1 Introduction

Since the beginning of the last century, the fundamental nature of the concept of automatic computations attracted a great attention of mathematicians and computer scientists (see [5, 1517, 23, 24, 28, 43]). The first studies had as their reference context the David Hilbert program, and as their reference language that was introduced by Georg Cantor [4]. These approaches lead to different mathematical models of computing machines (see [2, 7, 10]) that, surprisingly, were discovered to be equivalent (e.g., anything computable in the λ-calculus is computable by a Turing machine). Moreover, these results, and especially those obtained by Alonzo Church, Alan Turing [5, 11, 43], and Kurt Gödel, gave fundamental contributions to demonstrate that David Hilbert program, which was based on the idea that all of the mathematics could be precisely axiomatized, cannot be realized.

In spite of this fact, the idea of finding an adequate set of axioms for one or another field of mathematics continues to be among the most attractive goals for contemporary mathematicians. Usually, when it is necessary to define a concept or an object, logicians try to introduce a number of axioms describing the object in the absolutely best way. However, it is not clear how to reach this absoluteness; indeed, when we describe a mathematical object or a concept we are limited by the expressive capacity of the language we use to make this description. A richer language allows us to say more about the object and a weaker language—less. Thus, the continuous development of the mathematical (and not only mathematical) languages leads to a continuous necessity of a transcription and specification of axiomatic systems. Second, there is no guarantee that the chosen axiomatic system defines “sufficiently well” the required concept and a continuous comparison with practice is required in order to check the goodness of the accepted set of axioms. However, there cannot be again any guarantee that the new version will be the last and definitive one. Finally, the third limitation already mentioned above has been discovered by Gödel in his two famous incompleteness theorems (see [11]).

Starting from these considerations, in this paper, we study the relativity of mathematical languages in situations where they are used to observe and to describe automatic computations. We consider the traditional computational paradigm mainly following results of Turing (see [43]) whereas emerging computational paradigms (see, e.g., [1, 26, 45, 47]) are not considered here. In particular, we focus our attention on different kinds of Turing machines by enriching and extending the results presented in [42].

The point of view presented in this paper strongly uses three methodological ideas borrowed from physics and applied to mathematics, namely: the distinction between the object (we speak here about a mathematical object) of an observation and the instrument used for this observation; interrelations holding between the object and the tool used for this observation; the accuracy of the observation determined by the tool.

The main attention is dedicated to numeral systemsFootnote 1 that we use to write down numbers, functions, models, etc. and that are among our tools of investigation of mathematical and physical objects. It is shown that numeral systems strongly influence our capabilities to describe both the mathematical and physical worlds. A new numeral system introduced in [31, 33, 38]) for performing computations with infinite and infinitesimal quantities is used for the observation of mathematical objects and studying Turing machines. The new methodology is based on the principle “the part is less than the whole” introduced by Ancient Greeks and observed in practice. It is applied to all sets and processes (finite and infinite) and all numbers (finite, infinite, and infinitesimal).

In order to see the place of the new approach in the historical panorama of ideas dealing with infinite and infinitesimal, see [20, 21, 36, 37, 42]. The new methodology has been successfully applied for studying a number of applications: percolation (see [14, 44]), Euclidean and hyperbolic geometry (see [22, 30]), fractals (see [32, 34, 41, 44]), numerical differentiation and optimization (see [8, 35, 39, 49]), infinite series (see [36, 40, 48]), the first Hilbert problem (see [37]), and cellular automata (see [9]).

The rest of the paper is structured as follows. In Sect. 2, single and multi-tape Turing machines are introduced along with “classical” results concerning their computational power and related equivalences; in Sect. 3, a brief introduction to the new language and methodology is given whereas their exploitation for analyzing and observing the different types of Turing machines is discussed in Sect. 4. It shows that the new approach allows us to observe Turing machines with a higher accuracy giving so the possibility to better characterize and distinguish machines, which are equivalent when observed within the classical framework. Finally, Sect. 5 concludes the paper.

2 Single and multi-tape Turing machines

The Turing machine is one of the simple abstract computational devices that can be used to model computational processes and investigate the limits of computability. In the following Sects. 2.1 and 2.2, single and multi-tape Turing machines will be described along with important classical results concerning their computational power and related equivalences.

2.1 Single tape Turing machines

A Turing machine (see, e.g., [13, 43]) can be defined as a 7-tuple

$$ \mathcal{M}=\langle Q, \varGamma, \bar{b}, \varSigma, q_0, F, \delta \rangle , $$
(1)

where Q is a finite and not empty set of states; Γ is a finite set of symbols; \(\bar{b}\in\varGamma\) is a symbol called blank; \(\varSigma\subseteq\{ \varGamma-{\bar{b}}\}\) is the set of input/output symbols; q 0Q is the initial state; FQ is the set of final states; δ:{QFΓQ×Γ×{R,L,N} is a partial function called the transition function, where L means left, R means right, and N means no move.

Specifically, the machine is supplied with: (i) a tape running through it which is divided into cells each capable of containing a symbol γΓ, where Γ is called the tape alphabet, and \(\bar{b}\in\varGamma\) is the only symbol allowed to occur on the tape infinitely often; (ii) a head that can read and write symbols on the tape and move the tape left and right one and only one cell at a time. The behavior of the machine is specified by its transition function δ and consists of a sequence of computational steps; in each step the machine reads the symbol under the head and applies the transition function that, given the current state of the machine and the symbol it is reading on the tape, specifies (if it is defined for these inputs): (i) the symbol γΓ to write on the cell of the tape under the head; (ii) the move of the tape (L for one cell left, R for one cell right, N for no move); (iii) the next state qQ of the machine.

Starting from the definition of Turing machine introduced above, classical results (see, e.g., [2]) aim at showing that different machines in terms of provided tape and alphabet have the same computational power, i.e., they are able to execute the same computations. In particular, two main results are reported below in an informal way.

Given a Turing machine \(\mathcal{M}=\{Q, \varGamma, \bar{b}, \varSigma, q_{0}, F, \delta\}\), which is supplied with an infinite tape, it is always possible to define a Turing machine \(\mathcal{M}'=\{Q', \varGamma', \bar{b}, \varSigma', q_{0}', F', \delta'\}\) which is supplied with a semi-infinite tape (e.g., a tape with a left boundary) and is equivalent to \(\mathcal{M}\), i.e., is able to execute all the computations of \(\mathcal{M}\).

Given a Turing machine \(\mathcal{M}=\{Q, \varGamma, \bar{b}, \varSigma, q_{0}, F, \delta\}\), it is always possible to define a Turing Machine \(\mathcal{M}'=\{Q', \varGamma', \bar{b}, \varSigma', q_{0}', F', \delta'\} \) with |Σ′|=1 and \(\varGamma'=\varSigma'\cup\{\bar{b}\}\), which is equivalent to \(\mathcal{M}\), i.e., is able to execute all the computations of \(\mathcal{M}\).

It should be mentioned that these results, together with the usual conclusion regarding the equivalences of Turing machines, can be interpreted in the following, less obvious, way: they show that when we observe Turing machines by exploiting the classical framework we are not able to distinguish, from the computational point of view, Turing machines which are provided with alphabets having different number of symbols and/or different kind of tapes (infinite or semi-infinite) (see [42] for a detailed discussion).

2.2 Multi-tape Turing machines

Let us consider a variant of the Turing machine defined in (1) where a machine is equipped with multiple tapes that can be simultaneously accessed and updated through multiple heads (one per tape). These machines can be used for a more direct and intuitive resolution of different kind of computational problems. As an example, in checking if a string is palindrome it can be useful to have two tapes on which represent the input string so that the verification can be efficiently performed by reading a tape from left to right and the other one from right to left.

Moving toward a more formal definition, a k-tapes, k≥2, Turing machine (see [13]) can be defined (cf. (1)) as a 7-tuple

$$ \mathcal{M}_{K}=\bigl \langle Q, \varGamma, \bar{b}, \varSigma, q_0, F, \delta ^{(k)}\bigr \rangle , $$
(2)

where \(\varSigma=\bigcup^{k}_{i=1}\varSigma_{i}\) is given by the union of the symbols in the k input/output alphabets Σ 1,…,Σ k ; \(\varGamma=\varSigma\cup\{\bar{b}\}\) where \(\bar{b}\) is a symbol called blank; Q is a finite and not empty set of states; q 0Q is the initial state; FQ is the set of final states; δ (k):{QFΓ 1×…×Γ k Q×Γ 1×…×Γ k ×{R,L,N}k is a partial function called the transition function, where \(\varGamma_{i}=\varSigma_{i}\cup\{\bar{b}\}, i=1,\dots ,k\), L means left, R means right, and N means no move.

This definition of δ (k) means that the machine executes a transition starting from an internal state q i and with the k heads (one for each tape) above the characters a i 1,…,a i k , i.e., if δ (k)(q 1,a i 1,…,a i k )=(q j ,a j 1,…,a j k ,z j 1,…,z j k ) the machine goes in the new state q j , write on the k tapes the characters a j 1,…,a j k , respectively, and moves each of its k heads left, right or no move, as specified by the z j l ∈{R,L,N},l=1,…,k.

A machine can adopt for each tape a different alphabet, in any case, for each tape, as for the Single-tape Turing machines, the minimum portion containing characters distinct from \(\bar{b}\) is usually represented. In general, a typical configuration of a multi-tape machine consists of a read-only input tape, several read and write work tapes, and a write-only output tape, with the input and output tapes accessible only in one direction. In the case of a k-tapes machine, the instant configuration of the machine, as for the single-tape case, must describe the internal state, the contents of the tapes and the positions of the heads of the machine.

More formally, for a k-tapes Turing machine \(\mathcal{M}_{K}=\langle Q, \varGamma, \bar{b}, \varSigma, q_{0}, F, \delta^{(k)}\rangle \) with \(\varSigma=\bigcup^{k}_{i=1}\varSigma_{i}\) (see (2)) a configuration of the machine is given by

$$ q\#\alpha_{1}\uparrow\beta_{1}\#\alpha_{2} \uparrow\beta_{2}\# \dots\#\alpha_{k}\uparrow \beta_{k}, $$
(3)

where qQ; \(\alpha_{i}\in\varSigma_{i}\varGamma^{*}_{i}\cup\{ \epsilon\}\) and \(\beta_{i}\in\varGamma^{*}_{i}\varSigma_{i}\cup\{\bar {b}\}\). A configuration is final if qF.

The starting configuration usually requires the input string x on a tape, e.g., the first tape so that \(x\in \varSigma_{1}^{*}\), and only \(\bar{b}\) symbols on all the other tapes. However, it can be useful to assume that, at the beginning of a computation, these tapes have a starting symbol \(Z_{0}\notin\varGamma=\bigcup^{k}_{i=1}\varGamma_{i}\). Therefore, in the initial configuration the head on the first tape will be on the first character of the input string x, whereas the heads on the other tapes will observe the symbol Z 0, more formally, by re-placing \(\varGamma_{i}=\varSigma_{i}\cup\{\bar{b}, Z_{0}\}\) in all the previous definition, a configuration q#α 1β 1#α 2β 2#…#α k β k is an initial configuration if \(\alpha_{i}=\epsilon, i=1,\dots,k, \beta_{1}\in\varSigma_{1}^{*},\beta_{i}=Z_{0}, i=2,\dots,k\) and q=q 0.

The application of the transition function δ (k) at a machine configuration (c.f. (3)) defines a computational step of a multi-tape Turing machine. The set of computational steps which bring the machine from the initial configuration into a final configuration defines the computation executed by the machine. As an example, the computation of a multi-tape Turing machine \(\mathcal{M}_{K,}\) which computes the function \(f_{\mathcal {M}_{K}}(x)\) can be represented as follows:

$$ q_{0}\#\uparrow x\#\uparrow Z_{0}\#\dots\#\uparrow Z_{0}\stackrel {\rightarrow}{\mathcal{M}_{K}}q\#\uparrow x \#\uparrow f_{\mathcal {M}_{K}}(x)\#\uparrow \bar{b}\#\dots\#\uparrow\bar{b} $$
(4)

where qF and \(\stackrel{\rightarrow}{\mathcal{M}_{K}}\) indicates the transition among machine configurations.

It is worth noting that, although the k-tapes Turing Machine can be used for a more direct resolution of different kind of computational problems, in the classical framework it has the same computational power of the single-tape Turing machine. More formally, given a multi-tape Turing machine it is always possible to define a single-tape Turing machine, which is able to fully simulate its behavior and, therefore, to completely execute its computations. In particular, the single-tape Turing machines adopted for the simulation use a particular kind of the tape, which is divided into tracks (multi-track tape). In this way, if the tape has m tracks, the head is able to access (for reading and/or writing) all the m characters on the tracks during a single operation. If for the m tracks the alphabets Γ 1,… Γ m are adopted respectively, the machine alphabet Γ is such that |Γ|=|Γ 1×…×Γ m | and can be defined by an injective function from the set Γ 1×…×Γ m to the set Γ; this function will associate the symbol \(\bar{b}\) in Γ to the tuple \((\bar{b},\bar{b},\dots,\bar{b})\) in Γ 1×…×Γ m . In general, the elements of Γ which correspond to the elements in Γ 1×…×Γ m can be indicated by [a i 1,a i 2,…,a i m ] where a i j Γ j .

By adopting this notation, it is possible to demonstrate that given a k-tapes Turing Machine \(\mathcal{M}_{K}=\{ Q, \varGamma, \bar{b}, \varSigma, q_{0}, F, \delta^{(k)}\}\) it is always possible to define a single-tape Turing machine which is able to simulate t computational steps of \(\mathcal{M}_{K}=\) in O(t 2) transitions by using an alphabet with O((2|Γ|)k) symbols (see [2]).

The proof is based on the definition of a machine \(\mathcal{M}'=\{ Q', \varGamma', \bar{b}, \varSigma', q_{0}', F', \delta' \}\) with a single-tape divided into 2k tracks (see [2]); k tracks for storing the characters in the k tapes of \(\mathcal{M}_{K}\) and k tracks for signing through the marker ↓ the positions of the k heads on the k tapes of \(\mathcal{M}_{k}\). As an example, this kind of tape can represent the content of each tapes of \(\mathcal{M}_{k}\) and the position of each machine heads in its even and odd tracks respectively. As discussed above, for obtaining a single-tape machine able to represent these 2k tracks, it is sufficient to adopt an alphabet with the required cardinality and define an injective function which associates a 2k-ple characters of a cell of the multi-track tape to a symbols in this alphabet.

The transition function δ (k) of the k-tapes machine is given by δ (k)(q 1,a i 1,…,a i k )=(q j ,a j 1,…,a j k ,z j 1,…,z j k ), with z j 1,…,z j k ∈{R,L,N}; as a consequence the corresponding transition function δ′ of the single-tape machine, for each transition specified by δ (k) must individuate the current state and the position of the marker for each track and then write on the tracks the required symbols, move the markers and go in another internal state. For each computational step of \(\mathcal{M}_{K}\), the machine \(\mathcal{M}'\) must execute a sequence of steps for covering the portion of tapes between the two most distant markers. As in each computational step, a marker can move at most of one cell and then two markers can move away each other at most of two cells, after t steps of \(\mathcal{M}_{K}\) the markers can be at most 2t cells distant, thus if \(\mathcal{M}_{K}\) executes t steps, \(\mathcal{M}'\) executes at most: \(2\sum^{t}_{i=1}i = t^{2}+t =O(t^{2})\) steps.

Moving to the cost of the simulation in terms of the number of required characters for the alphabet of the single-tape machine, we recall that |Γ 1|=|Σ 1|+1 and that |Γ i |=|Σ i |+2 for 2≤ik. So, by multiplying the cardinalities of these alphabets, we obtain that: \(\vert \varGamma'\vert =2^{k}(\vert \varSigma_{1}\vert +1)\prod^{k}_{i=2}(\vert \varSigma_{i}\vert +2)=O({(2\max_{1\leq i\leq k}\vert \varGamma_{i}\vert )}^{k})\).

3 The Grossone methodology

In this section, we give just a brief introduction to the methodology of the new approach [31, 33] dwelling only on the issues directly related to the subject of the paper. This methodology will be used in Sect. 4 to study Turing machines and to obtain some more accurate results with respect to those obtainable by using the traditional framework [5, 43].

In order to start, let us remind that numerous trials have been done during the centuries to evolve existing numeral systems in such a way that numerals representing infinite and infinitesimal numbers could be included in them (see [3, 4, 6, 18, 19, 25, 29, 46]). Since new numeral systems appear very rarely, in each concrete historical period their significance for mathematics is very often underestimated (especially by pure mathematicians). In order to illustrate their importance, let us remind the Roman numeral system that does not allow one to express zero and negative numbers. In this system, the expression III-X is an indeterminate form. As a result, before appearing the positional numeral system and inventing zero mathematicians were not able to create theorems involving zero and negative numbers and to execute computations with them.

There exist numeral systems that are even weaker than the Roman one. They seriously limit their users in executing computations. Let us recall a study published recently in Science (see [12]). It describes a primitive tribe living in Amazonia (Pirahã). These people use a very simple numeral system for counting: one, two, many. For Pirahã, all quantities larger than two are just “many” and such operations as 2+2 and 2+1 give the same result, i.e., “many.” Using their weak numeral system, Pirahã are not able to see, for instance, numbers 3, 4, 5, and 6, to execute arithmetical operations with them, and, in general, to say anything about these numbers because in their language there are neither words nor concepts for that.

In the context of the present paper, it is very important that the weakness of Pirahã’s numeral system leads them to such results as

$$ \mbox{`many'}+ 1= \mbox{`many'}, \quad\quad \mbox{`many'} + 2 = \mbox{`many'}, $$
(5)

which are very familiar to us in the context of views on infinity used in the traditional calculus

$$ \infty+ 1= \infty, \quad\quad \infty+ 2 = \infty. $$
(6)

The arithmetic of Pirahã involving the numeral ‘many’ has also a clear similarity with the arithmetic proposed by Cantor for his AlephsFootnote 2:

$$ \aleph_0 + 1= \aleph_0, \quad\quad \aleph_0 + 2= \aleph_0, \quad\quad\aleph_1+ 1= \aleph_1, \quad\quad \aleph_1 + 2 = \aleph_1. $$
(7)

Thus, the modern mathematical numeral systems allow us to distinguish a larger quantity of finite numbers with respect to Pirahã but give results that are similar to those of Pirahã when we speak about infinite quantities. This observation leads us to the following idea:

Probably our difficulties in working with infinity is not connected to the nature of infinity itself but is a result of inadequate numeral systems that we use to work with infinity, more precisely, to express infinite numbers.

The approach developed in [31, 33, 38] proposes a numeral system that uses the same numerals for several different purposes for dealing with infinities and infinitesimals: in Analysis for working with functions that can assume different infinite, finite, and infinitesimal values (functions can also have derivatives assuming different infinite or infinitesimal values); for measuring infinite sets; for indicating positions of elements in ordered infinite sequences; in probability theory, etc. (see [8, 9, 14, 22, 30, 32, 3437, 3941, 44, 48, 49]). It is important to emphasize that the new numeral system avoids situations of the type (5)–(7) providing results ensuring that if a is a numeral written in this system then for any a (i.e., a can be finite, infinite, or infinitesimal) it follows a+1>a.

The new numeral system works as follows. A new infinite unit of measure expressed by the numeral ① called Grossone is introduced as the number of elements of the set, ℕ, of natural numbers. Concurrently with the introduction of Grossone in the mathematical language all other symbols (like ∞, Cantor’s ω, ℵ0,ℵ1,… , etc.) traditionally used to deal with infinities and infinitesimals are excluded from the language because Grossone and other numbers constructed with its help not only can be used instead of all of them, but can be used with a higher accuracy.Footnote 3 Grossone is introduced by describing its properties postulated by the Infinite Unit Axiom (see [33, 38]) added to axioms for real numbers (similarly, in order to pass from the set, ℕ, of natural numbers to the set, ℤ, of integers a new element—zero expressed by the numeral 0—is introduced by describing its properties).

The new numeral ① allows us to construct different numerals expressing different infinite and infinitesimal numbers and to execute computations with them. Let us give some examples. For instance, in Analysis, indeterminate forms are not present and, for example, the following relations hold for ① and ①−1 (that is infinitesimal), as for any other (finite, infinite, or infinitesimal) number expressible in the new numeral system

(8)
(9)
(10)

The new approach gives the possibility to develop a new Analysis (see [36]) where functions assuming not only finite values but also infinite and infinitesimal ones can be studied. For all of them, it becomes possible to introduce a new notion of continuity that is closer to our modern physical knowledge. Functions assuming finite and infinite values can be differentiated and integrated.

By using the new numeral system it becomes possible to measure certain infinite sets and to see, e.g., that the sets of even and odd numbers have ①/2 elements each. The set, ℤ, of integers has 2①+1 elements (① positive elements, ① negative elements, and zero). Within the countable sets and sets having cardinality of the continuum (see [20, 37, 38]) it becomes possible to distinguish infinite sets having different number of elements expressible in the numeral system using Grossone and to see that, for instance,

(11)

Another key notion for our study of Turing machines is that of infinite sequence. Thus, before considering the notion of the Turing machine from the point of view of the new methodology, let us explain how the notion of the infinite sequence can be viewed from the new positions.

Traditionally, an infinite sequence {a n },a n A, n∈ℕ, is defined as a function having the set of natural numbers, ℕ, as the domain and a set A as the codomain. A subsequence {b n } is defined as a sequence {a n } from which some of its elements have been removed. In spite of the fact that the removal of the elements from {a n } can be directly observed, the traditional approach does not allow one to register, in the case where the obtained subsequence {b n } is infinite, the fact that {b n } has less elements than the original infinite sequence {a n }.

Let us study what happens when the new approach is used. From the point of view of the new methodology, an infinite sequence can be considered in a dual way: either as an object of a mathematical study or as a mathematical instrument developed by human beings to observe other objects and processes. First, let us consider it as a mathematical object and show that the definition of infinite sequences should be done more precise within the new methodology. In the finite case, a sequence a 1,a 2,…,a n has n elements and we extend this definition directly to the infinite case saying that an infinite sequence a 1,a 2,…,a n has n elements where n is expressed by an infinite numeral such that the operations with it satisfy the methodological Postulate 3. Then the following result (see [31, 33]) holds. We reproduce here its proof for the sake of completeness.

Theorem 1

The number of elements of any infinite sequence is less or equal to ①.

Proof

The new numeral system allows us to express the number of elements of the set ℕ as ①. Thus, due to the sequence definition given above, any sequence having ℕ as the domain has ① elements.

The notion of subsequence is introduced as a sequence from which some of its elements have been removed. This means that the resulting subsequence will have less elements than the original sequence. Thus, we obtain infinite sequences having the number of members less than Grossone. □

It becomes appropriate now to define the complete sequence as an infinite sequence containing ① elements. For example, the sequence of natural numbers is complete, the sequences of even and odd natural numbers are not complete because they have elements each (see [31, 33]). Thus, the new approach imposes a more precise description of infinite sequences than the traditional one: to define a sequence {a n } in the new language, it is not sufficient just to give a formula for a n , we should determine (as it happens for sequences having a finite number of elements) its number of elements and/or the first and the last elements of the sequence. If the number of the first element is equal to one, we can use the record {a n :k} where a n is, as usual, the general element of the sequence and k is the number (that can be finite or infinite) of members of the sequence; the following example clarifies these concepts.

Example 1

Let us consider the following three sequences:

(12)
(13)
(14)

The three sequences have a n =b n =c n =4n, but they are different because they have different number of members. Sequence {a n } has ① elements and, therefore, is complete, {b n } has , and {c n } has elements.

Let us consider now infinite sequences as one of the instruments used by mathematicians to study the world around us and other mathematical objects and processes. The first immediate consequence of Theorem 1 is that any sequential process can have at maximum ① elements. This means that a process of sequential observations of any object cannot contain more than ① steps.Footnote 4 We are not able to execute any infinite process physically, but we assume the existence of such a process; moreover, only a finite number of observations of elements of the considered infinite sequence can be executed by a human who is limited by the numeral system used for the observation. Indeed, we can observe only those members of a sequence for which there exist the corresponding numerals in the chosen numeral system; to better clarify this point, the following example is discussed.

Example 2

Let us consider the numeral system, \(\mathcal{P}\), of Pirahã able to express only numbers 1 and 2. If we add to \(\mathcal{P}\) the new numeral ①, we obtain a new numeral system (we call it \(\widehat{\mathcal{P}}\)). Let us consider now a sequence of natural numbers {n:①}. It goes from 1 to ① (note that both numbers, 1 and ①, can be expressed by numerals from \(\widehat{\mathcal{P}}\)). However, the numeral system \(\widehat {\mathcal{P}}\) is very weak and it allows us to observe only ten numbers from the sequence {n:①} represented by the following numerals:

(15)

The first two numerals in (15) represent finite numbers, the remaining eight numerals express infinite numbers, and dots represent members of the sequence of natural numbers that are not expressible in \(\widehat{\mathcal{P}}\) and, therefore, cannot be observed if one uses only this numeral system for this purpose.

In the light of the limitations concerning the process of sequential observations, the researcher can choose how to organize the required sequence of observations and which numeral system to use for it, defining so which elements of the object he/she can observe. This situation is exactly the same as in natural sciences: before starting to study a physical object, a scientist chooses an instrument and its accuracy for the study.

Example 3

Let us consider the set as an object of our observation. Suppose that we want to organize the process of the sequential counting of its elements. Then, due to Theorem 1, starting from the number 1 this process can arrive at maximum to ①. If we consider the complete counting sequence {n:①}, then we obtain

(16)

Analogously, if we start the process of the sequential counting from 5, the process arrives at maximum to ①+4:

(17)

The corresponding complete sequence used in this case is {n+4:①}. We can also change the length of the step in the counting sequence and consider, for instance, the complete sequence {2n−1:①}:

(18)

If we use again the numeral system \(\widehat{\mathcal{P}}\), then among finite numbers it allows us to see only number 1 because already the next number in the sequence, 3, is not expressible in \(\widehat {\mathcal{P}}\). The last element of the sequence is 2①−1 and \(\widehat{\mathcal{P}}\) allows us to observe it. □

The introduced definition of the sequence allows us to work not only with the first but with any element of any sequence if the element of our interest is expressible in the chosen numeral system independently whether the sequence under our study has a finite or an infinite number of elements. Let us use this new definition for studying infinite sets of numerals, in particular, for calculating the number of points at the interval [0,1) (see [31, 33]). To do this, we need a definition of the term “point” and mathematical tools to indicate a point. If we accept (as is usually done in modern Mathematics) that a point A belonging to the interval [0,1) is determined by a numeral x, x∈𝕊, called coordinate of the point A where 𝕊 is a set of numerals, then we can indicate the point A by its coordinate x and we are able to execute the required calculations.

It is worthwhile to emphasize that giving this definition we have not used the usual formulation “x belongs to the set,, of real numbers.” This has been done because we can express coordinates only by numerals and different choices of numeral systems lead to different sets of numerals and, as a result, to different sets of numbers observable through the chosen numerals. In fact, we can express coordinates only after we have fixed a numeral system (our instrument of the observation) and this choice defines which points we can observe, namely, points having coordinates expressible by the chosen numerals. This situation is typical for natural sciences where it is well known that instruments influence the results of observations. Remind the work with a microscope: we decide the level of the precision we need and obtain a result which is dependent on the chosen level of accuracy. If we need a more precise or a more rough answer, we change the lens of our microscope.

We should decide now which numerals we shall use to express coordinates of the points. After this choice, we can calculate the number of numerals expressible in the chosen numeral system and, as a result, we obtain the number of points at the interval [0,1). Different variants (see [31, 33]) can be chosen depending on the precision level we want to obtain. For instance, we can choose a positional numeral system with a finite radix b that allows us to work with numerals

(19)

Then the number of numerals (19) gives us the number of points within the interval [0,1) that can be expressed by these numerals. Note that a number using the positional numeral system (19) cannot have more than Grossone digits (contrarily to sets discussed in Example 3) because a numeral having g>① digits would not be observable in a sequence. In this case (g>①), such a record becomes useless in sequential computations because it does not allow one to identify numbers entirely since g−① numerals remain nonobserved.

Theorem 2

If coordinates of points x∈[0,1) are expressed by numerals (19), then the number of the points x over [0,1) is equal to .

Proof

In the numerals (19), there is a sequence of digits, , used to express the fractional part of the number. Due to the definition of the sequence and Theorem 1, any infinite sequence can have at maximum ① elements. As a result, there is ① positions on the right of the dot that can be filled in by one of the b digits from the alphabet {0,1,…,b−1} that leads to b possible combinations. Hence, the positional numeral system using the numerals of the form (19) can express b numbers. □

Corollary 1

The number of numerals

(20)

expressing integers in the positional system with a finite radix b in the alphabet {0,1,…,b−2,b−1} is equal to b .

Proof

The proof is a straightforward consequence of Theorem 2 and is so omitted. □

Corollary 2

If coordinates of points x∈(0,1) are expressed by numerals (19), then the number of the points x over (0,1) is equal to b −1.

Proof

The proof follows immediately from Theorem 2. □

Note that Corollary 2 shows that it becomes possible now to observe and to register the difference of the number of elements of two infinite sets (the interval [0,1) and the interval (0,1), respectively) even when only one element (the point 0, expressed by the numeral 0.00…0 with ① zero digits after the decimal point) has been excluded from the first set in order to obtain the second one.

4 The Turing machines observed through the lens of the Grossone methodology

In this section, the different types of Turing machines introduced in Sect. 2 are analyzed and observed by using as instruments of the observation the Grossone language and methodology presented in Sect. 3. In particular, new results for multi-tape Turing machines are presented and discussed.

Before starting the discussion, it is useful to recall the main results from the previous Section: (i) any infinite sequence can have maximum ① elements; (ii) the elements which we are able to observe in this sequence depend on the adopted numeral system.

Then, in order to be able to read and to understand the output of a Turing machine, writing its output on the tape using an alphabet Σ containing b symbols {0,1,…,b−2,b−1} where b is a finite number, the researcher (the user) should know a positional numeral system \(\mathcal{U}\) with an alphabet {0,1,…,u−2,u−1} where ub, otherwise the output cannot be decoded by the user. Moreover, the researcher must be able to observe a number of symbols at least equal to the maximal length of the output sequence that can be computed by machine, otherwise the user is not able to interpret the obtained result (see [42] for a detailed discussion).

In this section, a first set of results aims to specify, with higher accuracy with respect to that provided by the mathematical language developed by Cantor and adopted by Turing, how and when the computations performed by a multi-tape Turing machine can be observed in a sequence. Moreover, it is shown that the Grossone language and methodology will allow us to perform a more accurate investigation of situations interpreted traditionally like equivalences among different multi-tape machines and among multi- and single-tape machines.

4.1 Observing computations performed by a multi-tape Turing machine

Before starting to analyze the computations performed by a k-tapes Turing machine (with k≥2) \(\mathcal{M}_{K}=\langle Q, \varGamma, \bar{b}, \varSigma, q_{0}, F, \delta^{(k)}\rangle \) (see (1), Sect. 2.2), it is worth to make some considerations about the process of observation itself in the light of the Grossone methodology. As discussed above, if we want to observe the process of computation performed by a Turing machine while it executes an algorithm, then we have to execute observations of the machine in a sequence of moments. In fact, it is not possible to organize a continuous observation of the machine. Any instrument used for an observation has its accuracy and there always be a minimal period of time related to this instrument allowing one to distinguish two different moments of time and, as a consequence, to observe (and to register) the states of the object in these two moments. In the period of time passing between these two moments, the object remains unobservable.

Since our observations are made in a sequence, the process of observations can have at maximum ① elements. This means that inside a computational process it is possible to fix more than Grossone steps (defined in a way) but it is not possible to count them one by one in a sequence containing more than Grossone elements. For instance, in a time interval [0,1), up to b numerals of the type (19) can be used to identify moments of time but not more than Grossone of them can be observed in a sequence. Moreover, it is important to stress that any process itself, considered independently on the researcher, is not subdivided in iterations, intermediate results, moments of observations, etc. The structure of the language we use to describe the process imposes what we can say about the process (see [42] for a detailed discussion).

On the basis of the considerations made above, we should choose the accuracy (granularity) of the process of the observation of a Turing machine; for instance, we can choose a single operation of the machine such as reading a symbol from the tape, or moving the tape, etc. However, in order to be close as much as possible to the traditional results, we consider an application of the transition function of the machine as our observation granularity (see Sect. 2).

Moreover, concerning the output of the machine, we consider the symbols written on all the k tapes of the machine by using, on each tape i, with 1≤ik, the alphabet Σ i of the tape, containing b i symbols, plus the blank symbol (\(\bar{b}\)). Due to the definition of complete sequence (see Sect. 3) on each tape at least ① symbols can be produced and observed. This means that on a tape i, after the last symbols belonging to the tape alphabet Σ i , if the sequence is not complete (i.e., if it has less than ① symbols) we can consider a number of blank symbols (\(\bar{b}\)) necessary to complete the sequence. We say that we are considering a complete output of a k-tapes Turing machine when on each tape of the machine we consider a complete sequence of symbols belonging to \(\varSigma_{i}\cup\{\bar{b}\}\).

Theorem 3

Let \(\mathcal{M}_{K}=\langle Q, \varGamma, \bar{b}, \varSigma, q_{0}, F, \delta^{(k)}\rangle \) be a k-tapes, k≥2, Turing machine. Then, a complete output of the machine will results in k① symbols.

Proof

Due to the definition of the complete sequence, on each tape at maximum ① symbols can be produced and observed and thus by considering a complete sequence on each of the k tapes of the machine the complete output of the machine will result in k① symbols. □

Having proved that a complete output that can be produced by a k-tapes Turing machine results in k① symbols, it is interesting to investigate what part of the complete output produced by the machine can be observed in a sequence taking into account that it is not possible to observe in a sequence more than ① symbols (see Sect. 3). As examples, we can decide to make in a sequence one of the following observations: (i) ① symbols on one among the k-tapes of the machine, (ii) symbols on each of the k-tapes of the machine; (iii) symbols on 2 among the k-tapes of the machine, an so on.

Theorem 4

Let \(\mathcal{M}_{K}=\langle Q, \varGamma, \bar {b}, \varSigma, q_{0}, F, \delta^{(k)}\rangle \) be a k-tapes, k≥2, Turing machine. Let M be the number of all possible complete outputs that can be produced by \(\mathcal{M}_{K}\). Then it follows .

Proof

Due to the definition of the complete sequence, on each tape i, with 1≤ik, at maximum ① symbols can be produced and observed by using the b i symbols of the alphabet Σ i of the tape plus the blank symbol (\(\bar{b}\)); as a consequence, the number of all the possible complete sequences that can be produced and observed on a tape i is . A complete output of the machine is obtained by considering a complete sequence on each of the k-tapes of the machine, thus by considering all the possible complete sequences that can be produced and observed on each of the k tapes of the machine, the number M of all the possible complete outputs will results in . □

As the number of complete outputs that can be produced by \(\mathcal{M}_{K}\) is larger than Grossone, then there can be different sequential enumerating processes that enumerate complete outputs in different ways, in any case, each of these enumerating sequential processes cannot contain more than Grossone members (see Sect. 3).

4.2 Equivalences among different multi-tape machines and among multi- and single-tape machines

In the classical framework, k-tape Turing machines have the same computational power of single-tape Turing machines and given a multi-tape Turing Machine \(\mathcal{M}_{K}\) it is always possible to define a Single-tape Turing Machine which is able to fully simulate its behavior and, therefore, to completely execute its computations. As shown for the single-tape Turing machine (see [42]), the Grossone methodology allows us to give a more accurate definition of the equivalence among different machines as it provides the possibility not only to separate different classes of infinite sets with respect to their cardinalities, but also to measure the number of elements of some of them. With reference to multi-tape Turing machines, the single-tape Turing machines adopted for their simulation use a particular kind of tape, which is divided into tracks (multi-track tape). In this way, if the tape has m tracks, the head is able to access (for reading and/or writing) all the m characters on the tracks during a single operation. This tape organization leads to a straightforward definition of the behavior of a single-tape Turing machine able to completely execute the computations of a given multi-tape Turing machine (see Sect. 2.2). However, the so-defined single-tape Turing machine \(\mathcal{M}\), to simulate t computational steps of \(\mathcal{M}_{K}\), needs to execute O(t 2) transitions (t 2+t in the worst case) and to use an alphabet with \(2^{k}(\vert \varSigma_{1}\vert +1)\prod^{k}_{i=2}(\vert \varSigma _{i}\vert +2)\) symbols (again see Sect. 2.2). By exploiting the Grossone methodology, is possible to obtain the following result that has a higher accuracy with respect to that provided by the traditional framework.

Theorem 5

Let us consider \(\mathcal{M}_{K}=\langle Q, \varGamma, \bar{b}, \varSigma, q_{0}, F, \delta^{(k)}\rangle \),a k-tapes, k≥2, Turing machine, where \(\varSigma=\bigcup^{k}_{i=1}\varSigma_{i}\) is given by the union of the symbols in the k tape alphabets Σ 1,…,Σ k and \(\varGamma=\varSigma\cup\{\bar{b}\}\). If this machine performs t computational steps such that

(21)

then there exists \(\mathcal{M}'=\{ Q', \varGamma', \bar{b}, \varSigma', q_{0}', F', \delta' \}\), an equivalent single-tape Turing machine with \(\vert \varGamma'\vert =2^{k}(\vert \varSigma_{1}\vert +1)\prod^{k}_{i=2}(\vert \varSigma_{i}\vert +2)\), which is able to simulate \(\mathcal{M}_{K}\) and can be observed in a sequence.

Proof

Let us recall that the definition of \(\mathcal{M}'\) requires for a single-tape to be divided into 2k tracks; k tracks for storing the characters in the k tapes of \(\mathcal{M}_{K}\) and k tracks for signing through the marker ↓ the positions of the k heads on the k tapes of \(\mathcal{M}_{k}\) (see Sect. 2.2). The transition function δ (k) of the k-tapes machine is given by δ (k)(q 1,a i 1,…,a i k )=(q j ,a j 1,…,a j k ,z j 1,…,z j k ), with z j 1,…,z j k ∈{R,L,N}; as a consequence the corresponding transition function δ′ of the Single-tape machine, for each transition specified by δ (k) must individuate the current state and the position of the marker for each track and then write on the tracks the required symbols, move the markers and go in another internal state. For each computational step of \(\mathcal{M}_{K}\), \(\mathcal{M}'\) must execute a sequence of steps for covering the portion of tapes between the two most distant markers. As in each computational step a marker can move at most of one cell and then two markers can move away each other at most of two cells, after t steps of \(\mathcal{M}_{K}\) the markers can be at most 2t cells distant, thus if \(\mathcal{M}_{K}\) executes t steps, \(\mathcal{M}'\) executes at most: \(2\sum^{t}_{i=1}i = t^{2}+t\) steps. In order to be observable in a sequence the number t 2+t of steps, performed by \(\mathcal {M}'\) to simulate t steps of \(\mathcal{M}_{K}\), must be less than or equal to ①. Namely, it should be t 2+t⩽①. The fact that this inequality is satisfied for completes the proof. □

5 Concluding Remarks

In the paper, single- and multi-tape Turing machines have been described and observed through the lens of the Grossone language and methodology. This new language, differently from the traditional one, makes it possible to distinguish among infinite sequences of different length so enabling a more accurate description of single- and multi-tape Turing machines. The possibility to express the length of an infinite sequence explicitly gives the possibility to establish more accurate results regarding the equivalence of machines in comparison with the observations that can be done by using the traditional language.

It is worth noting that the traditional results and those presented in the paper do not contradict one another. They are just written by using different mathematical languages having different accuracies. Both mathematical languages observe and describe the same objects—Turing machines—but with different accuracies. As a result, both traditional and new results are correct with respect to the mathematical languages used to express them and correspond to different accuracies of the observation. This fact is one of the manifestations of the relativity of mathematical results formulated by using different mathematical languages in the same way as the usage of a stronger lens in a microscope gives a possibility to distinguish more objects within an object that seems to be unique when viewed by a weaker lens.

Specifically, the Grossone language has allowed us to give the definition of complete output of a Turing machine, to establish when and how the output of a machine can be observed, and to establish a more accurate relationship between a multi-tape Turing machine and a single-tape one which simulates its computations. Future research efforts will be geared to apply the Grossone language and methodology to the description and observation of new and emerging computational paradigms.