1 Introduction

Information is everywhere, shaping our discourses and our thoughts. In everyday life, we know that the information spread by the media may trigger deep social, economical and political changes. In science, the concept of information has pervaded almost all scientific disciplines, from physics and chemistry to biology and psychology. It is for this reason that the philosophical analysis of its meaning and scope is nowadays an urgent task. In this sense, the works of Christopher Timpson constitute an outstanding contribution to the field, since they have brought to the fore many aspects of the concept of information: the domain of application of Shannon’s theory (Timpson 2003), the relation between information transmission and quantum entanglement (Timpson 2005), the interpretation of teleportation (Timpson 2006), the relation of quantum information with the interpretations of quantum mechanics (Timpson 2008, 2013), among others. In particular, Timpson proposes a deflationary view about information, according to which the term ‘information’ is an abstract noun and, as a consequence, information is not physical in any relevant sense. This innovative and well articulated view has had a great impact on the philosophy of physics community, especially among authors interested in the use of the concept of information for interpreting physical theories. For this reason, Timpson’s proposal deserves to be critically analyzed in detail, in order to assess the consequences usually drawn from it. The main purpose of the present article consists precisely in supplying such an analysis. In particular, after recalling certain distinctions regarding the concept of information (Section 2), the basic elements of Shannon information will be introduced (Section 3). In the following section, Timpson’s distinction between quantity of information and pieces of information will be presented, with some first qualms against it (Section 4). On this basis, it will be argued that Timpson’s characterization of quantity information in terms of Shannon’s coding theorems can be conceptually objected when considered from scientific practice (Section 5). It will also be claimed that the arguments appealed to by Timpson to ground his deflationary view of information oscillate between two questionable positions (Section 6): sometimes, the goal of communication is described as reproducing at the destination another token of the same type as that produced as the source; in other cases, the relation between tokens of the same type is identified with the formal relation of sameness of structure. This analysis will lead us to claim that information is an item even more abstract than what Timpson claims; nevertheless, this is not an obstacle to conceive information as a physical item (Section 7). Finally, in contrast with Timpson’s monist interpretation, we will propose to consider a pluralist view about information (Section 8), according to which, even on the basis of a single formalism, the concept of information admits a variety of interpretations, each one useful in a different context.

2 Which information?

As many recognize, information is a polysemantic concept that can be associated with different phenomena (Floridi 2015). In this conceptual tangle, the first distinction to be introduced in philosophy is that between a semantic and a non-semantic view of information. According to the first view, information is something that carries semantic content (Bar-Hillel and Carnap 1953; Bar-Hillel 1964; Floridi 2010, 2011); it is therefore strongly related to semantic notions such as reference, meaning and representation. In general, semantic information is carried by propositions that intend to represent states of affairs; so, it has intentionality, “aboutness”, that is, it is directed to other things. And although it remains controversial whether false factual content may qualify as information (see Graham 1999; Fetzer 2004; Floridi 2004, 2005; Scarantino and Piccinini 2010), semantic information maintains strong links with the notion of truth.

Non-semantic information, also called ‘mathematical’, is concerned with the compressibility properties of sequences of states of a system and/or the correlations between the states of two systems, independently of the meanings of those states. In this domain there are at least two different contexts in which the concept of information is essential. In the computational context, information is something that has to be computed and stored in an efficient way. In this framework, the algorithmic complexity measures the minimum resources needed to effectively reconstruct an individual message (Solomonoff 1964; Kolmogorov 1965, 1968; Chaitin 1966): it supplies a measure of information for individual objects taken in themselves, independently of the source that produces them. In the theory of algorithmic complexity, the basic question is the ultimate compression of individual messages. The main idea that underlies the theory is that the description of some messages can be considerably compressed if they exhibit enough regularity. Many information theorists, especially computer scientists, regard algorithmic complexity as more fundamental than Shannon entropy as a measure of information (Cover and Thomas 1991: 3), to the extent that algorithmic complexity assigns an asymptotic complexity to an individual message without any recourse to the notion of probability (for a discussion of the relation between Shannon entropy and Kolmogorov complexity, see Lombardi et al. 2015b). By contrast, in the traditional communicational context, whose classical locus is Claude Shannon’s formalism (Shannon 1948; Shannon and Weaver 1949), information is primarily something that has to be transmitted between to points for communication purposes. Shannon theory is purely quantitative, it ignores any issue related to informational content: “[the] semantic aspects of communication are irrelevant to the engineering problem. The significant aspect is that the actual message is one selected from a set of possible messages.” (Shannon 1948: 379). Following Timpson elucidation of the notion of information, in this paper we will focus on the concept of information in the communicational context. Nevertheless, the coexistence of different technical concepts of information points towards a pluralist stance regarding information, which is not just a weakness of common-sense contexts, but rather a matter of fact even in the scientific uses of the concept (we will come back to the issue of pluralism in the conclusions of the present article).Footnote 1

Timpson does not begin by the distinction between semantic and non-semantic information. According to him (2013: 11) the first and most important distinction is that between the everyday notion of information and the technical concept of information, such as that derived from the work of Shannon. The everyday notion of information is intimately associated with the concepts of knowledge, language and meaning: “If something is said to contain information then this is because it provides, or may be used to provide, knowledge” (Timpson 2013: 12). Information in the everyday sense displays intentionality, it is directed towards something, it is about something. By contrast, a technical concept of information is specified by means of a mathematical and/or physical vocabulary and, prima facie, has at most limited and derivative links to semantic and epistemic concepts.

The only semantic view analyzed by Timpson is that of Fred Dretske (1981). In this context, our author says: “His distinctive claim is that a satisfactory semantic concept of information is indeed to be found in information t Footnote 2 theory and may be achieved with a simple extension of the Shannon theory: in his view there is not a significant distinction between the technical and everyday concepts of information.” (Timpson 2013: 38, our emphasis).Footnote 3 This quote and others − “First, the distinction between the technical notions of information deriving from information theory and the everyday semantic/epistemic concept is not sufficiently noted” (ibid.: 3, our emphasis); “Does this establish a link between the technical communication-theoretic notions of information and a semantic, everyday one?” (ibid.: 40, our emphasis) − suggest that Timpson tends to equate the semantic and the everyday views of information. This suspicion is reinforced by the fact that the everyday concept is endowed with the same features as those traditionally used to characterize semantic information: meaning, intentionality, “aboutness”. Opposing the technical concept of information to the semantic concept, identified with the everyday concept, runs the risk of depriving the semantic view of any technical status. And, in turn, this would deprive the elucidation of a technical concept of semantic information, with its links with meaning and reference, of any philosophical interest. By contrast, at present there is a well developed field of research in the philosophy of information (see, for instance, Adriaans and van Benthem 2008, and the Web site of the Society for the Philosophy of Information) in the context of which many strongly technical views of semantic information are proposed (just to mention some of them: Dretske 1981; Barwise and Seligman 1997; Floridi 2011; for a wide and updated source of references, see Floridi 2015).

In spite of devoting several pages of his book to the everyday notion of information and its relation with knowledge (2013: 12–15), Timpson announces that, since he is concerned with classical and quantum information theories, his work addresses the technical concept of information. He also stresses from the beginning that, although there are different technical concepts of information other than Shannon’s, he will focus on the best known technical concept of information, the Shannon information, along with some closely related concepts from quantum information theory. So, let us begin by recalling the basic notions of Shannon theory.

3 Elements of Shannon theory

According to Shannon (1948; see also Shannon and Weaver 1949), a general communication system consists of five parts:

  • →A source S, which generates the message to be received at the destination.

  • →A transmitter T, which turns the message generated at the source into a signal to be transmitted. In the cases in which the information is encoded, coding is also implemented by this system.

  • →A channel CH, that is, the medium used to transmit the signal from the transmitter to the receiver.

  • →A receiver R, which reconstructs the message from the signal.

  • →A destination D, which receives the message.

The source S is a system with a range of possible states s 1, …, s n usually called letters, whose respective probabilities of occurrence are p(s 1), …, p(s n ). S produces sequences of states, usually called messages. The entropy of the source S can be computed as

$$ H(S)={\displaystyle \sum_{i=1}^np\left({s}_i\right) \log \left(1/p\left({s}_i\right)\right)} $$
(1)

Analogously, the destination D is a system with a range of possible states d 1, …, d m , with respective probabilities p(d 1), …, p(d m ). The entropy of the destination D can be computed as

$$ H(D)={\displaystyle \sum_{j=1}^mp\left({d}_j\right) \log \left(1/p\left({d}_j\right)\right)} $$
(2)

When ‘log’ is the logarithm to the base 2, the resulting unit of measurement for H(S) and H(D) is called ‘bit’, contraction of binary unit. If the natural logarithm is used, the unit of measurement is the nat, contraction of natural unit, and in the case of the logarithm to base 10, the unit is the Hartley.

The dependence between source and destination is defined by the matrix [p(d j /s i )], where p(d j /s i ) is the conditional probability of the occurrence of the state d j at the destination D given the occurrence of the state s i at the source S, and the elements in any row must add up to 1.

The relationship between H(S) and H(D) can be represented as follows:

figure a

The mutual information H(S; D) measures the amount of information generated at the source S and received at the destination D:

$$ H\left(S;D\right)=H(S)-E=H(D)-N $$
(3)

E measures the amount of information generated at S but not received at D, and N measures the amount of information received at D but not generated at S. Equivocity E and noise N are measures of the dependence between source and destination and, therefore, are functions not only of S and D, but also of the conditional probabilities p(d j /s i ). Thus, they are computed as

$$ N={\displaystyle \sum_{i=1}^np\left({s}_i\right){\displaystyle \sum_{j=1}^mp\left({d}_j/{s}_i\right) \log}\left(1/p\left({d}_j/{s}_i\right)\right)} $$
(4)
$$ E={\displaystyle \sum_{j=1}^mp\left({d}_j\right){\displaystyle \sum_{i=1}^np\left({s}_i/{d}_j\right) \log \left(1/p\left({s}_i/{d}_j\right)\right)}} $$
(5)

where p(s i /d j ) = p(d j /s i )p(s i )/p(d j ). The channel capacity C is defined as:

$$ C={ \max}_{p\left({s}_i\right)}H\left(S;D\right) $$
(6)

where the maximum is taken over all the possible distributions p(s i ) at the source. C is the largest amount of information that can be transmitted over the communication channel CH.

In the context of Shannon theory, coding is a mapping from the alphabet A S  = {s 1, …, s n } of letters of the source S to the set of finite length strings of symbols from the code alphabet A C  = {c 1, …, c q } of the transmitter T. In general, those strings, called code-words, do not have the same length: the code-word w i , corresponding to the letter s i , has a length l i . This means that coding is a fixed- to variable-length mapping. Therefore, the average code-word length L can be defined as:

$$ L={\displaystyle \sum_{i=1}^np\left({s}_i\right)\kern0.24em {l}_i} $$
(7)

L indicates the compactness of the code: the lower the value of L, the more efficient the coding, that is, the fewer resources needed to encode the messages. The noiseless coding theorem (or First Shannon Theorem) proves that, for very long messages (strictly speaking, for messages of length N → ∞), there is an optimal encoding process such that the average code-word length L is as close as desired to the lower bound L min of L:

$$ {L}_{\min }=\frac{H(S)}{ \log q} $$
(8)

where, when H(S) is expressed in bits, log is the logarithm to base 2. When H(S) is expressed in bits and the code alphabet has two symbols (an alphabet of binary digits, q = 2), then log2 q = log22 = 1, and the noiseless coding theorem establishes the direct relation between the entropy of the source and the lower bound L min of the average code-word length L.

In turn, the noisy coding theorem (or Second Shannon Theorem) proves that the information transmitted over a communication channel can be increased without increasing the probability of error as long as the communication rate is maintained below the channel capacity. In other words, the channel capacity is equal to the maximum rate at which the information can be sent over the channel and recovered at the destination with a vanishingly low probability of error.

4 Talking about information: quantity and pieces

Timpson (2013: 10) introduces a quote by Peter Strawson (1950: 448) as the epigraph of the second chapter of his book, entitled “2. What is information?”: “To suppose that, whenever we use a singular substantive, we are, or ought to be, using it to refer to something, is an ancient, but no longer a respectable, error.” And, immediately at the beginning of that chapter, he recalls a quote by John L. Austin (1950: 149): “For ‘truth’ itself is an abstract noun, a camel, that is of a logical construction, which cannot get past the eye even of a grammarian. We approach it cap and categories in hand: we ask ourselves whether Truth is a substance (the Truth, the Body of Knowledge), or a quality (something like the color red, inhering in truths), or a relation (‘correspondence’). But philosophers should take something more nearly their own size to strain at. What needs discussing rather is the use, or certain uses, of the word ‘true’.” By relying on the analogy between ‘truth’ and ‘information’, Timpson takes these quotes as a departing point to support his claim that ‘information’ is an abstract noun: “Austin’s aim was to de-mystify the concept of truth, and make it amenable to discussion, by pointing to the fact that ‘truth’ is an abstract noun. So too is ‘information’.” (Timpson 2013: 10).

Timpson recalls that very often abstract nouns arise as nominalizations of various adjectival or verbal forms. On this basis, he extends the analogy between truth and information: “Austin leads us from the substantive ‘truth’ to the adjective ‘true’. Similarly, ‘information’ is to be explained in terms of the verb ‘inform’.” (2013: 11). It is true that the meaning of the term ‘information’ is the result of a historical process of substantivation (see Adriaans 2013). But, what does ‘to inform’ mean? “To inform someone is to bring them to know something (that they did not already know).” (Timpson 2013: 11). In other words, the meaning of ‘information’ is given by the operation of bringing knowledge. However, as pointed out above, in Section 2, according to Timpson only the everyday concept of information has meaningful links with knowledge; thus, the analogy with truth and the transition from the verb ‘inform’ to the noun ‘information’ should only apply to the everyday concept. Therefore, this discussion about information in only a kind of motivation for the analysis in the technical domain: the reason why ‘information’, in its technical sense, is an abstract noun is not related to Strawson’s and Austin’s quotes, but is given on the basis of a further distinction, between “bits” and “pieces” of information.

In his review paper about quantum information, Timpson introduces the difference between “bits” and “pieces” in the following terms: “the notion of bits of information, quantum or classical; the amount of information t that a source produces; […] is to be contrasted with pieces of information t , what the output of a source (quantum or classical) is” (2008: 27, emTimpsonThimpsonphasis in the original). In his book of 2013, Timpson presents the same idea in terms of the difference between quantity of information and pieces of information (2013: 16).

On this basis, the argument for the abstractness of information runs easily. On the one hand, information qua-quantity is abstract because quantities are abstract (in the following section we will analyze what that quantity measures according to Timpson). On the other hand, the abstractness of information qua-piece relies on the philosophical distinction between types and tokens: “one should distinguish between the concrete systems that the source outputs and the type that this output instantiates.” (Timpson 2004: 22). The piece of information is not the token produced at the source, but the corresponding type; and since types are abstract, then information qua-piece is abstract (we will come back to the distinction type-token in Section 6).

Although convincing at first, this distinction deserves a further scrutiny. In the technical context of Shannon theory, the notion of quantity of information, as what measured by the magnitudes introduced by the theory, is quite clear. But, what about the notion of piece of information? In Timpson’s book the notion makes its first appearance in the context of everyday information: “Any statement of fact is a candidate piece of information.” (2013: 12). Nevertheless, he later claims that the Shannon theory itself includes the notion: “the Shannon theory also—and importantly—introduces its own novel concept of what pieces of (Shannon) information are. It introduces its own technical notion of what it is that is transmitted. It is a theory, then, not only of bits (amount), but of pieces (what) of information too.” (2013: 16). The problem with this claim is that it is not clear at all where the notion of information qua-piece can be found in Shannon theory.

The first point that a communication engineer has to learn in his training is that Shannon information theory is a quantitative theory: in the context of the theory, information is something amenable to quantification. In particular, previously to any interpretation, information is that item whose quantity (in general, whose average quantity; we will come back to this point in the next section) is measured by the entropies as theoretically defined. On the contrary, information qua-piece is not something amenable to quantification: talking about the amount of a type makes no sense. Therefore, the notion of information qua-piece cannot be read off from Shannon theory. Moreover, the theory counts with a concept that might be seen as a kind of correlate of Timpson’s piece of information: the concept of message. But, when studying information theory, we have to understand from the very beginning that messages are not information; by contrast, information, being a quantifiable item, is related in a certain way to the number of possible messages and their probabilities, so it is independent not only of the semantic content of the messages, but also of the identity of the messages themselves.

After recalling that the notion of piece of information applies in the everyday domain, where “pieces of information (e.g., the truth that it is overcast at the cricket ground before the match) are abstract, not concrete, objects” (2013: 24), the author claims that the same argument can be applied in the technical context: “If one has in mind pieces of information t , then, as these are various types, they are abstract too, just as any type is. Thus a shift from the everyday to the technical context does not involve any shift in the truth of the claim that the term ‘information’ is an abstract noun, even though in the technical Shannon case, ‘information t ’ evidently does not derive from the verb ‘inform’.” (2013: 24). One might wonder whether the initial appealing of the notion of piece of information is not due to its links to the everyday notion of information.Footnote 4

5 About the quantity of information

In Shannon theory, the quantities H(S) and H(D), usually called ‘entropies’, measure the amounts of information generated at the source and received at the destination, respectively. But, what kind of amount? In many presentations of the theory, H(S) and H(D) are defined directly in terms of the probabilities of the states of the source and the destination. However, from a conceptual viewpoint, it makes sense to ask for the information generated at the source by the occurrence of one of its states. Moreover, since Eqs. (1) and (2) have the form of a weighted average, it also makes sense to define the individual magnitudes on which the average is computed. From this perspective, I(s i ) measures the amount of information generated at the source by the occurrence of s i and I(d j ) measures the amount of information received at the destination by the occurrence of d j :

$$ I\left({s}_i\right)= \log \left(1/p\left({s}_i\right)\right) $$
(9)
$$ I\left({d}_j\right)= \log \left(1/p\left({d}_j\right)\right) $$
(10)

Once I(s i ) and I(d j ) are introduced, the entropies H(S) and H(D) turn out to measure average amounts of information per letter generated by the source and received by the destination respectively, and can be defined as (see, e.g., Lombardi 2005: 24–25; Bub 2007: 558):

$$ H(S)={\displaystyle \sum_{i=1}^np\left({s}_i\right)\;I\left({s}_i\right)} $$
(11)
$$ H(D)={\displaystyle \sum_{i=1}^np\left({d}_j\right)\;I\left({d}_j\right)} $$
(12)

In other words, only when log(1/p(s i )) and log(1/p(d i )) are linked to individual amounts of information, it can be said that the entropies H(S) and H(D) measure average amounts of information, as usual in the technical literature on information theory: only in terms of individual magnitudes averages can be significantly defined as such.

By contrast to the traditional conception of entropies as averages, Timpson does not define the amount of information generated by a single letter of the source: “It is essential to realize that ‘information’ as a quantity in Shannon theory is not associated with individual messages, but rather characterizes the source of the messages” (Timpson 2013: 21, emphasis in the original). In the few cases in which he speaks about the information that we would gain if the state s i were to occur (2013: 29), it is conceived as a “surprise information” associated with s i , which only makes sense when s i is the outcome of a single experiment considered as a member of a long sequence of experiments − where, apparently, the probabilities are conceived as frequencies − .

The distinction between conceiving the entropies of the source and the destination as measuring amounts of information or average amounts of information might seem an irrelevant detail. However, this is not the case when we are interested in elucidating the very notion of information − in Shannon’s sense−. In fact, assuming the conceptual priority of H(S) over individual amounts of information allows Timpson to define the concept of information in terms of the noiseless coding theorem: “the coding theorems that introduced the classical (Shannon 1948) and quantum (Schumacher 1995) concepts of information t do not merely define measures of these quantities. They also introduce the concept of what it is that is transmitted, what it is that is measured.” (Timpson 2008: 23, emphasis in the original). In other words, Shannon information measures “the minimal amount of channel resources required to encode the output of the source in such a way that any message produced may be accurately reproduced at the destination. That is, to ask how much information t a source produces is ask to what degree is the output of the source compressible?” (Timpson 2008: 27, emphasis in the original; see also Timpson 2013: 37, 43). In the same vein, Timpson relates mutual information with the noisy coding theorem: “the primary interpretation of the mutual information t H(X:Y) was in terms of the noisy coding theorem” (2013: 43).Footnote 5

A first point to notice is that, as explained in Section 3, only when H(S) is expressed in a unit of measurement defined by log n and the code alphabet has n symbols, the noiseless coding theorem identifies the entropy of the source and the lower bound L min of the average code-word length. In the general case, the entropy of the source is only proportional to L min (see Eq. (8)), and the constant of proportionality depends on which units are used to express the entropy of the source and how many symbols the code alphabet has, and these two aspects are completely independent. This means that the value of the compressibility of the messages produced by the source (compressibility that, according to Timpson, is a property of the source and defines the amount of the information generated by the source) does not depend only on the source but also on a feature of the transmitter. Timpson’s definitional move blurs the difference between two aspects of communication that are clearly distinguished in the traditional textbooks on information theory: the information generated at the source, which depends on the probability distribution over the states of the source and is independent of coding, and the number of symbols necessary to encode the occurrence of those states, which also depends on the alphabet used for coding.

A second issue to notice here is that the strategy of defining information via the noiseless coding theorem turns the result of the theorem into a definition. In fact, now the entropy H(S) of the source is not defined by Eq. (1) as the average amount of information per letter generated by the source, but it is defined by Eq. (8) as proportional to the minimum average code-word length in optimal coding. By starting with this new definition, now Eq. (1) must be obtained as the result of a mathematical proof given by the inverse of the original noiseless coding theorem. Of course, there is no formal mistake in this strategy, but it causes a kind of uneasiness when considered from a conceptual viewpoint.

In fact, if the noiseless coding theorem says what the quantity-information is, now we know what H(S) represents. But what about H(D), which is not involved in the theorem? If the noiseless coding theorem establishes what quantity-information is, H(D) does not represent quantity-information; then, it cannot be said that it is the amount of information received at the destination. Moreover, if H(D) does not represent an amount of information, it is not clear how it can be involved together with H(S) in algebraic operations. For instance, let us consider an ideal channel where noise and equivocity are zero and, therefore, H(S; D) = H(S) = H(D) (see Eq. (3)): in this case we would have a mathematical identity between variables representing different items − since only H(S) but not H(D) represents amount of information−, something difficult to be accepted in mathematized sciences.

As pointed out above, the coding theorem is proved in the case of very long messages, strictly speaking, for messages of length N → ∞. Thus, it says nothing about the relation between the information I(s i ) generated at the source by the occurrence of the state s i and the length of the binary sequence used to encode it. Therefore, if the noiseless coding theorem embodies the very meaning of information, I(s i ) is deprived of its meaning as an individual amount of information, and H(S) cannot be conceived as an average. In other words, when H(S) is defined by Eq. (1), we are free of deciding to interpret it as an average or not and, with this, to admit that the I(s i ) are individual quantities of information or not. But Timpson’s strategy of defining information in terms of the noiseless coding theorem leaves us with a single possibility: the I(s i ) cannot be conceived as individual quantities of information and H(S) cannot be conceived as an average amount, by contrast to the usual stance in the literature on information theory. From this perspective, when Ralph Hartley − whose work is explicitly acknowledged by Shannon as one of the bases of his proposal − states that, in the case of equiprobable alternatives, “[t]he information associated with a single selection is the logarithm of the number of symbols available” (Hartley 1928: 541), he is simply wrong. Not only that, but one might wonder whether short binary messages can be conceived as carrying a quantity of information to the extent that they are not covered by the noiseless coding theorem.

In turn, if the concept of information qua-quantity is defined through the noiseless coding theorem, it acquires content in the case of ideal coding. But, then, what happens in the case of non-ideal coding? Can we still say that the same amount of information can be better or worse encoded? Somebody might argue that the answer of this question is ‘yes’: the source produces a certain amount of information, as characterized by its behavior in the ideal case; then actual coding schemes can be measured up against the ideal case.Footnote 6 However this view embodies a conceptual difficulty. As Timpson repeatedly stresses, information qua-quantity is a property of the source itself, in particular, the compressibility of its messages in the ideal case. But if this is strictly the case, it is not clear how this time-independent and intrinsic property can be “produced” by it, since time-independent and intrinsic properties are possessed by objects but not produced by them. By contrast, it should be said that what is produced by the source are its messages, but not its properties. Moreover, it is not clear how that property of the source can be later subjected to a further process of coding independent of the source itself, a process which, in addition, can be non-ideal, by contrast to the ideal coding used to the definition of that property. These dissonances are easily removed when we consider, as in the technical presentations of the theory (see, e.g., Cover and Thomas 1991), that the amount of information produced by the source is defined by the features of the source itself (the probabilities of its states), and that that information is later encoded: the coding noiseless theorem says how this quantity of information previously defined can be ideally encoded; but since previously defined, it can also be non-ideally encoded.

The strategy of defining the amounts of information involved in Shannon theory in terms of the Shannon coding theorems seems to suggest that coding is a feature essential to communication. However, when explaining the elements of the general communication system, Shannon (1948: 381) characterizes the transmitter as a system that operates on the message coming from the source in some way to produce a signal suitable for transmission over the channel. He also stresses that, in many cases, such as in telegraphy, the transmitter is also responsible for encoding the source messages. This means that, as any communication engineer knows, in certain cases the message is not encoded; for instance, in traditional telephony the transmitter operates as a mere transducer, by changing sound pressure into a proportional electrical current. If one insisted on defining information qua-quantity in terms of the noiseless coding theorem, the entropy of the source qua-quantity would turn out to be defined in terms of something that is not essential to it: coding. Analogously, mutual information can be defined as the information generated at the source and received at the destination without reference to the capacity of the channel (see Eqs. (3), (4) and (5)), which, in turn, can be defined in terms of the mutual information as usual (see Eq. (6)). Moreover, mutual information needs neither coding nor noise to have a definite value. If, by contrast, we claim that the meaning of the mutual information is given by the noisy coding theorem, mutual information qua-quantity would turn out to be defined in terms of factors that are not essential to it: coding and noise. In both cases the definition would not express an essential feature of the definiendum.

Of course, with sufficient effort and perseverance, each one of the arguments against identifying quantity of information with compressibility can be answered with an ad hoc counterargument. However, it is not clear why one should undertake such a difficult task. In particular, it is not clear what the technical or the philosophical advantage of defining information qua-quantity in terms of the coding theorems is, instead of following the most usual strategy adopted in the traditional technical literature on information theory: the − average − amount of information produced by the source is defined by the features of the source itself (the probabilities of its states), and is independent of coding − even of the very fact that the messages are encoded or not−; when that information is later encoded, the coding noiseless theorem says how this quantity of information previously defined can be ideally encoded. This widespread view removes all the difficulties pointed out in this section in a single move, without particular and unnecessary further arguments.

6 Information qua-piece: the deflationary interpretation

In Section 4 we have introduced Timpson’s distinction between quantity of information and pieces of information, and we have challenged the notion of information qua-piece by pointing out that it is not clear at all where that notion can be found in Shannon theory. Nevertheless, somebody might claim that, even if the notion plays no technical role in Shannon theory, it serves to supply a philosophical elucidation of what counts as success in communication. In the present section it will be shown that the notion of piece of information distorts the definition of the success of communication in the technical domain.

According to Timpson, in communication, when the source of information produces a message, what we want to transmit is not the sequence of the states itself: “one should distinguish between the concrete systems that the source outputs and the type that this output instantiates.” (Timpson 2004: 22; see also Timpson 2008). The goal of communication, then, is to reproduce at the destination another token of the same type: “What will be required at the end of the communication protocol is either that another token of this type actually be produced at a distant point (as a consequence of the production of the initial token); or at least that it be possible to produce it there (as a consequence of the initial production) by a standard procedure.” (Timpson 2013: 23, emphasis in the original; see also Timpson 2008: 25).

Although very convincing at first sight, the argument deserves to be examined in detail. Is it true that the goal of communication (in the context of Shannon theory) is to reproduce at the destination a token of the same type as that produced at the source? As Shannon stresses, in communication, “[t]he significant aspect is that the actual message is one selected from a set of possible messages.” (1948: 379, emphasis in the original). The states d j of the destination system D can be any kind of states, completely different than the states s i of the source system S: the goal of communication is to identify at the destination which sequence of states s i was produced by the source. Timpson explains that “if the source X produces a string of letters like the following: x 2, x 1, x 3, x 1, x 4, …, x 2, x 1, x 7, x 1, x 4, say, then the type is the sequence ‘x 2, x 1, x 3, x 1, x 4, …, x 2, x 1, x 7, x 1, x 4’; we might name this ‘sequence 17’. The aim is to produce at the receiving end of the communication channel another token of this type. What has been transmitted, though, the information transmitted on this run of the protocol, is sequence 17.” (2004: 21–22). But this is not the case: what has been transmitted is not sequence 17, but that a particular string was the actual message selected from the set of the possible messages of the source. Indeed, the occurrence in D of another token of the type sequence ‘x 2, x 1, x 3, x 1, x 4, …, x 2, x 1, x 7, x 1, x 4’ is not necessary to identify which sequence occurred at S: the particular string produced at S can be identified by means of the occurrence in D of a sequence d 7, d 4, d 3, d 4, d 5, …, d 7, d 4, d 1, d 4, d 5, such that each state of the source is correlated with one state of the destination; in this particular case, x 1 → d 4, x 2 → d 7, x 3 → d 3, x 4 → d 5 and x 7 → d 1. In what sense the sequence d 7, d 4, d 3, d 4, d 5, …, d 7, d 4, d 1, d 4, d 5 is a token of the type ‘x 2, x 1, x 3, x 1, x 4, …, x 2, x 1, x 7, x 1, x 4’? Moreover, this is a case of deterministic situation, without equivocity and without noise, characterized by a one-to one mapping from the set of letters that characterize the source to the set of letters that characterize the destination. But the situation may be less simple. In a noisy case with no equivocity, the mapping is one-to-many (see, e.g., Cover and Thomas 1991: 184–185); for instance, in the above example, the mapping might have been x 1 → d 4, d 2, x 2 → d 7, d 6, x 3 → d 3, x 4 → d 5 and x 7 → d 1. Nevertheless, given any state of the destination, the state that occurred at the source can be univocally identified. In this case, the message produced by the source might be identified by means of either of the two sequences: d 7, d 4, d 3, d 4, d 5, …, d 7, d 4, d 1, d 4, d 5 and d 7, d 2, d 3, d 4, d 5, …, d 6, d 4, d 1, d 4, d 5. Again, in what sense these two sequences are tokens of the type ‘x 2, x 1, x 3, x 1, x 4, …, x 2, x 1, x 7, x 1, x 4’?

Summing up, the goal of communication consists in identifying at the destination the state produced at the source. The success criterion is given by a one-to-one or a one-to-many mapping from the set of letters of the source to the set of letters of the destination. Since this mapping is arbitrary, the states of the source and the states of the destination may be of a completely different nature: the source may be a dice and the destination a dash of lights; or the source may be a device that produces words in English and the destination a device that operates a machine. It is difficult to say in what sense a face of a dice and a light in a dash are tokens of a same type. Admitting arbitrary functions as defining the relation of being tokens of the same type leads to admit that any two things arbitrarily chosen can always be conceived as tokens of the same type and, then, trivializes the distinction type-token (see Wetzel 2011).

Somebody who seems to suspect that there is something odd in Timpson’s argument is Armond Duwell. After publishing an article arguing that quantum information is not different from classical information (Duwell 2003), Duwell changes his mind under the influence of Timpson’s works. So, in a later article he also takes into account the distinction between types and tokens. Nevertheless, he correctly acknowledges that: “To describe the success criterion of Shannon theory as being the reproduction of the tokens produced at the information source at the destination is unacceptable because it lacks the precision required of a success criterion.” (Duwell 2008: 199). The reasons are several. First, any token is a token of many different types simultaneously; so the type-token argument leaves undetermined the supposedly transmitted type (ibid.: 199). Moreover, in Shannon theory the success criterion is given by an arbitrary one-to-oneFootnote 7 mapping from the set of the letters of the source to the set of the letters of the destination (ibid.: 200). Duwell also notes that the Shannon entropy associated with a source can change due to the change of the probability distribution describing the source, without the change of the types that the source produces tokens of (ibid.: 202). Furthermore, the types a source produces tokens of can change without the Shannon entropy of the source changing (ibid.: 203).

We might suppose that all these correct observations are sufficient to convince Duwell that the success of communication in Shannon theory cannot be characterized in terms of the type-token distinction. However, this is not the conclusion drawn by him. In particular, Duwell considers that the mapping that determines the success criterion in Shannon theory is a one-to-one mapping that “establishes an identity between the symbols that characterize the source and destination […]. In other words, this function establishes the appropriate conditions for token instantiation of the type that the information source produced tokens of.” (Duwell 2008: 200). But, as stressed above, since the mapping is completely arbitrary, there is no constraint on the way that the states of source and destination are correlated, and this trivializes the distinction type-token. Moreover, as explained above, the mapping does not need to be one-to-one, but may be one-to-many.

In a further argument, Duwell distinguishes the success of communication − to identify at the destination the state generated at the source − from the goal of communication, which “is to produce, at the destination, a token of the type produced by the information source. For example, if the information source produces a sequence of letters, the destination ought to produce the same sequence of letters.” (Duwell 2008: 199). In this way, he sustains Timpson’s proposal at the cost of introducing a notion, the goal of communication, that is absent from Shannon’s original theory to the extent that it is not necessary for the success of communication.

The philosophical distinction between types and tokens, although not confined to logic and philosophy of language, finds its paradigmatic example in the difference between a sentence and its concrete utterances. This is a difference we have learned when studying logico-semantic topics, in order to avoid the confusion between the sentence, with its semantic content, and its concrete instances. Of course, when Timpson introduces the idea of type-information, he is not endowing types with meaning. However, a type needs to have some content to be able to identify its tokens: the distinction between types and tokens is not merely formal or syntactic; being tokens or a same type is not an arbitrary relation. By contrast, Shannon information is neutral with respect to any content, since the only relevant issue is the selection of a message among many. It seems that, although Timpson explicitly keeps distance from endowing information with any semantic content, when introducing the notion of piece of information certain semantic notions creeps up into his argumentation, in such a way that his concept of information turns out to acquire a sort of content completely alien to Shannon’s original proposal.Footnote 8

The idea that ‘information’ is an abstract noun, justified on the basis of the type-token distinction, has had a great impact in the philosophy of physics community since the publication of Timpson’s thesis (2004). However, Timpson seems to have perceived the need of clarification because, almost ten years later, he came back to the point. Although strongly based on his thesis, Timpson’s book (2013) adds a detailed discussion about the type-token distinction (2013: 17–20), which begins with the traditional Peircean difference between sentence-type (abstract) and sentence-token (concrete). But immediately the type-token distinction is generalized in terms of sameness of pattern or structure: “the distinction may be generalized. The basic idea is of a pattern or structure: something which can be repeatedly realized in different instances” (2013: 18). However, this new move is not free of difficulties.

First, sameness of pattern or structure is a purely formal relation, which cannot be simply identified with the philosophical relation between tokens of the same type, as argued above. Now Timpson is closer to a purely formal characterization of Shannon information, in which the only relevant notion of information is the information qua-quantity, but farther away from his original argumentation in terms of pieces of information and the type-token distinction.

But the main difficulty is technical: the idea of sameness of structure to characterize the goal of communication can be defended only by forgetting the possibility of noisy situations, in particular, if the states of the source and the states of the destination were always linked through a one-to-one mapping. However, as clearly stressed above, this is not the case: communication can be successful even in noisy cases, with one-to-many mappings linking the states of the source and the states of the destination. It seems that, when defining the goal of communication, Timpson, as Duwell before him, does not take into account the possibility of noisy situations, which are, however, the cases of real interest in the practice of communication engineering.Footnote 9

This abstract-noun deflationary interpretation of information allows Timpson to dissolve the problems related to communication based on entanglement. In particular, he cuts the Gordian knot of teleportation: if ‘information’ is an abstract noun, the question about how information “travels” from source to destination in teleportation makes no sense. (Timpson 2006). The fact stressed in this section is that appealing to the notion of piece of information and to the philosophical distinction between types and tokens is not necessary for supporting the abstract nature of information. In fact, for this purpose it is sufficient to notice that information in Shannon theory is completely formal and, therefore, even more abstract than types. But, in Timpson’s general argumentation, the abstract nature of information is the cornerstone of his claim that information is not physical. Therefore, it seems that, from a different argumentative line, we should arrive at the same conclusion. However, we will see in the next section that the matter is not so simple.

7 Why is information not physical?

According to Timpson, in the transmission of a piece of information, what is transmitted is a type sequence, and “types are abstracta. They are not themselves part of the contents of the material world, nor do they have a spatio-temporal location.” (Timpson 2008: 27, emphasis in the original). Since ‘information’, in its meaning as piece of information, is an abstract noun, “it doesn’t serve to refer to a material thing or substance.” (Timpson 2004: 20). Therefore, “one should not understand the transmission of information on the model of transporting potatoes, or butter, say, or piping water.” (2008: 31): a piece of information is not a substance or a kind of stuff (see 2004: 34, 2008: 28, 2013: 34–36). But information not only is not a substance, but it is neither a physical item. For Timpson, the slogan ‘Information is physical’, applied to the technical concept of information, “simply involves a category mistake. Pieces of information t , quantum or classical, are abstract types. They are not physical, it is rather their tokens which are.” (Timpson 2013: 69, emphasis in the original). Therefore, the slogan does not embody an ontological lesson but rather a logical confusion, “a confusion of token and type.” (ibid.: 69).

As argued in the previous sections, the notion of information qua-piece cannot be found in Shannon theory, and would play no technical role if added to it. In spite of Timpson’s efforts to distinguish between the everyday and the technical notions of information, some everyday assumptions implicitly and unintentionally seep into his argumentation. The style of Timpson’s argumentation is typical of certain traditional analytic philosophy: philosophical conclusions are drawn from the analysis of the ordinary, everyday language. This style reappears when considering whether information is physical or not: the physical world is what ordinary language talks about and, consequently, we discover the world’s structure by analyzing the grammar of that language. For this reason, the grammatical fact that a noun is abstract expresses the non-existence of its referent as a concrete item in the physical world. It is true that Timpson distinguishes between the everyday notion and the technical notion of information. Nevertheless, in both cases the strategy is the same: to analyze the grammatical role played by the word ‘information’ in the non-formal language, and to draw ontological conclusions from that analysis. However, physicists do not appeal to that strategy to decide what a physical item is when they say, as Rolf Landauer (1991, 1996), that information is physical. If one does not want to turn the structure of non-formal languages into the key witness about what exists and does not exist in the physical world, a more reasonable strategy seems to be admitting that physics supplies us the better tools to know what the physical world is. Therefore, in order to decide whether or not a certain item belongs to the physical world, we should see what role it plays in physical science. But for this we have to put aside the technically dubious notion of information qua-piece, and to ask for the reference of information qua-quantity.

The first point to notice here is that using the term ‘quantity of information’ still says nothing about the item referred to by the term, other than that such an item is measurable. When we are interested in the interpretation of the concept of information as used in Shannon theory, we need to decide about the ontological category of the item whose amount (or average amount) is measured by the entropies, the equivocity and the noise in Shannon theory. This is a legitimate question, whose answer should not be sought in the structure of natural language nor in a priori logico-ontological assumptions, but in the practice of science.

According to Timpson, information as a quantity is clearly a property: “If one has in mind the Shannon information t as a quantity − the compressibility of a source − then we certainly have in mind an abstract item, not a concrete one, just as any property must be abstract.” (2013: 24); “What had gone wrong was thinking of what is in fact a property—the information t of the source—as a kind of object (physical substance or stuff). […] (quantitative) information t is a property rather than an object” (2013: 36). In other words, the ontological category of the item whose amount is measured by H(S) (note that Timpson never addresses the interpretation of mutual information H(S; D), entropy of the destination H(D), noise N and equivocity E) is that of property. In particular, such an item is a property of the source: the compressibility of the messages produced by the source. This view, although it justifies the abstractness of information qua-quantity, logically depends on defining H(S) as the compressibility of the source via the noiseless coding theorem. But, as argued in Section 5, this definition leads to several difficulties that, at least, leave open the possibility of a different interpretation of the entropy of the source and, derivatively, of the other information quantities involved in Shannon theory.

But leaving aside those difficulties, that definition makes the item whose amount is measured by H(S) abstract, since properties are abstract. But what about its physical character? Timpson’s answer is that the claim ‘information is physical’ “would seem to be that some physically defined quantity (information t ) is physical; and that is hardly an earth-shattering revelation.” (2013: 68). However, the question about whether information is a physical item or not, far from being trivial, leads to an interesting philosophical discussion.

The first question is why information qua-quantity is “physically defined”, as Timpson claims. If the amount of information measured by H(S) is defined by the noiseless coding theorem, it is defined by logical-mathematical arguments: no physical theory is involved in that definition. There seems to be a substantial difference between the compressibility of a source and the mass of a particle regarding its physical nature. Therefore, one might suppose that, when Landauer and others claim ‘information is physical,’ they are not imagining a stuff flowing through space, but they allude to a physical property analog, regarding to its physicality, to the mass of a particle in classical mechanics or to the charge of a particle in classical electromagnetism. In this sense, information would be abstract, because a property, but physical, in a way that cannot be easily applied to information qua-quantity defined as compressibility. Moreover, from this physical perspective, the picture of the “flow” of information might make a certain sense. A traditional assumption in physics and engineering is that the transmission of information between two points of the physical space necessarily requires an information-bearing signal, that is, a physical process propagating from one point to the other. If information is a physical property, it must be a property of a physical signal that links transmitter and receiver; then, even if properties do not “flow”, there is a propagation of the carrier of the information qua-physical property. It would be interesting to analyze the literature and the practice of physics and engineering to know to what extent this is the idea behind the successful manipulation of information in technical contexts, and to explore the limitations of that view.

From a philosophical perspective, it is well known that physics, far from being a static body of knowledge, changes substantially through history. In this process, concepts undergo deep mutations that modify the worldview described by physics. Let us consider, for instance, the concept of a wave, which begins by referring to a property of a physical medium: a wave is nothing else than an abstract description of how a material medium changes its properties in space and/or in time. In this sense, the concept of a wave belongs to the category of property: there are no waves without a material medium that carries them. On the other hand, the concept of field derives from the force exerted by a certain body on a test particle due to a particular interaction. However, with the development of physics certain waves, like electromagnetic waves, become something that does not need a further underlying physical medium to exist. Moreover, in classical electromagnetism, the central concept turned out to be that of electromagnetic field, which lost its reference to test particles. Although at present the precise ontological status of a field is still under debate, it is usually agreed that a field is something that changes in a wave-like way but exists by itself, with no need of further underlying physical substratum, and that has its own properties and its specific physical description (for a historical account of this transformation, see Berkson 1974).

The examples of waves and fields show that, in certain cases, physics, in its evolution, tends to perform a substantialization of certain conceptsFootnote 10 from originally being conceived as properties, certain items turn into substances, but not in the sense of becoming kinds of stuff, referents of mass nouns, but in the Aristotelian philosophical sense (“primary substance” in Categories) of being objects of predication but not predicable of anything else, and being bearers of properties (see Robinson 2014). One might wonder whether the − technical − concept of information is undergoing a mutation analogous to that experienced by the concepts of wave and of field, and is beginning to be conceived as a physical magnitude that exists by itself, without the need of a material carrier supporting it.

A concept that immediately comes to one’s mind when thinking about a physical interpretation of information is that of energy, since energy also seems to be something “abstract” and non-material, at least when compared to, say, a molecule. Timpson considers the analogy between information and energy, and assumes that ‘energy’ is akin to a property name (2004: 20) (again, grammar playing a central role in ontological discussions): energy is a property, “it is not something which, properly speaking, has a spatio-temporal location at all, so it is not something which—in strict sense—moves around. Thus by talk of the flow of energy, what we have in mind is certain kinds of changes in the energies possessed by things having spatial locations: the energies of various located items can change over time.” (Timpson 2013: 36). The question here is: who are those “we” who have in mind that view of energy? Of course, that view is not wrong, but perhaps it is not the only correct one: the theoretical and experimental practice of physics might show that there are other pragmatically successful ways of conceiving energy. In order to know how energy is conceived in the practice of physics it is necessary to take into account that practice. And, on this basis, philosophical discussion may enrich the understanding of the concept.

In the context of the analogy between information and energy, Timpson asks whether information is “adventitious”, that is, added from without, from the perspective of the pragmatic interest of an agent: “Is it a fundamental one? […] Or is it an adventitious one: of the nature of an addition from without; an addition from the parochial perspective of an agent wishing to treat some system information-theoretically, for whatever reason?” (Timpson 2008: 46–47, emphasis in the original). The comparison with energy is relevant also with respect to this question. In fact, in the context of strict Newtonian mechanics, the concept of energy is subsidiary to the dynamical description of a system; in Timpson’s terms, it is an adventitious concept designed to measure the capacity of a system to perform a certain task − work−. However, in the framework of physics as a whole, it was gradually acquiring its own, not merely adventitious, reference, to become one of the fundamental physical concepts. The words of William Thomson in the nineteenth century already express clearly this transformation: “The very name energy, though first used in its present sense by Dr. Thomas Young about the beginning of this century, has only come into use practically after the doctrine which defines it had […] been raised from a mere formula of mathematical dynamics to the position it now holds of a principle pervading all nature and guiding the investigator in every field of science” (Thomson 1881: 475). At present, the word ‘energy’ does not refer to something concrete: if a perturbation in a physical medium is transmitted between two points of space, nothing material is transmitted; nevertheless, physics describes the phenomenon as a transference of energy between those points. And although in many cases the word ‘energy’ is still used as a property name, in many others energy has acquired a substantial nature − in the Aristotelian sense − that plays a central unifying role in physics: energy is an item essentially present in absolutely all contemporary physical theories; it is conceived as something that can be generated, accumulated, stored, processed, converted from one form to another, and transmitted from one place to another.

In his insistence on depriving information of any relevant physical nature, Timpson says that “Quantum information t theory and quantum computation are theories about what we can do using physical systems” (Timpson 2013: 69, emphasis in the original). Following with the analogy with energy, one can say that the concept of energy also began as a tool to describe what we can do with material systems. However, its status gradually changed with the historical development of physics: now energy is an undoubtedly physical item existing in the physical world, which, although non-material, plays an essential role in physical sciences. In the light of the strong presence of the concept of information in present-day physics, several authors (Stonier 1990, 1996; Rovelli, personal communication) consider that it is following a historical trajectory analogous to that followed by the concept of energy in the nineteenth century.

Summing up, it is quite clear that the world described by contemporary physics is not a world of material individuals and stuffs and properties applying on them. This traditional ontology was superseded by very peculiar ontological pictures, completely alien to the traditional view. For instance, in the world of quantum field theory, particles lose any classical feature and fields become substantial items: philosophical discussions revolve around whether particles or fields hold ontological priority (see, e.g., Kuhlmann 2010). In a general relativistic universe, energy acquires a sort of “materiality” and space-time is no longer a neutral container of material things; it has been claimed that perhaps the space-time of general relativity fits neither traditional relationalism nor traditional substantivalism (Earman 1989: 208). These discussions are philosophically interesting when one admits that it is physics and not grammar that is the best clue for discovering the content of the physical world. It does not matter what kinds of words are used to refer to properties, such as charge and mass, and to name items that tended to substantialization through the history of science, such as fields and energy. What only matters is that all those items inhabit the world of physics, that is, according to physics they are part of the furniture of the world. And this implies that contemporary physics offers no grounds to deny the possibility of a non-trivial and meaningful physical interpretation of the concept of information.

8 Conclusion: the many faces of information

Timpson considers that there is a single correct interpretation of the technical concept of information and, for this reason, he devotes a great effort to elucidate it. However, this “monist” position contrasts with the “pluralist” perspective adopted by other authors (Lombardi 2004; for a detailed argumentation see Lombardi et al. 2015a; see also Floridi 2011), which follows a present-day trend in the technical books on the matter: information theory is introduced from a formal perspective, with no mention of transmitters, receivers or signals, and its basic concepts are explained in terms of random variables and probability distributions over their possible values (Cover and Thomas 1991). According to this position (see also Khinchin 1957; Reza 1961), the concept of Shannon information is purely formal and belongs to a mathematical theory. Then, the word ‘information’ does not belong to the language of empirical sciences: it has no extralinguistic reference in itself, and from this fact derives the generality of the concept. As a consequence, the relationship between the word ‘information’ − in Shannon’s formal context − and the different views about the nature of information is the logical relationship between a mathematical term and its interpretations, each one of which endows it with a specific referential content. This pluralism is a matter of fact even in the scientific uses of the concept; therefore, deflationism runs the risk of becoming a kind of “conflationism” of different technical uses that need to be distinguished.

From this pluralist perspective, the epistemic view of information is one of those interpretations. According to it, information provides knowledge, modifies the state of knowledge of those who receive it. The epistemic interpretation may be applied in different technical domains, for example, in the attempts to ground a theory of knowledge on informational bases (Dretske 1981), or in psychology and cognitive sciences to conceptualize the human abilities of acquiring knowledge (Hoel et al. 2013).

A different interpretation is the physical view, which turns information into a physical magnitude. This is the position of many physicists (Stonier 1990, 1996; Landauer 1991, 1996; Rovelli 1996) and most engineers, for whom the essential feature of information consists in its capacity to be generated at one point of the physical space and transmitted to another point; it can also be accumulated, stored and converted from one form to another. This interpretation is appropriate for communication theory, in which the main problem consists in optimizing the transmission of information by means of physical bearers whose energy and bandwidth is constrained by technological and economic limitations. And in the physics domain, the attempts to reconstruct an objectively interpreted quantum mechanics on the basis of informational constraints (e.g., Clifton et al. 2003) find conceptual support in the physical interpretation.

A traditional physical context in which the formal concept of Shannon information acquires a physical content is statistical mechanics. Depending on how the probabilities involved in its definition are endowed with reference, the Shannon information can be interpreted as Boltzmann entropy or as Gibbs entropy (see Lombardi et al. 2015b). Although sometimes Gibbs entropy is viewed as a generalization of Boltzmann entropy when microstates are not equiprobable, such a view hides the deep difference between the Boltzmann and the Gibbs approaches, which leads even to different concepts of equilibrium and irreversibility (see Lombardi and Labarca 2005; Frigg 2008). This means that not even in statistical mechanics the formal concept of Shannon information has a single interpretation.

The discussion of the many faces of the concept of information is beyond the scope of the present paper. However, it is worth recalling Shannon’s words: “The word ‘information’ has been given different meanings by various writers in the general field of information theory. […] It is hardly to be expected that a single concept of information would satisfactorily account for the numerous possible applications of this general field.” (Shannon 1993: 180).