Keywords

1 Introduction

In his book about the geometry of Information Retrieval (IR), Rijsbergen writes in the prologue [30]:

Well imagine the world in IR before keywords or index terms. A document, then, was not simply a set of words, it was much more: it was a set of ideas, a set of concepts, a story, etc., in other words a very abstract object. It is an accident of history that a representation of a document is so directly related to the text in it. If IR had started with documents that were images then such a dictionary kind of representation would not have arisen immediately. So let us begin by leaving the representation of a document unspecified. That does not mean that there will be none, it simply means it will not be defined in advance. […] a document is a kind of fictive object. Strangely enough Schrödinger […] in his conception of the state-vector for QM envisaged it in the same way. He thought of the state-vector as an object encapsulating all the possible results of potential measurements. Let me quote: ‘It (ψ-function) is now the means for predicting probability of measurement results. In it is embodied the momentarily attained sum of theoretically based future expectation, somewhat as laid down in a catalogue.’ Thus a state-vector representing a document may be viewed the same way – it is an object that encapsulates the answers to all possible queries.

In the present chapter, we adopt that part of Rijsbergen’s perspective that emphasizes the importance of distinguishing a corpus of written documents, like the pages forming the World Wide Web, made of actual (printed or printable) webpages, from the meaning (conceptual) entity associated with it, which in the case of the Web we simply call it the ‘Quantum Web’ (in short, the ‘QWeb’), because its modeling requires the use of notions derived from quantum theory, as we are going to discuss. This requirement is not at all accidental, and we are going to consider this crucial aspect too. Indeed, a strong analogy was established between the operational-realistic description of a physical entity, interacting with a measurement apparatus, and the operational-realistic description of a conceptual entity, interacting with a mind-like cognitive entity (see [13] and the references therein). In that respect, in a recent interpretation of quantum theory the non-classical behavior of quantum micro-entities, like electrons and photons, is precisely explained as being due to the fact that their fundamental nature is conceptual, instead of objectual (see [14] and the references therein). Considering the success of the quantum formalism in modeling and explaining data collected in cognitive experiments with human participants, it is then natural to assume that a similar approach can be proposed, mutatis mutandis, to capture the information content of large corpora of written documents, as is clear that such content is precisely what is revealed when human minds interact with said documents, in a cognitive way.

What we will describe is of course relevant for Information Retrieval (IR), i.e., [27]: “the complex of activities performed by a computer system so as to retrieve from a collection of documents all and only the documents which contain information relevant to the user’s information need.” Although the term “information” is customarily used in this ambit, it is clear that the retrieval is about relevant information, that is, meaningful information, so that, in the first place, IR is really about Meaning Retrieval. More specifically, similarly to a quantum measurement, an IR process is an interrogative context where a user enters a so-called query into the system. Indeed, on a pragmatic level, a query works as an interrogation, where the system is asked to provide documents whose meaning is strongly connected to the meaning conveyed by the query, usually consisting of a word or sequence of words. In fact, since a search engine does not provide just a single document as an outcome, but an entire collection of documents, if the numerical values that are calculated to obtain the ranking are considered to be a measure of the outcome probabilities of the different documents, the analogy consists in considering the action of a search engine to be similar to that of an experimenter performing a large number of measurements, all with the same initial condition (specified by the query), then presenting the obtained results in an ordered way, according to their relative frequencies of appearance. Of course, the analogy is not perfect, as today search engines, when they look for the similarities between the words in the query and the documents, they only use deterministic processes in their evaluations. But we can certainly think of the deterministic functioning of today search engines as a provisional stage in the development of more advanced searching strategies, which in the future will also exploit non-deterministic processes, i.e., probabilistic rankings (see [1], for an example where the introduction of some level of randomness, by means of probabilities that reflect the relative weights of the parts involved in a decision process is able to offer a more balanced way to reach a meaningful outcome; see also [28], for an explanation about how indeterminism, in measurement situations, can increase our discriminative power).

It is important to say, however, that our focus here is primarily on ‘the meaning that is associated with a collection of documents’ and not on the exploration of more specific properties like ‘relevance’ and ‘information need’, which are more typically considered in IR. For the time being, our task is that of trying to find a way of modeling meaning content in a consistent way, and not yet that of considering the interplay between notions like ‘relevance’ and ‘content’, or ‘information need’ and ‘user’s request’ [27]. Our belief is that the adoption of a more fundamental approach, in the general modeling of meaning, will help us in the future to also address in new and more effective ways those more specific properties and their relationships.

Before entering in the description of our quantum approach, its motivations and foundations, it is useful to provide a definition of the terms “meaning” and “concept,” which we use extensively. By the term “meaning,” we usually refer to that content of a word, and more generally of any means of communication or expression, that can be conveyed in terms of concepts, notions, information, importance, values, etc. Meaning is also what different ‘meaning entities’, like concepts, can share, and when this happens they become connected, and more precisely ‘connected through meaning’. By the term “concept,” we usually intend a well-defined and ideally formed thought, expressible and usable at different levels, like the intuitive, logical, and practical ones. Concepts are therefore paradigmatic examples of ‘meaning entities’, used as inputs or obtained as outputs of cognitive activities, for instance, aimed at grasping and defining the essence of situations, decisions, reasoning, objects, physical entities, cultural artifacts, etc. Concepts are what minds (cognitive entities) are able to intend and understand, what they are sensitive to, and can respond to. They are what is created and discovered as the result of a cognitive activity, like study, meditation, observation, reasoning, etc. And more specifically, concepts are what minds use to make sense of their experiences of the world, allowing them, in particular, to classify situations, interpret them (particularly when they are new), connect them to previous or future ones, etc.

An important aspect is that concepts, like physical entities, can be in different states. For instance, the concept Fruits,Footnote 1 when considered in the context of itself, can be said to be in a very neutral or primitive meaning-state, which can be metaphorically referred to as its ‘ground state’. But concepts can also be combined with other concepts, and when this is done their meaning changes, i.e., they enter into different contextual states. For instance, the combination Sugary fruits can be metaphorically interpreted as the concept Fruits in an ‘excited state’, because of the context provided by the Sugary concept. But of course, it can also be interpreted as an excited state of the concept Sugary, because of the context provided by the Fruits concept.

An important notion when dealing with meaning entities like human concepts is that of abstractness, and its complementary notion of concreteness. For instance, certain concepts, like Table, Chair, and House, are considered to be relatively concrete, whereas other concepts, like Joy, Entity, and Justice, are considered to be relatively abstract. We can therefore find ways to order concepts in terms of their degree of concreteness or abstractness. For example, the concept Table can be considered to be more concrete than the concept Entity, the concept Chess table to be more concrete than the concept Table, the concept Alabaster chess table to be more concrete than Chess table, and so on. Here there is the idea that concepts are associated with a set of characteristic properties, and that by making their properties more specific, we can increase their degree of concreteness, up to the point that a concept possibly enters a one-to-one correspondence with an object of our spatiotemporal theater. This is because, according to this view, concepts would typically have been created by abstracting them from objects.

There is however another line to go from the abstract to the concrete, which can be considered to be more fundamental, and therefore also more important in view of a construction of a quantum model for the meaning content of a collection of documents. Indeed, although physical objects have played an important role in how we have formed our language, and in the distinction between abstract and concrete concepts, it is true that this line of going from the concrete to the abstract, linked to our historical need of naming the physical entities around us and define categories of objects having common features, remains a rather parochial one, in the sense that it does not take into full account how concepts behave in themselves, because of their non-objectual nature, particularly when they are combined, so giving rise to more complex entities having new emerging meanings.

When this observation is taken into account, a second line of going from the abstract to the concrete appears, related to how we have learned to produce conceptual combinations to better think and communicate (Fig. 1). The more abstract concepts are then those that can be expressed by single words, and an increase in concreteness is then the result of conceptual combinations, so that the most concrete concepts are those formed by very large aggregates of meaning-connected (entangled) single-word concepts, corresponding to what we would generically indicate as a story, like those written in books, articles, webpages, etc. Of course, not a story only in the reductive sense of a novel, but in the more general sense of a cluster of concepts combined so as to create a well-defined meaning. It is this line of going from the abstract to the concrete that we believe is the truly fundamental,Footnote 2 and in a sense also the universal one, which we will consider in our modeling strategy, when exploiting the analogy between a meaning retrieval situation, like when doing a Web search, and a quantum measurement in a physics’ laboratory. But before doing this, in the next section we describe in some detail one of the most paradigmatic physics’ experiments, which Feynman used to say that it contains the only mystery: the double-slit experiment.

Fig. 1
figure 1

Two main lines connecting abstract to concrete exist in the human culture. The first one goes from concrete objects to more abstract collections of objects having common features. The second one goes from abstract single-word concepts to stories formed by the combination of many meaning-connected concepts

In Sect. 3, we continue by providing a conceptualistic interpretation of the double-slit experiment, understanding it as an interrogative process. Then, in Sect. 4, we show how to use our analysis of the double-slit situation to provide a rationale for capturing the meaning content of a collection of documental entities. In Sect. 5, we observe that quantum interference effects are insufficient to model all data, so that additional mechanisms, like context effects, need to be also considered. In Sect. 6, we conclude our presentation by offering some final thoughts. In Appendix 1, we demonstrate that the combination of “interference plus context effects” allows in principle to model all possible data, while in Appendix 2, we introduce the notion of meaning bond of a concept with respect to another concept, showing its relevance to the interpretation of our quantum formalism.

2 The Double-Slit Experiment

The double-slit experiment is among the paradigmatic quantum experiments and can be used to effectively illustrate the rationale of our quantum modeling of the meaning content of corpora of written documents. One of the best descriptions of this experiment can be found in Feynman’s celebrated lectures in physics [24]. We will provide three different descriptions of the experiment. The first one is just about what can be observed in the laboratory, showing that an interpretation in terms of particle or wave behaviors cannot be consistently maintained. The second (Sect. 3) one is about characterizing the experiment in a conceptualistic way, attaching to the quantum entities a conceptual-like nature, and to the measuring apparatus a cognitive-like nature. The third one is about interpreting the experiment as an IR-like process (Sect. 4).

We first consider the classical situation where the entities entering the apparatus, in its different configurations, are small bullets. Imagine a machine gun shooting a stream of these bullets over a fairly large angular spread. In front of it there is a barrier with two slits (that can be opened or closed), just about big enough to let a bullet through. Beyond the barrier, there is a screen stopping the bullets, absorbing them each time they hit it. Since when this happens a localized and visible trace of the impact is left on the screen, the latter functions as a detection instrument, measuring the position of the bullet at the moment of its absorption. Considering that the slits can be opened and closed, the experiment of shooting the bullet and observing the resulting impacts on the detection screen can be performed in four different configurations. The first one, not particularly interesting, is when both slits are closed. Then, there are no impacts on the detection screen, as no bullets can pass through the barrier. On the other hand, impacts on the detection screen will be observed if (A) the left slit is open and the right one is closed; (B) the right slit is open and the left one is closed; (AB) both slits are open. The distribution of impacts observed in these three configurations is schematically depicted in Fig. 2. As one would expect, the ‘both slits open’ situation can be easily deduced from the two ‘only one-slit open’ situations, in the sense that if μ A(x) and μ B(x) are the probabilities of having an impact at location x on the detection screen, when only the left (resp., the right) slit is open, then the probability μ AB(x) of having an impact at that same location x, when both slits are kept open, is simply given by the uniform average:

$$\displaystyle \begin{aligned} \mu_{AB}^{\mathrm{bull}}(x)={1\over 2}[\mu_A(x) + \mu_B(x)]. {} \end{aligned} $$
(1)
Fig. 2
figure 2

A schematic description of the classical double-slit experiment, when: (A) only the left slit is open; (B) only the right slit is open; and (AB) both slits are simultaneously open. Note that the time during which the machine gun fired the bullets in situation (AB) is twice than in situations (A) and (B)

Consider now a similar experiment, using electrons instead of small bullets. As well as for the bullets, well-localized traces of impact are observed on the detection screen in the situations when only one slit is open at a time, always with the traces of impact distributed in positions that are in proximity of the open slit. On the other hand, as schematically depicted in Fig. 3, when both slits are jointly open, what is obtained is not anymore deducible from the two ‘only one-slit open’ situations. More precisely, when bullets are replaced by electrons, (1) is not anymore valid and we have instead:

$$\displaystyle \begin{aligned} \mu_{AB}^{\mathrm{elec}}(x)={1\over 2}[\mu_A(x) + \mu_B(x)]+\mathrm{Int}_{AB}(x), {} \end{aligned} $$
(2)

where IntAB(x) is a so-called interference contribution, which corrects the classical uniform average (1) and can take both positive and negative values. Clearly, a corpuscular interpretation of the experiment becomes now impossible, as the region where most of the traces of impact are observed is exactly in between the two slits, where instead we would expect to have almost no impacts. Also, in the regions in front of the two slits, where we would expect to have the majority of impacts, practically no traces of impact are observed.

Fig. 3
figure 3

A schematic description of the quantum double-slit experiment, when: (A) only the left slit is open; (B) only the right slit is open; and (AB) both slits are simultaneously open. Different from the classical (corpuscular) situation, a fringe (interference) pattern appears when the left and right slits are both open

Imagine for a moment that we are only interested in modeling the data of the experiment (either with bullets or electrons) in a very instrumentalistic way, by limiting the description only to what can be observed at the level of the detection screen, i.e., the traces that are left on it. For this, one can proceed as follows. The surface of the detection screen is first partitioned into a given number n of numbered cells C1, …, Cn (see Fig. 4). Then, the experiment is run until m traces are obtained on it, m being typically a large number. Also, the number of traces of impact in each cell is counted. If m AB(Ci) is the number of traces counted in cell Ci, i = 1, …, n, the experimental probability of having an impact in that cell is given by the ratio \(\mu _{AB}(\mathrm {C}_i;m)={m_{AB}(\mathrm {C}_i)\over m}\). Here by ‘experimental probability’ we simply mean the probability “induced” by a relative frequency over a large number of repetitions of a same measurement, under the same experimental conditions. Similarly, we have \(\mu _A(\mathrm {C}_i;m)={m_{A}(\mathrm {C}_i)\over m}\) and \(\mu _B(\mathrm {C}_i;m)={m_{B}(\mathrm {C}_i)\over m}\), where m A(Ci) and m B(Ci) are the number of traces counted in cell Ci when only the left and right slits are kept open, respectively. If the experiments are performed using small bullets, one finds that the difference \(\mu _{AB}(\mathrm {C}_i;m)- {1\over 2}[\mu _A(\mathrm {C}_i;m)+\mu _B(\mathrm {C}_i;m)]\) tends to zero, as m tends to infinity, for all i = 1, …, n, whereas if the experiment is done using micro-entities, like electrons, it does not converge to zero, but towards a function Int(Ci), expressing the amount of deviation from the uniform average situation.

Fig. 4
figure 4

The detection screen, partitioned into n = 21 different cells, each one playing the role of an individual position detector, here showing the traces of m = 54 impacts. The experimental probabilities are: \(\mu _{AB}(\mathrm {C}_1;21)={2\over 54}\), \(\mu _{AB}(\mathrm {C}_2;21)={2\over 54}\), \(\mu _{AB}(\mathrm {C}_3;21)={1\over 54}\), \(\mu _{AB}(\mathrm {C}_4;21)={7\over 54}\),…, \(\mu _{AB}(\mathrm {C}_{20};21)={1\over 54}\), μ AB(C21;21) = 0

Now, once the three real functions μ A(Ci;m), μ B(Ci;m), and μ AB(Ci;m) have been obtained, and their m → limit deduced, one could say to have successfully modeled the experimental data, in the three different configurations of the barrier. However, a physicist would not be satisfied with such a modeling. Why? Well, because it is not able to explain why μ AB(Ci) =limm μ AB(Ci;m) cannot be deduced, as one would expect, from μ A(Ci) =limm μ A(Ci;m) and μ B(Ci) =limm μ B(Ci;m), and why μ AB(Ci) possesses such a particular interference-like fringe structure. So, let us explain how the quantum explanation typically goes. For this, we will need to exit the two-dimensional plane of the detection screen and describe things at a much more abstract and fundamental level of our physical reality.

As is well-known, even if our description extends from the two-dimensional plane of the detection screen to the three-dimensional theater containing the entire experimental apparatus, this will still be insufficient to explain how the interference pattern is obtained. Indeed, electrons cannot be modeled as spatial waves, as they leave well-localized traces of impact on a detection screen, and they cannot be modeled as particles, as they cannot be consistently associated with trajectories in space.Footnote 3 They are truly “something else,” which needs to be addressed in more abstract terms. And this is precisely what the quantum formalism is able to do, when describing physical entities in terms of the abstract notions of states, evolutions, measurements, properties, and probabilities, not necessarily attributable to a description of a spatial (or spatiotemporal) kind.

So, let |ψ〉 be the state of an electronFootnote 4 (at a given moment in time) after having interacted with the double-slit barrier, with both slits open (we use here Dirac’s notation). We can consider that this vector state has two components: one corresponding to the electron being reflected back towards the source (assuming for simplicity that the barrier cannot absorb it), and the other one corresponding to the electron having successfully passed through the barrier and reached the detection screen. Let then P C be the projection operator associated with the property of “having been reflected back by the barrier,” and P AB the projection operator associated with the property of “having passed through the two slits.” For instance, P C could be chosen to be the projection onto the set of states localized in the half-space defined by the barrier and containing the source, whereas P AB would project onto the set of states localized in the other half-space, containing the detection screen.Footnote 5 We thus have \(P_C +P_{AB}=\mathbb {I}\), and we can define \(|\psi _{AB}\rangle ={P_{AB}|\psi \rangle \over \| P_{AB}|\psi \rangle \|}\), which is the state the electron is in after having passed through the barrier and reached the detection screen region. Note that the barrier acts as a filter, in the sense that if the electron does leave a trace on the detection screen, we know it did successfully pass through the barrier, and therefore was in state |ψ AB〉 when detected.

Now, since by assumption the n cells Ci of the detection screen work as distinct measuring apparatuses, and an electron cannot be simultaneously detected by two different cells, for all practical purposes we can associate them with n orthonormal vectors |e i〉, 〈e i|e j〉 = δ ij, corresponding to the different possible outcome-states of the position measurement performed by the screen. This means that we can consider {|e 1〉, …, |e n〉} to form a basis of the subspace of states having passed through the barrier, and since we are not interested in electrons not reaching the detection screen, we can consider such n-dimensional subspace to be the effective Hilbert space \({\mathcal {H}}\) of our quantum system, which, for instance, can be taken to be isomorphic to the vector space \({\mathbb C}^n\) of all n-tuples of complex numbers.

According to the Born rule (which in quantum mechanics is used to obtain a correspondence between what is observed in measurement situations, in terms of relative frequencies, and the objects of its mathematical formalism, thus expressing the statistical content of the theory and allowing to bring the latter in contact with the experiments), the probability for an electron in state \(|\psi _{AB}\rangle \in {\mathcal {H}}\), to be detected by cell Ci, is given by the square modulus of the amplitude 〈e i|ψ AB〉, that is: μ AB(Ci) = |〈e i|ψ AB〉|2, and if we assume that an electron that has passed through the barrier is necessarily absorbed by the screen (assuming, for instance, that the latter is large enough), we have \(\sum _{i=1}^n \mu _{AB}(\mathrm {C}_i)=1\). Introducing the orthogonal projection operators P i = |e i〉〈e i|, we can also write, equivalently:

$$\displaystyle \begin{aligned} \mu_{AB}(\mathrm{C}_i)\kern-1pt=\kern-1pt\|P_i |\psi_{AB}\rangle\|{}^2 \kern-1pt=\kern-1pt \langle \psi_{AB} |P_i^\dagger P_i|\psi_{AB}\rangle \kern-1pt=\kern-1pt \langle \psi_{AB} |P_i^2|\psi_{AB}\rangle = \langle \psi_{AB} |P_i|\psi_{AB}\rangle. \end{aligned} $$
(3)

More generally, if I is a given subset of {1, …, n}, we can define the projection operator M =∑iI P i, onto the set of states localized in the subset of cells with indexes in I, and the probability of being detected in one of these cells is given by:

$$\displaystyle \begin{aligned} \mu_{AB}(i\in I)=\langle \psi_{AB} |M|\psi_{AB}\rangle=\sum_{i\in I} \mu_{AB}(\mathrm{C}_i). {} \end{aligned} $$
(4)

As an example, consider the situation of Fig. 4, where one can, for instance, define the following seven projectors M k = P k + P k+7 + P k+14, k = 1, …, 7, describing the seven columns of the 3 × 7 screen grid. In particular, we have: \(\mu _{AB}(i\in \{4,11,18\}) = {7\over 54}+{8\over 54}+{3\over 54}={1\over 3}\), i.e., the probability for a trace of impact to appear in the central vertical sector of the screen (the central fringe) is one-third.

The double-slit experiment does not allow to determine if an electron that leaves a trace of impact on the detection screen has passed through the left slit or the right slit. This means that the properties “passing through the left slit” and “passing through the right slit” remain potential properties during the experiment, i.e., alternatives that are not resolved and therefore (as we are going to see) can give rise to interference effects [24]. Let however write P AB as the sum of two projectors: P AB = P A + P B, where P A corresponds to the property of “passing through the left slit” and P B to the property of “passing through the right slit.” Note that there is no unique way to define these properties, and the associated projections, as is clear that electrons are not corpuscles moving along spatial trajectories. A possibility here is to further partition the half-space defined by P AB into two sub-half-spaces, one incorporating the left slit, defined by P A and the other one incorporating the right slit, defined by P B, so that P A P B = P B P A = 0. For symmetry reasons, we can assume that the electron has no preferences regarding passing through the left or right slits (this will be the case if the source is placed symmetrically with respect to the two slits), so that \(\|P_A|\psi _{AB}\rangle \|{ }^2=\|P_B|\psi _{AB}\rangle \|{ }^2={1\over 2}\). We can thus define the two orthogonal states \(|\psi _A\rangle = \sqrt {2}\, P_A|\psi _{AB}\rangle \) and \(|\psi _B\rangle = \sqrt {2}\, P_B|\psi _{AB}\rangle \), and write:

$$\displaystyle \begin{aligned} |\psi_{AB}\rangle= (P_A + P_B)|\psi_{AB}\rangle= {1\over\sqrt{2}}(|\psi_A\rangle + |\psi_B\rangle). {} \end{aligned} $$
(5)

According to the above definitions, |ψ A〉 and |ψ B〉 can be interpreted as the states describing an electron passing through the left and right slit, respectively.Footnote 6 In other words, in accordance with the quantum mechanical superposition principle, we have expressed the electron state in the double-slit situation as a (uniform) superposition of one-slit states. Inserting (5) in (4), now omitting the argument in the brackets to simplify the notation, we thus obtain:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mu_{AB}&\displaystyle =&\displaystyle \langle \psi_{AB} |M|\psi_{AB}\rangle = {1\over 2}(\langle \psi_A| + \langle\psi_B|)M(|\psi_A\rangle + |\psi_B\rangle)\\ &\displaystyle =&\displaystyle {1\over 2}(\langle \psi_A |M|\psi_A\rangle + \langle \psi_B |M|\psi_B\rangle + \langle \psi_A |M|\psi_B\rangle +\langle \psi_B |M|\psi_A\rangle)\\ &\displaystyle =&\displaystyle {1\over 2}(\mu_A + \mu_B)+\underbrace{\mathfrak{R}\, \langle \psi_A |M|\psi_B\rangle}_{\mathrm{Int}_{AB}}, {} \end{array} \end{aligned} $$
(6)

where IntAB is the interference contribution, with the symbol \(\mathfrak {R}\) denoting the real part of a complex number, and we have used 〈ψ B|M|ψ A〉 = 〈ψ A|M|ψ B. So, when there are indistinguishable alternatives in an experiment, as is the case here, since we can only observe the traces of the impact in the detection screen, without being able to tell through which slit the electrons have passed, states are typically expressed as a superposition of the states describing these alternatives, and because of that a deviation from the classical probabilistic average (1) will be observed, explaining in particular why an interference-like fringe-like pattern can form.Footnote 7

3 Interrogative Processes

We now want to provide a cognitivistic/conceptualistic interpretation of the double-slit experiment, describing it as an interrogative process [11, 14]. It is of course well understood that measurements in physics’ laboratories are like interrogations. Indeed, when we want to measure a physical observable on a given physical entity, we can always say that we have a question in mind, that is: “What is the value of such physical observable for the entity?” By performing the corresponding measurement, we then obtain an answer to the question. More precisely, the outcome of the measurement becomes an input for our human mind, which attaches to it a specific meaning, and it is only when such mental process has been completed that we can say to have obtained an answer to the question that motivated the measurement. In other words, there is a cognitive process, performed by our human mind, and there is a physical process, which provides an input for it.

All this is clear, however, we want to push things further and consider that a measurement can also be described, per se, as an interrogative process, independently of a human mind possibly taking knowledge of its outcome. In other words, we also consider the physical apparatus as a cognitive entity, which answers a question each time it interacts with a physical entity subjected to a measurement, here viewed as a conceptual entity carrying some kind of meaning. This means that two cognitive processes are typically involved in a measurement, one at the level of the apparatus, and another one at the level of the mind of the scientist interacting with it. The latter is founded on human meaning, but not the former, which is the reason why we have to make as humans a considerable effort to understand what is going on. In that respect, we can say that the construction of the theoretical and conceptual edifice of quantum mechanics has been precisely our effort in the attempt to understand the non-human meaning that is exchanged in physical processes, for instance, when an electron interacts with a detection screen in a double-slit experiment.

We will not enter here into the details of this conceptuality interpretation of quantum mechanics, and simply refer to the review article [14] and to the references cited therein; this not only for understanding the genesis of this interpretation, but also for appreciating why it possibly provides a deep insight into the nature of our physical word. In the following, we limit ourselves to describing the double-slit experiment in a cognitivistic way, as this will be useful when we transpose the approach to an IR-like ambit. So, we start from the hypothesis that the electrons emitted by the electron gun are ‘meaning entities’, i.e., entities behaving in a way that is similar to how human concepts behave. And we also consider the detection screen to be a ‘cognitive entity’, i.e., an entity sensitive to the meaning carried by the electrons and able to answer questions by means of a written (pointillistic) language of traces of impact on its surface. We are then challenged as humans to understand the meaning of this language, and more precisely to guess the query that is answered each time, and then see if the collection of obtained answers is consistent with the logic of such query.

There are of course different equivalent ways to formulate the question answered by the screen detector’s mind. A possible formulation of it is the following: “What is a good example of a trace of impact left by an electron passing through the left slit or the right slit?” This way of conceptualizing the question is of course very “human,” being based on the prejudice that the electron would be an entity always having spatial properties, which is not the case (this depends on its state). But we can here understand the “passing through” concept as a way to express the fact that the probability of detecting the electron by the final screen is zero if both slits are closed. An alternative way of formulating the same question, avoiding the “passing through” concept could be: “What is a good example of an effect produced by an electron interacting with the barrier having both the left and right slits open?” However, we will use in our reasoning the previous formulation of the question, as more intuitive for our spatially biased human minds. What we want is to explain the emergence of the fringe pattern by understanding the process operated by the detection screen, when viewed as a cognitive entity answering the above question.

The first thing to observe is that such process will be generally indeterministic. Indeed, when we say “passing through a slit,” this is not sufficient to specify a unique trajectory in space for an electron (when assumed to be like a spatial corpuscle). This means that, if the screen cognitive entity thinks of the electron as a corpuscle, there are many ways in which it can pass through a slit, so, it will have to select one among several possibilities, which is the reason why, every time the question is asked, the answer (the trace of the impact on the screen) can be different (and cannot be predicted in advance), even though the state of the electron is always the same. The same unpredictability will manifest if the screen cognitive entity does not think of the electron as a spatial entity, but as a more abstract (non-spatial) conceptual entity, which can only acquire spatial properties by interacting with it. Indeed, also in this case the actualization of spatial properties will be akin to a symmetry breaking process, whose outcomes cannot be predicted in advance.

To understand how the cognitive process of the screen detector entity might work, let us first concentrate on the central fringe, which is the one exhibiting the higher density of traces of impact and which is located exactly in between the two slits. It is there that the “screen mind” is most likely to manifest an answer. To understand the reason of that, we observe that an impact in that region elicits a maximum doubt as regard the slit the electron would have taken to cross the barrier, or even that it would have necessarily passed through either the left or the right slit, in an exclusive manner. Thus, an impact in that region is a perfect exemplification of the concept “an electron passing through the left slit or the right slit.” Now, the two regions on the screen that are exactly opposite the two slits, they have instead a very low density of traces of impact, and again this can be understood by observing that an answer in the form of a trace of impact there would be a very bad exemplification of the concept “an electron passing through the left slit or the right slit,” as it would not make us doubt much about the slit taken by the electron. Moving from these two low-density regions, we will then be back in situations of doubt, although less perfect than that of the central fringe, so we will find again a density of traces of impact, but this time less important, and then again regions of low density will appear, and so on, explaining in this way the alternating fringe pattern observed in experiments [11, 14].

4 Modeling the QWeb

Having analyzed the double-slit experiment, and its possible cognitivistic/conceptualistic interpretation, we are now ready to transpose its narrative to the modeling of the meaning entity associated with the Web, which we have called the QWeb. Our aim is to provide a rationale for capturing the full meaning content of a collection of documental entities, which in our case will be the webpages forming the Web, but of course all we are going to say also works for other corpora of documents. As we explained in Sect. 1, there is a universal line for going from abstract concepts to more concrete ones, which is the one going from concepts indicated by single words (or few words) to those that are complex combinations of large numbers of concepts, which in our spatiotemporal theater can manifest as full-fledged stories, and which in our case we are going to associate to the different pages of the Web. Assuming they would have been numbered, we denote them Wi, i = 1, …, n. The meaning content of the Web has of course been created by us humans, and each time we interact with the webpages, for instance, when reading them, cognitive processes will be involved, which in turn can give rise to the creation of new webpages. However, we will not be interested here in the modeling of these human cognitive activities, as well as when we model an experiment conducted in a physics’ laboratory we are generally not interested in also modeling the cognitive activity of the involved scientists.

As mentioned in Sect. 1, we want to fully exploit the analogy between an IR process, viewed as an interrogation producing a webpage as an outcome, and a measurement, like the position measurement produced by the screen detector in a double-slit experiment, also viewed as being the result of an interrogative process. So, instead of the n cells Ci, i = 1, …, n, partitioning the surface of the detection screen, we now have the n webpages Wi, i = 1, …, n, partitioning the Web canvas. What we now measure is not an electron, but the QWeb meaning entity, which similarly to an electron we assume can be in different states and can produce different possible outcomes when submitted to measurements. We will limit ourselves to measurements having the webpages Wi as their outcomes. More precisely, webpages Wi will play the same role as the cells Ci of the detection screen in the double-slit experiment, in the sense that we do not distinguish in our measurements the internal structure of a webpage, in the same way that we do not distinguish the locations of the impacts inside a single cell. So, similarly to what we did in Sect. 2, we can associate each webpage with a state |e i〉, i = 1, …, n, so that {|e 1〉, …, |e n〉} will form a basis of the n-dimensional QWeb’s Hilbert state space.

Let us describe the kind of measurements we have in mind for the QWeb. We will call them “tell a story measurements,” and they consist in having the QWeb, prepared in a given state, interacting with an entity sensitive to its meaning, having the n webpages stored in its memory, as stories, so that one of these Web’s stories will be told at each run of these measurements, with a probability that depends on the QWeb’s state. The typical example of this is that of a search engine having the n webpages stored in its indexes, used to retrieve some meaningful information, with the QWeb initial state being an expression of the meaning contained in the retrieval query (here assuming that the search engine in question would be advanced enough to also use indeterministic processes, when delivering its outcomes).

If the state of the QWeb is |e i〉, associated with the webpage Wi, then the ‘tell a story measurement’ will by definition provide the latter as an outcome, with probability equal to one. But the states |e i〉, associated with the stories written in the webpages Wi, only correspond, as we said, to the more concrete states of the QWeb, according to the definition of concreteness given in Sect. 1, and therefore only represent the tip of the iceberg of the QWeb’s state space, as it would be the case for the position states of an electron. Indeed, the QWeb’s states, in general, can be written as a superposition of the webpages’ basis states:

$$\displaystyle \begin{aligned} |\psi\rangle = \sum_{j=1}^n r_j e^{i\rho_j}|e_i\rangle,\quad r_j,\rho_j\in {\mathbb R}, \quad r_j\ge 0, \quad \sum_{j=1}^{n}r_j^{2}=1. {} \end{aligned} $$
(7)

We can right away point out an important difference between (7) and what is usually done in IR approaches, like the so-called vector space models (VSM), where the states that are generally written as a superposition of basis states are those associated with the index terms used in queries (see, for instance, [30, p. 5], and [27, p. 19]). Here it is exactly the other way around: the dimension of the state space is determined by the number of available documents, associated with the outcome-states of the ‘tell a story measurements’, interpreted as stories, i.e., as the more concrete states of the QWeb entity subjected to measurements. This also means that (as we will explain in the following) the states associated with single terms will not necessarily be mutually orthogonal, i.e., will not generally form a basis. Of course, another important difference with respect to traditional IR approaches is that the latter are built upon real vector spaces, whereas our quantum modeling is intrinsically built upon complex vector spaces (Hilbert spaces), where linearity works directly at the level of the complex numbers and weights are only obtained from the square of their moduli. In other words, the complex numbers \(r_j e^{i\rho _j}\), appearing in the expansion (7), can be understood as generalized coefficients expressing a connection between the meaning carried by the QWeb in state |ψ〉, and the meaning “sticking out” from (the stories contained in) the webpages Wj.Footnote 8

As a very simple example of initial state, we can consider a state |χ〉 expressing a uniform meaning connection towards all the Web stories: \(|\chi \rangle ={1\over \sqrt {n}} \sum _{j=1}^n e^{i\rho _j}|e_j\rangle \), so that the probability to obtain story Wi, in a ‘tell a story measurement’, when the QWeb is in such uniform state |χ〉, is:

$$\displaystyle \begin{aligned} \mu(\mathrm{W}_i)=\langle\chi |P_i|\chi \rangle = {1\over n}\sum_{j,k=1}^n e^{i(\rho_j-\rho_k)}\underbrace{\langle e_k|e_i\rangle}_{\delta_{ki}}\underbrace{\langle e_i|e_j\rangle}_{\delta_{ij}} = {1\over n}. {} \end{aligned} $$
(8)

As another simple example, we can consider the QWeb state \(|\chi _I\rangle ={1\over \sqrt {m}} \sum _{j\in I} e^{i\rho _j}|e_j\rangle \), which is uniform only locally, i.e., such that only a subset I of m webpages , with m ≤ n, would have the same (non-zero) probability of being selected as an actual story, so that in this case \(\mu _I(\mathrm {W}_i)=\langle \chi _I|P_i|\chi _I \rangle = {1\over m}\), if i ∈ I, and zero otherwise.

It is important to observe that we are here viewing the QWeb as a whole entity, when we speak of its states, although it is clearly also a composite entity, in the sense that it is a complex formed by the combination of multiple concepts. Take two concepts A and B (for example, A =  Fruits and B =  Vegetables). As individual conceptual entities, they are certainly part of the QWeb composite entity, and as such they can also be in different states, which we can also write as linear combinations of the webpages’ basis states:

$$\displaystyle \begin{aligned} |\psi_A\rangle=\sum_{j=1}^n a_je^{i\alpha_j}|e_j\rangle,\quad |\psi_B\rangle=\sum_{j=1}^n b_je^{i\beta_j}|e_j\rangle, {} \end{aligned} $$
(9)

with \(a_j,b_j,\alpha _j,\beta _j\in {\mathbb R}\), a j, b j ≥ 0, and \(\sum _{j=1}^{n}a_j^{2}=\sum _{j=1}^{n}b_j^{2}=1\). These states, however, will be considered to be also states of the QWeb entity as a whole, as they also belong to its n-dimensional Hilbert space. In other words, even if states are all considered to be here states of the QWeb entity, some of them will also be interpreted as describing more specific individual conceptual entities forming the QWeb. We thus consider that individual concepts forming the composite QWeb entity can be viewed as specific states of the latter. Of course, the quantum formalism also offers another way to model composite entities, by taking the tensor product of the Hilbert spaces of the sub-entities in question. This is also a possibility, when modeling conceptual combinations, which proved to be very useful in the quantum modeling of data from cognitive experiments, particularly in relation to the notion of entanglement (see [6, 7] and the references cited therein), but in the present analysis we focus more directly on the superposition principle (and the interference effects it subtends) as a mechanism for accounting for the emergence of meaning when concepts are considered in a combined way [2] (see however the discussion in the first part of Sect. 5).

Since we are placing ourselves in the same paradigmatic situation of the double-slit experiment, we want to consider how the combination of two concepts A and B—let us denote the combination AB—can manifest at the level of the Web stories, in the ambit of a “tell a story measurement.” Here we consider the notion of “combination of two concepts” in a very general way, in the sense that we do not specify how the combination of A and B is actually implemented, at the conceptual level. In human language, if A is the concept Fruits and B is the concept Vegetables, their combination can, for instance, be Fruits–vegetables, Fruits and vegetables, Fruits or vegetables, Fruits with vegetables, Fruits are sweeter than vegetables, etc., which of course carry different meanings, i.e., describe different states of their two-concept combination. In fact, also stories which are jointly about Fruits and Vegetables can be considered to be possible states of the combination of these two concepts. All these possibilities give rise of different states |ψ AB〉, describing the combination of the two concepts A and B.

These two concepts can be seen to play the same role of the two slits in the double-slit experiment. When the two slits are jointly open, we are in the same situation as when the two concepts A and B are jointly considered in the combination AB, producing a state |ψ AB〉 that we can describe as the superposition of two states |ψ A〉 and |ψ B〉, which are the states of the concepts A and B, respectively, when considered not in a combination, and which play the same role as the states of the electron in the double-slit experiment traversing the barrier when only one of the two slits is kept open at a time. Of course, different superposition states can in principle be defined, each one describing a different state of the combination of the two concepts, but here we limit ourselves to the superposition (5), where the states |ψ A〉 and |ψ B〉 have the exact same weight in the superposition.

Let now X be a given concept. It can be a concept described by a single word or a more complex concept described by the combination of multiple concepts. We consider the projection operator \(M_X^w\), onto the set of states that are manifest stories about X. This means that we can write:

$$\displaystyle \begin{aligned} M_X^w=\sum_{i\in J_X} |e_i\rangle\langle e_i|, {} \end{aligned} $$
(10)

where J X is the set of indexes associated with the webpages that are manifest stories about X, where by “manifest” we mean stories that explicitly contain the word(s) “X” indicating the concept X, hence the superscript “w” in the notation, which stands for “word.” Indeed, we could as well have defined a more general projection operator \(M_X^s=\sum _{i\in I_X} |e_i\rangle \langle e_i|\), onto the set of states that are stories about X not necessarily of the manifest kind, i.e., not necessarily containing the explicit word(s) indicating the concept(s) the stories are about, with J X ⊂ I X, and the superscript “s” now standing for “story.”

To avoid possible confusions, we emphasize again the difference between the notion of state of a concept and that of story about a concept. The latter, in our definition, is a webpage, i.e., a full-fledged printed or printable document. But webpages that are stories about a concept may explicitly contain the word indicating such concept or not. For example, one can conceive a text explaining what Fruits are, without ever writing the word “fruits” (using in replacement other terms, like “foods in the same category of pineapple, pears, and bananas”). On the other hand, the notion of state of a concept expresses a condition which cannot in general be reduced to that of a story, as it can also be a superposition of stories of that concept (or better, a superposition of the states associated with the stories of that concept), as expressed, for instance, in (7) and (9), and a superposition of (states of) stories is not anymore a (state of a) story.

Now, when considering a “tell a story measurement,” we can also decide to only focus on stories having a predetermined content. In the double-slit experiment, this would correspond to only be interested in the detection of the electron by a certain subset of cells, indicated by a given set of indexes J X, and not the others. More specifically, we can consider only those stories that are “stories about X,” where X is a given concept. This means that if the QWeb is in a pre-measurement state |ψ A〉, which is the state of a given concept A, what we are asking through the measurement is if the stories about X are good representatives of A in state |ψ A〉 (in the same way we can ask if a certain subset of traces of impact, say those of the central fringe, is a good example of electrons passing through the left slit; see the discussion of Sect. 3). In other words, we are asking how much |ψ A〉 is meaning connected to concept X, when the latter is in one of the maximally concrete states defined by the webpages that are “stories of X” or even more specifically “manifest stories of X.”

In the latter case, we can test this by using the projection operator \(M_X^w\) and the Born rule. According to (4), the probability μ A with which the concept A in state |ψ A〉 is evaluated to be well represented by a “manifest story about X” is given by the average:

$$\displaystyle \begin{aligned} \mu_A(i\in J_X)= \langle\psi_A| M_X^w |\psi_A\rangle=\sum_{i\in J_X} |\langle e_i|\psi_A\rangle|{}^2 = \sum_{i\in J_X}a_i^2, {} \end{aligned} $$
(11)

where for the last equality we have used (9). If we additionally assume that A is more specifically described by a state that is a superposition only of those stories that explicitly contains the words “A” (manifest stories about A), the above probability becomes (omitting from now on the argument, to simplify the notation): \(\mu _A= \sum _{i\in J_{A,X}}a_i^2\), where J A,X denotes the sets of indexes associated with the webpages jointly containing the words “A” and “X.” Note that if n A,X = |J A,X| is the number of webpages containing both terms “A” and “X,” n A = |J A| and n X = |J X| are the webpages containing the “A” term and the “X” term, respectively, we have n A,X ≤ n A and n A,X ≤ n X. Becoming even more specific, we can consider states of A expressing a uniform meaning connection towards all the different manifest stories about A, that is, characteristic function states of the form:

$$\displaystyle \begin{aligned} |\chi_A\rangle={1\over \sqrt{n_A}} \sum_{j\in J_A} e^{i\alpha_j}|e_j\rangle, {} \end{aligned} $$
(12)

for which the probability (11) becomes:

$$\displaystyle \begin{aligned} \mu_A= \langle\chi_A| M_X^w |\chi_A\rangle=\sum_{i\in J_{A,X}}{1\over n_A}= {n_{A,X}\over n_A}, {} \end{aligned} $$
(13)

which can be simply interpreted as the probability of randomly selecting a webpage containing the term “X,” among those containing the terms “A.”

With respect to the double-slit experiment analogy, the probability μ A describes the “only left slit open” situation, and of course, mutatis mutandis, we can write (with obvious notation) an equivalent expression for a different concept B: \(\mu _B= \langle \chi _B| M_X^w |\chi _B\rangle =\sum _{i\in J_{B,X}}{1\over n_B}= {n_{B,X}\over n_B}\). So, when calculating the probability μ AB for the combination AB of the two concepts A and B, we are in a situation equivalent to when the two slits are kept jointly open, with the question asked being now about the meaning connection between AB, in state |ψ AB〉, and a (here manifest) story about X. Concerning the state |ψ AB〉, describing the combination, we want it to be able to account for the emergence of meanings that can possibly arise when the two concepts A and B are considered one in the context of the other, and for consistency reasons we expect the probability μ AB to be equal to \({n_{AB,X}\over n_{AB}}\) (since we are here limiting our discussion, for simplicity, to manifest stories), where n AB is the number of webpages containing both the “A” and “B” terms and n AB,X is the number of webpages containing in addition also the “X” term, and of course: n AB,X ≤ n AB, n AB ≤ n A, and n AB ≤ n B. This can be easily achieved if the state of AB is taken to be the characteristic function state: \(|\chi _{AB}\rangle ={1\over \sqrt {n_{AB}}} \sum _{j\in J_{AB}} e^{i\delta _j}|e_j\rangle \); however, coming back to our discussion of Sect. 2, this would not be a satisfactory way to proceed, as the modeling would then remain at the level of the canvas of printed documents of the Web, and would therefore not be able to capture the level of meaning associated with it, that is, the more abstract QWeb entity. It is only at the level of the latter that emergent meanings can be explained as the result of combining concepts.

By analogy with the paradigmatic double-slit experiment, we will here assume that a state of AB, i.e., a state of the combination of the two concepts A and B, when they are in individual states |ψ A〉 and |ψ B〉, respectively, can be generally represented as a superposition vector (5). Since here we are considering the special case where these states are characteristic functions, we more specifically have:

$$\displaystyle \begin{aligned} |\psi_{AB}\rangle= {1\over\sqrt{2}}(|\chi_A\rangle + |\chi_B\rangle), \end{aligned} $$
(14)

where we have assumed for simplicity that |χ A〉 and |χ B〉 can be taken to be orthogonal states (this need not to be the case in general). The interference contribution \(\mathrm {Int}_{AB}= \mathfrak {R}\, \langle \chi _A |M_X^w|\chi _B\rangle \) can then be calculated by observing that:

$$\displaystyle \begin{aligned} \begin{array}{rcl} M^w_X |\chi_B\rangle&\displaystyle =&\displaystyle \left(\sum_{j\in J_X} |e_j\rangle\langle e_j| \right)\left({1\over \sqrt{n_B}} \sum_{k\in J_B} e^{i\beta_k}|e_k\rangle\right)\\ &\displaystyle =&\displaystyle {1\over \sqrt{n_B}}\sum_{j\in J_X} \sum_{k\in J_B}e^{i\beta_k}|e_j\rangle\underbrace{\langle e_j|e_k\rangle}_{\delta_{jk}}={1\over \sqrt{n_B}}\sum_{j\in J_{B,X}}e^{i\beta_j}|e_j\rangle, {} \end{array} \end{aligned} $$
(15)

so that, multiplying the above expression from the left by 〈χ A| and taking the real part, we obtain:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mathrm{Int}_{AB}&\displaystyle =&\displaystyle \mathfrak{R}\left({1\over \sqrt{n_A}}\sum_{j\in J_A} e^{-i\alpha_j}\langle e_j|\right)\left({1\over \sqrt{n_B}}\sum_{k\in J_{B,X}}e^{i\beta_k}|e_k\rangle\right)\\ &\displaystyle =&\displaystyle {1\over \sqrt{n_An_B}}\sum_{j\in J_A}\sum_{k\in J_{B,X}}\underbrace{\langle e_j|e_k\rangle}_{\delta_{jk}} \underbrace{\mathfrak{R}\, e^{i(\beta_k-\alpha_j)}}_{\cos{}(\beta_k-\alpha_j)}=\sum_{j\in J_{AB,X}} {\cos{}(\beta_j-\alpha_j)\over \sqrt{n_An_B}}. {} \end{array} \end{aligned} $$
(16)

According to (6), (13), and (16), the probability μ AB for the combined concept AB is therefore:

$$\displaystyle \begin{aligned} \mu_{AB}= {1\over 2}\Big(\underbrace{n_{A,X}\over n_A}_{\mu_A} +\underbrace{n_{B,X}\over n_B}_{\mu_B}\Big)+\sum_{j\in J_{AB,X}} {\cos{}(\beta_j-\alpha_j)\over \sqrt{n_An_B}}. {} \end{aligned} $$
(17)

It is important to observe in (17) the role played by the phases α j and β j characterizing the states |χ A〉 and |χ B〉. When they are varied, the individual probabilities μ A and μ B remain perfectly invariant, whereas the values of μ AB can explore an entire range of values, within the interference interval \(I_{AB}= [\mu ^{\mathrm {min}}_{AB},\mu ^{\mathrm {max}}_{AB}]\), where according to (17) we have:

$$\displaystyle \begin{aligned} \begin{aligned} \mu^{\mathrm{min}}_{AB}&= {1\over 2}\left({n_{A,X}\over n_A}+{n_{B,X}\over n_B}\right)-{n_{AB,X}\over \sqrt{n_An_B}},\\ \mu^{\mathrm{max}}_{AB}&= {1\over 2}\left({n_{A,X}\over n_A}+{n_{B,X}\over n_B}\right)+{n_{AB,X}\over \sqrt{n_An_B}}. {} \end{aligned} \end{aligned} $$
(18)

Therefore, we see that via the interference effects, the co-occurrence of the terms “A,” “B,” and “X” is independent of what is revealed in the Web for the co-occurrence of just “A” and “X” or the co-occurrence of just “B” and “X.” This means that it is really at the more abstract level of the QWeb, and not of the Web, that these three situations of co-occurrence can be seen to be related to each other.

5 Adding Context

According to (18), by using the superposition principle and the corresponding interference effects, we can extend the values of the probability μ AB beyond those specified by the uniform average \(\mu _{AB}^{\mathrm {uni}}={1\over 2}\left ({n_{A,X}\over n_A}+{n_{B,X}\over n_B}\right )\). One may wonder then if, generally speaking, interference effects would be sufficient to model all possible situations. The answer is negative, and to see why let us consider a simple example of a collection of documents for which interference effects are insufficient for their modeling.Footnote 9

Assume that the collection is formed by n documents (n ≥ 140) that n A = 100 of them contain a given word “A,” and n B = 50 of them contain another word “B.” Also, the number of documents containing both words is assumed to be n AB = 10 (see Fig. 5). Consider then a third word “X,” which is assumed to be present in 80 of the documents containing the word “A,” in 15 of the documents containing the word “B,” and in 5 of the documents containing both words, that is: n A,X = 80, n B,X = 15, n AB,X = 5. So, \(\mu _A={n_{A,X}\over n_A}={80\over 100}=0.8\), \(\mu _B={n_{B,X}\over n_B}={15\over 50}=0.3\), and \(\mu _{AB}^{\mathrm {uni}}={1.1\over 2}= 0.55\). We also have, \({n_{AB,X}\over \sqrt {n_An_B}}={5\over \sqrt {5000}}\approx 0.07\), so that \(\mu ^{\mathrm {min}}_{AB}\approx 0.55-0.07=0.48\) and \(\mu ^{\mathrm {max}}_{AB}\approx 0.55+0.07=0.62\).

Fig. 5
figure 5

A schematic Venn-diagram representation of the number of documents containing the words “A,” “B,” and “X” (left) which can be modeled using only interference effects, and the words “A,” “B,” and “Y” (right), which instead also require context effects

Now, as we said, μ AB, for consistency reasons, should be equal to \({n_{AB,X}\over n_{AB}}={5\over 10}=0.5\), i.e., to the probability of randomly selecting a document containing the word “X,” among those containing the words “A” and “B.” Since 0.5 is contained in the interference interval I AB = [0.48, 0.62], by a suitable choice the phase differences in (17), the equality \(\mu _{AB} = {n_{AB,X}\over n_{AB}}\) can be obtained; hence, interference effects are sufficient to model this situation. But if we consider a word “Y” that, different from “X,” would only be present in 10 of the documents containing the word “A” and in 10 of those containing the word “B” (see Fig. 5), this time we have \(\mu _A={n_{A,Y}\over n_A}={10\over 100}=0.1\), \(\mu _B={n_{B,Y}\over n_B}={10\over 50}=0.2\), and \(\mu _{AB}^{\mathrm {uni}} ={0.3\over 2}= 0.15\). So, μ min ≈ 0.15 − 0.07 = 0.08 and μ max ≈ 0.15 + 0.07 = 0.22, which means that \({n_{AB,Y}\over n_{AB}}=0.5\) is not anymore contained in the interference interval I AB = [0.08, 0.22]. Hence, interference effects are not sufficient to model this situation.

Additional mechanisms should therefore be envisioned to account for all the probabilities that can be calculated by counting the relative number of documents containing certain words and co-occurrences of words. A possibility is to explore more general forms of measurements on more general versions of the QWeb entity. In our approach here, we focused on the superposition principle to account for the emergence of new meanings when concepts are combined. But of course, when a cognitive entity interacts with a meaning entity, the emergence of meaning is not the only element that might play a role. In human reasoning, for instance, a two-layer structure can be evidenced: one consisting of conceptual thoughts, where a combination of concepts is evaluated as a new single concept, and the other consisting of classical logical thoughts, where a combination of concepts is evaluated as a classical combinations of different entities [17].

To also account for the existence of classical logical reasoning, one can define more general “tell a story measurements,” by considering a specific type of Hilbert space called Fock space, originally used in quantum field theory to describe situations where there is a variable number of identical entities. This amounts considering the QWeb as a more general “quantum field entity” that can be in different number operator states and in different superpositions of these states. In the present case, since we are only considering the combination of two concepts, the construction of the Fock space \({\mathcal {F}}\) can be limited to two sectors: \({\mathcal {F}}= {\mathcal {H}}\oplus ({\mathcal {H}}\otimes {\mathcal {H}})\), where “⊕” denotes a direct sum between the first sector \({\mathcal {H}}\) (isomorphic to \({\mathbb C}^n\)) and the second sector \({\mathcal {H}}\otimes {\mathcal {H}}\) (isomorphic to \({\mathbb C}^{2n}\)), where “⊗” denotes the tensor product. The first sector describes the one-entity states, where the combination of the two concepts A and B is evaluated as a new (emergent) concept, typically described by a superposition state (5). The second sector describes the two-entity situation, where the two concepts A and B remain separate in their combination, which is something that can be described by a so-called product (non-entangled) state |ψ A〉⊗|ψ B〉.

Instead of (5), we can then consider the more general superposition state:

$$\displaystyle \begin{aligned} |\psi_{AB}\rangle= \sqrt{1-m^2}\, e^{i\nu}{1\over\sqrt{2}}(|\psi_A\rangle + |\psi_B\rangle) + m\, e^{i\lambda} |\psi_A\rangle\otimes |\psi_B\rangle, {} \end{aligned} $$
(19)

where the number 0 ≤ m ≤ 1 determines the degree of participation in the second sector. Also, instead of (10), we have to consider a more general projection operator, acting now on both sectors. Here we can distinguish the two paradigmatic projection operators:

$$\displaystyle \begin{aligned} M^{w,\mathrm{and}}_X= M_X^w\oplus(M_X^w \otimes M_X^w), \quad M^{w,\mathrm{or}}_X= M_X^w\oplus(M_X^w \otimes \mathbb{I} + \mathbb{I} \otimes M_X^w -M_X^w \otimes M_X^w), {} \end{aligned} $$
(20)

where \(M^{w,\mathrm {and}}_X\) describes the situation where the combination of concepts is logically evaluated as a conjunction (and), whereas \(M^{w,\mathrm {or}}_X\) describes the situation where the combination of concepts is logically evaluated as a disjunction (or). When we use \(M^{w,\mathrm {and}}_X\), one finds, in replacement of (6), the more general formulaFootnote 10:

$$\displaystyle \begin{aligned} \mu_{AB}=m^2\, \mu^{\mathrm{and}}_{AB} +(1-m^2)\left[{1\over 2}(\mu_A + \mu_B)+\mathrm{Int}_{AB}\right], {} \end{aligned} $$
(21)

where \(\mu ^{\mathrm {and}}_{AB} = \mu _A\mu _B\). However, this will not be sufficient to model all possible data, as is clear that in the previously mentioned example of word “Y,” we have: \(\mu ^{\mathrm {and}}_{AB}=0.02\), so that the interval of values that can be explored by the above convex combination (by varying not only the phases α j and β j, but now also the coefficient m) is [0.02, 0.22], which still doesn’t contain the value 0.5 of \({n_{AB,Y}\over n_{AB}}\). When we use instead \(M^{w,\mathrm {or}}_X\), we have to replace \(\mu ^{\mathrm {and}}_{AB}\) in (21) by \(\mu ^{\mathrm {or}}_{AB}=\mu _A +\mu _B - \mu _A\mu _B\), whose value for the word “Y” of our example is 0.28, so that the interval of possible values becomes [0.08, 0.28], which however is still not sufficient.

So, we must find some other cognitive effects, in order to be able to model and provide an explanation for a wide spectrum of experimental values for the probabilities, related to different possible collections of documental entities. A general way of proceeding, remaining in a “first sector” modeling of the QWeb, is to consider that there would be also context effects that can alter the QWeb state before it is measured. In the double-slit experiment analogy, we can imagine a mask placed somewhere in between the barrier and the screen, acting as a filter allowing certain states to pass through, whereas others will be blocked (see Fig. 6). Note that if we place the mask close to the detection screen, some cells will be deactivated, as the components of the pre-measurement state relative to them will be filtered out by the mask. On the other hand, if it is placed close to the double-slit barrier, it will allow to control the transmission through the slits and produce, by changing its position, a continuum of interference figures, for instance, interpolating the probability distributions of the two one-slit arrangements; see [21]. More complex effects can of course be obtained if the mask is placed at some finite distances from the barrier and screen, and more general filters than just masks can also be considered, but their overall effect will always be that only certain states will be allowed to interact with the measuring apparatus (here the screen).

Fig. 6
figure 6

By placing a screen with a mask (and more generally a filter) between the barrier and the detection screen, the structure of the observed interference pattern can be modulated. The effect of this additional structure can be ideally described using a projection operator

From a cognitivistic viewpoint, context effects can have different origins and logics. For instance, we can consider that an interrogative context, for the very fact that a given question is asked, will inevitably alter the state of the meaning entity under consideration. Even more specifically, consider the example of a cognitive entity that is asked to tell a story (it can be a person, a search engine, or the combination of both). For this, a portion of the entity’s memory needs to become accessible, and one can imagine that the extent and nature of such available portion of memory can depend on the story that is being asked.Footnote 11

So, we will now assume that when the QWeb entity is subjected to a “tell a story measurement,” there will be a preliminary change of state, and we will adopt the very simple modeling of such state change by means of an orthogonal projection operator, which in general can also depend on the choice of stories we are interested in, like “stories about X,” so we will generally write N X for it (\(N_X^2=N_X=N_X^\dagger \)). Just to give a simple example of a X-dependent projection N X, it could be taken to be the projection operator onto the subspace of QWeb’s states that are “states of X” (we recall that a “state of X” is generally not necessarily also a “story about X”). However, in the following we will just limit ourselves to the idealization that context effects can be formally modeled using a projection operator, without specifying their exact nature and origin. So, the presence of this additional context produces the pre-measurement transitions: \(|\psi _A\rangle \to |\psi ^{\prime }_A\rangle \), \(|\psi _B\rangle \to |\psi ^{\prime }_B\rangle \), and \(|\psi _{AB}\rangle \to |\psi ^{\prime }_{AB}\rangle \), where we have defined (from now on, for simplicity, we just write N for N X, dropping the X-subscript):

$$\displaystyle \begin{aligned} |\psi^{\prime}_A\rangle = {N|\psi_A\rangle \over \|N|\psi_A\rangle\|}, \quad |\psi^{\prime}_B\rangle = {N|\psi_B\rangle \over \|N|\psi_B\rangle\|},\quad |\psi^{\prime}_{AB}\rangle = {N|\psi_{AB}\rangle \over \|N|\psi_{AB}\rangle\|}. \end{aligned} $$
(22)

With the above re-contextualized states, the probability \(\mu _A = \langle \psi ^{\prime }_A| M_X^w |\psi ^{\prime }_A\rangle \) becomes:

$$\displaystyle \begin{aligned} \mu_A = {\langle\psi_A | N^\dagger M_X^w N|\psi_A\rangle\over \|N|\psi_A\rangle\|{}^2}={\langle\psi_A | N M_X^w N|\psi_A\rangle\over \langle\psi_A | N |\psi_A\rangle}={1\over p_A}\langle\psi_A | N M^w N|\psi_A\rangle, {} \end{aligned} $$
(23)

where for the second equality we have used ∥N|ψ A〉∥2 = 〈ψ A|N N|ψ A〉 = 〈ψ A|N 2|ψ A〉 = 〈ψ A|N|ψ A〉, and for the last equality we have defined the probability p A = 〈ψ A|N|ψ A〉, for the state |ψ A〉 to be an eigenstate of the context N. Similar expressions clearly hold also for the concept B: \(\mu _B = {1\over p_B}\langle \psi _B | N M^w N|\psi _B\rangle \), with p B = 〈ψ B|N|ψ B〉, and for the probability \(\mu _{AB}=\langle \psi ^{\prime }_{AB}| M_X^w |\psi ^{\prime }_{AB}\rangle \), relative to the concept combination AB, we now have:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mu_{AB} &\displaystyle =&\displaystyle {\langle\psi_{AB} | N^\dagger M_X^w N|\psi_{AB}\rangle\over \|N|\psi_{AB}\rangle\|{}^2} = {\langle\psi_{AB} | NM_X^w N|\psi_{AB}\rangle\over \langle\psi_{AB} | N |\psi_{AB}\rangle} \\ &\displaystyle =&\displaystyle {\langle\psi_A | NM_X^w N|\psi_A\rangle +\langle\psi_B | NM_X^w N|\psi_B\rangle + 2 \mathfrak{R}\, \langle\psi_A | NM_X^w N|\psi_B\rangle \over \langle\psi_A | N |\psi_A\rangle +\langle\psi_B | N |\psi_B\rangle + 2 \mathfrak{R}\, \langle\psi_A | N |\psi_B\rangle}\\ &\displaystyle =&\displaystyle {p_A\,\mu_A +p_B\, \mu_B + 2 \mathfrak{R}\, \langle\psi_A | NM_X^w N|\psi_B\rangle \over p_A +p_B + 2 \mathfrak{R}\, \langle\psi_A | N |\psi_B\rangle}. {} \end{array} \end{aligned} $$
(24)

The first two terms at the numerator of (24) correspond to a weighted average, whereas the third term, both at the numerator and denominator, is the interference-like contribution. Note that in the special case where |ψ A〉 and |ψ B〉 are eigenstates of the context N, that is, N|ψ A〉 = |ψ A〉 and N|ψ B〉 = |ψ B〉, we have p A = p B = 1, so that (24) reduces to (6), or, if |ψ A〉 and |ψ B〉 are not orthogonal vectors, to:

$$\displaystyle \begin{aligned} \mu_{AB}= {{1\over 2}(\mu_A +\mu_B) + \mathfrak{R}\, \langle\psi_A | M_X^w |\psi_B\rangle \over 1 + \mathfrak{R}\, \langle\psi_A |\psi_B\rangle}, {} \end{aligned} $$
(25)

where the weighted average now becomes a uniform one. The more general expression (24), incorporating both context and interference effects, allows to cover a much larger range of values. In fact, as we show in Appendix 1, under certain assumptions the full [0, 1] interval of values can be spanned, thus allowing all possible data about occurrence and co-occurrence of words to be modeled.

6 Conclusion

In this chapter, we have motivated a fundamental distinction between the Web of printed pages (or any other collection of documental entities) and a more abstract entity of meaning associated with it, which we have called the QWeb, for which we have proposed a Hilbertian (Born rule based) quantum model. In our discussion, we have focused on an important class of measurements, which we have called the ‘tell a story measurements’, whose outcome-states are associated with the n webpages and were taken to form a basis of the (n-dimensional) Hilbert space. We have tested the model by considering the specific situation where only stories manifestly containing the words denoting certain concepts are considered, in order to allow to relate the theoretical probabilities with those obtained by calculating the relative frequency of occurrence and co-occurrence of these words, which in turn depend on how much the associated concepts are meaning-connected. We have done so by also considering context effects, in addition to interference effects, the former being modeled by means of orthogonal projection operators and the latter by means of superposition states. Also, we have extensively used the double-slit experiment as a guideline to motivate the transmigration of fundamental notions from physics to human cognition and theoretical computer science.

Note that more general models than those explored here can also be considered, exploiting more general versions of the quantum formalism, like the GTR-model and the extended Bloch representation of quantum mechanics [8,9,10, 12, 13]. Hence, the “Q” in “QWeb” refers to a quantum structure that need not to be understood in the limited sense of the standard quantum formalism. We have also mentioned in Sect. 5 the possibility of working in a multi-sector Fock space, as a way to extend the range of probabilities that can be modeled. However, we observed that not all values can be modeled in this way. Another direction that can be explored (as an alternative to context effects) is to consider states whose meaning connections are not necessarily uniform, although still localized within the sets J A and J B. A further other direction is to consider step function states extending beyond the manifest word subspaces. For example, states of the form: \(|\psi _A^{\, a}\rangle = a\, |\chi _A\rangle + \bar {a}\, |\bar {\chi }_A\rangle \), where \(|\bar {\chi }_A\rangle ={1\over \sqrt {n-n_A}} \sum _{j\notin J_A}e^{i\alpha _j}|e_j\rangle \), and \(|a|{ }^2 + |\bar {a}|{ }^2 =1\).

Regarding the co-occurrences of words in documents, it is worth observing that they are determined by the meaning carried by the corresponding concepts and documents, and not by the physical properties of the latter. This means that we can access the traces left by meaning by analyzing the co-occurrence of words in the different physical (printed or stored in memory) documents, and that such meaning “stick out” from the latter in ways that can be accessed without the intervention of the human minds that created it. Note however that the meaning extending out of these documents, here the webpages, is not the full meaning of the QWeb, as encoded in its quantum state. This is so because one cannot reconstruct the pre-measurement state of a quantum measurement by having only access to the outcome (collapsed) states and the associated probabilities of a single measurement. For this, one needs to perform a series of different measurements, characterized by different informationally complete bases, as is done in the so-called quantum state tomography [22]. Here we only considered the basis associated with the webpages, and it is still unclear which complementary measurements could be defined, using different bases and having a clear operational meaning, that is, which can be concretely performed, at least in principle [4].

Let us also observe that, generally speaking, in IR situations also the modeling of how human minds interact with the QWeb can and will play a role, in addition to the modeling per se of the QWeb. Indeed, as we mentioned already in Sect. 3, the outcome provided by a measurement of the QWeb, say a given story in a “tell a story measurement,” becomes the input with which human minds will have to further interact with, which again can be described as a deterministic or indeterministic context, possibly creating new meanings. The formalism of quantum theory can again be used to model these human cognitive interactions, which is what is typically investigated in cognitive psychology experiments and again modeled using the mathematical formalism of quantum theory, in the emerging field known as quantum cognition; see [15] and the references cited therein.

We stress that, in our view, it is only when a more abstract—meaning oriented—approach is adopted in relation to documental entities, like the Web, and an operational-realistic modeling of its conceptual structure is attempted, exploiting the panoply of quantum effects that have been discovered in the physics’ laboratories, that quoting from [4]: “a deeper understanding of how meaning can leave its traces in documents can be accessed, possibly leading to the development of more context-sensitive and semantic-oriented information retrieval models.” Note however that we have not attempted here any evaluation of what are the pros and cons, differences and similarities, of our modeling and the other existing approaches, also integrating quantum features. Let us just mention, to give a few examples, Foskett’s work in the eighties of last century [25], Agosti et al. work in the nineties [19, 20], and Sordoni et al. more recent work, where the double-slit experiment analogy is also used to investigate quantum interference effects for topic models such as LDA [29].Footnote 12

To conclude, let us observe that in the same way the quantum cognition program, and its effectiveness, does not require the existence of microscopic quantum processes in the human brain [15], the path “towards a quantum Web” that we have sketched here, and in [4], where the Web of written documents is viewed as a “collection of traces” left by an abstract meaning entity—the QWeb—should not be confused with the path “towards a quantum Internet” [23], which is about constructing an Internet able to transmit “quantum information,” instead of just “classical information,” that is, information carried by entities allowing quantum superposition to also take place and be fully exploited. In the future, there will certainly be a Quantum Internet and a Quantum Web, that is, there will be a physical Internet more and more similar in structure to the abstract Web of meanings it conveys. These will be fascinating times for the evolution of the human race on this planet, who will then be immersed in a fully developed noosphere, but at the moment we are not there yet.