1 Introduction

“The enormous usefulness of mathematics in the natural sciences is something bordering on the mysterious...[T]here is no rational explanation for it”, wrote Eugene Wigner in a well-known article in 1960 (Wigner 1960). Above all, this “unreasonable” effectiveness manifests itself in physics. The latter, for Wigner, is devoted to “discovering the laws of inanimate nature”. This view of physics, widespread but also challenged several times during the twentieth century, relies on the concept of “law of nature” in a fundamental way. Any such law applies to one or several kinds of inanimate matter and describes their dynamical evolution. Physics is seen, then, as a study of natural phenomena by first deducing and subsequently applying corresponding general laws. The overarching aim of the laws is to enable the prediction of future events. Wigner wonders why this goal happens to be aligned with an apparently different one, that of mathematics, which he describes as selecting concepts that are “amenable to clever manipulations [in producing] striking, brilliant arguments”. If one takes for granted that mathematical thinking is exclusively concerned with a search for such arguments, it may indeed seem mysterious that the mathematical concepts and formulae should be useful in facilitating the prediction of future events.

I submit that the effectiveness of mathematics in the natural sciences is perfectly reasonable and rational if one adopts a different view of physical theory. The aim of prediction of future observations, for sure, remains; but the substance changes. This view applies whenever the object of study involves phenomena or processes whose nature remains unknown. Under these circumstances, physicists are not in position to say what kind of matter is involved but they are nevertheless eager to build a theory. In order to do so, they employ fundamental principles tasked with limiting the possibilities in a theoretical description of unknown facts. For short, this approach will be named ‘blackbox models.’ Its main feature is that physical theory is to be constrained by universal principles rather than dynamical laws. On this point the blackbox approach complements, but does not contradict, Wigner’s conception of physics. It is now broadly used, with applications spanning more than a century of research work that gave birth to new physical theories and discoveries. I illustrate the importance of this physics of the unknown on four examples: Einstein’s principle theories (Sect. 2), S-matrix (Sect. 3), effective field theories (Sect. 4), and device-independent approaches (Sect. 5).

On the basis of these four case studies I argue in Sect. 6 that the effectiveness of mathematics in blackbox models is neither surprising nor unreasonable. Blackbox models leave no room for Wignerian amazement because their success depends on mathematics as a driving force of theoretic construction. Yet they are predictive, as required of physics, and also explanatory. That physical explanation can be provided by blackbox models is precisely the missing element in Wigner’s view: these models do not seek to establish a law of nature, however their explanatory power is as real as that of constructive, dynamical theories. Combined with the constitutive role of mathematical concepts in blackbox models, it clears away the cloud of mystery over the use of mathematics.

2 Principle theories

In 1919 Einstein made a well-known distinction between principle and constructive theories (Einstein 1982). Constructive theories match Wigner’s view of physics: they contain dynamical laws describing the behaviour of particular kinds of matter, e.g., Newton’s laws for the movement of rigid bodies. Since their aim: employing laws to predict future events, is different from the aim of mathematics, Wigner’s claim to surprising effectiveness fully applies to constructive theories. By contrast, a principle theory, e.g., Einstein’s own special relativity, is a theory derived from simple postulates. It does not begin with an assumption about the type of matter or its dynamics; these become consequences of the postulates rather than theoretical prerequisites. The postulates are formulated as universal physical principles and are expressed in the formalism as mathematical axioms.

For example, the relativity principle or the independence of the speed of light of the reference frame in which it is measured play the role of fundamental principles in Einstein’s relativity theory. A different set of postulates may begin with setting an upper limit on velocities (Fock 1959, 1971). A modern avatar of these postulates, called no-signalling, stipulates that in an experimental setting with two observers the choice of measurement by one party must not influence the statistics of the outcomes registered by a different party. It is widely used in device-independent approaches for introducing constraints on operations with quantum information (see Sect. 5). The interest of this formulation is that it is entirely non-dynamical: no-signalling is an algebraic condition expressed in the language of conditional probability. At best it receives a kinematic—but not a dynamical—expression.

To use Einstein’s own words about principle theories, the principles are employed in them in order to “narrow the possibilities” (Einstein 2004). This means that one should begin the model-building exercise by adopting a very inclusive framework that can encompass the unknown phenomena in question but also much more. This framework may possibly extend beyond what has been or can be experimentally observed at a current stage of technological development. The point of choosing this starting point is that a broad framework can accommodate a yet unspecified theory with unpredictable empirical consequences. Principles, then, limit the possibilities and serve to narrow the framework down to a particular model. For example, no-signalling excludes faster-than-light travel in a geometric framework with a preselected spacetime, either Euclidean or Minkowski, or in the Riemannian way of introducing a spacetime manifold and arbitrary Riemannian metric. In a non-dynamical framework which does not begin with a geometric object, the very notion of ‘travel’ might be undefined. Here, the no-signalling principle helps to make sure that a purely algebraic model will not produce a contradiction with the theory of relativity when it is applied for the description of real-world phenomena. The impossibility of faster-than-light signalling is “elevated” (Friedman 2001, p. 88) to the status of universal postulate even in the absence of geometric assumptions. It then becomes a fundamental principle of nature and a constitutive feature of physical theories.

Einstein’s own road to the distinction between principle and constructive theories was a challenging one. After his 1905 article describing the photoelectric effect in terms of light quanta (Einstein 1905), his belief in the fundamental character and the exact validity of Maxwell’s electrodynamics was destabilized. As he wrote in the 1949 Autobiographical Notes,

Reflections of this type [on the dual wave-particle nature of radiation] made it clear to me as long ago as shortly after 1900, i.e., shortly after Planck’s trailblazing work, that neither mechanics nor electrodynamics could (except in limiting cases) claim exact validity. By and by I despaired of the possibility of discovering the true laws by means of constructive efforts based on known facts. Einstein (1949, p. 51, 53)

This “desperation” led Einstein to special relativity. To find the theory, he looked for one that would not be based on “known facts”. Special relativity, indeed, remains mute on the issue of material constitution of the rods and clocks that act as measurement devices.

There is good evidence that Einstein believed that this lack of constructivity was a disadvantage and that principle theories did not offer a satisfactory understanding of physics (Brown and Timpson 2006; Frisch 2005). This claim has been challenged recently via a comparison with James Jeans’s position (Lange 2014) but another, more seasoned critique focuses on the status of general relativity. According to Brown, it should be seen as a constructive theory since it contains a dynamical law (Brown 2005). Without entering the debate on constructive relativity, I would like to emphasize the importance of the argument from explanatory power. The capacity to explain phenomena was uncontroversially ascribed by Einstein only to constructive theories: “When we say we have succeeded in understanding a group of natural processes, we invariably mean that a constructive theory has been found which covers the processes in question” (Einstein 1982). Einstein wished to build an explanatory account based on known facts but despaired to do so. In his time and later, the desideratum to obtain a constructive theory as a replacement of principle-based special relativity never came to be realized.

To be sure, constructive theories are still widely in use. What has changed since the time of Einstein’s tergiversations is that principle theories are now taken to be explanatory. They are capable of giving an understanding of physics on a par with constructive theories, i.e., they can underlie theoretical knowledge as well as experimental setups (e.g., in quantum cryptography, see Sect. 5). That physical knowledge can be gained through the pursuit of a principle-based approach has helped to legitimize it, not only as a widespread method on sociological grounds, but also on the grounds of epistemology as an approach that is explanatory. Its key method: the choice of a broad framework and its subsequent narrowing down through limiting principles, which is at the same time an application of mathematics to physics and the enabling force behind theory-building and ultimately behind explanation. A conjunction of these two factors showcases a paradigm of physical theory in which it is perfectly reasonable to assign the central methodological place to mathematics.

3 S-matrix

In the years before quantum electrodynamics and subsequently quantum chromodynamics were fully developed it had not been clear that a field-theoretic approach would be successful in accounting for the electromagnetic, the weak and the strong interactions. In the early 1950s, for example, it was not obvious to the physics community whether the method of quantum field theory (QFT) based on gauge symmetry would be an appropriate framework for building the theory of strong interactions. A similar uncertainty plagued quantum electrodynamics a decade earlier. In 1954, same year as the work by Yang and Mills, during a conference discussion in the presence of Oppenheimer, Gell-Mann, Fermi, Wick, and Dyson, Goldberger challenged the applicability of QFT methods to nuclear interactions. Surprisingly, nobody in the audience spoke to the contrary (Noyes 1954). This episode was still remembered in the 1970s as a typical example of early doubts about the future of quantum field theory (Appelquist and Bjorken 1971).

The doubts about the applicability of QFT were prevalent because of renormalizability issues. In response physicists began to look for methods to build a theory that did not assume any known particle content leading to divergencies. The main idea of this approach was borrowed from Heisenberg’s philosophical program in the 1920s, which prescribed that a theory should focus only on observable quantities. This idea proved to be extremely successful in the discovery of quantum mechanics (Heisenberg 1925). The hope was that the same approach would again produce a crucial insight. As Weinberg wrote, the physicists of the generation before his own believed that “by using principles of unitarity, analyticity, Lorentz invariance and other symmetries, it would be possible to calculate the S-matrix, and you would never have to think about a quantum field” (Weinberg 1996, p. 248). Indeed, history has largely followed this prescription in developing the way in which our current physical theories with unknown particle content are constructed. One detail of this approach presents a particular philosophical interest. For a theoretician, the central question bears on the mathematical content of the theory: what mathematical concepts should one use to represent observable quantities? What physical constraints are to be imposed on such representatives? The success of the theory-building exercise is directly dependent on finding a framework in which the connection between mathematical concepts will turn out to have predictive power.

In 1937 John Wheeler introduced one such mathematical concept, which he called the scattering matrix, later to be known as S-matrix (Wheeler 1937). Wheeler’s initial intent was to develop a mathematical method of “resonating group structure” that would allow one to build a description of the whole interacting system of elementary particles from the knowledge of its parts. This did not fully work out. Wheeler, however, obtained a result suggesting that the problem as he had formulated it could in fact be bypassed: “The connection which we have obtained between the scattering and disintegration cross sections does not depend for its validity on the accuracy of what we have called the method of resonating group structure”. The scattering matrix that involved the cross sections depended only on some general asymptotic properties but not on the details of the interacting compound system. Among the general arguments used by Wheeler one mainly finds symmetry considerations credited by him to Bohr and Jordan. In the wake of Heisenberg’s quantum mechanics, Wheeler’s work provided a new example of a physical theory of unknown interactions, which involved exclusively the observables. It was built through the introduction of a new mathematical object. Wheeler published the results but did not pursue his method further; only much later did his scattering matrix become known as a precursor of the S-matrix theory of strong interactions (Mehra 2001, p. 990).

Between 1942 and 1944 Heisenberg, who did not know about Wheeler’s work, wrote a series of three articles in Zeitschrift für Physik explicitly pursuing the goal of building a theory of unknown physics. The reason why the constructive physical content had to be taken as unknown, according to Heisenberg, was that the theory could change in the future:

In view of the later alteration [Abänderung] of the theory, the present investigation attempts to isolate from the conceptual scheme of the quantum theory of wave fields those concepts which probably will not be affected by the future changes [in the theory of elementary particles] and which may therefore represent an integral part [Bestandteil] also of the future theory. Heisenberg (1942)

The concepts that Heisenberg thought would not be affected by a future theory change were the observable quantities. He admitted although, in an indisputable influence of his earlier discussions with Einstein, that ‘only the final theory will decide which quantities are “really observable”’. As early as 1938, simultaneously with Wheeler but independently, Heisenberg wrote:

Perhaps one may remember to advantage, in attempting to find new concepts, that in mathematical formulae, we are now confronted with the task of finding computational rules, by which we can connect the cross sections... (Heisenberg 1938)

This stance, to quote the historian Helmut Rechenberg, was a consequence of the fact that “one did not yet know how to formulate a divergence-free theory describing elementary particles” (Rechenberg 1989). Heisenberg’s conviction was that the right theory would contain a minimal length. It was not immediately clear, however, how one was supposed to introduce such minimal length in QFT. Heisenberg reasonably believed that the asymptotic results, because they belong among observable quantities that the theory must be able to predict, should remain independent of the minimal length. While working out a complete theory remained a matter for future research, it was possible to introduce a direct connection between the momenta and the energies of free particles and the scattering and reaction cross sections. The connection was to be expressed mathematically: “It seemed appropriate to find a mathematical [our emphasis—AG] object capable of housing these observable quantities. Heisenberg realized that the momentum space kernel of the probability amplitude for transitions between free particle states was the object he wanted” (Grythe 1982). Thus Heisenberg introduced a unitary ‘characteristic’ S-matrix becoming the founding father of the S-matrix approach in theoretical physics.

Heisenberg’s S-matrix met fierce opposition from Wolfgang Pauli. He believed that it could not be fundamental, because the way the approach was constructed did not rely on arguments from simplicity and, in fact, produced a result that was anything but simple:

In general I have arrived at the opinion that the S-matrix is not a concept, of which we may expect that it occurs in a future theory as a primary fundamental concept. It indeed has the character of something complicated and derived and therefore might hardly be suitable to lead us beyond the present wave mechanics. (Pauli 1946)

This lack of simplicity underwrote Pauli’s conviction that the S-matrix could not be a part of the laws of nature. It seems that Pauli believed that for reasons of mathematical elegance a law of nature should have a simple expression. He then concluded:

The S-matrix, although it might exist in a future theory, seems to be completely unfit to constitute the point of departure for a [new] theory. It is not the quantity which will occur in the general laws of nature, but a late consequence of them. (Pauli 1948)

This reveals a tension between two approaches to physical theory, each pushing toward a different role of mathematical formalism. Pauli wished to have a theory containing laws of nature, i.e., dynamical rules of evolution of particular kinds of matter. If a law is found, e.g., describing light quanta, and if this law has a mathematical expression, then it is perfectly legitimate to wonder, as Wigner did, why mathematics would be so effective in describing their behaviour. It is even more surprising that mathematics is equally effective in describing the evolution of directly perceivable objects like tables or chairs. Whatever answer one may give to this Wignerian wonder, the theory in question is, in Einstein’s terms, a constructive one.

The situation is different for principle theories. They explore unknown territories, which cannot yet be accounted for in terms of a particular kind of matter, let alone a law of its dynamical evolution. The S-matrix, as Wheeler discovered, bypasses the problem of “resonating group structure”, which would describe the content of the theory in terms of interacting particles. Similarly, Heisenberg’s focus on observable quantities does not require a physical description of how one such observable gets dynamically converted into another. The middle ground can remain unknown—a black box—while mathematical relations will still be available describing the relation between the observables. This is a clear sign that mathematics in a principle theory is not playing the role of underwriting the laws of nature, as Wigner thought, but rather of letting a theory of the unknown to be built in the first place. When Gregor Wentzel called the S-matrix program “very incomplete—it is like an empty frame for a picture yet to be painted” (Wentzel 1947), he believed to be giving a pejorative assessment of Heisenberg’s program. In fact, he put his finger on the main feature of principle theories: a physical theory is possible without filling in “an empty frame” or opening up a black box.

The S-matrix theory of nuclear interactions has become history after the advent of quantum chromodynamics but the S-matrix approach is still well and alive. In quantum gravity, for example, it is used for constructing low-energy models of supergravity from high-energy theories like string theory. String theory itself was discovered by Veneziano as a consequence of his work on the S-matrix approach, when he used general principles to correctly guess the unknown amplitudes satisfying duality properties, which described the excitations of a one-dimensional object (Veneziano 1968). One can use this all-encompassing theory of quantum gravity to construct low-energy gravitational models with unknown physical content. This study of unknown territory requires the same tool as the one used by Heisenberg for exploring the unknown land of QFT, the S-matrix:

Such an S-matrix, which is tightly constrained by properties such as unitarity and analyticity, can be a very powerful way to summarize our ignorance of a theory. ...We might anticipate that such study in the context of gravity, supplemented by additional physical input, could bear important fruit. (Giddings 2013)

Thus the anticipated physics is always mathematical.

4 Effective field theories

The S-matrix approach only asked ‘practical’ questions about the yet unknown theory of strong interactions, formulated in the language of physical observables, and methodically avoided the need to have a full theory. In the effective field theory (EFT) approach the unknown is not the theory of nuclear interactions but new physics beyond the Standard Model. With little prospect for distinguishing in the near future between the different alternatives, EFT offers a method for developing a theory-independent approach, in which observable effects are all that matters about new unknown physics. Just as the S-matrix enables an exclusive focus on observable quantities by disregarding the quantum field, EFT relieves one from the need to worry about the physical content of high-energy theory. To this end, EFT prescribes that the Lagrangian of the theory should include all terms in the most general form compatible with symmetry principles. Its assumes no particular physical content or physical meaning, with symmetry principles being the only constraints.

The notion of renormalizability in the context of quantum field theory and its early representatives like quantum electrodynamics was developed by Bethe, Schwinger, Tomonaga, Feynman, and Dyson. The latter introduced crucial power-counting techniques for analyzing operator relevance. Since his 1949 work (Dyson 1949a, b) and up to 1970s renormalizability was thought to be a necessary condition for a field theory to make sense. Wilson’s work on the renormalization group (Wilson 1971) has paved the way for a new attitude due to a modified view on the reality of the renormalization cut-off. In the older understanding, the cut-off scale was a residue of abstract mathematics introduced with the only goal of avoiding infinities in summation series. The new appreciation of non-renormalizable theories came with the understanding that the cut-off could be taken as physical and corresponding to the limit of applicability of a given theory. New physics was to be expected beyond the same cut-off scale \(\Lambda _{NP}\). Since the domain of applicability of particular field theories became limited by a number denoting an energy scale, they began to be seen as effective rather than fundamental theories whose validity only extends up to some frontier. Wilson’s work and Weinberg’s reintroduction of EFTs as useful theories with ‘phenomenological Lagrangians’ (Weinberg 1967, 1979, 1989) boosted this new view.

Much of the historic development of EFTs focused on the top-down approach stipulating that the fundamental physical theory is known but inapplicable for practical purposes. This may be due to the complexity of high-energy theory or, in the case of EFTs in condensed matter physics, to heuristic arguments as suggested by Shankar: “Even when one knows the theory at a microscopic level (i.e., the fundamental theory), there is often a good reason to deliberately move away to an effective theory” (Shankar 1999). A typical example from particle physics is the chiral perturbation theory, which gives a low-energy approximation of quantum chromodynamics in the light quark sector (for a review see Pich 1999). Even when the physical content is known, it is often instructive and necessary to build a physical theory as if it had remained unknown. This effectively transforms EFTs into blackbox models, with a history of the approach that treats the known as if it had been unknown going back to the Euler-Heisenberg calculation in the 1930s of photon-photon scattering at small energies within the framework of Dirac’s quantum field theory (Euler and Kockel 1935; Euler 1936; Heisenberg and Euler 1936).

High-energy physics often uses an alternative ‘bottom-up’ approach, whose popularity reflects a change in the conception of EFTs. Today physicists tend to think of all physical theories, including the Standard Model, as EFTs with respect to new physics at higher energies. Blackbox models have become universal: it is not wrong to claim that to some extent any quantum field theory is a theory of the unknown.

A typical model-building scenario, following Wilson, starts with a Lagrangian of an effective field theory valid up to scale \(\Lambda \). This Lagrangian can be generally written as a sum over local operator products:

$$\begin{aligned} \mathcal {L}=\sum _{n=0}^\infty \frac{\lambda _n}{\Lambda ^n} \mathcal {O} _n. \end{aligned}$$
(1)

Coefficients \(\lambda _n\) are the coupling constants. They encode information on the unknown physics at scales higher than \(\Lambda \) and can be fixed experimentally; additionally, when the underlying high-energy theory happens to be known, the values of the coupling constants appear through a renormalization group calculation.

The only constraints on the form of operator product terms \(\mathcal {O} _n\) come from the symmetries of the theory. The tree level of the power series in \(\frac{1}{\Lambda }\) is obtained by the usual Standard Model calculation. Effects of new physics appear in loop corrections and influence the value of coupling constants \(\lambda _n\). The main value of Lagrangian (1) for high-energy physics is that it can be used to study low-energy effects of new physics beyond the Standard Model without having to specify what this new physics actually is, apart from the assumption of its irrelevance to interactions below the cut-off.

To give a realistic example, consider a ‘top-down’ electroweak EFT that reproduces the Standard Model for the light degrees of freedom (light quarks, leptons and gauge bosons) as long as energies are small compared with the Higgs mass (Pich 1999). This EFT is Higgless in the sense that it cuts off the Higgs sector by a choice of \(\Lambda \). The lowest-order effective Lagrangian fixes the masses of Z and W bosons at tree level and does not carry information on the underlying symmetry breaking \(SU(2) _L \times U(1) _Y \rightarrow U(1) _{\mathrm {QED}}\) down to the gauge group U(1) of quantum electrodynamics. At the next order the most general effective chiral Lagrangian with only gauge bosons and Goldstone fields,

$$\begin{aligned} \mathcal {L} ^{(4)} _{\mathrm {EW}}=\sum _{i=0} ^{14} a _i \mathcal {O} _i, \end{aligned}$$
(2)

contains fifteen operators. This complexity is essential as it stems from the requirement that we use the most general form of the Lagrangian compatible with symmetry principles. Gell-Mann formulated a rule called “the totalitarian principle”, which asserts that everything that is not forbidden is compulsory (Bilaniuk and Sudarshan 1969). For Lagrangian (2), constraints from symmetry include invariance with respect to CP and \(SU(2) _L \times U(1) _Y\). Also, three of the fifteen operators vanish as a consequence of the equations of motion under the assumption of light fermions. With the remaining terms, one finds various effects such as the usual electroweak oblique corrections (six operators involved at the bilinear, four at the trilinear and five at the quartic levels), corrections to rare B and K decays, the CP-violating parameter, etc. This effective approximation of a very large Higgs mass in the Standard Model gives a field theory, whose operator content is not simple but which nevertheless possesses phenomenological predictive power and provides an easier way to perform calculations than the complete Standard Model Lagrangian.

As if he were developing an argument to counter Pauli’s critique of Heisenberg’s S-matrix, Weinberg insists that the absence of any assumption of simplicity about the EFT Lagrangian is what makes the EFT method so efficient (Weinberg 1996, p. 246). He further supports the parallel by claiming that “the S-matrix philosophy is not far from the modern philosophy of effective field theories”. However, he also adds a critique of S-matrix: “More important than any philosophical hang-ups was the fact that quantum field theory didn’t seem to be going anywhere in accounting for the strong and weak interactions”. The S-matrix was the only rational reaction to a situation in which no one knew what language to use, nor in which direction to look for a theory of the strong and weak interactions. This was despair quite analogous to Einstein’s unease when he realized that the theory he’d been developing could not be based on known facts (Sect. 2). Similarly, today we do not know whether supersymmetry, or extra dimensions, or yet another model, will turn out to be the right solution for new physics. However, the blackbox approach to unknown phenomena is generalized in EFTs to the point where it can be applied above and beyond any despair. It has become a usual, and arguably a normative, tool in quantum field theory.

Like Einstein or Heisenberg, we resort to a language that does not require knowledge of the dynamical laws or the constitutive types of matter. Unlike Einstein or Heisenberg, we treat this situation as perfectly reasonable. The method of building an EFT that starts from a general mathematical framework of gauge theory, then proceeds with a Lagrangian compatible with the constraints coming from symmetry principles, is neither a surprising nor a scandalous jump as Pauli may have thought about Heinsenberg’s S-matrix. That mathematics plays an effective role in physics of the unknown, if we look at it from the point of view of effective field theories, becomes a new normal.

This view is a far cry from Wigner’s insistence on physical theories as collections of laws of nature but it fits well with another one of his ideas. When Wigner announces that the aim of mathematics is to develop concepts that can be “manipulated [for] making striking, brilliant arguments”, he insists that these concepts are not chosen for their conceptual simplicity. Similarly, simplicity is not at work in the selection of operators for an EFT Lagrangian dictated by Gell-Mann’s totalitarian principle. The latter is driven by the concern of effectiveness in the prediction of future events: this is precisely Wigner’s definition of physical theory. What is more, this type of theory, like all blackbox models of unknown physics, is constitutively grounded in the use of mathematics.

Wigner connected mathematics with a capacity to make striking arguments. What is striking in EFTs is that an argument can be made at all without the knowledge of complete theory. Summing up, we see in the case of effective field theories an example of perfectly reasonable and rational effectiveness of mathematics in physics.

5 Device-independent models

Quantum cryptography works with systems of “unspecified character” (Bancal et al. 2011) or “unknown nature” (Bardyn et al. 2009). This is performed in a device-independent approach: a theoretical investigation that does not rely on the knowledge of laws governing the systems’ behaviour. A conventional ‘device’ refers here to any process or apparatus described by an operational theory, whether classical or quantum, which is explicitly designated. This terminology was first introduced by Mayers and Yao (1998), who developed device-independent quantum cryptography with imperfect sources. Over the years quantum cryptography has developed an array of such methods for dealing with adversaries which effectively turn systems into untrusted entities by acting upon sources. Device-independent protocols play an important role in experimental tasks such as randomness generation (Colbeck 2006; Pironio et al. 2010), quantum key distribution (Barrett et al. 2005), estimation of the states of unknown systems (Bardyn et al. 2009), certification of multipartite entanglement (Bancal et al. 2011), and distrustful cryptography (Aharon et al. 2016). Some of these cryptographic protocols have found a broader use in quantum information, e.g., device-independent tests are performed on Bell inequalities or on the assumption that superluminal signaling is impossible (Bancal 2013).

In full generality, device-independent models are defined as a set of n parties, each of which ‘selects’ a measurement setting or ‘places’ an input value \(x_1\in \mathcal {X}_1,\ldots ,x_n\in \mathcal {X}_n\) respectively, and ‘subsequently’ ‘obtains’ an output value or a measurement result \(a_1\in \mathcal {A}_1,\ldots , a_n\in \mathcal {A}_n\). The sets \(\mathcal {X}_1,\ldots ,\mathcal {X}_n\) and \(\mathcal {A}_1,\ldots ,\mathcal {A}_n\) are alphabets of finite cardinality. The verbs used in these expressions merely convey an operational meaning of the inputs and outputs; they do not imply that any party exercises free will or has conscious decision-making procedures. The term ‘subsequently’ introduces a local time arrow pointing from each party’s input to its output. Although such local time arrows seem quite intuitive, in full generality they need not be assumed either. A fully general setting requires, therefore, that absolutely nothing be postulated about the way inputs are transformed into outputs, except two conditions: a) these two types of data are clearly distinguished; b) the process of transformation is physical. Physics is contained in the generalized probability distribution \(\mathbf {p}=P(a_1,\ldots ,a_n|x_1,\ldots ,x_n)\) (Fig. 1).

Fig. 1
figure 1

In the case of \(n=3\) parties, physics is fully contained in the probabilities \(\mathbf {p}=P(a_1 a_2 a_3|x_1 x_2 x_3)\)

All device-independent models studied in the literature introduce further constraints on \(\mathbf {p}\). A customary one is the no-signalling principle mentioned in Sect. 2: a choice of measurement by one party must not influence the statistics of the outcomes registered by a different party. Mathematically, the distribution \(\mathbf {p}\) is non-signalling if and only if all one-party marginal probabilities are functions of their respective inputs \(x_i\):

$$\begin{aligned} P(a_i|x_1,\ldots ,x_n)=P(a_i|x_i).\end{aligned}$$
(3)

Although very common, this assumption is not universal: when device-independent methods are used to test general causal inequalities, the impossibility of signalling is not a prerequisite (Baumeler and Wolf 2014).

One of the earliest examples of device-independent methods in quantum information involves what is literally called a box. The no-signalling constraint was studied by Popescu and Rohrlich (1994) through the introduction of a non-local, or Popescu-Rohrlich (PR), box describing unknown processes which connect the inputs \(x,y\in \{0,1\}\) and the outputs \(a,b\in \{0,1\}\) of two parties according to the joint distribution:

$$\begin{aligned} P(ab|xy) = \left\{ \begin{array}{l@{\quad }l} 1/2:&{} a+b={xy} \mod 2\\ 0:&{} \mathrm {otherwise.} \end{array} \right. \end{aligned}$$
(4)

While a PR-box is a general algebraic framework designed to go beyond quantum theory, the application of the no-signalling principle implies that this box will respect the laws of special relativity. Its device-independent non-local structure accommodates a violation of the Tsirelson bound (Cirel’son 1980) by reaching the maximum amount of correlations in the CHSH inequality (Bell 1964; Clauser et al. 1969).

Hailed as a “very important recent development” (Popescu 2014), device-independent models are characterized by the absence of assumptions about the internal workings of the box. Its ‘interior’ is not described by a particular physical theory. The box is unknown territory which, since it is assumed to be of interest for physical theory, is also a territory of science. The entire setup belongs within the boundaries of physics; at the same time, as we argue elsewhere, it opens new possibilities to redefine these very boundaries (Grinbaum 2017).

The redefinition of the boundaries of physics achieved by device-independent methods in quantum cryptography and quantum information is entirely due to the use of mathematics. The prediction of future events, to use Wigner’s term, is made possible by a connection, which remains to be found, between the inputs and the outputs. In the absence of any additional assumptions, the search for such a connection is performed in the space of mathematical tools available to the physicist. This is a common trait of blackbox models that deal with unknown physics. In the operational framework based on generalized probability distributions, the physicist’s only elementary notions are the inputs and the outputs. She then applies mathematical constraints, like the no-signaling principle, to obtain a particular theory with predictive power. Not only is the effectiveness of mathematics unsurprising: it becomes a driving force that propels device-independent theory building.

6 Analysis and conclusion

Wigner’s argument about unreasonable effectiveness of mathematics in physical theory relied on an intuitive feeling that nothing, in principle, urges nature to be mathematical. Unlike, for example, Galileo, Wigner did not seek to ground his statement in a particular philosophical system. He expressed the immediate surprise of someone who discovers that mathematical formulae can correctly predict future events and account for reality outside human mind. Evidently, a Pythagorean or a neo-Platonist would not be puzzled, for these philosophical systems put the number among fundamental constitutive principles of nature. But Wigner’s amazement produced an urge to motivate the underlying connection by means other than the application of a doctrine.

In rational terms the Wignerian wonder can be understood as two questions: one about the substance and another about the aim of physical theory. The first one is: why are objective phenomena and matter in the world outside human mind described by mathematical laws? However, physical theory does not always deal with phenomena or matter that are known or already available. This is but a limitation that has led Wigner to the view of physical theory as a collection of laws of nature. Physics must often explore the unknown and one of its tasks is to determine what kind of matter is involved in an experiment or what events can be predicted, and subsequently observed, in support of a theory. In this case, the impulse for creating a theory stems from the desire to study new and unknown territory. Its deeper motivation usually relates to a feeling of dissatisfaction with the available old theories; it is rare that one would obtain at an early stage a sufficiently precise idea about the content of the new one. On other occasions old theories are too complex or unsuited for the needs of solving particular problems. Even if complete knowledge is available, it may be reasonable to treat it as if it had remained unknown. Thus physics of the unknown is established as a collection of exploratory instruments whose nature, as we saw on four examples, is mathematical.

This last point shifts focus to Wigner’s second question about the divergence of aims between physics and mathematics. Had Wigner taken a suitably broad definition of physical theory to include the physics of the unknown, he would have seen this divergence quickly evaporate. This is because the physical theory of the unknown takes the form of a blackbox model purporting to establish a link between the inputs and the outputs of the box. The link must be conceptual and striking, i.e., mathematical in Wignerian terms, but it must also provide the power of prediction of future events, i.e. it should position the theory within the boundaries of physics. This is the ‘box’ language of the device-independent approach. Equivalently, one may say with Heisenberg that the theory should only operate with observable quantities. Wheeler expressed the same idea by focusing his introduction of the S-matrix on asymptotically free particle states. In effective field theories, the unknown high-energy theory is replaced by operators describing all possible effects observable at a given energy scale.

What is unknown is placed in a black box, which the theory does not necessarily seek to open up. As shown by Einstein’s principle theories, this approach does not, and often cannot, help to uncover the content of the box. In spite of its non-constructive character, it can still be predictive and explanatory. Explanation in this case originates, not from any knowledge of what lies inside the box, but from the postulates that constrain the connection between the inputs and the outputs. If a theory is successful in predicting future events, then these principles become our best candidates for fundamental principles of nature. This new knowledge about the world does not come in the form of a dynamical law for a new kind of matter. Instead our worldview is put on a new foundation, whose status is established through an enquiry enabled by mathematics.

Wigner stopped short of claiming this essential theory-building role of mathematics. He calls it “somewhat irresponsible” as he identifies it with the following attitude: “When [the physicist] finds a connection between two quantities which resembles a connection well-known from mathematics, he will jump at the conclusion that the connection is that discussed in mathematics simply because he does not know of any other similar connection”. This phrase appears after a brief reference to Einstein’s appreciation of beauty, which in Wigner’s words “comes closest to an explanation for the mathematical concepts’ cropping up in physics”. Had he been reading Einstein’s correspondence that predated his article by only a few years, Wigner might have noticed that the “jump” that he was talking about was strictly analogous to another such jump, or an “elevation”, which Einstein placed at the center of his epistemology (Einstein 1987). As he was drawing a schema of theory building in physics in a letter to Maurice Solovine, Einstein stipulated that there exists “no logical path from the E[xperiences] to the A[xioms], but only an intuitive (psychological) connection” and, furthermore, that “the relations between the concepts that appear in [theorems] and the experiences are not of a logical nature”. Therefore, the correspondence between theoretic results and experimental findings, although “obtainable with great certainty”, requires a jump which is performed by the physicist. The validity of a result obtained by such means, as Olivier Darrigol puts it, “comes as a surprise” (Darrigol 2014, p. 344), hence perhaps Wigner’s complaint about a “somewhat irresponsible” attitude of the physicist.

It appears that this Einstein-inspired surprise is very much the same as Wigner’s. The latter, however, has a different rationale. Like Heisenberg’s reason for using the S-matrix, Wigner’s account of theory building refers to the physicist’s ignorance. Unlike Heisenberg’s appeal to ignorance, though, the Wignerian claim should lead to a connection with [only?] “well-known” mathematical concepts. Wigner then, seeing perhaps his mistake, tries to clarify this point: “It is true also that the concepts which were chosen were not selected arbitrarily from a listing of mathematical terms but were developed, in many if not most cases, independently by the physicist and recognized then as having been conceived before by the mathematician”. If one puts together this prescription and Heisenberg’s conviction that only observable quantities should be used in theory building, one gets very close to the general characterization of the blackbox approach. It seems that in the discussion of the aims of mathematics Wigner stopped really short of finding a satisfactory answer to his bewilderment about its connection to physics.

This shows that the connection between the inputs and the outputs of a black box in a device-independent approach is inherently conceptual and striking or, in other words, mathematical. To paraphrase Einstein, its mathematical character is not subject to logical deduction. Once established, this connection becomes a basis for physical explanation.

To sum up: to develop a blackbox model means to identify a mathematical link, i.e. the right mathematical concept and often the right mathematical language, for connecting the inputs and the outputs. This search is performed in the space of concepts and theories rather than in the empirical world of physical experimentation. A success, for sure, can only be proclaimed if the identified mathematical object helps to make empirically testable predictions of future events or to explain previously unaccounted phenomena. Whenever one achieves such success, the mathematical nature of the connection between the inputs and the outputs helps to provide both the constitutive and the explanatory powers of the theory. Contrary to Wigner’s claim about a “somewhat irresponsible” intrusion of mystery, the effectiveness of mathematics in describing the physics of the unknown—new and uncharted territory of science or nature yet unexplored—cannot but be deemed reasonable and unsurprising.