1 Introduction

The early part of the twentieth-century witnessed two revolutions that laid out the foundations of physics today: quantum theory and Einstein’s general theory of relativity. While each pillar has been tremendously successful in its respective domain, they seemingly stand apart from a number of perspectives. At the sharpest end of the apparent contradictions, general relativity and quantum theory clash with seemingly paradoxical consequences [1, 2], leaving our account of Nature fragmented and incomplete. This is the problem of quantum gravity. While there are a number of promising paths to quantum gravity, string/M-theory being the most relevant in the context of this review, its ultimate resolution remains a challenge for twenty-first-century physics. Even leaving these most dramatic schisms aside, general relativity and quantum theory paint curiously distinct pictures of the known fundamental forces. The electroweak and strong forces of the Standard Model of Particle Physics are described by quantised Yang–Mills gauge theories that play out on a fixed spacetime stage. On the other hand, gravity is most economically thought of as the curvature of spacetime itself, which is elevated to a dynamical degree of freedom in its own right. Moreover, the gauge theories underpinning the Standard Model are renormalisable, whereas the straightforward perturbative quantisation of general relativity, with or without matter, is plagued by uncontrollable divergences [3,4,5,6,7].

From this point of view it would seem that gravity cuts a lonely figure. However, the diffeomorphism invariance of general relativity makes it in a sense the “gauge theory” par excellence. Thought of this way perhaps gravity and gauge theory are not so distant after all. Indeed, the hope that gravity can be understood in terms of gauge theory, or vice versa, has been a reoccurring theme in the annuls of theoretical physics. It has taken numerous guises, many of which are inter-related. The earliest example is provided by Kaluza–Klein theory [8,9,10]; general relativity in \(D=5\) spacetime dimensions compactified on a circle \(S^1\) includes in its massless spectrum a \(\mathrm{Un}(1)\) gauge field.Footnote 1 Electromagnetism is seen to derive from pure geometry and gravity. Turning this relation around, the first and perhaps most logically transparent approach is to gauge the Lorentz, Poincaré or de Sitter symmetries [12,13,14,15,16]. Indeed, general relativity in its first-order form, which given the need to conveniently couple to fermions is not merely a cosmetic rewriting, has a manifest local Lorentz symmetry. The spin-connection \(w^{a}{}_{b}\) is the corresponding gauge field and its field strength is nothing but the Riemann tensor, \(R^{a}{}_{b}=dw^{a}{}_{b}+w^{a}{}_{c}\wedge w^{c}{}_{b}\). This would seem to make the relationship to Yang–Mills theory manifest, but more care is needed. In fact, the necessary extra ingredients make the differences between the Yang–Mills theory of local Poincaré symmetry and general relativity quite clear. Gauging the Poincaré groupFootnote 2 yields a covariant derivative \(D=d+A\) with connection

$$\begin{aligned} A=e^aP_a+w^{a}{}_{b}M^{b}{}_{a} \end{aligned}$$
(1)

where \(P_a\) and \(M^{b}{}_{a}\) are the translation and Lorentz generators, respectively, and we have suggestively labelled the corresponding gauge fields by \(e^a\) and \(w^{a}{}_{b}\). The Poincaré algebra, \([M, P]\sim P, [M, M]\sim M, [P, P]=0\), implies the field strength takes the form

$$\begin{aligned} F =[D, D]= T^{a} P_a + R^{a}{}_{b}M^{b}{}_{a} \end{aligned}$$
(2)

where

$$\begin{aligned} R^{a}{}_{b}= dw^{a}{}_{b}+w^{a}{}_{c}\wedge w^{c}{}_{b}, \quad T^a= de^a+w^{a}{}_{b}\wedge e^b. \end{aligned}$$
(3)

Hence, identifying \(e^a\) and \(w^{a}{}_{b}\) with the frame field and spin-connection, respectively, \(T^a\) and \(R^{a}{}_{b}\) correspond to the torsion and Riemann tensors. However, on the one hand the Yang–Mills Lagrangian \(\propto \mathrm{tr}(F\wedge \star F)\) obviously does not yield the Einstein equation (and, besides, the Hodge dual requires a metric, somewhat undermining the whole approach) and, on the other hand, the Einstein–Hilbert Lagrangian \(\propto \sqrt{-g}R\) is not gauged Poincaré invariant (and again requires a metric). Both deficiencies can be rectified using the Plebanski Lagrangian \( \propto e^a\wedge e^b \wedge R^{cd} \epsilon _{abcd}\), which is invariant under local Poincaré transformations up to a term proportional to the torsion\(T^a\). So, we have a local Poincaré invariant action that yields the Einstein equation if we impose that the torsion vanishes, \(T^a=0\). With this condition in place, we may identify the local Pioncaré translation parameter \(\epsilon ^a(x)\) with a vector \(\xi \), via \(\epsilon ^a =i_\xi e^a\), so that the local Poincaré transformation of \(e^a\) becomes a Lie derivative with respect to \(\xi \) together with a shifted local Lorentz rotation,

$$\begin{aligned} \delta e^a = D\epsilon ^a + \alpha ^{a}{}_{b}e^b ={\mathcal {L}}_{\xi }e^a -i_\xi T^a +\left( \alpha ^{a}{}_{b} -i_\xi w^{a}{}_{b}\right) e^b \underset{\tiny {T^a\rightarrow 0}}{\longrightarrow } {\mathcal {L}}_{\xi }e^a +\alpha _{(\xi )}^{a}{}_{b}e^b \end{aligned}$$
(4)

So far, so good. However, while the \(T^a=0\) constraint is consistent with the equations of motion, one should still object that it is not itself gauged Poincaré invariant, \(\delta T^a|_{T^a\rightarrow 0} = \left( R^{a}{}_{b}\epsilon ^b+ \alpha ^{a}{}_{b}T^b\right) |_{T^a\rightarrow 0} = R^{a}{}_{b}\epsilon ^b\). In this sense, general relativity is strictly not given by gauging the Poincaré group. Nonetheless, the procedure of gauging spacetime symmetries and then identifying appropriate constraints is remarkably effective and, for example, has been applied to construct Poincaré, anti-de Sitter and conformal supergravity theories [15, 17, 18], a highly non-trivial task when tackled conventionally. Moreover, the explicit imposition of constraints can be replaced by a spontaneous symmetry breaking mechanism of a larger group [19]. Such successes notwithstanding, a metric on an n-dimensional Riemannian manifold can be regarded as a section of the associated frame bundle with fibres \(\mathrm{GL}(n)/\mathrm{O}(n)\), so is closer in spirit to a gauged non-linear sigma model equipped with a soldering form, rather than a conventional Yang–Mills theory of the Poincaré group. This takes us rather beyond the direct connection to Yang–Mills gauge theory and certainly the scope of this review.

However, this is not the only way one might think to relate gauge theory to gravity. In particular, the holographic principle [20, 21], realised concretely through the Anti-de Sitter/Conformal Field Theory (AdS/CFT) correspondence [22,23,24], represents a subtler and ultimately deeper gauge/gravity relation. In fact, the idea that the spin-two graviton is a composite of two spin-one gauge bosons can be taken as the starting point of a heuristic route to AdS/CFT, without appealing a priori to string theoryFootnote 3 [27, 28]. Let us expand on this point of view briefly, as it is instructive and serves to highlight the crucial differences between the AdS/CFT gauge/gravity duality and “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)”. It has been long-ago suggested, more than once, that the spin-two graviton is not elementary, but rather a bound state appearing in some renormalisable quantum field theory [29,30,31,32]. On the basis of representation theory alone it is not unreasonable to suppose that a spin-2 graviton (coupled to a scalar) might be composed as the symmetric tensor product of two spin-1 particles (or even four spin-1/2 particles [29]). The problem with this idea is that the Weinberg–Witten theorem [33] would seem to rule out massless composite particles of spin greater than one in any quantum field theory, under the assumption that there exists a conserved Lorentz covariant energy-momentum tensor. However, along with requiring a stress tensor, there is another hidden assumption so seemingly innocuous that it almost does not bear mentioning; the composite graviton lives in the same spacetime as its elementary constituents. But this is precisely what the holographic principle violates: a theory in \((D+1)\) spacetime dimensions is the “holographic image” of a theory in D dimensions. The Weinberg–Witten theorem does not exclude the possibility that a graviton propagating in a \((D+1)\)-dimensional bulk spacetime is equivalently described in terms of a gauge theory living on a D-dimensional boundary. Of course, things are not quite so simple. First, in order to capture the physics of a local \((D+1)\)-dimensional theory the boundary theory needs an additional “dimension”—a parameter with respect to which the physical observables are local. This is provided by the energy scale \(\mu \); the couplings are governed by the renormalisation group flow equations, which are local functions of \(\mu \). Second, this scale should be macroscopic. Coupled with the expectation that the boundary theory should be highly quantum, since a perturbative gauge theory does not look like classical gravity, this suggests that the boundary theory is strongly-coupled over a large range of energies. For infinite bulk radial dimension the boundary theory should therefore be conformal. Supersymmetry can be employed to retain control in the strongly-coupled regime, in which case the gauge theory is superconformal. Finally, to account for the extra classical functional degrees of freedom enjoyed by fields propagating in \(D+1\) dimensions with respect to those in D dimensions, a large gauge group limit is invoked. For example, the equivalence of Yang–Mills theory, with \(\mathrm{SU}(N)\) colour group and gauge coupling g, with classical gravity is valid for \(N\gg \lambda \gg 1\), where we have introduced the ‘t Hooft coupling \(\lambda = g^2 N\). The best understood case, which is by now extremely well-tested, has type IIB string theory on an asymptotically \(\text {AdS}_{5}\times S^5\) bulk spacetime, with the maximally supersymmetric \(D=4\) Yang–Mills theory on the \(\text {AdS}_{5}\) boundary. It is widely believed to be a complete duality, also holding for finite N and \(\lambda \), although this is much harder to test. Regardless, even in its most conservative form, \(N\rightarrow \infty \), the AdS/CFT correspondence provides a remarkable gauge/gravity relation, with profound implications and myriad applications.

In the present contribution, our concern is a third (at least naively) independent relationship between gauge theory and gravity: the “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)” paradigm. While it also takes as its starting point the idea that spin-2 can be built from spin-1, it is quite different from a number of perspectives. First, it is rather generic, not requiring anything like large N, strong/weak-coupling-duality, supersymmetry or the holographic principle. It does however require other hidden properties, which are shared by a very broad class of gauge theories. We shall come to those momentarily. On the other hand, it is rather more limited in the sense that it is not a complete duality, but rather a growing set of compelling relations. To be more precise, we simply do not yet know just how general or deep it is. In this review, we shall describe the various and connected perspectives on “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)” and explore just how far it can be taken. Let us state at the outset that we do not have a definitive answer - the jury is still out. However, the proliferation of surprising, illuminating and powerful insights uncovered thus far compels further serious consideration, its rather radical nature notwithstanding.

The heuristic picture to have in mind is that one can regard the product of two gauge potentials as a gravitational theory as described by the suggestive, but for the moment purely illustrative, equation:

$$\begin{aligned} ``A_\mu \otimes \tilde{A}_\nu = g_{\mu \nu }\oplus B_{\mu \nu }\oplus \varphi \text {''}. \end{aligned}$$
(5)

Here, \(A_\mu \) and \(\tilde{A}_\nu \) are the gauge potentials of two distinct Yang–Mills theories, which we will refer to as the left (no tilde) and right (tilde) theories, or factors, respectively. They can have arbitrary and independent non-Abelian gauge groups G and \({\tilde{G}}\). Their “product” yields a metric, \(g_{\mu \nu }\), an Abelian 2-form gauge potential \(B_{\mu \nu }\) and a scalar field \(\varphi \). This equation has meaning if we interpret it as the tensor product of the corresponding spacetime little group representations, but going beyond this requires a rather more subtle approach. Indeed, there are a number of good reasons, Weinberg–Witten aside, to suspect it cannot be well-defined. For one the right hand side should be covariant with respect to general coordinate transformations whereas the left hand side transforms locally under two arbitrary finite-dimensional Lie groups. Nonetheless, the idea represented by (5) has proven itself incredibly powerful, particularly in the context of scattering amplitudes, motivating a reappraisal of these apparent obstructions.

Given the lessons of Weinberg–Witten (forbidding composite particles) and AdS/CFT (relying on a holographic dimension), how could such a proposal work? The first substantial clues came, just as in the case of AdS/CFT, from string theory in the guise of the Kawai–Lewellen–Tye (KLT) relations [34], which connect the tree-level scattering amplitudes of closed strings to sums of products of open string amplitudes. While highly non-trivial, the intuition underpinning these relations is quite clear.Footnote 4 First, the spectra of closed strings is given by the tensor product of those corresponding to left and right moving open strings. Since the low energy effective field theory limits of closed and open superstrings are given by supergravity and super Yang–Mills theory, respectively, graviton states arise as the tensor product of the gluon states.

For instance, the massless sector of type I superstrings in \(D=10\) is given by super Yang–Mills theory [36] with gluons and gluini, that is adjoint-valued Majorana–Weyl (MV) spinors, in the \(\mathbf {8}_v\) and \(\mathbf {8}_s\), respectively, of the (double cover of) the spacetime little group \(\text {Spin}(8)\). The product of type I superstrings with opposing chiralities yields type IIA superstrings, with massless spectrum,

$$\begin{aligned} \begin{array}{c|c|ccccccccc} \otimes &{}\mathbf {8}_v &{}\mathbf {8}_c \\ \hline \mathbf {8}_v &{}\underset{\text {graviton}}{\mathbf {35}} +\underset{\text {KR 2-form}}{\mathbf {28}}+\underset{\text {dilaton}}{\mathbf {1}} &{} \underset{\text {gravitino}}{\mathbf {56}_s}+\underset{\text {MW spinor}}{\mathbf {8}_s} \\ \hline \mathbf {8}_s &{} \underset{\text {gravitino}}{\mathbf {56}_c} +\underset{\text {MW spinor}}{\mathbf {8}_c} &{} \underset{\text {RR 3-form}}{\mathbf {56}_v} +\underset{\text {RR 1-form}}{\mathbf {8}_v} \\ \end{array} \end{aligned}$$
(6)

In particular, we see that the on-shell gluon\(\otimes \)gluon sectorFootnote 5 yields an on-shell graviton \(\mathbf {35}\), the on-shell Kalb–Ramond (KR) 2-form \(\mathbf {28}\), and the dilaton \(\mathbf {1}\), just as anticipated by the heuristic formula (5). Of course, this is just a special case of \(V\otimes V\cong \text {Sym}^2(V)_0\oplus \wedge ^2(V)\oplus {\mathbb {R}}\), where \(V\cong {\mathbb {R}}^{D-2}\) is the vector representation of \(\mathrm{SO}(D-2)\) and \(\text {Sym}^2(V)_0\) denotes the trace free symmetric product. This is the universal sector of the product all conventional (super) Yang–Mills theories for any dimension D and any number of allowed supersymmetries \({\mathcal {N}}\). It is therefore useful to give it a name. It is sometimes referred to as “\({\mathcal {N}}=0\) supergravity’ and we shall adopt this convention. In addition to the universal bosonic sector, we have the fermions from the gluon\(\otimes \)gluino\(+\)gluino\(\otimes \)gluon sector consisting of two gravitini, \(\mathbf {56}_s\) and \(\mathbf {56}_c\), and two MV spinors, \(\mathbf {8}_c\) and \(\mathbf {8}_s\). The appearance of two gravitini with opposing chiralities implies that there are \({\mathcal {N}}=2=1+1\) local supersymmetries corresponding to type IIA supergravity. This reveals our second universal property apparent just at the level of representation theory; the global left and right supersymmetries sum in the product to give local supersymmetries.Footnote 6 Finally, in the gluino\(\otimes \)gluino sector we find the R-R 1-form \(\mathbf {8}_v\) and 3-form \(\mathbf {56}_v\) Abelian gauge fields. This is just the Clifford expansion of the spinor tensor products.

The matching of closed string spectra with the products of open string spectra is, of course, just the minimal requirement for anything like the KLT relations to work, but also allows for (or, rather, follows from) the construction of closed string vertex operators in terms of those left and right moving open strings, \(V_{\mathrm{closed}}(2p, \tau )\sim \int d\sigma V_{\mathrm{open}}(p, \tau -\sigma )\tilde{V}_{\mathrm{open}}(p, \tau +\sigma )\), where \((\tau , \sigma )\) are the world-sheet coordinates. This, in turn, implies that the tree-level n-closed-string amplitudes are given by sums of products,

$$\begin{aligned} {\mathcal {A}}^{n}_{\mathrm{closed}} \sim \sum _{\sigma , \sigma '} e^{i \theta (\sigma , \sigma ')} {\mathcal {A}}^{n}_{\mathrm{open}} (\sigma )\tilde{{\mathcal {A}}}^{n}_{\mathrm{open}}(\sigma ') \end{aligned}$$
(7)

where \(A^{n}_{\mathrm{open}}\) and \(\tilde{A}^{n}_{\mathrm{open}}\) are left and right moving open string amplitudes, \(\sigma , \sigma '\) are noncyclic permutations of the external lines and the \(\theta \) are model independent phases determined entirely by \(\sigma , \sigma '\) [34]. For example, focussing on the bosonic string consider the very simplest case of the three-graviton vertex,

$$\begin{aligned} {\mathcal {A}}^{3}_{\mathrm{closed}} \sim \kappa \varepsilon _{1}^{\mu \alpha } \varepsilon _{2}^{\nu \beta }\varepsilon _{3}^{\rho \gamma }A^{3}_{\mathrm{open} ~\mu \nu \rho }\tilde{A}^{3}_{\mathrm{open}~\alpha \beta \gamma }. \end{aligned}$$
(8)

Here, \(\kappa \) is the closed string coupling constant, \(\varepsilon _{i}^{\mu \alpha }\) are the transverse-traceless graviton polarisation tensors and the open string three-gluon vertex is given by

$$\begin{aligned} {\mathcal {A}}^{3}_{\mathrm{open}} \sim g \varepsilon _{1}^{\mu } \varepsilon _{2}^{\nu }\varepsilon _{3}^{\rho } A^{3}_{\mathrm{open}~\mu \nu \rho }, \end{aligned}$$
(9)

where g is the open string coupling constant and \(\varepsilon _{1}^{\mu }\) are the transverse gluon polarisation tensors. Note, at zeroth-order in the inverse string tension, \(\alpha '\), \({\mathcal {A}}^{3}_{\mathrm{open}}\) is precisely the three-gluon vertex of pure Yang–Mills theory, but also has an order \(\alpha '\) contribution corresponding to an \(F^3\) term. Consequently, \({\mathcal {A}}^{3}_{\mathrm{closed}}\) has both order \(\alpha '\) and \(\alpha '^2\) contributions following from four- and six-derivative terms of the form \(R^2\) and \(R^3\). However, in the infinite string tension limit, \(\alpha '\rightarrow 0\), we recover a precise relationship between the three-point vertices of perturbative \({\mathcal {N}}=0\) supergravity (which can be restricted to Einstein–Hilbert gravity by choosing the gluon polarisations appropriately, since it is tree-level) and Yang–Mills theory. At four-point with external tachyons we obtain a relationship [34] between the famous closed string Virasoro–Shapiro [37, 38] and open string Veneziano [39] amplitudes, which initiated the string theory programme itself. The four-point KLT relations hold for all external states and, again, reduce to four-point Einstein–Hilbert gravity and Yang–Mills relations in the \(\alpha '\rightarrow 0\) limit. More generally, (7) for \(\alpha '\rightarrow 0\) gives a precise set of relations between graviton amplitudes as sums of products of gluon amplitudes for any number of points [34].

The KLT relations and their field theory descendants are by construction tree-level. Nonetheless, the “closed \(=\) open \(\times \) open” string theoretic approach was successfully applied to one-loop four-graviton amplitudes [40], suggesting that it may be possible to go beyond the semi-classical regime. At this stage it should be noted that direct contact with the standard Lagrangian approach has been lost. Instead, the key idea is to use unitarity methods [41,42,43,44,45,46,47] to build from the tree-level KLT relations loop amplitudes without passing through the usual Feynman prescription at all. This facilitated, for example, the calculation of the two-loop four-points amplitudes in \(D=4,{\mathcal {N}}=8\) supergravity from those of \({\mathcal {N}}=4\) super Yang–Mills theory [48]. The idea that unitarity can be used to glue trees into loops forms a part of what might be described as the “on-shell paradigm”. Starting with Lagrangian field theory we learnt how to perturbatively compute amplitudes to arbitrary precision. While conceptually straightforward, the factorial growth in complexity with loop order quickly renders the traditional Feynman diagram approach impractical. The search for computational efficiency precipitated a renaissance in amplitude techniques, focusing on physical, gauge invariant objects. Over time various amplitude structures (recursion relations, generalised unitarity cuts, Grassmannians, scattering equations, positively...) were uncovered, eventually allowing the Lagrangian ladder to be kicked away altogether. As well as being computationally powerful, this programme offers new perspectives on quantum field theory itself. For a review of many of these developments see [49].

Importantly, this new found freedom led to the discovery of new features of amplitudes, not visible from the original Lagrangian perspective. A remarkable example of one such hidden structure, that lies at the heart of “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)”, is the Bern–Carrasco–Johansson colour-kinematic (BCJ) duality, which relates the kinematic dependence of an amplitude to its colour data [50, 51]. One can write any gluon amplitude entirely in terms of trivalent graphs (which are not Feynman diagrams) by “blowing” up the four-point contact terms. Having done so, the BCJ duality conjecture is that there exists a rewriting of the amplitude such that: (i) for any triple of graphs, ijk with colour factors \(c_i, c_j, c_k\), built entirely from the gauge group structure constants, obeying a Jacobi identity \(c_i+c_j+c_k=0\), the corresponding kinematic factors, \(n_i, n_j, n_k\), which are built from the momenta and polarisation tensors, also obey the same Jacobi-type identity \(n_i+n_j+n_k=0\); (ii) for any diagram, i, such that \(c_i\rightarrow -c_i\) under the interchange of two legs then \(n_i\rightarrow -n_i\). A reorganisation admitting this surprising relationship between colour and kinematic data exists for all n-point tree-level amplitudes, as has been demonstrated from a number of perspectives [52,53,54,55]. Although there is as yet no proof that the colour-kinematic duality will hold to all loops, there are many highly non-trivial examples providing supportive evidence [56,57,58,59,60,61,62,63]. While it is clear that the colour factors should obey Jacobi identities (by definition), it is not at all obvious that the kinematic factors should play by the same rules! It is certainly not apparent form the Yang–Mills Lagrangian. Moreover, they have further implications for amplitude architecture. For example, an immediate consequence of tree-level BCJ duality is the existence of BCJ relations amongst colour-ordered partial amplitudes, reducing the number of independent n-point partial amplitudes down to \((n-3)!\) [50].

More remarkable still is the BCJ double-copy prescription [51, 64]. Consider two n-point L-loop Yang–Mills amplitudes, both written in trivalent form with respective colour and kinematic factors \((c_i, n_i)\) and \(({c}_i, \tilde{n}_i)\), at least one of which has been successfully cast in a BCJ duality respecting form, say \((c_i, n_i)\). We can construct a corresponding gravitational theory amplitude by simply replacing each colour factor in \(({c}_i, \tilde{n}_i)\) with the corresponding kinematic factor of \((c_i, n_i)\): \(({c}_i, \tilde{n}_i)\rightarrow (n_i, \tilde{n}_i)\). We have removed all reference to the gauge group and “doubled” the kinematic terms. For two pure Yang–Mills theoriesFootnote 7 this double-copy procedure generates all possible amplitudes of \({\mathcal {N}}=0\) supergravity, giving precise meaning to the heuristic equation (5), at least at the semi-classical level. However, the two amplitudes need not belong to the same theory. For example, we could take the \((c_i, n_i)\) from maximally supersymmetric \({\mathcal {N}}=4\) Yang–Mills amplitudes and the \((\tilde{c}_i, \tilde{n}_i)\) from pure Yang–Mills theory. This yields the amplitudes of pure \({\mathcal {N}}=4\) supergravity [65]. Alternatively, if both factors are \({\mathcal {N}}=4\) Yang–Mills theory we generate the maximally supersymmetric \({\mathcal {N}}=8\) supergravity [51], which can be thought of as the dimensional reduction on a 6-torus of the “type II \(=\) type I \(\times \) type I” relation described in (6). By varying the left and right factors over all BCJ duality compatible gauge theories we generate all BCJ double-copy constructible gravitational theories. Of course, this is easier said than done, but there is nonetheless a rapidly multiplying zoology BCJ double-copy constructible gravity theories [51, 56, 64, 66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89]. We shall come to describe this forest of theories once we have covered the necessary ground work.

The double-copy picture is not only conceptually compelling but also computationally powerful, bringing previously intractable calculations with in reach. This has pushed forward dramatically our understanding of divergences in perturbative quantum gravity [56, 57, 60, 67, 90,91,92,93,94,95], revealing a number of unexpected features and calling into question previously accepted arguments regarding finiteness. A remarkable example is given by the four-point graviton amplitude in \({\mathcal {N}}=8\) supergravity, which was shown to be finite to four loops in [56], contradicting some early expectations [96, 97]. It has since been shown that the four-loop cancellation can be accounted for by supersymmetry and \(E_{7(7)}\) U-duality [98,99,100,101,102]. The consensus, however, is that at seven loops any would-be cancellations cannot be “consequences of supersymmetry in any conventional sense” [98]. Unfortunately, seven loops in \({\mathcal {N}}=8\) supergravity remains beyond reach (for now), but by decreasing the amount of supersymmetry the same arguments apply at lower loop order. For example, the four-point amplitude of \(D=4, {\mathcal {N}}=5\) supergravity has been shown to be finite to four loops, contrary to all expectations based on standard symmetry arguments [57]. There are “enhanced cancellations” [57], in the sense that they cannot be explained by any standard symmetry argumentFootnote 8, at work and the conclusion that \({\mathcal {N}}=8\) supergravity will diverge at seven loops is thrown into doubt. More recently, in a computational tour de force the \({\mathcal {N}}=8\) four-point five-loop amplitude was completed using generalised BCJ duality and the double-copy of \({\mathcal {N}}=4\) Yang–Mills theory [95, 104]. It was found to be finite in agreement with the expectation that \({\mathcal {N}}=8\) should diverge at seven loops. Its degree of finiteness was inline with standard symmetry arguments; there were no enhanced cancellations that might lead one to speculate that seven loops will be finite contrary to conventional expectations. However, this conclusion was reached in \(D=24/5\) where the five-loop amplitude first diverges and there is a \(\partial ^8R^4\) counter-term. Since we do not fully understand the nature or origin of the enhanced cancellations it could still be that they kick in at seven loops in the relevant of \(D=4\). Similarly, the fact that \({\mathcal {N}}=5\) supergravity is rendered finite at four loops by enhanced cancellations can be interpreted in a number of ways. On the one hand, it makes it clear that standard symmetry arguments are insufficient to predict when a supergravity theory will diverge, opening a small window of opportunity for perturbative finiteness. On the other hand, although not a logical impossibility no amplitude practitioners (as far as we are aware) expect \({\mathcal {N}}=5\) supergravity to be perturbatively finite so: (i) the \({\mathcal {N}}=5\) enhanced cancellations witnessed at four loops are anticipated to fail eventually and even if \({\mathcal {N}}=8\) supergravity (where there is active debate regarding finiteness, see for example [98, 105,106,107,108]) is magically finite at seven loops, it is not guaranteed that the cancellations will persist at higher loops; (ii) the observation of enhanced cancellations in \({\mathcal {N}}<8\) supergravity theories implies that \({\mathcal {N}}=8\) is not actually special in this regard. Does this undermine its privileged position as a candidate finite theory; if theories that are not expected to be finite can have enhanced cancellations, then why should we think that their existence might suggest finiteness of other theories? Of course, without a complete understanding of the amplitudes, including hidden features such as the enhanced cancellations, these are all just speculations and we will not know the answer at any particular order until we do the calculation.

Whatever the case, however, there is something deeper at work we have yet to fully comprehend; the questions regarding finiteness and the “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)” paradigm more generally, remain very much open. In particular:

  1. 1.

    Why does the correspondence work? Can we prove the BCJ colour-kinematic conjecture? Is there some underlying geometric or world-sheet origin?

  2. 2.

    How deep is the correspondence? Is “\(\hbox {gravity} =\hbox {gauge} \times \hbox {gauge}\)” strictly a property of amplitudes or can it be generalised to other/all aspects of gauge and gravity theories? What are the implications for quantum gravity?

  3. 3.

    How general is the correspondence? When does a gauge theory respect BCJ duality? What classes of gravitational theories admit a gauge theory squared origin; are the factorisable theories special in some regard?

1.1 Outline

In the remainder of this review we shall begin by describing BCJ duality and the double-copy construction in some detail before giving an (inevitably incomplete) overview of the rapidly evolving work tackling these questions and their various related puzzles. In Sect. 2 we consider “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)”   in the context of scattering amplitudes, where it takes its most concrete and developed form. In particular, we shall use this section to introduce some notation and the basic background concepts, before introducing BCJ duality and the double-copy construction in some detail. This will address to some degree question (3) above regarding what theories admit a “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)” origin. We shall also review the status of the implications for perturbative quantum gravity.

Finally in Sect. 3, we take a step back from amplitudes and discuss various “off-shell” approaches to understanding BCJ duality and “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)”, addressing aspects of (1), (2) and (3). The first example is a Lagrangian approach to making BCJ duality manifest introduced in the original work on the double-copy [51, 64]. One can also consider the double copy of classical gauge solutions [109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135]. We will then review a field theoretic formulation of “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)” independent, but consistent with, the BCJ double-copy [81, 85, 115, 116, 119, 136,137,138,139,140,141,142].

A comment on the scope of this review. Given the length constraints of La Rivista we have not been able to be pedagogical nor close to comprehensive. Regarding the latter, we can only apologise for any omissions and would welcome suggestions for future additions. Regarding the former, while it has not been possible to be pedagogical, we have endeavoured to be reasonably self-contained and elementary on the essential introductory matter so the non-expert will have some chance to understand the core basics, opening the door as it were. Otherwise, further reading will no doubt be required and we have tried to provide sufficient references throughout.

Before proceeding any further let us first mention other reviews that cover the various subthemes treated here in more detail. First, for a pedagogical introduction to BCJ duality and the double-copy one could not do better than [143, 144]. The latter is also the superlative reference for BCJ duality, the double-copy and their applications more generally. For students starting out in this subject, or those coming from outside the amplitudes community, [44, 49, 145,146,147,148] provide excellent introductions, the latter two with one eye on BCJ duality the double-copy from the outset. For a broader, rather inspirational, early account of “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)”  and its potential implications for perturbative quantum gravity see [149]. For an eminently approachable review of twistors, setting the scene for their applications to amplitudes, see [150]. For an excellent account of their applications to perturbative quantum field theory and the relationship between gauge and gravity amplitudes see [151].

2 Scattering amplitudes

The slogan “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)” takes its most concrete and complete form in the setting of scattering amplitudes and so this is where we shall begin our journey. In the context of modern particle physics tested at accelerators, such as the Large Hadron Collider, scattering amplitudes constitute the most basic gauge theory observables. Through their relation to cross-sections for scattering processes, they encode the probabilities that a set of colliding particles will interact to produce some other set of particles, thus providing direct contact between theory and experiment. Belying their conceptual simplicity, they are replete with subtle hidden structures and continue to reveal remarkable surprises to this day, with profound consequences, not only for computational techniques, but also quantum field theory itself.

One such surprise is the idea that gauge theory amplitudes can be used as the building blocks for gravity theory amplitudes. This approach can be traced back to the KLT relations of string theory. The modern, further reaching, incarnation takes the form of the BCJ double-copy construction, which will be our principal preoccupation here. The double-copy in-turn relies upon BCJ colour-kinematic duality, which is thus our first concern.

2.1 Bern–Carrasco–Johansson colour-kinematic duality

The cornerstone of the double-copy realisation of graviton scattering amplitudes in terms of gluon scattering amplitudes is the Bern–Carrasco–Johansson colour-kinematic duality [50]. Everyone knows that the colour factors dressing gluonic Feynman diagrams obey certain relations, such as the Jacobi identity. The colour factors derive from the Lie group characterising the gauge theory so it is in their very nature to do so. However, BCJ duality implies that there exists a rewriting of the amplitude such that whenever a set of diagrams satisfies a Jacobi identity amongst their colour factors, the corresponding kinematic numerators obey precisely the same identity. Kinematic numerators satisfying these identities are referred to BCJ numerators. It is not at all obvious that this should be true; it certainly is not manifest in the conventional Yang–Mills Lagrangian.

Let us now describe in detail the BCJ duality conjecture for pure Yang–Mills theory. We shall add supersymmetry, matter couplings and more exotic structures in the subsequent sections. For now, we take this opportunity to give a lightning review of Yang–Mills theory, setting some notation, and introducing the basic ideas that will carry us through the remaining material.

2.1.1 Yang–Mills theory and gluon scattering amplitudes

We begin by reviewing pure Yang–Mills gauge theories, which are specified by a choice of compact Lie group G and a principal G-bundle P(MG), defined over a base manifold M corresponding, in this context, to a fixed spacetime background. Given an open subset U of M with local section \(\sigma \), the local Yang–Mills gauge potential \(A\in \Omega ^1(U)\otimes \mathfrak {g}\), where \(\mathfrak {g}\) is the Lie algebra of G, is given by \(A=\sigma ^* \omega \), where \(\omega \) is a connection on P(MG). The associated Yang–Mills field strength given by

$$\begin{aligned} F = dA+A\wedge A \end{aligned}$$
(10)

then corresponds to a local form of the curvature of the connection. For an open covering \(\{U_i\}\) of M with local sections \(\sigma _i\), the gauge potentials on non-trivial overlaps \(U_i \cap U_j\not =\varnothing \) satisfy the compatibly relations,

$$\begin{aligned} A_j=t^{-1}_{ij}A_i t_{ij} + t^{-1}_{ij}dt_{ij}, \end{aligned}$$
(11)

where \(t_{ij}: U_i \cap U_j \rightarrow G\) are the transition functions of P(MG). This implies the compatibility relation for the associated field strengths

$$\begin{aligned} F_j=t^{-1}_{ij}F_i t_{ij}. \end{aligned}$$
(12)

For two local sections on a chart U related by \(\sigma '(p)=\sigma (p)g(p)\) for all \(p\in U\), where \(g: U\rightarrow G\), the corresponding gauge potentials are related by the gauge transformations

$$\begin{aligned} A'=g^{-1}Ag + g^{-1}dg, \end{aligned}$$
(13)

which imply

$$\begin{aligned} F' = g^{-1}Fg. \end{aligned}$$
(14)

For gauge transformations connected to the identity there is a \(\theta : U\rightarrow \mathfrak {g}\) such that \(g=\exp [\theta ]\) and

$$\begin{aligned} \delta A= d\theta + [A, \theta ] \equiv D_A\theta ,\quad \delta F= [F, \theta ] \end{aligned}$$
(15)

to first order in \(\theta \), where we define the commutatorFootnote 9 of \(\mathfrak {g}\)-valued forms by \([x, y] =x\wedge y - (-1)^{pq}y\wedge x\), for \(x\in \Omega ^p(U)\otimes \mathfrak {g}\) and \(y\in \Omega ^q(U)\otimes \mathfrak {g}\).

Here we have introduced the local covariant derivative \(D_A: \Omega ^p(U)\otimes \mathfrak {g}\rightarrow \Omega ^{p+1}(U)\otimes \mathfrak {g}\) for local connection A (we will henceforth omit the subscript on \(D_A\) indicating the choice of connection),

$$\begin{aligned} D x : = dx +[A, x]. \end{aligned}$$
(16)

The field strength obeys the local Bianchi identity,

$$\begin{aligned} DF=d(dA+A\wedge A)+A\wedge dA- dA\wedge A =0 . \end{aligned}$$
(17)

It is often useful to introduce a basis \(\{T_a\}_{a=1}^{\dim \mathfrak {g}}\) for \(\mathfrak {g}\) in some representation \(\rho \), especially when considering the colour structure of scattering amplitudes. Then

$$\begin{aligned} A = g A_{\mu }^{a} dx^\mu \otimes T_a, \quad F =\frac{1}{2} g F_{\mu \nu }^{a} dx^\mu \wedge dx^\nu \otimes T_a \end{aligned}$$
(18)

where g is the Yang–Mills coupling and

$$\begin{aligned} F_{\mu \nu }^{a} = \partial _\mu A_{\nu }^{a} - \partial _\nu A_{\mu }^{a} + g f^{a}{}_{bc} A_{\mu }^{b}A_{\nu }^{c} \end{aligned}$$
(19)

and \( f^{a}{}_{bc}\) are the structure constants of \(\mathfrak {g}\), \([T_a, T_b] = f_{ab}{}^{c}T_c\), which are totally antisymmetric \(f_{abc}=f_{[abc]}\) (adjoint indices are raised/lowered via minus the Cartan-Killing form, so may be taken as \(\delta _{ab}\) in an appropriate basis) and satisfy the Jacobi identity

$$\begin{aligned} 3f_{[ab}{}^{e}f_{c]e}{}^{d}=f_{ab}{}^{e}f_{ce}{}^{d} +f_{ca}{}^{e}f_{be}{}^{d}+f_{bc}{}^{e}f_{ae}{}^{d}=0. \end{aligned}$$
(20)

In components the gauge transformations (15) are given by

$$\begin{aligned} \delta A_{\mu }^{a}= \partial _\mu \theta ^a + gf^{a}{}_{bc} A_{\mu }^{b} \theta ^c,\quad \delta F_{\mu \nu }^{a}= gf^{a}{}_{bc} F_{\mu \nu }^{b} \theta ^c. \end{aligned}$$
(21)

where \(\theta = g\theta ^aT_a\). With the goal of developing gluonic scattering amplitude relations in mind, for the remainder of this section we shall restrict M to be \(D=(1+d)\) dimensional Minkowski spacetime so that the bundle is trivial. We may leave for now G to be an arbitrary compact Lie group, since the adjoint representation is always real, but the typical example to have in mind is \(\mathrm{SU}(N)\). When we consider matter couplings we will have to be more careful about the properties of G, taking into account the specific reality properties of the other representations required. With these comments in mind, we can now turn to the perturbative quantum Yang–Mills theory valid in the high energy regime.

The quantum theory is most transparently formulated through an action principle. The classical Yang–Mills action functional is given by

$$\begin{aligned} S_{\mathrm{YM}}=\frac{1}{2 g ^2}\int _M \mathrm{tr}\ F \wedge \star F =-\frac{1}{4}\int _M d^DxF_{\mu \nu }^{a} F^{a \mu \nu } \end{aligned}$$
(22)

where \(\mathrm{tr}\) denotes an appropriately normalised G-invariant and negative-definite quadratic form on \(\mathfrak {g}\). It is by construction invariant under the gauge transformations (15). Consequently, there is a large gauge redundancy, which must be treated carefully using, for example, the Faddeev–Popov proceedure [152]. Although well-trodden territory, we will briefly review the approach of Becchi–Rouet–Stora–Tyutin (BRST) quantisation [153, 154], as certain ingredients will be explicitly needed later in the perhaps less familiar context of a classical field-theoretic realisation of “\(\hbox {gravity} =\hbox {gauge} \times \hbox {gauge}\)”. There are several good reviews, for example [155,156,157], of the BRST formalism and the more general Batalin-Vilkovisky (BV) [158,159,160,161,162,163] approach, as needed for open symmetry algebras encountered, for example, in supergravity. We will in fact need the full machinery of the BV formalism later, but refer the reader to [157] for the required background material.

Following the Faddeev–Popov prescription, using the Nakanishi–Lautrup Lagrange multiplier \(b : M\rightarrow {\mathfrak {g}}\) we can lift to the action the generalised gauge-fixing delta \(\delta (G[A]-w)\), with a width \(\xi \) Gaussian weighting by \(w :M\rightarrow {\mathfrak {g}}\). The Faddeev–Popov determinant is also lifted to the action by the inclusion of the anti-commuting ghost and antighost fields \(c, \bar{c}: \Omega ^0(M)\rightarrow \mathfrak {g}\). This results in the total Yang–Mills BRST action

$$\begin{aligned} S_{\mathrm{YM}_{\mathrm{BRST}}} = S_{\mathrm{YM}} + S_{\mathrm{gf}} + S_{\mathrm{gh}} \end{aligned}$$
(23)

where

$$\begin{aligned} S_{\mathrm{gf}}= -\frac{1}{g^2}\int _M \mathrm{tr}\ b \left( \frac{1}{2}\xi b+ G[A]\right) \end{aligned}$$
(24)

follows from the gauge-fixing terms and

$$\begin{aligned} S_{\mathrm{gh}}= -\frac{1}{g^2} \int _M d^4x \mathrm{tr} \left( \bar{c} \int _M d^4y \tfrac{\delta G(x)}{\delta A^\mu (y)}D^\mu c(y)\right) , \end{aligned}$$
(25)

follows from the Faddeev–Popov determinant. Although we have explicitly broken the gauge symmetry by the addition of (24), the total Yang–Mills BRST action (23) is annihilated under the global BRST transformations, in which the ghost field replaces the local gauge parameter \(\theta \):

$$\begin{aligned}&QA:= Dc, \end{aligned}$$
(26a)
$$\begin{aligned}&Qc:= -\tfrac{1}{2}\{c,c\}, \end{aligned}$$
(26b)
$$\begin{aligned}&Q\bar{c}:= b, \end{aligned}$$
(26c)
$$\begin{aligned}&Qb:=0, \end{aligned}$$
(26d)

where \(Q, c, \bar{c}, b\) have ghost numbers \(\text {gh}(Q)=1, \text {gh}(c)=1, \text {gh}(\bar{c})=-1, \text {gh}(b)=0\) and

$$\begin{aligned} Q(ab) = (Qa)b +(-1)^{\varepsilon (a)} a(Qb) \end{aligned}$$
(27)

where \(\varepsilon (f)\in \{0,1\}\) is the Grassmann grade of f.

In this case the BRST charge is nilpotent, \(Q^2=0\), without imposing any further conditions on the fields, which follows from the off-shell closure of the classical gauge transformations. It is the Noether charge associated to the continuous global symmetry with anti-commuting parameter \(\epsilon \), given by \(\delta _\epsilon =\epsilon Q\). Using these transformations we can rewrite \(S_{\mathrm{gf}} + S_{\mathrm{gh}}\) in terms of a Q-exact term \(\int \star Q \Psi _{\mathrm{gf}}\), where the ghost-fixing fermion \( \Psi _{\mathrm{gf}}\) is given by \(-\mathrm{tr}\ \bar{c} (\frac{1}{2}\xi b+ G[A])\). Functions on this enlarged space of fields, \((A, c, \bar{c}, b)\), together with the homological vector field, Q, form a chain complex and its cohomology characterises the set of possible physical observables. The expectation values of physical observables are independent of the choice of gauge-fixing, which is made manifest by the freedom to add Q-exact terms that modify the gauge-fixing and ghost terms, but leave the physical sector invariant.

Making a choice for G[A] and eliminating the auxiliary field b, we can then follow the usual perturbative approach to quantising (23), safe in the knowledge that the gauge redundancy has been appropriately accounted for, to arrive at the familiar Feynman diagrams for pure Yang–Mills theory, as described in the standard quantum field theory textbooks.Footnote 10 Since the colour structure of these diagrams will lie at the heart of our discussion, let us be completely explicit nonetheless.

Consider the linear covariant gauge fixing condition \(G[A]=\text {div}A\). After eliminating the Nakanishi-Lautrup field b through its algebraic equation of motion we are left with,

$$\begin{aligned} S_{\mathrm{YM}_{\mathrm{BRST}}} =\frac{1}{2 g ^2}\int _M \mathrm{tr}\left( F \wedge \star F-\frac{1}{\xi } \star (d^\dagger A)^2 +2d\bar{c} \wedge \star D c\right) , \end{aligned}$$
(28)

which has no gauge symmetry left but is invariant under the BRST transformations (26), but with \(Q\bar{c}:=- \tfrac{1}{\xi }d^\dagger A\), which follows from the equation of motion \(b = - \tfrac{1}{\xi }d^\dagger A\).

The Feynman diagrams then follow immediately from the standard canonical approach. The pure gluon diagrams in Feynman gauge (\(\xi =1\)) are as follows:

(29a)
(29b)
(29c)

where \(p_{ij}=p_i-p_j\) and \(\eta ^{\mu \nu \rho \sigma }=\eta ^{\mu \rho }\eta ^{\nu \sigma }-\eta ^{\mu \sigma }\eta ^{\nu \rho }\). The ghost diagrams are given by,

(30a)
(30b)

With the gluon Feynman diagrams at our disposal we can now turn to scattering amplitudes in pure Yang–Mills gauge theory. Consider a collection of \(n_i\) non-interacting and well-separated gluons in-coming from past infinity in the initial separable state

$$\begin{aligned} |\text {in}\rangle =| p_1, \varepsilon _1 \rangle \otimes | p_2, \varepsilon _2 \rangle \cdots \otimes | p_{n_i}, \varepsilon _{n_i} \rangle \equiv | p_1, \varepsilon _1 ; p_2, \varepsilon _2 ;\dots p_{n_i}, \varepsilon _{n_i} \rangle , \end{aligned}$$
(31)

where \(p_i, \varepsilon _i\) are the on-mass-shell momentum and polarisation (we are suppressing here the colour data) of the \(i^{\mathrm{th}}\) gluon, respectively. The amplitude for these in-coming particles to scatter into some out-going collection of \(n_f\) non-interacting, well-separated particles at future infinity, \(t\rightarrow \infty \), \(|\text {out}\rangle = | p_{n_i+1}, \varepsilon _{n_i+1} ; \dots p_{n_i+n_f}, \varepsilon _{n_i+n_f} \rangle \), is given by the S-matrix element, \({\mathcal {A}}_{n_i\rightarrow n_f} =\langle \text {out}|S|\text {in}\rangle \).

The Lehman–Symanzik–Zimmermann reduction formula re-expresses the scattering amplitude \( {\mathcal {A}}_{n_i\rightarrow n_f}\) in terms of the Green function of \(n_i + n_f\) local fields dressed by polarisation tensors and reciprocal renormalised two-point Green functions, which precisely cancel the external propagators. Splitting the S-matrix into its trivial and interacting pieces \(S=\mathbb {1}+iT\), the non-trivial amplitude is then given by the sum of all amputated and connected graphs using the Feynman diagrams in (29) contracted with the external legs, given in (32), with on-shell momenta \(\{p_i\}_{i=1}^{n_i+n_f}\) and polarisations \(\{\varepsilon _i\}_{i=1}^{n_i+n_f}\) (suppressing the helicity index \(\varepsilon _i=\varepsilon _{i}^{s}, s=\pm \)),

(32a)
(32b)

where the \(\mu , b\) indices are contracted with the colour and momentum indices of the associated leg of the internal diagram (represented here by the blob).

2.1.2 The colour-kinematic duality

Here we introduce the BCJ colour-kinematic duality for gluons. Let us first state the claim, so that we know where we are heading, before carefully unpacking the various ingredients.

First, recall an n-point L-loop gluon amplitude \({\mathcal {A}}_{\mathrm{YM}} ^{n, L}\) may always be written as

$$\begin{aligned} {\mathcal {A}}_{\mathrm{YM}} ^{n, L}=i^Lg ^{n-2+2L}\sum _{i}\int \prod ^{L}_{l=1}\frac{d^Dp_l}{(2\pi )^DS_i} \frac{c_i n_i}{d_i}, \end{aligned}$$
(33)

where the sum is over all n-point L-loop graphs, labelled i, with only trivalent vertices (these are not Feynman diagrams). The colour numerator or factor\(c_i\) associated to graph i is composed of gauge group structure constants and can be read off directly from the graph. The kinematic numerator or factor\(n_i\) associated to graph i is a polynomial of Lorentz-invariant contractions of polarisation vectors and momenta. The denominators \(d_i\) are composed of the propagators given by products of the momentum squared of each internal line of graph i. Here, \(S_i\in \mathbb {N}\) accounts for any over-counting due to the graph symmetries. At tree-level, \(L=0\), this simplifies to

$$\begin{aligned} {\mathcal {A}}_{\mathrm{YM}} ^{n, 0}=g ^{n-2}\sum _{i=1}^{(2n-5)!!} \frac{c_i n_i}{d_i}, \end{aligned}$$
(34)

since there are \((2n-5)!!\) trivalent tree diagrams at n-points.Footnote 11

Having reorganised the amplitudes into a sum over purely trivalent graphs we can state the BCJ duality conjecture:

figure a

Having presented the conjecture, let us now expand on the various components, starting with (33). Since \([-,-]: \mathfrak {g}\otimes \mathfrak {g}\rightarrow \mathfrak {g}\), the possibility of relating colour to kinematics relies on writing the amplitude in terms of trivalent diagrams only. This is possible because the four-point contact terms (29c) can always be ‘blown-up’ and absorbed into three-point diagrams. Consider the simplest example of the four-point, tree-level amplitude,

(35)

We shall leave the helicities unspecified as the argument should not depend on any particular configuration and simply denote the kinematic numeratorsFootnote 12 by \(n=n(p_i, \varepsilon _i)\) as in (33). Explicitly, with all momenta outgoing \(p_1+p_2+p_3+p_4=0\),

(36a)
(36b)
(36c)
(36d)

where we have used the Mandelstam variables \(s=(p_1+p_2)^2, t=(p_1+p_4)^2, u=(p_1+p_3)^2\) in the trivalent s-, t- and u-channel diagrams, respectively. We have suggestively labelled the kinematic factors appearing in the four-point contact term by \(n^{{(4)}}_{s}, n^{{(4)}}_{t}, n^{{(4)}}_{u}\), where explicitly

$$\begin{aligned} n^{{(4)}}_{s}&=2 \varepsilon _{1}\cdot \varepsilon _{[3} \varepsilon _{4]}\cdot \varepsilon _{2},\nonumber \\ n^{{(4)}}_{t}&=2 \varepsilon _{1}\cdot \varepsilon _{[2} \varepsilon _{3]}\cdot \varepsilon _{4},\nonumber \\ n^{{(4)}}_{u}&=2 \varepsilon _{1}\cdot \varepsilon _{[4} \varepsilon _{2]}\cdot \varepsilon _{3}. \end{aligned}$$
(37)

Writing the colour factors corresponding to each trivalent diagram as

$$\begin{aligned} c_s=f^{abx}f_{x}{}^{cd}, \quad c_t=f^{axd}f_{x}{}^{bc}, \quad c_u=f^{axc}f_{x}{}^{db}, \end{aligned}$$
(38)

the four-point contact term becomes

$$\begin{aligned} -ig^{2}\left( c_s n_{s}^{{(4)}}-c_t n_{t}^{{(4)}}-c_u n_{u}^{{(4)}}\right) =-ig^{2}\left( \frac{c_s sn_{s}^{{(4)}}}{s}-\frac{c_t tn_{t}^{{(4)}}}{t} -\frac{c_u un_{u}^{{(4)}}}{u}\right) \end{aligned}$$
(39)

where we have trivially inserted the corresponding propagators. This makes it immediately clear that the three terms can be absorbed into the s-, t- and u-channels respectively, shifting their kinematic factors,

$$\begin{aligned} n_s\rightarrow n'_s=n_s+sn_{s}^{{(4)}}, \quad n_t\rightarrow n'_t=n_t-tn_{t}^{{(4)}},\quad n_u\rightarrow n'_u=n_u-un_{u}^{{(4)}}. \end{aligned}$$
(40)

so that the amplitude is a sum over the three trivalent diagrams,

$$\begin{aligned} {\mathcal {A}}_{\mathrm{YM}} ^{4, 0}=-ig^{2}_{\mathrm{YM}} \left( \frac{c_s n'_{s}}{s}+\frac{c_t n'_{t}}{t}+\frac{c_u n'_{u}}{u}\right) . \end{aligned}$$
(41)

This argument trivially goes through for any set of four arbitrarily complex diagrams that differ only by a set of four-point subdiagrams embedded in a common subsector as depicted here,

(42)

where the dotted lines connect each subdiagram to otherwise identical total diagrams. Hence, wherever we see a four-point contact term we can absorb it into the three corresponding diagrams with the trivalent s-, t- and u-channels in its place.

We have rather laboured this essentially trivial observation, that the three sets of structure constants appearing in the four-point contact term are the same as those in the three four-point trivalent diagrams, because it brings into focus the first ingredients of the BCJ duality. From the outset, it was clear that the kinematic numerators entering the Feynman diagram decomposition of the amplitude are not unique since the polarisations are only defined up to shifts, \(\varepsilon (p)\rightarrow \varepsilon (p)+\alpha p\), which change each kinematic factor (but of course leave the amplitude itself invariant). This is no surprise, since each individual digram is not gauge invariant. However, the preceding discussion makes a second, less trivial, ambiguity in the kinematic numerators apparent, the generalised gauge transformations introduced [50]. The nomenclature derives from the observation that while they may look and feel like gauge transformations, there need not be any gauge transformation that actually realises a given generalised gauge transformation. To describe the generalised gauge transformations, let us return to our four-point example. Note that the three colour factors \(c_s, c_t, c_u\) of (36) are precisely the combinations of structure constants appearing in the Jacobi identity (20),

$$\begin{aligned} c_s-c_t-c_u=3 f^{xa[b}f_{x}{}^{cd]}=0. \end{aligned}$$
(43)

Hence, under a shift of the kinematic numerators by an arbitrary function \(\alpha \),

$$\begin{aligned} n_s \mapsto n_s - s \alpha ,\quad n_t \mapsto n_t + t \alpha , \quad n_u \mapsto n_u + u \alpha , \end{aligned}$$
(44)

the amplitude (41) is left invariant,

$$\begin{aligned} {\mathcal {A}}_{\mathrm{YM}} ^{4, 0}\mapsto -ig^{2} \left( \frac{c_s (n_s - s \alpha )}{s}+\frac{c_t (n_t + t \alpha )}{t} +\frac{c_u (n_u + u \alpha )}{u}\right) = {\mathcal {A}}_{\mathrm{YM}} ^{4, 0}, \end{aligned}$$
(45)

since \((-c_s+c_t+c_u)\alpha =0\) by the Jacobi identity. Again, it is clear that this invariance generalises to any triple of trivalent diagrams (ijk) that only differ in a common four-point subsector with colour factor satisfying a Jacobi identity of the form \(c_i+c_j+c_k=0\), where the generalised gauge transformation acting on the corresponding kinematic numerators is given by,

$$\begin{aligned} n_i \mapsto n_i + s_i \alpha ,\quad n_j \mapsto n_j + s_j \alpha ,\quad n_ \mapsto n_k + s_k \alpha \end{aligned}$$
(46)

and \(s_i, s_j, s_k\) are the three (and only three) distinct propagators as illustrated here,

figure b

Let us summarise. Every gluon scattering amplitude can be written in terms of purely trivalent graphs. The kinematic numerators associated to these graphs are not unique. In particular, for any triple of such diagrams with colour factors obeying a Jacobi identity, the amplitude is invariant under the generalised gauge transformations acting on the corresponding kinematic numerators. The BCJ colour-kinematic conjecture states that there is a writing of the kinematic numerators, exploiting their ambiguity, such that (i) whenever the colour factors of a triple of graphs obey a Jacobi identity then so do the corresponding kinematic numerators and, (ii) if interchanging two legs of diagram i implies \({c_{i}}\mapsto -{c_{i}}\), then we also have \({n_{i}}\mapsto -{n_{i}}\).Footnote 13 Before discussing the conjecture further let us take a look at some simple examples to get a feel for it.

We start with a triviality, the three-point amplitude (allowing complex momenta), which consists of a single diagram,

(47)

Under interchange of any two edges \(c=f^{abc} \mapsto -c\) since \(f^{abc}\) is totally antisymmetric. Since \(p_{ij}=-p_{ji}\), we see that the same is true for n, as claimed.

The next example, tree-level four points, is already less immediately obvious, although it has been known to satisfy BCJ duality for some time now (before the notion of BCJ duality had been articulated) [165, 166]. From (36), (37) and (40) the kinematic numerators with momenta out-going in the trivalent form are given by

$$\begin{aligned} n_s&=4\big (\varepsilon _{1}\cdot p_{2} \varepsilon _{2} -\varepsilon _{2}\cdot p_{1} \varepsilon _{1} +\tfrac{1}{2}\varepsilon _{1}\cdot \varepsilon _{2} p_{12}\big ) \cdot \big (12\rightarrow 34\big )+2 s\varepsilon _{1} \cdot \varepsilon _{[3}\varepsilon _{4]}\cdot \varepsilon _{2}\nonumber \\ n_t&=4\big (\varepsilon _{4}\cdot p_{1} \varepsilon _{1} -\varepsilon _{1}\cdot p_{4} \varepsilon _{4} +\tfrac{1}{2}\varepsilon _{4}\cdot \varepsilon _{1} p_{41}\big ) \cdot \big (41\rightarrow 23\big )+2 t\varepsilon _{4} \cdot \varepsilon _{[2}\varepsilon _{3]}\cdot \varepsilon _{1}\nonumber \\ n_u&=4\big (\varepsilon _{4}\cdot p_{2} \varepsilon _{2} -\varepsilon _{4}\cdot p_{4} \varepsilon _{4} +\tfrac{1}{2}\varepsilon _{4}\cdot \varepsilon _{2} p_{42}\big ) \cdot \big (42\rightarrow 31\big )+2 u\varepsilon _{4} \cdot \varepsilon _{[3}\varepsilon _{1]}\cdot \varepsilon _{2} \end{aligned}$$
(48)

Recall, the perhaps unfamiliar final terms come from absorbing the four-point contact term, hence the appearance of the propagators stu. The claim is that since \(c_s-c_t-c_u=0\), see (43), by BCJ duality (without any further intervention in this case) we also have

$$\begin{aligned} n_s-n_t-n_u=0 \end{aligned}$$
(49)

on-shell (\(\sum _{i=1}^{4}p_i=0\), \(p_{i}^{2}=0, \varepsilon _i \cdot p_{i}=0\)). Using

$$\begin{aligned} p_{12}\cdot p_{34}&=u-t,\nonumber \\ p_{41}\cdot p_{23}&=u-s,\nonumber \\ p_{42}\cdot p_{31}&=s-t, \end{aligned}$$
(50)

we see that the three unfamiliar terms, deriving from the four-point contact term, cancel identically against the terms \(\varepsilon _{1}\cdot \varepsilon _{2} \varepsilon _{1}\cdot \varepsilon _{2} p_{12}\cdot p_{34}, \varepsilon _{4}\cdot \varepsilon _{1} \varepsilon _{2}\cdot \varepsilon _{3} p_{41}\cdot p_{23}\) and \(\varepsilon _{4}\cdot \varepsilon _{2} \varepsilon _{3}\cdot \varepsilon _{1} p_{42}\cdot p_{31}\). To handle the remaining 24 terms we can first make a judicious choice for the polarisation reference vectors, \( q_1=q_2=q_3=p_4\) and \(q_4=p_2\), so that

$$\begin{aligned} \varepsilon _i\cdot p_4=0,\quad \varepsilon _4\cdot p_2=0. \end{aligned}$$
(51)

There are also various vanishing products amongst the polarisation tensors, for example \(\varepsilon ^{\pm }_{i}\cdot \varepsilon ^{\pm }_{j}=0, \forall i,j=1,2,3\), but we shall not need these as the BCJ duality holds for all helicity configurations. By inspection we see that for this choice \(n_u=0\) and we are left with

$$\begin{aligned} n_s-n_t&= 4\Big [ \varepsilon _{4}\cdot p_{3} \left( \varepsilon _{{2}}\cdot p_{1} \varepsilon _{{1}}\cdot \varepsilon _{{3}} -\varepsilon _{{1}} \cdot p_{2} \varepsilon _{{2}}\cdot \varepsilon _{{3}}- \tfrac{1}{2} \varepsilon _{{1}} \cdot \varepsilon _{{2}} \varepsilon _3\cdot p_{12}\right) \nonumber \\&\qquad +\tfrac{1}{2} \varepsilon _{{3}}\cdot \varepsilon _{4} (\varepsilon _{{1}} \cdot p_{2} \varepsilon _2\cdot p_{3}-\varepsilon _{{2}} \cdot p_{1}\varepsilon _1\cdot p_{3})\Big ] \nonumber \\&\qquad -4\Big [ \varepsilon _{4}\cdot p_{1} \left( \varepsilon _{{2}} \cdot p_{3} \varepsilon _{{3}}\cdot \varepsilon _{{1}} -\varepsilon _{{3}}\cdot p_{2} \varepsilon _{{2}} \cdot \varepsilon _{{1}}- \tfrac{1}{2} \varepsilon _{{3}}\cdot \varepsilon _{{2}} \varepsilon _1\cdot p_{32}\right) \nonumber \\&\qquad +\tfrac{1}{2} \varepsilon _{{1}}\cdot \varepsilon _{4} (\varepsilon _{{3}}\cdot p_{2} \varepsilon _2 \cdot p_{1}-\varepsilon _{{2}}\cdot p_{3}\varepsilon _3\cdot p_{1})\Big ] \end{aligned}$$
(52)

It is straightforward to show that this combination vanishes due to the special four-point kinematics (with our choice of polarisation reference vectors, but of course there is no loss of generality). First, note that \(\varepsilon _{{2}}\cdot p_{1}\varepsilon _1\cdot p_{3}= \varepsilon _{{1}}\cdot p_{2} \varepsilon _2\cdot p_{3}\) since \(\varepsilon _1\cdot p_{3} = \varepsilon _1\cdot (p_{3}+p_1)=-\varepsilon _1\cdot (p_{2}+p_4)=-\varepsilon _1\cdot p_{2}\) and \(\varepsilon _{{2}}\cdot p_{1}=\varepsilon _{{2}}\cdot (p_{1}+p_2)=-\varepsilon _{{2}}\cdot (p_{3}+p_4)=-\varepsilon _{{2}}\cdot p_3\). Consequently, \(\varepsilon _{{1}}\cdot p_{2} \varepsilon _2\cdot p_{3}-\varepsilon _{{2}}\cdot p_{1}\varepsilon _1\cdot p_{3}=0\) and similarly \(\varepsilon _{{3}}\cdot p_{2} \varepsilon _2\cdot p_{1}-\varepsilon _{{2}}\cdot p_{3}\varepsilon _3\cdot p_{1}=\varepsilon _{{4}}\cdot p_{3} \varepsilon _{{2}}\cdot p_{1} -\varepsilon _{{4}}\cdot p_{1} \varepsilon _{{2}}\cdot p_{3}=0\), leaving

$$\begin{aligned} n_s-n_t&=4\Big [ \varepsilon _{{2}}\cdot \varepsilon _{{1}} ( \varepsilon _{{4}}\cdot p_{1} \varepsilon _{{3}}\cdot p_{2} - \tfrac{1}{2}\varepsilon _{{4}}\cdot p_{3} \varepsilon _{{3}}\cdot p_{12}) \nonumber \\&\qquad + \varepsilon _{{2}}\cdot \varepsilon _{{3}} (\varepsilon _{{1}}\cdot p_{2} \varepsilon _{{4}}\cdot p_{3} -\tfrac{1}{2}\varepsilon _{{4}}\cdot p_{1} \varepsilon _{{1}}\cdot p_{32})\Big ]\nonumber \\&=0 \end{aligned}$$
(53)

where we have reorganised the remaining terms to make the final cancellations clear. Using \(\varepsilon _{{4}}\cdot p_3=\varepsilon _{{4}}\cdot (p_3+p_4)=-\varepsilon _{{4}}\cdot (p_1+p_2)=-\varepsilon _{{4}}\cdot p_1\), we have \(\varepsilon _{{4}}\cdot p_{1} \varepsilon _{{3}}\cdot p_{2} - \tfrac{1}{2}\varepsilon _{{4}}\cdot p_{3} \varepsilon _{{3}}\cdot p_{12}= \tfrac{1}{2}\varepsilon _{{4}}\cdot p_{1} \varepsilon _{{3}}\cdot (p_1+p_{2})=- \tfrac{1}{2}\varepsilon _{{4}}\cdot p_{1} \varepsilon _{{3}}\cdot (p_3+p_{4})=0\) and similar \(\varepsilon _{{1}}\cdot p_{2} \varepsilon _{{4}}\cdot p_{3} - \tfrac{1}{2}\varepsilon _{{4}}\cdot p_{1} \varepsilon _{{1}}\cdot p_{32}=0\). While this four-point example can be accounted for by the special kinematics associated to four-points, it makes the principle clear.

Five-points, considered the original BCJ duality work [50], provides the first truly non-trivial check. It also makes the general principles entering the duality clear, so let us reexamine it here, following closely [50], but of course with the benefit of hindsight. At five points there are 15 trivalent diagram contributing to the amplitude

$$\begin{aligned} {\mathcal {A}}_{\mathrm{YM}} ^{5, 0} = \sum _{i=1}^{15}\frac{n_i c_i}{d_i} \end{aligned}$$
(54)

Any triple of diagrams with a common pair of joined external legs gives us a colour Jacobi identity, for example

(55)

which corresponds to the identity

$$\begin{aligned} f^{a_1a_2}{}_{x}\left( f^{xa_3y}f_{y}{}^{a_4a_5} +f^{xa_5y}f_{y}{}^{a_3a_4} + f^{xa_4y}f_{y}{}^{a_5a_3}\right) =0, \end{aligned}$$
(56)

but of course not all are independent.

Generally, at n-points each colour factor is an order \(n-2\) polynomial in the structure constants \(f^{abc}\). There are \((2n-5)!!\) trivalent diagrams, each of which has \(n-3\) internal lines. Each internal line, regarded arbitrarily as say the s-channel, can contribute to one, and only one, colour Jacobi identity with two other diagrams containing the corresponding t- and u-channels. The total number of independent colour Jacobi relations is given by \(\sum _{k=1}^{\lfloor (n-2)/2\rfloor }C^{n-2}_{2k}C^{2 k}_{k}(n-2)!/2^{2k}\). This agrees with the number of independent partial colour-ordered amplitudes. For an n-point tree-level amplitude, the Kleiss–Kuijf relations [167, 168] imply that there are at most \((n-2)!\) independent basis partial amplitudes.Footnote 14 Using the multi-peripheral colour decomposition of [168] we learn that the number of independent (under the Jacobi identities) colour factors is given by the number of independent (under the Kleiss–Kuijf relations) partial amplitudes, that is \((n-2)!\). Hence, at five points we must have \(15-6=9\) independent Jacobi identities, as claimed.

A set of nine independent Jacobi relations at five points, according to the labelling given in [50], can be chosen as

$$\begin{aligned} \begin{array}{llllll} c_{7}=c_{6}-c_{1} &{}\quad c_{8}=c_{2}-c_{1} &{}\quad c_{9}=c_{3}-c_{2}; \\ c_{10}=c_{4}-c_{3} &{}\quad c_{11}=c_{5}-c_{4} &{}\quad c_{12}=c_{5}-c_{6}; \\ c_{13}=c_{10}-c_{7} &{}\quad c_{14}=c_{11}-c_{8} &{}\quad c_{15}=c_{12}-c_{9}. \end{array} \end{aligned}$$
(57)

If BCJ duality is valid there should exist a writing of the amplitude such that the nine kinematic identities

$$\begin{aligned} \begin{array}{llllll} n_{7}=n_{6}-n_{1} &{}\quad n_{8}=n_{2}-n_{1} &{}\quad n_{9}=n_{3}-n_{2}; \\ n_{10}=n_{4}-n_{3} &{}\quad n_{11}=n_{5}-n_{4} &{}\quad n_{12}=n_{5}-n_{6}; \\ n_{13}=n_{10}-n_{7} &{}\quad n_{14}=n_{11}-n_{8} &{}\quad n_{15}=n_{12}-n_{9}. \end{array} \end{aligned}$$
(58)

hold. For any one of the kinematic relations given in (58) it is straightforward to check that it may be satisfied using a slight generalisation of the arguments used in basic four-point example. The challenge is to realise all the relations consistently at once. One route is to first establish some consequences of BCJ duality if it were to hold. Assuming BCJ duality at five points, it was shown in [50] that there must exist a new set relations amongst the \(6=(5-2)!\) partial amplitudes. These were the first example of the BCJ relations [50], of which the fundamental relations take a particularly simple form,

$$\begin{aligned} \sum _{i=2}^{n-1}p_1\cdot \left( \sum _{j=2}^{i}p_j\right) A_n[2,\ldots ,i, 1, i+1,\ldots , n]=0. \end{aligned}$$
(59)

These relations were established up to eight points explicitly in [50] and conjectured to hold generally on the basis of further significant evidence at higher points. Returning to the 5-point example specifically, and pretending that we do not yet know of the BCJ relations, assuming BCJ duality we can take six numerators \(\{n_i\}_{i=1}^{6}\) as independent, the remaining \(n_j\notin \{n_i\}_{i=1}^{6}\) being generated by (58). Using the fact that each partial amplitude depends on only five colour-ordered diagrams, we can define, using the kinematic Jacobi relations (58), two of the six \(n_i\), let us say \(n_5, n_6\), in terms of only two partial amplitudes and the remaining four \(n_i, i=1,2,3,4\). For any of the other four independent partial amplitudes, by (58) we can replace any dependence on the \(n_j\) not belonging to our choice of independent \(\{n_i\}_{i=1}^{6}\) and then further remove any occurrence of \(n_5, n_6\) by their definition in terms of our two special partial amplitudes and the remaining \(n_i, i=1,2,3,4\). Then something unexpected happens - all dependence on \(\{n_i\}_{i=1}^{4}\) drops out due to only the kinematic relations amongst the propagators! All six partial amplitudes are simple functions of our two selected partial amplitudes and the propagators alone. These identities generate the KLT and BCJ relations at five points [50].

Now, given the BCJ relations it is possible to explicitly construct a representation of the total amplitude such that BCJ duality holds [52, 53]. That this representation yields the correct amplitude is checked via the KLT relations. The loop of reasoning is then cut by demonstrating independently that the BCJ relations do in fact hold. This was done at any number of points in [169, 170] by considering the \(\alpha '\rightarrow 0\) limit of string theory monodromy relations. They may also be deduced from pure spinor cohomology [171]. There are a number of powerful stringy perspectives on the BCJ relations, see for example [172,173,174,175]. A purely field theoretic derivation was given in [176] using only Britto-Cachazo-Feng-Witten (BCFW) recursion [177]. They have also been established [178] in \({\mathcal {N}}=4\) super Yang–Mills, which contains the pure Yang–Mills case, using the connected formalism of Roiban, Spradlin, Volovich and Witten.

Let us unpack further what we have seen, following closely [179] and the very clear account given in [143]. Consider the n-point tree amplitude written as a sum over \((2m-5)!!\) trivalent graphs. Thinking of the \((2m-5)!!\) colour \(c_i\) and kinematic factors \(n_i\) as vectors, \(\mathbf {c}, \mathbf {n}\), we can trivially rewrite the amplitude as

$$\begin{aligned} {\mathcal {A}}^{n,0}_{\mathrm{YM}}= \mathbf {c}^t\cdot \mathbf {D}\cdot \mathbf {n} \end{aligned}$$
(60)

where \([\mathbf {D}]_{ij}=\delta _{ij}/d_j\). Of course, only \((n-2)! \ c_i\) are independent and we can choose \((n-2)!\) master colour factors and put them into a \((n-2)!\)-vector \(\mathbf {c}_{\mathrm{m}}\). The rest are generated by the Jacobi identities

$$\begin{aligned} \mathbf {c}=\mathbf {J}\cdot \mathbf {c}_{\mathrm{m}} \end{aligned}$$
(61)

where \(\mathbf {J}\) is a \((2n-5)!!\times (n-2)!\) matrix encoding these relations. For example, at four points, in the conventions of (36), we can choose \(c_t, c_u\) as our master colour factors and then

$$\begin{aligned} \mathbf {J} = \begin{pmatrix} 1 &{} 1\\ 1&{} 0 \\ 0&{}1 \end{pmatrix}. \end{aligned}$$
(62)

At five points, for the choice of \(\mathbf {c}_{\mathrm{m}}^t =(c_1,\dots , c_6)\), in the conventions of (57) we have

$$\begin{aligned} \mathbf {J} =\begin{pmatrix} 1 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 1 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 \\ -1 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 \\ -1 &{} 1 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} -1 &{} 1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} -1 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} -1 &{} 1 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} -1 \\ -1&{} 0 &{} -1 &{} 1 &{} 0 &{} 1 \\ -1 &{} 1 &{} 0 &{} -1 &{} 1 &{} 0 \\ 0 &{} -1 &{} 1 &{} 0 &{} 1 &{} -1 \\ \end{pmatrix}. \end{aligned}$$
(63)

Of course, the choice of \(\mathbf {c}_{\mathrm{m}}\) is not unique. In this language, BCJ duality amounts to the existence of a rewriting of the amplitude such that

$$\begin{aligned} \mathbf {n}=\mathbf {J}\cdot \mathbf {n}_{\mathrm{m}}. \end{aligned}$$
(64)

But we also have the \((n-2)!\) independent (under Kleiss–Kuijf relations, but prior to applying the BCJ relations) partial amplitudes \(\mathbf {A}\), which in this language may be written

$$\begin{aligned} \mathbf {A} = \mathbf {P}\cdot \mathbf {n}= \mathbf {P} \mathbf {J} \cdot \mathbf {n}_{\mathrm{m}}. \end{aligned}$$
(65)

where \(\mathbf {P}\) is an \((n-2)!\times (2n-5)!!\) matrix with elements determined by the permutations defining each partial in \(\mathbf {A}\) relative to the colour order of the graphs. So the question of identifying a BCJ duality respecting set of numerators reduces to the invertibility of \(\mathbf {P} \mathbf {J}\). But wait, what if \(\mathbf {P} \mathbf {J}\) is singular? Well that is the point, the BCJ relations imply that only \((n-3)!\) of the partial amplitudes are independent and \(\mathbf {P} \mathbf {J}\)is singular. But, we can solve for \((n-3)!\) elements of \(\mathbf {n}_{\mathrm{m}}\) in terms of \((n-3)!\) partial amplitudes and the remaining \((n-2)!-(n-3)!=(n-3)!(n-3)\) elements of \(\mathbf {n}_{\mathrm{m}}\). More generally, any matrix \(\mathbf {M}\), with linear system \(\mathbf {y}=\mathbf {M}\cdot \mathbf {x}\), admits a generalised inverse \(\widetilde{\mathbf {M}}\) satisfying \({\mathbf {M}}\widetilde{\mathbf {M}}{\mathbf {M}}={\mathbf {M}}\), which implies \(\mathbf {y}=\mathbf {M} \widetilde{\mathbf {M}}\cdot \mathbf {y}\). The generalised inverse is not unique, however, so the solution for \(\mathbf {n}_{\mathrm{m}}\) given by \(\mathbf {n}_{\mathrm{m}}=\widetilde{\mathbf {P} \mathbf {J}}\cdot \mathbf {A}\) is not unique.Footnote 15 On substituting this solution back into (65), the dependence of the remaining \((n-3)!(n-3)\) partial amplitudes on \(\mathbf {n}_{\mathrm{m}}\) must drop out and we are left with the BCJ relations only; the \((n-3)!(n-3)\) leftover kinematic numerators are entirely unconstrained and may be set to zero at the expense of rendering the surviving kinematic numerators non-local as the propagators corresponding to the vanishing numerators have been shuffled into them.

Our discussion so far has been restricted to tree-level, but to take “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)”  beyond KLT we need to go to loops. The statement of the duality is not affected, up to some minor subtleties that we shall comment on momentarily. For example, consider the simplest example at one loop

(66)

which yields the colour Jacobi identity

$$\begin{aligned} f^{a}{}_{xa'}f^{x b}{}_{b'}(c_{s}^{a'b'cd}-c_{t}^{a'b'cd}-c_{s}^{a'b'cd})=0, \end{aligned}$$
(67)

where \(c_s, c_t, c_u\) are just the four-point tree colour factors given in (43). Under BCJ duality, we should then have

$$\begin{aligned} n_s-n_t-n_u=0, \end{aligned}$$
(68)

but where the kinematic factors are functions of the loop momentum \(n=n(\ell )\). The kinematic Jacobi-type identities are functional identities. The four-point one-loop example corresponding to (66) in \({\mathcal {N}}=4\) Yang–Mills theory is especially simple, due to the particularly simple structure of one-loop amplitudes [36, 180]. See for example [49, 144]. For pure Yang–Mills at one and two loops see [181]. For detailed examples at three loops see for example [51, 182]. These simple cases make it clear that BCJ can work at loop level. However, the proof of BCJ duality at tree-level given in [52, 53] relied on the KLT relations and therefore does not extend to loop level. At the time of writing there is no proof that BCJ duality will hold to all loops, despite an impressive number of highly non-trivial concrete examples [51, 57,58,59,60,61,62,63, 65, 67, 82, 90,91,92,93, 181, 183,184,185].

Understanding BCJ duality to all orders is no doubt a central problem, particularly in the context of applications to scattering amplitudes in gravity, the subject of the next section. We shall explore some of the possible routes to BCJ duality later, but let us now summarise some key properties and generalisations of BCJ duality.

2.1.3 BCJ duality: universal properties and generalisations

We have so far only discussed BCJ duality for pure Yang–Mills theory. We did not mention spacetime dimension, so the first obvious generalisation to Yang–Mills theory in arbitrary dimensions is already implicitly contained in our previous discussion. Of course, there are many other possible generalisations that are desirable for a better understanding the principles as well as for applications. For example, to test the UV properties of \({\mathcal {N}}=8\) supergravity, addressing question (2), we would like to be able to put \({\mathcal {N}}=4\) Yang–Mills amplitudes into BCJ duality respecting form. More generally, we would like to know what kind of gravity theories can be generated by the BCJ double-copy, addressing question (3), which requires knowing what kind of gauge theories admit a BCJ dual representation. Here we give a lightning tour of the key properties and generalisations:

  • Supersymmetry: Supersymmetry and BCJ duality are curiously compatible [144]. On the one hand, it is not hard to convince oneself that if BCJ duality holds for pure Yang–Mills, then it will hold for any pure super Yang–Mills multiplet simply through the supersymmetry transformations. Conversely, BCJ duality for Yang–Mills coupled to adjoint fermions implies supersymmetry [73, 186]. If we add a single minimally coupled adjoint-valued (minimal) fermion, \({\mathcal {L}}_{\mathrm{fermion}} \sim \bar{\lambda } \not {D} \lambda \), then BCJ duality requires [73] that for a four-point interaction with only fermions on the external legs we have the following identity,

    $$\begin{aligned} \bar{u}_{[1}\gamma _\mu v_2 \bar{u}_{3]}\gamma ^\mu v_4 = 0 \end{aligned}$$
    (69)

    which in turn implies the Fierz identity required for supersymmetry. If BCJ duality is to hold, then supersymmetry follows. Once we have introduced the double-copy we will see that it had to be this way [73].

    As is well-known in the context of supersymmetry, this identity only holds in \(D=3,4,6,10\), so \({\mathcal {N}}=1\) Yang–Mills theories only exist in these dimensions [35, 36]. Similarly, BCJ duality with a single adjoint fermion only holds in these dimensions. However, BCJ duality survives toroidal dimensional reduction and consistent truncations, so that colour-kinematic duality in \(D=10, {\mathcal {N}}=1\) implies BCJ duality for all pure super Yang–Mills theories in \(D\le 10\).

    The restriction of \({\mathcal {N}}=1\) Yang–Mills theories to \(D=3,4,6,10\) can be related to the existence of the four normed division algebras \({\mathbb {A}}={\mathbb {R}}, {\mathbb {C}}, {\mathbb {H}}, {\mathbb {O}}\) and the fact that they are alternative algebras [187,188,189,190]. This is reflected by the Lie algebra relation,

    $$\begin{aligned} \mathfrak {so}(1,1+\dim {\mathbb {A}})\cong \mathfrak {sl}(2, {\mathbb {A}}) \end{aligned}$$
    (70)

    connecting spacetime and algebra symmetries [191,192,193]. These structures are in turn related to the notion of triality [194, 195] and the triality algebras [196]. Remarkably (or inevitably, depending on your tastes), these structures feed directly into supergravity theories through “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)” with some surprising results [136, 138,139,140, 190, 197], as we shall describe in section 3.2.2.

  • Bi-adjoint scalars: The addition of adjoint-valued fermions implied supersymmetry, which typically introduces adjoint-valued scalars, unless it is minimal. In this case, BCJ duality for amplitudes involving scalars is again taken care of by supersymmetry. But can we include adjoint-valued scalars without fermions or supersymmetry? Well, it depends on the couplings. If the scalar field is minimally coupled to the gauge field and has a quartic potential with the Yang–Mills coupling g, \({\mathcal {L}}_{\phi ^4}\sim -g^2\mathrm{tr}([\phi , \phi ][\phi , \phi ])\), then certainly, since it is merely the dimensional reduction on \(S^1\) of pure Yang–Mills theory in \(D+1\). What about a cubic scalar potential with its own coupling constant? The answer is yes, but with the caveat that \( \phi \) must then carry the adjoint representation of a second global symmetry group [75]. A gauge invariant cubic term for a set of scalar adjoint scalars \(\phi ^{\tilde{a}}\), where \(\tilde{a}\) is for now an arbitrary global index, may be written,

    $$\begin{aligned} {\mathcal {L}}_{\phi ^3} = \frac{1}{3!}\lambda \tilde{f}_{\tilde{a}\tilde{b}\tilde{c}}\mathrm{tr}([\phi ^{\tilde{a}}, \phi ^{\tilde{b}}]\phi ^{\tilde{c}})=\frac{1}{3!}\lambda \tilde{f}_{\tilde{a}\tilde{b}\tilde{c}} {f}_{{a}{b}{c}}\phi ^{a\tilde{a}}, \phi ^{b\tilde{b}}\phi ^{c\tilde{c}}. \end{aligned}$$
    (71)

    Here, aside necessarily being totally antisymmetric, \(\tilde{f}_{\tilde{a}\tilde{b}\tilde{c}}\) is an unconstrained constant tensor. However, for four external scalars at tree-level this term, essentially by construction, contributes to the s-, t- and u-channel kinematic factors one piece of a would-be Jacobi identity each

    $$\begin{aligned} n_s=\lambda ^2 {\tilde{f}}^{\tilde{a}\tilde{b}\tilde{x}}\tilde{f}_{\tilde{x}} {}^{\tilde{c}\tilde{d}} + \cdots , \quad n_t=\lambda ^2 \tilde{f}^{\tilde{a} \tilde{x}\tilde{d}}\tilde{f}_{\tilde{x}}{}^{\tilde{b}\tilde{c}} + \cdots , \quad c_u=\lambda ^2 \tilde{f}^{\tilde{a}\tilde{x}\tilde{}c} \tilde{f}_{\tilde{x}}{}^{\tilde{d}\tilde{b}} + \cdots . \end{aligned}$$
    (72)

    Since these are the unique \({\mathcal {O}}(\lambda ^2)\) contributions, the kinematic Jacobi relation \(n_s=n_t+n_u\) (the four-point colour Jacobi relation is clearly unaffected, the only difference being that it is the scalar four-point contact term that must the absorbed into the trivalent stu diagrams) requires

    $$\begin{aligned} \tilde{f}^{\tilde{a}[\tilde{b}\tilde{x}} \tilde{f}_{\tilde{x}}{}^{\tilde{c} \tilde{d}]}=0. \end{aligned}$$
    (73)

    Our a priori unconstrained tensor \(\tilde{f}_{\tilde{a}\tilde{b}\tilde{c}}\) obeys the Jacobi identity and \(\phi ^{a\tilde{a}}\) is a \(G\times \tilde{G}\) bi-adjoint field. For the moment \(\tilde{G}\) is just some flavour group, but our suggestive notation is not a coincidence, as we shall see in section 2.2.2. This seems like a rather esoteric addition to the set of fundamental BCJ duality respecting ingredients, especially given that cubic potentials are unbounded from below, but it turns out that bi-adjoint scalar fields are important to the relationship between gauge and gravity amplitudes, as first identified in [198]. They are by now a ubiquitous element of “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)”, as evidenced by [75, 76, 81, 86, 109, 110, 112, 113, 136, 137, 179, 199,200,201,202,203,204,205], and so deserve some special comment.

  • Matter couplings: We have thus far only considered fields carrying the adjoint representation of the gauge group. What about other representations? Generically, matter fields carrying \(\mathfrak {g}\)-representations, \(\rho , \rho '\) such that \(\text {Ad}_\mathfrak {g} \in \rho \otimes \rho '\), can couple to the gauge field through the structure constants in the appropriate representation \([T^a]_{i}{}^{j}\), where down/up indices belong to the \(\rho \) and \(\rho '\) representations, respectively. Of course, the properties of the fields involved (commuting vs. Grassmann, spacetime representations etc) will place restrictions on the allowed \(\rho , \rho '\). For quarks, this is the familiar case of \([T^a]_{i\bar{\jmath }}\equiv [T^a]_{i}{}^{j}\), where i and \(\bar{\imath }\) are fundamental and anti-fundamental representations of \(\mathrm{SU}(3)\), respectively. So far, so ordinary. But what happens to BCJ duality? In particular, the T’s do not satisfy Jacobi relations. Rather, they satisfy commutation relations

    $$\begin{aligned} {[}T^{a}]_{i}{}^{j}[T^{b}]_{j}{}^{k}-[T^{b}]_{i}{}^{j}[T^{a}]_{j}{}^{k} =f^{ab}{}_{d}[T^{d}]_{i}{}^{k}. \end{aligned}$$
    (74)

    Of course, letting \([f^{a}]_{bc}=f^{a}{}_{bc}\) the Jacobi identities are just the commutation relations in the adjoint representation,

    $$\begin{aligned} {[}f^{a}]_{c}{}^{d}[f^{b}]_{d}{}^{e}-[f^{b}]_{c}{}^{d}[f^{a}]_{d}{}^{e} =f^{ab}{}_{d}[f^{d}]_{c}{}^{e}, \end{aligned}$$
    (75)

    which immediately suggests the appropriate generalisation of BCJ duality to non-adjoint matter [74]: BCJ duality for matter fields is mediated by the commutation relations. For any triple of diagrams involving matter fields with colour factors satisfying a commutation relation, the corresponding kinematic factors must satisfy the same relations. This has been applied to quantum chromodynamics [206] and has applications to black holes physics through the double-copy [130]. Note, quarks (or other matter fields) do not imply supersymmetry, unlike adjoint fermions. Again, in the context of the double-copy this is perfectly natural as we shall come to discuss. Note that for particular choices of gauge group and matter representations there may be additional colour identities, beyond the basic commutation relations, specific to these choices [74]. However, as emphasised in [74], these need not be imposed; what is essential to BCJ duality are generic identities that do not hinge on any special properties of the representations or gauge groups used.

  • Symmetry breaking: We previously mentioned consistent truncations of pure super Yang–Mills theories as a method for producing BCJ duality respecting theories with less supersymmetry. This is the simplest example of a broader class of symmetry breaking principles that preserve BCJ duality at tree-level. These are particularly useful in the construction of large classes of supergravity theories using the double-copy [73, 75, 76, 86, 87, 87, 88, 207]. We shall discuss some of these in subsequent sections.

    One can both spontaneously and explicitly break symmetries while preserving the BCJ relations [76]. Consider a subgroup \(H\subset G\) corresponding to the positive eigenspace subspace of a Cartan involution \(\theta : \mathfrak {g}\rightarrow \mathfrak {g}\). The adjoint representation decomposes as \(\text {Ad}_G = \text {Ad}_H\oplus \rho _H\), where \(\rho _H\) is a (not necessarily irreducible) representation of H. Under this subalgebra any adjoint-valued multiplet \(\varphi _{\mathrm{Ad}_G}\) will decompose accordingly,

    $$\begin{aligned} \varphi _{\mathrm{Ad}_G}\rightarrow \varphi _{\mathrm{Ad}_H } \oplus \varphi _{\rho _H}, \end{aligned}$$
    (76)

    where \(\varphi _{\mathrm{Ad}_H}\) and \(\varphi _{\rho _H}\) belong to the positive and negative eigenspaces of \(\theta \), respectively. If \(\varphi _{\mathrm{Ad}_G}\) transforms under some further global symmetry group \({\mathcal {G}}\), which may include R-symmetry, we may also decompose to a subgroup \({\mathcal {H}}\subset {\mathcal {G}}\). Combining both the explicit gauge and global group breakings, we can effect various truncations and introduce matter representations in manner that automatically preserves BCJ duality at tree-level. Particularly useful examples are given by field theory orbifolds [73]. Take a some order-k element \(\tau =\rho _{{\mathcal {G}}}(\sigma )\otimes \text {Ad}_{G}(g)\), and project the fields \(\varphi \) onto the \(\tau \)-invariant subsector. The perhaps simplest example is given by \(\tau :=(-1)^F\cdot \theta \), where F is the fermion number operator. In this case, for adjoint-valued bosons b and fermions f, the \(\tau \)-invariant subsector is given by \( b_{\mathrm{Ad}_H}, f_{\rho _H}. \)

    Hence, we generate matter representations \(f_{\rho _H}\) starting from purely adjoint fields in such a way that BCJ duality is inherited. This clearly generalises to supermultiplets. For example, following the same procedure we can project an adjoint \(D=4, {\mathcal {N}}=4\) supervector multiplet onto an adjoint \(D=4, {\mathcal {N}}=2\) supervector multiplet plus a fundamantal \(D=4, {\mathcal {N}}=2\) hyper multiplet.

  • Other algebraic structures: The presentation thus far would understandably leave one with the impression that trivalent diagrams are essential. However, this is really an artefact of the fact that Lie algebras have rank-3 structure constants. It is the algebraic structure of the gauge symmetry that dictates the nature of the duality. Given a generalised gauge theory with something other than a Lie algebra underpinning its colour structure, the BCJ duality will reflect its fundamental identities, which need not be the Jacobi relation. Although this principle is reasonable, working examples are rare. An important case is given in [71], where \({\mathcal {N}}=16\) supergravity [208, 209] was derived from the double-copy of the \(D=3, {\mathcal {N}}=8\) Bagger–Lambert–Gustavsson theory [210,211,212] through a colour-kinematic duality based on the fundamental Lie three algebra identity. Since \({\mathcal {N}}=16\) supergravity is the unique maximally supersymmetric theory in \(D=3\), this construction must be equivalent to that of the standard BCJ double-copy of \(D=3, {\mathcal {N}}=8\) Yang–Mills and indeed it is [70]. Further related examples were explored in [213], including the Aharony–Bergman–Jafferis–Maldacena theory [214] although the BCJ relations are absent beyond six points in that case. Where could one search for generalised gauge theories admitting a novel colour-kinematic duality structure? Since the Bagger–Lambert–Gustavsson theory is also a higher gauge theory [215] this could be one broad avenue to explore.

  • There are various geometric or world-sheet perspectives on BCJ duality and “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)”. For example, the BCJ relations and duality can be cast in string theoretic terms [53, 169, 170, 174, 216,217,218]. There are also “world-sheet theories for field theory”, in particular the scattering equation formalism [199, 201, 219, 220] and the ambi-twistor string approach [221,222,223,224,225], which provides a route to a better all-loop understanding through a “nodal” world-sheet genus expansion.

2.2 The Bern–Carrasco–Johansson double-copy construction

The notion of BCJ duality is an intriguing property of gauge theories in its own right, with non-trivial and otherwise hidden implications, such as the reduction of the number of independent partial amplitudes down to \((n-3)!\). It is not yet clear why it should hold and there are many open questions left to explore, in particular whether or not it can be made manifest and, relatedly, if it holds to all loops. However, the (remarkably sheltered!) reader might be forgiven for asking what this all has to do with “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)”. Here is the answer: given BCJ duality holds for pure Yang–Mills then every \({\mathcal {N}}=0\) supergravity amplitude follows directly from gluon amplitudes through the BCJ double-copy [51, 64]. Note, this statement only depends on the validity of the BCJ colour-kinematic conjecture, otherwise it is completely generic; it applies to all non-Abelian gauge groups in all spacetime dimensions. In this section we shall describe this construction. We begin with a review of perturbative \({\mathcal {N}}=0\) supergravity, before exploring its double-copy construction. Finally, we shall layout the growing zoology of double-copy constructible theories and discuss their implications for perturbative quantum gravity.

2.2.1 \({\mathcal {N}}=0\) supergravity

The common, or NS-NS, sector of the \(\alpha '\rightarrow 0\) limit of closed string theories is given by \({\mathcal {N}}=0\) supergravity,

$$\begin{aligned} S_{{\mathcal {N}}=0} =\frac{1}{2\kappa ^2} \int \star R -\frac{1}{(D-2)} d\varphi \wedge \star d\varphi - \frac{1}{2}e^{-\frac{4}{D-2}\varphi } H \wedge \star H, \end{aligned}$$
(77)

where \(2\kappa ^2=16\pi G_{\mathrm{N}}^{(D)}\). Aside from the metric g we have the dilaton \(\varphi \) and the KR 2-form \(H=dB\). The solutions of the associated equations of motions give backgrounds (with vanishing cosmological constant) around which strings can be quantised to lowest order in \(\alpha '\) and string coupling. They also ensure conformal invariance of the string is non-anomalous in critical dimensions. From the relationship between the double-copy and the KLT relations, we should not be surprised by the appearance of this action in “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)”.

In \(D=4\), for topologically trivial manifolds, we can dualise the B into a pseudo-scalar, the axion \(\chi \). To do so, forget B and consider H as the field with respect to which we vary. Of course, H is closed and so we must add a Lagrange multiplier \(\chi \) to enforce this condition

$$\begin{aligned} {\mathcal {L}}=\star R -\frac{1}{2} d\varphi \wedge \star d\varphi - \frac{1}{2}e^{-2\varphi } H \wedge \star H - d\chi \wedge H. \end{aligned}$$
(78)

Varying with respect to \(\chi \) we find \(dH=0\) as required. Now, varying with respect to H we find

$$\begin{aligned} H= - \frac{1}{2}e^{2\varphi } \star d\chi . \end{aligned}$$
(79)

Since this is algebraic we substitute back into \({\mathcal {L}}\) to obtain

$$\begin{aligned} {\mathcal {L}}_{\mathrm{dual}} = \star R -\frac{1}{2} d\varphi \wedge \star d\varphi - \frac{1}{2}e^{2\varphi } d\chi \wedge \star d\chi , \end{aligned}$$
(80)

which is often referred to as axion–dilaton gravity. This is equivalent (semi-classically, at least [226]) to \(D=4, {\mathcal {N}}=0\) supergravity. We emphasise the dual axion–dilaton picture as it highlights a rather general feature of simple double-copy constructible theories. To describe this we need one further step. Consider the \(\mathfrak {sl}(2, {\mathbb {R}})\) generators

$$\begin{aligned} E_-= & {} \begin{pmatrix} 0&{} 0\\ 1&{}0 \end{pmatrix}, \quad H = \begin{pmatrix} 1&{} 0\\ 0&{}-1 \end{pmatrix}, \quad E_+ = \begin{pmatrix} 0&{} 1\\ 0&{}0 \end{pmatrix} \end{aligned}$$
(81)
$$\begin{aligned} {[}H, E_\pm ]= & {} \pm 2 E_\pm , \quad [E_+, E_-] = H. \end{aligned}$$
(82)

Then \({\mathcal {V}} = e^{\frac{1}{2}\varphi }e^{E_+\chi }\) is an \(\mathrm{SL}(2, {\mathbb {R}})/\mathrm{SO}(2)\) coset representative in the “positive root gauge” and

$$\begin{aligned} \frac{1}{4}\mathrm{tr}\left( d{\mathcal {M}}^{-1}\wedge \star d{\mathcal {M}}\right) =-\frac{1}{2} d\varphi \wedge \star d\varphi - \frac{1}{2}e^{2\varphi } d\chi \wedge \star d\chi , \end{aligned}$$
(83)

where \({\mathcal {M}}={\mathcal {V}}^T{\mathcal {V}}\). This makes the invariance of \({\mathcal {L}}_{\mathrm{dual}}\) under global \(\mathrm{SL}(2, {\mathbb {R}})\) transformations \({\mathcal {V}}\mapsto {\mathcal {V}}M\), \(\det (M)=1\), and local (in the sense that they are functions of \(\varphi , \chi \)) \(\mathrm{SO}(2)\) transformations, \({\mathcal {V}}\mapsto M(\varphi , \chi ) {\mathcal {V}}\), \(M(\varphi , \chi )^TM(\varphi , \chi )=1\), manifest. A symmetric homogeneous space \({\mathcal {G}}/{\mathcal {H}}\) satisfies

$$\begin{aligned} {[}{\mathfrak {p}},{\mathfrak {p}}] \subset {\mathfrak {h}}, \end{aligned}$$
(84)

where \({\mathfrak {g}}={\mathfrak {h}}+{\mathfrak {p}}\). From the commutation relations (82) we note that \(\mathrm{SL}(2, {\mathbb {R}})/\mathrm{SO}(2)\) is symmetric. For a complete characterisation of symmetric space see, for example, [227, 228].

The fact that the scalars parametrise a symmetric space is an almost generic property of double-copy constructible gravity theories. We say “almost generic”, as there are numerous exceptions [77, 85, 229], but it is true for all the basic examples, of which there are many. As we shall describe, it is possible to understand when such scalar manifolds appear from “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)”  and why they should be consistent with the double-copy construction [72, 85, 138]. Despite this, there is no complete proof that the double-copy yields a symmetric spaces when it should, although the statement has passed all tests at the level of symmetries and amplitudes to date.

Let us now turn to scattering amplitudes. Ignoring \(\varphi , B\) for now, we can expand the Einstein–Hilbert action perturbatively around a Minkowski background \(g_{\mu \nu }=\eta _{\mu \nu }+\kappa h_{\mu \nu }\)

$$\begin{aligned} S_{\mathrm{EH}}\sim \int d^Dx\sum _{n=0}^{} \kappa ^n h^{n+1}\Box h \end{aligned}$$
(85)

and construct amplitudes as pioneered by Bryce DeWitt [230,231,232]. The Feynman diagrams for gravitons include n-point vertices for all n. However, just as for the four-point vertex in Yang–Mills, these can all be absorbed into the kinematic numerators of the purely trivalent diagrams. For example, consider as in [143] a pure cubic diagram i, contributing \(N_i/d_i\), where \(N_i\) is the kinematic factor (no colour factors here as we are dealing with gravitons), to the amplitude integrand, and another diagram \(i_{(4)}\), contributing \(N_{i_{(4)}}/d_{i_{(4)}}\), which is identical except that one cubic four-point sub-diagram with propagator s has been contracted to a four-point vertex. Then \(d_{i_{(4)}}=d_{i}/s\) and so

$$\begin{aligned} \frac{N_i}{d_i} + \frac{N_{i_{(4)}}}{d_{i_{(4)}}} =\frac{N_i+s N_{i_{(4)}}}{d_i} = \frac{N'_i}{d_i}. \end{aligned}$$
(86)

Having written the graviton amplitude in terms of pure cubic diagrams, it takes a form resembling closely the gluon amplitude (33),

$$\begin{aligned} A_{g, B, \varphi }^{n, L}=i^L\left( \frac{\kappa }{2}\right) ^{n-2+2L} \sum _{i}\int \prod ^{L}_{l=1}\frac{d^Dp_l}{(2\pi )^DS_i} \frac{N_i}{d_i}. \end{aligned}$$
(87)

2.2.2 The double-copy

Although not obvious, for factorisable external states, which form a basis, the gravitational kinematic numerators can always be written as a product \(N_i=n_i\tilde{n}_i\). This brings us to the statement of the BCJ double-copy prescription for pure Yang–Mills [51, 64]:

figure c

Footnote 16

Some immediate comments are in order:

  1. 1.

    The external states of \(A_{g, B, \varphi }^{n, L}\) are determined by the tensor product of the external states of \({\mathcal {A}}_{\mathrm{YM}} ^{n, L}\) and \(\widetilde{A}_{\mathrm{YM}}^{n, L}\), which need not be the same. The external states are labelled by their on-shell spacetime little group representations \(\mathrm{SO}(D-2)\) and, more generally, any other global representation they carry. For example, in \(D=4\) Yang–Mills we have the various possible products of gluons with \(\pm 1\) helicity states:

    $$\begin{aligned} \begin{array}{c|c|ccccccccc} \otimes &{}+1 &{} -1 \\ \hline +1 &{}+2, ~\text {graviton}&{} {0}, ~\tau \\ +1 &{} 0, ~{\bar{\tau }} &{} -2, ~\text {graviton} \end{array} \end{aligned}$$
    (88)

    where \(\tau =\varphi +e^{i\chi }\). Of course we get the \(2\times 2 = 4 \) degrees of freedom of \({\mathcal {N}}=0\) supergravity on a Minkowski background.

  2. 2.

    The amplitudes \({\mathcal {A}}_{\mathrm{YM}} ^{n, L}\) and \(\widetilde{A}_{\mathrm{YM}}^{n, L}\) need not derive from the same theory, as long as both theories admit a BCJ duality respecting form. This allows one to construct the product of different theories. The spectrum of states of the gravity theory is given by tensor product of the left and right gauge theory factors. For example, if one factor is \({\mathcal {N}}=4\) Yang–Mills and the other is pure \({\mathcal {N}}=2\) Yang–Mills then the double-copy is \({\mathcal {N}}=6\) supergravity.

    Varying over all BCJ compatible factors we generate a panoply of double-copy constructible gravity theories. Clearly, if one wishes to restrict to a single graviton, then each factor must have at most one massless adjoint gauge field. Note, however, the left and right factor need not have any gauge fields at all. For example, the amplitudes of \({\mathcal {N}}= 2\) hyper multiplet amplitudes generate those of \({\mathcal {N}}= 4\) Maxwell theory: “gauge \(=\) matter \(\times \) matter”. However, for the hyper multiplets to have a local symmetry they must come coupled to an \({\mathcal {N}}= 2\) Yang–Mills multiplet, which will generate the \({\mathcal {N}}= 4\) gravitational sector when included in the double-copy. So the \({\mathcal {N}}= 4\) Maxwell amplitudes generated by the hypers must be regarded as a subsector of a double-copy theory including the gravitational degrees of freedom.

  3. 3.

    Invariance of the gauge theory amplitudes under the linearised gauge transformations together with BCJ duality implies the invariance of the double-copy amplitudes under linearised diffeomorphisms and hence that they belong to some gravitational theory [86]. The emergence of linearised diffeomorphism can also be seen directly at the level of field theory [137], as we shall discuss in Sect. 3. The same field theoretic mechanism generates a linearised local supersymmetry for every adjoint fermion belonging to the factors [137] and so their product should be locally supersymmetric, i.e. a (possibly generalised) supergravity theory, where the gravitini follow from the product of the adjoint gluons and fermions. But if the product is locally supersymmetric the factors must be globally supersymmetric, which is precisely consistent with the observation that BCJ duality and adjoint fermions together implies supersymmetry. Conversely, if one or both of the factors have global supersymmetry, then the corresponding invariance of the gauge amplitudes, together with BCJ duality, implies that the amplitudes of the double-copy theory are invariant under linearised local supersymmetry transformations [144].

  4. 4.

    Of course, we can always go back to one of the gauge amplitudes by turning kinematics back into colour. We can proceed further and replace the remaining kinematics by a second copy of the colour, leaving us with an amplitude of the bi-adjoint \(\phi ^3\) theory introduced in Sect. 2.1.3. This reflects the idea expressed in [198] that the relationship between gauge theory and gravity is heuristically of the form “\(\phi ^3 \times \hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)”. For tree-level Yang–Mills at four points, imposing BCJ duality, this is quite literally the case,

    $$\begin{aligned} A_{\varphi ^3}^{4,0}A_{g, B, \varphi }^{4,0} = \widetilde{A}_{\mathrm{YM}}^{4, 0}\widetilde{A}_{\mathrm{YM}}^{4, 0}. \end{aligned}$$
    (89)

    It is also manifest at the level of integrands in the scattering equation formalism for any number of points [199]. There is another interpretation of the form “gravity \(=\) gauge \(\times \tilde{\phi }^3 \times \) gauge”, where \(\tilde{\phi }\) is in some sense the inverse of \(\phi \) [109, 136, 137, 199]. At tree-level this can be made concrete using double-partial Yang–Mills amplitudes

    $$\begin{aligned} A_{g, B, \varphi }^{n,0} = \mathbf {A}^{n,0}_{\mathrm{YM}}{}^t\cdot \mathbf {S}_{\mathrm{KLT}} \cdot \mathbf {A}^{n,0}_{\mathrm{YM}}. \end{aligned}$$
    (90)

    Here, \(\mathbf {A}^{n,0}_{\mathrm{YM}}\) is a specific choice of \((n-3)!\) independent partial amplitudes and \(\mathbf {S}_{\mathrm{KLT}}\) is the corresponding momentum kernel [233], which in this case is exactly the inverse of the matrix of scalar propagators, that is the double-partial amplitudes of the \(\phi ^3\) theory [199].

This picture allows one to construct a vast array of gravitational theories in the sense that every double-copy of a pair a gauge theory amplitudes gives a gravitational amplitude of some theory and that every amplitude of that theory can be written as a double-copy.

2.2.3 A growing zoology

The basic principle is that if one can cast two gauge theories into BCJ duality respecting form, then their amplitudes yield a double-copy theory with spectra given by the tensor product of the spectra of the two gauge theories: double-copy states = left gauge states \(\otimes \) right gauge states. There is a growing list of BCJ compatible gauge theories and, thus, double-copy constructible theories. Gauge invariance in the left and right factors implies linearised diffeomorphisms in the double-copy theory, so it will (typically) be a theory of gravity. Here we summarise the known double-copy constructible theories together with their gauge factors and the key ideas facilitating the implementation of BCJ duality and identifying the double-copy theory. Rather than follow the chronology, we will start with the simplest examples and work up in complexity.Footnote 17 We are not consistent in our labelling of the classes of double-copies as it is easiest to characterise them in terms of the factors in some cases and the double-copy theory in others. In all cases we are strictly referring only to the tree-level theories, although in many examples they have passed numerous loop-level tests.

Table 1 On-shell helicity states of all \(D=4\) supermultiplets
Table 2 Summary of supergravity theories obtained from the double-copy of two pure super Yang–Mills theories
Table 3 The content \(D=4\) resulting from the product of the on-shell helicity states of left and right matter multiplets, as summarised in Table 1
  1. 1.

    \(\phi ^3\)theory: The very simplest thing one can do is to replace not the colour factors, but the kinematic factors of pure Yang–Mills theory:

    $$\begin{aligned} \frac{n_ic_i}{d_i}\rightarrow \frac{\tilde{{c}}_ic_i}{d_i}. \end{aligned}$$
    (91)

    It is not hard to convince oneself that this yields the amplitudes of the bi-adjoint \(\phi ^3\) theory, with interaction (71), as identified in [199].

  2. 2.

    Pure\({\varvec{\mathcal {N}}}\)-extended Yang–Mills\(\times \)pure\(\tilde{{{\mathcal {N}}}}\)-extended Yang–Mills: As we have mentioned adjoint fermions and BCJ duality implies supersymmetry. Conversely, given BCJ duality for gluons, supersymmetry extends it to any pure super Yang–Mills multiplet. Consequently, the double-copy of super Yang–Mills theories follows directly. Of course, the left and right vectors \(A, \tilde{A}\) alone yield \({\mathcal {N}}=0\) supergravity, but the vectors with the \({\mathcal {N}}, \tilde{{{\mathcal {N}}}}\) left/right adjoint fermions yields \({\mathcal {N}}+\tilde{{{\mathcal {N}}}}\) gravitini. Hence, the double-copy must be a supergravity theory. For \({\mathcal {N}}+\tilde{{{\mathcal {N}}}}\) half-maximal or greater, there is unique candidate supergravity theory so the identification of the double-copy theory is trivial; it is just read-off from the tensor product of the left/right gauge theories states given in Table 2. For less supersymmetry, the couplings are not uniquely determined by the spectra so there is a priori some ambiguity. This can be resolved by examining the symmetries alone. This was done in [138, 140] for all pure super Yang–Mills theories in \(D=3,\ldots 10\), as reviewed in Sect. 3.2.2. The key observation is that the scalar manifolds of the resulting supergravity theories are symmetric homogeneous spaces \({\mathcal {G}}/{\mathcal {H}}\), as can be deduced by consistently truncating the maximally supersymmetric examples. This in turn implies that the Lagrangian is uniquely determined by the non-compact global symmetry group \({\mathcal {G}}\). The BCJ double-copy has been explicitly established and tested at the loop level for many of the cases [56, 57, 91, 95, 104, 183, 184]. An important approach to the double-copy construction of such theories (and others beside) is through orbifoldings of \({\mathcal {N}}=8\) supergravity that factorise into orbifolds of the left and right \({\mathcal {N}}=4\) Yang–Mills theories [68, 73]. We summarise results in Table 2. For our conventions and details of the various supermultiplets see Table 1. For the remaining discussion we will mostly focus on \(D=4\) and simply comment on other dimensions.

  3. 3.

    \({\varvec{\mathcal {N}}}=4\)and\({\varvec{\mathcal {N}}}=3\)supergravity coupled to vector multiplets: Restricting to \(D=4\) the next simplest class of double-copy constructible supergravity theories is \({\mathcal {N}}=4\) supergravity coupled to an arbitrary number of vector multiplets [68, 72, 73]. From the above list we see that \({\mathcal {N}}=2\times \tilde{{{\mathcal {N}}}}=2\) yields \(\mathbf {V}_2\otimes \tilde{{\mathbf {V}}}_2=\mathbf {G}_{4}\oplus 2 \mathbf {V}_{4}\) so that we have at least two vector multiplets. If one restricts to adjoint-valued multiplets then this is the only consistent \({\mathcal {N}}=2\times \tilde{{{\mathcal {N}}}}=2\) case; adding extra vector or hyper multiplets into the factors would generate additional graviton and/or gravitini multiplets. Of course, there is the other possibility of \({\mathcal {N}}=4\times \tilde{{{\mathcal {N}}}}=0\). This yields pure \({\mathcal {N}}=0\) supergravity [183]. However, one can couple n adjoint-valued scalars to the right factors (with couplings determined from the dimensional reduction of pure Yang–Mills in \(D=5\)) to give \({\mathcal {N}}=4\) supergravity coupled to n vector multiplets, \(\mathbf {V}_4\otimes [\tilde{{A}}\oplus n\phi ]=\mathbf {G}_{4}\oplus n \mathbf {V}_{4}\), with global symmetry \(\mathrm{SL}(2, {\mathbb {R}})\times \mathrm{SO}(6, n)\) [67, 68, 72, 90,91,92, 138, 183]. This can be trivially extended to half-maximal supergravities in \(D=3,5,6,7,8,9,10\). This exhausts all half-maximal or greater supergravity theories: the double-copy spans all such theories.

    The story for \({\mathcal {N}}=3\) is slightly different. There is no perturbative \({\mathcal {N}}=3\) super Yang–Mills theory, since \({\mathcal {N}}=3\) supersymmetry implies \({\mathcal {N}}=4\) for Yang–Mills.Footnote 18 Hence, there is only one way to obtain \({\mathcal {N}}=3\) through the double-copy, \(\mathbf {V}_2\otimes \tilde{{\mathbf {V}}}_1=\mathbf {G}_{3}\oplus \mathbf {V}_{3}\). The double-copy necessarily comes coupled to at least one vector multiplet. Adding adjoint-valued multiplets (vector, hyper or chiral) to either factor would result in extra graviton or gravitini multiplets so is forbidden (without increasing the degree of supersymmetry) (Table 2). However, we are able to include hyper and chiral multiplets in non-adjoint representations, using the matter (by which we mean any fields not valued in the adjoint) colour-kinematic duality of [74, 206]. Let us include a single half-hyper multiplet in the left \({\mathcal {N}}=2\) factor in a pseudo-real representation \(\rho \) (required for half-hypers), which is compatible with BCJ duality [77], and n chiral multiplets on the \(\tilde{{{\mathcal {N}}}}=1\) right theory, also in a pseudo-real representation \(\tilde{{\rho }}\):

    $$\begin{aligned} {[}\mathbf {V}_2\oplus \tfrac{1}{2}\mathbf {H}^{\rho }_{2}] \otimes [\mathbf {V}_1\oplus \mathbf {C}^{\tilde{{\rho }}}_{2}] =\mathbf {G}_{3}\oplus (n+1) \mathbf {V}_{3}. \end{aligned}$$
    (92)

    For convenience the tensor product of the on-shell matter multiplet are given in Table 3. Note, the “matter” representations \(\rho ,\tilde{{\rho }}\) do not double-copy with the adjoint-valued fields since the amplitudes necessarily have distinct colour structures and the Jacobi identities are replaced with commutation relations [74]. As in the case of \({\mathcal {N}}=4\) supergravity the vector multiplet coupling is unique [234] and the scalars belong to the symmetric spaces

    $$\begin{aligned} \frac{\mathrm{SU}(3,1+n)}{\mathrm{SU}(3)\times \mathrm{Un}(1+n)}. \end{aligned}$$
    (93)

    Through dimensional reduction/oxidation this exhausts the analysis for all Poincaré supergravity theories with more than eight supercharges (\({\mathcal {N}}>2\) in \(D=4\)). Every such theory is double-copy constructible with the single exception of pure \(D=4, {\mathcal {N}}=3\) supergravity and its dimensional reductions/oxidations.

  4. 4.

    \({\varvec{\mathcal {N}}}=2\)supergravity with homogeneous scalar manifolds: The complete classification for more than eight supercharges relied on the fact that the scalar manifolds of supergravity are, in this case, necessarily symmetric homogenous spaces. For eight (or fewer) supercharges, there is far more freedom. The scalar manifolds are required to be special geometries [235, 236], which includes real, Kähler and quarternionic manifolds [237,238,239], but homogeneity is not essential. Consequently the space of theories is far richer in this case.

    Focussing on \(D=4\), the scalars belonging to vector multiplets must parametrise a projective special Kähler manifold [235, 238, 240], while those belong to hyper multiplets parametrise a quarternionic-Kähler manifold [241,242,243]. A manageable, in the sense that there is an explicit and complete characterisation, subclass of \({\mathcal {N}}=2\) supergravity theories is given by those with homogenous scalar manifolds. A unified double-copy construction of almost all \({\mathcal {N}}=2\) supergravity theories coupled to vector multiplets with homogenous scalar manifolds was given in [77] through a left \({\mathcal {N}}=2\) Yang–Mills theory coupled to a single half-hyper multiplet in a pseudo-real representation and right \(\tilde{{{\mathcal {N}}}}=0\) Yang–Mills theory coupled to adjoint scalars and pseudo-real fermions. If non-symmetric the scalar manifolds are indexed by three integers \((q, P, \dot{P})\),

    $$\begin{aligned} \mathrm{SO}(1,1)\times \frac{\mathrm{SO}(q+2,2)}{\mathrm{SO}(q+2)\times \mathrm{Un}(1)} \times \frac{S_{q}(P,\dot{P})}{S_{q}(P,\dot{P})} \ltimes \Big [(\mathbf {spin},\mathbf {def},\mathbf {1})^1\ltimes (\mathbf {1,1,1})^2\Big ], \end{aligned}$$
    (94)

    where \(\mathbf {spin}\) indicates the spinor representation of \(SO(q+2,2)\) and \(\mathbf {def}\) the defining representation of \(S_q(P, \dot{P})\). Here, \((q, P,\dot{P})\) are integers, which fix the number of vector multiplets, the factor \(S_q(P, \dot{P})\) and representations carried by the fields. See, for example, [85] for full details.

    If the scalar manifold is symmetric there are three classes: (i) the generic Jordan sequence [244] indexed by a single integer, \((q, P, \dot{P})=(q, 0, 0)\), (ii) the four magic supergravities [237, 244, 245] for which \((q, P, \dot{P})=(n, 1, 0)\), where \(n=\dim \mathbb {A}=1,2,4,8\), and (iii) the minimally coupled sequence [246] indexed by a single integer, \((q, P, \dot{P})=(-2, P, 0)\). The scalar manifolds are respectively

    $$\begin{aligned} \frac{\mathrm{SU}(1,1)}{\mathrm{Un}(1)_g}\times \frac{\mathrm{SO}(q+2,2)}{\mathrm{SO}(q+2)\times U(1)}; \quad \frac{\text {Conf}( \mathfrak {J}^{{\mathbb {A}}}_{3})}{[\text {Str}_0 (\mathfrak {J}^{{\mathbb {A}}_\mathbb {C}}_3)]_c}; \quad \frac{\mathrm{SU}(1,P+1)}{\mathrm{Un}(1)\times \mathrm{SU}(P+1)}, \end{aligned}$$
    (95)

    where \({\mathbb {A}}_\mathbb {C}\cong \mathbb {C}\otimes {\mathbb {A}}\), \(\mathfrak {J}^{{\mathbb {A}}}_{3}\) is the cubic Jordan algebra of \(3\times 3\) Hermitian matrices over \({\mathbb {A}}={\mathbb {R}},{\mathbb {C}},{\mathbb {H}},{\mathbb {O}}\) and \( \mathfrak {J}^{{\mathbb {A}}_\mathbb {C}}_3\cong \mathbb {C}\otimes \mathfrak {J}^{{\mathbb {A}}}_3\) its complexification, \(\text {Conf}(\mathfrak {J})\) is the conformal group of the cubic Jordan algebra \(\mathfrak {J}\), \(\text {Str}_0(\mathfrak {J})\) is the reduced structure group and \([G]_c\) denotes the compact real form of the complexified group G. The minimally coupled sequence was given as a truncation of the generic Jordan sequence, but can also be constructed directly [85]. This list includes almost all \({\mathcal {N}}=2\) supergravity theories coupled to vector multiplet with scalars parametrising a homogenous manifold. The only exceptions are pure \({\mathcal {N}}=2\) supergravity and the \(T^3\) model,Footnote 19 which cannot be double-copy constructed [85].Footnote 20 For recent work on the double-copy construction of this class of theories at one-loop see [229]. One can in principle include an arbitrary number of hype rmultiplets with homogenous scalars manifolds [85], completing the classification of this subclass of double-copy constructible theories, although in the non-symmetric case it is not clear how it is to be realised. All cases may be summarised by

    $$\begin{aligned}&\mathbf {G}_2\oplus (1+q+2+r)\mathbf {V}_2\oplus (q'+4+t/2)\mathbf {H}_2\nonumber \\&\quad =\Big [\mathbf {V}_2\oplus \mathbf {H}_2^\rho \Big ] \otimes \Big [\tilde{{V}}\oplus (q+2)\tilde{{\phi }}\oplus (r)\tilde{{\lambda }}^{\tilde{{\rho }}} \oplus 2(q'+4)\Phi ^{\tilde{{\rho }}}\oplus (t)\varphi ^{\tilde{{\rho }}}\Big ], \end{aligned}$$
    (96)

    where the specific supergravity theory obtained is detemined by the various parameters and choices of couplings and symmetries of the right gauge theory, as described in detail in [77, 85].

  5. 5.

    Einstein–Maxwell–Yang–Mills supergravity: Thus far all vector multiplets appearing in the double-copy theory have been Abelian. It possible to also introduce Yang–Mills multiplets through a simple mechanism [75]. The left theory is take to be a pure super Yang–Mills theory, while the right theory is given by pure Yang–Mills coupled to a \(\tilde{{G}}\times G'\) bi-adjoint scalar \(\phi ^{aa'}\), where \(\tilde{{G}}\) is the gauge group as usual, but \(G'\) is a global symmetry,

    $$\begin{aligned} \tilde{{{\mathcal {L}}}}_{\mathrm{YM}+\phi ^3}&= \mathrm{tr}\left( \frac{1}{2}F \wedge \star F +\frac{1}{2}D\phi ^{a'}\wedge \star D\phi ^{a'} -\frac{g^2}{4}\star [\phi ^{a'}, \phi ^{b'}][\phi ^{a'}, \phi ^{b'}]\right. \nonumber \\&\quad \left. -\frac{g\lambda }{3!}\star f_{a'b'c'} \phi ^{a'}[\phi ^{b'}, \phi ^{c'}]\right) . \end{aligned}$$
    (97)

    The key observation is that the global symmetry \(G'\) of the right theory is promoted to a gauge symmetry of the corresponding vector multiplets, \(\mathbf {V}_{\mathcal {N}}\otimes \phi ^{a'}\), of the double-copy theory with coupling determined by the cubic scalar term of (97), \(g'\sim \kappa \lambda \) [75]. Combined with the previous techniques this allows for the double-copy construction of a variety of \({\mathcal {N}}\le 4\) Einstein–Maxwell–Yang–Mills supergravity theories [75]. There have been several subsequent tree-level [86, 249,250,251], one-loop [252, 253] and all-loop for a single external graviton [86] developments of Einstein–Maxwell–Yang–Mills amplitudes. Finally, the double-copy Einstein–Maxwell–Yang–Mills supergravity theories can be Higgsed by taking the left factor on the Coulomb branch and introducing matching masses for the scalars on the right through an explicit symmetry breaking [76].

  6. 6.

    Gauged Poincaré supergravity: Gauged supergravity theories with Minkowski background can also be constructed [87]. Here a subgroup of the R-symmetry of the corresponding ungauged supergravity is gauged, leading to massive gravitini. The left theory is a Higgsed Yang–Mills theory coupled to a set of scalars, which introduces the required massive bosons. The right theory super Yang–Mills theory is also Higgsed and has explicitly broken (through orbifolding) supersymmetry, which introduces the required has massive fermions to generate massive gravitini. Starting with an \({\mathcal {N}}=2\) super Yang–Mills theory, the simplest examples generate \(\mathrm{Un}(1)\) gaugings of the generic Jordan supergavities discussed above [87]. However, it is possible to extended to more supersymmetry and non-Abelian gaugings [87, 88].

  7. 7.

    \({\varvec{\mathcal {N}}}\le 1\)(super)gravity: For theories with less supersymmetry it is much harder to make general statements. Of course, there is the central example of \({\mathcal {N}}=0\) supergravity, but beyond it is difficult to characterise what classes of gravity theories may be double-copy constructed. This is principally due to the lack of symmetry, which makes it more difficult to identify what theory is generated by the double-copy. Nonetheless, there are numerous examples. This includes orbifoldings of “parent” double-copy supergravity theories that preserve BCJ duality, but break all, or almost all, the supersymmetries [73]. Control over the resulting theories follows from the control over the parent theory and the relevant orbifold. This technique has, for example, been used to construct all “twin” supergravity theories in [81].Footnote 21 This procedure generates new theories from old such as, for example, the \({\mathcal {N}}=1\) twins in \(D=4\). Again, control over the nature of these theories is inherited from their parents. There are various examples, both at tree-level and for loops [73, 81, 86], but a coherent picture remains to be developed. A particularly important example is the double-copy of Yang–Mills coupled to quarks [74, 206]. As well as being of interest in its own right, the techniques developed in this context have opened the door to a vastly expanded array of doubly-copy theories, as the preceding discussion of examples that make use of non-adjoint representations makes clear.

  8. 8.

    Conformal (super)gravity: In cases previously considered here, the graviton sector has been Einstein–Hilbert. Remarkably, however, this is not necessary. A counter example is given by conformal (super)gravity [83, 259]. The key idea is to use in one of the gauge theory factors a higher derivative \((DF)^2\) theory. In conjunction with various deformations one can then double-copy construct a number of conformal (super)gravity theories, including the Berkovits-Witten theory [260] and (mass-deformed) minimal conformal supergravity. See [261] for a review of conformal supergravity.

  9. 9.

    Exceptions to the exceptions: We should be clear about our definition of a “double-copy constructible” theory. The double-copy theory is defined by the totality generated by the two gauge theory factors: a particular theory is double-copy constructible if (1) all its amplitudes can be generated by the double-copy of the amplitudes two BCJ dualityFootnote 22 respecting theories and conversely (2) all amplitudes of the two theories generate an amplitude belonging to the corresponding gravitational theory.Footnote 23 For example, a conspicuous absentee is good old Einstein–Hilbert gravity; it is not double-copy constructible in the above sense, as it always comes with the axion–dilaton sector. Similarly, the \(T^3\) model is not double-copy constructible [85], which rather stands out as the only case of an \({\mathcal {N}}\ge 2\) supergravity theory with a (non-trivial) symmetric scalar manifold not admitting a double-copy construction.

    However, as always there are exceptions to the exceptions. All amplitudes of pure Einstein–Hilbert gravity can be systematically double-copy constructed by consistently restricting the external states to the graviton sector, while cancelling the would-be axion–dilaton sector appearing in loops with the product of “ghost” chiral fermion amplitudes [74]. The restriction on the external states violates our strong definition, but all amplitudes of Einstein–Hilbert gravity may nonetheless be double-copy constructed using these “ghost” cancellations. With this understanding of “double-copy constructible” it may be possible to fill in all the gaps, as well providing alternative constructions of double-copy theories. For example, pure \({\mathcal {N}}=4\) supergravity may also be constructed through pure \({\mathcal {N}}=2\) Yang–Mills \(\times \tilde{{{\mathcal {N}}}}=2\) Yang–Mills using ghost cancellations to remove the unwanted vector multiplet [82].

The above list is by no means exhaustive, although it clearly demonstrates the long-arm of the double-copy construction. In particular, we have not discussed the double-copy construction of: open and closed string amplitudes using Z-theory [79, 80, 89, 262, 263]; Born-Dirac-Infield theories, including couplings to super Yang–Mills theories and non-linear sigma models [86, 201, 264]; the special-Galileon theory [204]; and double-copy correlator relations [265,266,267]. It is also possible to apply the “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)” perspective to construct, discover, or deduce properties of, theories for which there is no, and perhaps can be no, Lagrangian description in the conventional sense [66, 138, 142, 258, 268, 269]. For example, previously unknown \(D=4, {\mathcal {N}}=2\) superconformal S-fold theories of the type introduced in [270,271,272,273], which being intrinsically non-perturbative are not amenable to the double-copy proper, where discovered using this approach in [258]. Finally, an alternative and elegant realisation, at tree-level, of many of these “\(\hbox {gravity} =\hbox {gauge} \times \hbox {gauge}\)” examples, and the relations between, them is given by the Cachazo-He-Yuan scattering equation formalism [199, 201, 219]. The amplitudes in this framework can be regarded as a world-sheet integrals, but localised on the solutions of the scattering equations. They sit in-between a string and particle picture. This formalism may also be derived from ambi-twistor string theory [221, 274], which then opens a route to loops [223, 224, 275] and curved backgrounds [225]. It has also been employed to construct candidate tree-level amplitudes for the \(D=6, {\mathcal {N}}=(4,0)\) theory, conjectured to arise in M-theory [276], through the double-copy of (2, 0) theory amplitudes [269]. This is all the more remarkable in light of the fact that we, at present, have no other insight regarding the interacting (4, 0) theory.

3 Field theory relations

These developments raise the question: to what extent, or in what sense, can one regard gravity as the square of Yang–Mills. Is there a deeper connection underlying the amplitude relations. Having exposed the hidden dualities of amplitudes through an intrinsically on-shell window, is it possible to now step back and understand their origins from a geometric or off-shell point of view. This is not only a conceptual question; having an off-shell understanding may shine light on the outstanding amplitude questions, such as BCJ duality beyond tree-level. There are a number of approaches one might take: can BCJ duality be manifested at the level of the Lagrangian or field equations [200, 204, 277,278,279,280,281,282]; can we rewrite the gravity in a form that, in some sense, factorises [64, 204, 281, 283,284,285]; is there a field theory “product” of gauge theories [81, 85, 115, 116, 119, 136,137,138,139,140,141,142]. We can also turn this on-shell versus off-shell question around: can the BCJ double-copy paradigm be repurposed to efficiently construct solutions in theories of gravity from gauge amplitudes and/or solutions [109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135]. In a sense this runs contrary to the “on-shell paradigm” that took us here. Going back off-shell may nonetheless be instructive.

3.1 Manifesting BCJ duality and the double-copy off-shell

3.1.1 BCJ Lagrangians and kinematic algebras

The remarkable relationship between colour and kinematics hidden in the amplitudes suggests that there is some underlying kinematic algebra mirroring the properties of conventional Lie algebras [200, 277, 278, 280, 282]. In general, the nature of this conjectured hidden algebra is not known, however for the self-dual sector it can be identified precisely as an area-preserving diffeomorphism Lie algebra in a particular two-dimensional subspace [277]. To see this, recall the self-duality constraint, \(F_{\mu \nu }=\frac{i}{2}\varepsilon _{\mu \nu \rho \sigma }F^{\rho \sigma },\) reduces the equations of motion in light-cone gauge \(A_u=0\) to

$$\begin{aligned} \Box \varphi + ig[\partial _w\varphi , \partial _u \varphi ]=0, \end{aligned}$$
(98)

where \(u=t-z, v=t-z, w=x+iy\) and \(A_w=0, A_v=-\frac{1}{4}\partial _w\varphi , A_{\overline{w}}=-\frac{1}{4}\partial _u\varphi .\) For suitable boundary conditions (98) can be solved perturbatively in momentum space \(\varphi ^{a}(p)=\sum _n \varphi ^{a}_{{n}}(p)\). Schematically, we have

$$\begin{aligned} \varphi ^{a}_{{n}}(p)&= \frac{1}{2} g^n\sum _i \int \prod ^{n+1}_{l=1} \frac{dp_l}{(2\pi )^4} \left( \frac{n_i(F)^{p}{}_{p_1p_2\dots p_{n+1}} c_i(f)^{a}{}_{a_1a_2\dots a_{n+1}}}{p^2d_i}\right) \nonumber \\&\quad \times \varphi ^{a_1}_{{0}}(p_1)\varphi ^{a_2}_{{0}}(p_2) \cdots \varphi ^{a_{n+1}}_{{0}}(p_{n+1}). \end{aligned}$$
(99)

Each order n correction can be represented in terms of \((n+1)\)-point trivalent tree diagrams, labelled here by i, with n sources of momenta \(p_l\) and one external field \(\varphi ^{a}_{{n}}(p)\). The \(d_i\) term appearing denominator is the product of the momenta squared of the internal lines. The important components from our perspective are the kinematic and colour numerators \(n_i(F)\) and \(c_i(f)\). As usual \(c_i(f)\) is a polynomial in the gauge group structure constants \(f^{abc}\) generated by attaching one to each vertex. Remarkably, \(n_i(F)\) is constructed in precisely the same way with \(f^{abc}\) replaced by a kinematic “structure constant” \(F^{p_a p_b p_c}\) such that \(c_i\leftrightarrow n_i\) under \( f\leftrightarrow F\). Specifically, \( F_{p_1p_2}{}^{q}=(2\pi )^4\delta ^4(p_1+p_2)(p_{1w}p_{2u}-p_{1u} p_{2w}), \) where indices are raised/lowered by \(\delta ^{pq}=\delta _{pq}=(2\pi )^4\delta ^4(p+q)\) with contractions given by integration \( X_{p\cdots }Y^{p\cdots } := \int \frac{dp}{(2\pi )^4} X(p, \cdots )Y(p, \cdots ). \) Using these conventions \(F^{p_a p_b p_c}\) is totally anti-symmetric and obeys the Jacobi identity [277], which combined with \(c_i\leftrightarrow n_i\) under \( f\leftrightarrow F\) makes the BCJ colour-kinematic duality manifest at the level of perturbative classical solutions in the self-dual sector. The kinematic structure constants F are those of the algebra of infinitesimal area-preserving diffeomorphisms. Moreover, this algebra has been shown to determine the kinematic numerators of tree-level maximally helicity violating amplitudes in the complete Yang–Mills theory including the anti-self-dual sector [277]. Understanding these structures beyond the self-dual sector remains an important open question.

Another approach is to modify the Yang–Mills action so that it manifests the duality between colour and kinematics directly in its Feynman diagrams [64]. One can in principle constructively determine the BCJ duality respecting Lagrangian order-by-order [279],

$$\begin{aligned} {\mathcal {L}}_{\mathrm{BCJ}} = {\mathcal {L}}_{\mathrm{YM}} +{\mathcal {L}}_{{(5)}}+ {\mathcal {L}}_{{(6)}}+\cdots . \end{aligned}$$
(100)

Of course, \({\mathcal {L}}_{{(n)}}\) are constrained to leave the amplitudes invariant, but nevertheless rearrange the kinematic numerators. This was done explicitly in [64] to six points. For example, choosing Feynman gauge one possibility at five points is given by

$$\begin{aligned} {\mathcal {L}}_{{(5)}} \propto f_{[a_1a_2}{}^{b}f_{a_3]b}{}^{c} f_{ca_4a_5}\partial _{[\mu } A_{\nu ]}^{a_1}A_{\rho }^{a_2}A^{a_3 \mu } \frac{1}{\Box }(A^{a_4 \nu }A^{a_5 \rho }). \end{aligned}$$
(101)

This is identically zero since we have the Jacobi identity sitting up front \(f_{[a_1a_2}{}^{b}f_{a_3]b}{}^{c}=0\), hence the amplitudes are trivially left invariant by the addition of \({\mathcal {L}}_{{(5)}}\). However, separating the three terms, trivially inserting \(\Box /\Box \) for each term, and redistributing them, as we would in the amplitude, shifts the numerators of the five-point diagrams such that they are in BCJ dual form [64]. Let us assume we had found BCJ numerators starting from the original Lagrangian. Adding \({\mathcal {L}}_{{(5)}}\) would preserve the duality by construction; that we can add identically zero terms to the Lagrangian while maintaining BCJ duality is another way to see that the BCJ numerators are non-unique. Note, BCJ duality can be made completely manifest at the level of the Lagrangian for a non-linear sigma model [204]. The BCJ double-copy of the non-linear sigma model yields the special Galileon and “squaring” the non-linear sigma model action gives a novel form of Galileon action [204].

This brings us to the idea that the \({\mathcal {N}}=0\) supergravity Lagrangian can be “factorised”. What does one mean by this? In the context of string the left and right movers heuristically give rise to the spacetime indices on \(h_{\mu \nu }, B_{\mu \nu }, \varphi \). Thus, the left/right indices of \(Z_{\mu \nu }\sim h_{\mu \nu }+B_{\mu \nu }\sim A_\mu \tilde{A}_\nu \) have their origin in the left/right open strings corresponding to the left/right gauge theories. Given that each gauge theory is independent, one might therefore anticipate a formulation of \({\mathcal {N}}=0\) supergravity that makes this manifest in that the left and right indices only “talk” amogst themselves. It is in this sense that we mean the action factorises. This idea was sometime ago proposed by Siegel, who demonstrated that there does indeed exist such a formalism, at least for specific gauge choices [286, 287]. Later Grant and Bern developed a perturbative Lagrangian that manifests the left/right split order-by-order [283]. To give a simple illustration of the idea Bern and Grant imposed de Donder gauge and made a field redefinition for the metric perturbation \(g_{\mu \nu }=\eta _{\mu \nu }+\kappa h_{\mu \nu }\) and the dilaton \(\varphi \)

$$\begin{aligned} h_{\mu \nu }\rightarrow h_{\mu \nu }+\eta _{\mu \nu } \sqrt{\frac{2}{D-2}}\varphi ,\quad \varphi \rightarrow \tfrac{1}{2} h +\sqrt{ \frac{D-2}{2}}\varphi , \end{aligned}$$
(102)

which yields at zeroth order

$$\begin{aligned} {\mathcal {L}}_{\mathrm{EH}}=-\tfrac{1}{2}h^{\mu }{}_{\nu } \Box h_{\mu }{}^{\nu }+\varphi \Box \varphi . \end{aligned}$$
(103)

The terms contracting amongst the “left” and ”right” indices (an ambiguous notion since \(h_{\mu \nu }\) is symmetric) have been removed. Of course, this condition has to be maintained to all orders. The field redefinition realising this goal, even before making any gauge choice, is remarkably simple,

$$\begin{aligned} g_{\mu \nu }=e^{\sqrt{\frac{2}{D-2}}\kappa \varphi } e^{\kappa h_{\mu \nu }}, \quad \varphi \rightarrow \sqrt{\frac{2}{D-2}}\left( \varphi +\frac{1}{2}h\right) , \end{aligned}$$
(104)

and was checked explicitly through order six in \(\kappa \), allowing the KLT relations to be derived directly from the action itself up to five points [283].

Here the dilaton was introduced as an auxiliary device to aid the factorisation of the Einstein–Hilbert Lagrangian, but given the nature of the double-copy one should only expect the full factorisation to work for \({\mathcal {N}}=0\) supergravity, where the dilaton and KR 2-form are genuine components of the full theory.Footnote 24 Indeed, one could take the view that the non-symmetric \(Z_{\mu \nu }\) is required to make sense of the notion of having left and right indices at all. This was taken seriously in [284], where a left/right factorised action was constructed using the double-field theory formalism [288,289,290], which enlarges the set of spacetime coordinates to accommodate a (symmetric) generalised metric to render the dualities of string theory manifest. The generalised metric was introduced in the context of string and membrane dualities in earlier related work [286, 287, 291,292,293,294]. As particularly relevant to “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)”, the generalised metric of [291] was obtained in [286] using left and right vierbeins, making the left/right sectors apparent with a manifest \(\mathrm{GL}(D, {\mathbb {R}})\times \mathrm{GL}(D, {\mathbb {R}})\) symmetry. More recent approaches to this question [281, 285] have also made a twofold Lorentz symmetry (rather than \(\mathrm{GL}(D, {\mathbb {R}})\), since only the metric was considered) manifest to all orders [281]. This formulation also has the potentially appealing feature, from the BCJ double-copy point of view, that the left/right factorised Lagrangian of [285] has only cubic interactions with the aid of only a single auxiliary field \(a^{\rho }{}_{\mu \nu }\),

$$\begin{aligned} {\mathcal {L}}_{\mathrm{EH}}\sim a^{\rho }{}_{\mu \nu } \partial _{{\rho }}{\mathfrak {g}}^{\mu \nu } -\left( a^{\rho }{}_{\sigma \mu }a^{\sigma }{}_{\rho \nu }-\frac{1}{D-1} a^{\rho }{}_{\rho \mu }a^{\sigma }{}_{\sigma \nu }\right) {\mathfrak {g}}^{\mu \nu } \end{aligned}$$
(105)

where \({\mathfrak {g}}^{\mu \nu }\) is the usual tensor density \(\sqrt{-g}g^{\mu \nu }\).

3.1.2 Double-copy solutions

One can apply the BCJ double-copy paradigm to the construction of classical solutions in theories of gravity, such as black holes, from gauge theory. This may take the guise of applying a classical double-copy-like map to classical gauge theory solutions or extracting perturbative classical solutions from the double-copy of gauge theory amplitudes [109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135].

Let us consider the former case. In its simplest incarnation, introduced in [109], there is double-copy-like map of gauge theory solutions that yields non-perturbative solutions in Einstein–Hilbert gravity under the assumption that the spacetime metric is of Kerr–Schild type,

$$\begin{aligned} g_{\mu \nu }(x)=\eta _{\mu \nu } + \phi (x) k_{\mu }{k}_{\nu }, \end{aligned}$$
(106)

where \(\phi (x)\) is, morally speaking, related to the by-now familiar bi-adjoint scalar, although it carries no indices here. The covector field \(k_\mu \) is null with respect to both g and \(\eta \). The Minkowski background can be generalised to an arbitrary background, in which case \(k_\mu \) is null and geodesic with respect to the background metric. The Kerr–Schild form of the metric effectively linearises the Ricci tensor.

To give a feeling for the classical double-copy let us turn to the simplest example:

$$\begin{aligned} \text {Schwarschild black hole} = (\text {static colour charge})^2. \end{aligned}$$
(107)

More specifically, the solution to the sourced Yang–Mills equation \(D\star F=j\), where \(j^{\mu }_{a} = -g c_a \delta (\mathbf {x})(1, 0,0,0)\) is a static point-like colour charge located at the origin with constant \(c_a\), is taken to be

$$\begin{aligned} A_{\mu }^{a} = \phi c^a k_{\mu }, \quad k_{\mu } =(1, \hat{\mathbf {x}}), \quad \phi =\frac{1}{4\pi r} \end{aligned}$$
(108)

which obviously linearises the Yang–Mills equation. Now, in precise analogy to the BCJ double-copy we send the gauge coupling g to the gravitational coupling \(\kappa /2\) and the colour factor \(c^a\) to a second copy of the kinematics \(c^a\mapsto M k_\nu \), where M is a mass-dimension-one constant. The scalar \(\phi \) goes along for the ride, just as for the propagators in the BCJ double-copy . It is in this sense that it is related to \(\phi ^3\)-theory. Hence

$$\begin{aligned} A_{\mu }^{a} = \phi c^a k_{\mu }\mapsto \frac{\kappa }{2} \frac{M}{4\pi r} k_{\mu }k_{\nu } \end{aligned}$$
(109)

which we recognise as nothing but the Schwarschild solution in Kerr–Schild coordinates

$$\begin{aligned} g^{\mathrm{Schwar.}}_{\mu \nu }(x)=\eta _{\mu \nu } + \frac{\kappa }{2} \frac{M}{4\pi r} k_{\mu }k_{\nu }=\eta _{\mu \nu } + \frac{2GM}{ r} k_{\mu }k_{\nu } \end{aligned}$$
(110)

with static point-like mass located at the origin \(T_{\mu \nu }=M v_\mu v_\nu \delta (\mathbf {x})\), where \(v^\mu =(1,0,0,0)\), which is the obvious “double-copy” of \(j^\mu \). The solution (108) is perhaps a little unfamiliar, but is related by a gauge transformation to the standard Coulomb solution \(A_{\mu }^{a}=\frac{gc^a}{4\pi r}(1,0,0,0)\). This serves to highlight the subtle role played by gauge and coordinate choices in the context of solutions, as opposed to amplitudes [117, 119, 132]. Indeed, making another gauge choice for the point charge solution it is possible instead to obtain a gravity solution including a dilatonic contribution [114, 117]; it would seem the that the Kerr–Schild classical double-copy is not unique [117]. In fact, the most general Kerr–Schild classical double-copy of the Coulomb solution has been argued perturbatively [117] to be the two-parameter Janis–Newman–Winicour solution [295], which can be tuned to turn off the dilaton, leaving the Schwarschild solution. The exact Kerr–Schild double-copy realisation of this space of solutions was given in [296], which, interestingly, used the T-duality generalised metric [297] and double field theory formalism [288], generalising the class of solutions considered in the Kerr–Schild double-field theory double-copy of [298].

This basic example has since been extended to a number of (generalised) Kerr–Schild spacetimes [109,110,111,112,113,114,115,116, 118,119,120,121,122,123,124,125, 296, 298]. It is also possible to construct spacetimes perturbatively using a direct classical analog of the BCJ duality and the double-copy [117].

3.2 Field theoretic “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)

Another approach to addressing such questions is to build a dictionary at the level of fields, as opposed to on-shell states or amplitudes, expressing the covariant fields of (super)gravity in terms of the product of (super) Yang–Mills fields. That is, can we interpret

$$\begin{aligned} A_\mu (x)``\otimes \text {''} \tilde{A}_\nu (x). \end{aligned}$$
(111)

directly in the context of field theory, without appealing to on-shell conditions? Invoking the known properties of open and closed strings we can deduce a consistent indentification of the product of gauge potentials or vector supermultiplets [299, 300]. But is there an independent definition of “\(\otimes \)” at the level of field theory which is valid whether or not there is an underlying string interpretation to guide our identifications? This raises two immediate sub-questions: (i) gravity has no colour, so where do the left and right gauge groups go? (ii) amplitudes are multiplicative in momentum space; is this reflected in the product? Said another way, does the product violate the Leibnitz rule?

From the Weinberg–Witten theorem the product cannot be a straightforward tensor product of any kind. Moreover, the lessons of the double-copy strongly suggest a subtle relationship with the \(\phi ^3\)-theory. It should at least be compatible with the intricacies of BCJ duality and the double-copy in cases where the product field theory agrees with that generated by the double-copy of the corresponding amplitudes of the factor theories.

Guided by the structure of the amplitude relations and requirements of symmetry a covariant product rule was introduced in [137]. It is independent of the amplitude relations, but, in all cases where we have been able to test it, it is compatible in the sense that product field theory agrees with the amplitude product. It is defined as:

$$\begin{aligned} f\circ \tilde{f}:=\langle \langle f \cdot \Phi \cdot { \tilde{f}}\rangle \rangle . \end{aligned}$$
(112)

Here, \(f, \tilde{f}\) are arbitrary spacetime fields valued in \(\mathfrak {g}\) and \(\tilde{{\mathfrak {g}}}\), respectively. The “spectator” field \(\Phi =\Phi ^{a\tilde{a}}T_a\otimes \tilde{T}_{\tilde{a}}\) is a \(G \times {\tilde{G}}\) bi-adjoint valued scalar. The \(\cdot \) product denotes an associative convolutive inner tensor product with respect to the Poincaré group

$$\begin{aligned} {[}f\cdot g](x)=\int d^Dy f(y) \otimes g(x-y) \end{aligned}$$
(113)

and \(\langle \langle ~,~,~\rangle \rangle :\mathfrak {g}\times (\mathfrak {g}\otimes {\tilde{\mathfrak {g}}})\times {\tilde{\mathfrak {g}}} \rightarrow {\mathbb {R}}\) is a trilinear trace form constructed from the negative-definite trace forms of \({\mathfrak {g}}, \tilde{{\mathfrak {g}}}\), which in the standard basis is simply,

$$\begin{aligned} \langle \langle X , Y , Z\rangle \rangle = X_aY^{a\tilde{a}}Z_{\tilde{a}}. \end{aligned}$$
(114)

The convolution reflects the fact that the amplitude relations are multiplicative in momentum space. For sufficiently well-behaved functions the convolution obeys,

$$\begin{aligned} \partial _{\mu }[f\cdot g](x)=[\partial _{\mu }f\cdot g](x)=[f\cdot \partial _{\mu }g](x). \end{aligned}$$
(115)

This turns out to be essential for reproducing the local symmetries of (super)gravity from those of the two (super) Yang–Mills factors. The double trace form accounts for the gauge groups, while the spectator field allows for arbitrary and independent G and \({\tilde{G}}\). Of course, it is closely related to the bi-adjoint scalar of the BCJ zeroth-copy. Heuristically, it can be considered as its convolutive pseudo-inverse \(\Phi \sim \phi ^{-1}\).

3.2.1 Local symmetries

Having introduced the covariant product, let us consider the case of two pure Yang–Mills theories. The field-theoretic product of two gauge potentials, \(A_\mu \) and \(\tilde{A}_\nu \), is given by

$$\begin{aligned}{}[{}A_\mu \circ \tilde{A}_\nu ](x) = g^2 [A_\mu ^a \cdot \Phi _{a\tilde{a}} \cdot \tilde{A}_\nu ^{\tilde{a}}](x). \end{aligned}$$
(116)

In addition to the gauge potential, one must also include the accompanying BRST ghost fields [137, 141, 142]. This reflects the fact that we must include the gauge transformations while the product is defined on fields. Thus it is natural to include the BRST ghosts into the “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)” construction. Indeed, the inclusion of BRST ghosts in the context of Yang–Mills squared or “open \(\times \) open” strings was advocated sometime ago by Siegel [299, 300]. With the ghosts incorporated, the total product of left and right pure Yang–Mills theories is given schematically by

$$\begin{aligned} \begin{array}{c|c|cccccc} \circ &{} \tilde{A}_\nu &{}\tilde{c}^{\beta } \\ \hline A_\mu &{}\underset{\text {graviton}}{g_{\mu \nu }} +\underset{\text {KR 2-form}}{B_{\mu \nu }} +\underset{\text {dilaton}}{\eta _{\mu \nu }\varphi } &{} \underset{\text {right diffeo. + KR ghosts}}{\tilde{C}^{\beta }_{\mu }}\\ \hline c^{\alpha } &{} \underset{\text {left diffeo. + KR ghosts}}{{C}^{\alpha }_{\nu }} &{} \underset{\text {KR ghost-for-ghosts}}{\lambda ^{(\alpha \beta )}} +\underset{\det g\ \text {+ dilaton}}{(g+\varphi )\varepsilon ^{\alpha \beta }}\\ \end{array} \end{aligned}$$
(117)

Here we have introduced the \(\mathrm{SL}(2, {\mathbb {R}})\)-doublets of left/right ghosts and anti-ghosts, \(c^\alpha =(c, \bar{c})\), following [301]. This dictionary is heuristic, we give the precise relationship below at the linear level, but lays out the basic structure. First, it splits into four sectors:

  1. (i)

    \(A \times \tilde{A} = \hbox {physical} + \hbox {auxiliary}\)

  2. (ii)

    \(A \times \tilde{c} = \hbox {right ghosts}\)

  3. (iii)

    \(c \times \tilde{A} = \hbox {left ghosts}\)

  4. (iv)

    \(c \times \tilde{c} = \text {ghosts-for-ghosts} +\hbox {physical}/\hbox {auxiliary}\)

This is quite intuitive, except perhaps for the mixing of physical and auxiliary degrees of freedom in the \(A \times \tilde{A}\) and \(c \times \tilde{c}\) sectors. This mixing is a consequence of choosing Einstein frame, as opposed to string frame. We will make this precise momentarily. Second, the ghost numbers and mass dimensions of the \({\mathcal {N}}=0\) supergravity follow consistently from the product. First, ghost numbers \(\text {gh}(f)\) and Grassmann grades \(\varepsilon (f)\) are additive under the product

$$\begin{aligned} \text {gh}(f\circ g)&= \text {gh}(f)+\text {gh}(g);\nonumber \\ \varepsilon (f\circ g)&= \varepsilon (f)+\varepsilon (g) \mod 2 ; \end{aligned}$$
(118)

Similarly, since the mass dimension of the spectatorFootnote 25 is \((3D+2)/2\) we have

$$\begin{aligned} {[}f\circ g]=[f]+[g] -\frac{D-2}{2} \end{aligned}$$
(119)

which as we shall see is precisely as required. The ghost number, grade and mass dimension, \(\left( \text {gh}(f), \varepsilon (f), [f]\right) \), of the product are summarised here:

$$\begin{aligned} \begin{array}{c|c|cccccc} \circ &{} \begin{array}{c}\tilde{A}^{\tilde{a}}_\nu \\ \left( 0, 0, \frac{D-2}{2}\right) \end{array} &{}\begin{array}{c} \tilde{{c}}^{\tilde{a}} \\ \left( 1, 1, \frac{D-2}{2}\right) \end{array} &{} \begin{array}{c} \tilde{\bar{c}}^{\tilde{a}} \\ \left( -1, 1, \frac{D-2}{2}\right) \end{array} \\ \hline \begin{array}{c} {A}^{{a}}_\nu \\ \left( 0, 0, \frac{D-2}{2}\right) \end{array} &{} \left( 0, 0, \frac{D-2}{2}\right) &{} \left( 1, 1, \frac{D-2}{2}\right) &{} \left( -1, 1, \frac{D-2}{2}\right) \\ \hline \begin{array}{c} {{c}}^{{a}} \\ \left( 1, 1, \frac{D-2}{2}\right) \end{array} &{} \left( 1, 1, \frac{D-2}{2}\right) &{} \left( 2, 0, \frac{D-2}{2}\right) &{}\left( 0, 0, \frac{D-2}{2}\right) \\ \begin{array}{c} \tilde{\bar{c}}^{\tilde{a}} \\ \left( -1, 1, \frac{D-2}{2}\right) \end{array} &{} \left( -1, 1, \frac{D-2}{2}\right) &{} \left( 0, 0, \frac{D-2}{2}\right) &{}\left( -2, 0, \frac{D-2}{2}\right) \\ \end{array} \end{aligned}$$
(120)

As first noted in Refs. [299, 300], we see from the above that the degrees of freedom, ghost number and parity inherited by the products are very suggestive that squaring two BRST-covariant Yang–Mills theories results in the states, physical as well as first- and second-level ghosts, of a graviton, two-form and dilaton. Let us now make this precise at the linear level using the convolutive product [137, 141]. For simplicity we adopt Feynman-‘t Hooft gauge for both Yang–Mills factors and eliminate the Nakanishi-Lautrup fields, \(b, \tilde{b}\), through their equations of motion. This is not required; arbitrary and independent gauge choices can be made [302]. The simplestFootnote 26 ansatz for the “\(\hbox {gravity} =\hbox {gauge} \times \hbox {gauge}\)”  dictionary is given by:

  1. 1.

    The graviton

    $$\begin{aligned} h_{\mu \nu } = A_{(\mu } \circ \tilde{A}_{\nu )} - a\eta _{\mu \nu } (A^{\rho } \circ \tilde{A}_\rho -c^{\alpha } \circ \tilde{c}_{\alpha }). \end{aligned}$$
    (121)
  2. 2.

    The KR two-form

    $$\begin{aligned} B_{\mu \nu } = A_{[\mu } \circ \tilde{A}_{\nu ]}. \end{aligned}$$
    (122)
  3. 3.

    The dilaton

    $$\begin{aligned} \varphi = A^{\rho } \circ \tilde{A}_\rho -c^{\alpha } \circ \tilde{c}_{\alpha }. \end{aligned}$$
    (123)

Note, here we have rescaled by \(A_\mu , c^\alpha \) (and similar on the right) by \(g^{-1}\) to ensure that the mass dimensions are consistent, that is \(h_{\mu \nu }= A_{(\mu }^a \cdot \Phi _{a\tilde{a}} \cdot \tilde{A}_{\nu )}^{\tilde{a}}+\cdots \) and similar for the remaining fields. Note, a is left as a free parameter for now and the \(\mathrm{SL}(2, {\mathbb {R}})\) ghost-antighost singlet \(c^{\alpha } \circ \tilde{c}_{\alpha }=c^\alpha \circ \tilde{c}^{\beta }\varepsilon _{\alpha \beta }\) provides the trace of the graviton, but also contributes to the dilaton in Einstein frame.

Similarly, we have the ghost and ghost-for-ghost dictionaries:

  1. 1.

    Diffeomorphism (anti)ghost

    $$\begin{aligned} c_{\mu }^{\alpha } =\frac{1}{2}\left( {C}^{\alpha }_{\mu } +\tilde{C}^{\alpha }_{\mu }\right) = \frac{1}{2}\left( c^\alpha \circ \tilde{A}_\mu + A_\mu \circ \tilde{c}^\alpha \right) \end{aligned}$$
    (124)
  2. 2.

    Two-form gauge (anti)ghost

    $$\begin{aligned} \lambda _{\mu }^{\alpha }= \frac{1}{2}\left( {C}^{\alpha }_{\mu } -\tilde{C}^{\alpha }_{\mu } \right) = \frac{1}{2}\left( c^{\alpha } \circ \tilde{A}_\mu - A_\mu \circ \tilde{c}^{\alpha }\right) . \end{aligned}$$
    (125)
  3. 3.

    Two-form gauge (anti)ghost-for-(anti)ghost

    $$\begin{aligned} \lambda ^{\alpha \beta } =c^{(\alpha } \circ \tilde{c}^{\beta )}=\begin{pmatrix} \lambda &{} \eta \\ \eta &{} \bar{\lambda } \end{pmatrix} \end{aligned}$$
    (126)

Having proposed a dictionary between the \({\mathcal {N}}=0\) supergravity and pure Yang–Mills fields, the first consistency check is the BRST transformations. Said another way, the local gauge symmetries of the Yang–Mills factors must consistently generate the local diffeomorphism and 2-form gauge symmetries of hB. Since we are working at linear level the non-Abelian Yang–Mills gauge group G breaks to \(\dim {\mathfrak {g}}\) local \(\mathrm{Un}(1)\) gauge symmetries and a global group, \(G_{\mathrm{global}}\cong G\),

$$\begin{aligned} \delta _{\epsilon , X} A = \epsilon dc + [A, X], \end{aligned}$$
(127)

where \(\epsilon , \varepsilon (\epsilon )=1, \text {gh}(\epsilon )=-1\) is a constant parameter, \(\delta _\epsilon = \epsilon Q\) and \(X\in {\mathfrak {g}}_{\mathrm{global}}\). Similarly, on the right factor \(\tilde{G}\rightarrow \mathrm{Un}(1)^{\dim \tilde{{\mathfrak {g}}}}\times \tilde{G}_{\mathrm{global}}\).

First, the gravity fields must be invariant under \({G}_{\mathrm{global}}\times \tilde{G}_{\mathrm{global}}\). This is trivially ensured by the spectator field, which transforms as

$$\begin{aligned} \delta _{X, \tilde{X}}\Phi = [\Phi , X] + [ \Phi , \tilde{X}] \end{aligned}$$
(128)

so that for any \(f, \tilde{g}\) such that \(\delta _{X}f = [f, X]\) and \(\delta _{\tilde{X}}\tilde{g} = [\tilde{g}, \tilde{X}]\),

$$\begin{aligned} \delta _{X, \tilde{X}}f\circ \tilde{g} = 0, \end{aligned}$$
(129)

which follows from the Killing form property \(\langle X, [Y,Z]\rangle = \langle [X, Y] ,Z\rangle \).

Let us now turn to the BRST transformations: the linearised diffeomorphisms and the Abelian 2-form gauge transformations of \({\mathcal {N}}=0\) supergravity. For convenience we recall here the linear BRST transformations in this gauge choice

$$\begin{aligned} Q A = d c, \quad Qc = 0,\quad Q \bar{c} = -\partial A \end{aligned}$$
(130)

and similar for the right factor. Then

$$\begin{aligned} {\mathcal {Q}}\varphi&= Q A^{\rho } \circ \tilde{A}_\rho - Q c^{\alpha } \circ \tilde{c}_{\alpha }+A^{\rho } \circ \tilde{Q} \tilde{A}_\rho +c^{\alpha } \circ \tilde{Q} \tilde{c}_{\alpha }\nonumber \\&= \partial ^{\rho } (c \circ \tilde{A}_\rho )- \partial ^\rho (A_\rho \circ {\tilde{c}})+ \partial ^\rho (A_{\rho } \circ \tilde{c}) - \partial ^\rho ( c \circ \tilde{A}_{\rho })=0 \end{aligned}$$
(131)

and

$$\begin{aligned} {\mathcal {Q}} (A_{\mu } \circ \tilde{A}_{\nu })&= Q A_{\mu } \circ \tilde{A}_{\nu }+ A_{\mu } \circ \tilde{Q}\tilde{A}_{\nu }\nonumber \\&= \partial _{\mu } (c \circ \tilde{A}_{\nu }) +\partial _\nu (A_{\mu } \circ \tilde{c})\nonumber \\&= \partial _{\mu } C_\nu + \partial _\nu \tilde{C}_\mu . \end{aligned}$$
(132)

Hence, from (125) we recover the linearised diffeomorphisms and 2-form gauge transformations:

$$\begin{aligned} {\mathcal {Q}} h_{\mu \nu }&= \partial _\mu c_\nu + \partial _\nu c_\mu , \end{aligned}$$
(133a)
$$\begin{aligned} {\mathcal {Q}} B_{\mu \nu }&= \partial _\mu \lambda _\nu - \partial _\nu \lambda _\mu , \end{aligned}$$
(133b)
$$\begin{aligned} {\mathcal {Q}} \varphi&= 0. \end{aligned}$$
(133c)

We see that the linearised general coordinate transformations of \(h, \varphi \) and the 2-form gauge symmetry of B are precisely recovered. Varying the ghost fields we obtain \({\mathcal {Q}} c_\mu = 0\) for the diffeomorphism ghost, as expected, and

$$\begin{aligned} {\mathcal {Q}} \lambda _{\mu }&= \partial _\mu \lambda , \end{aligned}$$
(134a)
$$\begin{aligned} {\mathcal {Q}} \lambda&= 0, \end{aligned}$$
(134b)
$$\begin{aligned} {\mathcal {Q}} \eta&= \partial ^{\mu }\lambda _{\mu }. \end{aligned}$$
(134c)

These are precisely the gauge-for-gauge transformations of an Abelian 2-form [141, 303].Footnote 27 Note, this result relies on the Grassmann grading, strongly suggesting that the inclusion of ghosts is a necessary ingrediant. We are only left with the antighost transformations. These play the crucial role of mapping the gauging fixing choice of the Yang–Mills factors into the gravity theory. Recall, \(Q\bar{c}=b\) and the equation of motion of b is determined by the gauge-fixing term. Similarly, focussing on the graviton, \({\mathcal {Q}}\bar{c}_\mu = b_\mu \), where \(b_\mu \) is the 1-form Lagrange multiplier of the gauge-fixing action for the linearised Einstein–Hilbert action. The variation \({\mathcal {Q}}\bar{c}_\mu \) is determined by the antighost dictionary, (125), so that for our left/right Yang–Mills gauge choices with \(a=1/(D-2)\) we have,

$$\begin{aligned} {\mathcal {Q}}\bar{c}_\mu&=\frac{1}{2}\left( Q A_{\mu } \circ \bar{\tilde{c}}+ A_{\mu } \circ \tilde{Q} \bar{\tilde{c}}+Q \bar{{c}} \circ \tilde{A}_{\mu } -\bar{{c}} \circ \tilde{Q} \tilde{A}_{\mu } \right) \nonumber \\&= \frac{1}{2}\left( \partial _{\mu } (c^\alpha \circ \tilde{c}_\alpha )+ \partial ^{\rho } (A_{\mu } \circ \tilde{A}_{\rho }+A_{\rho } \circ \tilde{A}_{\mu }) \right) \nonumber \\&= -\partial ^{{\rho }}h_{\rho \mu }+\frac{1}{2}\partial _{{\mu }}h, \end{aligned}$$
(135)

which corresponds to the de Donder linear diffeomorphism gauge-fixing function,

$$\begin{aligned} b^{{(h)}}_{\mu } =- \partial ^{{\rho }}h_{\rho \mu }+\frac{1}{2}\partial _{{\mu }}h. \end{aligned}$$
(136)

The requirement that \(b^{{(h)}}_{\mu } \) is expressible in terms of gravity fields implies the de Donder term. The choice of \(a=1/(D-2)\) in (121) is fixed by the requirement that the diffeomorphism gauge-fixing is independent of the dilaton, reflecting our choice of Einstein frame. For arbitrary a there is a \(\partial _{{\mu }}\varphi \) contribution to (136), while the de Donder term is left invariant. This is clearly a consequence of our choice of Yang–Mills gauge-fixing functions and the restriction to a local field dictionary. The point is that whatever choices we make for the left/right Yang–Mills theories they consistently map into the gravity theory; the gauge-fixing function of the gravity theory is determined by those of the Yang–Mills factors [302, 304]. Similarly, the 2-form gauge-fixing term is given by

$$\begin{aligned} b^{{(B)}}_{\mu } =\partial _{{\mu }} \eta -\partial ^{{\rho }}B_{\mu \rho }, \end{aligned}$$
(137)

which is precisely the canonical gauge-fixing term of the Abelian 2-form [156, 157, 303]. Equipped with the gauge-fixing terms we can then impose the equations of motion (including the gauge fixing) to uniquely determine the relationship between the Yang–Mills and \({\mathcal {N}}=0\) supergravity sources, completing the linear dictionary [302, 304].

Let us now work through the simplest example exhibiting all the local symmetries of interest, including supersymmetry. We consider the product of a left \(D=4, {\mathcal {N}}=1\) super Yang–Mills multiplet and \(D=4\) Yang–Mills on the right. In this case we have the luxury of a full off-shell vector superfield for the left \({\mathcal {N}}=1\) super Yang–Mills,

$$\begin{aligned} V(x, \theta , \bar{\theta })=&M+i\theta \chi -i{\bar{\theta }}{\bar{\chi }}+i{\theta }^2{ F} -i{\bar{\theta }}^2{\bar{F}}-{ \theta }\sigma ^{\mu }{\bar{\theta }}A_\mu \nonumber \\&\quad +i{ \theta }^2{\bar{\theta }}\left( {\bar{\psi }}+\frac{i}{2} \bar{\sigma }^\rho \partial _\rho { \chi } \right) -i{\bar{\theta }}^2{ \theta }\left( \psi +\frac{i}{2}\sigma ^\rho \partial _\rho {\bar{\chi }} \right) \nonumber \\&\quad +\frac{1}{2}{\bar{\theta }}^2{\theta }^2\left( D +\frac{1}{2} \Box M \right) \end{aligned}$$
(138)

transforming under local supergauge, non-Abelian global G and global super-Poincaré:

$$\begin{aligned} \delta V=\underbrace{C +\bar{C}}_{\mathrm{local}\ \mathrm{Abelian}\ \mathrm{supergauge}} +\overbrace{[V, X]}^{\mathrm{global}\ \mathrm{non}\text {-}\mathrm{Abelian} ~G} +\underbrace{\delta _{\epsilon } V}_{\mathrm{global}\ \mathrm{supersymmetry}} \end{aligned}$$
(139)

where \(C(x, \theta , \bar{\theta })\) is a chiral superfield of ghosts

$$\begin{aligned} C(x, \theta , \bar{\theta }) = B+\sqrt{2}\theta \zeta + \theta ^2 K +i{ \theta }\sigma ^{\rho }{\bar{\theta }}\partial _{\rho }c +\frac{i}{\sqrt{2}} \theta ^2{\bar{\theta }}{\bar{\sigma }}^\rho \partial _\rho \zeta +\frac{1}{4}\theta ^2{\bar{\theta }}^2\Box B \nonumber \\ \end{aligned}$$
(140)

The product of \({\mathcal {N}}=1\) with \({\mathcal {N}}=0\) Yang–Mills generates \({\mathcal {N}}=1\) supergravity coupled to a single chiral multiplet, as can be seen directly from the product of helicity states,

$$\begin{aligned} \left( +1, +\tfrac{1}{2}, -\tfrac{1}{2}, -1\right) \otimes (+1, -1) =\underbrace{\left( +2, +\tfrac{3}{2}, -\tfrac{3}{2}, -2\right) }_{{\mathcal {N}}=1\ \mathrm{supergravity}} ~\oplus \underbrace{\left( +\tfrac{1}{2}, 0, 0, -\tfrac{1}{2}\right) }_{{\mathcal {N}}=1\ \mathrm{supergravity}}. \end{aligned}$$
(141)

The field and ghost dictionary is directly analogous:

$$\begin{aligned} H_{\nu }= & {} V \circ {\tilde{A}}_\nu \quad \text {real supergravity superfield} \nonumber \\ S= & {} V \circ \tilde{{c}} \quad \text {real ghost superfield} \nonumber \\ S_\nu= & {} C \circ {\tilde{A}}_\nu \quad \text {chiral ghost superfield} \end{aligned}$$
(142)

Varying the gravitational superfield via the dictionary

$$\begin{aligned} \delta H_{\nu }={S_\nu +{\bar{S}}_\nu +{\partial }_{\nu }S}+\delta _{\epsilon } H_{\nu }. \end{aligned}$$
(143)

This is the complete set of transformation rules for the new-minimal superfield at linearised approximation [305, 306]. Hence, the local gravitational symmetries of general covariance, 2-form gauge invariance, local supersymmetry and local chiral symmetry follow from those of Yang–Mills at linear level. In particular, the product of the left fermion \(\chi \) with the right ghost \(\tilde{c}\) gives a local supergauge transformation of the gravitino. Schematically,

$$\begin{aligned} \psi _\nu = \psi \circ \tilde{A}_\nu , \quad \eta = \psi \circ \tilde{c}\quad \Rightarrow \quad {\mathcal {Q}} \psi _\nu =\partial _{{\mu }} \eta , \end{aligned}$$
(144)

so that the presence of adjoint fermions induces local supersymmetries, in agreement with the BCJ double-copy. Including \({\mathcal {N}}\) and \(\tilde{{\mathcal {N}}}\) adjoint fermions in the left and right factors, we obtain \({\mathcal {N}}+\tilde{{\mathcal {N}}}\) gravitini and local supergauge transformations and, hence, an \(({\mathcal {N}}+\tilde{N})\)-extended supergravity theory.

The \(12+12\) new minimal multiplet splits with respect to superconformal transformations into an \(8+8\) conformal supergravity multiplet plus a \(4+4\) conformal tensor multiplet,

$$\begin{aligned} \underbrace{\begin{pmatrix} \mathbf {5+3+1+3}\\ \mathbf {4+2+4+2}\\ \end{pmatrix}}_{\mathrm{new}\text {-}\mathrm{minimal}} \rightarrow \underbrace{\begin{pmatrix} \mathbf {5+3}\\ \mathbf {4+4}\\ \end{pmatrix}}_{\mathrm{conformal}}+\underbrace{\begin{pmatrix} \mathbf {3+1}\\ \mathbf {2+2}\\ \end{pmatrix}}_{\mathrm{tensor}} \end{aligned}$$
(145)

in terms of \(\mathrm{Spin}(3)\) representations. Since the left (anti)ghost is a chiral superfield the ghost-antighost sector gives a compensating \(4+4\) chiral (dilaton) multiplet [299, 300], yielding old-minimal \(12+12\) supergravity [307, 308] coupled to a tensor multiplet, which, with the conventional 2-derivative Lagrangian, correctly corresponds to the on-shell content obtained by tensoring left/right helicity states.

This linear mapping can be used to construct a higher-order perturbative relationship using a BCJ-type formalism [302], building on that of [117]. Even in the absence of a full perturbative framework, the dictionary can be used to construct, for example, supersymmetric (single and multi-centre) black hole solutions in \({\mathcal {N}}=2\) supergravity [115, 116], in the weak-field limit. Finally, it can be applied to curved back-grounds, at least where the convolution can be made tractable [304].

3.2.2 Global symmetries

When coupled to other fields the Yang–Mills factors may have further global symmetries. In particular, in addition to global supersymmetry, pure \({\mathcal {N}}\)-extended Yang–Mills theories always possess global R-symmetries. The global of the factors generate global symmetries in the product. In the context of supergravity, these typically take the form of non-compact global symmetries, \({\mathcal {G}}\), acting non-linearly on the scalar fields, the preeminent example being \(D=4, {\mathcal {N}}=8\) supergravity, which has global symmetry \(E_{7(7)}\), the maximally split non-compact real form of the second largest exceptional Lie group [309]. From the “\(\hbox {open} \times \hbox {open} = \hbox {closed}\)” string point of view, when it applies in the sense that the supergravity theory is the low energy effective field theory limit, these global symmetries can be understood as the continuous limit of the U-duality groups of M-theory [310]. This is indeed the case for all supergravity theories obtained from pure “\({\mathcal {N}}\) Yang–Mills \(\times \tilde{{\mathcal {N}}}\) Yang–Mills” in any dimension, since the factors are all (possibly consistent truncations of) open string theories. In these examples, the scalar fields of the corresponding supergravity theory always parametrise a symmetric space \({\mathcal {G}}/{\mathcal {H}}\), where \({\mathcal {H}}\) is the maximal compact subgroup of the non-compact global symmetry group \({\mathcal {G}}\) [140]. This reflects and generalises our earlier observation that in \(D=4\) the axion–dilaton of \({\mathcal {N}}=0\) supergravity belongs to \(\mathrm{SL}(2, {\mathbb {R}})/\mathrm{SO}(2)\). An obvious question at this point is the Yang–Mills origin of such symmetries. In the following we shall describe this situation, revealing some unexpected surprises along the way, as well as some general principles both in the supersymmetric and non-supersymmetric cases. The question of global symmetries from squaring Yang–Mills has been addressed in, for example, [66, 68, 69, 72, 75, 77, 81, 85, 136, 138,139,140, 144, 311, 312], both in the context of scattering amplitudes and field theory.

Table 4 U-dualities (global symmetries) of M-theory (\(D=11, {\mathcal {N}}=1\) supergravity) compactified on an n-torus

As an example, in Table 4 we give the U-dualities of \(D=11\) M-theory compactified on an n-torus (equivalently \(D=10\) type IIA tring theory on an \((n-1)\)-torus), and the \({\mathcal {G}}, {\mathcal {H}}\) of the corresponding supergravity low energy effective field theory limits. These are the square of the maximally supersymmetric Yang–Mills thoeries in \(D=10-(n+1)\). As one observes, the global symmetries become increasingly manifest as one descends in dimension.Footnote 28 Thus, to fully expose the structure of the global symmetries with respect to squaring we should consider the product of super Yang–Mills theories in \(D=3\). This was done in [136]. The result reveals a rather intriguing mathematical structure. The symmetry algebras obtained make up the Freudenthal–Rosenfeld–Tits magic square [317,318,319] as given in Table 6. As we shall explain this surprise has an elegant explanation, but first let us briefly return to the familiar case of \(D=4\) to make some generic observations.

Global symmetries: a first look To discuss global symmetries we can put aside the gauge/BRST transformations and focus on the asymptotic states of the on-shell spectrum. For the left/right Yang–Mills factors, these are labelled by their representations under the common spacetime little group \(\mathrm{Spin}(D-2)\) and any internal global symmetries they may carry, which may include both R-symmetries \(R, \tilde{R}\) and flavour groups \(F, \tilde{F}\). We shall work at the infitesimal level, denoting the spacetime and internal Lie algebras by \({\mathfrak {so}}(D-2)\) and \({\mathfrak {int}}, {\mathfrak {\tilde{int}}}\). Suppressing the momentum label the states are denoted:

$$\begin{aligned} \text {left states} \quad |{\rho _{{\mathfrak {so}}(D-2)}; \rho _{{\mathfrak {int}}}}\rangle _{L}^{\rho _{\mathfrak {g}}}, \quad \text {right states} \quad |{\rho _{{\mathfrak {so}}(D-2)}; \rho _{\tilde{{\mathfrak {int}}}}}\rangle _{L}^{\rho _{\tilde{{\mathfrak {g}}}}}. \end{aligned}$$
(146)

We have supressed the momentum and colour labels. However, it is sometimes important to recall the gauge group representation carried by the states, as indicated by the superscript. When all states are in the adjoint we will leave this implicit. Since the left/right spacetimes are identified, but the internal symmetries are not, the product states are \((\mathfrak {so}(D-2)\oplus \mathfrak {int} \oplus \tilde{\mathfrak {int}})\)-modules.Footnote 29 Since, at tree-level all amplitudes of the left/right factors are invariant under \({\mathfrak {int}}\) and \(\tilde{{\mathfrak {int}}}\), respectively, the global internal symmetry of the gravity theory is at least\(\mathfrak {int} \oplus \tilde{\mathfrak {int}}\). In the absence of anomalies this persists to all orders in perturbation theory. But this need not be all symmetries of the gravitational theory. As we have argued, and will make explicit in the following, the global supersymmetries of the left (\({\mathcal {N}}\)-extended) and right (\(\tilde{{\mathcal {N}}}\)-extended) factors sum to give local supersymmetries, so that the product theory has \(({\mathcal {N}}+\tilde{{\mathcal {N}}})\)-extended local supersymmetry. Such theories have, at least, a linearly global symmetry isomorphic to the \(({\mathcal {N}}+\tilde{{\mathcal {N}}})\)-extended R-symmetry group, which includes as a subgroup the product of the left and right R-symmetry groups. The product theory can and will (typically) have more symmetry than is present in its factors. From the perspective of the double-copy this is quite remarkable. The gravity amplitudes are built from the numerators of the factor only, which individually manifest only their own global symmetries, yet they conspire to yield the larger symmetry of the gravity theory.

Given enough symmetry in the factors, the symmetries of the product theory can be deduced unambiguously form the field theory product. First one must determine the linearised local symmetry as described in the previous section. Given this structure, one can then focuss on global symmetries in terms of either the fields or, more simply, the asymptotic states. Let us work through the paradigmatic example of “\({\mathcal {N}}=4\) Yang–Mills \(\times \tilde{{\mathcal {N}}}=4\)” Yang–Mills in \(D=4\). The \({\mathcal {N}}=4\) Yang–Mills multiplet includes a gluon, four gluini and six scalars. The only allowed internal symmetry is the \({\mathfrak {su}}(4)\) R-symmetry. The states are given by,

$$\begin{aligned} \wedge ^0 Q |{1; \mathbf {1}}\rangle&= |{1; \mathbf {1}}\rangle \nonumber \\ \wedge ^1 Q |{1; \mathbf {1}}\rangle&= |{\tfrac{1}{2}; \mathbf {4}}\rangle \nonumber \\ \wedge ^2 Q |{1; \mathbf {1}}\rangle&= |{0; \mathbf {6}}\rangle \nonumber \\ \wedge ^3 Q |{1; \mathbf {1}}\rangle&= |{{-}\tfrac{1}{2}; \bar{\mathbf {4}}}\rangle \nonumber \\ \wedge ^4 Q |{1; \mathbf {1}}\rangle&= |{{-}1; \mathbf {1}}\rangle \end{aligned}$$
(147)

where \(|{h; \mathbf {n}}\rangle \) denotes a helicity h state carrying \({\mathfrak {su}}(4)\) representation \(\mathbf {n}\). We have also indicated the action of the supersymmetry charge \(Q\sim |{{-}\tfrac{1}{2}; \mathbf {4}}\rangle \). The product yields

$$\begin{aligned} \begin{array}{l|lllllllllll} &{}|{1; \mathbf {1}}\rangle &{} |{\tfrac{1}{2}; \mathbf {4}}\rangle &{}|{0; \mathbf {6}}\rangle &{} |{-\tfrac{1}{2}; \bar{\mathbf {4}}}\rangle &{} |{-1; \mathbf {1}}\rangle \\ \hline |{1; \mathbf {1}}\rangle &{}|{2; \mathbf {1,1}}\rangle &{} |{\tfrac{3}{2}; \mathbf {1,4}}\rangle &{}|{1; \mathbf {1,6}}\rangle &{}|{\tfrac{1}{2}; \mathbf {1}, \bar{\mathbf {4}}}\rangle &{} |{0; \mathbf {1,1}}\rangle \\ |{\tfrac{1}{2}; \mathbf {4}}\rangle &{}|{\tfrac{3}{2}; \mathbf {4,1}}\rangle &{} |{1; \mathbf {4,4}}\rangle &{}|{\tfrac{1}{2}; \mathbf {4,6}}\rangle &{}|{0; \mathbf {4}, \bar{\mathbf {4}}}\rangle &{} |{{-}\tfrac{1}{2}; \mathbf {4,1}}\rangle \\ |{0; \mathbf {6}}\rangle &{}|{1; \mathbf {6,1}}\rangle &{} |{\tfrac{1}{2}; \mathbf {6,4}}\rangle &{}|{0; \mathbf {6,6}}\rangle &{}|{{-}\tfrac{1}{2}; \mathbf {6}, \bar{\mathbf {4}}}\rangle &{} |{{-}1; \mathbf {6,1}}\rangle \\ |{{-}\tfrac{1}{2}; \bar{\mathbf {4}}}\rangle &{}|{\tfrac{1}{2}; \bar{\mathbf {4}}, \mathbf {1}}\rangle &{} |{0;\bar{\mathbf {4}},\mathbf {4}}\rangle &{}|{{-}\tfrac{1}{2}; \bar{\mathbf {4}},\mathbf {6}}\rangle &{}|{{-}1; \bar{\mathbf {4}}, \bar{\mathbf {4}}}\rangle &{} |{{-}\tfrac{1}{2}; \bar{\mathbf {4}}, \mathbf {1}}\rangle \\ |{{-}1; \mathbf {1}}\rangle \ &{}|{0; \mathbf {1,1}}\rangle &{}|{{-}\tfrac{1}{2}; \mathbf {1,4}}\rangle &{}|{{-}1; \mathbf {1,6}}\rangle &{}|{{-}\tfrac{3}{2}; \mathbf {1}, \bar{\mathbf {4}}}\rangle &{} |{{-}2; \mathbf {1,1}}\rangle \end{array} \end{aligned}$$
(148)

Gathering the positive helicity states we find they carry the \({\mathfrak {int}}\oplus \tilde{{\mathfrak {int}}}\cong {\mathfrak {su}}(4)\oplus {\mathfrak {su}}(4)\) representations given by

$$\begin{aligned}&2~\text {Graviton: }\quad \mathbf {(1,1)}_{0}\quad \leftarrow \quad \mathbf {1} \end{aligned}$$
(149)
$$\begin{aligned}&\tfrac{3}{2}~\text {Gravitini: }\quad \mathbf {(4,1)}_{{\frac{1}{2}}} +\mathbf {(1,4)}_{{-\frac{1}{2}}}\quad \leftarrow \quad \mathbf {8} \end{aligned}$$
(150)
$$\begin{aligned}&1~\text {Vectors: }\quad \mathbf {(6,1)}_{{1}}+\mathbf {(1,6)}_{{-1}} +\mathbf {(4,4)}_{{0}}\quad \leftarrow \quad \mathbf {28} \end{aligned}$$
(151)
$$\begin{aligned}&\tfrac{1}{2}~\text {Spinors: }\quad (\bar{\mathbf {4}},\mathbf {1})_{{\frac{3}{2}}} +(\mathbf {1},\bar{\mathbf {4}})_{{-\frac{3}{2}}}+\mathbf {(6,4)}_{{\frac{1}{2}}} +\mathbf {(4,6)}_{{-\frac{1}{2}}}\quad \leftarrow \quad \mathbf {56} \end{aligned}$$
(152)
$$\begin{aligned}&0~\text {Scalars: }\quad \mathbf {(1,1)}_{{2}}+\mathbf {(1,1)}_{{-2}} +(\bar{\mathbf {4}},\mathbf {4})_{{1}}+(\mathbf {4},\bar{\mathbf {4}})_{{-1}} +\mathbf {(6,6)}_{{0}}\quad \leftarrow \quad \mathbf {70} \end{aligned}$$
(153)

while the negative helicity states carry their complex conjugates. We have indicated in the subscripts an additional \({\mathfrak {u}}(1)\) charge q corresponding to the difference, rather than sum, of the left and right helicities, \(q=\tilde{h}-h\), so that \({\mathfrak {so}}(2)\oplus \widetilde{{\mathfrak {so}}(2)} \cong {\mathfrak {so}}(2)\oplus {\mathfrak {u}}(1)\), introduced in [72]. We have also indicated the branching under \({\mathfrak {su}}(8)\supset {\mathfrak {u}}(1)\oplus {\mathfrak {su}}(4)\oplus {\mathfrak {su}}(4)\) of the positive helicity \({\mathcal {N}}=8\) supergravity states carrying the corresponding representations under its linearly realised global symmetry \({\mathcal {H}}\cong \mathrm{SU}(8)\) (we have weighted the \({\mathfrak {u}}(1)\) by a factor of 2 relative to the standard conventions). As observed in [72], with the \({\mathfrak {u}}(1)\) charges included, the spectrum of “\({\mathcal {N}}=4\) Yang–Mills \(\times \tilde{{\mathcal {N}}}=4\)” is precisely that of \({\mathcal {N}}=8\) supergravity under \({\mathfrak {su}}(8)\supset {\mathfrak {u}}(1)\oplus {\mathfrak {su}}(4)\oplus {\mathfrak {su}}(4)\):

$$\begin{aligned} \wedge ^0 {\mathcal {Q}} |{2; \mathbf {1}}\rangle&= |{1; \mathbf {1}}\rangle \nonumber \\ \wedge ^1 {\mathcal {Q}} |{2; \mathbf {1}}\rangle&= |{\tfrac{3}{2}; \mathbf {8}}\rangle \nonumber \\ \wedge ^2 {\mathcal {Q}} |{2; \mathbf {1}}\rangle&= |{1; \mathbf {28}}\rangle \nonumber \\ \wedge ^3 {\mathcal {Q}} |{2; \mathbf {1}}\rangle&= |{\tfrac{1}{2}; \mathbf {\overline{56}}}\rangle \nonumber \\ \wedge ^4 {\mathcal {Q}} |{2; \mathbf {1}}\rangle&= |{0; \mathbf {70}}\rangle \nonumber \\ \wedge ^6 {\mathcal {Q}} |{2; \mathbf {1}}\rangle&= |{{-}\tfrac{1}{2}; \mathbf {\overline{56}}}\rangle \nonumber \\ \wedge ^7 {\mathcal {Q}} |{2; \mathbf {1}}\rangle&= |{{-}1; \mathbf {\overline{28}}}\rangle \nonumber \\ \wedge ^8 {\mathcal {Q}} |{2; \mathbf {1}}\rangle&= |{{-}2; \mathbf {1}}\rangle . \end{aligned}$$
(154)

This also makes it clear that the left and right supercharges generate the \({\mathcal {N}}=8\) supercharges \({\mathcal {Q}}=Q\oplus \tilde{Q}\); under the product the supersymmetries sum \({\mathcal {N}}\times \tilde{{\mathcal {N}}}\rightarrow {\mathcal {N}}+\tilde{{\mathcal {N}}}\).

Note, unlike \({\mathfrak {su}}(4)\oplus {\mathfrak {su}}(4)\), the additional \({\mathfrak {u}}(1)\) was not a priori required to be a symmetry of the product theory. Nonetheless, its presence for the product of pure (super) Yang–Mills theories may be anticipated from various points of view. This analysis can be repeated for any product of pure super Yang–Mills theories, including \({\mathcal {N}}=0\), in \(D=4\) [138, 140]. In these cases, the additional \({\mathfrak {u}}(1)\) is always required at the level of symmetries. In particular, it ensures that the scalar manifold is symmetric [85], just as in the case of axion–dilaton gravity (dualised \({\mathcal {N}}=0\) supergravity) derived from the square of pure Yang–Mills. All such theories may be consistently truncated to \({\mathcal {N}}=0\) supergravity, so from this point of view it is reasonable, although not unquestionable, to expect the presence of the \({\mathfrak {u}}(1)\) symmetry. It is in this sense that the product of Yang–Mills theories generically yields symmetric scalar manifolds. Moreover, it is crucially present in all the double-copy constructed amplitudes in all cases that have been tested, although there is no formal proof that it holds to all points. For \({\mathcal {N}}+\tilde{{\mathcal {N}}}>4\) this had to be the case as the supergravity theories are unique and have this symmetry. For \({\mathcal {N}}+\tilde{{\mathcal {N}}}\le 4\) it is also present at tree-level, but is anomalous. Rather satisfyingly, these anomalies can be traced back to the factors through the double-copy [72, 82]. As we shall review, it can also be understood from the division algebraic perspective on spacetime and supersymmetry.

We have thus far left the generators of \({\mathfrak {su}}(8)\) not contained in \({\mathfrak {u}}(1)\oplus {\mathfrak {su}}(4)\oplus {\mathfrak {su}}(4)\) unaccounted for. Although these cannot be generated by left or right transformations alone, they may be deduced from their product. Since we are seeking bosonic symmetries we can consider the “product” of the left and right supercharges, \(Q\otimes \tilde{Q}\) to give us a map of \({\mathcal {N}}=8\) states that preserves helicity [140]. Let us denote the formally modified (by, roughly speaking, \(\Box ^{-1}\)Footnote 30) supersymmetry charges introduced in [140], by \(Q^{a}_{-}, Q^{+}_{a}\), where the ± charge raises/lowers the helicity by \(\pm 1/2\) and the superscript (subscript) a is in the \(\mathbf {4} \ (\mathbf {\overline{4}})\) of \({\mathfrak {su}}(4)\). The helicity preserving operators \(Q_{-}^{a}\otimes \tilde{Q}^{+}_{a}\) and \(Q^{+}_{a}\otimes \tilde{Q}_{-}^{a}\) operators sit in the \((\mathbf {4}, \bar{\mathbf {4}})_{{1}} +(\bar{\mathbf {4}},\mathbf {4})_{{{-}1}}\) of \({\mathfrak {u}}(1)\oplus {\mathfrak {su}}(4)\oplus {\mathfrak {su}}(4)\), which matches the decomposition under \({\mathfrak {su}}(8)\supset {\mathfrak {u}}(1)\oplus {\mathfrak {su}}(4)\oplus {\mathfrak {su}}(4)\),

$$\begin{aligned} \mathbf {63}=(\mathbf {15,1})_{{0}}\oplus (\mathbf {1, 15})_{{0}}\oplus (\mathbf {1, 1})_{{0}} +(\mathbf {4}, \bar{\mathbf {4}})_{{1}}+ (\bar{\mathbf {4}},\mathbf {4})_{{{-}1}}. \end{aligned}$$
(155)

The action of the helicity preserving operators \(Q\otimes \tilde{Q}\) on the \({\mathcal {N}}=8\) states gives precisely the required transformations as described in [140]. Computing the commutators through this action, the full Lie algrebra of \({\mathfrak {su}}(8)\) is recovered. This generalises to all dimensions and degrees of supersymmetry [140].

But this is not the end of the story. As emphasised, the equations of motion of \({\mathcal {N}}=8\) supergravity have a non-linear realised non-compact global symmetry, \(E_{7(7)}\). If we make the assumption that the supergravity scalars parametrise a Riemannian symmetric homogenous space \({\mathcal {G}}/{\mathcal {H}}\), then \(T_{p}({\mathcal {G}}/{\mathcal {H}})\cong {\mathfrak {p}}\), where \({\mathfrak {g}}={\mathfrak {h}}+{\mathfrak {p}}\) for \({\mathfrak {g}},{\mathfrak {h}}\) the Lie algebras of \({\mathcal {G}}, {\mathcal {H}}\) respectively. Then the previously derived \({\mathfrak {h}}\) representation carried by the scalars is enough to fix \({\mathcal {G}}\). In the present example we have \(\mathbf {133}=\mathbf {63+70}\) under \({\mathfrak {su}}(8)\subset {\mathfrak {e}}_{7(7)}\). Since the scalars are indeed in the \(\mathbf {70}\) of \({\mathfrak {su}}(8)\), we infer the \(E_{7(7)}\) and then can check the consistency of the representations carried by the vectors. Justifying the assumption that the scalars will parametrise a symmetric space by considering all \({\mathcal {N}}=0\) truncations, this provides a relatively systematic approach to fixing the global symmetries. In fact, it has a very natural algebraic/geometric underpinning as we shall describe in the following section. Of course, one could argue that given the uniqueness of \({\mathcal {N}}>4\) supergravity, we knew the answer all along, however this is against the spirit of “\(\hbox {gravity} =\hbox {gauge} \times \hbox {gauge}\)”. Besides, it can be generalised to \({\mathcal {N}}\le 4\) supergravity theories, with a very large class of matter couplings, where we lose uniqueness and can drop the symmetricFootnote 31 scalar manifold assumption [85]. Nonetheless, one would like to see the generators in \({\mathfrak {p}}\subset {\mathfrak {g}}\) arise directly in terms of products of operators belonging to the left and right factors. Let us reconsider the tensor product of the (modified) supersymmetry charges introduced above. The \(\pm 1\) helicity states transform irreducibly in the \(\mathbf {56}\) of \({\mathfrak {e}}_{7(7)}\), which decomposes into the \(\mathbf {28+\overline{28}}\) under \({\mathfrak {su}}(8)\). The \(\mathbf {70}\in {\mathfrak {e}}_{7(7)}\ominus {\mathfrak {su}}(8)\) exchanges the helicity states so we require operators in the

$$\begin{aligned} \mathbf {70}=\mathbf {(1,1)}_{{2}}+\mathbf {(1,1)}_{{-2}} +(\bar{\mathbf {4}}, \mathbf {4})_{{1}} +(\mathbf {4}, \bar{\mathbf {4}})_{{-1}}+\mathbf {(6,6)}_{{0}} \end{aligned}$$
(156)

of \({\mathfrak {u}}(1)\oplus {\mathfrak {su}}(4)\oplus {\mathfrak {su}}(4)\subset {\mathfrak {su}}(8)\) carrying helicity charge \(\pm 2\) related by conjugation and self-duality:

$$\begin{aligned} \begin{array}{rlllllll} Q^{+}_{a}Q^{+}_{b} &{}\otimes &{} \tilde{Q}^{+}_{\tilde{{a}}}\tilde{Q}^{+}_{\tilde{{b}}} &{} (\mathbf {6,6})^{{2}}_{{0}}\\ Q^{+}_{a}Q^{+}_{b} Q^{+}_{c} &{}\otimes &{} \tilde{Q}^{+}_{\tilde{{a}}} &{} (\mathbf {{4},\overline{4}})^{{2}}_{{{-}1}}\\ {Q}^{+}_{{a}} &{}\otimes &{} \tilde{Q}^{+}_{\tilde{{a}}}\tilde{Q}^{+}_{\tilde{{b}}} \tilde{Q}^{+}_{\tilde{{c}}} &{} (\overline{\mathbf {4}},{\mathbf {4}})^{{2}}_{{1}}\\ Q^{+}_{a}Q^{+}_{b} Q^{+}_{c}Q^{+}_{d} &{}\otimes &{}\mathbb {1} &{} (\mathbf {1,1})^{{{-}2}}_{{{-}2}}\\ \mathbb {1}&{}\otimes &{} \tilde{Q}^{+}_{\tilde{{a}}}\tilde{Q}^{+}_{\tilde{{b}}} \tilde{Q}^{+}_{\tilde{{c}}} \tilde{Q}^{+}_{\tilde{{d}}} &{} (\mathbf {1,1})^{{2}}_{{2}}\\ \end{array} \end{aligned}$$
(157)

which gives the correct commutation relations acting on the vector states. This is just suggestive and the full picture is yet to be made clear at the level of field theory.

Returning to amplitudes and the BCJ double-copy we can be much more concrete, since they intrinsically carry non-linearities. Generically, the presence of a non-linear symmetry of the scalar Lagrangian, we have in mind here \({\mathcal {G}}/{\mathcal {H}}\) sigma model, manifests itself through low-energy theorems regarding soft limits of the scalar amplitudes [311, 312, 320]. If the space of scalars is a homogenous manifold, they are the Goldstone bosons of \({\mathcal {H}}\subset {\mathcal {G}}\) and so are derivatively coupled. Consequently, on sending the momentum of any external scalar in any amplitude to zero the amplitude itself will vanish. This, in contrast, is not the case for the Yang–Mills factors. In the present case, the \(\mathrm{SU}(4)\) symmetry of \({\mathcal {N}}=4\) Yang–Mills is linearly realised on the scalars and single-soft-scalar limits do not vanish, but rather give the small-mass limit of Coulomb branch amplitudes [321]. Building amplitudes involving scalar through the double-copy of \({\mathcal {N}}=4\) and then testing the vanishing of single-soft-scalar limits establishes that the scalars of the double-copy theory belong to a homogeneous space, despite the fact that scalars of the factors do not. One can go further by carefully considering taking double-soft limits in different orders to extract the commutation relations of the coset manifold and in this way piece together the full non-global symmetry of the double-copy theory, as has been done in some detail the \(E_{7(7)}\) of \({\mathcal {N}}=8\) supergravity [312]. The same principles can be applied to any case where the scalars belong to a homogenous manifold, a particularly elegant example constructed through the double-copy in [77] is given by the magic \({\mathcal {N}}=2\) supergravity theories [237, 244, 245], which have exceptional non-compact global symmetries belonging to (a particular non-compact real form of) the Freudenthal–Rosenfeld–Tits magic square [317,318,319, 322, 323]. In fact, (a different non-compact real form of) the magic square arises naturally from “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)” in a completely different context. This forms the next part of the story of the global symmetries.

3.2.3 Magic pyramids of symmetries

Recall, the U-duality symmetries grow as we descend in spacetime dimension. See Table 4. Let us therefore consider the global symmetries of the product of all pure \({\mathcal {N}}=1,2,4,8\) Yang–Mills theories in \(D=3\). Applying the principles of the preceding sections a remarkable result follows. It was shown in [136] that the resulting \(({\mathcal {N}}+\tilde{{\mathcal {N}}})\)-extended supergravity theories have global symmetries given precisely by (a particular non-compact real form of) the Freudenthal–Rosenfeld–Tits magic square, as summarised in Table 5.

Table 5 \(({\mathcal {N}}+\tilde{{\mathcal {N}}})\)-extended \(D=3\) supergravities obtained by the product of left/right super Yang–Mills multiplets with \({\mathcal {N}},\tilde{ {\mathcal {N}}}=1,2,4,8\)
Table 6 The magic square with real form corresponding to the product of pure super Yang–Mills theories in \(D=3\) spacetime dimensions

The Freudenthal–Rosenfeld–Tits magic square [317,318,319, 322, 323] is a \(4\times 4\) array \({\mathfrak {m}}({\mathbb {A}}, \tilde{{\mathbb {A}}})\) of semi-simple Lie algebras given by pairs of composition algebras \({\mathbb {A}}, \tilde{{\mathbb {A}}}={\mathbb {R}}, {\mathbb {C}}, {\mathbb {H}}, {\mathbb {O}}\), as given in Table 6. The original magic square was for compact real forms, but there are various modifications that allow for a variety of real forms. The complete set of possibilities are given in [324]. The magic square given in Table 6 can be concisely summarised by the magic square formula [136],

$$\begin{aligned} {\mathfrak {ms}}({\mathbb {A}}_{ {\mathcal {N}}_L}, {\mathbb {A}}_{ {\mathcal {N}}_R}) :={\mathfrak {tri}}({\mathbb {A}}_{ {\mathcal {N}}_L})\oplus {\mathfrak {tri}} ({\mathbb {A}}_{ {\mathcal {N}}_R})+3({\mathbb {A}}_{ {\mathcal {N}}_L}\otimes {\mathbb {A}}_{ {\mathcal {N}}_R}), \end{aligned}$$
(158)

which adapts the compact version given in [196]. The triality algebra of \({\mathbb {A}}\), denoted \({\mathfrak {tri}}({\mathbb {A}})\), is related to the total on-shell global symmetries of the associated super Yang–Mills theory [190]. This rather surprising connection, relating the magic square of Lie algebras to the square of super Yang–Mills, can be attributed to the existence of a unified \({\mathbb {A}}_{\mathcal {N}}={\mathbb {R}}, {\mathbb {C}}, {\mathbb {H}}, {\mathbb {O}}\) description of \(D=3,\, {\mathcal {N}}=1,2,4,8\) super Yang–Mills theories.

This observation was subsequently generalised to \(D=3, 4, 6\) and 10 dimensions [138, 190] by incorporating the well-known relationship between the existence of minimal super Yang–Mills theories in \(D=3,4,6,10\) and the existence of the four division algebras \({\mathbb {R}}, {\mathbb {C}}, {\mathbb {H}}, {\mathbb {O}}\) [187, 189, 191, 195]. From this perspective the \(D=3\) magic square forms the base of a “magic pyramid” of supergravities, given in Fig. 1. The Lie algebras are given by the magic pyramid formula:

$$\begin{aligned} \mathfrak {mp}({\mathbb {A}}_{n}, {\mathbb {A}}_{n{\mathcal {N}}}, {\mathbb {A}}_{n{\tilde{{\mathcal {N}}}}}) :=\left\{ u\in {\mathfrak {ms}}({\mathbb {A}}_{n{\mathcal {N}}}, {\mathbb {A}}_{n{\tilde{{\mathcal {N}}}}}) \ominus {\mathfrak {so}}({\mathbb {A}}_n)_{ST}\Big |[u,{\mathfrak {so}}({\mathbb {A}}_n)_{ST}]=0\right\} . \end{aligned}$$
(159)
Fig. 1
figure 1

A magic pyramid of supergravities. The vertical axis labels the spacetime division algebra \({\mathbb {A}}_n\), while the horizontal axes label the algebras associated with the number of supersymmetries \({\mathbb {A}}_{n {\mathcal {N}}}\) and \({\mathbb {A}}_{n{\tilde{{\mathcal {N}}}}}\)

These constructions build on a long line of work relating division algebras and magic squares to spacetime and supersymmetry. See [35, 187,188,189, 191, 195, 237, 244, 245, 313, 324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339,340,341,342,343,344,345,346,347,348,349,350,351,352,353,354,355,356,357,358] for a glimpse of the relevant literature. An early exampleFootnote 32 in the context of group disintegrations in supergravity appears in [313]. Before developing these ideas we should take a breif detour through division algebras and the magic square.

Division algebras and the magic square In this section we follow closely [195, 196]; we refer the reader to these works for more detailed explanations and proofs. An algebra \(\mathbb {A}\) defined over \({\mathbb {R}}\) with identity element \(e_0\), is said to be composition if it has a non-degenerate quadratic formFootnote 33\({\mathbf {n}}:\mathbb {A}\rightarrow {\mathbb {R}}\) such that,

$$\begin{aligned} {\mathbf {n}}(ab)={\mathbf {n}}(a){\mathbf {n}}(b),\quad \forall ~~ a,b \in {\mathbb {A}}, \end{aligned}$$
(160)

where we denote the multiplicative product of the algebra by juxtaposition. Regarding \({{\mathbb {R}}}\subset {\mathbb {A}}\) as the scalar multiples of the identity \( {\mathbb {R}}e_0\) we may decompose \(\mathbb {A}\) into its “real” and “imaginary” parts \({\mathbb {A}}={{\mathbb {R}}}\oplus \text {Im} {\mathbb {A}}\), where \(\text {Im}{\mathbb {A}}\subset {\mathbb {A}}\) is the subspace orthogonal to \({\mathbb {R}}\). An arbitrary element \(a\in {\mathbb {A}}\) may be written \(a=\text {Re}(a) +\text {Im}(a)\). Here \(\text {Re}(a)\in {\mathbb {R}}e_0\), \(\text {Im}(a)\in \text {Im}{\mathbb {A}}\). Defining conjugation using the bilinear form,

$$\begin{aligned} \tau (a)\equiv \overline{a}:=\langle {a}, {e_0}\rangle e_0-a, \quad \langle a, b\rangle :={\mathbf {n}}(a+b)-{\mathbf {n}}(a)-{\mathbf {n}}(b). \end{aligned}$$
(161)

we as usual write

$$\begin{aligned} \text {Re}(a)=\frac{1}{2}(a+\overline{a}), \quad \text {Im}(a) =\frac{1}{2}(a-\overline{a}). \end{aligned}$$
(162)

A composition algebra \(\mathbb {A}\) is said to be division if it contains no zero divisors,

$$\begin{aligned} ab=0\quad \Rightarrow \quad a=0\quad \text {or}\quad b=0, \end{aligned}$$

in which case \({\mathbf {n}}\) is positive semi-definite and \({\mathbb {A}}\) is referred to as a normed division algebra. Hurwitz’s celebrated theorem states that there are exactly four normed division algebras [359]: the reals, complexes, quaternions and octonions, denoted respectively by \({\mathbb {R}}, {\mathbb {C}}, {\mathbb {H}}\) and \({\mathbb {O}}\). They may be constructed via the Cayley-Dickson doubling procedure, \({\mathbb {A}}'={\mathbb {A}}\oplus {\mathbb {A}}\) with multiplication in \({\mathbb {A}}'\) defined by

$$\begin{aligned} (a, b)(c, d) = (ac - d\bar{b}, \bar{a}d + cb). \end{aligned}$$
(163)

With each doubling a property is lost as summarised here:

$$\begin{aligned} \begin{array}{lllllll} &{} \dim &{} \text {Division} &{} \text {Associative} &{} \text {Commutative} &{} \text {Ordered} \\ {\mathbb {R}}= {\mathbb {R}}&{} 1&{} yes &{} yes &{} yes &{} yes \\ {\mathbb {C}}\cong {\mathbb {R}}\oplus {\mathbb {R}}&{} 2&{} yes &{} yes &{} yes &{} no \\ {\mathbb {H}}\cong {\mathbb {C}}\oplus {\mathbb {C}}&{} 4&{} yes &{} yes &{} no &{} no \\ {\mathbb {O}} \cong {\mathbb {H}}\oplus {\mathbb {H}}&{} 8&{} yes &{} no &{} no &{} no \\ \end{array} \end{aligned}$$

Note that, while the octonions are not associative they are alternative:

$$\begin{aligned} {[}a, b, c]:= (ab)c-a(bc) \end{aligned}$$
(164)

is an alternating function under the interchange of its arguments. This property is crucial for supersymmetry.

An element \(a\in {\mathbb {O}}\) may be written \(a=a^ae_a\), where \(a=0,\ldots ,7\), \(a^a\in {\mathbb {R}}\) and \(\{e_a\}\) is a basis with one real \(e_0\) and seven \(e_i, i=1,\ldots , 7,\) imaginary elements. The octonionic multiplication rule is,

$$\begin{aligned} e_ae_b=\left( \delta _{a0}\delta _{bc}+\delta _{0b}\delta _{ac} -\delta _{ab}\delta _{0c}+C_{abc}\right) e_c, \end{aligned}$$
(165)

where \(C_{abc}\) is totally antisymmetric and \(C_{0bc}=0\). The non-zero \(C_{ijk}\) are given by the Fano plane. See Fig. 2.

Fig. 2
figure 2

The Fano plane. The structure constants are determined by the Fano plane, \(C_{ijk}=1\) if ijk lies on a line and is ordered according as its orientation. Each oriented line follows the rules of quaternionic multiplication. For example, \(e_2e_3=e_5\) and cyclic permutations; odd permutations go against the direction of the arrows on the Fano plane and we pick up a minus sign, e.g. \(e_3e_2=-e_5\)

There are three symmetry algebras on \({\mathbb {A}}\) that we will make use of here. The norm preserving algebra is defined as,

$$\begin{aligned} \mathfrak {so}({\mathbb {A}}):=\{A\in \mathrm{Hom}_{{\mathbb {R}}}({\mathbb {A}}) | \langle A a, b\rangle +\langle a, A b\rangle =0, \;\forall a, b \in {\mathbb {A}}\}, \end{aligned}$$
(166)

yielding,

$$\begin{aligned} {\mathfrak {so}}({\mathbb {R}})&\cong \emptyset , \nonumber \\ {\mathfrak {so}}({\mathbb {C}})&\cong {\mathfrak {so}}(2),\nonumber \\ {\mathfrak {so}}({\mathbb {H}})&\cong {\mathfrak {so}}(3)\oplus {\mathfrak {so}}(3),\nonumber \\ {\mathfrak {so}}({\mathbb {O}})&\cong {\mathfrak {so}}(8). \end{aligned}$$
(167)

The triality algebra of \({\mathbb {A}}\) is defined as triples \((A, B, C) \in \mathfrak {so}({\mathbb {A}})\oplus \mathfrak {so}({\mathbb {A}})\oplus \mathfrak {so}({\mathbb {A}})\) that act as generalise derivations,

$$\begin{aligned} \mathfrak {tri}({\mathbb {A}}):=\{(A, B, C)| A(ab)=B(a)b+aC(b), \; \forall a,b \in {\mathbb {A}}\}, \end{aligned}$$
(168)

yielding,

$$\begin{aligned} {\mathfrak {tri}}({\mathbb {R}})&\cong \emptyset , \nonumber \\ {\mathfrak {tri}}({\mathbb {C}})&\cong {\mathfrak {so}}(2)\oplus {\mathfrak {so}}(2),\nonumber \\ {\mathfrak {tri}}({\mathbb {H}})&\cong {\mathfrak {so}}(3)\oplus {\mathfrak {so}}(3)\oplus {\mathfrak {so}}(3),\nonumber \\ {\mathfrak {tri}}({\mathbb {O}})&\cong {\mathfrak {so}}(8). \end{aligned}$$
(169)

Note, the octonionic triality algebra reduces to a single copy of \({\mathfrak {so}}(8)\). This is the statement of infinitesimal triality: for all \(A\in {\mathfrak {so}}(8)\) there exist unique BC such that

$$\begin{aligned} A(ab)=B(a)b+aC(b), \quad \forall a,b \in {\mathbb {O}}. \end{aligned}$$
(170)

One can regard the triality algebra as a generalised form of the derivation algebra,

$$\begin{aligned} \mathfrak {der}({\mathbb {A}})=\{A\in \mathrm{Hom}_{\mathbb {R}}({\mathbb {A}}) | A(ab) = A(a)b+aA(b)\}, \end{aligned}$$
(171)

which for \({\mathbb {A}}={\mathbb {O}}\) gives the smallest exceptional Lie algebra,

$$\begin{aligned} {\mathfrak {der}}({\mathbb {R}})&\cong \emptyset , \nonumber \\ {\mathfrak {der}}({\mathbb {C}})&\cong \emptyset ,\nonumber \\ {\mathfrak {der}}({\mathbb {H}})&\cong {\mathfrak {so}}(3),\nonumber \\ \mathfrak {der}({\mathbb {O}})&\cong \mathfrak {g}_{2(-14)}. \end{aligned}$$
(172)

This provides the first example of a division algebraic description of an exceptional Lie algebra. In fact, the entire magic square can be realised in terms of the division algebras. The magic square was the result of an effort to give a unified and geometrically motivated description of Lie algebras, including the remaining exceptional cases of \(\mathfrak {f}_4, \mathfrak {e}_6, \mathfrak {e}_7, \mathfrak {e}_8\). The classical Lie algebras \(\mathfrak {so}(n), \mathfrak {su}(n), \mathfrak {sp}(n)\) are very naturally captured by \({\mathbb {R}}, {\mathbb {C}}, {\mathbb {H}}\) geometrical structures, respectively. There are a number of ways of articulating this idea, but perhaps the most concise is in terms of the isometries of projective geometries. In particular, the isometry Lie algebras are:

$$\begin{aligned} \mathfrak {Isom}({\mathbb {R}}\mathbb {P}^n)\cong \mathfrak {so}(n+1), \quad \mathfrak {Isom}({\mathbb {C}}\mathbb {P}^n)\cong \mathfrak {su}(n+1), \quad \mathfrak {Isom}({\mathbb {H}}\mathbb {P}^n)\cong \mathfrak {sp}(n+1). \end{aligned}$$
(173)

This sequence is rather suggestive; can we continue it to include \(\mathfrak {Isom}({\mathbb {O}}\mathbb {P}^n)\)? Despite non-associativity it was shown by Moufang [360] that one can consistently construct the octonionic projective line and plane, \({\mathbb {O}}\mathbb {P}^1\) and \({\mathbb {O}}\mathbb {P}^2\). The latter is often referred to as the Cayley plane. However, we cannot go beyond \(n=2\) for the octonions,Footnote 34 which in this context reflects the fact that there is indeed just a finite set of exceptional Lie algebras not belonging to any countably infinite family. The \({\mathbb {O}}\mathbb {P}^1\) example is constructed in direct analogy with the real, complex and quaternionic cases.Footnote 35 It has \({\mathfrak {Isom}}({\mathbb {O}}\mathbb {P}^1) \cong {\mathfrak {so}}(8)\), so does not give us anything new. The octonionic plane has a more intricate structure. An element \((a, b, c)\in {\mathbb {O}}^3\) with \(\mathbf {n}(a)+\mathbf {n}(a)+\mathbf {n}(c)=1\) and \((ab)c=a(bc)\) gives a point in \({\mathbb {O}}\mathbb {P}^2\), the line through the origin containing (abc) in \({\mathbb {O}}^3\). It is not difficult to show the space of such elements is a 16-dimensional real manifold embedded in \({\mathbb {O}}^3\) through eight real constraints: \(\mathbf {n}(a)+\mathbf {n}(a)+\mathbf {n}(c)=1\) and \((ab)c=a(bc)\). The lines in \({\mathbb {O}}\mathbb {P}^2\) are copies of \({\mathbb {O}}\mathbb {P}^1\) and there is a duality relation sending lines/points into points/lines preserving the incidence structure. Borel showed that \(F_{4(-52)}\) is the isometry group of a 16-dimensional projective plane, which is none other than \({\mathbb {O}}\mathbb {P}^2\). One can show that the points and lines in \({\mathbb {O}}\mathbb {P}^2\) are in one-to-one incidence preserving correspondence with trace 1 and 2 projectors in the Jordan algebra of \(3\times 3\) octonionic Hermitian matrices \(\mathfrak {J}_{3}({{\mathbb {O}}})\) (treating projectors as propositions the incidence relation in \(\mathfrak {J}_{3}^{{\mathbb {O}}}\) is given by implication) [361]. Then \(F_{4(-52)}=\text {Isom} ({\mathbb {O}}\mathbb {P}^2)\) follows automatically from the result of Chevalley and Schafer that \(F_{4(-52)}=\mathrm{Aut}(\mathfrak {J}_{3} ({{\mathbb {O}}}))\), the group preserving the Jordan product with Lie algebra \(\mathfrak {der}(\mathfrak {J}_{3}({{\mathbb {O}}}))\) [362]. In summary, the sequence in (173) is continued to include,

$$\begin{aligned} \mathfrak {Isom}({\mathbb {O}}\mathbb {P}^2)\cong \mathfrak {der} (\mathfrak {J}_{3}({{\mathbb {O}}}))\cong \mathfrak {f}_{4(-52)}. \end{aligned}$$
(174)

Since \(F_{4(-52)}\) acts transitively on the space of trace 1 projectors and the stabiliser of a given trace 1 projector is isomorphic to \(\mathrm{Spin}(9)\) we have,

$$\begin{aligned} {\mathbb {O}}\mathbb {P}^2\cong F_{4(-52)}/\mathrm{Spin}(9). \end{aligned}$$
(175)

The Cayley plane is a homogenous symmetric space with \(T_p({\mathbb {O}}\mathbb {P}^2)\cong {\mathbb {O}}^2\), which carries the spinor representation of \(\mathrm{Spin}(9)\); under \(F_{4(-52)}\supset \mathrm{Spin}(9)\) we have

$$\begin{aligned} \mathfrak {f}_{4(-52)}&\cong \mathfrak {so}({\mathbb {R}}\oplus {\mathbb {O}})+{\mathbb {O}}^2\nonumber \\&\cong \mathfrak {so}({\mathbb {O}})+{\mathbb {O}}+{\mathbb {O}}+{\mathbb {O}}. \end{aligned}$$
(176)

The three \({\mathbb {O}}\) terms in the final line transform in the three triality related 8-dimensional representations of \(\mathfrak {so}(8)\), the vector, spinor and conjugate spinor. It is this triality relation which implies that \(\mathfrak {tri}({\mathbb {O}})\cong {\mathfrak {so}}({\mathbb {O}})\).

Seemingly inspired by the trivial identity \({\mathbb {O}}\cong {\mathbb {R}}\otimes {\mathbb {O}}\) Boris Rosenfeld [319] proposed a natural extension of this construction,

$$\begin{aligned} \mathfrak {Isom}(({\mathbb {C}}\otimes {\mathbb {O}})\mathbb {P}^2)&\cong \mathfrak {e}_{6(-78)},\nonumber \\ \mathfrak {Isom}(({\mathbb {H}}\otimes {\mathbb {O}})\mathbb {P}^2)&\cong \mathfrak {e}_{7(-133)},\nonumber \\ \mathfrak {Isom}(({\mathbb {O}}\otimes {\mathbb {O}})\mathbb {P}^2)&\cong \mathfrak {e}_{8(-248)}, \end{aligned}$$
(177)

thus giving a uniform geometric description for all Lie algebras. The would-be tangents spaces \(({\mathbb {A}}\otimes {\mathbb {O}})^2\) have the correct dimensions. However, it is not actually possible to construct projective spaces over \({\mathbb {H}}\otimes {\mathbb {O}}\) and \({\mathbb {O}}\otimes {\mathbb {O}}\) using the logic applied to \({\mathbb {O}}\mathbb {P}^2\), essentially because they do not yield Jordan algebras. They nonetheless can be identified with Riemannian geometries with isometries \(E_{7(-133)}\) and \(E_{8(-248)}\), respectively. Indeed, the Lie algebra decompositions,Footnote 36

$$\begin{aligned} \mathfrak {f}_{4(-52)}&\cong \mathfrak {so}({\mathbb {R}}\oplus {\mathbb {O}})+({\mathbb {R}}\otimes {\mathbb {O}})^2\nonumber \\ \mathfrak {e}_{6(-78)}&\cong \mathfrak {so}({\mathbb {C}}\oplus {\mathbb {O}})\oplus \mathfrak {u}(1) +({\mathbb {C}}\otimes {\mathbb {O}})^2\nonumber \\ \mathfrak {e}_{7(-133)}&\cong \mathfrak {so}({\mathbb {H}}\oplus {\mathbb {O}})\oplus \mathfrak {sp}(1) +({\mathbb {H}}\otimes {\mathbb {O}})^2\nonumber \\ \mathfrak {e}_{8(-248)}&\cong \mathfrak {so}({\mathbb {O}}\oplus {\mathbb {O}})+({\mathbb {O}}\otimes {\mathbb {O}})^2 \end{aligned}$$
(178)

naturally suggest the identifications

$$\begin{aligned} \text {Isom}(({\mathbb {R}}\otimes {\mathbb {O}})\mathbb {P}^2)&= F_{4(-52)}/\mathrm{Spin}(9)\nonumber \\ \text {Isom}(({\mathbb {C}}\otimes {\mathbb {O}})\mathbb {P}^2)&= E_{6(-78)}/[(\mathrm{Spin}(10)\times \mathrm{Un}(1))/{\mathbb {Z}}_4]\nonumber \\ \text {Isom}(({\mathbb {H}}\otimes {\mathbb {O}})\mathbb {P}^2)&= E_{7(-133)}/[(\mathrm{Spin}(10)\times \mathrm{Sp}(1))/{\mathbb {Z}}_2]\nonumber \\ \text {Isom}(({\mathbb {O}}\otimes {\mathbb {O}})\mathbb {P}^2)&= E_{8(-248)}/[\mathrm{Spin}(16)/{\mathbb {Z}}_2] \end{aligned}$$
(179)

with tangent spaces \(({\mathbb {R}}\otimes {\mathbb {O}})^2, ({\mathbb {C}}\otimes {\mathbb {O}})^2, ({\mathbb {H}}\otimes {\mathbb {O}})^2, ({\mathbb {O}}\otimes {\mathbb {O}})^2\) carrying the appropriate spinor representations. Using the Tits’ construction [318] the isometry algebras are given by the natural generalisation of (174),

$$\begin{aligned} \mathfrak {f}_{4(-52)}&\cong \mathfrak {der}({\mathbb {R}})\oplus \mathfrak {der} (\mathfrak {J}_{3}({{\mathbb {O}}}))+\text {Im}{\mathbb {R}}\otimes \mathfrak {J}'_{3}({{\mathbb {O}}})\nonumber \\ \mathfrak {e}_{6(-78)}&\cong \mathfrak {der}({\mathbb {C}})\oplus \mathfrak {der} (\mathfrak {J}_{3}({{\mathbb {O}}}))+\text {Im}{\mathbb {C}}\otimes \mathfrak {J}'_{3}({{\mathbb {O}}})\nonumber \\ \mathfrak {e}_{7(-133)}&\cong \mathfrak {der}({\mathbb {H}})\oplus \mathfrak {der} (\mathfrak {J}_{3}({{\mathbb {O}}}))+\text {Im}{\mathbb {H}}\otimes \mathfrak {J}'_{3}({{\mathbb {O}}})\nonumber \\ \mathfrak {e}_{8(-248)}&\cong \mathfrak {der}({\mathbb {O}})\oplus \mathfrak {der} (\mathfrak {J}_{3}({{\mathbb {O}}}))+\text {Im}{\mathbb {O}} \otimes \mathfrak {J}'_{3}({{\mathbb {O}}}), \end{aligned}$$
(180)

where \(\mathfrak {J}'\) denotes the subset of traceless elements in \({\mathfrak {J}}\). Generalising further, the Tits’ construction defines a Lie algebra,

$$\begin{aligned} \mathfrak {ms}({\mathbb {A}}, \tilde{{\mathbb {A}}})_{\mathrm{compact}} :=\mathfrak {der}({\mathbb {A}})\oplus \mathfrak {der}(\mathfrak {J}_{3} (\tilde{{\mathbb {A}}}))+\text {Im}{\mathbb {A}}\otimes \mathfrak {J}'_{3}(\tilde{{\mathbb {A}}}), \end{aligned}$$
(181)
Table 7 The magic square given by the Tits’ construction

which yields the compact magic square given in Table 7. The “magic” is that Table 7 symmetric about the diagonal despite the apparent asymmetry of (181). To obtain a magic square with the non-compact real forms that follow from squaring Yang–Mills, as given in Table 6, one can use a Lorentzian Jordan algebra [324],

$$\begin{aligned} \mathfrak {ms}( {\mathbb {A}}, \tilde{{\mathbb {A}}} ):=\mathfrak {der}( {\mathbb {A}}) \oplus \mathfrak {der}(\mathfrak {J}_{1, 2}({\tilde{{\mathbb {A}}} })) +\text {Im} {\mathbb {A}}\otimes \mathfrak {J}'_{1, 2}(\tilde{{\mathbb {A}}} ). \end{aligned}$$
(182)

The commutation relations are omitted here, as later we shall see that Yang–Mills squared gives an alternative form of (182), based on the Barton-Sudbery triality construction [196], that is manifestly symmetric in \({\mathbb {A}}, \tilde{{\mathbb {A}}}\) [136, 138], for which we will present the details in full. This symmetric form reflects the fact that the squaring procedure is itself symmetric on interchanging the left and right theories.

Division algebras and Yang–Mills theories: In the two previous sections we saw that the “square” of \(D=3\) super Yang–Mills theories and the “square” of division algebras both led to the magic square of Freudenthal. Surely this is no coincidence. Indeed, there is a long history of work connecting supersymmetry, spacetime and the division algebras [35, 187,188,189, 191, 195, 237, 244, 245, 324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339, 341,342,343, 346, 348, 352, 353, 356,357,358], which, as we shall review, underlies this magical meeting.

Perhaps the most direct link from division algebras to spacetime symmetries comes via the Lie algebra isomorphism of Sudbery [191],

$$\begin{aligned} {\mathfrak {sl}}(2, {\mathbb {A}})\cong {\mathfrak {so}}(1, 1+\dim {\mathbb {A}}), \end{aligned}$$
(183)

which identifies \(D=3,4,6,10\) as algebraically special. This is itself tied to the earlier observation of Kugo and Townsend [187] that the existence of minimal super Yang–Mills multiplets in only \(D=3,4,6,10\) is related to the uniqueness of \({\mathbb {R}}, {\mathbb {C}}, {\mathbb {H}}, {\mathbb {O}}\). This was followed-up by a number of authors [192, 193, 363,364,365], sharpening the correspondence between supersymmetry and division algebras. The final case of \(D=10, {\mathbb {A}}={\mathbb {O}}\) was developed most carefully in [189], where the link between supersymmetry and the alternativity of \({\mathbb {O}}\) was emphasised.

Pulling together these ideas, it was shown in [190] that \({\mathcal {N}}\)-extended super Yang–Mills theories in \(D = n + 2\) dimensions are completely specified (the field content, Lagrangian and transformation rules) by selecting an ordered pair of division algebras: \({\mathbb {A}}_n\) for the spacetime dimension and \({\mathbb {A}}_{n{\mathcal {N}}}\) for the degree of supersymmetry, where the subscripts denote the dimension of the algebras. Consequently, the dual appearances of the magic square in \(D=3\), or equivalently for \({\mathbb {A}}_n={\mathbb {R}}\), can be explained by the observation that \(D = 3, {\mathcal {N}}= 1, 2, 4, 8\) Yang–Mills theories can be formulated with a single Lagrangian and a single set of transformation rules, using fields valued in \({\mathbb {R}}, {\mathbb {C}}, {\mathbb {H}}\) and \({\mathbb {O}}\), respectively [136]. Tensoring an \({\mathbb {A}}\)-valued \(D=3\) super Yang–Mills multiplet with an \(\tilde{{\mathbb {A}}}\)-valued \(D=3\) super Yang–Mills multiplet yields a \(D=3\) supergravity multiplet with fields valued in \({\mathbb {A}}\otimes \tilde{{\mathbb {A}}}\), making a magic square of U-dualities appear rather natural.

As noted in [190], the overall (spacetime little group plus internal) symmetry of the \({\mathcal {N}}=1\) theory in \(D=n+2\) dimensions is given by the triality algebra, \({\mathfrak {tri}}({\mathbb {A}}_n)\). If we dimensionally reduce these theories we obtain super Yang–Mills with \({\mathcal {N}}\) supersymmetries whose overall symmetries are given by,

$$\begin{aligned} {\mathfrak {sym}}({\mathbb {A}}_n,{\mathbb {A}}_{n{\mathcal {N}}}):=\big \{(A,B,C)\in \mathfrak {tri} ({\mathbb {A}}_{n{\mathcal {N}}})| [A, {\mathfrak {so}}({\mathbb {A}}_n)_{ST}]=0, ~~\forall A\notin {\mathfrak {so}} ({\mathbb {A}}_n)_{ST} \big \}, \end{aligned}$$
(184)

where \({\mathfrak {so}}({\mathbb {A}}_n)_{ST}\) is the subalgbra of \(\mathfrak {so}({\mathbb {A}}_{n{\mathcal {N}}})\) that acts as orthogonal transformations on \({\mathbb {A}}_n\subseteq {\mathbb {A}}_{n{\mathcal {N}}}\). The division algebras used in each dimension and the corresponding \({\mathfrak {sym}}\) algebras are summarised in Table 8.

Table 8 A table of algebras: \({\mathfrak {sym}}({\mathbb {A}}_n,{\mathbb {A}}_{n{\mathcal {N}}})\)

Let us take \(D=3\) as a concrete example. The \({\mathcal {N}}=8\) Lagrangian is given by

$$\begin{aligned} {\mathcal {L}}&=\mathrm{tr}\left( \tfrac{1}{2}F\wedge \star F-\tfrac{1}{2} D\varphi _i\wedge \star D\varphi _i+i\bar{\lambda }_a\not {D} \lambda _a \right. \nonumber \\&\quad \left. -\tfrac{1}{4}g^2 [\varphi _i,\varphi _j] [\varphi _i,\varphi _j] -g\bar{\lambda }^{a}\Gamma ^i_{ab}[\varphi _i, \lambda ^{b}]\right) , \end{aligned}$$
(185)

where \(\Gamma ^i_{ab}\), \(i=1,\ldots ,7\), \(a,b=0,\ldots ,7\), belongs to the SO(7) Clifford algebra. The key observation is that this gamma matrix can be represented by the \({\mathbb {A}}\) structure constants \(C_{abc}\),

$$\begin{aligned} \Gamma ^i_{ab}=i(\delta _{bi}\delta _{a0}-\delta _{b0}\delta _{ai}+C_{iab}), \end{aligned}$$
(186)

which allows us to rewrite the \({\mathcal {N}}=1,2,4,8\) action in terms of a single expression defined over \({\mathbb {R}},{\mathbb {C}},{\mathbb {H}},{\mathbb {O}}\):

$$\begin{aligned} {\mathcal {L}}=\mathrm{tr}\left( \tfrac{1}{2}F\wedge \star F-\tfrac{1}{2} D\overline{\varphi }\wedge \star D\varphi +i\bar{\lambda }\not {D} \lambda -\tfrac{1}{4}g^2 \langle [\varphi ,\varphi ] | [\varphi ,\varphi ] \rangle -g [\bar{\lambda }, \varphi , \lambda ]\right) , \end{aligned}$$
(187)

where \(\varphi =\varphi ^i e_i\) is an \(\hbox {Im}\mathbb {A}\)-valued scalar field, \(\lambda =\lambda ^a e_a\) is an \(\mathbb {A}\)-valued two-component spinor and \(\bar{\lambda }=\bar{\lambda }^ae_a^*\).

Now consider the product of two division algebraic multiplets, where \({\mathcal {N}}=\dim {\mathbb {A}}\), \(A_\mu \in \text {Re}{\mathbb {A}}, \varphi \in \text {Im}{\mathbb {A}}, \lambda \in {\mathbb {A}}\) and similar for the right theory. We obtain the field content of an \(({\mathcal {N}}+{\tilde{{\mathcal {N}}}})\)-extended supergravity theory valued in both \(\tilde{{\mathbb {A}}}\) and \(\tilde{{\mathbb {A}}}\):

$$\begin{aligned} g_{\mu \nu } \in {\mathbb {R}}, \quad \Psi _{\mu } \in \begin{pmatrix} {\mathbb {A}}\\ \tilde{{\mathbb {A}}} \end{pmatrix}, \quad \varphi ,\chi \in \begin{pmatrix} {\mathbb {A}}\otimes \tilde{{\mathbb {A}}} \\ {\mathbb {A}}\otimes \tilde{{\mathbb {A}}} \end{pmatrix}. \end{aligned}$$
(188)

The \({\mathbb {R}}\)-valued graviton and \({\mathbb {A}}\oplus \tilde{{\mathbb {A}}}\)-valued gravitino carry no degrees of freedom. The \(({\mathbb {A}}\otimes \tilde{{\mathbb {A}}})^2\)-valued scalar and Majorana spinor each have \(2(\dim {\mathbb {A}}\times \dim \tilde{{\mathbb {A}}})\) degrees of freedom.

The \({\mathcal {H}}\) algebra then follows immediately in this division algebraic language. The left and right factors each come with a commuting copy of the triality algebra, \({\mathfrak {tri}}({\mathbb {A}})\oplus {\mathfrak {tri}}(\tilde{{\mathbb {A}}})\). However, the \({\mathbb {A}}\otimes \tilde{{\mathbb {A}}}\) doublets in (188) form irreducible representations of R-symmetry. The corresponding generators must themselves transform under \({\mathfrak {tri}}({\mathbb {A}})\oplus {\mathfrak {tri}}(\tilde{{\mathbb {A}}})\) consistently, implying they are elements of \({\mathbb {A}}\otimes \tilde{{\mathbb {A}}}\). This follows, formally, from the left/right supersymmetries \(Q\otimes \tilde{Q}\)

$$\begin{aligned} {\mathfrak {h}}({\mathbb {A}}, \tilde{{\mathbb {A}}}) :=\underbrace{{\mathfrak {tri}} ({\mathbb {A}})}_{\mathrm{Left}\ \mathrm{global}\ \mathrm{symmetries}}\oplus \underbrace{{\mathfrak {tri}} (\tilde{{\mathbb {A}}})}_{\mathrm{Right}\ \mathrm{global}\ \mathrm{symmetries}} +\underbrace{{\mathbb {A}}\otimes \tilde{{\mathbb {A}}}}_{Q\otimes \tilde{Q}}. \end{aligned}$$
(189)

This follows from the observation that \(Q\otimes \tilde{Q}\in {\mathbb {A}}\otimes \tilde{{\mathbb {A}}}\). Recall, these are “pseudo-supersymmetry” transformations since they do not change the mass dimension of the component fields. This Lie algebra yields the maximal compact subalgebras of the corresponding non-compact global symmetries of the magic square, as given in Table 9.

Table 9 Magic square of maximal compact subalgebras

The U-dualities \({\mathcal {G}}\) are realised non-linearly on the scalars, which parametrise the symmetric spaces \({\mathcal {G}}/{\mathcal {H}}\). This can be understood using the identity relating \(({\mathbb {A}}\otimes \tilde{{\mathbb {A}}})^2\) to \({\mathcal {G}}/{\mathcal {H}}\),

$$\begin{aligned} ({\mathbb {A}}\otimes \tilde{{\mathbb {A}}})\mathbb {P}^2 \cong {\mathcal {G}}/{\mathcal {H}}. \end{aligned}$$
(190)

The scalar fields may be regarded as points in division-algebraic projective planes. The tangent space \(T_{p}({\mathcal {G}}/{\mathcal {H}})\cong \mathfrak {p}=\mathfrak {g}\ominus \mathfrak {h}\) implies the scalars carry the \(\mathfrak {p}\)-representation of \({\mathcal {H}}\). The tangent space at any point of \(({\mathbb {A}}\otimes \tilde{{\mathbb {A}}})\mathbb {P}^2\) is just \(({\mathbb {A}}\otimes \tilde{{\mathbb {A}}})^2\), the required representation space of \({\mathcal {H}}\). Since \({\mathcal {G}}/{\mathcal {H}}\) is a symmetric space, the U-duality Lie algebra is given by adjoining the scalar representation space \(({\mathbb {A}}\otimes \tilde{{\mathbb {A}}})^2\) to (189),

$$\begin{aligned} {\mathfrak {ms}}({\mathbb {A}}, \tilde{{\mathbb {A}}}):=\underbrace{{\mathfrak {tri}}({\mathbb {A}}) \oplus {\mathfrak {tri}}(\tilde{{\mathbb {A}}})+({\mathbb {A}}\otimes \tilde{{\mathbb {A}}})}_{{\mathfrak {h}} ({\mathbb {A}}, \tilde{{\mathbb {A}}})}+\underbrace{({\mathbb {A}}\otimes \tilde{{\mathbb {A}}})^2}_{\mathrm{``scalars}\text {''}}. \end{aligned}$$
(191)

This has a \({\mathbb {Z}}_2\times {\mathbb {Z}}_2\) graded Lie algebra structure uniquely determined by the left/right super Yang–Mills factors and yields precisely the magic square [138].

Let us describe how this formula works. For an element \(A\in \mathfrak {so}({\mathbb {A}})\) define \(\sigma {A}\equiv \tau A\tau ^{-1}\in \mathfrak {so}({\mathbb {A}})\). Then

$$\begin{aligned} \theta : {\mathfrak {tri}}({\mathbb {A}}) \rightarrow {\mathfrak {tri}}({\mathbb {A}}) : (A, B, C)\mapsto (\sigma {B}, C, \sigma {A}), \end{aligned}$$
(192)

is an order three Lie algebra automorphism, which for \({\mathbb {A}}={\mathbb {O}}\) interchanges the three inequivalent 8-dimensional representations of \(\mathfrak {so}({\mathbb {O}})\).

Given two normed division algebras \({\mathbb {A}}\) and \(\tilde{{\mathbb {A}}}\) we can define on

$$\begin{aligned} {\mathfrak {ms}}({\mathbb {A}}, \tilde{{\mathbb {A}}})=[{\mathfrak {tri}}({\mathbb {A}})\oplus {\mathfrak {tri}}(\tilde{{\mathbb {A}}})]_{00} +({\mathbb {A}}\otimes \tilde{{\mathbb {A}}})_{01}+({\mathbb {A}}\otimes \tilde{{\mathbb {A}}})_{10} +({\mathbb {A}}\otimes \tilde{{\mathbb {A}}})_{11} \end{aligned}$$
(193)

a \({\mathbb {Z}}_2\times {\mathbb {Z}}_2\) graded Lie algebra structure. First, \({\mathfrak {tri}}({\mathbb {A}})\) and \({\mathfrak {tri}}(\tilde{{\mathbb {A}}})\) are Lie subalgebras. For elements \(T=(A, \sigma B, \sigma C)\) in \(\mathfrak {tri}({\mathbb {A}})\) and \((a\otimes b, 0, 0)\), \((0, a\otimes b, 0)\), and \((0, 0, a\otimes b)\) in \(3({\mathbb {A}}\otimes \tilde{{\mathbb {A}}})\), the commutators are given by the natural action of \({\mathfrak {tri}}({\mathbb {A}})\),

$$\begin{aligned} {[}T, (a\otimes b, 0, 0)]&=(A(a) \otimes b, 0, 0),\nonumber \\ {[}T, (0, a\otimes b, 0)]&=(0, B(a)\otimes b, 0),\nonumber \\ {[}T, (0, 0, a\otimes b)]&=(0, 0, C(a)\otimes b). \end{aligned}$$
(194)

Similarly for \(\tilde{T}=(\tilde{A}, \sigma \tilde{B}, \sigma \tilde{C})\) in \(\mathfrak {tri}(\tilde{{\mathbb {A}}})\),

$$\begin{aligned} {[}\tilde{T}, (a\otimes b, 0, 0)]&=(a \otimes \tilde{A}(b), 0, 0),\nonumber \\ {[}\tilde{T}, (0, a\otimes b, 0)]&=(0, a\otimes \tilde{B}(b), 0),\nonumber \\ {[}\tilde{T}, (0, 0, a\otimes b)]&=(0, 0, a\otimes \tilde{C}(b)). \end{aligned}$$
(195)

For two elements belonging to the same summand \(({\mathbb {A}}\otimes \tilde{{\mathbb {A}}})_{ij}\) in (193) the commutators are given by

$$\begin{aligned} {[}(a\otimes b, 0, 0), (a'\otimes b', 0, 0)]&=\quad \langle {a}, {a'}\rangle \ \ \tilde{T}_{b,b'}+\langle {b}, {b'}\rangle T_{a,a'},\nonumber \\ {[}(0,a\otimes b, 0), (0, a'\otimes b', 0)]&=-\langle {a}, {a'}\rangle \ \theta \tilde{T}_{b,b'} -\langle {b}, {b'}\rangle \theta T_{a,a'},\nonumber \\ {[}(0, 0, a\otimes b), (0, 0, a'\otimes b')]&=-\langle {a}, {a'}\rangle \theta ^2 \tilde{T}_{b,b'}-\langle {b}, {b'}\rangle \theta ^2 T_{a,a'}, \end{aligned}$$
(196)

where

$$\begin{aligned} T_{a,a'}:=(S_{a,a'}, R_{a'}R_{\overline{a}}-R_{a}R_{\overline{a}'}, L_{a'}L_{\overline{a}}-L_{a}L_{\overline{a}'}), \end{aligned}$$
(197)

and

$$\begin{aligned} S_{a,a'}(b)=\langle {a}, {b}\rangle a'-\langle {a'}, {b}\rangle a,\quad L_{a}(b)=ab,\quad R_{a}(b)=ba. \end{aligned}$$
(198)

Finally, we have

$$\begin{aligned} {[}(a\otimes b, 0, 0), (0, a'\otimes b', 0)]&=(0, 0, \overline{aa'}\otimes \overline{bb'}),\nonumber \\ {[}(0,0,a\otimes b), ( a'\otimes b', 0,0)]&=(0, \overline{aa'} \otimes \overline{bb'},0),\nonumber \\ {[}(0, a\otimes b, 0), (0, 0, a'\otimes b')]&=-(\overline{aa'} \otimes \overline{bb'},0,0). \end{aligned}$$
(199)

With these commutators the magic square formula (193) describes the Lie algebras of Table 6. The formula (191) is based on the triality construction described in [196]. Although isomorphic as vector spaces, they have different Lie algebra structures, as reflected in the distinct real forms appearing in each case. We see that we truncate to the maximal compact subalgebra (189) by discarding any two of the three summands \(({\mathbb {A}}\otimes \tilde{{\mathbb {A}}})_{ij}\).

For \(D=n+2\), we begin with a pair of Yang–Mills theories with \({\mathcal {N}}\) and \({\tilde{{\mathcal {N}}}}\) supersymmetries written over the division algebras \({\mathbb {A}}_{n{\mathcal {N}}}\) and \({\mathbb {A}}_{n{\tilde{{\mathcal {N}}}}}\), respectively, as described in [190]. In terms of spacetime little group representations we may then write all the bosons of the left (right) theory as a single element \(b\in {\mathbb {A}}_{n{\mathcal {N}}}\) (\(\tilde{b}\in {\mathbb {A}}_{n{\tilde{{\mathcal {N}}}}}\)), and similarly for the fermions \(f\in {\mathbb {A}}_{n{\mathcal {N}}}\) (\(\tilde{f}\in {\mathbb {A}}_{n{\tilde{{\mathcal {N}}}}}\)). After tensoring we arrange the resulting supergravity fields into a bosonic doublet and a fermionic doublet,

$$\begin{aligned} B= \begin{pmatrix} b\otimes \tilde{b} \\ f\otimes \tilde{f} \end{pmatrix},\quad F =\begin{pmatrix} b\otimes \tilde{f} \\ f\otimes \tilde{b} \end{pmatrix}, \end{aligned}$$
(200)

just as we did in \(D=3\). The algebra (189) acts naturally on these doublets. However, a diagonal \({\mathfrak {so}}({\mathbb {A}}_n)_{ST}\) subalgebra of this corresponds to spacetime transformations, so we must restrict \({\mathfrak {h}}({\mathbb {A}}_{n{\mathcal {N}}},{\mathbb {A}}_{n{\tilde{{\mathcal {N}}}}})\) to the subalgebra that commutes with \({\mathfrak {so}}({\mathbb {A}}_n)_{ST}\). Heuristically, we identify a diagonal spacetime subalgebra \({\mathbb {A}}_n\) in \({\mathbb {A}}_{n{\mathcal {N}}}\otimes {\mathbb {A}}_{n{\tilde{{\mathcal {N}}}}}\) and require that it is preserved by the global isometries, which picks out a subset in \({\mathfrak {Isom}}(({\mathbb {A}}_{n{\mathcal {N}}}\otimes {\mathbb {A}}_{n{\tilde{{\mathcal {N}}}}})\mathbb {P}^2)\). Imposing this condition selects the U-duality algebra of the \(D=n+2\), \(({\mathcal {N}}+{\tilde{{\mathcal {N}}}})\)-extended supergravity theory obtained by tensoring \({\mathcal {N}}\) and \({\tilde{{\mathcal {N}}}}\) super Yang–Mills theories. The Lie algebras are given by the magic pyramid formula:

$$\begin{aligned} \mathfrak {mp}({\mathbb {A}}_{n}, {\mathbb {A}}_{n{\mathcal {N}}}, {\mathbb {A}}_{n{\tilde{{\mathcal {N}}}}}):=\left\{ u\in {\mathfrak {m}} ({\mathbb {A}}_{n{\mathcal {N}}}, {\mathbb {A}}_{n{\tilde{{\mathcal {N}}}}})\ominus {\mathfrak {so}}({\mathbb {A}}_n)_{ST}\Big |[u,{\mathfrak {so}} ({\mathbb {A}}_n)_{ST}]=0\right\} . \end{aligned}$$
(201)

The terminology is made clear by the pyramid of corresponding U-dualities groups presented in Fig. 1. The base of the pyramid in \(D=3\) is the \(4 \times 4\) Freudenthal magic square, while the higher levels are comprised of a \(3 \times 3\) square in \(D=4\), a \(2 \times 2\) square in \(D=6\) and Type II supergravity at the apex in \(D=10\). Note, in [313] the oxidation of \({\mathcal {N}}\)-extended \(D=3\) dimensional supergravity theories was shown to generate a partially symmetric “trapezoid” of non-compact global symmetries for \(D=3,4,\ldots 11\) and \(0, 2^0, 2^1,\ldots 2^7\) supercharges. A subset of algebras in the trapezoid with \(D=3,4,5\) and \(2^5, 2^6, 2^7\) supercharges matches the \(D=3,4,5\) and \({\mathbb {A}}={\mathbb {C}}, {\mathbb {H}}, {\mathbb {O}}\) exterior wall of the pyramid of Fig. 1.

To illustrate the principles of the magic pyramid let us consider the simplest example in \(D=4\), the product of two \({\mathcal {N}}=1\) Yang–Mills multiplets \((A_\mu ,\lambda )\), which must yield \({\mathcal {N}}=2\) supergravity coupled to one hypermultiplet. This follows from state counting and supersymmetry alone, but the actual coupling is not fixed. By determining the symmetry this ambiguity is resolved (assuming, as before, a homogenous scalar manifold). The left/right Yang–Mills on-shell multiplets are represented by the complex numbers (helicity states):

$$\begin{aligned} A,~\lambda \in {\mathbb {C}}, \quad \tilde{{A}},~\tilde{{\lambda }} \in \tilde{{{\mathbb {C}}}}. \end{aligned}$$
(202)

Collecting the bosonic/fermionic states, the product gives us the \(({\mathbb {C}}\otimes {\mathbb {C}})^2\) valued objects:

$$\begin{aligned} B=\begin{pmatrix} A \otimes \tilde{{A}} \\ \lambda \otimes \tilde{{\lambda }} \end{pmatrix}~~\text {and}~~~ F=\begin{pmatrix} A \otimes \tilde{{\lambda }} \\ \lambda \otimes \tilde{{A}} \end{pmatrix}. \end{aligned}$$
(203)

Let us consider the \(D=3\) maximal compact algebra

$$\begin{aligned} {\mathfrak {h}}({\mathbb {C}}, \tilde{{{\mathbb {C}}}}):={\mathfrak {tri}}({\mathbb {C}})\oplus {\mathfrak {tri}}(\tilde{{{\mathbb {C}}}})+{\mathbb {C}}\otimes \tilde{{{\mathbb {C}}}}, \end{aligned}$$
(204)

in this representation. To describe the generators acting on \(({\mathbb {C}}\otimes {\mathbb {C}})^2\) it is convenient to define the quantities

$$\begin{aligned} 1_{\pm }:=\frac{1}{2}(1\otimes 1\mp i\otimes i), ~~~~i_{\pm }:=\frac{1}{2}(i\otimes 1\pm 1\otimes i), \end{aligned}$$
(205)

which form two orthogonal copiesFootnote 37 of \({\mathbb {C}}\):

$$\begin{aligned} 1_{\pm }^2=1_{\pm },~~1_{\pm }i_{\pm }=i_{\pm },~~, i_{\pm }^2=-1_{\pm } \quad 1_{\pm }1_{\mp }=0,~~1_{\pm }i_{\mp }=0,~~i_{\pm }i_{\mp }=0 \end{aligned}$$
(206)

A basis for \({\mathfrak {tri}}({\mathbb {C}})\cong \mathfrak {so}(2)\oplus \mathfrak {so}(2)\) is given by the \({\mathbb {C}}\otimes \tilde{{{\mathbb {C}}}}\)-valued \(2\times 2\) matrices

$$\begin{aligned} i_+\mathbb {1}, \quad i_+\sigma ^1 , \end{aligned}$$
(207)

while those of \({\mathfrak {tri}}(\tilde{{{\mathbb {C}}}})\) are given by \(i_-\mathbb {1},\;i_-\sigma ^1\). The generators of the \({\mathbb {C}}\otimes \tilde{{{\mathbb {C}}}}\) term are similarly given by

$$\begin{aligned} 1_+\varepsilon ,\quad i_+\sigma ^3, \quad 1_-\varepsilon ,\quad i_-\sigma ^3. \end{aligned}$$
(208)

It is straightforwd to verify that these matrices generate \({\mathfrak {su}}(2)\times {\mathfrak {su}}(2)\times {\mathfrak {u}}(1)\times {\mathfrak {u}}(1) \cong {\mathfrak {so}}(4)\times {\mathfrak {so}}(2)\times {\mathfrak {so}}(2)\), as stated in the compact sub-magic square in Table 9.

Thus far this is just the \(D=3\) analysis. But recall, in \(D=4\) a diagonal \({\mathbb {C}}\in {\mathbb {C}}\otimes \tilde{{{\mathbb {C}}}}\) is identified with spacetime. The left Yang–Mills multiplets transform under its spacetime little algebra \({\mathfrak {u}}(1)\) acting on \({\mathbb {C}}\) as

$$\begin{aligned} \delta _{\theta } A=i\theta A,~~~~~\delta _{\theta } \lambda =\frac{1}{2}i\theta \lambda , \end{aligned}$$
(209)

(together with the complex conjugates) and similar for the right multiplet, with \(\tilde{{\mathfrak {u}}}(1)\) acting on \(\tilde{{{\mathbb {C}}}}\) with parameter \(\tilde{{\theta }}\). Focussing on the fermion doublet and identifying the left/right spacetimes, \(\theta =\tilde{{\theta }}\), we find

$$\begin{aligned} \delta _\theta F=\theta \left( \frac{3}{2}i_+\mathbb {1}+\frac{1}{2}i_-\sigma ^3\right) F. \end{aligned}$$
(210)

Let us unravel what is happening here. Focusing on the fermions we see that the positive helicity spin-\(\tfrac{3}{2}\) and spin-\(\tfrac{1}{2}\) states belong to \(i_+\) and \(i_-\) sectors, just as one would anticipate. This is only consistent because \({\mathbb {C}}\otimes {\mathbb {C}}\) is not a division algebra; it contains zero divisors and we have \(i_+i_-=0\). Their role of the failure of the division property here is to ensure that each component in the multiplet transforms with the correct helicity and only the correct helicity. Having identified the spacetime little group generator, the remaining internal symmetries are determined. All the matrices commute with \(i_+\mathbb {1}\), but \(i_-\sigma ^1\) and \(1_-\epsilon \) do not commute with \(i_-\sigma ^3\), so we are forced to discard these generators, leaving the subalgebra

$$\begin{aligned} {\mathfrak {u}}(1)\oplus {\mathfrak {u}}(1)\oplus {\mathfrak {su}}(2). \end{aligned}$$
(211)

This is the maximal compact subalgebra of the corresponding \(D=4, {\mathcal {N}}=2\) (or \({\mathbb {C}}, {\mathbb {C}}, {\mathbb {C}}\)) entry in the pyramid, as given in Table 9. Acting on the gravitino with these generators we find it transforms as a doublet but, again because of the \(i_+\) annihilating \(i_-\), the spin-\(\frac{1}{2}\) fields are singlets, as required in the supergravity theory. The Yang–Mills R-symmetries have been absorbed into the U-duality group. A similar analysis for the bosonic fields in the theory shows that we do indeed obtain a graviton, a vector and two scalars, which transform as a singlet, a singlet and doublet under the \({\mathfrak {su}}(2)\), as required.

Applying the same prinicples to the non-compact global symmetries we arrive at the complete pyramid given in Fig. 1. Let us just illustrate the point by returning to our discussion of \({\mathcal {N}}=8\) supergravity in \(D=4\), which we expect to yield \(E_{7(7)}\). Just as for the case treated above we need to identify the \({\mathbb {C}}\subset {\mathbb {O}}\otimes \tilde{{\mathbb {O}}}\) corresponding to the \(D=4\) spacetime algebra. To do so we start from the \(D=3\ ({\mathbb {O}}, {\mathbb {O}})\) entry of the magic square and decompose with respect to left/right Yang–Mills symmetries:

(212)

where we have separated the compact (\({\mathfrak {tri}}({\mathbb {O}}) \oplus {\mathfrak {tri}}({\mathbb {O}})+{\mathbb {O}}\otimes \tilde{{\mathbb {O}}}\)) and non-compact (\({\mathbb {O}}\otimes \tilde{{\mathbb {O}}}+{\mathbb {O}}\otimes \tilde{{\mathbb {O}}}\) ) generators with the square parentheses. To distinguish the spacetime little group \({\mathfrak {u}}(1)_{\mathrm{st}}\) from the internal \({\mathfrak {u}}(1)\) we must take the sum and difference of the \({\mathfrak {u}}(1)\oplus {\mathfrak {u}}(1)\) charges giving,

(213)

where the first charge corresponds to \({\mathfrak {u}}(1)_{\mathrm{st}}\). The \(D=4\) global symmetry generators are those left once we have discarded all generators carrying a non-trivial \({\mathfrak {u}}(1)_{\mathrm{st}}\) charge (as well as the \({\mathfrak {u}}(1)_{\mathrm{st}}\) itself, of course), which yields

(214)

We recognise this as precisely the decomposition of \({\mathfrak {e}}_{7(7)}\) under \({\mathfrak {e}}_{7(7)}\supset {\mathfrak {su}}(8) \supset {\mathfrak {su}}(4)\oplus {\mathfrak {su}}(4)\oplus {\mathfrak {u}}(1)\): where the compact pieces, contained in the first bracket, form the maximal compact subalgebra \({\mathfrak {su}}(8)\),

(215)

To extract the field content we simply decompose the \(\mathbf{128}\) (B) and \(\mathbf{128}'\) (F) of \({\mathfrak {so}}(16)\) with respect to \({\mathfrak {u}}(1)_{\mathrm{st}}\oplus {\mathfrak {su}}(8)\),

(216)

which yields the helicity states and global representations of the expected \({\mathcal {N}}=8\) supermultiplet.

Let us conclude with some comments on the product of theories other than super Yang–Mills. Particularly interesting examples are provided by the superconformal multiplets in \(D=3,4,6\). In a manner directly analogous to the magic pyramid the tensor product of left and right superconformal theories yields the “conformal pyramid”, described in [138]. It has the remarkable property that its faces are also given by the Freudenthal magic square. In particular, ascending up the maximal spine one encounters the famous exceptional sequence \(E_{8(8)}, E_{7(7)}, E_{6(6)}\), but where \(E_{6(6)}\) belongs to the \(D=6, (4, 0)\) theory proposed by Hull as the superconformal limit of M-theory compactified on a 6-torus [276, 366, 367]. This pattern suggests the existence of some highly exotic \(D=10\) theory with \(F_{4(4)}\) U-duality group. The existence of such a theory would be more than a little surprising and there is a (slightly) more conventional interpretation of the conformal pyramid, including its \(F_{4(4)}\) tip, but for theories in \(D=3,4,5,6\), as described [138].

The product of conformal theories in the context of amplitudes has been considered previously in, for example, [66, 70, 71, 368, 369]. In particular, the maximally supersymmetric \(D=3\), \({\mathcal {N}}=8\) Bagger-Lambert-Gustavsson (BLG) Chern-Simons-matter theory [210,211,212] has been shown to enjoy a colour-kinematic duality reflecting its three-algebra structure [71]. The “square” of BLG amplitudes yields those of \({\mathcal {N}}=16\) supergravity. Since \({\mathcal {N}}=16\) supergravity is the unique theory with 32 supercharges in three dimensions it is also the “square” of the \({\mathcal {N}}=8\) Yang–Mills theory. The square of the amplitudes in both cases agree, despite their distinct structures [70].

In \(D=6\) one might expect relations between the “square” of the \({\mathcal {N}}=(2,0)\) tensor multiplet and the \({\mathcal {N}}=(4, 0)\) theory proposed by Hull [276, 366, 367], as put forward in [66]. Of course, amplitudes are generically not well-defined in these cases, but one can make some precise statements in terms of the tree-level S-matrix in particular regimes, as discussed in [368, 369]. For example, in the absence of additional degrees of freedom all three-point tree-level amplitudes of the (2, 0) tensor multiplet vanish [368]. The \(D=5, {\mathcal {N}}=4\) super Yang–Mills theory squares to give the amplitudes of \(D=5, {\mathcal {N}}=8\) supergravity. However, being non-renormalisable it ought to be regarded as a superconformal \(D=6, {\mathcal {N}}=(2,0)\) theory compactified on a circle of radius \(R=g_{YM}^{2}/4\pi ^2\). At linearised level Hull’s (4, 0) theory follows from the square the (2, 0) theory [142] and gives \({\mathcal {N}}=8, D=5\) supergravity when compactified on a circle [366]. In fact, it would seem that one can go beyond the free theory and construct candidate tree-level amplitudes of the (4, 0) theory from the double-copy of the tree-level amplitudes of the (2, 0) theory using \(D=6\) spinor-helicities and a polarised version of the Cachazo-He-Yuan scattering equation formalism [269]. They necessarily start at four-points, since the three-point (2, 0) amplitudes are trivial. Of course, the (4, 0) amplitudes need to be tested before one can claim they correspond to Hull’s conjecture. For instance, their double-soft scalar limits should reveal the \(E_{6(6)}\) symmetry. They do already pass the first test; on dimensional reduction on a circle they yield the amplitudes of \(D=5, {\mathcal {N}}=8\) supergravity. From this perspective the \((2, 0) \times (2, 0) = (4, 0)\) identity constitutes an, as yet ill-defined, M-theory up-lift of the maximally supersymmetric \(D=5\) squaring relation.

4 Closing remarks

We started our journey by posing a number of questions regarding the nature of the “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)” paradigm. In particular, we asked: (1) Why does the correspondence work? (2) Is it strictly a property of amplitudes or can it be generalised to other/all aspects of gauge and gravity theories? (3) What classes of gravitational theories admit a gauge theory squared origin?

In the course of the subsequent discussion we have witnessed remarkable progress on all fronts. Our understanding of BCJ duality and the double-copy has developed dramatically and along with it our handle on the divergences of perturbative quantum gravity. The programme has clearly shown itself to be an effective point of view beyond amplitudes from a number of perspectives, from novel approaches to the construction of solutions to the identification of new gauge and gravity theories. It has also become increasingly clear that the BCJ double-copy, and “\(\hbox {gravity} = \hbox {gauge} \times \hbox {gauge}\)” more generally, can be applied to a vast and diverse set of theories, that continues to grow.

Yet, we have no complete answers and the central questions remain: Does BCJ duality hold to all orders, at least for some theories? What are the ultimate implications for quantum gravity? Is there a geometrical or world-sheet underpinning? Can we characterise all theories admitting a gauge squared origin? Is it a/the right way to think of gravity?