1 Introduction

Precise knowledge of the content of protons, the parton distribution functions (PDFs), is a necessary ingredient for accurate predictions of both Standard Model (SM) and beyond-the-SM (BSM) cross sections at the Large Hadron Collider (LHC). Searching for BSM effects in the deviations of SM parameters from their predicted SM values [1, 2] requires better knowledge of PDFs in the kinematic regions where they are already best known [3, 4]. This means that many effects which were previously considered negligible must now be examined seriously. In this paper, particular attention is paid to some of these effects, including theoretical scale uncertainties and the correlations of experimental systematic uncertainties between, as well as within, data sets.

With present theoretical knowledge, PDFs can be determined at up to next-to-next-leading order (NNLO) in perturbative quantum chromodynamics (QCD). In order to determine the PDFs to the required precision, a wide coverage of the scale, \(Q^2\), and x, the fraction of the proton’s longitudinal momentum carried by the parton participating in the initial interaction, is required. This is facilitated by combining new inputs with older data from various experiments and by measurement of different processes to constrain different regions of phase space, which span the kinematic region: \(10^{-5} \lesssim x \lesssim 1\) and \(1 \lesssim Q^2 \lesssim 10^6~\text {GeV}{}^2\).

Knowledge of the PDFs of the proton, in the kinematic region relevant for the LHC, comes mainly from the precision deep inelastic scattering (DIS) data from ep collisions at the HERA collider, which covers a broad range of \(Q^2\) and x. The PDF set HERAPDF2.0 was determined from HERA data alone [5] using information from \(e^{\pm } p\) neutral-current (NC) and charged-current (CC) processes. The lower-\(Q^2\) NC data constrain the low-x sea-quark distribution but are not able to distinguish between quark flavours in the sea at low x, or between the down-type quarks, \(\bar{d}\) and \(\bar{s}\) at any x. The difference between the NC \(e^+p\) and \(e^-p\) cross sections at high \(Q^2\), together with the high-\(Q^2\) CC data, constrains the valence distributions. The \(Q^2\) dependence measured in the data constrains the gluon distribution. Additional information about the gluon comes from using HERA cross sections measured at various centre-of-mass energies (\(\sqrt{s}\)) through the contribution of the longitudinal structure function [6,7,8].

Diverse ATLAS data sets can be used in addition to the HERA data to constrain the PDFs better. In Ref. [9], high precision measurements of the inclusive differential \(W^{\pm }\) and \(Z/\gamma ^*\) boson cross sections at 7 \(\text {TeV}\) were added to the HERA data, resulting in the PDF set ATLASepWZ16, which improved on the HERAPDF2.0 set in various respects. Firstly, the strange content of the sea was determined rather than assumed to be a fixed fraction of the light sea. Indeed, compared to previous determinations, the strange sea was found to be enhanced at low x. Secondly, the accuracy of the valence quark distributions for \(x<0.1\) was improved. The effect of adding differential \(t\bar{t}\) distributions, in the lepton + jets and dilepton channels at 8 \(\text {TeV}\), to the HERA data and the inclusive \(W,Z/\gamma ^*\) data was studied in Ref. [10]. These \(t\bar{t}\) data are complementary to the \(W,Z/\gamma ^*\) data in their power to constrain PDFs because they are sensitive to the high-x gluon distribution. The resulting PDF set was called ATLASepWZtop18. In Ref. [11], data on the production of W and Z bosons in association with jets (V+ jets) were added to the HERA data and inclusive W, \(Z/\gamma ^*\) data, resulting in the ATLASepWZVjets20 PDF set. The V+ jets data are sensitive to partons at higher x than can be accessed by inclusive W and \(Z/\gamma ^*\) data and, in particular, they constrain the \(\bar{d}\) and \(\bar{s}\) quarks at higher x.

It would clearly be advantageous to combine ATLAS \(W,Z/\gamma ^*\) data, \(t\bar{t}\) data and V+ jets data in a single QCD fit. This is done in the present paper, and additional ATLAS data sets are also included. This paper presents the first comprehensive and comparative NNLO perturbative QCD analysis of a number of ATLAS data sets with potential sensitivity to parton distributions. The additional ATLAS data sets are described in the following. Firstly, the data on W [12] production and \(Z/\gamma ^*\) [13] production with 20.2 fb\(^{-1}\) at 8 \(\text {TeV}\) are added, providing further constraints on the valence PDFs and on the composition of the light-quark sea. Secondly, the direct photon production differential cross sections with 20.2 fb\(^{-1}\) at 8 \(\text {TeV}\) and 3.2 fb\(^{-1}\) at 13 \(\text {TeV}\) are added in the form of their ratios [14], whereby many systematic uncertainties cancel out. Thirdly, the \(t\bar{t}\) differential cross sections in the lepton + jets channel from 3.2 fb\(^{-1}\) at 13 \(\text {TeV}\) [15] are added and, fourthly, inclusive jet production cross sections with 4.5 fb\(^{-1}\) at 7 \(\text {TeV}\) [16], 3.2 fb\(^{-1}\) at 8 \(\text {TeV}\) [17] and 3.2 fb\(^{-1}\) at 13 \(\text {TeV}\) [18] are considered. These direct photon data, \(t\bar{t}\) data and inclusive jet data all have impact on the gluon PDF at medium to high x.

All ATLAS data sets considered in this study have full information about correlated systematic uncertainties. The analysis considers systematic uncertainty correlations between, as well as within, data sets. This is important now that several of the input data sets have systematic uncertainties deriving from jet measurements, since these uncertainties are larger than those from the lepton measurements. Theoretical uncertainties, such as scale uncertainties, are also considered, including their correlations between data sets. The resultant PDF set is called ATLASpdf21.

The structure of the paper is as follows. Section 2 presents the input data sets. Section 3 describes the theoretical framework of the fit. Section 4 presents the fitting methodology, including the definition of the fit \(\chi ^2\) and details of the parameterisation. Section 5 presents ATLASpdf21 PDFs and then compares them with a fit in which correlations between data sets are not implemented. The impact of each data set is considered and then model uncertainties and further theoretical uncertainties, including scale uncertainties, are considered. Section 6 presents a study of the \(\chi ^2\) tolerance and a comparison with other modern PDF sets. Section 7 gives the summary and conclusions. Appendix A gives a more detailed exposition of the impact of scale uncertainties for the inclusive \(W,Z/\gamma ^*\) data. The impacts of inclusive jet data at different centre-of-mass energies are compared in Appendix B. Appendices C and D compare the ATLASpdf21 fit predictions with the ATLAS input data sets and with data sets from other experiments, respectively.

2 Input data sets

2.1 Description of data sets

The combined \(e^{\pm }p\) NC cross-section measurements of H1 and ZEUS [5] cover a kinematic range of \(Q^2\), defined as the negative four-momentum transfer squared, from 0.045 \(\text {GeV}\) \(^2\) to 50,000 \(\text {GeV}\) \(^2\) and of Bjorken-x, \(x_{\mathrm {Bj}}\), which is equal to the fraction of the proton’s longitudinal momentum carried by the struck parton at leading order (LO) in QCD, from \(6\times 10^{-7}\) to 0.65. The CC cross-section measurements cover a kinematic region \(Q^2\) \(\sim 300\) \(\text {GeV}\) \(^2\) to beyond \(10^4\) \(\text {GeV}\) \(^2\) and of x from \({\sim }0.65\) down to \({\sim }10^{-2}\). Low-x data, below \(x_{\mathrm {Bj}} = 10^{-5}\), are excluded from this analysis by requiring \(Q^2 > 10~\) \(\text {GeV}\) \(^2\), motivated by the poorer fit observed in this region compared to the rest of HERA data [5], which may reflect the need for resummation corrections at low x [19, 20]. Full information about correlated systematic uncertainties is provided. There are 169 sources of correlated systematic uncertainty. Total uncertainties are below 1.5\(\%\) over the \(Q^2\) range of \(3< Q^2 < 500\) \(\text {GeV}\) \(^2\) and remain below 3\(\%\) up to \(Q^2= 3000\) \(\text {GeV}\) \(^2\).

The ATLAS \(W,Z/\gamma ^*\) differential cross-section measurements at \(\sqrt{s} = 7\) \(\text {TeV}\) based on an integrated luminosity of 4.6 fb\(^{-1}\) are used [9]. A combination of the electron and muon decay channels is used. They access a kinematic range whereby the scale is identified with the measured boson mass range, and the x range is determined by the scale, the proton beam energy and the measured rapidity (y) ranges, such that \(Q^{2} = m^{2}\) and \(x= (Q/\sqrt{s})\,\mathrm {e}^{\pm y}\), which gives an x range \(0.001 \lesssim x \lesssim 0.1\).Footnote 1 The \(W^{\pm }\) differential cross sections were measured as a function of the W decay lepton pseudorapidity, \(\eta _\ell \), with an experimental precision of 0.6%–1.0%. The \(Z/\gamma ^*\) boson rapidity distribution, \(y_{ Z }\), was measured in three mass ranges: \( 46<m_{\ell \ell }<66~\) \(\text {GeV}\); \(66<m_{\ell \ell }<116~\) \(\text {GeV}\) and \(116<m_{\ell \ell }<136~\) \(\text {GeV}\), with the two decay leptons in central-central (c-c) and central-forward (c-f) rapidity ranges,Footnote 2 and experimental uncertainties as low as \(0.4\%\) for c-c rapidity and \(2.3\%\) for c-f rapidity. There are 131 sources of correlated systematic uncertainty. The low-mass off-peak \(Z/\gamma ^*\) data are not used in the present analysis because they are subject to much larger corrections for non-fixed order, parton shower \(p_\mathrm {T}\)-resummation effects than the other \(W,Z/\gamma ^*\) data [21]. These low-mass data had little impact on the ATLASepWZ16 PDFs.

ATLAS has published differential cross sections and forward-backward asymmetries for \(Z/\gamma ^*\) production at \(\sqrt{s}= 8\) \(\text {TeV}\), based on 20.2 fb\(^{-1}\) of data [13]. The measurement is presented in terms of three variables for both the c-c and c-f regions of the dilepton rapidity, \(y_{\ell \ell }\). For the c-c region, there are 12 bins of absolute rapidity ranging from 0.0 to 2.4, in intervals of 0.2, divided into 7 bins of the dilepton mass \(m_{\ell \ell }\) ranging from 46 to 200 \(\text {GeV}\) and 6 bins of the polar angle, \(\theta \), of the decay lepton in the Collins–Soper frame [22]. For the c-f region, there are 5 bins of absolute rapidity ranging from 1.2 to 3.6 and 5 mass bins ranging from 66 to 150 \(\text {GeV}\) in the same 6 bins of the Collins–Soper angle. The dilepton data were extracted in both the dielectron and dimuon channels up to \(|y_{\ell \ell }|=2.4\), which is extended to \(|y_{\ell \ell }|=3.6\) in the dielectron channel. The dimuon and dielectron data are combined for the c-c region. Measurements are used as cross sections when the acceptance of the lepton fiducial cuts with respect to the (\(m_{\ell \ell }\), \(|y_{\ell \ell }|\), \(\cos \theta ^{*}\)) bin is greater than \(95\%\), such that the leptons are constructed with high accuracy and the cross sections are known to \({\sim }0.5\%\) accuracy. There are 184 cross-section data points which fulfil these requirements and all of them lie in the c-c region of dilepton rapidity. These are used for the central fit. The selection of the bins in \(m_{\ell \ell }\), \(y_{\ell \ell }\) and \(\theta \) ensures that the NNLO predictions for these high acceptance bins are not highly sensitive to fiducial cuts applied to leptons [23]. The uncertainties of the high-\(|y_{\ell \ell }|\) data range from \(2\%\) to \(33\%\) and the combined central rapidity data reaches a precision ranging from \(0.4\%\), for the Z-mass-peak bins at \(|y_{\ell \ell }| < 1.0\), to \(1.8\%\) for the high-mass bins at \(|y_{\ell \ell }|=2.4\). There are 330 sources of correlated systematic uncertainty. The remaining lower-acceptance data are used only in cross-checks. They are converted to 188 data points for forward-backward asymmetries across the Collins–Soper angle in order to minimise sensitivity to acceptance cuts. These data lie in both the c-c and c-f dilepton rapidity regions.

The W inclusive differential cross sections, from 20.2 fb\(^{-1}\) of data at \(\sqrt{s} = 8\) \(\text {TeV}\) [12], are measured as a function of muon pseudorapidity, \(\eta _{\mu }\). Data are provided for \(W^+\) and \(W^-\) cross sections separately as well as for the W-asymmetry. The separate \(W^+\) and \(W^-\) cross sections are used for the fit, rather than the W-asymmetry. The precision of these cross-section measurements varies from \(0.8\%\) to \(1.5\%\) as a function of \(\eta _{\mu }\). The pseudorapidity covers the same central range as the \(\sqrt{s} = 7\) \(\text {TeV}\) inclusive W data. There are 41 sources of correlated systematic uncertainty.

The ATLAS differential cross-section measurements of W production in association with jets (W+ jets) are based on data recorded during pp collisions at \(\sqrt{s} = 8\) \(\text {TeV}\), and a total integrated luminosity of 20.2 fb\(^{-1}\), in the electron decay channel only. Each event contains at least one jet with transverse momentum \(p_{\text {T}} > 30~\text {GeV}{}\) and rapidity \(|y|< 4.4\). The data are split into \(W^{+}\) and \(W^{-}\) cross sections. There are 51 sources of correlated systematic uncertainty. As explained in Ref. [11] the differential spectra of the transverse momenta of the \(W^+\) and \(W^-\) bosons \(p_{\mathrm {T}}^{ W }\) are fitted in the range \(25< p_{\mathrm {T}}^{ W } < 800~\text {GeV}{}\). The data precision ranges from 10 to 25\(\%\) from low to high \(p_{\mathrm {T}}^{ W }\) [24].

The ATLAS differential cross-section measurements of Z production in association with jets [25] are based on data recorded during pp collisions at \(\sqrt{s} = 8\) \(\text {TeV}\), and a total integrated luminosity of 20.2 fb\(^{-1}\), in the electron decay channel only. The measurement was performed as a function of the absolute rapidity of inclusive jets, \(|y^\mathrm {jet}|\), for several bins of the transverse momentum within \(25~\text {GeV}< p_{\mathrm {T}}^{\mathrm {jet}} < 1050~\text {GeV}\). There are 42 sources of correlated systematic uncertainty. The data precision ranges from \({\sim }10\%\) to \({\sim }30\%\) from low \(|y^{\mathrm {jet}}|\) and \(p_{\mathrm {T}}^{\mathrm {jet}}\) to high \(|y^{\mathrm {jet}}|\) and \(p_{\mathrm {T}}^{\mathrm {jet}}\).

The \(t\bar{t}\) differential cross sections were measured at \(\sqrt{s} = 8\) \(\text {TeV}\) using 20.2 fb\(^{-1}\) of data in the lepton + jets [26] and dilepton [27] decay channels. The \(t\bar{t}\) cross sections are provided as both normalised and absolute spectra. The absolute spectra are used in the present study since they carry extra information about the normalisation of the cross-sections, which is not input anywhere else in the fit. As detailed in Ref. [10], in the lepton + jets channel the differential spectra that are used are the mass of the \(t\bar{t}\) pair, \(m_{t\bar{t}}\), and the average top-quark transverse momentum, \(p_{\mathrm {T}}^t\). Each of these spectra has full information about systematic and statistical bin-to-bin correlations, including statistical correlations between the spectra. There are 55 sources of correlated systematic uncertainty in common between the different spectra. The precision of these lepton + jets data is \({\sim }15\%\) for \(p_{\mathrm {T}}^t\) and ranges from \({\sim }10\%\) to \({\sim }20\%\) from low to high \(m_{t\bar{t}}\). In the dilepton channel, the spectrum for the rapidity of the \(t\bar{t}\) pair, \(y_{t\bar{t}}\), is used. The precision of these data ranges from \({\sim }5\) to \({\sim }15\%\) from low to high \(|y_{t\bar{t}}\)|. For these dilepton channel data the correlations are provided as a total covariance matrix. Further studies of these top-quark data are presented in Ref. [10].

The \(t\bar{t}\) differential cross sections were measured at \(\sqrt{s} = 13\) \(\text {TeV}\) using 36 fb\(^{-1}\) of data in the lepton + jets [15] channel. Similarly to the \(t\bar{t}\) data at 8 \(\text {TeV}\), the absolute spectra are used. For the \(t\bar{t}\) data at 13 \(\text {TeV}\), two complementary topologies are measured; the ‘resolved’ and ‘boosted’ topologies, which differ in that the decay products of the hadronically decaying top quark are well separated for the resolved case and collimated for the boosted case. The ‘boosted’ topology reaches higher absolute rapidity of the \(t\bar{t}\) pair. The differential spectra used for the PDF fit are the mass of the \(t\bar{t}\) pair, \(m_{t\bar{t}}\), the average top-quark transverse momentum, \(p_{\mathrm {T}}^t\), the average top-quark rapidity, \(y_t\), and the boosted rapidity of the \(t\bar{t}\) pair, \(y^\mathrm {b}_{t\bar{t}}\). These spectra were found to be the most sensitive to the gluon PDF in Ref. [28]. Each of these spectra has full information about systematic and statistical bin-to-bin correlations, including statistical correlations between the spectra.Footnote 3 There are 88 sources of correlated systematic uncertainty in common between the different spectra and statistical correlations between the spectra are implemented. Double differential spectra in \(p_{\mathrm {T}}^t\) and \(m_{t\bar{t}}\) were also studied. The precision of these lepton + jets data ranges from \(15\%\) to \(30\%\) for \(p_{\mathrm {T}}^t\) and from \(15\%\) to \(22\%\) for \(m_{t\bar{t}}\), with the largest uncertainties in the lowest and highest bins. For both of the rapidity spectra the total uncertainty is \({\sim }15\%\).

The ratio of the cross sections for inclusive isolated-photon production at \(\sqrt{s} = 8\) and 13 \(\text {TeV}\) comes from integrated luminosities of 20.2 fb\(^{-1}\) and 3.2 fb\(^{-1}\) respectively. The data are presented as a function of photon transverse energy, \(E_{\mathrm {T}}^{\gamma }\), for \(E_{\mathrm {T}}^{\gamma }> 125\) \(\text {GeV}\) in bins of photon pseudorapidity, \(\eta ^{\gamma }\), for the central region \(|\eta ^{\gamma }| < 2.37\). Details of the isolation criteria are given in Ref. [14]. Evaluation of experimental uncertainties in the ratio data takes into account the correlations between the data at the two different centre-of-mass energies. The dominant source of correlated uncertainty is the photon energy scale. The uncertainty due to this source is much lower when considering the ratio of cross sections. Nevertheless, 25 correlated uncertainty components from this source remain, and there are 21 further uncertainty components which are treated as correlated or uncorrelated as specified in Ref. [14]. It should be noted that for the present fits the combined luminosity uncertainty at 8 and 13 \(\text {TeV}\) is not used, so the separate luminosity uncertainties may be correlated with those of other data at these centre-of-mass energies. The experimental precision of the ratios ranges from \({\sim }4\%\) at low \(E_{\mathrm {T}}^{\gamma }\) and \(|\eta ^{\gamma }|\) to \({\sim }10\%\) at the largest \(E_{\mathrm {T}}^{\gamma }\) and \(|\eta ^{\gamma }|\).

Table 1 Summary of all the ATLAS input data sets considered in the QCD fit. It should be noted that: (i) the inclusive W cross-section data at 7 and 8 \(\text {TeV}\) are used as \(W^+\) and \(W^-\) data separately, (ii) the isolated photon data are used as the ratio of the 13 \(\text {TeV}\) to 8 \(\text {TeV}\) cross sections and (iii) the inclusive jet production data at 7 and 13 \(\text {TeV}\) are used only for a cross-check

The ATLAS measurements of inclusive jet cross sections at \(\sqrt{s}= 7\) \(\text {TeV}\) [16], 8 \(\text {TeV}\) [17] and 13 \(\text {TeV}\) [18] were considered. The data are presented as a function of jet \(p_{\mathrm {T}}^{\mathrm {jet}}\), in six bins of absolute rapidity from \(|y^{\mathrm {jet}}|=0.0\) to \(|y^{\mathrm {jet}}| = 3.0\). Jets were reconstructed using the anti-\(k_t\) algorithm with jet radius parameter \(R=0.4\) and \(R=0.6\) for data at 7 and 8 \(\text {TeV}\), and with \(R=0.4\) for data at 13 \(\text {TeV}\). Statistical correlation matrices between bins are provided for all jet data sets. However, it is not possible to fit inclusive jet production data at different centre-of-mass energies simultaneously, because the full experimental systematic uncertainty correlations between these data sets have not been fully specified. The data at 8 \(\text {TeV}\) are selected for the central fit, firstly because a better understanding of correlated systematic uncertainties has been achieved for these data than for the data at 7 \(\text {TeV}\), and secondly because they are available for \(R=0.6\), unlike for the data at 13 \(\text {TeV}\), and a larger R value is considered more reliable theoretically [29]. Details of the data set are therefore given only for the jet data at 8 \(\text {TeV}\). The data at 7 and 13 \(\text {TeV}\) have similar features, the main difference being in the \(p_{\mathrm {T}}^{\mathrm {jet}}\) range probed. These data sets, including the alternative choice of jet radius, are used in cross-checks. At 8 \(\text {TeV}\) there are 171 jet data points from 20.2 fb\(^{-1}\) of integrated luminosity. The fitted \(p_{\mathrm {T}}^{\mathrm {jet}}\) range is from 85 \(\text {GeV}\) to 2.5 \(\text {TeV}\). The dominant uncertainties come from the estimation of the jet energy scale and jet energy resolution. Extensive work [17] has been done to understand correlated sources of uncertainty. There are 320 sources of correlated systematic uncertainty, in addition to the luminosity uncertainty, and the jet energy scale’s \(\eta \) intercalibration uncertainty is split into 250 sources. The total uncertainty in the central \(p_{\mathrm {T}}^{\mathrm {jet}}\) range, \(300< p_{\mathrm {T}}^{\mathrm {jet}} <600\) \(\text {GeV}\), is \({\sim }5\%\) for \(|y^{\mathrm {jet}}| < 0.5\), rising to \({\sim }10\%\) for \(|y^{\mathrm {jet}}|\) in the range 2.5–3.0. For low \(p_{\mathrm {T}}^{\mathrm {jet}}\), \(85< p_{\mathrm {T}}^{\mathrm {jet}} <300\) \(\text {GeV}\), the uncertainty is \({\sim }15\%\) and for high \(p_{\mathrm {T}}^{\mathrm {jet}}\), \(600< p_{\mathrm {T}}^{\mathrm {jet}} <2000\) \(\text {GeV}\), it is \({\sim }50\%\).

All the ATLAS input data sets for the QCD fit are summarised in Table 1.

2.2 Correlations of uncertainties within and between data sets

The correlated systematic uncertainties are applied within each data set with the following small number of exceptions. Firstly, for the W+ jets, the Z + jets and the inclusive jet spectra at 8 \(\text {TeV}\), the systematic uncertainty due to the unfolding procedure is treated as being uncorrelated, both within and between these spectra.Footnote 4 As shown in Ref. [11], this affects the \(\chi ^2\) of the fits to V+ jets, but has little impact on the fitted PDFs. Similarly, the parton shower systematic uncertainty is decorrelated between the \(p_{\mathrm {T}}^t\) and \(m_{t\bar{t}}\) spectra in the \(t\bar{t}\) lepton + jets channel, as done in Ref. [10]. It was established that this decorrelation has a minimal effect on the PDFs, while reducing the fit \(\chi ^2\) to acceptable levels. This decorrelation and the aforementioned decorrelation of the unfolding systematic uncertainty in V+ jets and inclusive jet data, can be justified because the systematic uncertainties concerned are evaluated from the difference of two Monte Carlo estimates, and thus do not represent well-behaved Gaussian uncertainties. Similar conclusions were reached in a recent study in the MMHT framework [30]. Thirdly, in the inclusive jet data at 8 \(\text {TeV}\) further decorrelations of such systematic uncertainties, derived from the difference of two Monte Carlo estimates, are considered following Ref. [17]. The experimental systematic uncertainties for the jet energy scale (JES) such as the ‘Flavour Response’, ‘Multi-Jet Balance Fragmentation’, ‘Pile-up Rho Topology’,Footnote 5 and the ‘Non-Perturbative Correction’ uncertainty, are not considered completely correlated between all rapidity bins. Instead, they are split into two or three components as a function of rapidity and \(p_{\mathrm {T}}^{\mathrm {jet}}\) as specified in the various splitting options described in the Appendix of Ref. [17]. For the central fit, the preferred set of splitting options for \(R=0.6\) is used, in which the JES ‘Flavour Response’ is split into three components (see Table 6 of Ref. [17]). In the present paper, this is called ‘Decorrelation Scenario 2’ and it is chosen because it is one of the two preferred options as determined in the analysis in Ref. [17].Footnote 6 Alternative decorrelation scenarios are also considered in Sect. 5.3.1.

Correlations of systematic uncertainties between data sets are explained below. The luminosity uncertainties are considered fully correlated for all data sets at the same centre-of-mass energy. For the data sets considered in this analysis, systematic uncertainties involving electron and muon measurements are small (\({<}1\%\)), whereas systematic uncertainties involving the jet measurements can be much larger (\(O(10\%)\)). Moreover, the high-precision inclusive \(W,Z/\gamma ^*\) differential cross-section measurements at 7 \(\text {TeV}\) and the inclusive \(Z/\gamma ^*\) triple differential cross-section measurements at 8 \(\text {TeV}\) both had the electron and muon channel data combined and thus the identities of the systematic uncertainty sources are lost in the combination procedure such that the analysis cannot correlate them with muon and electron uncertainties in other data sets (or between the inclusive \(Z/\gamma ^*\) measurements at 7 and 8 \(\text {TeV}\)). Thus, correlations of lepton uncertainties between these inclusive W and Z measurements and the other data sets are neglected. This is not a significant limitation in the study of the effect of correlations between data sets since the lepton uncertainties are small compared to the jet uncertainties, as shown in the recent study of the impact of V+ jets data on PDF fits [11]. This study reverted to the use of the separate electron and muon channel data for the inclusive \(W,Z/\gamma ^*\) data at 7 \(\text {TeV}\) in order to correlate the electron systematic uncertainties with those of the W+ jets data and the Z + jets data at 8 \(\text {TeV}\) in the electron channel. These studies produce PDFs which are barely different from those in which the electron/muon combined data are used and lepton systematic uncertainty correlations are not applied between the data sets. Indeed, in Sect. 5.2 of the present analysis, it is shown that the impact of the V+ jets data within the present analysis, which uses combined electron/muon data for both inclusive \(W,Z/\gamma ^*\) data at 7 \(\text {TeV}\) and inclusive \(Z/\gamma ^*\) data at 8 \(\text {TeV}\), is very similar to the impact of these data in Ref. [11], where uncombined electron and muon data are used. Thus, the use of the more accurate combined electron and muon data is preferred.

In this paper, the correlations of the much larger jet-measurement uncertainties are considered. The W+ jets data and the Z + jets data at 8 \(\text {TeV}\) have common sources of jet systematic uncertainties, which are listed in Table 2. These sources are considered \(100\%\) correlated. Since the \(t\bar{t}\) cross sections at both 8 and 13 \(\text {TeV}\) consider data in the lepton + jets channel, there are some common sources of systematic uncertainty, due to the jet measurements, between these data and the V+ jets data. These are also listed in Table 2 and are considered \(100\%\) correlated between these data sets. This table also lists correlations of some sources of smaller systematic uncertainties, in the muon measurements and diboson, single-top and Z + jets backgrounds, which can be correlated between the \(t\bar{t}\) measurements. It should be noted that detailed correlations between the \(t\bar{t}\) data at 8 \(\text {TeV}\) in the dilepton channel and other data sets cannot be applied because the correlations for the dilepton channel are supplied as a covariance matrix and the separate individual sources of uncertainty cannot be separated. This does not have a significant effect on the fit since the dilepton data itself has only a small impact on the fit, as seen in Ref. [10].

Correlations of jet systematic uncertainties between the inclusive jet data and the V+ jets data and \(t\bar{t}\) data in the lepton + jets channel have been identified and are listed in Table 2. There is a choice to be made in the implementation of these inter-data-set correlations. The JES ‘Flavour Response’ and ‘Pile-up Rho Topology’ are part of the Jet Decorrelation Scenario 2 chosen for the central fit, and thus they have been split into three components according to jet rapidity and \(p_\mathrm {T}\) [17]. Consequently, there are no longer corresponding systematic sources in the V+ jets data and \(t\bar{t}\) data. The choice made for the central fit is to correlate only the remaining six sources (in the right-hand column of Table 2) between the data sets. The alternative choice of correlating these two systematic sources with the other data sets but not including them in the jet decorrelation scenario was also investigated. The effect on the resulting PDFs is negligible.

There is a further caveat on how the correlations between the inclusive jet data set and the V+ jets and \(t\bar{t}\) data sets are applied, namely that the last two data sets have radius \(R=0.4\) jets, and hence the systematic uncertainties may not be fully correlated. Checks were made using \(100\%\) correlation and no correlation, yielding little difference between the resultant PDFs. For the central fit a correlation of \(100\%\) is used.

The systematic uncertainties of the inclusive jet data at different beam energies are correlated with each other, but understanding these correlations in detail is non-trivial. In the present study, these data sets are fitted separately and results are compared. As already stated the data at 8 \(\text {TeV}\) are used for the central fit.

Table 2 Systematic uncertainties that are correlated between the W+ jets data at 8 \(\text {TeV}\), Z + jets data at 8 \(\text {TeV}\), \(t\bar{t}\) lepton + jets data at 8 \(\text {TeV}\), \(t\bar{t}\) lepton + jets data at 13 \(\text {TeV}\) and inclusive jets data at 8 and 13 \(\text {TeV}\) are listed. The names of the systematic uncertainties are those found in the HEPData entries [31]. Entries in the same row are taken as \(100\%\) correlated for the V+ jets and \(t\bar{t}\) lepton + jets data, which all have jet radius \(R=0.4\). Different degrees of correlation are considered for the inclusive jet data at \(R=0.6\), because of the differing choice of jet radius. Where entries are omitted, that systematic uncertainty does not exist for that data set (denoted by ‘–’). The luminosity uncertainty of data sets at the same centre-of-mass energy are also fully correlated. The JES ‘Flavour Response’ and JES ‘Pile-up Rho topology’ are considered fully correlated with other data sets only for cross-checks. They are not correlated for the central fit because they are part of the Decorrelation Scenario 2 which is applied to the inclusive jet measurements, as explained in the text. For this reason they are marked with the symbol \(^*\)

The measurement of the direct-photon production ratio already considered correlations between the data at 8 \(\text {TeV}\) and 13 \(\text {TeV}\). The photon energy scale is the largest correlated systematic uncertainty between the two measurements. There are no further important correlations with the other data sets. The luminosity uncertainties of the data at 8 \(\text {TeV}\) and 13 \(\text {TeV}\) are not combined for the present study. Instead, the 8 \(\text {TeV}\) luminosity is correlated with that of the other 8 \(\text {TeV}\) data sets and the 13 \(\text {TeV}\) luminosity is correlated with that of the other 13 \(\text {TeV}\) data sets.

3 Theoretical framework

3.1 NNLO QCD and NLO electroweak predictions for each data set

The present analysis uses the xFitter framework [32,33,34]. This program interfaces to theoretical calculations directly or uses fast interpolation grids to make theoretical predictions for the considered processes. The program MINUIT [35] is used for the minimisation. Each step is cross-checked with an independent fit program [36]. This section describes how these predictions are obtained for each data set.

For the DIS processes the light-quark coefficient functions are calculated to NNLO in QCD theory as implemented in QCDNUM [37]. The contributions of charm and bottom quarks are calculated in the general-mass variable-flavour-number scheme of Refs. [38, 39], known as the optimised TRVFN scheme.

Table 3 Summary of code used for NLO, NNLO QCD and LO, NLO EW predictions for ATLAS data as applied in the QCD fit. For most data sets, predictions are provided at NLO in QCD and LO in EW in the form of fast interpolation grids and are corrected to NNLO QCD and NLO EW by K-factors. For the \(t\bar{t}\) lepton + jets channel the grids are calculated at NNLO in QCD directly, so no entry appears in the column ‘NLO QCD code’

The renormalisation and factorisation scales for the DIS processes are taken to be the conventional choices, \(\mu _{\mathrm {r}} =\mu _{\mathrm {f}} =\sqrt{Q^2}\), where \(Q^2\) is the negative four-momentum transfer squared as already stated. For the DIS data electroweak (EW) effects are already unfolded to leading order, such that LO-EW corrections need not be applied, apart from the running of \(\alpha \). NLO-EW corrections are not well defined and no such corrections are applied to the DIS data.

In the present analysis, the photon PDF within the proton is not accounted for and thus data sets with substantial sensitivity to the photon PDF, such as very high-mass Drell–Yan data, are excluded. It is estimated [40] that the photon takes only \(\sim 0.3\%\) of the momentum of the proton in our kinematic range.

The DIS processes are the only ones for which fast NNLO QCD predictions can be made analytically, such that they can be used in an iterative fit. For the LHC processes, NNLO calculations are too time-consuming and thus fast interpolation grids are used. However, currently only the \(t\bar{t}\) production processes in the lepton + jets channel have grids available for NNLO calculations. For the remainder of the data sets the grids are available only at NLO and K-factors are used to correct the QCD predictions for the differential cross sections from NLO to NNLO. These K-factors are calculated from the ratio of the NNLO to NLO cross sections, with the same cuts as the data, using a fixed input PDF. These K-factors have very little dependence on the choice of input PDF for this calculation, and iteration using the output PDF of the fit is not necessary. The calculations used to construct the interpolation grids are usually to leading order (LO) in the EW part of the calculation. Correction from LO to NLO in EW predictions is also applied by a K-factor technique. For some data sets, this is applied together with the QCD predictions’ correction from NLO to NNLO while for others it is a separate calculation. Details for each data set are given below and summarised in Table 3. For data sets measuring final-state jets, there are also non-perturbative corrections to account for hadronisation. These are also applied using a K-factor technique as specified below.Footnote 7

The W and Z inclusive cross sections at 7 \(\text {TeV}\) are calculated at fixed order, to NNLO in QCD theory and to NLO in EW theory, as described in Ref. [9]. The results obtained from DYNNLO 1.5 [41, 42] and FEWZ 3.1.b2 [43, 44] are compared. A small difference between the NNLO predictions of FEWZ and DYNNLO of up to \({\sim }1\%\) is observed for the W- and Z-peak data. The sensitivity of NNLO programs to the fiducial cuts applied to the data was highlighted again recently [23]. This is considered further in Sect. 3.2. The scales for the Drell–Yan (DY) processes are taken to be the decay dilepton invariant mass appropriate to the centre of the mass bin in the NC case and the W-boson mass in the CC case. The xFitter package uses the APPLgrid code [45] interfaced to the MCFMprogram [46, 47] for fast calculation of the differential W and \(Z/\gamma ^*\) boson cross sections at NLO in QCD theory and LO in EW theory, and a K-factor technique is used to correct the NLO QCD predictions to NNLO and the LO EW predictions to NLO. These K-factors are within 1–2% of unity for the \(W^{\pm }\) and \(Z/\gamma ^*\) in the c-c dilepton rapidity region and within \({\sim }4\%\) of unity in the c-f dilepton rapidity region. The high-mass sideband of \(Z/\gamma ^*\) production is also subject to background from photon-induced dilepton production, which was estimated using the MRST2004qed photon PDF [48] and subtracted from the data. Compared to the signal, the size of this background is \({\sim }(1.5\pm 0.5)\%\).

The QCD predictions for the triple differential distributions of \(Z/\gamma ^*\) production at 8 \(\text {TeV}\) are made to NLO by MCFMinterfaced to APPLgrid, and K-factors are applied for NNLO QCD and NLO EW effects using NNLOjet. These K-factors are within \({\sim }3\%\) of unity. The renormalisation and factorisation scales for these data are taken to be the decay dilepton invariant mass appropriate to the centre of the mass bin. For the present study, Particle Data Group (PDG) values [49] are used for the electroweak parameters in the \(G_\mu \) scheme, in particular the value of \(\sin ^2\!\theta _{\text {W}}\) is \(\sin ^2\!\theta _{\text {W}} = 0.23127\).

The QCD predictions for the W cross sections at 8 \(\text {TeV}\) are calculated to NLO and the EW predictions to LO using MG5_aMC@NLO 2.6.4 interfaced to APPLgrid via aMCfast 1.3.0, and the NNLO QCD + NLO EW K-factors are calculated using DYNNLO 1.5. These K-factors are within 1%–2% of unity. The renormalisation and factorisation scales for these data are taken to be the W-boson mass.

The predictions for W+ jets production at 8 \(\text {TeV}\) are obtained at fixed order, up to NNLO in QCD theory and at LO in EW theory, using the \(\mathrm {N_{jetti}}\) program [50]. Outputs from the APPLgrid code are used for fast calculation at NLO in QCD theory, and K-factors from the aforementioned \(\mathrm {N_{jetti}}\) prediction are used to correct this calculation to NNLO. The renormalisation and factorisation scales are set to \(\sqrt{m_W^2+\Sigma (p_{\text {T}} ^{\mathrm {jet}})^2}\), where the second term in the square root is a sum over the transverse momentum of each jet. Additionally, all non-perturbative corrections, such as for hadronisation effects, are included in the K-factors. Further K-factors are applied for NLO EW corrections as computed by the authors of Sherpa [51]. As in the previous ATLAS analysis of V+ jets data [11], the lowest \(p_\mathrm {T}^{\mathrm {jet}}\) bin, \(p_\mathrm {T}^{\mathrm {jet}} < 25\) \(\text {GeV}\), is not used, since the calculation is effectively only at NLO for such low \(p_\mathrm {T}\).

The NNLO QCD predictions for Z + jets production at 8 \(\text {TeV}\) were calculated by the authors of Ref. [52]. The renormalisation and factorisation scales are set to \(\mu _{\mathrm {r}} = \mu _{\mathrm {f}} = \frac{1}{2}\left( \sqrt{m_{\ell \ell }^2 + p_{\text {T},\ell \ell }^2} + \Sigma p_{\text {T},\text {partons}}\right) \) where \(m_{\ell \ell }\) is the invariant mass of the electron pair, \(p_{\text {T},\ell \ell }\) is the transverse momentum of the electron pair and \(\Sigma p_{\text {T},\mathrm {partons}}\) is the sum of the transverse momenta of the outgoing partons. Four sets of K-factors are provided: the first corrects the NLO QCD predictions to NNLO, the second gives corrections for non-perturbative effects, the third corrects from LO to NLO in QED and the fourth applies NLO EW corrections (excluding QED radiation) as computed by Sherpa[2.2.10]. The K-factors for both W+ jets and Z + jets production are typically within \(10\%\) of unity. Uncertainties in the non-perturbative corrections are supplied for the Z + jets predictions and these are applied as correlated systematic uncertainties. Their impact is very small. Such uncertainties are not supplied for the W+ jets predictions.

The NNLO QCD predictions for \(t\bar{t}\) production at 8 \(\text {TeV}\) were calculated by the authors of Ref. [53] and are available in the form of fast interpolation grids, APPLgrid [45] or fastNLO [54, 55], for the data in the lepton + jets channel. The predictions for \(m_{t\bar{t}}\) are given for renormalisation and factorisation scales equal to \(H_{\mathrm {T}}/4\), where \(H_{\mathrm {T}}=\sqrt{m_t^2+(p_{\mathrm {T}}^t)^2}+\sqrt{m_t^2 +(p_{\mathrm {T}}^{\bar{t}})^2}\), and the predictions for \(p_{\mathrm {T}}^t\) are given for scales equal to \(m_{\mathrm {T}}/2\), where \(m_{\mathrm {T}}=\sqrt{m_t^2+(p_{\mathrm {T}}^t)^2}\) and \(m_t=173.3~\) \(\text {GeV}\) is the pole mass. Non-perturbative corrections and their uncertainties are supplied with the data and these uncertainties are applied as correlated systematic uncertainties. For the dilepton channel, APPLgrid is interfaced to MCFMto produce NLO grids, and a K-factor technique is used to correct NLO predictions to NNLO, using K-factors from Ref. [28]. The scales used for the predictions are equal to \(H_{\mathrm {T}}/4\). The NNLO/NLO K-factors are \({\sim }7\%\) above unity and constant for \(y_{t\bar{t}}\). The calculations are for zero width of the top mass. NLO EW corrections for the spectra are also considered, using the additional K-factors given in Ref. [56]. For the 8 \(\text {TeV}\) data, these corrections can be up to \(1\%\) for \(y_{t\bar{t}}\) and \(y_t\), up to \(2\%\) for \(m_{t\bar{t}}\), and up to \(4\%\) for \(p_{\mathrm {T}}^t\).

The NNLO predictions for \(t\bar{t}\) production at 13 \(\text {TeV}\) have been calculated by the authors of Ref. [53] in the form of fast interpolation grids [45, 54, 55] for the data in the lepton + jets channel [57]. The predictions for \(m_{t\bar{t}}\), \(y_t\) and \(y^b_{t\bar{t}}\) are given for renormalisation and factorisation scales equal to \(H_{\mathrm {T}}/4\) and the predictions for \(p_{\mathrm {T}}^t\) are given for scales equal to \(m_{\mathrm {T}}/2\), just as for the 8 \(\text {TeV}\) \(t\bar{t}\) data. The only difference is that at 13 \(\text {TeV}\) the value \(m_t=172.5~\) \(\text {GeV}\) is used for the top-quark mass, because this is the value used for the interpolation grids. The small difference between the masses used for the central fit for 8 and 13 \(\text {TeV}\) \(t\bar{t}\) data is studied as part of the model uncertainties and found to be negligible (see Sect. 5.3.1). NLO EW corrections for the spectra are also considered, using K-factors taken from Ref. [56]. For the 13 \(\text {TeV}\) data, these corrections can be up to \(1\%\) for \(y_{t\bar{t}}\), up to \(0.5\%\) for \(y_t\), up to \(2.5\%\) for \(m_{t\bar{t}}\), and up to \(5\%\) for \(p_{\mathrm {T}}^t\). Non-perturbative corrections and their uncertainties are again supplied with the data and these uncertainties are applied as correlated systematic uncertainties.

The NLO QCD predictions for direct photon production were taken from MCFM interfaced to APPLgrid, and K-factors from Ref. [58] were applied to obtain NNLO QCD predictions. Resummation of electroweak effects is also applied as in Ref. [59]. The corrections from these K-factors are 2%–15% and 0.5%–10% for the separate 8 and 13 \(\text {TeV}\) data sets, respectively, but they almost cancel out in the ratio, leaving corrections of less than 2%. Finally, a photon isolation criterion is applied to the predictions to follow that applied to the data as closely as possible (see Ref. [60]). The renormalisation and factorisation scales are set to \(E_{\mathrm {T}}^{\gamma }\).

The NLO QCD predictions for inclusive jet production at 7, 8 and 13 \(\text {TeV}\) are made with NLOjet++ interfaced to APPLgrid. The renormalisation and factorisation scale choices \(\mu _{\mathrm {r}} =\mu _{\mathrm {f}} = p_{\mathrm {T}}^{\mathrm {max}}\) and \(\mu _{\mathrm {r}} =\mu _{\mathrm {f}} = p_{\mathrm {T}}^{\mathrm {jet}}\) are both available. For the 7 and 8 \(\text {TeV}\) data, predictions are available for jet radii \(R=0.4\) and 0.6, while for the 13 \(\text {TeV}\) data they are available for \(R=0.4\) only, matching the data. The K-factors to convert NLO QCD predictions to NNLO have been calculated using NNLOjet [61]. These K-factors are rather different for the two scale choices. For the 8 \(\text {TeV}\) data and scale \(p_{\mathrm {T}}^{\mathrm {max}}\) they range from \({\sim }{+}10\%\) at low \(p_{\mathrm {T}}\) to a few percent at high \(p_{\mathrm {T}}\) for low \(|y^{\mathrm {jet}}|\), but the high-\(p_{\mathrm {T}}\) corrections can become slightly negative at high \(|y^{\mathrm {jet}}|\). For the scale \(p_{\mathrm {T}}^{\mathrm {jet}}\), they range from \({\sim }{-}5\%\) at low \(p_{\mathrm {T}}\) to a few percent positive at high \(p_{\mathrm {T}}\) for low \(|y^{\mathrm {jet}}|\), but become increasingly negative for all \(p_\mathrm {T}\) at high \(|y^{\mathrm {jet}}|\). The K-factors for the 7 \(\text {TeV}\) data are very similar, and those for the 13 \(\text {TeV}\) data follow the same trends while being a bit larger. The choice of scale for the central fit is \(\mu _{\mathrm {r}} =\mu _{\mathrm {f}} = p_\mathrm {T}^{\mathrm {jet}}\), since this is preferred theoretically [29], but the alternative scale choice \(\mu _{\mathrm {r}} =\mu _{\mathrm {f}} = p_\mathrm {T}^{\mathrm {max}}\) and other scale variations are considered in Sect. 5.3.1.

The precision of the K-factors, in the predictions for inclusive jet data, depends on the statistical precision of the NNLOjet calculations which evaluated the NLO and NNLO cross sections. For the 8 \(\text {TeV}\) inclusive jet data, the resulting K-factors suffer from some statistical fluctuations that can be as large as \({\sim }1\%\).Footnote 8 For the central fit, these K-factors are smoothed and the estimated uncertainty in this procedure is applied as being \(60\%\) correlated and \(40\%\) uncorrelated between \(p_\mathrm {T}\) bins within the same rapidity bin. Alternative treatments are considered in Sect. 5.3.1.

The QCD inclusive jet predictions are corrected for NLO electroweak effects as detailed in Ref. [17]. This correction can reach more than \(10\%\) for the highest \(p_{\mathrm {T}}^{\mathrm {jet}}\) in the lowest \(|y^{\mathrm {jet}}|\) bin, but decreases rapidly as \(|y^{\mathrm {jet}}|\) increases. It is less than \(3\%\) for \(|y^{\mathrm {jet}}| > 1.0\).

In order to compare fixed-order QCD calculations with the measured inclusive jet cross sections, corrections for non-perturbative effects must also be applied. These are derived using LO Monte Carlo event generators. There are significant differences between the corrections derived with different event generators, Pythia [62] and Herwig [63], and different sets of tuned parameter values. The corrections can be up to \({\sim }10\%\) at low \(p_{\mathrm {T}}^{\mathrm {jet}}\). They are different for \(R=0.4\) and \(R=0.6\) due to differing interplay of the underlying-event and hadronisation corrections, and for this reason fits to data with both jet radii are investigated. Values of the non-perturbative correction are taken at the centre of the uncertainty band, which is constructed as the envelope of all such corrections considered, and the systematic uncertainty of each correction value, correlated in \(p_{\mathrm {T}}^{\mathrm {jet}}\) and \(y^{\mathrm {jet}}\), is taken from the spread of the band (see Ref. [17] for details).

Fits were performed using 7, 8 and 13 \(\text {TeV}\) inclusive jet data separately, using both choices for the jet radius, \(R=0.4\) and \(R=0.6\), when available, and both choices for the scales, \(p_\mathrm {T}^{\mathrm {max}}\) and \(p_\mathrm {T}^{\mathrm {jet}}\), but the final choice for the central fit is 8 \(\text {TeV}\) jets with \(R=0.6\) and scales \(\mu _{\mathrm {r}} =\mu _{\mathrm {f}} = p_{\mathrm {T}}^{\mathrm {jet}}\), as discussed in Sect. 5.3.1. Table 3 details the code used for NNLO/NLO QCD corrections and NLO/LO EW corrections to ATLAS data in the QCD fit, for ease of reference.

3.2 Scale uncertainties and sensitivity to NNLO code

For some of the ATLAS data sets considered, the precision of the data set is so high that it is comparable to the size of the NNLO scale uncertainties. This is the case for the ATLAS W and \(Z/\gamma ^*\) inclusive data sets at both 7 and 8 \(\text {TeV}\), for which the total experimental uncertainty and the scale uncertainties both approach \({\sim }0.5\%\). In this case, the scale uncertainties are considered as additional theoretical uncertainties which are added to the \(\chi ^2\) calculation in the same way as experimental systematic uncertainties (see Sect. 4). These scale uncertainties are evaluated as follows. The K-factors are evaluated for separate changes of the renormalisation and factorisation scales by factors of 2 and 0.5. The magnitude of the K-factor difference is symmetrised as \((K [\mu _{\mathrm {r}} (2)] - K[\mu _{\mathrm {r}} (0.5)])/2\) and \((K [\mu _{\mathrm {f}} (2)] - K[\mu _{\mathrm {f}} (0.5)])/2\) and its sign is preserved as positive if the upward variation of \(\mu _{\mathrm {r}} \) or \(\mu _{\mathrm {f}} \) makes the K-factor increase and negative if it makes the K-factor decrease. For these renormalisation and factorisation scale changes, the ratios of the signed K-factor changes to the K-factors for the nominal scales are calculated to obtain the fractional scale uncertainties. These can then be used in the fit like any other systematic uncertainty, as suggested in Ref. [65].

Due to the similarity of the W and Z processes, both the renormalisation scale and factorisation scale are considered correlated within the WZ data sets at 7 \(\text {TeV}\) and between the W and Z data sets at 8 \(\text {TeV}\). They are also considered to be correlated between the W and Z data sets at 7 and 8 \(\text {TeV}\) for the central fit. Studies of alternative approaches for the scale uncertainties are presented in Appendix A.

Scale uncertainties are accounted for in the central fit only for the W and Z inclusive data sets, because the scale uncertainties are comparable to the experimental uncertainties only in this case. For the other ATLAS data sets the experimental uncertainties are larger and scale uncertainties are treated by repeating the fit with varied scales as discussed in Sect. 5.3.1.

As remarked earlier, there is a small difference of up to \({\sim }1\%\) between the NNLO predictions of FEWZ and DYNNLO. A fractional uncertainty can be defined as the ratio of the signed difference between the FEWZ and DYNNLO predictions to the FEWZ prediction, and applied similarly to the scale uncertainties. The inclusion of such a theoretical uncertainty makes no further difference to the fits, provided the scale uncertainties are applied.

Recently, next-to-next-to-next-to-leading-order (N\(^{3}\)LO) calculations for the total cross section of the Drell–Yan process have become available [66]. It is notable that the N\(^{3}\)LO result does not quite lie within the scale uncertainties of the NNLO result. The N\(^{3}\)LO correction to the central value is \({\sim }2\%\) compared to NNLO scale uncertainties of \({\sim }0.5\%\). Full accounting for N\(^{3}\)LO corrections, as well as \(p_\mathrm {T}\) resummation effects which affect the prediction of W and Z fiducial cross sections at the level of \({\sim }0.5-1\%\) [23], are beyond the scope of the current fixed-order-NNLO QCD fit.

4 Fit methodology

The DGLAP [67,68,69] evolution equations yield the PDFs at any value of \(Q^2\) given that they are parameterised as a function of x at a starting scale \(Q^2_0\). In the present analysis, this scale is chosen to be \(Q^2_0 = 1.9\) \(\text {GeV}\) \(^2\) so that it is below the charm mass threshold \(m_c^2\). A recent combination of HERA heavy-quark data [70] has stimulated a re-analysis of the optimal values of the heavy-quark masses and their variations as used in the optimised TRVFN scheme [39]. The resulting values are \(m_c=1.41\) \(\text {GeV}\) and \( m_b=4.2\) \(\text {GeV}\) [71], and these are used for the central fit as applied to the DIS data. A minimum \(Q^2\) cut of \(Q^2_{\mathrm {min}} = 10.0\) \(\text {GeV}\) \(^2\) is imposed on the HERA data. All these assumptions are varied in the consideration of model uncertainties, as shown in Sect. 5.3.1. The strong coupling constant is fixed to \(\alpha _{\text {s}} (m_{Z}) = 0.118\).

The quark distributions at the starting scale are represented by the generic form

$$\begin{aligned} xq_i(x) = A_i x^{B_i} (1-x)^{C_i} P_i(x), \end{aligned}$$

where \(P_i(x)= (1 + D_i x + E_i x^2 + F_ix^3)\). The parameterised quark distributions, \(xq_i\), are chosen to be the valence quark distributions (\(xu_v,~xd_v\)) and the light anti-quark distributions (\(x\bar{u},~x\bar{d},~x\bar{s}\)). The gluon distribution is parameterised with the more flexible form

$$\begin{aligned} xg(x) = A_g x^{B_g} (1-x)^{C_g}P_g(x) - A'_g x^{B'_g} (1-x)^{C'_g}, \end{aligned}$$

where \(C'_g\) is set to 25 to suppress negative contributions at high x. The parameters \(A_{u_v}\) and \(A_{d_v}\) are fixed using the valence-quark number sum rule, and \(A_g\) is fixed using the momentum sum rule. The normalisation and slope parameters, A and B, of the light-quark sea, \(x\bar{u}\), \(x\bar{d}\) and \(x\bar{s}\), are all independent of each other, such that there is no constraint on \(x\bar{d} - x\bar{u}\), or on \(\bar{s}/(\bar{d}+\bar{u})\), either in shape or in normalisation as \(x \rightarrow 0\). By default it is assumed that \(xs=x\bar{s}\). The DE and F terms in the polynomial expansion \(P_i(x)\) are used only if required by the data, following the procedure described in Ref. [33], whereby parameters are added only if the \(\chi ^2\) of the fit decreases significantly. This leads to a 21-parameter fit with: \(P_{u_v}(x)=1+ D_{u_v} x + E_{u_v} x^2\), \(P_{d_v}(x)=1+ D_{d_v} x\) and \(P_{g}(x)=1+ D_{g} x \), with \(P_i(x)=1\) for the light-sea PDFs. It has been established that adding further parameters beyond the point at which there is little further change in \(\chi ^2\) (called the point of ‘saturation’ of the \(\chi ^2\)) only serves to fit noise in the data [72]. Thus, addition of further parameters to the central fit is not considered. However, parameterisation uncertainties are considered in Sect. 5.3.2.

A cross-check was made using Chebyshev polynomials rather than ordinary polynomials for the \(P_i(x)\) terms. There is no improvement in \(\chi ^2\) for a comparable 21-parameter fit using the Chebyshev polynomials and the extracted PDFs lie within the total uncertainty bands of the ordinary polynomial fit.

The level of agreement between the data and the predictions from a PDF parameterisation is quantified by the \(\chi ^2\) per degree of freedom (\(\chi ^2\)/NDF, for NDF degrees of freedom). The definition of the \(\chi ^2\) is

$$\begin{aligned} \begin{aligned} \chi ^2&= \sum _{i,k}\left( D_i - T_i(1-\sum _{j}\gamma _{ij}b_{j})\right) \\&\, \times C^{-1}_{\text {stat,uncor}, ik} (D_i,D_k)\left( D_k \!-\! T_k(1\!-\!\sum _{j}\gamma _{kj}b_{j})\right) \\&\, + \sum _{i} \log \frac{\delta ^{2}_{i,\text {uncor}}T_i^2 + \delta ^{2}_{i,\text {stat}}D_i T_i}{\delta ^{2}_{i,\text {uncor}}D_i^2 + \delta ^{2}_{i,\text {stat}}D_i^2}\\&\, + \sum _{j} b_j^2 \end{aligned} \end{aligned}$$
(1)

where \(D_i\) represent the measured data, \(T_i\) the corresponding theoretical prediction, \(\delta _{i,\text {uncor}}\) and \(\delta _{i,\text {stat}}\) are the uncorrelated systematic uncertainties and the statistical uncertainties of \(D_i\), and the correlated systematic uncertainties, described by \(\gamma _{ij}\), are accounted for using the nuisance parameters \(b_j\). The quantity \(C_{\text {stat,uncor}, ik}\) is a covariance matrix for both the statistical and uncorrelated systematic uncertainties. Summations over i and k run over all data points and summation over j runs over all sources of correlated systematic uncertainty. For each data set, the first term gives the main contribution to the partial \(\chi ^2\) of the data set and the second term is a small bias correction term, referred to as the log penalty, which arises because the diagonal term of the matrix, C, is given by \(C_{ii} =\delta ^{2}_{i,\text {uncor}}T_i^2 + \delta ^{2}_{i,\text {stat}}D_i T_i\), with different weighting for statistical uncertainties and uncorrelated systematic uncertainties. This form of the \(\chi ^2\) is used as standard in HERA and ATLAS PDF fits [5, 9, 10]. In reporting the \(\chi ^2\) for the fits, the first and second terms together give the partial \(\chi ^2\) of the data set. The third term gives the correlated \(\chi ^2\) arising from the penalty for the nuisance parameters describing correlations of systematic uncertainties within and between data sets. The experimental uncertainties of the fit are first set using the usual tolerance, \(T=1\), where \(T^2 = \Delta \chi ^2 =1\). The use of an enhanced tolerance is considered in Sect. 6.

5 Results

In this section the ATLASpdf21 PDF set is presented. The impact of variations of the central choice of fit settings and parameterisation is discussed in Sect. 5.3. Table 4 gives the total \(\chi ^2\) per degree of freedom, \(\chi ^2/\mathrm {NDF}\), of the fit using all data sets and the \(\chi ^2\) per data point, partial \(\chi ^2/\mathrm {NDP}\) for NDP data points, of each data set. The correlated terms (term 3) are shown separated into groups with common systematic correlations. In order to evaluate the separate contributions of the data sets to this correlated term, the fit is run with its final parameters fixed for each data set separately. These values follow the total correlated terms in brackets, in order of their appearance in the table. The quality of the fit to the HERA data is \(\chi ^2/\mathrm {NDP} =1.14\), comparable to that of HERAPDF2.0, so that there is no tension between the ATLAS data and HERA data. The quality of the fit to the ATLAS WZ data is \(\chi ^2/\mathrm {NDP} =1.44\), and to the ATLAS V+ jets, \(t\bar{t}\) and inclusive jet data it is \(\chi ^2/\mathrm {NDP} =1.40\). The quality of fit to the ATLAS direct photon data is \(\chi ^2/\mathrm {NDP} = 0.7\). These \(\chi ^2\) values are comparable to those obtained by the global PDF fits for similar data sets, but indicate a need to consider the appropriate \(\chi ^2\) tolerance of the fit. Both of these points are discussed further in Sect. 6.

Table 4 \(\chi ^2\) contributions for the all data sets entering the PDF fit. The partial \(\chi ^2\) for the individual data sets are given with respect to the number of data points (NDP). They represent the addition of terms 1 and 2 in Eq. (1). The correlated terms (term 3) are shown separated into groups with common systematic correlations. The total value of the correlated term for these groups is also split into the addition of the separate contributions in the order in which they are given in the table

5.1 Comparison of PDFs with and without inclusion of correlations between data sets

The ATLASpdf21 PDFs at \(Q^2 = 1.9\) \(\text {GeV}\) \(^2\) (the starting scale) are shown in Figs. 1 and 2. Only experimental uncertainties with tolerance \(T=1\) are shown for the comparisons throughout this section. Full uncertainties including model and parameterisation variations are considered for the ATLASpdf21 fit in Sect. 5.3.

Fig. 1
figure 1

ATLASpdf21 PDFs comparing those extracted from a fit in which correlations of systematic uncertainties between data sets are applied, with those extracted from a fit in which only the luminosity uncertainties for each centre-of-mass energy are correlated between data sets. Only experimental uncertainties are shown, evaluated with tolerance \(T=1\). Top left: \(xu_v\). Top right: \(xd_v\). Bottom left: \(x\bar{u}\). Bottom right: \(x\bar{d}\). The lower panels show the comparison as a ratio to the default ATLASpdf21 PDF

Fig. 2
figure 2

ATLASpdf21 PDFs comparing those extracted from a fit in which correlations of systematic uncertainties between data sets are applied, with those extracted from a fit in which only the luminosity uncertainties for each centre-of-mass energy are correlated between data sets. Only experimental uncertainties are shown, evaluated with tolerance \(T=1\). Top left: \(x\bar{s}\). Top right: xg. Bottom left: \(x(\bar{d}-\bar{u})\). Bottom right: \(R_s\). For the \(x\bar{s}\) and xg plots, the lower panels show the comparison as a ratio to the default ATLASpdf21 PDF

Figure 2 also shows the ratio \(R_s=x(s + \bar{s})/x(\bar{u}+\bar{d})\) and the difference \(x(\bar{d}-\bar{u})\). In this section, all PDFs are presented with uncertainties evaluated using the standard \(\chi ^2\) tolerance \(T^2 = \Delta \chi ^2 = 1\). Enhanced tolerance is considered in Sect. 6. Also shown in these figures are the PDFs from a fit in which correlations of systematic uncertainties between the V+ jets, \(t\bar{t}\) (lepton + jets) and inclusive jets data sets are not applied, with the exception of the luminosities for the data at the same centre-of-mass energy.Footnote 9 The systematic uncertainties which are correlated are specified in Table 2 and their treatment for the central fit is discussed in Sect. 2.2.

The differences between the extracted PDFs with and without inter-data-set correlations are mostly in the d-type sector and can reach \({\sim }20\%\) for \(x\bar{d}\) at large x. The PDF difference \(x(\bar{d}-\bar{u})\) and ratio \(R_s\), shown in the bottom half of Fig. 2, are particularly sensitive to the use of these correlations.

It is important to examine these differences at a scale relevant for precision LHC physics and this is illustrated in Figs. 3 and 4 where the ratios of the PDFs with and without the correlations are shown at \(Q^2 = {10,000}\) \(\text {GeV}\) \(^2\). Although these differences are not large compared to current experimental precision, they can nevertheless be important if \(O(1\%)\) accuracy is to be a realistic goal for PDFs [73].

Fig. 3
figure 3

Ratios of ATLASpdf21 PDFs extracted from a fit including correlations of systematic uncertainties between data sets to those extracted from a fit in which only the luminosity uncertainties for each centre-of-mass energy are correlated between data sets, at scale \(Q^2 = {10,000}\) \(\text {GeV}\) \(^2\). Only experimental uncertainties are shown, evaluated with tolerance \(T=1\). Top left: \(xu_v\). Top right: \(xd_v\). Bottom left: \(x\bar{u}\). Bottom right: \(x\bar{d}\)

Fig. 4
figure 4

Ratios of ATLASpdf21 PDFs extracted from a fit including correlations of systematic uncertainties between data sets to those extracted from a fit in which only the luminosity uncertainties for each centre-of-mass energy are correlated between data sets, at scale \(Q^2 = {10,000}\) \(\text {GeV}\) \(^2\). Only experimental uncertainties are shown, evaluated with tolerance \(T=1\). Left: xs. Right: xg

Fig. 5
figure 5

The PDF ratio \(R_s=x(s+\bar{s})/x(\bar{u}+\bar{d})\) from ATLASpdf21 compared with \(R_s\) for fits not including some of the WZ data sets. Only experimental uncertainties are shown, evaluated with tolerance \(T=1\). Left: not including WZ data at both 7 and 8 \(\text {TeV}\). Right: not including WZ data at 8 \(\text {TeV}\)

These plots also allow a comparison of the size of the current experimental uncertainty of the PDFs with and without inclusion of inter-data-set correlations. The sizes are generally similar but the uncertainty in the high-x \(\bar{d}\), strange and gluon PDFs is larger if such correlations are not taken into account.

5.2 Impact of each data set

In this section the impact of each data set is considered. Only experimental uncertainties with tolerance \(T=1\) are shown for these comparisons. Full uncertainties including model and parameterisation variations are considered for the ATLASpdf21 fit in Sect. 5.3.

Fig. 6
figure 6

Relative uncertainties in ATLASpdf21 \(xd_v\) and xg compared with fits not including some of the WZ data sets. Only experimental uncertainties are shown, evaluated with tolerance \(T=1\). Top: \(xd_v\) uncertainties, (left) not including inclusive WZ data at both 7 and 8 \(\text {TeV}\), (right) not including inclusive WZ data at 8 \(\text {TeV}\). Bottom: xg uncertainties, (left) not including inclusive WZ data at both 7 and 8 \(\text {TeV}\), (right) not including inclusive WZ data at 8 \(\text {TeV}\)

5.2.1 Impact of WZ inclusive data

Figure 5 shows the ratio \(R_s\) for the ATLASpdf21 fit and compared with a fit in which the inclusive WZ data at 7 and 8 \(\text {TeV}\) are removed (left-hand plot), as well as to a fit in which only WZ data at 8 \(\text {TeV}\) are removed (right-hand plot). It is clear that without WZ inclusive data the ratio \(R_s\) cannot be determined reliably. Once WZ data at 7 \(\text {TeV}\) are input the determination improves considerably, but the inclusive WZ data at 8 \(\text {TeV}\) still add information.

In contrast, the valence and gluon PDFs are still reasonably well determined without any WZ data but the input of these data decreases their uncertainties significantly, as illustrated for the \(xd_v\) and xg PDFs on the left-hand side of Fig. 6.

On the right-hand side of Fig. 6 the decrease in the uncertainties of the \(xd_v\) and xg PDFs from removing only the WZ data taken at 8 \(\text {TeV}\) is illustrated, showing that the major decrease comes from retaining the WZ data taken at 7 \(\text {TeV}\).

However, the WZ data taken at 8 \(\text {TeV}\) have a major role to play in ensuring that \(x\bar{u} \sim x\bar{d}\) holds at low x, even though this constraint is not imposed. Without them, one observes \(x\bar{d} < x\bar{u}\) at low x, as seen in Fig. 7. These data also somewhat reduce the low-x strange distribution and harden the high-x strange distribution, while softening the high-x \(x\bar{d}\) distribution, as also shown in Fig. 7.

Fig. 7
figure 7

ATLASpdf21 PDFs compared with those from a fit not including the WZ data at 8 \(\text {TeV}\). Only experimental uncertainties are shown, evaluated with tolerance \(T=1\). Top left: \(x\bar{u}\). Top right: \(x\bar{d}\). Bottom: \(x\bar{s}\)

Fig. 8
figure 8

ATLASpdf21 PDFs compared with those from a fit not including V+ jets data. Only experimental uncertainties are shown, evaluated with tolerance \(T=1\). Top left: \(x\bar{d}\). Top right: \(x\bar{s}\). Bottom left: \(R_s\). Bottom right: \(x(\bar{d}-\bar{u})\)

Fig. 9
figure 9

ATLASpdf21 PDFs compared with those from a fit not including V+ jets data. Only experimental uncertainties are shown, evaluated with tolerance \(T=1\). Left: \(xd_v\). Right: the ratio of xg for the two fits

There is mild tension between the WZ data at 8 \(\text {TeV}\) and the WZ data at 7 \(\text {TeV}\). The partial \(\chi ^2/\mathrm {NDP}\) for the WZ data at 7 \(\text {TeV}\) decreases from 68/55 to 50/55 if the WZ data at 8 \(\text {TeV}\) are excluded from the fit, and the partial \(\chi ^2/\mathrm {NDP}\) for the WZ data at 8 \(\text {TeV}\) decreases from 239/206 to 222/206 if the WZ data at 7 \(\text {TeV}\) are excluded from the fit. These increases in \(\chi ^2\) are most pronounced for the 7 \(\text {TeV}\) c-c data around the Z mass-peak (66–116 \(\text {GeV}\)) and for the mass bins around the Z peak in 8 \(\text {TeV}\) data. As already remarked, theoretical scale uncertainties for WZ data at both 7 and 8 \(\text {TeV}\) are added to the fit uncertainties. If these uncertainties are not added the tension between WZ data at 7 and 8 \(\text {TeV}\) increases. The partial \(\chi ^2/\mathrm {NDP}\) for WZ data at 7 \(\text {TeV}\) increases to 80/55 and the partial \(\chi ^2/\mathrm {NDP}\) for WZ data at 8 \(\text {TeV}\) increases to 268/206 if both data sets are included in the fit and scale uncertainties are not applied. The differences between the PDFs are not large, whether or not scale uncertainties are applied, compared to the current experimental precision. However, consideration of such theoretical uncertainties is important if accuracy to 1% is the ultimate goal for the PDFs. N\(^{3}\)LO calculations [74] indicate that higher-order corrections are likely to be larger than our current estimate of scale uncertainties. A further study of scale uncertainties is given in Appendix A.

5.2.2 Impact of V+ jets data

The impact of the V+ jets data is shown in Figs. 8 and  9. There are significant changes in the \(x\bar{d}\) and \(x\bar{s}\) PDF shapes such that the high-x \(x\bar{s}\) and \(x\bar{d}\) PDFs become softer and harder, respectively, with the input of V+ jets data. Because of the change in the \(x\bar{d}\) shape, the difference \(x(\bar{d}-\bar{u})\) is also strongly affected. This is shown in Fig. 8. This figure also shows that, without the V+ jets data, there is little information about the ratio \(R_s\) at high x. The changes in Fig. 8 are large because the V+ jets data resolve a double minimum in parameter space such that the fit now prefers a hard \(x\bar{d}\) and soft \(x\bar{s}\) at high x, whereas it previously had an additional minimum with \(x\bar{d} \sim x\bar{s}\) at high x, which was marginally preferred. These results are similar to those already seen and fully explained in the ATLASepWZVjets20 PDF analysis [11] and the double minimum may be seen in Figure 5 of that paper. A fit with only HERA data, or with HERA plus ATLAS WZ data at 7 \(\text {TeV}\) (ATLASepWZ20), prefers the minimum with \(x\bar{d} \sim x\bar{s}\) at high x, but once ATLAS V + jets data at 8 TeV are added to the fit the double minimum with a hard \(x\bar{d}\) and soft \(x\bar{s}\) at high x is preferred and the previous minimum disappears. In Figure 3 of Ref. [11], it can also be seen that for ATLASepWZ20 (the fit without V+ jets data) the full uncertainties, including model and parameterisation variations, are very large because they cover both minima, whereas the full uncertainties of the ATLASepWZVjets20 fit (including V + jets data) are much smaller because there is no double minimum. There is thus no strong inconsistency between the fits with or without the V+ jets data because of the large uncertainties of the fit without the V + jets data. This also applies in the present analysis, but in Fig. 8, only experimental uncertainties are shown. The full uncertainties for the fit without V + jets data in our present analysis are of little interest and are not presented. Model and parametrisation uncertainties for the ATLASpdf21 fit will be discussed in Sect. 5.3.

These V+ jets data also effect a modest change in the \(xd_v\) shape and the gluon PDF shape as shown in Fig. 9. The u-type quarks are not strongly affected by these data and thus they are not shown. There is no tension between the V+ jets data at 8 \(\text {TeV}\) and other data sets in the fit, since all data are fitted well at the minimum chosen by these V+ jets data.

5.2.3 Impact of \(t\bar{t}\) data

Fig. 10
figure 10

ATLASpdf21 xg PDF compared with xg for a fit not including various \(t\bar{t}\) data sets. Only experimental uncertainties are shown, evaluated with tolerance \(T=1\). Top left: the shape ratio to a fit not including \(t\bar{t}\) data at 8 and 13 \(\text {TeV}\). Top right: the ratio to a fit not including \(t\bar{t}\) data at 8 and 13 \(\text {TeV}\) for which both distributions are centred on unity. Bottom left: the shape ratio to a fit not including \(t\bar{t}\) data at 13 \(\text {TeV}\). Bottom right: the shape ratio to a fit not including \(t\bar{t}\) data at 8 \(\text {TeV}\)

The impact of the \(t\bar{t}\) data is shown in the top half of Fig. 10. The high-x gluon distribution is mildly softened when the \(t\bar{t}\) data are added to the fit. This effect is opposite to the one observed in the ATLASepWZtop18 fit. This is because more data which harden the gluon PDF, in particular the V+ jets and inclusive jet data, are included in the present fit. The more significant effect is in the uncertainties of the high-x gluon distribution, which are reduced. There is no significant tension between the \(t\bar{t}\) data and the other data in the fit. Figure 10 (bottom half) also shows the impact of removing only the \(t\bar{t}\) data at 13 \(\text {TeV}\) (left) or only the \(t\bar{t}\) data at 8 \(\text {TeV}\) (right). It is clear that the data at 8 \(\text {TeV}\) have the stronger impact on the shape of the xg PDF but both data sets contribute to a modest reduction in the uncertainties.

5.2.4 Impact of photon data and inclusive jet data

Fig. 11
figure 11

ATLASpdf21 xg PDF compared with xg for fits not including various data sets. Only experimental uncertainties are shown, evaluated with tolerance \(T=1\). Left: not including the direct-photon production ratio data taken at 13 and 8 \(\text {TeV}\). Right: not including inclusive jet data at 8 \(\text {TeV}\)

There is little impact from the addition of the direct-photon production ratio data apart from a marginal softening of the high-x gluon distribution as shown in Fig. 11 (left). However, it is notable that these data can now be well fitted at NNLO in QCD, given that they have been excluded from PDF fits for the last 20 years because of poor fits to lower-energy data [60, 75]. There is minimal tension with other data sets.

The principal impact of the inclusive jet data is on the gluon PDF. The main effect is a considerable decrease in high-x gluon uncertainties, with a mild hardening of the gluon PDF at high x, as shown in Fig. 11 (right). There is minimal tension with other data sets. As explained earlier in Sect. 2, the central ATLASpdf21 fit includes only the inclusive jet data at 8 TeV, with \(R=0.6\). The full uncertainties of the fit, see Sect. 5.3.1, include the difference between the choices \(R=0.6\) and \(R=0.4\). The impact of using inclusive jet data at 7 TeV and at 13 TeV, instead of at 8 TeV, is explored in Appendix B.

5.3 Model, theoretical and parameterisation uncertainties

The consideration of additional uncertainties affecting the PDFs is presented in this section. These are classified and labeled here as either model, theoretical or parameterisation uncertainties.

5.3.1 Model and theoretical uncertainties

The class of model uncertainties includes effects due to variations of the heavy-quark masses input to the TRVFN heavy-quark-mass scheme for the inclusive DIS calculations, the minimum \(Q^2\) cut on the HERA inclusive DIS data and the value of the starting scale for evolution. The minimum \(Q^2\) cut was varied in the range \(7.5< Q^2_{\mathrm {min}} < 12.5\) \(\text {GeV}\) \(^2\) and the starting scale was varied in the range \(1.6< Q^2_0 < 2.2\) \(\text {GeV}\) \(^2\). In the inclusive DIS calculations, the heavy-quark masses were varied in the ranges \(1.37< m_c <1.45\) \(\text {GeV}\) and \(4.1< m_b< 4.3\) \(\text {GeV}\) [71]. The variations of \(m_c\) and \(Q^2_0\) are coupled since the requirement \(Q^2_0 < m_c^2\) must be met. For this reason the upward variation of \(m_c\) and the downward variation of \(Q^2_0\) are symmetrised. Figure 12 illustrates the effect of these variations on the gluon distribution since this is the PDF most sensitive to these changes. The impact of the choice of \(m_c\) and \(m_b\) is modest. The effect of the variation of the \(Q^2_{\mathrm {min}}\) cut is larger but still within the experimental uncertainties. The largest of these uncertainties is due to the choice of \(Q^2_0\), which gives a gluon PDF differing from the central fit by \({\sim }2\sigma \) for \(x \sim 0.1\).

An additional model uncertainty comes from the assumed value of the top-quark mass. The interpolation grids for the NNLO predictions for the \(t\bar{t}\) data at 8 \(\text {TeV}\) are available for pole masses, \(m_{t}=172.5, 173.3, 175.0\) \(\text {GeV}\). The smaller and larger values are used to estimate an asymmetric model uncertainty around the central value, shown in Fig. 12. For the \(t\bar{t}\) 1-D distributions at 13 \(\text {TeV}\) the grids are only available for \(m_{t}=172.5\) \(\text {GeV}\). The effect of this change in central value is negligible, as illustrated in Fig. 12. A cross-check was performed using the double differential \(p_{\mathrm {T}}^t\) and \(m_{t\bar{t}}\) distributions at 13 \(\text {TeV}\), for which predictions are available for \(m_{t}=171.0,172.5,174.0\) GeV and the effect was found to be negligible. The 13 \(\text {TeV}\) \(t\bar{t}\) data have a smaller impact in the fit than the 8 \(\text {TeV}\) \(t\bar{t}\) data, so only the \(m_{t}\) variations of the 8 \(\text {TeV}\) data set are input to the final model uncertainty.

Fig. 12
figure 12

Impact of model assumption variations on the ATLASpdf21 gluon PDF illustrated in their ratio to the xg distribution for the central choice. Uncertainties of the central fit are experimental, evaluated with tolerance \(T=1\). Top left: variation of \(m_c\) and \(m_b\) values. The \(m_c\) upward variation is shown and this is symmetrised. Top right: impact of variations of \(Q^2_{\mathrm {min}}\) and \(Q^2_0\) on the ATLASpdf21 gluon PDF. The \(Q^2_0\) downward variation is shown and it is symmetrised. Bottom: impact of the variation of \(m_{t}\) and of the choice of jet radius for the inclusive jets on the ATLASpdf21 gluon PDF

Table 5 \(\chi ^2\) contributions for the inclusive jet data set at 8 \(\text {TeV}\) with R = 0.6, for different correlation scenarios, as explained in the text. The \(\chi ^2\) values given here represent the addition of all terms in Eq. (1)

A further potential model uncertainty comes from the treatment of the jet systematic uncertainties. Alternative decorrelation scenarios are considered as follows: the alternative option in which the JES Flavour Response is split into only two components rather than three, called “Decorrelation scenario 1”; complete decorrelation of the Jet Flavour Response between rapidity bins, called “FR decorrelated”; and no decorrelation, called “Fully correlated”. Table 5 gives the total \(\chi ^2/\mathrm {NDP} \) for the jets (including all three terms of Eq. (1)) for alternative correlation scenarios, showing that the difference between full correlations and various choices of decorrelation can have a significant effect on the \(\chi ^2\).

Fig. 13
figure 13

Ratios of the ATLASpdf21 gluon PDFs for different inclusive jet scale choices. Left: \(p_{\mathrm {T}}^{\mathrm {max}}\) over \(p_{\mathrm {T}}^{\mathrm {jet}}\) (default). Right: \(2p_{\mathrm {T}}^{\mathrm {jet}}\) and \(p_{\mathrm {T}}^{\mathrm {jet}}/2\) over \(p_{\mathrm {T}}^{\mathrm {jet}}\) (default). Uncertainties of the central fit are experimental, evaluated with tolerance \(T=1\)

Table 6 \(\chi ^2/\mathrm {NDF}\) for the total fit and the contribution \(\chi ^2/\mathrm {NDP}\) for the inclusive jet data, for various scale choices, and treatment of K-factors and their uncertainties. The first row represents the default values used in the central ATLASpdf21 fit

However, the difference between the ATLASpdf21 gluon PDFs obtained using jet data with these different correlation scenarios is relatively small compared to the model uncertainties considered above, e.g. the variation of \(Q^2_0\). There are also no changes in the PDF uncertainties as a result of using different correlation scenarios. Hence, these variations are not considered as a source of significant uncertainty.

A further source of uncertainty for the inclusive jet data comes from the choice of jet radius R. The difference between the PDFs due to a different choice of jet radius for the 8 \(\text {TeV}\) jets is shown in Fig. 12. The only significant change is in the gluon PDF. This small difference between the PDFs extracted using jet data at 8 \(\text {TeV}\) for \(R = 0.4\) and \(R = 0.6\) is considered as an extra uncertainty because, although it is clear that the choice R = 0.6 is theoretically favoured [29], this paper also inputs \(t\bar{t}\) data from the lepton + jets channel and V+ jets data, and for these data sets only \(R = 0.4\) jets are available.

The effect of using jet production data at different centre-of-mass energies of 7, 8 and 13 \(\text {TeV}\) is shown in Appendix B. This is not considered as an additional uncertainty but is presented as a cross-check that the choice of centre-of-mass energy for the jet production data does not have a significant influence on the PDFs extracted.

All model uncertainties are added in quadrature to form a total model uncertainty.

Theoretical uncertainties include the scale uncertainties of the predictions. The largest of these are the scale uncertainties for the inclusive W and Z data which are considered in detail in Sect. 3.2. Since these are treated as correlated systematic uncertainties in the fit \(\chi ^2\) they are actually already included in what has been labelled as the experimental uncertainty of the fit.

Fig. 14
figure 14

Impact of adding \(F_{u_{v}}\) and \(D_{\bar{d}}\) as free parameters on the valence PDFs in comparison with the central ATLASpdf21 21-parameter fit. Uncertainties of the central fit are experimental, evaluated with tolerance \(T=1\). Top left: \(xu_v\). Top right: \(xd_v\). Bottom left: \(x\bar{u}\). Bottom right: \(x\bar{d}\)

The effect of scale uncertainties for the V+ jets data was studied in Ref. [11] and was found to be negligible. The effect of scale uncertainties for the \(t\bar{t}\) lepton + jets data was studied for the present paper and was also found to be negligible. The direct-photon production ratio data are relatively insensitive to scale variations because \(100\%\) correlation of the scale uncertainties is assumed between the 8 and 13 \(\text {TeV}\) data. Since the fit is relatively insensitive to these data, alternative assumptions for these correlations were not pursued.

The scale uncertainty for the inclusive jets at 8 \(\text {TeV}\) requires further consideration. These K-factors were supplied for two choices of scale: \(\mu _{\mathrm {r}} =\mu _{\mathrm {f}} = p_{\mathrm {T}}^{\mathrm {max}}\) or \(\mu _{\mathrm {r}} =\mu _{\mathrm {f}} = p_{\mathrm {T}}^{\mathrm {jet}}\). A scale related to \(p_{\mathrm {T}}^{\mathrm {jet}}\) is favoured over \(p_{\mathrm {T}}^{\mathrm {max}}\) [29], as already mentioned. However, this difference in scale choice is now investigated. A comparison of the gluon PDFs for these two scales is shown in Fig. 13 (left), since this is the PDF most sensitive to this change. It can be seen that there is no significant difference between the resulting gluon PDFs. Changes in the nuisance parameter values for the correlated systematic uncertainties absorb the change in the predictions for the two scales. The \(\chi ^2\) values for these fits are given in Table 6. The only significant change in \(\chi ^2\) comes from the inclusive jets, not from other data sets.

Next, scale variations \(\mu _{\mathrm {r}} =\mu _{\mathrm {f}} = 2 p_{\mathrm {T}}^{\mathrm {jet}}\) and \(\mu _{\mathrm {r}} =\mu _{\mathrm {f}} = p_{\mathrm {T}}^{\mathrm {jet}}/2\) are considered. The effect on the gluon PDF is shown in Fig. 13 (right). The \(\chi ^2\) values for these fits are given in Table 6. Again, the only significant change in \(\chi ^2\) comes from the inclusive jets, not from other data sets.

Table 7 Total \(\chi ^2\)/NDF for each model and parameterisation variation considered for the ATLASpdf21 PDF fit. The notation ‘(sym)’ indicates that the upward and downward model variations have been symmetrised

Finally, a cross-check is performed in which the K-factors are not smoothed but their statistical uncertainties are used as an additional uncorrelated uncertainty. The \(\chi ^2\) value for this fit is also given in Table 6. It is lower than that for smoothed K-factors because the statistical uncertainties of the unsmoothed K-factors are \({\sim }1\%\). However, the resulting PDFs are very similar to those obtained using smoothed K-factors. Since none of the scale variation effects produced significant changes in the PDFs, no further theoretical uncertainty is added for this source.

Thus the theoretical uncertainties considered so far are either already included in the experimental uncertainties of the fit, or they are negligible. It should be noted that the data are also sensitive to the value of \(\alpha _{\text {s}} (m_{Z})\), which affects the shape of the gluon PDF. The correlation between \(\alpha _{\text {s}} (m_{Z})\) and the gluon PDF shape is specified by the DGLAP formalism. A determination of \(\alpha _{\text {s}} (m_{Z})\) is beyond the scope of the current paper, since a correct NNLO treatment requires variation of the K-factor calculations with \(\alpha _{\text {s}} (m_{Z})\), or direct NNLO grids for all processes. This is left for future work. The conventional value \(\alpha _{\text {s}} (m_{Z}) = 0.118\) is used, in line with the value used by the global fitting groups, CT [76], MSHT [77] and NNPDF [78].

5.3.2 Parameterisation uncertainties

The optimal number of parameters was determined by ‘saturation’ of the \(\chi ^2\) as explained in Sect. 4. However, the effect on the PDFs of adding extra parameters is investigated. Although there is no significant further decrease in \(\chi ^2\), some small shape changes are observed when adding an \(F_{u_{v}}\) term to the u-valence PDF and/or a \(D_{\bar{d}}\) term to the \(x\bar{d}\) PDF. Figure 14 shows the impact of adding both of these as free parameters on the \(xu_v\), \(xd_v\), \(x\bar{u}\) and \(x\bar{d}\) PDFs, which are the PDFs which show the largest variations. The total parameterisation uncertainty is the envelope of these D and F parameterisation variations. Model and parameterisation variations, and their effect on the total \(\chi ^2\) of the fit, are summarised in Table 7.

5.3.3 Imposing a maximum \(Q^2\) cut on the input data

A further theoretical uncertainty stems from the possibility that subtle effects of BSM physics may be present in high-scale data. If these are included in a PDF fit, then the estimates of background to further higher-scale physics from new data could be distorted. A fit is performed in which the maximum scale for events accepted from each process is 500 \(\text {GeV}\), or \(Q^2 ={250,000}\) \(\text {GeV}\) \(^2\). This removes 82 points from the fit: 63 from the inclusive jets data, 6 from the V+ jets data, 11 from the direct-photon ratio data, and one each from the \(t\bar{t}\) \(p_{\mathrm {T}}\) spectra at 8 and 13 \(\text {TeV}\). The \(\chi ^2\)/NDF value of the fit improves from 2010/1620 to 1870/1538, with the largest improvement coming from the inclusive jets, for which the \(\chi ^2\)/NDP value improves from 248/171 to 136/108.

Figure 15 compares the PDFs at the scale of \(Q^2 = {10,000}\) \(\text {GeV}\) \(^{2}\), with a linear scale in x so as to emphasise the differences at high x. The shapes of the PDFs and their uncertainties are not strongly affected by the cut, and a maximum \(Q^2\) cut is not considered further.

5.4 Combination of uncertainties

The PDFs are now presented with experimental uncertainties from the fit and with additional model and parameterisation uncertainties as discussed in this section. These model and parameterisation uncertainties are added in quadrature to the experimental uncertainty. It should be noted that the scale uncertainties for the inclusive WZ data appear as part of the experimental uncertainty, whereas the scale uncertainties for other data sets are negligible. Figure 16 shows the valence distributions and the \(x\bar{u}\) and \(x\bar{d}\) sea distributions, and Fig. 17 shows the \(x\bar{s}\) and xg distributions and the \(x(\bar{d}-\bar{u})\) distribution. The experimental uncertainties are much larger for the d-type sector than for the u-type sector. The larger fractional uncertainty of the \(x\bar{s}\) PDF in comparison with the \(x\bar{d}\) PDF reflects its smaller absolute size as \(x\rightarrow 1\). The gluon PDF is well determined for \(0.01< x < 0.3\) but not at low or high x. The model uncertainties in all PDFs are moderate, being largest in the gluon PDF. The parameterisation uncertainty of \(x\bar{d}\) affects both \(x\bar{d}\) and \(xd_v\) since \(xd_{\mathrm {sea}} = x\bar{d}\) and the sea and valence d-quarks act together. The \(xd_v\) distribution is affected at both low and high x because of the valence-quark number sum rule. Similarly, the parameterisation uncertainty of \(xu_v\) is seen at both low and high x. The parameterisation uncertainty of the gluon PDF comes from the other PDFs through the momentum sum rule.

Fig. 15
figure 15

Ratios of the PDFs from a fit in which a maximum scale cut of 500 \(\text {GeV}\) is imposed, to the central ATLASpdf21 at the scale \(Q^2 = {10,000}\) \(\text {GeV}\) \(^{2}\). Both fits are shown with just experimental uncertainties, evaluated with tolerance \(T=1\). Top left: \(xu_v\). Top right: \(xd_v\). Middle left: \(x\bar{u}\). Middle right: \(x\bar{d}\). Bottom left: xs. Bottom right: xg. Here the x-scale is linear

Fig. 16
figure 16

ATLASpdf21 PDFs showing experimental uncertainties evaluated with \(T=1\) (red), model (yellow) and parameterisation (green) uncertainties. Experimental, model and parameterisation uncertainties are cumulative. Top left: \(xu_v\). Top right: \(xd_v\). Bottom left: \(x\bar{u}\). Bottom right: \(x\bar{d}\). The lower panels illustrate the fractional uncertainties

Fig. 17
figure 17

ATLASpdf21 PDFs showing experimental uncertainties evaluated with \(T=1\) (red), model (yellow) and parameterisation (green) uncertainties. Experimental, model and parameterisation uncertainties are cumulative. Top left: \(x\bar{s}\). Top right: xg. Bottom: \(x\bar{d}-x\bar{u}\). The lower panels illustrate the fractional uncertainties

Figure 18 shows the \(R_s\) distribution and its uncertainties as a function of x, at \(Q^2 = 1.9\) \(\text {GeV}\) \(^2\). Figure 19 shows the value of \(R_s\), at \(Q^2=1.9\) \(\text {GeV}\) \(^2\) and \(x=0.023\) and at \(Q^2 = m_Z^2\) and \(x=0.013\), compared with other PDFs. For the ATLASpdf21 fit, the value of \(R_s\) at this low scale has decreased relative to previous ATLAS fits. The ATLASpdf21 central value of \(R_s \sim 0.8\) is due firstly to the input of the V+ jets data, as seen in the change from ATLASepWZ16 (\(R_s \sim 1.15\)) to ATLASepWZVjets20 (\(R_s \sim 1.0\)), secondly to the new freedom in the low-x parameterisation, whereby the ATLASpdf21 central parameterisation corresponds to the low end of the uncertainty band of the ATLASepWZVjets20 result (\(R_s \sim 0.85\)), and thirdly to the input of the WZ data at 8 \(\text {TeV}\). On the other hand, \(R_s\) has increased for some of the other PDFs, e.g. MSHT20 [77] compared with MMHT14 [79], and CT18A [76] compared with CT14 [80]. This increase is due to the input of the ATLAS precision inclusive WZ data at 7 \(\text {TeV}\). Thus, the recent evaluations of this ratio of strangeness to light quarks are now in fair agreement. However, the strangeness is still substantially larger at low x than the older values which estimated \(R_s\sim 0.5\). This illustration of the ratio \(R_s\) at \(x=0.023\) should be supplemented by a comparison of the full shape of the \(R_s\) distribution in Sect. 6.

6 Consideration of \(\chi ^2\) tolerance and comparison with global fits

6.1 \(\chi ^2\) tolerance

The experimental uncertainties presented for the fitted PDFs were evaluated by the usual criterion of parameter setting, \(\Delta \chi ^2 = 1\). However, global PDF fitting groups such as CT and MSHT use enhanced \(\chi ^2\) tolerances, \(\Delta \chi ^2 = T^2\), where \(T^2\) is a dynamic tolerance ranging from 10 to 100.Footnote 10 The justification for this comes from consideration of the consistency, or relative inconsistency, of input data sets [84]. In the present case, whereas the HERA data is a single consistent set of data with common correlated uncertainties, the ATLAS data are more disparate and a study to find an appropriate tolerance was performed.

One way to estimate a suitable tolerance for a fit including the ATLAS data is to consider the dynamic tolerance procedure of the MSHT group, as first introduced in the MSTW paper [84]. Briefly restated, the method is as follows. The inverse of the Hessian matrix of the fit is diagonalised to give the eigenvector combinations of the parameters of the fit. The eigenvalues are the squares of the uncertainties in these eigenvector combinations of the parameters. The parameter displacements from the global minimum can then be expanded in a basis of rescaled eigenvectors, \(e_{ik}=\sqrt{\lambda _k} v_{ik}\), where \(\lambda _k\) is the k-th eigenvalue and \(v_{ik}\) is the i-th component of the k-th orthonormal eigenvector. Pairs of eigenvector PDFs have parameters given by \(a_i(k^{\pm })=a_i^0 \pm t e_{ik}\), where \(a_i^0\) are the central parameters, and \(t=1\) should give the usual change in \(\chi ^2\) of \(T^2 = \Delta \chi ^2=1\). In the quadratic approximation, \(t=T\), and this is reasonably well obeyed for the present ATLASpdf21 21-parameter fit, at least up to \(t=5\).

Having determined the eigenvectors, each of these is treated as a different ‘hypothesis’ for a theory which can fit the data, and each of them is checked to see if it can describe each of the data sets within the hypothesis-testing criterion, \(\chi _n^2 \lesssim N+\sqrt{2N}\), as t is increased and decreased along the eigenvector, where \(\chi ^2_n\) is the \(\chi ^2\) for a particular data set and N is the number of data points for that set. As soon as \(\chi ^2_n\) goes above this limit, this sets the maximum value of t, and hence of T, that can be tolerated for this eigenvector in this direction, for this data set. Each eigenvector will have a most constraining data set, for each direction, and in principle the values of the tolerances T for the most constraining data set can be different for each eigenvector and each direction, hence the term ‘dynamic’.

.

Fig. 18
figure 18

ATLASpdf21 \(R_s\) distribution showing experimental uncertainties evaluated with \(T=1\) (red), model (yellow) and parameterisation (green) uncertainties. Experimental, model and parameterisation uncertainties are cumulative. The lower panel illustrates the fractional uncertainties

Fig. 19
figure 19

\(R_s\) from ATLASpdf21, showing experimental uncertainties evaluated with \(T=1\), model and parameterisation uncertainties, compared with other recent PDFs: ABMP16 [81], CT14 [80], CT18, CT18A [76], MMHT14 [79], MSHT20 [77], NNPDF3.0 [82], NNPDF3.1_strange [83], ATLASepWZ16 [9] and ATLASepWZVjets20 [11]. Left: \(R_s\) at \(Q^{2}=1.9\) \(\text {GeV}\) \(^2\) and \(x=0.023\). Right: \(R_{s}\) at \(Q^2 = m_{Z}^{2}\) and \(x=0.013\)

This method is made more sophisticated in a number of ways. Firstly, the hypothesis-testing criterion chosen is not exactly the simple \(\chi ^2_n< N + \sqrt{2N}\), but is \(\chi ^2_n < \xi _{68}\), where \(\xi _{68}\) is given by \(\int ^{\xi _{68}}_0 \mathrm {d}\chi ^2 P_N(\chi ^2) =0.68\) and \(P_N(\chi ^2)\) is the usual probability density function for the \(\chi ^2\) distribution with N degrees of freedom. Thus each data set should be described within its \(68\%\) confidence level. Secondly, the criterion imposed is not actually \(\chi ^2_n < \xi _{68}\), but \(\chi ^2_n < (\chi ^2_{n,0}/\xi _{50})\,\xi _{68}\), to allow for the fact that the central fit \(\chi ^2_{n,0}\) for the data set may not be perfect, where perfect would imply \(\chi ^2_{n,0} = \xi _{50}\).

For the ATLASpdf21 fit the most constraining data sets for the eigenvectors are the WZ data at 7 \(\text {TeV}\) or the \(Z/\gamma ^{*}\) triple differential cross-section data at 8 \(\text {TeV}\) (and very occasionally the W data at 8 \(\text {TeV}\)). The tolerance values for all the eigenvectors, in both directions, are all in the range 2.4–4.2. These values are all fairly similar, so a single value can be used and, since the tolerance should not be set too large whereby a data set is far outside its \(68\%\) CL limit, this suggests the simple choice of \(T = 3\).

In the above procedure to evaluate the dynamic tolerance, the fact that the data sets are not fitted perfectly was taken into account. One may estimate how much of the increased tolerance comes from this part of the procedure by increasing the uncertainties of each data set by \(\sqrt{\chi ^2/\mathrm {NDP}}\) for those data, such that \(\chi ^2/\mathrm {NDP}\sim 1\) is achieved in the fit. This will clearly increase the uncertainties of the output PDFs. However, a naive application of this procedure would increase all the experimental uncertainties and this seems somewhat unnecessary for the statistical uncertainties, which are well known. Hence, only the systematic uncertainties are increased. Starting with the \(\sqrt{\chi ^2/\mathrm {NDP}}\) scaling factors, the fit is iterated, adjusting these factors at each iteration, until a \(\chi ^2/\mathrm {NDP} \sim 1\) for each data set is achieved, After three iterations there is no longer any significant change and the scaling factors range from 1.0 to 1.6.

Fig. 20
figure 20

ATLASpdf21 fits comparing the full uncertainties (experimental, model, parameterisation) for tolerance values \(T=1\) and \(T=3\). Top left: \(xu_v\). Top right: \(xd_v\). Middle left: \(x\bar{u}\). Middle right: \(x\bar{d}\). Bottom left: \(x\bar{s}\). Bottom right: xg

Fig. 21
figure 21

ATLASpdf21 \(xu_v\) and \(xd_v\) distributions with full uncertainties (experimental \(T=3\), model, parameterisation) compared with other PDFs. Left: CT18, CT18A, HERAPDF2.0. Right: MSHT20, NNPDF3.1, ABMP16

Fig. 22
figure 22

ATLASpdf21 \(x\bar{u}\), \(x\bar{d}\) and \(x(\bar{d}-\bar{u})\) distributions with full uncertainties (experimental \(T=3\), model, parameterisation) compared with other PDFs. Left: CT18, CT18A, HERAPDF2.0. Right: MSHT20, NNPDF3.1, ABMP16

Fig. 23
figure 23

ATLASpdf21 \(x\bar{s}\), xg and \(R_s\) distributions with full uncertainties (experimental \(T=3\), model, parameterisation) compared with other PDFs. Left: CT18, CT18A, HERAPDF2.0. Right: MSHT20, NNPDF3.1, ABMP16

It is non-trivial that for these reweightings of the data the PDF shapes hardly differ from those of an unweighted fit. This shows that the fit is stable for modest reweighting of the input data sets. The uncertainties of the PDFs have increased, but only by a factor which corresponds to a tolerance of up to \(T \sim 1.5\). Thus the greater part of the dynamic tolerance estimate of \(T=3\) comes from the application of the hypothesis-testing criterion to the \(\chi ^2\) for the eigenvectors of the fit.

The tolerance \(T=3\) is applied for the final estimation of PDF uncertainty; however, ATLASpdf21 PDFs with both choices of tolerance, \(T=1\) and \(T=3\), will be made available on the LHAPDF PDF repository [85]. Having fixed the tolerance for the experimental uncertainties of the PDFs, the full uncertainties, including experimental, model and parameterisation uncertainties are compared for the choices \(T=3\) and \(T=1\) in Fig. 20. When an enhanced tolerance such as this is applied to the experimental uncertainties, the model and parameterisation uncertainties are far less significant.

Comparisons of the differential cross-section measurements used as input to the fit with the predictions of ATLASpdf21 fit, with \(T=3\) and full uncertainties, are shown in Appendix C.

6.2 Comparison of ATLASpdf21 with global PDFs

In Figs. 21, 22 and 23, the ATLASpdf21 PDFs are compared with the HERAPDF2.0 [5], CT18, CT18A [76], NNPDF3.1 [78], MSHT20 [77] and ABMP16 [81] PDFs. All figures in this section show ATLASpdf21 PDFs with full uncertainties, such that experimental uncertainties are evaluated with \(T=3\) and model and parametrisation uncertainties are also included. It is observed that the addition of the ATLAS data sets to the HERA data brings the PDFs much closer to those of the global fits than to HERAPDF2.0, particularly for \(xd_v\), \(x\bar{d}\) and \(x\bar{u}\). The \(xu_v\) and \(xd_v\) distributions of the ATLASpdf21 fit are in good agreement with CT18, CT18A, MSHT20 and ABMP16, whereas NNPDF3.1 is an outlier. The \(x\bar{u}\) distribution agrees reasonably well with all the other PDFs. The \(x\bar{d}\) distribution is in fair agreement with CT, MSHT and NNPDF, with ABMP presenting as somewhat of an outlier.Footnote 11 However, it should be noted that \(x\bar{d}\) from the ATLASpdf21 fit is somewhat higher at high x. This is further illustrated by the \(x(\bar{d}-\bar{u})\) distribution, which changes from negative at high x when fitting only HERA data in HERAPDF2.0, to positive at high x for ATLASpdf21, in agreement with the global PDFs. However, the ATLASpdf21 \(x(\bar{d}-\bar{u})\) distribution is more positive than the global PDFs for \(x >0.3\), in better agreement with the newer E906 Drell–Yan data [86], rather than the E866 data which is in the global fits. This is explored further in Appendix D. The central value of the \(x\bar{s}\) distribution is larger than those in NNPDF3.1 and CT18 for \(x\sim 0.02\), but is in good agreement with CT18A and MSHT20. The gluon distribution agrees best with NNPDF3.1, but is in reasonable agreement with the other global PDFs.Footnote 12 The distributions of \(R_s\) for all the PDFs illustrated are now in broad agreement but there are differences in detail. The uncertainties of HERAPDF2.0 and CT18 are larger than the others because they do not include ATLAS inclusive WZ data at 7 TeV.Footnote 13 The uncertainties of ABMP16 are smaller for the usual reason that they use tolerance \(T=1\). The input of ATLAS inclusive WZ data at 7 TeV is the only difference between the CT18 and CT18A analyses. After input of these data the CT18A \(R_s\) PDF ratio has smaller uncertainties and a larger central value of \(R_s\) at low x than CT18. CT18A is in good agreement with ATLASpdf21, for both central value and uncertainty, over the full x range illustrated. The PDF analyses of MSHT20 and NNPDF3.1 also use ATLAS inclusive WZ data at 7 TeV and have a similar size of uncertainty and level of agreement with ATLASpdf21. In addition, the MSHT20 analysis uses ATLAS inclusive WZ data at 8 TeV, just as the ATLASpdf21 analysis does. Note that NNPDF have updated their study of strangeness recently to NNPDF3.1_strange [83], which has a somewhat larger strangeness at low x than NNPDF3.1. The modern PDFs which use ATLAS inclusive WZ data, including ATLASpdf21 itself, all agree that strangeness is not suppressed strongly at low x, but there is substantial suppression at high x.

In summary, the ATLASpdf21 fit now agrees with the global fits as well as they agree with each other. Thus ATLAS data seem able to replicate most of the features that the fixed-target DIS and DY data plus the Tevatron data brought to the global PDFs, see Appendix D. Using only the HERA and ATLAS data allows a more rigorous treatment of correlated systematic uncertainties and, in particular, of correlations between data sets.

For the global fits considered here, the \(\chi ^2\) values for the data sets included in the ATLASpdf21 fit are HERAPDF2.0: 2262, CT18: 2135, CT18A: 2133, MSHT20: 2218, NNPDF3.1: 2109, compared with ATLASpdf21: 2010. Although the global PDFs from CT, MSHT and NNPDF have more flexible parameterisations, the \(\chi ^{2}\) value of the ATLASpdf21 fit is better for the data sets considered. This is a further indication that the ATLASpdf21 parameterisation is sufficiently flexible.

Fig. 24
figure 24

ATLASpdf21 fit with full uncertainties (experimental \(T=3\), model, parameterisation) compared with CT18A, MSHT20 shown as a ratio with each distribution centred on unity. Top left: \(xu_v\). Top right: \(xd_v\). Middle left: \(x\bar{u}\). Middle right: \(x\bar{d}\). Bottom left: xg. Bottom right: \(x\bar{s}\)

Figure 24 shows the uncertainties of the ATLASpdf21 fit, with the enhanced tolerance \(T=3\), compared with CT18A and MSHT20 PDFs. These two PDFs are chosen because they both use the ATLAS WZ inclusive cross-section data at 7 \(\text {TeV}\) and they both evaluate uncertainties in a similar way to the present analysis. In general, the uncertainties of CT18A are somewhat larger than those of MSHT20, even though both represent \(68\%\) CL uncertainties. For the \(x\bar{u}\) and \(x\bar{d}\) sea distributions at low x, the uncertainties of ATLASpdf21 lie between those of CT18A and MSHT20, becoming larger at high x, \(x>0.1\), reflecting the absence of fixed-target DY data. For the \(x\bar{s}\) sea distribution there is a similar pattern, with the low x uncertainties of the ATLAS PDF being comparable to those of CT18A. The uncertainty of the ATLASpdf21 gluon distribution at \(x< 2\times 10^{-3}\) is very similar to that of both CT18A and MSHT20, but the high x gluon uncertainty is somewhat larger for the ATLASpdf21 fit, reflecting the absence of Tevatron jet data. In the valence sector, the \(xu_v\) uncertainties of the ATLASpdf21 fit are comparable to those of CT18A at low x, but larger at high x, reflecting the absence of Tevatron WZ data. For \(xd_v\), the uncertainties are larger for all x except \(x\sim 0.003\)–0.008, reflecting the fact that there is less information about the d-quark than the u-quark in the absence of deuterium target data. Appendix D shows comparisons between the ATLASpdf21 predictions and some of these older data sets which, while not fitted, are generally well described.

Fig. 25
figure 25

ATLASpdf21 fit with full uncertainties (experimental \(T=3\), model, parameterisation) compared with global PDFs: CT18A and MSHT20, focusing on high x behaviour. Top left: \(xu_v\). Top right: \(xd_v\). Middle left: \(x\bar{u}\). Middle right: \(x\bar{d}\). Bottom left: \(x\bar{s}\). Bottom right: xg

In order to focus more on the high x regime, which is important in searches for new physics, the comparisons between the ATLASpdf21 fit, with the enhanced tolerance of \(T=3\), and the CT18A and MSHT20 PDFs, are repeated with a linear-x and log-y scale in Fig. 25. Some discrepancies, not evident before, appear for \(x \ge 0.7\). In this high x region there is no data to determine any of the PDFs. Nevertheless, there is agreement within \({\sim }2\sigma \).

7 Conclusion

A PDF fit is presented using selected proton–proton collision data sets recorded by the ATLAS experiment at the LHC together with the final combined inclusive HERA data in order to assess the impact of the ATLAS data with a minimum of other input data. The resulting PDF set is called ATLASpdf21. The addition of ATLAS inclusive W and Z data to HERA data allows a fit to be made without any constraints imposed on the relationship between \(x\bar{u}\), \(x\bar{d}\) and \(x\bar{s}\), nevertheless, \(x\bar{d}\sim x\bar{u}\) is observed at low x. The strangeness at low x is found not to be as suppressed as found in earlier PDFs such as CT14, MMHT14 and NNPDF3.0, but to be in agreement with modern PDFs such as CT18A, MSHT20 and NNPDF3.1_strange which use these ATLAS inclusive WZ data. This increase in the low x strange was first seen in the ATLASepWZ12 and ATLASepWZ16 PDF analyses, but is now somewhat moderated by further ATLAS data, and the freedom of the parametrisation at low x. These inclusive WZ data also exhibit sensitivity to the valence quark distributions and the gluon distribution, considerably reducing their uncertainty. The ATLAS V+ jets data constrain the light-quark sea at higher x, such that strangeness is strongly suppressed at high x, and impact the high x gluon distribution. The \(t\bar{t}\) data serve to moderately reduce the uncertainty in the high x gluon distribution. Ratios of direct photon production at different proton–proton collision energies have only a mild impact on the gluon PDF, but it is notable that they may now be reliably fitted to NNLO in QCD. The jet production data are sensitive to the gluon distribution at medium to high x and they reduce its uncertainty considerably.

The role of scale uncertainties is considered for all data sets and these theoretical uncertainties are implemented in the fit for the inclusive WZ data, where they are most impactful. The effects of such scale uncertainties are small.

A study is made of excluding high-scale data (with a scale above 500 \(\text {GeV}\)), which may contain subtle effects of physics beyond the Standard Model, from the fit. The resulting PDFs are not strongly affected by this restriction of the fitted kinematic region.

It is observed that the addition of the ATLAS data sets to the HERA data brings the PDFs much closer to the global PDFs of MSHT, CT and NNPDF than to HERAPDF2.0. The ATLASpdf21 PDFs agree with these global fits as well as they agree with each other. Thus, ATLAS data seem to be able to replicate many of the features that the fixed-target deep inelastic scattering and Drell–Yan data plus the Tevatron Drell–Yan data bring to the global PDFs. Using only the HERA and ATLAS data allows a detailed treatment of correlated systematic uncertainties.

Correlations of systematic uncertainties within and between ATLAS data sets are considered, with particular emphasis on the larger systematic uncertainties appertaining to the jet energy scale. Information about correlations between data sets is provided such that the major correlations could be considered by the global fitting groups CT, MSHT and NNPDF in future fits. The effects of these correlations are relatively small, \(O({\sim }1\%\)), at the scale \(Q^2\sim {10{,}000}\) GeV\(^2\) considered for precision physics at the LHC, but large enough to be considered in future precision data analyses such as the measurement of \(m_{W}\) and \(\sin ^2\!\theta _{\text {W}}\), given that such a level of accuracy is now desired to resolve effects of physics beyond the Standard Model in the measurements of Standard Model parameters.