1 Introduction

Redfield (1934) was the first to observe that below the mixed layer, the concentrations of dissolved nitrate and phosphate vary approximately in constant proportions. The modern estimate of the molar ratio of dissolved nitrate and phosphate is 16:1; it is called “Redfield ratio.”

The commonly accepted explanation for this phenomenon is related to the biological activity of marine microorganisms. At the ocean surface, in the area that receives light, phytoplankton uses the sun’s energy through photosynthesis producing oxygen and transforming some of the inorganic carbon and nutrients (nitrogen and phosphorous) into organic matter. The reverse reaction is oxidation of organic matter or remineralization. Below the mixed layer, within a water mass, this oxidation is mainly responsible for variations in concentrations of carbon, oxygen, nitrogen, and phosphorous. That these variations remain in constant ratios means simply that biological activity follows the same rules everywhere in the ocean.

This process is represented by the following chemical reaction:

$$ \begin{array}{c}\hfill {\mathrm{C}}_{\upalpha}{\mathrm{H}}_{\upbeta}{\mathrm{O}}_{\upgamma}{\mathrm{N}}_{\updelta}{\mathrm{P}}_{\upvarepsilon}+{\uppsi}_{{\mathrm{O}}_2}{\mathrm{O}}_2\underset{\mathrm{photosynthesis}}{\overset{\mathrm{oxidation}}{\rightleftarrows }}\kern11em \hfill \\ {}\hfill {\uppsi}_{CO_2}{CO}_2+{\uppsi}_{{\mathrm{N}\mathrm{O}}_3^{-}}{\mathrm{N}\mathrm{O}}_3^{-}+{\uppsi}_{PO_4^{3-}}{PO}_4^{3-}+{\uppsi}_{{\mathrm{H}}^{+}}{\mathrm{H}}^{+}+{\uppsi}_{{\mathrm{H}}_2\mathrm{O}}{\mathrm{H}}_2\mathrm{O}\hfill \end{array} $$
(1)

CαHβOγNδPε is a fictitious molecule, which contains the elements entering into the composition of any organic matter. The coefficients α , β , γ , δ , and ε are determined by the different stoichiometric coefficients of the reaction, \( {\uppsi}_{{\mathrm{O}}_2} \), \( {\uppsi}_{CO_2} \), \( {\uppsi}_{{\mathrm{NO}}_3^{-}} \), \( {\uppsi}_{PO_4^{3-}} \), \( {\uppsi}_{{\mathrm{H}}^{+}} \), and \( {\uppsi}_{{\mathrm{H}}_2\mathrm{O}} \).

The Redfield ratios are the ratios of these stoichiometric coefficients. Let us take, for example, the concentrations of oxygen and nitrate. When the oxygen concentration decreases by ∆O2, the concentration of nitrate increases by \( {\Delta \mathrm{NO}}_3^{-}=\frac{{\ \uppsi}_{{\mathrm{NO}}_3^{-}}}{\uppsi_{{\mathrm{O}}_2}}{\Delta \mathrm{O}}_2 \). These concentrations therefore vary according to a constant ratio, called R ON and defined by

$$ {R}_{\mathrm{O}\mathrm{N}}=\frac{{\varDelta \mathrm{O}}_2}{{\varDelta \mathrm{NO}}_3^{-}}=\frac{\uppsi_{{\mathrm{O}}_2}}{{\ \uppsi}_{{\mathrm{NO}}_3^{-}}} $$
(2)

Similarly, we define the coefficient R OP as

$$ {R}_{\mathrm{O}\mathrm{P}}=\frac{{\varDelta \mathrm{O}}_2}{{\varDelta PO}_4^{3-}} = \frac{\uppsi_{{\mathrm{O}}_2}}{\uppsi_{PO_4^{3-}}} $$
(3)

and the coefficient R NP , which can be deduced from the two previous ones,

$$ {R}_{NP}=\frac{{\varDelta \mathrm{NO}}_3^{-}}{{\varDelta PO}_4^{3-}} = \frac{{\ \uppsi}_{{\mathrm{NO}}_3^{-}}}{\uppsi_{PO_4^{3-}}}=\frac{R_{\mathrm{OP}}}{R_{\mathrm{ON}}} $$
(4)

The concept of Redfield therefore asserts the existence of coefficients R ON, R OP, and R NP, which are constant, both in time and space. Yet, in the oceans, there are several processes such as physiological plasticity, bacterial denitrification, nitrogen fixation, or differential remineralization, which can produce small stoichiometry deviations (Hupe and Karstensen 2000).

Determination of Redfield ratios in seawater is of great interest because of their role in the cycles of nutrients and carbon (Goyet and Brewer 1993; Roy-Barman and Jeandel 2011).

Actually, human activities cause a dramatic increase of carbon dioxide into the atmosphere. So, the ocean absorbs more and more carbon dioxide, which brings about unprecedented changes in ocean biogeochemical processes and in marine ecosystems. Describing how this “anthropogenic” activity continues to drive the carbon toward the ocean interior is crucial today (Goyet and Touratier 2009; Touratier et al. 2012).

Typical values of the Redfield ratios are those given by Redfield et al. (1963); the stoichiometric coefficients \( {\uppsi}_{{\mathrm{O}}_2} \), \( {\uppsi}_{{\mathrm{NO}}_3^{-}} \), and \( {\uppsi}_{PO_4^{3-}} \) of Eq. (1) are equal to 138, 16, and 1, respectively, and lead to R OP = 138, R NP = 16, and R ON = 9 (all numerical values are rounded to a whole unit).

Since then, numerous studies have been conducted on the subject (see Sect. 3). They all confirm the concept of Redfield, but their results may give different values, depending on the studies, from 7 to 11 for R ON, from 95 to 190 for R OP, and from 12 to 26 for R NP. Moreover, while Redfield established constant values at any point of the world ocean (the oceans are in open communication with each other and constitute the world ocean), the majority of observations indicate that the Redfield ratios are variable according to ocean basin and depth. This variability could be further linked to the role of anthropogenic carbon, which does not penetrate the ocean in the same way everywhere.

We present here a new method for determining the Redfield ratios. Its main advantage is that it is theoretically applicable everywhere (subject to the availability of sufficient independent data).

2 Water mass properties

2.1 Water masses

A “water mass” is a volume of water which has been formed in the same way in the same place (Tomczak 1999). As soon as a water mass has left the surface, its potential temperature and salinity represent the signature of this water mass and can be used as “conservative tracers.” Thus, the observed variations of potential temperature and salinity, from one ocean area (geographical or depth) to another, are only due to mixing of various water masses (except in very few areas such as near-hydrothermal vents).

Consider an area of the ocean where N water masses are present. In this area, any seawater sample results from the mixing of these water masses, in the proportions k 1 , k 2, … , k N called mixing coefficients. These coefficients satisfy

$$ \forall i\in \left\{1,2,\dots, N\right\},0\le {k}_i\le 1 $$
(5)

and

$$ {\sum}_{i=1}^N{k}_i=1 $$
(6)

Any conservative characteristic C of a given sample satisfies

$$ C={\sum}_{i=1}^N{k}_i{C}_i $$
(7)

where C i is the characteristic of the water mass i. The value of C depends only upon the mixing of the water masses at the considered location.

2.2 Conservative tracers

There are conservative tracers other than potential temperature and salinity. Furthermore, it is also possible to construct a conservative tracer using a combination of two non-conservative tracers. As example, oxygen, nitrogen, and phosphorous are not considered as conservative tracers since their concentrations vary according to biological activity (as mentioned in Sect. 1). Their concentrations can be decomposed as the sum of one term reflecting the conservative mixing of water masses and a second term reflecting local biochemical processes.

$$ \left[{\mathrm{O}}_2\right]={\sum}_{i=1}^N{k}_i{\left[{\mathrm{O}}_2\right]}_i-\varDelta {\mathrm{O}}_2 $$
(8)
$$ \left[{\mathrm{NO}}_3^{-}\right]={\sum}_{i=1}^N{k}_i{\left[{\mathrm{NO}}_3^{-}\right]}_{\mathrm{i}}+\varDelta {\mathrm{NO}}_3^{-} $$
(9)
$$ \left[{PO}_4^{3-}\right]={\sum}_{i=1}^N{k}_i{\left[{PO}_4^{3-}\right]}_i+{\varDelta PO}_4^{3-} $$
(10)

[O2], \( \left[{\mathrm{NO}}_3^{-}\right], \) and \( \left[{PO}_4^{3-}\right] \) are the sample concentrations (in μmol/kg) of oxygen, nitrate, and phosphate, respectively. [O2] i , \( {\left[{\mathrm{NO}}_3^{-}\right]}_i \), and \( {\left[{PO}_4^{3-}\right]}_i \) are the concentrations of the water mass i. The minus sign in Eq. (8) is a reminder that the oxygen is consumed when nitrate and phosphate are produced.

The Redfield’s principle of variations in constant proportions allows the construction of the conservative tracers NO and PO (Broecker 1974).

$$ \mathrm{NO}=\left[{\mathrm{O}}_2\right]+{R}_{\mathrm{O}\mathrm{N}}\left[{\mathrm{NO}}_3^{-}\right] $$
(11)
$$ PO=\left[{\mathrm{O}}_2\right]+{R}_{\mathrm{O}\mathrm{P}}\left[{PO}_4^{3-}\right] $$
(12)

NO and PO are called “composite tracers” (each consisting of two non-conservative characteristics), and they are themselves conservative (this is strictly true only within the validity of constant Redfield ratios). Indeed, we get, by combining Eqs. (8), (9), and (11),

$$ \mathrm{NO}={\sum}_{i=1}^N{k}_i{\left[{\mathrm{O}}_2\right]}_i-\varDelta {\mathrm{O}}_2+{R}_{\mathrm{O}\mathrm{N}}\left({\sum}_{i=1}^N{k}_i{\left[{\mathrm{NO}}_3^{-}\right]}_i+\varDelta {\mathrm{NO}}_3^{-}\right) $$
(13)
$$ ={\sum}_{i=1}^N{k}_i\left({\left[{\mathrm{O}}_2\right]}_i+{R}_{\mathrm{O}\mathrm{N}}{\left[{\mathrm{NO}}_3^{-}\right]}_i\right)-\varDelta {\mathrm{O}}_2+{R}_{\mathrm{O}\mathrm{N}}\varDelta {\mathrm{NO}}_3^{-} $$
(14)

The variations of concentrations respect the equation: \( {\Delta \mathrm{O}}_2={R}_{\mathrm{ON}}{\Delta \mathrm{NO}}_3^{-} \) (cf. Eq. (2)), so

$$ \mathrm{NO}={\sum}_{i=1}^N{k}_i\left({\left[{\mathrm{O}}_2\right]}_i+{R}_{\mathrm{O}\mathrm{N}}{\left[{\mathrm{NO}}_3^{-}\right]}_i\right) $$
(15)

By defining the characteristic NO i of the water mass,

$$ {\mathrm{NO}}_i={\left[{\mathrm{O}}_2\right]}_i+{R}_{\mathrm{O}\mathrm{N}}{\left[{\mathrm{NO}}_3^{-}\right]}_i $$
(16)

one finds again an equation (see Eq. (7)) for a conservative tracer,

$$ \mathrm{NO}={\sum}_{i=1}^N{k}_i{\mathrm{NO}}_i $$
(17)

Thus, NO (and similarly PO) depends only upon the mixing of water masses at the considered point.

3 Theory

In order to determine the values of the Redfield ratios, existing methods generally seek to determine the non-conservative fractions ∆O2, \( {\Delta \mathrm{NO}}_3^{-} \), and \( {\Delta PO}_4^{3-} \) of Eqs. (8), (9), and (10). These fractions are proportional to each other; proportionality coefficients are the Redfield ratios.

An early category of methods (Alvarez-Borrego et al. 1975; Castro et al. 1998; Hupe and Karstensen 2000; Schneider et al. 2005) begins by determining, for a given area, the various water masses in presence and their characteristics, [O2] i , \( {\left[{\mathrm{NO}}_3^{-}\right]}_i \), and \( {\left[{PO}_4^{3-}\right]}_i \). Then, for each sample, the mixing coefficients k i must be calculated. They then determine the conservative parts of the oxygen and nutrient concentrations, \( {\sum}_{i=1}^N{k}_i{\left[{\mathrm{O}}_2\right]}_i \), \( {\sum}_{i=1}^N{k}_i{\left[{\mathrm{NO}}_3^{-}\right]}_i \), and \( {\sum}_{i=1}^N{k}_i{\left[{PO}_4^{3-}\right]}_i \). To access the non-conservative fractions, just subtract the conservative parts from the corresponding measured concentrations [O2], \( \left[{\mathrm{NO}}_3^{-}\right], \) and \( \left[{PO}_4^{3-}\right] \). The disadvantage of these methods lies in the large estimation errors of the mixing coefficients. First, the characteristics of a water mass can never be determined exactly. Then, the mixing coefficients can be calculated only sample by sample, without the possibility of reducing errors by using a global calculation on all samples.

A second category of methods does not explicitly determine the mixing coefficients. From assumptions on the various contributing water masses, they build relationships (other than simple proportionality) between concentrations [O2], \( \left[{\mathrm{NO}}_3^{-}\right], \) and \( \left[{PO}_4^{3-}\right] \). The identification of the parameters of these relationships leads to the Redfield ratios. But, these methods are only applicable either to specific areas (Takahashi et al. 1985; Minster and Boulahdid 1987; Shaffer et al. 1999; Li and Peng 2002; Schroeder et al. 2010) or to areas where at most two water masses mix (Anderson and Sarmiento 1994; Placenti et al. 2013).

The advantage of the method presented here is that first, it does not require the identification of characteristics of each water mass or any knowledge about the mixing coefficients. Thus, it eliminates a primary source of errors. Secondly, if there are enough data available, it can be applied everywhere, whatever the area, and its application is always done in the same way.

3.1 Assumptions and notations

Let us consider P samples of seawater, spread in an area of the world ocean where N water masses (known also as “sources”) are present. We suppose that for each sample, we have measurements of M conservative tracers with P ≥ M.

We can build the matrix D (P rows × M columns) of data,

$$ D=\left(\begin{array}{cccc}\hfill \vdots \hfill & \hfill \hfill & \hfill \hfill & \hfill \vdots \hfill \\ {}\hfill {d}_{l,1}\hfill & \hfill {d}_{l,2}\hfill & \hfill \cdots \hfill & \hfill {d}_{l,M}\hfill \\ {}\hfill \vdots \hfill & \hfill \hfill & \hfill \hfill & \hfill \vdots \hfill \end{array}\right) $$
(18)

For each measurement point l, 1 ≤ l ≤ P, the corresponding values d l , 1 , d l , 2 , … , d l , M of the M conservative tracers are in row l. For example, the first column of the matrix D can be composed of potential temperature data and the second column of salinity data.

Note that \( {\underset{\_}{1}}_P \) is the vector of P rows and one column, whose elements are all equal to 1,

$$ {\underset{\_}{1}}_P=\left(\begin{array}{c}1\\ {}\vdots \\ {}1\end{array}\right) $$
(19)

The starting point of our method is based on the following question:

Is there a vector \( \underset{\_}{a} \) of M real coordinates, \( \underset{\_}{a}=\left(\begin{array}{c}{a}_1\\ {}{a}_2\\ {}\vdots \\ {}{a}_M\end{array}\right) \), such that

$$ D\ \underset{\_}{a}={\underset{\_}{1}}_P $$
(20)

Let K be the matrix (P rows × N columns) of mixing coefficients,

$$ K=\left(\begin{array}{ccccc}\hfill \vdots \hfill & \hfill \hfill & \hfill \hfill & \hfill \hfill & \hfill \vdots \hfill \\ {}\hfill {k}_{l,1}\hfill & \hfill {k}_{l,2}\hfill & \hfill \cdots \hfill & \hfill \hfill & \hfill {k}_{l,N}\hfill \\ {}\hfill \vdots \hfill & \hfill \hfill & \hfill \hfill & \hfill \hfill & \hfill \vdots \hfill \end{array}\right) $$
(21)

For each sample l, 1 ≤ l ≤ P, the corresponding mixing coefficients k l , 1 , k l , 2 , … , k l , N are in row l. Their sum being equal to 1 (cf. Eq. (6)), we have

$$ K\ {\underset{\_}{1}}_N={\underset{\_}{1}}_P $$
(22)

where \( {\underset{\_}{1}}_N \) is the vector of N rows × 1 column, whose elements are all equal to 1.

Let W be the matrix (N rows × M columns) of the characteristics of sources,

$$ W=\left(\begin{array}{cccc}\hfill \vdots \hfill & \hfill \hfill & \hfill \hfill & \hfill \vdots \hfill \\ {}\hfill {w}_{l,1}\hfill & \hfill {w}_{l,2}\hfill & \hfill \cdots \hfill & \hfill {w}_{l,M}\hfill \\ {}\hfill \vdots \hfill & \hfill \hfill & \hfill \hfill & \hfill \vdots \hfill \end{array}\right) $$
(23)

For each water mass l, 1 ≤ l ≤ N, the corresponding conservative tracers w l , 1 , w l , 2 , … , w l , M are in row l. As all considered tracers are conservative, Eq. (7) becomes, in matrix form

$$ D=K\ W $$
(24)

It is necessary to define K and W to demonstrate that the vector \( \underset{\_}{a} \) exists, but it is not necessary to know their values to apply our method (and this is the main advantage of this method).

3.2 Existence of a linear relation between conservative tracers

Let us concentrate here to solve Eq. (20). In other words, do M conservative tracers span a hyperplane in R M?

Furthermore, suppose that the N sources are independent (none is the result of a mixing of the others). In practice, it is mainly the knowledge of the hydrology of the ocean area which ensures correct identification of the independent water masses. Thus, the rank of the matrix D is equal to the minimum of N and M. The following three cases are possible (referring to the results on the resolution of a linear system):

  • Case 1, N > M (more water masses than conservative tracers)

Then, the rank of matrix D is equal to M. So we know that the only possible solution of the equation \( D\ \underset{\_}{a}={\underset{\_}{1}}_P \) is the solution in the least squares sense,

$$ {\underset{\_}{a}}_{mc}={\left({D}^{\prime }D\right)}^{-1}{D}^{\prime }\ {\underset{\_}{1}}_P $$
(25)

The vector \( {\underset{\_}{a}}_{mc} \) minimizes the norm of the residuals of the equation \( D\ \underset{\_}{a}={\underset{\_}{1}}_P \), but there is no guarantee that this minimum is equal to zero or in other words that \( {\underset{\_}{a}}_{mc} \) is an exact solution of the equation. Moreover, in general, it is not! So in this case, there exists a priori no hyperplane of the tracers.

  • Case 2, N < M (less water masses than conservative tracers)

Then, the rank of D is equal to N. So we know nothing about \( D\ \underset{\_}{a}={\underset{\_}{1}}_P \).

Consider the same equation, not anymore on data but on sources,

$$ W\ \underset{\_}{a}={\underset{\_}{1}}_N $$
(26)

The rank of matrix W is equal to N. So, Eq. (26) has at least one (exact) solution, the vector

$$ {\underset{\_}{a}}_s={W}^{\prime }{\left(W\ {W}^{\prime}\right)}^{-1}\ {\underset{\_}{1}}_N $$
(27)

(There is no guarantee that this solution is unique, but this is not a problem for our final objective.)

This solution satisfies

$$ W\ {\underset{\_}{a}}_s={\underset{\_}{1}}_N $$
(28)

Hence, by multiplying on the left by K,

$$ K\ W\ {\underset{\_}{a}}_s=K\ {\underset{\_}{1}}_N $$
(29)

or by using Eq. (22),

$$ K\ W\ {\underset{\_}{a}}_s={\underset{\_}{1}}_P $$
(30)

and finally, with Eq. (24),

$$ D\ {\underset{\_}{a}}_s={\underset{\_}{1}}_P $$
(31)

In other words, \( {\underset{\_}{a}}_s \) is also a solution of the first equation on data (Eq. (20)). So in this case, the tracers form at least one hyperplane.

  • Case 3, N = M (as many water masses as conservative tracers)

This case falls in both the second case (rank of D equal to N) and in the first case (rank of D equal to M). The equation \( D\ \underset{\_}{a}={\underset{\_}{1}}_P \) therefore has a unique solution, which is

$$ {\underset{\_}{a}}_s={\left({D}^{\prime }D\right)}^{-1}{D}^{\prime }\ {\underset{\_}{1}}_P $$
(32)

Or equivalently, since here W is invertible,

$$ {\underset{\_}{a}}_s={W}^{-1}\ {\underset{\_}{1}}_N $$
(33)

In conclusion, if N ≤ M (as many or more conservative tracers than water masses), we know that there exists a linear relation between the M conservative tracers.

We will see in Sect. 4 how to find the coefficients of this relation, using an algorithm of total least squares (Markovsky and Van Huffel 2007). For now, we have demonstrated the existence of a relation, but we have no means of getting it. If N < M, the vector \( {\underset{\_}{a}}_s={W}^{\prime }{\left(W\ {W}^{\prime}\right)}^{-1}\ {\underset{\_}{1}}_N \) (cf. Eq. (27)) is incalculable since the matrix W is unknown. If N = M, Eq. (33) fails to provide the solution (since W is still unknown) and Eq. (32) cannot be applied directly. Indeed, the computation of the pseudoinverse (matrix (D D)−1 D ) of the matrix D is so sensitive to the unavoidable noise (of the measured data) that it often leads to unusable results.

3.3 Principle of the determination of the Redfield ratios

Let us now assume that N ≤ M. This restricts the use of our method to areas where sufficient data are available compared to the water masses involved.

How does the existence of a linear relation between the conservative tracers enable the determination of the Redfield ratios? Just take, among the tracers, the composite tracers NO or PO. We now know that there exists a linear relation between these composite tracers and the other considered tracers. The estimate of this linear relationship, which has the minimum error, will correspond to the correct value of the coefficients R ON or R OP (in the calculation of NO or PO).

For example, consider an area where we have at our disposal data for potential temperature T, salinity S, oxygen [O2], and nitrate \( \left[{\mathrm{NO}}_3^{-}\right] \) and where there is a maximum of three water masses. The tracer NO is given by Eq. (11), in which the value of R ON is the unknown to be determined. From now on, we know that there exist real numbers a 1 , a 2 , and a 3, such that when the value taken for R ON is the correct one, the data of every sample in the area satisfies the equation

$$ {a}_1\ T+{a}_2\ S+{a}_3\ \mathrm{NO}=1 $$
(34)

The correct value of R ON minimizes the error of reconstruction of Eq. (34).

Note that we can write Eq. (34) in the form

$$ {a}_1\ T+{a}_2\ S+{a}_3\ \left[{\mathrm{O}}_2\right]+{a}_3\ {R}_{\mathrm{O}\mathrm{N}}\ \left[{\mathrm{NO}}_3^{-}\right]=1 $$
(35)

Hence, if we set

$$ {a}_4={a}_3\ {R}_{\mathrm{ON}} $$
(36)

we have

$$ {a}_1\ T+{a}_2\ S+{a}_3\ \left[{\mathrm{O}}_2\right]+{a}_4\ \left[{\mathrm{NO}}_3^{-}\right]=1 $$
(37)

Thus, we can also determine a 1 , a 2 , a 3 , and a 4 and then apply

$$ {R}_{\mathrm{ON}}=\raisebox{1ex}{${a}_4$}\!\left/ \!\raisebox{-1ex}{${a}_3$}\right. $$
(38)

4 Data and calculation

4.1 Conservative tracers

We have used Global Data Analysis Project (GLODAP) data (Key et al. 2004) covering the global ocean by combining a wide set of oceanographic measurements. Measurements of different expeditions since 1990 were synthesized and calibrated to provide a global database. In GLODAP, there are measurements of temperature, salinity, oxygen, nitrate, and phosphate. Oxygen and nitrate lead to the conservative tracer NO, while oxygen and phosphate lead to PO.

Strictly speaking, because of the slight compressibility of sea water, which causes a warming due to increase in pressure, the temperature “in situ” (archived in GLODAP) is not exactly a conservative parameter because it varies with pressure. The laws of thermodynamics permit to calculate the temperature corrected for the pressure effect (see the official site http://www.teos-10.org/ about the thermodynamic equation of seawater). It is this conservative temperature that we calculated and used. Similarly, we calculated the “preformed” salinity (IOC, SCOR, and IAPSO 2010), which is strictly conservative. It is also calculated from the measured salinity, according to the same laws of thermodynamics. It is the one we used here.

Moreover, the measurements available in GLODAP allow us to build another composite tracer, TrOCA0. It is made from temperature and total alkalinity (Dickson 1981). Its precise definition and its properties are described in Touratier and Goyet (2004) and Touratier et al. (2007). Here, its primary feature is that it is conservative.

So, we have the following five conservative tracers: potential temperature T, (preformed) salinity S, NO, PO, and TrOCA0. We recall that we have demonstrated in the preceding paragraphs the existence of a linear relation between any set of M conservative tracers, in any area comprising at maximum M water masses. In addition, the fact of taking NO or PO among the M tracers enables to determine the Redfield ratios; the correct values of R ON or R OP are those minimizing the error of reconstruction of the linear relation between tracers.

4.2 Measurement noise

Let us continue with the example already given in Sect. 3.3 of computing R ON in an ocean area with a maximum of three water masses (in an area with four water masses, we would use TrOCA0 in addition to potential temperature and salinity). In order to determine R ON, we have two possibilities. The first one is to directly determine the four coefficients of the linear Eq. (37) and then to calculate R ON using Eq. (38). The second one is a minimization loop on the values of R ON; we successively assume different values for R ON. For each assumed value of R ON, we compute the tracer NO (Eq. (11)) and we determine the three coefficients of Eq. (34) as well as the norm of the residuals of this Eq. (34). Finally, we choose the optimal value of R ON as the one corresponding to the smallest norm of residuals.

If the data were free from noise, the above two options would lead to the same result. But of course, the available data are noisy! They are supposed to be affected by additive Gaussian noise with zero mean and standard deviation depending on the data type, 0.001 °C for temperature, 0.0001 for salinity, 0.02 μmol/kg for nitrate, 0.005 μmol/kg for phosphate, and 2 μmol/kg for oxygen (Sabine et al. 2005 (www.seabird.com); Oudot et al. 1998).

The presence of noise makes the choice of the calculation method crucial. The second of the two possibilities mentioned above is not satisfactory because it too often leads to a divergence of the minimization algorithm. Thus, we wrote our calculation program by choosing the first option, i.e., by decomposing the tracer NO to arrive at Eq. (37).

Similarly, in order to calculate R OP, we must determine the four coefficients b 1 , b 2 , b 3 , and b 4 of the equation

$$ {b}_1\ T+{b}_2\ S+{b}_3\ \left[{\mathrm{O}}_2\right]+{b}_4\ \left[{PO}_4^{3-}\right]=1 $$
(39)

then calculate

$$ {R}_{\mathrm{OP}}=\raisebox{1ex}{${b}_4$}\!\left/ \!\raisebox{-1ex}{${b}_3$}\right. $$
(40)

The determination of R ON and of R OP are thus made independently. In theory, one can imagine a simultaneous calculation of R ON and R OP. Just construct a double minimization loop through the values of both R ON and R OP, in order to minimize the residuals of an equation involving both NO and PO,

$$ {\alpha}_1\ T+{\alpha}_2\ S+{\alpha}_3\ \mathrm{NO}+{\alpha}_4\ PO=1 $$
(41)

In practice, this double loop, as the previously envisaged simple loop, often diverges (here again, because of the presence of noise in the data). We therefore did not accept it as a valid algorithm.

Note that, since we cannot use both NO and PO tracers simultaneously, we are reduced to up to four conservative tracers, for the estimation of R ON, potential temperature, salinity, TrOCA0, and NO and for the estimation of R OP, potential temperature, salinity, TrOCA0, and PO. Consequently, we will be limited to study ocean areas with a maximum of four water masses. For most ocean areas, as shown below (for example, for the Atlantic Ocean), this is just fine since we can often split a large area into smaller areas. Yet, for an ocean area with more than four water masses, the calculation algorithm would remain the same, but it would require additional conservative tracers (CFCs, for example).

4.3 Estimation of the Redfield ratios

Returning to Eq. (37), it is now necessary to estimate its coefficients, in order to then determine R ON (with Eq. (38)). In order to calculate R OP, the approach is analog (Eqs. (39) and (40)).

It is a least squares problem, but one in which the data are noisy and have different standard deviations. Furthermore, in Eq. (37), the second term of equality is not zero but equal to 1. The robust estimator is therefore the generalized total least squares estimator (Markovsky and Van Huffel 2007). In order to compute this estimator, we write the equations

$$ {a}_1\ T+{a}_2\ S+{a}_3\ \left[{\mathrm{O}}_2\right]+{a}_4\ \left[{\mathrm{NO}}_3^{-}\right]-1=0 $$
(42)

or

$$ {a}_1\ T+{a}_2\ S+{a}_4\ \left[{\mathrm{NO}}_3^{-}\right]-1=-{a}_3\ \left[{\mathrm{O}}_2\right] $$
(43)

or finally

$$ {x}_1\ T+{x}_2\ S+{x}_3\ \left[{\mathrm{NO}}_3^{-}\right]+{x}_4=\left[{\mathrm{O}}_2\right] $$
(44)

or in matrix form

$$ A\ \underset{\_}{x}=\underset{\_}{b} $$
(45)

For every measurement point l, let T l , S l , \( {\left[{\mathrm{NO}}_3^{-}\right]}_l \), and [O2] l represent the corresponding data values. The first column of the matrix A contains potential temperature data, the second column those of salinity, the third column those of nitrate, and the fourth column consists of constant values equal to 1.

$$ A=\left(\begin{array}{cccc}\vdots & \vdots & \vdots & 1\\ {}{T}_l& {S}_l& {\left[{\mathrm{NO}}_3^{-}\right]}_l& 1\\ {}\vdots & \vdots & \vdots & 1\end{array}\right) $$
(46)

The vector \( \underset{\_}{b} \) contains the measurements of oxygen,

$$ \underset{\_}{b}=\left(\begin{array}{c}\vdots \\ {}{\left[{\mathrm{O}}_2\right]}_l\\ {}\vdots \end{array}\right) $$
(47)

The vector

$$ \underset{\_}{x}=\left(\begin{array}{c}{x}_1\\ {}{x}_2\\ {}{x}_3\\ {}{x}_4\end{array}\right) $$
(48)

is that of the parameters to be estimated.

Equation (38), which allowed us to determine the value of R ON, becomes

$$ {R}_{\mathrm{ON}}=-{x}_3 $$
(49)

The non-zero second term of the initial Eq. (37) therefore results in a constant column in the matrix A (i.e., a column without noise). To get an estimate as robust as possible, the vector \( \underset{\_}{b} \) must contain the noisiest data (oxygen).

Furthermore, our problem is an “ill-posed” inverse problem; the matrix containing all the measurement data is ill-conditioned due to both the additive noise and the significant differences in the orders of magnitude and accuracies among the tracers. This ill conditioning prohibits any direct pseudoinverse solution. On the contrary, in order to “regularize” the problem, the estimation algorithm suggests truncation, if necessary, of the lowest singular values of this matrix.

Provided that a solution exists, the method of generalized total least squares (Van Huffel and Vandewalle 1989; Markovsky and Van Huffel 2007) allows us to solve the problem.

The method takes into account not only the measurement noise of all the data (in matrix \( \underset{\_}{\mathrm{b}} \) and in matrix A of Eq. (45)) but also the different values of standard deviations. We assume that all the data are disturbed by additive Gaussian noise with zero mean but with standard deviation depending on the data type, 0.001 °C for temperature, 0.0001 for salinity, 0.02 μmol/kg for nitrate, 0.005 μmol/kg for phosphate, and 2 μmol/kg for oxygen.

Furthermore, the method gives an analytical formulation of the result (based upon the singular value decomposition of the data matrix) instead of searching numerically for the optima of a cost function. So, the solution is much more reliable than a result that would have been found by a local optimization algorithm (which may diverge or converge toward a local optimum instead of the global one).

Hence, it was very important to first demonstrate that a solution exists.

The analytic formulation we used to construct the generalized total least squares estimator of the Redfield ratios is detailed in the Appendix.

Finally, let us note that among the measurement points, there are always a few of them that are aberrant (at these points, the measurements were, for various reasons, totally biased and flawed). To overcome this issue, we iteratively calculate the estimate; on every iteration, we remove the data points that are three times above the standard deviation of the residuals of the estimation.

We wrote the estimation algorithm using MATLAB.

5 Results and discussion

5.1 Study area

We chose to test this method in the Atlantic Ocean because it is a key element in the large-scale circulation of the global ocean. Many different water masses constitute the Atlantic Ocean (Talley et al. 2011), and the North Atlantic Ocean is an area of deep-water formation.

Since the method presented here should have the same number (or less) of water masses than that of conservative tracers, it is necessary to “cut” the Atlantic Ocean into several zones.

In an area where N sources and M tracers are available, the M values of these tracers, for each measurement point, are the coordinates of the faces of a convex polyhedron of dimension M, with N vertices. The projection of this polyhedron on a plane (formed for example by the potential temperature and salinity tracers, which gives the “T-S diagram”) is a polygon having X vertices, with X ≤ N, each vertex of the polygon corresponding to one of the sources. Such projection is often used to determine the number and position of the sources involved by identifying the vertices of the polygon. But, the determination of the vertices of the polygon has variable accuracy, depending upon the respective positions of the available measurement points. Moreover, as there is no guarantee that X equals N, (N − X) sources are absolutely inaccessible.

Consequently, zones based solely on the measurement points would be absurd. Here, we identified 16 areas in the Atlantic Ocean, based not only upon the available data (via the T-S diagram) but also mainly on the current knowledge of the water masses and their movements (Fieux 2010).

5.1.1 Latitudes

The longitudinal limits of the Atlantic Ocean are the natural continental limits. The Atlantic Ocean is limited to the west by the American coasts and to the east by the European and African coasts. The hydrological characteristics of the Atlantic Ocean allow us to divide it into four latitudinal zones, as shown in Fig. 1 and defined as follows:

  • The North Atlantic Ocean between 60°N and 20°N (in red in Fig. 1).

  • The northern equatorial Atlantic Ocean, at latitudes ranging from 20°N to 5°N (in yellow in Fig. 1).

  • The southern equatorial Atlantic Ocean, at latitudes ranging from 5°N to 20°S (in pink in Fig. 1).

  • The South Atlantic Ocean, at latitudes ranging from 20°S to 45°S (in green in Fig. 1). The limit of 45°S is determined between the Subtropical Front and the Polar Front.

Fig. 1
figure 1

The Atlantic Ocean

5.1.2 Depths

At these latitudinal zones, depth ranges should be added. First, recall that we must consider only the data from below the surface mixed layer (cf. Sect. 1). For this, we used the database NDP-076 (Goyet et al. 2000), which gives the maximum depth of the mixed layer for each month of the year at any point of the world ocean (on a grid of 1° longitude by 1° latitude). Then, the characteristics of ocean circulation can distinguish the following three layers:

  • The surface ocean, at depths ranging from 50 down to 500 m.

  • The intermediate layer, at depths between 500 and 1750 m.

  • The deep ocean, at depths deeper than 1750 m.

In the deep ocean, an additional split is necessary. Indeed, the Mid-Atlantic Ridge separates the western basins from the eastern basins, as well as the flow of bottom waters. This ridge, which follows approximately the shape of continents, appears light blue in Fig. 1. For layers deeper than 1750 m, it is thus necessary to separate them in two areas, one west of the ridge and another east of this ridge. This split is made following the ridge (not according to an arbitrarily longitude).

Consequently, as specified in Table 1, for this study, the Atlantic Ocean is divided into 16 zones. In each zone, there are three or four independent water masses, also indicated in Table 1. For areas with three sources, the calculation of the Redfield ratios is based upon the following three tracers: potential temperature, salinity, and either NO or PO. For areas with four sources, we use the three latter tracers to which we add the TrOCA0 tracer.

Table 1 Study areas and their water masses

5.2 Water masses

We recall here the advantage of our method; it does not require determination of the characteristics of the water sources involved. It is sufficient to know the number of them (in order to adjust accordingly the number of conservative tracers used in the calculation and, thus, to ensure the existence of a solution to the algorithm).

A water mass can, of course, be present in several study areas. So, some different study areas appear (cf. Table 1), consisting of the same sources. However, it does not make sense to merge those areas when calculating the Redfield ratios. Merging them could certainly lead to numerical results, but these results would be meaningless. Indeed, for each area, we have identified the major water masses involved, those mainly responsible for the characteristics of the water in this ocean area. But, each area is also subject to the influence of other sources of less importance and of more or less distant origin. Neglecting these other sources when estimating the Redfield ratios is possible only if the choice of the measurement points considered for an estimate, and therefore the definition of study areas, remain based on previous knowledge of ocean movements.

Two deep water masses are present at all latitudes of the Atlantic Ocean, the Antarctic Bottom Water (AABW) and the North Atlantic Deep Water (NADW). The AABW is the deepest water mass; it lies over the entire bottom of the Atlantic Ocean, below 4000 m. The NADW has its origin in the Nordic Seas and flows southward along the American continent. It is present in all deep layers except in the northeast. It can be divided into the following three sources: the upper NADW at depths between 1200 and 1900 m, the central NADW between 1900 and 3500 m, and the lower NADW between 3500 and 3900 m.

Another water mass, less deep, can also be found at almost all latitudes, the Antartic Intermediate Water (AAIW). It originates from the Southern Ocean and flows northward to a maximum latitude of about 20°N. Its presence is less strong in the east than in the west. With an average depth of 1000 m, it influences the intermediate layer (between 500 and 1750 m) but also the deep layer (beyond 1750 m).

The central or subtropical waters are water masses with a more limited geographical extension and a relatively shallow depth range (average depth between 500 and 700 m). They are formed in convergence zones, where waters tend to be subducted due to wind forcing. The North Atlantic Subtropical Water (NASTW) and the North Atlantic Central Water (NACW) are created north of the equator, while the South Atlantic Subtropical Water (SASTW) and the South Atlantic Central Water (SACW) are created south of the equator. The NACW is subjected to intense winter cooling, causing convection significantly deeper than in southern Atlantic Ocean, and hence, it can mix with water from the bottom layer.

In addition to these main water masses, there are water masses of local influence. Among them, two are deep sources; in the northwest, there is the Denmark Strait Overflow Water (DSOW), at depths between 3000 and 3500 m, and in the northeast, there is the Iceland-Scotland Overflow Water (ISOW), at depths between 2000 and 2500 m. An intermediate water mass, the Mediterranean Water (MW), enters the Atlantic Ocean through the Strait of Gibraltar, in the northeast, and stabilizes at an average depth of 1000 m. In the surface layer, local sources have characteristics directly related to the various exchanges of heat and freshwater with the atmosphere, and they are therefore relatively variable. These sources have no specific name. We have found a different local source in each surface zone.

5.3 Redfield ratios

Table 2 shows the results obtained for each zone. We estimated R ON and R OP independently, using NO for the calculation of R ON and using PO for the calculation of R OP (cf. Sect. 4.3).

Table 2 Redfield ratios, within each studied ocean area

For the 16 areas considered here, high coefficients of determination r 2 > 0.84 (average 0.94) were found both for using R ON or R OP.

Standard deviations were calculated based on the least squares estimator (Markovsky and Van Huffel 2007), yielding maximal values of 0.125 for R ON and 2.73 for R OP. These values enable the determination of a 99% confidence interval of the results, ±0.3 for R ON values and ±6 for R OP values.

Table 2 also indicates values of the R NP ratio. This R NP ratio is not estimated directly from the data, but it is deducted from the results obtained for the first two ratios (according to Eq. (4), R NP = R OP/R ON).

Overall, the first striking result is that the values obtained for the ratios are remarkably quasi-constant! They vary relatively little from one zone to another, which is in good agreement with Redfield’s concept.

Furthermore, our results are very close to those first given by Redfield et al. (1963) himself. Nevertheless, the values here are in general, slightly lower than those estimated by Redfield (R ON = 9 and R OP = 138). However, in view of our method (selection of study areas according to the hydrological characteristics and then use of an identical estimator whatever the area), the variations in these values from one area to another are certainly more significant than the values themselves.

We detect significant variations both with latitude and depth. From north to south (from 60°N to 45°S), regardless of depth, R ON decreases and then increases; R OP increases and then decreases; and therefore, R NP (\( {R}_{NP}=\frac{R_{\mathrm{OP}}}{R_{\mathrm{ON}}} \)) increases and then decreases. When the depth increases, regardless of latitude, both R ON and R OP increase and then decrease; thus, R NP decreases and then increases (since the numerator R OP varies less than the denominator R ON).

In the literature, several studies report various values for the Redfield ratios. Often, these values also differ from those of Redfield himself and this to a greater or lesser extent (depending on the study, R ON can range from 7 to 11, R OP can range from 95 to 190, R NP can range from 12 to 26). In general, they also vary geographically, depending both upon the area and upon the depth.

But, the results (as well as the zones that are considered) are different depending on the methods used, and their associated explanations are more or less related to local phenomena. Thus, it is very difficult to compare results (Schneider et al. 2005). Some results can be interpreted as similar to ours, others cannot.

For example, Anderson and Sarmiento (1994) calculated R OP and R NP in the South Atlantic (20°N to 50°S), Indian (20°N to 30°S), and south central Pacific (between 10°N and 30°S) basins, between 400- and 4000-m depth. They found that the R OP ratio is constant with depth and basin, at a value of 170 ± 10, whereas R NP is constant with basin but exhibits a mid-depth minimum; in the 1000–3000-m zone, R NP ≅ 12 ± 2, above and below R NP  ≅ 16 ± 1 (consequently, R ON varies from approximately 11 to 14 at mid-depth).

Shaffer et al. (1999) analyzed data chosen in the same three ocean sectors (low- and middle-latitude Pacific, Indian, and South Atlantic Oceans) and obtained once again ratios constant with basin. But, they estimated that R OP increases from about 140 at 750-m depth to about 170 at 1500 m and remains so deeper down. R NP was found to decrease from about 15 at 750 m to about 12 at 1500–2000 m, similar to the results from Anderson and Sarmiento (1994).

On a smaller scale, Hupe and Karstensen (2000) estimated R OP and R NP within the Arabian Sea between 550 and 4500 m. Both ratios were found to increase continuously with depth. In the 550–1200-m zone, R OP ≅ 139 ± 7 and R NP ≅ 14.4 ± 0.2. Between 1200 and 2000 m, R OP ≅ 152 ± 5 and R NP ≅ 14.9 ± 0.2. Deeper than 2000 m, R OP ≅ 158 ± 5 and R NP ≅ 15.3 ± 0.2. No mid-depth minimum was observed for R NP.

Li and Peng (2002) did not reveal any variation with depth but showed, for the Atlantic Ocean divided into two zones, 45°N to 5°N and 5°N to 50°S, a decrease of each ratio R ON, R OP, and R NP from north to south. In the northern zone, R ON ≅ 8.5 ± 0.8, R OP ≅ 137 ± 7, and R NP ≅ 16.1 ± 1.0. In the southern zone, R ON ≅ 8.4 ± 0.3, R OP ≅ 128 ± 5, and R NP ≅ 15.2 ± 0.7.

Today, the only agreed-upon conclusions are both the confirmation of Redfield’s concept and the variability of these ratios depending upon the ocean area.

6 Conclusions

Redfield ratios are prime indicators of the chemical composition of the ocean, which is governed by biological, chemical, and physical processes. In the upper layer of the ocean, photosynthesis transforms carbon and nutrients into organic matter. This organic matter then remineralizes at depth, allowing the different elements to be entrained in a huge (large-scale) cycle through convection movements and ocean circulation. In the context of climate change, these biogeochemical processes are very important because they determine how fast the ocean can absorb carbon dioxide from the atmosphere.

From the notion of water mass circulation and conservative tracers, we have constructed the generalized total least squares estimator of the Redfield ratios R ON and R OP. We have tested it over the Atlantic Ocean from 60°N down to 45°S. The results remain close to the nominal values of Redfield, with a variability consistent with some previous studies. The precise determination of the Redfield ratios is particularly important today in the context of global change, since they are used to calculate the concentrations of anthropogenic carbon that penetrates into the ocean. The variability of these ratios demonstrates the need for a detailed study of their values prior to their use and the limitations of any mean that could be used.

Therefore, the method presented here could be particularly useful for future studies. While it respects the hydrological specificities of each ocean area, it can simply be applied everywhere, in an analytical manner (without the risks of local artifacts or divergence that an optimization algorithm would have), always constructing the same estimator. This is obviously dependent upon available measurements.

The estimation of R ON is based on the use of the conservative tracer NO and therefore always needs the data of oxygen and nitrate. Similarly, the estimation of R OP is based on the use of the conservative tracer PO and therefore always needs the data of oxygen and phosphate. The other measurement data can be variable, as long as they allow to construct enough conservative tracers for the considered area.

For example here, to determine the Redfield ratios in the Atlantic Ocean, we used the in situ measurements, inevitably, oxygen, nitrate, and phosphate (for NO and PO), then temperature, salinity, and total alkalinity. From these in situ measurements, we built conservative tracers, conservative temperature, conservative salinity (so-called preformed), and tracer TrOCA0 (it is the one that needs the measurements of total alkalinity). But, the method would apply in the same way to different measurements and tracers.