Introduction

Dissolved organic matter (DOM) biogeochemistry constitutes a critical link in the global carbon cycle, contributing to the evasion of CO2 from freshwaters (Battin et al. 2008), and connecting energy flow and nutrient cycling (Bernhardt et al. 2002; Qualls 2000) as an energy source and cellular building blocks for the microbial heterotrophs that dominate aquatic ecosystem respiration (del Giorgio 2005). Because DOM occupies such a central role in ecosystem-level processes, attempts to describe the composition of the DOM pool and relate organic composition to function constitute an area of intensive research in all aquatic habitats ranging from the oceans (Benner et al. 1992; Cauwet 2002; Hansell 2013) to inland waters (Jaffe et al. 2008). As the connections between DOM metabolism in inland waters and the global carbon cycle have become more apparent (Cole et al. 2007; Lauerwald et al. 2012), knowledge of DOM biogeochemistry has taken on new urgency (Battin et al. 2009; IPCC 2014).

Significant strides have been made toward understanding sources, concentrations, biodegradability, and seasonal dynamics of DOM in inland waters (Cleveland et al. 2004; Fellman et al. 2009a; Fellman et al. 2009b; Fellman et al. 2008; Wickland et al. 2007). However, in most studies, characterization of DOM has been limited to bulk characteristics associated with δ 13C signatures, excitation-emission fluorescence spectroscopy, or fractionation with XAD (hydrophobic cross-linked polystyrene copolymer) resins. These bulk average techniques, as well as elemental composition, acidic functional group content (Sun et al. 1997) and oxidation state (Vallino et al. 1996) have limited utility for addressing molecular composition. Ultrahigh-resolution Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS) overcomes some of these limitations and provides a highly resolved view of molecular-level DOM composition for natural waters (Kujawinski et al. 2002; Marshall et al. 1998), and has revealed stunning diversity, with thousands of individual molecules present in low concentrations, comprising the DOM pool in streams and rivers (Hockaday et al. 2009; Kim et al. 2006).

FT-ICR MS has shown great promise for analysis of DOM quality in natural waters (Kim et al. 2003; Kujawinski et al. 2002; Marshall et al. 1998). Studies utilizing FT-ICR MS to examine DOM in aquatic ecosystems have become more widely used in freshwater ecosystems (Kujawinski et al. 2002; Minor et al. 2012; Mosher et al. 2010; Stenson et al. 2002; Stenson et al. 2003) and have addressed the compositional variability in headwater streams across climatic regions (Jaffe et al. 2012). However, to the best of our knowledge, ours is the first study that considers the spatial patterns of DOM molecular composition, chemogeography (Dunlop and Jeffries 1985), and DOM chemodiversity (Kellerman et al. 2014) both across watersheds from different geographical regions as well as the longitudinal patterns over fluvially connected sites beginning in the headwaters of unimpounded stream ecosystems and subsequent downstream changes across stream orders within a stream and river catchment as stream size increases downstream.

The River Continuum Concept (RCC) provides a heuristic theoretical framework that has guided research in stream and river ecosystems for more than three decades (Vannote et al. 1980). The RCC predicts that the diversity of DOM peaks in 1st-order streams as ground waters with low DOM molecular diversity surface and extract organic molecules from detritus, but then diversity diminishes approximately 2-fold in 2nd order streams and 3-fold by 5th-order streams, with little change through 11th-order streams as heterotrophic microbial activity removes labile compounds from downstream transport (Minshall et al. 1985; Vannote et al. 1980). The RCC was conceived prior to the advent of ultrahigh-resolution mass spectrometry, so that until recently the state of knowledge has made it difficult to design an empirical test of the prediction. We now know that DOM throughout a broad range of aquatic environments, including groundwater, is one of the most complex mixtures on Earth (Longnecker and Kujawinski 2011; Osterholz et al. 2014; Sleighter and Hatcher 2008) and that the molecular composition in headwater streams across broad geographic regions shows clear similarities yet distinct differences (Jaffe et al. 2012). We use that knowledge to amend the RCC prediction in a broader context of global DOM biogeochemistry to suggest that DOM diversity, while reaching a peak where the connection of the aquatic environment and the terrestrial environment is maximal, i.e., 1st-order streams, is high in ground water and remains high where surface flows originate and throughout a river network. Furthermore, we note that others have shown that the molecular character of DOM displays substantial overlap among streams globally, yet contains components that make each DOM pool distinct (Jaffe et al. 2012; Kim et al. 2006). In the study described below, we harness a modern organic geochemical methodology, FT-ICR MS, exploiting the sensitivity and resolution of molecular level organic geochemical analyses (Hockaday et al. 2009; Kim et al. 2006) to test these predictions in tropical and temperate stream catchments.

Methods

Site descriptions and experimental design

Three forested headwater stream ecosystems were sampled: the 1st–3rd order, 725 ha White Clay Creek (WCC), an eastern deciduous forest catchment within the Pennsylvania Piedmont (39°53′N, 75°47′W) (Newbold et al. 1997), the 1st–3rd order, 319 ha Rio Tempisquito (RT), a tropical evergreen catchment within the Cordillera de Guanacaste of Costa Rica (10°57′N, 85°29′W) (Newbold et al. 1995), and the 1st–5th order, 17,094 ha Neversink River (NVK), an eastern deciduous forest catchment in the Catskill Mountains of New York (41°90′N, 74.58′W) (Newbold et al. 2006). A sampling design utilizing fluvially connected sites was employed within each watershed. In the WCC and RT watersheds 1st-, 2nd-, and 3rd-order streams were sampled and in the NVK watershed 1st-, 3rd-, and 5th-order streams were sampled. All sites were sampled a single time in April, 2012 with water collected from shallow (<0.5 m), well mixed runs under base flow conditions.

DOM composition via FT-ICR MS

Stream water samples for DOM composition analyses were collected in borosilicate glass bottles rendered organic carbon-free by combustion (450 °C, 5 h). The bottles were rinsed 3× with site stream water before collection, stream water was filtered through pre-combusted glass fiber filters (Whatman GF/F), and the pH was adjusted to 2.3–2.5 with Optima grade HCl. DOM in 1 L was extracted by use of 100 mg PPL Bond Elut cartridges, a modified styrene divinyl benzene polymer (Agilent Technologies, Santa Clara, CA) following methods outlined by Dittmar et al. (2008) at loading rates that ranged from 8 to 21 mg C g−1 adsorbent.

Ammonium hydroxide (5 µL, 30 %, Optima grade) was added to 500 µL of each DOM sample to facilitate deprotonation (Kim et al. 2003). Samples were introduced into the mass spectrometer via a syringe pump at 400 nL/min and a 50 lm i.d. fused silica micro electrospray ionization (ESI) needle under typical ESI conditions (−2.0 kV; tube lens, −300 V; and heated capillary current, 2.4 A). Mass analysis was performed with a custom built FT-ICR mass spectrometer equipped with a 22 cm diameter horizontal bore 9.4 T actively shielded magnet (Kaiser et al. 2011a). Data were collected and processed with a modular ICR data acquisition system (Predator) (Blakney et al. 2011). Ions were accumulated external to the magnet (Senko et al. 1997) in a linear octopole ion trap (25.1 cm long) equipped with axial electric field (Wilcox et al. 2002) for 20 s and transferred through rf-only multipoles to a seven segment, open cylindrical cell with capacitively coupled excitation electrodes similar to the configuration by Tolmachev et al. (2008) and Kaiser et al. (2011b). Chirp excitation (∼1400–70 kHz at a sweep rate of 50 Hz μs−1 and 360 Vp−p amplitude) accelerated the ions to a detectable cyclotron radius.

Multipoles were operated at 1.8 MHz at a peak-to-peak rf amplitude of 70 V. Broadband frequency sweep (“chirp”) dipolar excitation (70 kHz to 1.27 MHz at a sweep rate of 150 Hz/µs and a peak-to-peak amplitude of 190 V was followed by direct mode image current detection (digitization rate at twice the highest excited spectral frequency, in this case 1.27 MHz) for 1.6 s to yield 4 Mword time-domain data. The time-domain data were processed and Hanning-apodized, followed by a single zero-fill before fast Fourier transformation and magnitude calculation (Marshall et al. 1998). Frequency was converted to mass-to-charge ratio (m/z) by the quadrupolar electric trapping potential approximation to generate an m/z spectrum (Ledford et al. 1984; Shi et al. 2000). External mass calibration for negative ESI FT-ICR MS was performed with Agilent G2421A electrospray ‘‘tuning mix” (high mass) and stearic acid (low mass) as previously reported by Qian et al. (2001). Molecular formulae were assigned according to the parameters outlined by Kujawinski and Behn (2006). This analysis resulted in a mass spectrum containing 2950–3700 peaks per sample. Mass spectral peak height is a direct measure of ion relative abundance, but ionization efficiency can vary among different compound classes, so that ion relative abundance does not necessarily reflect the relative abundances of the parent neutrals in the original sample.

Geochemical and environmental parameters

Samples for concentrations of dissolved organic carbon (DOC), anions, cations, and inorganic nutrients were collected in pre-combusted (500 °C for 6 h) borosilicate glassware. DOC samples were filtered through pre-combusted glass fiber filters (Whatman GF/F) with a syringe and syringe-type holder and analyzed by UV-promoted persulfate oxidation with a Sievers 900 total organic carbon analyzer equipped with an inorganic carbon removal module.

Anion and cation samples were filtered through a sterile Pall Gelman HF Tuffryn 0.2 µm acrodisc filter. Anions were measured with a Dionex DX 500 ion chromatography system equipped with an AS-15 column and conductivity detector. Volatile fatty acids were determined as part of the anion analysis. Filtrates for cation determinations were acidified with trace-metal grade nitric acid and analyzed by inductively coupled plasma-atomic emission spectrometry with an Intrepid II XSP Duo View instrument (Thermo Elemental).

Temperature, conductivity, pH and dissolved oxygen (DO) were measured with a hand-held field meter (Orion Star A329). Percent canopy cover was estimated by use of a hand-held convex spherical crown densiometer (Forestry Suppliers, Jackson, MS).

Statistical analysis

Principal coordinate analysis (PCoA), utilizing Bray-Curtis similarity indices was applied by use of the natural log transformed relative abundance of the compound class data as the main (dependent variable) matrix and natural log transformed environmental/geochemical data as the secondary (independent variable) matrix in PC-Ord software (version 4, MjM Software Design) (McCune and Mefford 1999). Chemodiversity was estimated with the Chao1 matrix (Kellerman et al. 2014).

Results

Concentrations of DOC in stream water increased with stream order within the WCC watershed (WCC 1st, 0.8 mg C L−1; WCC 2nd, 1.2 mg C L−1; WCC 3rd, 2.1 mg C L−1) but not in the other watersheds (RT 1st, 1.1 mg C L−1; RT 2nd, 0.8 mg C L−1; RT 3rd, 1.0 mg C L−1; NVK 1st, 1.0; NVK 3rd, 1.1 mg C L−1; NVK 5th, 0.9 mg C L−1) (Supplementary Table 1). The DOM pool for each of the 3 watersheds was composed of approximately 2000 to 3000 different molecular formulae with declining diversity from the tropical evergreen catchment (RT) to the eastern deciduous forest catchments (WCC and NVK) (Table 1). Nearly 70 % of these compound formulae were common to all watersheds and 48.5 % were found in all streams (Fig. 1a). When plotted on a van Krevelen diagram, the common formulae centered in the area associated with carboxyl-rich alicyclic molecules (CRAM; Fig. 1a; Hertkorn et al. 2006) whereas formulae unique by watershed were largely outside of the CRAM region on the van Krevelen diagram (Fig. 1b) and were overwhelmingly confined to 1st-order streams (Fig. 2a,b). Of the 20.2 % of the compound formulae that were unique to a single watershed, most were found in the tropical evergreen forest streams (14.4 %; Table 1) and were predominantly found in the van Krevelen diagram regions associated with lignin, condensed hydrocarbon, and protein compound formulae with O:C atomic ratios <0.8 (Fig. 1b). The smaller number of compound formulae found exclusively in the WCC watershed (5.1 %) more closely resembled tannins, with O:C ratios >0.6, whereas the 0.6 % of formulae unique to the NVK watershed were distributed between these major compound classes (Fig. 1b, Table 1). Because location on a van Krevelen diagram is not a definitive assessment of molecular identity, these compound formulae are referred to as lignin-like, condensed hydrocarbon-like, and CRAM-like. The DOM molecular composition of stream waters within the tropical evergreen forest (RT) was distinct from those of the eastern deciduous forest (WCC and NVK); and molecular compositions of all streams were distinct from each other, with the composition in 1st-order streams most different from the compositions in higher order streams (Fig. 3a, b).

Table 1 Number of compound formulae (corresponding to negative ion ESI 9.4 T FT-ICR mass spectral peaks) unique to watersheds
Fig. 1
figure 1

van Krevelen diagram for a unique and b ubiquitous molecular formulae found in each watershed. Key: Rio Tempisquito-pink, White Clay Creek-blue, Neversink River-green, based on individual elemental compositions derived from negative ESI 9.4 T FT-ICR mass spectra. CRAM (carboxylic-rich alicyclic molecules)

Fig. 2
figure 2

van Krevelen diagrams (based on negative ESI 9.4 T FT-ICR mass spectra) for molecular formulae unique to each stream order for a White Clay Creek and b Rio Tempisquito. Key: first order-blue, second order-green, third order-pink

Fig. 3
figure 3

a Principal coordinate analysis of stream water DOM molecular compositions (obtained from negative ESI 9.4 T FT-ICR MS) for the eight stream water samples collected from three watersheds, with the DOM grouped by relative magnitude of each compound formula. b Hierarchical cluster analysis (Bray-Curtis similarity) depicting DOM molecular compositions for each sample. Key: Rio Tempisquito-red, White Clay Creek-blue, Neversink River-black. Number indicates stream order

Within a stream catchment, 63–71 % of all formulae were common across all stream orders (Table 1). The NVK 1st-order stream sample was lost during processing; therefore, much of the watershed focus in our analysis is limited to RT and WCC samples. DOM formula diversity was highest in 1st-order streams, with 17–21 % of all formulae unique to those streams and chemodiversity declined by 15–27 % with increasing stream size, changing little beyond the 2nd-order streams (Tables 2 and 3). Most of the compound formulae for all samples are located in the area of the van Krevelen diagrams that is characteristic of lignin, CRAM, tannin, and condensed hydrocarbon compounds (Table 4). Stream water from the 1st-order sites of WCC and RT contained the highest number of lignin-like molecular formulae, which subsequently decreased in the downstream direction by 18.4 % in the WCC watershed and by 10.6 % in the RT watershed (Table 4). Similar trends were observed for compound formulae that were tannin-like (WCC 41.3 % decrease, RT 23.2 % decrease), condensed hydrocarbon-like (WCC 16.1 % decrease, RT 48.9 % decrease), CRAM-like (WCC 4.9 % decrease, RT 17.9 % decrease), protein-like (WCC 16.8 % decrease, RT 48.0 % decrease), and “other” classes (WCC 26.7 % decrease, RT 25.6 % decrease) although formulae within these compound classes were lower in abundance. We did not observe any consistent pattern in compositional differences within a watershed based on molecular size, H:C or O:C ratios, average O, C, or double bond equivalents (DBE = number of rings plus double bonds to carbon) (Table 3). The pattern of unique compound formulae within stream orders differed between the WCC and RT watersheds. Most of the formulae unique to 1st-order streams in WCC had O:C >0.8 (Fig. 2a) whereas in RT most formulae unique to 1st-order streams had O:C <0.6 and (Fig. 2b).

Table 2 Number of compound formulae (corresponding to negative ion ESI 9.4 T FT-ICR mass spectral peaks) unique to steam order within a watershed
Table 3 Characteristics of DOM molecular formulae analyzed by negative ESI 9.4 T FT-ICR MS
Table 4 Compound classes of DOM molecular formulae characterized by negative ESI 9.4 T FT-ICR MS

The van Krevelen diagrams of the H:C versus O:C ratios of the compound formulae from stream water DOM samples show a high degree of similarity but display subtle shifts downstream of the 1st-order streams. This trend is most clearly seen in the WCC watershed as the cloud of points associated with lowest relative magnitude molecular formulae (pink) within the van Krevelen diagram shrinks along the O:C axis, especially at high O:C values above 0.9, but also at low H:C values <0.5. Whereas the column of points in the CRAM region centered around O:C = 0.5; H:C = 1 increases in relative magnitude (Fig. 4a–c). Downstream changes across stream orders in the RT watershed are most clearly seen in the spreading of the points associated with the second level of relative magnitude formulae (blue) across O:C values >0.8 in the 3rd-order stream. In this watershed the highest relative magnitude formulae diminish and shift from O:C near 0.4 in the 1st-order stream to the center of the CRAM region with O:C = 0.5; H:C = 1 in the 3rd-order stream (Fig. 4d–f). The relative magnitudes of formulae with specific numbers of oxygen atoms diminished between highest levels in 1st-order streams to lower and nearly equivalent levels in the 2nd- and 3rd-order streams (Fig. 5).

Fig. 4
figure 4

Three-dimensional van Krevelen diagrams (based on negative ESI 9.4 T FT-ICR mass spectra) for a 1st-order White Clay Creek, b 2nd-order White Clay Creek, c 3rd-order White Clay Creek, d 1st-order Rio Tempisquito, e 2nd-order Rio Tempisquito, f 3rd-order Rio Tempisquito. Colors indicate relative magnitude of DOM compounds

Fig. 5
figure 5

Relative abundances of Ox heteroatom classes of DOM ions from stream water measured by FT-ICR MS for three stream orders from a White Clay Creek, and b Rio Tempisquito

The background biogeochemical parameters differed at the watershed level, but no downstream trends were detected within watersheds. Base cations, alkalinity, and pH reflected the geological characteristics of each region with low ionic strength waters within the NVK watershed, higher concentrations in the WCC watershed, and intermediate concentrations within the RT watershed (Supplementary Table 1). Nitrate concentrations reflected the agricultural activities within the WCC watershed for which oxalate concentrations were particularly elevated. Concentrations of DOC, soluble reactive P, NO2–N and NH4–N were similar among all watersheds.

Discussion

Our study of DOM chemogeography and chemodiversity is the first to consider both the DOM patterns across watersheds from different geographical regions and the longitudinal patterns over fluvially connected stream orders. We identified a core set of molecular formulae in the lignin/CRAM/tannin/condensed hydrocarbon region of the van Krevelen diagram and noted their presence across distant watersheds and throughout stream orders within a watershed. Similar findings of common DOM molecular character have been reported for aquatic environments that include streams over small spatial scales (Mosher et al. 2010), headwater streams across distant watersheds (Jaffe et al. 2012; Kim et al. 2006), a lake with its tributary swamps, streams, and river (Minor et al. 2012), freshwater DOM and marine DOM (Gonsior et al. 2011; Koch et al. 2005), and large rivers, oceans, and estuaries (Bae et al. 2011; Koch et al. 2005; Sleighter & Hatcher 2008; Spencer et al. 2012; Stubbins et al. 2010). This phenomenon in freshwaters has been attributed to common source materials and similar diagenetic processes (Jaffe et al. 2012), or the natural refractory nature of conserved portions of the DOM pool, hydrological connectivity, terrestrial inputs, and changes in the discharge regime (Minor et al. 2012).

The striking concentration within the ubiquitous DOM of molecular formulae in the CRAM region of the van Krevelen diagram also has been described in marine waters (Hertkorn et al. 2006; Hertkorn et al. 2013), Lake Ontario (Lam et al. 2007), the main-stem Congo River (Stubbins et al. 2010), a river to ocean transect of the lower Chesapeake Bay (Sleighter and Hatcher 2008), and a range of terrestrial to aquatic ecosystems (Roth et al. 2014).The location of a peak in a particular compound class region of a van Krevelen diagram is a necessary but not sufficient condition for compound class identification, and distinguishing CRAM from lignin, for example, would require information on molecular structure (Roth et al. 2014). Thus without definitive structural data it is impossible to ascertain the identity of the common core of molecules, but it does appear that they are recalcitrant, heavily degraded, and originating from terrestrial ecosystems.

DOM formulae that were unique among watersheds largely fell outside the CRAM region of the van Krevelen diagram and were overwhelmingly confined to 1st-order streams. Within a watershed, the chemodiversity of DOM peaked in 1st-order streams. However the measured diversity of molecular formulae based on FT-ICR MS cannot be extrapolated unequivocally to the true molecular diversity of DOM, because each formula could represent an unknown number of structural isomers (Hertkorn et al. 2006; Sleighter and Hatcher 2007). Thus the chemodiversity peak is, in fact, a peak in molecular formula diversity. Nevertheless, the chemogeography and chemodiversity patterns we observed suggest that unique DOM formulae are products of terrestrial origin and soil diagenetic alterations that are susceptible to selective adsorption on mineral surfaces and/or biotic and abiotic oxidation during downstream transport.

First-order streams are intimately connected to the landscape and the associated riparian zones that are a major source of organic matter entering streams through shallow subsurface pathways (Mei et al. 2012). The near-stream environment supports both aerobic and anaerobic soil processes that change with hydrologically driven fluctuations in redox potential (Sawyer et al. 2014) and generate a diverse spectrum of microbially produced organic end-products. The fluvial geomorphology of 1st-order streams is typically restricted to shallow water depths and slow water velocities (Leopold et al. 1964). The high water table in the adjacent terrestrial environment that renders trees susceptible to wind throw creates openings in the forest canopy (Kaplan et al. 1980). In combination, these attributes make 1st-order streams particularly suited to process DOM. The short uptake lengths over which microbial processing of DOM occurs in 1st-order streams are directly proportional to depth and velocity (Hall et al. 2013; Newbold et al. 1982); the shallow depths promote interactions with streambed sediments that foster mass transfer of DOM to sediment microbial communities, biological uptake, and processing (Kaplan et al. 2008); selective abiotic sorption on mineral surfaces (Aufdenkampe et al. 2001; Kleber et al. 2011); and shallow depths plus open canopies allow sunlight penetration that can promote photooxidation of DOM (Cory et al. 2013; Moran et al. 2000; Tranvik and Bertilsson 2001; Wetzel et al. 1995) which has not been exposed previously to sunlight. However, the magnitudes of changes in DOM formula diversity with downstream transport were not matched by similar changes in DOC concentration, so we suspect that the chemodiversity pattern is the result of alterations to the DOM molecules involving partial molecular degradation rather than complete oxidation.

We also observed a decline in DOM formula diversity from the tropics through temperate forest watersheds with increasing latitude. With more than a 1000 % greater tree species diversity in the seasonally dry, evergreen tropical forests (Hartshorn 1983) compared to temperate forests, and a decline in tree species diversity between the Pennsylvania piedmont forest (Fike 1999) and those within the Catskill Mountains (McIntosh 1962), there was a positive relationship between the watershed-level diversity in DOM formulae and the diversity of tree species. Collectively, these observations about DOM chemodiversity are consistent with prior suggestions that aquatic organic matter compositions in streams and rivers are set by degradation processes that occur upslope in soils and riparian zones in a drainage basin (Hedges et al. 2000), are strongly correlated with the terrestrial vegetation (Goni et al. 2003), and impacted by processes within the riparian zone (Mei et al. 2012).

The fact that a significant number (20 %) of the unique lignin-like and condensed hydrocarbon molecular formulae in the Rio Tempisquito watershed and 34 % of the unique tannin-like molecular formulae in the White Clay Creek watershed were present in the 1st-order streams and disappeared with increasing stream order, indicates that many of the compounds historically thought to be recalcitrant are actually subject to removal from the system by biotic, photochemical, or abiotic adsorption processes. A growing body of evidence strongly supports the idea that molecularly uncharacterized humic material contributes to the DOM pool that is susceptible to biological oxidation (Cory and Kaplan 2012; Sleighter et al. 2014; Volk et al. 1997). Further support for the biological oxidation of this group of molecules is provided by a positive correlation between humic fluorescent DOM and percent biodegradable DOM in Arctic rivers (Mann et al. 2012) and measurement of the extensive degradation of lignin and associated macromolecules in the Amazon River (Ward et al. 2013). The notion that DOM in headwaters undergoes extensive processing is consistent with the finding that longitudinal changes in DOM in fluvial networks involve decreasing DOM biolability and the accumulation of modern DOC of terrestrial origin as stream size increases (Fellman et al. 2014).

Most DOM molecular formulae unique to the Rio Tempisquito watershed were lignin-like and condensed hydrocarbon-like whereas most molecular formulae found exclusively in the White Clay Creek watershed streams were tannin-like. Lignin and tannin molecules are part of a natural group of aromatic organic substances in soils, produced by decaying vegetation that are susceptible to photodegradation. However, despite the differences in the tropical and temperate streams in canopy cover, we are unable to ascribe the differences in DOM between the watersheds to photochemistry, as there is no evidence that tannins and lignins differ in their susceptibility to photodegradation (Gonsior et al. 2011; Gonsior et al. 2013; Rossel et al. 2013; Stubbins et al. 2010). Instead, it seems more likely that inter-specific differences in the content of tannins (Coq et al. 2010) and lignin (Osono and Takeda 2004) within the tree species and associated forest floor litter, combined with differences in bacterial communities that dominate tropical and temperate zone leaf litter (Kim et al. 2014) influence the quality of DOM passing through the litter layer into the soils.

Once DOM enters the soils, differences in the geology and resulting soils among the watersheds may play a role in what enters the groundwater and ultimately the stream. The Rio Tempisquito watershed contains relatively young allophane-rich Andisols of volcanic origin, whereas the White Clay Creek watershed contains deep, unglaciated Ultisols developed from quartz, schist, gneiss, and marble, and the Neversink River watershed soils are primarily shallow Inceptisols of glacial origin with Histosols in wetlands near some of the smaller streams. These differences in the soils (Buurman et al. 2007; Hernandez et al. 2012) and their associated microbial communities (Ramette and Tiedje 2007) likely contributed to the resulting patterns of DOM molecular composition across watersheds (Findlay et al. 2008).

The River Continuum Concept prediction of peak DOM diversity in 1st-order streams and the suggestion that DOM diversity would decline 2- to 3-fold with downstream transport were made well before advances in geochemistry revealed the complexity of the DOM pool (Hockaday et al. 2009; Kim et al. 2003; Marshall et al. 1998) and thus could not be tested. These advances in geochemical analyses and our studies reported here are based on ultrahigh-resolution mass spectrometry that allow for the calculation of unique molecular formulae for the ion peaks present in each DOM spectrum (Marshall et al. 1998). Our measurement of formula diversity as opposed to the molecular DOM diversity complicates an explicit test of the River Continuum hypothesis concerning DOM and also presents a challenge to our understanding of the geochemistry and biogeochemistry of natural DOM as small changes in DOM structure can influence its reactivity (Ball and Aluwihare 2014). Determining the structure associated with molecular composition within a complex mixture and relating structure to DOM reactivity and degradation is a contemporary analytical challenge that remains to be solved. The current best approach appears to be a combination of different techniques to constrain the likely structures associated with an elemental composition (Abdulla et al. 2013).

Our findings of peak formula diversity and peak number of unique molecular formulae in the 1st order sites within the Rio Tempisquito and White Clay Creek watersheds support the River Continuum Concept prediction, but the 15–27 % decline in formula diversity from 1st-order streams to 2nd and 3rd order reaches is considerably more gradual than the 200–300 % decline in DOM molecular diversity envisioned (Vannote et al. 1980). Furthermore, the molecular-level data presented here extend previous observations that in-stream modifications to DOM composition are most pronounced in 1st-order sites within a river network and the resulting composition persists over several subsequent stream orders (Kaplan et al. 1980). Clearly, extending molecular analyses to larger, downstream rivers would be a useful addition to our studies. Those measurements are hampered by limited access to large river networks without substantial anthropogenic alteration. However, we anticipate that our findings that dramatic changes were confined to 1st-order streams and that little change occurred with increasing stream order downstream would persist beyond the 5th-order stream we sampled.

Although the number of samples analyzed in our study was small and constrained to a single time of year and the headwater streams within river networks, we believe that the similarity of DOM trends across watersheds adds credence to our interpretation, and posit that the novel data we have generated provide valuable insights to DOM patterns and processes within a meta-ecosystem context (Vannote et al. 1980). Additionally, the relatively uniform loading of DOC onto the columns of PPL adsorbent gives us confidence that the DOM diversity patterns we observed are real and not artifacts associated with varying recoveries during the solid phase extraction. Any contributions from bacteria or similarly-sized particles that passed through the GF/F filters are likely to be trivial, because we have been unable to detect any DOC concentration differences between GF/F and 0.2 µm filtrates of stream water (Kaplan, unpublished data) and estimates based on calculations involving cell density, cell size, and C cell−1 (Bott and Kaplan 1985) suggest a maximum contribution of ≤20 µg C L−1.

Conclusion

Our study represents a major step toward affirming the long-standing hypothesis within river ecology that DOM molecular diversity decreases with stream order (Vannote et al. 1980). Understanding of DOM biogeochemistry, including patterns of chemogeography and chemodiversity within and among fluvial networks, would benefit from analyses of a greater number of samples that extend the spatial and temporal coverage of our investigation. Including structural measurements as a complement to the compositional determinations could lead to a true assessment of DOM chemodiversity and improved understanding of DOM processing and diagenesis. Finally, combining recent advances in low throughput, ultrahigh-resolution mass spectrometry with more commonly used high throughput, low-resolution optical techniques (Sleighter et al. 2014; Stubbins et al. 2010) could advance the understanding of temporal dynamics of DOM molecules in river networks.