1 Introduction

The implementation of the Water Framework Directive (WFD) in Europe requires cost-effective methods to assess, on a large scale, the status of water bodies and to evaluate management options to advise water managers. Nutrient loading and water quality models are increasingly used as a tool to support assessment of the ecological status of water bodies and of the recovery options that should be implemented (e.g., [1]).

For the assessment of ecological status, various types of models are available. The export coefficient modeling approach focuses on large-scale nutrient modeling. It typically uses land use classes based on remote sensing, GIS data sets, and export coefficient data from empirical studies. The approach is simple and logical and the limited input requirements make it useful for assessment at the catchment or national scale [2, 3]. Examples include Moneris for simulating nutrient emissions [4, 5], PolFlow which simulates the transport of nutrients as a function of soil, lithology, and runoff [6], and N_EXRET [7] for simulating nitrogen (N) export from different sources and N retention in lakes and peatlands.

In catchment scale and in river-dominated basins, advanced process-based, semi-distributed, dynamic nutrient models such as SWAT (e.g.,[8]) and INCA [9, 10] can be applied over a wide range of spatial and temporal scales, but these models are currently unsuitable for complex waterways with large lakes or for national scales.

Other models include process-based models that focus on physically based hydrology and nutrient leaching in three dimensions. Such models have been developed and used for assessing water quality issues in field or small catchment scale (models NELUP by Lunn et al. [11] and NTT-Watershed by Heng and Nikolaidis [12]). However, as a decision tool for planners and managers, the use of these models is often limited due to high input data requirements, which prevent calibration of the models for large river systems.

Finally, models focusing on nutrient transport and retention over a large scale include the HBV-N and HBV-P models applied to Swedish watersheds [1315], the HYPE model developed for the entire Baltic Sea catchment area [16] and the VEMALA model discussed here for all Finnish watersheds. The VEMALA modeling system simulates runoff and transport of nitrogen and phosphorus (P) on a daily time-step. Simulations are run for both past and present as well as with climate change and load reduction scenarios. The VEMALA model nutrient loading and scenario results are used by Regional Environmental Centres in planning WFD implementation work in Finnish watersheds.

In order to meet the challenge of estimating the spatial and temporal variability of nutrient loading on a national scale, we developed a large-scale modeling system VEMALA. The objectives of this paper are (1) to provide a detailed description of the structure and underlying assumptions of the VEMALA model in its three current versions (VEMALA 1.1, VEMALA-N, and VEMALA-ICECREAM) and (2) to evaluate the model’s performance by comparison to observed N and P loads and concentrations.

2 Materials and Methods

2.1 Model Description

2.1.1 Background and Versions of VEMALA

VEMALA is a novel, national scale, operational modeling and assessment system that simulate runoff and water quality on a daily time-step for all Finnish watersheds. The model was specifically developed for Finnish conditions including large, lake-rich watersheds. It simulates nutrient transport and leaching from terrestrial parts, nutrient transport in rivers, and nutrient processes in rivers and lakes (Fig. 1 and Table 1). VEMALA was developed using the WSFS operational hydrological flood forecasting modeling system [17], which has been developed at the Finnish Environment Institute (SYKE) since the 1980s. During the year 2005, a copy of the WSFS system was made and a description of the water quality processes was started. Since then, two modeling systems have been in use: WSFS—the operational hydrological flood forecasting system and WSFS-VEMALA (VEMALA)—the new nutrient loading model. Both models use a similar hydrological model. The results of both modeling systems—real-time flood forecasts from WSFS and nutrient loads from VEMALA—are shown on the same Internet page (http://www.environment.fi/en-US/Waters/Hydrological_situation_and_forecasts).

Fig. 1
figure 1

Structure of the VEMALA model

Table 1 VEMALA model structure and different versions

The first version of VEMALA (vs. 1.1) is based on a regression model between nutrient concentrations and runoff for total phosphorus (TP), total nitrogen (TN), and suspended solids (SS). The model is based on the assumptions that runoff is the main driver transporting nutrients to the water bodies and that nutrient concentration is proportional to the runoff amount. Although the concept of the model is simple, the complexity of the system is high due to the high variability of the computational units to be simulated. The regression model parameters are automatically calibrated for each water quality measurement point, characterized by more than eight annual observations. VEMALA 1.1 has been calibrated and is currently applied over all the river catchments in Finland.

Since VEMALA 1.1, the model has been developed towards a more process-based nutrient loading model, capable of simulating climate and agricultural change scenarios. The model development strategy was to gradually increase the complexity of the VEMALA model by developing a catchment scale, semi-process based model of total nitrogen loading, VEMALA-N and then developing VEMALA further, by coupling it with a field-scale process-based model ICECREAM for total phosphorus loading simulations. The WSFS hydrological model is used as a basis for hydrological simulations in VEMALA 1.1 and VEMALA-N versions, but the field-scale hydrological processes are simulated by ICECREAM in the VEMALA-ICECREAM version. Diffuse loading from the terrestrial component is simulated separately for agricultural fields and non-agricultural areas on the basis of different methods, but the river routing and nutrient transport and lake water and nutrient balance models are the same for all versions (Table 1). Point loading includes loading from sewage treatment plants, industry, fish farming, and peat mining as well as atmospheric deposition to the water surface. Point load is described as daily values added to the simulated loads in rivers and lakes. Annual loadings from scattered settlements are added to the daily simulated concentrations as an additional concentration component in the terrestrial models.

A comprehensive spatial description of the catchments has been created using a basic simulation unit for the hydrological model at the third level sub-catchment (average size of about 60 km2). The model is divided further into fourth level sub-catchments for each lake bigger than 1 ha. Lake catchment area, agricultural field area, and non-agricultural area are estimated and used in the nutrient simulation. The water flow path from lake to lake was estimated from the topographic maps automatically using an algorithm. The model’s first version used the areal distribution of agricultural and non-agricultural areas, whereas in later versions, a more detailed description of agricultural fields including soil textures, slopes, and crops was added (Section 2.2).

2.1.2 Terrestrial Sub-Models

Hydrological Model

The hydrological model used in VEMALA is the WSFS conceptual semi-distributed hydrological model, which is based on the HBV model [18]. Since the 1980s, WSFS has been developed separately and includes many differences from the HBV model [19]. The spatial simulation unit in WSFS is a sub-catchment with a mean size of 60 km2. WSFS simulates the following components of the hydrological cycle: snow accumulation and melt, infiltration, soil moisture, evapotranspiration, and two components of flow—sub-surface flow and base flow (Fig. 2). Their simulation is based on the daily input of temperature, precipitation, and potential evaporation. Snow accumulation and melt are simulated by a degree-day approach assuming that if air temperature is below a threshold temperature T M, the precipitation is accumulated as snow. Snow melt is simulated using Eq. 1:

$$ \mathrm{MELT}=\mathrm{K}\mathrm{M}\times \left({T}_{\mathrm{air}} - {T}_{\mathrm{M}}\right) $$
(1)

where MELT is snow melt (mm day−1), T air is air temperature (°C), KM is the snow melt index (calibrated parameter) (mm C−1), and T M is a threshold temperature above which snow melt occurs (calibrated parameter) (°C). Snow accumulation and melt are simulated separately for open and forested areas.

Fig. 2
figure 2

Schematic presentation of the WSFS hydrological model

Below-ground surface water is stored in three storages—soil moisture storage, upper water storage, and groundwater storage. The two components of flow with different delay in the response are formed from the last two storages. Water balance for all three storages is calculated for each time step (1 day). Snowmelt or rainfall water can infiltrate and increase either the soil moisture storage or the upper water storage. The division between soil moisture storage and upper water storage depends on the relative soil moisture:

$$ \mathrm{I}\mathrm{N}\mathrm{F}=\mathrm{YIELD}{\left(\frac{\mathrm{MVS}}{\mathrm{MVAK}}\right)}^{\mathrm{EX}} $$
(2)

where INF is the infiltration into upper water storage (mm day−1), YIELD is the water yield of snowmelt and rainfall (mm day−1), MVS is the soil water storage (mm), MVAK is the calibrated maximum soil water storage (mm), and EX is the calibrated parameter controlling infiltration and soil moisture dependency.

Soil moisture storage is filled by infiltration caused by snowmelt or precipitation water and emptied by evapotranspiration. The daily actual evapotranspiration is simulated by an empirical formula depending on potential evaporation and soil moisture storage:

$$ \mathrm{H}\mathrm{A}=\frac{\mathrm{MVS}}{\mathrm{ALPCF}\times \mathrm{MVAK}}\times \mathrm{H}\mathrm{P}\mathrm{M} $$
(3)

where HA is the actual evapotranspiration (mm day−1), HPM is the potential evaporation (mm day−1), and ALPCF is the calibrated coefficient, characterizing the soil moisture level at which actual evapotranspiration becomes equal to potential evaporation (0.40…0.80). The observed class-A potential evaporation observations are used as potential evaporation in the actual evapotranspiration calculation. If observations are not available, the potential evaporation is calculated by an empirical formula depending on temperature and precipitation.

Water balance equations for water storages in the model are the following:

$$ {\mathrm{MVS}}_i={\mathrm{MVS}}_{i-1}+\left({\mathrm{YIELD}}_i-{\mathrm{INF}}_i\right)-{\mathrm{HA}}_i $$
(4)
$$ {\mathrm{VV}}_i={\mathrm{VV}}_{i-1}+{\mathrm{INF}}_i-{\mathrm{PO}}_i-{\mathrm{VO}}_i $$
(5)
$$ {\mathrm{GV}}_i={\mathrm{GV}}_{i-1}+{\mathrm{PO}}_i-{\mathrm{OG}}_i $$
(6)

where VV is the upper water storage (on time step i and i − 1) (mm), PO is the percolation from upper water storage to groundwater storage (mm day−1), VO is the sub-surface flow (mm day−1), GV is the groundwater storage (mm), and OG is the groundwater flow (mm day−1).

Components of the flow are dependent on the amount of water stored in upper and groundwater storages and the on-calibrated response coefficients. Percolation also depends on the water amount in the upper storage and the calibrated percolation coefficient:

$$ \mathrm{V}\mathrm{O}=\mathrm{V}\mathrm{C}\times \mathrm{V}\mathrm{V} $$
(7)
$$ \mathrm{P}\mathrm{O}=\mathrm{P}\mathrm{C}\times \mathrm{V}\mathrm{V} $$
(8)
$$ \mathrm{O}\mathrm{G}=\mathrm{G}\mathrm{C}\times \mathrm{G}\mathrm{V} $$
(9)

where VC, PC, and GC are calibrated response parameters. Total daily runoff is the sum of daily sub-surface flow VO and groundwater flow OG.

Concentration-Runoff Relationship Model (VEMALA 1.1)

The terrestrial loading model is based on the non-linear regression model between nutrient concentration and daily simulated runoff. The concentration–discharge relationship is the simplest way to simulate nutrient concentrations. It is widely used in nutrient loading estimation models [2022]. Here, daily simulated runoffs are divided into five classes in order to represent the variation in the concentration–discharge relationship depending on the rainfall sum: r 1 (0–1 mm day−1), r 2 (1–3 mm day−1), r 3 (3–6 mm day−1), r 4 (6–10 mm day−1), and r 5 (>10 mm day−1). The nutrient concentrations from agricultural and non-agricultural areas are calculated from Eq. 10. The resulting nutrient concentration is calculated as a weighted average of land-use areas (Eq. 12). In order to take into account the seasonality of the concentration– runoff relationship, we derived a nutrient concentration–runoff relationship for four seasons for agricultural areas (cold season, spring, vegetative season, and autumn; example in Fig. 3). For non-agricultural areas there is one nutrient concentration–runoff relationship for all four seasons. The need for distinguishing between different seasons in agricultural areas arises from different cultivation procedures during the spring and autumn seasons and subsequently different nutrient runoff intensities. In general, the mean nutrient concentration is positively correlated with runoff (Fig. 3). The correlation is more pronounced for the autumn than the spring season, with higher TP concentrations for the same runoff classes. For the spring season, however, TP concentration decreases slightly if the runoff is between 6 and 10 mm in the agricultural areas.

$$ {c}_{\mathrm{a}}=\frac{r_1\times {c}_{1,\mathrm{a}}+{r}_2\times {c}_{2,\mathrm{a}}+{r}_3\times {c}_{3,\mathrm{a}}+{r}_4\times {c}_{4,\mathrm{a}}+{r}_5\times {c}_{5,\mathrm{a}}}{r}{C}_{f,\mathrm{VIHMA}} $$
(10)
$$ {c}_{\mathrm{n}\hbox{-} \mathrm{a}}=\frac{r_1\times {c}_{1,\mathrm{n}\hbox{-} \mathrm{a}}+{r}_2\times {c}_{2,\mathrm{n}\hbox{-} \mathrm{a}}+{r}_3\times {c}_{3,\mathrm{n}\hbox{-} \mathrm{a}}+{r}_4\times {c}_{4,\mathrm{n}\hbox{-} \mathrm{a}}+{r}_5\times {c}_{5,\mathrm{n}\hbox{-} \mathrm{a}}}{r}{C}_{f,\mathrm{VEPS}} $$
(11)
$$ c=\frac{c_{\mathrm{a}}\times {a}_{\mathrm{a}}+{c}_{\mathrm{n}\hbox{-} \mathrm{a}}\times {a}_{\mathrm{n}\hbox{-} \mathrm{a}}}{a_{\mathrm{tot}}}+{c}_{\mathrm{setl}} $$
(12)

where c a and c n-a are the daily nutrient concentrations for agricultural and non-agricultural land, respectively, taking into account the nutrient concentration of each runoff class for each land use class: c 1,a, c 2,a, c 3,a, c 4,a, c 5,a, c 1,n-a, c 2,n-a, c 3,n-a, c 4,n-a, and c 5,n-a; c is the resulting nutrient concentration; a a is the area of agriculture; a n-a is the non-cultivated area; and c setl is the daily nutrient concentration from scattered settlements.

$$ {C}_{f,\mathrm{VIHMA}}=\frac{L_{\mathrm{VEMALA},\ \mathrm{a}}}{C_{\mathrm{man},\mathrm{a}}{L}_{\mathrm{VIHMA}}} $$
(13)
$$ {C}_{f,\mathrm{VEPS}}=\frac{L_{\mathrm{VEMALA},\mathrm{n}\hbox{-} \mathrm{a}}}{C_{\mathrm{man},\mathrm{n}\hbox{-} \mathrm{a}}{L}_{\mathrm{VEPS}}} $$
(14)

where L VEMALA,a and L VEMALA,n-a, are the long-term simulated mean annual loadings from agricultural and non-agricultural areas (kg a−1), LVIHMA is the mean annual loading from agriculture estimated by the VIHMA assessment tool (kg a−1),

Fig. 3
figure 3

Mean TP concentration–runoff relationship for agricultural and non-agricultural areas for the spring (a) and autumn (b) season for the river Vantaanjoki (48 sub-catchments). Error bars show the 25th and 75th percentiles of parameter variation between sub-catchments

LVEPS is the mean annual loading from forest and forestry estimated by the VEPS assessment tool (kg a−1), Cman,a and Cman,n-a are the coefficients set by manual calibration for those river catchments where VIHMA and VEPS mean annual loading is not able to produce realistic diffuse loading estimates to match observed nutrient concentrations in streams and lakes,

C f,VIHMA is the scaling factor for scaling mean annual agricultural loading L VEMALA,a to match mean annual agricultural loading from VIHMA L VIHMA,

C f,VEPS is the scaling factor for scaling mean annual non-agricultural loading L VEPS,n-a to match mean annual forest loading from VEPS LVEPS.

In the VEMALA 1.1 model, there are only two land use classes—agriculture and non-agriculture. The source apportionment of diffuse loading between these two land use classes has been a problematic issue, since parameters are calibrated against water quality observations, which indicate only the total nutrient load. Moreover, retention in rivers and lakes is calibrated simultaneously with loading parameters and the balance between loading and retention, making it difficult to calibrate. Therefore, we used an approach to adjust the VEMALA-simulated average agricultural loading by the average agricultural loading simulated by the VIHMA tool [23]. VIHMA estimates the average loading from each field plot based on crop class, slope class, soil texture class, and agricultural practices applied on the field. The average non-agricultural loading simulated by VEMALA is adjusted to the loading values of the VEPS load estimation tool [23]. VIHMA and VEPS loading estimates are used in VEMALA by limiting the simulated average loading in VEMALA to a range around the VIHMA and VEPS estimations (Eqs. 13 and 14). The range is set by manual calibration through the coefficients C man,a and C man,n-a which have a default value of 1.0 and range from 0.7 to 1.3. The resulting scaling factors C f,VIHMA and C f,VEPS are used to scale the daily simulated concentrations (Eqs. 10 and 11).

Nitrogen leaching Model (VEMALA-N)

In the VEMALA-N model, nitrate (NO3 ) and organic nitrogen are described separately. Organic nitrogen is simulated using a concentration–discharge relationship in which sub-surface and base flow are characterized by different organic nitrogen concentrations. Nitrate is simulated using a semi-process-based model similar to the INCA approach [10, 24]. Ammonium (NH4 +) leaching is neglected at this point. On average, in Finnish river catchments, ammonium loading represents only a small fraction (around 6 %) of total nitrogen (TN) loading [25]. Although NH4 + leaching is neglected, the NH4 + storage in the soil is described and linked to the soil organic nitrogen and nitrate.

In the VEMALA-N model, six land uses/crop classes are defined as follows: spring cereals, winter cereals, grassland, root crops, green fallow, and forest. The nitrogen processes included in the soil model simulating nitrate leaching are mineralization, nitrification, denitrification, immobilization, plant uptake, fertilizer input, and dissolution and nitrogen leaching (equations presented in Appendix 1) (Fig. 4).

Fig. 4
figure 4

Schematic presentation of the VEMALA-N model (equations presented in Appendix 1)

VEMALA-N uses the WSFS conceptual hydrological model (Section 2.1.2.1) to simulate nitrate transport through the soil. Thus, there are two flow components: sub-surface flow and base flow. There is no surface runoff component in the WSFS hydrological model, but instead, the conceptual sub-surface flow is used as a sum of surface runoff and shallow sub-surface flow having a short residence time in the soil. For the sub-surface flow, NO3 concentration is estimated by simulating the NO3 balance in the soil (Appendix 1), using the WSFS soil moisture storage (MVS). It is assumed that WSFS simulates soil moisture over the 1-m deep layer. The conceptual base flow represents the groundwater flow and the soil water flow with a longer residence time. NO3 concentration is assumed to be constant for the base flow and it is estimated by calibration, but the range of base flow NO3 concentration limits are set to observed groundwater concentrations in Finland [26]. The limitation of the WSFS hydrological model for use in nutrient leaching modeling is its lack of a surface runoff component, because nutrient transport greatly depends on whether or not the water is flowing through the soil profile. Moreover, the missing division of the soil profile into layers is a limitation because the nutrient concentrations in the soil vary with depth. These limitations should be considered in the future developments of the hydrological model for VEMALA.

Most of the nitrogen processes in the soil are simulated as first-order kinetic processes that depend on the mass of the nitrogen fractions in the soil, soil temperature, and soil moisture (Appendix 1). The most appropriate function relating mineralization and soil temperature is a logistic function, which has an S-shape as suggested by Dessureault-Rompre et al. [27]. According to this function, mineralization is low at low soil temperatures around 0 °C and increases rapidly between 5 and 15 °C, finally reaching a plateau at ca. 20 °C. A parabolic function is used to describe the effect of soil moisture on mineralization as suggested by Myers et al. [28] and Paasonen-Kivekäs et al. [29]. The maximum rate of mineralization is found when the soil moisture reaches field capacity.

In VEMALA-N, denitrification depends on the NO3 availability and soil moisture in the soil. Soil temperature is not yet included in this version, mainly in order to be able to reach high denitrification values during low temperature periods as reported by Martikainen et al. [30]. According to these authors, emissions of N2O (originating from nitrification and denitrification) during winter can be, on average, 57 % of the annual flux. Both mineral and organic soils have substantial N2O production at temperatures close to zero.

The simulation of immobilization in the model varies with inorganic N storages in the soil (NO3 and NH4 +), soil moisture, and soil temperature. However, Martikainen et al. [30] reported that immobilization responded weakly to soil temperature changes between +0.5 and +15 °C.

The growth of plant biomass is related to air temperature sums over the vegetative season [31]. Nitrogen uptake by plants is simulated using a daily nitrogen demand, taking into account daily plant biomass growth and soil moisture stress. Mass balances of NO3 and NH4 + in the soil are simulated for each time step by the equations found in Appendix 1. The concentration of NO3 in the soil solution is simulated by assuming that all the NO3 storage is dissolved in the soil water of the simulated soil layer. The NO3 concentration in groundwater is assumed to be constant with time, but different for agricultural and non-agricultural areas. Further developments of the VEMALA-N model include the simulation of ammonium leaching from agricultural and non-agricultural areas.

VEMALA-N model has 20 calibrated parameters (Appendix 1), most of which have rather tight limits because they characterize the rates of the nitrogen processes. The limits of the parameters have been set by values found in literature or by trial and error, so that the model results match the observed concentrations in the streams. They also have to match the annual values of nitrogen sub-processes in soils given in literature. The calibration of the parameters is made by automatic calibration (Section 2.3) by optimizing the difference between observed river and lake concentrations and loads.

Agricultural Phosphorus Loading Model (VEMALA-ICECREAM)

In the VEMALA-ICECREAM version, the field-scale process-based model ICECREAM has been coupled to the VEMALA model for the simulation of total phosphorus loading from agricultural fields. ICECREAM is based on CREAMS [32] and GLEAMS [33] models, applied to Finnish conditions by Rekolainen and Posch [31] and further developed for simulation of phosphorus loading from Finnish fields by Tattari et al. [34], Yli-Halla et al.[35], Bärlund et al. [36], and Jaakkola et al. [37]. In VEMALA-ICECREAM, the ICECREAM model has been applied to each field plot in every third level sub-catchment. The total daily loading is simulated from each field and daily total agricultural loading is simulated as a sum of loadings from each field in a third level sub-catchment.

The hydrology of ICECREAM is based on the so-called bucket model. The bucket model simulates the downward movement of soil water based on four parameters: soil hydraulic conductivity, wilting point, field capacity, and soil porosity. The SCS curve number method is used for calculating surface runoff [38]. For clay soils, the bypass flow simulation developed by Jaakkola et al. [37] is applied. Evapotranspiration is calculated using the Penman–Monteith method [39].

Phosphorus simulation is based on the flow between three mineral P pools (stable, active, and labile P) and three organic P pools (manure, fresh organic, and stable organic P). The initial inorganic P content is calculated from the concentration of acid ammonium acetate extractable P, which is routinely tested from agricultural fields in Finland (later referred to as soil-test P value). The simulated P loading consists of particulate phosphorus (PP) and dissolved phosphorus (DP). They are lost via surface runoff and in clay soils also via bypass flow. In addition, dissolved phosphorus is lost with the water percolating through the soil profile. Losses of PP in surface runoff are linked to soil erosion, which is calculated with the modified USLE model [40, 41]. A schematic presentation of the phosphorus flows in the model is shown in Fig. 5.

Fig. 5
figure 5

Simulation of phosphorus flows in the ICECREAM model

To make the ICECREAM model better applicable in VEMALA, we revised the initialization of the inorganic P pools and the balance calculation between the pools on the basis of experimental data from Finnish fields. Here, we present only the P transport equations that differ from those in the earlier versions of ICECREAM. The equations not presented here can be found in the article of Tattari et al. [34].

Equation 15 is used to calculate the initial inorganic P content of the soil. It is based on the data from 23 mineral soils in Finland [42, 43].

$$ \mathrm{soil}\;\mathrm{inorganic}\;\mathrm{P}=12.1\times \mathrm{Plab}+4.21\times \mathrm{clay}\%+237 $$
(15)

Soil inorganic P is in milligrams per kilogram, Plab is labile phosphorus in milligrams per kilogram, and clay% is the percentage of clay in the soil. Plab is calculated from the soil-test P value as suggested by Bärlund et al. [36]. In Eq. 16, the change in the soil-test P value over 1 year is calculated. This value is used for calculating the balance between different P pools. The equation is derived from the data of the 18 non-fertilized fields in the field trials of Saarela et al. [43, 44].

$$ \mathrm{S}\mathrm{T}\mathrm{P}={\mathrm{STP}}_0+0.0022\times {\mathrm{STP}}_0\times {\mathrm{P}}_{\mathrm{bal}} $$
(16)

STP is the soil-test P value (mg kg−1) at the end of the year, STP0 is the soil-test P value (mg kg−1) at the beginning of the year, and Pbal is the change in the total P content of the soil during the year (kg ha−1).We also changed the equation for calculating the DP load from surface runoff (in kg m−2) to the same equation used for calculating DP load from bypass flow and percolating water (Eqs. 17 and 18). Surface runoff is assumed to take place in the plough layer. When traditional tillage methods are used in the simulation, the plough layer consists of the three uppermost layers (25 cm). Otherwise, the layer used in the simulation consists of the two uppermost layers (13 cm). The calculation is carried out for each layer starting from the top layer.

$$ \mathrm{D}\mathrm{P}={C}_{\mathrm{Pav}}\times q $$
(17)

where

$$ {C}_{\mathrm{Pav}}=\frac{\mathrm{Plab}}{K_{\mathrm{d}}\times {m}_{\mathrm{s}}+\left(\mathrm{s}\mathrm{w}+{q}_{\mathrm{runoff}}+{q}_{\mathrm{bypass}}\right)} $$
(18)

C Pav is the concentration of DP in soil water (kg L−1), q is the daily runoff through soil matrix/surface runoff/water flow through macropores (mm), Plab is labile P in the soil layer (kg m−2), K d is a partitioning coefficient (l kg−1), m s is soil mass (kg m−2), sw is total amount of water in the soil layer (mm), q runoff is daily surface runoff (mm), and q bypass is daily runoff through macropores (mm).

Bypass flow (water flow through macropores in clay soils) is simulated in a similar manner to that described by Jaakkola et al. [37] with some modifications. Bypass flow is generated when the threshold values for soil moisture and precipitation / snow melt are exceeded, as in the work of Jaakkola et al. [37], according to Eq. 19:

$$ {q}_{\mathrm{bypass}}=\varepsilon \times \mathrm{reff} $$
(19)

where ε is the fraction of precipitation/snow melt routed through macropores (a calibrated parameter) and reff is precipitation/snow melt in millimeters. However, the soil moisture is calculated from the whole soil profile rather than from the topmost two layers, as the bucket model is not sensitive enough to calculate reliably the soil moisture in just a small soil layer. As another new feature, bypass flow is also generated when the soil is frozen, even though the soil moisture threshold value is not exceeded. The fraction of precipitation/ snow melt which is routed through macropores is calibrated, but unlike in the work of Jaakkola et al., the fraction is calibrated separately for frozen and frost-free soil. The amount of surface runoff is then calculated from the precipitation/snow melt without first deducting the amount of the bypass flow.

PP load in bypass (kg m−2) is calculated from the same soil layers as DP load according to Eq. 20:

$$ {\mathrm{PP}}_{\mathrm{bypass}}={q}_{\mathrm{bypass}}\times \omega \times {\displaystyle \sum}\left({\mathrm{PP}}_{\mathrm{il}}+{\mathrm{PP}}_{\mathrm{ia}}+{\mathrm{PP}}_{\mathrm{is}}+{\mathrm{PP}}_{\mathrm{oh}}+{\mathrm{PP}}_{\mathrm{om}}\right) $$
(20)

where ω is the fraction of PP moving from P pools to macropores (mm−1, a calibrated parameter) and the PP pools are labile (PPil), active (PPia), stable (PPis), stable organic (PPoh) and manure (PPom) in kilograms per square meter. However, this approach underestimated the bypass PP loading in wet autumn months. Thus, we developed a new empirical equation to reach the high PP loadings in wet seasons. The equation is based on simulated daily soil moisture and monthly measurements of particulate phosphorus loading through sub-surface drains in the Kotkanoja field in Jokioinen [45]. When observing the data, we found that the PP loading through macropores increased exponentially as the simulated soil moisture increased. The best correlation was achieved when the average soil moisture from the day under simulation and the 9 previous days were used. When the soil is frozen, the soil moisture is calculated as if the soil were at its wilting point. The new equation (Eq. 21) is used for calculating a coefficient to fix the fraction of PP pools moving from P pools to macropores.

$$ \beta = \rho \times {e}^{\gamma \times \mathrm{s}\mathrm{w}10} $$
(21)

where β is a coefficient for adjusting the fraction of PP pools moving from P pools to macropores, ρ and γ are parameters for calibrating β, and sw10 is the average soil water content of the last 10 days (mm). The hydrological simulation of ICECREAM gave much lower evaporation, especially for the bare soil in spring after snow melt, than the VEMALA model. For this reason, the total evapotranspiration for ICECREAM outside the growing season is now calculated by VEMALA. During the growing season, transpiration and evaporation are calculated by ICECREAM but corrected with a correction factor in order to reach the same annual total evapotranspiration as in VEMALA. Snow water equivalent and precipitation/ snow melt are also inputs from VEMALA.

Further minor changes have been made to the model. After modifying the frost simulation, soil frost does not melt before the snow has melted. The growth of grass has been changed so that the grass which has grown above the ground after the last harvest in the autumn will die in the spring and the above-ground growth starts from zero. The growth of winter wheat has also been changed. In the new version, the growth of winter wheat after winter is slower than in the earlier version. The erosion of grass-covered soil has been increased by making the soil loss factor independent of grass height.

For calibration and development of the ICECREAM model, experimental data from four Finnish agricultural fields was used. These experiments have been published by Turtola and Kemppainen [46], Koskiaho et al. [47], Puustinen et al. [48], Uusitalo et al. [49], and Turtola et al. [45]. However, most of the parameters used have been adopted from the previous versions of ICECREAM [34, 36].

Parameters of the field-scale ICECREAM model have been calibrated against field measurements of nutrient transport. However, the catchment scale final loading results of VEMALA-ICECREAM have been manually calibrated in the same way as described in Section 2.1.2.2. Manually calibrated coefficients are used to adjust the ICECREAM simulated load for those river catchments where ICECREAM agricultural loading and VEMALA forest loading are not able to produce realistic diffuse loading estimates to match observed nutrient concentrations in streams and lakes.

The ICECREAM model was originally designed for simulating mineral soils only. However, in some river catchments, peat is a common soil type of agricultural fields. For example, in the river Lapuanjoki and Karvianjoki catchments, one fourth of the arable land is peat. For this reason, it is also necessary to be able to simulate phosphorus loading from organic fields. The development of the ICECREAM extension for simulating peat soils has been reported by Piirainen [50].

2.1.3 River Sub-Models

River Routing Sub-Model

In the VEMALA model, we simulate water flow through the river according to the WSFS model. First, the river is split into smaller stretches. The water flows into a river stretch either from the previous river stretch or lake or as runoff from a land area. Then, the inflow (Q in) increases the water level of the river stretch (w), which increases its outflow (Q out). Traditionally, the rating curve between the outflow and the water level of the river can be described with the following equation:

$$ {Q}_{\mathrm{out}}=a{\left({w}_i-b\right)}^c $$
(22)
$$ a={C}_Q{A}_{\mathrm{tot}}^{C_{\exp }}, $$
(23)

where b, c, C Q , and C exp are calibrated parameters and \( {A}_{tot} \) is the catchment size. The parameter a characterizes the effect of the river size on the outflow discharge. In the VEMALA model, we express the rating curve (Eq. 22) in a simpler form:

$$ {Q}_{\mathrm{out},i}=a\left({C}_{\mathrm{k}1}w+\left(1-{C}_{\mathrm{k}1}\right){w}^2\right), $$
(24)

where C k1 is a calibrated parameter characterizing the shape of the rating curve. If the C k1 is 1.0 the rating curve is linear. If C k1 is 0.0, then the rating curve is a parabola (see Fig. 6). Equation 24 has a simple quadratic form, which allows the rapid numerical solutions of the equations. However, it still catches the essential form of the rating curve, and gives acceptable results. In order to calculate Q out, we need to know the water level w. For this, we use the continuity (mass balance) equation for each river reach:

$$ V\left(t+dt\right)=V(t)+{Q}_{\mathrm{in},i}dt-{Q}_{\mathrm{out},i}dt, $$
(25)

where V(t) is the volume of the river stretch at time t and \( dt \) is a time step. \( {Q}_{in} \) and \( {Q}_{out} \) are the in and out flows of water from river reach i. In VEMALA, the cross-section of the river has a trapezoidal shape, allowing calculation of the volume of the river reach:

$$ V(t) = l\ \left({\theta}_{\mathrm{slope}}{w}^2(t)+{H}_{\mathrm{width}}\ w(t)\right), $$
(26)

where l is the length, θ slope is the slope, and H width is the width of the river stretch. By substituting Eqs. 26 and 24 for the terms V(t + dt), V(t) and Q out,i in Eq. 25, we get

$$ l\ \left({\theta}_{\mathrm{slope}}\kern0.5em {w}^2(t)+{H}_{\mathrm{width}}\ w(t)\right)+a\left({C}_{k1}{w}_i+\left(1-{C}_{k1}\right){w^2}_i\right)dt=l\left({\theta}_{\mathrm{slope}}{w}^2\left(t-dt\right)+{H}_{\mathrm{width}}\kern0.62em w\left(t-dt\right)\right)+{Q}_{\mathrm{in},i}dt $$
(27)

Here, we can solve w(t) using the water level from the previous time step w(t-dt). Finally, \( {Q}_{out} \) is solved from Eq. 24 and used as an inflow to other river parts or lakes.

Fig. 6
figure 6

Shape of the rating curves (Eq. 24) with different C k1 coefficients

Nutrient River Routing Sub-Model

During the VEMALA model development, the nutrient routing equations were built into the existing WSFS river routing sub-model. Therefore, the nutrient routing sub-model is based on the water continuity equation and river discretization described in Section 2.1.3.1. The nutrient transport in rivers is simulated as an advection process, omitting diffusion. The concentration of a substance in a river changes according to physical and biogeochemical processes such as sedimentation, erosion, and denitrification. Inputs from point and diffuse sources also affect the concentrations in the river model. A diffuse source contributes both to flow and mass according to the following equation derived from Eq. 25:

$$ {A}_{c,i}l{c}_i={V}_{i-1}{c}_{i-1}+{Q}_{\mathrm{in},i}{c}_i\times dt+{Q}_{r,i}{c}_{\mathrm{diff}}\times dt-Q\_\mathrm{out},i\times dt\pm \mathrm{reaction} $$
(28)

where A c is the cross section area of the stream (m2), l is the length of the river reach (m), Q in and Q out are the inflow and outflow discharge of the river reach (m3 s−1), Q r is the inflow to the river reach from the local catchment (m3 s−1), c is the concentration of a substance in the river (mg L−1), c diff is the diffuse source concentration (mg L−1), t is the time (s), and reaction is the reaction processes (denitrification, sedimentation, and erosion). Since the outflow discharge from the river reach is already estimated, the unknown concentration of a substance in each river reach can be obtained from Eq. 29 for total nitrogen and from Eq. 30 for total phosphorus:

$$ {c}_{i,\mathrm{T}\mathrm{N}}=\frac{V_i{c}_{i,\mathrm{T}\mathrm{N}}+{Q}_{\mathrm{in}}{c}_{\mathrm{in},\mathrm{T}\mathrm{N}}\varDelta t+{Q}_r{c}_{\mathrm{diff},\mathrm{T}\mathrm{N}}\varDelta t-{M}_{\mathrm{denitrif}}-{M}_{\mathrm{se} \dim }+{M}_{\mathrm{eros}}}{V_i+{Q}_{\mathrm{in}}\varDelta t+{Q}_r\varDelta t-{Q}_{\mathrm{out}}\varDelta t} $$
(29)
$$ {c}_{i,\mathrm{T}\mathrm{P}}=\frac{V_i{c}_{i,\mathrm{T}\mathrm{P}}+{Q}_{\mathrm{in}}{c}_{\mathrm{in},\mathrm{T}\mathrm{P}}\varDelta t+{Q}_r{c}_{\mathrm{diff},\mathrm{T}\mathrm{P}}\varDelta t-{M}_{\mathrm{se} \dim }+{M}_{\mathrm{eros}}}{V_i+{Q}_{\mathrm{in}}\varDelta t+{Q}_r\varDelta t-{Q}_{\mathrm{out}}\varDelta t} $$
(30)

where V is the volume of the river reach (m3), i is the number of river reach, Q r is the inflow to each river reach from the local catchment (m3 s−1), M denitrif is mass of nitrogen removed from the system by denitrification (kg day−1), M sedim is mass of nutrient sedimenting (kg day−1), and M eros is mass of nutrient eroded from the bottom (kg day−1).

Reaction processes are different for each substance. Denitrification, sedimentation, and erosion are included in total nitrogen and nitrate simulations and sedimentation and erosion are incorporated in total phosphorus simulations. Sedimentation and erosion rates are calibrated parameters (Section 2.3). Denitrification from river sediments is an important component of the nitrogen cycling in freshwater ecosystems [5153]. In the model, denitrification depends on the temperature and nitrogen concentration in the stream:

$$ {M}_{\mathrm{denitrif}}={r}_{\mathrm{denitrif}, \max }f(T){k}_{\mathrm{conc}}{k}_{\mathrm{unit}}{A}_{\mathrm{surf}}\varDelta t $$
(31)

where M denitrif is mass of nitrogen removed from the system by denitrification (kg day−1), r denitrif,max is maximum denitrification rate (mg m−2 day−1), f(T) is temperature effect coefficient, which increases linearly from −5 to 25 °C (0–1), k conc is concentration effect coefficient (0–1) and k unit is coefficient for transferring units, A surf is surface area of the river reach (m2). Maximum denitrification rate and k conc are calibrated parameters (Section 2.3).

The mean annual denitrification values are checked against estimates of mean annual denitrification values given in the literature and by using expert judgment. Sedimentation and erosion in each river reach are simulated in a similar way for each substance. In the long term, nutrient erosion may not exceed sedimentation. Erosion occurs when discharge exceeds the calibrated value for erosive discharge. When discharge is below this value, sedimentation occurs.

$$ {M}_{\mathrm{eros}}={\sigma}_{\mathrm{eros}}{c}_i{V}_i,\;\mathrm{if}\;Q\ge {Q}_{\mathrm{eros}} $$
(32)
$$ {M}_{\mathrm{sedim}}={\sigma}_{\mathrm{sedim},r}{c}_i{V}_i,\;\mathrm{if}\;Q<{Q}_{\mathrm{eros}} $$
(33)
$$ {Q}_{\mathrm{eros}}={Q}_{\mathrm{mean}}{C}_{\mathrm{eros}} $$
(34)

where σ eros (day−1), σ sedim,r (day−1), C erosare calibrated parameters, Q mean is mean simulated discharge (m3 day−1), and Q eros is threshold discharge above which the erosion occurs (m3 day−1).

2.1.4 Lake Sub-Models

Lake Water Balance Sub-Model

Lake water balance components, e.g., daily inflow, lake evaporation, lake precipitation, and the daily volume of the lake are simulated. The current water level is estimated depending on the lake’s volume and then the outflow from the lake is simulated. The lake water balance is simulated for each lake every day:

$$ \frac{dV}{dt}={Q}_{\mathrm{in}}\times 86400-{Q}_{\mathrm{out}}\times 86400+{\mathrm{PA}}_S-{\mathrm{EA}}_S $$
(35)

Where V is the lake’s volume (m3), Q in is the inflow to the lake (m3 s−1), Q out is the outflow from the lake (m3 s−1), P is the daily precipitation (mm day−1), E is the evaporation from the lake surface (mm day−1), and A s is the lake surface area (m2).

The outflow from the lake can be simulated in three ways: (1) if the outflow measurements are available, then the simulated daily outflow from the lake can be set to equal the observed outflow; (2) if there is an estimated rating curve between measured outflow and measured water level in the lake, then that is used in the model; (3) in other cases, the outflow is estimated by solving the mass balance equation in a similar way as for the river reaches by the following equations:

$$ \frac{A_s{w}_i}{dt}=\frac{V_{i-1}}{dt}+{Q}_{\mathrm{in},i}-{Q}_{{\mathrm{out}}_l,i} $$
(36)
$$ \mathrm{Volume}=\frac{1}{2}h\times {w}_i\times L=\frac{1}{2}h\times {w}_i\times \frac{2{w}_i}{\varTheta }=\frac{h}{\varTheta}\times {w_i}^2 $$
(37)
$$ \frac{h\times {w}_i^2}{\varTheta \times dt}=\frac{V_{i-1}}{dt}+{Q}_{\mathrm{in},i}-{Q}_{{\mathrm{out}}_l,i} $$
(38)
$$ {C}_{\mathrm{L}1}={C}_{\mathrm{L}3}{10}^{C_Q} $$
(39)
$$ {C}_{\mathrm{L}2}=\left(1-{C}_{\mathrm{L}3}\right){10}^{C_Q} $$
(40)
$$ {Q}_{\mathrm{out}\_l,i}={C}_{\mathrm{L}1}\left({w}_i-{w}_0\right)+{C}_{\mathrm{L}2}{\left({w}_i-{w}_0\right)}^2 $$
(41)

where A s,input is the lake surface area as an input data (m2), h is the length of the lake (calibrated parameter) (m), \( \boldsymbol{\varTheta} \) is the slope of the lake shores (calibrated parameter), L is the width of the lake (L = A s,input/h) (m), w i is the water level (m), and w 0 is the water level at which outflow is equal to 0 (m). For the lakes simulated according to case (3), the lake geometry is assumed to be a triangular prism, being a rectangle from above (width L × length h) and a triangle from the side (slope of sides \( \boldsymbol{\varTheta} \) and known side L). Parameters C L3 and C Q are calibrated for each lake. Parameter C L3 characterizes the shape of the rating curve. If C L3 is 1.0, the rating curve is linear, if C L3 is 0.0, the rating curve is a parabola. The exponential function y = 10x is used to assess the rating curve parameters (Eqs. 39 and 40) relating water level difference to outflow, in order to be able to simulate a wide range of lakes in terms of size and outflow.

When Eq. 41 is substituted into Eq. 38, we get Eq. 42. The water level w i of the lake can be obtained by solving the quadratic function:

$$ \frac{h}{\varTheta dt}{w}_i^2+{C}_{\mathrm{L}1}{w}_i-{C}_{\mathrm{L}1}{w}_0+{C}_{\mathrm{L}2}{w}_i^2-2{C}_{\mathrm{L}2}{w}_i{w}_0+{C}_{\mathrm{L}2}{w}_0^2-\frac{V_{i-1}}{dt}-{Q}_{\mathrm{in},i}=0 $$
(42)

By substituting w i for Eq. 41, the outflow from the lake can be estimated.

Lake Nutrient Balance Sub-Model

A lake is simulated as a continuously stirred tank reactor system assuming that the concentration c of a given substance in the volume V of the system is always uniformly distributed in space [54]. By using the lake water mass balance Eq. 35, we can write an equation for nutrient transport and fate in lakes. The nutrient mass balance includes external loading or inflow loading, outflow loading, sedimentation, denitrification, and internal loading [55]:

$$ \frac{dM}{dt}={Q}_{\mathrm{in}}(t){c}_{\mathrm{in}}(t)\cdot 86400\cdot {10}^{-3}-{Q}_{\mathrm{out}}\cdot 86400\frac{M}{V}-{\sigma}_{\mathrm{sedim}}M\pm {r}_{\mathrm{reaction}}\cdot {10}^{-6}{A}_{\mathrm{surf}} $$
(43)

where M is the mass of a nutrient in the lake (kg), Q in(t) is the inflow discharge changing in time (m3 s−1), c in(t) is the inflow concentration changing in time (mg L−1), Q out is the outflow discharge (m3 s−1), V is the volume of the lake (m3), σ sedim is the sedimentation rate (day−1), r reaction is the internal loading for the TP equation (with a “+” sign) or the denitrification rate for TN (with a “−” sign) (mg m−2 day−1), and A surf is the lake surface area (m2). Since the outflow discharge from the lake is already estimated, the unknown concentration of a substance at time-step i in the lake can be obtained from Eq. 44 for total phosphorus and from Eq. 45 for total nitrogen:

$$ {c}_i=\frac{c_{i-1}{V}_{i-1}+{Q}_{\mathrm{in}}(t){c}_{\mathrm{in}}(t)\varDelta t-{Q}_{\mathrm{out}}{c}_{i-1}\varDelta t-{\sigma}_{\mathrm{sedim}}{V}_i{c}_{i-1}\varDelta t+{r}_{\mathrm{reaction}}{A}_{\mathrm{surf}}\varDelta t}{V_{i-1}+{Q}_{\mathrm{in}}\varDelta t-{Q}_{\mathrm{out}}\varDelta t} $$
(44)
$$ {c}_i=\frac{c_{i-1}{V}_{i-1}+{Q}_{\mathrm{in}}(t){c}_{\mathrm{in}}(t)\varDelta t-{Q}_{\mathrm{out}}{c}_{i-1}\varDelta t-{\sigma}_{\mathrm{sedim}}{V}_i{c}_{i-1}\varDelta t-{M}_{\mathrm{denitrif}}}{V_{i-1}+{Q}_{\mathrm{in}}\varDelta t-{Q}_{\mathrm{out}}\varDelta t} $$
(45)

where c i and c i-1are the concentration in the lake at the present and previous time steps (mg L−1), respectively; V i and V i-1 are the lake volume at the present and previous time steps (m3), respectively; M denitrif is the mass of denitrified nitrogen (kg) estimated in the same way as for rivers by Eq. 31. Sedimentation is simulated as a first-order rate process related to the mass of the nutrient in the lake. Denitrification and sediment release processes are simulated as a release rate depending on lake surface area.

For total phosphorus, the processes taken into account in the mass balance calculation are sedimentation and sediment release (r int, mg m−2 day−1). Sedimentation rate characterizes the fraction of the total mass of a nutrient in the lake which sediments during the time step (σ sedim, day−1). Sedimentation rate is constant with time, but sedimentation mass changes because of the changing mass of nutrients in the lake. Sedimentation and sediment release rates are lake specific and they are calibrated on the basis of lake concentration observations. Calibrated sedimentation rates for lakes in Finland vary in the range of 0.002–0.005 day−1, meaning that 0.20–0.50 % of the TP mass in the lake is sedimented in 1 day. Phosphorus sediment release can be caused by different processes such as biogeochemical release of phosphate from the sediments or resuspension of particulate phosphorus caused by wind or fish. The model simulates the resulting sediment release and assumes that it occurs only during the summer months from June to August. Sediment release is an important phenomenon in total phosphorus simulations, especially in shallow lakes in which the concentration increases during the summer months without external loading. Sediment release is calibrated using phosphorus concentrations during the summer months.

For total nitrogen mass balance simulations in lakes, the processes taken into account are sedimentation of nitrogen and denitrification of nitrate. The main process responsible for the nitrogen loss from the system is denitrification in lakes (Eq. 31). Sedimentation has a smaller impact on the nitrogen concentration in the water column. The maximum denitrification rate r denitrif,max is a calibrated parameter which has the same value for rivers and lakes and is calibrated for each third level sub-catchment. The calibrated maximum denitrification rate for rivers and lakes in Finland varies in the range of 15–25 mg m−2 day−1. Calibrated sedimentation rates for nitrogen in Finnish lakes vary from 0.0001 to 0.00015 day−1, which means that 0.01–0.015 % of the total nitrogen mass in the lake is sedimented in 1 day.

2.2 Data Sources

2.2.1 Meteorology, Hydrology, and Water Quality Monitoring Data

Air temperature and precipitation observations from the Finnish Meteorological Institute are used as input to the hydrological model of VEMALA. There are approximately 190 stations with daily temperature observations and 250 stations with precipitation observations in Finland. The data from daily discharge and water level observations carried out by SYKE are used in calibration of the hydrological model of VEMALA. In Finland, there are around 300 discharge stations and 400 water level observation stations with daily measurements [56]. Water quality data are collected by SYKE from the environmental administration’s monitoring and compulsory inspections and are available from the national HERTTA database. The frequency of water quality observations and the analyzed factors vary according to local needs from one sample per 12 years up to more than 13 samples per year. However, all of the ca. 16,000 river and 15,000 lake water quality observation points are used in the VEMALA model calibration and testing.

Only a small fraction of the observation points have more than 13 observations per year (around 35 river water quality observation points). Observations at these sites are used for estimating the annual loads of TN and TP from Finnish rivers and are reported to the Baltic Marine Environment Protection Commission (HELCOM). We used these annual estimations for testing the VEMALA model performance. The annual river loads (L a ) were calculated by the averaging method by multiplying the mean monthly concentration by the mean monthly flow and summing up the monthly loads:

$$ {\displaystyle {L}_a}={\displaystyle \underset{m=1}{\overset{12}{\varSigma }}}{\displaystyle {c}_m}{\displaystyle {q}_m} $$
(46)

where c m is the observed instantaneous concentration in month m (mg m−3) (mean in the case of several samples per month or seasonal average in the case of missing monthly observations) and q m is the mean monthly flow calculated from the daily flow values (m3 s−1) [57].

2.2.2 Watershed Characteristics

The watersheds analyzed in this study were selected on the basis of the availability of the water quality monitoring data to estimate the nutrient loading using the averaging method (Fig. 7). The characteristics of watersheds influencing nutrient loading (Table 2) vary to a great extent over Finnish catchments: (1) agricultural area (0–40 %), (2) peat soil area (0–60 %), (3) share of clay soil from cultivated land (0–65 %), (4) slopes (mean slope for fields on river catchments 0.7–4.8 %), (5) agricultural crop distribution (spring cereal-dominated or grassland-dominated catchments), and (6) lake area (0–20 %). Watersheds in the south and south-west coast (from Taasianjoki to Aurajoki) are agriculture-dominated catchments on clay soils. Watersheds along the west coast and in northern Finland are peat soil-dominated catchments with a gradually decreasing area under cultivation towards the northern catchments.

Fig. 7
figure 7

Major watersheds in Finland. The watersheds analyzed in this study are in darker gray color, and they are described in Table 2

Table 2 General characteristics influencing nutrient loading for the main watersheds in Finland

In Finland, a major share of nutrient loading originates from agriculture. Although the area under cultivation is small (7.4 %), the contribution of agriculture to nutrient loading and water quality is significant [58]. Fields are concentrated in southern, south-western, and western Finland. Forests also contribute to diffuse pollution covering 78 % of the total land area in Finland, of which one third is forested peatland. The contribution of forestry as a nutrient source to water ecosystems increases towards eastern and northern Finland [7].

2.2.3 GIS Data of Agricultural Fields

Detailed soil and catchment characteristics and GIS data are used to estimate diffuse loading. The agricultural fields are characterized by the data received from the field plot register (i.e., “identification system of the fields”). This is a nationwide register, in which all the field plots receiving area-based subsidies are digitized. The field plot register is owned by the Agency for Rural Affairs (Mavi), which operates under the Ministry of Agriculture and Forestry. The data characterizing the fields are exact location, size, boundaries, and crop grown during the year 2007. In addition, for part of the fields, soil-test P value and soil texture are received from companies providing a private soil analysis service. The soil texture is classified into 22 main soil texture classes, which are sorted into four broader soil texture groups (clay, silt, coarse, and organic) for VEMALA 1.1 and VEMALA-ICECREAM. For VEMALA 1.1 and VEMALA-N, crop data are classified into five crop classes: spring cereals, winter cereals, grasslands, green fallow, and root crops. For VEMALA-ICECREAM, the crop classes are oats, barley, spring wheat, winter wheat, grass, sugar beet, potato, oilseed, green fallow, bare fallow, rye, onion, and clover. Field slopes are estimated for each field, using a digital elevation model (DEM) with a grid resolution of 25 m × 25 m, combined with field border data. All the comprehensive field characteristics are used in the nutrient loading simulation, depending on the model version, either directly in field-scale VEMALA-ICECREAM or summarized for each third level sub-catchment in VEMALA-N.

2.2.4 Sources of Nutrient Loading Data

The VEMALA system includes all the relevant nitrogen and phosphorus sources including diffuse and point source pollution. Diffuse loading is simulated by VEMALA and annual point loads and peat production loads are gathered from the Compliance Monitoring Data System (VAHTI), which contains data on pollution loading, water and air pollution control, waste management and noise abatement, and annual atmospheric depositions. The annual loading estimates from scattered settlements are gathered from the VEPS loading assessment tool [59].

2.3 Calibration and Evaluation of the Model Performance

The VEMALA model parameters are estimated by an automatic calibration procedure, which is made in two steps: first, the hydrological model parameters are calibrated by optimizing the sum of the square of the difference between the observed and simulated discharge, water level, water equivalent of snow, and the difference between the simulated extent of snow-covered area and observed snow-covered area from satellite pictures; second, nutrient loading model parameters are calibrated by minimizing the square of the difference between observed and simulated concentrations and loads in rivers and lakes (Eq. 47). In both calibrations, a modification of the direct search Hooke–Jeeves optimization algorithm [60] is used. A river catchment is divided into calibration areas consisting of several third level sub-catchments depending on the availability of the observation data. In the hydrological model calibration, the calibration areas are divided by hydrological observation points. In the calibration of nutrient loading parameters an observation point with more than eight observations per year is necessary in order to define a separate calibration area.

$$ {\mathrm{OC}}_{\mathrm{ca}}=w1{\displaystyle \sum_{i=1}^{n_{wq,1}}{\left({\mathrm{conc}}_{\mathrm{obs},i}(i)-{\mathrm{conc}}_{\mathrm{sim},i}(i)\right)}^2+w2}{\displaystyle \sum_{i=1}^{n_{wq,2}}{\left({\mathrm{load}}_{\mathrm{obs},i}(i)-{\mathrm{load}}_{\mathrm{sim},i}(i)\right)}^2} $$
(47)

In Eq. 47, OCca is the optimization criterion for each calibration area, concobs,i and loadobs,i are the observed nutrient concentration and load, respectively; concsim,i and loadsim,i are simulated nutrient concentration and load, respectively; n wq,1 is a river or lake water quality observation point and n wq,2 is a river water quality observation point, w1 is a weighting factor for concentration bias and w2 is a weighting factor for load bias. The weighting factors w1 and w2 are set based on an expert judgment of the model performance with different weights. The reason for using both concentration and load in the optimization criteria comes from the fact that the model is a simplification of reality and it cannot describe all phenomena seen in the observations. The calibration can be guided by weighting the observations so that the resulting model and parameterization describe more accurately those characteristics of the observations which are most relevant for the use of model results. If calibration is made only by load observations, then observations during discharge peaks have the greatest effect and observations during low flow have a minor effect. If only concentration observations are used, then low flow period concentration is simulated more accurately, with a cost of a less accurate load simulation. The aim of the model is to give both estimates of loading to lakes and sea and also estimates of water quality, i.e., concentration in lakes and rivers also during low flow. Therefore, both concentration and load observations are necessary in the calibration.

The main difference between the calibration of the hydrological part of WSFS and the nutrient load model is the frequency of the water quality observations. Indeed, observations are not available daily and the highest peaks in concentrations/loads are most probably missed due to the infrequent measurements, making the calibration less efficient. The number of observation points with frequent observations determines the number of calibration areas. All observation points located in the same calibration area are taken into account for the optimization criteria estimation. Observation points with more frequent measurements will have a higher weight in the parameter values. Therefore, very often, we choose the option to have the same parameter values for all third level sub-catchments within a calibration area. The smaller the calibration area, the better is the spatial variation of the parameter values.

The observed stream concentrations and loads and concentrations in the lakes for the period from 1991 to 2013 were used to calibrate the model versions VEMALA 1.1 and VEMALA-N, whereas VEMALA-ICECREAM was not calibrated against observed concentrations. The total number of observation points used in the calibration for the whole of Finland is about 31000 in rivers and 35000 in lakes. The total number of observations for phosphorus was about 2000000 and for nitrogen about 1800000. We used the Nash–Sutcliffe criteria (NSE) to evaluate the model performance; the NSE values were calculated for the period from 1991 to 2013 for both nutrient loads and concentrations. NSE values were also calculated separately for two model versions of the VEMALA 1.1—concentration–runoff relationship model and new model versions (VEMALA-N for TN simulation and VEMALA-ICECREAM for TP simulation).

3 Results

3.1 TP and TN Loadings to the Baltic Sea

3.1.1 Mean Annual Loads

Nutrient gross loading, retention, and net loading for all 74 river catchments in Finland and for the coastal areas of the Baltic Sea were calculated using VEMALA 1.1. To test the model performance, the simulated nutrient loadings were compared with loadings calculated from monitoring data from the river mouths, over the period 1991–2011 (Table 3).

Table 3 Comparison between TP and TN loadings simulated by the VEMALA 1.1 model and estimated by the averaging method based on the monitoring data of 27 river catchments over the period 1991–2011

Simulated TP and TN gross specific loadings for each third level sub-catchment (average size 60 km2) are shown in Fig. 8a, b, respectively. The highest TP gross-specific loading is found in watersheds with the highest agricultural percentage, which are located on the south-western coast of Finland in the river Uskelanjoki, Aurajoki, and Paimionjoki watersheds and in the lowest part of the Kymijoki watershed on the south coast (above 70 kg km−2 year−1). The highest TN gross specific loadings are found in the rivers Porvoonjoki, Uskelanjoki, and Paimionjoki and the lowest parts of the Kymijoki and Kyrönjoki watersheds (above 900 kg km−2 year−1), in keeping with the high percentage of agricultural fields.

Fig. 8
figure 8

Simulated TP and TN gross specific loading maps for Finnish watersheds

The highest nutrient retention is detected in watersheds with high lake percentages—in the Vuoksi, Kymijoki, and Karjaanjoki watersheds 83, 74, and 66 % of TP gross loading, respectively, is retained in the watershed (Fig. 8a), as well as 68, 54, and 52 % of TN loading (Figs. 8b). TP retention varies between 0 and 83 % of the TP loading in the 27 watersheds studied here, with an average retention of 42 %. For TN, the retention varies between 4 and 68 % of the TN loading, with an average retention of 37 %. For both TP and TN, the retention is linearly related to the lake percentage in the watershed (Fig. 9).

Fig. 9
figure 9

Relationship between lake percentage and TP retention (a) and TN retention (b)

The absolute difference between simulated and monitored net TP loads (Table 3) varies between 0 and 44 % at the river basin scale, with a mean difference of 12 %. The greatest differences between simulated and estimated net TP loads were detected in the Kiskonjoki (44 %), Tornionjoki (30 %), and Paimionjoki (−22 %) river basins. In the river Tornionjoki, VEMALA 1.1 clearly overestimates TP loading (404 t year−1), whereas the TP loading simulated by VEMALA-ICECREAM (324 t year−1) is closer to the estimate based on monitoring data (282 t year−1).

The absolute difference between simulated and estimated net TN loads varies from 1 to 26 % for different river basins, with a mean difference of 9 %. The greatest differences between simulated and estimated TN loads are in the Kiskonjoki (26 %), Kyrönjoki (20 %), and Tornionjoki (19 %) river basins.

The highest nutrient net specific loading values (Table 3, Fig. 10a, b) for TP and TN are for the river catchments Uskelanjoki, Aurajoki, and Paimionjoki located in south-eastern Finland, with a combination of high agricultural field and low lake percentage in the catchment (TP loading above 60 kg km−2 year−1 and TN loading above 800 kg km−2 year−1). By contrast, the lowest nutrient specific loading values are for the river catchments Vuoksi, Kymijoki, Oulujoki, and Torniojoki with high lake and low agricultural field percentages in the catchment (3–10 kg km−2 year−1 of TP loading and 130–230 kg km−2 year−1).

Fig. 10
figure 10

Relationship between the net specific loading (kg km−2 year−1) simulated by VEMALA and calculated from monitoring data for total phosphorus (a) and total nitrogen (b)

3.1.2 Variability of Annual Loads and Retention

Variation of annual simulated TP loading, retention, and discharge was analyzed for the river Kokemäenjoki catchment, and compared to the estimated loading from monitoring data. The river Kokemäenjoki is a large watershed (27,046 km2) with a considerable amount of lakes (11 %) and long waterways, agriculture accounts for only 18 % of the area and peat soils cover 16 %. Annual TP loading varied depending on the variation of annual discharge (Fig. 11). During the period 1991–2011, the lowest simulated TP loading occurred during the dry years 2003 and 2009 with TP loads of 200 and 235 t year−1, respectively, and the highest simulated loading took place during the wet years 2000 and 2008 with TP loads of 470 and 560 t year−1, respectively. The highest absolute difference between simulated load and estimated load from monitoring data was during the wet years 2000 and 2008, with a difference up to 140 t year−1 (up to 30 %). During the average precipitation years and dry years, the absolute difference was lower. On wet years, TP loading estimated from monitoring data was usually higher than simulated TP loading. By contrast, on dry years, TP loading estimated from monitoring data was lower than simulated TP loading. It appears that the averaging method used for estimation of loads overestimates TP load during the wet years and underestimates it during the dry years. The disadvantage of the averaging method is that for months with many high runoff and high concentration events, mean monthly concentration is overestimated. In reality, concentration is fluctuating in a similar way as runoff. Nutrient loading models such as VEMALA take into account the fluctuation of the concentration depending on the runoff, and therefore they can be used for more reliable nutrient loading estimates for individual years.

Fig. 11
figure 11

Annual variation of simulated TP net loading and estimated TP loading based on monitoring data and discharge for the river Kokemäenjoki

Net retention has a negative correlation with annual discharge in the river Kokemäenjoki catchment, as shown in Fig. 12. Net retention is higher during years with lower annual discharge, because water and nutrient residence time is longer in lakes. Simulated mean annual retention in the Kokemäenjoki catchment was 48 %, and it varied from 43 % during the wet year 2008 to 60 % during the dry year 2003. Other factors also influence the variation of the retention, such as the sequence of dry years or wet years. For example, if there are two dry years in a row, then retention is higher. The ability of the VEMALA model to simulate variation of the retention depending on the hydrological conditions is an advantage compared to the regression models relating retention with lake percentage in the catchment area.

Fig. 12
figure 12

Simulated annual retention and discharge relationship for the Kokemäenjoki watershed

3.2 Nutrient Concentrations and Loads for the Aurajoki river—Comparison Between the VEMALA Model Versions

The VEMALA model simulates daily nutrient concentrations and loads at different spatial scales. VEMALA-ICECREAM simulates phosphorus concentrations and phosphorus loads at the field scale for agricultural fields only. Nutrient concentrations and loads are then simulated at the third level sub-catchment (mean size 60 km2) for all the point and diffuse pollution sources. Finally, nutrient concentrations and loads are modeled at the river mouth. The model results are presented here against observations for the river Aurajoki (874 km2). Aurajoki is a watershed with few lakes (lake percentage 0.3 %), considerable cultivated area (37 %), and minor peat areas (11 %).

The Nash and Sutcliffe efficiency criteria (NSE) value for discharge simulation was rather high (0.77), which means that discharge was well simulated. The main flood peaks in the Aurajoki river were caused by spring snow melt, during which the discharge varied from 80 to 110 m3 s−1 (Fig. 13). However, during the year 2011, there was also a winter flood peak (70 m3 s−1) caused by elevated precipitation and relatively high air temperatures. Autumn and winter flood peaks are becoming more frequent due to the warming climate and increase in precipitation. During the period 2009–2011, the spring discharge was well simulated, but summer and autumn discharge peaks were underestimated. Despite the higher discharge during the spring floods in the period 2009–2011, the observed TP concentrations were higher during the high discharge events in autumn and winter. This can be explained by higher erodibility of the clay soils during the autumn and winter months, when the soils are wet and lacking vegetation cover. By contrast, during the spring flood, the soil is frozen, which reduces soil erodibility. The VEMALA 1.1 model has been divided into seasons due to this observed seasonal variability of the runoff-concentration relationship.

Fig. 13
figure 13

Simulated and observed daily discharge for the river Aurajoki within the calibration period (2009–2011)

The highest TN concentrations occurred on three occasions—during the snowmelt flooding, during the autumn–winter high discharge peaks, and during the summer, after fertilizer application. During the period 2009–2011, the highest observed TN concentration was during the snow melt flood in the year 2011. In all 3 years there were high concentrations during the autumn discharge peaks. This is probably caused by excess fertilizer left in the soil after the growing season and mineralization of organic fertilizers during autumn months. During the year 2010, there was also a high concentration peak during the summer after fertilizer application. For TN leaching, the most important hydrological factors determining the amount of NO3 leached during the spring are the amount of snow, soil frost, and soil moisture conditions over the winter. Due to high snow depth, low soil frost and dry soils over the winter 2010–2011, the snowmelt infiltrated into the soil and high NO3 amounts were leached during the spring 2011. During years with deeper and more solid soil frost, there is more surface runoff and less infiltration leading to less NO3 leaching during the snow melt flood. The present VEMALA-N model has been tested to simulate soil frost depth and surface runoff depending also on the soil frost conditions, but it will be a future development work to test these new features properly and incorporate them into the operational model version.

NSE values for TN and TP load simulations were high during the calibration period (0.77–0.86 for Aurajoki, Table 4), which means that the models are suitable for simulating the variability in daily loading at the river mouth. NSE values for TN concentration were 0.00 for VEMALA-N and 0.34 for VEMALA 1.1. Figure 14 shows that the VEMALA-N model overestimates some concentration peaks, leading to higher NSE results for VEMALA 1.1. The biggest difference between the two model versions can be seen during the summer months when a high TN concentration peak is simulated with VEMALA-N even during a small runoff event (Fig. 14), due to fertilizer application in the spring, which increases the soil NO3 storage considerably. The VEMALA 1.1 concentration-runoff relationship-based model is not able to simulate this phenomenon. However, both model versions were unable to simulate the maximum TN concentrations in Aurajoki (Fig. 14).

Table 4 TN and TP concentration (conc.) and loading calibration (for VEMALA 1.1, VEMALA-N) and validation (for VEMALA-ICECREAM) results for the river Aurajoki, 1991–2013, NSE
Fig. 14
figure 14

Simulated and observed daily TN concentrations for the river Aurajoki within the calibration period (2009–2011)

TP concentration simulated by the model version VEMALA 1.1 has better correlation with the observed TP concentrations (NSE = 0.58). Looking at Fig. 15, we can say that the VEMALA-ICECREAM model is not able to simulate maximum autumn TP concentrations in some years such as, e.g., 2009 and 2011, although the maximum spring TP concentrations and autumn concentrations during the year 2010 are simulated rather well. TP runoff during the autumn runoff events on clay soils will be one of the development areas in the VEMALA-ICECREAM model.

Fig. 15
figure 15

Simulated and observed TP concentrations for the river Aurajoki part of the calibration period for VEMALA 1.1 and the validation period for VEMALA-ICECREAM (2009–2011)

4 Discussion

4.1 The National-Scale VEMALA Nutrient Loading Model Versus Other Estimation Methods

VEMALA can be used to estimate TP and TN nutrient loads for all river catchments in Finland where the observation data are too scarce to apply the averaging method. VEMALA 1.1 simulated a mean annual net nutrient loading, from the Finnish territory (including transboundary watersheds) to the Baltic Sea (or to transboundary watercourses), for the period 1991–2011, of 3900 t year−1 of TP and 94 000 t year−1 of TN. These estimates do not include point loads flowing directly to the Baltic Sea. The VEMALA 1.1 model simulated total mean annual TP and TN gross loads to watercourses in all the river catchments in Finland of 6100 and 135000 t year−1, respectively. The simulated mean annual TP and TN retained in lakes and rivers in Finland was 2200 t year−1 or 36 % of the TP gross load to watercourses, and 42 000 t year−1 or 31 % of the TN gross load. Nutrient retention values greatly depend on the total lake area in the river catchments. The average N retention of 31 % of the TN gross load, estimated by VEMALA, is close to an earlier estimate of 35 % [7].

The TN balance for the Finnish river catchments (without transboundary catchments) was estimated earlier by the GIS-based assessment model N_EXRET [7]. The calculated TN gross loading to watercourses has been estimated to be 119000 t year−1 and the retention in lakes and peatlands 42000 t year−1, leading to a TN net loading of 77000 t year−1 to the Baltic Sea. The difference between TN gross loadings to watercourses obtained by these two models is only 13 %. Moreover, the VEMALA results cover an 8 % larger territory, because transboundary catchments are included in the simulation. The transboundary catchments are mainly located in the northern part of Finland and are characterized by low nutrient loading.

The net nutrient loading from Finnish watersheds to the Baltic Sea (excluding the river Vuoksi watershed) has been estimated for HELCOM [61] by using monitoring data and an averaging method for the period from 1991 to 2011. The estimation of unmonitored river export is based on an area-specific extrapolation of export in nearby monitored rivers with similar watershed characteristics. The TP mean annual net load to the Baltic Sea has been estimated to be 3480 t year−1 and the TN net load to be 69200 t year−1. The corresponding values were estimated by VEMALA 1.1: TP mean annual net load to the Baltic Sea was 3500 t year−1 and TN mean annual net load was 73000 t year−1. Therefore, the TP net loading was similar with both calculation methods and the TN net loading differed only by 5 %.

Comparison with other large-scale nutrient loading estimation methods shows that VEMALA 1.1 is able to reproduce results similar to those from other large-scale methods using modeling or monitoring.

4.2 Use of Different VEMALA Versions

Each version of the model has advantages and disadvantages. VEMALA 1.1, the model based on concentration–runoff relationships, performs well if the number of water quality observations available is sufficient to calibrate the model parameters. Therefore, VEMALA 1.1 can mainly be used for present nutrient loading simulations at large- and middle-sized river catchment scale at the river mouths.

The more process-based modeling approaches are suitable to simulate a wide range of spatial scales, starting from the field scale (VEMALA-ICECREAM), to the small river basin scale, and further to the large river basin or watershed scale. They have more physically based parameters which vary within a smaller range than the calibrated parameters, and therefore, the parameter estimation includes less uncertainty than in VEMALA 1.1. The process-based models simulate the change in nutrient storage and processes in the soil depending on various biogeochemical and climatic factors. Therefore, these models can be used for climate and agricultural management change studies. However, these more complex models do not always fit better the observed nutrient concentrations in the streams due to a high spatial variability of the simulation units (fields or crop classes) and the complexity of the biogeochemical processes.

4.3 Retention

The major N removal processes in lakes are regulated by a number of factors, of which hydraulic residence time, availability of other nutrients (primarily phosphorus), light regime, and redox potential are among the most important (e.g., [62]), whereas P removal processes are regulated by sedimentation processes (co-precipitation with calcium carbonate or adsorption by iron and aluminum hydroxides), and typically, P retention increases with longer hydraulic retention times (e.g., [63]). Within the aquatic landscape, the key environments for N removal or retention are stream riparian areas, wetlands, and lakes [13]. In lakes, two essential processes, sedimentation (both N and P) and denitrification (N), are related to the surface area of the sediments and to the surface area of the lakes.

According to our results, the most important factor explaining nutrient retention is the lake percentage of the watershed (Fig. 9a, b). Other characteristics influencing nutrient retention in lakes, such as residence time depending on a volume–catchment area relationship and nutrient concentration in the lake, cause further variability. With the same residence time, nutrient sedimentation is higher in more loaded, more eutrophic lakes. Five watersheds (Karjaanjoki, Ähtävänjoki, Oulujoki, Kokemäenjoki, Eurajoki in Fig. 9a) with approximately the same lake percentage (9.8 to 12.9 %) had high variability in TP retention, from 37 to 66 %. The highest TP retention was estimated in the river Karjaanjoki watershed, because of the highest residence time (the ratio characterizing residence time, which is volume of lakes divided by watershed area, is 0.92) in combination with the highest gross specific loading (37 kg km−2 year−1). The lowest TP retention was estimated in the river Eurajoki, because of the lowest residence time (ratio characterizing residence time–volume of lakes divided by watershed area is 0.67), although the gross specific loading is also rather high in the river Eurajoki watershed (33 kg km−2 year−1).

Denitrification from river sediments may also be an important component of the TN retention in watersheds, particularly in temperate environments. In Finnish conditions, in the boreal zone, denitrification from the rivers and riparian areas plays some role, as well as denitrification from various peatland areas. Retention of TN was low in watersheds practically without lakes, such as Aurajoki and Lapväärtinjoki (Table 3). In VEMALA-N simulation the denitrification from the river Aurajoki was 41 t year−1 (4 % of the gross N load). There is some evidence that increased NO3–N loading in boreal rivers enhances denitrification in high-latitude eutrophic rivers (Temmesjoki in northern Finland), but denitrification has a limited capacity to remove nitrate from rivers (1.2–7.9 % of the added nitrate) [64]. It is expected that river denitrification is higher in southern Finland because of a warmer climate and higher nitrate concentrations in the rivers.

4.4 Parameter Sensitivity and Uncertainties of the Model Approach

There are several sources of uncertainty: model input uncertainty, model structure uncertainty (process descriptions), model parameter uncertainty, and model technical uncertainty [65]. Here, we discuss only uncertainty related to the model parameters. The uncertainty of the VEMALA model parameters also depends on the spatial and temporal scale used. A high uncertainty in the VEMALA 1.1 model loading estimates is found for watershed parts without a sufficient number of water quality observations, whereas the uncertainty decreases for those watersheds (or parts of watersheds) with frequent water quality observations (more than eight per year). For the watershed parts with limited water quality observation sites, the parameter values were calibrated on the basis of the closest downstream frequent water quality observation site. Therefore, in some watershed parts a large number of third level sub-catchments can belong to the same calibration area. The parameter values can also have a high variation (Fig. 3), due to the lack of observations for the calibration. The process-based versions, VEMALA-N and VEMALA-ICECREAM, were developed to reduce the uncertainty due to the over-parameterization and lack of observations for VEMALA 1.1.

We studied sensitivity of the simulated TP net specific loadings to two non-agricultural parameterizations (Fig. 16). The first parameterization uses the parameters described in Section 2.1.2 in two ways: (1) without division into four seasons, one parameter set is used instead of four parameter sets for each season, and (2) the spatial variation of the parameters between third level sub-catchments is reduced by setting spatial variation similar to that in the VEPS load estimation tool [59]. The two main changes in the second parameterization are (1) the non-agricultural loading simulation is divided into four seasons (the same as for agricultural loading) and (2) the spatial variation of the parameters is only based on the calibration procedure and is not set to the VEPS forest and forestry loading spatial variation.

Fig. 16
figure 16

Simulated TP net specific loading by VEMALA 1.1 by two non-agricultural loading parameterizations, together with loads estimated from monitoring data

The range and average values of these two simulations are presented in Fig. 16, together with load estimated from monitoring data. The simulated mean TP net specific loading for 27 watersheds was 11.6 ± 0.4 kg km−2 year−1 (±4 %). More variation in TP net specific loading values was found for some watersheds, e.g., 14 ± 3 kg km−2 year−1 (±24 %) in Simojoki, 25 ± 2.6 kg km−2 year−1 (±12 %) in Lapuanjoki, and 41 ± 3 kg km−2 year−1 (±8 %) in Porvoonjoki. Differences between simulated values and those estimated from monitoring were not typically very high, except for Kiskonjoki where simulated loadings were on average 44 % higher (see also Section 3.1.1). In Kiskonjoki, the results based on monitoring data underestimate the nutrient loadings because the national water quality monitoring point represents only 60 % of the river catchment. The remaining 40 % are located downstream from that point, and the simulated loadings are for the whole watershed area. This case demonstrates the advantage of using models for nutrient loading simulations. Models can be used when observation data are too scarce or are not sufficiently representative to use the averaging method for riverine loading simulation. There is a higher variability between the simulations in the forest-dominated watersheds, since the forest parameterization has been changed. However, in some agricultural watersheds, there can also be a relatively high variability between the simulations, for example, in the river Porvoonjoki watershed. In the calibration-based VEMALA 1.1 model, the forest loading attempts to compensate the low agricultural loading in order to match the observed high nutrient concentrations in the streams. Therefore, there is a need for more process-based nutrient forest loading model development in the future.

Parameters of the terrestrial, river, and lake models are calibrated simultaneously (for VEMALA 1.1 and VEMALA-N versions), which causes a problem in lakes without inflow concentration observations. Inflow loading and retention may compensate each other; the same mean concentration in the lakes can be simulated by underestimating both the inflow loading and retention. Another problem is the wide variation of parameter values (Fig. 3), which causes uncertainty in simulated nutrient concentrations. To reduce the model uncertainty, more effort to develop process-based terrestrial, river, and lake models is necessary: this work is in progress in the further development of VEMALA-ICECREAM and VEMALA-N models. In the VEMALA-ICECREAM version, the terrestrial part is not calibrated against water quality observations in streams and lakes, and only river and lake model parameters are calibrated.

4.5 Challenges in a Nationwide Nutrient Loading Simulation—Phosphorus Simulation on Mineral and Peat Soils

The main challenges of a nationwide nutrient loading simulation system lie in the high variability of a catchment’s characteristics and the ability of the model to capture various biogeochemical and climatic conditions influencing nutrient leaching and transport. The spatial variability of agricultural areas, crops, soil textures, slopes, lake percentages, and peatlands influences the nutrient loading to a great extent. River catchments in the southern and south-western coastal zones (from Taasianjoki to Aurajoki) are agriculture-dominated catchments on clay soils. The TP loading is characterized by high loading during the snowmelt in spring or during rainfall events outside the growing season, due to transport of particulate phosphorus via preferential flow paths (surface runoff and macropore flow). For these river catchments, the concentration-runoff relationship model (VEMALA 1.1) produced satisfactory results, because TP concentrations increased rapidly during the runoff peak events.

The dynamics of phosphorus leaching in peat soils are different from those in mineral soils. The hydrological properties of peat soils differ considerably from those of mineral soils. Peat soils have high water conductivity and water holding capacity, so there is very little surface runoff even during the runoff peaks. In peat soils, phosphorus sorption sites are scarce. Consequently, added phosphorus is rather easily leached from these soils. Unlike mineral soils, a considerable part of the P load from agricultural peat soils comes through the matrix flow as dissolved phosphorus [66], but also as organic substances. Phosphorus is mainly leached from forested peat soils as organic substances [67]. In the low flow periods, the stagnant water in the peat soil profile leaches phosphorus. This phenomenon causes higher P concentrations in the peat-dominated catchments during the low flow periods compared with high flow periods [68]. Because of these differences in the phosphorus leaching dynamics of mineral and peat soils, calibration of the concentration-runoff model VEMALA 1.1 is difficult in catchments with a considerable area of peat soils.

5 Conclusions

In this paper, we described the operational, national-scale nutrient loading model VEMALA and its three versions. The concentration-runoff relationship model (VEMALA 1.1) is suitable for present nutrient loading simulation for large- and middle-scale watersheds with good water quality observation sites available. The more process-based model versions are suitable for nutrient load simulations on a wider range of spatial scales, starting from the field scale (VEMALA-ICECREAM) and smaller catchment scale (VEMALA-N) up to large watershed scale, as well as for present climate and scenarios of changing climate and agricultural practices.

VEMALA simulates the nutrient balance in the watershed by simulating gross nutrient loading, retention, and net loading. The lake nutrient balance model, common to all three versions, is based on the simple approach of a continuously stirred tank reactor system with calibrated sedimentation and release rates. Despite the simple approach, the model is still able to represent nutrient retention in the watersheds by taking into account the volume of the lakes, water residence time, and actual nutrient concentrations in the lakes. However, the lake nutrient balance model needs further development in the future. For example, nutrient sedimentation rate could be related to the texture of the sediment particles entering the lake and the processes involved in phosphorus release from the sediments could be developed further.

The uncertainty of the model results depends on the spatial and temporal scales of the simulations. The annual nutrient net loading at the river mouth has a relatively low uncertainty due to good water quality monitoring sites for calibration of the model parameters. The uncertainty of the model results increases with decreasing size of the watershed. In small headwater catchments without observation sites, the uncertainty is high. Annual gross loading and nutrient retention are usually more uncertain than net loading simulations, due to the small amount of observation data for calibration of the model parameters.

In the future, emphasis will be put on the development of the process-based phosphorus leaching model ICECREAM to simulate erosion processes typical to Finnish conditions (snow melt in the spring, wet autumns causing high concentrations of suspended sediments in runoff from clay soils), and the catchment-scale nitrogen leaching model VEMALA-N to include ammonium and organic nitrogen leaching in the model. Finally, a more process-based model of forest environments should be developed in the future in order to better simulate export loads of nutrients (nitrogen and phosphorus) from forested areas (pristine and managed).