Elsevier

Geoderma

Volume 377, 1 November 2020, 114575
Geoderma

A comparison of the use of local legacy soil data and global datasets for hydrological modelling a small-scale watersheds: Implications for nitrate loading estimation

https://doi.org/10.1016/j.geoderma.2020.114575Get rights and content

Highlights

  • SWAT model was set up with SoilGrids soil data and with data set made from local legacy data.

  • Parameters sampling was performed for streamflow and nitrate load prediction assessment.

  • SoilGrids based model led to more biased prediction of water balance and nitrate load.

  • For smaller scales studies SoilGrids seems to be still insufficient.

Abstract

Appropriate soil property data are important inputs for the development of hydrological models, and as the scale of the study decreases, the relevance of these data increases. Watershed soil surveys require time and money, so readily available data are often used. This study compares the hydrological and nitrate loading prediction performance of the globally available dataset SoilGrids and a dataset prepared from local legacy data by a random forest approach. The study was carried out using the SWAT model for a 33 km2 watershed in the Czech Republic. Model performance was tested by applying a Latin hypercube-sampled parameter set to both a model using the SoilGrids data as input, and another model based on the local soil dataset. The SoilGrids dataset generally shows finer soil texture and higher organic carbon content than the dataset based on local legacy data. These differences are also reflected in the hydrological process predictions, where the SoilGrids-based model produced less lateral flow than that of the local dataset, with the compensation of higher evapotranspiration. This difference in water balance led to worse nitrate loading estimation by the SoilGrids-based model. The results of this case study suggest that the use of a local dataset is still more appropriate, despite the availability of global data with mid resolution.

Introduction

Hydrological models support decision-making regarding various issues, such as water resource planning, flood prevention, contamination mitigation, etc. (Beven, 2012). To reduce the uncertainty in model outputs, realistic input data are needed (Robinson et al., 2016). Soils play a crucial role in rainfall-runoff processes and constituent loading. Soil properties that relate to the rate of infiltration, or ability to store water, significantly affect the water balance in watersheds (Geroy et al., 2011). Additionally, the impact of soils on hydrological processes and ion sorption also affect nutrient loss (Gaines and Gaines, 1994, Kurunc et al., 2011). Moreover, soil properties vary spatially (Biggar and Nielsen, 1976, Iqbal et al., 2005) and with depth in the soil profile (Franzluebbers, 2002). Some hydropedological properties, e.g. hydraulic conductivity, may even vary with season (Šípek et al., 2019). Uncertainty in soil information can outweigh the uncertainty of climate change impact, as shown by Folberth et al. (2016) in the case of crop yield modelling by global, gridded crop models.

One of the most popular tools for modelling hydrological processes in watersheds is SWAT (Gassman et al., 2007). It is commonly used for the estimation of water balance and nonpoint source water pollution, especially by nutrients, and the estimation of the effectiveness of best management practices (e.g. Strauch et al., 2013) or the possible impact of climate change (e.g. Bhatta et al., 2019). The importance of the input soil data resolution on the output of the SWAT model has been shown by many authors (Bouslihim et al., 2019, Bhandari et al., 2018, Bossa et al., 2012, Mednick, 2010, Moriasi and Starks, 2010, Romanowicz et al., 2005). Although it is possible to obtain comparable results for the watershed outlet, spatial model performance is expected to decrease significantly with less reliable soil data, as shown by Tavares Wahren et al. (2016). In addition, a few authors have investigated the impact of soil input data of different resolution on nitrate loading (Chaplot, 2005, Cotter et al., 2003, Geza and McCray, 2008) and have observed considerable effects of these different soil input data.

There are several ways to address soil property inputs for models. Global datasets, such as the FAO soil map (Batjes, 1997, Nachtergaele et al., 2010) or SoilGrids (Hengl et al., 2017), provide soil data for the whole world and are readily available. These data are suitable for large-scale studies (Abbaspour et al., 2015). In most cases, traditional soil maps with soil classes are used. The mean soil properties are then estimated for particular soil classes from databases (Čerkasova et al., 2018, Mbungu and Kashaigili, 2017) or in combination with field surveys (Cordeiro et al., 2018, Kmoch et al., 2019, Lima et al., 2013). This approach has the uncertainty relating to spatial delimitation of patches representing soil classes. Another way is to spatially predict soil properties by digital soil mapping (DSM) (Ma et al., 2019, Piikki and Söderström, 2019, McBratney et al., 2003). Soil properties are predicted using known information from particular pedons represented by spatial points. Approaches to these predictions include interpolation methods, regression models or machine learning methods. Hydropedological properties can be predicted directly from existing measurements, but, more often, only information about particle size distribution (PSD), including the proportion of clay, silt and sand fractions and organic carbon content (OC), is available. Hydropedological properties must be derived by pedotransfer functions (PTFs). These two approaches can provide comparable results (Tóth et al., 2018). For the SWAT model, the DSM approach was used in a study by Santra et al. (2011), in which the basic soil properties (PSD and organic carbon) were interpolated by regression kriging, the soil hydraulic properties were derived by PTFs and the resulting pixels were aggregated by a fuzzy approach. In comparison to the map of dominant soil classes, the fuzzy approach performed better for stream flow prediction. Ziadat et al. (2015) developed a tool called SLEEP (soil-landscape estimation evaluation program) which is capable of spatially estimating soil properties from point surveys with a digital terrain model and the remote sensing data required to derive covariates for fitting a linear regression model. Testing the performance of the SWAT model with soil data from SLEEP, kriging and FAO maps showed that the model with SLEEP-derived data performed similarly to the FAO map-based model and better than the kriging-based model. Tavares Wahren et al. (2016) tried to solve the problem of soil data scarcity and improve the spatial distribution of soil depths by using the SoLIM tool. Although stream flow prediction was comparable to that from the model with the base soil data, these authors demonstrated reduced uncertainty in model parameters after calibration.

In the case of large-scale watersheds, more detailed soil data may not have a significant effect on hydrological model performance, but in the case of smaller studies, more detailed soil information may have a more pronounced effect (Mukundan et al., 2010). Conducting soil surveys for particular modelling studies can require time and money. The aim of this study is to determine whether SoilGrids, the most detailed global soil data set currently available, is sufficient for smaller watershed modelling studies of nitrate loading by comparison with a dataset derived from local soil legacy data (SLD). The differences in streamflow output, possible parameter range and implications for water balance are assessed. The implications of soil datasets and the resulting difference in the effects of water balance on nitrate loading are investigated.

Section snippets

The study area

The Olešná reservoir is located in the eastern part of the Czech Republic (49° 38′ N, 18° 18′ E), and its contributing watershed area covers approximately 33 km2 (Fig. 1). The reservoir serves for protection against flooding of a nearby residential area, as a water source for a company producing wood pulp, and for the recreational activities of swimming and fishing. The elevation of the watershed ranges from 300 m to 860 m (mean 401 m). The long-term mean annual precipitation is 995 mm, and the

Legacy data-based soil property prediction

Mean average error (MAE) and root mean square error (RMSE) for predicting the percentage proportion of particles <0.01 mm are summarized in Table 2. The prediction by random forest resulted in relatively low error according to comparison with the used training dataset, but performance decreased after comparison with the independent control dataset. The prediction error was lower for the topsoil layer than for the subsoil layer. Performance improved slightly after residual interpolation.

Mean

Prediction of soil properties

Prediction accuracy is in accordance with that of other studies (e.g. Nussbaum et al., 2018), including those describing SoilGrids development (Hengl et al., 2017). Generally, the pattern of soil texture and OC is consistent in both datasets, but SoilGrids shows systematically finer soil texture and higher OC content, which was also observed in France (Vaysse and Lagacherie, 2015). SoilGrids gives a good representation of the coarsest signal in global soil property variation (Hengl et al., 2017

Conclusions

The lack of detailed spatially distributed soil property data still limits the use of hydrological models. DSM methods seem to promise approaches for estimating soil properties and increase the relevance of hydrological model output. Globally, the most detailed product, SoilGrids, provides soil property data at high resolution, which may allow applications in smaller catchment areas. In this study, we have shown a comparison with our own DSM data prepared from local legacy data in the case of

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgement

The research was funded by the University of Ostrava from internal project SGS02/PřF/2019-2020.

The authors kindly thank reviewers for their constructive and inspiring.

The English language was reviewed by James P. Leckie.

References (76)

  • Y.T. Dile et al.

    Introducing a new open source GIS user interface for the SWAT model

    Environ. Model. Software

    (2016)
  • A.J. Franzluebbers

    Water infiltration and soil structure related to organic matter and its stratification with depth

    Soil Tillage Res.

    (2002)
  • M. Geza et al.

    Effects of soil data resolution on SWAT model stream flow and water quality predictions

    J. Environ. Manage.

    (2008)
  • A. Kurunc et al.

    Identification of nitrate leaching hot spots in a large area with contrasting soil texture and management

    Agric. Water Manage.

    (2011)
  • M. Ließ et al.

    Uncertainty in the spatial prediction of soil texture. Comparison of regression tree and Random Forest models

    Geoderma

    (2012)
  • B.P. Malone et al.

    Mapping continuous depth functions of soil carbon storage and available water capacity

    Geoderma

    (2009)
  • E. Molina-Navarro et al.

    A QGIS plugin to tailor SWAT watershed delineations to lake and reservoir waterbodies

    Environ. Model. Software

    (2018)
  • Y. Ouyang

    Estimation of shallow groundwater discharge and nutrient load into a river

    Ecol. Eng.

    (2012)
  • K. Piikki et al.

    Digital soil mapping of arable land in Sweden – validation of performance at multiple scales

    Geoderma

    (2019)
  • M. Piniewski et al.

    The effect of sampling frequency and strategy on water quality modelling driven by high-frequency monitoring data in a boreal catchment

    J. Hydrol.

    (2019)
  • N.J. Robinson et al.

    Soil data for biophysical models in Victoria, Australia: current needs and future challenges

    Geoderma Reg.

    (2016)
  • A.A. Romanowicz et al.

    Sensitivity of the SWAT model to the soil and land use data parametrisation: a case study in the Thyle catchment

    Belgium. Ecol. Modell.

    (2005)
  • M. Strauch et al.

    The impact of Best Management Practices on simulated streamflow and sediment load in a Central Brazilian catchment

    J. Environ. Manage.

    (2013)
  • F. Tavares Wahren et al.

    Combining digital soil mapping and hydrological modeling in a data scarce watershed in north-central Portugal

    Geoderma

    (2016)
  • A. van Griensven et al.

    A global sensitivity analysis tool for the parameters of multi-variable catchment models

    J. Hydrol.

    (2006)
  • K. Vaysse et al.

    Evaluating Digital Soil Mapping approaches for mapping GlobalSoilMap soil properties from legacy data in Languedoc-Roussillon (France)

    Geoderma Reg.

    (2015)
  • Y. Zhu et al.

    Uncertainty assessment in baseflow nonpoint source pollution prediction: the impacts of hydrographic separation methods, data sources and baseflow period assumptions

    J. Hydrol.

    (2019)
  • AOPK, 2007. Soil map. 1:50 000. Web map available from: <http://mapy.geology.cz/pudy/>. WMS available from:...
  • N.H. Batjes

    A world dataset of derived soil properties by FAO-UNESCO soil unit for global modelling

    Soil Use Manage.

    (1997)
  • K. Beven
  • R. Bhandari et al.

    Effects of soil data resolution on the simulated stream flow and water quality: application of watershed-based SWAT model. World Environ. Water Resour. Congr. 2018 Watershed Manag. Irrig. Drainage, Water Resour. Plan. Manag. – Sel. Pap. from World Environ

    Water Resour. Congr.

    (2018)
  • J.W. Biggar et al.

    Spatial variability of the leaching characteristics of a field soil

    Water Resour. Res.

    (1976)
  • L. Breiman

    Random forests

    Mach. Learn.

    (2001)
  • Carnell, R., 2019. lhs: Latin Hypercube Samples. R package version...
  • O. Conrad et al.

    System for Automated Geoscientific Analyses (SAGA) v. 2.1.4

    Geosci. Model Dev.

    (2015)
  • M.R.C. Cordeiro et al.

    Deriving a dataset for agriculturally relevant soils from the Soil Landscapes of Canada (SLC) database for use in Soil and Water Assessment Tool (SWAT) simulations

    Earth Syst. Sci. Data

    (2018)
  • A.S. Cotter et al.

    Water quality model output uncertainty as affected by spatial resolution of input data

    J. Am. Water Resour. Assoc.

    (2003)
  • C. Folberth et al.

    Uncertainty in soil data can outweigh climate impact signals in global crop yield simulations

    Nat. Commun.

    (2016)
  • Cited by (11)

    • Conversion from farmland to orchards has minor effects on nitrogen biological processes in deep loess deposits

      2022, Agriculture, Ecosystems and Environment
      Citation Excerpt :

      The large NO3--N reservoir is controlled by multiple nitrogen (N) sources and biological processes, including the nitrification of synthetic N fertilizer, atmospheric N deposition, biological fixation, and the mineralization of organic N (Galloway et al., 2008; Vitousek et al., 2013; Levy-Booth et al., 2014). The N cycling processes in terrestrial ecosystems are well known but most previous studies examined the topsoil < 1 m deep (Zhang et al., 2019b; Krpec et al., 2020). The large amounts of NO3--N, preserved in deep vadose zones, may exist for decades to hundreds of years and become a long-term threat to soil and groundwater (Sebilo et al., 2013; Ding et al., 2014; Yang et al., 2020b; Gurevich et al., 2021).

    • Evaluation of global and continental scale soil maps for southern Africa using selected soil properties

      2022, Catena
      Citation Excerpt :

      While numerous researchers rely on soil data when investigating the environmental impacts of climate change, few provide the reasoning behind selecting a particular soil map. Moreover, while many studies have shown that local maps outperform global or regional maps (Chen et al., 2020; Cramer et al., 2019; Dharumarajan et al., 2021; Krpec et al., 2020; McNicol et al., 2019; Rivas-Tabares et al., 2020; Venter et al., 2021), global and continental soil maps are often used without proper consideration of their reliability. However, an evaluation of map accuracy and comparison across the different available datasets would be useful to aid researchers in selecting the right soil data information.

    • Using hydropedological characteristics to improve modelling accuracy in Afromontane catchments

      2022, Journal of Hydrology: Regional Studies
      Citation Excerpt :

      The same principles can be applied to this study with a number of parameters being sensitive to modelling accuracy including the detailed soil information, the corresponding and detailed digital soil maps which highlight the hydropedological behaviour of the catchments as well as the detailed information obtained from Kuenene et al. (2011) on the soil drainage curves and how these translate into the time taken for lateral water movement to contribute to streamflow. Numerous research studies have highlighted the importance of detailed soil information on improving SWAT model accuracy (Adem et al., 2020; Chen et al., 2016, Krpec et al., 2020, Peschel et al., 2006). Soil information should be coupled with ecological information of the catchments such as the effects of fire, vegetation as well as rates of evapotranspiration on hydrological modelling accuracy.

    • Effect of sensitivity analysis on parameter optimization: Case study based on streamflow simulations using the SWAT model in China

      2021, Journal of Hydrology
      Citation Excerpt :

      However, most previous parameter optimization and SA studies did not combine these two processes, thereby resulting in the following problems. (1) In some studies, SA was not performed but instead a small number of parameters were selected for optimization based on the researchers’ experience (Confesor and Whittaker, 2007; Khoi and Thom, 2015; Krpec et al., 2020), which could lead to the omission of important parameters. (2) In other studies, SA was performed, but these studies only ranked the SA results of all selected parameters and then these parameters were directly optimized, rather than those parameters that showed relatively high sensitivity in the SA results (Mulungu and Munishi, 2007; Zhai et al., 2014; Odusanya et al., 2019; Pang et al., 2020).

    View all citing articles on Scopus
    View full text