The integrated eco-hydrological-economic model of the Heihe River Basin constructed by integrating the existing models (or modules) of the Heihe River Basin. The model has complete functions, model performance, simulation and prediction capabilities, applicability of different watersheds, and application of remote sensing data Stay ahead of existing models
A model is structurally more complicated than data. A model structure is characterized by the model conceptualization, formalization, representation, and code. All of these elements are personally and stylistically diverse; usually, no well-recognized standard is followed. Therefore, model integration is much more difficult than data integration (Dolk and Kottemann, 1993; Tang, 2001; Jakeman et al., 2013). From the perspective of computer and information sciences, the structure of a watershed model, whether a hydrological, ecological, or land surface model, is composed of the physical structure, the input and output interfaces, the user interface, and auxiliary tools (Figure 2). A brief description of each of these components follows: (1) The physical structure is composed of the core structure and the peripheral structure. The core structure consists of the governing equations (usually differential equations) and parameterization equations/formulae of the model. The peripheral structure includes numerical solution methods, spatial and temporal discretization schemes, and model initialization schemes. The physical structure of the model, particularly the core structure, is a formalization of the hydrological, ecological, and socioeconomic knowledge related to the watershed. This structure is closely related to our understanding of the watershed behavior and the corresponding mathematical expressions and solutions of problems that we want to address.
(2) The input-output interface (I/O interface) refers to the relationship between the model and the model dataset. For watershed models, the model dataset can be classified into three categories: the forcing data, which could be the near-surface atmospheric state, human activity forcing, geological settings, and other data used as boundary conditions; the model parameter dataset; and the dataset for the validation and diagnosis of the model (Li et al., 2010a). In general, the interface includes the input, output, pre-process, and post-process of the above datasets. The interface, if well designed, is independent of the physical structure of the model. From a model integration viewpoint, the I/O interface is also one of the interfaces for coupling to other models or modules.
(3) The user interface is an easy, efficient, and enjoyable space for users to interact with the model. The command-line interface is still common for scientific models. However, friendly graphical user interfaces and/or web-based user interfaces need to be developed.
(4) Auxiliary tools, which include parameter estimation, data preparation (e.g., data interpolation, stochastic simulation (e.g., ensemble prediction), data fusion, and data conversion), data provision, visualization, and high-performance computation, can be equipped in a well-developed integrated model.
Modeling environment and the two approaches for model integration
A narrow definition of model integration, using the concepts of the model structure defined in the above section, is to couple the model components to simulate the system. However, in a broad sense, model integration includes the development of tools to support more efficient and effective model development and enable simpler interactions between the model and data. The modeling environment, sometimes called the environmental modeling framework, realizes the broad definition of model integration and can create domain-specific models of large-scale systems (David et al. 2013; Granell et al., 2013). The model environment is a computer software platform that supports the efficient development of integrated models, the convenient coupling of existing models or modules, model management, data management and processing, parameter calibration, and data and results visualization (Castronova et al., 2013b; Chen et al., 2011; Cheng et al., 2014; Li et al., 2010b; Wen et al., 2013).
Therefore, we propose that the model integration in watershed science research consists of two themes. One theme is the development of a watershed system model, which should be capable of representing the dynamics of the water-land-air-plant-human nexus. The other theme is the development of modeling environments, which focus on software support for efficient and effective model integration using advanced computational and information techniques.
To clarify the role of the modeling environment, we classify the model integration into two approaches, namely, the know-how approach and technical approach. The know-how approach is defined as the modeling approach used to formalize new understanding in the models. For example, if we use a unified equation to express the water flow in saturated and unsaturated zones, the governing equations have to be redefined. For instance, Wang used a set of unified equations to formulate the three-dimensional water movement in the vadose and saturated zones (Wang et al., 2007); alternatively, if a new parameterization formula is added to close a set of equations, then the numerical scheme of the model may change. In these two examples, the physical structure, particularly the core structure, of the model changes, low-level source code has to be developed, and coupling the existing modules by data transfer only is not possible. This new model development is, therefore, a multidisciplinary synthesis, and cross-field collaboration by researchers from different disciplines may be required. This type of model integration adds new knowledge to the process of integration, so it is called the know-how approach of model integration. The technical approach, in contrast, focuses on linking existing models or model components by data transfer. This approach does not modify the core structure of existing models, and it seldom modifies the peripheral structure unless some schemes, e.g., the spatial and temporal discretization schemes, need to be changed. The technical approach mainly addresses the I/O interfaces, which pass data and messages among different models and model components (Gregersen et al., 2007; David et al., 2013). Overall, the technical approach falls within the scope of computer and information technologies.
Classifying the modeling approaches into know-how and technical approaches helps to clarify the relationship between the development of an integrated model and the modeling environment. As illustrated in Figure 3, the technical approach, which is usually equipped with an icon drag-and-drop function or a modeling language, is efficient for developing integrated models and promoting the integration of the model with a geographic information system (GIS), data management, parameter calibration, and visualization environments. However, the know-how approach must be used when new formalization is required or when the complexity of the new integrated model prevents the linkage of existing models/modules using only the input and output interfaces. In these cases, a new model at the level of the source code must be developed. However, the know-how approach of model integration can also benefit from the technical approach. When a new model is formulated, it can be coupled with other models/modules to form a more functional model by using the modeling environment. Additionally, the modeling environment provides data management, parallel computation, visualization and other necessary support for the integrated model and helps improve the model performance using various tools, such as parameter calibration.
Relationship between a scientific model and river basin management model
Whether a model is a tool or a hypothesis has been a subject of debate for a long time (Savenije, 2009). Our opinion is that the purpose of watershed model integration is to develop both a scientific model and a river basin management model. The scientific model, i.e., the watershed system model is used for understanding the complex processes of a river basin; thus, it is similar to a scientific hypothesis. The river basin management model is mainly applied to manage water resources, other natural resources and socio-economic resources; thus, it often acts as a tool. The scientific model is usually built on the basis of the existing laws of physics and is often called a mechanism model, while the watershed management model can be a completely empirical model. A scientific model is often very complex due to the complexity of watershed science, while a watershed management model mainly focuses on simplicity and usability. A scientific model does not typically need to be equipped with a graphical user interface, but a watershed management model requires a friendly user interface for easy operation. The former often involves a huge computational cost, while the latter must be computationally efficient. The former, in theory, has better predictability, but the uncertainty in the simulations is also very large and difficult to quantify. The latter is expected to inherit the predictability of the former, but uncertainty must be reduced and controlled to communicate the risks to the decision-making process.
howWhether an integrated model should be simple or complex is also debated. As Albert Einstein said, “Make everything as simple as possible, but not simpler.” Specifically, we should pursue the development of a simple model, but we also must ensure that the model can describe the complexity within a river basin. The balance between authenticity and simplicity is modeling art. Building a river basin management model based on a mature scientific model (Cai et al., 2015; CUAHSI, 2007; Surridge and Harris, 2007) is a reflection of this aim. In Figure 4, we summarize the relationship between the two types of integrated models. The new generation of scientific models will have a large capacity, but the complexity, computational cost, and uncertainty generally increases with the increase in the extent of the model integration. To reduce and control uncertainty, we propose to apply data assimilation and other model-data fusion methods for merging multi-source observation data into the dynamics of the watershed system model. This strategy can reduce the uncertainty in the simulation and also increase the predictability of the simulations.
However, a new problem emerges, i.e., watershed managers and stakeholders have difficulty using the watershed system model with a data-assimilation ability because the model structure is too complex and the computational cost is too high. Although stakeholder involvement is very important, stakeholders generally do not intend to learn the complex processes behind the watershed system model. So, is there a compromise between model predictability and applicability? A feasible approach may be to build a physically or computationally simplified surrogate model (Razavi et al., 2012; Wu et al., 2015; Wu et al., 2016) or to develop an offline data-driven surrogate model by using a huge amount of data generated from the watershed system model; in addition, it could be possible to pre-define various scenarios of climate change, land use change, river basin management planning, and economic development and to conduct scenario-based simulations to support decision making. In this way, model integration could take into account both the strong predictability of a scientific model and the simplicity, robustness, applicability, and easy interaction of a management-oriented model, and in the meanwhile, lower the complexity and computational cost, and reduce the uncertainties.