AgMIP calibration protocol

Overview

Fig. 1: Answers to the question “What was your approach to calibration?” (n=189) from a web-based survey of calibration practices in crop modeling (Seidel et al. 2018).

The protocol covers all the steps involved in calibration, namely the choice of default parameter values,      observed variables to fit,      objective function,       parameters to be estimated,      numerical algorithm and implementation, and finally of evaluation. The protocol also includes instructions as to documentation to be produced at each step of the calibration. The protocol is intended to be applicable to a wide range of crop growth models and crop development data sets, e.g. covering a number of sites and/or several treatments at one or more sites and different types of measured variables. These can be variables measured over time (phenology, biomass, leaf area index, soil water contents, soil mineral nitrogen contents etc.) or variables measured only at one point in time (e.g. crop yield, yield components, grain protein or grain number at harvest), but for several treatments or environments (we call site/year combinations of measurements “environments”).

Description of the calibration protocol

The proposed protocol for      crop model calibration, is composed of eight steps (Fig. 2). Implementation of t     he first five steps (the model expertise part of the protocol) require detailed knowledge of the model and the data. No calculations are performed here. The result of these steps is a series of tables that contain all the model-specific information needed for the calculations. The last three steps (the calculation steps) describe the calculations to be done. The protocol includes instructions for each step, and the documentation to be produced in each step. The documentation of all decisions and assumptions is an integral part of the protocol, ensuring transparency and reproducibility of the calibration procedure.

Fig. 2: Schema of calibration protocol. The first five steps involve codifying model expertise. Given that information, the calculation steps 6-8 require no further model-specific inputs (according to Wallach et al. 2024).

Step 1. Choose default parameter values

Only a small fraction of crop model parameters will usually be estimated from the data. The majority of the parameters will remain at their default values. Default values[1]  in this context are the pre‑set parameter values provided by the model developers, or taken from an earlier model calibration, intended to represent the most relevant existing parameter values for the studied conditions. Also, for those parameters that are estimated from the data, the starting values for exploring the parameter space are centered around the default values. The optimization method used here, works best, if the default values for parameters to be estimated are reasonably close to the new best values. Therefore, it is important to choose the default values with care. In particular, one should obtain as much information as possible about the cultivar characteristics (maturity class, photoperiod sensitivity, vernalization requirements etc.) and choose default parameter values based on a cultivar as similar as possible to the cultivar in question. 

The documentation required here (Table 1) contains the cultivar characteristics and the rationale for the choice of default parameter values.

Table 1 : Example of protocol documentation for step 1, showing cultivar characteristics of the target population variety and of the variety providing the default parameter values. This example is for the French data set and the STICS model.

VarietyCharacteristics 
Variety of measurements: ApacheA soft winter wheat. Stem elongation – semi-early. Heading – early. Vernalization requires 40 days where full vernalization occurs if daily average temperature is between 3°C and 10°C. There is no vernalization below -4°C or above 17°C. Otherwise there is a proportional reduction in vernalization effectiveness.
Cultivar used to provide default parameter values: SoissonsSoissons seems to be close to Apache in terms of vernalization requirements and earliness.

Step 2. List observed variables and corresponding simulated variables

The purpose of this step is to identify the correspondence between observed and simulated variables. For crop model calibration, it is advantageous to have as many variables as possible, because it provides richer, more diverse information that helps constrain parameters more effectively, reduces equifinality, and improves the model’s ability to represent multiple processes accurately, which leads to more robust and reliable predictions across conditions. Therefore, it is strongly recommended to use all observed or measured variables for calibration that have a corresponding output variable in the model. A simulated variable is assigned to each measured variable. The documentation required for this step is a table with one row for each measured variable, showing also the corresponding simulated variable and the respective unit (Table 2 shows an example).

Table 2: Example of protocol documentation for step 2 and 3, showing observed and corresponding simulated variables as well as their units [9] (step 2) and the group, each parameter is attributed to (step 3), including their order of calibration. This example is for the French data set and the STICS model.

Name of the observed or required variableGroup for calibrationName of the simulated variableUnit of the simulated variable
Date_BBCH10phenologyilevsd
Date_BBCH30phenologyiamfsd
Date_BBCH55phenologyilaxsd
Date_BBCH90phenologyimatsd
EarsPerSqMearsNANA
Biomassplant_biomassmasec_nt ha-1
N_in_biomassHarvestplant_N_contentN_in_biomassHarvest%
Grain_Numbergrain_numberchargefruitm-2
Grain_Yieldyieldmafruitt ha-1
ProteinContentGrainseed_proteinProteinContentGrain%

Step 3. Define groups of variables and order them

In this step, observed variables are divided into groups. All variables in one group need to have the same unit. For example, data of days to development stages are grouped together in a phenology group. Multiple observations over time of the same variable (e.g., biomass, if there are in-season measurements) will constitute a single group[11] . Other variables, such as observations at crop harvest (e.g. yield, grain number, grain protein content etc.) will each constitute a separate group.

Then, the order in which the variable groups will be used for calibration needs to be defined. The order should be such that estimating parameters to fit variables in one group has little or no effect on the simulated values of variables in previous groups. In the case of modeling crop growth, phenology is usually the first group, since while changing simulated phenology usually has a major effect on simulated values of other variables, changing the simulated values of other variables often has little or no effect on simulated phenology.

If there is little or no feedback in between variable groups, then the fit to each variable group will hardly change when subsequent groups are fit, so the parameter values found for each group will be a good approximation to the best parameters considering all the data. If, however, there is substantial feedback, which is quite typical in crop models, then the parameter values found for each group will no longer give a good fit after all groups have been considered. This is taken into account in the later steps of the protocol. The required documentation for this step     , which is combined with the documentation for step 2, shows the group and order for each observed variable (Table 2).

Step 4. Identify the major parameter(s) for each group of variables

The purpose of this step is to identify the major           parameters for each variable group. The main role of the major parameters is to reduce bias. The major     g parameters for a group of variables should have an effect on the simulated values in all environments. Nearly additive parameters, i.e. parameters that have a similar effect across all environments, can be good choices as major parameters, as they make the model bias nearly zero for the associated variable.  An example of      a major      parameter is thermal degree days to a development stage, since increasing the required number of degree days will, in general, increase the days to the stage by a similar amount for all environments.[15] [16]  Parameters that describe the effect of stresses, which only affect the simulated values if the stresses are present, will not be bias-reducing parameters.

There is a strict limit[17]  on the number of major      parameters for each group, to avoid over-parameterization. If there is only one measured variable in the parameter group (e.g. yield), there can only be[18]  one major      parameter. For variables with at least two measurements in some environments      (e.g. biomass with in-season measurements) there can be [19] at most two major      parameters (for example, one that determines rate of increase during vegetative growth and a second that determines rate of increase during reproductive growth). For phenology, there can be as many major      parameters as observed development stages with simulated equivalents. However, each phenological major     g parameter must affect the time to a different stage. If no parameter acts across nearly all environments for a given variable, then there may be no major parameter for that variable, only candidate parameters (see Step 5).

The required documentation here is a table which shows the major      parameters for each variable group (see example in Table 3).

Table 3: Example of protocol documentation for step 4, showing major      parameters for the different variable groups with default values, lower and upper boundaries[20]  and a short explanation and unit of the parameter. This example is for the French data set and the STICS model.

Name of the parameterGroupDefault valueLower boundUpper boundShort explanation (and units if possible)
stlevamfphenology212150400cumulative thermal time between the stages LEV (emergence) and AMF (maximum acceleration of leaf growth, end of juvenile phase) (degree-d)
stamflaxphenology367150500cumulative thermal time between the stages AMF (maximum acceleration of leaf growth, end of juvenile phase) and LAX (maximum leaf area index, end of leaf growth )  (degree-d)
stdrpmatphenology700500900cumulative thermal time between the stages DRP (starting date of filling of harvested organs) and MAT (maturity)  degree-d)
efcroivegplant_biomass4.2536maximum radiation use efficiency during the vegetative stage (AMF = maximum acceleration of leaf growth, end of juvenile phase – DRP=starting date of filling of harvested) (g.MJ-1 organs)
efcroireproplant_biomass4.2536maximum radiation use efficiency during the grain filling phase (DRP= starting date of filling of harvested organs – MAT= maturity), (g.MJ-1)
Vmax2plant_N_content0.050.0020.1maximum specific N uptake rate with the high affinity transport system (µmole.cm-1 h-1)
cgraingrain_number0.0360.030.04slope of the relationship between grain number and growth rate (grains.g-1.d)
vitircarbTyield0.00070.000050.002rate of increase of the C harvest index vs thermal time (g grain.g-1.d-1)
vitirazoseed_protein0.01450.0010.04rate of increase of the N harvest index vs time (g grain.g-1.d-1)

Step 5. Identify the candidate parameters for each group of variables

The candidate parameters are those parameters that are likely to explain a substantial part of the variability between environments   that remains after the major      parameters are estimated. Each of these parameters will be tested (in the next step) for the respective group of variables, and will only be included in the final list of parameters to estimate if estimation leads to a sufficient improvement in fit to the data. The candidate parameters should be ordered from supposedly most to supposedly least important. The number of candidate parameters is not limited, but it has a significant impact on computational time     . The required documentation here is a table with the candidate parameters for each variable group (see example in Table 4).

Table 4: Example of protocol documentation for step 5, showing candidate parameters for the different variable groups with default values, lower and upper boundaries and a short explanation and unit of the parameter. This example is for the French data set and the STICS model.

Name of the parameterGroupDefault valueLower boundUpper boundShort explanation (and units if possible)
jvcphenology55.912560number of vernalizing days (d)
sensrsecphenology0.501index of root sensitivity to drought (1=insensitive) (without dimension)
belongphenology0.0120.0050.03parameter of the curve of coleoptile elongation (degree.day-1)
jvcminiphenology7215minimum vernalizing days required (d)
stressdevphenology001maximum phasic delay allowed due to stresses (without dimension)
dlaimaxbrutplant_biomass0.000475E-060.005maximum rate of the setting up of LAI (m2 leaf.plant-1.degree-d-1)
durvieFplant_biomass20040300maximal lifespan of an adult leaf expressed in summation of Q10=2 (2**(T-Tbase)) (without dimension)
vlaimaxplant_biomass2.21.52.5ulai at the inflexion point of the function DELTAI=f(ULAI), where ulai is the relative development unit for LAI (without dimension)
psistoplant_biomass151125potential of stomatal closing (absolute value) (bars)
psiturgplant_biomass4115potential of the beginning of decrease of the cellular extension (absolute value) (bars)
croiracplant_N_content0.1200.5elongation rate of the root apex (cm.degree-d-1)
draclongplant_N_content8011000maximum rate of root length production per plant (cm.plant-1.degree-d-1)
nbjgraingrain_number30540number of days used to compute the number of viable grains (d)
pgrainmaxiyield0.04070.030.065maximum grain weight (at 0% water content)
cgrainv0yield000.07fraction of the maximal number of grains when growth rate is zero

Step 6. Select parameters to estimate for each group and estimate their values

In this calculation step, each group of variables is treated separately, in the order chosen in step 3 (Table 2). A list of parameters to estimate for each group is initialized with the major      parameter(s), and then each candidate parameter is tested. The major      parameters for the group are estimated using ordinary least squares (OLS), and the corrected Akaike Information Criterion (AICc, Brewer et al., 2016; Chakrabarti and Ghosh, 2011) is calculated as

   

where SS is the sum of squared errors for all variables in the group, n is the number of data points and p the number of estimated parameters. This assumes that all model errors for the group are independent and identically normally distributed.

Once the major      parameters have been estimated, each candidate parameter in turn is added tentatively to the list of parameters to be estimated. If estimating all the parameters on the list reduces AICc below the previous smallest value, the candidate is kept on the list of parameters to be estimated. Otherwise, the candidate is removed from the list of parameters to be estimated, and returns to its default value      (see flow diagram in Fig. 3). AICc is a standard model selection criterion, designed to choose the best predicting model even if none of the proposed models is the true model (Aho et al., 2014).

[21] A first required documentation table here shows the result of adding each new candidate parameter, for each variable group (see example in Table 4[22] [23] ). The optimum parameter values and the fit to the measured data after this step are combined with the documentation of step 7 (Table 5, Table 6[24] [25] ).

Fig. 3: Flow diagram for the selection of parameters to estimate, and first estimation of their values for one variable group. This is done for each variable group in step 6 of the protocol (Wallach et al. 2024).

Step 7. Estimate all selected parameters together with WLS, using all variables together

In this step, all the selected parameters from step 6 are estimated together, using all the data, using weighted least squares (WLS). The objective function, to be minimized, is a sum of terms, one for each variable group. The term for each group is the sum of squared errors for that group, divided by errVar:

   

where SS is the sum of squared errors for all variables in the group from step 6, n is the number of data points and p the number of estimated parameters in step 6. The required documentation table here shows the estimated parameter values after steps 6 and 7 (see example in Table 5). [26] [27] 

Step 8. Evaluate goodness-of-fit

In this step, metrics of goodness-of-fit are calculated for the simulations using the default parameter values, values after step 6 and using the parameter values after step 7. The required documentation table here shows the metrics for goodness-of-fit at each stage. An example is shown in Table 6 and Supplementary Table S. Additional metrics could also be calculated. Graphs of simulated versus observed values for each variable should also be produced (see example in Supplementary Figures S1-S3). [29] 

Discussion

Before starting calibration, some aspects need to be considered: Some variables might need to be transformed before calibration, to give them comparable weights.[31] [32]  For example, biomass measurements over time go over a wide range of values over the growth period. This is associated with an increase of standard deviation of model error. Therefore, biomass measurements should be replaced by the natural logarithm of biomass. The log transformation will make the standard deviations approximately constant for all dates. Similarly, a log transformation should be used for any other variables expected to vary over a wide range over time.

In CroptimizR (REF), the Nelder-Mead simplex algorithm is currently used to find optimal parameter values, both in step 6 and step 7 (Nelder and Mead, 1965). [33] However, use of the simplex is not an obligatory part of the protocol. Other optimization algorithms could be used. [34] 

Software tools

The open-source R package CroptimizR automates the implementation of the AgMIP Calibration protocol and the generation of associated diagnostics. More broadly, CroptimizR connects crop models or platforms (e.g. STICS, DSSAT, APSIM, SQ2) with parameter estimation packages (e.g. BayesianTools, nloptr) through the development of wrappers, while providing functionalities specifically tailored to crop modeling. Further details on the available methods, criteria, and features can be found on the CroptimizR github page (https://github.com/SticsRPacks/CroptimizR/) and in the package documentation. An example of the protocol applied using this package is also available here: https://sticsrpacks.github.io/CroptimizR/articles/AgMIP_Calibration_Protocol.html.

References see publication page

Maybe to add here links to all our papers (and in future also papers applying the protocol) with their doi links and very short statement of the key development (or testing) the paper involves?

Seidel, S. J.; Palosuo, T.; Thorburn, P.; Wallach, D. (2018): Towards improved calibration of crop models – Where are we now and where should we go? In: European Journal of Agronomy 94, S. 25–35. DOI: 10.1016/j.eja.2018.01.006.


 


Twenty Twenty-Five

email@example.com
+1 555 349 1806