Comparison of Daily Precipitation Bias Correction Methods Based on Four Regional Climate Model Outputs in Ouémé Basin, Benin

Abstract: Precipitation projections from regional climate models in West Africa are attributed with significant biases with respect to the observed. This study aims at evaluating of six methods of precipitation bias correction on four RCM (CCLM, CRCM, RACMO and REMO) outputs in the Ouémé basin. The bias correction methods used are classified into three namely: the Delta approach, the Linear Scaling method and the quantile approaches. Corrected and uncorrected RCM precipitation data were compared with the observed using Mean Absolute Error (MAE) and Root Mean Square error (RMSE). The findings showed that raw outputs of regional climate models (RCMs) are characterized with several biases. In general, the models overestimate precipitation. For daily precipitation correction, the quantile approaches assuming a gamma distribution for daily precipitation were not able to reduce the biases of precipitation. The empirical quantile mapping and the adjusted quantile mapping are the most effective in correcting the biases of daily precipitation. Thus the adjusted quantile mapping can be used to correct biases of precipitation projections for modeling the future availability of water resources.


Introduction
Managing future fresh water resources under a changing climate with vastly uncertain future atmospheric greenhouse gas emissions is a daunting challenge facing human society today. Several studies on climate change impacts have shown that water resources are impacted in all investigated areas [1][2][3]. These changes have effect on critical sectors such as water supply, agriculture, energy and others in many regions around the world especially in West Africa where agriculture is mainly rain-fed. The decreasing rainfall and devastating droughts in the Sahel region (West Africa) during the last decades of the 20th century with peaks between 1972-1974 and 1983-1985 [4] is a perfect illustration. Hydrological models are one of the most powerful tools at our disposal to assess climate change impact on water resources. However, hydrological simulations of scenario projections require adequate projected fields of meteorological forcing variables such as precipitation. Several researchers demonstrated that raw output from regional climate models (RCMs) cannot be used directly as input for impact models because of systematic bias [4][5][6][7][8]. These errors originate from different sources like (1) errors transferred from GCMs to RCMs (boundary problem), (2) insufficiently resolved surface properties (like orography) and (3) errors due to numeric resolutions and parameterization [9][10][11][12][13]. The RCMs errors also depend on simulated variables and may be large for precipitation due to its highly nonlinear nature and large spatial variability, making them highly dependent on model resolution [12].
In order to enhance climate models resolution to desired scales (grid versus local scale) and accuracy (model biases), different bias correction approaches are developed to reduce these errors; an overview can be found in Themeßl et al. [13]. One bias correction method commonly used in climate change impact studies is the "delta change approach", also called perturbation method [12,14,15]. This method generates climate scenarios by adding the climate change signal from a RCM simulation to daily or monthly observations. It's widely used in many climate change impact studies [8,[16][17][18]. Besides the delta approach, other authors used quantile mapping or histogram equalization, by fitting a probability density function (PDF) or cumulative distribution function to the modeled and observed data [5,7,13,[16][17][18][19][20]. The applicability of these bias correction methods, ranging from the simple scaling approach to sophisticated distribution mapping [21], has not been investigated in Ouémé basin. Therefore, evaluating and finding the appropriate bias correction method is necessary to evaluate the impact of climate change on water resources; in order to bridge this gap.
This study evaluates performances of six precipitation bias correction methods applied to four RCM simulations in catchment of the Ouémé basin. These bias correction methods include the most frequently used bias correction approaches.
The rest of paper is structured as thus: Section 2 introduces the study area and data; Section 3 describes the bias correction methods for precipitation and intercomparison of parameters; Section 4 presents results and discussion, followed by conclusions in Section 5.

Study Area
The present study focuses on Ouémé river basin at the outlet of Bonou (Figure 1). This basin is located between the latitudes 7°58' North to 10°12' North and longitudes 1°30' East to 3°05' East and covers an area of 49,256 km 2 . The rainfall, which is mainly controlled by the atmospheric circulation of two air masses and their seasonal movement (the Harmattan and the monsoon), is characterized by two types of rainfall regimes, from the bimodal rainfall regime in the south to unimodal in the north. The annual rainfall average is 1200 mm/year from 1960 to 2013 in the northern of basin and is close to the value estimated by Lawin et al. [22] over the period 1954 to 2005. For the southern part of basin, this annual mean is 1275 mm/year. The average discharge of the main watercourse of this basin is approximately 50 m 3 /second at Bétérou hydrometric station from 1960 to 2013 and 190.75m 3 /second at Bonou station for the same period. The relief is characterized by the altitudes which vary from 273 to 480 meters and the slope is 0.5meter/kilometer. Form factor of the basin is evaluated at 1.37 meanwhile the drainage density is equal to 0.12 kilometer/Kilometer 2 [23]

Data
Daily precipitation amount from four regional climate models was used for this study. These models are REMO2009, RACMO22T, CRCM5 and CCLM4 as presented in Table 1. Data of REMO2009, RACMO22T, CRCM5 and CCLM4 are available in the context of the Coordinated Regional Climate Downscaling Experiment (CORDEX) over Africa at 0.44° resolution for the period 1950 to 2100 [24] and it has already been used over Africa [3,8,17,[25][26][27][28]. Only REMO data are usually used in impact studies at Ouémé basin [29]. The last three RCMs were chosen to test their ability to reproduce the rainfall cycle in Ouémé basin for future impact studies. We used these RCM predictions following the most extreme IPCC scenario RCP8.5 (except CRCM5 which wasn't run for RCP8.5 scenario over Africa) and the mean RCP4.5 available for the period 2006-2100 in CORDEX database and the historical data of these models for 1960-2005 period. All these data are available in the CORDEX database online (http://www.cordex.org).
The observed daily rainfall used in this study are provided by the National Meteorology Agency of Benin (Météo Bénin) for the period 1960-2013 for eighteen rain gauges spatially located as shown in figure 1. Since RCM output of the given grid is the spatial mean in this grid; for each rainfall gauge point, we extracted the RCM dataset of the grid containing this point.

Delta Method
The first bias correction method used in this study is the delta approach (simply called Delta in following text). This conventional way to construct precipitation time series for a future climate is to perturb an observed data series with a projected future climate change [30,31]. The long-term mean changes are calculated and added to the observation records. This method is sometimes referred to as the direct method in the scientific literature. The method was applied in two steps. First we corrected the data of reference period (calibration period) which is from 1960 to 1993. For this period, the delta approach is defined as: These two steps constitute the delta approach such as defined by Déqué [14]. The delta method removes biases in the mean but not the coefficient of variance of the modeled precipitation [32]. The variance in future climate is kept the same as under present climate, which will likely not be true [33]. For the following text, 1960 to 1993 is referred to as the calibration period and 1994-2005 is the validation period.

Linear Scaling Approach
The linear scaling method (simply called Scaling) aims to perfectly match the mean of corrected values with that of observed ones [18,32]. Precipitation is typically corrected with a multiplier term on a monthly basis. This approach is defined as follows [18]

Quantile Mapping Methods
We evaluated four variants of the quantile matching to correct the daily precipitation of the RCMs output. The quantile matching adjusts all moments of the probability distribution function (PDF) of any variable of the model [5][6][7] by using the PDF of observations, integrating both PDFs to cumulative distribution functions (CDFs) and construct a transfer function. This transfer function translates the raw model output into corrected output. After the correction, the CDF of the model should be equal to the observed one [7].
The reference method uses empirical distribution functions [7,14,34] and is hence referred to as empirical quantile mapping (EQM). This method is expected to produce the best correction but depends on many degrees of freedom and may not be stationary due to possible overfitting [7]. However, for climate change applications, it is assumed that the transfer function stays constant with time [5], which is not a trivial assumption [35]. This classical quantile matching method is constructed as follows: where y is the corrected precipitation value, x the value of the precipitation to be corrected, 1 obs F − is the inverse of the CDF of the observations and accordingly RCM F is the CDF of the RCM used. The probability of observing x millimeter per day (or less) in the RCM is thus transferred to the quantile of the observed CDF, matching exactly this probability.
The application of the quantile-quantile transformation is more flexible than the previous methods and it's a procedure that has been widely used for correcting biases in the simulated meteorological variables [14,36,37]. Within this context, a new quantile-quantile calibration method based on a nonparametric function that amends mean, variability, and shape errors in the simulated cumulative distribution functions (CDFs) of the climatic variables has been developed by Amengual et al. [38]. The procedure consists of calculating the changes, quantile by quantile, in the CDFs of daily RCM outputs between a x-year control period and successive x-year future time slices. These changes are rescaled based on the observed CDF for the same control period, and then added, quantile by quantile, to these observations to obtain new calibrated future CDFs that convey the climate change signal. The choice of x value depends on the length of the observation datasets available; but the x-year chosen must have a climatological meaning [38]. In this study, we chose the 15-years periods due to the temporal limitation of the observed database of reference period (33 years) and also to be in accordance with these authors.
The statistical adjustment can be written as the following relationship between the i th ranked value i P (projected or future calibrated), i O (control observed or baseline), ci S (raw control simulated), and fi S (raw future simulated) of the corresponding CDFs. This is just a summary of the method called here AQM (Adjusted Quantile Mapping), all detail can be found in Amengual et al. [38].
Amengual et al. [38] proposed O IQR (interquartile range of the observed data) and c S IQR (interquartile range of the raw control simulated data) as surrogates of the population variability. IQR is the parametric difference between the 75th (P75) and 25th (P25) percentiles for all the variables, except for the precipitation for which they proposed to use the 90th (P90) and 10th (P10) percentiles owing to the highly asymmetrical gamma-type distribution of this variable, with a high proportion of non-rainy days. Factor g modulates the variation in the mean state ∆ , while f calibrates the change in variability and shape expressed by ' i ∆ .
Other ways to use the quantile-quantile transformation to correct the bias of RCMs data is to replace the empirical CDFs with a parametric distribution. The gamma distribution is commonly used for representing the PDF of precipitation [5,7,33] and depends only on two parameters. However, the gamma distribution does not represent daily precipitation for every region adequately as shown by Vlček and Radan [39] for some parts of Europe. Since this study aims to compare the bias correction methods, we did not test it, but, we assumed that the daily precipitation distribution can be represented by gamma distribution. This method will be called GQM. The gamma distribution is defined as: Where β as the scale parameter, α as the shape parameter and Γ as the gamma function. The gamma distribution is not defined for x = 0 mm/day. Therefore, the correction process will be a dual step [7,19]. First, the number of dry days is corrected by optimizing a threshold value 0 s that means all values smaller than this threshold are set to zero. Afterwards, these fitted PDFs are integrated and the resulting CDFs are used to replace the empirical CDFs in equation (7).
The last quantile method used replaces the empirical PDFs (or CDFs) by a combination of a gamma distribution and a GPD (Generalized Pareto distribution). This method is referred to GPQM in the following text. The gamma Pareto distribution is a heavy-tailed extreme value distribution [7], and is defined as follows: (12) with s as the threshold, ( ) s as the reparameterized scale parameter and ξ as the shape parameter.
As a first step, the same threshold 0 s used for dry days correction in GQM is applied here to correct the dry days [7]. Second, for the threshold s , we use the 95th percentile as proposed by Yang et al. [33] and used by Gutjhar et al. [7]. This means that values smaller than the 95th percentile are assumed to follow a gamma distribution, whereas values larger than this threshold are assumed to follow a general Pareto distribution: where y is the corrected precipitation value, x the value to be corrected for the precipitation. So, we have four parameters to estimate: scale (β) and shape (α) parameters for the gamma distribution and scale (σ) and shape (ξ) parameters for the GPD for observations and the RCM model used, respectively. Added to the threshold s , there are five parameters to be estimated.
In order to apprehend a potential distortion of the RCM's temporal structure by the bias correction methods used, we regard, the root mean square error (RMSE), the mean absolute error (MAE), the mean (µ), the 90th percentile (P90), the probability of wet day (P wet ) and the standard deviation (σ) of corrected and uncorrected time series from RCM simulations. Both the root mean square error (RMSE) and the mean absolute error (MAE) are regularly employed in model evaluation studies. Willmott and Matsuura [40] have suggested that the RMSE is not a good indicator of average model performance and might be a misleading indicator of average error and thus the MAE would be a better metric for that purpose. Chai et al. [41] demonstrated that the RMSE is not ambiguous in its meaning, contrary to what was claimed by Willmott and Matsuura [40]. So, there is no consensus on the most appropriate metric for model errors. To avoid all ambiguities we compared the methods performance using the two parameters.

Ability of RCMs to Reproduce Past Rainfall
The figure 2 shows the annual rainfall cycle of observations and raw simulations of the four RCMs for the eighteen stations used in this study. The RCMs ability to reproduce precipitations cycle depends on the gauge station. The RCMs CRCM and REMO overestimate precipitation amount in the basin. The overvaluation is very large between June and September, moment of heavy rain in the northern parts of the basin. Despite this overestimation of precipitation, these two RCMs have a good capacity to reproduce the shape of the observed annual precipitation cycle. Indeed, the unimodal character of annual cycle of precipitation in most stations of the basin was kept by REMO and CRCM. It could be noted that the period of extreme precipitation (from July to September) is also preserved by these RCMs for these stations which are situated in the northern parts of the basin. As for RACMO and CCLM, these models underestimate the rainfall amount in most stations where the rainfall is characterized by unimodal regime. In these stations (Bassila, Bembèrèkè, Birni, Djougou, Ina, Kouandé, Natitingou, Parakou), the underestimation exacerbate between June and September in opposite of REMO and CRCM. However, RACMO and CCLM models reproduce the shape of seasonal cycle of precipitation in the southern parts of the basin which is characterized by bimodal seasonal cycle of precipitation.
In spite of these inabilities to reproduce well the past rainfall amount of precipitation, the models REMO, RACMO, CRCM and CCLM simulate well in general the annual cycle of precipitation. So, the rainfall predicted by these models can be used for the impact studies, of course, after improvement of the data by the bias correction methods.  The performance of a given method to reduce the bias depends on the rainfall gauge station and the RCM considered. At the daily scale, Delta, Scaling, EQM, AQM reduce the bias of REMO, CRCM, CCLM and RACMO models. But, GQM and GPQM deteriorate the quality of RCMs raw data.  The performances of the methods in calibration and validation are practically the same. That is due to the length of calibration period (34 years against 12 years for validation) making the bias correction transfer functions of this period very representative of the study period . In this context, we showed the rainfall cycle only for the validation period. In most stations, the Adjusted Quantile Mapping (AQM) best reduce the biases than others methods. Figures 5, 6, 7 and 8 show the cycle of observed precipitation, uncorrected and corrected precipitation for REMO, CCLM, CRCM and RACMO regional climate models respectively for the validation period.

Figure 5. Rainfall intensity for validation period 1994-2005 as observed (black) and simulated raw(red) by REMO and corrected using Delta approach (yellow), Scaling(blue), AQM (green), EQM(cyan) GQM(magenta) and GPQM ( blue dash) for heighten stations in Ouémé catchment.
For REMO data bias correction, except the station of Bassila, Kouandé and Parakou where EQM has a successful bias correction; in most stations, AQM outperform other methods for correcting the bias of the four RCMs used. These three stations are situated in the northern parts of the basin where there is unimodal rainfall cycle that is well kept by REMO simulation.  There is no station where AQM doesn't outperform the other bias correction methods used for CCLM data correction. The rainfall cycle predicted by CCLM is bimodal in all the evaluated stations of the basin contrary to REMO. In this context, due to its flexibility, AQM reduce the bias of the raw data.

Figure 7. Rainfall intensity for validation period 1994-2005 as observed (black) and simulated raw(red) by CRCM and corrected using Delta approach (yellow), Scaling(blue), AQM (green), EQM(cyan) GQM(magenta) and GPQM ( blue dash) for heighten stations in Ouémé catchment.
As the REMO model, CRCM predicted a unimodal rainfall cycle for all the stations. EQM performs well in few stations, but in general AQM has a successful bias correction. Scaling method has similar performance with the EQM. For the four RCMs raw precipitation used, GQM and GPQM which use gamma distribution for the daily precipitation distribution don't perform up to the quality of the raw outputs of the models (figures 3 and 4).
All the bias correction methods, except GQM and GPQM, improve the raw RCM-simulated precipitation; however, there are some differences in their corrected statistics. Contrary to other methods, GQM and GPQM heighten the precipitation amount predicted by the RCMs which are already overestimated comparatively to the observed precipitation. This GQM and GPQM inability to correct the RCMs simulations means that the gamma distribution, which is the basis of these two methods, does not represent daily precipitation for the basin. Delta, Scaling, EQM and AQM approaches reduce the bias of raw simulated data for the four RCMs used in the study. All these methods correct adequately the extreme values especially the 90th percentile (table 2). The number of dry days was also reduced for the four RCMs by all bias correction methods. Comparing MAE and RSME (Figures 3  and 4), we have showed that the Scaling approach is also adapted to correct the biases of the RCMs simulated precipitation. However, this approach doesn't adequately reduce the biases of the peak of precipitation. Also, it doesn't reduce the number of dry days. For all RCMs used, the quantile approaches AQM and EQM give the good correction of peak values, especially the 90th percentile. These methods adequately correct the high variability of the raw simulated data.

Future Rainfall Predicted Trends
Using AQM method, we corrected the projections data of each model for the period 2006-2100. Figures 9, 10, 11 and 12 show the trends of precipitation for the two chosen IPCC scenarios RCP4.5 and RCP8.5.
CRCM model is not run for RCP8.5 scenario for Africa. There is an ambiguity on the real trend of precipitation in the basin. In the southern parts of basin, no significant trend is noted. But in the north of the basin, CRCM scenario RCP4.5 predicts a slight decrease of precipitation at the end of 21 st century.

Discussion
The bias is defined as long term average difference between model and observation [20]. Bias correction was done on RCM-simulated precipitation at eighteen stations. The models behavior in simulating the climate variables naturally depends on location of evaluated stations in the basin. The correction methods which assumes that daily precipitation is distributed following a gamma distribution (GQM and GPQM), failed to correct the daily precipitation amount during months of heavy downpours (June to September). It means that the gamma distribution does not represent daily precipitation in our study area. This was already shown by Vlček and Radan [39] who proved that the gamma distribution doesn't represent adequately the daily precipitation of every region of Europa and this distribution is more frequently accepted in spring and summer than in winter and autumn.
The linear scaling (simply called Scaling in this paper) approach used to reduce the biases of precipitation, has a correct estimation of mean but a slight underestimation of the 90th percentile and standard deviation. It also doesn't well reduce the number of wet days. The overestimation of the low precipitation has already been shown by Fang et al. [18] indicating that the linear scaling method has a very limited ability in reproducing dry day precipitation. The results of this study confirm the study of Teutschbein and Seibert [21]; which reveals that the linear scaling method does not adjust the standard deviation and the percentiles. That means the scaling approach is not good at correcting the RCMs outputs.
The Delta method overestimates the number of the dry days, but removes the bias of precipitation in the mean. This method's ability to correct the mean of precipitation has been proved by Lenderink et al. [32] and Wetterall et al. [16].
The last two methods used for correction of daily precipitation are EQM and AQM quantile methods. These methods are very good at reducing the biases of daily precipitation simulated by the four RCMs used. AQM and EQM have corrected well the daily precipitation. These quantile methods reduce the number of wet days and confirm the findings of Theme ßl et al. [13], who justified that the quantile mapping corrects the frequency of dry days adequately. The extremes values (P90 for precipitation) are also well corrected. This ability to correct biases of precipitation is due to the flexibility of the quantile mapping approach used, which were originally designed for the daily precipitation [13]. The bias correction is likely to remove the known drizzle error [42] and reduce the precipitation bias of the RCM. Thus, the quantile methods AQM and EQM seem solve better these issues than other methods tested in this study.
After the bias correction, the probability of detecting a climate change signal is reduced since the signal is reduced after the correction, but the variability remains [7]. However, using the quantile method AQM corrected data, the increasing trend of rainfall was shown and it's in line with IPCC [43] finding which state precipitation increasing for the end of the 21st century. These results also confirm those results obtained by Kaboré et al. [8] who also found an increasing trend of annual rainfall amounts of future models with the period from 2006 to 2050. CCLM shows reduction in precipitation at the end of the century in the whole basin. This finding is consistent with Dosio and Panitz [44] who, using CCLM has predicted a significant reduction of precipitation at the end of the century in West Africa.

Conclusion
We evaluated the ability of six daily precipitation bias correction methods in reducing the biases of four regional climate models outputs. Except GQM and GPQM, the remaining methods reduce the bias of raw data provided by the RCMs used. Inability of GQM and GPQM methods is likely due to the gamma approach, basis of these methods. Delta approach remains powerful to remove the bias of precipitation in the mean but it overestimates the number of dry days. The linear scaling adequately reduces the biases of daily precipitation; however it overestimates the number of wet days. Ability of quantile methods AQM and EQM to correct the biases of daily precipitation has been established. Based on the common parameters MAE, RSME, mean and extreme values (90 th percentiles), AQM and EQM are excellent at reducing the biases of precipitation. However, the AQM method corrects better than EQM; making it the most recommended method for correcting the bias of RCMs outputs in the climate change context for impact studies which are often conducted at local scale. Further studies should evaluate influence of bias correction of RCM precipitation data on projected runoff at Ouémé basin.