On the Use of Simple Scaling Stochastic (SSS) Framework to the Daily Hydroclimatic Time Series in the Context of Climate Change

Climate Change hypothesis pushed the scientific community to question the characteristics of the classical statistics such as mean, variance, standard deviation, covariance, etc. in the hydroclimatic field. Many studies have revealed that the climate has always changed and that these changes are closely related to the Hurst phenomenon detected in long hydroclimatic time series and in stochastic term which is equivalent to a simple scaling behavior of climate variability on the time scale. A new statistical framework taking into account the climatic variability is now applied. Most studies are at annual scale where variability at finer scales is not taken into account. This paper proposes to verify the validity of the new statistical framework at finer time scale: the daily time scale. Twelve (12) daily time series of flows, rainfalls and temperatures with 18,628 observations, each one, were studied. Four different methods, such as Rescaled range Statistic (R/S) method, R/S modified method, Aggregate Variances method and Aggregated Standard Deviation (ASD) were applied to determine the Hurst exponent (H). All methods lead to the conclusion that the investigated time series have a long-term persistence phenomenon. Contrary to annual time series where variability corresponds to a Simple Scaling Stochastic (SSS) process, the daily time series seem to correspond to a process having both a SSS component and a deterministic component.


Introduction
In the last three decades, climate change has been the subject of intensive scientific research. The focuses are on the understanding of factors, mechanisms and processes related to climate, and also on the climate modeling at global scale. Another important field of recent research is the detection and attribution of changes in the past climate. Many hydrologists agree that climate change is a result of human's activities but the recent research on climate change has led a strong conclusion that climate has always, throughout the Earth's history, changed irregularly on all time scale [1]. However, the same author said that undoubtedly, change has been accelerated in modern times due to radical developments in demography, technology and life conditions. He also classified change with respect to its predictability. For this author, the change is regular in simple systems (Left part of Fig. 1). The regular change can be periodic or non-periodic. Whatever it is, using equations of dynamical systems, regular change is predictable. But this kind of change is rather trivial. More interesting are the more complex systems with long time horizons (right part of Fig. 1), where change is unpredictable in deterministic terms, or random. Fig. 1. Classification of change with respect to predictability [1].
For [2], the falling and rising local trends can be regarded as climate changes or variations, considered by many as deterministic components in climatic time series. According to this author, climate cannot be predicted in deterministic term under change [1][2][3]. In all cases, however, these changes are irregular and are better modeled as stochastic fluctuations on many time scales, in the absence of an accurate deterministic model that could explain them and predict their future. Under climate change, are the typical statistics, used on hydroclimatic (such as mean, variance, standard deviation, cross and autocorrelation) with hypothesis of identically distributed and independent of variables, consistent? According to [4], climate must be considered as a stochastic system. A stochastic basis for dealing with these shifts and trends is offered by Simple Scaling Stochastic (SSS) processes that are consistent with the assumption of hydroclimatic fluctuations on multiple time scales, a behavior that is none other than the Hurst phenomenon discovered by [5].
Pure randomness, such as in classical statistics, where different variables are identically distributed and independent, is sometimes a useful model, but in most cases it is inadequate [1]. In several studies such as [2,[6][7][8][9] the Hurt phenomenon was explained by multi-scale variability of time series. A number of studies have identified the Hurst phenomenon in several environmental quantities such as (to mention a few of the more recent) wind power [10]; global mean temperatures [11]; flows of the Nile [2,12]; flows of the River Warta, Poland [13]; inflows of Lake Maggiore, Italy [14]; indexes of North Atlantic Oscillation [15]; and tree-ring widths, which are indicators of past climate [6]. The Hurst phenomenon which characterizes the dependence in time series is interpreted as "memory indicator". Thus, the long-rang dependence is interpreted as "long memory".
But the Hurst phenomenon is not necessarily an indicator of infinite memory of a process [16]. It is more insightful to interpret long-range dependence as long-term change [2]. According to [1][2], the fluctuations (i.e. change) in times series can be regarded as a manifestation of Hurst phenomenon.

Study Location and Data
The Ouémé catchment covers an area of 49,256 km 2 at the hydrometric station of Bonou, with a length of 523 km representing 47.2% of Benin area [17]. It extends from latitude 7°58 to 10°12 and from longitude 1°35 to 3°05 [18] (Fig. 2). The rainfall, which is mainly controlled by the atmospheric circulation of two air masses and their seasonal movement (the Harmattan and the monsoon), is characterized by two types of climate: from the bimodal climate in the south to unimodal climate in the north [17]. The averages of annual rainfall (1960 -2010) are 1204.77 mm at the Bétérou and 1098.40 mm at Savè. The dynamic of the flow is characterized by a high discharge during the rainy season. The maximum flow over 1952 -2010 is around 267.88 m 3 /s at Bétérou and 478.87 m 3 /s at Savè's bridge. From November to May almost all the rivers dry up and the averages of low flows go from 49 to 5m 3 /s at Savè station and 17 to 2 m 3 /s at Bétérou [18].
Data used in this study included daily time series of river flows, rainfalls and temperatures from January first 1960 to December 31 2010 (18,628 days). Meteorological data (rainfall and temperature) and river flows data were provided respectively by the Benin Meteorological Department, ASCENA (Agency for Air Navigation Safety in Africa and Madagascar) and the National Directorate of Water (DG-Eau). Spatialized regional daily average rainfall was obtained by kriging method by [19]. A common characteristic that comes from the four examples depicted in Fig. 3 is that a local multi-day average (30-day and 365-day average) is not stable but, it exhibits significant variability. In Fig. 3a and Fig. 3b, one observes an alternation of rising and falling trends in the time series of runoff and rainfall moving averages. But, regarding temperatures, there is a rising trend of different moving averages of each time series. As it has been noted above, these fluctuations can be regarded as the evidence of long-term change in the times series. To verify the hypothesis of long-term change in time series, we need to determine the Hurst exponent (H).

Estimation of Hurst Exponent
The Hurst exponent H (0<H<1) provides a measure of the intensity of the long-term persistence (LTP). Thus, it is used to classify time series according to their dependence structure [20].
If H = 1/2, the autocorrelations are zero and the spectral density is constant and positive. The process therefore has no LTP (white noise).
If 1/2<H<1, the autocorrelations are all positive and decrease hyperbolically to zero. The spectral density exhibits a pole at zero frequency. The series presents non-periodic cycles of all kinds. Low frequencies are very important and cycles (not periodicals) slow becoming more pronounced. The process has the form of LTP.
If 0<H<l/2, the autocorrelation sign alternates and the spectral density, null at zero, is dominated by high-frequency components. The process is anti-persistent.
Many methods for estimating of Hurst exponent are available. Some of these are described in detail in [21][22]. In this paper, we use only the Rescaled range Statistic (R/S) method, the Rescaled range Statistic (R/S) method modified by [23], the aggregated variance method [22] and Aggregated Standard Deviation (ASD) method.

The R/S Method
Introduce by [5] and developed in [24][25][26]; the R/S statistic is certainly the most known and used method of estimating of H. It is defined as the extent of the partial sums of deviations of a time series to its mean divided by its standard deviation. For a time series, Y t , t = 1, 2,....., T, with mean t Y . R is defined by: In practice, this method involves several steps: First, one determines a sequence of integers (k i ) i = 1,..., m of length m, arbitrarily chosen, such that 1 < k m <... < k 1 < n, for which the sequence defined by [27] is used such that for

Time (day) Temperature (°C)
Daily upper temperature 30-days average 365-days average Mean are determined, and the log Q (k i ) versus logK i is plotted. Finally, one draws a line whose expression is: logQ (ki) = a + blogki + u and one applies the ordinary least square which gives the estimators of a and b and then one can determine the estimated Hurst coefficient as H = b. Various authors [24,26,28] emphasized the superiority of the R/S analysis compared to more traditional methods of detection of LTP such as studying autocorrelation, reports variances and spectral analysis. [20] shows that the R/S analysis can detect the presence of LTP even in a highly non-Gaussian time series.

Modified R/S Method (Lo Method)
Among the disadvantages of R/S Statistics proposed by Hurst, one can cite its sensitivity to the presence of shortterm persistence (STP). To overcome this problem, [23] proposed another statistic, called "modified R/S statistic." Its limit distribution is invariant to different forms of short memory processes. This method allows testing the null hypothesis of no LTP against the alternative of STP. The modified R/S statistic of Lo has the following form: S and n X are respectively the empirical variance and mean.
( ) 1 1 j j w q q = − + (q=1, 2,…, q) are the weights proposed by [29]. In practice, the selection of the integer q is a real problem. [30][31] have shown by Monte Carlo studies that when q is relatively large compared to the sample size, the estimator is biased and therefore q must be chosen as a small integer, while other studies have shown by Monte Carlo that q = 1 is an acceptable choice. Then, we choose q=1. Contrary to the classic R/S statistic, the limit distribution of the modified R/S is known and the statistic V defined by: converges to the extent of a Brownian bridge on the unit interval. It is therefore possible to perform a statistical test of the null hypothesis of short persistence against the alternative hypothesis of long-term persistence by referring to the table of critical values provided by [23].

Aggregated Variance Method
The method of aggregated variance is based on the aggregation of the time series into several blocks ( ) The procedure is repeated for successive values of m and one has with slope 2H-2, which provides an estimator of H.

Aggregated Standard Deviation (ASD)
To apply ASD method, we need to assess the standard deviation at several time scales. It has several advantages such as (1) easy understandability and transparency that enables better perception of the behavior and does not hide its implications, (2) simplicity and minimal parameterization (it does not involve any other concept than standard deviation), which enables a probabilistic description of the concepts it uses and hence a statistical framework of estimation and testing, and (3) appropriateness, in terms of producing estimates within the interval (0, 1) [31].
Let X i be a stationary process on discrete time i (referring to days here) with standard deviation σ and let be the aggregated process at time scale k, with standard deviation σ (k) . The LTP is expressed by elementary scaling process: Equation (6) corresponds to a stochastic process in discrete time and termed Hurst-Kolmogorov process (HK). Its continuous time form is [1,33]: where a is any time scale, σ (a) is the standard deviation at scale a, and both a and k have units of time.
To determine H, we use the algorithm by [2], which by construction ensures appropriate estimates H. For comparisons, we add another common stochastic process; the simple Markov process or, in discrete time i, the autoregressive process of order 1 (AR (1)), which is the most example of short term-persistence (STP). It exhibits dependence expressed at scale 1 as xi = ρ xi -1 + vi (9) where ρ stands for the lag-one autocorrelation coefficient (-1 < ρ < 1) and vi (i = 1, 2, …) are independent, identically distributed, random variables. In this case the standard deviation is given by [6,32]: Table 1 presents the results of the estimation of H with the four (04) methods described above. To determine whether the estimated H values are significantly higher than 0.5 (i.e. if the time series have a long-term dependence), the statistic V is compared to the critical values provided by [23] which in the case of one sided test is around 1.620 and 1.747 responsively for 5% and 10% significance level. The values of the statistic V are between 3.82 and 25.58. We therefore conclude that the time series exhibit LTP behavior.
The H values for the three time series of rainfall (Bétérou, Savè, Bonou) are in the same order of magnitude, whatever the method used. In fact, in the Ouémé at Bétérou basin, H values vary from 0.67 (R/S method) to 0.63 (ADS method) with an average of 0.65. In the Ouémé at Savè basin the values of H vary from 0.67 (R/S method) to 0.64 (aggregate variance method) with an average of 0.65. In the Ouémé at Bonou basin, H ranges from 0.69 (R/S) to 0.64 (aggregate variance method) with an average of 0.67. The lag-one correlation coefficients are also low and are of the order of 0.224 in the Ouémé at Bétérou basin, 0.386 in the Ouémé at Savè basin and 0.412 in the Ouémé at Bonou basin. In summary, the daily rainfall exhibit LTP behaviors. But the intensity of these behaviors are low (low values of H). Therefore, these time series undergo a small change and this is consistent with the low fluctuations of moving averages shown in Fig. 3b. Regarding the flow, it can be noticed that the H values obtained with the modified R/S method are significantly lower than those obtained from the other methods. This can be explained by the fact that the modified R/S method is not sensitive to the STP characteristic of these series (ρ close to 1 in the three series). In summary, the H values for the river flow series are quite high and ρ values are very close to 1. Therefore, these time series exhibit strong LTP and STP behaviors. The strong LTP can be justified by the large fluctuations of moving averages shown in Fig. 3a. When we consider only the ADS method, H values (0.86 for Bétérou, 0.88 for Savè and 0.87 for Bonou) are equivalent to those obtained on the annual series of the Nile flow rates, i.e. 0.87 by [1], 0.89 by [2] and [34] for a time series of 849 years and 0.85 by [35] for a time series of 131 years by using the same method. On the other side, in Boeoticos Kephisos Basin in Greece, the H coefficient was estimated to 0.79 for the annual flows of 95 years [1].
In the case of temperatures, it can be noticed that the H values with the R/S method are also significantly lower than those obtained from the other methods. Since ρ values are also higher (between 0.654 and 0.815), then the series have a high STP. The insensitivity to STP of the Modified R/S method can therefore explain the low values of H obtained by this method compared to the other three methods.
For both considered stations, the H values are almost identical for the same kind of series. In general, the average values of H are 0.80 for the daily mean temperatures and 0.77 for the maximum of daily temperatures. It is equal to 0.81 for the minimum of daily temperatures. When one abandons the modified R/S method, then averages are 0.85 for the daily mean temperatures, 0.84 for the minimum of daily temperatures and 0.80 for the maximum of daily temperatures.
The H values given by the ADS method are also in the same order of magnitude as those given by the annual temperatures series [6,32].

Prediction with SSS
The climacogram, which is the logarithmic plot of standard deviation (σ (k) ) vs scale (k), provides very information on the behavior of process. Thus, to construct the empirical climacogram we calculate an averaged time series for each scale (k = 1, 2, 3,..., n/10) and then calculate the sample estimate of the standard deviation σ (k). In a purely random process, the climacogram would be a straight line with slope -0.5, as implied by the classical statistical law: But in real-world processes, the slope is different from -0.5, it is equal to H -1, where H represents the Hurst coefficient. This slope corresponds to the scaling law in equation (7). Fig. 4 presents the empirical climacograms of time series. The slopes of climacograms are not constant. They vary with the time scale (Table 2). For three time series of runoff, when k is between 1 and 10, the average of climacograms slope is -0.008. The value of the Hurst coefficient is then 0.99. For values of k ranging from 11 to 30, the slope of climacograms decreases to an average of -0.04, which corresponds to H = 0.96. k is from 31 to 100, the average slope of climacograms is -0.20 and corresponds to a value of 0.80 for H. When k is between 101 and 365, the average slope of climacograms is -0 84 and H is equal to 0.16. Finally when k is greater than 365, the slope rises to -0.33 and the Hurst coefficient value is 0.77.
In summary, when k is between 1 and 100 and when k is greater than 365, the time series of runoff present a LTP behavior (H>0.5), while when k is between 101 and 365 the time series have an anti-persistent behavior (H<0.5).
The climacograms of rainfall show also several slopes. When k is between 1 and 7, the average of climacograms slope is -0.24. The H value is then 0.76. For k equal 8 to 70, the average slope of climacograms becomes higher (-0.097) and the Hurst coefficient is 0.90. For k between 71 and 160, the average of climacograms slope down to -0.35 which corresponds to H equal to 0.65. For k values between 161 and 365 days, the slope of climacograms is abnormally low (-1.87). The value of the slope does not allow us to determine H. When k is greater than 365, the slope of climacograms is -0.49 and H is equal to 0.5. In summary, the rainfall series exhibit LTP behavior when k is less than 161 days but when k is greater than 365 days, the rainfall series are Gaussian (H=0.5).
The climacograms of the daily minimum temperatures series have three levels of varying slopes. When k is from 1 to 70, the average slope of climacograms is -0.11. The Hurst coefficient is then 0.89. For k between 71 and 365, the slope is -0.375 which gives H = 0.63. When k is greater than 365 days the slope is -0.096, and the value of H is equal to 0.90. Therefore, one can conclude that, whatever the value of k, the average temperature series have a LTP behavior. The climacograms of the series of maximum and mean of daily temperatures reveal 4 different slopes. The first slope is obtained for k between 1 and 70. They vary from -0.044 to -0.056 with an average of -0.05. The Hurst coefficient is equal to 0.95. The second slope is obtained when k is between 71 and 160 and they vary from -0.33 to -0.31 with an average of -0.32. Then, the Hurst coefficient is equal to 0.68. The third slopes vary from -1.18 to -1.88 with an average of -1.51. As it has been noticed in the daily rainfall, this slope doesn't allow the determination of the Hurst coefficient. The last slope is obtained when k>0, and varies from -0.16 to -0.38 with an average of -0.25. In such a case H is equal to 0.75.  The above analysis shows that the slope of climacograms varies with the time scale contrary to the assumption of classical statistic that the slope is constant and equal to -0.5. For some time series (such as rainfall, mean and maximum of temperatures), where k is between 161 and 365, the slope of climacograms is outside the interval (-1, 0). Therefore, H is not between (0, 1). It appears necessary to wonder whether only the stochastic appearance is enough to model hydroclimatic phenomena.
It can be noticed that the empirical climacograms show a rate of periodic processes. This appears to be consistent with the fact that for most hydrometeorological processes are cyclical with a period of one year. To improve the results obtained with the fitting of empirical climacograms by HK model, we decided to model the series as cyclostationary one [1]. Equation (12) describes a periodic process with white noise: Where T is the period of the process, a and b are parameters and sinc (x) is the cardinal sinus of x and is equal to: When The periodic component of equation (14) is more deterministic than stochastic. Thus, integrating the HK behavior to periodic model, we get the periodic process with Hurst-Kolmogorov model whose equation is: Hurst-Kolmogorov behavior is consistent with all series (Fig. 5). Table 3 presents the uncertainties ( ( ) n σ ) related to each model. It appears from this table that the uncertainties related to Hurst -Kolmogorov and periodic models are higher than those related to the model of classical statistic (i.e. white noise). But the periodic model reduces significantly the uncertainties than Hurst-Kolmogorov model. Since the periodic model takes into account the stochastic component and the deterministic component, then taking into account both components appear to be a better approach to reducing uncertainties in the context of climate change.

Conclusion
Climate change is a reality and is as old as the world. In accordance with the studies of past climate, the climate has always changed and at any time scale. Climate change is related to the Hurst's phenomenon and corresponds to a simple scaling stochastic process. Daily series studied here seem to correspond to a process having both a noise component and a deterministic component. The inclusion of these two components appears to be a better approach to reducing uncertainties in the context of climate change.