American Journal of Theoretical and Applied Statistics
Volume 5, Issue 6, November 2016, Pages: 365-375

Estimating Extreme Quantiles of the Maximum Surface Air Temperatures for the Sir Seretse Khama International Airport Using the Generalized Extreme Value Distribution

Thuto Mothupi1, *, Wilson Moseki Thupeng1, Baitshephi Mashabe2, Botho Mokoto3

1Department of Statistics, University of Botswana, Gaborone, Botswana

2Department of Mathematics and Statistical Sciences, Botswana International University of Science and Technology, Palapye, Botswana

3Department of Risk Management, Insurance and Actuarial Science, Ba Isago University, Gaborone, Botswana

Email address:

(T. Mothupi)
(W. M. Thupeng)
(B. Mashabe)
(B. Mokoto)

*Corresponding author

To cite this article:

Thuto Mothupi, Wilson Moseki Thupeng, Baitshephi Mashabe, Botho Mokoto. Estimating Extreme Quantiles of the Maximum Surface Air Temperatures for the Sir Seretse Khama International Airport Using the Generalized Extreme Value Distribution. American Journal of Theoretical and Applied Statistics. Vol. 5, No. 6, 2016, pp. 365-375. doi: 10.11648/j.ajtas.20160506.16

Received: October 10, 2016; Accepted: October 22, 2016; Published: November 14, 2016


Abstract: In this paper, extremes of quarterly maximum surface air temperature are modelled by employing the block maxima approach to extreme value analysis. The aim of the paper is to predict the future behaviour of the quarterly maximum surface air temperatures by estimating their high quantiles using the generalized extreme value distribution, an extreme value distribution usually used to model block maxima. The data are derived from monthly maximum surface air temperatures recorded at the SSSK International Airport Weather Station from January 1985 to December 2015. The Jarque-Bera normality test is performed on the data, and shows that the quarterly maximum temperatures do not follow a normal distribution. The Seasonal Mann-Kendall test detects no monotonic trends for the quarterly maximum temperatures. The Kwiatkowski- Phillips-Schmidt-Shin test indicates that the data are stationary. Parameter values of the generalized extreme value distribution are estimated using the method of maximum likelihood, and both the Kolmogorov-Smirnov and Anderson-Darling goodness of fit tests show that the distribution gives a reasonable fit to the quarterly maximum surface air temperatures. Estimates of the T-year return levels for the return periods 5, 10, 25, 50, 100, 110 and 120 years reveal that the surface air temperature for the SSK International Airport will be increasing over the next 120 years.

Keywords: Maximum Temperature, Extreme Value, Return Level, Generalized Extreme Value Distribution, Stationarity, Climate Change


1. Introduction

There is growing concern around the world about increased emissions by the industrialised nations of greenhouse gases (GHG), which will cause increase in the global temperature and changes of other climatic variables such as rainfall and evaporation [38], [35] and [36]. In its Fifth Assessment Report, the International Panel on Climate Change (IPPC) concludes, among others, that future climate extreme events would indicate significant changes in terms of the frequency, intensity and duration around the world. In vulnerable regions, extreme weather and climate can lead to disasters with significant impacts on human and natural systems [13]. Historically, according to [13], extreme events were generally rare in any one location, with time between events when human and natural systems could recover from the impacts experienced. However, as climate change increases the frequency, intensity and duration of some extreme weather and climate events [26], the time between extreme events will shorten across this century. This means that as the frequency, intensity and duration of extreme weather and climate events increase, there will be greater decreases in the return periods of such events. And, shorter return periods could affect the resilience of the affected communities to subsequent extreme events, especially in communities or countries at high risk of experiencing extreme weather and climate events like droughts. Precipitation and temperature extremes are considered to be the most important climate events and, have been extensively explored over the past several decades, according to [48]. Although deficient rainfall is considered the chief architect of droughts, heat waves and temperature extremes, though underestimated, often play crucial roles in drought development and intensification [11] and [3]. Regions with arid and semi-arid climate are more susceptible to drought, since they are more sensitive to rainfall deficiency and temperature extremes [6] and [3]. The present study focuses on temperature extremes.

Global temperature has increased since the beginning of the last century and will most likely continue to do so in the next decades [24] and [37]. [37] further warn that this increasing trend may induce more frequent and more intense heat waves in the future as pointed out by [34], [14] and [4]. In their study on climate variability and trends in meteorological time series in semi-arid Botswana, [8] studied the variability in rainfall, maximum and minimum air temperatures at 14 synoptic stations in Botswana for a period of 1960 to 2014. The results from the study indicate a significant increasing trend in minimum and maximum temperatures at most of the stations. The recent years have been marked by exceptional heat waves in Botswana, especially in the southern parts of the country. According to the Daily News of 21st August, 2016, Botswana had been experiencing searing heat since the 20th August 2016, with temperatures ranging between the 40°C and 43°C marks in several parts of the country. The same newspaper reports Mr Radithupa Radithupa, the Chief Meteorologist at the Department of Meteorological Services (DMS), which is responsible for weather forecasting, had explained early that week that the heat wave was anticipated to break high temperatures in some places. Mr Radithupa indicated that the current maximum temperature record was 43.3°C recorded in Gaborone almost 72 years ago in 1944. MAGICC/SCENGEN results indicate an increase in temperature of 2°C, on average, for all areas of Botswana in the year 2030, with the temperature increasing from 0.5°C to over 2°C, according to Botswana’s Second National Communication to the [44].

Gaborone, located in the south east of semi-arid Botswana, characterized by very low and erratic rainfall patterns, suffers from regular extreme temperatures and is highly drought prone. The city has been experiencing chronic water scarcity and power outages as the Gaborone dam, which contributes 38% supply to Greater Gaborone, reaches unprecedented low levels due to low rainfall amounts and high temperatures. As at August 23, 2016, the Gaborone Dam, with a capacity of 141.4 million cubic meters, stood at 15.2% with a supply period of 7 months without inflow. As pointed out by [3], in spite of normal or above-average rainfall (no meteorological drought) in a season or year, heat waves and extreme temperatures may affect vegetation health and trigger vegetative/agricultural drought. A heat wave is considered only when the maximum temperature of a station reaches at least 40°C for plains and at least 30°C for hilly regions [39].

Understanding and predicting the trends of extreme weather events is crucial for the protection of socio-economic well-being. Quantifying extremely high surface air temperature changes and trends in extremes is also crucial for understanding global warming and mitigating its regional impact [16]. As pointed out by [16], often of interest to scientists is the T-year return level for annual maximum of daily temperatures, which is defined as the quantile  (from the distribution of the extreme temperatures) which has probability  of being exceeded in a particular year. The return level is a fundamental quantity in the description of the behaviour of the upper tail of a distribution as it represents a rare event. And, to calculate return levels, we need to determine the most appropriate probability model for the distribution of the extreme temperatures. The current investigation attempts to provide a better understanding of trends in temperature extremes as a possible source of vulnerability to residents of the City of Gaborone by studying the patterns in the quarterly maximum surface air temperatures and estimating their return levels for different return periods.

Extreme Value Theory (EVT) provides a rigorous framework for analysis of climate extremes and their return levels [28] and [10]. It is widely applicable in a wide range of disciplines including finance, insurance, engineering, hydrology and climatology. Statistical methods for modelling extremes of stationary sequences have receivedmuch attention and though different methods for inference do exist the modelling strategies are basically identical [10], [5] and [12]. Given the potentially disastrous social, economic and public health impacts of weather-related events such as heat waves, droughts and floods, and given the relative rarity of such events, applying the insights of extreme value theory to environmental risk analysis is both a necessity and natural evolution [40]. In practice, there are two commonly used approaches to model extreme values of a series of independent observations. The first approach is called the block maxima approach, in which data are blocked into sequences of observations of length n, for some large value of n, producing a series of block maxima. The generalized extreme value (GEV) distribution provides a model for the distribution of block maxima. However, modelling only the block maxima is a wasteful approach if other extreme values are available. The alternative approach is the peaks over threshold (POT) method, which focuses on exceedances over a fixed high threshold. The advantage of the POT method in modelling extremes is that more observations are incorporated into the analysis. [43] made an attempt along these lines by modelling monthly maximum temperature for Shakawe in Botswana using the generalized Pareto distribution. However, for the current investigation preliminary analysis of the data supports the modelling of the data using the GEV distribution. The modelling of extreme temperature using the GEV distribution introduced by [26], is gaining currency (see [41], [19], [9], [37], [48]). Most recently, [2] model a 20-yeartime series of maximum temperatures from the Cameroon Development Corporation (C.D.C) for Mbonge using the generalized extreme value family of distributions. The authors find that the three-parameter generalized extreme value model best fits maximum temperature data as compared to the Weibull, Frechet and Gumbel models.

Despite the potentially disastrous effects of high temperatures on public health and the socio-economic wellbeing of the people, not much research work has been done in terms of applying extreme value methods to study the behaviour of extreme temperature data for Botswana. This study seeks to fill this gap. The results of the study will provide useful information to decision-makers in government, those involved in early warning systems and the health sector to prepare the public for the negative impacts of weather changes due to extreme temperatures

The objective of this study is to quantify and describe the behaviour of extreme surface air temperature at the Sir Seretse International Airport in Gaborone, Botswana. Our aim is to model the quarterly maximum temperatures through the use of the GEV distribution and predict their future behaviour by estimating high quantiles of the maximum temperature distribution. The return levels are important for prediction and planning purposes, and must be estimated from a stationary model. Thus, the main purpose of developing a stationary GEV model for the SSK Airport quarterly maximum is to compare the estimated return levels (expected quantiles) from the GEV distribution with the currently used design value based on the some guideline, for example, the optimal temperature for human health. The paper is organised as follows. Section 1 gives the introduction. The remainder of the paper is organised as follows. Section 2 provides a description of the historical maximum temperature data and the extreme value methodology employed in the study. The results and discussion are provided in Section 3 while Section 4 contains the conclusions of the study.

2. Data and Research Methodology

2.1. Data

The data consist of monthly maximum surface air temperatures, measured in degrees Celsius (°C), over the period January, 1985 to January, 2016, recorded at the Sir Seretse Khama (SSK) Airport weather station, in Gaborone, the capital city of Botswana. The dataset was obtained from the Department of Meteorological Services (DMS) of the Government of Botswana. This particular weather station was chosen based on availability and good quality of the data. For performing the analysis, the R software packages tseries, nortest, evir and extremes were used.

2.2. Generalized Extreme Value Distribution

Extreme value theory relies on the Extremal Types Theorem ([15], [18] and [30]), which identifies three distinct types of extremal behaviours: Let  be a series of independent and identically distributed random variables with a common distribution function F, and let . If there exist normalising constants  and  such that, as ,

(1)

for all, where G is a non-degenerate distribution function, then G must be either Frechet, Gumbel or negative Weibull. [26] shows that the three types based on Equation (1) ca be combined to form a single parametric family called the generalized extreme value (GEV) distribution with cumulative distribution function (cdf)

(2)

with  The quantities µ,  and 𝝃 are the location, scale and shape parameters, respectively. The shape parameter determines the rate of tail decay. The three types of extreme value distributions are obtained when  (heavy-tailed Frechet),  (light-tailed Gumbel) and  (short-tailed negative Weibull). Interested readers are referred to [10], [5] and [42].

2.3. Choice of Block Size

In the block maxima approach, extreme values are obtained by selecting the maximum values from equal length blocks such as a year. The GEV distribution provides a model for the distribution of block maxima. Its application consists of blocking the data into blocks of equal length, and fitting the GEV distribution to the set of block maxima [10]. The block size is very important in GEV modelling as the size of a block and the number of block maxima form a crucial trade-off between bias and variance of parameters estimates. A small block size generates a long sequence of block maxima, leading to a violation of the asymptotic validity of the model, which results in bias in parameter estimates and poor extrapolation; a large block size produces a shorter sequence of block maxima, leading to a large estimation variance. In implementing the GEV model, the choice of block size has to be chosen so that individual block maxima have a common distribution. Furthermore, the data generating process is assumed to be stationary.

However, in practice, extreme value data, especially environmental time series, may show some violation of the assumptions mentioned above due to one of or a combination of local temporal dependence, long term trends and seasonal variation. For instance, quarterly temperatures are likely to vary with the season, which violates the assumption of a common distribution for the individual block maxima. In many situations concerning climate studies a reasonable choice is to consider the annual maxima [10]. However, for this study due to scarcity of the temperature data, an annual block size would lead to only 10 annual maximum temperatures, since we are using a 10-year data set, which is too small a sample size for any meaningful inference. Hence, in this paper, we use the quarter of a year as the block and partition the series of monthly maximum temperatures for each year into four blocks of equal length: November – January (Q1), February – April (Q2), May – July (Q3) and August – October (Q4), with , denoting the ith quarter of the year.

2.4. Testing for Trend and Stationarity

Before carrying out the modelling process one should first test a time series to see if it is stationary. A stationary series is one whose statistical properties, such as the mean, variance and covariance, do not change over time or space. As mentioned before, non-stationarity may be due to local temporal dependence, long term trends and seasonal variation. Violation of stationarity in climate extremes can lead to wrong estimates of return levels derived from a given extreme value model.

Statistical tests that are used to detect time series trends are broadly grouped into two categories: parametric and non-parametric methods. Non-parametric tests are more appropriate for hydro-meteorological time series data, which are generally non-normally distributed and censored ([49] and [7]). One of the well-established non-parametric (distribution-free) tests for trend detection is the Mann-Kendall (MK) test, first proposed by [32], further studied by [29] and improved by [21], who modified the test to take seasonality into account. The MK test is used when the series values can be assumed to be independent and identically distributed and, thus, seasonality is not expected in the series. However, quarterly temperatures are likely to vary with the season. Therefore, in this paper, we apply one extension of the Mk test which takes seasonality into account, namely, the Seasonal Mann-Kendall (SMK) test (due to [21]) to detect trends in the quarterly maximum temperatures for the SSK Airport in the period under study.

The purpose of a trend test is to determine if the time series has a monotonic trend. However, monotonicity of a series does not always indicate non-stationarity. Therefore, when the purpose is to identify non-stationarity in a time series, it is necessary to perform further tests on the series. The most commonly used stationarity test, the KPSS test, is due to [27]. Hence, in this paper, to investigate non-stationarity in the quarterly maximum surface air temperature time series data for the SSK Airport, we apply the KPSS test. The test is selected due to its common use in environmental studies (see [46], [47], [50] and [19]). If the test of stationarity indicates non-stationarity in a particular way, then that non-stationarity can be modelled using models for non-stationary extremes; if the test indicates stationarity, we do modelling under the stationary assumption. In fact, as pointed out by [49], it is useful to develop non-stationary GEV models when there is evidence of statistically significant trends even if stationarity tests do not indicate non-stationarity.

2.4.1. Seasonal Mann Kendall Trend Test

The Seasonal Mann-Kendall test is described in detail in [21], [22], [17] and [20]. It is used to test for a monotonic trend of the variable of interest when the data collected over time are expected to have seasonality and serial correlation. The test assumes that when no trend is present, the observations in the time series are not serially correlated.

Adopting the notation of [21] and [22], let  denote the observation obtained in season  in year , where a season may be a day, week, month, quarter or any other period of time. In the present paper, the season is a quarter of a year. Hence, let  represent the entire dataset collected over years consisting of data subsets  where , denotes the set of data for the ith quarter for n years,

The null hypothesis H0 for the two-sided (homogeneity) SMK test is that there is no monotonic trend in the series and the alternative hypothesis HA is that for one or more seasons there is an upward or downward trend over time.

If we denote the difference  for the ith quarter, with , and let  be the indicator function for quarter , then the Mann-Kendall statistic for the ith season is calculated as

, where m is the number of seasons, in our case, and

And, [21] have shown that the seasonal Mann-Kendall statistic for the entire series is given by , with variance  and , where  is the number of tied groups for the ith quarter and  is the number of data points in the pth group for the ith quarter. Then the SMK test statistic  is given by

(3)

Thus, under the null hypothesis,  (Equation (3)) is equal to zero. On the other hand, a positive (negative) value of  is an indicator of an increasing (decreasing) trend. Thus, we reject the null hypothesis in favour of the alternative if , where  is the )th percentile of the standard normal distribution, that is, if  is improbably large.

2.4.2. Kwiatkowski-Philips-Schmidt-Shin (KPSS) Stationary Test

The KPSS test tests the null hypothesis that a time series is stationary against the alternative that the series is non-stationary due to the presence of a unit root. The test is derived by starting with the model

, with

where  is a vector of regression coefficients,  contains deterministic components (constant or constant plus time trend),  is a pure random walk with innovation variance , and  is a random error term of  and may be heteroskedastic.,  denotes the error term of  and is assumed to be a series of identically distributed independent random variables of expected value equal to zero and constant variance .

The KPSS test is a one-sided test with the null hypothesis of stationarity being equivalent to the assumption that the variance , which implies that  is a constant and, hence, that  is trend stationary. Thus, the KPSS test statistic is the Lagrange multiplier (LM) or score statistic for testing  against the alternative that  and is given by Equation (4) as

(4)

where ,  is the residual of a regression of  on  and  is a consistent estimate of the long-run variance of  using .

As the KPSS stationary test is a one-sided right-tailed test, one rejects the null of stationarity at the  level if the KPSS test statistic K is greater than the  quantile from the appropriate asymptotic distribution.

2.5. Parameter Estimation

Many methods are available in statistical literature for estimating the parameters of a probability distribution. These include the commonly used ones such as the methods of moments, maximum likelihood estimation, probability weighted moments and the L-moments method. In this study, we estimate the parameter of the GEV distribution using the widely used maximum likelihood estimation (MLE) method.

Suppose  is a set of independent and identically distributed random variables having the GEV distribution. Then, the log-likelihood for the GEV parameters  when  is

(5)

provided , for ..

The MLEs of  are determined by maximising Equation (5) with respect to the parameter vector . See [23] for details. No analytical solution to Equation (5) exists and numerical optimisation algorithms have been used in practice. In this study, the Newton-Raphson method was used to solve the likelihood equations above, following [23] and [31].

2.6. Model Assessment

2.6.1. Diagnostic Plots

Though it is impossible to check the validity of an extrapolation based on a GEV model, assessment can be made with reference to the observed data [10]. In this paper, we use the probability plot, quantile plot, return level plot and density plot for model checking. However, graphical diagnostic techniques are subjective in nature and should, therefore, be supported by more rigorous statistical tests such as the well-known Kolmogorov-Smirnov and Anderson-Darling as they allow for quantitative assessment of model fit.

2.6.2. Goodness-of-Fit and Model Selection

The two commonly used goodness-of-fit tests for extreme value models are the Kolmogorov-Smirnov test and the Anderson-Darling test, both of which are based on the empirical distribution function (edf). We apply these to the quarterly maximum surface air temperature for the SSK Airport data.

i.        Kolmogorov-Smirnov Test

The Kolmogorov-Smirnov (K-S) test is defined as the largest vertical difference between the empirical cumulative distribution function (edf)  to and the theoretical cumulative distribution function (cdf) . It is used for testing the null hypothesis that a sample of size n comes from a population with a specified distribution. That is, if denote an ordered sample of independent observations from a population with distribution function , the K-S test statistic is given by Equation (6).

(6)

where the cdf evaluated at the ordered observation of the ordered sample, , and n is the number of observations. The null hypothesis of the test is rejected in favour of the alternative if the calculated value of D is improbably large.

ii.       Anderson-Darling Test

The Anderson-Darling (A-D) test due to [1] is used for testing the null hypothesis that a sample of size n comes from a population with a specified distribution. It is based on the discrepancy between the empirical cumulative distribution function  to and the theoretical cumulative distribution function . This test gives more weight to the tails of the distribution than the Kolmogorov-Smirnov test. Let  denote an ordered sample of independent observations from a population with distribution function . The test statistic of the Anderson-Darling test is

(7)

where, as before,  .

The distribution function of the A-D statistic is complicated, even asymptotically. However, [33] provide a computational method to evaluate the accuracy of the cdf for any sample size. And, the null hypothesis of the test is rejected in favour of the alternative if the calculated value of  is improbably large.

2.7. Return Level Estimation

In extreme value analysis, often of interest to scientists is the T-year return level for annual extremal data, which is defined as the quantile  (from the distribution of the extreme values) which has probability  of being exceeded in a particular year. The main purpose of developing stationary models is to compare estimated return levels (expected quantiles) from the GEV distribution with the currently used design values based on some guideline, where available. Extreme value theory is often used to find return values for return periods that amply exceed the record length [45]. This implies extrapolation of the GEV fit to a domain outside the range of the observations. In this study, we consider the 5-year, 10 -year, 25-year, 50-year, 100-year, 110-year and 120-year return levels. The T-year return level  is obtained by inverting the GEV distribution function given by Equation (2). Thus, if , then  is given by (see [5], Equation 5.9)

(8)

And, the maximum Likelihood estimates for the return levels are obtained by substituting  and  into Equation (8).

3. Results and Discussions

3.1. Descriptive Statistics

Summary statistics for the SSK Airport quarterly maximum temperatures are contained in Table 1. The minimum (Min) and maximum (Max) surface air temperatures recorded at the SSK Airport during the study period are 26.5°C and 41.5°C, respectively. The small value of the coefficient of variation (CV) indicates little variability in the quarterly maximum temperatures. The p-value of 0.0005 of the Jarque-Bera (BP) normality test shows overwhelming evidence against the null hypothesis that the quarterly maximum temperatures follow a normal distribution. The negative values of the coefficients of skewness and excess kurtosis, respectively, indicate that the data are negatively skewed and not heavy-tailed. A kurtosis measure of less than 3 (excess kurtosis <0), as compared to that of a normal distribution (3), depicts a central peak that is lower and broader with tails that are shorter and thinner. These results and the histogram imposed on the density plot in Figure 2, which represents a left-skewed distribution, support the use of the GEV to model the quarterly maximum temperatures.

Table 1. Summary statistics for the SSK Airport quarterly maximum temperatures.

n Min Max Mean Median Mode S.D CV Skewness Kurtosis J.B (P-value)
124 26.5 41.5 35.975 37.150 39.50 3.45 0.0958 -0.851 -0.17 15.049 (0.0005)

Figure 1 shows a time series plot of the quarterly maximum surface air temperatures for the SSK Airport under the study period. From the figure, there appears to be no increase in the quarterly maximum temperatures through time. And, there is no strong indication that the pattern of variation in the quarterly maximum temperatures has changed over the years.

Figure 1. Time series plot for quarterly maximum temperature at SSK Airport, January – December, 2016.

3.2. Testing for Trend and Stationarity

3.2.1. Seasonal Mann-Kendall Test Without Correlation

Table 2. Results of the Seasonal Mann-Kendall trend test.

Statistics for Quarter (Season) Var () Z tau p-value
Individual season 1 -48 3451.3 -0.8 -0.103 0.4139
2 120 3454.7 2.0 0.258 0.0412
3 -187 3457.7 -3.2 -0.402 0.00147
4 -62 3447.3 -1.1 -0.133 0.291
Total series   -177 13811 -1.5 -0.095 0.13204

The results contained in Table 2 show that the quarterly maximum temperature at SSK Airport for Quarter 2 (April to June) have a positive value of , with p-value=0.0412, which indicates a significant increasing trend in the maximum temperatures for the season. The result suggests that the months of April, May and June are getting warmer with time possibly due to the effects of climate change. The results from the Seasonal Mann-Kendall test further reveal that the maximum temperatures for Quarter 3 (July to September), with  and p-value=0.00147 are showing a significant decreasing trend. This result is also interesting since in Botswana, surface air temperatures normally begin to increase during July through September, after the winter season, with the coldest month of the year being June. The results seen here may be due to the effect of climate change but, the conclusion cannot be made definitively because the series consists of only 124 observations and, so, may not be long enough to us to discern the effects of climate change. However, according to the Seasonal Mann-Kendall trend test, for the total series, with p-value=0.13204, indicating that though the value of  suggests a decreasing trend in the series, there is has not been a significant monotonic trend in the overall quarterly maximum temperatures for the area.

3.2.2. KPSS Test for Stationarity

The observed KPSS test statistics is  with a p-value greater than the printed p-value of 0.1, indicating that the test is not significant. Therefore, we conclude that the given series of quarterly maximum surface air temperatures at the SSK Airport indicate that the data are stationary.

From the results of the Seasonal Mann-Kendall trend test and the KPSS stationarity test, the series of the quarterly maximum temperatures for the SSK Airport for the period under study exhibit neither a monotonic trend nor non-stationarity. Hence, we fit the GEV under the assumption of stationarity.

3.3. Parameters Estimates

Table 3 contains maximum likelihood estimates (MLEs) of the parameters of the GEV for the quarterly maximum surface air temperatures at the SSK Airport. The numbers in brackets indicate the standard errors of the estimates. The standard errors are generally small, indicating small estimation variance. The goodness-of-fit of the model is discussed in Section 3.5

Table 3. MLE parameter estimates of the GEV for the quarterly maximum temperature s.

Parameter Location (μ) Scale (σ) Shape (ξ)
Estimate (standard error) 35.2761 (0.3530) 3.6956 (0.2733) -0.5833(0.0457)

3.4. Model Assessment

3.4.1. Model Diagnostics

Figure 2 shows four diagnostic plots for goodness-of-fit test of the GEV distribution and it reflects the fitting degree between the theoretical distribution, GEV, and the actual sample series. The upper left plot is the probability plot. The upper right plot is the quantile plot. All the dots that represent sample data in the probability plot and the quantile plot are virtually located on a straight line with a slope of 1. The lower left plot is the return level plot. Its ordinate is the return level of the maximum temperature and its abscissa is the return period in year on a logarithmic scale. In the return level plot, the upper and the lower curves represent the confidence upper limit and confidence lower limit of the return level, respectively. All the dots representing sample data (quarterly maximum temperatures) are within the range of the confidence interval, concentrated near the middle convex curve and approaching a certain finite value. Therefore, like both the probability plot and the quantile plot, the return level curve supports the GEV model. The lower right plot is the density plot, which consists of the estimated probability density curve imposed on the histogram for the data. Its abscissa is maximum temperature (z) and its ordinate is the associated frequency. The estimated density curve also fits in with the histogram. Therefore, all the four diagnostic plots indicate the plausibility of the GEV (35.2761, 3.6956, -0.5833) distribution as a model for the SSK Airport quarterly maximum temperatures.

Figure 2. Model diagnostic for quarterly maximum temperatures at SSKA.

3.4.2. Goodness-of-Fit Tests: Kolmogorov-Smirnov and Anderson-Darling Test

Table 4 shows the Kolmogorov-Smirnov and Anderson-Darling test results, obtained using Equations (6) and (7), respectively, for the SSK Airport quarterly maximum temperatures. An inspection of the p-value of each test leads to a non-rejection of the null hypothesis that the series follows the generalized extreme value distribution.

Table 4. Kolmogorov-Smirnov (K-S) and Anderson-Darling (A-D) statistics.

Test Statistic p-value Significance level Reject
A-D 1.0482342 - 0.05 no
K-S 0.099 0.164 0.05 no

Thus, we conclude that the SSK Airport quarterly maximum temperature series follows the GEV distribution which is the specified distribution under the null hypothesis.

3.5. Return Level Estimates

Once the best model for the data has been selected, interest is now in deriving the return levels of extreme quarterly maximum temperatures at the SSK Airport. The quarterly maximum temperature for the past 31 years was 41.5°C. Hence, to predict the probability that a quarterly maximum temperature exceeding 41.5°C will occur in a longer period, return levels were used based on maximum likelihood estimation method. The results, contained in Table 5, reveal that the surface air temperature for the SSK International Airport will be increasing over the next 120 years.

Table 5. Return level estimates.

Return period T (years) Estimated return level (°C)
5 38.98
10 39.91
25 40.63
50 40.96
100 41.18
110 41.20
120 41.22

From the table, the return level corresponding to a 5-year event for the SSK Airport maximum temperature is 38.98°C. This is the temperature which has a probability  of being exceeded, on average, once every year in the next 5 years. The return temperatures for the 10-year to the 120-year return periods range from about 40 to 41°C. Thus, the 120-year return level estimate (41. 22°C) means that in the SSK Airport, a maximum surface air temperature of 41. 2°C is expected to be exceeded, on average, once in 120 years with probability 0.008, a rare event, in deed.

The return level estimates above are actually not far from the upper end-point of the GEV distribution. Let  denote the maximum likelihood estimate of the return level associated with the return period , with the abscissa of the return level plot the return period on a logarithmic scale. If , the right endpoint (asymptotic limit as ) of the GEV distribution is finite and is given by  ([10] and [5]). For the SSK Airport quarterly maximum temperature series, the MLEs of the GEV parameters,, imply that °C. This finite right endpoint is evident on the return level plot in Figure 2, and means that, on the basis of the available data, we can conclude that the maximum surface air temperature will most likely never reach °C in the SSK Airport even beyond 120 years. However, the authors caution the reader that trend detection using extreme value analysis requires long-term and reliable data as currently observed trends may not persist in the future. As a result, great care must be exercised in extrapolating historical trends into the future, more particularly when making long-term projections of weather or climate extremes, as these may be affected by very different natural and anthropogenic causes than historical events. Hence, further studies involving larger data sets are necessary to identify trends in extreme temperatures in the SSK Airport and their timing, as well as possible links to climate change.

It is inarguable that unusually high temperatures observed in the study area pose risks to public health and safety and the environment. Prolonged dry conditions due to low and erratic rainfall coupled with very high temperatures will no doubt jeopardise access to clean drinking water, result in extreme heat events and flash flooding. Among the possible consequences of extremely high temperatures are deaths that may occur due to direct impacts and the indirect effects of heat-exacerbated, life-threatening illnesses, such as heat exhaustion, heatstroke, and cardiovascular diseases. The results of this study should, therefore, be of help to members of the public in the City of Gaborone and all other stakeholders to take precautionary in the event of extremely high surface air temperatures.

4. Conclusions

The generalized extreme value distribution is used to model quarterly maximum temperatures using data obtained from the Sir Seretse Khama international Airport weather station in Gaborone for the period January 1985 to December 2015. The Kwiatkowski-Philips-Schmidt-Shin test of stationarity on the series reveals that the maximum temperatures are stationary. The Seasonal Mann-Kendall trend test reveals no presence of monotonic trend. Parameters of the GEV distribution are estimated using the maximum likelihood method. Model diagnostics, which include the probability plot, quantile plot, return level plot and density plot, show a good fit of the GEV model to the quarterly maximum temperature series in the SSK Airport area. In addition, the Kolmogorov-Smirnov and Anderson-Darling goodness-of-fit tests support the plausibility of the GEV (35.2761, 3.6956, -0.5833) distribution as a model for the SSK Airport quarterly maximum temperatures.

T-year return levels for the return periods 5, 10, 25, 50, 100, 110 and 120 years, respectively, are estimated. The results show that the surface air temperature for the SSK International Airport will be increasing over the next 120 years. On the basis of the available data, it is also revealed that the observed quarterly maximum temperature of 41.5°C cannot be exceeded even in the next 120 years.

This study has demonstrated how extreme value theory can be used in the estimation of extreme quantiles of surface air temperature in a weather station in Gaborone. Our hope is that the study will be used to understand and predict the trends of extreme weather events in other stations around Botswana and elsewhere. The main limitation of this study is a lack of availability of data on extreme temperatures in the SSK Airport prior to the year 1985. However, further studies involving larger data sets are necessary to identify trends in extreme temperatures and their timing, as well as possible links to climate change. Further studies will be carried out to model joint (multivariate) data on air temperatures at several sites around south eastern Botswana in order to understand how extremely high temperatures in one weather station relate to those in nearby stations.


References

  1. Anderson, T.W. and Darling, D.A. (1952). Asymptotic theory of certain ‘goodness-of-fit’ criteria based on stochastic processes. The Annals of Mathematical Statistics,vol. 23, No. 2,193-212.
  2. Arreyndip NA, Joseph E (2015) Extreme temperature forecast in Mbonge, Cameroon, through return level analysis of the generalized extreme value (GEV) distribution.International Journal of Mathematical, Computational, Physical, Electrical and Computer Engineering vol. 9, No. 6, 343–348.
  3. Bandyopadhyay, N., Bhuiyan, C., and Saha,A. K.(2016), Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, vol. 82, issue 1, 367-388.
  4. Barriopedro, D., Fischer, E. M., Luterbacher, J., Trigo, R. M., and Garcia-Herrera, R. (2011). The hot summer of 2012: Redrawing the temperature record map of Europe, Science 332 (6026), 220-224.
  5. Beirlant, J., Goegebeur, Y., Segers, J. and Teugels, J. (2004). Statistics of extremes. Theory and Applications, with contributions from de Waal, D.J. and Ferro, C. Wiley Series in Probability and Statistics. John Wiley & Sons, Ltd., Chichester.
  6. Bhuiyan, C. (2008).Desert Vegetation during Droughts: Response and Sensitivity. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B8. Beijing.
  7. Bouza-Deano, R., Ternero-Rodriguez, M., and Fernandez-Espinosa, A.J. (2008). Trend study and assessment of surface water quality in the Ebro River (Spain). Journal of Hydrology 361, 227–239.
  8. Byakatonda, J., Parida, P.B., and Kenabatho, K.P. (2015). Climate Variability and Trends in Meteorological Time Series in Semi-Arid Botswana. 10th Alexander von Humboldt International Conference. Addis Ababa (Ethiopia) 18–20 November 2015, AvH10-54.
  9. Chikobvu, D. and Sigauke, C. (2013). Modelling influence of temperature on daily peak electricity demand in South Africa.Journal of Energy in South Africa 24(4), 63-70.
  10. Coles, S. (2001). An introduction to Statistical Modelling of Extreme Values. Springer-Verlag, London.
  11. Dai, K., Trenberth, K., and Qian, T. (2004). A Global Dataset of Palmer Drought Severity Index for 1870-2002: Relationship with Soil Moisture and Surface Warming. Journal of Hydrometeorology 5 (6), 1117-1130.
  12. de Haan, L. and Ferreira, A. (2006). Extreme Value Theory: An Introduction. Springer Series in Operations Research and Financial Engineering.
  13. Ebi, K.L. and Bowen, K. (2016). Extreme events as sources of health vulnerability: Drought as an example. Weather and Climate Extremes 11 (2016), 95-102.
  14. Fischer, E. M. and Schär, C. (2010). Consistent geographical patterns of changes in high-impact European heatwaves. Nature Geoscience 3, 398–403.
  15. Fisher, R.A. and Tippett, L.H.C. (1928). Limiting forms of the frequency distribution of the largest or smallest member of a sample. Proceedings of the Cambridge Philosophical Society 24,180–190.
  16. Fuentes, M., Henry, J., and Reich, B. (2013). Nonparametric Spatial Models for Extremes: Application to Extreme Temperature Data. Extremes 16, 75-101.
  17. Gilbert, R.O. (1987). Statistical Methods for Environmental Pollution Monitoring. Van Nostrand Reinhold Company Inc., New York.
  18. Gnedenko, B. (1943). Sur la distribution limite du terme maximum d’unesériealéatoire. Annals of Mathematics 44, 423–453.
  19. Hasan, H. B., Ahmad Radi, N. F. B. and Kassim, S. B. (2012). Modelling of Extreme Temperature Using Generalized Extreme Value (GEV) Distribution: A Case Study of Penang. World Congress on Engineering 2012. Vol. 1, 181-186.
  20. Helsel, D.R. and R.M. Hirsch. (2002). Statistical methods in water resources. Studies in Environmental Science 49. New York: Elsevier. (available on-line as a pdf file at: http://water.usgs.gov/pubs/twri/twri4a3/) Accessed 4-10-2016.
  21. Hirsch, R.M., Slack, J.R, and Smith, R.A. (1982). Techniques of trend analysis for monthly water quality data. Water Resources. Research 18, 107-121.
  22. Hirsch, R.M. and J.R. Slack. (1984). A nonparametric trend test for seasonal data with serial dependence. Water Resources Research 20 (6):727-732.
  23. Hosking, J. R. M. (1985). Algorithm AS 215: Maximum-likelihood estimation of the parameter of the generalized extreme-value distribution. Applied Statistics 34, 301–310.
  24. IPCC (2007). Climate Change 2007: The Physical Science Basis. Contribution of Working Group I to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change. Solomon, S., Qin, D., Manning, M., Chen, Z., Marquis, M., Averyt, K.B, Tignor, M. and Miller, H.L. (editors). Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, 996 pp.
  25. IPCC (2013): Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change [Stocker, T.F., D. Qin, G.-K. Plattner, M. Tignor, S.K. Allen, J. Boschung, A. Nauels, Y. Xia, V. Bex and P.M. Midgley (eds.)]. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, 1535 pp.
  26. Jenkins, A.F. (1955). The frequency distribution of the annual maximum (or minimum) values of meteorological events. Quarterly Journal of the Royal Meteorological Society 81, 158-172.
  27. Kwiatkowski, D., Phillips, P.C.B., and Shin, P.S., Y. (1992). Testing the null hypothesis of stationarity against the alternative of a unit root. Journal of Econometrics 54, 159-178. North-Holland.
  28. Katz, R., Parlange, M.B., and Naveau, P. (2002). Statistics of extremes in hydrology. Advances in Water Resources 25, 12871304.
  29. Kendall, M.G. (1975). Rank correlation methods, 4th ed. Charles Griffin, London.
  30. Leadbetter, M.R., Lindgreen, G., and Rootzen, H. (1983). Extremes and Related Properties of Random Sequences and Series. Springer-Verlag, New York.
  31. MacLeod, A.J. (1989). AS R76 - A remark on algorithm AS 215: Maximum likelihood estimation of the parameters of the generalized extreme value distribution. Applied Statistics 38(3), 198–199.
  32. Mann, H.B. (1945). Non-parametric tests against trend. Econometrica 13,163-171.
  33. Marsaglia, G. and Marsaglia, J. (2004) Evaluating the Anderson-Darling Distribution. Journal of Statistical Software 9 (2), 1–5. http://www.jstatsoft.org/v09/i02. Accessed 04-10-2016.
  34. Meehl, G. A., and C. Tebaldi (2004). More intense, more frequent, and longer lasting heat waves in the 21st century. Science 305, 994–997.
  35. Parida, B.P., Moalafhi, D.B., Dube, O.P. (2005). Estimation of Likely Impact of Climate variability on Runoff Coefficients from Limpopo Basin using Artificial Neural Networks (ANN). In: Proceedings of the International Conference on Monitoring, Prediction and Mitigation of Water-Related Disasters, 12–15 January, Kyoto University, Japan, pp. 443–449.
  36. Parida, B.P. and Moalafhi D.B. (2008), Regional rainfall frequency analysis for Botswana using L-Moments and radial basis function network. Physics and Chemistry of the Earth, Parts A/B/C, 33(8), 614-620.
  37. Parey, S., Hoang, T.T.H., Dacunha-Castelle (2013). The importance of mean and variance in predicting changes in temperature extremes. Journal of Geophysical Research: Atmospheres 118, 8285–8296.
  38. Raskin, P. and Kemp-Benedict, E. (2004). Background Paper for UNEP’s Third Global Environment Outlook Report (GEO-3), UNEP, Kenya.
  39. Ray, K., Mohapatra, M., Bandyopadhyay, B.K., and Rathore, L.S. (Editors) (2015). High-Impact Weather Events over the SAARC Region. Springer Heidelberg New York Dordrecht London.
  40. Shaby, B. A. and Reich, B. J. (2012). Bayesian spatial extreme value analysis to assess the changing risk of concurrent high temperatures across large portions of European cropland. Environmetrics 23, 638–648.
  41. Siliverstovs, B., Oetsch, R., Kemfert, C., Jaeger, C., Haas, A., and Kremers, H. (2008). Climate Change and Modelling of Extreme Temperatures in Switzerland. DIW Berlin Discussion Paper No 840.
  42. Stephenson, A. and Tawn, J. (2004). Bayesian inference for extremes: accounting for the three extremal types. Extremes 7, 291-307.
  43. Thupeng, W.M. and Kgosi, P.M. (2012). The Deviance Information Criterion as a Bayesian Measure of Model Assessment for Extreme Values. Pioneer Journal of Theoretical and Applied Statistics Volume 3, Issue 2, 93-103.
  44. United Nations Framework Convention on Climate Change (UFCCC) (2016). https://en.wikipedia.org/wiki/Paris_Agreement. Accessed 19 May 2016.
  45. Van Den Brink, H.W., Konnen, G.P, Opsteegh, J.D., Van Oldenborgh, G.J., and Burgers, G. (2005): Estimating return periods of extreme events from ECMWF seasonal forecast ensembles. International Journal of Climatology25, 1345–1354.
  46. Wang, W., Van Gelder, G.P. and Vrijling, J.K. (2005).Trend and Stationary Analysis for Streamflow Processes of Rivers in Western Europe in the 20th Century. IWA International Conference on Water Economics, Statistics and Finance, Rethymno, Greece.
  47. Wang, W., Vrijling, J.K., Van Gelder, P.H.A.J.M., and Ma, J (2006) Testing for nonlinearity of stream flow processes at different timescales. Journal of Hydrology, 322 (1–4), 247–268.
  48. Wen, Fang, Qi, Zhou and Gao (2015). Changes of temperature and precipitation extremes in China: past and future. Theoretical and Applied Climatology, 10.1007/s00704-015-1584-x, 369-383.
  49. Yilmaz, A.G. and Perera, B.J.C. (2014). Extreme Rainfall Nonstationary Investigation and Intensity-Frequency-Duration Relationship. Journal of Hydrologic Engineering, Vol. 19, No. 6, 1160-1172.
  50. Yoo, S-H. (2007). Urban Water Consumption and Regional Economic Growth: The Case of Taejeon, Korea. Water Resources Management 21, 1353–1361.

Article Tools
  Abstract
  PDF(351K)
Follow on us
ADDRESS
Science Publishing Group
548 FASHION AVENUE
NEW YORK, NY 10018
U.S.A.
Tel: (001)347-688-8931