Estimating Extreme Quantiles of the Maximum Surface Air Temperatures for the Sir Seretse Khama International Airport Using the Generalized Extreme Value Distribution
Thuto Mothupi^{1, *}, Wilson Moseki Thupeng^{1}, Baitshephi Mashabe^{2}, Botho Mokoto^{3}
^{1}Department of Statistics, University of Botswana, Gaborone, Botswana
^{2}Department of Mathematics and Statistical Sciences, Botswana International University of Science and Technology, Palapye, Botswana
^{3}Department of Risk Management, Insurance and Actuarial Science, Ba Isago University, Gaborone, Botswana
Email address:
To cite this article:
Thuto Mothupi, Wilson Moseki Thupeng, Baitshephi Mashabe, Botho Mokoto. Estimating Extreme Quantiles of the Maximum Surface Air Temperatures for the Sir Seretse Khama International Airport Using the Generalized Extreme Value Distribution. American Journal of Theoretical and Applied Statistics. Vol. 5, No. 6, 2016, pp. 365-375. doi: 10.11648/j.ajtas.20160506.16
Received: October 10, 2016; Accepted: October 22, 2016; Published: November 14, 2016
Abstract: In this paper, extremes of quarterly maximum surface air temperature are modelled by employing the block maxima approach to extreme value analysis. The aim of the paper is to predict the future behaviour of the quarterly maximum surface air temperatures by estimating their high quantiles using the generalized extreme value distribution, an extreme value distribution usually used to model block maxima. The data are derived from monthly maximum surface air temperatures recorded at the SSSK International Airport Weather Station from January 1985 to December 2015. The Jarque-Bera normality test is performed on the data, and shows that the quarterly maximum temperatures do not follow a normal distribution. The Seasonal Mann-Kendall test detects no monotonic trends for the quarterly maximum temperatures. The Kwiatkowski- Phillips-Schmidt-Shin test indicates that the data are stationary. Parameter values of the generalized extreme value distribution are estimated using the method of maximum likelihood, and both the Kolmogorov-Smirnov and Anderson-Darling goodness of fit tests show that the distribution gives a reasonable fit to the quarterly maximum surface air temperatures. Estimates of the T-year return levels for the return periods 5, 10, 25, 50, 100, 110 and 120 years reveal that the surface air temperature for the SSK International Airport will be increasing over the next 120 years.
Keywords: Maximum Temperature, Extreme Value, Return Level, Generalized Extreme Value Distribution, Stationarity, Climate Change
1. Introduction
There is growing concern around the world about increased emissions by the industrialised nations of greenhouse gases (GHG), which will cause increase in the global temperature and changes of other climatic variables such as rainfall and evaporation [38], [35] and [36]. In its Fifth Assessment Report, the International Panel on Climate Change (IPPC) concludes, among others, that future climate extreme events would indicate significant changes in terms of the frequency, intensity and duration around the world. In vulnerable regions, extreme weather and climate can lead to disasters with significant impacts on human and natural systems [13]. Historically, according to [13], extreme events were generally rare in any one location, with time between events when human and natural systems could recover from the impacts experienced. However, as climate change increases the frequency, intensity and duration of some extreme weather and climate events [26], the time between extreme events will shorten across this century. This means that as the frequency, intensity and duration of extreme weather and climate events increase, there will be greater decreases in the return periods of such events. And, shorter return periods could affect the resilience of the affected communities to subsequent extreme events, especially in communities or countries at high risk of experiencing extreme weather and climate events like droughts. Precipitation and temperature extremes are considered to be the most important climate events and, have been extensively explored over the past several decades, according to [48]. Although deficient rainfall is considered the chief architect of droughts, heat waves and temperature extremes, though underestimated, often play crucial roles in drought development and intensification [11] and [3]. Regions with arid and semi-arid climate are more susceptible to drought, since they are more sensitive to rainfall deficiency and temperature extremes [6] and [3]. The present study focuses on temperature extremes.
Global temperature has increased since the beginning of the last century and will most likely continue to do so in the next decades [24] and [37]. [37] further warn that this increasing trend may induce more frequent and more intense heat waves in the future as pointed out by [34], [14] and [4]. In their study on climate variability and trends in meteorological time series in semi-arid Botswana, [8] studied the variability in rainfall, maximum and minimum air temperatures at 14 synoptic stations in Botswana for a period of 1960 to 2014. The results from the study indicate a significant increasing trend in minimum and maximum temperatures at most of the stations. The recent years have been marked by exceptional heat waves in Botswana, especially in the southern parts of the country. According to the Daily News of 21^{st} August, 2016, Botswana had been experiencing searing heat since the 20^{th} August 2016, with temperatures ranging between the 40°C and 43°C marks in several parts of the country. The same newspaper reports Mr Radithupa Radithupa, the Chief Meteorologist at the Department of Meteorological Services (DMS), which is responsible for weather forecasting, had explained early that week that the heat wave was anticipated to break high temperatures in some places. Mr Radithupa indicated that the current maximum temperature record was 43.3°C recorded in Gaborone almost 72 years ago in 1944. MAGICC/SCENGEN results indicate an increase in temperature of 2°C, on average, for all areas of Botswana in the year 2030, with the temperature increasing from 0.5°C to over 2°C, according to Botswana’s Second National Communication to the [44].
Gaborone, located in the south east of semi-arid Botswana, characterized by very low and erratic rainfall patterns, suffers from regular extreme temperatures and is highly drought prone. The city has been experiencing chronic water scarcity and power outages as the Gaborone dam, which contributes 38% supply to Greater Gaborone, reaches unprecedented low levels due to low rainfall amounts and high temperatures. As at August 23, 2016, the Gaborone Dam, with a capacity of 141.4 million cubic meters, stood at 15.2% with a supply period of 7 months without inflow. As pointed out by [3], in spite of normal or above-average rainfall (no meteorological drought) in a season or year, heat waves and extreme temperatures may affect vegetation health and trigger vegetative/agricultural drought. A heat wave is considered only when the maximum temperature of a station reaches at least 40°C for plains and at least 30°C for hilly regions [39].
Understanding and predicting the trends of extreme weather events is crucial for the protection of socio-economic well-being. Quantifying extremely high surface air temperature changes and trends in extremes is also crucial for understanding global warming and mitigating its regional impact [16]. As pointed out by [16], often of interest to scientists is the T-year return level for annual maximum of daily temperatures, which is defined as the quantile (from the distribution of the extreme temperatures) which has probability of being exceeded in a particular year. The return level is a fundamental quantity in the description of the behaviour of the upper tail of a distribution as it represents a rare event. And, to calculate return levels, we need to determine the most appropriate probability model for the distribution of the extreme temperatures. The current investigation attempts to provide a better understanding of trends in temperature extremes as a possible source of vulnerability to residents of the City of Gaborone by studying the patterns in the quarterly maximum surface air temperatures and estimating their return levels for different return periods.
Extreme Value Theory (EVT) provides a rigorous framework for analysis of climate extremes and their return levels [28] and [10]. It is widely applicable in a wide range of disciplines including finance, insurance, engineering, hydrology and climatology. Statistical methods for modelling extremes of stationary sequences have receivedmuch attention and though different methods for inference do exist the modelling strategies are basically identical [10], [5] and [12]. Given the potentially disastrous social, economic and public health impacts of weather-related events such as heat waves, droughts and floods, and given the relative rarity of such events, applying the insights of extreme value theory to environmental risk analysis is both a necessity and natural evolution [40]. In practice, there are two commonly used approaches to model extreme values of a series of independent observations. The first approach is called the block maxima approach, in which data are blocked into sequences of observations of length n, for some large value of n, producing a series of block maxima. The generalized extreme value (GEV) distribution provides a model for the distribution of block maxima. However, modelling only the block maxima is a wasteful approach if other extreme values are available. The alternative approach is the peaks over threshold (POT) method, which focuses on exceedances over a fixed high threshold. The advantage of the POT method in modelling extremes is that more observations are incorporated into the analysis. [43] made an attempt along these lines by modelling monthly maximum temperature for Shakawe in Botswana using the generalized Pareto distribution. However, for the current investigation preliminary analysis of the data supports the modelling of the data using the GEV distribution. The modelling of extreme temperature using the GEV distribution introduced by [26], is gaining currency (see [41], [19], [9], [37], [48]). Most recently, [2] model a 20-yeartime series of maximum temperatures from the Cameroon Development Corporation (C.D.C) for Mbonge using the generalized extreme value family of distributions. The authors find that the three-parameter generalized extreme value model best fits maximum temperature data as compared to the Weibull, Frechet and Gumbel models.
Despite the potentially disastrous effects of high temperatures on public health and the socio-economic wellbeing of the people, not much research work has been done in terms of applying extreme value methods to study the behaviour of extreme temperature data for Botswana. This study seeks to fill this gap. The results of the study will provide useful information to decision-makers in government, those involved in early warning systems and the health sector to prepare the public for the negative impacts of weather changes due to extreme temperatures
The objective of this study is to quantify and describe the behaviour of extreme surface air temperature at the Sir Seretse International Airport in Gaborone, Botswana. Our aim is to model the quarterly maximum temperatures through the use of the GEV distribution and predict their future behaviour by estimating high quantiles of the maximum temperature distribution. The return levels are important for prediction and planning purposes, and must be estimated from a stationary model. Thus, the main purpose of developing a stationary GEV model for the SSK Airport quarterly maximum is to compare the estimated return levels (expected quantiles) from the GEV distribution with the currently used design value based on the some guideline, for example, the optimal temperature for human health. The paper is organised as follows. Section 1 gives the introduction. The remainder of the paper is organised as follows. Section 2 provides a description of the historical maximum temperature data and the extreme value methodology employed in the study. The results and discussion are provided in Section 3 while Section 4 contains the conclusions of the study.
2. Data and Research Methodology
2.1. Data
The data consist of monthly maximum surface air temperatures, measured in degrees Celsius (°C), over the period January, 1985 to January, 2016, recorded at the Sir Seretse Khama (SSK) Airport weather station, in Gaborone, the capital city of Botswana. The dataset was obtained from the Department of Meteorological Services (DMS) of the Government of Botswana. This particular weather station was chosen based on availability and good quality of the data. For performing the analysis, the R software packages tseries, nortest, evir and extremes were used.
2.2. Generalized Extreme Value Distribution
Extreme value theory relies on the Extremal Types Theorem ([15], [18] and [30]), which identifies three distinct types of extremal behaviours: Let be a series of independent and identically distributed random variables with a common distribution function F, and let . If there exist normalising constants and such that, as ,
(1)
for all, where G is a non-degenerate distribution function, then G must be either Frechet, Gumbel or negative Weibull. [26] shows that the three types based on Equation (1) ca be combined to form a single parametric family called the generalized extreme value (GEV) distribution with cumulative distribution function (cdf)
(2)
with The quantities µ, and 𝝃 are the location, scale and shape parameters, respectively. The shape parameter determines the rate of tail decay. The three types of extreme value distributions are obtained when (heavy-tailed Frechet), (light-tailed Gumbel) and (short-tailed negative Weibull). Interested readers are referred to [10], [5] and [42].
2.3. Choice of Block Size
In the block maxima approach, extreme values are obtained by selecting the maximum values from equal length blocks such as a year. The GEV distribution provides a model for the distribution of block maxima. Its application consists of blocking the data into blocks of equal length, and fitting the GEV distribution to the set of block maxima [10]. The block size is very important in GEV modelling as the size of a block and the number of block maxima form a crucial trade-off between bias and variance of parameters estimates. A small block size generates a long sequence of block maxima, leading to a violation of the asymptotic validity of the model, which results in bias in parameter estimates and poor extrapolation; a large block size produces a shorter sequence of block maxima, leading to a large estimation variance. In implementing the GEV model, the choice of block size has to be chosen so that individual block maxima have a common distribution. Furthermore, the data generating process is assumed to be stationary.
However, in practice, extreme value data, especially environmental time series, may show some violation of the assumptions mentioned above due to one of or a combination of local temporal dependence, long term trends and seasonal variation. For instance, quarterly temperatures are likely to vary with the season, which violates the assumption of a common distribution for the individual block maxima. In many situations concerning climate studies a reasonable choice is to consider the annual maxima [10]. However, for this study due to scarcity of the temperature data, an annual block size would lead to only 10 annual maximum temperatures, since we are using a 10-year data set, which is too small a sample size for any meaningful inference. Hence, in this paper, we use the quarter of a year as the block and partition the series of monthly maximum temperatures for each year into four blocks of equal length: November – January (Q1), February – April (Q2), May – July (Q3) and August – October (Q4), with , denoting the i^{th} quarter of the year.
2.4. Testing for Trend and Stationarity
Before carrying out the modelling process one should first test a time series to see if it is stationary. A stationary series is one whose statistical properties, such as the mean, variance and covariance, do not change over time or space. As mentioned before, non-stationarity may be due to local temporal dependence, long term trends and seasonal variation. Violation of stationarity in climate extremes can lead to wrong estimates of return levels derived from a given extreme value model.
Statistical tests that are used to detect time series trends are broadly grouped into two categories: parametric and non-parametric methods. Non-parametric tests are more appropriate for hydro-meteorological time series data, which are generally non-normally distributed and censored ([49] and [7]). One of the well-established non-parametric (distribution-free) tests for trend detection is the Mann-Kendall (MK) test, first proposed by [32], further studied by [29] and improved by [21], who modified the test to take seasonality into account. The MK test is used when the series values can be assumed to be independent and identically distributed and, thus, seasonality is not expected in the series. However, quarterly temperatures are likely to vary with the season. Therefore, in this paper, we apply one extension of the Mk test which takes seasonality into account, namely, the Seasonal Mann-Kendall (SMK) test (due to [21]) to detect trends in the quarterly maximum temperatures for the SSK Airport in the period under study.
The purpose of a trend test is to determine if the time series has a monotonic trend. However, monotonicity of a series does not always indicate non-stationarity. Therefore, when the purpose is to identify non-stationarity in a time series, it is necessary to perform further tests on the series. The most commonly used stationarity test, the KPSS test, is due to [27]. Hence, in this paper, to investigate non-stationarity in the quarterly maximum surface air temperature time series data for the SSK Airport, we apply the KPSS test. The test is selected due to its common use in environmental studies (see [46], [47], [50] and [19]). If the test of stationarity indicates non-stationarity in a particular way, then that non-stationarity can be modelled using models for non-stationary extremes; if the test indicates stationarity, we do modelling under the stationary assumption. In fact, as pointed out by [49], it is useful to develop non-stationary GEV models when there is evidence of statistically significant trends even if stationarity tests do not indicate non-stationarity.
2.4.1. Seasonal Mann Kendall Trend Test
The Seasonal Mann-Kendall test is described in detail in [21], [22], [17] and [20]. It is used to test for a monotonic trend of the variable of interest when the data collected over time are expected to have seasonality and serial correlation. The test assumes that when no trend is present, the observations in the time series are not serially correlated.
Adopting the notation of [21] and [22], let denote the observation obtained in season in year , where a season may be a day, week, month, quarter or any other period of time. In the present paper, the season is a quarter of a year. Hence, let represent the entire dataset collected over years consisting of data subsets where , denotes the set of data for the ith quarter for n years,
The null hypothesis H0 for the two-sided (homogeneity) SMK test is that there is no monotonic trend in the series and the alternative hypothesis HA is that for one or more seasons there is an upward or downward trend over time.
If we denote the difference for the ith quarter, with , and let be the indicator function for quarter , then the Mann-Kendall statistic for the ith season is calculated as
, where m is the number of seasons, in our case, and
And, [21] have shown that the seasonal Mann-Kendall statistic for the entire series is given by , with variance and , where is the number of tied groups for the ith quarter and is the number of data points in the pth group for the ith quarter. Then the SMK test statistic is given by
(3)
Thus, under the null hypothesis, (Equation (3)) is equal to zero. On the other hand, a positive (negative) value of is an indicator of an increasing (decreasing) trend. Thus, we reject the null hypothesis in favour of the alternative if , where is the )^{th} percentile of the standard normal distribution, that is, if is improbably large.
2.4.2. Kwiatkowski-Philips-Schmidt-Shin (KPSS) Stationary Test
The KPSS test tests the null hypothesis that a time series is stationary against the alternative that the series is non-stationary due to the presence of a unit root. The test is derived by starting with the model
, with
where is a vector of regression coefficients, contains deterministic components (constant or constant plus time trend), is a pure random walk with innovation variance , and is a random error term of and may be heteroskedastic., denotes the error term of and is assumed to be a series of identically distributed independent random variables of expected value equal to zero and constant variance .
The KPSS test is a one-sided test with the null hypothesis of stationarity being equivalent to the assumption that the variance , which implies that is a constant and, hence, that is trend stationary. Thus, the KPSS test statistic is the Lagrange multiplier (LM) or score statistic for testing against the alternative that and is given by Equation (4) as
(4)
where , is the residual of a regression of on and is a consistent estimate of the long-run variance of using .
As the KPSS stationary test is a one-sided right-tailed test, one rejects the null of stationarity at the level if the KPSS test statistic K is greater than the quantile from the appropriate asymptotic distribution.
2.5. Parameter Estimation
Many methods are available in statistical literature for estimating the parameters of a probability distribution. These include the commonly used ones such as the methods of moments, maximum likelihood estimation, probability weighted moments and the L-moments method. In this study, we estimate the parameter of the GEV distribution using the widely used maximum likelihood estimation (MLE) method.
Suppose is a set of independent and identically distributed random variables having the GEV distribution. Then, the log-likelihood for the GEV parameters when is
(5)
provided , for ..
The MLEs of are determined by maximising Equation (5) with respect to the parameter vector . See [23] for details. No analytical solution to Equation (5) exists and numerical optimisation algorithms have been used in practice. In this study, the Newton-Raphson method was used to solve the likelihood equations above, following [23] and [31].
2.6. Model Assessment
2.6.1. Diagnostic Plots
Though it is impossible to check the validity of an extrapolation based on a GEV model, assessment can be made with reference to the observed data [10]. In this paper, we use the probability plot, quantile plot, return level plot and density plot for model checking. However, graphical diagnostic techniques are subjective in nature and should, therefore, be supported by more rigorous statistical tests such as the well-known Kolmogorov-Smirnov and Anderson-Darling as they allow for quantitative assessment of model fit.
2.6.2. Goodness-of-Fit and Model Selection
The two commonly used goodness-of-fit tests for extreme value models are the Kolmogorov-Smirnov test and the Anderson-Darling test, both of which are based on the empirical distribution function (edf). We apply these to the quarterly maximum surface air temperature for the SSK Airport data.
i. Kolmogorov-Smirnov Test
The Kolmogorov-Smirnov (K-S) test is defined as the largest vertical difference between the empirical cumulative distribution function (edf) to and the theoretical cumulative distribution function (cdf) . It is used for testing the null hypothesis that a sample of size n comes from a population with a specified distribution. That is, if denote an ordered sample of independent observations from a population with distribution function , the K-S test statistic is given by Equation (6).
(6)
where the cdf evaluated at the ordered observation of the ordered sample, , and n is the number of observations. The null hypothesis of the test is rejected in favour of the alternative if the calculated value of D is improbably large.
ii. Anderson-Darling Test
The Anderson-Darling (A-D) test due to [1] is used for testing the null hypothesis that a sample of size n comes from a population with a specified distribution. It is based on the discrepancy between the empirical cumulative distribution function to and the theoretical cumulative distribution function . This test gives more weight to the tails of the distribution than the Kolmogorov-Smirnov test. Let denote an ordered sample of independent observations from a population with distribution function . The test statistic of the Anderson-Darling test is
(7)
where, as before, .
The distribution function of the A-D statistic is complicated, even asymptotically. However, [33] provide a computational method to evaluate the accuracy of the cdf for any sample size. And, the null hypothesis of the test is rejected in favour of the alternative if the calculated value of is improbably large.
2.7. Return Level Estimation
In extreme value analysis, often of interest to scientists is the T-year return level for annual extremal data, which is defined as the quantile (from the distribution of the extreme values) which has probability of being exceeded in a particular year. The main purpose of developing stationary models is to compare estimated return levels (expected quantiles) from the GEV distribution with the currently used design values based on some guideline, where available. Extreme value theory is often used to find return values for return periods that amply exceed the record length [45]. This implies extrapolation of the GEV fit to a domain outside the range of the observations. In this study, we consider the 5-year, 10 -year, 25-year, 50-year, 100-year, 110-year and 120-year return levels. The T-year return level is obtained by inverting the GEV distribution function given by Equation (2). Thus, if , then is given by (see [5], Equation 5.9)
(8)
And, the maximum Likelihood estimates for the return levels are obtained by substituting and into Equation (8).
3. Results and Discussions
3.1. Descriptive Statistics
Summary statistics for the SSK Airport quarterly maximum temperatures are contained in Table 1. The minimum (Min) and maximum (Max) surface air temperatures recorded at the SSK Airport during the study period are 26.5°C and 41.5°C, respectively. The small value of the coefficient of variation (CV) indicates little variability in the quarterly maximum temperatures. The p-value of 0.0005 of the Jarque-Bera (BP) normality test shows overwhelming evidence against the null hypothesis that the quarterly maximum temperatures follow a normal distribution. The negative values of the coefficients of skewness and excess kurtosis, respectively, indicate that the data are negatively skewed and not heavy-tailed. A kurtosis measure of less than 3 (excess kurtosis <0), as compared to that of a normal distribution (3), depicts a central peak that is lower and broader with tails that are shorter and thinner. These results and the histogram imposed on the density plot in Figure 2, which represents a left-skewed distribution, support the use of the GEV to model the quarterly maximum temperatures.
n | Min | Max | Mean | Median | Mode | S.D | CV | Skewness | Kurtosis | J.B (P-value) |
124 | 26.5 | 41.5 | 35.975 | 37.150 | 39.50 | 3.45 | 0.0958 | -0.851 | -0.17 | 15.049 (0.0005) |
Figure 1 shows a time series plot of the quarterly maximum surface air temperatures for the SSK Airport under the study period. From the figure, there appears to be no increase in the quarterly maximum temperatures through time. And, there is no strong indication that the pattern of variation in the quarterly maximum temperatures has changed over the years.
3.2. Testing for Trend and Stationarity
3.2.1. Seasonal Mann-Kendall Test Without Correlation
Statistics for | Quarter (Season) | Var () | Z | tau | p-value | |
Individual season | 1 | -48 | 3451.3 | -0.8 | -0.103 | 0.4139 |
2 | 120 | 3454.7 | 2.0 | 0.258 | 0.0412 | |
3 | -187 | 3457.7 | -3.2 | -0.402 | 0.00147 | |
4 | -62 | 3447.3 | -1.1 | -0.133 | 0.291 | |
Total series | -177 | 13811 | -1.5 | -0.095 | 0.13204 |
The results contained in Table 2 show that the quarterly maximum temperature at SSK Airport for Quarter 2 (April to June) have a positive value of , with p-value=0.0412, which indicates a significant increasing trend in the maximum temperatures for the season. The result suggests that the months of April, May and June are getting warmer with time possibly due to the effects of climate change. The results from the Seasonal Mann-Kendall test further reveal that the maximum temperatures for Quarter 3 (July to September), with and p-value=0.00147 are showing a significant decreasing trend. This result is also interesting since in Botswana, surface air temperatures normally begin to increase during July through September, after the winter season, with the coldest month of the year being June. The results seen here may be due to the effect of climate change but, the conclusion cannot be made definitively because the series consists of only 124 observations and, so, may not be long enough to us to discern the effects of climate change. However, according to the Seasonal Mann-Kendall trend test, for the total series, with p-value=0.13204, indicating that though the value of suggests a decreasing trend in the series, there is has not been a significant monotonic trend in the overall quarterly maximum temperatures for the area.
3.2.2. KPSS Test for Stationarity
The observed KPSS test statistics is with a p-value greater than the printed p-value of 0.1, indicating that the test is not significant. Therefore, we conclude that the given series of quarterly maximum surface air temperatures at the SSK Airport indicate that the data are stationary.
From the results of the Seasonal Mann-Kendall trend test and the KPSS stationarity test, the series of the quarterly maximum temperatures for the SSK Airport for the period under study exhibit neither a monotonic trend nor non-stationarity. Hence, we fit the GEV under the assumption of stationarity.
3.3. Parameters Estimates
Table 3 contains maximum likelihood estimates (MLEs) of the parameters of the GEV for the quarterly maximum surface air temperatures at the SSK Airport. The numbers in brackets indicate the standard errors of the estimates. The standard errors are generally small, indicating small estimation variance. The goodness-of-fit of the model is discussed in Section 3.5
Parameter | Location (μ) | Scale (σ) | Shape (ξ) |
Estimate (standard error) | 35.2761 (0.3530) | 3.6956 (0.2733) | -0.5833(0.0457) |
3.4. Model Assessment
3.4.1. Model Diagnostics
Figure 2 shows four diagnostic plots for goodness-of-fit test of the GEV distribution and it reflects the fitting degree between the theoretical distribution, GEV, and the actual sample series. The upper left plot is the probability plot. The upper right plot is the quantile plot. All the dots that represent sample data in the probability plot and the quantile plot are virtually located on a straight line with a slope of 1. The lower left plot is the return level plot. Its ordinate is the return level of the maximum temperature and its abscissa is the return period in year on a logarithmic scale. In the return level plot, the upper and the lower curves represent the confidence upper limit and confidence lower limit of the return level, respectively. All the dots representing sample data (quarterly maximum temperatures) are within the range of the confidence interval, concentrated near the middle convex curve and approaching a certain finite value. Therefore, like both the probability plot and the quantile plot, the return level curve supports the GEV model. The lower right plot is the density plot, which consists of the estimated probability density curve imposed on the histogram for the data. Its abscissa is maximum temperature (z) and its ordinate is the associated frequency. The estimated density curve also fits in with the histogram. Therefore, all the four diagnostic plots indicate the plausibility of the GEV (35.2761, 3.6956, -0.5833) distribution as a model for the SSK Airport quarterly maximum temperatures.
3.4.2. Goodness-of-Fit Tests: Kolmogorov-Smirnov and Anderson-Darling Test
Table 4 shows the Kolmogorov-Smirnov and Anderson-Darling test results, obtained using Equations (6) and (7), respectively, for the SSK Airport quarterly maximum temperatures. An inspection of the p-value of each test leads to a non-rejection of the null hypothesis that the series follows the generalized extreme value distribution.
Test | Statistic | p-value | Significance level | Reject |
A-D | 1.0482342 | - | 0.05 | no |
K-S | 0.099 | 0.164 | 0.05 | no |
Thus, we conclude that the SSK Airport quarterly maximum temperature series follows the GEV distribution which is the specified distribution under the null hypothesis.
3.5. Return Level Estimates
Once the best model for the data has been selected, interest is now in deriving the return levels of extreme quarterly maximum temperatures at the SSK Airport. The quarterly maximum temperature for the past 31 years was 41.5°C. Hence, to predict the probability that a quarterly maximum temperature exceeding 41.5°C will occur in a longer period, return levels were used based on maximum likelihood estimation method. The results, contained in Table 5, reveal that the surface air temperature for the SSK International Airport will be increasing over the next 120 years.
Return period T (years) | Estimated return level (°C) |
5 | 38.98 |
10 | 39.91 |
25 | 40.63 |
50 | 40.96 |
100 | 41.18 |
110 | 41.20 |
120 | 41.22 |
From the table, the return level corresponding to a 5-year event for the SSK Airport maximum temperature is 38.98°C. This is the temperature which has a probability of being exceeded, on average, once every year in the next 5 years. The return temperatures for the 10-year to the 120-year return periods range from about 40 to 41°C. Thus, the 120-year return level estimate (41. 22°C) means that in the SSK Airport, a maximum surface air temperature of 41. 2°C is expected to be exceeded, on average, once in 120 years with probability 0.008, a rare event, in deed.
The return level estimates above are actually not far from the upper end-point of the GEV distribution. Let denote the maximum likelihood estimate of the return level associated with the return period , with the abscissa of the return level plot the return period on a logarithmic scale. If , the right endpoint (asymptotic limit as ) of the GEV distribution is finite and is given by ([10] and [5]). For the SSK Airport quarterly maximum temperature series, the MLEs of the GEV parameters,, imply that °C. This finite right endpoint is evident on the return level plot in Figure 2, and means that, on the basis of the available data, we can conclude that the maximum surface air temperature will most likely never reach °C in the SSK Airport even beyond 120 years. However, the authors caution the reader that trend detection using extreme value analysis requires long-term and reliable data as currently observed trends may not persist in the future. As a result, great care must be exercised in extrapolating historical trends into the future, more particularly when making long-term projections of weather or climate extremes, as these may be affected by very different natural and anthropogenic causes than historical events. Hence, further studies involving larger data sets are necessary to identify trends in extreme temperatures in the SSK Airport and their timing, as well as possible links to climate change.
It is inarguable that unusually high temperatures observed in the study area pose risks to public health and safety and the environment. Prolonged dry conditions due to low and erratic rainfall coupled with very high temperatures will no doubt jeopardise access to clean drinking water, result in extreme heat events and flash flooding. Among the possible consequences of extremely high temperatures are deaths that may occur due to direct impacts and the indirect effects of heat-exacerbated, life-threatening illnesses, such as heat exhaustion, heatstroke, and cardiovascular diseases. The results of this study should, therefore, be of help to members of the public in the City of Gaborone and all other stakeholders to take precautionary in the event of extremely high surface air temperatures.
4. Conclusions
The generalized extreme value distribution is used to model quarterly maximum temperatures using data obtained from the Sir Seretse Khama international Airport weather station in Gaborone for the period January 1985 to December 2015. The Kwiatkowski-Philips-Schmidt-Shin test of stationarity on the series reveals that the maximum temperatures are stationary. The Seasonal Mann-Kendall trend test reveals no presence of monotonic trend. Parameters of the GEV distribution are estimated using the maximum likelihood method. Model diagnostics, which include the probability plot, quantile plot, return level plot and density plot, show a good fit of the GEV model to the quarterly maximum temperature series in the SSK Airport area. In addition, the Kolmogorov-Smirnov and Anderson-Darling goodness-of-fit tests support the plausibility of the GEV (35.2761, 3.6956, -0.5833) distribution as a model for the SSK Airport quarterly maximum temperatures.
T-year return levels for the return periods 5, 10, 25, 50, 100, 110 and 120 years, respectively, are estimated. The results show that the surface air temperature for the SSK International Airport will be increasing over the next 120 years. On the basis of the available data, it is also revealed that the observed quarterly maximum temperature of 41.5°C cannot be exceeded even in the next 120 years.
This study has demonstrated how extreme value theory can be used in the estimation of extreme quantiles of surface air temperature in a weather station in Gaborone. Our hope is that the study will be used to understand and predict the trends of extreme weather events in other stations around Botswana and elsewhere. The main limitation of this study is a lack of availability of data on extreme temperatures in the SSK Airport prior to the year 1985. However, further studies involving larger data sets are necessary to identify trends in extreme temperatures and their timing, as well as possible links to climate change. Further studies will be carried out to model joint (multivariate) data on air temperatures at several sites around south eastern Botswana in order to understand how extremely high temperatures in one weather station relate to those in nearby stations.
References