Inter-Arrival Time Modeling of Threshold Scores in Mathematics Among School Pupils; A Case of Acacia Crest School, Kenya

Mathematical literacy is the ability to use numbers to help solve real-world problems. It focuses on pupils' ability to analyze, justify and communicate ideas effectively with regard to formulating, solving and interpreting Mathematical problems in a variety of forms and situations. The study modeled above threshold scores in mathematics among school pupils as an indicator for being mathematically literate. Modeling was on the inter-arrival times for pupils scoring above threshold scores (Mathematics mean score) for a given sample of pupils in their mid and end of term examinations. The Poisson distribution has been widely used as a statistical procedure for modeling inter-arrival times for count data outcomes. However, for heavy-tailed inter-arrival times of successive outcomes, the Poisson distribution exhibits an empirical observational failure thus setting up a framework for the use of other distributions that can handle such heavy-tailed data. The study used the generalized Gumbel and Weibull inter-arrival time distributions which were assumed to nest the standard Poisson distribution in which Weibull inter-arrival gave a better fit to the data. Data used was secondary data on pupil performance in Mathematics in relation to other subjects from Acacia Crest School.


Introduction
Mathematical education includes more than executing techniques. It infers an information base and an ability to certainty apply this information in the practical world. Mathematical literate persons can evaluate, decipher information and tackle everyday issues. They can also reason in numerical, graphical and geometric circumstances and convey information in a logical manner.
With an expanding information growth and the economic developments in the world, more individuals are working with advances in settings where mathematics is thought of being a cornerstone to solving everyday problems. This calls for proper understanding of the mathematics subject at school level by pupils and thus the need to evaluate if the same is achieved.
To help determine the inter-arrival times, for pupils to score above threshold score (Mathematics Class Mean) and consequently use the same to make inference on their performances, there is need to model the inter-arrival times for the scores correctly.
This research thus seeks an appreciation of the same by modeling the inter-arrival times for threshold grades in pupils' mathematics scores for Acacia Crest School. Comparison is on the Weibull count model and the Gumbel count model to determine the model that best fits the pupils' performance in Mathematics Threshold Scores.

Statement of the Problem
The measurement of academic literacy among pupils has been based on the performance of the Kenyan education system. Every researcher who has considered a research in this field has identified factors that are either related to other research findings or are entirely different in wordings but same in effect to literacy skills.
In order to utilize the factors affecting mathematical literacy skills in Kenya there is need to reduce them to the most effective ones. This could lead to a better understanding of the underlying levels of mathematical literacy skills in Kenya.
This research considers the threshold score (Mathematics Class Mean) as an indicator for good mathematical skills and thus seeks to explore the Gumbel and Weibull inter-arrival time distributions that best model the time taken between two successive threshold scores among school pupils.

General Objective
The general objective of the study is to model inter-arrival times for threshold scores in Mathematics among school pupils using the Gumbel and Weibull Inter-Arrival time distributions.

Specific Objective
The specific objectives shall be; 1) To model the inter-arrival times for threshold scores in mathematics among school pupils using the inter-arrival time distributions. 2) To determine the inter-arrival time model that best fits the data and use the same to estimate the inter-arrival times for the threshold scores. 3) To perform the goodness of fit tests of the fitted models in estimating the inter-arrival times for the threshold scores. 4) To predict the future inter-arrival times for threshold scores among school pupils using the fitted models.

Significance of the Study
Everybody is fit to become scientifically proficient in mathematics. The way towards this societal objective starts at home and in the school classroom, bolstered by the family and the community at large. All pupils can learn mathematical science with enough help, resources and adequate time with a guarantee that they will do.
This study intends to incite a conversation on the mathematical education abilities in the Kenyan schools as instanced for Acacia Crest School. The approach taken in this study was to model the times taken by a given cohort of pupils to attain threshold scores in their mathematics examinations throughout school.
This shall in turn help educational policy makers to formulate policies aimed at improving mathematical literacy skills among school pupils. This in the long run shall ensure the presence of a population which can solve real life problems using the acquired numeracy skills.

Scope and Outline of the Study
The study focuses on using the inter-arrival time distributions for modeling the time taken between a previous threshold score in mathematics and a subsequent one. The subsequent sections of this study are organized as follows: chapter two which is empirical literature review, chapter three which describes the study methodology, chapter four which gives the data analysis and chapter five which gives the conclusions & recommendations.

Introduction
This chapter reviews empirical literature with regard to inter-arrival time modeling so as to get appropriate theories to substantiate this research.

Empirical Literature Review
In the modeling of Inter-Arrival time data Dokter et al (2017) utilized the gamma distribution [1], Bushra et al (2018) investigated the impact of packet inter-arrival time feature for online P2P classification in terms of accuracy, Kappa statistic and time [3] and Farayibi (2016) investigated the inter-arrival times of customers in the Nigerian Banking system [4].
Harahap (2018) modeled and simulated queue waiting time at traffic light intersection [5] and Gleb et al (2008) derived an analytical expression of the inter-arrival time distribution for a non-homogeneous Poisson process (NHPP) [6].
Seigha et al (2017) evaluated the queuing system using the exponentiated Poisson process [7] while Mohammad et al (2013) used the multiple-channel queuing model, with the Poisson arrival and exponential service times, to solve the waiting line models [8]. Onoja & Kembe (2018) modeled the inter-arrival times on patien twaiting time [9] and Sahana (2018) studied a counting process with a generalized interarrival exponential times [2].
Eri & Mihaela (2019) had a queue-based modeling of the aircraft arrival process at a single airport [16] while Stoynov et al (2015) developed a stochastic point process, ST distribution in the modeling of inter-arrival times of floods [15]. Manu et al (2020) modeled inter-arrival times of electric vehicles and their charging sessions using an exponential distribution [14] while Thiagarajan et al (2020) studied queuing theory based on decentralized onload data using expected max-min probabilistic decision for reducing workload [13].

Introduction
This chapter is developed with the intuition of discussing the Weibull and Gumbel count data distributions used to model the inter-arrival times for threshold scores in Mathematics for Acacia Crest School Pupils.

Inter-Arrival Times
The Weibull and Gumbel inter-arrival time distributions were used to model the inter-arrival times for threshold scores in Mathematics among school pupils, a case of Acacia Crest School. The mean of the inter-arrival time distribution was taken as , variance and the hazard function ℎ = where and are the density and cumulative probability functions respectively. On deriving the relationship between inter-arrival times and their count model equivalent, was let to be the time from the measurement origin at which the event occurs and as the number of events that have occurred up until time .

Inter-Arrival Time Distributions
The relationship between inter-arrival times and the number of threshold scores is; This relationship is thus restated by saying that the amount of time at which the threshold score occurred from the time origin is less than or equal to if and only if the number of threshold scores that have occurred by time is greater than or equal to [10,11].

Parameter Estimation
The maximum likelihood parameter estimation technique was used to estimate the fitted distribution model parameters. The likelihood functions for the Weibull and Gumbel Count models were respectively given as [12];

Model Selection
This research used the HIJ/LIJ model selection criteria to select the model that best fits the data. If : is the number of parameters and the sample size then the HIJ/LIJ are given as; LIJ = −2 2 + 2NO :

Model Diagnostics
To evaluate the goodness of fit of the fitted distributions, the study used the Pearson residual statistic, Pearson Chi-Square test, Cameron Trivedi test for Dispersion and the Hosmer-Lemeshow test.

Introduction
This chapter gave a detailed analysis of the research results and the subsequent discussions. The descriptive statistics, fitted model coefficients and residual analyses were used to aid the data exploration and further make meaningful insights about the data.

Data Analysis
In order to model the threshold scores in mathematics, a total of 96 examination results for Acacia Crest School were used in the study. This included mid-term and end-term examinations for classes' five to eight for the study period 2016-2019. The average mean scores per subject were used with regard to overall class mean in evaluating their effect on the above threshold scores for the Mathematics Subject. The descriptive statistics and the fitted model coefficients gave an analysis of the threshold scores data.

Descriptive Data Analysis
In the preliminary analysis of the threshold scores data, a total of six thousand one hundred and sixty five (6,165) pupil results were used in the study. Four hundred and four pupils recorded an above threshold score during the study period 2016-2019.
The above threshold scores were evaluated as the number of pupils who had a mathematics score higher than the average mathematical scores in their respective classes for a given examination. Table 1 gave a summary of the threshold scores and Figure 1 gave a graphical visualization of the same. The mean of the threshold scores was higher but very close to the median which gave an implication of the majority of the threshold scores being centered close to the mean value. Pupils; a Case of Acacia Crest School, Kenya The maximum and minimum threshold scores were 9.00 and 0.000 respectively. Their existed a 3.135 variation in the data which was lower than the mean of the threshold scores thus an early indication of the presence of under-dispersion in the threshold scores data.

Dispersion in the Threshold Scores Data
Since the variation in the threshold scores data with regard to the mean threshold score gave an indication of the presence of under-dispersion in the data, there was need to confirm the presence of this type of dispersion in the data.
The Cameron Trivedi test for dispersion was used and it gave a value of 0.8028. This was less than the equidispersion value of 1 thus confirming the presence of underdispersion and thus the need for the use of appropriate models that can handle the same.
The Weibull Inter Arrival Time Model in comparison to the Gumbel Inter Arrival Time Model was used in the study.  The shape and scale parameters had a 81% and 21% respective effect on the threshold scores. The total number of pupils had a 1.72% positive effect on the total expected threshold scores. Maths, English and Kiswahili average scores had a 0.16%, 0.53% and 0.51% respective positive effect on the threshold scores.

Fitted Model Coefficients
Science, Social Studies and the Class mean average score had a respective 0.11%, 0.40% and 0.64% negative effect on the threshold scores. The total number of pupils, English Mean Score and the Shape parameter had a significant effect on the threshold scores. Table 3 gave a summary of the fitted Gumbel Inter-Arrival Count Model. The Location and scale parameters had a 4.7332 and a 0.4198 respective unit effect on the threshold scores. The total number of pupils had a 0.68% positive effect on the total expected threshold scores. Maths, Science, Social Studies and the Class Mean Averages had a 0.23%, 0.42%, 0.36 and 0.67% respective negative effect on the threshold scores.
English and Kiswahili mean average scores had a respective 1.31%, 1.15% positive effect on the threshold scores. The total number of pupils, English Mean Score and Social Studies Mean Score had a significant effect on the threshold scores.
From Tables 2 and 3, it was clear on the consistent effect of the English Mean Average on the threshold scores. This gave an indication on the need to enhance the teaching of the English Subject as a means of attaining high mathematical literacy in Acacia Crest School. The mathematics averages did not on themselves have a significant effect on the threshold scores.

Results Discussion
To aid in the results discussion for the threshold scores data, the fitted model residuals were analyzed. The Pearson residuals, Hosmer-Lemeshow Goodness of fit test and the AIC/BIC model selection criteria were used. Table 4 gave a summary of the pearson residuals and Figure 2 gave the residual plots for the fitted models. The Weibull Inter Arrival Count Model had a median residual close to zero compared to the residuals from the Gumbel Inter Arrival distribution with Location and Scale parameters. There was also an almost symmetry in the Weibull residuals compared to the Gumbel residuals. This gave an indication of unbiasedness of the Weibull in modeling the threshold scores data in comparison to the Gumbel Model.

Hosmer Lemeshow Test
As a goodness of fit test, the Hosmer-Lemeshow test was used as given in Table 5. All the models had a P-Value of 1 which gave an indication of a better fit on 8 degrees of freedom. The Weibull model had a lower X-Squared value −1.2986 compared to the Gumbel model −2.4908 thus insinuating a better fit by the Weibull model to the threshold scores data.

Model Selection
Model selection for this study was informed by the information criterion. Table 6 gave a summary of the AIC/BIC and the Log-Likelihood of the fitted models. The Weibull Model had the lowest AIC/BIC/Log-Likelihood thus the indication of its better fit in the modeling of Threshold Scores compared to the Gumbel Model.

Claims Frequency Prediction
Since the Weibull Model gave a better fit to the Threshold Scores data compared to the Gumbel Model, it was used to predict future threshold scores. Table 7 and Figure 4, gave a summary of the predicted threshold Scores and associated histogram.  The mean of the predicted threshold scores was lesser but very close to the median which gave an implication of the majority of the predicted threshold scores being centered close to the mean value.
The maximum and minimum threshold scores were 7.00 and 0.000 respectively. There existed a 2.44 variation in the predicted data which was lower than the mean of the predicted threshold scores thus an indication of the presence of under-dispersion in the predicted threshold scores data.

Chi-Square test for Estimated and Predicted Threshold Scores
In order to determine the association between the estimated and predicted threshold scores, the Chi-Square test was used as in Table 8. The Chi-Square P-Value was close to zero which gave an indication of the presence of association between the Estimated and Predicted Threshold Scores. This confirmed the usefulness of the Weibull model in modeling Threshold Scores.

Introduction
This chapter is the final stage of the study; it gives conclusions to the findings and recommendations for future research.

Conclusion
The study notes that in the modeling of inter-arrival times, other distributions that nest the Poisson distribution ought to be used. This is as with Dokter et al (2017) who utilized theGamma distribution [1], Farayibi (2016) who used the multi-server queuing theory [4], Gleb et al (2008) who used Non-Homogenous Poisson Process [6] and Seigha et al (2017) who used the Exponentiated Poisson Process [7].
In comparison to the Gumbel Inter-Arrival Count model, the Weibull Inter-Arrival Count model ought to be used in the modeling of inter-arrival time data. This is due to the reason that the Weibull Inter-Arrival Count model has the lowest AIC/BIC values which implies a better fit of the model to the data. It is also unbiased as witnessed in the symmetry of its residuals compared to the Gumbel model residuals.
For the mathematical literacy skills, English had a significant effect on the Threshold Scores as compared to the Mathematics Subject in itself. This gave an indication that lowering/increasing the Mathematics Subject Mean did not have a significant effect on the number of Threshold Scores but for the English Subject there was a significant effect. This was attributed to the fact that English is a language of instruction in the teaching of Mathematics in Kenya.

Recommendations
The study notes that in the coming up of strategies aimed at improving the mathematical literacy skills among school going children, the teaching of English Subject should be given an emphasis. This study gave an application of the Weibull and Gumbel Inter-Arrival times models in the modeling of Threshold Scores. The study notes that these models can be extended and applied in Bayesian analysis with longitudinal data and incorporate other statistical techniques like the Bootstrapping.