Determination of the Severity of Motorcycle and Tricycle Accidents in Nigeria

Road traffic accidents are a very rampant issue causing injury, loss of lives and property worldwide. In this research, a system for determining the severity of motorcycle accidents in Lokoja Metropolis of Central Nigeria was developed. The research considered different areas that are highly prone to accidents in Lokoja. Although accidents cannot be totally avoided, through scientific analysis, their frequency and severity can be reduced. The methodology used in this research is Knowledge Discovery in Databases with the Decision Tree Algorithm as the soft computing technique used for analysis. Python programming language was used for the implementation. The dataset used was gotten from the Federal Road Safety Corps (FRSC) in Lokoja. After the training and testing of the dataset, we achieved an accuracy of 90.5%. The motorcycle accident severity prediction system developed could serve as a tool that can be used to cub the enormous challenges faced by FRSC in curtailing motorcycle accident.


Introduction
Movement from one place to another either of humans or goods is inevitable in life hence, transport is an important element in development. The provision of transport infrastructure has grown extensively in World via a wide range of networks of modes which have undergone tremendous technological advances cutting across the motive power, the tracks and the means that serve as compartment for passengers and goods. In the past decade, the use and ownership in the Country of motorcycle has been on the increase which significantly has impacted the socio-economic facets of people's lives. Motorcycles are a means of transport used to move from one place to another. With the policy of liberalisation of public transport by the government of Nigeria in the early 1960s, commercial motorcycles popularly known as 'okada' were introduced in Nigerian cities and progressively into rural areas. Recently, the introduction of the tricycle popularly known as 'keke napep or keke' mode of transportation has not only improved the socio-economic life of the people but has also created employment for the timing unemployed youths especially in the present situation where unemployment has been the order of the day [1]. This however has worsened the intra-urban road traffic accident records in the cities as the riding habits of the cyclists shows majority of them disobey traffic regulations and road signs and operate on the intra-city roads with abandonment and lack of due regard for their own safety. The carelessness of the cyclists on the roads is overwhelming paying little or no attention to other road users [2]. This has resulted to series of road mishaps involving fatalities and traffic congestions which have negatively affected the socio-economic life of the inhabitants in most Nigerian cities, towns, and villages. The resume of the majority of the operators is frightening. The operators are usually very young males with mean age ranging from 25 to 36 years, illiterate to semi-literate, indulge in drug or psychoactive substance, ignorant of traffic codes, with no form of training on the use of motorcycles and are majorly hirers of motorcycles for business. These characteristics explain their behaviour on the road [2,3]. Again, in contrast with other modes of transportation, motorcycle users are more prone to serious injuries.
Road traffic accidents are responsible for over 1.2 million deaths worldwide and about 4.8 million injuries on a yearly basis according to World Health Organization (WHO) in 2004. Road traffic accidents are increasing daily and it is one of the major causes of death in developing countries. The rate of accidents on Nigerian roads has become a menace to many Nigerians because of the shabby road conditions [2]. The inability of Nigerian government to provide a better means of transportation like the railways within town or monorails like we have in some organized countries is the reason why motorcycles and three wheeled automobiles are common means of transportation. Lokoja, a confluence and historical town in Central Nigeria has experienced series of changes over the years. The rapid population growth coupled with the rise in commercial activities and narrow roads has encouraged the use of motorcycles and tricycles in this town which scale through traffic with ease. As a result, they are the commonest means of transportation used by most residents of Lokoja town who do not have other means of transport. They also offer door to door services at affordable rates which endeared them to most people [4].
As the use of this means of transportation increases, it causes congestion most times in the early and evening hours of the day in some of the densely populated areas of the Town making them highly prone to accidents. With this, the need to develop tools that will predict accidents in these areas so as to guide the Federal Road Safety Corps and/or Traffic Wardens (Police) to station their officers in those areas in order to properly guide road users so as to prevent the occurrences of such accidents has become paramount. Though accidents cannot be avoided nor prevented in totality, but its frequency and severity can be greatly reduced.
The severity of road traffic crashes is a major problem in developing countries. Although, injuries sustained from road traffic accident are important public health concern in third world countries, it is a neglected epidemic due to lack of appropriate policy response or implementation of existing ones to prevent road traffic crashes and make the roads safer for vulnerable road users [5]. The risk of severe injury to the occupants of both motorcycles and tricycles is further increased by being an open vehicle without safety device such as seat belt and airbag. The pedestrians hit by both motorcycles and tricycles are also at risk of injury of varying degrees of severity depending on the orientation of the pedestrian on impact. The vulnerability of pedestrian and other road users is even higher in a setting, where most of the roads lack pedestrian walkways and disregard for traffic rules and safety measures is common among drivers as often the case in developing countries. Whereas many researchers [1,3,4,6,7,8,9,10,11] have provided useful information on the proportion of motorcycle traffic, long term trend of motorcycle crashes, riding behaviour of the motorcyclists, causes of motorcycle accidents, identification of the accident black spots in a geographical area, characteristics of the motorcyclists crashes and factors associated with the crashes in Nigeria; the severity [4] of motorcycle and tricycle mishaps is still an issue yet to be considered. There is scanty information about motorcycle and tricycle related road traffic injuries, pattern of injuries and crash characteristic of two-wheeled motorcycle. Overall, there is very limited data on motorcycle and tricycle injuries and vulnerability of its occupants and pedestrians. This study, therefore, aims at determining the severity of motorcycle and tricycle mishaps using decision tree algorithm of machine learning. The research will use the dataset collected from Lokoja metropolis. Decision trees have been used as well to analyse and predict the severity of accidents. Decision trees provide a suitable model to find the causes of accidents because they can easily be interpreted and decision rules can easily be gotten from them. The learning begins with observations or data such as present or direct experience that makes the system make better decisions in the future. It has been used in different fields of study like medicine, agriculture, education, sales (customer management etc.). The rest of the paper is organized in the following order. Section two gives the Literature Review while Section three presents the methods deployed in achieving the purpose of this research. The results are presented in Section four. In Section five, the discussion of the results is presented. As a conclusion, unique contributions of this article, limitations of the research and some future research directions are given in Section 6.

Literature Review
Road traffic accident can be seen as an occurrence in which one or more vehicles collide with a road barricade, another vehicle, person or animal. Road accidents do not only affect people, but also causes damage or loss of property. The major aim of transportation system is to convey people and goods safely from one place to another. The number of cars, motorcycles and tricycles on the road has brought about the problem of loss of lives and property due to accidents that occur daily [12].
The agency responsible for the enforcement of safety rules on roads in Nigeria is the Federal Road Safety Corps. It was established by the Federal Government of Nigeria via Decree 45 of 1988 as amended by the Decree 35 of 1992, with effect from 18 th February, 1988. The Commission was given the responsibility of administration of road safety in Nigeria. The FRSC in 2011 reported that between January 2007 and June 2010 a total of 4,017 truck accidents happened on Nigerian roads with an annual average of 1,148 cases and monthly average of 96 cases [6,8].

Factors Determining Road Traffic Accident
Accidents are complex processes which can be explained as a consequence of different influential factors [13]. To understand the concept of traffic road accident with the view of addressing it, it is important to understand the underlying factors that are responsible for it [14]. Many researchers have conducted numerous studies to explore these factors [13,15,16,17]. Zhang [18] identified and quantified the factors affecting highway crash severity in Louisiana. Ordered Mixed Logit was used to predict the crash. The factors identified include, age and gender of the driver, vehicle speed, alcohol assumption, seatbelt usage, whether the driver was ejected from the vehicle, whether the crash was a head-on collision, whether an airbag was deployed, and whether one of the vehicles was following too close behind another vehicle. Keller [19] estimated the crash with an ordered probity model, which showed that crashes involving a pedestrian/bicyclist have the highest probability of a severe injury. For motor vehicle crashes, left turn, angle, head-on and rear-end crashes cause higher injury severity levels. Division (a median) on the minor road, as well as a higher speed limit on the minor road, was found to lower the expected injury level. Ratanavaraha and Suangka [20] evaluated the factors that affect the accident on expressways in Thailand. The independent variables evaluated include, average speed on road section, average traffic volume per day, period of time, weather conditions, physical characteristics of accident area, and cause of accident. The study results found that speed limit was the only factor that affects the accident severity on expressway. It is noted that, most previous studies are focusing more on human factors and vehicle equipment (driver age, gender, alcohol usage, seatbelt usage, airbag, etc.). On the infrastructure side, although knowledge is being accumulated to relate the severity of traffic crashes to roadway characteristics (such as road function class, roadway alignment, speed limits, etc.) and environmental factors (such as weather, and road lighting condition), such knowledge is mostly qualitative in nature [14].

Road Traffic Accident Epicentres
These are areas which are prone to accidents. Different researches employed different methodologies to explore the identification and prediction of these areas. For instance, [22] introduces the possibility of using accident prediction models for the identification of hazardous road locations. The application of this method is presented with an example of secondary rural roads in the south Moravian region which are classified into road segments homogeneous in terms of basic geometric and traffic characteristics. The prediction model is represented by a generalized linear model which on the basis of the available data, determines the expected number of accidents for individual types of road segments. This method can be used as an effective tool for road network safety management.
Akomolafe and Olutayo [6] studied the various techniques used to analyse the causes of accidents along the Lagos-Ibadan express way. The research used the decision tree algorithm and the result showed that causes of accidents, specific time/condition that could trigger accident and accident prone areas could be effectively identified. In a similar vein, [9] combined decision tree and artificial neural network to discover new knowledge from historical data about accidents in one of Nigeria's busiest roads, Lagos-Ibadan express. The data was organized into continuous and categorical data. The continuous data were analyzed using Artificial Neural Networks technique while the categorical data was analyzed using Decision Tree. Sensitivity analysis was performed and irrelevant inputs were eliminated. The performance measures used to determine the performance of the technique include Mean Absolute Error (MAE), Confusion Matrix, Accuracy Rate, True Positive, False Positive and Percentage correctly classified instances.
Muhammad et al. [23] used decision tree algorithm to predict the likely cause of accidents, its prone locations and time along the Kano-Wudil highway. This was with the view of taking necessary measures to avert accident along the road.

Traffic Congestion Detection
Traffic congestion is a condition on transport that is characterized by slower speeds, long trip times, and increased vehicular queuing. This is a great problem to road users as many accidents are due to it. Wang et al. [16] researched on traffic speed and proposed an error feedback Recurrent Convolutional Neural Network structure for continuous traffic speed prediction. By integrating the spatio-temporal traffic speeds of contiguous road segments as an input matrix, the method explicitly leveraged on the implicit correlations among nearby segments to improve the predictive accuracy. By further introducing separate error feedback neurons to the recurrent layer, the method learns from prediction errors so as to meet predictive challenges rising from abrupt traffic events such as morning peaks and traffic accidents. A novel influence function was designed based on the deep learning model, and how to leverage it was showcased to recognize the congestion sources of the ring roads in Beijing.

Severity of Traffic Accidents
Severity of an accident has to do with the extent of havoc wrecked by the accident. Moghaddam et al. [24] used a series of artificial neural networks to model and estimate crash severity and identify significant crash related factors in urban highways. Applying ANN is engineering science has been proved in recent years. Obtained results show that the variables such as highway width, head-on collision, type of vehicle at fault, ignoring lateral clearance, following distance, inability to control the vehicle, violating the permissible velocity and deviation to left by drivers are most significant factors that increase crash severity in urban highways.
Again, Li et al. [24] used statistical analysis and data mining algorithms on FARS Fatal Accident dataset in an attempt to address this problem. The relationship between fatal rate and other attributes including collision manner, weather, surface condition, light condition and drunk driver were investigated. Association rules were discovered by Apriori algorithm, classification model was built by Naive Bayes classifier, and clusters were formed by simple K-means clustering algorithm. Certain safety driving suggestions were made based on statistics, association rules, classification model, and clusters were obtained.
Ramachandiran et al. [25] used classification algorithms to obtain the relation between various attributes that determine the severity of an accident. First, the Naive Bayes classification algorithm was used to obtain the severity of accident or the given attribute values. Next, the Decision Tree algorithm was used to perform the same function. The accuracy of both the algorithms are compared so that the better algorithm can be used. The main aim of this project was to a relationship between the factors leading to an accident and the severity of the accident. The contribution of this paper is to propose a tool to predict whether an accident that occur for the given parameters is critical or non-critical.
Hazaa et al. [26] surveyed the latest studies in the field of traffic accident prediction; the most important tools and algorithms were used in the prediction process such as back propagation Neural Networks and the decision tree. They proposed a model for predicting traffic accidents based on dataset obtained from the Directorate General of Traffic Statistics, IBB, Yemen. The classification algorithm was applied to the dataset that was divided into a training group and a test group to obtain satisfactory results to find predictive results that would assist and contribute to the reduction of traffic accidents after appropriate evaluation and discussions.
Yahaya et al. [10] in their study proposed an arithmetic mean of information gain and correlation ratio based decision tree data mining algorithm which addressed the biasness and improve the accuracy of information gain based decision tree data mining algorithms. The proposed algorithm was demonstrated using road accident data set of Gombe -Numan-Yola highway, Nigeria and gave 93.29% accuracy against information gain decision tree algorithm which gave 74.93% accuracy. The proposed algorithm minimized the biasness disadvantage of the information gain of decision tree based algorithm for data sets with large number of attributes with different data types and district values. Also, Wahab and Jiang [11] used machine learning based algorithms to predict and classify motorcycle crash severity. The aim was to evaluate different approaches to modeling motorcycle crash severity as well as investigating the effect of risk factors on the injury outcomes of motorcycle crashes. The dataset was classified into four injury severity categories: fatal, hospitalized, injured, and damage-only. Three machine learning based algorithms were used: J48 Decision Tree Classifier, Random Forest (RF) and Instance Based learning parameter k (IBk). The results of the study revealed that the predictions of machine learning algorithms are superior to the MNLM in accuracy and effectiveness and the RF based algorithms show the overall best agreement with the experimental data out of the three machine learning algorithms, for its global optimization and extrapolation ability.

Summary of Research Findings
Issues such as accident are country specific. This is because the level of development varies across countries. Development in terms of roads, road networks, level of literacy, sophistication of traffic legislation and agencies for the enforcement of these laws. Consequently, methods for predicting and preventing traffic accidents must always be adjusted and ready to rediscover the new features. To add to the lapses in legislation of the country, each country has unique economic, political, social, and institutional opportunities for and barriers which makes accidents especially that of motorcycles different amongst countries. A crucial and peculiar issue is the high level of sense of irresponsibility among cyclists.
It is worthy to note that, most previous studies focused more on vehicular accidents, little considered motorcycle. Among the works on motorcycle, there no work that talked about the Nigeria's case. More interest is developed in this locality because of the terrible nature of the roads and the road network. Also, most of the motorcycle users are illiterates and have little or no knowledge of traffic rules. human factors and vehicle equipment (driver age, gender, alcohol usage, seatbelt usage, airbag, etc.). On the infrastructure side, although knowledge is being accumulated to relate the severity of traffic crashes to roadway characteristics (such as road function class, roadway alignment, speed limits, etc.) and environmental factors (such as weather, and road lighting condition), such knowledge is mostly qualitative in nature.
The various publications and research works reviewed in this section explores different works which apply machine learning algorithms and techniques to solve problems in road accidents.
Machine learning algorithms like decision tree as aforementioned is one of the research works reviewed in the previous section.
The method that will be employed in this research is the Decision Tree algorithm due to its efficiency and accuracy. The aim of using Decision Tree is to create a training model that can predict class or value of target variables by learning decision rules gotten from prior data. Decision Trees clearly maps out the problem so that every possible option can be challenged. It sets a framework to quantify the values of outcomes and the probabilities of achieving them.
Many traffic crash prediction models have been built over time some of which failed to achieved the desired results. Different approaches have been used in the implementation of these models such as the Decision trees, Support Vector Machine, Association Rules and Naive Bayes Algorithm. In order to augment safety on our roads, effective crash prediction is of great importance. In this research we are considering traffic crash prediction based on the Decision tree algorithm.
Information required for traffic crash prediction includes location of crash, time crash occurred, types of vehicles involved and so on. These have to be looked into by the Road Safety Commission to reduce the rate of accidents on the road. The information gathered will be trained using machine learning techniques and the trained model will be used to predict traffic crashes in the future in a specified location.
The main problem of the existing models is prediction accuracy due to poor traffic crash data records in Nigeria. Our traffic flow is not properly monitored by real time monitoring systems. Also, building Traffic Crash Prediction (TCP) models is not an easy task due to the processes involved and difficulty in getting adequate data for the project. Problems related to the existing models of Traffic Crash Prediction include: 1. Data unavailability.
2. Lack of information security. 3. Management of the existing systems of crash prediction is relatively expensive. 4. Lack of real time monitoring systems on the roads which leads to inaccurate crash prediction majorly.

Data Collection
The historical data was sourced through traffic census on motorcycle accidents carried out by the author during this research. The records from the Federal Medical Centre, General Hospital, Police State Head Quarters and Federal Road Safety Corps Office all in Lokoja between 2015 -2019 were collected. A total of 184,514 motorcycle and tricycle accidents were recorded, of which, there were 163,958 crash induced injuries and 20556 deaths. The dataset has the following features: identity, time, driver, accident type, road, weather, driver's condition, time of the day and location and severity.

Decision Tree Algorithm
Decision tree classifiers are one of the most popular and used classification techniques because the tree is constructed from the given data based on simple equations and uses the attribute selection measures such as a gain ratio measure, which ranks the attributes and determines the most useful attribute, and accordingly the researcher can realize the most efficient attributes on the predicted purpose. The decision tree is one of the main data mining technique that is used to build the classification model, it is a very practical method since it is relatively fast, does not require any domain knowledge or parameter setting, can deal multidimensional data, and can easily generate a set of simple classification rules that are interpretable and understandable for humans. In general, decision tree classifiers have good accuracy. The decision tree algorithm was chosen for this research because of its simplicity and accuracy in decision making. The idea behind the decision tree is to create a training model which can be used to predict class or target variable. The pseudo code on how the decision tree algorithm works is shown below. Create

Results
The system was implemented using python programming language and decision tree algorithm. The system is web-based with two (2) modules: User Authentication module (allows the admin to login into the system) and analysis module (shows where the admin performs analysis on the data). The performance evaluation of this system was carried out using different workload and metrics to determine the efficiency of the system and also to know how well and fast the system can interact using different data input. System's performance is crucial in the design, procurement, and use of systems. As such, the aim is to get the optimal performance of the system at the minimum cost. To achieve the evaluation, we split entire dataset into two sets training and testing, we use 70% for training and 30% for testing. We then fit the training data in the machine for training. We use the test set against the training set for accuracy. We compute the accuracy by analysing how accurately test set scores by learning from train set. The result obtained from the demonstration of the system are shown in the figures below. This graph shows the number of people that got involved in the accident against the severity of the accident. This graph illustrates the driver against the severity (whether it was a male or female rider). It shows that the male riders got involved in more accidents than the females and therefore has a higher severity than the females. This graph shows us the condition of the riders either drunk, sober, or normal against the severity. More accidents were recorded for riders that were drunk. The severity of the motorcycle crash when the rider was sober is seen to be the highest of the three conditions. From there we got precision, recall and f1 score. These are used to understand the prediction and accuracy. Precision is the ratio of correctly predicted positive observations to the total predicted positive observation. Recall refers to the percentage of total relevant results correctly classified by our algorithm. F1 Score is the weighted average of Precision and Recall. In average, all the classes of "0" and "1" were classified correctly.

Discussion
The developed web application was used in carrying out the prediction of motorcycle accidents severity in Lokoja, which was achieved through the collection of historical data concerning the motorcycle accidents that previously occurred. The historical data is processed and stored in the database. We split entire data set into two sets training and testing. We use 70% for training and 30% for testing. We then fit the training data in the machine for training. We use the test set against the training set for accuracy. After using the Decision Tree, we get an accuracy of 90.5% for our model. During the development phase, testing was carried out on the system to determine its performance and response time using a test workload and metrics. The use of the web application for motorcycle accidents severity prediction will increase the quality of traffic services in the area of road safety through the following: To put the system into action, a knowledge-based decision system is built. The decision tree algorithm was used to represent the knowledge and to identify significant rules. It was run on the motorcycle and tricycle accident dataset with different numbers of attributes. Rules were generated based on the following attributes: identity, time, driver, accident type, road, weather, driver's condition, time of the day and location and severity. The implementation of this system was accomplished using Python programming language.
The result presented in the previous section shows that the Accuracy, Precision, Recall, and F-Measure were 90.50%, 97.00%, 93.00%, and 95.00% respectively. The obtained results were agreed with the results of [9] where the accuracy of the Decision Tree was 77.70%. Furthermore, the results of the Decision Trees were agreed with the results of Yahaya et al. [10] where the accuracy was 83.52%. In addition, the results of Precision, Recall, and F-Measure were poor for minor classes (Fatal and Serious) and bias to the major class (Slight) this due to the skewed distribution of data between classes on an imbalanced original dataset.

Conclusion
At the end of this research, a model for predicting the severity of motorcycle accidents was developed using the decision tree algorithm and the data set given to us by the Federal Road Safety Commission. The emerging trend in the road safety sector is the use of an automated system. Lapses in the security of the system such as the unauthorized user (admin or doctor) gaining access to the system and analyse accidents can have serious consequences that extend beyond the sector. The motorcycle accident severity prediction system developed could serve as a tool that can be used to cub the enormous challenges faced by the sector as it does not give access to a new user to register as admin or without permission from the Federal Road Safety Commission. The motorcycle accident severity prediction model is highly recommended for the Federal Road Safety Commission. It is recommended that the Federal Road Safety Commission should improve on its current state in rural regions of all countries by using an automated road traffic accident prediction model such as the model for predicting the severity of motorcycle accidents.