Assessment of Groundwater Quality Parameters Using Multivariate Statistics- A Case Study of Majmaah, KSA

This paper aims at evaluating the groundwater quality parameters at part of Majmaah City, Saudi Arabia. The study uses multivariate statistics methods (Principal Component Analysis, PCA and/or Factor Analysis, FA) to determine the most important water quality parameters at the study area, aiming at introducing an environmental assessment of its current situation. Water quality parameters that showed levels above the standards would be spotted and reasons behind this contamination will be investigated. Real data are collected from 15 groundwater wells near farms part in Gewy area. The results of the water analysis data were subjected to intensive statistical tests and interpretations ware made in the highlight of physical factors. The paper also introduces a cost effective methodology to minimize the number of water quality parameters to the most important variables based on the multivariate statistical results. The interrelationship between different water quality parameters at the study area was also investigated. Moreover, Geostatistics techniques and GPS data, for the same wells, were utilized for characterization of some water quality parameters in 2D and results are introduced in the form of contour maps and 3D representation of these variables. The obtained results are presented as environmental maps for the water quality parameters at the study area and give, for the first time, a characterization of the water quality at the selected portion of Majmaah city.


Introduction
Worldwide, there is an increasing concern with the quality of drinking water. In fact, water is a source of life if not the most important. The water is the more material presence on the ground, covering more than three-quarters of the globe. Water is very important for our life. Therefore, water quality must be set to standards in terms of its concentrations, i.e. it should be free from apparent turbidity, color, and odor and from any objectionable taste. This means that it may be consumed in any desired amount without concern for adverse effects on health [1].
In spite of the lack of fresh water available, and doubled the need for fresh water than ever before, there is still enough of these needs, if properly used. Unfortunately, the distribution of fresh water is not equal to the Earth's surface. There are some areas poor in water resources, and others are rich in water. When the rain falling on the ground, It does not fall evenly, some areas barren desert, and some of them fall by the rains and flooding river [1]. There are areas suffering from water shortage, not of a few, but for the misuse of sources.
Globally represent, a problem of lack of potable water for drinking and agriculture, the most serious problem humankind has ever faced, and that despite the fact that water covers about 75 % of the planet. However, approximately 97.5 % of this water is unfit for the purposes of the Human Being the salt water or contaminated industrial waste liquid, and thus makes it more difficult is that 70% of the remaining part of this fresh water available in the world (2.5%) is present in the form of snow is difficult exploited in an economically [1].
The groundwater is the main source of fresh water in Saudi Arabia, covering 96% of any surface water source tortured perennial lack of fluctuation in rainfall ( Figure 1) and the nature of the geological formations. Therefore, water sources are rainwater and groundwater, as the study of water resources that depend on the proportion of 95 % of the underground water in need of necessary and urgent to study and objective evaluation on the grounds that the aspirations of the future will bear fruit only if it is built on the foundations and realistic than the country's wealth water. [4]. Renewable surface water and groundwater resources together with an assessment of water withdrawal are summarized from the FAO in Table 1, [4]. The quality of water may be described in terms of the concentration and state (dissolved or particulate) of some or all of the organic and inorganic material present in the water, together with certain physical characteristics of the water. It is determined by in situ measurements and by examination of water samples on site or in the laboratory. The main elements of water quality monitoring are, therefore, on-site measurements, the collection and analysis of water samples, the study and evaluation of the analytical results, and the reporting of the findings. The results of analyses performed on a single water sample are only valid for the particular location and time at which that sample was taken. One purpose of a monitoring program is, therefore, to gather sufficient data (by means of regular or intensive sampling and analysis) to assess spatial and/or temporal variations in water quality.
This main objective of this work is to provide a clear and comprehensive picture for the overall groundwater quality at specific area. Other objectives are: (a) To determine and rank the levels of water quality parameters in the study area. (b) To introduce environmental maps for the distribution of different water quality parameters in the form of contour maps and 3D representation of the variables so that a clear picture can be drawn for the decision makers. (c) To classify groundwater in the area into spatial water quality types by developing a robust water quality index (WQI).

Groundwater Quality Parameters
Water quality is affected by a wide range of natural and human influences. The most important of the natural influences are geological, hydrological and climatic, since these affect the quantity and the quality of water available. Their influence is generally greatest when available water quantities are low and maximum use must be made of the limited resource; for example, high salinity is a frequent problem in arid and coastal areas. If the financial and technical resources are available, seawater or saline groundwater can be desalinated but in many circumstances this is not feasible. Thus, although water may be available in adequate quantities, its unsuitable quality limits the uses that can be made of it [1].
The quality of groundwater can be measured by determining several components in the field or designate values recorded through chemical analysis in the laboratory and then compare the values with local to international standards to determine the degree of efficiency or quality. The most important items that are measured to determine the quality of groundwater are: temperature -pH, electrical conductivity -dissolved oxygen -alkaline -calcium -sodiumpotassium -magnesium -chloride -nitrite -phosphate, etc.

Water Quality Standards
Since there is a wide range of natural water qualities, there is no universal standard against which a set of analyses can be compared. If the natural, pre-polluted quality of a water body is unknown, it may be possible to establish some reference values by surveys and monitoring of unpolluted water in which natural conditions are similar to those of the water. However, there is a universal standard such as WHO, SAS and EPA.

Groundwater in KSA
The Kingdom of Saudi Arabia is known as one of the most water scarce countries in the world. It relies on three types of water resources -renewable water resources, non-renewable groundwater resources and desalinated seawater, See Table  (3).

The Study Area
Majmaah city is located within the northern part of Riyadh. It lies at latitude N 25° and longitude E 46°, [5].
The governorate is about 185 kilometers northwest of Riyadh on the path through Riyadh, Sudir-Qassim quick high way and away from Qassim city with about 140 kilometers. The city with, 60,000 capita, is considered as one of the most promising developing cities in KSA, Its great strategic location (link between different cities, andthe international road, which connects the central region with a number of Gulf states) will attract a number of residents to work especially after the establishment of the University and Sudir industrial city.
The samples of groundwater at the study area were collected using GPS for the well coordinates and bottle samples were gathered for laboratory analyses. [6]

Data Collection
Table (4) lists all the raw data collected from the 15 water wells at the study area. These data represent first set of samples (A).

Global Positioning System (GPS)
Table (5) lists the coordinated of the well as recorded by the student's using GPS instrument in the field. The coordinated were recorded first as longitudes and latitudes and then converted to X, Y coordinate system. Figure (4) shows the locations of the wells.

Methodology
The data used throughout this work were collected from 15 groundwater wells from a farm area in Majmaah city. Water samples analyzed (first set) at the Environmental Engineering Lab, Civil and Environmental Engineering Department. Most of the well samples have a complete analysis for about 20 water quality parameters, such as: pH, TDS, DO, EC, Cl, F, Cu, Ag, and Fe. The following steps describe the methodology.
GPS instrument was used to determine the X, Y and coordinates of the tested wells, as there were no coordinates available for those points. Laboratory analysis to be conducted over three samples for 22 water quality parameters, and results were sorted for further studies. Statistical analysis over the WQP from each well has been done aimed at revealing the nature of the different variables.
A comparison study between the mean values for the tested WQP and the national and international standards will be introduced. PCA has been used for variables reduction. Surfer software was used to generate contour maps and 3D representation of each variable.

Preliminary Results
The next step is to screen the data to determine the statistical values before using multivariate statistics. Statistics were carried out using Excel and StatGraph Software 2.1 [9,10]. Figures (3 and 4) give the other results.

Discussion of the Summary Statistics
The statistical analysis shown in Table (6) was made for 22 water quality parameters taken from 15 groundwater wells at a location in Majmaah city. At the first screening stage all the results of the lab. analysis of the different water quality parameters have checked for the extreme values and the wrong values were corrected or excluded from the next step of analysis. Therefore, we should mansion that there were several values recorded with vital mistakes, for example, the record of Dissolved Oxygen in well number 12 was primary recorded in the lab analysis sheet as 504 mg/l. Recheck the value revealed that this value is 5.4 mg/l. Preliminary statistics lead to exclude the following parameters from the multivariate statistics, those are: F, S, Cr, Color and Ba as they have abnormal distribution.

Multivariate Statistics Methods
Multivariate statistical methods: Principal Components Analysis (PCA) and Factor Analysis (FA) methods can be used to reduce Number of variables and detect relationships between them. The large set of water quality parameters can be further studied using one or both of these methods to determine the interrelationship between the parameters [13].

Multivariate Statistics
1) To identify the hidden dimensions or constructs that may not be apparent from direct analysis. 2) To identify relationships between variables, it helps in data reduction. 3) It helps the researcher to cluster the product and population being analyzed. Principal Component Analysis (PCA) This is the most common method used by researchers. PCA starts extracting the maximum variance and puts them into the first factor. After that, it removes that variance explained by the first factors and then starts extracting maximum variance for the second factor. This process goes to the last factor [14]. In this research paper; the results of PCA are demonstrated while FA results are reserved for further publication.

Results and Discussions of PCA
The correlation matrix controlling all variables using SPSS is presented in Table (8). Correlations are highlighted using 4 different colors for ranges (0.2 to 0.8 and above). PCA analysis is given in Table (9), This procedure performs a principal components analysis. The purpose of the analysis is to obtain a small number of linear combinations of the 17 variables which account for most of the variability in the data. In this case, 4 components have been extracted.

Interpretation of Multivariate Statistics Results
PCA method has defined four main components for the tested set of data. This is can be seen from the scree plots of figure (4a). The implementation of PCA method led to reduction in the number of variables.
The values obtained from data reduction using PCA method reveal that the first component (factors) involves: Cd, Cn, Cu, NO3 and Pb in one group. The variables: Ca, DO, EC and TDS are the main variable in group two. Also, group three consists of seven variables that might have interrelationship among them, those are: Ca, Ag, EC, Mg, pH, and TDS. The fourth group has two main variables: Mg and pH.
It is logically interpreting the strong interrelationship between TDS and EC as the one variable can be extracted directly from the other. Also it was expected that Ca and Mg laying in one group. However the hidden relations between the variables forming the first group are difficult to be interpreted.

Mapping using Geostatistics Techniques
Geostatistics is the statistics of spatially or temporally correlated data. The technique has been used to be a practical approach to the problems of ore reserve estimation and mine planning. It has been also used for other applications concerned with petroleum and gas resources estimation.
One of the characteristics that distinguish earth sciences data from most other data is that the data belongs to same location in space. Spatial features of the data set such as the location of extreme values, and the overall trend are of considerable interest [16]. These features, in other words the variables Z(x), are functions describing natural phenomena that have geographic distributions, such as any water quality parameter (for example pH, EC, etc). Kriging is the most famous geostatisics technique being used for this purpose. In this work, Kriging is performed using SURFER software [17].

Conclusions
Water quality parameters of groundwater wells at part of Majmaah city has been characterized using multivariate statistical analyses. The main findings of this paper are: Results of the preliminarily statistical analysis revealed that only 17 variables are suitable for carrying out multivariate statistical analysis to reduce the number of measured variables to principle components or factors. Variables: F, S, Cr, Color and Ba were excluded from multivariable analysis.
A comparison study between the mean values for the tested WQP and the national and international standards is introduced, where most of the recorded parameters were below the standards except two variables showed a slight values above the maximum permissible levels, these are: Lead (Pb) and Cadmium (Cd).
PCA has been used for variables reduction, where four PCA were extracted and the variables defining each group are reported.
An attempted to interoperate the resultant groups were made, despite the physical interpretation is not deep.
Surfer software was used to generate contour maps and 3D representation of each variable within the study area.
The methodology introduced throughout this project would assist the water quality monitoring programs by providing a quick analysis and dynamic system for water quality characterization at any defined area.
Nevertheless, the study has been carried out over a small portion, the steps and introduced procedure can be easily applied elsewhere for similar purpose.