Study and Prediction of Landslide in Uttarkashi, Uttarakhand, India Using GIS and ANN

Landslide is defined as a slow to rapid downward movement of instable rock and debris masses under the action of gravity. Landslides are one of the major natural hazards that account for hundreds of lives besides enormous damage to properties and blocking the communication links every year. The area chosen in the present study is Uttarkashi district of Uttarakhand, suffering from frequent landslides every year. Present study focused on the possible factors that are responsible for the landslide in hilly regions of Uttarakashi, Uttarakhand. In present study we used the already existing topographical maps, satellite imageries and field work. Integrated them together using GIS and soft computing to create a database that will generate the output for the future use for prediction of susceptibility of landslide. The main aim of present study is to integrate the result of our study with spatial data, soil parameters, land inventory and used the output as a user friendly application using GIS which could predict the future susceptibility of region to landslide and% contribution of each factor for the same. In this study, layers are evaluated with the help of stability studies used to produce landslide susceptibility map by Artificial Neural Network (ANN). ArcGIS 9.3, ERDAS and Excel software have been used for zonation, and statistical analysis respectively. Database of this information layer is used to train, test, and validate the ANN model. A three-layered ANN with an input layer, two hidden layers, and one output layer is found to be optimal. Finally, an overlay analysis will be carried out by evaluating the layers obtained according to their accepted coefficient in final model.. Efficiency of the application will be calculated by the help of previously acquired data of the study area at different places and then the reliability of the application will be judged.


Introduction
A frequently used definition of landslide is "movement of mass of rock, earth or debris down a slope" in the words of Cruden [1]. They are the catastrophic phenomenon taken lives of many a hundred and destroyed the hard earned money, disrupting the communication facilities. The active areas include Himalayan region of India and the process of development of this region thus slows down. As per the official figures of United Nations International strategy of disaster reduction (UN/ISDR) and Centre for Research on Epidemiology of Disasters CRED for the year 2006, landslide ranked 3 rd in terms of number of death among the top ten natural disasters [2]. As far as Indian scenario is concerned approximately 0.49 million km 2 or 15% of land area of country is vulnerable to landslide hazard and 80% is spread over Himalayas, Nilgiris, Ranchi plateau and eastern ghats (GSI 2006) [3]. Uttarakhand is an appropriate choice for the study since the newly developed state has been fighting with the catastrophe and is making front paper headlines for landslide at Vishnuprayag, Baldora, Lambagharchatti, Jharkula, phatabyung, and Amiya landslides [4]. The recent landslide of 2012 took a heavy toll on life and property, many people lost their lives, and thousands of tourist were stranded due to disrupted communication services [5]. The phenomenon can be easily classified and described by two nouns. The first describes the material and second describes the movement [6,15]. The material can be rock, debris, earth or a mix and movement can be fall, topple, slide, spread and flow. The traditional practice of Landslide prevention is enabling people with Landslide Hazard Zonation Maps. These map divides the land into homogeneous areas or domain and their ranking according to degree of actual hazard potential caused by mass movement (Guzzetti et. al, 1999 andvarnes 1984) [7,8,9]. In the present study with the help of satellite imageries like DEM from cartosat-1 and topographic maps from GSI (geographical survey of India) a susceptibility map is prepared. With the help of ANN model, we generated weightage for each factor and using this the hazard zonation map is produced [10,11]. The result obtained i.e. vector maps could be used for accessing the features for a particular location. Whole study of this paper is based on following objectives 1. Identification of factors which affects to the landslide. 2. Determination of the extent to which the various factors contribute to landslide. 3. Preparation of a landslide hazard zonation map that would divide Uttarkashi into different zones depending upon the factors.

Study Area
Uttarkashi falls under the physiographic division (s) Rohilkhand plains, Nepal Himalayas, Ganga, Yamuna daob, Siwalik range, Kumaun Himalaya, Dhaoladhar range. There are 793 villages with area drained by major river (s), Yamuna, Ganga. Annual average rainfall observed is 1750.50mm and mean temperature 16°C. In the present paper landslide hazard zonation map has been prepared for the Rishikesh-Uttarkashi-Gaumukh-Gangotri.

Data and Material Used
The successful landslides prediction depends on preparation of a reliable database from a reliable data source. Therefore, the relation between landslide occurrence and the conditioning parameters used is crucially important for landslide susceptibility mapping. It may be possible that any parameter is important with respect to landslide occurrence for the given area but it is also possible that the importance of same parameter is negligible for another area (Mohammad Onagh, 2012). Thus a number of thematic maps (referred to as data layer in GIS) based on the specific parameters which are related to occurrence of landslide viz. slope, aspect, lithology, rainfall, land cover etc. were generated using ERDAS and ARCGIS v. 9.3. Four control points were selected at the corner of the concerned points, the geo-referencing of these coordinates was done by finding the coordinates from the Google Earth. DEM (Digital elevation model) was obtained from BHUVAN.

Data Acquisition
For the preparation of data for the susceptibility analysis, firstly the raw data has to be acquired from the reliable sources. The chosen data are related with the various factors causing landslide at a place. The factors in relevance to the landslide susceptibility analysis of Uttarkashi are: 1. Soil Type 2. Rock Type 3. Precipitation 4. Land use Pattern 5. Drainage Pattern 6. Slope 7. Water Table The data pertaining to the various factors listed above has been collected in the form of maps of Uttarakhand As: 1. Lithology Map of Uttarakhand

Data Prepration for Analysis
The data that has been acquired is raw and is to be converted into the software readable form to be used in the susceptibility analysis. The maps depicting various features of Uttarkashi are in Raster Form i.e. in the form of images and have to be converted into the Vector Form for use in the susceptibility analysis. Four control points are selected on the four corners of the map such that the points mark the spatial extent of the whole map as shown:

Digitization
Digitization for the various shape files has then being done by retrieving the concerned shape file and the map form which digitization has to be done. Both the shape file and the map being retrieved in Arc Map, the map is zoomed to a comfortable level such that all features on the map could be easily traced out on the screen itself to create new layers or themes.
The shape files created using the raw data are:

Artificial Neural Network
An artificial neural network is a "computational mechanism able to acquire, represent, and compute a mapping from one multivariate space of information to another, given a set of data representing that mapping". The back-propagation training algorithm is the most frequently used neural network method and is the method used in this study. The back-propagation training algorithm is trained using a set of examples of associated input and output values. The purpose of an artificial neural network is to build a model of the data-generating process, so that the network can generalize and predict outputs from inputs that it has not previously seen [11,12].
The ANN is a black box model is a multi-layered neural network, which consists of an input layer, hidden layers, and an output layer. The hidden and output layer neurons process their inputs by multiplying each input by a corresponding weight, summing the product, and then processing the sum using a nonlinear transfer function to produce a result. An artificial neural network "learns" by adjusting the weights between the neurons in response to the errors between the actual output values and the target output values. At the end of this training phase, the neural network provides a model that should be able to predict a target value from a given input value. A neural network consists of a number of interconnected nodes. Each node is a simple processing element that responds to the weighted inputs it receives from other nodes. The arrangement of the nodes is refer-red to as the network architecture ( figure 18). Figure 18. Architecture of neural network (source: (Lee, 2009)).
In the present study we selected 107 points and all the six factors namely soil depth, soil type, rock type, land cover, slope and elevation and a excel database is created. The dataset is categorized into 60% training and 40% validation. In the study we used multiple layers with structure of 6 X 20 X 1 that is 6 input and 1 output neuron. ANNs can be grouped into two major forward and feedback (recurrent) networks. In the former network, no loops are formed by the network connections, while one or more loops may exist in the back propagation neural network with feed forward approach. The data is categorized into training, testing and validation, and all the six factors namely soil depth, soil type, rock and a excel database is created. The dataset is categorized into 60% training and 40% validation. In these points we categorized them into two categories landslide prone and non-landslide prone. Since ANN does not understand the 'landslide prone' and 'non-landslide prone' region we explain it by giving value '1' and '0' respectively. As we have seen neural network can compute the output for a given input. However, this is possible only if we know the coefficients called weights. For this three-layered feed-forward network was implemented using the MATLAB software package. Here, "feed-forward" denotes that the interconnections between the layers propagate forward to the next layer. To calculate weights for different factors we have to train the network, thus interrelationship between the nodes of different factors are given. The back-propagation algorithm was then applied to calculate the weights between the input layer (6) and the hidden layer (20), and between the hidden layer (20) and the output layer (1), by modifying the number of hidden node and adjusting the learning rate (0.01). In the training process we change the weights in that way in which the network output and the true values get closer and closer to each other. For a new dataset the weights are unknown. To check that the result is valid the validation data is evaluated using the same model and the error in true and calculated value is calculated. We have set the error to 0.01 also referred to as goal. The number of epochs was set to 3,000. Most of the training datasets met the 0.01 RMSE goal.
For easy interpretation, the average values were calculated, and these values were divided by the average of the weights of the some factor that had a minimum value. Weightage of different factors are shown in table 1.

Results and Conclusion
This study was concerned to the region of Uttarkashi due to the limitation of resources and time, we have been able to generate the results for a limited area Rishikesh Uttarkashi-Gangotri-Gaumukh route from latitude 78°19'55.14'' to 78°47'36.27" and longitude 30°32'30" to31°1'9.33". The results are compiled below.
The first objective of present study was to study the factors causing landslide. The choice of factors for a particular area depends on a large number of things like the area of study its extent, geographical features. It may be possible that any parameter is important with respect to landslide occurrence for the given area but it is also possible that the importance of same parameter is negligible for another area [13]. Thus a number of thematic maps (referred to as data layer in GIS) based on the specific parameters which are related to occurrence of landslide viz. slope, aspect, lithology, rainfall, land cover etc. were generated using ERDAS and ARCGIS v. 9.3. We have used only six factors to limit the bulk of data.
They are: 1. Slope 2. Soil depth 3. Soil texture 4. Height 5. Rock type 6. Land use 7. Precipitation The next objective of our study was to present the weightage of various factors causing landslide. An artificial neural network technique was used. Here we have selected 107 points and all the six factors and a excel database was created. The dataset is categorized into 60% training and 40% validation. The back-propagation algorithm was then applied to calculate the weights between the input layer (6) and the hidden layer (15), and between the hidden layer (15) and the output layer (1), by modifying the number of hidden node and adjusting the learning rate (0.01). The regression performance was 2.03, the accuracy for the training data was 0.99409, for testing and validation are 0.41565 and 0.18369.  The most contributing factor is slope carrying 93% and the least one is soil depth. The region that is prone to landslide has been depicted by 1 and the region that is not prone to landslide has been depictyed by 0. Other values for landslide susceptibility for the adjoining areas have been calculated using interpolation technique therefore the Rishikesh-Uttarkashi-Gangotri-Gaumukh route has been mapped for landslide Hazard. Using ArcGIS the Landslide susceptibility for whole of the map region can be seen.

Conclusion
The study has to led the determination of factors on the basis of past studies and determination of weightage for the chosen six factors namely soil depth, soil texture, rock type, height, slope and land cover. It has led us to understand the application of probabilistic approach and its success in the work. With the further advancement in such type of study, we could interpret results for future from past records, if the site is inaccessible, or the test results are erroneous. Here we have used the already existing topographical maps, satellite imageries and field work integrating them together using GIS and ANN MODEL to create a database that has generated the output for the future use. The result of present study with spatial data, soil parameters, land inventory and presented by a landslide hazard zonation map and a user friendly application using GIS that could predict the future susceptibility of region to landslide and percentage contribution of each factor for the same. The reliability of ANN is high over other methods. Largely this study emphasize on the lucid presentation of result for laymen.