Human Activity Recognition Based on Weighted Sum Method and Combination of Feature Extraction Methods

: Human Activity Recognition (HAR) is one of the most important areas of computer vision research. The biggest difficulty for HAR system is that the camera could only film in one direction, leading to a shortage of data and low recognition results. This paper focuses on researching and building new models of HAR, including Principal Components Analysis (PCA), Linear discriminant Analysis (LDA) is to reduce the dimensionality and size of data, contributing to high recognition accuracy. First, from the 3D motion data


Introduction
Nowaday, Human Activity Recognition [1] has received special attention from researchers around the world. Its applications include intelligent security monitoring systems, health care systems, intelligent transportation systems, and a variety of systems that involve interactions between people and electronic devices such as human computer interfaces [2].
The study in the area of HAR focuses on the recognition of activities from the videos recorded by the regular cameras. The biggest problem of HAR using regular cameras is that the video can only be recorded from one direction, leading to a shortage of data for recognition. To improved the recognition rate for HAR, in recent years, the 3D method using "marker" such as the Motion Capture using stereo camera [3] or using the devices dedicated depth sensor [4] such as the Microsoft Kinect [5] is proposed. This paper proposed the combination method using many feature extraction methods such as Principal Components Analysis (PCA) [6] and Multi-class Linear Discriminant Analysis (Multi-class LDA) [7]. The recognition results depend on the feature extraction method. This paper uses the weighted sum method [8] to combine the feature extraction methods to enhance the efficiency of activity recognition.
The paper is described as following. The first section is the introduction of this paper. The second section presents the related researches. The 3 rd section shows the proposed method. The 4 th section shows the experiment result. Finally, the 5 th section summarizes this paper.

Principal Components Analysis
Principal Components Analysis [4] is a statistical algorithm using orthogonal transformation to transform a set of data from a higher dimensional space into new lower dimensional space. This transformation is defined in such a way that the first principal component has the largest possible variance. Figure 1 illustrates an example of the transformation using PCA. Given training set X | ∈ , i ∈ 1 … n . Where is a vector in d-dimensional space, and is the number of vectors in the set X.
First, PCA converts centre of all vectors into the original coordinates by: where µ is the center of all vectors in X, it is calculated by formula: Then, PCA method calculates the covariance matrix of the vectors in X , Gọi X * … ∈ is matrix containing all the training vectors. V X * X * " ∈ The projection vector # of PCA is solved by the eigen problem: V# λ# Finally, PCA uses the first % vectors to built the new lower dimensional space. Call matrix U '() # , # * … # + ∈ + . Then the coordinate of all vectors in the new coordinate space is: F U '() " X * ∈ + .

Multi-class Linear Discriminant Analysis
Multi-class Linear Discriminant Analysis (Multi-class LDA) [5] is a generalization of standard two-class LDA that can handle arbitrary number of classes. Multi-class LDA finds projection vector # such that all data projected into # have the largest separations.
Given the training data set with label: X ./ , 0 12 ∈ , y ∈ 1 … 4 5; i ∈ 1 … n Where 7 is i th vector of the training data set in 8dimensional space and 0 is a label of 7 . 4 is the number of label kinds and is the number of vectors in the set X. The separation between these classes is defined by the ratio of the variance between the classes to the variance within the classes as following: Where S : is between-class scatter matrix, S : is calculated by formula: Where, > < is center of class ? and < is the number of vectors in class ?. And > and is the mean and the number of all data the training set.
Here, > @ A is the center of the class is labeled by 0 . Then, Multi-class LDA determines the projection vector # as following, # argmax I # " S : # # " S ; # We can find vector J by solving the following eigen problem:

Proposed Method
This paper combine the feature extraction methods and use weighted sum model to improve recognition results. Proposed method includes 3 steps: First, using pre-processing method with CMU data; Next step, use Manual, PCA and LDA method to feature extraction; Finally, proposed weighted sum method for machine learning with Suport Vector Machine (SVM) [9]. Figure 2 shows the proposed model.

Preprocessing
This paper uses skeleton model of human [3]. This step is as followed: 1. Reduce the number of features by reducing the number of the selected bone in 3D human model. This paper references the method proposed by K. Adistambha [10] to select the bones of 3D human model. 2. Normalize monitoring time of all actions. This paper uses the time of shortest actions.

Feature Extraction
The main purpose is to find vectors representing the data with smaller dimensional initial data while ensuring effective human activity recognition. This paper uses 3 methods in feature extraction.

Weighted Sum Method
Each feature extraction method has the different advantages and disadvantages. Combining these methods can improve the recognition result. This paper uses the recognition rate to calculate the weight in weighted sum model. Call K ∈ 1 … 4 is the label assigned using the L th feature extraction method. M is the recognition rate of the L th feature extraction method. This paper assigns label K N as following:

Experiment Result
This experiment uses the CMU graphics Lab motion capture database [3]. The database includes Acclaim Skeleton File (ASF) structure and Acclaim Motion Capture (AMC) files. This paper uses data consists of 29 bones and represent ASF structure shown as Figure 3. Figure 4 is an example of a 3D image of human bone structure reconstructed from AMC. This paper uses data formatted by the ASF and AMC: 1. In the ASF file a base pose is defined for the skeleton that is the starting point for the motion data. ASF has information: length, direction, local coordinate frame, number of Dofs, joint limits and hierarchy, connections of the bone. 2. The AMC file contains the motion data for a skeleton defined by an ASF file. The motion data is given a sample at a time. Each sample consists of a number of lines, a segment per line, containing the data. This experiment use 4 kinds of the human action. They are running, walking, jumping and dancing. Training data includes 165 action used to build models with PCA, multiclass LDA and Suport Vector Machine (SVM) [9]. Validating data includes 163 actions used to calculate accuracy rates of each feature extraction method. Testing data included 163 actions used to evaluate the proposed method. Table 1 shows the CMU data used in this experiment.  Training data  24  75  43  23  165  Validating data  24  75  42  22  163  Testing data  24  75  42  22  163  Total  72  225  127  67  491

Recognition Result Using Manual Method
First, this paper uses the manual feature extraction based on K. Adistambha method [10]. This paper defines the feature based on detail groups of bones showed as following.  Table 2 shows the result of human activity recognition based on SVM with manual feature extraction. Experimental results of bone 11 were 82.7% and only lower when 29 bones were 83.3%. It is therefore possible to use a bundle of 11 bones as a method of feature extraction to construct a weighted sum method in the proposed model. For the PCA method, the parameter to be determined is the number of Eigenvectors (which is the dimension in the new space) so that the recognition model has the highest accuracy. Figure 5 shows the variation of the recognition rate when the number of dimensions changes using the PCA.

Recognition Result Using PCA
As the number increases, the accuracy increases to a threshold value (90.1% with a dimension equal to 49), then begins to decrease and gradually becomes straight when the number is large. The maximum accuracy rate is 90.1% when the feature number is 49. The detail of recognition result is shown as Table 3.

Recognition Result Using Multi-class LDA
The multi-class LDA method also needs to determine the number of dimensions of the data extracted so that the recognition model has the highest accuracy. Figure 6. The accuracy rate using multi-class LDA. Figure 6 shows the accuracy rate of human activity recognition using multi-class LDA. The maximum accuracy is 86.0% when the feature number is 138. The detail of recognition result is shown as Table 4. This sub-section uses the best result of recognition based on manual method, PCA, mult-class LDA to assign the label of human action based on proposed weighted sum method. The Tabel 5 shows the best performances of each feature extraction method.  Table 6 is a detailed recognition result of the proposed model. It shows that the accuracy rate of the human ativity recognition using proposed weighted sum method is 90.7%.  Figure 7 shows the comparison between conventional methods and proposed method. The result of recognition using PCA has higher perfomance when combining with the manual method. And the proposed method has the higher accuracy rate (90.7%). It is better than the method using PCA (90.1%).

Conclusion
This paper described the related research of feature extraction in human activity recognition area included Principal Components Analysis (PCA) and Multi-class Linear Disciminant Analysis (Multi-class LDA). This paper also described the manual feature extraction method referenced the method proposed by K. Adistambha. Next, this paper showed the proposed method using weighted sum method and the conbination of conventional feature extraction methods. The experiment result showed the accuracy rate of the human activity recognition based on manual feature extraction, PCA and Multi-class LDA. It also showed the perfomance of proposed method. It showed that the perfomance of proposed method is better than conventional methods.
In future work, we combine other feature extraction methods such as kernel feature extraction method to improve the human activity recognition rate.