Infrasound Source Identification Based on Spectral Moment Features

Infrasound signals have a frequency range below the human hearing frequency range, and originate from different sources. Since these waves contain useful information about the occurrence of some important event, in this paper we intend to present a method for the recognition of sources of these signals. In the present paper, by using the feature spectral moment along with Mel-frequency cepstral coefficients (MFCC) and linear prediction coefficients (LPC) and also selecting a subset from the feature which plays a more discriminative role for the signal sources, and then by using classifier ensembles, we reached a 98.1% precision in the infrasound source identification.


Introduction
Infrasound is a technical term to identify acoustical waves with frequencies below 20Hz which is beyond human hearing capabilities with frequencies between 20Hz to 20KHz [1][2][3][4]. Infrasound waves propagate through the atmosphere around the earth and since they has a very low absorption characteristic, they travel very long distances [4][5][6].
Infrasound waves are generated by different kinds of natural and man-made sources including earthquakes, volcanoes, bolide, thunderstorms, chemical and nuclear explosions, airplanes, rockets and so on. Because various events produce infrasound by different mechanisms, the energy of the signals is also distributed in different frequency [7].
Thus we are surrounded by a world of non-perception sounds which include valuable information about their original sources and clearly determine the necessity of detection and the analysis of infrasound waves in the atmosphere.
On the other hand identifying some of the originating sources of the infrasound wave is the specific mission of CTBTO 1 and specific tasks for research institutes, therefore scientists and researchers have used different method to 1Comprehensive Nuclear-Test-Ban Treaty Organization separate the infrasound waves and one of the best approaches to do so is the artificial intelligence approach.
Infrasound waves are collected by infrasound sensors or microbarographs which are set up by a special design in an array in infrasound network stations. The most important world wide spread infrasound stations operate under the International Monitoring System (IMS) which includes sixty stations worldwide to collect the infrasound waves for the International Data Center (IDC) in Vienna, Austria.
To identify the sources of infrasound signals, different steps should be taken. In preprocessing step, signals are normalized and noise is eliminated. In next steps, the feature vectors are extracted which best describe the signals. After that feature selection is done. Since the entire extracted feature vectors are not necessarily used for recognition in these signals, we are looking forward to using the methods to select the most important features which discriminate the signals propagating from different sources the best, so that we can use them in the recognition step. In the following, when all the trained data are available in this way, using the classifier ensembles method, the recognition step is going to be implemented. Finally, to test and evaluate the efficiency of the algorithms, cross validation method is used.
The block diagram of an infrasound source identification system can be seen in figure 1. In section 2 we present an overview of the related works. Section 3 presents our proposed method to extract the features. In section 4 we present our experimental results and section 5 offers our conclusions.

Related Works
Efforts to identify the infrasound signals have been done before. In 2005, F. M. Ham proposed a bank of Radial Basis Function (RBF) neural networks, to discriminate between six different man-made events [8]. Mel-Frequency Cepstral Coefficients (MFCC) feature set extracted for this method. He improved his method in [9] by a Parallel neural network classifier bank (PNNCB) with the same feature vector.
In 2008 a combination of Wavelet coefficients feature vector with a fuzzy K-means clustering method used to earthquake prediction [10].
In [11] a Hidden Markov Model (HMM) is used to detect the presence of elephants with Linear predictive coding method for extracting the formants of the elephant rumbles.
Another paper is published by F. M. Ham in 2011 that is focused on exploiting the infrasonic characteristics of volcanoes by extracting unique cepstral-based features from the volcano's infrasound signature. These feature vectors are then used by a neural-classifier to distinguish the ashgenerating eruptive activity from three volcanoes [12].
X. Lui et. al. proposed a classification method based on Hilbert-Huang transform (HHT) and support vector machine (SVM) to discriminate between three different natural events [13]. The frequency spectrum characteristics of infrasound signals produced by different events, such as volcanoes, are unique, which lays the foundation for infrasound signal classification.
Feature extraction is an important block in any machine learning based system. Features extracted from signals can be divided into three categories: the time-domain features, the frequency-based features and time-frequency features. The time-domain features or temporal features are simply extracted and have easy physical interpretation, Energy of signal, zero crossing rate, maximum amplitude and minimum energy are some of the time-domain features. The frequencybased or spectral features are obtained by converting the time based signal into the frequency domain using Fourier transform, like: fundamental frequency, spectral centroid, spectral moments, etc. Time-frequency features describe a signal in both the time and frequency domains simultaneously. One of the most basic forms of timefrequency analysis is Short-Time Fourier Transform (STFT) and one more sophisticated technique is wavelet.
In recent researches the different and various sound and infrasound signals feature extraction methods, including cepstral coefficients method and spectral methods which describe the signal linear characteristics, are more common and used. One of the research challenges is dealing with the noisy environment of infrasound waves. As mentioned before these features, although they describe the signal the best, they are not robust in noisy environments. Thus we are trying to use other powerful methods for noisy environment to combine them with the feature extraction methods for improving our algorithm performance.
One of the feature extraction methods, which is more robust in noisy environments and has an ability to describe the nonlinear characteristics of the signal, is the spectral moment method. In the following, a short description of linear spectral features and spectral moment's features is presented.
Linear spectral features are features derived from power spectral density of a signal and they are able to extract the linear characteristics of a signal. These features include cepstral coefficients which result from discrete cosine transform over signal power spectral. Also the perceptual linear prediction has cepstral coefficients which are similar to the MFCC with one exception which is based on the human hearing perceptual model.
These features are used in some research [8,9,11,12,14] and usually used as a standard feature in the most of automatic speech recognition (ASR) systems. This set of features extract the linear speech signal information suitably, but it is not able to describe the nonlinear characteristics or higher order statistical features of the signal. Furthermore, one of the most important weaknesses of the spectral features is its low robustness in noisy environment. These features are very sensitive to additive noise [15].
To improve the robustness of these features, with respect to background noise and other distortions, an effort has been made to search for alternative features [16][17][18][19][20].
Since the upper sections of the spectral amplitude (such as formants) are less susceptible to noise, Paliwal [20] suggested spectral sub-band centroids (SCC) as new features to complete the cepstral coefficients features. These features are obtained by dividing the frequency band into some specific sub-bands and then finding sub-band centroids using the power spectral and Fourier transform methods. He tested these features over the recognition of English alphabets and showed that the centroids features is more robust due to the noise, but is still weaker compared to linear prediction cepstral coefficients (LPCC) in clean speech. This idea was improved in [21] and it was proved that these new features have a lot of capabilities for robust speech recognition.
The spectral sub-band centroids idea or the same first order spectral moments is extended to higher order normalized spectral sub-band moments (NSSM) [22].

The Proposed Method
We tried to extend the first order spectral moment to a higher-ordered one and while presenting a two-dimension definition of these features, we introduce the mixed moments and used them in our work.
The concept of moment is used to describe the features of a population. In general, the Kth moment centroid of a random variable with a single real variable X is defined as follow: [23] = − Moments could be defined in two centroid and noncentroid types. Unlike the non-centroid moments which are computed around zero, centroid moments are calculated around the average value and with respect to it.
Each two-dimensional probability density function could be described with sets of unlimited numbers. The lower order moments tend to describe the more generalized characteristics of distribution form. While the higher order moments describe the noise characteristics and their details. A two-dimensional density function could be considered as a two-dimensional shape which we supposed to extract its characteristics.
With the two-dimensional moment concept definition in hand, we can extend it to a two-dimensional distribution density function, and in such case, the two-dimensional spectral moments of this distributed density function describe its spectral form. On the other hand it presents a description of the spectrograph. The second dimension of the distribution function for our infrasound signals is the frames of the signal related to the specific event which has resulted from the implementation of the window function and of filter bank on the signal. Two-dimensional Cartesian moments, , from the + order, is defined with a distribution function , , as bellow: The two-dimensional moment for a digitized picture of × with a discrete distribution density , is as follows: [24] = ∑ ∑ , is a set of n order moment including all moments so that + ≤ &, and includes ' & + 1 & + 2 elements.
The first order moments, * " , " + , are used for localizing the centroid of the shape mass. The coordinate of the centroid mass, , , -, is determined by the following formulas: The second order moments, * "' , , '" + which are known as the moments of inertia are used to define the object principal axes. These principal axes are the pair of axes about which there is the minimum and the maximum second moment. The two third-order centroid moment, * 1" , "1 +, describe the image projection skewness. The skewness is a classic statistical measure of the degree of asymmetric of a symmetrical distribution around the average value. The two fourth-order moments, * 2" , "2 +, are describing the kurtosis of a picture visualization. The kurtosis is a classical statistic measure of the peakedness of a distribution.
Moments beyond 4th-order moments are High-order moments. In [25], M. Vuskovic and S. Du analysed the impact of noise in temporal signals and found to be very high at higher order moments. Now, by presenting a proposal about the moments, we expand them and extract more characteristics of the signal by moments named mixed moments. Assuming = , ' , … , 4 is a multi-variables random vector with n dimensions, with the finite moments up to fourth order, the average vector = 5 6 , … , 4 78 briefly becomes a µ = , … , 4 . The nth order centroid moment's matrix is specified by with 9 = 2,3,4. The covariance, which is the extended concept of variance, is the measure of coordinate variations of the two random variables. The covariance matrix is a matrix whose elements show the correlation among the different parameters of the system. For k=2 the& × & covariance matrix is as follow: The elements of the matrix are: Now the idea of extending the variance moment to covariance could be used to extend the skewedness and kurtosis moments to obtain the coskewness and cokurtosis matrices.
On the other hand, since we cannot ignore the features in which the power spectral function is extracted with regards to spectral specifications of the signal, we always use these features with the spectral moment features.
There are different methods to select the features from the feature space. An approach is a correlation-based feature selection [27]. In this approach we have to search the feature space and find subsets of features that are highly correlated with the class while having low intercorrelation. By scattering search method we first select some candidate subsets [28] and then evaluate the worth of a subset of attributes by considering the individual predictive ability of each feature along with the degree of redundancy between them.

Experiments
As was mentioned in the introduction, the infrasound waves originate from different sources. In this paper we have used the data released from six sources of infrasound from Defense Threat Reduction Agency (DTRA) data centre and it includes infrasound signals obtained from IMS and DoE arrays. The detailed information about these six events is shown in table 1. After feature extraction and a subset selection from the proper features, we start the source identification process from the different available source signals through the classifier ensembles method [29].
The purpose of this algorithm is to make precise and diverse classifiers. The main idea is to implement the feature extraction over the subsets of features and to make one set with all features for each classification, so the PCA is used here. In methods that we use in this section for classification, to develop the trained data for a classifier, the features set is divide randomly into k-subsets (k is the algorithm parameter), and the PCA is applied to each subset. All PCAs are retained in order to preserve the variability information in the data. Thus k-axis rotations are executed to form the new features. The reason to use the decision trees for classification, here, is that they are sensitive to rotation of the feature axes. The purpose is to train multi-classifier systems based on uniform classification model over the different subsets.
Assume = * , … , 4 + F is a sample with n described features, and we consider X as a set of data including trained data in the form of × & matrix. We consider Y as a vector with a class label for the data as G = , … , F in such a way > adapts a value from class labels set *H , … , H I +. We also consider the classifications as a collective one in form of < , … , < J and the feature set in form of F. All the classifiers can train in parallel.
After data preparation, we applied the algorithms to the data and the algorithms performances are evaluated by the 10-fold cross validation method.

Results
In this paper we used spectral moment features and combine them with the linear spectral features. Also, using a feature selection technique and a classifier ensembles method, produce a system which is able to recognize the infrasound signals propagated from different natural and man-made sources from each other. The system uses the spectral moment features to extract nonlinear features and higher order statistical specifications of the signals, and combine them with linear spectral features to have a proper linear description of the signal. Furthermore, by using the feature selection technique the system is able to obtain the smallest optimal feature vector which is able to have a better discrimination of the infrasound events. We obtained a recognition precision of 98.1% by using the classifier ensembles method.