A New Similarity Measure for Time Series Data Mining Based on Longest Common Subsequence

: In this research, a new similarity measurement method that named Developed Longest Common Subsequence (DLCSS) is suggested for time series data mining. The main idea of the DLCSS is using the logic of the Longest Common Subsequence (LCSS) method and the concept of similarity in time series data. In most studies related to time series data mining, referred to the LCSS and Dynamic Time Warping (DTW) methods as the best and most usable for similarity measurement methods


Introduction
Time series data is a set of ordered numbers that expresses the temporal properties of the objects at any moment of time [1]. Time series data almost exist in all areas, as an example in the medical field such as the heart rate data, the intensity breathing data and the neurotoxicity of the brain for a period of time, in climate field such as the daily temperature data of a location and the daily humidity of a location, in the sales field such as daily, weekly, monthly or annual sales and in other different fields. Time series data have three important features. The 1 st ones is to have a high dimension, so that sometimes a time series data can be have hundreds or more member and this occupies high memory space and reduces the speed of computing time series data mining. The 2 nd ones is data-dependency, so that this feature plays a significant role in mining time series. Because the value of each member of a time series is influenced by the value of its former members, so it should be needed to carefully determine appropriate mathematical and statistical relationships. The 3 rd ones is the need for their constant continuation update in most real applications [2][3][4][5].
Data mining is a particular importance way for discovering knowledge from a wealth of data, so that the use of various data mining techniques such as Classification, Clustering, Rule deduction, the Query by content, Forecasting in the different fields like production, medicine, social, meteorology, stock exchange, sales, customer service and etc. are increasing [2].
Time series data mining process is hard and special, because data mining techniques are specially designed for fixed data, and it needs to make changes to the corresponding algorithms for time series data mining [6]. These changes are reducing dimension of time series and choosing appropriate similarity measurement method. The dimension reducing of a time series means indexing. It's aim is reduction of calculation time and it should be done in such a way that the amount of lost knowledge due to the reduction of the time series do not deviate from achieving the right result [7]. A survey of various types of indexing methods has been carried out by Aghabozorgi et al. (2015) [8]. The choosing appropriate similarity measurement method is determining the appropriate time series similarity measurement method for Time series data mining, which is important effective factor in the quality of results. It should be noted that provide a suitable method for measuring the similarity of time series is one of the issues that has been widespread in time series data mining research in recent years [8]. Furthermore Aghabozorgi et al. (2015) showed that the Longest Common Subsequence (LCSS) and Dynamic Time Warping (DTW) methods have been used in more research and have a much better performance than other methods [8].
So with these descriptions and due to the importance and impact of the similarity measurement method in time series data mining, in this research a new method for time series similarity measurement is proposed and the performance of this method compared with the performance of the LCSS and the DTW similarity measurement methods.
In the following, at first the concepts of similarity, kinds of similarity measurement methods and the relations to calculate some of them especially the LCSS and the DTW methods discuss. After that the developed LCSS-based methods and their specifications are described. Then the proposed method for measuring similarity of time series is presented. After that, the Query by content and K-medoids techniques are done with the proposed, LCSS and DTW methods for the time series datasets and final the results are analyzed.

Similarity and History of Similarity Measurement Methods
As before discussed, one of the important problems in time series data mining is the similarity problem. Based on the research, Similarity in time series is defined as point-to-point similarity and flexible similarity (one point to several points or several points to one point). There is another definition for similarity that define as similarity in time, similarity in shape and similarity in model. Similarity in time means that the similarity between two time series based on similarity at any given moment in time. Similarity in shape is the similarity between two time series based on the similarities between the following subsequence and the similarity in model also means the uniformity of the parameters and the uniformity of the fitted model to two time series [9].
On the other hand, there are generally two approaches for time series similarity measurement, the Whole matching approach and the subsequence matching approach. In the Whole matching approach, total length of time series are used, that is if the length of each time series is equal to m, all m data of the first time series and all m data of the second time series are used. In the subsequence matching approach, the time series have different lengths and similarity measurement between them is based on the similarity between the following subsequences. If the length of them are n and m respectively, and n <m then subsequences with the Consecutive data of length n from the time series with greater length will select and the similarity of each of these subsequences with smaller time series is measured. The most similarity obtained is considered as the similarity of the two time series [10].
The similarity measures can be also categorized into four categories: 1. Shape based distance measure, 2. Edit based distance measure, 3. Feature based distance measure and 4. model based distance measure [11].
In the following some of the famous distance measuring methods of time series in the domain of shape-based, edit-based and feature-based distance measure are presented and the strengths and weaknesses of them will be expressed.

Shape Based Distance Measures Group
This Group of measures is based on directly use the raw values and shapes of the time series in different manners. Below, the most commonly used methods of this group are discussed. Suppose that TS = x , x , … , x and TS = y , y , … , y represent the time series X and Y, respectively with length n.

Distance Measurement Method Based on Lp-Norms
One of the most well-known shape based distance measurement that had been used in investigations related to time series data mining is the Lp-Norms method, which is considered as a strict metric method, only use for time series with equal lengths and it is point-to-point similarity type [12]. In this method the distance between TS and TS is calculated by relation (1).
In this relation, p is a natural number and when p=1 it is called the coordinate relation (Manhattan relation) and when p=2 it is known as the Euclidean relation. While two time series are similar in shape and this similarity occurs with a time delay then this relationship can not identify this similarity and it is the main weakness of this method.

Short-time Series Method
In Short Time Series or STS method, each time series is considered as a linear function. In this method the distance between TS and TS is calculated by relation (2), so that the parameter t represent the time of the measurement of the i th data [13]. Weakness of STS method is same as the weakness of Lp-Norm method.

Dynamic Time Warping Method (DTW)
The DTW method is a method that has been able to overcome the weakness of the above methods [14]. Because sometimes there is time series that are roughly same in general but this Mining Based on Longest Common Subsequence similarity does not coincide along the axis of time. In fact, the DTW method is presented to calculate the similarity between two time series with different lengths and has a significant difference with the previous methods. This difference is the possibility of lengthening the length of a time series by dragging it (repeating some of its data which is similar to the other time series data). This method uses a backward relation to calculate the non-similarity between two time series with lengths n and m respectively, as DTW TS * , TS % M n, m which that M(n,m) calculated by the relation (3).
With this explanations, the result of the DTW method is DTW TS * , TS % and a sequence with paired elements and the length r, where each paired element represents the data of first and second time series respectively, that are same (very close together) and the length of this sequence is certainly greater than or equal to Max n, m . In order to comparable the DTW of two time series with the DTW of two other time series, the relation dissim TS * , TS % # @"A "! B ,"! C ||E || is used.
In general, this method has a better performance than other time series measurement methods and has wider application [8].

Edit Based Measurement Method Group
The edit based measurement methods group was originally presented to calculate the similarity between two sequences of characters, and based on the count of the minimum number of editing operations necessary (including removal, placement, and insertion) to convert a sequence to another sequence. In the following some of the most usual methods of this group are discussed. To continue suppose that S x , x , … , x and S y , y , … , y F are two sequences of characters.

Levenshtien Distance Measurement Method
The Levenshtine distance measurement method was presented by a Russian scientist Vladimir Levenshtine and it is widely used in spelling, speech recognition, DNA analysis, and plagiarism detection [15]. While the length of two sequences are n and m respectively, then Lev S * , S % M n, m and M n, m is calculated from the relation (4), so that Sim x , y I 0 ; x y 1 ; x K y .
M i, j This method is inherently created to compare two sequences of characters but it can be used for two time series by define the similarity threshold. This method is rarely used in time series data mining.

Longest Common Subsequence Method (LCSS)
The LCSS method is a classic problem in computer science. The task is to find the longest common subsequence of two sequences. The most important feature of this method is that it can be ignore noise and distortion values. This method is inherently created to compare two sequences of characters. The similarity in this method defines as the same of two characters of two sequences and LCSS S * , S % M n, m , So that M(n,m) is calculated by the relation (5) and 0 R M n, m R min n , m .
The relation Sim S * , S % * VW!! FX is used to comparable the LCSS of two time series with the LCSS of two other time series, which is within the range 0 to 1. The closer to one, the two sequences are more similar.
In order to use the LCSS method for numerical sequences (time series), changes have been made in how to determine the similarity of the two data. So when the absolute value of the difference between the two data of two time series is less than or equal the similarity threshold then it is considered to be similar, otherwise the two data are not similar [16][17]. With this description, the relation (5) is rewritten as relation (6).
The similarity threshold is ∈. The logic used in this relation can be displayed in the Figure 1. The result of LCSS is influenced by the value of ∈, such that smaller value of ∈, the smaller LCSS, and larger value of ∈, the larger LCSS. The appropriate value of ∈ depends on the nature of the data, but in the absence of any knowledge of the dataset and its features, the use of this method practically hasn't any conceptual.

Edit Distance for Real Sequence Method (EDR)
In this method, the identical of characters of two sequences are the criterion to calculate the number of changes that needed to same two sequences to each other. As defined EDR S , S = M n, m and M (n, m) is calculated from the relation (7) and SC = I 0; x = y 1; x ≠ y .
By definition of the similarity threshold, this method can be used to measure the time series distance, but this method has limited used in time series data mining [17].

Edit Distance with Real Penalty Method (ERP)
The ERP method is the adaptation to the edit distance which is combination of the DTW and the EDR [18]. It used to measure the distance of time series with unequal lengths. In this method ERP TS , TS = M n, m and M(n, m) is calculated from the relation (8): In the above relation, g is a constant value that represents the amount of fines and is determined by the user. This method has limited used in time series data mining.

Feature Based Distance Measurement Group
The feature-based distance measures focus on extracting a set of features from time series and calculating the similarities between these features, rather than using the raw data of those time series. Suppose that TS = x , x , … , x and TS = y , y , … , y represent the time series X and Y respectively.

Pearson Correlation Coefficient and Related Coefficient
Pearson correlation coefficient is one of the feature based distance methods and uses the relation (9). In this relation μ * and μ % represent the average of the first and second time series data respectively, Sd * and Sd % represent the standard deviation of the first and second time series data respectively.
Based on this relation, two distance metrics were defined which are d fW TS , TS = (fWW XfWW g and d fW TS , TS = 2 1 − PCC , so that the value of β is defined by the user.
Note that the length of two time series must be equal in these relations.

Cosine Angle
The root of d fW TS , TS is called Cosine Angle and calculated by CA TS * , Note that the length of two time series must be also equal in this relation. The weakness of these methods is like the LP-Norm method.
Interestingly, the performance of all above methods is such that it can not be specifically stated that a particular method is appropriate for any time series databases. In other words, based on the research carried out, it can be concluded that each one is good for a group of data set and is not good for the rest of the data set and it showed that the DTW and the LCSS methods are widespread used and they have better performance than other methods [8,[18][19][20][21][22][23][24][25][26].
So the purpose of this research is to develop the LCSS method to measure the similarity of time series. So before propose the new method, refer to all LCSS-based methods.

Constrained Longest Common Subsequence Metho
The Constrained Longest Common Subsequence (C-LCSS) method is a method that calculates the Longest common subsequence of two sequences in relation to a 3 rd sequence. As defined while S and S are two input sequences and B is a finite sequence with length r, then the constrained longest common subsequence is a subsequence of the two input sequences and including B which has the longest length. The C-LCSS method has limited used in the consistency of two biological sequences with a common and assumed structure. The C-LCSS method does not use as a measure of distance in time series data mining [27].

Multiple Longest Common Subsequence Method
The Multiple Longest Common Subsequence (MLCSS) method is a method that calculates the longest common subsequence of more than two sequences. As defined, while S , S , ,..., S^ denote the K input sequence so that k >2, this method try to find the longest common subsequence of these sequences. This method is considered as a Np-Hard problem for k> 3, and it is necessary to use heuristic methods to solve it. Meanwhile, this method doesn't use in time series data mining [28][29].

Multiple Longest Common Subsequence Method
The Weighted Longest Common Subsequence (WLCSS) or the Heaviest Common Subsequence (HCSS) is a method that calculates the longest common subsequence of two sequences with highest weight. In this method, each character has a positive weight and the purpose is to determine the common subsequence of two sequences so that this subsequence has the maximum weight of all the available subsequences. Due to the nature of this method, it can not be used in time series data mining [30].

Flexible Longest Common Subsequence Method
The Flexible Longest Common Subsequence (FLCSS) is a new type of longest common subsequence method that seeks to find the common subsequence of two sequences with highest consequence points. In other words, when sequencing is important this method can be used. But, the arrangement of the common subsequence is not important in time series data mining, so this method is practically not used in time series data mining [31].

Longest Common Subsequence with Gapped Constraint Method
The Longest Common Subsequence with Gapped Constraint (LCSSGC) method is a modified method of LCSS. while A and B are two input sequences and C is a restriction sequence with a gap list so that the lengths of these sequences are m, n and r, respectively, the LCSSGC problem is to find the longest subsequence such as Z of the sequences A , B and C [32]. Due to the nature of this method, it can not be used in time series data mining.
In a general summary of all developed methods based on the LCSS method, they can't be used in time series data mining like the CLCSS, the WLCSS, the FLCSS, and the LCSSGS methods, or they use only to determine the representation of several time series like the MLCSS method.

Proposed Method for Measuring the Similarity of Time Series
As will be shown in section 6.1, the sensitivity of LCSS method to the similarity threshold is very high so in this research in order to reduce this sensitivity and increase the quality of the results of data mining processes such as the Query by content and clustering techniques, a new method is proposed which is based on the LCCS'logic and is named "Developed Longest common Subsequence" or "DLCSS".
The DLCSS method uses two similarity thresholds, the first similarity threshold ∈ is used to recognize the definite similarity of two data and the second similarity threshold ∈ is used to detect the conditional similarity of the two data. Some conditions must be met for each of these cases. The relation (10) a Yx y Y i 1,2, … , m, j 1,2, … , n 0 ≪ opqrr sr t , sr u v w, x R xyw w, x To better understanding the DLCSS method, look at the Conceptual description of similarity threshold of the DLCSS in Figure 2. Unlike the LCSS that represents the length of the longest common subsequence and is a natural number between zero and min(m,n), the DLCSS represents the similarity score between two time series and can gives a real number between zero and min(m,n).
In contrast to the LCSS, the DLCSS does not have the rigid view (zero and one) to similarity, so while a data is a bit farther away but it is closer than the other adjacent data then it have chance to participate in similarity.
The logic used in DLCSS method is as follows: a) The two data of two time series are certainly similar, if the absolute value of difference between these data is smaller or equal to ∈ . In this case, one unit will be added to the similarity score to the state of the preceding two data. b) The two data are maybe similar, if the absolute value of the difference between these data is larger than ∈ and smaller or equal to ∈ . This condition may be correct with respect to the status of the data before them. If this condition is correct, then the value that is added to similarity score is a fraction of one which is exactly equal to c) The two data of the two series are not definitely similar, if the absolute value of difference between these data is greater than ∈ , then the similarity score is equal to the maximum similarity score before them.

Performance Evalution Approach
In this study, 23 series of time series data sets from the UCR data set are used, their name and specifications are presented in Table 1. Each time series data set has two distinct subsets, which are the training data set and the experimental data set. In each of the subsets, the class of each time series is specified. For example the "statistical control" data set has 6 clusters (class), and the length of each time series is 60 and the number of time series in the training data set and experimental data set are 300 and 300 respectively.
In this research, the performance of the LCSS and the DTW methods is compared to the proposed method on these data sets by the Query by content and K-medoids clustering techniques.
The Query by content technique has four steps in this research. First, the similarity of any time series of experimental dataset is measured by similarity measurement method with any time series of training dataset. Second, the most similar time series of training dataset to this time series is determined. Third, the class of time series is same as the class the most similar time series of training dataset. Four, the accuracy index is calculated. So that the accuracy is the ratio of the number of time series of experimental dataset that their class is correctly determined to the total number of time series of experimental dataset. This accuracy is the performance of the Query by content technique.
The K-medoids clustering technique in this research is used in two steps. In the first step, this technique run on the training data set and based on the accuracy of clustering, the best number of clusters and the representative of those clusters are selected. The accuracy clustering index in this process is the ratio of the number of time series of training data set that correctly assigned to the right cluster to the total number of time series of training dataset. In the second step, based on the best number of clusters and the cluster representative obtained from the first step, experimental data sets are grouped and the accuracy of these grouping is calculated as the ratio of the number of time series of experimental dataset that correctly assigned to the right cluster to the total number of time series of experimental dataset. These accuracy represent the performance of the K-medoids clustering technique.

The Query by Content Results
In this section, the results of the implementation of the Query by content technique by using the LCSS, DTW and DLCSS method as the similarity measurement are presented in Tables 2, 3, 4 respectively and the results are analyzed.
In Table 2, for example for statistical control dataset, the class of 97.33% of time series of the experimental data set as compared to the class of time series of training data set is correctly recognized. As you can see, the clustering accuracy of some datasets is very low, such as OSU dataset with 46.28% and Middle-P-T dataset with 58.4%. In addition, this method has been able to take 80.1% accuracy for all the datasets to determining the correct class of the time series of experimental dataset. In Table 3, for example for statistical control dataset when ∈= 0.05 among the 300 time series of experimental dataset the Class of 209 of them is correctly identified which is equal to 69.67%. This process is performed for all data sets and for different values of ∈. As previously noted, different results of this technique by using different value of similarity threshold in the LCSS indicates the effect of the value of similarity threshold on the result. For example in the case of statistical control dataset, by increasing the value of ∈ from 0.05 to 0.35, the accuracy of correct recognition of time series class increases from 69.67% to 94.67%. This trend for Gun-Point dataset is initially increcing and then descending, so that its maximum value occurs at ∈ = 0.15. These results show that the value of similarity threshold has very effective on the result of the Query by content technique, and the inappropriate selection of similarity threshold can have adverse effects.
In a general view of the results in Table 3, the accuracy of correct recognition of time series class by applying LCSS method with increase value of similarity threshold from 0.05 to 0.35 in SC, CBF, ECG, Face4, Sweedian, 50words, Distal and Italy's power demand datasets is ascending (i.e., 8 datasets of 23 datasets), in Adiac, Car and Olive Oil datasets is descending (3 datasets of 23 datasets), for GP, Medical, OSU, Beef, Ligthing, Fish, Trace, Ligthing7, Middle-PT, Diatom size reduction and Gun-Point is initially ascending and then is descending (i.e., 11 datasets from 23 datasets), and eventually this trend for Plane Dataset is initially descending and then is ascending. According to this description and based on the results, the highest accuracy of correct recognition of the time series class for all datasets has occurred in ∈ = 0.25 and is equal to 81.02%.
The interest point of the best-value of similarity threshold (i.e., ∈ = 0.25) is the low accuracy of correct determining time series class in Adiac and Olive Oil datasets which is equal to 42.7% and 45% respectively, which that both have low accuracy and their have worst results among different results of ∈, So this would be an weakness to the LCSS method. In table 4, the accuracy of the implementation of the Query by content technique by DLCSS method with ∈ 1 = 0.05 and different values for ∈ 2 are presented. As the results show, the accuracy obtained for each dataset is more stable than LCSS's result. For example, the accuracy obtained for Adiac dataset and Olive Oil dataset are more than 80% and %70, respectively. In general overview, the best situation for all datasets is created in a situation where ∈ 2 = 0.6, which is 84.3% and it higher than the best situation obtained by the LCSS method which is 81.02%.
The Query by content technique with DLCSS was implemented again when ∈ 1 = 0.10 and different value of ∈ 2 and the results represent in table 5. As the results show, it is evident that the accuracy obtained for each of the datasets is also. However, contrary to the results in table 4, the accuracy for Adiac and Olive oil dataset are over 68% and about 46%, respectively. In general summary, the best situation for all datasets is created in a state where ∈ 2 = 0.6 which is 83.5% and it is higher than the best accuracy by LCSS method, but in compared with the best situation of the table 4 is less. Therefore, between the different values of similarity threshold for DLCSS method, the best situation is occure at ∈ 1 = 0.05 and ∈ 2 = 0.6. Summary of the best results of implementing the Query by content technique with the DTW, LCSS and DLCSS are presented in Table 6.
It now needs to be checked, is the performance of the DLCSS is better than the DTW? Is the performance of the DLCSS better than the LCSS? For this purpose, pairwise comparison test is used. The zero assumption in this test is the performance of two methods is statistically same, and the one assumption is the performance of two methods is not statistically same. So, If 1% error is tolerable, the interval [1.299, 9.897] is estimated for the accuracy difference between DLCSS and DTW methods and this means that this difference is not zero with 99% confidence and these methods are different in terms of performance, since this difference is positive the performance of the DLCSS is better than the DTW with 99% confidence. Meanwhile, if 10% error is tolerable the interval [0.207, 7.812] is estimated for the accuracy difference between DLCSS ans LCSS and it can be argued that the performance of the DLCSS is better than the LCSS with 90% confidence.

The K-medoids Clustering Results
As discussed earlier, the K-medoids clustering technique in this research is used in two steps. In first step, each training dataset is clustered by K-Medoids and the best cluster number and the best cluster representative is selected based on the value of the target unction, then the accuracy index is calculated. In second step, each expremental dataset is grouped based on the first step results and accuracy of the 2 nd step is calculated again. The purpose of this process is to answer these questions: Question 1: Is the performance of the DLCSS in clustering technique better than the DTW and LCSS performances? Question 2: Is the performance of the DLCSS in determining the cluster number better than the DTW and LCSS performances? Question 3: Is the performance of the DLCSS in determining the cluster representative better than the DTW and LCSS performances?
To answer these questions, the clustering technique was implemented on 23 training datasets using the DTW, LCSS and DLCSS in two modes. The first mode is to create 500 initial cluster center and the maximum 200 times displacement of the cluster center. The second mode is to create 500 initial cluster center and the maximum 500 times displacement of the cluster center. The results will be shown in Tables 7 to 12. First mode: Create 500 random initial cluster center and the maximum 200 times displacement of the cluster center Table 7 shows the results of the implementation of K-medoids clustering technique with the DTW, LCSS and DLCSS methods in the first mode. Based on these results for example for Statistical control dataset, the best result with DTW would be in the cluster number of 6 and with 98.67% accuracy, it means that 98.67% of the time series of this dataset correctly clustered in correct cluster. The best result with LCSS and ∈ = 0.25 is the cluster number of 6 and accuracy of 85.33% and the best result with DLCSS, ∈ 1 = 0.05 and ∈ 2 = 0.6, is 6 for the cluster number and 90.33% accuracy.
Based on these results, the clustering accuracy for all training datasets with DTW is 55.89%, with LCSS is 58.44% and with DLCSS is 62.02%. The paired comparison test on the accuracy results in Table  7 is used to answer the first question. If 10% error is tolerable the interval [0.292, 8.151] is estimated for the performance difference between the DLCSS and DTW. This means that this difference with 90% confidence isn't zero, so it can be claimed that the performance of DLCSS is better than the DTW with 90% confidence. If 1% error is tolerable, the interval [0.71, 6.973] is estimated for the performance difference between the DLCSS and the LCSS, so it can be claimed that the performance of DLCSS is better than the LCSS with 99% confidence. To answer the second question referred to the results presented in Table 8, the DTW determines the correct number of clusters for 7 datasets, and this number for the LCSS and DLCSS are 7 and 13, respectively. In general, the DLCSS has the best performance in this area.
After clustering the training datasets and determining the best cluster number and cluster representatives for each of them, the expremental datasets is grouped. These results are present in Table 9. Based on these results and for example for Statistical Control dataset, the time series of experimental dataset can be grouped by the DTW, LCSS and DLCSS with 95.33% accuracy, 84% accuracy and 86.33% accuracy, respectively. In general, for all dataset and by useing the best cluster number and cluster representatives obtained from the first step, the accuracy of grouping by the DTW, LCSS and DLCSS of experimental dataset is 62.35%, 62.64% and 64.91% respectively. To answer the third question, the paired comparison based on the result in Table 9 is used. If 10% error is tolerable the interval [0.698, 8.087] is estimated for the performance difference between the DLCSS and DTW. This means that this difference is not zero with 90% confidence, so these methods are different in terms of performance and it can be argued that the performance of DLCSS is better than the performance of DTW with 90% confidence. Meanwhile if 2% error is tolerable, the interval [0.381,7.651] is estimated for the difference between the DLCSS and LCSS and it can be argued that the performance of DLCSS is better than the performance of LCSS with 98% confidence.
Second mode: Create 500 random cluster center and allow up to 500 times the center of the cluster to move Table 10 shows the results of the implementation of K-medoids clustering technique with the DTW, LCSS and DLCSS methods in the first mode. Based on these results and for example for Statistical control dataset, the best result with the DTW would be in cluster number of 6 and with 97.67% accuracy, it means that 97.67% of the time series of this dataset correctly clustered in correct place. The best result with the LCSS and ∈= 0.25 is cluster number of 6 and 87.33% accuracy and the best result with DLCSS , ∈ 1 =0.05 and ∈ 2 =0.6 is cluster number of 6 and 91.33% acuuracy. Based on these results, clustering accuracy for all training datasets with the DTW is 56.24%, with LCSS is 58.19% and with DLCSS is 60.61%. To answer first question, the paired comparison test based on the results in Table 10 is used. If 5% error is tolerable the interval [0.03, 7.074] is estimated for the performance difference between the DLCSS and DTW method. This means that this difference with 95% confidence isn't zero, so it can be claimed that the performance of the DLCSS is better than the DTW with 95% confidence. If 10% error is tolerable the interval [0.253, 4.728] is estimated for the performance difference between the DLCSS and LCSS, and also it can be claimed that the performance of the DLCSS is better than the LCSS with 90% confidence. To answer the second question referred to the results presented in Table 11, the DTW determines the correct number of clusters for 8 datasets, and this number for the LCSS and DLCSS are 7 and 11, respectively. In general, the DLCSS has the best performance in this area.
After cluster training datasets and determining the best cluster number and cluster representatives for each of them, the expremental datasets is grouped. These results are presented in Table 12. Based on these results and for example for Statistical Control dataset, time series of the experimental dataset can be grouped by the DTW, LCSS and DLCSS with 97.67% accuracy, 87.33% accuracy and 91.33% accuracy, respectively. In general, for all dataset and by useing the best cluster number and cluster representatives obtained from the first step, the accuracy of grouping by the DTW, LCSS and DLCSS of experimental dataset is 62.55%, 62.22% and 64.24% respectively.
To answer the third question the paired comparison test based on the results in table 12 is used. If 10% error is tolerable the interval [0.386, 9.529] is estimated for the performance difference between the DLCSS and DTW. This means that this difference is not zero with 90% confidence, so these methods are different in terms of performance and it can be argued that the performance of DLCSS is better than the performance of DTW with 90% confidence. Meanwhilee if 2% error is tolerable, the interval [0.06,7.726] is estimated for the difference between the DLCSS and LCSS and it can be argued that the performance of DLCSS is better than the performance of LCSS with 98% confidence.

Conclusion
In this research, a new method for measuring the similarity of time series based on logic and characteristics of the LCSS method is presented which uses two similarity thresholds that named Developed Longest Common Subsequence (DLCSS). The reasons for using two similarity thresholds in the proposed method are firstly, high felactuation in the implementation of the query by content technique by the LCSS method, secondly, the low accuracy in determining the cluster number of datasets and the accuracy in assigning time series to the right clusters, thirdly, the existence of concepts such as Compactness and Separation in the basic concepts of clustering. In DLCSS method, smaller similarity threshold is the basis for the recognition of the definite similarity between two data and larger similarity threshold as the basis for the recognition of the conditional similarity of the two data. According to the investigations, the best value for them are ∈ = 0.05 and∈ = 0.60, respectively. By implementation the Query by content technique with the DLCSS, LCSS and DTW method, it was determined that the accuracy of the correct determination of the time series class for the 23 data sets was 84.35%, 81.02% and 80.08% respectively, which that DLCSS mthod has higher accuracy and good stability in results and low error.
In the K-Medoids clustering technique, the accuracy of the clustering of the training datasets with the creation of 500 randomly selected cluster centers and the possibility of 200 displacement of the cluster center with the DTW, LCSS and DLCSS were 55.89%, 58.44% and 62.02% respectively. A pairwise comparison test showed that, it can be claimed that the performance of DLCSS is better than the DTW and LCSS with 95% confidence and 99% confidence respectively. By using the cluster number and cluster representation obtained from the first step, the experimental dataset was grouped with the DTW, LCSS and DLCSS, which have the accuracy of 62.35%, 62.64% and 64.91% respectively. By using pairwise comparison tests, it can be claimed that the DLCSS has better performance in determining the clusters number and cluster representatives than DTW and LCSS with 90% and 95% confidence, respectively. Meanwhile, this clustering process was performed once again by creating 500 randomly selected cluster centers and the possibility of 500 cluster displacements, which shows that DLCSS is superior to DTW and LCSS.
In general, it can be claimed that the DLCSS has a better performance in time series data mining compared to the performance of DTW and LCSS with at least 90% confidence.