The Positional Effect in Soft Classification Accuracy Assessment

Recent research has included the rapid development of soft classification algorithms and soft classification accuracy assessment beyond the traditional hard approaches. However, less consideration has been given to whether conditions and assumptions generated for the hard classification accuracy assessment are appropriate for the soft one. Positional error is one of the most significant uncertainties that need to be considered. This research examined the impacts of positional errors on the accuracy measures derived from the soft error matrix using NLCD 2011 as reference data and several coarser maps generated from NLCD 2011 as classification maps at the spatial resolutions of 150m, 300m, 600m, and 900m. Eight study sites, with a spatial extent of 180km×180km, of different landscape characteristics were investigated using a two-level classification scheme. Results showed that with existing registration accuracies achieved by current global land cover mapping, the errors in overall accuracy (OA-error) were 2.13% -39.98% and 2.53%-48.82% for the 8 and 15 classes, respectively and the errors in Kappa (Kappa-error) were 6.64%-57.09% and 7.08%-58.81% for the 8 and 15 classes, respectively if soft classifications were implemented based on images where spatial resolutions varied from 150m to 900m. More complex landscape characteristics and classes in the classification scheme produced a greater impact of the positional error on the accuracy measures. To keep both OA-error and Kappa-error under 10 percent, the average required registration accuracy should achieve 0.1 pixels. This paper strongly recommends the addition of uncertainty analysis due to positional error in future global land cover mapping.


Introduction
Recent research in the fields of global warming, ecological modeling, and environmental monitoring has created a high demand for global land cover products [1,2]. Therefore, many global land cover mapping projects have been implemented based on satellite imagery of a variety of spatial, spectral, and temporal characteristics [3,4] such as Global Land Cover 2000 [5] and GlobCover 2009 [6].
Accuracy assessment is an indispensable component of land cover mapping [7], and it consists of both thematic and positional evaluation. Positional accuracy is achieved by a comparison of coordinates from the same sample locations between the land cover map and its reference data, results of which are measured by root mean square error (RMSE) [8]. In contrast, thematic accuracy is derived by a comparison of thematic labels between the map and a sample of reference data, results of which are presented as an error matrix where the overall accuracy (OA), Kappa, user accuracy (UA) and producer accuracy (PA) are estimated [8,9].
As it is reported, the positional accuracy of IGBP DISCover, UMD Land Cover, MODIS 5, Global Land Cover 2000, and GlobCover 2009 is ~1km, ~1km, 50-100m, 300m-333m and 77m respectively while the thematic accuracy is 66.9%, unknown, 74.8%, 68.6% and 67.5% respectively [5,[10][11][12][13][14]. The thematic accuracy of UMD Land Cover is unknown as no independent validation of UMD Land Cover has been implemented [12]. All these thematic accuracies were achieved based on the underlying assumption that land cover maps are seamlessly geo-registered to their reference data. However, no map is free of positional error, which brings false thematic error into the reported thematic accuracy [8,15]. This issue begs the questions of how the positional errors affect the thematic accuracy and what is the required positional accuracy for validating thematic accuracy of land cover maps.
Many studies have quantified the impact of positional errors on the thematic accuracy based on the hard classification accuracy assessment [8,16,17]. For example, [18] demonstrated that with a one-pixel shift in the cardinal direction, the overall accuracy produced a conservative bias of 15.3%, 23.4%, 36.7% for 5, 10, 25 map classes, respectively using the SPOT HRV multi-spectral images. [19] found that one pixel's location error reduced the overall accuracy by 28%, 20%, and 12% by choosing a pixel, block, or polygon as the assessment unit, respectively. As the use of soft classification has gained increased popularity, few projects have assessed the positional effect on soft classification accuracy assessment. [20] conducted a simulation analysis showing that with one pixel's locational error the bias of the overall accuracy ranges from 4.12% to 34.3% whereas the error of kappa ranges from 8.47% to 71.76% depending on the spatial characteristics. Although this work began the investigation of the positional effect on the soft classification accuracy assessment, two limitations existed with this research. First, the land cover maps analyzed in that study contained only two thematic classes, which is not common in land cover mapping. Second, spatial patterns of these maps were generated based on simulated imagery, some of which may not exist in the real remote sensing images. Both limitations have impeded the analysis of the connection between these results and future soft land cover mapping products.
Therefore, this research continues the work from [20] to analyze the positional effects using existing land cover maps and actual registration levels when soft land cover mapping is implemented. The imagery selected was of MODIS scale or finer due to the fact that most global land cover products were generated at these scales and validated by Landsat or SPOT satellite images [21][22][23]. Factors including spatial characteristics and classification schemes were also analyzed. Finally, this paper determined what the required positional accuracy for a valid soft classification accuracy assessment is.

Study Sites and Landscape Characteristics
Eight study sites representing different landscape structure were chosen from the NLCD 2011 product. They are located in the central or eastern part of the United States. Each of them is a square region with a size of 180km×180km ( Figure 1). Two classification scheme levels were employed in this study (Table 1). Level II is the NLCD 2011 classification scheme with 15 classes common to all eight study sites. The scheme actually has other classes (e.g., Sedge/ Herbaceous in Alaska) which occur only in some regions of the US. Level I classes were generated by merging (collapsing) the thematic classes from level II into 8 classes. The preliminary analysis demonstrates that the cultivated/planted class dominates study sites #1 and #2. Forest and cultivated/planted are the primary classes in study sites #3, #5, and #6. Developed, forest, and cultivated/planted are most prevalent in study site #4. Finally, study site #7 has a great quantity of cultivated/planted, forest, and wetland while forest, shrubland, cultivated/planted, and wetland are dominant in study site #8 ( Table 2, Table 3). The landscape characteristics of the eight study sites were analyzed using the following indices: landscape shape index (LSI), mean patch size (Area_M), mean shape index (Shape_M), edge density (ED), and contagion index (CON) [24,25]. LSI measures the overall geometric complexity of the entire landscape. Area_M evaluates the mean patch size while Shape_M assesses the complexity of patch shape compared to a square shape of the same size. ED denotes the length of edges per unit area. CON indicates both patch type interspersion and patch dispersion at the landscape level. This analysis was accomplished using Fragstats v4.2 designed for quantification and analysis of landscape metrics for categorical maps [26].

Reference Data
The reference data for the eight study sites were obtained from the NLCD 2011 product [27]. Two levels of the classification scheme were generated for the reference data, and the spatial resolution is 30m (Table 1). We assumed that the thematic accuracy of the NLCD 2011 product is one hundred percent for purposes of this research so that the impact of positional error could be evaluated.

Soft Classification Data
The soft classification maps of different spatial resolutions for each study site were generated by upscaling the reference data. Different window sizes, varying from 5×5 to 10×10, 20×20, and 30×30 pixels, were used to create the soft classification maps with spatial resolutions of 150, 300, 600, and 900 meters, respectively (Table 4). Each pixel in the soft classification map contains a vector denoting the proportion of each class. This upscaling method was applied to all reference data across each study site for both levels of the classification scheme.
There are several reasons that this research upscaled the reference data to generate soft classification maps instead of using other existing coarser land cover classification maps (e.g., land cover datasets classified from AVHRR or MODIS imagery). First, soft classification of coarser-resolution images introduces classification errors [28,29], and various classification errors combined with varied positional errors would make the results difficult to explain. Second, it is nearly impossible to find such a series of soft classification maps for each study site with different spatial resolutions.

Soft Classification Accuracy Assessment with Positional Errors
Suppose that a soft classification map has pixels and of them are randomly sampled for accuracy assessment. Soft classification accuracy assessment with positional errors proceeds in the following way. The sampling pixel in the form of a vector v ( , ) at the location (x, y) in the soft classification map takes the cluster of pixels at location ( x + ∆, y + ∆) in the reference data as the validation unit which is in the form of a vector ( ∆, ∆) . The error matrix for the sampling pixel is then constructed using Eq. (1).
In Eq. (1) , the construction rule ∩ is explained by a composite operator developed by Pontius and Cheuk [30] in which the diagonal and off-diagonal elements employ different algorithms. Generally, a minimum operator is calculated for the diagonal elements while the multiplication operator is carried out for the off-diagonal elements. The designing idea and details of calculation can be found in [30]. This composite operator has been widely accepted for its practical and simple characteristics [31]. # is the number of soft pixels shifted from its original position in the soft classification map and varied from 0 to 3 soft pixels with an addition of 0.1 soft pixels in order to measure the positional effect at sub-pixel level [32]. This translation model has been widely used to simulate positional errors [33,34].
The final soft error matrix is created by averaging the soft error matrixes (Eq. (1)) for all sampling pixels ( ) [7,8,35]. Previous research has shown that sampling introduces errors affecting the accuracy measures [36]. Therefore, in this research all soft pixels in the classification map were included to avoid sampling errors.
This research was interested in the component of thematic error caused by positional errors. This part of the thematic error equals the absolute values of the accuracy measures without positional errors minus the counterpart with positional errors (Eq. (2, 3)). We calculated $%-' to indicate the part of thematic error for overall accuracy ($%), and ( )) -' for ( )) . Although a few recent have questioned the usefulness of kappa [37], kappa is still widely used in global land cover mapping such as [38][39][40]. Besides, the focus of this research is not on kappa itself, but rather the positional effect on ( )) . This analysis will promote additional and necessary attention on the impact that position has on soft classification accuracy assessment.  Table 5 also shows the percentage of mixed pixels in these soft classification maps. Obviously, a bigger window size (lower spatial resolution), combined with more heterogeneous landscape, increases this percentage.   Figures 6-9 present the Kappa-errors accordingly when the spatial resolution is 150, 300, 600, and 900 meters, respectively. Each figure is divided into two groups by the classification scheme (8 classes or 15 classes). Within each group, each type of line represents one study site with specific landscape characteristics. The bottom abscissa value ranging from 0 to 3 indicates the amount of the soft pixels associated with a given spatial resolution while the top value denotes the absolute distance in meters correspondingly. Figure 2 demonstrates the impact of positional errors on overall accuracy of eight study sites at a spatial resolution of 150 meters. Within the group using the 8-class scheme, the rate of growth depends on the spatial characteristic of each study site. For example, the rate of growth of study site #1 is the lowest as a result of its most homogeneous landscape pattern with the simplest shapes and largest patch sizes. The rate of growth of study site #8 is the highest because it has the most heterogeneous landscape pattern with the most complex shapes and smallest patch sizes. The slope of each line is steeper at the beginning and then becomes more stable. The trend and shape of the lines between study site #2 and #3 are similar. The same situation also occurs between study site #5 and #6, and between study site #7 and #8. At the positional error of 0.5 pixels, the OA-error is 9.49% for study site #1 and increases to 24.28% for study site #8. Generally, the lines in the left group are lower than the corresponding ones in the right group, which indicates that a higher number of categorical classes increases the positional effect. For instance, at the positional error of 3 soft pixels, the OA-error of study site #8 is 53.56% using the 8-class scheme while the OA-error is 62.94% for the 15-class scheme. It can also be seen that the order of lines in the right group changes such that the highest line is from study site #7.

Impact of Positional Errors on OA and Kappa Derived from the Soft Error Matrix
The results found in Figure 3 are very similar to Figure 2 when the spatial resolution is 300 meters. The only difference is that the spatial resolution of 300 meters creates a less positional effect on OA-error than the spatial resolution of 150 meters does with the same amount of absolute distance. The same is true for relative distance. For example, the positional error of 300 meters creates 49.28% of OA-error at study site #8 when the spatial resolution is 150 meters in the 8-class scheme whereas the positional effect drops to 39.98% when the spatial resolution is 300 meters. Three soft pixels' positional error creates 53.56% of OA-error at study site #8 when the spatial resolution is 150 meters while the value decreases to 50.97% at the spatial resolution of 300 meters.
The results in Figure 4 are different from what has been found in both Figures 2 and 3 when the spatial resolution becomes 600 meters. The line of study site #1 shows a trend becoming stable when the positional error reaches one soft pixel in the 8-class group as does the lines for study sites #2, #3, and #6. The lines of study site #1 and #3 show the same trend in the 15-class scheme. The relative comparison between the study sites stays the same except the line for study site #4 which is higher than the line of study site #6 when the positional errors exceeds1.9 and 1.7 soft pixels in the 8 and 15 class schemes, respectively. Figure 4 and Figure 5 are also very similar, showing the same line trends. The line for study site #4 becomes higher than line of study site #6 when the positional error is greater than 0.9 and 0.8 soft pixels in the 8 and 15 class schemes, respectively.
The comparison of Figure 2 to Figure 5, when holding the classification scheme constant, shows that the spatial resolution alters the impact of positional errors on the OAerror. For example, for study site #1, the OA-error varies from 0 to 20.19% at the spatial resolution of 150 meters in contrast to the variation from 0 to 17.93% at the spatial resolution of 900 meters.
The analysis performed with OA-error is same for Kappaerror (Figures 6-9). Few differences exist between the results of OA-error and Kappa-error. First, compared to the positional effect on the overall accuracy, the impact on the kappa is higher. For example, with three soft pixels' positional errors, the Kappa-error ranges from 0 to 76.83% while the OA-error varies from 0 to 64.1%. Second, the lines of study sites become much denser in contrast to the ones in the analysis of OA-error. Third, the lines in the 8 classes' group are much lower than the corresponding lines in the 15 classes' group however the degree is not as much as they are showed in the OA-error analysis. Fourth, the spatial resolution changes the lines' pattern more than it does in the OA-error analysis. For example, in the 15 classes' group, the highest line is the study site #6 at the spatial resolution of 150m, and it becomes the line of study site #1 at the spatial resolution of 900m.

The Required Registration Accuracy for Soft
Classification Accuracy Assessment Table 6 shows the required registration accuracies (# of soft pixels) to keep both OA-error and Kappa-error under 10% for eight study sites with two levels of classification scheme at a spatial resolution of 150, 300, 600, and 900 meters, respectively. To retain OA-error less than 10% for the 8 classes' scheme, the required registration accuracy for spatial resolution of 150, 300, 600, and 900meters ranges from 0.1 to 0.5 soft pixels, from 0.1 to 0.4 soft pixels, from 0.1 to 0.4 soft pixels, and from 0.1 to 0.5 soft pixels, respectively. It is clear that half of a soft pixel is not enough to obtain an OA-error of less than 10%. As the spatial resolution becomes coarser, the required registration accuracy decreases from 0.20 to 0.18 for the overall accuracy. The 15-class scheme makes achieving the required registration accuracy more difficult. To keep the Kappa-error lower than 10%, the required registration accuracy should reach 0.1 soft pixels for all spatial resolutions and both classification schemes.

Discussion
Soft classification methods have provided an impetus for transforming the hard classification error matrix into a soft one. However, the assumptions applied for the hard classification error matrix may not be suitable for the soft one. This research examined the impacts of positional errors on the thematic accuracy measures derived from the soft error matrix using NLCD 2011 as reference data and several coarser maps generated from NLCD 2011 as classification maps at the spatial resolutions of 150m, 300m, 600m, and 900m. Eight study sites, with a spatial extent of 180km × 180km, of different landscape characteristics were investigated using two levels of classification scheme. The thematic errors caused by positional errors were reported using OA-error and Kappa-error.
Our results are consistent with the results found in the simulation experiment [20] but have now been verified with actual, existing maps and not just simulations. For example, the positional effect becomes greater where the landscape is more heterogeneous. Also, the positional effect is reduced when the spatial resolution becomes coarser. The kappa values were more sensitive to positional error than overall accuracy. As a result, the following analysis and discussion focus mainly on new findings that complement what has been found in the previous simulation experiments.
The use of real images allowed this research to conduct an uncertainty analysis using existing global land cover maps. Table 7 shows that most global land cover products were created at a spatial resolution of 1km. The positional accuracy of IGBP, UMD, MODIS 5, GLC 2000, and GlobCover 2009 is 1km, 1km, 50-100m, 300-333m, and 77m, respectively. That is positional accuracy of 1 pixel, 1 pixel, 0.1-0.2 pixels, 0.3-0.33 pixels, and ~0.26 pixels, respectively according to their spatial resolutions. Therefore, if global land cover maps are produced by soft classification methods using remote sensing imagery at spatial resolutions from 150m to 1km, and they achieve existing positional accuracies, the error in the overall accuracy would range from 2.13% to 39.98% while the error in the kappa would vary from 6.64% to 57.09% using the 8-class scheme. The error in the overall accuracy would range from 2.53% to 48.82% whereas the error in the kappa would vary from 7.08% to 58.81% using the 15-class scheme. Most OA-errors are higher than 10% while all Kappa-errors are higher than 10%. Also, most of the global land cover datasets contain more than 15 classes. Therefore, we can speculate that the errors in the overall accuracy and kappa would be higher. This implication raises the issue of the reliability of thematic accuracies reported by these global mapping projects and highlights the importance of uncertainty analysis of thematic accuracies based on positional errors. Such an analysis would improve the confidence level of further research based on these land cover datasets. This research strongly recommends the additional uncertainty analysis due to positional errors in future global land cover mapping. Global land cover mapping made use of images at the spatial resolution of 1km such as MODIS and AVHRR [3,4] and preferred finer images such as MERIS [6,14]. Most of them were validated by Landsat images [21][22][23]. This paper determined that a half-pixel is not suitable for soft classification accuracy assessment at these scales using Landsat images as reference. To keep the thematic errors due to positional errors less than 10%, the required registration accuracy should achieve 0.14 pixels for overall accuracy and 0.10 pixels for kappa for the 8-class scheme. This requirement increases to 0.10 pixels for both overall accuracy and kappa for the 15-class scheme. As the spatial resolution becomes coarser, and the classification scheme includes more map classes, the positional requirement increases. Therefore, the positional accuracy standard for the soft classification accuracy assessment should be updated, and the future requirement for registration accuracy must consider the spatial resolution, number of classes in the classification scheme and the spatial characteristics of the imagery. Considering more land cover and land use maps were produced by soft classification [46][47][48][49] and given the positional accuracies existing global land cover maps achieved, there is a great need to improve image registration methods or develop methods to eliminate the positional effect on soft classification accuracy assessment. Several techniques have been suggested to remove the positional effect for hard classification accuracy assessment such as spatial aggregation (e.g. 3×3 pixels) [34,[50][51][52][53] and fuzzy location model [54,55]. However, whether they are appropriate for soft classification accuracy assessment needs further research.
The results also showed that the effect of positional errors is higher in the 15-class scheme than in the 8-class scheme as indicated by both OA-error and kappa-error. This result is reasonable because the landscape characteristics with more classes become more complex (Table 5). Nevertheless, surprisingly this trend becomes weaker if using kappa-error is used as an indicator as shown in Figures 6 to 9. Besides, compared to the effect on overall accuracy the highest and lowest positional impact on kappa was not study site #8 and #1, respectively regardless of spatial resolution. Even worse is that the highest positional impact on kappa is at study site #1 when the spatial resolution is 900m. The underlying reason for these unexpected results is that user accuracy or producer accuracy of classes with smaller proportions caused by positional error seems to approximate zero, which makes the kappa value quite low despite the relatively high value of overall accuracy. This also demonstrates that kappa is inconsistent with overall accuracy, which is proof to confirm that kappa does provide discrepant accuracy information.

Conclusions
Recent research has revealed the rapid development of soft classification algorithms and soft classification accuracy assessment beyond the traditional hard approaches. However, less consideration has been given to whether conditions and assumptions generated for the hard classification accuracy assessment are appropriate for the soft one. This research examined the impacts of positional error, one of the most significant uncertainties that need consideration, on the accuracy measures derived from the soft error matrix using NLCD 2011 as reference data and several coarser maps generated from NLCD 2011 as classification maps at the spatial resolutions of 150m, 300m, 600m, and 900m. This research complements and continues the simulation work of [20]. New conclusions include that with the existing levels of registration accuracy, if a global land cover product is produced by soft classification method at MODIS scale or similar scale, the OA-error will vary from 2.13% to 39.98% and from 2.53% to 48.82% for the 8 and 15 class schemes, respectively. The Kappa-error will range from 6.64% to 57.09% and from 7.08% to 58.81% for the 8 and 15 class schemes, respectively. To keep both errors in overall accuracy and kappa under 10%, the average required registration accuracy should reach at least 0.18 and 0.14 soft pixels for the 8 and 15 class schemes, respectively. This research highlights the importance of uncertainty analysis of thematic accuracies caused by positional errors while performing the global land cover mapping. The positional effect is inconsistent between overall accuracy and kappa and is greater in the classification scheme with a greater number of classes. There is a great need to update the positional requirements for soft accuracy assessment according to the image spatial resolution, classification scheme, and landscape structure. Besides, techniques suggested to remove positional effect in hard classification may not be appropriate for soft one, which needs to be investigated further. Considering positional issues and analysis described in this paper will significantly improve soft accuracy assessment in the future.