Mangrove Information Extraction and Precision Analysis Based on Multi-Feature Combination

: Extracting information of mangroves at different tide levels from remote sensing images is challenging. In this study, we investigated the use of multiple features for mangrove information extraction, including spectral features, vegetation indices (NDVI, NIMI)


Introduction
Mangroves are a unique type of tropical trees and shrubs that grow in the intertidal zone along the coast of tropical and subtropical regions [1]. Due to their special characteristics of being between land and sea, mangroves have attracted increasing attention in global environmental and climate change research [2]. Mangroves can protect the coastal environment, purify water, sequester carbon, and maintain the balance of coastal ecosystems, and have high economic and ecological values [3].
In China, mangroves grow along the southeast coast and are mainly distributed in provinces such as Guangdong, Guangxi, Hainan, and Fujian [4]. According to existing research data, the mangrove area in China has decreased from the original 48,000 hectares to 22,000 hectares in 2000 over the past 42 years [5]. Due to the special location of mangrove distribution, their anti-interference ability is weak, and they are easily affected by pollution from land and sea, which is difficult to repair after pollution [6]. Therefore, it is particularly important to strengthen the planning and supervision of mangroves.
Investigating and monitoring mangrove resources is the primary condition for protecting and managing mangroves. However, traditional field measurement methods are time-consuming, costly, and inefficient due to the special growth environment and periodic inundation by seawater. In recent years, with the continuous development of remote sensing technology, the research on mangrove information extraction using remote sensing technology has become increasingly in-depth. Compared with traditional survey methods, remote sensing technology is more economical and efficient. For example, many scholars have used multispectral remote sensing information to extract mangrove information by constructing corresponding vegetation indices based on the spectral characteristics of vegetation. Liu Kai et al. based on the GEE platform, used its Landsat data set, and used the NDVI index to classify mangroves with decision trees based on expert knowledge [7]; Chen Bangqian et al. found that NDVI is a key variable for determining mangrove greenness, canopy coverage and tidal inundation classification threshold, and a key part for extracting mangroves through remote sensing images [8]. In recent years, with the rise of machine learning algorithms such as random forests, neural networks, and support vector machines, the accuracy of mangrove information extraction has been improved. However, due to the uncertainty of tidal conditions, it is difficult to find representative training samples, and it is relatively difficult to extract mangroves submerged under water from a single date image using these methods. For a long time, a large amount of research has pointed out that tides may seriously affect the remote sensing extraction results of mangroves. The current solution is mainly to use multiple low tide satellite images to comprehensively extract mangrove information [9].
Vegetation indices (VIs) are determined based on the spectral characteristics of vegetation through function construction methods and have been proven to be effective in monitoring vegetation from space [10]. In the past two decades, remote sensing research on submerged and floating aquatic vegetation has been widely studied [11]. Due to the strong reflection of exposed and submerged mangroves in the 700~900 nm band and the strong emissivity of the water band near 560 nm, it has been declining. Many scholars at home and abroad have designed different intertidal mangrove extraction indices based on this characteristic. Jia Mingming et al. used Landsat series data to construct the inundated mangrove index (IMFI), which has an advantage in extracting inundated mangrove information compared with the NDVI index [12]. The normalized intertidal mangrove index (NIMI) based on Sentinel-2 data has higher accuracy in extracting mangrove information in the intertidal zone than the NDVI index and LSWI index [13]. However, these studies mostly use a single spectral data as the data source to extract intertidal mangroves. It has been shown that using multiple features for classification can improve the classification accuracy of land cover [14].
In this study, based on the spectral information of mangroves and water bodies, we added texture features to NDVI and NIMI vegetation indices and used a random forest classifier for classification. We compared the performance and accuracy of mangrove information extraction under different feature combinations. Zhanjiang City in Guangdong Province is located at the southernmost end of mainland China. The Zhanjiang Mangrove National Nature Reserve is the largest contiguous mangrove growth area in mainland China. The Zhanjiang Mangrove National Nature Reserve was established in 1990 and was approved as a national-level mangrove nature reserve by the State Council in December 1997. The reserve has compiled the "Overall Plan of Zhanjiang Mangrove National Nature Reserve" and the "Management Plan of Zhanjiang Mangrove Protection Area". Gaoqiao Mangrove is located in the core area of the Zhanjiang Mangrove National Nature Reserve, as shown in Figure 1. The Zhanjiang Mangrove National Nature Reserve is located in a subtropical climate zone with an average annual temperature of 23.4°C and an average annual rainfall of 1756 mm. It is often attacked by tropical storms and heavy rains, and the tide is mostly semi-diurnal, with a maximum average tidal range of 2.5~3 m.

Sentinel-2 Data
The Sentinel-2 series consists of two satellites, Sentinel-2A launched on June 23, 2015, and Sentinel-2B launched on March 7, 2017. Compared to other non-commercial multispectral satellites, Sentinel-2 has the advantages of high temporal and spatial resolution and has been widely used for water detection, mapping of agricultural and forest resources.
The Sentinel-2 data available for download from the European Space Agency is at Level-1C, which refers to satellite products that have only undergone geometric and orthorectification but have not undergone atmospheric correction and cannot be used directly. Level-2A data that have undergone atmospheric correction need to be produced by the user. In this study, we selected two suitable high and low tide images based on the tidal conditions in the Gaoqiao area obtained from the China Maritime Services website. We used the Sen2Cor plugin from the European Space Agency website to produce Level-2A data after atmospheric correction. We then removed the b1, b2, and b3 bands with a resolution of 60 m, resampled the remaining bands to 10 m, and cropped the images to the study area, resulting in the Sentinel-2 MSI image data required for this study.
We determined the tidal conditions in the study area based on the tidal data from the China Maritime Services website (https://www.cnss.com.cn/html/tide.html) and selected two Sentinel-2 images with similar dates and low cloud cover, one for high tide and the other for low tide. After screening, we selected the Sentinel-2 high tide image from November 28, 2021, with a cloud cover of 1.47% and a tide level of 5.05 m, and the low tide image from December 3, 2021, with a cloud cover of 0.82% and a tide level of 1.44 m. The tidal difference between the two images was 3.61 m (as shown in Table 1, Figure 2, and Figure 3).

Sample Data
This study focuses on the extraction of mangrove information in the intertidal zone based on multiple feature combinations. Therefore, the land cover samples in this study were divided into three categories: mangroves, non-mangroves, and water bodies. Non-mangroves include farmland, forest land, built-up land, bare land, and tidal flats. Water bodies include seawater, ponds, rivers, etc. Samples were visually interpreted from high-resolution images in Google Earth. A total of 400 sample points were selected from the low-tide image to construct a reference map, including 92 mangrove samples, 236 non-mangrove samples, and 72 water body samples. In addition, 400 sample points were selected from the high-tide image, including 74 mangrove samples, 217 non-mangrove samples, and 109 water body samples. These samples were mainly used to test the accuracy of mangrove information extraction using different feature combinations based on the random forest algorithm.

Object-Oriented Random Forest Classification Method
The random forest (RF) algorithm is a novel machine learning algorithm that uses multiple CART decision trees to form a new learning algorithm. Each decision tree can independently complete the classification operation, and the final result is obtained by voting through the CART decision tree. Compared with support vector machine and neural network learning algorithms, the RF algorithm has good noise resistance and classification performance and is often used for classification problems. Studies have shown that the use of the random forest algorithm for mangrove information extraction has better results than the KNN algorithm and support vector machine algorithm [15]. In this study, an object-oriented random forest algorithm was used to classify the image based on four data combination schemes.

Image Segmentation
The prerequisite for object-oriented classification is image segmentation. Due to the good segmentation effect, high efficiency, and low time consumption of the multi-scale segmentation method [16], this study used the multi-scale segmentation method. The optimal segmentation scale was selected using the local variance method, which is more efficient and accurate than the traditional trial-and-error method and can avoid the subjectivity of the trial-and-error method. This method measures the rationality of multi-scale segmentation based on the local variance and change of the segmentation object. The eCognition 9.0 software and ESP2 scale evaluation tool were used in this study, and the optimal segmentation scale was selected as 44 for the low-tide image and 32 for the high-tide image. This segmentation scale can better reflect the patch information of mangroves, and the shape parameters and compactness parameters were set to 0.1 and 0.5, respectively.

Feature Selection for Classification
Due to the differences in the organizational characteristics and external morphology of mangroves, they exhibit different features in classification, which are reflected in different aspects such as spectrum, shape, and texture. These features are the main basis for extracting land cover information, and selecting appropriate features can distinguish target land cover from non-target land cover. A total of 34 object features were selected in this study, including 14 spectral features and 20 texture features.
The spectral classification features selected in this study included band mean information (Mean), brightness value (Brightness), brightness level difference (Max. diff), and vegetation indices, including the normalized difference vegetation index (NDVI) and the normalized intertidal mangrove index (NIMI). Three methods were used for texture feature extraction, including gray-level co-occurrence matrix, local gray-level statistics, and gray-level difference vector. Among them, the gray-level co-occurrence matrix method was the most commonly used and effective in texture statistics analysis. Therefore, this method was selected in this study, and the texture features selected were homogeneity, correlation, contrast, entropy, and the moving direction was set to 0°, 45°, 90°, 135°, and all directions (all dir.).

Feature Selection
The Salford Predictive Modeler (SPM) software was used in this study to evaluate the importance of the 34 classification features selected in this study using the random forest algorithm. Based on the importance score, the features that had a greater impact on the accuracy of mangrove classification were selected as the optimal features to participate in the comparison of mangrove classification results. The feature importance scores are shown in the figure below. A total of 34 features were evaluated for importance, and after experimentation, the top 21 classification features were selected as the optimal feature combination scheme. The remaining 13 features had importance scores less than 0.9 and were all texture features. As shown in Figure 3, the band features generally ranked higher in the importance evaluation, while the texture features had lower importance scores and ranked lower, indicating that the band features were the main features for extracting mangrove information.

Feature Combination Schemes
This study aims to explore the effectiveness of combining vegetation indices and texture features based on Sentinel-2 spectral data for extracting mangrove information in the intertidal zone of high-tide images. Therefore, six data combination schemes were constructed, as shown in Table 2.

Accuracy Evaluation Method
The confusion matrix is the most commonly used and widely accepted accuracy evaluation method [17]. In this study, the confusion matrix calculation method was selected to compare the accuracy of mangrove extraction for each feature combination classification scheme. The confusion matrix is a comparison array composed of actual sample points and classified result data, and its evaluation indicators include map accuracy (PA), user accuracy (UA), overall accuracy (OA), and Kappa coefficient. Four evaluation indicators were selected in this study to test the classification results.

Classification Mapping of Feature Combination Schemes
Mangroves grow in the intertidal zone along the coast. The Sentinel-2 low-tide image selected in this study has a tide level of 1.44m, which is a relatively low tide level, and the impact of seawater on mangroves is relatively small. Therefore, the object-oriented random forest classification method was used to extract the distribution range of mangroves by using the low-tide image as the data source and combining it with the sample point data of visual interpretation and expert judgment. After obtaining the classification results, the confusion matrix was used for comparison, and the overall classification accuracy of this method was 96%, with a Kappa coefficient of 0.932, and the accuracy of mangrove classification was 95.7%. When the Kappa coefficient is greater than 0.8, the recognition effect of the classification can be considered good. This indicates that the mapping results of this study are consistent with the ground truth data interpreted from Google Earth. Therefore, this classification result was used as the reference map, and it was assumed that the mangrove area extracted by this method covered the entire study area, as shown in Figure 4. The Sentinel-2 low-tide image was used as the data source to obtain a reference image with an overall classification accuracy of 96% using the object-oriented random forest classification method. The study then explored the effectiveness of multi-feature fusion based on mangrove vegetation indices in the intertidal zone using the Sentinel-2 high-tide image as the data source. Six feature combination schemes were constructed according to the research purpose, and the results are shown in Figures 5-7.

Analysis of Mangrove Extraction Accuracy for Different Combination Schemes
As shown in Table 2, combinations (a) and (b) both use spectral features as the features for mangrove information extraction, and their classification accuracy for mangrove identification is relatively high, with map accuracy of 93.4%, user accuracy of 95.5% and 94.5%, overall classification accuracy of 88.7%, and Kappa coefficient of 0.812. Although the overall classification accuracy of combinations (a) and (b) is the same, the mangrove extraction results in Figure 5 (a) and (b) show that combination (b) performs better than combination (a) in extracting submerged mangroves. This may be because the normalized intertidal mangrove index (NIMI) in combination (b) considers the high absorption rate of mangroves in the red band (b4) and the high reflectance rate of mangroves in the vegetation red edge and near-infrared bands (b6, b7, b8) with high water absorption rate when constructing the index, further highlighting the difference between mangroves and water bodies, which can distinguish mangroves under water bodies.  Combinations (c) and (d) combine band features and texture features as the features for extracting mangrove information. The overall accuracy remains almost unchanged or slightly decreased, but the user accuracy for mangrove classification decreases significantly, by 8.5% and 9.9%, respectively. This indicates that in the process of random forest classification, non-mangrove areas are classified as mangroves. In this study, forestland, farmland, and other areas are classified as non-mangrove areas. The main terrestrial plant species in the study area are silvergrass, lychee, longan, etc. The terrestrial vegetation in the study area is mostly trees, and their leaf and crown texture features are very similar to those of mangroves, resulting in a significant decrease in user accuracy for mangrove classification. As shown in Figure 6, the forestland in the study area is misclassified as mangroves. Combination (e) combines all band features and texture features. The classification results are evaluated by the confusion matrix, with an overall accuracy of 91% and a user accuracy of 92.5%, which are higher than the user accuracy of combinations (c) and (d). This indicates that the introduction of NDVI and NIMI indices can improve the classification accuracy of mangroves. According to the classification results of the combination scheme in Figure 7, the probability of misclassifying mangroves as non-mangroves is reduced, indicating that spectral features are not only important for land cover classification but also for species classification.
Combination (f) uses the feature selection combination as the data feature and uses the object-based random forest method in eCognition software to obtain an overall classification accuracy of 92% for mangroves, with a user accuracy of 96.7% and a Kappa coefficient of 0.86. This scheme has a much higher accuracy for mangrove classification than other schemes. Moreover, as shown in Figure 7, the feature selection-based mangrove extraction performs better in extracting submerged mangroves. By selecting the features used for classification, weak features can be greatly reduced and feature redundancy can be avoided, indicating that the feature selection method based on random forest can retain important information about land cover, select effective information, and maximize the use of effective information to improve classification accuracy.

Conclusion
(1) The introduction of NDVI and NIMI indices into the combination scheme of band features and texture features (Scheme 3) effectively improves the classification accuracy of mangroves. This is mainly due to the fact that the Normalized Inter-tidal Mangrove Index (NIMI) considers the high absorption of red light by mangroves and the high reflectance of water bodies, making the difference between mangroves and water bodies more obvious.
(2) When texture features are introduced for classification, the user accuracy for mangrove classification decreases significantly. This is because the main terrestrial plant species in the study area (such as silvergrass, lychee, longan, etc.) have similar texture features to mangroves, making it difficult to distinguish between mangroves and non-mangroves, which in turn affects the classification accuracy of texture features. (3) The feature selection method performs well in mangrove extraction, not only selecting an appropriate number of features to avoid data redundancy but also greatly reducing the influence of weak features. Especially, it significantly improves the accuracy of extracting submerged mangroves.