Contact Us

Tel:0371-63387308
      0371-65330928
E-mail:guoshuxuebao@caas.cn

Home-Journal Online-2026 No.3

Estimation of chlorophyll content in the canopy of Malus sieversii based on unmanned aerial vehicle hyperspectral imaging combined with machine learning algorithms

Online:2026/3/18 17:01:46 Browsing times:
Author: ZHANG Zhicong, CUI Dong, ZHAO Yang, HAN Yaxin, WU Yunhao, JIANG Zhicheng, LIU Wenxin, YAN Jiangchao
Keywords: Malus siersii; Machine learning; Canopy chlorophyll; UAV hyperspectral; Vegetation index
DOI: 10.13925/j.cnki.gsxb.20250384
Received date:
Accepted date:
Online date:
PDF Abstract

ObjectiveMalus sieversii is a significant wild fruit tree resource in the arid regions of Central Asia and the ancestral species of modern cultivated apples. The leaf chlorophyll content of M. sieversii normally serves as a key indicator reflecting its photosynthetic capacity and overall health status. To meet the demands of nutritional status and diseases monitoring of M. sieversii, this study established a non-destructive and rapid monitoring method for chlorophyll content in the canopy of M. sieversii using a high- spectrum unmanned aerial vehicle (UAV) coupled with machine learning algorithms based on integrated spectral preprocessing techniques. The optimal machine learning model was selected to invert the SPAD values in the research area.MethodsThe experimental area selected the Saha Wild Fruit Forest in Xinyuan County, IIi River Valley, Xinjiang, which has typical distribution characteris-tics. During the flowering period of wild apples, the DJI M350RTK UAV was equipped with the GaiaSky- mini3- VN hyperspectral imaging system to obtain hyperspectral images of the canopy in the study area. Meanwhile, the relative chlorophyll contents of 85 sample tree canopies were synchronously measured using the SPAD-502Plus chlorophyll meter. During data processing, the first part of the bands with large noise (399.08- 409.27) was removed. Subsequently, spectral optimization were then performed using eight preprocessing schemes, including four basic preprocessing methods (Savitzky-Golay smoothing (SG), multiplicative scatter correction (MSC), standard normal variate transformation (SNV), and moving average smoothing (MA)) and their combinations with first derivative transformation (D1). On this basis, three types of hyperspectral narrow-band vegetation indices were systematically constructed: difference spectral index (DSI), ratio spectral index (RSI), and normalized difference spectral index (NDSI). Four algorithms, namely random forest regression (RFR), XGBoost, gradient boosting decision tree (GBDT), and K-nearest neighbor regression (KNN), were used to establish chlorophyll content inversion models, and the coefficient of determination (R2 ) and root mean square error (RMSE) were used as evaluation indicators.ResultsSpectral analysis showed that the canopy spectrum of wild apples exhibited a reflection peak near 550 nm and a distinct chlorophyll absorption valley at 680 nm, with a steep increase in reflectance in the red edge region (700-750 nm). Without the basic preprocessing step of derivative transformation, Random Forest Regression (RFR) standed out as the best-performing preprocessing model, with an average test set R2 of 0.746 (RMSE=2.707). Its advantages were particularly pronounced in scatter correction preprocessing, where the R2 values of MSC-RFR and SNV-RFR attained 0.772 and 0.775, respectively. The SG-D1 preprocessing method significantly reduced the influence of high-frequency noise on the spectrum without damaging the effective spectral information. Among them, MA-D1-NDSI (580.43, 447.92) exhibited a correlation coefficient of 0.75 (P0.01) with wild apple SPAD values, significantly outperforming other preprocessing methods. The characteristic bands were predominantly distributed in the red edge region (680-750 nm) and the near-infrared plateau region (760-900 nm). Comparative results of the four machine learning models indicated that the RFR model exhibits the best predictive performance. Particularly under the SG-D1 preprocessing condition, this model achieved the R2 value of 0.818 with the RMSE of 2.419, representing a 5.5% improvement in accuracy compared to the best model of basic preprocessing (SNV-RFR). In contrast, the predictive accuracy of the XGBoost (R2 =0.725), KNN (R2 =0.729), and GBDT (R2 =0.645) models decreased in sequence, among which the GBDT model exhibited significant overfitting. Based on the optimal SG-D1-RFR model, this study generated a spatial distribution maps of the SPAD values of the M. sieversii canopy in the study area. The inversion results showed that the SPAD value of the M. sieversii canopy in the study area ranged from 17.1 to 39.8, exhibiting a distribution pattern that is higher in the southeast and lower in the northwest. Furthermore, the model could effectively distinguish plants with different health status: The SPAD values for healthy canopies clustered between 30 and 36, while those of stressed or senescent canopies predominantly exhibited values below 25. Moreover, the model successfully identified shadow areas (SPAD35) and dead branch areas (SPAD18) within the trees, validating its applicability under complex canopy conditions.ConclusionThe SG- D1 preprocessing method significantly enhanced the red-edge features and improved the correlation between spectral information and chlorophyll content. The RFR algorithm performed outstanding in handling high-dimensional spectral features, making it the optimal choice for chlorophyll inversion of M. sieversii leaves. The constructed NDSI, RSI, and DSI indices can effectively capture the variations in chlorophyll content. This method provides a novel technique for non-destructive monitoring of wild fruit tree resources,and its technical route can also be extended to the remote sensing monitoring of physiological parameters of other woody plants. By systematically investigating the influence of eight spectral preprocessing methods on the inversion of chlorophyll content in the canopy of M. sieversii, we proposed the optimal model combination of SG-D1-RFR and established a spectral index system suitable for complex canopy structures. Furthermore, this methodology can be applied to the monitoring and conservation of other endangered wild fruit tree resources. These studies further refine the theoretical framework and technical methods for remote sensing monitoring of wild plant resources.