MLR-based feature splitting regression for estimating plant traits using high-dimensional hyperspectral reflectance data


Shuaipeng Fei   Demin Xu   Zhen Chen   Yonggui Xiao   Yuntao Ma


Estimating plant traits accurately and timely is essential to improve breeding efficiency and optimize management. By combining regression algorithms and hyperspectral reflectance, plant traits can be quantitatively estimated in a nondestructive and rapid manner. Multiple linear regression (MLR) is a widely used and efficient regression method. However, the method is prone to overfitting when working with high-dimensional data, which can significantly impact the robustness and accuracy of the regression model. In this study, a novel regression algorithm was developed based on MLR and bands splitting strategy, namely feature splitting regression (FSR), to establish the link between high-dimensional hyperspectral reflectance and various plant traits. The results showed that the framework of FSR can provide MLR with the ability to handle high-dimensional hyperspectral data. FSR significantly improved the accuracy of wheat yield prediction compared to MLR. The performance of FSR was also benchmarked against the random forest (RF) in using full bands reflectance from different datasets as inputs. In wheat yield prediction, the FSR models achieved significantly improved prediction accuracy across environments in most validation cases, with mean absolute error (MAE) reduced by up to 28.50% compared to the RF model. In addition, the FSR and RF algorithms were validated 20 times on public datasets of different species. The results showed that the FSR models achieved higher prediction accuracy than the RF models in assessing plant traits for different types of data, with a mean MAE reduction ranging from 26.18% to 62.27%. This study shows that the combination of high-dimensional hyperspectral reflectance data and the FSR framework can achieve plant traits estimation comparable to the advanced machine learning algorithm, which provides an alternative for high precision management in practical production and plant breeding programs.


Hyperspectral reflectance; Plant traits; Machine learning; Feature splitting regression; Multiple linear regression

MLR-based feature splitting regression for estimating plant traits using high-dimensional hyperspectral reflectance data.pdf