Abstract:
To address the issues of low prediction accuracy of tobacco leaf chemical components and poor performance in small sample scenarios, a regression model for predicting tobacco leaf chemical components was established by combining NW (Nadaraya-Watson kernel regression estimation) with an adaptive random bandwidth calculation method. The model was verified by using meteorological and tobacco leaf chemical component data of Henan Province from 2010 to 2020. The results showed that the method proposed in this paper was overall superior to the existing methods. Taking Yunyan 87 as an example, the prediction errors (MAPE) for the main chemical components such as total sugar, reducing sugar, total alkaloids, potassium, and total nitrogen were 8.46%, 8.71%, 12.87%, 14.04%, and 9.95%, respectively. Compared with the random forest, BP, and KNN methods, the MAPE index was reduced by 7.69%, 5.02%, and 5.5%, respectively. This model can be applied to the prediction of chemical components of different tobacco varieties and has more stable prediction results when the data is biased. Moreover, the method proposed in this study only requires 1% of the data volume of the random forest model to achieve better results and can achieve better results in robustness verification and reliability assessment.