高级检索

    基于NW与自适应随机带宽计算的烟叶化学成分预测方法

    Prediction of Tobacco Leaf Chemical Composition Based on NW and Adaptive Random Bandwidth Calculation

    • 摘要: 为解决烟叶化学成分预测准确度不高、小样本场景性能差的问题,结合NW(Nadaraya-Watson核回归估计)与自适应随机带宽计算方法,建立了烟叶化学成分预测的回归模型。采用河南省2010―2020年气象和烟叶化学成分数据进行模型验证,结果表明,本文方法整体上优于现有方法,以云烟87为例,对总糖、还原糖、总植物碱、钾、总氮等主要化学成分的预测误差(MAPE)分别为8.46%、8.71%、12.87%、14.04%、9.95%,相较于随机森林、BP、KNN方法,MAPE分别降低了7.69%、5.02%、5.5%。本模型可应用于不同品种的烟叶化学成分预测,在数据存在偏差的情况下具有更稳定的预测结果,且本文方法仅需随机森林模型1%的数据量就可达到更优效果,在鲁棒性验证、可靠性评估等方面均可取得更优的结果。

       

      Abstract: To address the issues of low prediction accuracy of tobacco leaf chemical components and poor performance in small sample scenarios, a regression model for predicting tobacco leaf chemical components was established by combining NW (Nadaraya-Watson kernel regression estimation) with an adaptive random bandwidth calculation method. The model was verified by using meteorological and tobacco leaf chemical component data of Henan Province from 2010 to 2020. The results showed that the method proposed in this paper was overall superior to the existing methods. Taking Yunyan 87 as an example, the prediction errors (MAPE) for the main chemical components such as total sugar, reducing sugar, total alkaloids, potassium, and total nitrogen were 8.46%, 8.71%, 12.87%, 14.04%, and 9.95%, respectively. Compared with the random forest, BP, and KNN methods, the MAPE index was reduced by 7.69%, 5.02%, and 5.5%, respectively. This model can be applied to the prediction of chemical components of different tobacco varieties and has more stable prediction results when the data is biased. Moreover, the method proposed in this study only requires 1% of the data volume of the random forest model to achieve better results and can achieve better results in robustness verification and reliability assessment.

       

    /

    返回文章
    返回