Advanced Search
    JIN Huiqing, HAO Xianwei, LIU Jianguo, WANG Qing, REN Zhiguang, YUAN Kailong, LI Qi, JI Guixia, YUAN Xiaolong. Construction of a Tobacco Leaf Origin Prediction Model Based on Multi-Model Ensemble and MetabolomicsJ. CHINESE TOBACCO SCIENCE.
    Citation: JIN Huiqing, HAO Xianwei, LIU Jianguo, WANG Qing, REN Zhiguang, YUAN Kailong, LI Qi, JI Guixia, YUAN Xiaolong. Construction of a Tobacco Leaf Origin Prediction Model Based on Multi-Model Ensemble and MetabolomicsJ. CHINESE TOBACCO SCIENCE.

    Construction of a Tobacco Leaf Origin Prediction Model Based on Multi-Model Ensemble and Metabolomics

    • To address the limitations of traditional tobacco origin identification methods, such as insufficient feature coverage and strong subjectivity, this study proposes a tobacco leaf origin prediction model based on multi-feature selection and ensemble learning. Using 576 tobacco samples, including middle strips and upper comprehensive module strips from five major production regions, this study conducted feature preprocessing on 7024 aroma component data points. Through variance analysis, Pearson correlation coefficient, and LightGBM feature importance evaluation, 130 highly discriminative features were selected, achieving a dimensionality reduction rate of 98.15%. For model construction, we innovatively integrated the advantages of support vector machines (SVM), random forests (RF), and multilayer perceptrons (MLP), optimized hyperparameters via grid search, and fused predictions using a weighted voting strategy. The experimental results demonstrated that the ensemble model achieved superior performance in 4-fold cross-validation, with macro-average precision, recall, and F1 all reaching 1.0, significantly outperforming individual models and realizing accurate classification of all samples. This study not only reveals the nonlinear relationships between aroma components and ecological regions but also provides a new, highly accurate method for tobacco origin identification, which has important theoretical and practical value for tobacco quality traceability and production process optimization.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return