高级检索

    绒毛状烟草和林烟草全长cDNA文库构建及EST序列分析

    Full-Length cDNA Library Construction of Nicotiana tomentosiformis and Nicotiana sylvestris and ESTs Analysis of Tobacco

    • 摘要: 表达序列标签(EST)广泛应用于基因功能研究和分子标记开发。以普通烟草两个二倍体祖先种绒毛状烟草(Nicotiana tomentosiformis,TT)和林烟草(Nicotiana sylvestris,SS)多个组织为试验材料,使用CloneMiner cDNA文库构建方法构建了均一化全长cDNA文库,测序并进行序列拼接、功能注释、进化分析和标记开发。绒毛状烟草和林烟草均一化全长cDNA文库容量分别为0.72×106和1.12×106 pfu/mL,重组率分别约为94%和93%,插入片段平均长度为1.4 kb。测序获得20 953条EST序列,拼接为10 504个unigenes。与普通烟草EST序列混合拼接,产生34 450条contigs,123 511条singletons,烟草异源四倍体中T和S基因与绒毛状烟草、林烟草之间的相似性远高于两个二倍体祖先种之间。预测获得104 915个编码序列,其中73 670个序列包含功能结构域,81% unigenes在番茄中具有同源基因。鉴定了11 869个微卫星位点(SSR)和25 209个单核苷酸多态性位点(SNP)。这些数据信息对于烟草基因功能研究和分子育种具有重要价值。

       

      Abstract: Expressed sequence tags (EST) are widely used in gene function research and molecular marker development. In order to obtain a large number of EST sequences of tobacco, a variety of tissues and organs from Nicotiana tomentosiformis and Nicotiana sylvestris were taken as plant materials, and two full-length enriched cDNA libraries were constructed using the CloneMiner cDNA method. The EST sequences were used for sequence assembly, functional annotation, phylogenetic analysis and molecular marker development. The normalized full-length cDNA libraries were constructed successfully from Nicotiana tomentosiformis and Nicotiana sylvestris. The titer were 0.72×106 and 1.12×106 pfu/mL, respectively, and the recombination rates were approximately 94% and 93%, respectively. The average length of inserted cDNA fragments was 1.4 kb. 20 953 ESTs were generated from the full-length enriched cDNA libraries, and assembled into 10 504 unigenes. All of the ESTs from allopolyploid tobacco (Nicotiana tabacum) and its two diploid progenitors, Nicotiana tomentosiformis and Nicotiana sylvestris were assembled, resulting in 34 450 contigs and 123 511 singletons. The global assembly showed that the transcripts from the resident T- and S-genomes in the allotetraploid nucleus were more closely related to their diploid homologs than they were to each other. In total, 104 915 coding sequences were identified, of which 73 670 sequences contained functional domains. Approximately 81% of the unigenes had homologs in tomato. Furthermore, we found 11 869 putative simple sequence repeats (SSR) and 55 369 single nucleotide polymorphisms (SNPs). The EST resources have important implications for gene function research and molecular breeding.

       

    /

    返回文章
    返回