棉花学报 ›› 2021, Vol. 33 ›› Issue (6): 504-512.doi: 10.11963/cs20200085
武建楠1(),陈肯1,王欢2,庞铂实1,周宇荀1,肖君华1,李凯1,*(
)
收稿日期:
2020-11-02
出版日期:
2021-11-15
发布日期:
2022-04-14
通讯作者:
李凯
E-mail:969747260@qq.com;likai@dhu.edu.cn
作者简介:
武建楠(1993―),女,硕士研究生, 基金资助:
Wu Jiannan1(),Chen Ken1,Wang Huan2,Pang Boshi1,Zhou Yuxun1,Xiao Junhua1,Li Kai1,*(
)
Received:
2020-11-02
Online:
2021-11-15
Published:
2022-04-14
Contact:
Li Kai
E-mail:969747260@qq.com;likai@dhu.edu.cn
摘要:
【目的】多倍体植物同源区段单核苷酸多态性(Single nucleotide polymorphism,SNP)标记分型挑战颇大。本研究以四倍体陆地棉同源区段聚合酶链式反应(Polymerase chain reaction,PCR)靶向扩增子数据集为例,观测同源区段的影响并优化生物信息学分析方案。【方法】首先扩增并测序获得136个陆地棉样本DNA的包含潜在变异的3个区段,其次使用不同参数进行比对与SNP检出,最后比较优化不同方案分析结果的异同。【结果】常规分析发现,区段1的3个SNP位点和区段2的1个SNP位点在样本中均鉴定为野生型或纯合突变型,而几乎所有样本的区段3鉴定为杂合突变型。Blast分析表明,位于A12染色体的区段3与其同源序列(位于D12染色体)相似性为96.28%。仅将区段3的同源序列作为参考序列分析,潜在SNP位点的基因型鉴定结果无变化;而将区段3与其同源序列同时作为参考序列分析,比对到区段3与其同源序列的读长的比例分别为48%、52%,因此存在较多同源区段3的读长是导致分型错误的主要原因,并确定了区段3 潜在SNP位点的基因型应为TT。此外,通过对比GATK结果在区段3发现了2个新SNP且排除了3个因部分同源序列变异造成的假阳性SNP。【结论】本研究验证了同源序列的存在会严重影响多倍体SNP鉴定与生物信息学分析;对关键参数优化特别是将多倍体同源序列同时作为参考序列,能够提高SNP分型的准确度。
武建楠,陈肯,王欢,庞铂实,周宇荀,肖君华,李凯. 多倍体同源区段二代测序生物信息学分析关键参数优化[J]. 棉花学报, 2021, 33(6): 504-512.
Wu Jiannan,Chen Ken,Wang Huan,Pang Boshi,Zhou Yuxun,Xiao Junhua,Li Kai. Optimization of key parameters for next-generation sequencing bioinformatics analysis of polyploid homologous segments[J]. Cotton Science, 2021, 33(6): 504-512.
表2
本研究中用到的引物序列"
靶标区段 Target segment | 上游引物 Forward primer | 下游引物 Reverse primer |
区段1 Segment 1 | TTCCTTGGTTTAAGTAACTCCTAAGC | GCGATCTAAGAAGAGATGTGGAAATAATACCAG |
区段2 Segment 2 | CTATATCCTGAAGTTTCTACACCGTTCTC | CTAAGAGAGATGATATAAGATGAAATG- AAGCATG |
区段3 Segment 3 | CGATGGTCAACATAAACCTTGGATAAT- AGATCG | CTTGGGAGTTGAACCGAGGGTATC |
表3
靶标区段作为参考序列的变异检测与分型结果"
靶标区段 Target segment | SNP位点 SNP position | 参考序列碱基 Reference base | 变异碱基 Altered base | 基因型* Genotype |
区段1 Segment 1 | 79 | T | G | 1/1 (100%) |
123 | T | C | 0/0 (100%) | |
161 | A | C | 0/0 (100%) | |
区段2 Segment 2 | 120 | T | C | 1/1 (100%) |
区段3 Segment 3 | 48 | A | C | 0/1 (100%) |
78 | T | A | 0/1 (100%) | |
143 | A | T | 0/1 (99.23%), 0/0 (0.77%) | |
174 | G | A | 0/1 (100%) | |
180 | G | C | 0/1 (100%) | |
182 | C | A | 0/1 (100%) |
[1] |
Ganal M W, Altmann T, Rder M S. SNP identification in crop plants[J/OL]. Current Opinion in Plant Biology, 2009, 12 (2): 211-217[2020-10-26]. https://doi.org/10.1016/j.pbi.2008.12.009.
doi: 10.1016/j.pbi.2008.12.009 |
[2] | 刘传光, 张桂权. 水稻单核苷酸多态性及其应用[J/OL]. 遗传, 2006, 28(6): 737-744[2020-10-26]. https://doi.org/10.3321/j.issn:0253-9772.2006.06.019. |
Liu Chuanguang, Zhang Guiquan. Single nucleotide polymorphism (SNP) and its application in rice[J/OL]. Hereditas, 2006, 28(6): 737-744 [2020-10-26]. https://doi.org/10.3321/j.issn:0253-9772.2006.06.019. | |
[3] |
Rafalski A. Applications of single nucleotide polymorphisms in crop genetics[J/OL]. Current Opinion in Plant Biology, 2002, 5(2): 94-100[2020-10-26]. https://doi.org/10.1016/S1369-5266(02)00240-6.
pmid: 11856602 |
[4] |
滑峰, 万海粟, 梅朝蓉, 等. 同源序列对CYP2A13基因SNP研究的影响[J/OL]. 中国肺癌杂志, 2010, 13(2): 94-97[2020-10-26]. https://doi.org/10.3779/j.issn.1009-3419.2010.02.02.
doi: 10.3779/j.issn.1009-3419.2010.02.02 pmid: 20673498 |
Hua Feng, Wan Haisu, Mei Chaorong. et al. Interference of homologous sequences on the SNP study of CYP2A13 gene[J/OL]. Chinese Journal of Lung Cancer, 2010, 13(2): 94-97[2020-10-26]. https://doi.org/10.3779/j.issn.1009-3419.2010.02.02.
doi: 10.3779/j.issn.1009-3419.2010.02.02 pmid: 20673498 |
|
[5] |
Somers D J, Kirkpatrick R, Moniwa M, et al. Mining single-nucleotide polymorphisms from hexaploid wheat ESTs[J/OL]. Genome, 2003, 46(3): 431-437[2020-10-26]. https://doi.org/10.1139/g03-027.
pmid: 12834059 |
[6] |
Fredman D, White S J, Potter S, et al. Complex SNP-related sequence variation in segmental genome duplications[J/OL]. Nature Genetics, 2004, 36(8): 861-866[2020-10-26]. https://doi.org/10.1038/ng1401.
pmid: 15247918 |
[7] | 贺道华, 邢宏宜, 赵俊兴, 等. 多倍体植物中单核苷酸多态性(SNPs)的开发[J/OL]. 浙江大学学报(农业与生命科学版), 2011, 37(5): 485-492[2020-10-26]. https://doi.org/10.3785/j.issn.1008-9209.2011.05.003. |
He Daohua, Xing Hongyi, Zhao Junxing, et al. Single nucleotide polymorphism (SNP) discovery in polyploidy plants[J/OL]. Journal of Zhejiang University (Agriculture and Life Science), 2011, 37(5): 485-492[2020-10-26]. https://doi.org/10.3785/j.issn.1008-9209.2011.05.003. | |
[8] |
Clevenger J P, Ozias-Akins P. SWEEP: a tool for filtering high-quality SNPs in polyploid crops[J/OL]. G3 (Bethesda), 2015, 5(9): 1797-1803[2020-10-26]. https://doi.org/10.1534/g3.115.019703.
doi: 10.1534/g3.115.019703 |
[9] |
Walid K, Clevenger J P, et al. Machine learning as an effective method for identifying true single nucleotide polymorphisms in polyploid plants[J/OL]. Plant Genome, 2019, 12(1): 180023[2020-10-26]. https://doi.org/10.3835/plantgenome2018.05.0023.
doi: 10.3835/plantgenome2018.05.0023 |
[10] |
Paterson A H, Wendel J F, Gundlach H, et al. Repeated polyploidization of Gossypium genomes and the evolution of spinnable cotton fibres[J/OL]. Nature, 2012, 492(7429): 423-427[2020-10-26]. https://doi.org/10.1038/nature11798.
doi: 10.1038/nature11798 |
[11] |
Chen K, Zhou Y X, Li K, et al. Multiplex PCR with the blunt hairpin primers for next generation sequencing[J/OL]. Biotechnology and Bioprocess Engineering, 2017, 22(3): 347-351[2020-10-26]. https://doi.org/10.1007/s12257-017-0133-0.
doi: 10.1007/s12257-017-0133-0 |
[12] | 匡猛, 杨伟华, 许红霞, 等. 单粒棉花种子DNA快速提取方法[J/OL]. 分子植物育种, 2010, 8(4): 827-831[2020-10-26]. https://doi.org/10.3969/mpb.008.000827. |
Kuang Meng, Yang Weihua, Xu Hongxia, et al. A rapid method of DNA extraction from single cotton seed[J/OL]. Molecular Plant Breeding, 2010, 8(4): 827-831[2020-10-26]. https://doi.org/10.3969/mpb.008.000827. | |
[13] | 王亚恒. 应用于靶向测序的多重PCR引物设计系统[D]. 上海: 东华大学, 2018. |
Wang Yaheng. Multiplex PCR primer design system for targeted sequencing[D]. Shanghai: Donghua University, 2018. | |
[14] | 钱强, 徐园, 王亚恒, 等. 基于多重PCR靶向二代测序的近交系小鼠遗传质量监测方法建立[J/OL]. 实验动物与比较医学, 2019, 39(2): 111-117[2020-10-26]. https://doi.org/10.3969/j.issn.1674-5817.2019.02.008. |
Qian Qiang, Xu Yuan, Wang Yaheng, et al. Mouse genetic quality monitoring method establishment based on next-generation sequencing through multiplex PCR[J/OL]. Laboratory Animal and Comparative Medicine, 2019, 39(2): 111-117[2020-10-26]. https://doi.org/10.3969/j.issn.1674-5817.2019.02.008. | |
[15] |
Zhu Tao, Liang Chengzhen, Meng Zhigang, et al. CottonFGD: an integrated functional genomics database for cotton[J/OL]. BMC Plant Biology, 2017, 17: 101[2020-10-26]. https://doi.org/10.1186/s12870-017-1039-x.
doi: 10.1186/s12870-017-1039-x pmid: 28595571 |
[16] |
Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform[J/OL]. Bioinformatics, 2009, 25(14): 1754-1760[2020-10-26]. https://doi.org/10.1093/bioinformatics/btp324.
doi: 10.1093/bioinformatics/btp324 |
[17] |
Li H, Handsaker B, Wysoker A, et al. The sequence alignment-map format and SAMtools[J/OL]. Bioinformatics, 2009, 25(16): 2078-2079[2020-10-26]. https://doi.org/10.1093/bioinformatics/btp352.
doi: 10.1093/bioinformatics/btp352 |
[18] |
Nielsen R, Paul J S, Albrechtsen A, et al. Genotype and SNP calling from next-generation sequencing data[J/OL]. Nature Reviews Genetics, 2011, 12(6): 443-451[2020-10-26]. https://doi.org/10.1038/nrg2986.
doi: 10.1038/nrg2986 pmid: 21587300 |
[19] |
Depristo M A, Banks E, Poplin R, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data[J/OL]. Nature Genetics, 2011, 43(5): 491-498[2020-10-26]. https://doi.org/10.1038/ng.806.
doi: 10.1038/ng.806 |
[20] | 匡猛, 王延琴, 周大云, 等. 基于单拷贝SNP标记的棉花杂交种纯度高通量检测技术[J/OL]. 棉花学报, 2016, 28(3): 227-233[2020-10-26]. https://doi.org/10.11963/issn.1002-7807.201603005. |
Kuang Meng, Wang Yanqin, Zhou Dayun, et al. High-throughput genotyping assay technology for cotton hybrid purity based on single-copy SNP markers[J/OL]. Cotton Science, 2016, 28(3): 227-233[2020-10-26]. https://doi.org/10.11963/issn.1002-7807.201603005. | |
[21] |
Kaur S, Francki M G, Forster J W. Identification, characterization and interpretation of single-nucleotide sequence variation in allopolyploid crop species[J/OL]. Plant Biotechnology Journal, 2012, 10(2): 125-138[2020-10-26]. https://doi.org/10.1111/j.1467-7652.2011.00644.x.
doi: 10.1111/pbi.2011.10.issue-2 |
[22] |
Bassil N V, Davis T M, Zhang H, et al. Development and preliminary evaluation of a 90 K Axiom SNP array for the allo-octoploid cultivated strawberry Fragaria ananassa[J/OL]. BMC Genomics, 2015, 16: 1[2020-10-26]. https://doi.org/10.1186/s12864-015-1310-1.
doi: 10.1186/1471-2164-16-1 |
[23] |
Meng C M, Cai C P, Zhang T Z, et al. Characterization of six novel NAC genes and their responses to abiotic stresses in Gossypium hirsutum L.[J/OL]. Plant Science, 2009, 176(3): 352-359[2020-10-26]. https://doi.org/10.1016/j.plantsci.2008.12.003.
doi: 10.1016/j.plantsci.2008.12.003 |
[24] | 王振玉, 李威, 周晓箭, 等. 棉花单核苷酸多态性标记研究进展[J/OL]. 棉花学报, 2016, 28(4): 399-406[2020-10-26]. https://doi.org/10.11963/issn.1002-7807.201604012. |
Wang Zhenyu, Li Wei, Zhou Xiaojian, et al. Review of single nucleotide polymorphism markers in cotton[J/OL]. Cotton Science, 2016, 28(4): 399-406[2020-10-26]. https://doi.org/10.11963/issn.1002-7807.201604012. | |
[25] |
You Qian, Yang Xiping, Peng Ze, et al. Development and applications of a high throughput genotyping tool for polyploid crops: single nucleotide polymorphism(SNP) array[J/OL]. Frontiers in Plant Science, 2018, 9:104[2020-10-26]. https://doi.org/10.3389/fpls.2018.00104.
doi: 10.3389/fpls.2018.00104 pmid: 29467780 |
[26] | 李杏瑜, 刘颖, 朱方何, 等. 扩增子测序结合高分辨率溶解曲线鉴定花生单核苷酸多态[J/OL]. 分子植物育种, 2015, 13(9):1970-1979[2020-10-26]. https://doi.org/10.13271/j.mpb.013.001970. |
Li Xingyu, Liu Ying, Zhu Fanghe, et al. Identification and eva-luation of single-nucleotide polymorphisms in peanut (Arachis hypogaea L.) based on amplicon sequencing combined with high resolution melting analysis[J/OL]. Molecular Plant Breeding, 2015, 13(9):1970-1979[2020-10-26]. https://doi.org/10.13271/j.mpb.013.001970. | |
[27] |
Oliver R E, Lazo G R, Lutz J D, et al. Model SNP development for complex genomes based on hexaploid oat using high-throughput 454 sequencing technology[J/OL]. BMC Genomics, 2011, 12: 77[2020-10-26]. https://doi.org/10.1186/1471-2164-12-77.
doi: 10.1186/1471-2164-12-77 |
[28] |
Byers R L, Harker D B, Yourstone S M, et al. Development and mapping of SNP assays in allotetraploid cotton[J/OL]. Theoretical and Applied Genetics, 2012, 124(7): 1201-1214[2020-10-26]. https://doi.org/10.1007/s00122-011-1780-8.
doi: 10.1007/s00122-011-1780-8 |
[29] |
Clevenger J, Chavarro C, Pearl S A, et al. Single nucleotide polymorphism identification in polyploids: a review, example, and recommendations[J/OL]. Molecular Plant, 2015, 8(6): 831-846[2020-10-26]. https://doi.org/10.1016/j.molp.2015.02.002.
doi: 10.1016/j.molp.2015.02.002 pmid: 25676455 |
[30] |
Chen K, Zhou Y X, Li K, et al. A novel three-round multiplex PCR for SNP genotyping with next generation sequencing[J/OL]. Analytical and Bioanalytical Chemistry, 2016, 408(16): 4371-4377[2020-10-26]. https://doi.org/10.1007/s00216-016-9536-6.
doi: 10.1007/s00216-016-9536-6 pmid: 27113460 |
[1] | 李飞,郭莉莉,赵瑞元,尹凌洁,王家珍,李彩红,何叔军,梅正鼎. 氮肥减量深施对油后直播棉花干物质与氮素积累、分配及产量的影响[J]. 棉花学报, 2022, 34(3): 198-214. |
[2] | 王亚茹,杨北方,雷亚平,熊世武,韩迎春,王占彪,冯璐,李小飞,邢芳芳,辛明华,吴沣槭,陈家乐,李亚兵. 基于红外传感器的棉花叶片温度变化特征及其影响因子分析[J]. 棉花学报, 2022, 34(3): 235-246. |
[3] | 胡宇凯,赵书珍,董红强,魏永海,田玉刚,陈佳林,董合林,马小艳,冯璐,翟云龙,陈国栋. 化学打顶对南疆棉花干物质积累与分配的影响[J]. 棉花学报, 2022, 34(3): 247-255. |
[4] | 龚明贵,刘凯洋,魏亚楠,白娜,邱智军,张巧明. 砷胁迫下接种丛枝菌根真菌对棉花光合特性和叶肉细胞超微结构的影响[J]. 棉花学报, 2022, 34(3): 256-266. |
[5] | 张雪, 孙瑞斌, 马聪聪, 马丹, 张晓睿, 刘志红, 刘传亮. 棉花SRS基因家族的全基因组鉴定及生物信息学分析[J]. 棉花学报, 2022, 34(2): 107-119. |
[6] | 苏星, 苏振贺, 宣立锋, 李社增, 王培培, 郭庆港, 马平. 生防菌NCD-2菌株定量检测体系的建立及其在棉花根际定植检测中的应用[J]. 棉花学报, 2022, 34(2): 162-172. |
[7] | 卢合全,唐薇,张冬梅,罗振,孔祥强,李振怀,徐士振,代建龙,李维江,辛承松. 化肥减施和秸秆还田对土壤肥力、棉花养分吸收利用及产量的影响[J]. 棉花学报, 2022, 34(2): 137-150. |
[8] | 周雪慧,高二林,王钰静,李焱龙,袁道军,朱龙付. GhROP6通过调控茉莉酸合成与木质素代谢参与棉花抗黄萎病反应[J]. 棉花学报, 2022, 34(2): 79-92. |
[9] | 李秀青,王倩,胡子曜,雷建峰,代培红,刘超,刘晓东,李月. GhMAPKKK2基因在棉花抗黄萎病中的功能分析[J]. 棉花学报, 2022, 34(1): 1-11. |
[10] | 上官小霞,曹俊峰,杨琴莉,吴霞. 棉花纤维发育的分子机理研究进展[J]. 棉花学报, 2022, 34(1): 33-47. |
[11] | 席凯鹏,席吉龙,杨苏龙,张建诚. 长期秸秆配施鸡粪对棉田土壤重金属累积的影响及生态风险评价[J]. 棉花学报, 2022, 34(1): 48-59. |
[12] | 李世梅,李自良,冯旭飞,向导,杨明凤,张旺锋,张亚黎. 棉花盛铃期不同器官氮磷化学计量特征及异速关系[J]. 棉花学报, 2022, 34(1): 60-68. |
[13] | 陈凯丽,田秋恒,刘志洋,王海,熊杰,雷勇辉,孙燕飞. 新疆石河子及周边地区棉花根际土壤丛枝菌根真菌多样性[J]. 棉花学报, 2022, 34(1): 69-78. |
[14] | 王艳情, 郑杰, 许艳超, 蔡小彦, 周忠丽, 侯宇清, 王坤波, 王玉红, 陈浩东, 刘方, 李志坤. 棉花HDAC基因家族鉴定及其在黄萎病菌侵染下的表达分析[J]. 棉花学报, 2021, 33(6): 469-481. |
[15] | 李秋琳,李燕,陈伟,姚金波,朱守鸿,袁黎,张永山. 基于广泛靶向代谢组学的不同颜色棉花花瓣中类黄酮成分差异分析[J]. 棉花学报, 2021, 33(6): 482-492. |
|