Machine-learning-aided precise prediction of deletions with next-generation sequencing

来源期刊:中南大学学报(英文版)2016年第12期

论文作者:髙敬阳 管瑞

文章页码:3239 - 3247

Key words:next-generation sequencing; deletion prediction; sensitivity; false discovery rate; feature extraction; machine learning

Abstract: When detecting deletions in complex human genomes, split-read approaches using short reads generated with next-generation sequencing still face the challenge that either false discovery rate is high, or sensitivity is low. To address the problem, an integrated strategy is proposed. It organically combines the fundamental theories of the three mainstream methods (read-pair approaches, split-read technologies and read-depth analysis) with modern machine learning algorithms, using the recipe of feature extraction as a bridge. Compared with the state-of-art split-read methods for deletion detection in both low and high sequence coverage, the machine-learning-aided strategy shows great ability in intelligently balancing sensitivity and false discovery rate and getting a both more sensitive and more precise call set at single-base-pair resolution. Thus, users do not need to rely on former experience to make an unnecessary trade-off beforehand and adjust parameters over and over again any more. It should be noted that modern machine learning models can play an important role in the field of structural variation prediction.

有色金属在线官网  |   会议  |   在线投稿  |   购买纸书  |   科技图书馆

中南大学出版社 技术支持 版权声明   电话:0731-88830515 88830516   传真:0731-88710482   Email:administrator@cnnmol.com

互联网出版许可证:(署)网出证(京)字第342号   京ICP备17050991号-6      京公网安备11010802042557号