A novel alignment-free method for HIV-1 subtype classification

Lily He Tsinghua University Rui Dong Tsinghua University Rong Lucy He Chicago State University Stephen S.-T. Yau Tsinghua University

Data Analysis, Bio-Statistics, Bio-Mathematics mathscidoc:2004.42001

Infection, Genetics and Evolution, 77, 104080, 2020.1
HIV-1 is the most common and pathogenic strain of human immunodeficiency virus consisting of many subtypes. To study the difference among HIV-1 subtypes in infection, diagnosis and drug design, it is important to identify HIV-1 subtypes from clinical HIV-1 samples. In this work, we propose an effective numeric representation called Subsequence Natural Vector (SNV) to encode HIV-1 sequences. Using the representation, we introduce an improved linear discriminant analysis method to classify HIV-1 viruses correctly. SNV is based on distribution of nucleotides in HIV-1 viral sequences. It not only computes the number of nucleotides, but also describes the position and variance of nucleotides in viruses. To validate our alignment-free method, 6902 complete genomes and 11,668 pol gene sequences of HIV-1 subtypes were collected from the up-to-date Los Alamos HIV database. SNV outperforms the three popular methods, Kameris, Comet and REGA, with almost 100% Sensitivity and Specificity, also with much less time. Our subtyping algorithm especially works better for circulating recombinant forms (CRFs) consisting of a few sequences. Our approach is also powerful to separate unique recombinant forms (URFs) from other subtypes with 100% Sensitivity and Specificity. Moreover, phylogenetic trees based on SNV representation are constructed using full-length HIV-1 genomes and pol genes respectively, where viruses from the same subtype are clustered together correctly.
SNV, alignment-free, HIV-1, Classification
[ Download ] [ 2020-04-23 10:51:36 uploaded by RuiDong ] [ 663 downloads ] [ 0 comments ]
@inproceedings{lily2020a,
  title={A novel alignment-free method for HIV-1 subtype classification},
  author={Lily He, Rui Dong, Rong Lucy He, and Stephen S.-T. Yau},
  url={http://archive.ymsc.tsinghua.edu.cn/pacm_paperurl/20200423105136777525630},
  booktitle={Infection, Genetics and Evolution},
  volume={77},
  pages={104080},
  year={2020},
}
Lily He, Rui Dong, Rong Lucy He, and Stephen S.-T. Yau. A novel alignment-free method for HIV-1 subtype classification. 2020. Vol. 77. In Infection, Genetics and Evolution. pp.104080. http://archive.ymsc.tsinghua.edu.cn/pacm_paperurl/20200423105136777525630.
Please log in for comment!
 
 
Contact us: office-iccm@tsinghua.edu.cn | Copyright Reserved