Convex hull principle for classification and phylogeny of eukaryotic proteins

Xin Zhao Tsinghua University Kun Tian Tsinghua University Rong L. He Chicago State University Stephen S.-T. Yau Tsinghua University

Data Analysis, Bio-Statistics, Bio-Mathematics mathscidoc:1904.42001

This study quantitatively validates the principle that the biological properties associated with a given genotype are determined by the distribution of amino acids. In order to visualize this central law of molecular biology, each protein was represented by a point in 250-dimensional space based on its amino acid distribution. Proteins from the same family are found to cluster together, leading to the principle that the convex hull surrounding protein points from the same family do not intersect with the convex hulls of other protein families. This principle was verified computationally for all available and reliable protein kinases and human proteins. In addition, we generated 2,328,761 figures to show that the convex hulls of different families were disjoint from each other. The classification performs well with high and robust accuracy (95.75% and 97.5%) together with reasonable phylogenetic trees validate our methods further.
Convex hull principle, Classification, Protein kinases, Human proteins, Natural vector, Phylogenetic analysis
[ Download ] [ 2019-04-27 21:07:09 uploaded by zhaox15 ] [ 870 downloads ] [ 0 comments ]
@inproceedings{xinconvex,
  title={Convex hull principle for classification and phylogeny of eukaryotic proteins},
  author={Xin Zhao, Kun Tian, Rong L. He, and Stephen S.-T. Yau},
  url={http://archive.ymsc.tsinghua.edu.cn/pacm_paperurl/20190427210709465030279},
}
Xin Zhao, Kun Tian, Rong L. He, and Stephen S.-T. Yau. Convex hull principle for classification and phylogeny of eukaryotic proteins. http://archive.ymsc.tsinghua.edu.cn/pacm_paperurl/20190427210709465030279.
Please log in for comment!
 
 
Contact us: office-iccm@tsinghua.edu.cn | Copyright Reserved