Summary.
Recent advances in large-scale genome sequencing have led to the rapid accumulation of amino acid sequences of proteins whose functions are unknown. Because the functions of these proteins are closely correlated with their subcellular localizations, it is vitally important to develop an automated method as a high-throughput tool to timely identify their subcellular location. Based on the concept of the pseudo amino acid composition by which a considerable amount of sequence-order effects can be incorporated into a set of discrete numbers (Chou, K. C., Proteins: Structure, Function, and Genetics, 2001, 43: 246–255), the complexity measure approach is introduced. The advantage by incorporating the complexity measure factor as one of the pseudo amino acid components for a protein is that it can more effectively reflect its overall sequence-order feature than the conventional correlation factors. With such a formulation frame to represent the samples of protein sequences, the covariant-discriminant predictor (Chou, K. C. and Elrod, D. W., Protein Engineering, 1999, 12: 107–118) was adopted to conduct prediction. High success rates were obtained by both the jackknife cross-validation test and independent dataset test, suggesting that introduction of the concept of the complexity measure into prediction of protein subcellular location is quite promising, and might also hold a great potential as a useful vehicle for the other areas of molecular biology.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Xiao, X., Shao, S., Ding, Y. et al. Using complexity measure factor to predict protein subcellular location. Amino Acids 28, 57–61 (2005). https://doi.org/10.1007/s00726-004-0148-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00726-004-0148-7