|
Aloy, P. and Russell, R.B. Interrogating protein interaction networks through structural biology. Proc Natl Acad Sci U S A 2002;99(9):5896-5901. Andorf, C.M., Honavar, V. and Sen, T.Z. Predicting the binding patterns of hub proteins: a study using yeast protein interaction networks. PLoS One 2013;8(2):e56833. Bernardes, J.S. and Pedreira, C.E. A review of protein function prediction under machine learning perspective. Recent Pat Biotechnol 2013;7(2):122-141. Breiman, L. Random forests. Machine Learning 2001. Camacho, C., et al. BLAST+: architecture and applications. BMC Bioinformatics 2009;10:421. Coletta, A., et al. Low-complexity regions within protein sequences have position-dependent roles. BMC Syst Biol 2010;4:43. Du, P., et al. PseAAC-Builder: a cross-platform stand-alone program for generating various special Chou's pseudo-amino acid compositions. Anal Biochem 2012;425(2):117-119. Emir, B., et al. Identification of a potential fibromyalgia diagnosis using random forest modeling applied to electronic medical records. J Pain Res 2015;8:277-288. Engelman, D.M., Steitz, T.A. and Goldman, A. Identifying nonpolar transbilayer helices in amino acid sequences of membrane proteins. Annu Rev Biophys Biophys Chem 1986;15:321-353. Finn, R.D., et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res 2016;44(D1):D279-285. Fong, J.H. and Panchenko, A.R. Intrinsic disorder and protein multibinding in domain, terminal, and linker regions. Mol Biosyst 2010;6(10):1821-1828. Ge, H., et al. Correlation between transcriptome and interactome mapping data from Saccharomyces cerevisiae. Nat Genet 2001;29(4):482-486. Gene Ontology, C. Gene Ontology Consortium: going forward. Nucleic Acids Res 2015;43(Database issue):D1049-1056. Giaever, G., et al. Functional profiling of the Saccharomyces cerevisiae genome. Nature 2002;418(6896):387-391. Goel, R., et al. Human Protein Reference Database and Human Proteinpedia as resources for phosphoproteome analysis. Mol Biosyst 2012;8(2):453-463. Grissa, D., et al. Feature Selection Methods for Early Predictive Biomarker Discovery Using Untargeted Metabolomic Data. Front Mol Biosci 2016;3:30. Hao, T., et al. Reconstruction and Application of Protein-Protein Interaction Network. Int J Mol Sci 2016;17(6). Haynes, C., et al. Intrinsic disorder is a common feature of hub proteins from four eukaryotic interactomes. PLoS Comput Biol 2006;2(8):e100. Hsing, M., Byler, K. and Cherkasov, A. Predicting highly-connected hubs in protein interaction networks by QSAR and biological data descriptors. Bioinformation 2009;4(4):164-168. Hsing, M., Byler, K.G. and Cherkasov, A. The use of Gene Ontology terms for predicting highly-connected 'hub' nodes in protein-protein interaction networks. BMC Syst Biol 2008;2:80. Ivanov, A.A., Khuri, F.R. and Fu, H. Targeting protein-protein interactions as an anticancer strategy. Trends Pharmacol Sci 2013;34(7):393-400. Jones, D.T. and Cozzetto, D. DISOPRED3: precise disordered region predictions with annotated protein-binding activity. Bioinformatics 2015;31(6):857-863. Jonsson, P.F. and Bates, P.A. Global topological features of cancer proteins in the human interactome. Bioinformatics 2006;22(18):2291-2297. Kalita, M.K., et al. CyclinPred: a SVM-based method for predicting cyclin protein sequences. PLoS One 2008;3(7):e2605. Kawashima, S., et al. AAindex: amino acid index database, progress report 2008. Nucleic Acids Res 2008;36(Database issue):D202-205. Kerrien, S., et al. The IntAct molecular interaction database in 2012. Nucleic Acids Res 2012;40(Database issue):D841-846. Latha, A.B., et al. Identification of hub proteins from sequence. Bioinformation 2011;7(4):163-168. Li, L., et al. Prediction of bacterial protein subcellular localization by incorporating various features into Chou's PseAAC and a backward feature selection approach. Biochimie 2014;104:100-107. Li, W. and Godzik, A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 2006;22(13):1658-1659. Lin, W.J. and Chen, J.J. Class-imbalanced classifiers for high-dimensional data. Brief Bioinform 2013;14(1):13-26. Liu, Q., et al. Feature selection and classification of MAQC-II breast cancer and multiple myeloma microarray gene expression data. PLoS One 2009;4(12):e8250. Lu, L., Lu, H. and Skolnick, J. MULTIPROSPECTOR: an algorithm for the prediction of protein-protein interactions by multimeric threading. Proteins 2002;49(3):350-364. Magnan, C.N. and Baldi, P. SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity. Bioinformatics 2014;30(18):2592-2597. Malhis, N., Jacobson, M. and Gsponer, J. MoRFchibi SYSTEM: software tools for the identification of MoRFs in protein sequences. Nucleic Acids Res 2016;44(W1):W488-493. Manning, G., et al. The protein kinase complement of the human genome. Science 2002;298(5600):1912-1934. Martins, F., et al. Unravelling the relationship between protein sequence and low-complexity regions entropies: Interactome implications. J Theor Biol 2015;382:320-327. Mohan, A., et al. Analysis of molecular recognition features (MoRFs). J Mol Biol 2006;362(5):1043-1059. Nacher, J.C., Hayashida, M. and Akutsu, T. Emergence of scale-free distribution in protein-protein interaction networks based on random selection of interacting domain pairs. Biosystems 2009;95(2):155-159. Nakashima, H., Nishikawa, K. and Ooi, T. Distinct character in hydrophobicity of amino acid compositions of mitochondrial proteins. Proteins 1990;8(2):173-178. Ofran, Y. and Rost, B. Predicted protein-protein interaction sites from local sequence information. FEBS Lett 2003;544(1-3):236-239. Ota, M., et al. Multiple-Localization and Hub Proteins. PLoS One 2016;11(6):e0156455. Pancsa, R. and Tompa, P. Structural disorder in eukaryotes. PLoS One 2012;7(4):e34687. Patil, A., Kinoshita, K. and Nakamura, H. Domain distribution and intrinsic disorder in hubs in the human protein-protein interaction network. Protein Sci 2010;19(8):1461-1468. Patil, A., Kinoshita, K. and Nakamura, H. Hub promiscuity in protein-protein interaction networks. Int J Mol Sci 2010;11(4):1930-1943. Patil, A. and Nakamura, H. Disordered domains and high surface charge confer hubs with the ability to interact with multiple proteins in interaction networks. FEBS Lett 2006;580(8):2041-2045. Pellegrini, M., et al. Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc Natl Acad Sci U S A 1999;96(8):4285-4288. Peng, Z., et al. Intrinsic disorder in the BK channel and its interactome. PLoS One 2014;9(4):e94331. Peri, S., et al. Human protein reference database as a discovery resource for proteomics. Nucleic Acids Res 2004;32(Database issue):D497-501. Pierrot, C., et al. Inhibition of protein-protein interactions in Plasmodium falciparum: future drug targets. Curr Pharm Des 2012;18(24):3522-3530. Prabhakaran, M. The distribution of physical, chemical and conformational properties in signal and nascent peptides. Biochem J 1990;269(3):691-696. Ramana, J. and Gupta, D. LipocalinPred: a SVM-based method for prediction of lipocalins. BMC Bioinformatics 2009;10:445. Rezwan, M. and Auerbach, D. Yeast "N"-hybrid systems for protein-protein and drug-protein interaction discovery. Methods 2012;57(4):423-429. Richardson, J.S. and Richardson, D.C. Amino acid preferences for specific locations at the ends of alpha helices. Science 1988;240(4859):1648-1652. Schad, E., Tompa, P. and Hegyi, H. The relationship between proteome size, structural disorder and organism complexity. Genome Biol 2011;12(12):R120. Tolosi, L. and Lengauer, T. Classification with correlated features: unreliability of feature ranking and solutions. Bioinformatics 2011;27(14):1986-1994. Touw, W.G., et al. Data mining in the Life Sciences with Random Forest: a walk in the park or lost in the jungle? Brief Bioinform 2013;14(3):315-326. Uversky, V.N., Oldfield, C.J. and Dunker, A.K. Showing your ID: intrinsic disorder as an ID for recognition, regulation and cell signaling. J Mol Recognit 2005;18(5):343-384. Vacic, V., et al. Characterization of molecular recognition features, MoRFs, and their binding partners. J Proteome Res 2007;6(6):2351-2366. Wang, L., Wang, Y. and Chang, Q. Feature selection methods for big data bioinformatics: A survey from the search perspective. Methods 2016;111:21-31. Wright, M.N.Z., A. ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R. J Stat Softw 2017. Yang, R., et al. A Novel Feature Extraction Method with Feature Selection to Identify Golgi-Resident Protein Types from Imbalanced Data. Int J Mol Sci 2016;17(2):218. Zhang, J. Protein-length distributions for the three domains of life. Trends Genet 2000;16(3):107-109.
|