Coverart for item
The Resource Machine learning for protein subcellular localization prediction, Shibiao Wan, Man-Wai Mak

Machine learning for protein subcellular localization prediction, Shibiao Wan, Man-Wai Mak

Label
Machine learning for protein subcellular localization prediction
Title
Machine learning for protein subcellular localization prediction
Statement of responsibility
Shibiao Wan, Man-Wai Mak
Creator
Contributor
Author
Subject
Genre
Language
eng
Summary
For bioinformaticians, computational biologists, and wet-lab biologists, the authors provide the latest machine learning approaches for protein subcellular localization prediction with a systemic scheme for improving predictors performance
Member of
Cataloging source
E7B
http://library.link/vocab/creatorName
Wan, Shibiao
Dewey number
572/.696
Index
index present
Language note
English
LC call number
QP551
LC item number
.W36 2015eb
Literary form
non fiction
Nature of contents
  • dictionaries
  • bibliography
NLM call number
QU 55
http://library.link/vocab/relatedWorkOrContributorName
Mak, M. W.
http://library.link/vocab/subjectName
  • Proteins
  • Machine learning
  • Probabilities
  • Carrier Proteins
  • Artificial Intelligence
  • Probability
  • Technology & Engineering / Signals & Signal Processing
  • Machine learning
  • Probabilities
Label
Machine learning for protein subcellular localization prediction, Shibiao Wan, Man-Wai Mak
Instantiates
Publication
Copyright
Bibliography note
Includes bibliographical references and index
Carrier category
online resource
Carrier category code
  • cr
Carrier MARC source
rdacarrier
Color
multicolored
Content category
text
Content type code
  • txt
Content type MARC source
rdacontent
Contents
  • Preface -- Contents -- List of Abbreviations -- 1 Introduction -- 1.1 Proteins and their subcellular locations -- 1.2 Why computationally predict protein subcellular localization? -- 1.2.1 Significance of the subcellular localization of proteins -- 1.2.2 Conventional wet-lab techniques -- 1.2.3 Computational prediction of protein subcellular localization -- 1.3 Organization of this book -- 2 Overview of subcellular localization prediction -- 2.1 Sequence-based methods -- 2.1.1 Composition-based methods -- 2.1.2 Sorting signal-based methods -- 2.1.3 Homology-based methods -- 2.2 Knowledge-based methods -- 2.2.1 GO-term extraction -- 2.2.2 GO-vector construction -- 2.3 Limitations of existing methods -- 2.3.1 Limitations of sequence-based methods -- 2.3.2 Limitations of knowledge-based methods -- 3 Legitimacy of using gene ontology information -- 3.1 Direct table lookup? -- 3.1.1 Table lookup procedure for single-label prediction -- 3.1.2 Table-lookup procedure for multi-label prediction -- 3.1.3 Problems of table lookup -- 3.2 Using only cellular component GO terms? -- 3.3 Equivalent to homologous transfer? -- 3.4 More reasons for using GO information -- 4 Single-location protein subcellular localization -- 4.1 Extracting GO from the Gene Ontology Annotation Database -- 4.1.1 Gene Ontology Annotation Database -- 4.1.2 Retrieval of GO terms -- 4.1.3 Construction of GO vectors -- 4.1.4 Multiclass SVM classification -- 4.2 FusionSVM: Fusion of gene ontology and homology-based features -- 4.2.1 InterProGOSVM: Extracting GO from InterProScan -- 4.2.2 PairProSVM: A homology-based method -- 4.2.3 Fusion of InterProGOSVM and PairProSVM -- 4.3 Summary -- 5 From single- to multi-location -- 5.1 Significance of multi-location proteins -- 5.2 Multi-label classification -- 5.2.1 Algorithm-adaptation methods
  • 5.2.2 Problem transformation methods -- 5.2.3 Multi-label classification in bioinformatics -- 5.3 mGOASVM: A predictor for both single- and multi-location proteins -- 5.3.1 Feature extraction -- 5.3.2 Multi-label multiclass SVM classification -- 5.4 AD-SVM: An adaptive decision multi-label predictor -- 5.4.1 Multi-label SVM scoring -- 5.4.2 Adaptive decision for SVM (AD-SVM) -- 5.4.3 Analysis of AD-SVM -- 5.5 mPLR-Loc: A multi-label predictor based on penalized logistic regression -- 5.5.1 Single-label penalized logistic regression -- 5.5.2 Multi-label penalized logistic regression -- 5.5.3 Adaptive decision for LR (mPLR-Loc) -- 5.6 Summary -- 6 Mining deeper on GO for protein subcellular localization -- 6.1 Related work -- 6.2 SS-Loc: Using semantic similarity over GO -- 6.2.1 Semantic similarity measures -- 6.2.2 SS vector construction -- 6.3 HybridGO-Loc: Hybridizing GO frequency and semantic similarity features -- 6.3.1 Hybridization of two GO features -- 6.3.2 Multi-label multiclass SVM classification -- 6.4 Summary -- 7 Ensemble random projection for large-scale predictions -- 7.1 Random projection -- 7.2 RP-SVM: A multi-label classifier with ensemble random projection -- 7.2.1 Ensemble multi-label classifier -- 7.2.2 Multi-label classification -- 7.3 R3P-Loc: A compact predictor based on ridge regression and ensemble random projection -- 7.3.1 Limitation of using current databases -- 7.3.2 Creating compact databases -- 7.3.3 Single-label ridge regression -- 7.3.4 Multi-label ridge regression -- 7.4 Summary -- 8 Experimental setup -- 8.1 Prediction of single-label proteins -- 8.1.1 Datasets construction -- 8.1.2 Performance metrics -- 8.2 Prediction of multi-label proteins -- 8.2.1 Dataset construction -- 8.2.2 Datasets analysis -- 8.2.3 Performance metrics -- 8.3 Statistical evaluation methods -- 8.4 Summary
  • 9 Results and analysis -- 9.1 Performance of GOASVM -- 9.1.1 Comparing GO vector construction methods -- 9.1.2 Performance of successive-search strategy -- 9.1.3 Comparing with methods based on other features -- 9.1.4 Comparing with state-of-the-art GO methods -- 9.1.5 GOASVM using old GOA databases -- 9.2 Performance of FusionSVM -- 9.2.1 Comparing GO vector construction and normalization methods -- 9.2.2 Performance of PairProSVM -- 9.2.3 Performance of FusionSVM -- 9.2.4 Effect of the fusion weights on the performance of FusionSVM -- 9.3 Performance of mGOASVM -- 9.3.1 Kernel selection and optimization -- 9.3.2 Term-frequency for mGOASVM -- 9.3.3 Multi-label properties for mGOASVM -- 9.3.4 Further analysis of mGOASVM -- 9.3.5 Comparing prediction results of novel proteins -- 9.4 Performance of AD-SVM -- 9.5 Performance of mPLR-Loc -- 9.5.1 Effect of adaptive decisions on mPLR-Loc -- 9.5.2 Effect of regularization on mPLR-Loc -- 9.6 Performance of HybridGO-Loc -- 9.6.1 Comparing different features -- 9.7 Performance of RP-SVM -- 9.7.1 Performance of ensemble random projection -- 9.7.2 Comparison with other dimension-reduction methods -- 9.7.3 Performance of single random-projection -- 9.7.4 Effect of dimensions and ensemble size -- 9.8 Performance of R3P-Loc -- 9.8.1 Performance on the compact databases -- 9.8.2 Effect of dimensions and ensemble size -- 9.8.3 Performance of ensemble random projection -- 9.9 Comprehensive comparison of proposed predictors -- 9.9.1 Comparison of benchmark datasets -- 9.9.2 Comparison of novel datasets -- 9.10 Summary -- 10 Properties of the proposed predictors -- 10.1 Noise data in the GOA Database -- 10.2 Analysis of single-label predictors -- 10.2.1 GOASVM vs FusionSVM -- 10.2.2 Can GOASVM be combined with PairProSVM? -- 10.3 Advantages of mGOASVM -- 10.3.1 GO-vector construction
  • 10.3.2 GO subspace selection -- 10.3.3 Capability of handling multi-label problems -- 10.4 Analysis for HybridGO-Loc -- 10.4.1 Semantic similarity measures -- 10.4.2 GO-frequency features vs SS features -- 10.4.3 Bias analysis -- 10.5 Analysis for RP-SVM -- 10.5.1 Legitimacy of using RP -- 10.5.2 Ensemble random projection for robust performance -- 10.6 Comparing the proposed multi-label predictors -- 10.7 Summary -- 11 Conclusions and future directions -- 11.1 Conclusions -- 11.2 Future directions -- A Webservers for protein subcellular localization -- A.1 GOASVM webserver -- A.2 mGOASVM webserver -- A.3 HybridGO-Loc webserver -- A.4 mPLR-Loc webserver -- B Support vector machines -- B.1 Binary SVM classification -- B.2 One-vs-rest SVM classification -- C Proof of no bias in LOOCV -- D Derivatives for penalized logistic regression -- Bibliography -- Index
Control code
912323205
Dimensions
unknown
Extent
1 online resource (210 pages)
Form of item
online
Isbn
9781501501524
Media category
computer
Media MARC source
rdamedia
Media type code
  • c
http://library.link/vocab/ext/overdrive/overdriveId
2326883343542755850
Specific material designation
remote
System control number
(OCoLC)912323205
Label
Machine learning for protein subcellular localization prediction, Shibiao Wan, Man-Wai Mak
Publication
Copyright
Bibliography note
Includes bibliographical references and index
Carrier category
online resource
Carrier category code
  • cr
Carrier MARC source
rdacarrier
Color
multicolored
Content category
text
Content type code
  • txt
Content type MARC source
rdacontent
Contents
  • Preface -- Contents -- List of Abbreviations -- 1 Introduction -- 1.1 Proteins and their subcellular locations -- 1.2 Why computationally predict protein subcellular localization? -- 1.2.1 Significance of the subcellular localization of proteins -- 1.2.2 Conventional wet-lab techniques -- 1.2.3 Computational prediction of protein subcellular localization -- 1.3 Organization of this book -- 2 Overview of subcellular localization prediction -- 2.1 Sequence-based methods -- 2.1.1 Composition-based methods -- 2.1.2 Sorting signal-based methods -- 2.1.3 Homology-based methods -- 2.2 Knowledge-based methods -- 2.2.1 GO-term extraction -- 2.2.2 GO-vector construction -- 2.3 Limitations of existing methods -- 2.3.1 Limitations of sequence-based methods -- 2.3.2 Limitations of knowledge-based methods -- 3 Legitimacy of using gene ontology information -- 3.1 Direct table lookup? -- 3.1.1 Table lookup procedure for single-label prediction -- 3.1.2 Table-lookup procedure for multi-label prediction -- 3.1.3 Problems of table lookup -- 3.2 Using only cellular component GO terms? -- 3.3 Equivalent to homologous transfer? -- 3.4 More reasons for using GO information -- 4 Single-location protein subcellular localization -- 4.1 Extracting GO from the Gene Ontology Annotation Database -- 4.1.1 Gene Ontology Annotation Database -- 4.1.2 Retrieval of GO terms -- 4.1.3 Construction of GO vectors -- 4.1.4 Multiclass SVM classification -- 4.2 FusionSVM: Fusion of gene ontology and homology-based features -- 4.2.1 InterProGOSVM: Extracting GO from InterProScan -- 4.2.2 PairProSVM: A homology-based method -- 4.2.3 Fusion of InterProGOSVM and PairProSVM -- 4.3 Summary -- 5 From single- to multi-location -- 5.1 Significance of multi-location proteins -- 5.2 Multi-label classification -- 5.2.1 Algorithm-adaptation methods
  • 5.2.2 Problem transformation methods -- 5.2.3 Multi-label classification in bioinformatics -- 5.3 mGOASVM: A predictor for both single- and multi-location proteins -- 5.3.1 Feature extraction -- 5.3.2 Multi-label multiclass SVM classification -- 5.4 AD-SVM: An adaptive decision multi-label predictor -- 5.4.1 Multi-label SVM scoring -- 5.4.2 Adaptive decision for SVM (AD-SVM) -- 5.4.3 Analysis of AD-SVM -- 5.5 mPLR-Loc: A multi-label predictor based on penalized logistic regression -- 5.5.1 Single-label penalized logistic regression -- 5.5.2 Multi-label penalized logistic regression -- 5.5.3 Adaptive decision for LR (mPLR-Loc) -- 5.6 Summary -- 6 Mining deeper on GO for protein subcellular localization -- 6.1 Related work -- 6.2 SS-Loc: Using semantic similarity over GO -- 6.2.1 Semantic similarity measures -- 6.2.2 SS vector construction -- 6.3 HybridGO-Loc: Hybridizing GO frequency and semantic similarity features -- 6.3.1 Hybridization of two GO features -- 6.3.2 Multi-label multiclass SVM classification -- 6.4 Summary -- 7 Ensemble random projection for large-scale predictions -- 7.1 Random projection -- 7.2 RP-SVM: A multi-label classifier with ensemble random projection -- 7.2.1 Ensemble multi-label classifier -- 7.2.2 Multi-label classification -- 7.3 R3P-Loc: A compact predictor based on ridge regression and ensemble random projection -- 7.3.1 Limitation of using current databases -- 7.3.2 Creating compact databases -- 7.3.3 Single-label ridge regression -- 7.3.4 Multi-label ridge regression -- 7.4 Summary -- 8 Experimental setup -- 8.1 Prediction of single-label proteins -- 8.1.1 Datasets construction -- 8.1.2 Performance metrics -- 8.2 Prediction of multi-label proteins -- 8.2.1 Dataset construction -- 8.2.2 Datasets analysis -- 8.2.3 Performance metrics -- 8.3 Statistical evaluation methods -- 8.4 Summary
  • 9 Results and analysis -- 9.1 Performance of GOASVM -- 9.1.1 Comparing GO vector construction methods -- 9.1.2 Performance of successive-search strategy -- 9.1.3 Comparing with methods based on other features -- 9.1.4 Comparing with state-of-the-art GO methods -- 9.1.5 GOASVM using old GOA databases -- 9.2 Performance of FusionSVM -- 9.2.1 Comparing GO vector construction and normalization methods -- 9.2.2 Performance of PairProSVM -- 9.2.3 Performance of FusionSVM -- 9.2.4 Effect of the fusion weights on the performance of FusionSVM -- 9.3 Performance of mGOASVM -- 9.3.1 Kernel selection and optimization -- 9.3.2 Term-frequency for mGOASVM -- 9.3.3 Multi-label properties for mGOASVM -- 9.3.4 Further analysis of mGOASVM -- 9.3.5 Comparing prediction results of novel proteins -- 9.4 Performance of AD-SVM -- 9.5 Performance of mPLR-Loc -- 9.5.1 Effect of adaptive decisions on mPLR-Loc -- 9.5.2 Effect of regularization on mPLR-Loc -- 9.6 Performance of HybridGO-Loc -- 9.6.1 Comparing different features -- 9.7 Performance of RP-SVM -- 9.7.1 Performance of ensemble random projection -- 9.7.2 Comparison with other dimension-reduction methods -- 9.7.3 Performance of single random-projection -- 9.7.4 Effect of dimensions and ensemble size -- 9.8 Performance of R3P-Loc -- 9.8.1 Performance on the compact databases -- 9.8.2 Effect of dimensions and ensemble size -- 9.8.3 Performance of ensemble random projection -- 9.9 Comprehensive comparison of proposed predictors -- 9.9.1 Comparison of benchmark datasets -- 9.9.2 Comparison of novel datasets -- 9.10 Summary -- 10 Properties of the proposed predictors -- 10.1 Noise data in the GOA Database -- 10.2 Analysis of single-label predictors -- 10.2.1 GOASVM vs FusionSVM -- 10.2.2 Can GOASVM be combined with PairProSVM? -- 10.3 Advantages of mGOASVM -- 10.3.1 GO-vector construction
  • 10.3.2 GO subspace selection -- 10.3.3 Capability of handling multi-label problems -- 10.4 Analysis for HybridGO-Loc -- 10.4.1 Semantic similarity measures -- 10.4.2 GO-frequency features vs SS features -- 10.4.3 Bias analysis -- 10.5 Analysis for RP-SVM -- 10.5.1 Legitimacy of using RP -- 10.5.2 Ensemble random projection for robust performance -- 10.6 Comparing the proposed multi-label predictors -- 10.7 Summary -- 11 Conclusions and future directions -- 11.1 Conclusions -- 11.2 Future directions -- A Webservers for protein subcellular localization -- A.1 GOASVM webserver -- A.2 mGOASVM webserver -- A.3 HybridGO-Loc webserver -- A.4 mPLR-Loc webserver -- B Support vector machines -- B.1 Binary SVM classification -- B.2 One-vs-rest SVM classification -- C Proof of no bias in LOOCV -- D Derivatives for penalized logistic regression -- Bibliography -- Index
Control code
912323205
Dimensions
unknown
Extent
1 online resource (210 pages)
Form of item
online
Isbn
9781501501524
Media category
computer
Media MARC source
rdamedia
Media type code
  • c
http://library.link/vocab/ext/overdrive/overdriveId
2326883343542755850
Specific material designation
remote
System control number
(OCoLC)912323205

Library Locations

    • Curtis Laws Wilson LibraryBorrow it
      400 West 14th Street, Rolla, MO, 65409, US
      37.955220 -91.772210
Processing Feedback ...