Methodology Input Format Output Format Evaluation and Performance

Our methods are evaluated on a large dataset (compiled by Capriotti et al., Bioinformatics, vol. 20, pages 190-201, 2004) containing 1615 mutations using 20 fold cross validation procedure. Under this procedure, the dataset is splited evenly into 20 folds. Any one fold is used as test dataset, another remaining 19 folds are used as training dataset. Thus there are 20 pairs of testing and training datasets. For each pair, the SVM and neural network are trained on the training dataset and tested on the testing dataset. The performance on all test datasets are combined and reported as the performance of tested methods. Here are the prediction accuracy (correct num / total num) by support vector machines using sequence information and tertiary structure information respectively.


Download MUpro 1.0