Calculate Pearson Correlations between the CCLE-EXP gene and PR gene.
The original dataset downloaded here with the header (i.e. the first and second line) deleted.
The original dataset downloaded here with the header (i.e. the first and second line) deleted.
A matrix of Pearson scores between PR genes and CCLE-EXP genes.
Obtain the global ranking informatiom of each feature, which would be used to calculate global scores.
A list of genes whose essentiality need to be predicted. Download
A matrix of Pearson scores between PR genes and CCLE-EXP genes. Generated by calculate_corr.m
A two-column table containing feature names and how many times this feature’s local score was top 10. This table would be used to calculated global scores.
Use local scores and global rankings to calculate final correlation score. Then output the name of top 9 expression features and 1 copy number feature for each PR gene. One commandline parameter required. We used 0.7 in this project.
Example: perl get_top_prior_2300.pl 0.7
A two-column table containing feature names and how many times this feature’s local score was top 10. Generated by generate_GE_top_100.pl
.
Unprocess copy number data. Download
A matrix of Pearson scores between PR genes and CCLE-EXP genes. Generated by calculate_corr.m
A table of the name of the 10 predictive features of each PR gene.
Generate formated SVM input file for training dataset.
Unprocess gene expression data. Download
Unprocess copy number data. Download
Generated by get_top_prior_2300.pl
A list of genes whose essentiality need to be predicted. Download
Achilles scores scaled by min and max. See main text for more information.
This is the SVM input file for training dataset.
Similar to extract_value_svm.pl
, but generate SVM input files for testing data.
Same as extract_value_svm.pl
.
This is the SVM input file for testing dataset.
Use SVM to do linear regression and perform prediction on testing dataset. One commandline parameter required. We used 0.005 in this project.
Example: perl test_svm_c.pl 0.005
SVM input files for training data. Generated by extract_value_svm.pl
.
SVM input files for testing data. Generated by extract_value_svm_test.pl
.
The model for a specific gene.
The predicted essenciality score of a specific gene in testing cell lines.
This folder contains scripts for 5-fold cross-validation testing alternative alpha values.
This folder contains scripts for 5-fold cross-validation testing different regression algorithms.
This folder contains scripts for 5-fold cross-validation testing a bunch of alternative numbers (3,4,5…30) of features used for prediction.
This folder contains scripts for 5-fold cross-validation testing the performance of using only top 10 expression features as the 10 predictive features.
This folder contains scripts for 5-fold cross-validation testing the performance of using only top 10 copy numbers as the 10 predictive features.
This folder contains scripts for 5-fold cross-validation testing the performance of putting copy number profile and expression profile together and use the top 10 in the mixed features as the 10 predictive features.