项目作者: gitakartika
项目描述 :
Classify antioxidant property on protein sequence based on protein sequence feature
高级语言: Jupyter Notebook
项目地址: git://github.com/gitakartika/antioxidant-protein-classification.git
antioxidant-protein-classification
This project goal is to classify either the protein sequences have antioxidant property or not. To achieve this goal, we perform:
- Feature extraction: To obtain protein feature based on their sequences
- Feature selection: To select only impactful feature, there are two ways to get these features:
a) Eliminate high correlated features
b) Perform RFECV - Analyze optimal SVM parameter to determined which parameter we used on hyperparameter tuning
- Hyperparameter Tuning using SVM
- Evaluate model using data testing
- Analyze whether the model is overfit or not.