项目作者: tzaiyang

项目描述 :
Speech Emotion Recognition Using Deep Convolutional Neural Network and Discriminant Temporal Pyramid Matching
高级语言: Python
项目地址: git://github.com/tzaiyang/SpeechEmoRec.git
创建时间: 2017-12-27T11:56:51Z
项目社区:https://github.com/tzaiyang/SpeechEmoRec

开源协议:

下载


SpeechEmoRec

Building Status

Introduction

This project aims to implement speech emotion recognition strategy proposed in Speech Emotion Recognition Using Deep Convolutional Neural Network and Discriminant Temporal Pyramid Matching

Runtime enviorment

CPU Host :

  • ubuntu16.04
  • python3.5
  • tensorflow1.7.0

GPU Server :

  • tensorflow-gpu1.7.0
  • NVIDIA driver version:390
  • cuda9.0
  • cudnn7.0

Instructions

Preprocessing Data

  1. Update path of dataset which you want to save from path.py
  2. Downloading Berlin Database of Emotional Speech!
    1. Berlin Dataset
      $ python load_emodb.py
    2. eNTERFACE Dataset
      Downloading the eNTERFACE05 Dataset and update the dataset root
  3. Starting preprocessing

    $ python melSpec.py

Feature Extracting

Finetune AlexNet with Tensorflow

  1. $ python finetune.py

Discriminant Temporal Pyramid Matching

  1. $ python dtpm.py -s
  2. $ python dtpm.py -n

Classfier

Support Vector Machine

  1. $ python svm.py

Refrences:

Refrence Model:

  • Alexnet
  • SVM

Refrence Papers:

  • ImageNet Classification with Deep Convolutional
    Neural Networks
  • Speech Emotion Recognition Using Deep Convolutional Neural Network and Discriminant Temporal Pyramid Matching
  • Geometric ℓp-norm feature pooling for image classification