项目作者: stefantaubert

项目描述 :
Source code for the TUCMI submissions to the GeoLifeCLEF 2018 species recognition task
高级语言: Python
项目地址: git://github.com/stefantaubert/lifeclef-geo-2018.git
创建时间: 2018-06-08T10:44:31Z
项目社区:https://github.com/stefantaubert/lifeclef-geo-2018

开源协议:MIT License

下载


Species Prediction based on Environmental Variables using Machine Learning Techniques

By Stefan Taubert, Max Mauermann, Stefan Kahl, Thomas Wilhelm-Stein, Danny Kowerko and Maximilian Eibl

Introduction

This is the sourcecode for our submissions to the GeoLifeCLEF 2018 species recognition task.

Contact: Stefan Taubert, Technische Universität Chemnitz, Media Informatics

E-Mail: github@stefantaubert.com

This project is licensed under the terms of the MIT license.

Please cite the paper in your publications if it helps your research.

  1. @article{taubert2018large,
  2. title={Species Prediction based on Environmental Variables using Machine Learning Techniques},
  3. author={Taubert, Stefan and Mauermann, Max and Kahl, Stefan and Wilhelm-Stein, Thomas and Kowerko, Danny and Ritter, Marc and Eibl, Maximilian},
  4. journal={Working notes of CLEF},
  5. year={2018}
  6. }

You can download our working notes here: TUCMI GeoLifeCLEF Working Notes PDF

Installation

Python

  1. git clone git@github.com:stefantaubert/lifeclef-geo-2018.git
  2. cd lifeclef-geo-2018
  3. sudo pip install r requirements.txt

Training

For the training you need to download the GeoLifeCLEF training data.

Dataset

You need to set up the path to the directory with the datasets.
Therefor you need to create a file geo/data_dir_config.py which defines a root-variable and looks like this:

  1. root = "/path/to/datasetdir"

In this dataset directory should be the following files and directories:

  1. occurrences_test.csv
  2. occurrences_train.csv
  3. patchTrain
  4. ¦ 256
  5. ¦ ¦ patch_1.tif
  6. ¦ ¦ patch_2.tif
  7. ¦ ¦ ...
  8. ¦ 512
  9. ¦ ¦ patch_257.tif
  10. ¦ ¦ patch_258.tif
  11. ¦ ¦ ...
  12. ¦ ...
  13. patchTest
  14. ¦ 256
  15. ¦ ¦ patch_1.tif
  16. ¦ ¦ patch_2.tif
  17. ¦ ¦ ...
  18. ¦ 512
  19. ¦ ¦ patch_257.tif
  20. ¦ ¦ patch_258.tif
  21. ¦ ¦ ...
  22. ¦ ...

Run Models

To run any of the eight models you need to navigate to the specific model directory and execute the according python script:

XGB Single Model

  1. PYTHONPATH=/path/to/gitrepo python geo/models/xgb/single_model.py

XGB Multi Model

  1. PYTHONPATH=/path/to/gitrepo python geo/models/xgb/multi_model.py

XGB Multi Model with Groups

  1. PYTHONPATH=/path/to/gitrepo python geo/models/xgb/multi_model_with_groups.py

Keras Single Model

  1. PYTHONPATH=/path/to/gitrepo python geo/models/keras/train_keras_model.py

Keras Multi Model

  1. PYTHONPATH=/path/to/gitrepo python geo/models/keras/train_keras_model.py

Vector Model

  1. PYTHONPATH=/path/to/gitrepo python geo/models/vector/model.py

Random Model

  1. PYTHONPATH=/path/to/gitrepo python geo/models/random/model.py

Probability Model

  1. PYTHONPATH=/path/to/gitrepo python geo/models/probability/model.py

Tests

If you want to run the tests you need to run the specific script in the test dir:

  1. PYTHONPATH=/path/to/gitrepo python geo/tests/test_*.py

Analysis

You can run our analysis with any script in the geo/analysis/ directory for instance:

  1. PYTHONPATH=/path/to/gitrepo python geo/analysis/species_occurences.py

On Windows

Look at this post on StackOverflow to set the PYTHONPATH. An other possibility is to use Visual Studio Code and set the launch.json like this:

  1. {
  2. "version": "0.2.0",
  3. "configurations": [
  4. {
  5. "name": "Python: Current File",
  6. "type": "python",
  7. "request": "launch",
  8. "program": "${file}",
  9. "env": {"PYTHONPATH":"${workspaceRoot}"}
  10. }
  11. ]
  12. }