项目作者: jacksonpradolima

项目描述 :
GitLab CI Torrent - Mining CI job logs
高级语言: Python
项目地址: git://github.com/jacksonpradolima/gitlabci-torrent.git
创建时间: 2020-03-13T19:00:13Z



forthebadge made-with-python

" class="reference-link">GitLabCI-Torrent tool Among us Party

" class="reference-link">An easy way to mining the GitLab CI job logs seeking test results Meow Party


This tool contains methods to download, analyze, and extract data from job logs hosted at GitLab CI. They are:

  1. Harvester which downloads the logs (harvester_main.py)
  2. Analyzer to extract relevant information from logs (analyzer_main.py)
  3. Data Extraction (data_extraction_main.py) to split the data by variants (Read our article to know about Highly Configuration Systems)
  4. Project Status (project_status_main.py) to observe relevant information about the system and its variants, such as the period (variant exist/logs extracted), the number of builds, the number of faults (and build that fails), the number of tests (variation), the mean test duration, and the interval between commits.

:pencil: Citation

If this tool contributes to a project which leads to a scientific publication, I would appreciate a citation.

  1. @InProceedings{PradoLima_Learning2020,
  2. author = {Prado Lima, Jackson A. and Mendon\c{c}a, Willian D. F. and Vergilio, Silvia R. and Assun\c{c}\~{a}o, Wesley K. G.},
  3. title = {{Learning-Based Prioritization of Test Cases in Continuous Integration of Highly-Configurable Software}},
  4. booktitle = {Proceedings of the 24th ACM Conference on Systems and Software Product Line: Volume A - Volume A},
  5. series = {SPLC'20}
  6. year = {2020},
  7. isbn = {9781450375696},
  8. doi = {10.1145/3382025.3414967},
  9. articleno = {31},
  10. numpages = {11},
  11. location = {Montreal, Quebec, Canada},
  12. publisher = {Association for Computing Machinery},
  13. }

:red_circle: Installing required dependencies

The following command allows to install the required dependencies:

  1. $ pip install -r requirements.txt

:heavy_exclamation_mark: Allowing the tool to connect witgh GitLab CI

  1. Create a personal Access Token (see Personal Token Acess guide) for the GitLab instance desired, for example, https://gitlab.com or https://gitlab.dune-project.org/. This token needs privileges to read the repository and gather the logs.
  2. Complete the configuration.properties file with your GitLab Access Token

WARNING: Sometimes the connection does not work, and you need to change the path for the properties file in gitlabci_torrent/utils/gitlab_utils.py and use the absolute path.

" class="reference-link">Using the tool Dianajoa

📌 Downloading the job logs (Harvester)

To download the logs from a project, do:

  1. python harvester_main.py -p ProjectID -k ConfigKey


  • -p or --project_id for Repository ID (or Project ID). Follow this answer to find the ID.
  • -k or --configkey is the Configuration Key saved in configuration.properties (default GitLab)

The another parameters available are:

  • The user can pass a directory where the logs will be saved using -d or --save_dir (default logs).
  • The user can define a threshold for the mining using the parameter -t or --threshold. This parameter is a date threshold in format YYYY/MM/DD, otherwise it will return all logs.

📌 Running the data extraction process (Analyzer)

To extract the features for one project, do:

  1. python analyzer_main.py -d PathToLogs


  • -d or --logs_dir is the directory with the logs.

📌 Splitting by Variants (Data Extraction)

To split the test results by variant, do:

  1. python data_extraction_main.py -d PathToLogs -p ProjectName


  • -d or --logs_dir is the directory with the logs.
  • -p or --project_name the project name. Here, some projects have similar name for repository and (gitlab) user. In this way, you can decide the right name.

📌 Observing the project (Project Status)

To observe the project status for one project, do:

  1. python project_status_main.py -d PathToLogs -p ProjectName


  • -d or --logs_dir is the directory with the logs.
  • -p or --project_name the project name. Here, some projects have similar name for repository and (gitlab) user. In this way, you can decide the right name.


  • 👨‍💻 Jackson Antonio do Prado Lima :e-mail:
  • 👨‍💻 Willian Douglas Ferrari Mendonça :e-mail: