Reproducible pipeline from Twitter API using DVC
In this project I built a pipeline using DVC from my previously created notebook, called the Twitter API.
Due to the size of my notebook, I only put the most important parts of my work into the pipeline.
This parts are:
The pipeline graph is the following:
+-------+
| fetch |
+-------+
*
*
*
+-------+
| graph |
+-------+
*
*
*
+------------+
| egonetwork |
+------------+
To download the project, proceed with cloning.
git clone https://github.com/antoniod20/dvc-twitter.git
The project was carried out with Python 3.6.9. It is therefore advisable to have a version of Python at least higher than version 3 installed.
To install all the libraries needed to run the project, it is necessary to run this command line:
pip install -r src/requirements.txt
To launch the pipeline, the following steps must be run:
cd .\dvc-twitter\dvc-twitter-api\
dvc repro