项目作者: russomi-labs

项目描述 :
Apache Airflow Quickstart
高级语言: Python
项目地址: git://github.com/russomi-labs/airflow-quickstart.git
创建时间: 2019-02-05T14:04:03Z
项目社区:https://github.com/russomi-labs/airflow-quickstart

开源协议:

下载


Quickstart

Airflow Quickstart and Tutorial

ETL best practices with Airflow

ETL best practices with Airflow is an excellent guide on best
practices for Airflow.

Option 1: Install Airflow on Host System

  1. export SLUGIFY_USES_TEXT_UNIDECODE=yes
  2. pip2 install -r requirements-dev.txt

Option 2: Run Airflow with Docker

See the Run Airflow from Docker
tutorial for additional details.

Prerequisites

Usage

Run the web service with docker

  1. # See https://docs.docker.com/compose/reference/envvars and
  2. # https://docs.docker.com/compose/compose-file/#variable-substitution
  3. export COMPOSE_FILE=docker-compose-LocalExecutor.yml
  4. docker-compose up -d
  5. # Build the image
  6. # docker-compose up -d --build

Check http://localhost:8080/

  • docker-compose logs - Displays log output
  • docker-compose ps - List containers
  • docker-compose down - Stop containers

Other commands

If you want to run other airflow sub-commands, you can do so like this:

  • docker-compose run --rm webserver airflow list_dags - List dags
  • docker-compose run --rm webserver airflow test [DAG_ID] [TASK_ID] [EXECUTION_DATE] - Test specific task
  1. # See https://docs.docker.com/compose/reference/envvars and
  2. # https://docs.docker.com/compose/compose-file/#variable-substitution
  3. export COMPOSE_FILE=docker-compose-LocalExecutor.yml
  4. # List dags
  5. docker-compose run \
  6. --no-deps \
  7. --rm \
  8. webserver airflow list_dags
  9. # test specific task
  10. docker-compose run \
  11. --no-deps \
  12. --rm webserver \
  13. airflow test \
  14. init_docker_example \
  15. initialize_etl_example \
  16. 2019-01-12T00:00:00+00:00

Other docker-compose examples:

  1. docker-compose -f docker-compose-LocalExecutor.yml up --abort-on-container-exit
  2. docker-compose -f docker-compose-LocalExecutor.yml down
  3. docker-compose -f docker-compose-LocalExecutor.yml up -d
  4. # list dags by running the webserver container and using the airflow cli
  5. docker-compose run --rm webserver airflow list_dags
  6. # unpause init_docker_example

docker-compose usage

For example:

  1. $ docker-compose run web python manage.py shell

By default, linked services will be started, unless they are already
running. If you do not want to start linked services, use
docker-compose run --no-deps SERVICE COMMAND [ARGS...].

  1. Usage:
  2. run [options] [-v VOLUME...] [-p PORT...] [-e KEY=VAL...] [-l KEY=VALUE...]
  3. SERVICE [COMMAND] [ARGS...]
  4. Options:
  5. -d, --detach Detached mode: Run container in the background, print
  6. new container name.
  7. --name NAME Assign a name to the container
  8. --entrypoint CMD Override the entrypoint of the image.
  9. -e KEY=VAL Set an environment variable (can be used multiple times)
  10. -l, --label KEY=VAL Add or override a label (can be used multiple times)
  11. -u, --user="" Run as specified username or uid
  12. --no-deps Don't start linked services.
  13. --rm Remove container after run. Ignored in detached mode.
  14. -p, --publish=[] Publish a container's port(s) to the host
  15. --service-ports Run command with the service's ports enabled and mapped
  16. to the host.
  17. --use-aliases Use the service's network aliases in the network(s) the
  18. container connects to.
  19. -v, --volume=[] Bind mount a volume (default [])
  20. -T Disable pseudo-tty allocation. By default `docker-compose run`
  21. allocates a TTY.
  22. -w, --workdir="" Working directory inside the container

Connect to database

If you want to use Ad hoc query, make sure you’ve configured connections:
Go to Admin -> Connections and Edit “postgres_default” set this values:

  • Host : postgres
  • Schema : airflow
  • Login : airflow
  • Password : airflow

Credits

Resources


Google Cloud Composer Quickstart

Using gcloud to set variables

The gcloud composer command can be used to set airflow variables.

  1. export GCP_PROJECT=$(gcloud config get-value core/project)
  2. export GCE_ZONE=$(gcloud config get-value compute/zone)
  3. export ENVIRONMENT=dev
  4. export ENVIRONMENT_NAME=$GCP_PROJECT'-'$ENVIRONMENT
  5. export LOCATION=us-$GCE_ZONE
  6. export GCS_BUCKET=$GCP_PROJECT'-'$ENVIRONMENT_NAME'-bucket'
  7. gsutil mb $GCS_BUCKET
  8. gcloud composer environments run $ENVIRONMENT_NAME \
  9. --location LOCATION variables -- \
  10. --set gcp_project $GCP_PROJECT
  11. gcloud composer environments run $ENVIRONMENT_NAME \
  12. --location LOCATION variables -- \
  13. --set gcs_bucket $GCS_BUCKET
  14. gcloud composer environments run $ENVIRONMENT_NAME \
  15. --location LOCATION variables -- \
  16. --set gce_zone $GCE_ZONE

Airflow tutorial (Original)

This is the code for Airflow-tutorial playlist by Tuan Vu on Youtube

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

Usage

Run the web service with docker

  1. docker-compose up -d
  2. # Build the image
  3. # docker-compose up -d --build

Check http://localhost:8080/

  • docker-compose logs - Displays log output
  • docker-compose ps - List containers
  • docker-compose down - Stop containers

Other commands

If you want to run other airflow sub-commands, you can do so like this:

  • docker-compose run --rm webserver airflow list_dags - List dags
  • docker-compose run --rm webserver airflow test [DAG_ID] [TASK_ID] [EXECUTION_DATE] - Test specific task

Connect to database

If you want to use Ad hoc query, make sure you’ve configured connections:
Go to Admin -> Connections and Edit “postgres_default” set this values:

  • Host : postgres
  • Schema : airflow
  • Login : airflow
  • Password : airflow

Credits