项目作者: goyal07nidhi

项目描述 :
An application to make data available as an API with an API key authentication.
高级语言: Jupyter Notebook
项目地址: git://github.com/goyal07nidhi/Data-as-a-Service.git
创建时间: 2021-04-07T05:59:42Z
项目社区:https://github.com/goyal07nidhi/Data-as-a-Service

开源协议:

下载


Data-as-a-service

made-with-python


Getting Started

The Goal of this project is to build an application for a company who is interested in monetizing it’s data and making it’s data available as an API.
To build this service, we are using Fast API to illustrate how it works.

Data used:

Task performed

  • Task 1: Review
  • Task 2: Data Ingestion
  • Task 3: Design the Fast API
  • Task 4: Enabling API key authentication
  • Task 5: Test API

Project Structure

  1. Assignment_3/
  2. ├── dags/
  3. └── data_ingestion.py
  4. ├── FastAPI/
  5. ├── main.py
  6. └── test_main.py
  7. ├── locust_load_test.py
  8. ├── MindiagramArchitecture/
  9. ├── datalytics_architecture.py
  10. └── moody_architecture.py
  11. ├── Production_Plant_data_input/
  12. ├── C11.csv
  13. ├── C13-1.csv
  14. ├── C13-2.csv
  15. ├── C14.csv
  16. ├── C15.csv
  17. ├── C16.csv
  18. ├── C7-1.csv
  19. ├── C7-2.csv
  20. ├── C8.csv
  21. └── C9.csv
  22. ├── README.md
  23. └── TestingJupyterNotebook/
  24. ├── main.ipynb
  25. └── test_main.ipynb

Task 1: Review

api_architecture

Setup:

  1. install chocolatey
  2. Run choco install graphviz in administer mode for windows
  3. Run the code in administer mode in PowerShell
    1. Set-ExecutionPolicy Bypass -Scope Process -Force; [System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072; iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))
  4. pip install diagrams

Process:

  1. Make a .py based on your architecture and run the file. An image should be created in the path of your python file

Task 2: Data Ingestion

Screen Shot 2021-03-31 at 3 12 36 PM

Requirements:

  • Snowflake
  • Airflow

1. Snowflake Account Setup

Create a snowflake account by using below link:

  1. https://signup.snowflake.com/?_ga=2.124938569.258300955.1617216030-578664637.1617216030

2. Snowflake connection setup:

To verify your version of Python:

  1. python --version

Use pip version 19.0 or later. Execute the following command to ensure the required version is installed:

  1. python -m pip install --upgrade pip

To install the connector, run the following commands:

  1. pip install snowflake-connector-python==<version>
  2. pip install upgrade snowflake-connector-python

Verify your installation
Create a file (e.g. validate.py) containing the following Python sample code, which connects to Snowflake and displays the Snowflake version:

  1. #!/usr/bin/env python
  2. import snowflake.connector
  3. Gets the version
  4. ctx = snowflake.connector.connect( user='<user_name>', password='<password>', account='<account_name>')
  5. cs = ctx.cursor()
  6. try:
  7. cs.execute("SELECT current_version()")
  8. one_row = cs.fetchone()
  9. print(one_row[0])
  10. finally:
  11. cs.close()
  12. ctx.close()

:Note: Make sure to replace , , and with the appropriate values for your Snowflake account.

Next, execute the sample code by:

  1. python validate.py

3. Airflow setup:

  1. pip install apache-airflow
  2. pip install -r requirements.txt

Once Airflow is installed, configure the same by running:

  1. # Use your present working directory as the airflow home
  2. export AIRFLOW_HOME=~(pwd)
  3. # export Python Path to allow use of custom modules by Airflow
  4. export PYTHONPATH="${PYTHONPATH}:${AIRFLOW_HOME}"
  5. # initialize the database
  6. airflow db init
  7. airflow users create \
  8. --username admin \
  9. --firstname <YourName> \
  10. --lastname <YourLastName> \
  11. --role Admin \
  12. --email example@example.com

4. Using Airflow

Start the Airflow server in daemon

  1. airflow webserver -D

Start the Airflow Scheduler

  1. airflow scheduler

Once both are running - you should be able to access the Airflow UI by visiting http://127.0.0.1:8080/home on your browser.

To kill the Airflow webserver daemon:

  1. lsof -i tcp:8080

You should see a list of all processes that looks like this:

  1. COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
  2. Python 24905 ng 6u IPv4 0x4b0a093c5550948f 0t0 TCP *:http-alt (LISTEN)
  3. Python 24909 ng 6u IPv4 0x4b0a093c5550948f 0t0 TCP *:http-alt (LISTEN)
  4. Python 24911 ng 6u IPv4 0x4b0a093c5550948f 0t0 TCP *:http-alt (LISTEN)
  5. Python 24912 ng 6u IPv4 0x4b0a093c5550948f 0t0 TCP *:http-alt (LISTEN)
  6. Python 24916 ng 6u IPv4 0x4b0a093c5550948f 0t0 TCP *:http-alt (LISTEN)
  7. Python 24923 ng 6u IPv4 0x4b0a093c5550948f 0t0 TCP *:http-alt (LISTEN)

Kill the process by running kill - in this case, it would be kill 24905

Running the Pipeline

Login to Airflow on your browser and turn on the Data_Ingestion DAG from the UI. Start the pipeline by choosing the DAG and clicking on Run.

113389409-fb29e300-935d-11eb-8ea

Task 3: Design the Fast API

Requirements:

  • Fastapi
  • Pytest

Fastapi setup:

Install FastAPI framework, high performance, easy to learn, fast to code, ready for production

  1. pip install fastapi

Install the lightning-fast ASGI server Uvicorn

  1. pip install uvicorn

Install Python snowflake connector to get data from snowflake or post data into snowflake

  1. pip install snowflake-connector-python==<version>

Simple powerful testing with python

  1. pip install pytest

:Note: Make sure to replace , ,,, ,

and with the appropriate values for your Snowflake account

Built various API’s that can be used to query different aspects of the dataset.

Create various GET and POST methods using FastApi

Review https://fastapi.tiangolo.com/tutorial/ for an intro to API

Using Fastapi:

Go to the Fastapi DIrectory path and you will see two files main.py and test_main.py

  1. http://localhost:8080/

You can now run uvicorn main:app —reload to start fastapi running on 8080 port.

Using Pytest:

To check whether our api is working well, we can use make use of test_main.py file by simply running pytest

In that, we have our Test Client defined to test all the api’s present in our app

Task 4: Enabling API key authentication

To enable API key authentication

We refered https://medium.com/data-rebels/fastapi-authentication-revisited-enabling-api-key-authentication-122dc5975680

Authentication Setup:

  1. from fastapi.security.api_key import APIKeyQuery

-Set a API KEY name for instance to access_token

-Set a API Key to “Team6” for instance

Extract the access_token key in the query

Using the following line of code

  1. api_key_query = APIKeyQuery(name=API_KEY_NAME, auto_error=False)

Define the API Key

  1. async def get_api_key(
  2. api_key_query: str = Security(api_key_query)):
  3. if api_key_query == API_KEY:
  4. return api_key_query
  5. else:
  6. raise HTTPException(
  7. status_code=HTTP_403_FORBIDDEN, detail="Could not validate credentials"
  8. )
  • Call this api key function from each and every api function to authenticate it

Task 5: Test API

Test Unit Cases

Setup:

Install PyTest by running this:

  1. % pip install pytest

Using TestClient

Import TestClient
Create a TestClient passing to it your FastAPI.

Create functions with a name that starts with test. (this is standard pytest conventions).

Use the TestClient object the same way as you do with requests.

Write simple assert statements with the standard Python expressions that you need to check (again, standard pytest).

Run the command to test all use cases in test_main.py,
  1. pytest or pytest -v

Locust Load Test

Setup:

1.Install the libraries

  1. pip install locustio==0.14.6
  2. pip install greenlet==0.4.16
  1. Run locust --help and should give you an output similar below
    image

    Process:

  2. Make a .py file similar to locust_load_test.py to test the load test

  3. To run the file

    1. locust -f query_locust.py
  4. Open the browser and enter the following url

    1. http://localhost:8089/
  5. Fill up the Number of users to simulate, Hatch rate, Host and click Start swarming

Team Members:

  1. Nidhi Goyal
  2. Kanika Damodarsingh Negi
  3. Rishvita Reddy Bhumireddy

Citation: