Official Azure Reference Architectures for AI workloads
This repository contains the recommended ways to train and deploy machine learning models on Azure. It ranges from running massively parallel hyperparameter tuning using Hyperdrive to deploying deep learning models on Kubernetes. Each tutorial takes you step by step through the process to train or deploy your model. If you are confused about what service to use and when look at the FAQ below.
For further documentation on the reference architectures please look here.
This repository is arranged as submodules and therefore you can either pull all the tutorials or simply the ones you want.
To pull all the tutorials simply run:
git clone --recurse-submodules https://github.com/Microsoft/AIReferenceArchitectures.git
if you have git older than 2.13 run:
git clone --recursive https://github.com/Microsoft/AIReferenceArchitectures.git
Tutorial | Environment | Description | Status |
---|---|---|---|
Deploy Deep Learning Model on Kubernetes | Python GPU | Deploy image classification model on Kubernetes or IoT Edge for real-time scoring using Azure ML | |
Deploy Classic ML Model on Kubernetes | Python CPU | Train LightGBM model locally using Azure ML, deploy on Kubernetes or IoT Edge for real-time scoring | |
Hyperparameter Tuning of Classical ML Models | Python CPU | Train LightGBM model locally and run Hyperparameter tuning using Hyperdrive in Azure ML | |
Deploy Deep Learning Model on Pipelines | Python GPU | Deploy PyTorch style transfer model for batch scoring using Azure ML Pipelines | |
Deploy Classic ML Model on Pipelines | Python CPU | Deploy one-class SVM for batch scoring anomaly detection using Azure ML Pipelines | |
Deploy R ML Model on Kubernetes | R CPU | Deploy ML model for real-time scoring on Kubernetes | |
Deploy R ML Model on Batch | R CPU | Deploy forecasting model for batch scoring using Azure Batch and doAzureParallel | |
Deploy Spark ML Model on Databricks | Spark CPU | Deploy a classification model for batch scoring using Databricks | |
Train Distributed Deep Leaning Model | Python GPU | Distributed training of ResNet50 model using Batch AI |
The tutorials have been mainly tested on Linux VMs in Azure. Each tutorial may have slightly different requirements such as GPU for some of the deep learning ones. For more details please consult the readme in each tutorial.
Please report issues with each tutorial in the tutorial’s own github page.
If there is a particular scenario you are interested in seeing a tutorial for please fill in a scenario suggestion
We are constantly developing interesting AI reference architectures using Microsoft AI Platform. Some of the ongoing projects include IoT Edge scenarios, model scoring on mobile devices, add more… To follow the progress and any new reference architectures, please go to the AI section of this link.
This project welcomes contributions and suggestions. Most contributions require you to agree to a
Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
the rights to use your contribution. For details, visit https://cla.microsoft.com.
When you submit a pull request, a CLA-bot will automatically determine whether you need to provide
a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions
provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct.
For more information see the Code of Conduct FAQ or
contact opencode@microsoft.com with any additional questions or comments.