项目作者: kubeflow-kale

项目描述 :
Kubeflow’s superfood for Data Scientists
高级语言: Python
项目地址: git://github.com/kubeflow-kale/kale.git
创建时间: 2019-01-24T17:58:44Z
项目社区:https://github.com/kubeflow-kale/kale

开源协议:Apache License 2.0

下载



Kale Logo




GitHub License


PyPI Version


npm Version


Kale CI Workflow Status


KALE (Kubeflow Automated pipeLines Engine) is a project that aims at simplifying
the Data Science experience of deploying Kubeflow Pipelines workflows.

Kubeflow is a great platform for orchestrating complex workflows on top
Kubernetes and Kubeflow Pipeline provides the mean to create reusable components
that can be executed as part of workflows. The self-service nature of Kubeflow
make it extremely appealing for Data Science use, at it provides an easy access
to advanced distributed jobs orchestration, re-usability of components, Jupyter
Notebooks, rich UIs and more. Still, developing and maintaining Kubeflow
workflows can be hard for data scientists, who may not be experts in working
orchestration platforms and related SDKs. Additionally, data science often
involve processes of data exploration, iterative modelling and interactive
environments (mostly Jupyter notebook).

Kale bridges this gap by providing a simple UI to define Kubeflow Pipelines
workflows directly from you JupyterLab interface, without the need to change a
single line of code.

Read more about Kale and how it works in this Medium post:
Automating Jupyter Notebook Deployments to Kubeflow Pipelines with Kale

Getting started

Install the Kale backend from PyPI and the JupyterLab extension. You can find a
set of curated Notebooks in the
examples repository

  1. # install kale
  2. pip install kubeflow-kale
  3. # install jupyter lab
  4. pip install "jupyterlab>=2.0.0,<3.0.0"
  5. # install the extension
  6. jupyter labextension install kubeflow-kale-labextension
  7. # verify extension status
  8. jupyter labextension list
  9. # run
  10. jupyter lab

Kale JupyterLab Extension

To build images to be used as a NotebookServer in Kubeflow, refer to the
Dockerfile in the docker folder.

FAQ

Head over to FAQ to read about some known issues and some of the
limitations imposed by the Kale data marshalling model.

Resources

Contribute

Backend

Create a new Python virtual environment with Python >= 3.6. Then:

  1. cd backend/
  2. pip install -e .[dev]
  3. # run tests
  4. pytest -x -vv

Labextension

The JupyterLab Python package comes with its own yarn wrapper, called jlpm.
While using the previously installed venv, install JupyterLab by running:

  1. pip install "jupyterlab>=2.0.0,<3.0.0"

You can then run the following to install the Kale extension:

  1. cd labextension/
  2. # install dependencies from package.lock
  3. jlpm install
  4. # build extension
  5. jlpm run build
  6. # list installed jp extensions
  7. jlpm labextension list
  8. # install Kale extension
  9. jlpm labextension install .
  10. # for development:
  11. # build and watch
  12. jlpm run watch
  13. # in another shell, run JupyterLab in watch mode
  14. jupyter lab --no-browser --watch

Git Hooks

This repository uses
husky
to set up git hooks.

For husky to function properly, you need to have yarn installed and in your
PATH. The reason that is required is that husky is installed via
jlpm install and jlpm is a yarn wrapper. (Similarly, if it was installed
using the npm package manager, then npm would have to be in PATH.)

Currently installed git hooks:

  • pre-commit: Run a prettier check on staged files, using
    pretty-quick