项目作者: criteo

项目描述 :
A fromconfig Launcher for MlFlow
高级语言: Python
项目地址: git://github.com/criteo/fromconfig-mlflow.git
创建时间: 2021-04-22T19:19:49Z



FromConfig MlFlow


A fromconfig Launcher for MlFlow support.


  1. pip install fromconfig_mlflow


To activate MlFlow login, simply add --launcher.log=mlflow to your command

  1. fromconfig config.yaml params.yaml --launcher.log=mlflow - model - train



  1. """Dummy Model."""
  2. import mlflow
  3. class Model:
  4. def __init__(self, learning_rate: float):
  5. self.learning_rate = learning_rate
  6. def train(self):
  7. print(f"Training model with learning_rate {self.learning_rate}")
  8. if mlflow.active_run():
  9. mlflow.log_metric("learning_rate", self.learning_rate)


  1. model:
  2. _attr_: model.Model
  3. learning_rate: "${params.learning_rate}"


  1. params:
  2. learning_rate: 0.001

It should print

  1. Started run:
  2. Training model with learning_rate 0.001

If you navigate to you should see your the logged learning_rate metric.

MlFlow server

To setup a local MlFlow tracking server, run

  1. mlflow server

which should print

  1. [INFO] Starting gunicorn 20.0.4
  2. [INFO] Listening at:

We will assume that the tracking URI is from now on.

Configure MlFlow

You can set the tracking URI either via an environment variable or via the config.

To set the MLFLOW_TRACKING_URI environment variable


Alternatively, you can set the mlflow.tracking_uri config key either via command line with

  1. fromconfig config.yaml params.yaml --launcher.log=mlflow --mlflow.tracking_uri="" - model - train

or in a config file with


  1. # Configure mlflow
  2. mlflow:
  3. # tracking_uri: "" # Or set env variable MLFLOW_TRACKING_URI
  4. # experiment_name: "test-experiment" # Which experiment to use
  5. # run_id: 12345 # To restore a previous run
  6. # run_name: test # To give a name to your new run
  7. # artifact_location: "path/to/artifacts" # Used only when creating a new experiment
  8. # Configure launcher
  9. launcher:
  10. log: mlflow

and run

  1. fromconfig config.yaml params.yaml launcher.yaml - model - train

Artifacts and Parameters

In this example, we add logging of the config and parameters.

Re-using the quickstart code, modify the launcher.yaml file

  1. # Configure logging
  2. logging:
  3. level: 20
  4. # Configure mlflow
  5. mlflow:
  6. # tracking_uri: "" # Or set env variable MLFLOW_TRACKING_URI
  7. # experiment_name: "test-experiment" # Which experiment to use
  8. # run_id: 12345 # To restore a previous run
  9. # run_name: test # To give a name to your new run
  10. # artifact_location: "path/to/artifacts" # Used only when creating a new experiment
  11. # include_keys: # Only log params that match *model*
  12. # - model
  13. # Configure launcher
  14. launcher:
  15. log:
  16. - logging
  17. - mlflow
  18. parse:
  19. - mlflow.log_artifacts
  20. - parser
  21. - mlflow.log_params

and run

  1. fromconfig config.yaml params.yaml launcher.yaml - model - train

which prints

  1. INFO:fromconfig_mlflow.launcher:Started run:<MLFLOW_RUN_ID>
  2. Training model with learning_rate 0.001

If you navigate to the MlFlow run URL, you should see

  • the parameters, a flattened version of the parsed config (model.learning_rate is 0.001 and not ${params.learning_rate})
  • the original config, saved as config.yaml
  • the parsed config, saved as parsed.yaml



To configure MlFlow, add a mlflow entry to your config and set the following parameters

  • run_id: if you wish to restart an existing run
  • run_name: if you wish to give a name to your new run
  • tracking_uri: to configure the tracking remote
  • experiment_name: to use a different experiment than the custom
  • artifact_location: the location of the artifacts (config files)

Additionally, the launcher can be initialized with the following attributes

  • set_env_vars: if True (default is True), set MLFLOW_RUN_ID and MLFLOW_TRACKING_URI
  • set_run_id: if True (default is False), set mlflow.run_id in config.

For example,

  1. # Configure logging
  2. logging:
  3. level: 20
  4. # Configure mlflow
  5. mlflow:
  6. # tracking_uri: "" # Or set env variable MLFLOW_TRACKING_URI
  7. # experiment_name: "test-experiment" # Which experiment to use
  8. # run_id: 12345 # To restore a previous run
  9. # run_name: test # To give a name to your new run
  10. # artifact_location: "path/to/artifacts" # Used only when creating a new experiment
  11. # Configure Launcher
  12. launcher:
  13. log:
  14. - logging
  15. - _attr_: mlflow
  16. set_env_vars: true
  17. set_run_id: true


The launcher can be initialized with the following attributes

  • path_command: Name for the command file. If None, don’t log the command.
  • path_config: Name for the config file. If None, don’t log the config.

For example,

  1. # Configure logging
  2. logging:
  3. level: 20
  4. # Configure mlflow
  5. mlflow:
  6. # tracking_uri: "" # Or set env variable MLFLOW_TRACKING_URI
  7. # experiment_name: "test-experiment" # Which experiment to use
  8. # run_id: 12345 # To restore a previous run
  9. # run_name: test # To give a name to your new run
  10. # artifact_location: "path/to/artifacts" # Used only when creating a new experiment
  11. # Configure launcher
  12. launcher:
  13. log:
  14. - logging
  15. - mlflow
  16. parse:
  17. - _attr_: mlflow.log_artifacts
  18. path_command: launch.sh
  19. path_config: config.yaml
  20. - parser
  21. - _attr_: mlflow.log_artifacts
  22. path_command: null
  23. path_config: parsed.yaml


The launcher will use include_keys and ignore_keys if present in the config in the mlflow key.

  • ignore_keys : If given, don’t log some parameters that have some substrings.
  • include_keys : If given, only log some parameters that have some substrings. Also shorten the flattened parameter to start at the first match. For example, if the config is {"foo": {"bar": 1}} and include_keys=("bar",), then the logged parameter will be "bar".

For example,

  1. # Configure logging
  2. logging:
  3. level: 20
  4. # Configure mlflow
  5. mlflow:
  6. # tracking_uri: "" # Or set env variable MLFLOW_TRACKING_URI
  7. # experiment_name: "test-experiment" # Which experiment to use
  8. # run_id: 12345 # To restore a previous run
  9. # run_name: test # To give a name to your new run
  10. # artifact_location: "path/to/artifacts" # Used only when creating a new experiment
  11. include_keys: # Only log params that match *model*
  12. - model
  13. # Configure launcher
  14. launcher:
  15. log:
  16. - logging
  17. - mlflow
  18. parse:
  19. - parser
  20. - mlflow.log_params