项目作者: nhatthai

项目描述 :
Integate Amazon AWS SageMaker into the Kedro pipeline.
高级语言: Python
项目地址: git://github.com/nhatthai/kedro-aws-sagemaker.git
创建时间: 2021-01-25T07:42:48Z
项目社区:https://github.com/nhatthai/kedro-aws-sagemaker

开源协议:

下载


Kedro AWS SageMaker

  1. Integate Amazon AWS SageMaker into the Kedro pipeline.
  2. Build machine learning pipelines in Kedro and while taking advantage of
  3. the power of SageMaker for potentially compute-intensive machine learning tasks.

Prerequisites

  • Kedro 0.16.6
  • S3 bucket & SageMaker
  • scikit-learn 0.23.0
  • pickle5 0.0.11

Issues

  • Could not move S3 objects to another region in AWS SageMaker

    1. File "/usr/local/lib/python3.7/site-packages/kedro/pipeline/node.py", line 433, in run
    2. raise exc
    3. File "/usr/local/lib/python3.7/site-packages/kedro/pipeline/node.py", line 424, in run
    4. outputs = self._run_with_list(inputs, self._inputs)
    5. File "/usr/local/lib/python3.7/site-packages/kedro/pipeline/node.py", line 471, in _run_with_list
    6. return self._decorated_func(*[inputs[item] for item in node_inputs])
    7. File "/Users/nhatthai/Code/kedro-aws-sagemaker/example/src/example/pipelines/data_science/nodes.py", line 104, in train_model_sagemaker
    8. sklearn_estimator.fit(inputs=inputs, wait=True)
    9. File "/usr/local/lib/python3.7/site-packages/sagemaker/estimator.py", line 657, in fit
    10. self.latest_training_job = _TrainingJob.start_new(self, inputs, experiment_config)
    11. File "/usr/local/lib/python3.7/site-packages/sagemaker/estimator.py", line 1420, in start_new
    12. estimator.sagemaker_session.train(**train_args)
    13. File "/usr/local/lib/python3.7/site-packages/sagemaker/session.py", line 562, in train
    14. self.sagemaker_client.create_training_job(**train_request)
    15. File "/usr/local/lib/python3.7/site-packages/botocore/client.py", line 357, in _api_call
    16. return self._make_api_call(operation_name, kwargs)
    17. File "/usr/local/lib/python3.7/site-packages/botocore/client.py", line 676, in _make_api_call
    18. raise error_class(parsed_response, operation_name)
    19. botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the CreateTrainingJob operation:
    20. No S3 objects found under S3 URL "s3://kedro-data" given in input data source.
    21. Please ensure that the bucket exists in the selected region (us-east-1),
    22. that objects exist under that S3 prefix,
    23. and that the role "arn:aws:iam::783560535431:role/SageMaker-ExecRole" has "s3:ListBucket" permissions on bucket "kedro-data".
    24. Error message from S3: The bucket is in this region: ap-southeast-1.
    25. Please use this region to retry the request

    Currently, us-east-1 region is default

    • Fixed: set region=ap-southeast-1 into ~/.aws/config file.

Results

  • Kedro Visualise Pipelines
    Kedro Viz

  • Kedro AWS SageMaker
    Kedro SageMaker

  • Amazon SageMaker Completed
    Amazon SageMaker Completed

  • Amazon SageMaker Detail
    Amazon SageMaker Detail

References