How can we orchestrate and automate data movement and data transformation in AWS sagemaker pipeline-CodePudding

I'm migrating our ML notebooks from Azure Databricks to AWS environment using Sagemaker and Step functions. I have separate notebooks for data processing, feature engineering and ML algorithms which I want to run in a sequence after completion of previous notebook. Can you help me any resource which shows to execute sagemaker notebooks in a sequence using AWS step?

CodePudding user response：

For this type of architecture you need to involve some other elements of the aws as well. The other services which might be helpful to achieve this is using the combination of eventbridge (scheduled rules) which will execute lambda and then reaches to sagemaker where you can execute you notebooks.

CodePudding user response：

A new feature allows you to Operationalize your Amazon SageMaker Studio notebooks as scheduled notebook jobs. Unfortunately there no way yet to tie them together into a pipeline.
The other alternative would be to convert your notebooks to processing and training jobs, and use something like AWS Step Functions, or SageMaker Pipelines to run them as a pipeline.