Kick-start your path to production with a project template
Earlier this year, I published a step-by-step guide to automating an end-to-end ML lifecycle with built-in SageMaker MLOps project templates and MLflow. It brought workflow orchestration, model registry, and CI/CD under one umbrella to reduce the effort of running end-to-end MLOps projects.
In this post, we will go a step further and define an MLOps project template based on GitHub, GitHub Actions, MLflow, and SageMaker Pipelines that you can reuse across multiple projects to accelerate your ML delivery.
We will take an example model trained using Random Forest on the California House Prices dataset, and automate its end-to-end lifecycle until deployment into a real-time inference service.
We will tackle in 4 steps:
- We will first setup a development environment with IDE, MLflow tracking server, and connect GitHub Actions to your AWS account.
- Second, I will show how you can experiment, and collaborate easily with your team members. We will also package code into containers, and run them in scalable SageMaker jobs.
- Then, we will automate your model build workflow with a SageMaker Pipeline and schedule it to run once a week.
- Finally, we will deploy a real-time inference service in your account with a GitHub Actions-based CI/CD pipeline.
Note: There can be more to MLOps than deploying inference services (e.g: data labeling, date versioning, model monitoring), and this template should give you enough structure to tailor it to your needs.
Step 1: Setting up your project environment
We will use the following components in the project:
- SageMaker for container-based jobs, model hosting, and ML pipelines
- MLflow for experiment tracking and model registry.
- Api Gateway for exposing our inference endpoint behind an API.
- GitHub as repo, CI/CD and ML pipeline scheduler with GitHub Actions.
If you work in an enterprise, this setup may be done for you by IT admins.
Working from your favourite IDE
For productivity, make sure you work from an IDE you are comfortable with. Here, I host VS Code on a SageMaker Notebook Instance and will also use SageMaker Studio to visualize the ML pipeline.
See Hosting VS Code on SageMaker for install instructions.
Setting up a central MLflow tracking server
We need a central MLflow tracking server to collaborate on experiments and register models. If you don’t have one, you can follow instructions and blog post to deploy the open source version of MLflow on AWS Fargate.
You can also swap MLflow for native SageMaker options, Weights & Biases, or any other tool of your choice.
Connecting GitHub Actions to your AWS account
Next, we will use OpenID Connect (OIDC) to allow GitHub Actions workflows to access resources in your account, without needing to store AWS credentials as long-lived GitHub secrets. See Configuring OpenID Connect in Amazon Web Services for instructions.
You can setup a
github-actions role with the following trust relationships:
We add SageMaker as principal so we can run jobs and pipelines directly from GitHub workflows. Equally so for Lambda and API Gateway.
For illustrative purposes, I attached the
AdministratorAccess managed policy to the role. Make sure you tighten permissions in your environment.
Setting up GitHub repo secrets for the project
Finally, we store
AWS account ID,
region name, and
github-actions role ARN as secrets in the GitHub repo. They can be sensitive information and will be used securely by your GitHub workflows. See Encrypted secrets for details.
We are now ready to go!
Step 2: Experimenting and collaborating in your project
You can find the experiment folder in the repo with example notebooks and scripts. It is typically the place where you start the project and try to figure out approaches to your ML problem.
Below is the main notebook showing how to train a model with Random Forest on the California House Prices dataset, and do basic prediction:
It is a simple example to follow in our end-to-end project and you can run
pip install -r requirements.txt to work with the same dependencies as your team members.
This experimental phase of the ML project can be fairly unstructured and you can decide with your team how you want to organize the sub-folders. Also, whether you should use notebooks or python scripts is totally up to you.
You can save local data and files in the
model folders. I have added them to
.gitignore so you don’t end up pushing big files to GitHub.
Structuring your repo for easy collaboration
You can structure your repo anyway you want. Just keep in mind that ease of use and reproducibility are key for productivity in your project. So here I have put the whole project in a single repo and tried to find the balance between python project conventions and MLOps needs.
You can find bellow the folder structure with descriptions:
Tracking experiments with MLflow
You can track experiment runs with MLflow, whether you run code in your IDE or in SageMaker Jobs. Here I log runs under the
You can also find example labs in this repo for reference.
Step 3: Moving from local compute to container-based jobs in SageMaker
Running code locally can work in early project stages. However, at some point you will want to package dependencies into reproducible Docker images, and use SageMaker to run scalable, container-based jobs. I recommend reading A lift and shift approach for getting started with Amazon SageMaker if this sounds new to you.
Breaking down the workflow into jobs
You can breakdown your project workflow into steps. Here, we split ours into 2: We run data processing in SageMaker Processing jobs, and model training in SageMaker Training jobs.
Building containers and pushing them to ECR
The Dockerfiles for our jobs are in the docker folder and you can run the following shell command to push the images to ECR.
sh scripts/build_and_push.sh <ecr-repo-name> <dockerfile-folder>
Using configuration files in the project
To prevent hardcoding, we need a place to hold our jobs’ parameters. Those parameters can include container image URIs, MLflow tracking server URI, entry point script location, instance types, hyperparameters to use in your code running in SageMaker jobs.
We will use model_build.yaml for this. Its YAML structure make it easy to extend and maintain over time. Make sure to add your MLflow server URI and freshly pushed container image URIs to the config before running jobs.
Running containerized jobs in SageMaker
Step 4: Automating your model building
So you have successfully experimented locally and ran workflow steps as container-based jobs in SageMaker. Now we want to automate this process. Let’s call it the
model_build process, as it relates to everything happening before a model version is registered into MLflow.
We want to automate the container image building, tie our ML workflow steps into a pipeline, and automate the pipeline creation into SageMaker. There are different triggers we could apply on the pipeline, and we will trigger its executions based on a schedule defined in a GitHub workflow.
Building container images automatically with GitHub workflows
The workflow looks at Dockerfiles in the docker folder and triggers when changes occur in the repo
main branch. Under the hood it uses a composite GitHub action that takes care of logging into ECR, building, and pushing the images.
The workflow also tags the container images based on the GitHub commit to ensure traceability and reproducibility of your ML workflow steps.
Tying our ML workflow steps into a pipeline in SageMaker
Next, we define a pipeline in SageMaker to run our workflow steps. You can find our pipeline in the src/model_build folder. It basically runs the processing step, get its output data location, and trigger a training step. And same as for jobs, the pipeline executions use parameters defined in our model_build.yaml.
I have added scripts/submit_pipeline.py in the repo to help you create/update the pipeline in SageMaker on-demand. It can help debug and run the pipeline in SageMaker when needed.
Scheduling our SageMaker Pipeline with GitHub Actions
We schedule our pipeline with the schedule-pipeline GitHub workflow. It uses a cron expression to run the pipeline at 12:00 on Fridays.
This basic scheduling example can work for some of your use cases and feel free to adjust pipeline triggers as you see fit. You may also want to point the model_build configuration to a place where new data comes in.
After each pipeline execution you will see a new model version appear in MLflow. Those are model versions we want to deploy into production.
Step 5: Deploying your inference service into production
Now that we have model versions regularly coming into the model registry, we can deploy them into production. This is the
Our real-time inference service
We will build a real-time inference service for our project. For this, we want to get model artifacts from the model registry, build an MLflow inference container, and deploy them into a SageMaker endpoint. We will expose our endpoint via a Lambda function and API that a client can call for predictions.
In case you need to run predictions in batch, you can build an ML inference pipeline using the same approach as we took for
Pushing the inference container image to ECR
Alongside the ML model, we need a container image to handle the inference in our SageMaker Endpoint. Let’s push the one provided by MLflow into ECR.
Defining our API stack with CDK
We use CDK to deploy our inference infrastructure and define our stack in the model_deploy folder. app.py is our main stack file. You will see it read the model_deploy config and create SageMaker Endpoint, Lambda function as request proxy, and an API using API gateway.
Make sure you update your model_deploy config with container image and MLflow tracking server URIs before deploying.
Deploying into production with a multi-stage CI/CD pipeline
We use a Trunk Based approach to deploy our inference API into production. Essentially, we use a multi-stage GitHub workflow hooked to the repo
main branch to built, test, and deploy our inference service.
The CI/CD workflow is defined in deploy-inference and has 4 steps:
- build reads a chosen model version binary from MLflow (defined in config), and uploads its model.tar.gz to S3. This is done by mlflow_handler, which also saves the model S3 location in AWS SSM for use in later CI/CD stages. The MLflow handler also transitions the model into Staging in the model registry.
- deploy-staging deploys the CDK stack into staging so we can run tests on the API before going into production. The job uses a composite GitHub action I have built for deploying CDK templates into AWS.
- test-api does basic testing of the inference service in Staging. It sends an example payload to the API and checks if the response status is OK. If OK, the MLflow handler will transition the model into Production in the model registry. Feel free to add more tests as you see fit.
- deploy-prod deploys the CDK stack into production.
Using your inference service
When your service is successfully deployed into production, you can navigate to the AWS CloudFormation console, look at the stack
Outputs, and copy your API URL.
You are now ready to call your inference API and can use the following example data point in the request body:
You can use tools like Postman to test the inference API from your computer:
In this post I have shared with you an MLOps project template putting experiment tracking, workflow orchestration, model registry, and CI/CD under one umbrella. It’s key goal is to reduce the effort of running end-to-end MLOps projects and accelerate your delivery.
It uses GitHub, GitHub Actions, MLflow, and SageMaker Pipelines and you can reuse it across multiple projects to accelerate production delivery.
To go further in your learnings, you can visit Awesome SageMaker where you can find in a single place, all the relevant and up-to-date resources needed for working with SageMaker.