Kubeflow Demo
This post will help you to run your custom PyTorch Lightning training loop - followed by model serving - in a Kubeflow pipeline. Please see the previous post on how to get Kubeflow up and running. Make sure to run the port-forwarding to the services and SSH to the instances.
You can see the entire notebook with all the code/steps here.
Step 1 : Create a Notebook
First, we will create a notebook. We will use a custom-made public ECR image for this. Although one is already available at public.ecr.aws/j1r0q0g6/notebooks/notebook-servers/jupyter-pytorch-full:v1.5.0
, a custom one has been made that uses the latest versions of PyTorch (+ Lightning) as well as kfp. More instructions can be found here.
- Name the notebook:
cifar-intel
- Use Custom Image:
public.ecr.aws/x6u1q5c1/kflow/pytorch:cpu-1.13.1
- CPU: 2, Memory 4 GiB (free to choose)
- In Advanced Options -> Configurations, select ‘Allow access to Kubeflow Pipelines’
Step 2: Writing Functions
We will import the necessary libraries.
import kfp
import kfp.components as components
import requests
import kfp.dsl as dsl
from typing import NamedTuple
The last one is needed for logging outputs to the dashboard.
We will create several functions for a modular approach. Note that these function will not be executed immediately, but will be run as part of a pipeline later.
Get Data
This function gets data from a publicly accessible URL. It downloads the data into a Minio bucket.
Preprocess Data
This function fetches data from the Minio bucket and creates test/train split. It then pushes these data back to the Minio bucket.
Model Train
This function is responsible for training the model. However, before this function, we download code from an external site into the Minio bucket. This code downloading can also be done inside the function as well. It first fetches the train/test data splits and then runs the training loop. Note that in this step, we generate a Torchscript model and upload it to our Minio bucket.
Model Generation
This function can be used to convert model from one format (e.g. PyTorch) to another (e.g. ONNX, TensorRT, etc.). Here, we simply convert our Torchscript model generated in the previous step to MAR format, that is to be used by PyTorch.
Model Serve
This create a KServe endpoint that serves our PyTorch Model using TorchServe. We provide it with the location of our model along with a service account name that stores our access keys for the Minio storage.
Step 3: Building the Pipeline
We then convert our functions into ‘components’ which - as the name suggests - are components for our pipeline. We use the same base image as the one used for our Notebook (although it can be a more lightweight image as well). We can also specify additional libraries to be installed before running our function if required e.g. torch-model-archiver
for the model generation function.
We then create the pipeline by adding a @dsl.pipeline
decorator to a function that defines the sequence of steps. The after
method ensures order in our pipeline.
Step 4: Executing the Pipeline
We can then run our pipeline via
client.create_run_from_pipeline_func(output_test,arguments=arguments,experiment_name="intel-pt")
Step 5: Inference
To view the model endpoint, go to Endpoints on the left-hand sidebar. You will see a mode cifar34
. Make sure it has a green checkpoint next to it before proceeding.
First, we can check the status of our model
!curl http://cifar34.kubeflow-user-example-com.svc.cluster.local/v1/models/cifar34
This will give an output like this
{"name": "cifar34", "ready": true}
We can download a sample image, convert it to base64, and make a POST request to the endpoint.
For the associated files, visit the repo.