Kubeflow Demo

This post will help you to run your custom PyTorch Lightning training loop - followed by model serving - in a Kubeflow pipeline. Please see the previous post on how to get Kubeflow up and running. Make sure to run the port-forwarding to the services and SSH to the instances.

You can see the entire notebook with all the code/steps here.

Step 1 : Create a Notebook

First, we will create a notebook. We will use a custom-made public ECR image for this. Although one is already available at public.ecr.aws/j1r0q0g6/notebooks/notebook-servers/jupyter-pytorch-full:v1.5.0, a custom one has been made that uses the latest versions of PyTorch (+ Lightning) as well as kfp. More instructions can be found here.

Name the notebook: cifar-intel
Use Custom Image: public.ecr.aws/x6u1q5c1/kflow/pytorch:cpu-1.13.1
CPU: 2, Memory 4 GiB (free to choose)
In Advanced Options -> Configurations, select ‘Allow access to Kubeflow Pipelines’

Step 2: Writing Functions

We will import the necessary libraries.

import kfp
import kfp.components as components
import requests
import kfp.dsl as dsl

from typing import NamedTuple

The last one is needed for logging outputs to the dashboard.

We will create several functions for a modular approach. Note that these function will not be executed immediately, but will be run as part of a pipeline later.

Get Data

This function gets data from a publicly accessible URL. It downloads the data into a Minio bucket.

Preprocess Data

This function fetches data from the Minio bucket and creates test/train split. It then pushes these data back to the Minio bucket.

Model Train

This function is responsible for training the model. However, before this function, we download code from an external site into the Minio bucket. This code downloading can also be done inside the function as well. It first fetches the train/test data splits and then runs the training loop. Note that in this step, we generate a Torchscript model and upload it to our Minio bucket.

Model Generation

This function can be used to convert model from one format (e.g. PyTorch) to another (e.g. ONNX, TensorRT, etc.). Here, we simply convert our Torchscript model generated in the previous step to MAR format, that is to be used by PyTorch.

Model Serve

This create a KServe endpoint that serves our PyTorch Model using TorchServe. We provide it with the location of our model along with a service account name that stores our access keys for the Minio storage.

Step 3: Building the Pipeline

We then convert our functions into ‘components’ which - as the name suggests - are components for our pipeline. We use the same base image as the one used for our Notebook (although it can be a more lightweight image as well). We can also specify additional libraries to be installed before running our function if required e.g. torch-model-archiver for the model generation function.

We then create the pipeline by adding a @dsl.pipeline decorator to a function that defines the sequence of steps. The after method ensures order in our pipeline.

Step 4: Executing the Pipeline

We can then run our pipeline via

client.create_run_from_pipeline_func(output_test,arguments=arguments,experiment_name="intel-pt")

Step 5: Inference

To view the model endpoint, go to Endpoints on the left-hand sidebar. You will see a mode cifar34. Make sure it has a green checkpoint next to it before proceeding.

First, we can check the status of our model

!curl http://cifar34.kubeflow-user-example-com.svc.cluster.local/v1/models/cifar34

This will give an output like this

{"name": "cifar34", "ready": true}

We can download a sample image, convert it to base64, and make a POST request to the endpoint.

For the associated files, visit the repo.