Tutorial - From training to inference - Classification#

This notebook contains all the necessary information needed to run an end-to-end test of OctaiPipe, training a model on the cloud and deploying it out to your own edge devices.

We will develop our pipelines on the OctaiPipe Jupyter server, and use the OctaiPipe Pythonic interface for deployments. We will also make use of the OctaiPipe web portal to register edge devices, monitor and query deployments and trained models.

In order to run the notebook, you will need to have access to one or more devices to deploy to. In this example, we will use a portion of the C-MAPSS dataset, which we prepared in csv files as part of the device registration package. The C-MAPSS dataset contains simulated sensor data from aeroengines as they undergo degradation. The machine learning task is to predict the failure imminence (0, 1, and 2) of a given aeroengine— a multi-class classification problem, using the sensor telemetry data. We use the sensor data from a single aeroengine for training, and that from a different one for inference. You will copy these data files to the test edge devices to run this tutorial, as instructed below. You can also use a dataset of your choice, but this would require adjusting the input and output data specs in the configuration files.

The notebook helps you do the following:

Copy C-MAPSS tutorial files to a registered device for training
Preprocess data and train a model on the edge with OctaiPipe
Finding model version for inference
Deploy model out to edge devices for inference
Remove deployment from device(s)
Clean up

Register devices#

We will deploy our example training pipeline and inference pipeline to a test edge device. If you haven’t already, register your edge devices/computers via OctaiPipe web portal. Search “register devices” in the documentation. This example uses local csv files as data source, but you can configure any data source you prefer. Note down the device_id of the device on which to run this tutorial; in what follows we call it device_0 for definiteness.

Copy C-MAPSS tutorial files to a registered device for training#

We download the octaipipe_e2e_tutorials.zip file from the Tutorials folder on the OctaiPipe Notebook Server. Copy the file to the (registered) test edge device and unzip. In what follows, we assume the destination folder is /home/octaipipe/octaipipe_e2e_tutorials, if that is not where you have located the data on your device, you will have to update the paths in each of the example configs. The data files in this tutorial is contained in octaipipe_e2e_tutorials/cmapss_cat, wherein the train sub-folder contains the training set data_cmapss_cat.csv, sensor data from a single aeroengine— the column RUL_cat is the target label, and the inference contains the inference data, sensor data from a different aeroengine. The target label RUL_cat has 3 classes indicating aeroengine failure imminence: 0, 1, 2, with 0 being the most imminent. The config files referenced below are also included in the configs sub-folder, to be used on the OctaiPipe Jupyter server (not on the edge devices).

Note: The container will have to have write permission to ~/octaipipe_e2e_tutorials. OctaiPipe images run as the octaipipe user; you can configure the permissions yourself or if you are not familiar with linux permissions, you can set lax permissions on ~/octaipipe_e2e_tutorials and its sub directories like so:

chmod -R 777 ~/octaipipe_e2e_tutorials

Preprocess data and train a model#

Now that we have set up our device and have the training dataset on the device, we can build a simple OctaiPipe pipeline which preprocesses the data, and trains a model on the preprocessed data. We will be working on the OctaiPipe Jupyter server, and using OctaiPipe’s Deployments Pythonic inferface to deploy the training pipeline to the device.

To deploy and run an OctaiPipe PipelineStep on an edge device, we need to define two main things: the step name and the path to the configuration file. To get an example configuration file, we use the OctaiPipe function octaipipe.develop.get_example_edge_deployment_config() which takes the step name as input.

For information on these pipeline steps and their configs, see the documentation for the preprocessing step and model training step. If you wish to get information on available OctaiPipe pipeline steps or how to make your own pipeline steps see the relevant documentation.

We use OctaiPipe’s native Preprocessing step to scale the features using sklearn’s min-max scaler. To do this, in the preprocessing step config file, we specify preprocessing_specs.steps to include normalise, and preprocessing_specs.preprocessors_specs to specify the type and the name of the scaler to be fitted. An example is provided below.

./configs/preprocessing.yml

name: preprocessing

# local input data source
input_data_specs:
  default:
    - datastore_type: local
      settings:
        query_type: csv
        query_config:
          filepath_or_buffer: /home/octaipipe/octaipipe_e2e_tutorials/cmapss_cat/train/data_cmapss_cat.csv
          index_col: "_time"  # set timestamp '_time' column as index


# local output data source
output_data_specs:
  default:
    - datastore_type: local
      settings:
        file_path: /home/octaipipe/octaipipe_e2e_tutorials/cmapss_cat/train/data_cmapss_cat_preprocessed.csv
        write_config:
            index: True

run_specs:
  save_results: True
  target_label: RUL_cat
  label_type: "int"
  onnx_pred: False
  train_val_test_split:
    to_split: false
    split_ratio:
      training: 0.6
      validation: 0.2
      testing: 0.2

preprocessing_specs:
  steps:
    - normalise
  preprocessors_specs:
    - type: minmax_scaler
      load_existing: False
      name: scaler_example
  degradation_model: pw_linear
  initial_RUL: 125

We use OctaiPipe’s native Model Training step to train a lightGBM classification model on the scaled training set that is the output of the preprocessing step. To do this, we specify model_specs, in particular the model type lgb_class_sk, name and the model parameters passed to the model (in this example random_state=42). An example is provided below.

./configs/model_training.yml

name: model_training

# local input data source
input_data_specs:
  default:
    - datastore_type: local
      settings:
        query_type: csv
        query_config:
          filepath_or_buffer: /home/octaipipe/octaipipe_e2e_tutorials/cmapss_cat/train/data_cmapss_cat_preprocessed.csv  # output of the preprocessing step
          index_col: "_time"  # set timestamp '_time' column as index to resemble data pulled from influxdb

# local output data source
output_data_specs:
  default:
    - datastore_type: local
      settings:
        file_path: /home/octaipipe/octaipipe_e2e_tutorials/cmapss_cat/train/data_cmapss_cat_trained.csv
        write_config:
            index: True

model_specs:
  type: lgb_class_sk
  load_existing: false
  name: model_example
  model_load_specs:
    version: '1.0'
  params:
    random_state: 42

run_specs:
  save_results: True
  target_label: RUL_cat
  grid_search:
    do_grid_search: false
    grid_definition: ./configs/model_grids/lgb_class_sk_grid.yml
    metric: 'mse'

After filling out the preprocessing and model_training configs to fit our data, we save them as ./configs/preprocessing.yml and ./configs/model_training.yml. If you’re using your own dataset or if you have configured the C-MAPSS data before adding to your device, these config files would need to be amended.

Once we have set up the config files, we can deploy the training pipeline to the edge device using OctaiPipe’s Deployment interface. We will deploy the preprocessing step and model training steps to our test device in sequence. To this end, we fill out the deploymeny config:

./configs/deployment_config_training_1.yml

name: deployment_config

device_ids: ['device_0']  # replace with registered device_id

image_name: octaipipe.azurecr.io/octaipipe-all_data_loaders:latest

datasources:
  environment:
    - INFLUX_ORG
    - INFLUX_URL
    - INFLUX_TOKEN
    - INFLUX_DEFAULT_BUCKET
  env_file:
    # - ./credentials/.env

grafana_deployment: null
grafana_cloud_config_path:

pipelines:
  - preprocessing:
      config_path: ./configs/preprocessing.yml

We deploy the preprocessing step with the deployment config ./configs/deployment_config_training_1.yml, by simply running: octaipipe.deployments.new_edge_deployment(config_path='./configs/deployment_config_training_1.yml') We can monitor the deployment status on the web portal. Once the preprocessing step is completed and the preprocessing data is output, we can run the model training step similarly, with the following deployment config:

./configs/deployment_config_training_2.yml

name: deployment_config

device_ids: ['device_0']  # replace with registered device_id

image_name: octaipipe.azurecr.io/octaipipe-all_data_loaders:latest

datasources:
  environment:
    - INFLUX_ORG
    - INFLUX_URL
    - INFLUX_TOKEN
    - INFLUX_DEFAULT_BUCKET
  env_file:
    # - ./credentials/.env

grafana_deployment: null
grafana_cloud_config_path:

pipelines:
  - model_training:
      config_path: ./configs/model_training.yml

You could also combine these steps together into one deployment like so:

./configs/deployment_config_training_3.yml

name: deployment_config

device_ids: ['device_0']  # replace with registered device_id

image_name: octaipipe.azurecr.io/octaipipe-all_data_loaders:latest

datasources:
  environment:
    - INFLUX_ORG
    - INFLUX_URL
    - INFLUX_TOKEN
    - INFLUX_DEFAULT_BUCKET
  env_file:
    # - ./credentials/.env

grafana_deployment: null
grafana_cloud_config_path:

pipelines:
  - preprocessing:
      config_path: ./configs/preprocessing.yml
  - model_training:
      config_path: ./configs/model_training.yml

Note: It may take a few minutes for the device to pull the OctaiPipe image and start the steps. To check on the progress of this you can either check the status of the dpeloymetn in the OctaiPipe Portal or get the logs for the Edge Client running on the target device.

Finding model version for inference#

With the training pipeline completed, we now have the trained minmax scaler and lightGBM classifier model. To deploy a model, we need to retrieve a model’s version, which along with the model type and name, act as its identifier in inference steps. We can simply find the trained models’ metadata on the web portal, or alternatively, this can be done by OctaiPipe’s Models Pythonic interface.

For example, we can find all model versions by the model name by:

octaipipe.models.find_models_by_name('model_example')

And the model metadata of a particular model version by:

octaipipe.models.get_model_by_name_and_version(name='model_example', version='1.1')

The metadata returned includes the OctaiStep config used to train the model, and the input column names.

We note down the model type, name and version of the minmax scaler and lightGBM model we just trained, which will be referenced in the inference pipeline.

Deploy model out to edge devices for inference#

We deploy the trained scaler and model for inference. For the purpose of this tutorial, we deploy to the same device we ran the training pipeline on, but you can easily deploy to other, multiple devices, once you have prepared the inference data on those devices (see a previous section).

We first fill out the inference pipeline step. Following our training pipeline, we preprocess the aeroengine sensor with the fitted minmax scaler, followed by model inference with the fitted lightGBM model on the scaled data. These steps are run at a periodic interval, as is typical of inference where new live data is being continuous recorded— although our example inference dataset is a fixed one.

The OctaiStep configs are specified similar to what was done in the training pipeline, except that we load an existing scaler/model, and we set the steps to run every 1s:

./configs/preprocessing_inference.yml

name: preprocessing

# local input data source
input_data_specs:
  default:
    - datastore_type: local
      settings:
        query_type: csv
        query_config:
          filepath_or_buffer: /home/octaipipe/octaipipe_e2e_tutorials/cmapss_cat/inference/data_cmapss_cat_inference.csv
          index_col: _time

# local output data source
output_data_specs:
  default:
    - datastore_type: local
      settings:
          file_path: /home/octaipipe/octaipipe_e2e_tutorials/cmapss_cat/inference/data_cmapss_cat_preprocessed.csv
          write_config:
              index: True

run_specs:
  save_results: True
  run_interval: 1s  # if run_interval is present in run_specs, the step is run periodically, e.g. for inference
  onnx_pred: false

preprocessing_specs:
  steps:
    - normalise
  preprocessors_specs:  # specify a trained scaler
    - type: minmax_scaler
      load_existing: True
      name: scaler_example
      version: "3.0"  # fill out version from last section
  degradation_model: linear
  initial_RUL: 125

./configs/model_inference.yml

name: model_inference

# local input data source
input_data_specs:
  default:
    - datastore_type: local
      settings:
        query_type: csv
        query_config:
          filepath_or_buffer: /home/octaipipe/octaipipe_e2e_tutorials/cmapss_cat/inference/data_cmapss_cat_preprocessed.csv
          index_col: _time

# for local testing: output predictions to local csv files
output_data_specs:
  default:
    - datastore_type: local
      settings:
          file_path: /home/octaipipe/octaipipe_e2e_tutorials/cmapss_cat/inference/predictions.csv
          write_config:
              index: True

model_specs:
  name: model_example
  type: lgb_class_sk
  version: "1.1" # fill out version from last section

run_specs:
  prediction_period: 1s
  onnx_pred: false

We deploy the inference steps with the deployment config ./configs/deployment_config_inference.yml, by simply running: octaipipe.deployments.new_edge_deployment(config_path='./configs/deployment_config_inference.yml') We can monitor the deployment status on the web portal.

./configs/deployment_config_inference.yml

name: deployment_config

device_ids: ['demo-device-0']

image_name: octaipipe.azurecr.io/octaipipe-all_data_loaders:latest

datasources:
  environment:
    - INFLUX_ORG
    - INFLUX_URL
    - INFLUX_TOKEN
    - INFLUX_DEFAULT_BUCKET
  env_file:
    # - ./credentials/.env

grafana_deployment: null
grafana_cloud_config_path:

pipelines:
  - preprocessing:
      config_path: ./configs/preprocessing_inference.yml
  - model_inference:
      config_path: ./configs/model_inference.yml

Once the pipeline is deployed, record the deployment ID from the Jupyter cell output, or look it up on the web portal— you can filter by time to get the latest deployments. You can verify the step outputs on the device.

Remove deployment#

To end the tutorial, we remove the inference pipeline deployment by running:

octaipipe.deployments.kill_deployment(<deployment_id>)

Clean up#

When you have finished this tutorial, you can perform a clean-up by removing the data files in ~/octaipipe_e2e_tutorials.