WASM Model Inference Step#

From OctaiPipe version 2.2 onwards, in addition to our traditional Python based model inference step, we now have a WebAssembly (WASM) inference step developed in Rust.

Benefits of using our WASM image#

  • Faster performance as WebAssembly compiled bytecode runs at near native speed without the need for a garbage collector giving.

  • Reduced CPU and memory footprint allow ML Inference to run on lower spec devices.

  • Smaller image size reduces startup time and storage.

  • Enhanced security as WASM provides a secure sandboxed environment since all file and network access is deny by default unless explicitly granted.

The configurations used to deploy our models are the same across our Python and Wasm image versions. Simply choose the wasm image octaipipe.azurecr.io/octaioxide:2.3.3 in your deployment configuration and everything else remains the same.

The best way to understand how to use OctaiPipe with WASM inference is to run the notebook tutorial in our Jupyter image: Tutorials/05_WASM-inference/WASM_Inference.

Currently, WASM inference supports csv data store on your local device however more options will become available in future releases. If you need to use other datas stores, please use our Python inference image.

WASM Inference with local csv data store example#

  • In this example the input_data_specs settings are specific parameters for loading a csv file situated on your local device, e.g. for a situation where live sensor data is being written locally.

  • We reference model_specs, the model must have already been trained and saved to your Portal for it to be deployable.

  • output_data_specs are again specific to local csv in this case and each periodic inference iteration will append predictions to the file_path specified here.

 1name: model_inference
 2
 3input_data_specs:
 4  datastore_type: csv
 5  settings:
 6    delimiter: ','
 7    file_path: /home/octaipipe/testdata_tutorial_cmapss_latest.csv
 8    headers: true
 9    index:
10    - _time
11    - Machine number
12
13model_specs:
14  fl_model: true
15  model_id: inference_test_regressor_0
16  name: inference_test_regressor
17  type: base_SGDRegressor
18  version: '1'
19
20output_data_specs:
21- datastore_type: csv
22  settings:
23    append: true
24    delimiter: ','
25    file_path: /home/octaipipe/testdata_tutorial_cmapss_out.csv
26
27run_specs:
28  only_last_batch: true
29  prediction_period: 5s
  • By setting only_last_batch: true, we are ensuring that inference is only performed on the last rows of the csv file, equal to the batch size that was used during model training. This is useful for live data where we only need to make predictions on the most recent data for each iteration rather than the whole dataset.

WASM Deployment Configuration#

  • Define the list of device ids to deploy to, this can be one or many.

  • Specify the image_name as octaipipe.azurecr.io/octaioxide:latest to indicate that we want to use the WASM inference image.

Set the pipelines to model_inference (this is the case for any model inference step) and the config_path for your inference configuration as per above.

 1name: deployment_config
 2
 3device_ids: ['your_device_id_1', 'your_device_id_2']
 4
 5image_name: 'octaipipe.azurecr.io/octaioxide:latest'
 6
 7env: {}
 8
 9datasources:
10  environment:
11    - INFLUX_ORG
12    - INFLUX_URL
13    - INFLUX_TOKEN
14    - INFLUX_DEFAULT_BUCKET
15  env_file:
16    - ./credentials.env
17
18grafana_deployment:
19grafana_cloud_config_path:
20
21pipelines:
22  - model_inference:
23      config_path: configs/inference_config_wasm.yml