Model and Preprocessor Object Management#

After fitting a preprocessor (such as a sklearn encoder or scaler) or a model in an OctaiPipe training pipeline, it is necessary to save the object for use during inference. In this page, we describe ways in which OctaiPipe saves a fitted preprocessor/model.

Each time we run the Model Training Step, or a Preprocessing Step in which a preprocessor is fitted, OctaiPipe saves the fitted object in two different formats:

ONNX (Open Neural Network Exchange Format). A format designed to represent any type of Machine Learning and Deep Learning model, as well as preprocessor objects. Examples of supported frameworks include scikit-learn, PyTorch, TensorFlow, and Keras. ONNX allows for interoperability between different frameworks.

Serialized Python Objects. The object is saved using joblib.dump(), which works efficiently on arbitrary Python objects such as model files.

The files for the fitted object are saved locally, and also uploaded to cloud and registered in the database.