Inference API Deployment#

Launches a REST API which takes data and serves inferences based on the loaded model (defined in deployment_specs. model_specs)

By default launches a grafana instance to monitor the model performance.

class InferenceAPIDeployment

Kuberenetes Objects:

  • ConfigMap

  • Deployment

  • Service

  • HorizontalPodAutoscaler

Fields:

  • type: inferenceAPIDeployment

  • name: string

    • required - unique name of deployment

  • template string

    • default: src/octaipipe/configs/cloud_deployment/inference_api_deployment_template.yml

  • kubernetes_namespace string

    • default: default

  • monitor_model bool

    • default: True

    • Sets up a Grafana Dashboard which monitors model performance.

  • deployment_specs dict

    • See placeholders

Placeholders (deployment_specs or env variables):

  • deployment_name string

    • default = deployment’s name

  • api_http_port int

    • default = 8222

  • MONITOR_DB_URL string

    • default = monitor DB client URL

  • deployment_id string

    • default = deployment id

  • model_specs multi-line scalar value

    • required

    • e.g.:

An example of model specs value