Inference API Deployment#
Launches a REST API which takes data and serves inferences based on the loaded model (defined in deployment_specs
. model_specs
)
By default launches a grafana instance to monitor the model performance.
class InferenceAPIDeployment
Kuberenetes Objects:
ConfigMap
Deployment
Service
HorizontalPodAutoscaler
Fields:
type
: inferenceAPIDeploymentname
: stringrequired - unique name of deployment
template
stringdefault:
src/octaipipe/configs/cloud_deployment/inference_api_deployment_template.yml
kubernetes_namespace string
default:
default
monitor_model
booldefault:
True
Sets up a Grafana Dashboard which monitors model performance.
deployment_specs
dictSee placeholders
Placeholders (deployment_specs or env variables):
deployment_name
stringdefault = deployment’s name
api_http_port
intdefault =
8222
MONITOR_DB_URL
stringdefault = monitor DB client URL
deployment_id
stringdefault = deployment id
model_specs
multi-line scalar valuerequired
e.g.: