Docker Resource Constraints#
When running OctaiPipe in Docker on an edge device, there might be a need to control the resources used by these containers. Docker offers a way to set the maximum CPU and memory limits that each container uses as well as minimum reservations required by a container.
This might be beneficial in scenarios when running on low-powered devices where resources are restricted and it is crucial to ensure that the device does not face failures such as out-of-memory.
It also helps users ensure that running OctaiPipe on a device does not disturb other processes running there.
This guide shows users how to control resource limits of edge deployments in OctaiPipe and typical resource usage for a Federated Learning container and Inference for a sample dataset.
Configuration in OctaiPipe#
Setting the maximum CPU and memory limits in OctaiPipe is done in the deployment config YAML file. The CPU limit is set as a fraction of CPU cores to use and memory is set in megabytes. Both these should be set as strings in the yaml.
Reservations for OctaiPipe containers can be set in the same way.
Setting the resource contraints for a deployment sets the constraints for all containers defined for that deployment.
Configuration for Edge Deployment#
Below is an example config file, where the resource constraints are set in the resource_constraints
field under limits
and reservations
. Each has the fields cpus
and memory
.
Note that reservations should be the same as or lower than limits.
1 name: deployment_config
2
3 device_ids: [device_0, device_1]
4
5 resource_constraints:
6 limits:
7 cpus: '0.5'
8 memory: 500M
9 reservations:
10 cpus: '0.4'
11 memory: 400M
12
13 datasources:
14 environment:
15 - INFLUX_ORG
16 - INFLUX_URL
17 - INFLUX_TOKEN
18
19 env: {}
20
21 grafana_deployment:
22 grafana_cloud_config_path:
23
24 pipelines:
25 - preprocessing:
26 config_path: preprocessing_inference.yml
27 - model_inference:
28 config_path: model_inference.yml
Configuration for Federated Learning#
Below is an example setting resource constraints for the edge containers running FL training
at the edge. The resource constraints are set in the infrastructure
section.
1 name: federated_learning
2
3 infrastructure:
4 device_ids: [device_0, device_1]
5 image_name: octaipipe.azurecr.io/octaipipe-all_data_loaders:latest
6 server_image: octaipipe.azurecr.io/fl_server:latest
7 resource_constraints:
8 limits:
9 cpus: '0.5'
10 memory: 500M
11 reservations:
12 cpus: '0.4'
13 memory: 400M
14
15 # Rest of config left out for brevity
Reference Resource Usage#
The table below details the resources used by one container running either a Federated Learning workload or Model Inference on a portion of the C-MAPSS dataset. The dataset contains 28 columns and around 20,000 rows of data. The resource usage is detailed for OctaiPipe’s Base PyTorch model (a two-layer perceptron), Sklearn’s SGD Regressor, OctaiPipe’s FL-XGBoost, and the k-FED algorithm.
Workload |
Resource |
SGD Regressor |
FLXGBoost |
Base PyTorch |
k-FED |
---|---|---|---|---|---|
|
Memory |
280MB |
280MB |
320MB |
280MB |
CPU |
0.5 |
0.5 |
0.7 |
1.0 |
|
|
Memory |
280MB |
280MB |
300MB |
280MB |
CPU |
0.35 |
0.3 |
0.3 |
0.5 |
NOTE#
These numbers are for this specific dataset and model architecture. Data with more columns and rows would require more resources
Setting limits below the resources needed for a container will Kill the container and prevent OctaiPipe from running
These were all run using OctaiPipe’s Python images. For lower powered devices, inference can also be run using OctaiOxide