Federated XGBoost Implementation with OctaiPipe#

Note

Before proceeding note that OctaiPipe does not currently support running of FLXGBoost on arm32 devices.

Introduction#

OctaiPipe has implemented XGBoost within a federated learning (FL) framework, enabling clients to train multiple trees using XGBoost on local datasets before aggregating these trees on a server. This method allows for multiple trees to be sent to the server for aggregation in each iteration, differing from other implementations where only a single tree is trained and sent per iteration.

Additionally, we have implemented a normalized learning rate options to account for varying dataset sizes across clients, aiming to improve model performance and fairness.

Advantages of XGBoost#

XGBoost is preferred for certain scenarios due to its:

  • Ease of Use: Less complex setup compared to neural networks, making it accessible for a wide range of applications.

  • Efficiency and Cost Savings: Demonstrates faster training times and reduced computational costs.

  • Explainability: Facilitates easier extraction of feature importance, enhancing model interpretability.

  • Performance: Often outperforms neural network-based models on tabular datasets, especially when data size is medium and features are sparse or non-IID (not independently and identically distributed).

  • Robustness: Effectively handles missing values, a challenge for many neural network models.

Federated XGBoost Process#

In the federated setting, the training process starts with each client training local XGBoost models. These local models, represented by a set number of trees, are then aggregated by the server into a global model. This model is distributed to clients for further training in subsequent rounds, progressively enhancing the model’s accuracy. The cycle of local training, aggregation, and distribution continues until a predefined number of iterations is reached, optimizing the global model while maintaining data privacy.

The best way to understand how to use Federated XGBoost is to run the notebook tutorial in our Jupyter image: Tutorials/FLXGBoost-tutorial/flxgboost-tutorial.

Configurations and Methodologies#

Example config:

name: federated_learning

infrastructure:
  device_ids: [demo_device_0, demo_device_1]  # Add desired devices
  image_name: "octaipipe.azurecr.io/octaipipe-all_data_loaders:2.1.0"  # Set OctaiPipe image here
  server_image: "octaipipe.azurecr.io/fl_server:2.1.0"  # Set FL server image here

strategy:
  num_rounds: 20
  num_local_rounds: 2
  normalized_learning_rate: true

input_data_specs:
  devices:
  - device: default
    datastore_type: influxdb
    query_type: dataframe
    query_template_path: ./data/influx_query_def.txt
    query_values:
      start: "2022-11-10T00:00:00.000Z"
      stop: "2022-11-11T00:00:00.000Z"
      bucket: test-bucket
      measurement: sensors-raw
      tags: {}
    data_converter: {}

evaluation_data_specs:
  devices:
  - device: default
    datastore_type: influxdb
    query_type: dataframe
    query_template_path: ./data/influx_query_eval_def.txt
    query_values:
      start: "2022-11-10T00:00:00.000Z"
      stop: "2022-11-11T00:00:00.000Z"
      bucket: test-bucket
      measurement: sensors-raw
      tags: {}
    data_converter: {}

model_specs:
  type: base_xgboost
  load_existing: false
  name: test_xgboost
  model_params:
    objective: reg:squarederror
    eta: 0.15
    max_depth: 8
    eval_metric: auc
    nthread: 16
    num_parallel_tree: 1
    subsample: 1
    tree_method: hist

run_specs:
  target_label: RUL
  cycle_id: "Machine number"
  backend: pytorch  # pytorch, SGD, xgboost

There are two configurable features of the FL-XGBoost implementation which will effect the training process:

  1. The XGBoost model parameters which will be used by the clients to train their local models. These parameters are set in the federated_learning config file detailed below.

  2. The strategy used by the server to handle and aggregate models from clients. This can be set in the federated_learning config file as well. Or can be configured after the config has been loaded following the example shown here: Customise Strategy Parameters

XGBoost model parameters#

https://xgboost.readthedocs.io/en/release_1.7.0/parameter.html#learning-task-parameters

Options set here will be passed to the xgboost.Booster object when the model is initilaised. It allows you to have greater control over the model training process, catering to your specific learning task.

  • objective: reg:squarederror - Specifies the learning task and the corresponding learning objective. Here, it is set for regression tasks with squared error as the loss function. - Ideal for regression problems where the goal is to minimize the squared differences between the predicted and actual values.

  • eta: 0.15 (alias: learning_rate) - Controls the step size shrinkage used in the update to prevent overfitting by making the boosting process more conservative. - Recommended option is lower than the default (0.3) to make the model training process more conservative and potentially achieve better generalization on unseen data.

  • max_depth: 8 - Determines the maximum depth of a tree. Deeper trees can model more complex patterns but might overfit. - Increased from the default (6) to allow the model to capture more complex relationships in the data without excessively risking overfitting.

  • eval_metric: auc - Evaluation metric for validation data, important for classification tasks. - Using AUC (Area Under the Curve) as it is effective for binary classification problems, particularly useful in imbalanced datasets.

  • nthread: 16 - Number of parallel threads used to run XGBoost. - Increased to speed up computation. The exact value can be adjusted based on the machine’s CPU cores available for parallel processing.

  • num_parallel_tree: 1 - Number of trees to grow per iteration. Used for boosted random forests. - Set to 1 for standard boosting. Increasing this creates a forest of trees for each iteration and can be used for models like random forests.

  • subsample: 1 - Fraction of training instances to be randomly sampled for building trees, to prevent overfitting. - Set to 1 to use all data, indicating confidence in the dataset’s representativeness and the model’s resilience against overfitting.

  • tree_method: hist - Specifies the tree construction algorithm used in XGBoost. Options include exact, approx, and hist, among others. - Hist is chosen for faster computation time compared to the exact method, especially suitable for datasets with a large number of observations or features.

Strategy options#

https://flower.ai/docs/framework/how-to-use-strategies.html

This defines the learning method for the Federated Learning process.

“It is basically the federated learning algorithm that runs on the server. Strategies decide how to sample clients, how to configure clients for training, how to aggregate updates, and how to evaluate models.” - https://flower.ai/docs/framework/how-to-use-strategies.html

Some of the options refer to OctaiPipe specific functionality. For FL XGBoost training these are num_local_rounds and normalized_learning_rate. Details on all strategy options can be found here strategy