Guide - Remaining Useful Life Estimation#
In this tutorial, we aim to predict the Remaining Useful Life (RUL) of commercial aircraft engines. This data set was generated with the C-MAPSS simulator. C-MAPSS stands for ‘Commercial Modular Aero-Propulsion System Simulation’ and it is a tool for the simulation of realistic large commercial turbofan engine data. Each flight is a combination of a series of flight conditions. Below a description of the data in brief
Multivariate time-series data, one for each engine (100 engines)
Each engine starts with some initial wear (unknown), but still in normal condition
At the start of time-series, engine is operating normally
At some point, a fault develops
In the training data, the fault continues growing until system failure
In the test data, the time-series ends prior to system failure
Objective:#
Predict the number of operation cycles before system failure for the test data.
Environment Setup#
1 - Make sure you have WSL on your system as it enables you to use Linux tools, like Bash, completely integrated with Windows tools, like PowerShell or Visual Studio Code. It is recommended as it requires fewer resources (CPU, memory, and storage) than a full virtual machine.
2 - Now, open the WSL terminal and make sure that you can access the Linux file system instead of the Windows file system.
3 - Make sure to install the necessary extensions for VSCode on WSL:Ubuntu (e.g. Python, Pylance, Jupyter, cornflakes-linter etc …)
4 - Clone the OctaiShow Repo under your project directory.
5 - Create and switch to a new git branch, the name for the branch should be descriptive of the demo and preferrably starts with your initials. Eg cc-cmapss-regression.
6 - Create a folder for your specific OctaiPipe Demo. Eg. cmapss-regression.
7 - Create a conda environment using conda create -n [env-name] python=3.9, then activate it.
8 - This is the folder Structure needed to run OctaiPipe Pipeline steps :
Note that cmapss-data is included in the .gitignore file as it conatines data files.
9 - Source your credentials/set_env.sh file to set up the environment variables for successfully running the framework. The needed variables are the following :
INFLUX_URL - url of the database
INFLUX_ORG - org of the db
INFLUX_TOKEN - security token
BLOB_CONNECTION_STRING - connection string to the blob containing ML models
INFLUX_CONN_TIMEOUT
INFLUX_READ_TIMEOUT
MODEL_NAME
10 - Next, run pip install -r requirements.txt to install OctaiPipe and its requirements.
11 - Make sure you set up your local influxdb instance according to Link and create the needed buckets for your pipeline steps. Then Run the notebooks to load the data to the database.
Running OctaiPipe Steps#
Naviagte to your scripts folder, run the run.sh script to run OctaiPipe pipeline steps. The script contains the following commands that will run the Training, Evaluation and Inference Steps:
1 python3 -m octaipipe_core.pipeline.run_step --step-name model_training --config-path ../configs/data-cmapss/train-config.yml
2 python3 -m octaipipe_core.pipeline.run_step --step-name model_evaluation --config-path ../configs/data-cmapss/eval-config.yml
3 python3 -m octaipipe_core.pipeline.run_step --step-name model_inference --config-path ../configs/data-cmapss/inference-config.yml