Feature Selection Step#

To run locally: feature_selection

This feature selection step loads raw data that has been imputed, computes fitness metrics for all features in the raw data, and returns a subset of data (including list of fit features) for training.

Note: This step will be generalised and made available in a future release. Until it is done, this page is not updated.

The following is an example of a config file together with descriptions of its parts.

Step config example#

  • Alternatively, you can load an example of the filled config where you will need to change some values to adapt it to your problem.

 1name: feature_selection
 2
 3input_data_specs:
 4  datastore_type: influxdb
 5  query_type: influx  # influx/dataframe/stream/stream_dataframe/csv
 6  query_template_path: ./configs/data/influx_query.txt
 7  query_values:
 8    start: "2020-05-20T13:30:00.000Z"
 9    stop: "2020-05-20T13:35:00.000Z"
10    bucket: sensors-out
11    measurement: cat 
12    tags: {}
13  data_converter: 
14    name: influx_flat
15    args: {}
16
17output_data_specs:
18  - datastore_type: influxdb
19    settings:
20      bucket: test-bucket-1
21      measurement: testv1-fe
22
23
24run_specs:
25  input_sensors:
26    - "Load_Cell_Mid"
27    - "Eddy_Top"
28  data_filename: /Users/ngcs/Downloads/211106_220102_imputed.parquet
29  max_RUL: 1000
30  min_metric_val: 0.1
31  last_n_strokes: 10000
32  cycle_key:
33

Input and Output Data Specs#

input_data_specs and output_data_specs follow a standard format for all the pipeline steps; see Octaipipe Steps.

Configs Description#

Use this table to describe in detail each of the fields in the configuration file provided above apart from input_data_specs and output_data_specs as they are explained in the main OctaiPipe Steps page.

Level 1

Level 2

Level 3

Type/Options

Description

run_specs

Step Outputs#

In this section, provide a description of the expected outputs of the step, both local and in the blob to help users investigate their results.