Setting the RUL clipping level#

Often when running Remaining Useful Life (RUL) estimation, the non-linear degradation of systems is taken into account. What this means is that a lot of systems actually operate under stable conditions until they start their degradation. Therefore, representing the output variable as constantly linearly declining throughout an operating cycle is not the most effective way to train a model.

Instead, the output variable is clipped at some value, so that every value greater than the threshold becomes the clipping value (see fig. 1).

some-image — Figure 1. Linear and piecewise RUL clipped at 125 from Li et al. (2018)#

One of the key factors that determine the success of RUL estimation is the selection of a reasonable RUL clipping value. Below is a small guide that goes through how to set this value.

Following Heimes (2008)#

Heimes (2008) set up a process for doing RUL estimation on the CMAPSS dataset, containing data on the degradation of NASA turbofans. From Heimes, three main points/statistics initially guide the process for choosing an RUL clipping value.

The minimum run length before failure (i.e. the shortest cycle in the dataset)
Half of the average run length
Negative RUL estimation is penalized less than positive errors

The first two points give an ballpark number for the RUL clipping value. The third point suggests that a smaller value might be preferrable to a larger value, as it is likely that we would rather be overly cautious about machine failure rather than overestimate the time we have until failure.

For heavily skewed datasets, it might be more appropriate to use half of the median run length instead of the mean, as the median is less sensitive to outliers. In either case, it is useful to look at a histogram of the run lengths in the dataset.

Experimentation of RUL clipping#

Another method to determine an appropriate RUL clipping level is simply to test different clipping levels and comparing them on some metric such as RMSE. With OctaiPipe’s autoML functions, it is easy to repeatedly test the same model and dataset while only changing the RUL clipping value (intitial_RUL parameter).

A full approach might be finding a ballpark value using the method of Heimes (2008), and then testing a few thresholds around the ballpark value using OctaiPipe’s autoML.