Predicting asset failure for proactive maintenance reduces failure incidents, unplanned shut down, maintenance (Avg plant ~$44m to ~$80m ref. Zeteky) and costs substantially (Avg $37m ref. GE) . However, prediction of false negative failures may create blind spots that lead to significant damages in high consequences areas. We define ‘blind spots’ as the condition in which the equipment failure is wrongly predicted safe and thus no maintenance is performed.
One of the main reasons for such false negatives results from the use of unsupervised Anomaly Detection Techniques (ADTs). In most cases unsupervised machine learning such as deep learning neural network, RNN or Kernelized artificial neural network (KANN), is employed to detect failure events. For industrial assets, historical data mostly comes from sensors which itself could be faulty thus the machine learns incorrectly. This creates the following issues for the user:
- Analyst has no control over the neural network or is unable to explain the results
- ML model cannot distinguish between sensor failure and machine failure
- ML model is unable to correlate the physical mechanism with the failure event
- ML model assigns the wrong failure agents to the problem
These challenges lead to a high percentage of false negative failures and false positive failures on predicting asset failures.
Such unsupervised ML algorithms have been useful for fraud detection in financial institutions. However, a “one size fits all” culture to shortcut the effort of predicting safety risks for industrial assets could be perilous.
Recommended Improvement for Failure Prediction
The following is a schematic of how to prepare data and create features/variables for avoiding false classifications of failure events:
Follow these steps to more accurately predict equipment failure and reduce the number of false negatives:
- Discuss with a subject matter expert (SME) about your raw data
- Use the experience of a SME about the physical mechanism of failure for that asset
- Use the knowledge of a SME about sensor failure patterns
- Remove anomalous data from the sensor failure
- Discuss with the SME again about the removed and cleaned pattern
- Create new features/variables or remove existing features that may cause confusion
- Use at least 2 different genres of machine learning algorithms
- Perform a detail physical parametric study to understand the features that contribute to failure
- Show a feature importance chart to the SME and decide on whether those features carrying the physical mechanism are indeed failures.
- Compare feature importance between different genre of ML and use features wisely to train failure models
Although these steps may require time and effort, the benefits far outweigh the costs particularly when predicting failures of critical assets that your operation depends on.
Finally, advance machine learning algorithm may enhance decision making quite impressively, only if it is built by the subject matter experts and able to incorporate the human experience. We should be very cautious using decisions from artificial intelligence that uses machine learning without human element.
–Khairul Chowdhury
Chief Technology Officer, IDARE