This work considers the application of classification algorithms for data-driven fault diagnosis of batch processes. A novel data selection methodology is proposed which enables online classification of detected disturbances without requiring the estimation of unknown (future) process behavior, as is the case in previously reported approaches.
The proposed method is benchmarked in two case studies using the Pensim process model of Birol et al. (2002) implemented in RAYMOND. Both a simple k Nearest Neighbors (k-NN) and complex Least Squares Support Vector Machine (LS-SVM) are employed for classification to demonstrate the generic nature of the proposed approach. In addition, the influence of different data pretreatment methods on the classification performance is discussed, together with a motivation for selecting the correct pretreatment steps. Finally, the influence of the number of available training batches is studied.
The results demonstrate that a good classification performance can be achieved with the proposed data selection method even with a low number of faulty training batches by exploiting knowledge on the nature of the to-be-diagnosed faults in the data pretreatment. This provides a proof of concept for classification-based batch diagnosis and demonstrates the importance of incorporating process insight in the construction of data-driven process monitoring and diagnosis tools.