This work addresses the dual challenge posed by label-flipping attacks and high data imbalance in Dissolved Gas Analysis (DGA)-based Power Transformer Fault Diagnosis (PTFD). While existing literature focused on improving the performance of data-driven methods by utilizing clean training samples with a combination of data-balancing methods, however, to date, the combined challenges of data imbalance and label-flipping attacks in the training phase of Machine Learning (ML) and Deep Learning (DL) models remain unaddressed. Therefore, this work bridges this gap by introducing a customized and extended semi-supervised learning framework. The proposed method identifies potentially correct and incorrectly labeled samples, processes them with Gaussian augmentation to handle the unlabeled data, and incorporates a re-weighting mechanism to handle class imbalance. Experimental results demonstrate that the existing data balancing methods fail to provide support for ML and DL models under label-flipping attacks, and the proposed method achieves an overall accuracy of 90%, a precision of 92%, and an F1-score of 89%, significantly outperforming state-of-the-art ML and DL models. Under label-flipping attacks and data imbalance scenarios, the proposed method demonstrates at least a 12.5% improvement in accuracy, 8.2% in precision, and 14% in F1-score, reflecting a significant relative performance gain over state-of-the-art models in such adversarial scenarios. Moreover, the proposed method exhibits resilience to Gaussian noise and False Data Injection Attacks (FDIA), ensuring robust DGA-based PTFD against physical and cyber anomalies.
History
Journal
IEEE Transactions on Dielectrics and Electrical Insulation