Towards Reliable Diagnostics Under Data Scarcity - Machine Learning for Manufacturing Equipment
Time: Fri 2026-06-05 14.00
Location: F3 , Lindstedtvägen 26-28
Video link: https://kth-se.zoom.us/j/62520529067
Language: English
Subject area: Production Engineering
Doctoral student: Eleonora Iunusova , Tillverknings- och mätsystem
Opponent: Professor Anders Skoogh, Chalmers University of Technology
Supervisor: Professor Andreas Archenti, Design and Management of Manufacturing Systems, DMMS, Tillverknings- och mätsystem; Dr Robert Tomkowski, Tillverknings- och mätsystem
Abstract
The ongoing digitalization of industrial systems is transforming maintenance practices by enabling continuous monitoring and data collection. This creates the foundation for data-driven approaches and enables advanced diagnostics through the use of machine learning methods in maintenance applications. To achieve reliable diagnostic performance, supervised machine learning methods rely on representative training data covering diverse operating conditions and characteristic signatures of different failure modes, with sufficient data quality, accurate labels, and proper annotations. However, in industrial maintenance, these requirements are commonly not satisfied due to heterogeneous operating regimes, harsh data acquisition conditions, and the inherently rare occurrence of faults. At the same time, maintenance applications are often safety-critical and associated with significant operational and economic risks, which motivates the need for reliable diagnostics even under data-constrained conditions.
This thesis treats data scarcity as an inherent and largely unavoidable constraint of industrial maintenance and develops a structured approach to characterize it, assess its effects on diagnostic reliability, and identify effective strategies to operate under such limitations. Data scarcity is formally defined as a multidimensional concept encompassing five dimensions: availability, coverage, representativeness, usability, and quality, establishing a framework for systematically assessing data-related limitations in industrial monitoring. Diagnostic reliability is characterized along three properties: accuracy, generalization, and robustness, which together define the basis for evaluating machine learning-based diagnostics under data scarcity.
The effects of data scarcity on diagnostic reliability are investigated through structured empirical studies that systematically vary controlled data scarcity factors, including data volume, fault sample ratio, and measurement degradation, across three transfer scenarios with increasing domain shift. Within these scenarios, classical machine learning and deep learning methods are combined with different knowledge transfer strategies, including domain adaptation, transfer learning, and joint learning, to examine how individual factors, their interactions, and the choice of learning strategy jointly determine diagnostic performance. This factorial approach connects the conceptual characterization of data scarcity directly to empirical evaluation, enabling quantitative assessment and analytical interpretation of model behavior under realistic industrial constraints.
The results establish that model effectiveness is regime-dependent: no single learning strategy is universally optimal, and diagnostic performance is governed by the interaction between data characteristics, domain conditions, and the learning strategy employed. The findings are consolidated into practical insights for scarcity-aware machine learning, providing actionable guidance for data acquisition, model selection, and the design of reliable fault detection systems for condition-based maintenance in industrial environments.