A production director we spoke with last year had a simple question: "How many months before your AI system saves its first motor from failure?" The honest, uncomfortable answer: it depends on how long you've been collecting data before deployment. Fewer than six months — and a full year probably won't be enough either.
Predictive maintenance (PdM) is one of the domains where AI/ML delivers verifiable results. A 30–50% reduction in unplanned downtime is a number that repeats across projects from different industrial sectors. It is also a domain where marketing hype and actual production results diverge more sharply than almost anywhere else. The aim of this article is to map where an ML model genuinely adds value, what data preconditions you must meet, and when a rule-based approach is the smarter starting point.
What predictive maintenance actually solves
Traditional maintenance operates in two modes: reactive (fix after the failure) and preventive (replace on a schedule regardless of actual condition). Both have obvious weaknesses — reactive generates unplanned outages, preventive leads to premature replacement of parts that had years of life left.
Predictive maintenance adds a third mode: you intervene when equipment is genuinely beginning to degrade — not before, not after. In practice this means:
- Continuous data collection from sensors (vibration, temperature, acoustic emission, current spectrum)
- Anomaly detection or direct estimation of remaining useful life (RUL)
- Operator alerts typically 3–14 days before the predicted failure
It works well on assets with a well-characterised degradation process — bearings, electric motors, compressors, pumps, gearboxes. For exactly these asset types, vibration frequency spectra are a reliable carrier of wear information.
Where an ML model genuinely adds value
Anomaly detection without labelled data
The most universally accessible entry point is training a model exclusively on normal operating states. The model learns the distribution of the "healthy" signal, and any deviation — including one nobody has seen before — triggers an alert. Technically this covers autoencoders, methods such as Isolation Forest, and Vision Transformers trained on vibration spectrograms.
Advantage: no historical failure records required. Disadvantage: a higher false-positive rate in the first months of operation while threshold values are calibrated to the specific asset.
Fault classification and RUL estimation
When you have enough history — and that is the critical qualifier — a model can classify the specific fault type and estimate remaining useful life. This is where figures such as 88–97% prediction accuracy appear for well-instrumented assets after 6–12 months of data collection. Those numbers come from benchmark datasets, not necessarily from the first year in production on your specific equipment.
Multi-modal signal fusion
Combining vibration, temperature, acoustic emission, and current spectrum in one model delivers substantially better accuracy than single-sensor approaches. In practice this means more sensors and a more robust data pipeline, but for critical assets the investment is justified.
Generative synthetic data for rare failures
One of PdM's biggest challenges: catastrophic failures are rare. The model does not receive enough examples. The approach that has proven itself is generating synthetic failure events using GANs or diffusion models — this lets you artificially augment the training set in places where real data simply does not exist.
Data preconditions — what the vendor won't tell you
Before any decision to deploy an ML solution, these questions must be answered:
- 1.How long have we been collecting sensor data? — Less than six months means a cold-start problem. The model has no way to know what "abnormal" looks like because it has not seen enough normal states or their seasonal variations.
- 2.Do we have recorded failure events? — Dates and types of faults must be in the system. Without them it is impossible to train a fault-classification model or an RUL model.
- 3.Are the data synchronised? — Sensors, SCADA, ERP, and maintenance logs must share consistent timestamps. A one-hour offset between sources can throw the entire model off.
- 4.What is the sampling frequency? — Vibration analysis typically requires hundreds of Hz. Temperature is sufficient once a minute. An undersampled signal will not reveal the problem.
- 5.Are the data clean? — Dead sensors, stuck values, communication interruptions. Data quality is a bigger problem in practice than model selection.
We have seen projects where a company paid for an ML platform before its data-collection pipeline was in order. The outcome: the model trained on noise and the first alarms were false positives. That quickly erodes operator trust — which is a worse situation than if PdM had never been deployed at all.
When a simpler rule is enough
An ML model is not always the right choice. There are scenarios where a rule-based system or statistical process control (SPC) delivers the same or better result at a fraction of the cost:
- Assets with unambiguous threshold values — if a motor heats to 95 °C, that is an alarm. No neural network required.
- New equipment without history — the model has nothing to train on; rules work from day one.
- Few assets of the same type — ML models do not prove themselves on small samples.
- Regulated processes requiring auditability — the rule "alarm when X is exceeded" is easy to explain to a regulator. A black-box model much less so.
The sensible path: start with rules and SPC, collect data, and once you have 12+ months of history with a sufficient number of failure events — then it is worth considering ML.
ROI reality: what to count and what not to
Marketing materials cite 30–50% reductions in unplanned downtime. Those numbers are realistic — but reaching them takes time. A real ROI calculation must account for:
On the benefits side: - Reduced cost of unplanned outage (line hourly rate × average outage duration × frequency) - Extended part lifetime — fewer premature replacements - Reduced safety stock of spare parts
On the cost side: - Sensor infrastructure (retrofitting legacy equipment is expensive) - Data pipeline and integration with SCADA/ERP - Licensing or in-house development of the ML platform - Model calibration time — the first 3 months with false positives are unproductive - Ongoing model management (drift, new assets, firmware updates)
Across dozens of deployments we see that break-even typically arrives in year 2–3, not year 1 — when starting from scratch. If you already have an OPC-UA or SCADA system with clean historical data, you can move faster.
This connects to the hardware budget question — more in the article AI Copilot for Operators, where we examine the economics of edge AI devices.
Integration with existing systems
Predictive maintenance is not an isolated module — it must be connected to your existing systems:
- CMMS/EAM (Computerized Maintenance Management System) — a PdM alert must automatically generate a work order, not just a notification on a dashboard. If the operator must manually transcribe an alarm into the CMMS, the system will stop being used very quickly.
- SCADA / OPC-UA — standard protocols for industrial data; most modern PdM platforms support them natively.
- ERP — connection to spare-parts ordering is critical for a closed loop.
An architecture where PdM sits as a silo with its own dashboard and no connection to operational processes is one of the most common reasons projects fail to deliver the expected ROI. BMS system integration with the SCADA layer is covered in more detail in the article on BMS, KNX, and Loxone.
Digital twins and multi-agent approaches
A new architecture is emerging in the current generation of PdM solutions: digital twins of equipment combined with AI agents. Instead of a single classification model you have a network of agents — one collects and cleans data, a second classifies the degradation type, a third simulates remaining useful life on a physics model of the asset, a fourth generates a recommendation for the technical team.
This approach is more powerful than a single-purpose ML model, but also more expensive to implement and operate. It makes sense where the asset is sufficiently valuable (CNC machining centre, turbine, compressor station) and where a wrong prediction means genuinely high costs. For ordinary pumps in a secondary circuit it is over-engineered.
We discuss multi-agent system architectures in more detail in AI in Practice: Multi-Agent Systems.
Predictive maintenance vs. AI visual inspection — where the boundary lies
Predictive maintenance and visual inspection are two distinct domains, even though both fall under "AI in industry." PdM works with time-series sensor data and answers the question when will the asset fail. Visual inspection works with images and answers the question whether a specific part has a defect or not. More on visual inspection in AI Visual Quality Control.
Interestingly, these domains are beginning to converge — vibration spectrograms are now processed by visual models (CNNs), while camera systems simultaneously track thermal profiles. Multi-modal fusion is the direction the entire field is heading.
Frequently asked questions
How long before a PdM system starts working reliably?
In practice, count on 6–12 months from the start of data collection to the first reliable predictions. The first 3 months are characterised by a higher false-positive rate (around 10–15%) while the model calibrates to the specific asset and its operating cycles. This time can be shortened if you have archived historical data from a SCADA system — but it must be clean and synchronised with maintenance logs.
Are standard industrial IoT sensors sufficient for PdM?
Sensors are only the first step. Equally important is consistent time synchronisation, reliable data transmission without outages, and correct storage with at least 2 years of retention. The majority of PdM project failures are unrelated to model selection; they stem from a poor-quality data pipeline — dead sensors, stuck values, time offsets between sources.
When does it make sense to use ML instead of rules?
An ML model pays off when you have at least 12 months of clean sensor data with a sufficient number of recorded failure events, enough assets of the same type, and a degradation process that cannot be trivially captured with a simple threshold. For assets with an unambiguous limit value (temperature, pressure, current), rules are simpler, more auditable, and equally effective.
What is realistic prediction accuracy?
Under ideal conditions — well-instrumented asset, 6–12 months of data, recurring failure types — figures of 88–97% accuracy are cited. These numbers come predominantly from benchmark datasets and controlled studies. In the first year of real production with new assets, expect lower accuracy until the model absorbs seasonal variations and different operating modes.
Do I need to replace my entire CMMS to deploy PdM?
No. Most modern PdM platforms can integrate via API or standard connectors into existing CMMS and ERP systems. The critical requirement is that a PdM alert automatically generates a work order — manual transcription is not sustainable long-term. If your CMMS lacks an API, middleware solutions typically bridge this gap without requiring a full system replacement.
*If you are deciding whether and how to deploy predictive maintenance in your facility, we are glad to assess your sensor infrastructure and data — and tell you directly whether starting with ML makes sense, or whether a different first step is more worthwhile. Contact us for a no-obligation consultation.*
