Failure Code Taxonomies for Plants: Getting Useful Data Out of CMMS

By Mark strong on June 27, 2026

failure-code-taxonomies-for-plants-getting-useful-data-out-of-cmms

Ask most reliability engineers what their CMMS tells them about failure patterns on a specific asset class and you will get one of two answers: either a shrug, or a Pareto chart built from work order descriptions that read "pump leaking," "broke down again," and "fixed." Free-text failure notes are not data. They are a diary. And a diary cannot tell you which failure mode is costing you the most downtime, which cause is recurring across multiple assets, or which maintenance task would have prevented the last three emergency callouts. That gap starts with the failure code taxonomy.

Configure a Structured Failure Code Taxonomy in OxMaint — and Start Getting Analysable Reliability Data From Your Next Work Order

OxMaint enforces separate Mode, Cause, and Effect fields at work order closure — no free text, no inconsistent entries, no data that cannot be Pareto-analysed. Sign up free or book a demo to see the failure code configuration live.

Why Most Plant CMMS Data Is Useless for Reliability Analysis

The problem is almost never the CMMS. It is what technicians are allowed to type into it. When work order closure fields accept any free text, the same physical event gets recorded as "bearing noise," "vibration on pump 4," "overheating — bearing side," and "shut down — hot bearing" by four different technicians across two shifts. Those are the same failure mode. A Pareto analysis will treat them as four separate, low-frequency events — each one below the threshold for investigation. The real pattern stays invisible.

40%
Variation in reported failure rates between datasets with strict equipment classification discipline and those without it — OREDA handbook data
3
Distinct fields required for every analysable failure record: Mode, Cause, and Effect — most CMMS implementations capture only one
80%
Of downtime from specific failure modes that Pareto analysis can isolate — but only when codes are structured, consistent, and complete

The Three Pillars: Mode, Cause, and Effect

ISO 14224 and the OREDA handbook both organise failure data around three distinct concepts. Conflating any two of them produces data that looks complete but cannot be analysed. Each one answers a different question — and each one drives a different type of action.

PILLAR 1
Failure Mode
What observable state did the asset enter when it failed?
The symptom at the functional boundary. What changed that caused the asset to stop performing its required function — or to perform it in a degraded, intermittent, or erratic way. Mode describes the how of failure, not the why.
Examples
Pump Reduced flow output • Leaking from mechanical seal • Excessive vibration • Fails to start
Conveyor Belt tracking off • Drive shaft seized • Reduced speed under load
Compressor High discharge temperature • Pressure not reaching setpoint • Abnormal noise
Drives: Which assets have the same failure mode pattern? What maintenance task addresses this mode?
PILLAR 2
Failure Cause
What physical or operational mechanism produced the failure mode?
The mechanism at the component level that triggered the failure mode. ISO 14224 distinguishes between random causes (manufacturing defects, latent faults), wear-out causes (fatigue, erosion, corrosion), and systematic causes (design errors, installation faults, operating outside limits).
Examples
Mechanical Bearing contamination • Shaft misalignment • Lubrication starvation • Fatigue crack
Electrical Insulation degradation • Overcurrent • Loose terminal connection
Process Dry running • Cavitation • Fluid contamination • Thermal shock
Drives: Which PM task eliminates this cause? Is this cause systemic across multiple assets or isolated?
PILLAR 3
Failure Effect
What was the operational and business consequence of this failure?
The impact on the production system and the maintenance resource consumed. ISO 14224 classifies effects at three levels: the effect on the maintainable item, the effect on the equipment unit, and the effect on the plant or system. Most CMMS implementations capture none of them.
Examples
Production Full production loss • Reduced throughput • Quality degradation • No production impact
Safety Hazard created • Environmental release • No safety impact
Maintenance Immediate repair required • Deferred to next planned stop • Monitored and continued
Drives: Which failures justify emergency response? Which assets warrant higher PM frequency based on consequence?

These three fields together produce a record that is Pareto-analysable, RCM-traceable, and comparable across plants. Missing any one of them degrades the whole dataset — because you cannot answer "which failure cause is driving the most production-loss events on our pump fleet" without all three captured consistently on every work order. Sign up free on OxMaint to configure separate Mode, Cause, and Effect fields as mandatory work order closure requirements.

The ISO 14224 Equipment Hierarchy: Where Taxonomy Starts

Before failure codes can be meaningful, the equipment hierarchy they attach to must be structured. ISO 14224 organises assets in a nine-level hierarchy from industry segment down to maintainable item. For most manufacturing plants, the practical working levels are four:

Plant / Site
System
Equipment Unit
Maintainable Item
Example: Site A Cooling Water System Centrifugal Pump P-101 Mechanical Seal

The critical discipline: failure codes are attached at the equipment unit level, not the system or site level. A failure record that says "Cooling Water System — Leaking" is unsearchable. A failure record that says "P-101 — Mechanical Seal — Reduced Flow — Dry Running — Deferred to Next Planned Stop" is analysable, comparable to P-102 through P-108, and directly traceable to a PM action.

A Working Failure Code Set for Common Plant Equipment

The table below is not exhaustive — it is a starting point. The codes are drawn from ISO 14224 Annex tables and the OREDA handbook, adapted for plant application. Each equipment class needs its own controlled vocabulary; the principle is the same across all of them.

P
Centrifugal Pumps
Failure Mode Failure Cause Effect Class
Reduced flow output Impeller wear / Cavitation / Clogged suction Reduced throughput
Leaking — mechanical seal Dry running / Bearing-induced shaft movement / Contamination Environmental release / Full loss
Excessive vibration Shaft misalignment / Bearing wear / Impeller imbalance Deferred repair / Immediate repair
Fails to start Electrical supply fault / Seized bearing / Control signal fault Full production loss
Overheating — bearing housing Lubrication starvation / Bearing contamination / Overload Immediate repair required
C
Conveyors and Material Handling
Failure Mode Failure Cause Effect Class
Belt tracking off Roller misalignment / Belt splice failure / Load asymmetry Reduced throughput
Drive shaft seized Bearing failure / Overload / Lubrication failure Full production loss
Belt tear / splice failure Foreign object damage / Fatigue at splice / Overloading Full production loss
Reduced speed under load Drive belt slipping / Motor fault / Gearbox wear Quality degradation / Reduced throughput
Abnormal noise — drive end Gearbox wear / Bearing deterioration / Coupling fault Monitored and continued
E
Electric Motors
Failure Mode Failure Cause Effect Class
Fails to start Winding fault / Supply phase loss / Control circuit fault Full production loss
Overheating — winding Overload / Insulation degradation / Inadequate ventilation Immediate repair / Full loss
Excessive vibration Bearing wear / Rotor imbalance / Shaft misalignment Deferred repair
High bearing temperature Lubrication failure / Contamination / Excessive radial load Immediate repair required
Trips on overcurrent Driven equipment overload / Phase imbalance / Winding degradation Full production loss

These are starting codes, not finished taxonomies. Each plant's controlled vocabulary should be built with technicians and reliability engineers together — the technicians who know what they observe, and the engineers who know what they need to analyse. A code set built without technician input will not be used correctly at closure.

The Five Rules of a CMMS Taxonomy That Works

01
Mode, Cause, and Effect must be separate mandatory fields
If they are optional, technicians under time pressure will skip them. If they are combined into one free-text field, the data cannot be split for Pareto analysis. Three fields, all required, all dropdown-enforced — before a work order can be closed.
02
Dropdowns, not free text — ever
Free text produces "pump leaking," "leak from pump seal," "mechanical seal — leaking," and "seal failure" as four separate records that describe one failure mode. A dropdown menu for each equipment class eliminates this immediately. "Leaking — Mechanical Seal" is one code, consistently applied across every shift and every technician.
03
Code sets are equipment-class specific, not plant-generic
A generic list of failure codes applied to all asset types creates the same problem as free text — "Leaking" on a pump means something completely different than "Leaking" on a pressure vessel or a hydraulic system. Build separate dropdown lists per equipment class, with modes that are only observable on that class.
04
Cause codes must distinguish systematic from random failures
ISO 14224 distinguishes between random causes (unpredictable, independent of age), wear-out causes (predictable, age-dependent), and systematic causes (design, installation, or operating condition errors that will repeat on every similar asset). If your cause codes do not preserve this distinction, you cannot identify which failures warrant design review versus which warrant PM interval adjustment.
05
Review and prune the code set every six months
A code set that grows without pruning becomes a dropdown with 80 options that technicians scroll past. Run a Pareto on which codes are actually being used. Codes with under five records in a 12-month period either reflect a genuinely rare event or a code nobody knows how to classify correctly. Investigate and either merge or remove. Keep the active vocabulary lean — 10 to 15 modes per equipment class is the practical ceiling for consistent application.

What Analysable Failure Data Makes Possible

Structured failure codes are not an admin exercise. They are the foundation for every reliability analysis that moves a maintenance programme from reactive to proactive. Here is what becomes possible once Mode, Cause, and Effect are captured consistently. Book a demo to see how OxMaint's failure code configuration enables all four from day one.

PA
Pareto Analysis
Rank failure modes by frequency and by downtime impact separately. The mode that fails most often is not always the one costing you most production. Only structured codes let you run both analyses and compare them.
RC
Root Cause Identification
When the same cause code appears on five different pump assets over six months, the cause is systematic — not random. Without cause codes, each event looks isolated. With them, the fleet-wide pattern is visible in a single query.
PM
PM Optimisation
Match PM task content to the actual failure causes occurring in your plant. If "lubrication starvation" appears as a recurring cause across a bearing class, the PM interval or lubricant specification needs review — not more technician training.
CB
Cross-Site Benchmarking
Failure rates on the same equipment class across multiple sites are only comparable if both sites use the same code vocabulary. With a consistent taxonomy, a site running 230-hour MTBF on pump class A can be compared directly to a sister site running 310 hours — and the cause codes reveal why.

The Most Common Implementation Mistakes

High Risk
Coding at system level instead of equipment unit
Recording "Cooling Water System — High Temperature" tells you nothing about which asset failed. The failure code must be attached to the specific equipment unit — Pump P-101 — so that asset-level MTBF can be calculated and cause patterns can be compared across the asset class.
High Risk
Allowing "other" or "unknown" as cause codes without follow-up
"Unknown" cause codes are legitimate at initial closure when root cause has not yet been confirmed. They are a data quality failure if they remain unresolved. Every "unknown" cause code should trigger a follow-up investigation task, with the result recorded before the work order is fully closed.
Medium Risk
Building the code set without technician input
Reliability engineers sometimes build taxonomy in a spreadsheet and load it into the CMMS without validation from the technicians who will use it. Codes that do not map to what technicians observe will either be misapplied or ignored. Hold two workshops: one to validate modes against what technicians actually see, one to validate causes against what they find on teardown.
Medium Risk
Treating failure effect as a severity score rather than a consequence category
Numeric severity scores (1–5) are common and almost useless for reliability analysis because they are subjective and shift between technicians. Replace them with consequence categories: Full Production Loss, Reduced Throughput, Quality Degradation, No Production Impact, Safety or Environmental Release. These are objective, consistently applied, and directly linkable to financial impact.
Data Quality
Not validating data quality in the first 90 days
The first 90 days of a new taxonomy implementation will surface miscoding — technicians unsure where to classify a borderline event will make inconsistent choices. Run a monthly data quality check for the first quarter: how many work orders have "unknown" cause codes, how many modes are being used for multiple asset classes incorrectly, which codes have suspiciously high frequency suggesting they are being used as defaults.

Replace Free-Text Failure Notes With Structured, Analysable Data

OxMaint enforces separate Mode, Cause, and Effect dropdowns at work order closure — configured per equipment class, mandatory before closure, and immediately available for Pareto analysis and root cause reporting. Sign up free to configure your failure code taxonomy, or book a demo to see the configuration workflow.

Frequently Asked Questions

What is the difference between a failure mode and a failure cause in ISO 14224?

A failure mode is the observable state the asset entered when it failed — what changed at the functional boundary. "Leaking from mechanical seal" is a mode. A failure cause is the physical or operational mechanism that produced the mode — why it happened at the component level. "Dry running" or "shaft movement due to bearing wear" are causes of a mechanical seal failure. ISO 14224 requires both to be captured separately because they drive different actions: modes drive maintenance task selection, causes drive design or procedure review. Conflating them is the single most common failure code mistake in plant CMMS implementations.

What is OREDA and how does it relate to failure code taxonomy?

OREDA — Offshore Reliability Data — is a multi-industry consortium that has published equipment failure rate databases since 1984, built directly on ISO 14224 taxonomy. The OREDA handbook provides failure mode and cause vocabulary, failure rate data, and equipment subdivision guidance for major equipment classes. Its significance for plant taxonomy is twofold: it validates which failure modes are worth including in your controlled vocabulary (those with material frequency in the data), and it provides the cross-industry consistency basis that makes your plant data comparable to published benchmarks — but only if your taxonomy aligns with its classification structure.

How many failure mode codes should each equipment class have?

Ten to fifteen modes per equipment class is the practical ceiling for consistent application by technicians. Beyond that, technicians face decision fatigue at work order closure and begin defaulting to the most familiar code rather than the most accurate one. The initial list can be longer, but it should be pruned after the first six months of use: any code that appears fewer than five times in 12 months either represents a genuinely rare event or a code that nobody knows how to classify — both require investigation. Start with the ten most common observable states for the equipment class and add from field experience.

Why does coding at equipment unit level matter more than coding at system level?

MTBF, repeat failure analysis, and PM optimisation all depend on failure records being linkable to a specific asset with a known operating history. A failure record attached to "Cooling Water System" cannot contribute to MTBF calculation for any specific pump, cannot be compared to other pumps of the same class, and cannot trigger a PM review for the specific asset that is failing repeatedly. ISO 14224 is explicit that failure records must be attached at the equipment unit level — the specific item that was repaired — not the parent system or functional location. This is the most common structural error in plant asset hierarchies.

How does OxMaint enforce failure code discipline at work order closure?

OxMaint configures separate Failure Mode, Failure Cause, and Failure Effect dropdown fields at the equipment-class level — so a technician closing a pump work order sees only pump-relevant mode options, not a generic plant-wide list. All three fields are set as mandatory before a work order can be marked complete, preventing free-text entries and incomplete records. Administrators can update the taxonomy centrally — adding, merging, or retiring codes — and changes propagate immediately to all mobile users without requiring app updates. Sign up free to configure your first equipment class taxonomy today.


Share This Story, Choose Your Platform!