Using AI to Extract Critical Controls from Regulatory Documentation

February

2026

A client recently came to us with a specific request to assess their bowtie risk assessment controls against the new National Heavy Vehicle Regulator NHVR 2026 Master Code of Practice.

Released in January 2026, the revised Code represents a significant shift from the 2018 version. It moves from a role-based structure to an activity-based structure, focusing on the risks associated with transport activities rather than specific job titles. It's been developed over two years with extensive industry input, and now covers sectors like buses and container transport that weren't addressed in the original.

The Code is comprehensive, prescribing 489 individual items across dozens of activity areas. Reviewing each one against an existing risk framework, classifying it, and determining relevance to your critical controls is weeks of painstaking administrative work.

It's also exactly the kind of task where AI can help. Not to replace critical thinking, but to dramatically reduce the administrative burden so you can focus on what matters: implementing the controls that actually prevent people getting killed.

Our approach

As part of an AI coaching program we're running with the client, we used this task as a live case study. We applied the ICMM critical control flowchart (the version slightly augmented by the Sustainable Minerals Institute) to classify every one of the 489 items in the Code.

The ICMM framework asks a simple question: Is this a physical object or human action that, directly and of itself, prevents harm? And if so, would its absence or failure significantly increase fatality risk?

That's the test for a critical control. It's a reasonably high bar do most things that get called "controls" in safety documentation don't pass it.

What we found

The results were striking:

Critical Control (Likely): 86 items (18%)
Critical Control (Possible): 21 items (4%)
Control: 29 items (6%)
Control Support: 287 items (59%)
Verification: 66 items (13%)

Only 28% of the items met the definition of a control at all. Just 22% were likely or possibly critical; the ones where failure puts someone on a fatal risk pathway.

The remaining 72%? Training programs, procedures, policies, schedules, information sharing, audits, inspections. Important work. But control supports and verification activities, not controls by definition.

The implication for practitioners

To be clear: this isn't a criticism of the NHVR Master Code of Practice. It's a well-constructed regulatory document that serves a purpose, and the shift to an activity-based structure is a meaningful improvement. Training matters. Procedures matter. Verification matters.

But the heavy focus on admin is real. We'd hoped a Master Code of Practice would lean more heavily toward direct controls that prevent serious injuries and fatalities, with administrative measures in a supporting role. Something closer to 70/30 in the other direction.

That has implications for how organisations use the document. If you build your compliance program around every item in the Code with equal weight, you're spreading assurance effort thin. The 86 items that are likely critical controls deserve a fundamentally different level of attention than the 287 that are control supports.

The Critical Control focused risk assessment exists precisely to help organisations make this distinction. But doing this manually across 489 items? That's where the administrative reality defeats the intent.

Where AI fits

This is a sweet spot use case for AI in safety work: structured classification tasks against defined frameworks, applied to long-format regulatory or technical documentation.

But it's not as simple as uploading a PDF and asking a question. We've tested this extensively across various LLMs and tools like ChatGPT and Microsoft Copilot simply can't handle these tasks reliably. Generic prompts produce generic outputs. Hallucinations get worse as conversations grow in length and context windows get used. Consistency falls apart across a 489-item document.

Over the past year, we've spent more than 100 hours developing a prompting methodology using specific LLMs for exactly this kind of task. It's iterative work:

Structuring the problem so the model understands the classification criteria
Selecting appropriate models and settings for consistency
Designing prompts that produce reliable, auditable outputs
Building in verification steps to catch edge cases
Refining based on where the model struggles

The result is a workflow that can process a document like the NHVR Master Code of Practice systematically and produce a classified control library your team can actually use.

Building capability, not dependency

Our AI coaching approach focuses on this middle ground. Where generic queries fall short, but you don't need a six-figure enterprise AI solution.

We work alongside clients on their real problems. Documents they're already dealing with. Frameworks they're already using. The gap between what off-the-shelf AI tools promise and what they actually deliver is something we cover extensively in our coaching because understanding why ChatGPT or Copilot failed is often where the real learning happens.

The goal is to build internal capability; fit-for-purpose prompting workflows your team can run, refine, and own.

The NHVR classification task took a fraction of the time it would have taken manually. More importantly, the client now has a transferable method they can apply to other regulatory documents, other frameworks, other classification challenges.

That's the value. Not a one-off deliverable, but a repeatable capability.

A question for the industry

We're genuinely curious how others working with heavy vehicle operations will navigate this new Code.

How are you planning to use it alongside your critical control frameworks? Will you classify items by type, or treat them uniformly? Where do you go for guidance on identifying and assuring the direct controls that prevent fatalities in heavy vehicle operations?

The split we found; 22% critical, 72% administrative - might be exactly right for a regulatory document. Or it might reflect a broader pattern in how safety documentation gets written. We think that's worth unpacking.

Get in touch

If you're exploring how AI can reduce administrative burden in your safety and risk work, we'd be glad to talk. Our coaching programs are designed to help teams develop practical AI capabilities for their specific applications.

And if you'd like a copy of the full NHVR control classification spreadsheet; all 489 items classified against the ICMM framework - get in touch and we'll share it.

see all posts >