Using AI to Draft Risk Assessments: Where It Helps, Where It Doesn't, and How to QA the Output

Q: What does AI do well in risk assessment drafting?

Three things in particular. First, scaffolding: hazard lists, generic control measures and template structure for common assessment types. Second, summarising: turning the assessor's site notes into a defensible executive summary. Third, rewriting: tidying up field notes typed on a phone into clean professional sentences. The common thread is that AI is good at turning structured input from a competent assessor into a presentable output, not at replacing the judgement that produces the input in the first place.

Q: What kinds of mistakes does AI make in risk assessments?

Four categories come up consistently. Citation hallucination: invented regulation numbers, repealed legislation, or fabricated guidance document numbers. Generic control measures that don't fit the site: PPE listed where it is not the right control, training listed where supervision is the gap. Risk score drift: when asked to populate a 5x5 matrix, AI tends to cluster scores in the middle band and underreport high consequence low likelihood hazards. And consequence understatement on rare events: AI often gives serious or fatal hazards a moderate score because the verbal phrasing of the control is calm. Every AI-drafted assessment needs a human assessor to read the controls, sanity check the scores, and verify any citation.

Q: Is using AI to draft a risk assessment compliant with HSE guidance?

There is no HSE rule that prohibits using AI to draft a risk assessment. The Management of Health and Safety at Work Regulations 1999 require a suitable and sufficient assessment by a competent person, and they require the significant findings to be recorded. AI is a drafting tool. The competent assessor's name, judgement and sign-off are still required. As long as the assessor reads and is willing to defend every word of the assessment, the use of an AI tool to produce the first draft is, in current regulator practice, the same as using a template.

Q: Should you tell clients you use AI in your risk assessments?

Yes. Clients are going to ask. The fair answer is that AI is used to produce first drafts, to generate executive summaries, to suggest control measures and to do an automated QA pass on every report, but that every assessment is reviewed and signed by the named competent assessor who carries the liability. That answer is professionally defensible. The opposite answer, that no AI is involved at all, is becoming harder to give credibly and is going to be challenged at procurement.

The starting point: liability does not move

Every conversation about AI in risk assessment has to start here. The Management of Health and Safety at Work Regulations 1999 require a suitable and sufficient risk assessment by a competent person. Sector-specific regulations (the Fire Safety Order, CAR 2012, the Working at Height Regulations) add specific duties on top, again on a competent person. There is no clause in any of them that lets a tool sign the assessment. If you, the consultant, put your name on the report, you carry the liability for every word in it, whether the first draft was typed by you, pasted from a template you wrote in 2019, or generated by a model in 2026.

The useful framing is: AI is a faster typewriter. It does not change what a competent assessor has to verify before sign-off. It does change how much time the competent assessor spends typing.

Three jobs AI is genuinely good at

1. Scaffolding the first draft

Give a current AI model the activity (lone worker on a remote farm site), the type of assessment, and a short brief about the client. It will produce a clean structure: hazard categories, generic control measures, a sample 5x5 matrix population, and a covering paragraph that reads professionally. The output is roughly 70 to 80 per cent of what a competent assessor would have typed manually in 90 minutes, in about 20 seconds. The assessor's job moves from typing to verifying.

2. Executive summaries from site notes

Most assessors carry rough notes off site that are not in client-readable shape. AI is excellent at turning those into a one-paragraph summary, a short list of significant findings, and a tidy actions table. This is the single biggest time saving AI offers in consultancy work, because executive summary writing is the slowest stage of producing a defensible report.

3. Rewriting field text

Field notes typed on a phone, half thumb-typed, half autocorrected, become clean professional sentences in one prompt. Same content, same intent, better signal. The competent assessor reads the rewrite to confirm meaning has not changed, then keeps it.

Four failure modes that catch consultants out

1. Citation hallucination

AI will, with confident phrasing, cite a regulation number that does not exist, refer to legislation that has been repealed, or invent an HSG document number. This happens often enough that any AI-drafted citation has to be verified. The cheapest fix is to forbid AI from citing legislation in the draft and to insert citations manually in the review step. The next cheapest fix is to verify every legislative reference against legislation.gov.uk before sign-off.

2. Generic controls that do not fit the site

AI defaults to a sensible-sounding general control. "PPE" gets listed where the right control was substitution. "Training" gets listed where the gap was supervision. These are not wrong as words, but they are wrong as the control for this site. The site-specific judgement only happens in the assessor's head; the AI cannot supply it. Every control has to be read against what the assessor actually saw.

3. Risk score drift toward the middle

Given a 5x5 matrix and asked to populate it, current AI models tend to cluster scores in the middle band (likelihood 3, consequence 3, score 9) and underreport rare high consequence hazards. This is partly a calibration artefact and partly a tone artefact: the AI tries to sound proportionate. On low-frequency catastrophic hazards (a confined-space fatality, a fall from height fatality) AI will frequently give a moderate score because the verbal phrasing of the existing control is calm. The assessor has to recalibrate.

4. Consequence understatement on rare events

Related to score drift, but worth calling out separately. AI tends to score a control-failure scenario based on what usually happens, not based on the worst credible outcome. A defensible risk assessment has to be based on the worst credible outcome of the hazard, not the average outcome. The assessor reads the consequence column and adjusts where the AI has been timid.

The 7-point QA checklist

Run this seven-point QA pass on every AI-drafted assessment before sign-off, in order.

Site reality check. Read the assessment with the site in your head. Does it describe what you actually saw? If not, what is missing?
Persons at particular risk. Are the people on this site who are at particular risk named or characterised? Lone workers, mobility-impaired occupants, contractors on site for the first time, young persons, pregnant workers.
Citation verification. Every regulation, every Act, every HSG document. Compare to legislation.gov.uk. Strike any reference you cannot confirm.
Control specificity. Read each control as if you were the duty-holder being asked to implement it. Is it specific enough to do? "Train staff" is not enough. "Refresher manual handling training before next deliveries, recorded in the staff training matrix" is enough.
Worst-credible-outcome check. For every hazard, ask: if every control failed, what is the worst credible outcome? Is the consequence score consistent with that?
Hazard completeness. What hazards relevant to this site are missing? AI cannot tell you about the hazard that was not in its training set or not in the brief.
Sign-off line. Are you, named on the report, willing to defend every word at an enforcement interview? If yes, sign. If no, edit until yes.

What AI should not be used for

Identifying hazards from site photos. A photo does not show the work activity, the people, or the context. A model trained on internet photos will report on what is visible and miss what is missing. Hazard identification has to happen with the assessor on site.
Risk scoring without a human pass. Score drift is real and predictable. The score column is the most legally sensitive part of the report and has to be checked.
Untouched legislative citations. Hallucination is the most common failure mode and the easiest one for a regulator to challenge.
Stand-alone executive summaries presented without the body. If the body has not been written, the summary cannot be defended. AI will happily write a confident summary of an assessment that does not yet exist.

Telling clients you use AI

Be ahead of the question. The honest professional answer is: AI is used to produce first drafts, to generate executive summaries, to suggest control measures and to run an automated QA check, but every assessment is reviewed and signed by the named competent assessor who carries the liability. That answer is defensible. The opposite answer (no AI is used) is going to be tested at procurement and is rapidly becoming the harder one to back up.

What clients actually want to know is whether your AI use changes who carries the liability. The answer is: no. The AI tool is the typewriter. The signature on the report is yours.

Frequently asked

Can AI write a risk assessment on its own?

No, not in any defensible sense. AI is very good at producing the structure and first draft. It cannot stand on a real site, identify the persons at particular risk, or make the professional judgement the law requires. A risk assessment produced entirely by AI with no human review is not signed by a competent person, which is the legal test.

What does AI do well in risk assessment drafting?

Three things: scaffolding (hazard lists, generic control measures and template structure for common assessment types), summarising (turning site notes into a defensible executive summary), and rewriting (tidying up field notes typed on a phone into clean professional sentences). The common thread is that AI is good at turning structured input from a competent assessor into presentable output.

What kinds of mistakes does AI make in risk assessments?

Four categories: citation hallucination (invented regulation numbers or repealed legislation), generic control measures that do not fit the site, risk score drift toward the middle band, and consequence understatement on rare high-consequence events. Every AI-drafted assessment needs a competent assessor to verify the controls, sanity check the scores and check the citations.

Is using AI to draft a risk assessment compliant with HSE guidance?

There is no HSE rule that prohibits AI drafting. The Management of Health and Safety at Work Regulations 1999 require a suitable and sufficient assessment by a competent person, and the significant findings to be recorded. AI is a drafting tool. The competent assessor's name, judgement and sign-off are still required.

Should you tell clients you use AI in your risk assessments?

Yes. The fair answer is that AI is used for first drafts, executive summaries, suggested controls and automated QA, but every assessment is reviewed and signed by the named competent assessor who carries the liability. That answer is defensible. The opposite answer (no AI at all) is going to be harder to give credibly and will be challenged at procurement.

The starting point: liability does not move

Three jobs AI is genuinely good at

1. Scaffolding the first draft

2. Executive summaries from site notes

3. Rewriting field text

Four failure modes that catch consultants out

1. Citation hallucination

2. Generic controls that do not fit the site

3. Risk score drift toward the middle

4. Consequence understatement on rare events

The 7-point QA checklist

What AI should not be used for

Telling clients you use AI

Frequently asked

Related reading

How to write a fire risk assessment under the Fire Safety Order and Building Safety Act

What compliance does my business actually need? A checklist for non-domestic premises