What good QA looks like in a penetration testing team

Report quality 7 June 2026

QA is often treated as the final step before a penetration testing report goes to the client.

That is part of the job, but it is too narrow.

Good QA is not just proofreading. It is not only checking spelling, formatting and whether screenshots are present. It is a quality control point for the whole engagement.

A good QA process checks whether the report is accurate, evidenced, proportionate, clear and useful.

It also tells leaders where the team needs development.

QA is more than report polishing

A polished report can still be weak.

It may be well formatted and easy to read, but still contain unclear evidence, weak severity logic or vague recommendations. It may describe a technical issue accurately, but fail to explain why it matters to the client.

That is why QA needs to go deeper than language and formatting.

A good reviewer should be asking:

has the issue been proven?
does the evidence support the conclusion?
is the severity rating proportionate?
is the recommendation usable?
is the risk explained clearly?
are assumptions and limitations clear?
would the client understand what to do next?

This matters because the report is often the main deliverable the client remembers.

If the report is unclear, the engagement feels weaker, even if the technical testing was good.

Technical accuracy is only one layer

Technical accuracy is essential.

A finding must be correct. The vulnerability must be described properly. The evidence must show what was observed. The recommendation must not suggest something impractical or technically wrong.

But technical accuracy alone is not enough.

A technically accurate finding can still create problems if it is poorly framed.

For example, a finding may correctly describe missing access control, but fail to explain which users were tested, what access was achieved, what data or function was exposed and what assumptions were made.

That finding may be true, but it is not yet defensible.

QA should test both the technical point and the clarity of the reasoning.

The reviewer should be able to trace the conclusion back to the evidence. If that line is weak, the client will probably spot it too.

This is one reason technical training does not fix every pentest team problem. The team may have the technical skill to find issues, but still need development around judgement, explanation and defensible reporting.

Severity needs to be explainable

Severity is one of the most important areas for QA.

A rating should not feel like a number attached at the end. It should be supported by the evidence, exploitability, impact and client context.

QA should check whether the severity makes sense and whether it can be defended.

That includes questions such as:

what was actually proven?
what level of access was required?
what impact was demonstrated?
what assumptions affect the rating?
are there known compensating controls?
would new client context change the severity?
is the rating consistent with similar findings?

The aim is not to remove professional judgement.

The aim is to make judgement visible.

If a client challenges the rating, the team should be able to explain it calmly. If the rating changes because the client provides new context, that should be handled as an evidence-based update, not a negotiation.

This is why teams need a clear approach to handling severity disputes. A rating should be reviewed through evidence, context and professional judgement, not pressure or convenience.

Recommendations should be usable

Weak recommendations create rework.

A recommendation that says “apply appropriate access controls” may be technically correct, but it may not help the client fix the issue. A recommendation that simply repeats the finding is also weak.

Good recommendations are specific enough to guide action.

They should explain what needs to change, where possible, without pretending to know the client’s full engineering constraints.

QA should check whether the recommendation is:

relevant to the finding
realistic for the affected system
clear enough for remediation teams
proportionate to the severity
free from unnecessary generic advice

A report with poor recommendations may lead to more client questions, poor remediation, retest friction and reduced confidence.

That all creates cost after the report has been issued.

Evidence should carry the finding

Evidence is one of the main things QA should protect.

A finding should not depend on the reader trusting the tester. It should show enough detail for the conclusion to be understood and challenged.

That does not mean every report needs excessive screenshots or raw request dumps. Too much evidence can make a report harder to use.

The question is whether the evidence is sufficient.

A reviewer should check whether the evidence shows:

the affected component
the tested user or role, where relevant
the action performed
the result observed
the security boundary that failed
the realistic impact

Evidence should support the story of the finding.

If the evidence is thin, the finding becomes easier to challenge and harder to fix.

QA should protect the client relationship

QA is also a client experience control.

Clients do not only judge whether the findings are technically valid. They judge whether the provider understands their environment, explains risk clearly and produces work they can act on.

A report that feels careless can weaken trust.

That may include inconsistent wording, unclear severity, generic recommendations, exaggerated impact or findings that look copied from a template without enough client context.

Good QA reduces that risk.

It makes sure the report feels considered, specific and professionally controlled.

That matters commercially. A good report can support repeat work. A poor report can make the client question the engagement, even if the test itself was technically sound.

QA defects are leadership signals

QA should not only fix individual reports.

It should provide feedback about the team.

If the same issues appear repeatedly, leaders should treat that as useful data. Repeated weak evidence may mean testers need better note-taking habits. Repeated severity changes may mean the team needs clearer rating guidance. Repeated report rewrites may mean juniors need structured writing support.

Common QA patterns include:

findings missing business context
recommendations being too generic
evidence not supporting the conclusion
severity ratings being inconsistent
testers overstating impact
reports needing several review cycles
the same reviewer rewriting large sections

These are not just report problems.

They are development, process and leadership signals.

They can also show where senior pentesters become the delivery bottleneck. If the same experienced people are repeatedly rewriting reports or rescuing findings, the team has a scaling problem as well as a QA problem.

Good QA happens before the final review

A final QA gate is useful, but it should not be the first time quality is considered.

If a report reaches final review in poor condition, the team has already lost time. The reviewer has to do heavy correction, the tester has to revisit old notes and the report issue date may slip.

Better teams build quality earlier into the delivery process.

That might include:

clearer note-taking expectations
finding review during the test
early discussion of complex issues
examples of good findings
severity calibration sessions
report writing coaching
better handover from tester to reviewer

The final QA stage should catch issues. It should not routinely rescue the report.

What this means for pentest team performance

Good QA protects the client, the tester and the business.

It improves report quality. It reduces rework. It makes severity easier to defend. It gives remediation teams clearer guidance. It also shows leaders where the team needs development.

For offensive security teams, QA should be treated as a core operating process, not an administrative step.

The question is not whether the report is tidy enough to send.

The question is whether the work is accurate, evidenced, proportionate and useful enough for the client to trust.

Leaders should also measure penetration testing team performance in ways that expose QA drag, report rework, client clarification, severity disputes and unplanned senior involvement.