Validating an artificial intelligence: a matter of judgment before a question of method

nathangozlan48
10 hours ago
4 min read

Artificial intelligence is taking hold in the pharmaceutical industry: visual inspection on production lines, signal detection in pharmacovigilance, demand forecasting, document processing. With it comes a question that quality teams ask sooner or later: how do you validate such a system?

People often expect a technical answer, a procedure to apply. Yet the question raises something of a different order. Validating an AI forces us to reconsider what "validating" means, and to accept that the answer is as much a matter of judgment as of method.

Validate what, exactly?

First misunderstanding: the word covers two distinct realities. For data teams, validating a model means measuring its performance on data it has never seen. For quality assurance, it means demonstrating, with supporting evidence, that a system performs its function in a controlled and reproducible way. Both approaches are legitimate, and neither is sufficient on its own. A model deemed "validated" in the statistical sense may offer none of the guarantees quality expects, and vice versa.

The validation of an AI therefore plays out at the meeting point of these two requirements — precisely where neither culture is used to working. It is less a problem of tools than a problem of shared language.

A system that is never exact

Then comes a less comfortable reality. Traditional software applies a rule, and you can verify that it produces the right result. An AI model, by contrast, estimates. It does not compute a truth; it produces a probability, and it will get things wrong by design.

From then on, "validating" can no longer mean "proving it is right." The question becomes: where, and to what extent, can this model be wrong without it mattering? And who decides that this proportion is acceptable? This shift changes the very nature of the exercise. Validation stops being a guarantee and becomes a risk decision, owned as such, rather than a mere certificate of conformity.

A validation that does not hold over time

Another rarely anticipated feature: a model may work correctly on its go-live day, then quietly degrade as its environment evolves, without anything having been changed. A validation frozen in time then protects against nothing.

What value should be placed on an attestation that may expire without warning? This question alone is enough to move the cursor: validating an AI is not an event you tick off once, but continuous monitoring, able to spot the moment when the model is no longer up to what is expected of it.

Performance is never neutral

Performance metrics are readily presented as a technical matter. They are first a matter of responsibility. Favouring detection at the cost of false alerts, or the reverse, amounts to deciding what you accept to miss. On a production line, wrongly rejecting a compliant product and letting a real defect through do not carry the same consequences.

Behind every threshold, then, lies a trade-off that bears on patient safety and product quality. This trade-off belongs neither to the model nor to the data specialist alone. Stating it explicitly, and at the right level, is an integral part of validation.

Knowing what cannot be validated

This is perhaps the most important point, and the least often stated. Rigour, on this subject, is recognised by the ability to say no. Not all AI uses deserve the same validation effort, and some simply cannot be validated in the sense the industry means, because their behaviour is neither stable nor reproducible.

Recognising this limit, refusing to deploy a system you do not control, sizing the requirement to the real risk rather than trying to cover everything: this is what separates a sound approach from a façade of validation. At a time when the pressure to adopt AI is strong, knowing how to set aside what cannot be mastered is a skill in itself.

A framework that guides, without deciding for you

Regulators are moving in this direction. The EMA's draft Annex 22, the ISPE Artificial Intelligence Guide, the A3P AI Guide and the ten AI principles from the EMA and the FDA outline a coherent framework, centred on risk and on the place of the human. This framework sets expectations; discernment remains to be exercised project by project.

A matter of shared perspective

The most stubborn obstacle is, ultimately, not technical. Data specialists master models but know little of regulatory requirements. Quality experts master regulation but are discovering the inner workings of machine learning. As long as these two worlds do not share a common understanding and vocabulary, every AI project advances with a blind spot at its centre.

It is this conviction that shapes the two-day training course designed by ADN on the validation of artificial intelligence. Built for mixed teams (quality, computer system validation, data, regulatory affairs), it offers not a recipe to apply but a way of reasoning: understanding what a model really is, gauging its limits, identifying and ranking risks, then conducting a proportionate validation, from scoping through to in-production monitoring. With a clear, deliberate stance: ask the right questions before looking for the right answers.

If AI validation is already a topic in your projects, or about to become one, contact us to discuss it and determine how this training can adapt to your context.