You’ve decided AI belongs in your GxP operations. The Stack is the architecture for that move. Traditional CSV assumes a fixed expected result, a system that stays put, behaviour independent of data, and visible failure; AI breaks all four. The Stack is how you keep control anyway.
Classification is the most consequential hour in an AI system's lifecycle. The model tiers a use case by its impact on product quality and patient safety and by the system's autonomy, then sets the assurance accordingly. A worked example from the book: an HPLC chromatogram review classifier sits at Tier 2, and a case where a system drifted from Tier 2 to Tier 3 in eighteen months shows why reclassification triggers matter.
| Tier | Impact & autonomy | Assurance |
|---|---|---|
| Tier 1 | Low impact, assistive only | Light, with basic controls |
| Tier 2 | Significant GxP, human sign-off | Model validation and decision records |
| Tier 3 | High impact or higher autonomy | Full validation, tight monitoring, strong oversight |
| Tier 4 | Critical or autonomous in critical use | The heaviest controls, or kept out of that use |
| Traditional CSV assumes | AI reality |
|---|---|
| A fixed expected result | Results are probabilistic |
| A system that stays put | The model can change |
| Behaviour independent of data | Behaviour depends on training data |
| Failure is visible | Failure is often silent |
Governance precedes validation. The regulatory spine (Annex 11, the draft Annex 22, 21 CFR Part 11, FDA CSA, the FDA-EMA principles) is the base. On it sits the reference architecture: data, model, workflow, record and audit trail. Then classification and risk tiering; then the governance operating model (an AI Governance Board, RACI and decision rights); then a risk-based, model-specific validation master plan; and at the top, monitoring and control for drift, performance, human-in-the-loop and decision-integrity records that capture input, output, reviewer and disposition.