GPAI obligations for EU SMB SaaS
Most EU AI Act articles focus on high-risk Annex III. Yet for an EU SMB SaaS that uses GPT-4, Claude, Gemini, or Mistral via API — GPAI (general-purpose AI) obligations are the everyday concern. Q3-Q4 2026 is the enforcement window for GPAI providers, but deployer obligations (yours) actually started 2 August 2025. Yes — a year ago.
If you use an LLM API (OpenAI / Anthropic / Mistral / Google) you are a deployer of a GPAI system, not a provider. Your obligations are light (Art. 26): don't violate ToS, don't disable safety guardrails, document internal use. Full Art. 53-55 obligations (training data summary, copyright policy, technical documentation, evaluation against systemic risks) sit with the model providers — OpenAI, Mistral, etc. Verify your provider has published AI Act compliance documentation (DeepSeek, smaller open-source providers often haven't).
What exactly is GPAI?
The EU AI Act defines GPAI in Art. 3(63) as "AI model... that displays significant generality and is capable of competently performing a wide range of distinct tasks". In practice: a foundation model trained on a massive dataset with cross-domain capability. LLMs (GPT, Claude, Llama), vision-language models (CLIP, DALL-E), multimodal systems (Gemini, GPT-4o).
An important distinction:
- GPAI model = the model itself (weights + architecture)
- GPAI system = a product that integrates a GPAI model (e.g. ChatGPT, Claude.ai, but also your application using GPT-4 API)
The Art. 53-55 rules apply to providers of GPAI models. Rules for GPAI systems used in high-risk use cases are elsewhere (Art. 25 supply chain, Art. 28 high-risk deployer obligations).
Provider vs deployer — which are you
This is the only question that matters for SMBs. 95% of EU SMB SaaS are deployers. If your company:
- Trains your own foundation model with compute > 10²² FLOPs (≈ €1M+ training cost) → Provider GPAI, full Art. 53 obligations
- Trains your own model but < 10²² FLOPs → not GPAI, regular AI rules apply
- Fine-tunes a GPAI (e.g. fine-tunes Llama 3 on your own data) with significant modification → possibly a provider of derived model (Recital 109), needs an audit
- Uses GPAI via API (OpenAI, Anthropic, Mistral, Google) → Deployer of GPAI system, light obligations
- Hosts open-source GPAI (Llama, Mistral) without modifications in production → possibly a distributor (Art. 25), still mostly deployer-tier obligations
🟢 Most SMBs = deployer (light obligations)
Practical checklist if you use OpenAI/Anthropic/Mistral API as a deployer:
- Don't violate the provider's Terms of Service (e.g. don't use GPT-4 for political microtargeting if OpenAI ToS prohibits it)
- Don't disable safety guardrails in production (jailbreaking for testing OK, for deployment NO)
- Document internally: what model you use, in what use case, how you monitor output (a 1-pager is enough)
- If your GPAI-using system falls under high-risk Annex III (e.g. CV screening) — you have full Art. 9-15 obligations for your system (model weights come from OpenAI, but the system is yours)
Provider obligations (Art. 53)
If you actually are a GPAI provider (rare for SMBs but possible — e.g. fine-tuning Mistral on your own dataset for a specific domain with > 10²² FLOPs, or your own large model trained from scratch), Art. 53 requires four things:
1. Technical documentation (Art. 53(1)(a))
Annex XI of the EU AI Act lists the full requirements. SMB summary:
- General description and intended purpose of the model
- Tasks the model can perform + capability metrics
- Acceptable use policies (where the model should NOT be used)
- Date of release + distribution channels
- Architecture + parameter count + modality (text / image / audio)
- Procedures for testing + validation results
- Risk management procedures (known limitations, hallucination rates, etc.)
Documentation format: any, but must be kept up-to-date and retained for 10 years post-release.
2. Documentation for downstream providers (Art. 53(1)(b))
A second document: lighter version of the technical doc for companies that will integrate your model into their systems. Helps them meet their own Art. 9-15 obligations (if their system is high-risk).
In practice: a model card in Hugging Face style, but with specific fields from Art. 53(1)(b): intended use, limitations, evaluation results, training data sources (general), copyright posture.
3. Copyright policy (Art. 53(1)(c))
The provider must publish a policy on Union copyright law compliance, especially Art. 4 of the DSM Directive (text-and-data mining opt-out). In plain English: if you text-data-mined the internet for training and content owners opted out (e.g. via robots.txt, ai.txt, headers), your policy must respect those opt-outs and describe how you do so.
This was an extremely hot topic in 2025-2026 — most frontier providers (OpenAI, Anthropic) published detailed copyright policies, some open-source models had gaps (e.g. Mistral initially).
4. Training data summary (Art. 53(1)(d)) — the obligation that catches most SMBs off-guard
The provider must publish a "sufficiently detailed summary" about the data used for training. The template was published by the European AI Office in February 2025. It requires:
- General data sources: publicly available datasets (Common Crawl, Wikipedia, ArXiv), licensed data, user-generated content (with ToS reference), proprietary data
- Modality breakdown: how much text vs image vs audio
- Volume estimates: tokens / images / hours per category (rough percentages OK)
- Scraping practices: whether you respect robots.txt, ai.txt, opt-outs
- Languages covered: which European Union languages are represented
- Time period: when data was collected
Systemic risk GPAI — threshold 10²⁵ FLOPs (Art. 51)
The EU AI Act splits GPAI providers into two classes:
- Standard GPAI — Art. 53 obligations
- GPAI with systemic risk — Art. 53 + Art. 55 (additional)
The threshold for "systemic risk" in Art. 51 = compute used for training > 10²⁵ floating-point operations (FLOPs). A very high bar — in 2026 only the largest providers cross it (GPT-4o, Claude 3.5 Opus, Gemini 1.5 Ultra, Llama 3.1 405B).
For SMBs this is academic interest, not a practical concern. Even large EU AI startups (Mistral, Aleph Alpha, Black Forest Labs) may or may not hit 10²⁵ FLOPs — that's a ~$50M+ training run.
Art. 55 additional obligations (if you are a systemic risk provider)
- Model evaluation with standardised protocols (red teaming, adversarial testing)
- Identification + mitigation of systemic risks (CBRN, election manipulation, large-scale fraud)
- Tracking + reporting of serious incidents
- Adequate cybersecurity protections for model weights
Penalties Art. 99(4) — €15M / 3% turnover
A breach of GPAI obligations by a provider falls under the standard non-compliance penalty in Art. 99(4): €15 million or 3% of global annual turnover, whichever applies (lower of for SMEs per Art. 99(6), higher of for non-SMEs).
Concretely: if OpenAI doesn't publish a training data summary (Art. 53(1)(d) violation), max penalty = max(€15M, 3% × OpenAI revenue ~$3.4B) = ~$102M. For a solo provider with €100k revenue = max(€15M, €3k) = €15M absolute, but per Art. 99(6) lower-of for SME = €3k.
For deployers: penalties for misuse can come from other articles (e.g. Art. 50 transparency for chatbots without disclosure, Art. 99(4) for high-risk systems with GPAI integrated).
Decision tree for SMB SaaS
┌─ Do you use an API to a foundation model (OpenAI/Anthropic/Mistral/Google)?
│
├─ YES → You're a DEPLOYER of a GPAI system
│ Light obligations: ToS compliance, no safety guardrail removal,
│ internal documentation. Full obligations ONLY if your system
│ falls under high-risk Annex III — then Art. 9-15 for the system.
│ Verify the provider has published AI Act compliance docs.
│
├─ NO → Do you train your own model?
│ ├─ YES → Compute > 10²² FLOPs (~€1M+ training cost)?
│ │ ├─ YES → You're a PROVIDER GPAI
│ │ │ Full Art. 53 obligations.
│ │ │ Compute > 10²⁵ FLOPs (~$50M+ training)?
│ │ │ ├─ YES → + Art. 55 systemic risk obligations
│ │ │ └─ NO → Standard GPAI (Art. 53 only)
│ │ └─ NO → Not GPAI. Regular AI rules.
│ │
│ └─ NO → Do you fine-tune GPAI with significant modification?
│ ├─ YES → Possible derived model = you're a provider
│ │ of the derived model. Audit needed.
│ └─ NO → Pure deployer, light obligations.
└─
What this means in practice for EU SMB SaaS — 4 scenarios
Scenario A: Chatbot on OpenAI API
Situation: SaaS support chatbot, GPT-4 API, ~5k users monthly.
Classification: Deployer of GPAI system + Art. 50 transparency (limited risk: user-facing chatbot).
What to do: add "Powered by AI" disclosure (Art. 50), verify OpenAI has published Art. 53 compliance (they do — ToS + Privacy Policy + Model Card), 1-page internal documentation, monitor output for harmful patterns. You are NOT a GPAI provider.
Scenario B: CV screening on Anthropic Claude API
Situation: HR-tech SaaS, Claude API for parsing + ranking CVs.
Classification: Deployer of GPAI + Provider of HIGH-RISK system (Annex III #4 employment). Bad news — your system is high-risk regardless of using GPAI.
What to do: full Art. 9-15 obligations for your system (risk management, data governance, technical documentation, logging, human oversight, accuracy, cybersecurity), conformity assessment Annex VI, EU registration, post-market monitoring. Anthropic meets their own GPAI provider obligations separately — your responsibility is at the SYSTEM level. Decision tree for high-risk.
Scenario C: Fine-tuned Llama 3 for legal-tech
Situation: EU legal-tech startup, fine-tunes Llama 3 8B on a proprietary Polish legal documents dataset (~50k docs).
Classification: Borderline. Fine-tuning compute likely below the 10²² FLOPs threshold (Art. 51 for "model" status — Recital 99). But Recital 109 suggests significant modification can make you a provider of a derived model.
What to do: 80% chance you're still a deployer (fine-tuning without fundamental modification ≠ provider). Conservatively — document fine-tuning methodology (lighter version of Art. 53(1)(a)), publish a small training data summary for your fine-tuning dataset (Art. 53(1)(d)). Audit recommended for legal certainty.
Scenario D: Self-hosted Mistral for privacy-sensitive deployment
Situation: EU healthcare SaaS, self-hosts Mistral Small 3 for GDPR-compliant on-premise deployment, WITHOUT modifications.
Classification: Deployer of GPAI system + Provider of HIGH-RISK system (Annex III #5 healthcare).
What to do: verify Mistral published Art. 53 compliance docs (Mistral has model cards + tech reports, but verify formal AI Act Art. 53 doc). Your obligations = full Art. 9-15 for the healthcare AI system. Self-hosting does NOT make you a GPAI provider — you're distributing it, Mistral made it.
Enforcement timeline — what and when
| Date | What becomes enforced |
|---|---|
| 02.02.2025 | Art. 5 (banned practices) + Art. 4 (AI literacy) — already live |
| 02.08.2025 | GPAI provider obligations Art. 53-55 — already live for new models |
| 02.08.2026 | High-risk obligations Art. 9-15 (Annex III) + most enforcement provisions |
| 02.08.2027 | GPAI obligations for models placed on market before 02.08.2025 (transition period) |
| 02.08.2030 | Annex I high-risk (safety components for product regulations) |
Important for SMBs: if you use GPT-4 API in 2026, OpenAI already has its obligations met (live since 02.08.2025). You have the right to demand AI Act compliance documentation from them as part of your vendor management.
Vendor checklist — what to verify with your GPAI provider
Before deploying GPAI in EU production, check whether your provider has published:
- Technical documentation Annex XI compliant (a model card may be enough for smaller providers)
- Acceptable use policy with explicit EU AI Act references
- Copyright policy — how they respect Art. 4 DSM Directive, whether they honour robots.txt / ai.txt opt-outs
- Training data summary per the European AI Office template (publicly available on their site)
- Incident reporting channel — where to report something problematic generated by their model
- Systemic risk status — whether > 10²⁵ FLOPs (if so, additional Art. 55 docs)
Provider status (as of 04.05.2026):
- OpenAI — full Art. 53 compliance, training data summary live, copyright policy detailed. Systemic risk classified.
- Anthropic — full Art. 53, model cards, copyright policy. Systemic risk classified for Claude 3.5 Opus+.
- Google — Gemini compliance docs live. Mixed disclosure quality for smaller models.
- Mistral — partial. Tech reports + model cards exist, full Art. 53(1)(d) summary template format complete only by Q1 2026.
- Meta (Llama) — model cards + responsible AI license, partial formal Art. 53 doc.
- DeepSeek / Qwen / other Chinese providers — minimal EU AI Act compliance documentation. High risk for EU production deployment.
Pitfalls for SMBs — 4 most common mistakes
1. Confusion: "we use AI = we are a GPAI provider"
Mistake: you use an API (OpenAI, Claude) → you think you have Art. 53 obligations.
Reality: you're a deployer. Provider obligations sit with OpenAI/Anthropic. You have light obligations (ToS, internal docs).
2. Skipping the provider's copyright policy
Mistake: you deploy open-source GPAI without checking whether their training data respected copyright.
Reality: if the provider breaches Art. 53(1)(c) and the European AI Office launches an investigation, the vendor may be forced to retract the model — you lose your dependency.
3. Fine-tuning = automatic provider status
Mistake: you do a LoRA fine-tune on 5k examples → you fear you're a GPAI provider with full Art. 53.
Reality: small fine-tunes ≠ significant modification per Recital 109. Full pre-training from scratch or fine-tune with €1M+ compute = potential provider. In between = grey zone, audit needed.
4. GPAI in high-risk system = "OpenAI handles it"
Mistake: SaaS uses GPT-4 for CV screening → thinks "OpenAI has compliance, we're fine".
Reality: your system is a separate high-risk one (Annex III #4). OpenAI's compliance for the GPAI model doesn't release you from full Art. 9-15 obligations for your system. These are two different responsibilities in the supply chain.
Action items for EU SMB SaaS
- Inventory your GPAI integrations. List per system: provider, model name, use case, whether user-facing or internal, whether it falls under high-risk Annex III.
- Verify provider compliance. For each vendor: find their AI Act compliance page (usually /policies/eu-ai-act or /trust). Save the link + date last verified.
- Internal documentation. 1-page per system: name, what it does, what model, what guardrails, what monitoring. Enough for a deployer.
- Art. 50 transparency check. User-facing AI? Add disclosure ("Content/response generated by AI"). Log implementation date.
- High-risk overlap check. If any system falls under Annex III — full Art. 9-15 obligations, regardless of using GPAI. Annex III decision tree →
Audit your AI stack against GPAI + Annex III
I map every AI system: provider/deployer status, Art. 53 compliance verification, high-risk overlap, fix roadmap. €799 founding rate, 5-7 days delivery, 100% async.
Order audit · Reply in 4h →Or check the Penalty Calculator to estimate your exposure — 60 seconds.
FAQ
Do I need to publish a training data summary if I fine-tune Llama 3?
Most likely NO — a LoRA fine-tune ≠ significant modification per Recital 109. If your fine-tuning compute > 10²² FLOPs (rare), yes. Conservatively: a short note describing your fine-tuning dataset (1-2 pages) as vendor due diligence for downstream users.
Is Stable Diffusion a GPAI?
Yes, image generation foundation model. SD 1.5 / 2.x / 3.x — all GPAI per Art. 3(63). Stability AI has its own provider obligations.
What about Chinese models (DeepSeek, Qwen)?
If you use them in EU production = high regulatory risk. Most Chinese providers haven't published formal Art. 53 documentation per the European AI Office template. The EU AI Office may launch an investigation, the model may be forced to retract from the EU market — you lose vendor dependency. For mission-critical deployments prefer European providers (Mistral, Aleph Alpha) or US providers with EU compliance (OpenAI, Anthropic).
Are embedding models (text-embedding-3, voyage-2) GPAI?
Edge case. Strict interpretation of Art. 3(63) "wide range of distinct tasks" — embedding models are typically single-task (vector representation). Most legal opinion: NOT GPAI in the EU AI Act sense. Provider obligations less restrictive. But always check vendor classification.
What does "significant modification" mean for fine-tuning?
Recital 109 doesn't give a precise definition. Practical heuristics used by compliance counsel:
- Fine-tune compute < 10²² FLOPs → typically not significant
- Fine-tune compute > 10²³ FLOPs → likely significant (€10M+ training run)
- Architecture change (e.g. adding layers, MoE conversion) → significant regardless of compute
- Domain adaptation without architecture change → typically not significant
Conservatively: if uncertain, treat as significant and document as a derived model provider. Better than under-estimating.
— Piotr Reder, aiactaudit.pl. 4 May 2026.