Inference Embeddings Rerank Images Speech Guard
DE ES IT

Sovereign AI Safety,
built for Europe.

German HQ 100% EU data residency

Classify and moderate AI content on European GPUs. Detect unsafe prompts and responses in real-time without your data leaving the EU.

Create free account 100K tokens/month free
// models + pricing

Guard Models

We run the Qwen3Guard family: the first safety guardrail models from Qwen. Classify content as safe, unsafe, or controversial across 119 languages. Available in Gen mode for complete texts and Stream mode for real-time token-by-token moderation.

All models run on modern Blackwell or newer chips for ideal performance. Pricing per million tokens. Free tier included on all models.

Qwen3Guard-0.6B
Fast, lightweight moderation. Ideal for real-time content filtering at scale.
Input: 0,02 € / 1M tokens Coming soon
Parameters0.6B
Context32K tokens
Languages119
ModesGen + Stream
Qwen3Guard-4B
Balanced accuracy and speed. Production-ready safety layer for any application.
Input: 0,06 € / 1M tokens Coming soon
Parameters4B
Context32K tokens
Languages119
ModesGen + Stream
Qwen3Guard-8B
Maximum detection accuracy for critical safety and compliance workloads.
Input: 0,10 € / 1M tokens Coming soon
Parameters8B
Context32K tokens
Languages119
ModesGen + Stream
Free tier
100K tokens/month All models 10 req/min No credit card
// what you can build

Use Cases

Guard models are the safety layer for production AI systems. Add content moderation to any LLM pipeline without building your own classifiers.

Input Moderation
Screen user prompts before they reach your LLM. Block harmful requests, detect prompt injection attempts, and enforce content policies automatically.
Output Safety
Evaluate LLM responses before showing them to users. Catch hallucinated harmful content, PII leaks, and policy violations in real-time with Stream mode.
Compliance & Audit
Log safety classifications for every interaction. Meet regulatory requirements with structured risk assessments, threat categories, and audit trails.
Multilingual Content Safety
Moderate content across 119 languages without separate models per language. Essential for European platforms serving diverse markets.
// for teams that need more
Need more? The Business Plan covers all Nodion.ai products: Inference, Embeddings, and more. 500 €/month, 50M tokens, dedicated GPU capacity, 99.5% SLA.
View Business Plan →
// getting started

API Documentation

The Guard API uses the chat completions endpoint with a guard model. Send content to classify and receive safety assessments.

# Base URL
https://api.nodion.ai/v1
# Example: curl
curl https://api.nodion.ai/v1/chat/completions \
  -H "Authorization: Bearer $NODION_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen/qwen3guard-0.6b",
    "messages": [
      {"role": "user", "content": "Is this text safe?: Hello, how are you today?"}
    ]
  }'

Returns safety classification with risk level, category, and explanation. Supports both Gen (complete text) and Stream (token-by-token) modes.

// why this matters
GDPR-native. Not a policy checkbox, it's how the infrastructure is built. No data leaves the EU. No transatlantic transfers. No adequacy decision risks.
Nordic green energy. GPU clusters in Sweden and Finland run on renewable energy. Cold climate means natural cooling, lower energy waste, smaller footprint.
No US dependency. German company. EU servers. Open-source models. Full stack sovereignty without hyperscaler lock-in.
Open-source only. Every model we serve is fully open. You can inspect the weights, understand the architecture, audit the outputs.
OpenAI-compatible API. Drop-in replacement. Change your base URL and you're running on sovereign European infrastructure.

Ready to start?

100K free tokens per month. No credit card required. All models included.

Create free account