Inference Embeddings Rerank Images Speech Guard
DE ES IT

Sovereign AI Embeddings,
built for Europe.

German HQ 100% EU data residency

Turn text into vector embeddings on European GPUs.
Power semantic search, RAG, and classification
without your data leaving the EU.

Create free account 100K tokens/month free
// models + pricing

Embedding Models

We run the Qwen3 Embedding family, the #1 ranked models on the MTEB multilingual leaderboard. Full open weights, 100+ languages supported, up to 32K context length. Flexible output dimensions for any use case.

All models run on modern Blackwell or newer chips for ideal performance. Pricing per million tokens. Free tier included on all models.

64.3
Qwen3-Embedding-0.6B
Fast, lightweight embeddings. Ideal for high-throughput workloads.
Input: 0,02 € / 1M tokens Coming soon
Parameters0.6B
Context32K tokens
Dimensionsup to 1024
Languages100+
Qwen3-Embd-0.6B
64.3
text-embd-3-lg
58.9
Cohere multi v3
61.1
Gemini Embedding
68.4
69.5
Qwen3-Embedding-4B
Balanced performance and efficiency. Great for production RAG.
Input: 0,06 € / 1M tokens Coming soon
Parameters4B
Context32K tokens
Dimensionsup to 2560
Languages100+
Qwen3-Embd-4B
69.5
text-embd-3-lg
58.9
Cohere multi v3
61.1
Gemini Embedding
68.4
70.6
Qwen3-Embedding-8B
#1 on MTEB multilingual. Maximum quality for critical retrieval.
Input: 0,10 € / 1M tokens Coming soon
Parameters8B
Context32K tokens
Dimensionsup to 4096
Languages100+
Qwen3-Embd-8B
70.6
text-embd-3-lg
58.9
Cohere multi v3
61.1
Gemini Embedding
68.4
Free tier
100K tokens/month All models 10 req/min No credit card
// what you can build

Use Cases

Text embeddings are the foundation of modern AI applications. Generate vector representations of text for a wide range of tasks.

Semantic Search
Find relevant documents by meaning, not just keywords. Build search that understands intent across 100+ languages.
Retrieval-Augmented Generation (RAG)
Ground your LLM responses in your own data. Embed documents, retrieve context, generate accurate answers.
Classification & Clustering
Categorize support tickets, group similar content, detect duplicates. Let embeddings do the heavy lifting.
Multilingual Matching
Match content across languages without translation. Ideal for European businesses operating in multiple markets.
// for teams that need more
Need more? The Business Plan covers all Nodion.ai products: Inference, Embeddings, and more. 500 €/month, 50M tokens, dedicated GPU capacity, 99.5% SLA.
View Business Plan →
// getting started

API Documentation

The Embeddings API is fully compatible with the OpenAI Embeddings API. Point any OpenAI SDK at our base URL and start generating embeddings.

# Base URL
https://api.nodion.ai/v1
# Example: curl
curl https://api.nodion.ai/v1/embeddings \
  -H "Authorization: Bearer $NODION_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen/qwen3-embedding-0.6b",
    "input": "Sovereign AI infrastructure for Europe"
  }'

Supports: /v1/embeddings, /v1/models. Flexible output dimensions via the dimensions parameter.

// why this matters
GDPR-native. Not a policy checkbox, it's how the infrastructure is built. No data leaves the EU. No transatlantic transfers. No adequacy decision risks.
Nordic green energy. GPU clusters in Sweden and Finland run on renewable energy. Cold climate means natural cooling, lower energy waste, smaller footprint.
No US dependency. German company. EU servers. Open-source models. Full stack sovereignty without hyperscaler lock-in.
Open-source only. Every model we serve is fully open. You can inspect the weights, understand the architecture, audit the outputs.
OpenAI-compatible API. Drop-in replacement. Change your base URL and you're running on sovereign European infrastructure.

Ready to start?

100K free tokens per month. No credit card required. All models included.

Create free account