Reranking is a second-stage retrieval step that re-scores documents against a query using a cross-encoder model. It significantly improves precision over embedding-only search by evaluating query-document pairs directly.

Is the Nodion.ai Reranking API GDPR-compliant?

Yes. Like all Nodion.ai services, the Reranking API runs entirely on EU-based GPU infrastructure in Sweden and Finland. No data leaves the EU. Operated by Nodion GmbH, a German company.

How does reranking improve RAG?

In a RAG pipeline, embedding search retrieves a broad set of candidates quickly. A reranker then scores each candidate against the query with higher accuracy, selecting only the most relevant chunks for the LLM. This reduces noise and improves answer quality.

What reranking models does Nodion.ai offer?

Nodion.ai offers the Qwen3 Reranker model family in three sizes: 0.6B, 4B, and 8B. All models support 32K context, 100+ languages, and instruction-aware reranking. The models predict relevance scores between 0 and 1 for query-document pairs.

Inference Embeddings Rerank Images Speech Guard

Sovereign AI Reranking,
built for Europe.

German HQ 100% EU data residency

Re-score and rank documents with cross-encoder precision on European GPUs. Boost your RAG pipeline accuracy without your data leaving the EU.

Create free account 100K tokens/month free

// models + pricing

Reranker Models

We run the Qwen3 Reranker family: instruction-aware cross-encoder models that score query-document relevance with high precision. 100+ languages, 32K context. Perfect as the second stage after embedding search.

All models run on modern Blackwell or newer chips for ideal performance. Pricing per million tokens. Free tier included on all models.

Qwen

65.8

Qwen3-Reranker-0.6B

Fast, lightweight reranking. Ideal for high-throughput RAG pipelines.

Input: 0,02 € / 1M tokens Coming soon

Parameters0.6B

Context32K tokens

Languages100+

Scoringyes/no logits

Qwen3-Reranker-0.6B

65.8

Cohere Rerank v3

67.1

bge-reranker-v2

62.4

Jina Reranker v2

63.8

72.1

Qwen3-Reranker-4B

Best balance of speed and accuracy. Production-ready for RAG.

Input: 0,06 € / 1M tokens Coming soon

Parameters4B

Context32K tokens

Languages100+

Scoringyes/no logits

Qwen3-Reranker-4B

72.1

Cohere Rerank v3

67.1

bge-reranker-v2

62.4

Jina Reranker v2

63.8

72.9

Qwen3-Reranker-8B

Maximum reranking quality for critical retrieval workloads.

Input: 0,10 € / 1M tokens Coming soon

Parameters8B

Context32K tokens

Languages100+

Scoringyes/no logits

Qwen3-Reranker-8B

72.9

Cohere Rerank v3

67.1

bge-reranker-v2

62.4

Jina Reranker v2

63.8

Free tier

100K tokens/month All models 10 req/min No credit card

// what you can build

Use Cases

Reranking is the precision layer in modern retrieval systems. Add a reranker after embedding search to dramatically improve relevance.

RAG Pipeline Optimization

Retrieve 20 candidates with embeddings, rerank to the top 5. Your LLM gets only the most relevant context, producing better answers with less noise.

Enterprise Search

Boost search accuracy for internal knowledge bases, legal documents, and support portals. Cross-encoder scoring understands nuance that keyword and vector search miss.

Multilingual Retrieval

Rerank documents across languages without translation. Query in German, match English documents, score by relevance. Ideal for European multilingual workloads.

E-Commerce & Recommendations

Re-score product search results by true relevance to the query. Improve conversion by surfacing the right products, not just similar ones.

// for teams that need more

Need more? The Business Plan covers all Nodion.ai products: Inference, Embeddings, and more. 500 €/month, 50M tokens, dedicated GPU capacity, 99.5% SLA.

View Business Plan →

// getting started

API Documentation

The Reranking API uses a simple scoring endpoint. Send a query and a list of documents, get back relevance scores.

            # Base URL

            https://api.nodion.ai/v1

            # Example: curl

            curl https://api.nodion.ai/v1/rerank \

              -H "Authorization: Bearer $NODION_API_KEY" \

              -H "Content-Type: application/json" \

              -d '{

                "model": "qwen/qwen3-reranker-0.6b",

                "query": "How do I cancel my subscription?",

                "documents": [

                  "To cancel, go to Settings > Billing > Cancel Plan.",

                  "Our pricing starts at 10 EUR per month.",

                  "You can upgrade your plan at any time."

                ],

                "top_n": 2

              }'

Returns relevance scores (0-1) for each document. Supports instruction-aware reranking via the instruction parameter.

// why this matters

GDPR-native. Not a policy checkbox, it's how the infrastructure is built. No data leaves the EU. No transatlantic transfers. No adequacy decision risks.

Nordic green energy. GPU clusters in Sweden and Finland run on renewable energy. Cold climate means natural cooling, lower energy waste, smaller footprint.

No US dependency. German company. EU servers. Open-source models. Full stack sovereignty without hyperscaler lock-in.

Open-source only. Every model we serve is fully open. You can inspect the weights, understand the architecture, audit the outputs.

OpenAI-compatible API. Drop-in replacement. Change your base URL and you're running on sovereign European infrastructure.

Ready to start?

100K free tokens per month. No credit card required. All models included.

Create free account

Sovereign AI Reranking,built for Europe.

Reranker Models

Use Cases

API Documentation

Ready to start?

Sovereign AI Reranking,
built for Europe.