Re-score and rank documents with cross-encoder precision on European GPUs. Boost your RAG pipeline accuracy without your data leaving the EU.
Create free account 100K tokens/month freeWe run the Qwen3 Reranker family: instruction-aware cross-encoder models that score query-document relevance with high precision. 100+ languages, 32K context. Perfect as the second stage after embedding search.
All models run on modern Blackwell or newer chips for ideal performance. Pricing per million tokens. Free tier included on all models.
Reranking is the precision layer in modern retrieval systems. Add a reranker after embedding search to dramatically improve relevance.
The Reranking API uses a simple scoring endpoint. Send a query and a list of documents, get back relevance scores.
Returns relevance scores (0-1) for each document. Supports instruction-aware reranking via the instruction parameter.
100K free tokens per month. No credit card required. All models included.
Create free account