What EmbeddingAdapters Can Do — Features & Capabilities

🔄

Cross-model embedding translation

Translate embeddings from any source model into any target model's space. Query a Pinecone index built with OpenAI embeddings using a local MiniLM model — or vice versa. Adapters learn the mapping between embedding spaces so you never need to re-embed your data.

from embedding_adapters import adapt # Translate MiniLM embeddings → OpenAI space adapted = adapt( embedding, source="sentence-transformers/all-MiniLM-L6-v2", target="openai/text-embedding-3-small" )

📊

Confidence scoring

Every translation returns a confidence score that tells you how reliable the adapted embedding is for your specific input. Use this for smart routing — fall back to the native model when confidence is low, or use the adapter when it's high to save cost and latency.

result = adapt(embedding, source="bge", target="openai") print(result.confidence) # 0.94 — high confidence print(result.embedding) # adapted vector

⚡

CLI for quick experiments

Test translations instantly from the command line without writing any code. Pipe in text, get back adapted embeddings in JSON format. Great for prototyping and debugging retrieval pipelines.

$ embedding-adapters embed \ --source all-MiniLM-L6-v2 \ --target openai/text-embedding-3-small \ --text "semantic search query"

🔌

Works with any vector database

EmbeddingAdapters is database-agnostic. It operates on the embedding vectors themselves, so it works alongside Pinecone, Weaviate, Qdrant, pgvector, Milvus, FAISS, ChromaDB, or any other vector store. No vendor lock-in, no special integrations needed.

🏗️

Multiple adapter sizes

Choose the right trade-off for your use case. Small adapters are lightweight and fast for edge deployment. Large adapters deliver maximum translation fidelity for production search systems. The --flavor flag switches between them instantly.

🔀

Provider migration without downtime

Switching from one embedding provider to another? Instead of re-embedding millions of documents (which can take days and cost thousands), use an adapter to query your existing index with the new model immediately. Migrate gradually on your own schedule.

🧪

A/B testing embedding models

Compare how different models perform on your specific data without maintaining separate indexes. Translate queries into each model's space, run retrieval, and compare results — all against a single vector store.

🏎️

GPU & CPU support

Adapters run on CUDA when available for batch processing, and fall back gracefully to CPU. The models are small enough (typically <50MB) for edge deployment, serverless functions, or even in-browser inference.

🎯

Custom adapter training

Train adapters on your own domain data for higher fidelity translations on specialized corpora. Legal documents, medical records, code repositories — custom-trained adapters consistently outperform the general-purpose public models on domain-specific retrieval tasks.

🛡️

Fully local & private

Everything runs on your infrastructure. No data leaves your network, no API calls to external services during inference. Adapters are deterministic — the same input always produces the same output, making them safe for regulated environments and reproducible pipelines.

Everything EmbeddingAdapters can do