Can I generate OpenAI embeddings locally without an API key?

Yes. EmbeddingAdapters lets you generate embeddings that are compatible with OpenAI's text-embedding-3-small vector space using local models like all-MiniLM-L6-v2 or BGE. No OpenAI API key required — everything runs on your machine.

How do I switch embedding models without re-indexing my vector database?

EmbeddingAdapters translates query embeddings from your new model into the vector space of your existing index. This means you can switch from MiniLM to OpenAI (or vice versa) without re-embedding a single document in Pinecone, Weaviate, Qdrant, pgvector, or any other vector store.

Which embedding models does EmbeddingAdapters support?

EmbeddingAdapters supports translation between popular models including OpenAI text-embedding-3-small, sentence-transformers/all-MiniLM-L6-v2, BAAI/bge-base-en, intfloat/e5-large-v2, and multilingual-e5-large, with more adapters being added regularly.

Open-source · Local-first · Production-ready

Embedding interoperability,
without re-indexing

Name: EmbeddingAdapters
Author: EmbeddingAdapters

Universal embedding-space translation library. Plug-and-play adapters that map one model's vector space into another — locally, instantly, for free. Learn more →

Get started Read benchmarks →

Commercial inquiries →

Adapter

Terminal

$ pip install embedding-adapters

# Translate an embedding between models
$ embedding-adapters embed \
    --source sentence-transformers/all-MiniLM-L6-v2 \
    --target openai/text-embedding-3-small \
    --flavor large \
    --text "Where can I get a hamburger?"

Generates an embedding in OpenAI text-embedding-3-small space from MiniLM locally, with a confidence score.

Why EmbeddingAdapters

Drop-in embedding translation
for real systems

Stop choosing between embedding quality and operational sanity. Adapters let you move between models without touching your vector store.

⚡

Latency-sensitive RAG

Use a lightweight local model for queries while your index was built with a heavier cloud provider. Get sub-10ms embeddings without sacrificing retrieval quality.

🔄

Gradual model migration

Move from one embedding provider to another incrementally. No need to re-embed millions of documents at once — translate on the fly during the transition.

🛡️

Provider resilience

Handle rate limits, outages, and provider changes without breaking your retrieval pipeline. Route through local adapters when the cloud is unavailable.

🧪

Model evaluation

Compare how different embedding providers behave on your data without committing to any single one. Sample multiple spaces from a single source model.

🔍

Quality routing

Detect when a query is too complex for your local model and intelligently route it to a stronger provider. Built-in quality endpoints tell you when to escalate.

💰

Cost optimization

Slash embedding API costs by running common queries through a free local model with an adapter, and reserving cloud calls for only the hardest cases.

Quick start

Three lines to cross-model
compatibility

From MiniLM to OpenAI in seconds

Load a pre-trained adapter from the registry, encode with your local model, and translate into any target embedding space — no API calls needed for the translation step.

Load your source model — any SentenceTransformer or local embedding model

Load a pre-trained adapter from the HuggingFace registry with one call

Translate embeddings — the adapter maps source vectors into the target space

quickstart.py

from sentence_transformers import SentenceTransformer
from embedding_adapters import EmbeddingAdapter

# 1) Load a lightweight local model
model = SentenceTransformer("all-MiniLM-L6-v2")

# 2) Load a pre-trained adapter
adapter = EmbeddingAdapter.from_registry(
    source="sentence-transformers/all-MiniLM-L6-v2",
    target="openai/text-embedding-3-small",
    flavor="large",
)

# 3) Encode locally, translate into OpenAI's space
src_embs = model.encode(texts, normalize_embeddings=True)
translated = adapter.translate(src_embs)

# translated embeddings are now compatible
# with your OpenAI-indexed vector store!

How it works

Learned mappings between
embedding spaces

Each adapter is a compact trained model that preserves the geometric relationships that matter for retrieval. The key is a carefully designed training process that prevents collapse.

Embed with source

Generate embeddings using your local or source model as you normally would.

→

Adapt

The adapter applies a trained transformation, mapping vectors into the target space.

→

Retrieve

Query your existing vector index directly — no re-embedding, no downtime, no data migration.

Benchmarks

Retrieval quality that
holds up in production

MiniLM + Adapter vs. native OpenAI text-embedding-3-small on retrieval tasks.

93%

Recall@10 Preserved

vs. native OpenAI embeddings

50×

Faster Locally

no network round-trip

Per Query

local inference only

MiniLM + Adapter

0.89

OpenAI native

0.95

MiniLM + Adapter (local, free)

OpenAI text-embedding-3-small (API)

Built for production

Everything you need to ship
with confidence

📦

Pre-trained adapter registry

One-line loading from HuggingFace. Adapters for popular model pairs are ready to go — no training required.

🖥️

CLI & Python API

Use the command-line tool for quick experiments, or import the Python library for production integration.

📊

Quality endpoints

Built-in quality scoring tells you whether an adapted embedding will work for your query — enabling smart routing.

🔌

DB agnostic

Works alongside any vector database — Pinecone, Weaviate, Qdrant, pgvector, FAISS. No vendor lock-in.

⚙️

Train your own

Adapters can be custom-trained on your domain data for even better translation fidelity on specialized corpora.

🏎️

GPU & CPU support

Runs on CUDA when available, falls back gracefully to CPU. Lightweight enough for edge deployment.

For teams & enterprises

Custom adapters trained on your data

Our public adapters work well for general use, but domain-specific corpora deserve domain-specific translation. We train custom adapters on your data that consistently outperform our open-source models — higher recall, tighter semantic fidelity, and full support for proprietary embedding pipelines.

Custom adapter training on your corpus
Higher fidelity than public models
Private deployment & licensing options
Dedicated integration support

Commercial inquiries

Embedding interoperability,without re-indexing

Drop-in embedding translationfor real systems