OpenAdapterOpenAdapter
Endpoints

Embeddings

Create vector embeddings for search, RAG, clustering, and semantic similarity. The gateway exposes a standard **OpenAI Embeddings** API at `POST /v1/embeddings`

Base URL & authentication

Base URL: https://api.openadapter.in

All requests need an Authorization: Bearer sk-cv-... header. Generate or copy your API key from the Dashboard → API Keys page.

Create vector embeddings for search, RAG, clustering, and semantic similarity. The gateway exposes a standard OpenAI Embeddings API at POST /v1/embeddings. Use your normal OpenAdapter API key; requests count toward your plan quota like chat completions.

POST https://api.openadapter.in/v1/embeddings

Discover models

Call GET /v1/models and use entries where model_type is "embedding". Your plan may restrict which models appear (for example, curated lists on specific plans).

Aliases are case-sensitive — use the exact id from the models list.

Some aliases (for example qwen3-embedding) may show model_type: "chat" in the models list for historical routing reasons but still work with POST /v1/embeddings. If the embeddings call succeeds, the alias is valid.

Request body (OpenAI-compatible)

FieldRequiredNotes
modelYesEmbedding alias, e.g. qwen3-embedding-small
inputYesA string or an array of strings to embed in one call
encoding_formatNofloat (default)
dimensionsNoReduces vector size only if the upstream model supports it
userNoOptional end-user id for logging (string)

The response shape matches OpenAI: object, data[] with embedding, index, model, and usually usage with total_tokens.

curl https://api.openadapter.in/v1/embeddings \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-embedding-small",
    "input": ["hello world", "second line"]
  }'
from openai import OpenAI

client = OpenAI(
    api_key="$API_KEY",
    base_url="https://api.openadapter.in/v1",
)

resp = client.embeddings.create(
    model="qwen3-embedding-small",
    input=["Hello from OpenAdapter", "Batch item two"],
)
for item in resp.data:
    print(len(item.embedding), "dimensions")
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: '$API_KEY',
  baseURL: 'https://api.openadapter.in/v1',
});

const resp = await client.embeddings.create({
  model: 'jina-embeddings-v5',
  input: 'Single string input is also valid.',
});

console.log(resp.data[0].embedding.length);

Choosing a model

Use one model per index or vector store — mixing embeddings from different models breaks similarity search.

Model aliasRoleVector size (typical)Notes
qwen3-embedding-smallGeneral text (default workhorse)1024Alias for Qwen3 0.6B embedding on OpenAdapter Edge
Qwen3-Embedding-0.6BSame family as above1024Alternative display name for tooling that expects this id
qwen3-embeddingSame stack1024Convenience alias
jina-embeddings-v5Strong multilingual / general retrieval1024Hosted on OpenAdapter Edge (Jina v5 text small)
bge-m3Dense retrieval, popular for RAG1024OpenAdapter Edge (BAAI bge-m3)
nv-embedqa-e5-v5NVIDIA embedding model1024Routed via NVIDIA

Sizes above are what the gateway returns today; always verify with a test call if you rely on exact dimensionality for your DB schema.

RAG and the dashboard

The Vector DB page in the dashboard uses embeddings for ingest and query. Pick an embedding model in the collection settings and keep it consistent for that collection.

Limits and behavior

  • Quota: Each call consumes request quota (and token-based accounting where applicable), similar to other /v1 routes.
  • Rate limits: Your plan RPM and per-model tier limits apply unless your plan explicitly disables them.
  • Routing: The gateway picks a healthy provider key for the alias; you do not pass provider credentials.
  • Input size: Very long inputs may be truncated or rejected by the upstream model — stay within reasonable chunk sizes for your use case.

If model is not allowed for your plan, the API returns an error listing allowed models when your plan uses model restrictions.

On this page