OpenAdapter Docs

Create vector embeddings for search, RAG, clustering, and semantic similarity. The gateway exposes a standard **OpenAI Embeddings** API at `POST /v1/embeddings`

Base URL & authentication

Base URL: https://api.openadapter.in

All requests need an Authorization: Bearer sk-cv-... header. Generate or copy your API key from the Dashboard → API Keys page.

Create vector embeddings for search, RAG, clustering, and semantic similarity. The gateway exposes a standard OpenAI Embeddings API at POST /v1/embeddings. Use your normal OpenAdapter API key; requests count toward your plan quota like chat completions.

POST https://api.openadapter.in/v1/embeddings

Discover models

Call GET /v1/models and use entries where model_type is "embedding". Your plan may restrict which models appear (for example, curated lists on specific plans).

Aliases are case-sensitive — use the exact id from the models list.

Some aliases (for example qwen3-embedding) may show model_type: "chat" in the models list for historical routing reasons but still work with POST /v1/embeddings. If the embeddings call succeeds, the alias is valid.

Request body (OpenAI-compatible)

Field	Required	Notes
`model`	Yes	Embedding alias, e.g. `qwen3-embedding-small`
`input`	Yes	A string or an array of strings to embed in one call
`encoding_format`	No	`float` (default)
`dimensions`	No	Reduces vector size only if the upstream model supports it
`user`	No	Optional end-user id for logging (string)

The response shape matches OpenAI: object, data[] with embedding, index, model, and usually usage with total_tokens.

curl https://api.openadapter.in/v1/embeddings \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-embedding-small",
    "input": ["hello world", "second line"]
  }'

from openai import OpenAI

client = OpenAI(
    api_key="$API_KEY",
    base_url="https://api.openadapter.in/v1",
)

resp = client.embeddings.create(
    model="qwen3-embedding-small",
    input=["Hello from OpenAdapter", "Batch item two"],
)
for item in resp.data:
    print(len(item.embedding), "dimensions")

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: '$API_KEY',
  baseURL: 'https://api.openadapter.in/v1',
});

const resp = await client.embeddings.create({
  model: 'jina-embeddings-v5',
  input: 'Single string input is also valid.',
});

console.log(resp.data[0].embedding.length);

Choosing a model

Use one model per index or vector store — mixing embeddings from different models breaks similarity search.

Model alias	Role	Vector size (typical)	Notes
`qwen3-embedding-small`	General text (default workhorse)	1024	Alias for Qwen3 0.6B embedding on OpenAdapter Edge
`Qwen3-Embedding-0.6B`	Same family as above	1024	Alternative display name for tooling that expects this id
`qwen3-embedding`	Same stack	1024	Convenience alias
`jina-embeddings-v5`	Strong multilingual / general retrieval	1024	Hosted on OpenAdapter Edge (Jina v5 text small)
`bge-m3`	Dense retrieval, popular for RAG	1024	OpenAdapter Edge (BAAI bge-m3)
`nv-embedqa-e5-v5`	NVIDIA embedding model	1024	Routed via NVIDIA

Sizes above are what the gateway returns today; always verify with a test call if you rely on exact dimensionality for your DB schema.

RAG and the dashboard

The Vector DB page in the dashboard uses embeddings for ingest and query. Pick an embedding model in the collection settings and keep it consistent for that collection.

Limits and behavior

Quota: Each call consumes request quota (and token-based accounting where applicable), similar to other /v1 routes.
Rate limits: Your plan RPM and per-model tier limits apply unless your plan explicitly disables them.
Routing: The gateway picks a healthy provider key for the alias; you do not pass provider credentials.
Input size: Very long inputs may be truncated or rejected by the upstream model — stay within reasonable chunk sizes for your use case.

If model is not allowed for your plan, the API returns an error listing allowed models when your plan uses model restrictions.

Embeddings

Discover models

Request body (OpenAI-compatible)

Choosing a model

RAG and the dashboard

Limits and behavior

On this page