Documentation Index
Fetch the complete documentation index at: https://docs.tokmodel.com/llms.txt
Use this file to discover all available pages before exploring further.
The /v1/embeddings endpoint converts text into numerical vectors that capture semantic meaning. You can embed a single string or a batch of strings in one request. The resulting vectors are suitable for semantic search, retrieval-augmented generation (RAG), clustering, and classification tasks.
Authentication
Include your API key in every request:
Authorization: Bearer YOUR_API_KEY
Send an embeddings request
Provide an input (a string or array of strings) and a model. TokModel routes the request to the specified embedding model and returns one vector per input item.
curl https://tokmodel.com/v1/embeddings \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/text-embedding-3-small",
"input": "The quick brown fox jumps over the lazy dog."
}'
Pass an array of strings to input to embed a batch in a single API call. The response contains one entry per input, in the same order.
curl https://tokmodel.com/v1/embeddings \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/text-embedding-3-small",
"input": [
"How do I reset my password?",
"Where can I find my invoices?",
"How do I cancel my subscription?"
]
}'
Example response
Each object in the data array corresponds to one input string. The embedding field contains the raw float vector.
{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [
0.0023064255,
-0.009327292,
0.015797347,
"..."
]
}
],
"model": "openai/text-embedding-3-small",
"usage": {
"prompt_tokens": 10,
"total_tokens": 10
}
}
The embedding array above is truncated for readability. Real vectors typically contain 512 to 3072 dimensions depending on the model.
Compute cosine similarity
After embedding two pieces of text, compare them with cosine similarity. A score close to 1.0 means the texts are semantically similar.
import numpy as np
def cosine_similarity(a, b):
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
response = client.embeddings.create(
model="openai/text-embedding-3-small",
input=[
"How do I reset my password?",
"I forgot my password, what should I do?",
],
)
vec_a = response.data[0].embedding
vec_b = response.data[1].embedding
score = cosine_similarity(vec_a, vec_b)
print(f"Similarity: {score:.4f}") # e.g. 0.9312
Common use cases
Semantic search — embed a user query and compare it against pre-embedded documents to find the most relevant results, even when the exact words differ.
Retrieval-augmented generation (RAG) — embed your knowledge base and retrieve the top-k matching chunks before passing them to a chat model as context.
Clustering — group similar documents together by clustering their vectors using algorithms like k-means without any labeled training data.
Classification — train a lightweight classifier on top of embeddings to categorize text into predefined labels.