Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.tokmodel.com/llms.txt

Use this file to discover all available pages before exploring further.

The /v1/responses endpoint implements OpenAI’s Responses API — a newer request format that replaces the messages array with an input field accepting either a plain string or a structured content array. It returns a richer response object that includes the model’s output items directly. A companion endpoint, /v1/responses/compact, returns only the output text, which is useful when you need a lightweight response without the full response envelope.

POST /v1/responses

Request parameters

model
string
required
The ID of the model to use. Use the list models endpoint to retrieve available model IDs.
input
string | array
required
The input prompt. Pass a plain string for simple text prompts, or an array of content objects for multi-modal or structured inputs.
stream
boolean
default:"false"
When true, the response streams as server-sent events. The stream ends with a response.completed event.
instructions
string
System-level instructions for the model, equivalent to a system message in the Chat Completions API.
temperature
number
default:"1"
Sampling temperature between 0 and 2.
max_output_tokens
integer
Maximum number of tokens to generate in the response.

Example

curl https://tokmodel.com/v1/responses \
  --request POST \
  --header "Authorization: Bearer YOUR_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "model": "openai/gpt-4o",
    "input": "Explain the difference between REST and GraphQL in one paragraph.",
    "temperature": 0.5
  }'

Response

{
  "id": "resp_abc123",
  "object": "response",
  "created_at": 1716470400,
  "model": "openai/gpt-4o",
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "REST uses fixed endpoints and HTTP methods to expose resources, while GraphQL exposes a single endpoint where clients specify exactly the data they need using a query language, reducing over-fetching and under-fetching."
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 18,
    "output_tokens": 44,
    "total_tokens": 62
  }
}

POST /v1/responses/compact

The compact variant accepts the same request body as /v1/responses but returns only the generated output text as a plain string, without the full response envelope. Use this endpoint when you only need the model’s reply and want to avoid parsing nested response objects.

Example

curl https://tokmodel.com/v1/responses/compact \
  --request POST \
  --header "Authorization: Bearer YOUR_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "model": "openai/gpt-4o",
    "input": "Give me a one-sentence summary of the water cycle."
  }'

Response

{
  "output": "Water evaporates from surfaces, condenses into clouds, and falls back to Earth as precipitation, continuously cycling through the atmosphere and hydrosphere."
}