POST /v1/responses

The /v1/responses endpoint implements OpenAI’s Responses API — a newer request format that replaces the messages array with an input field accepting either a plain string or a structured content array. It returns a richer response object that includes the model’s output items directly. A companion endpoint, /v1/responses/compact, returns only the output text, which is useful when you need a lightweight response without the full response envelope.

Request parameters

model

string

required

The ID of the model to use. Use the list models endpoint to retrieve available model IDs.

input

string | array

required

The input prompt. Pass a plain string for simple text prompts, or an array of content objects for multi-modal or structured inputs.

stream

boolean

default:"false"

When true, the response streams as server-sent events. The stream ends with a response.completed event.

instructions

string

System-level instructions for the model, equivalent to a system message in the Chat Completions API.

temperature

number

default:"1"

Sampling temperature between 0 and 2.

max_output_tokens

integer

Maximum number of tokens to generate in the response.

Example

curl https://tokmodel.com/v1/responses \
  --request POST \
  --header "Authorization: Bearer YOUR_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "model": "openai/gpt-4o",
    "input": "Explain the difference between REST and GraphQL in one paragraph.",
    "temperature": 0.5
  }'

Response

{
  "id": "resp_abc123",
  "object": "response",
  "created_at": 1716470400,
  "model": "openai/gpt-4o",
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "REST uses fixed endpoints and HTTP methods to expose resources, while GraphQL exposes a single endpoint where clients specify exactly the data they need using a query language, reducing over-fetching and under-fetching."
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 18,
    "output_tokens": 44,
    "total_tokens": 62
  }
}