Reasoning

Some models support a thinking (reasoning) mode: before producing the final answer, the model spends extra effort working through the problem. This improves quality on hard tasks — multi-step logic, math, planning, code — at the cost of higher latency and more tokens.

Reasoning is available only on reasoning-capable models; on other models the setting has no effect.

Controlling effort

You control reasoning with a single neutral setting — enable it and give it a token budget. A larger budget means deeper thinking:

{
  "thinking": {
    "enabled": true,
    "budgetTokens": 8000
  }
}

The budget is translated to each model family's native control automatically:

For models with a direct thinking budget, it is used as-is.
For models that expose discrete effort levels, the budget maps to a level (a small budget → light effort, a large budget → maximum effort).

One setting works across all models — no per-model configuration required.

When to use it

Use a larger budget for complex reasoning: math, multi-constraint planning, debugging, careful classification.
Keep it off (or small) for short, factual, or latency-sensitive replies — reasoning adds delay and cost.

For the exact call in your language, see the JavaScript, Unity, or Unreal text generation guide.

Reasoning

Reasoning

Controlling effort

When to use it

On this page