Skip to main content

    Error Handling

    Behest returns standard HTTP status codes + a consistent JSON body. The v1.5 TypeScript and Python SDKs expose typed error classes so you can pattern-match without string-sniffing.


    Error body shape

    json
    {
      "error": {
        "code": "quota_exceeded",
        "message": "User has exceeded the daily token limit for tier 'free'.",
        "details": {
          "tier": "free",
          "limit": { "tokens_per_day": 50000 },
          "usage": { "tokens_today": 50123 }
        }
      }
    }

    Plus two response headers you should always read:

    • X-Trace-Id — include in any support ticket; resolves to a Grafana trace instantly.
    • Retry-After — seconds to wait on a 429.

    Status code cheat sheet

    StatusTypical codeCauseAction
    400 / 422validation_errorMalformed body, unknown model, invalid headerFix request; do not retry
    401invalid_tokenMissing/expired/invalid JWT, bad signature, wrong kidRefresh token, re-mint
    402quota_exceededUser over tier's daily capShow upgrade UI; do not retry
    403forbiddenJWT valid but not allowed (wrong tenant, killed project)Logout user; escalate
    404Unknown project slug, missing threadFix resource id
    429rate_limitedToo many requests per minuteRespect Retry-After
    5xxserver_errorProvider outage, timeout, transient infraRetry with backoff (max 2)

    Typed SDK errors (TypeScript)

    ts
    import {
      Behest,
      BehestError,
      BehestBadRequestError, // 400 / 422
      BehestAuthError, // 401 / 403
      BehestQuotaError, // 402
      BehestRateLimitError, // 429
      BehestServerError, // 5xx
      BehestConfigError, // local config problem (never from server)
    } from "@behest/client-ts";
     
    const behest = new Behest(); // reads BEHEST_KEY from env
     
    try {
      await behest.chat.completions.create({ messages });
    } catch (err) {
      if (err instanceof BehestAuthError) return forceLogout(err);
      if (err instanceof BehestQuotaError) return showUpgrade(err);
      if (err instanceof BehestRateLimitError) return backoff(err.retryAfter); // seconds
      if (err instanceof BehestServerError) return retryable(err);
      throw err;
    }

    Every BehestError exposes:

    • err.status — HTTP code (undefined for BehestConfigError)
    • err.code — specific sub-reason (invalid_token, quota_exceeded, rate_limited, forbidden, validation_error, server_error, network_error)
    • err.message — human text
    • err.traceId — pass to support; matches the X-Trace-Id response header
    • err.raw — the parsed response body (use for custom fields like upgrade_url if the server returns one)

    BehestRateLimitError additionally carries err.retryAfter (number of seconds, derived from the Retry-After header).

    403 note: classifyHttpError returns BehestAuthError with code: "forbidden" for 403 responses. Check err.code === "forbidden" to distinguish from a plain 401.

    404 note: 404 surfaces as the base BehestError with status: 404 and no specific subclass. Check err.status === 404 or err.code (if the server set one).


    Typed SDK errors (Python)

    python
    from behest import (
        Behest,
        BehestError,
        BehestAuthError,
        BehestQuotaError,
        BehestRateLimitError,
        BehestServerError,
    )
     
    behest = Behest()  # reads BEHEST_KEY from env
     
    try:
        behest.chat.completions.create(messages=..., model=...)
    except BehestAuthError:
        force_logout()
    except BehestQuotaError as e:
        show_upgrade(e)
    except BehestRateLimitError as e:
        time.sleep(e.retry_after or 1)  # seconds
        retry()
    except BehestServerError:
        retry_with_backoff()

    Browser errors (direct fetch / OpenAI SDK)

    When the browser talks to Behest directly with a server-minted JWT (no Behest SDK in the browser), you do not get typed error classes — read the status and code out of the response yourself:

    ts
    const resp = await fetch(`https://${SLUG}.behest.app/v1/chat/completions`, {
      method: "POST",
      headers: {
        Authorization: `Bearer ${token}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({ messages, stream: true }),
    });
     
    if (!resp.ok) {
      const body = await resp.json().catch(() => ({}));
      const traceId = resp.headers.get("X-Trace-Id");
      const code = body?.error?.code;
      if (resp.status === 401) return refreshTokenAndRetry();
      if (resp.status === 402) return showUpgrade(body?.error?.details);
      if (resp.status === 429) {
        const retryAfter = Number(resp.headers.get("Retry-After")) || 1;
        return backoff(retryAfter); // seconds
      }
      throw new Error(`Behest ${resp.status} ${code ?? ""} [${traceId}]`);
    }

    The OpenAI SDK wraps these as APIError with status, code, and headers — same data, idiomatic names.


    Retry policy

    Do retry: 408, 429 (respect Retry-After), 500, 502, 503, 504, network errors. Do not retry: 400, 401 (refresh first), 402, 403, 404, 422.

    Max 2 retries with exponential backoff + jitter.


    Remediation per code

    invalid_token / 401

    • Cause: token expired, signature invalid (wrong kid), no Authorization header.
    • Fix: re-mint the JWT (from your backend's /api/behest/token endpoint or a fresh behest.auth.mint() call). If re-minting still fails, verify your kid matches a key published to your tenant JWKS.
    • See auth modes.

    quota_exceeded / 402

    • Cause: user hit tier cap (requests/day or tokens/day).
    • Fix: show an upgrade modal. The response body's error.details carries { tier, limit, usage }; read it from err.raw (SDK) or the parsed response (direct fetch). After the user upgrades, re-mint the JWT — the new tier takes effect on the next call.
    • See tiers and usage.

    rate_limited / 429

    • Cause: too many requests per minute for this tier.
    • Fix: back off by err.retryAfter seconds (same value as the Retry-After header). Add a client-side queue if this is frequent.
    • See rate limiting.

    forbidden / 403

    • Cause: project killed by admin, kill-switch engaged, tenant mismatch.
    • Fix: log the err.code, err.traceId, and contact support. Usually not user-recoverable.

    server_error / 5xx

    • Cause: provider outage, timeout, transient infra.
    • Fix: retry once with backoff. If still failing, show "The AI is having a moment — please retry" and log err.traceId.

    validation_error / 400

    • Cause: unknown model, malformed JSON, missing required field.
    • Fix: look at err.code and err.raw. Usually a code bug.

    Logging for debuggability

    Always log traceId on error — it's the fastest path to a Grafana trace for support:

    ts
    catch (err) {
      if (err instanceof BehestError) {
        console.error("[behest]", {
          status: err.status,
          code: err.code,
          traceId: err.traceId,
          message: err.message,
        });
      }
    }

    Or plug into your observability stack:

    ts
    Sentry.withScope((scope) => {
      scope.setTag("behest.trace_id", err.traceId);
      scope.setTag("behest.code", err.code);
      Sentry.captureException(err);
    });

    See also

    Enterprise Token FinOps: Enforce hard budgets and attribute costs per session.

    Learn more