Skip to main content

    Provider Setup Guide (BYOK)

    Behest supports Bring Your Own Key (BYOK) — you connect your own provider API keys, and Behest routes your inference requests using those keys. Behest never marks up your LLM costs; you pay the provider directly at their published rates.

    BYOK is available on Pro, Business+, and Enterprise plans. Free-plan accounts use the platform's shared Gemini key.


    Supported Providers

    ProviderAuth methodKey formatLiteLLM prefix
    OpenAIBearer API keysk-... or sk-proj-...none (native)
    Anthropicx-api-key headersk-ant-...anthropic/
    Google GeminiQuery-param API keyAIza... (39 chars)gemini/
    MistralBearer API key10+ chars (no fixed prefix)mistral/
    CohereBearer API key10+ chars (no fixed prefix)cohere/
    OpenRouterBearer API keysk-or-v1-[64 hex chars]openrouter/

    AWS Bedrock and Azure OpenAI are in the integration roadmap. Contact support if you need early access.


    How Key Storage Works

    Every provider key is encrypted with AES-256-GCM before it touches the database. The ciphertext format is:

    {iv_hex}:{ciphertext_hex}:{auth_tag_hex}
    
    • IV: 12 bytes (96 bits), randomly generated per key using the OS CSPRNG
    • Auth tag: 16 bytes (128 bits, GCM default) — any tampering is detected at decryption time
    • Encryption key: loaded from PROVIDER_ENCRYPTION_KEY environment variable (32 bytes / 64 hex chars) at service startup; fails closed if missing or malformed

    The plaintext key is never logged, never returned in any API response, and never stored. The only time it exists in memory is during the encryption write path (milliseconds) and during decryption in custom_auth.py for a single request. After decryption, the plaintext reference is immediately deleted.

    Keys are stored in the tenant_provider_keys table:

    • One row per provider per tenant (UNIQUE(tenant_id, provider_type))
    • api_key_enc — the AES-256-GCM ciphertext (never returned in API responses)
    • key_last4 — last 4 characters of the plaintext at write time, for display only (e.g. ...xK9z)
    • key_set_at — timestamp of the most recent save or rotation

    Redis stores a copy of the ciphertext at provider:{tenantId}:{providerType}:api_key_enc with a 24-hour TTL, refreshed hourly by the redis-sync-worker.


    Adding a Provider Key

    Dashboard

    1. Go to Account Settings → Provider Keys
    2. Select the provider from the dropdown
    3. Paste your API key into the field
    4. Click Save — Behest validates the key against the provider API before storing it

    API

    http
    PUT /v1/tenants/:tenantId/providers/:providerType
    Authorization: Bearer <service-JWT>
    Content-Type: application/json
     
    {
      "api_key": "sk-proj-..."
    }

    Response (200):

    json
    {
      "configured": true,
      "provider_type": "openai",
      "key_last4": "k9Zq",
      "key_set_at": "2026-03-30T12:00:00.000Z"
    }

    The api_key field is never echoed back. The response only confirms the last 4 characters.

    Error codes:

    HTTPCodeMeaning
    400INVALID_KEY_FORMATKey does not match the provider's expected format
    403BYOAK_REQUIRES_PROYour plan does not include BYOK
    422KEY_VALIDATION_FAILEDKey was rejected by the provider API

    Validation behavior: Behest performs a live validation call to the provider API (5-second timeout). If the provider API is unreachable or times out, the call is treated as valid and the key is stored (fail-open). This prevents a provider outage from blocking your key rotation. A 401 from the provider is the only signal that definitively marks a key as invalid.


    Removing a Provider Key

    Dashboard

    Go to Account Settings → Provider Keys → click Remove next to the provider.

    API

    http
    DELETE /v1/tenants/:tenantId/providers/:providerType
    Authorization: Bearer <service-JWT>

    Response: 204 No Content

    Removal takes effect immediately — the Redis entry is deleted atomically with the database row. Projects that had this provider's models configured will silently fall back to the platform default (Gemini 2.5 Flash) on the next request. Their provider_model setting is preserved in the database (intent is kept); it just has no key to activate against.


    Listing Configured Keys

    http
    GET /v1/tenants/:tenantId/providers
    Authorization: Bearer <service-JWT>

    Response:

    json
    {
      "providers": [
        {
          "provider_type": "openai",
          "key_last4": "k9Zq",
          "key_set_at": "2026-03-30T12:00:00.000Z",
          "projects_using_count": 3
        }
      ]
    }

    Provider-Specific Notes

    OpenAI

    Get a key: platform.openai.com/api-keys

    Key formats accepted:

    • sk-[20+ chars] — legacy project keys
    • sk-proj-[20+ chars] — project-scoped keys (recommended)
    • sk-svcacct-[20+ chars] — service account keys

    Validation endpoint: GET https://api.openai.com/v1/models with Authorization: Bearer {key}. A 401 means invalid; 403 and 429 mean the key is valid but rate-limited or restricted.

    Supported models (selected):

    Model IDContextStreamingVisionTool use
    gpt-4o128KYesYesYes
    gpt-4o-mini128KYesYesYes
    gpt-4.11MYesYesYes
    gpt-4.1-mini1MYesNoYes
    o3200KYesNoNo
    o3-mini200KYesNoNo
    o4-mini200KYesNoYes

    Rate limits depend on your OpenAI tier (Tier 1 through Tier 5, based on cumulative spend). New keys start at Tier 1 (500 RPM, 200K TPM). See platform.openai.com/docs/guides/rate-limits.

    Anthropic

    Get a key: console.anthropic.com/settings/keys

    Key format: sk-ant-[20+ chars]

    Validation endpoint: GET https://api.anthropic.com/v1/models with x-api-key: {key} and anthropic-version: 2023-06-01. A 401 is invalid; 403 and 529 mean valid but restricted.

    Supported models:

    Model IDContextStreamingVisionTool use
    claude-opus-4-20250514200KYesYesYes
    claude-sonnet-4-20250514200KYesYesYes
    claude-haiku-4-5-20251001200KYesYesYes
    claude-3-5-haiku-20241022200KYesYesYes

    LiteLLM prepends anthropic/ to the model ID before forwarding (e.g., claude-sonnet-4-20250514 becomes anthropic/claude-sonnet-4-20250514).

    Note: Anthropic does not offer embedding models. Requests for embeddings must use a different provider.

    Google Gemini

    Get a key: aistudio.google.com/app/apikey

    Key format: AIza[35 alphanumeric chars] (exactly 39 characters)

    Validation endpoint: GET https://generativelanguage.googleapis.com/v1beta/models?key={key}. A 400 or 403 means invalid; 429 means valid but rate-limited.

    Supported models:

    Model IDContextStreamingVisionTool use
    gemini-2.5-pro1MYesYesYes
    gemini-2.5-flash1MYesYesYes
    gemini-2.0-flash1MYesYesYes
    gemini-1.5-pro1MYesYesYes

    LiteLLM prepends gemini/ to the model ID (e.g., gemini-2.5-flash becomes gemini/gemini-2.5-flash).

    The Google AI Studio free tier offers 1,500 requests/day and 1M TPM — useful for development. Pay-as-you-go unlocks 2,000 RPM.

    Note: Google is also the platform default provider. If no BYOK key is configured for a project, Behest routes to gemini-2.5-flash using the platform's own key.

    Mistral

    Get a key: console.mistral.ai/api-keys

    Key format: Any string of 10 or more characters. Mistral does not use a predictable prefix — Behest validates only via live API call.

    Validation endpoint: GET https://api.mistral.ai/v1/models with Authorization: Bearer {key}. A 401 means invalid.

    Supported models:

    Model IDContextStreamingVisionTool use
    mistral-large-latest128KYesYesYes
    mistral-small-latest128KYesNoYes
    codestral-latest256KYesNoYes
    pixtral-large-latest128KYesYesYes

    LiteLLM prepends mistral/ to the model ID. Mistral is a European provider (Paris); EU-based tenants may prefer it for data residency considerations.

    Cohere

    Get a key: dashboard.cohere.com/api-keys

    Key format: Any string of 10 or more characters. Cohere keys are validated only via live API call.

    Validation endpoint: GET https://api.cohere.com/v1/models with Authorization: Bearer {key}. A 401 or 403 means invalid.

    Supported models:

    Model IDContextStreamingVisionTool use
    command-r-plus-08-2024128KYesNoYes
    command-r-08-2024128KYesNoYes
    command-light4KYesNoNo

    LiteLLM prepends cohere/ to the model ID. Note: Cohere does not support vision (image) inputs.

    OpenRouter

    Get a key: openrouter.ai/keys

    Key format: sk-or-v1-[64 lowercase hex chars] (exactly 77 characters)

    Validation endpoint: GET https://openrouter.ai/api/v1/auth/key with Authorization: Bearer {key}.

    OpenRouter is a meta-router that provides access to hundreds of models from multiple providers under a single key. When using OpenRouter, specify model IDs in org/model format (e.g., openai/gpt-4o, anthropic/claude-sonnet-4-20250514).

    LiteLLM prepends openrouter/ to the model ID. Behest adds the required HTTP-Referer: https://behest.ai and X-Title: Behest headers automatically.

    Selected OpenRouter models:

    Model IDProvider
    openai/gpt-4oOpenAI via OpenRouter
    anthropic/claude-sonnet-4-20250514Anthropic via OpenRouter
    google/gemini-2.5-flashGoogle via OpenRouter
    deepseek/deepseek-r1DeepSeek
    x-ai/grok-3xAI
    meta-llama/llama-3.3-70b-instructMeta via OpenRouter

    Key Rotation

    To rotate a key, simply PUT a new key to the same endpoint. Behest performs an upsert (ON CONFLICT DO UPDATE) — the existing ciphertext is replaced atomically. The Redis entry is updated immediately. There is no downtime window; requests continue using the old key until the Redis write completes (typically under 1ms).


    Testing a Key Before Deploying

    Use the test-token endpoint to issue a short-lived (5-minute) JWT that exercises the BYOK path without touching your production deploy:

    http
    POST /v1/projects/:projectId/settings/test-token
    Authorization: Bearer <service-JWT>

    This writes draft config keys (draft:config:{pid}:*) to Redis with a 300-second TTL. Requests using the draft token read from these keys, keeping test traffic isolated from production settings.

    Enterprise Token FinOps: Enforce hard budgets and attribute costs per session.

    Learn more