Skip to main content

    Python SDK and REST Usage — Behest

    ⚠️ This page documents pre-v1.5 patterns (OpenAI SDK + manual mint). The v1.5 behest Python package (pip install "behest-ai>=1.5") ships the Behest class with auth.mint(), dual-mode support, and typed errors — mirrors the TypeScript SDK. Prefer that for new Python code.

    Start here instead:

    The manual-REST examples below still work and are useful when you can't install a package (e.g., tight cold-start budgets). Also note: passing BEHEST_KEY as the OpenAI api_key on /v1/chat/completions will 401 — you must mint a JWT first. See the new migration guide for the correct pattern.

    A dedicated behest Python package is planned for Phase 3. Until then, you have two immediately usable options: (1) use the OpenAI Python SDK with a custom base_url, or (2) call the REST API directly with curl or any HTTP library. Both approaches work today with no additional installation.


    The Behest API is fully OpenAI-compatible. Point the OpenAI Python SDK at Behest by changing base_url and api_key:

    Installation

    bash
    pip install openai

    Usage

    python
    from openai import OpenAI
     
    client = OpenAI(
        api_key="bh_live_YOUR_API_KEY",
        base_url="https://api.behest.app/v1",
    )
     
    response = client.chat.completions.create(
        model="gemini-2.5-flash",
        messages=[{"role": "user", "content": "Say hello in three languages."}],
    )
     
    print(response.choices[0].message.content)

    That is the complete change — two constructor parameters. All other OpenAI SDK usage (streaming, function calling, response types) works without modification.

    With a system prompt

    python
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "You are a concise assistant. Reply in one sentence."},
            {"role": "user", "content": "What is quantum entanglement?"},
        ],
        temperature=0.5,
        max_tokens=150,
    )
     
    print(response.choices[0].message.content)

    Streaming

    python
    stream = client.chat.completions.create(
        model="gemini-2.5-flash",
        messages=[{"role": "user", "content": "Write a short poem about the sea."}],
        stream=True,
    )
     
    for chunk in stream:
        delta = chunk.choices[0].delta.content
        if delta:
            print(delta, end="", flush=True)
    print()  # final newline

    Async usage

    python
    import asyncio
    from openai import AsyncOpenAI
     
    client = AsyncOpenAI(
        api_key="bh_live_YOUR_API_KEY",
        base_url="https://api.behest.app/v1",
    )
     
    async def main():
        response = await client.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": "Explain recursion briefly."}],
        )
        print(response.choices[0].message.content)
     
    asyncio.run(main())

    Error handling

    python
    from openai import OpenAI, APIStatusError, APIConnectionError
     
    client = OpenAI(
        api_key="bh_live_YOUR_API_KEY",
        base_url="https://api.behest.app/v1",
    )
     
    try:
        response = client.chat.completions.create(
            model="gemini-2.5-flash",
            messages=[{"role": "user", "content": "Hello"}],
        )
        print(response.choices[0].message.content)
     
    except APIStatusError as e:
        print(f"Status: {e.status_code}")
        print(f"Code:   {e.code}")
        print(f"Msg:    {e.message}")
     
        if e.status_code == 401:
            print("Check your BEHEST_API_KEY environment variable.")
        elif e.status_code == 429:
            retry_after = e.response.headers.get("retry-after", "unknown")
            print(f"Rate limited. Retry after {retry_after}s.")
        elif e.status_code == 403:
            print("Guardrail blocked the request:", e.code)
        elif e.status_code == 502:
            print("Upstream provider error — not a Behest issue.")
     
    except APIConnectionError as e:
        print("Network error connecting to Behest:", e)

    Environment variable pattern

    python
    import os
    from openai import OpenAI
     
    client = OpenAI(
        api_key=os.environ["BEHEST_API_KEY"],
        base_url="https://api.behest.app/v1",
    )

    Option 2: REST API with curl

    Use curl when you want to test quickly without writing code or installing anything.

    Chat completion

    bash
    curl -X POST https://{slug}.behest.app/v1/chat/completions \
      -H "Authorization: Bearer bh_live_YOUR_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{
        "model": "gemini-2.5-flash",
        "messages": [
          {"role": "user", "content": "Say hello in three languages."}
        ]
      }'

    Chat completion with system prompt

    bash
    curl -X POST https://{slug}.behest.app/v1/chat/completions \
      -H "Authorization: Bearer bh_live_YOUR_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{
        "model": "gpt-4o",
        "messages": [
          {"role": "system", "content": "You are a helpful assistant."},
          {"role": "user", "content": "What is the boiling point of water?"}
        ],
        "temperature": 0.5,
        "max_tokens": 100
      }'

    Streaming with curl

    bash
    curl -X POST https://{slug}.behest.app/v1/chat/completions \
      -H "Authorization: Bearer bh_live_YOUR_API_KEY" \
      -H "Content-Type: application/json" \
      -N \
      -d '{
        "model": "gemini-2.5-flash",
        "messages": [{"role": "user", "content": "Count to five slowly."}],
        "stream": true
      }'

    The -N flag disables buffering so tokens print as they arrive.

    Example response

    json
    {
      "id": "chatcmpl-abc123",
      "object": "chat.completion",
      "created": 1743024000,
      "model": "gemini-2.5-flash",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "Hello! Hola! Bonjour!"
          },
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 12,
        "completion_tokens": 8,
        "total_tokens": 20
      }
    }

    Option 3: REST API with the requests Library

    python
    import os
    import requests
     
    BEHEST_API_KEY = os.environ["BEHEST_API_KEY"]
     
    response = requests.post(
        "https://{slug}.behest.app/v1/chat/completions",
        headers={
            "Authorization": f"Bearer {BEHEST_API_KEY}",
            "Content-Type": "application/json",
        },
        json={
            "model": "gemini-2.5-flash",
            "messages": [
                {"role": "user", "content": "Summarize the water cycle in two sentences."}
            ],
        },
    )
     
    response.raise_for_status()
    data = response.json()
    print(data["choices"][0]["message"]["content"])

    Option 4: REST API with httpx (async)

    python
    import asyncio
    import os
    import httpx
     
    BEHEST_API_KEY = os.environ["BEHEST_API_KEY"]
     
    async def main():
        async with httpx.AsyncClient() as client:
            response = await client.post(
                "https://{slug}.behest.app/v1/chat/completions",
                headers={
                    "Authorization": f"Bearer {BEHEST_API_KEY}",
                    "Content-Type": "application/json",
                },
                json={
                    "model": "gemini-2.5-flash",
                    "messages": [{"role": "user", "content": "Hello from Python!"}],
                },
            )
            response.raise_for_status()
            data = response.json()
            print(data["choices"][0]["message"]["content"])
     
    asyncio.run(main())

    Python SDK (Phase 3)

    A dedicated behest Python package is planned for Phase 3. When it ships:

    bash
    pip install behest-ai openai
    python
    from behest import BehestClient
     
    client = BehestClient(
        api_key="bh_live_YOUR_API_KEY",
    )
     
    response = client.chat.completions.create(
        model="gemini-2.5-flash",
        messages=[{"role": "user", "content": "Hello, Behest!"}],
    )
     
    print(response.choices[0].message.content)

    The Python SDK will follow the same pattern as the TypeScript SDK: extend the official OpenAI Python SDK with Behest authentication and header injection. Migration from the OpenAI Python SDK approach above will be a single constructor swap.

    Enterprise Token FinOps: Enforce hard budgets and attribute costs per session.

    Learn more