Python SDK and REST Usage — Behest

⚠️ This page documents pre-v1.5 patterns (OpenAI SDK + manual mint). The v1.5 behest Python package (pip install "behest-ai>=1.5") ships the Behest class with auth.mint(), dual-mode support, and typed errors — mirrors the TypeScript SDK. Prefer that for new Python code.

Start here instead:

Python FastAPI quickstart · Auth modes · Error handling

The manual-REST examples below still work and are useful when you can't install a package (e.g., tight cold-start budgets). Also note: passing BEHEST_KEY as the OpenAI api_key on /v1/chat/completions will 401 — you must mint a JWT first. See the new migration guide for the correct pattern.

A dedicated behest Python package is planned for Phase 3. Until then, you have two immediately usable options: (1) use the OpenAI Python SDK with a custom base_url, or (2) call the REST API directly with curl or any HTTP library. Both approaches work today with no additional installation.

Option 1: OpenAI Python SDK (Recommended Until Python SDK Ships)

The Behest API is fully OpenAI-compatible. Point the OpenAI Python SDK at Behest by changing base_url and api_key:

Installation

bash

pip install openai

Usage

python

from openai import OpenAI
 
client = OpenAI(
    api_key="bh_live_YOUR_API_KEY",
    base_url="https://api.behest.app/v1",
)
 
response = client.chat.completions.create(
    model="gemini-2.5-flash",
    messages=[{"role": "user", "content": "Say hello in three languages."}],
)
 
print(response.choices[0].message.content)

That is the complete change — two constructor parameters. All other OpenAI SDK usage (streaming, function calling, response types) works without modification.

With a system prompt

python

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a concise assistant. Reply in one sentence."},
        {"role": "user", "content": "What is quantum entanglement?"},
    ],
    temperature=0.5,
    max_tokens=150,
)
 
print(response.choices[0].message.content)

Streaming

python

stream = client.chat.completions.create(
    model="gemini-2.5-flash",
    messages=[{"role": "user", "content": "Write a short poem about the sea."}],
    stream=True,
)
 
for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)
print()  # final newline

Async usage

python

import asyncio
from openai import AsyncOpenAI
 
client = AsyncOpenAI(
    api_key="bh_live_YOUR_API_KEY",
    base_url="https://api.behest.app/v1",
)
 
async def main():
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Explain recursion briefly."}],
    )
    print(response.choices[0].message.content)
 
asyncio.run(main())

Error handling

python

from openai import OpenAI, APIStatusError, APIConnectionError
 
client = OpenAI(
    api_key="bh_live_YOUR_API_KEY",
    base_url="https://api.behest.app/v1",
)
 
try:
    response = client.chat.completions.create(
        model="gemini-2.5-flash",
        messages=[{"role": "user", "content": "Hello"}],
    )
    print(response.choices[0].message.content)
 
except APIStatusError as e:
    print(f"Status: {e.status_code}")
    print(f"Code:   {e.code}")
    print(f"Msg:    {e.message}")
 
    if e.status_code == 401:
        print("Check your BEHEST_API_KEY environment variable.")
    elif e.status_code == 429:
        retry_after = e.response.headers.get("retry-after", "unknown")
        print(f"Rate limited. Retry after {retry_after}s.")
    elif e.status_code == 403:
        print("Guardrail blocked the request:", e.code)
    elif e.status_code == 502:
        print("Upstream provider error — not a Behest issue.")
 
except APIConnectionError as e:
    print("Network error connecting to Behest:", e)

Environment variable pattern

python

import os
from openai import OpenAI
 
client = OpenAI(
    api_key=os.environ["BEHEST_API_KEY"],
    base_url="https://api.behest.app/v1",
)

Option 2: REST API with curl

Use curl when you want to test quickly without writing code or installing anything.

Chat completion

bash

curl -X POST https://{slug}.behest.app/v1/chat/completions \
  -H "Authorization: Bearer bh_live_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-2.5-flash",
    "messages": [
      {"role": "user", "content": "Say hello in three languages."}
    ]
  }'

Chat completion with system prompt

bash

curl -X POST https://{slug}.behest.app/v1/chat/completions \
  -H "Authorization: Bearer bh_live_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is the boiling point of water?"}
    ],
    "temperature": 0.5,
    "max_tokens": 100
  }'

Streaming with curl

bash

curl -X POST https://{slug}.behest.app/v1/chat/completions \
  -H "Authorization: Bearer bh_live_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -N \
  -d '{
    "model": "gemini-2.5-flash",
    "messages": [{"role": "user", "content": "Count to five slowly."}],
    "stream": true
  }'

The -N flag disables buffering so tokens print as they arrive.

Example response

json

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1743024000,
  "model": "gemini-2.5-flash",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! Hola! Bonjour!"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 8,
    "total_tokens": 20
  }
}

Option 3: REST API with the `requests` Library

python

import os
import requests
 
BEHEST_API_KEY = os.environ["BEHEST_API_KEY"]
 
response = requests.post(
    "https://{slug}.behest.app/v1/chat/completions",
    headers={
        "Authorization": f"Bearer {BEHEST_API_KEY}",
        "Content-Type": "application/json",
    },
    json={
        "model": "gemini-2.5-flash",
        "messages": [
            {"role": "user", "content": "Summarize the water cycle in two sentences."}
        ],
    },
)
 
response.raise_for_status()
data = response.json()
print(data["choices"][0]["message"]["content"])

Option 4: REST API with `httpx` (async)

python

import asyncio
import os
import httpx
 
BEHEST_API_KEY = os.environ["BEHEST_API_KEY"]
 
async def main():
    async with httpx.AsyncClient() as client:
        response = await client.post(
            "https://{slug}.behest.app/v1/chat/completions",
            headers={
                "Authorization": f"Bearer {BEHEST_API_KEY}",
                "Content-Type": "application/json",
            },
            json={
                "model": "gemini-2.5-flash",
                "messages": [{"role": "user", "content": "Hello from Python!"}],
            },
        )
        response.raise_for_status()
        data = response.json()
        print(data["choices"][0]["message"]["content"])
 
asyncio.run(main())

Python SDK (Phase 3)

A dedicated behest Python package is planned for Phase 3. When it ships:

bash

pip install behest-ai openai

python

from behest import BehestClient
 
client = BehestClient(
    api_key="bh_live_YOUR_API_KEY",
)
 
response = client.chat.completions.create(
    model="gemini-2.5-flash",
    messages=[{"role": "user", "content": "Hello, Behest!"}],
)
 
print(response.choices[0].message.content)

The Python SDK will follow the same pattern as the TypeScript SDK: extend the official OpenAI Python SDK with Behest authentication and header injection. Migration from the OpenAI Python SDK approach above will be a single constructor swap.