Guide

Add AI to a React App

A complete, working React component that calls Behest directly from the browser. Includes state management, error handling, and streaming.

Prerequisites

A React app (Create React App, Vite, or any React setup)
A Behest AI account with a project and API key (sign up free)
Your app's origin added to your project's CORS settings (e.g., http://localhost:5173)

Basic Chat Component

This component sends a user message to Behest and displays the AI response. It calls the API directly from the browser — no backend server needed.

import { useState } from "react";

const BEHEST_URL = "https://your-project.behest.app/v1/chat/completions";
const API_KEY = "your-api-key"; // In production, use environment variables

export function AiChat({ userId, conversationId }) {
  const [input, setInput] = useState("");
  const [messages, setMessages] = useState([]);
  const [loading, setLoading] = useState(false);
  const [error, setError] = useState(null);

  async function handleSubmit(e) {
    e.preventDefault();
    if (!input.trim() || loading) return;

    const userMessage = { role: "user", content: input };
    setMessages((prev) => [...prev, userMessage]);
    setInput("");
    setLoading(true);
    setError(null);

    try {
      const response = await fetch(BEHEST_URL, {
        method: "POST",
        headers: {
          "Authorization": `Bearer ${API_KEY}`,
          "Content-Type": "application/json",
          "X-End-User-Id": userId,
          // Uniquely identifies a conversation thread for per-session cost attribution.
          "X-Session-Id": `user-${userId}-conv-${conversationId}`,
        },
        body: JSON.stringify({
          model: "gemini-2.5-flash",
          messages: [...messages, userMessage],
        }),
      });

      if (!response.ok) {
        if (response.status === 429) {
          throw new Error("Rate limited. Please wait a moment.");
        }
        throw new Error(`Request failed: ${response.status}`);
      }

      const data = await response.json();
      const assistantMessage = data.choices[0].message;
      setMessages((prev) => [...prev, assistantMessage]);
    } catch (err) {
      setError(err.message);
    } finally {
      setLoading(false);
    }
  }

  return (
    <div>
      <div>
        {messages.map((msg, i) => (
          <div key={i} style={{ marginBottom: 12 }}>
            <strong>{msg.role === "user" ? "You" : "AI"}:</strong>
            <p>{msg.content}</p>
          </div>
        ))}
        {loading && <p>Thinking...</p>}
        {error && <p style={{ color: "red" }}>{error}</p>}
      </div>

      <form onSubmit={handleSubmit}>
        <input
          type="text"
          value={input}
          onChange={(e) => setInput(e.target.value)}
          placeholder="Ask something..."
          disabled={loading}
        />
        <button type="submit" disabled={loading}>
          Send
        </button>
      </form>
    </div>
  );
}

How It Works

The component maintains a messages array with the full conversation history.
On submit, it sends the entire message history to Behest. Behest also maintains its own server-side memory, so even if the user refreshes, context is preserved.
Behest handles CORS, authenticates the request, scrubs PII, defends against prompt injection, enforces rate limits, and routes to the LLM.
The response comes back in the standard OpenAI format. Extract the content from data.choices[0].message.content.

Streaming Response

For real-time token-by-token output, use stream: true. The response is delivered as server-sent events (SSE):

async function streamResponse(question, userId, conversationId) {
  const response = await fetch(BEHEST_URL, {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${API_KEY}`,
      "Content-Type": "application/json",
      "X-End-User-Id": userId,
      // Uniquely identifies a conversation thread for per-session cost attribution.
      "X-Session-Id": `user-${userId}-conv-${conversationId}`,
    },
    body: JSON.stringify({
      model: "gemini-2.5-flash",
      messages: [{ role: "user", content: question }],
      stream: true,
    }),
  });

  const reader = response.body.getReader();
  const decoder = new TextDecoder();
  let fullText = "";

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    const chunk = decoder.decode(value);
    const lines = chunk.split("\n").filter((line) => line.startsWith("data: "));

    for (const line of lines) {
      const data = line.replace("data: ", "");
      if (data === "[DONE]") break;

      try {
        const parsed = JSON.parse(data);
        const content = parsed.choices[0]?.delta?.content || "";
        fullText += content;
        // Update your UI with fullText as each chunk arrives
        setResponse(fullText);
      } catch {
        // Skip malformed chunks
      }
    }
  }
}

Per-User and Per-Session Tracking

Pass two headers to enable per-user and per-session attribution. X-End-User-Id identifies the end user (drives per-user memory, rate limiting, and analytics). X-Session-Id identifies the conversation thread so your analytics dashboard can show per-session token cost instead of one aggregate bucket:

const response = await fetch(BEHEST_URL, {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${API_KEY}`,
    "Content-Type": "application/json",
    "X-End-User-Id": currentUser.id, // Your app's user ID
    // Uniquely identifies a conversation thread for per-session cost attribution.
    "X-Session-Id": `user-${currentUser.id}-conv-${conversationId}`,
  },
  body: JSON.stringify({
    model: "gemini-2.5-flash",
    messages: [{ role: "user", content: input }],
  }),
});

With both headers set, Behest maintains separate conversation memory per user, enforces per-user rate limits, and attributes token usage to both the user and the specific conversation in your analytics dashboard. Skipping X-Session-Id collapses every request into a single “no session” bucket.

Quickstart CORS Guide Add AI to Next.js