Events and Notifications
Behest emits structured events for guardrail detections, budget activity, and infrastructure state changes. Events are written to Redis Streams and can be consumed in real time.
Event Transport: Redis Streams
All events are written to the guardrails:events Redis Stream using XADD. The stream is capped at 10,000 entries (MAXLEN 10000, approximate trimming). Older events are evicted automatically.
Stream key: guardrails:events
Read events with XREAD or XREADGROUP:
# Read all new events from the beginning
XREAD COUNT 100 STREAMS guardrails:events 0-0
# Read new events since a known message ID
XREAD COUNT 100 STREAMS guardrails:events 1743340800000-0
# Block until new events arrive (60-second timeout)
XREAD COUNT 10 BLOCK 60000 STREAMS guardrails:events $Each entry is a flat key/value map (all values are strings).
Event Types
Guardrail: PII Detection
Emitted by pii_shield_hook.py (Microsoft Presidio NER + regex) when PII is found in a user message.
Event type values:
pii_detection— PII was found and masked or redacted (enforce mode), or found and logged (shadow mode)pii_block— PII with aBLOCKaction was detected and the request was rejected
Fields:
| Field | Type | Description |
|---|---|---|
type | string | "pii_detection" or "pii_block" |
tenant_id | string | UUID of the tenant |
project_id | string | UUID of the project |
direction | string | "input" (pre-call scan) |
mode | string | "shadow" or "enforce" |
action_taken | string | "blocked", "masked", or "logged" |
entity_types | string | JSON array of detected entity type names (e.g. ["PERSON", "EMAIL_ADDRESS"]) |
entity_count | string | Number of PII entities detected |
request_id | string | Value of the x-request-id header |
timestamp | string | Unix timestamp (float, as string) |
Example entry:
type pii_detection
tenant_id 550e8400-e29b-41d4-a716-446655440000
project_id 6ba7b810-9dad-11d1-80b4-00c04fd430c8
direction input
mode enforce
action_taken masked
entity_types ["EMAIL_ADDRESS","PERSON"]
entity_count 2
request_id req_abc123
timestamp 1743340800.123
PII mode configuration:
disabled(default) — no scanning, no eventsshadow— scan, log events, do not modify content or block requestsenforce— scan, mask/redact/block per entity config, log events
Configure via project settings: config:{projectId}:pii_mode and config:{projectId}:pii_entities.
The pii_entities config is a JSON object mapping entity type to action:
{
"EMAIL_ADDRESS": "MASK",
"PERSON": "REDACT",
"CREDIT_CARD": "BLOCK"
}Supported entity types come from Microsoft Presidio's standard recognizers (PERSON, EMAIL_ADDRESS, PHONE_NUMBER, CREDIT_CARD, IBAN_CODE, IP_ADDRESS, LOCATION, URL, etc.).
MASK vs REDACT vs BLOCK:
MASK— reversible tokenization. The original value is stored in a Redis vault (pii_vault:{tid}:{pid}:{requestId}, 5-minute TTL) and restored in the LLM response before returning to the client.REDACT— permanently replaced with<ENTITY_TYPE>in the request. Not restored.BLOCK— request is rejected with HTTP 400.
Guardrail: Sentinel (Prompt Injection)
Emitted by sentinel_hook.py when jailbreak patterns or blocklist terms are detected in user messages.
Event type values:
sentinel_jailbreak— a hardcoded jailbreak regex matched user inputsentinel_blocklist— a custom blocklist term matched user input
Fields:
| Field | Type | Description |
|---|---|---|
type | string | "sentinel_jailbreak" or "sentinel_blocklist" |
tenant_id | string | UUID of the tenant |
project_id | string | UUID of the project |
direction | string | "input" |
mode | string | "shadow" or "enforce" |
action_taken | string | "blocked" or "logged" |
request_id | string | Value of the x-request-id header |
timestamp | string | Unix timestamp (float, as string) |
Sentinel mode configuration:
disabled(default) — no scanningshadow— scan and log, never blockenforce— block requests that trigger detections
Configure via project settings: config:{projectId}:sentinel_mode and config:{projectId}:sentinel_blocklist.
Built-in jailbreak patterns (pre-compiled regexes, always active when mode is not disabled):
- Instruction override:
ignore all previous instructions - Role-play bypass:
act as unrestricted,pretend to be DAN - System prompt injection:
[SYSTEM],<<SYS>>,system prompt override - Safety bypass:
bypass your safety filters,disable content restrictions - Instruction discard:
disregard your safety prompt
Custom blocklist terms are matched case-insensitively as substrings.
When sentinel fires in enforce mode, the response to the client is:
HTTP 400
"Request blocked: potential prompt injection detected"
or
HTTP 400
"Request blocked: content policy violation"
Budget Events
Budget alerts are not currently emitted as stream events. The token_budget_hook.py runs post-response and increments Redis counters, but does not emit to guardrails:events. Budget enforcement is handled pre-request by the Kong plugin (returns 429 when used >= budget).
Planned: 50%, 80%, and 100% budget alert events are on the roadmap. When implemented, they will be written to guardrails:events with type budget_alert and a threshold field.
Current token counter keys (readable directly from Redis for monitoring):
| Counter | Redis key | Window |
|---|---|---|
| Per-user daily | tokens:{pid}:{uid}:{YYYYMMDD} | UTC day |
| Per-user monthly | tokens:{pid}:{uid}:{YYYYMM} | UTC month |
| Per-project daily | tokens:{tid}:{pid}:{YYYYMMDD} | UTC day |
Kill Switch Events
Kill switches do not emit stream events — they are state flags that Kong reads synchronously per-request. The Kong plugin logs a warning entry when a kill switch fires:
WARN Kill switch active: {level} tid={tenantId} pid={projectId}
Kill switch Redis keys (set to "1" to activate, delete to deactivate):
| Scope | Redis key |
|---|---|
| Global | killswitch:global |
| Tenant | killswitch:tenant:{tenantId} |
| Project | killswitch:project:{projectId} |
Activation takes effect on the next request (checked at request time, no TTL).
Consuming Events
Polling a Consumer Group (Recommended for Production)
Consumer groups allow multiple consumers to share the stream and track acknowledged events:
import redis
import json
r = redis.from_url("redis://localhost:6379", decode_responses=True)
# Create consumer group (once)
try:
r.xgroup_create("guardrails:events", "my-consumer-group", id="$", mkstream=True)
except redis.ResponseError:
pass # Group already exists
# Consume events
while True:
entries = r.xreadgroup(
groupname="my-consumer-group",
consumername="worker-1",
streams={"guardrails:events": ">"},
count=10,
block=5000, # 5-second timeout
)
if not entries:
continue
for stream_name, messages in entries:
for msg_id, fields in messages:
print(f"Event: {fields['type']} project={fields.get('project_id')}")
# Acknowledge after processing
r.xack("guardrails:events", "my-consumer-group", msg_id)Simple Polling (Development / Low Volume)
import redis
r = redis.from_url("redis://localhost:6379", decode_responses=True)
last_id = "0-0" # Start from beginning; use "$" for new-only
while True:
entries = r.xread(streams={"guardrails:events": last_id}, count=50, block=5000)
if not entries:
continue
for stream_name, messages in entries:
for msg_id, fields in messages:
last_id = msg_id
print(fields)Event Delivery Guarantees
| Property | Value |
|---|---|
| Delivery | At-least-once (XADD is fire-and-forget; failures are swallowed) |
| Ordering | Per-stream insertion order (Redis Streams preserve insertion order) |
| Retention | Last 10,000 events (approximate MAXLEN trimming) |
| Durability | Redis AOF/RDB persistence (same as all Behest Redis data) |
| Latency | Sub-millisecond (same-process async write in LiteLLM hook) |
Events are written with a best-effort approach — if the Redis write fails (connection error, timeout), the exception is swallowed and the event is lost. The inference request succeeds regardless of event write failure.
Event Filtering
All events include tenant_id and project_id fields. Use these to filter to a specific project in your consumer:
if fields.get("project_id") == "6ba7b810-9dad-11d1-80b4-00c04fd430c8":
handle_event(fields)There is no per-project stream partition currently. All tenants share the single guardrails:events stream. Server-side filtering by project ID is on the roadmap.