Events and Notifications

Behest emits structured events for guardrail detections, budget activity, and infrastructure state changes. Events are written to Redis Streams and can be consumed in real time.

Event Transport: Redis Streams

All events are written to the guardrails:events Redis Stream using XADD. The stream is capped at 10,000 entries (MAXLEN 10000, approximate trimming). Older events are evicted automatically.

Stream key: guardrails:events

Read events with XREAD or XREADGROUP:

bash

# Read all new events from the beginning
XREAD COUNT 100 STREAMS guardrails:events 0-0
 
# Read new events since a known message ID
XREAD COUNT 100 STREAMS guardrails:events 1743340800000-0
 
# Block until new events arrive (60-second timeout)
XREAD COUNT 10 BLOCK 60000 STREAMS guardrails:events $

Each entry is a flat key/value map (all values are strings).

Event Types

Guardrail: PII Detection

Emitted by pii_shield_hook.py (Microsoft Presidio NER + regex) when PII is found in a user message.

Event type values:

pii_detection — PII was found and masked or redacted (enforce mode), or found and logged (shadow mode)
pii_block — PII with a BLOCK action was detected and the request was rejected

Fields:

Field	Type	Description
`type`	string	`"pii_detection"` or `"pii_block"`
`tenant_id`	string	UUID of the tenant
`project_id`	string	UUID of the project
`direction`	string	`"input"` (pre-call scan)
`mode`	string	`"shadow"` or `"enforce"`
`action_taken`	string	`"blocked"`, `"masked"`, or `"logged"`
`entity_types`	string	JSON array of detected entity type names (e.g. `["PERSON", "EMAIL_ADDRESS"]`)
`entity_count`	string	Number of PII entities detected
`request_id`	string	Value of the `x-request-id` header
`timestamp`	string	Unix timestamp (float, as string)

Example entry:

type        pii_detection
tenant_id   550e8400-e29b-41d4-a716-446655440000
project_id  6ba7b810-9dad-11d1-80b4-00c04fd430c8
direction   input
mode        enforce
action_taken masked
entity_types ["EMAIL_ADDRESS","PERSON"]
entity_count 2
request_id  req_abc123
timestamp   1743340800.123

PII mode configuration:

disabled (default) — no scanning, no events
shadow — scan, log events, do not modify content or block requests
enforce — scan, mask/redact/block per entity config, log events

Configure via project settings: config:{projectId}:pii_mode and config:{projectId}:pii_entities.

The pii_entities config is a JSON object mapping entity type to action:

json

{
  "EMAIL_ADDRESS": "MASK",
  "PERSON": "REDACT",
  "CREDIT_CARD": "BLOCK"
}

Supported entity types come from Microsoft Presidio's standard recognizers (PERSON, EMAIL_ADDRESS, PHONE_NUMBER, CREDIT_CARD, IBAN_CODE, IP_ADDRESS, LOCATION, URL, etc.).

MASK vs REDACT vs BLOCK:

MASK — reversible tokenization. The original value is stored in a Redis vault (pii_vault:{tid}:{pid}:{requestId}, 5-minute TTL) and restored in the LLM response before returning to the client.
REDACT — permanently replaced with <ENTITY_TYPE> in the request. Not restored.
BLOCK — request is rejected with HTTP 400.

Guardrail: Sentinel (Prompt Injection)

Emitted by sentinel_hook.py when jailbreak patterns or blocklist terms are detected in user messages.

Event type values:

sentinel_jailbreak — a hardcoded jailbreak regex matched user input
sentinel_blocklist — a custom blocklist term matched user input

Fields:

Field	Type	Description
`type`	string	`"sentinel_jailbreak"` or `"sentinel_blocklist"`
`tenant_id`	string	UUID of the tenant
`project_id`	string	UUID of the project
`direction`	string	`"input"`
`mode`	string	`"shadow"` or `"enforce"`
`action_taken`	string	`"blocked"` or `"logged"`
`request_id`	string	Value of the `x-request-id` header
`timestamp`	string	Unix timestamp (float, as string)

Sentinel mode configuration:

disabled (default) — no scanning
shadow — scan and log, never block
enforce — block requests that trigger detections

Configure via project settings: config:{projectId}:sentinel_mode and config:{projectId}:sentinel_blocklist.

Built-in jailbreak patterns (pre-compiled regexes, always active when mode is not disabled):

Instruction override: ignore all previous instructions
Role-play bypass: act as unrestricted, pretend to be DAN
System prompt injection: [SYSTEM], <<SYS>>, system prompt override
Safety bypass: bypass your safety filters, disable content restrictions
Instruction discard: disregard your safety prompt

Custom blocklist terms are matched case-insensitively as substrings.

When sentinel fires in enforce mode, the response to the client is:

HTTP 400
"Request blocked: potential prompt injection detected"

HTTP 400
"Request blocked: content policy violation"

Budget alerts are not currently emitted as stream events. The token_budget_hook.py runs post-response and increments Redis counters, but does not emit to guardrails:events. Budget enforcement is handled pre-request by the Kong plugin (returns 429 when used >= budget).

Planned: 50%, 80%, and 100% budget alert events are on the roadmap. When implemented, they will be written to guardrails:events with type budget_alert and a threshold field.

Current token counter keys (readable directly from Redis for monitoring):

Counter	Redis key	Window
Per-user daily	`tokens:{pid}:{uid}:{YYYYMMDD}`	UTC day
Per-user monthly	`tokens:{pid}:{uid}:{YYYYMM}`	UTC month
Per-project daily	`tokens:{tid}:{pid}:{YYYYMMDD}`	UTC day

Kill Switch Events

Kill switches do not emit stream events — they are state flags that Kong reads synchronously per-request. The Kong plugin logs a warning entry when a kill switch fires:

WARN Kill switch active: {level} tid={tenantId} pid={projectId}

Kill switch Redis keys (set to "1" to activate, delete to deactivate):

Scope	Redis key
Global	`killswitch:global`
Tenant	`killswitch:tenant:{tenantId}`
Project	`killswitch:project:{projectId}`

Activation takes effect on the next request (checked at request time, no TTL).

Consuming Events

Polling a Consumer Group (Recommended for Production)

Consumer groups allow multiple consumers to share the stream and track acknowledged events:

python

import redis
import json
 
r = redis.from_url("redis://localhost:6379", decode_responses=True)
 
# Create consumer group (once)
try:
    r.xgroup_create("guardrails:events", "my-consumer-group", id="$", mkstream=True)
except redis.ResponseError:
    pass  # Group already exists
 
# Consume events
while True:
    entries = r.xreadgroup(
        groupname="my-consumer-group",
        consumername="worker-1",
        streams={"guardrails:events": ">"},
        count=10,
        block=5000,  # 5-second timeout
    )
    if not entries:
        continue
    for stream_name, messages in entries:
        for msg_id, fields in messages:
            print(f"Event: {fields['type']} project={fields.get('project_id')}")
            # Acknowledge after processing
            r.xack("guardrails:events", "my-consumer-group", msg_id)

Simple Polling (Development / Low Volume)

python

import redis
 
r = redis.from_url("redis://localhost:6379", decode_responses=True)
 
last_id = "0-0"  # Start from beginning; use "$" for new-only
while True:
    entries = r.xread(streams={"guardrails:events": last_id}, count=50, block=5000)
    if not entries:
        continue
    for stream_name, messages in entries:
        for msg_id, fields in messages:
            last_id = msg_id
            print(fields)

Event Delivery Guarantees

Property	Value
Delivery	At-least-once (XADD is fire-and-forget; failures are swallowed)
Ordering	Per-stream insertion order (Redis Streams preserve insertion order)
Retention	Last 10,000 events (approximate MAXLEN trimming)
Durability	Redis AOF/RDB persistence (same as all Behest Redis data)
Latency	Sub-millisecond (same-process async write in LiteLLM hook)

Events are written with a best-effort approach — if the Redis write fails (connection error, timeout), the exception is swallowed and the event is lost. The inference request succeeds regardless of event write failure.

Event Filtering

All events include tenant_id and project_id fields. Use these to filter to a specific project in your consumer:

python

if fields.get("project_id") == "6ba7b810-9dad-11d1-80b4-00c04fd430c8":
    handle_event(fields)

There is no per-project stream partition currently. All tenants share the single guardrails:events stream. Server-side filtering by project ID is on the roadmap.