Delivery, metering, and observability
LoreOS runs character work asynchronously. That means delivery, spend control, and debugging are first-class parts of the platform.
Delivery
You can consume replies directly from session events, or attach delivery channels. The managed Telegram channel lets a developer connect a bot, bind channel users to LoreOS sessions, and let LoreOS handle delivery state.
There is one cursored event log per session, projected three ways:
- Polling —
GET /v1/sessions/{id}/events?since=<cursor>, the universal fallback. - SSE stream —
GET /v1/sessions/{id}/events/stream?since=<cursor>; a connection is capped at 5 minutes, reconnect with the last cursor. - Signed webhook push — register with
POST /v1/sessions/{id}/channels {url, secret}. LoreOS POSTs each event to your URL signed (x-auto-dating-signature: sha256=HMAC(secret, body)), at-least-once with retry/backoff. Dedupe onx-auto-dating-event-id(or on the eventcursor) so a redelivered event is not processed twice.
For the exact ordered sequence an adapter sees per message — and how to handle typing indicators, latency, failures, and process death — see Event lifecycle and reply timing.
Metering
Usage is metered by app and external user. LoreOS reserves usage before expensive work runs, so budget caps can stop a request before model calls or image work begin.
Use:
GET /v1/usagefor current app usage (optionallygroup_byresource_type / model / provider / character / session / external_user);GET /v1/ratesfor the rate card;- external-user usage and budget endpoints to inspect or cap a single user.
Cost lifecycle. LoreOS reserves credits before the provider call — a send that would
exceed a hard cap is rejected with 402 budget_exceeded before any model spend — then
settles the actual usage after the turn completes (success or failure: you are charged
what was actually spent). The usage_credits you read in a trace is the settled amount.
Where a failed or blocked image shows up. GET /v1/usage reports settled cost only. An
image that was attempted but never generated (e.g. the image provider returns an error, or a
402 budget_exceeded blocked it before the call) costs $0, so it does not appear in
/v1/usage — there was no spend. The attempt and its failure are visible elsewhere:
- image-probe failure →
GET /v1/characters/{slug}/image-probe/{probe_id}:statusis"failed"with a message. (servable: trueandstatus: "failed"is the correct-but-confusing case where the image was generated + stored but the request was then rejected.) - inventory-generate failure → the per-item result in the inventory-generate report.
- reply image failure → a
message.failedevent on the session log.
So /v1/usage answers “what did I spend”; the probe / inventory report answers “did this image
attempt succeed, and if not, why”. If image generation fails with a provider billing error, that
is LoreOS’s image-provider capacity — contact LoreOS; registering an existing image via
POST .../visual-assets/register-url needs no image generation and is never billing-blocked.
Observability
A character runtime has many moving parts: message runs, model calls, delivery attempts,
image requests, Story Room state, proactive hooks, and scheduled jobs. The read-only,
app-scoped observability endpoints let you answer “why did this character reply that way,
get withheld, or cost that much” without stitching tables by hand. Every endpoint is scoped
to your app — a session that is not yours returns 404 (never 403), so it looks absent.
Find sessions
GET /v1/sessions lists your app’s sessions, newest first, with optional filters
character (slug), channel_id, and external_user_ref:
Per-event trace — GET /v1/sessions/{id}/trace
One read that stitches the event log to its settled cost and its push-delivery status, newest
first. This is the “inbound → reply → cost → delivery” view for a single session. Each row
carries run_ref (the turn’s trace id), so you can group rows by the run_ref the
send response returned and
attach that turn’s cost and delivery to your UI.
Notes on the fields:
textis the event’spayload.text, andimage_urlis the event’spayload.image_url(set on a servableimage.ready). The trace surfaces these two flattened for convenience; the full payloads remain on the events endpoint.usage_creditsis the settled sum of metered usage for that turn’srun_ref. Rows that share arun_refreport the same turn-level settled total.delivery_status/delivery_attempts/delivery_errorreflect the latest push delivery attempt for that event (populated for events delivered to a webhook/Telegram channel;nullfor inbound user events and for events you only polled).
Per-turn runs — GET /v1/sessions/{id}/runs
An aggregate-per-turn view: for each processed turn (keyed by trace_id, which equals the
run_ref from the send response), the model-call count, settled cost, total latency, and an
error flag — so you can self-diagnose “why is this turn slow / what did it cost / did it
error” without any internal prompt, role, or world-model detail.
status is ok, or error if any model call in the turn failed. Use total_latency_ms and
model_calls to reason about a slow turn; use cost_credits for the per-turn settled spend.
Delivery status — GET /v1/sessions/{id}/delivery
Per-event push-delivery attempts for the session’s channels, with attribution (which bot
delivered which character’s reply to whom): status (delivered / failed / dead_letter),
attempt_count, response_code, last_error, next_attempt_at, plus channel_id,
channel_type, bot_username, character_slug, and provider_chat_id.
What you can debug vs what is redacted
The observability surface is read-only and redacted by default. It exposes the product-level picture — what happened, what it cost, whether it delivered — and never the engine’s internal reasoning. Concretely:
This is the OS-D08 privacy boundary: a developer sees enough to test, trace, and improve character behavior, but the raw evolving world-model stays internal.
The one opt-in relaxation: a relationship meter. By default GET /v1/sessions/{id}/state
returns only lifecycle + aggregates with "redacted": true. If you opt your app in via
PATCH /v1/apps/{app_id} {expose_relational_numerics: true}, the session state additionally
includes a live per-dimension relational meter — the final 0..1 value per dimension plus
its label (a “relationship meter” / “mastery meter”). Even then, only the final values are
exposed: still no patch history, no critic reasoning, no signal events.
Support escalation — what the LoreOS team can see
Every turn carries a run_ref (its trace id), which also appears as trace_id in /runs and
on each /trace row. When you open a support escalation, include the run_ref for the
turn in question. Internally that trace id ties together the full server-side detail for that
turn — the model calls, the critic stack, the world-model and emission decisions — which the
LoreOS team can inspect to diagnose your case. That depth is not projected onto the /v1
surface (it stays behind the OS-D08 boundary); the run_ref is the handle that lets us look
it up for you.
A note on PII and logging
- Error reasons are redacted enums, not raw errors. A
message.failedevent reportsreason: "generation_failed" | "timed_out"and never the internal exception text. Raw internal errors stay server-side. - Secrets are never echoed. Delivery secrets (bot tokens, webhook/HMAC secrets) are stored in a secrets vault; API responses and logs hold only a reference and a fingerprint, never the raw secret. Do not put secrets in fields that round-trip (labels, metadata).
- You control end-user identity.
external_user_refis your identifier for an end-user; choose an opaque id rather than embedding raw PII (email, phone) in it, since it appears in observability reads. To remove a user’s data, use the external-user delete endpoint. - Message text appears in the event log and trace (that is the conversation itself) — treat those reads as containing user content and handle them under your own privacy policy.
Ship characters with evidence
Before launch, inspect readiness, runtime preview, image probes, usage caps, and eval runs. LoreOS is built for teams that need character behavior they can test, trace, and improve instead of guessing from a single generated reply.