Delivery, metering, and observability

LoreOS runs character work asynchronously. That means delivery, spend control, and debugging are first-class parts of the platform.

Delivery

You can consume replies directly from session events, or attach delivery channels. Telegram has two supported paths:

BYO existing bot: connect a BotFather token with POST /v1/apps/{app_id}/delivery-channels/telegram, set telegram_role="character", and bind sessions with POST /v1/sessions/{session_id}/telegram/bind-token.
Official Managed Bots: connect a manager bot with telegram_role="manager", then call POST /v1/sessions/{session_id}/telegram/managed-child-bot to create or reuse one Telegram contact for that user-character relationship.

Use the Managed Bots path when the Telegram notification should come from the character’s own contact and each relationship should keep its own Telegram chat history.

There is one cursored event log per session, projected three ways:

Polling — GET /v1/sessions/{id}/events?since=<cursor>, the universal fallback.
SSE stream — GET /v1/sessions/{id}/events/stream?since=<cursor>; a connection is capped at 5 minutes, reconnect with the last cursor.
Signed webhook push — register with POST /v1/sessions/{id}/channels {url, secret}. LoreOS POSTs each event to your URL, at-least-once with retry/backoff. Dedupe on x-auto-dating-event-id (or on the event cursor) so a redelivered event is not processed twice. Verify the signature header before trusting the body (below).

Webhook signature. Each push carries:

x-auto-dating-signature: sha256=<hex>

where <hex> = HMAC-SHA256(your-channel-secret, raw-request-body-bytes) — the secret you registered, keyed over the exact raw bytes of the request body (do not re-serialize the parsed JSON; the byte ordering must match). Verify by recomputing the HMAC over the raw body and doing a constant-time compare against the header value; reject on mismatch. Then dedupe on x-auto-dating-event-id (every push also carries the event id) so a redelivered event is processed once. The body is {"event": {…}} — the same event shape the events endpoint returns.

For the exact ordered sequence an adapter sees per message — and how to handle typing indicators, latency, failures, and process death — see Event lifecycle and reply timing.

Proactive delivery contract

Proactive character messages arrive as character.initiated events on the same session event log. Treat them like character message.created events for rendering; the only difference is that no user turn immediately preceded them.

Production semantics:

Paused or exited sessions are skipped before spend.
Frequency and quiet-hour behavior comes from the character’s behavioral thresholds plus GET|PATCH /v1/sessions/{session_id}/proactive-preferences.
App-wide operations policy is available at GET|PATCH /v1/apps/{app_id}/proactive-policy for enabled, quiet_hours, max_per_day, allowed_reasons, channel_policy, and cost_cap_usd_per_day.
Delivered proactive messages are metered as user_proactive because they are visible user-facing work for one end user.
Proactive candidate sweeps, daily/offscreen planning, and ambient-life work are metered as character_infrastructure.
Use webhook or Telegram/managed delivery for push. Polling and SSE still read the same log but are only live while your app is connected.

Webhook push is signed and at-least-once. Verify the x-auto-dating-signature: sha256=<hex> header (the Webhook signature detail under Delivery above: HMAC-SHA256 of your channel secret over the raw body bytes, constant-time compared) and dedupe on x-auto-dating-event-id or the event cursor.

Webhook retry policy. Delivery is at-least-once with exponential backoff: the first retry waits 30s, each subsequent attempt doubles (60s, 120s, …) capped at 1 hour, for a maximum of 8 attempts. After 8 failures the delivery status becomes dead_letter and LoreOS stops retrying that event — your endpoint must come back and you re-read the log (polling/SSE), since a dead-lettered event is not auto-redelivered. Inspect delivery health with:

GET /v1/sessions/{session_id}/delivery

It reports per-event status (delivered / failed / dead_letter), attempt_count, next_attempt_at, response_code, and last_error — use it for operational delivery health and support tooling, not as the gate for rendering a reply.

quiet_hours skip (not queue). A proactive send that would fire inside quiet_hours is skipped, not queued for after the window — it does not get delivered late. Quiet hours suppress the send; they do not delay it.

Proactive cost is per end-user. A delivered proactive message is metered as user_proactive and attributed to the end user. All user_* (user-attributed) usage classes — including user_direct and user_proactive — draw from the same per-user budget, so proactive spend counts against the cap set with POST /v1/external-users/{external_user_ref}/budget-policy exactly like a normal reply. A proactive that would exceed a hard cap is rejected before any model spend.

Relationship score changes use the same transport. After an app opts into relationship numerics with PATCH /v1/apps/{app_id} {"expose_relational_numerics": true}, post-turn score commits emit relationship_score.changed on session_events. The payload contains final 0..1 values only; raw extractor deltas, patch history, signal events, critic reasoning, and prompt text are omitted.

Telegram Managed Bots

The official Managed Bots path is the recommended route for companion products that want one Telegram contact per user-character relationship.

Prepare your manager bot in BotFather:

@BotFather -> open BotFather Mini App -> select your manager bot -> Bot Settings
-> enable Bot Management Mode

Connect your manager bot:

1 POST /v1/apps/{app_id}/delivery-channels/telegram
2 {
3   "bot_token": "123:manager-token",
4   "telegram_role": "manager",
5   "routing_mode": "developer_user_link"
6 }

The response includes management-mode evidence when Telegram exposes it:

1 {
2   "management_mode_status": "enabled",
3   "can_manage_bots": true,
4   "management_mode_checked_at": "2026-06-19T09:00:00+00:00"
5 }

If management_mode_status is disabled, return to BotFather, enable Bot Management Mode, then call POST /v1/delivery-channels/{channel_id}/verify. If it is unknown, LoreOS could not confirm the field yet, so verify after enabling the setting.

Create or reuse the child contact for a session:

POST /v1/sessions/{session_id}/telegram/managed-child-bot

Open the returned data.telegram.deep_link from the end user’s Telegram account. If data.telegram.status is pending, Telegram will guide the user through creating the child bot. If it is linked, open the returned bot link directly.
After Telegram sends the managed-bot update to the manager webhook, LoreOS stores the child token in Vault, registers the child bot webhook, and binds it to the existing session. Messages and proactive sends then use that child bot.

Child bot tokens are never returned by the API, stored in channel config, or shown in traces.

Metering

Usage is metered by app and external user. LoreOS reserves usage before expensive work runs, so budget caps can stop a request before model calls or image work begin.

Usage & cost endpoints by credential

Which read you call depends on which credential you hold. The runtime key (ck_) is already bound to one app, so its reads are app-scoped and need no app_id. The account token (lat_) is workspace-scoped, so per-app account reads take an app_id path param.

Goal	Credential	Route
App usage, grouped	`ck_` runtime key	`GET /v1/usage?group_by=…` (app-scoped, no `app_id` needed)
Rate card	`ck_` runtime key	`GET /v1/rates`
App cost summary (pricing view)	`lat_` account token + `app_id`	`GET /v1/account/apps/{app_id}/cost-summary`
App usage (account view)	`lat_` account token + `app_id`	`GET /v1/account/apps/{app_id}/usage`
Cost across ALL apps	`lat_` account token	`GET /v1/account/cost-summary` (workspace-scoped)

Where app_id comes from. The account-plane lat_ routes need BOTH the account token and an app_id. You get that app_id from POST /v1/account/apps (the created app’s data.app_id) or, with a runtime key, from GET /v1/me (data.app_id — the runtime /v1/me, not /v1/account/me). The account-plane GET /v1/account/me returns workspace identity, not an app_id; list apps with GET /v1/account/workspace.

group_by dimensions for GET /v1/usage (default resource_type; an unknown value is a 422 invalid_group_by):

`group_by`	What it answers
`usage_class`	how cost splits across user COGS vs character infrastructure (see the table below)
`resource_type`	which resource kind (reply, image, etc.) drove spend
`model`	which model id the spend ran on
`provider`	which provider the spend ran on
`character`	which character accrued the cost; groups include `character_id`, `character_slug`, and display name
`session`	which session accrued the cost; groups include `session_id`, `character_id`, `character_slug`, `external_user_id`, and `external_user_ref`
`external_user`	which end user accrued the cost (by external user id)

Use GET /v1/characters to map your app roster. Each character summary includes character_id, app_id, slug, and display_name, so a billing dashboard can join usage groups to your own character catalog without reverse-engineering slugs from sessions.

The summary reads:

GET /v1/account/apps/{app_id}/cost-summary with an account token for a pricing-oriented summary across end-user cost, character infrastructure cost, setup/import cost, active holds, and top users/characters;
GET /v1/account/cost-summary with an account token for the same projection rolled up across every app in the workspace;
external-user usage and budget endpoints to inspect or cap a single user.

Cost lifecycle. LoreOS reserves credits before the provider call — a send that would exceed a hard cap is rejected with 402 budget_exceeded before any model spend — then settles the actual usage after the turn completes (success or failure: you are charged what was actually spent). The usage_credits you read in a trace is the settled amount.

Cost classes. credits are USD. The canonical ledger also classifies each event by usage_class, so your dashboard can separate user COGS from character operating cost:

`usage_class`	What it means
`user_direct`	visible work directly caused by an end-user action, such as a normal reply
`user_proactive`	a delivered character-initiated message for one end user
`user_maintenance`	session-scoped maintenance caused by user activity, including world-model, canon, relationship, user-intent, reflection, and life-arc work
`character_infrastructure`	scheduled or ambient character work, including Story Room daily runs, proactive candidate sweeps, offscreen scene realization, and ambient-life planning
`app_setup`	developer setup and migration work, such as transcript import extraction
`platform_internal_excluded`	internal platform work excluded from customer cost

Use GET /v1/usage?group_by=usage_class for the raw grouped ledger. Use GET /v1/account/apps/{app_id}/cost-summary when a console or coding agent needs the pricing view: user_attributed_cost_usd, character_infrastructure_cost_usd, app_setup_cost_usd, active_reservations_usd, top end users, top characters, and pricing_guidance.suggested_customer_cogs_usd.

The hosted developer console renders that account-plane projection in Cost overview, Per-user metering, and Cost, Usage And Caps, so developers can see user-driven cost and character infrastructure cost without building a dashboard first. Its External Users And Limits panel renders each managed end user’s spent cost, hard limit, reserved in-flight cost, remaining budget after holds, and last activity.

Where a failed or blocked image shows up. GET /v1/usage reports settled cost only. An image that was attempted but never generated (e.g. the image provider returns an error, or a 402 budget_exceeded blocked it before the call) costs $0, so it does not appear in /v1/usage — there was no spend. The attempt and its failure are visible elsewhere:

image-probe failure → GET /v1/characters/{slug}/image-probe/{probe_id}: status is "failed" with a message. (servable: true and status: "failed" is the correct-but-confusing case where the image was generated + stored but the request was then rejected.)
inventory-generate failure → the per-item result in the inventory-generate report.
reply image failure → a message.failed event on the session log.

So /v1/usage answers “what did I spend”; the probe / inventory report answers “did this image attempt succeed, and if not, why”. If image generation fails with a provider billing error, that is LoreOS’s image-provider capacity — contact LoreOS; registering an existing image via POST .../visual-assets/register-url needs no image generation and is never billing-blocked.

Observability

A character runtime has many moving parts: message runs, model calls, delivery attempts, image requests, Story Room state, proactive hooks, and scheduled jobs. The read-only, app-scoped observability endpoints let you answer “why did this character reply that way, get withheld, or cost that much” without stitching tables by hand. Every endpoint is scoped to your app — a session that is not yours returns 404 (never 403), so it looks absent.

For app-wide redacted runtime metrics, call:

GET /v1/apps/{app_id}/observability/metrics?window_hours=24

It returns model-call count, p50/p95/p99 latency, cost by provider/model class, failure classes, fallback rate, cache token share, and usage by class. It excludes prompts, provider payloads, raw world-model rows, and message text.

List end-users for a dashboard

GET /v1/external-users lists the end-users in the app attached to your runtime key. Use it for customer dashboards, support tools, and per-user limit screens. It is paginated with next_cursor and supports:

status=active|blocked|deleted
q=<text> to search external_user_ref and display_name
limit=1..200
cursor=<next_cursor>

By default, soft-deleted users are omitted. Pass status=deleted when you need to audit deleted records.

1 {
2   "schema_version": "v0",
3   "data": {
4     "external_users": [
5       {
6         "external_user_ref": "u_8123",
7         "display_name": "Riley",
8         "metadata": { "plan": "plus" },
9         "status": "active",
10         "credit_limit": 25.0,
11         "used_credits": 4.25,
12         "metered_cost_usd": 4.25,
13         "user_attributed_cost_usd": 4.25,
14         "reserved_credits": 0.5,
15         "remaining_credits": 20.25,
16         "session_count": 3,
17         "last_session_at": "2026-06-22T09:00:00Z",
18         "last_activity_at": "2026-06-22T09:14:31Z",
19         "created_at": "2026-06-20T09:00:00Z",
20         "updated_at": "2026-06-22T09:14:31Z"
21       }
22     ],
23     "next_cursor": "eyJleHRlcm5hbF91c2VyX3JlZiI6InVfODEyMyIsInVwZGF0ZWRfYXQiOiIuLi4ifQ",
24     "has_more": true
25   }
26 }

For one user’s conversation history, call GET /v1/external-users/{external_user_ref}/sessions, then read the event log for a specific session with GET /v1/sessions/{session_id}/events?since=0. The end-user list is a CRM and metering summary; it does not return raw transcript text.

Find sessions

GET /v1/sessions lists your app’s sessions, newest first, with optional filters character (slug), channel_id, and external_user_ref:

1 {
2   "schema_version": "v0",
3   "data": {
4     "sessions": [
5       {
6         "session_id": "…",
7         "character": "yura",
8         "external_user_ref": "u_8123",
9         "interaction_mode": "character_chat",
10         "lifecycle_state": "active",
11         "status": "open",
12         "created_at": "2026-06-05T08:55:10.001Z",
13         "last_event_at": "2026-06-05T09:14:31.880Z"
14       }
15     ]
16   }
17 }

Per-event trace — `GET /v1/sessions/{id}/trace`

One read that stitches the event log to its settled cost and its push-delivery status, newest first. This is the “inbound → reply → cost → delivery” view for a single session. Each row carries run_ref (the turn’s trace id), so you can group rows by the run_ref the send response returned and attach that turn’s cost and delivery to your UI.

1 {
2   "schema_version": "v0",
3   "data": {
4     "session_id": "…",
5     "trace": [
6       {
7         "seq": 43,
8         "event_id": "…",
9         "run_ref": "0f4c1b9e-...",
10         "type": "message.created",
11         "role": "character",
12         "created_at": "2026-06-05T09:14:31.880Z",
13         "text": "I just got back from the flower market — kind of wiped, honestly.",
14         "image_url": null,
15         "delivery_status": "delivered",
16         "delivery_attempts": 1,
17         "delivery_error": null,
18         "usage_credits": 0.034
19       },
20       {
21         "seq": 42,
22         "event_id": "…",
23         "run_ref": "0f4c1b9e-...",
24         "type": "run.status",
25         "role": "character",
26         "created_at": "2026-06-05T09:14:02.118Z",
27         "text": null,
28         "image_url": null,
29         "delivery_status": null,
30         "delivery_attempts": null,
31         "delivery_error": null,
32         "usage_credits": 0.034
33       },
34       {
35         "seq": 41,
36         "event_id": "…",
37         "run_ref": "0f4c1b9e-...",
38         "type": "message.created",
39         "role": "user",
40         "created_at": "2026-06-05T09:14:01.902Z",
41         "text": "hey, what are you up to?",
42         "image_url": null,
43         "delivery_status": null,
44         "delivery_attempts": null,
45         "delivery_error": null,
46         "usage_credits": 0.034
47       }
48     ]
49   }
50 }

Notes on the fields:

text is the event’s payload.text, and image_url is the event’s payload.image_url (set on a servable image.ready). The trace surfaces these two flattened for convenience; the full payloads remain on the events endpoint.
usage_credits is the settled sum of metered usage for that turn’s run_ref. Rows that share a run_ref report the same turn-level settled total.
delivery_status / delivery_attempts / delivery_error reflect the latest push delivery attempt for that event (populated for events delivered to a webhook/Telegram channel; null for inbound user events and for events you only polled).

These delivery fields are not the chat rendering gate. If a character message.created or character.initiated event is present in the session log, the visible reply is ready to render even if push delivery bookkeeping is still null, pending, or settling. Use delivery_status: "delivered" only to answer whether LoreOS successfully pushed that event to a managed channel such as webhook or Telegram.

Telegram diagnostics - `GET /v1/observability/sessions/{id}/telegram`

For official Managed Bots, this endpoint gives one redacted support view for a single session:

child-contact request status from telegram_managed_bot_requests
active Telegram binding and child bot status
latest inbound Telegram session events
reply jobs created from those inbound events
managed-channel lifecycle markers
Telegram delivery attempts for outbound character events

Use it when a user can message the child bot but no character reply appears, or when a child contact is linked but outbound Telegram delivery looks stuck.

$ curl "https://api.loreos.app/v1/observability/sessions/$SESSION_ID/telegram?limit=20" \
>   -H "Authorization: Bearer $LOREOS_KEY"

The response is app-scoped and redacted. It reports booleans such as provider_chat_present, telegram_owner_present, has_error, and provider_message_present, but it does not expose bot tokens, webhook secrets, raw Telegram user or chat ids, raw Telegram updates, user message text, raw provider errors, prompts, or provider payloads.

Per-turn runs — `GET /v1/sessions/{id}/runs`

An aggregate-per-turn view: for each processed turn (keyed by trace_id, which equals the run_ref from the send response), the model-call count, settled cost, total latency, an error flag, the per-turn cache-read token share, a provider fallback flag, and the public model classes used — so you can self-diagnose “why is this turn slow / what did it cost / did it error / did it hit cache / did it fall back” without any internal prompt, role, raw model id, or world-model detail.

1 {
2   "schema_version": "v0",
3   "data": {
4     "runs": [
5       {
6         "trace_id": "0f4c1b9e-...",
7         "started_at": "2026-06-05T09:14:02.000Z",
8         "finished_at": "2026-06-05T09:14:31.700Z",
9         "model_calls": 7,
10         "cost_credits": 0.034,
11         "total_latency_ms": 29680,
12         "status": "ok",
13         "cache_read_token_share": 0.41,
14         "fallback": false,
15         "model_classes": ["llm_deepseek"]
16       }
17     ]
18   }
19 }

status is ok, or error if any model call in the turn failed. Use total_latency_ms and model_calls to reason about a slow turn; cost_credits for the per-turn settled spend. cache_read_token_share (0..1) is the fraction of this turn’s input tokens served from the prompt cache (higher = cheaper/faster). fallback is true if any call in the turn failed over to a backup provider. model_classes lists the public model classes used (e.g. llm_deepseek, image) — never the raw model id. (For window-wide percentiles, cache share, and fallback rate across all turns, use GET /v1/apps/{app_id}/observability/metrics.)

Delivery status — `GET /v1/sessions/{id}/delivery`

Per-event push-delivery attempts for the session’s channels, with attribution (which bot delivered which character’s reply to whom): status (delivered / failed / dead_letter), attempt_count, response_code, last_error, next_attempt_at, plus channel_id, channel_type, bot_username, character_slug, and provider_chat_id.

Use this endpoint for operational delivery health, retries, and support tooling. Do not poll it to decide when to show a reply in your UI; render from the events endpoint instead.

Safety controls

Your app owns its end-user policy. Drive LoreOS runtime controls without exposing internals:

Action	API	Effect
Pause / resume a session	`PATCH /v1/sessions/{id}/lifecycle {"action":"pause"\|"resume"}`	Blocks new user messages + proactive work; `POST /messages` → `409 session_paused` until resumed
End one session	`POST /v1/sessions/{id}/block {"reason","report"}`	Exits ONE user×character session (`session.exited`); optional `report:true` also files a report
Report	`POST /v1/sessions/{id}/report {"category","reason"}`	Files a `safety.reported` event for your moderation workflow; no automatic enforcement
Forget one memory	`DELETE /v1/sessions/{id}/memories/{memory_id}`	Suppresses one memory card
Forget ALL memories	`DELETE /v1/sessions/{id}/memories`	Suppresses every visible memory card in the session
Block an end-user (account-level)	`POST /v1/external-users/{ref}/block` (+ `/unblock`)	Non-destructive, reversible; the end-user is rejected (`403 external_user_blocked`) from NEW sessions across the whole app
Block from one character	`POST /v1/characters/{slug}/block {"external_user_ref"}` (+ `/unblock`)	Non-destructive, reversible; rejects NEW sessions for that one (end-user × character) pair (`403 end_user_blocked_from_character`); other characters unaffected
Delete an end-user	`POST /v1/external-users/{ref}/delete`	Destructive soft-delete

Account-level and character-level block are reversible and gate new sessions; pause or end any existing sessions separately. None of these block/forget calls touch character canon or persona.

What you can debug vs what is redacted

The observability surface is read-only and redacted by default. It exposes the product-level picture — what happened, what it cost, whether it delivered — and never the engine’s internal reasoning. Concretely:

You can read	You cannot read over `/v1`
The event log (messages, run.status, image.ready, exits, failures)	Raw prompts or model payloads for any call
Per-turn settled cost, latency, and model-call count (`/runs`)	Critic reasoning / quality-critic verdicts (the prose of why a reply was shaped a certain way)
Per-event delivery status + attribution (`/delivery`)	World-model patch history and signal events
The redacted session state (`/state`: lifecycle + aggregates)	Live relational / world-model numerics (relationship or mastery values) — unless your app opts in (below)
The redacted runtime preview (`/runtime-preview`: dimension names, asset labels, settings)	Story Room private plans, branch forecasts, and unaccepted candidates

This is the OS-D08 privacy boundary: a developer sees enough to test, trace, and improve character behavior, but the raw evolving world-model stays internal.

The one opt-in relaxation: a relationship meter. By default GET /v1/sessions/{id}/state returns only lifecycle + aggregates with "redacted": true. If you need numeric relationship state, call GET /v1/me to get your app_id, then opt your app in with PATCH /v1/apps/{app_id} {expose_relational_numerics: true}. After that, read the dedicated GET /v1/sessions/{id}/relationship-scores endpoint, or read the same meter on GET /v1/sessions/{id}/state as relational_meter. The meter contains the final 0..1 value per dimension plus its label (a “relationship meter” / “mastery meter”). The dedicated score endpoint also returns score_contract, updated_at, and version so product logic can distinguish stale vs current values. Scores are not monotonic and may move down or decay. Even then, only the final values are exposed: still no patch history, no critic reasoning, no signal events.

Support escalation — what the LoreOS team can see

Every turn carries a run_ref (its trace id), which also appears as trace_id in /runs and on each /trace row. When you open a support escalation, include the run_ref for the turn in question. Internally that trace id ties together the full server-side detail for that turn — the model calls, the critic stack, the world-model and emission decisions — which the LoreOS team can inspect to diagnose your case. That depth is not projected onto the /v1 surface (it stays behind the OS-D08 boundary); the run_ref is the handle that lets us look it up for you.

A note on PII and logging

Error reasons are redacted enums, not raw errors. A message.failed event reports reason: "generation_failed" | "timed_out" and never the internal exception text. Raw internal errors stay server-side.
Secrets are never echoed. Delivery secrets (bot tokens, webhook/HMAC secrets) are stored in a secrets vault; API responses and logs hold only a reference and a fingerprint, never the raw secret. Do not put secrets in fields that round-trip (labels, metadata).
You control end-user identity. external_user_ref is your identifier for an end-user; choose an opaque id rather than embedding raw PII (email, phone) in it, since it appears in observability reads. To remove a user’s data, use the external-user delete endpoint.
Message text appears in the event log and trace (that is the conversation itself) — treat those reads as containing user content and handle them under your own privacy policy.

Ship characters with evidence

Before launch, inspect readiness, runtime preview, image probes, usage caps, and eval runs. LoreOS is built for teams that need character behavior they can test, trace, and improve instead of guessing from a single generated reply.