Event lifecycle and reply timing

The exact ordered sequence an adapter sees per message — send, run.status, the reply, settled cost — plus how to handle latency, failures, and process death.

LoreOS replies are asynchronous. POST /messages does not return the character’s reply — it accepts the user’s message, returns a cursor, and the reply lands later on the session event log. This is not request/response; it is messenger-style. This page is the precise contract for the sequence an adapter sees, the timing to design for, and the failure and recovery behavior.

If you are wiring a bot or chat UI on top of LoreOS, read this before you build your polling or webhook loop. For the transport options (poll vs SSE vs webhook) see Delivery, metering, and observability.

The ordered sequence

One user message produces this ordered sequence of events on the session log. Each event carries a monotonic cursor; you advance your poll cursor as you consume them.

adapter LoreOS
─────── ──────
│ POST /v1/sessions/{id}/messages {"text": "..."}
├──────────────────────────────────▶
│ · record the user turn
│ · write durability job (atomic)
│ 200 {schema_version, data:{ · kick off the reply (async)
│ accepted, cursor: N,
│ sent_turn_index, run_ref,
│ reply_mode}, next_actions}
◀──────────────────────────────────┤
│ (the log now has, in order:)
│ cursor N message.created role=user ← your message, echoed
│ cursor N+1 run.status role=character {status:"generating", run_ref}
│ └─ emitted synchronously on accept
│ GET /v1/sessions/{id}/events?since=N
├──────────────────────────────────▶
│ data:{ events:[ run.status, ... ], next_cursor }
◀──────────────────────────────────┤ (render a typing indicator off run.status)
│ ··· seconds-to-tens-of-seconds pass; the engine runs ···
│ cursor N+2 message.created role=character {text, bubbles:[...], turn_index}
│ (and/or) image.ready role=character {image_url, servable, ...}
│ GET /v1/sessions/{id}/events?since=N+1
├──────────────────────────────────▶
│ data:{ events:[ message.created(character), ... ], next_cursor }
◀──────────────────────────────────┤ (clear typing; render the reply bubbles)
│ (optional) GET /v1/sessions/{id}/trace ← settled per-turn cost + delivery, keyed by run_ref

In words:

  1. You send. POST /v1/sessions/{id}/messages with {text} (optionally reply_mode).
  2. You get an accept envelopenot the reply. It tells you the cursor your message landed at and the run_ref that correlates this whole turn.
  3. message.created (role user) appears on the log at that cursor — your own message, echoed back as an event so the log is the single source of truth.
  4. run.status {status:"generating"} (role character) appears next. It is emitted synchronously, the instant the send is accepted — render a typing indicator off it.
  5. message.created (role character) appears when the reply is ready, seconds to tens of seconds later, carrying the reply text and its ordered bubbles. Clear the typing indicator on this event.
  6. (Optional) settle the cost via GET /v1/sessions/{id}/trace or /runs, filtered by the run_ref you got in step 2.

What the send response actually contains

POST /v1/sessions/{id}/messages returns the standard versioned envelope. The data object:

1{
2 "schema_version": "v0",
3 "data": {
4 "accepted": true,
5 "cursor": 41,
6 "sent_turn_index": 12,
7 "run_ref": "0f4c1b9e-...",
8 "reply_mode": "fast"
9 },
10 "next_actions": [
11 { "command": "GET /v1/sessions/{id}/events?since=41", "description": "Poll for the character's reply" },
12 { "command": "GET /v1/sessions/{id}/trace", "description": "Per-turn settled cost + latency + delivery (filter by run_ref)" }
13 ]
14}
FieldMeaning
acceptedtrue once the user turn is recorded and the reply is kicked off.
cursorThe cursor of your (role:user) message.created event. Start polling with since=cursor.
sent_turn_indexThe user turn’s index in the conversation.
run_refThe turn’s trace id. The same value rides on the run.status payload, on every /trace row for this turn, and as trace_id in /runs. Use it to attach this turn’s settled cost/latency/delivery to your UI.
reply_modeThe effective mode for this turn (fast by default, or deep if you asked).

Retry-safe sends. Pass Idempotency-Key: <unique-per-message> on POST /messages (common on serverless/Vercel, where a function can be retried). A same-key re-call returns the original cursor and sent_turn_index with idempotent_replay: true instead of creating a duplicate turn and a duplicate reply.

The events, field by field

Each entry in GET /v1/sessions/{id}/events is one log event:

1{
2 "id": "…",
3 "schema_version": "v0",
4 "cursor": 42,
5 "session_id": "…",
6 "type": "run.status",
7 "event_type": "run.status",
8 "role": "character",
9 "trace_id": "0f4c1b9e-...",
10 "created_at": "2026-06-05T09:14:02.118Z",
11 "payload": { "status": "generating", "run_ref": "0f4c1b9e-..." },
12 "source_ref": { … }
13}

type and event_type are the same string (both are present for client convenience). The event types you will see for the send flow:

typeroleWhenpayload highlights
message.createduserechoes your sent message, at the cursor the send returned{text, bubbles:[text]}
run.statuscharactersynchronously on send-accept (the /v1 send path){status:"generating", run_ref}
message.createdcharacterthe reply is ready{text, bubbles:[…], turn_index}
image.readycharactera selfie/image was delivered for the turn{image_url, servable, image_request_id, image_kind, visible_description, caption, turn_id}
character.initiatedcharacterthe character messaged first — the authored greeting on session create, or a proactive turn{bubbles:[…], …}
session.exitedsystemthe character left the chat (it will not reply again in this session){reason_code}
message.failedsystemthe reply pipeline errored for this turn{turn_index, reason, recoverable}

payload.bubbles is an ordered list. A character reply is frequently multiple bubbles, the way a person sends two or three short messages in a row — render them in order as separate bubbles, not as one concatenated string.

Does run.status always arrive?

On the /v1 send path, yes. run.status {status:"generating"} is written to the log synchronously, in the same transaction that accepts the message — before POST /messages returns. So for any send you make through POST /v1/sessions/{id}/messages, the run.status event is already on the log by the time you start polling.

It is an additive event type. Build your client to ignore unknown event types (so the contract can grow) and to act only on the types you handle — typically message.created and image.ready. run.status is a progress hint, not a delivery: it is the signal you turn a typing indicator on with.

Managed Telegram differs. If you let LoreOS deliver to a Telegram bot via the managed channel, the inbound-Telegram path goes straight from the user’s message.created to the reply — it does not emit a run.status event (Telegram has its own native “typing…” chat action, which the managed path drives directly). run.status is specific to the /v1 event-log send path. Do not assume run.status on the managed-Telegram log.

When does message.failed arrive, and is it retryable?

message.failed (role system) is emitted when the reply pipeline itself errors for a turn — so your poll/SSE loop sees a terminal outcome instead of waiting forever. Its payload:

1{ "turn_index": 12, "reason": "generation_failed", "recoverable": true }
  • reason is a redacted enumgeneration_failed or timed_out. It is never the raw internal error (that stays server-side; see the PII note in observability).
  • recoverable is true: the user turn is recorded, so you can resend the message (a new POST /messages) to try again. Treat message.failed as “this attempt failed, you may retry,” and surface a retry affordance in your UI.

message.failed is for genuine pipeline errors, not for process death. A crash/deploy that kills the API process mid-reply does not emit message.failed — that case is handled by the durability safety net below (the reply still lands). message.failed means the pipeline ran and failed; the durability net means the pipeline never finished and gets re-run.

Withheld / no-reply turns

A reply is not guaranteed for every user message. The character may deliberately withhold a reply (an emission decision — for example it is mid-task, or chose silence), or it may exit the session. So:

  • A run.status for a turn is not a promise that a message.created (character) follows.
  • If the character withheld, no character message.created appears for that turn — and there is no message.failed either (nothing failed; it was a deliberate decision). Your typing indicator should time out on its own (see latency UX below), not hang forever.
  • If the character exited, you get a session.exited event. After that, sends to the session return 409 session_exited; start a new session to continue.

Design your client so a turn that produces no character reply is a normal, handled outcome — clear the typing indicator on a timeout and move on.

How far to advance the polling cursor

Always advance since to the maximum cursor you have seen across all event types — not just the cursor of the last message you rendered. The events endpoint returns next_cursor (the cursor of the last event in the page, or your since if the page was empty); poll again with since=next_cursor. Because you advance past run.status, image.ready, and every other event too, you never re-read events you have already consumed, and you never skip one.

The cursor is also your dedupe key for at-least-once delivery: if you use the webhook projection (which is at-least-once with retry), dedupe incoming events by cursor (or by the x-auto-dating-event-id header) so a redelivered event is not rendered twice.

The durability safety net (the reply lands even if the process dies)

When you send a message, LoreOS writes a small durability job in the same transaction that records your user turn and the message.created + run.status events. The reply then runs in-process. If the API process restarts, deploys, or crashes after accepting your message but before the reply lands, the in-process work dies — but the durability job does not. A background watchdog finds reply jobs that never reached a decision (no character turn followed, past a staleness threshold) and re-runs them on a durable worker.

What this means for you as an adapter:

  • A momentary API restart/deploy mid-reply does not lose the reply. It arrives on the event log a bit later, via recovery — your normal poll/SSE loop picks it up with no special handling.
  • The recovery is designed to land before the SSE connection window closes, so even a streaming client generally sees the recovered reply on the same connection.
  • You do not need to implement your own “did the reply get dropped?” reconciliation for process death. Keep your message-send retries idempotent (the Idempotency-Key header) and let the event log be your source of truth.

This safety net covers process death (the pipeline never finished). It is distinct from message.failed, which covers a pipeline that ran and errored. Together they mean: a turn you sent always reaches a terminal state on the log — a character reply (possibly recovered), a deliberate withhold/exit, or a message.failed you can retry.

Latency: what to design for

Because replies are asynchronous and go through the full character engine (grounding, safety, knowledge, emission, and — in deep mode — a voice-quality critic and rewrite), a reply takes seconds to tens of seconds, not milliseconds. This is the central UX fact to design around: it is messenger latency, not API latency.

These are early, single-sample, pre-optimization numbers — not authoritative percentiles. An early end-to-end measurement saw a fast reply ≈ 38s and a deep reply ≈ 55s of total reply latency on a single sample, before latency tuning. Treat them as rough order-of-magnitude only; formal p50/p95 latency figures are pending. Do not hardcode them as SLAs. Design your timeouts generously and let the event log — not a stopwatch — be the source of truth for whether a reply arrived.

reply_mode: fast vs deep

POST /messages takes an optional reply_mode:

reply_modeDefault?What it doesTrade-off
fastyes (when omitted)Messenger-grade latency. Skips only the advisory voice-quality critic. Grounding, safety, knowledge, and emission gates all stay on, so factual accuracy is unchanged. This matches the managed-Telegram path.Lower latency, slightly less voice polish.
deepnoOpts into the full critic stack — adds the quality critic and its one-shot voice rewrite.Maximum voice polish, noticeably slower (the rewrite adds time).

Either way, the deeper world-model / relational update runs asynchronously after the reply is sent, so it never blocks the reply you show the user. Default to fast for chat; reach for deep when voice quality matters more than latency for a given turn.

A robust adapter treats the wait as a first-class state:

1

Show typing on run.status

The moment you read the run.status {status:"generating"} event for a turn, render a typing indicator (or, on managed Telegram, LoreOS drives the native typing action for you). On the /v1 path that event is already on the log when your first poll returns.

2

Soft-warn around ~30s

If no character message.created (or image.ready) has arrived after roughly 30 seconds, keep the typing indicator but consider a soft, in-UI hint that the reply is taking a moment. Do not error out — the reply is very likely still coming.

3

Offer a retry around 60–120s

If the turn still has no character reply after ~60–120 seconds, offer the user a retry. The original turn is recorded, so a retry is a fresh POST /messages (use a new Idempotency-Key). Importantly: even if you offer a retry, keep consuming the event log — a delayed or recovered reply for the original turn can still arrive, and you should render it if it does.

4

Clear typing on the reply (or on a withhold timeout)

Clear the typing indicator when the character message.created arrives. If the character deliberately withheld (no reply and no message.failed), clear it on your own timeout so the UI does not hang.

Streaming partial bubbles is on the roadmap, not yet shipped. Today a character reply appears as a single message.created event with the full bubbles list once it is ready — there is no token-by-token partial stream. The run.status event is the “it’s coming” signal; render a typing indicator off it rather than waiting for partial text. (The SSE transport streams events, but each message.created is still a complete reply, not a partial one.)

Putting it together

The minimal correct loop, regardless of transport:

  1. POST /messages → keep cursor and run_ref.
  2. Poll GET /events?since=<max cursor seen> (or open the SSE stream / receive webhooks).
  3. On run.status → show typing. On character message.created → render bubbles, clear typing. On image.ready (with servable:true) → render image_url. On session.exited → end the session. On message.failed → offer retry.
  4. Advance your cursor to the largest cursor you have seen, every time.
  5. Don’t block on a stopwatch — the event log is the truth, the durability net guarantees the reply lands, and a withheld turn is a normal outcome you time out gracefully.

For the per-turn cost and delivery view keyed by run_ref, and exactly what you can and cannot inspect, see Delivery, metering, and observability.