Event lifecycle and reply timing
The exact ordered sequence an adapter sees per message — send, run.status, the reply, settled cost — plus how to handle latency, failures, and process death.
LoreOS replies are asynchronous. POST /messages does not return the character’s
reply — it accepts the user’s message, returns a cursor, and the reply lands later on the
session event log. This is not request/response; it is messenger-style. This page is the
precise contract for the sequence an adapter sees, the timing to design for, and the failure
and recovery behavior.
If you are wiring a bot or chat UI on top of LoreOS, read this before you build your polling or webhook loop. For the transport options (poll vs SSE vs webhook) see Delivery, metering, and observability.
The ordered sequence
One user message produces this ordered sequence of events on the session log. Each event
carries a monotonic cursor; you advance your poll cursor as you consume them.
In words:
- You send.
POST /v1/sessions/{id}/messageswith{text}(optionallyreply_mode). - You get an accept envelope — not the reply. It tells you the cursor your message
landed at and the
run_refthat correlates this whole turn. message.created(roleuser) appears on the log at that cursor — your own message, echoed back as an event so the log is the single source of truth.run.status{status:"generating"}(rolecharacter) appears next. It is emitted synchronously, the instant the send is accepted — render a typing indicator off it.message.created(rolecharacter) appears when the reply is ready, seconds to tens of seconds later, carrying the reply text and its orderedbubbles. Clear the typing indicator on this event.- (Optional) settle the cost via
GET /v1/sessions/{id}/traceor/runs, filtered by therun_refyou got in step 2.
What the send response actually contains
POST /v1/sessions/{id}/messages returns the standard versioned envelope. The data object:
Retry-safe sends. Pass Idempotency-Key: <unique-per-message> on POST /messages
(common on serverless/Vercel, where a function can be retried). A same-key re-call returns
the original cursor and sent_turn_index with idempotent_replay: true instead of
creating a duplicate turn and a duplicate reply.
The events, field by field
Each entry in GET /v1/sessions/{id}/events is one log event:
type and event_type are the same string (both are present for client convenience). The
event types you will see for the send flow:
payload.bubbles is an ordered list. A character reply is frequently multiple bubbles,
the way a person sends two or three short messages in a row — render them in order as
separate bubbles, not as one concatenated string.
Does run.status always arrive?
On the /v1 send path, yes. run.status {status:"generating"} is written to the log
synchronously, in the same transaction that accepts the message — before POST /messages
returns. So for any send you make through POST /v1/sessions/{id}/messages, the
run.status event is already on the log by the time you start polling.
It is an additive event type. Build your client to ignore unknown event types (so
the contract can grow) and to act only on the types you handle — typically message.created
and image.ready. run.status is a progress hint, not a delivery: it is the signal you
turn a typing indicator on with.
Managed Telegram differs. If you let LoreOS deliver to a Telegram bot via the managed
channel, the inbound-Telegram path goes straight from the user’s message.created to the
reply — it does not emit a run.status event (Telegram has its own native “typing…”
chat action, which the managed path drives directly). run.status is specific to the
/v1 event-log send path. Do not assume run.status on the managed-Telegram log.
When does message.failed arrive, and is it retryable?
message.failed (role system) is emitted when the reply pipeline itself errors for a
turn — so your poll/SSE loop sees a terminal outcome instead of waiting forever. Its payload:
reasonis a redacted enum —generation_failedortimed_out. It is never the raw internal error (that stays server-side; see the PII note in observability).recoverableistrue: the user turn is recorded, so you can resend the message (a newPOST /messages) to try again. Treatmessage.failedas “this attempt failed, you may retry,” and surface a retry affordance in your UI.
message.failed is for genuine pipeline errors, not for process death. A crash/deploy
that kills the API process mid-reply does not emit message.failed — that case is handled
by the durability safety net below (the reply still lands). message.failed means the
pipeline ran and failed; the durability net means the pipeline never finished and gets
re-run.
Withheld / no-reply turns
A reply is not guaranteed for every user message. The character may deliberately withhold a reply (an emission decision — for example it is mid-task, or chose silence), or it may exit the session. So:
- A
run.statusfor a turn is not a promise that amessage.created (character)follows. - If the character withheld, no character
message.createdappears for that turn — and there is nomessage.failedeither (nothing failed; it was a deliberate decision). Your typing indicator should time out on its own (see latency UX below), not hang forever. - If the character exited, you get a
session.exitedevent. After that, sends to the session return409 session_exited; start a new session to continue.
Design your client so a turn that produces no character reply is a normal, handled outcome — clear the typing indicator on a timeout and move on.
How far to advance the polling cursor
Always advance since to the maximum cursor you have seen across all event types —
not just the cursor of the last message you rendered. The events endpoint returns
next_cursor (the cursor of the last event in the page, or your since if the page was
empty); poll again with since=next_cursor. Because you advance past run.status,
image.ready, and every other event too, you never re-read events you have already consumed,
and you never skip one.
The cursor is also your dedupe key for at-least-once delivery: if you use the webhook
projection (which is at-least-once with retry), dedupe incoming events by cursor (or by the
x-auto-dating-event-id header) so a redelivered event is not rendered twice.
The durability safety net (the reply lands even if the process dies)
When you send a message, LoreOS writes a small durability job in the same transaction
that records your user turn and the message.created + run.status events. The reply then
runs in-process. If the API process restarts, deploys, or crashes after accepting your
message but before the reply lands, the in-process work dies — but the durability job does
not. A background watchdog finds reply jobs that never reached a decision (no character turn
followed, past a staleness threshold) and re-runs them on a durable worker.
What this means for you as an adapter:
- A momentary API restart/deploy mid-reply does not lose the reply. It arrives on the event log a bit later, via recovery — your normal poll/SSE loop picks it up with no special handling.
- The recovery is designed to land before the SSE connection window closes, so even a streaming client generally sees the recovered reply on the same connection.
- You do not need to implement your own “did the reply get dropped?” reconciliation for
process death. Keep your message-send retries idempotent (the
Idempotency-Keyheader) and let the event log be your source of truth.
This safety net covers process death (the pipeline never finished). It is distinct from
message.failed, which covers a pipeline that ran and errored. Together they mean: a turn
you sent always reaches a terminal state on the log — a character reply (possibly recovered),
a deliberate withhold/exit, or a message.failed you can retry.
Latency: what to design for
Because replies are asynchronous and go through the full character engine (grounding, safety,
knowledge, emission, and — in deep mode — a voice-quality critic and rewrite), a reply takes
seconds to tens of seconds, not milliseconds. This is the central UX fact to design
around: it is messenger latency, not API latency.
These are early, single-sample, pre-optimization numbers — not authoritative percentiles.
An early end-to-end measurement saw a fast reply ≈ 38s and a deep reply ≈ 55s of
total reply latency on a single sample, before latency tuning. Treat them as rough
order-of-magnitude only; formal p50/p95 latency figures are pending. Do not hardcode them
as SLAs. Design your timeouts generously and let the event log — not a stopwatch — be the
source of truth for whether a reply arrived.
reply_mode: fast vs deep
POST /messages takes an optional reply_mode:
Either way, the deeper world-model / relational update runs asynchronously after the
reply is sent, so it never blocks the reply you show the user. Default to fast for chat;
reach for deep when voice quality matters more than latency for a given turn.
Recommended client behavior
A robust adapter treats the wait as a first-class state:
Show typing on run.status
The moment you read the run.status {status:"generating"} event for a turn, render a typing
indicator (or, on managed Telegram, LoreOS drives the native typing action for you). On the
/v1 path that event is already on the log when your first poll returns.
Soft-warn around ~30s
If no character message.created (or image.ready) has arrived after roughly 30 seconds,
keep the typing indicator but consider a soft, in-UI hint that the reply is taking a moment.
Do not error out — the reply is very likely still coming.
Offer a retry around 60–120s
If the turn still has no character reply after ~60–120 seconds, offer the user a retry. The
original turn is recorded, so a retry is a fresh POST /messages (use a new
Idempotency-Key). Importantly: even if you offer a retry, keep consuming the event log —
a delayed or recovered reply for the original turn can still arrive, and you should render it
if it does.
Streaming partial bubbles is on the roadmap, not yet shipped. Today a character reply
appears as a single message.created event with the full bubbles list once it is ready —
there is no token-by-token partial stream. The run.status event is the “it’s coming”
signal; render a typing indicator off it rather than waiting for partial text. (The SSE
transport streams events, but each message.created is still a complete reply, not a
partial one.)
Putting it together
The minimal correct loop, regardless of transport:
POST /messages→ keepcursorandrun_ref.- Poll
GET /events?since=<max cursor seen>(or open the SSE stream / receive webhooks). - On
run.status→ show typing. On charactermessage.created→ renderbubbles, clear typing. Onimage.ready(withservable:true) → renderimage_url. Onsession.exited→ end the session. Onmessage.failed→ offer retry. - Advance your cursor to the largest
cursoryou have seen, every time. - Don’t block on a stopwatch — the event log is the truth, the durability net guarantees the reply lands, and a withheld turn is a normal outcome you time out gracefully.
For the per-turn cost and delivery view keyed by run_ref, and exactly what you can and
cannot inspect, see
Delivery, metering, and observability.