Skip to content

Commit d5baeca

Browse files
committed
perf(session): cache messages across prompt loop to preserve prompt cache byte-identity
OpenCode updates tool part states in-place (pending → completed + output) between consecutive API calls in the tool-execution loop. When the next API call serializes the conversation, the previous assistant message has different bytes (completed state + output vs pending/error placeholder), breaking Anthropic's prompt cache from that point forward. On real sessions this causes ~20% of turns to re-write the entire context at the cache-write price (12.5× cache-read). On April 21st alone, this cost $2,264 in cache writes vs $1,234 in cache reads. Fix: cache the conversation array across prompt loop iterations. On tool- call continuation steps, only append genuinely NEW messages instead of reloading all messages from the DB. Existing messages retain their original part states (as the API last saw them), preserving byte-identity for the prompt cache. Full reloads still happen after compaction, subtask handling, and overflow recovery — these operations structurally change the conversation.
1 parent 276d162 commit d5baeca

1 file changed

Lines changed: 28 additions & 1 deletion

File tree

packages/opencode/src/session/prompt.ts

Lines changed: 28 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1279,11 +1279,21 @@ NOTE: At any point in time through this workflow you should feel free to ask the
12791279
let step = 0
12801280
const session = yield* sessions.get(sessionID)
12811281

1282+
// Cache conversation across prompt loop iterations to preserve prompt
1283+
// cache byte-identity. Full reload only on first iteration and after
1284+
// compaction/subtask. Tool-call continuation appends only NEW messages
1285+
// to avoid re-reading updated tool parts that break the cache prefix.
1286+
let msgs: MessageV2.WithParts[] | undefined
1287+
let needsFullReload = true
1288+
12821289
while (true) {
12831290
yield* status.set(sessionID, { type: "busy" })
12841291
yield* slog.info("loop", { step })
12851292

1286-
let msgs = yield* MessageV2.filterCompactedEffect(sessionID)
1293+
if (needsFullReload || !msgs) {
1294+
msgs = yield* MessageV2.filterCompactedEffect(sessionID)
1295+
needsFullReload = false
1296+
}
12871297

12881298
let lastUser: MessageV2.User | undefined
12891299
let lastAssistant: MessageV2.Assistant | undefined
@@ -1335,6 +1345,7 @@ NOTE: At any point in time through this workflow you should feel free to ask the
13351345

13361346
if (task?.type === "subtask") {
13371347
yield* handleSubtask({ task, model, lastUser, sessionID, session, msgs })
1348+
needsFullReload = true
13381349
continue
13391350
}
13401351

@@ -1347,6 +1358,7 @@ NOTE: At any point in time through this workflow you should feel free to ask the
13471358
overflow: task.overflow,
13481359
})
13491360
if (result === "stop") break
1361+
needsFullReload = true
13501362
continue
13511363
}
13521364

@@ -1356,6 +1368,7 @@ NOTE: At any point in time through this workflow you should feel free to ask the
13561368
(yield* compaction.isOverflow({ tokens: lastFinished.tokens, model }))
13571369
) {
13581370
yield* compaction.create({ sessionID, agent: lastUser.agent, model: lastUser.model, auto: true })
1371+
needsFullReload = true
13591372
continue
13601373
}
13611374

@@ -1489,6 +1502,20 @@ NOTE: At any point in time through this workflow you should feel free to ask the
14891502
auto: true,
14901503
overflow: !handle.message.finish,
14911504
})
1505+
needsFullReload = true
1506+
} else {
1507+
// Tool-call continuation: append only the new assistant message
1508+
// instead of reloading all messages from DB. This preserves the
1509+
// byte-identity of prior messages (especially tool parts that
1510+
// transitioned pending→completed after the previous API call)
1511+
// for Anthropic's prompt cache.
1512+
const fresh = yield* MessageV2.filterCompactedEffect(sessionID)
1513+
const existingIds = new Set(msgs!.map((m) => m.info.id))
1514+
for (const msg of fresh) {
1515+
if (!existingIds.has(msg.info.id)) {
1516+
msgs!.push(msg)
1517+
}
1518+
}
14921519
}
14931520
return "continue" as const
14941521
}).pipe(Effect.ensuring(instruction.clear(handle.message.id)))

0 commit comments

Comments
 (0)