State consistency · Write-side coherence for collaborating agents
v0.7.1 · Apache-2.0 · PyPIThat stale read is how shared memory pollution starts — one agent's hallucination becomes a "fact" the next one reasons from, and a cascading error propagates downstream. agent-coherence makes the moment visible and serves the current version on the next read instead of rebroadcasting the full artifact every turn. Same library, same protocol, across LangGraph, CrewAI, AutoGen, and any custom orchestrator. Same behavior regardless of model provider (Anthropic, OpenAI, Google, Mistral, open-source).
"This asynchronicity adds challenges in result coordination, state consistency, and error propagation across the subagents."
— Anthropic Engineering, How we built our multi-agent research system (June 2025), on what blocks async multi-agent execution at scale.
Anthropic named the problem. agent-coherence is the protocol that addresses it.
$ pip install "agent-coherence[langgraph]"
If your agents only read from sources you don't control, you need a freshness pipeline. If your agents write to each other's state, you need a coherence protocol. They're different problems — and the wrong tool for one is silent failure in the other.
Read-side freshness
The world writes (commits, Slack, docs, tickets); agents read. You need an index pipeline that keeps the corpus current as sources change — incremental embeddings, knowledge graphs, retrieval.
Write-side coherence
The agents write — they collaborate on shared plans, edit specs, mutate memory, hand off scratchpads. You need a coherence protocol that detects stale reads and enforces single-writer ordering when one agent commits.
Failure modes caught: stale reads · shared memory pollution · cascading errors · context handoff drift · concurrent-write conflicts.
Both layers are needed in a real production system. agent-coherence focuses on the write side.
Reproducible in CI with GenericFakeChatModel — no live LLM API calls. Run them yourself: make benchmark.
| Workload | Agents | Reads : Writes | Hit rate | Savings |
|---|---|---|---|---|
| Planning (read-heavy) | 4 | 12:1 | 75% | 69% |
| Code review (moderate) | 3 | 8:3 | 60% | 47% |
| High-churn (write-heavy) | 4 | 8:4 | 50% | 29% |
MESI cache coherence — the protocol every modern CPU uses to share memory — adapted for LLM agents sharing artifacts.
Each shared artifact is cached locally per agent. Reads serve from the local cache when valid — no re-broadcast.
Writes commit to a coordinator, which sends ~12-token invalidation signals instead of rebroadcasting the full artifact.
Single-writer-multiple-reader per artifact with bounded staleness. Peers re-fetch on next read, guaranteed.
Five synchronization strategies ship out of the box: lazy (default), eager, lease (TTL-based), access_count, and broadcast — pick the one matching your workload's read/write ratio and staleness tolerance.
Same library, same protocol, same behavior — regardless of orchestrator or model provider.
# LangGraph drop-in — one import change, no node code changes from langgraph.store.memory import InMemoryStore # before from ccs.adapters import CCSStore # after store = CCSStore(strategy="lazy") graph = builder.compile(store=store)
"Subagent output to a filesystem to minimize the 'game of telephone' [...] implement artifact systems where specialized agents can create outputs that persist independently."
— Anthropic Engineering, multi-agent research system (Appendix, June 2025). CCSStore is exactly that pattern — plus coherence semantics so subagents know when their cached view is stale.
Provider-neutral: same behavior with Anthropic, OpenAI, Google, Mistral, or open-source models. The protocol operates on artifacts, not model responses.
Building coding sub-agents?
See the recorded planner-executor demo →
ccs-diagnose — zero-network stale-read detector for existing graphsAnthropic's engineering team, after shipping their multi-agent Research system to production, named state consistency as one of three challenges blocking async multi-agent execution at scale. agent-coherence is the protocol that addresses it.
Architecturally, this is the layer QuantumBlack/McKinsey describes as agentic shared services — the protocol-first, composable substrate between agent runtimes and enterprise data. agent-coherence is the state-consistency primitive that lives there.
Agentic systems & runtimes
Interfaces & agentic orchestration
Agentic shared servicesagent-coherence is here
In-house systems & external data
Layer naming follows "Creating a future-proof enterprise agentic platform architecture" (QuantumBlack/McKinsey). agent-coherence is composable by design: it slots alongside your existing evaluations, observability, and memory layers — same library across LangGraph, CrewAI, AutoGen, and custom runtimes, vendor-neutral across Anthropic, OpenAI, Google, Mistral, and open-source models. Multi-vendor workflows, minimum lock-in.
The audience signal is consistent: 32% of agent teams cite quality — "hallucinations and consistency of outputs" — as the #1 production blocker. LangChain, State of Agent Engineering 2026.
Common questions about stale-read detection and multi-agent coherence across LangGraph, CrewAI, AutoGen, and custom orchestrators.
When one agent reads an artifact — a plan, a document, a result — that another agent has already updated, the reader gets a stale copy. If the reader then writes back, it overwrites the current version with logic that was based on stale state. MLflow's multi-agent observability team calls this shared memory pollution: one agent's hallucination becomes a "fact" subsequent agents reason from, producing cascading errors that compound across reasoning steps. Trace-only tools can see the calls but not the staleness; agent-coherence detects the exact moment of divergence.
LangSmith and Braintrust show you what your agents did. agent-coherence shows you when one of them was wrong because it read stale state from another. The difference is structural — we track per-agent ownership of shared artifacts (MESI states), so the tool can flag a read that returned an outdated copy. Trace-only tools cannot detect this because they lack the state model.
No. Drop-in adapters ship for LangGraph (CCSStore), CrewAI, AutoGen, and any custom orchestrator via CoherenceAdapterCore. The protocol operates on artifacts, not model responses, so it works the same with Anthropic, OpenAI, Google, Mistral, AWS Bedrock, Azure OpenAI, and open-source models.
The protocol enforces single-writer exclusivity. Only one agent holds write permission at a time; concurrent writes are prevented at the protocol level, not "resolved" by an append-only reducer that produces duplicates or unexpected list nesting. For workloads where concurrent writes are semantically composable, CRDTs are the right tool — see the Why Coherence Matters doc for the layered model.
69% token reduction on read-heavy workloads (12:1 read/write ratio), 47% on moderate (8:3), 29% on write-heavy (8:4). The lever is invalidation signals (~12 tokens) replacing full-artifact rebroadcasts. Run the benchmarks yourself: pip install "agent-coherence[langgraph,benchmark]" then make benchmark. CI uses GenericFakeChatModel — no live API calls required.
Apache-2.0, on PyPI, 165 tests, TLA+/TLC model-checked safety properties (single-writer, monotonic versioning, no torn reads), PyPI Trusted Publishers with PEP 740 attestations and CycloneDX SBOM published with every release. Opt-in crash-recovery sweep reclaims stale grants when agents OOM-kill or livelock. ccs-diagnose runs as a zero-network static analyzer on existing graphs before adoption.
15-minute call. We'll look at your graph and tell you whether agent-coherence will move the number.