[ PRISM · WMA-V ]

v1.0.0 · production release

arXiv preprint 2026

[ VΔLK Research ]

Institute for Complex

Cognitive Systems

World modelspredict. PRISMverifies.

A verifiable world-model agent architecture where prediction errors drive constraint-checked state updates and trace-consolidated memory. The policy never writes state directly — a verified update cycle does.

Read paper→See architecture↓

PRISM_LIVE→PREDICT s_{t+1}→OBSERVE o_t→ERROR ε_t→ADMIT→PRESSURE→GATE_τ→UPDATE Δs→COMMIT τ→REPLAY ✓→POLICY π→CONSOLIDATE→ROLLOUT

SCROLL↓

PRISMWorld-Model Agent · Verifiable∮Prediction error drives update∮9 typed error channelsReplay-ExactHash-chained memory◈PRISMWorld-Model Agent · Verifiable∮Prediction error drives update∮9 typed error channelsReplay-ExactHash-chained memory◈PRISMWorld-Model Agent · Verifiable∮Prediction error drives update∮9 typed error channelsReplay-ExactHash-chained memory◈PRISMWorld-Model Agent · Verifiable∮Prediction error drives update∮9 typed error channelsReplay-ExactHash-chained memory◈

[ 01 / architecture ]

the core problem

why current agents silently drift

Policyshould not writestate.

PRISM · update rule·§3 formal definition

// existing LLM agents — ReAct, Reflexion, Voyager:

s_t+1 ← π_LLM(s_t, o_t)// policy mutates state directly

// PRISM — the policy never writes state:

ε_t = d(ô_t+1, o_t+1) → admit(ε) → Δs → commit

// prediction error is the only update signal

// every Δs passes admissibility before commit

// the chain is hash-verified before replay

“A model that predicts well
is not yet a model that commits well.”

— §1 motivation

No write authority

I.
Policy network
via token sampling
⨯
II.
Structured parser
via extraction output
⨯
III.
Retriever / RAG
via top-k context
⨯
IV.
World-model latent
via raw rollout
⨯
V.
Tool callback
via function return
⨯
VI.
Operator
via manual override
⨯

State writes only via the verified update cycle.
∎

[ 02 / cycle ]

The update
cycle,
eight steps.

Every observation passes through the full inference cycle. Enforced as a durable workflow. Scroll to traverse one execution.

Cycle progress

01/08

[ step 01 ]

The agent's current latent belief. A typed structure with entities, claims, and a posterior over the world.

formal

s_t = (entities, claims, posterior)

BELIEF_TYPED·enforced

CYCLE_PROGRESS

013%

[ 03 / properties ]

Six architectural
guarantees.

Each one is checked at compile time and verified at runtime. Remove any one and the architecture degrades to a vanilla retrieval-augmented loop.

[ property_01 ]

Replay-exact memory

commits since genesis

847,291

LIVElast hash: 7c3a...b9e1

State at any t is bit-exactly reconstructible from genesis.

Memory is a hash-chained replay log of typed transitions. Vector store is retrieval — not memory. Graph store is projection — not belief.

[ property_02 ]

Latent world model

Prediction error becomes the update signal.

The latent predictor forecasts ẑ_t+1 before observation. Mismatch enters as one typed error channel — never as direct state write.

[ property_03 ]

Four-gate action commit

ADMISSIBILITY GATE✓

→

EVIDENCE GATE✓

→

POLICY GATE✓

→

SAFETY GATE✓

→

EXTERNAL_ACTION●

Four gates. Then the world changes.

Every action emits a post-execution observation. The agent always closes the loop.

[ property_04 ]

Local-only inference

vLLM

SGLang

PyTorch

llama.cpp

sentence-T

CUDA

No external API in the update path.

Deterministic audit. Reproducible provenance. Sovereign data.

[ property_05 ]

Strict cycle

cycle steps

enforced as
durable workflow

No skips.

14_INVARIANTS

✓no_update_without_error/✓no_direct_state_write/✓no_policy_authority_over_state/✓no_parser_authority_over_state/✓no_retriever_authority_over_state/✓no_memory_storage_collapse/✓no_commit_without_trace/✓no_admission_after_error_violation/✓no_action_without_gate/✓no_action_without_transition/✓no_latent_to_state_without_error/✓no_operator_to_state_mutation/✓no_action_result_to_state_without_observation/✓predictor_decorrelation > 0.05/✓no_update_without_error/✓no_direct_state_write/✓no_policy_authority_over_state/✓no_parser_authority_over_state/✓no_retriever_authority_over_state/✓no_memory_storage_collapse/✓no_commit_without_trace/✓no_admission_after_error_violation/✓no_action_without_gate/✓no_action_without_transition/✓no_latent_to_state_without_error/✓no_operator_to_state_mutation/✓no_action_result_to_state_without_observation/✓predictor_decorrelation > 0.05/

[ 04 / live ]

Live inference console

Watch the model
commit, step by step.

Live simulation of PRISM's telemetry. Every transition is hash-chained, every metric streams in real time. Connect your own predictor — same surface.

prism.valkresearch.org / inference / live

RUNNING

Cycle pipeline

PREDICT✓

OBSERVE✓

ERROR✓

ADMIT●

processing...

COMMIT○

UPDATE TEMP τ

0.624

τ_t+1 = τ_t · d + u · g − r · k

Replay log · hash-chained

CHAIN_VERIFIED ✓

ε channels · current step‖Δ‖ = 0.348

struct

ent

rel

claim

temp

caus

evid

telos

lat

Metrics report

novelty0.71

relevance0.89

evidence0.62

contradiction0.18

risk0.09

pressure0.58

TELOS_ALIGNMENT

truth●●●●○

provenance●●●●●

safety●●●●●

coherence●●●●○

The console is a live simulation. In production it streams from PostgreSQL ledger via SSE.

[ 05 / theorem ]

§4 · formal guarantee

bit-exact reconstruction

Replay-Exact.

Let A be a PRISM-compliant agent. Under assumptions 1–3, for any time t:

I.
State sₜ is bit-exactly determined
by the genesis state s₀ and the trace sequence (τ₀, τ₁, …, τ_{t−1})
II.
Every state update has a witness
each transition Δs corresponds to exactly one logged τᵢ — no silent drift
III.
Tampering is detectable
any modification to τᵢ breaks the chain h_τ = H(h_prev ∥ serialize(τ)) w.h.p.

[ corollary · necessity ]

“Any agent satisfying replay-exact verifiability must satisfy the PRISM separation. The architecture is implied, not chosen.”

Assumptions

A1commit: S × Δs → S is deterministic and pure
A2Hash H is collision-resistant
A3Replay log {τᵢ} is append-only, tamper-evident

What the theorem does NOT claim

T1 guarantees auditability and reconstructibility — not improved task success, not predictor accuracy. PRISM bounds error commit, not error occurrence.

[ 06 / comparison ]

Five axes.
Nine architectures.
One row closes all five.

ReAct & descendants ignore the write boundary entirely. Active Inference and JEPA address prediction but not commit authority. Cognitive architectures satisfy parts. No prior LLM-agent design combines all five.

ReAct

Yao et al. 2023

Write boundary
separated
Prediction
required
Typed delta
structurally
Gated action
4× gates
Replay
bit-exact

Reflexion

Shinn et al. 2023

Write boundary
separated
Prediction
required
Typed delta
structurally
Gated action
4× gates
Replay
bit-exact

Voyager

Wang et al. 2023

Write boundary
separated
Prediction
required
Typed delta
structurally
Gated action
4× gates
Replay
bit-exact

Toolformer

Schick et al. 2023

Write boundary
separated
Prediction
required
Typed delta
structurally
Gated action
4× gates
Replay
bit-exact

Constitutional AI

Bai et al. 2022

Write boundary
separated
Prediction
required
Typed delta
structurally
Gated action
4× gates
Replay
bit-exact

Active Inference

Friston 2010

Write boundary
separated
N/A
Prediction
required
Typed delta
structurally
Gated action
4× gates
Replay
bit-exact

JEPA / V-JEPA

LeCun 2022

Write boundary
separated
N/A
Prediction
required
Typed delta
structurally
Gated action
4× gates
Replay
bit-exact

Soar / ACT-R

Laird 2019

Write boundary
separated
Prediction
required
Typed delta
structurally
Gated action
4× gates
Replay
bit-exact

PRISM

this work

Write boundary
separated
Prediction
required
Typed delta
structurally
Gated action
4× gates
Replay
bit-exact

← drag horizontally · Table 1 · §2 of preprint →

[ 07 / benchmarks ]

An order of magnitude
less silent drift.

On suites measuring what PRISM was designed for — error-driven update fidelity, silent commit of unsupported claims, replay traceability — the architecture dominates baselines by 10–30×.

CDR↑ better

Contradiction Detection

PRISM v1.0.0

ReAct47

RAG agent31

SCR↓ better

Silent Commit Rate

PRISM v1.0.0

ReAct38

RAG agent52

TR↑ better

Traceability

PRISM v1.0.0

ReAct0

RAG agent0

RVP↑ better

Replay-Validation Pass

PRISM v1.0.0

ReAct0

RAG agent0

[ honest disclosure · §7.3 ]

Latency is the price.
We pay it openly.

PRISM pays a per-cycle cost. On the B1/B2 suites that cost is approximately 2.4× ReAct p95. For chat-only workloads this is prohibitive. For agents that must replay their own decisions, this is the price of verifiability.

0ms

PRISM p95

1800ms

ReAct p95

[ 08 / stack ]

Standard tools.
Verifiable wiring.

PRISM uses production ML components — V-JEPA, GLiNER-Relex, DeBERTa-MNLI, vLLM — as encoders and serving layers, never as state-writing actors. All inference runs locally.

Inference

Python 3.12runtime
PyTorchcore
vLLMLLM serving
SGLangstructured gen
llama.cppedge

Encoders

sentence-transformerstext emb
V-JEPAvisual latent
GLiNER-Relexentity / relation
DeBERTa-v3-MNLIclaim NLI
TimeMoEtemporal

State / Memory

PostgreSQLhash-chained log
Qdrantretrieval index
Neo4jgraph projection
RedisSTM cache
Apache WALdurability

Runtime / Obs

Temporaldurable workflow
FastAPIcontrol plane
Pydantic v2contracts
OpenTelemetrytracing
structlogaudit

$production gate✓ruff check .✓mypy --strict✓pytest invariants/✓docker compose up✓replay --verify-chainall green · sealed

[ 09 / paper ]

Read.Replay.Refute.

PRISM v1.0.0 reference implementation, the full B1/B2 benchmark suites, baseline configs, and replay logs from all benchmark runs — open-sourced for independent verification. PRISM is a discipline applied to existing models, not a new model. Plug in your own predictor. Audit the result.

Read preprint↗github / valk-research / prism↗research@valkresearch.org↗

Authors

[anon. for review]

Affiliation

VΔLK Research

Status

preprint

Year

2026

bibtex / cite

@article{prism2026wmav,
  title   = {PRISM: Predictive Recurrent Inference
             State Machine — A Verifiable World-Model
             Agent Architecture},
  author  = {[Anonymized for Review]},
  journal = {Preprint},
  year    = {2026},
  note    = {V{\Delta}LK Research,
             Institute for Complex Cognitive Systems}
}

When to adopt PRISM

▸long-horizon agents where memory must replay
▸multi-turn reasoning where drift is a known failure mode
▸regulated domains: finance, healthcare, law, audit
▸embodied / robotic systems with safety constraints
▸research benchmarks that score traceability

v1.0.0 sealed · reference release

World modelspredict. PRISMverifies.

Policyshould not writestate.

The updatecycle,eight steps.

Six architecturalguarantees.

State at any t is bit-exactly reconstructible from genesis.

Prediction error becomes the update signal.

Four gates. Then the world changes.

No external API in the update path.

No skips.

Watch the modelcommit, step by step.

Replay-Exact.

State sₜ is bit-exactly determined

Every state update has a witness

Tampering is detectable

Five axes.Nine architectures.One row closes all five.

ReAct

Reflexion

Voyager

Toolformer

Constitutional AI

Active Inference

JEPA / V-JEPA

Soar / ACT-R

PRISM

An order of magnitudeless silent drift.

Latency is the price.We pay it openly.

Standard tools.Verifiable wiring.

Read.Replay.Refute.

The update
cycle,
eight steps.

Six architectural
guarantees.

Watch the model
commit, step by step.

Five axes.
Nine architectures.
One row closes all five.

An order of magnitude
less silent drift.

Latency is the price.
We pay it openly.

Standard tools.
Verifiable wiring.