worldRAG + LVM Architecture (v1)
Author: Trent Carter + ChatGPT • Date: 2025‑10‑16 • Target: LNSP / worldRAG blueprintLegend
High‑Level Architecture (ASCII)
┌──────────────────────────────────────────────────────┐
│ CONTROL PLANE │
│ • Ingestion P1–P12 • Nightly compaction │
│ • Index retrain • Lane health + SLOs │
└──────────────────────────────────────────────────────┘
▲ ▲
│ │
telemetry + metrics admin APIs
│ │
│ │
User ──prompt──▶ Host LLM ──────────────────────────────────────────────────────────────────────────┐
│ │
│ (A) Query Understanding │
▼ │
┌────────────────────┐ │
│ TMD Classifier │ → tmd_bits (uint16) + tmd_dense (16D) + lane_index │
└─────────┬──────────┘ │
│ │
│ (B) Vectorization │
▼ │
┌────────────────────┐ │
│ Encoder (768D) │ Q = embed(query) │
│ (GTR‑T5 / Stella) │ │
└─────────┬──────────┘ │
│ │
│ (C) Lane Routing │
▼ │
┌────────────────────┐ soft/hard route by lane_index │
│ Lane Router │────────────────────────────────────────────────────┐ │
└─────────┬──────────┘ │ │
│ │ │
┌───────────┴────────────────────────────────────────────────────────────────▼───────────┐
│ RETRIEVAL FABRIC │
│ │
│ ┌──────────────────────────────┐ ┌──────────────────────────────┐ │
│ │ Vector Index (per‑lane) │ │ Graph DB (Neo4j) │ │
│ │ Faiss / pgvector (F784) │ │ edges: REL{type,confidence} │ │
│ └───────────┬─────────────────┘ └───────────┬─────────────────┘ │
│ │ (1) ANN top‑K by Cos(Q⊕TMD, F784) │ (2) expand hops │
│ ▼ ▼ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ K CPE_ID hits │──────────────┬──────────────▶│ Neighbor IDs │ │
│ └──────────────────┘ │ └──────────────────┘ │
│ │ (3) hydrate │
│ ▼ │
│ ┌────────────────────────────┐ │
│ │ Text/Meta DB (Postgres) │ (mission, concept, probe, │
│ │ + cpe_vectors (pgvector) │ expected, tmd_bits, etc.) │
│ └────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────────────────┘
│
│ (4) Echo Validation + Rank Fusion
▼
┌────────────────────────────────────────────┐
│ Echo Validator (P13) │
│ • Cos(question_vec, concept_vec) ≥ τ │
│ • Drop low‑echo, rescore by: │
│ w1·cos + w2·echo + w3·graph_degree │
└──────────────────────────┬─────────────────┘
│ top‑K context packs (vectors + text)
▼
┌───────────────────────────────────────────────────────────────────┐
│ LVM (Vector‑Native Reasoner) │
│ Mamba/MoE over vectors: consumes context pack (F784 + graph pri) │
│ • Compositional reasoning in latent space │
│ • Produces answer vector(s) │
└─────────────────────┬─────────────────────────────────────────────┘
│ (optional decode for humans / host LLM)
▼
┌────────────────────────────────────────────┐
│ Vec2Text & Response Synthesizer │
│ • Decode vector answers to text │
│ • Host LLM finalizes style/format │
└────────────────────────────────────────────┘
Retrieval Algorithm (precise order)
tmd_bits, lane_index, tmd_dense.tmd_dense if doing query‑time fusion.score = w1·cos + w2·echo + w3·deg + w4·recency(optional).Storage & IDs (inter‑DB linking)
tmd_bits kept as uint16 plus learned 16D tmd_dense.Observability (minimum viable SLOs)
Three Novel Upgrades (high‑impact, implementable)
1) Adaptive Semantic‑GPS Router (ASGR)
Goal: Replace hard lane gating with a learnable _multi‑lane mixture_ that preserves precision but boosts recall.π(lane|Q) via a small MLP over (Q, tmd_dense); route to top‑m lanes (m∈{2..4}) with soft quotas; entropy regularizer to avoid collapse.tmd_bits as a prior.2) Echo‑Weighted Contrastive Tuning (EWCT) for the LVM
Goal: Continually align the LVM to _what actually retrieves well_.3) Vector‑Delta Patching (VDP) for Knowledge Maintenance
Goal: Cheap updates and compositional synthesis without re‑embedding the world.v_new = v_old + Δ_t). Compose deltas at query time.Integration Notes
tmd_bits as deterministic routing + tmd_dense as learnable feature; do not conflate.