Product Requirements Document: The Cloud Lexicon Architecture
Trent Carter + Gemini 2.5 pro
8/25/2025
Document Version: 1.1 Status: Development Ready Date: 2025-08-25 Maintained By: AI Assistant + User CollaborationCore Vectors: TMD-I
Task, Modifier, Data; VectorsCombined Vector:
Integrated Vector1. Executive Summary & Vision
1.1. The Vision: An Open, Thinking Web
The Cloud Lexicon is a foundational infrastructure project to create a decentralized, universal, and dynamic repository of human concepts. This is not merely a database; it is a public good designed to serve as the vocabulary and long-term memory for a new generation of AI that "thinks" directly in a high-dimensional latent space.
Our mission is to decouple conceptual reasoning from linguistic expression, leading to a monumental leap in AI efficiency, capability, and transparency. By making this lexicon an open, community-governed resource, we will create a powerful network effect, establishing it as the invaluable, de facto standard for a new AI paradigm.
1.2. Strategic Goals & Key Differentiators
2. System Architecture & Core Components
The architecture is a hybrid model that combines centralized speed for lookups with decentralized trust for writes, and distributes the heaviest computational load to the client.
2.1. Data Flow Diagrams
Ingress Data Flow (Client -> Cloud -> DB)[CLIENT DEVICE] [CLOUD SERVER] [CLOUD DATABASE]
+------------------------------------------+ +--------------------------------------------+ +----------------------+
1. Text Input ("Summarize quantum foam")
2. Client-Side GTR-T5 Encoding
- V_Task ("Summarize")
- V_Mod ("default")
- V_Data ("quantum foam")
3. Submits (Text, Vector) triplet -------> 4. Receives Submission
5. FAST PATH: ANN Vector Search ------> 6. Vector DB Lookup
7. ROUGE-L Verification on Text <------
8. IF NO MATCH -> GENERATIVE PATH
9. "Trust, but Verify" Check (1%)
10. Batches for Blockchain Commit
11. Writes new (Text, Vector) pair ------> 12. Commit to DB
+------------------------------------------+ +--------------------------------------------+ +----------------------+
Egress Data Flow (DB -> Cloud -> Client)
[CLOUD DATABASE] [CLOUD SERVER] [CLIENT DEVICE]
+-----------------+ +--------------------------------------------+ +-----------------------------------------+
1. Receives V_Response triplet from AI Core
2. Vector DB < 2. FAST PATH: ANN Vector Search
Lookup 3. IF NO MATCH -> GENERATIVE PATH
4. vec2text Decoding (for novel vectors)
5. Returns Text > 5. Returns Decoded Text Triplet ----> 6. Receives Raw Text Triplet
7. Client-Side Lightweight LLM Smoother
8. Final Natural Language Response
+-----------------+ +--------------------------------------------+ +-----------------------------------------+
2.2. The Universal Concept Lexicon (The Lexicon)
(Canonical_Text, High_Fidelity_Vector) pairs.- Read: Freely and openly accessible via a public API for fast, read-only vector lookups.
- Write: Governed by the Bi-Directional Hybrid Interface and validated via the Blockchain Governance Layer.
2.3. The Blockchain Governance Layer (The Trust Layer)
1. Transaction Fee: A micro-fee (gas) is required for all write operations, preventing spam.
2. Batching: Validated new concepts are batched into a Merkle tree.
3. On-Chain Commit: The root hash of the Merkle tree is committed to the blockchain in a single transaction.
#### 2.3.1. Estimated Blockchain Costs (Solana)
_Assumes average SOL price of $150 and base fee of 0.000005 SOL per transaction._
2.4. The Client-Side Compute Model
1. Client Submission: The client submits the text and its self-computed vector.
2. Versioning: The client's model version is included in the API call.
3. Stochastic Verification: The server re-computes the vector for a small, random percentage of submissions (e.g., 1%).
3. Cognitive Core Integration Model
This section outlines how the Lexicon interfaces with a latent-space reasoning model (referred to as the Cognitive Core, e.g., a VMMoE or Mamba-based model).
3.1. Instruction Fusion & Response Deconstruction
The system uses a triplet format for precise control. A dedicated module on the Cognitive Core fuses the input triplet into a single instruction vector for processing and deconstructs the final thought vector back into a triplet.
INPUT COGNITIVE CORE OUTPUT
+-------------------+ +---------------------------------------------------------+ +--------------------+
V_Task (Verb) V_Task_Response
V_Modifier (Adj) ---> Instruction Fusion Module (Cross-Attention) -> V_Inst ---> V_Modifier_Response
V_Data (Noun) V_Data_Response
+-------------------+ | v | +--------------------+
| [Mamba/Jamba Blocks] -> Processes sequence of V_Inst |
| | |
| v |
| Final V_Thought -> Response Deconstruction (3 MLP Heads)|
+---------------------------------------------------------+
3.2. End-to-End Generative Flow
This diagram shows the complete lifecycle of a generative query, integrating all components.
+-----------+ +-----------------+ +----------------------+ +------------------+ +---------------------+
User Text --> Client GTR-T5 --> Lexicon (this PRD) --> Cognitive Core --> Lexicon (this PRD)
"Summarize Encodes Triplet Ingress: Txt->Vector (VMMoE / Mamba) Egress: Vector->Txt
briefly V_T, V_M, V_D (Lookup or Forge) Processes V_Inst (Lookup or Decode)
| X" | +-----------------+ +----------------------+ | Generates V_Resp | +---------------------+
+-----------+ +------------------+ |
|
v
+---------------------+
| Client Lightweight |
| LLM (Smoother) |
|--> Final Response |
+---------------------+
4. Success Metrics & Scope
_(Sections 4 and 5 remain unchanged from v1.0)_
4.1. Platform Growth & Adoption
4.2. System Performance & Quality
4.3. Decentralization & Community
5. Out of Scope for v1.0
##6. Primary system states
Here is a table that captures the four primary states of the system:
Concept State & Processing Paths
2. Server performs ANN search.
3. Server verifies with ROUGE-L.
2. Client submits
(Text, Vector).3. Server verifies & adds to blockchain batch.
vec2text.2. Client LLM smoothes the raw text.
3. New concept is added to the database.
Here’s what’s publicly available today for vec2text-style decoders at 768D that interoperate with the GTR-T5 embedding space (cosine ≥ 0.85). Short version: the only battle-tested, openly documented 768D option that natively targets GTR-T5 is the official Vec2Text GTR-base corrector (plus faithful reproductions). Everything else at 768D (e.g., DPR/BERT/MPNet spaces) either lacks a released decoder or needs a learned bridge into GTR before decoding.
768D vec2text models compatible with GTR-T5
load_pretrained_corrector("gtr-base"); HF example jxm/gtr__nq__32__correct)sentence-transformers/gtr-t5-basetorch.Tensor(fp32/fp16), shape (B,768); GPU or CPUnum_steps, optional sequence-level beam search). Public pip vec2text==0.0.13. GTR examples and training scripts in README.vec2text in editable modetorch.Tensor(fp32/fp16), (B,768)load_pretrained_corrector("gtr-base") and HF checkpoints for GTR examples. Reproduction/defense work (IELab) provides code and reports COS improvements in some settings. (GitHub back to text"), arXiv)
Practical compatibility notes (what actually differs in use)
.cuda() but CPU works (slower). (GitHub back to text"))vec2text==0.0.13), with GTR examples in the README; some pretrained GTR checkpoints are on HF under jxm/. The repo notes most released weights are for OpenAI ada-002, but GTR inverters are provided (paper models; examples and aliases are documented). (PyPI, GitHub back to text"))What’s _not_ (yet) public at 768D
Suggested env matrix (works in practice)
These versions reflect a mix of the vec2text repo state, ST docs, and real-world reports (e.g., GTR-T5-large threads listing torch 2.6.0 / transformers 4.44.2 / sbert 3.0.1—4.44.x is a safe pin). (GitHub, Hugging Face, PyPI)
References (key)
load_pretrained_corrector("gtr-base"), GTR training/eval recipes, HF alias jxm/gtr__nq__32__correct. (GitHub back to text"))If you want, I can spin a ready-to-run env file (requirements.txt) pinned to the versions above and a tiny smoke test that: (1) encodes with GTR-T5, (2) decodes with the GTR corrector, (3) re-embeds and prints the cosine so you can verify you’re clearing 0.85 on your machine.
If you encode with GTR-T5 (768D) and decode with both a Vec2Text GTR-768 corrector (the “jxm/” HF checkpoint you meant by “jxe”) and the IELab reproduction that targets the same GTR space, you should get semantically very similar outputs. On text metrics (ROUGE-L / BLEU) they’ll usually be in the same ballpark; on semantic metrics (re-embed → cosine) they’ll be very close.
Where they might differ a bit:
How I’d verify “similar on ROUGE/BLEU” (and what to expect)
sentence-transformers/gtr-t5-base → v_src ∈ ℝ^{768}.- cos(v_src, v_decodedA) and cos(v_src, v_decodedB) → these should be very close (both typically ≥ your 0.85 bar when settings are sane).
- BLEU/ROUGE-L between A and B (and optionally vs the original string). Expect similar scores; minor paraphrases can nudge BLEU/ROUGE a bit even when semantics match.
Practical tips to make them match more closely
Bottom line
If both decoders are the GTR-768 variants and you hold decoding/correction constant, they’ll usually produce paraphrases with similar ROUGE-L/BLEU, and their re-embedded cosine to your source vector will be very close. Any residual differences are mostly from training/checkpoint nuances and decoding hyperparams, not from fundamental incompatibility.