PRD: Power Law Engine (PLE)

PRD: Power Law Engine (PLE) File: docs/PRDs/PRD_Power_Law_Engine_PLE.md Owner: Architect + PLMS Lead Status: Draft v0

Trent Carter

⸻

1. Purpose & Concept

The Power Law Engine (PLE) is a cross-cutting analytics module that:

Pre-execution (Blueprint / PLMS):

• Models fat-tailed risk (time, tokens, cost) from historical runs.

• Adjusts estimates & confidence intervals to reflect avalanche / outlier behavior.

During execution (PAS Runtime):

• Monitors cascading failures (“avalanches”) and outlier events.

• Raises early warnings and suggests load shedding / controlled burns.

Post-execution:

• Updates lane-level and system-level tail statistics (exponents, thresholds).

• Feeds back into Blueprint, pricing, and PLE-driven dashboards.

⸻

2. Goals / Non-Goals Goals

• Capture heavy-tailed behavior in PAS runs (cost, time, retries, cascades).

• Provide tail-aware estimates to Blueprint / PLMS (90–99% CI, not just averages).

• Detect avalanches (large cascades of tasks/actions) in real time.

• Provide lane-level risk scores (tail-risk) consumable by:

• PLMS Blueprint

• PAS Resource Manager

• HMI dashboards

Non-Goals

• PLE does not change core PAS orchestration logic directly.

• PLE does not define product pricing by itself (it informs pricing).

• PLE does not perform ML training; it ingests telemetry and outputs analytics.

⸻

3. High-Level Architecture 3.1 Placement in Verdict System

flowchart LR

subgraph User/HMI

HMI[Verdict HMI\nDashboards & Blueprint UI]

end

subgraph Planning

BP[Blueprint / PLMS\nEstimation & Scoping]

end

subgraph Runtime

PAS[ PAS Root & Agents\n(Dir/Mgr/Prog) ]

RM[Resource Manager]

end

subgraph Analytics

LOGS[(Action / Run Logs)]

PLE[Power Law Engine\n(Analytics Service)]

end

PAS --> LOGS

RM --> LOGS

LOGS --> PLE

PLE --> BP

PLE --> RM

PLE --> HMI

Type: Separate internal service (FastAPI / gRPC style), stateless, reading from logs/warehouse, writing back summaries/flags.

⸻

4. Core Concepts & Data 4.1 Event & Avalanche Event = a root-cause–grouped unit of work, typically:

• One Blueprint job or PAS project_run.

• Can contain 1..N actions / job cards across agents.

Avalanche = a large event where:

• A single triggering failure / mis-estimate leads to:

• Many retries

• Rollbacks / re-plans

• Test re-runs

• Large token/time overrun

Key metrics per event:

• event_id

• root_cause_id (if available)

• lane (Code, Docs, Data, QA, etc.)

• num_actions (job cards / tool calls / steps)

• tokens_actual, tokens_expected, tokens_delta

• time_actual, time_expected, time_delta

• cost_actual, cost_expected, cost_delta

• num_rollbacks, num_retries

PLE uses distributions of these to detect power-law tails.

⸻

5. Functional Requirements 5.1 Pre-Execution: Blueprint / PLMS Integration Inputs:

• Historical event summaries (per lane, per project type).

• Draft Blueprint estimate: (time_est, cost_est, token_est, lane_mix, risk_profile).

Behavior:

Tail Characterization

• For each lane and project type, PLE maintains:

• Approximate tail exponent (α) for cost/time/overrun.

• Thresholds for “normal” vs “extreme” events per metric.

Tail-Aware Estimate Adjustment

• Given a new Blueprint:

• Adjust cost / time estimates with tail-aware confidence bands (e.g. P50, P90, P99).

• Produce a Tail Risk Score (0–100) for the project.

Output to Blueprint UI / PLMS

• Display:

• “Expected” vs “worst-case (P90 / P95)” ranges.

• Lane contributions to tail risk (which lanes dominate the tail).

• Optionally: Recommend rehearsal intensity or sandboxing level for high tail-risk jobs.

5.2 During Execution: PAS Runtime Monitoring Inputs:

• Streaming (or batched) action_logs from PAS and Resource Manager:

• Action type, lane, timestamps.

• Tokens, retries, rollbacks, errors.

Behavior:

Event Aggregation

• Group actions into ongoing events (by project_run, run_id, or root_cause_id).

Real-time Avalanche Detection

• Monitor for:

• num_actions > k × median for similar jobs.

• tokens_delta, time_delta, num_retries, num_rollbacks exceeding lane-specific thresholds.

• When crossing thresholds:

• Emit “Avalanche Warning” to:

• Resource Manager (for load shedding / pause / replan).

• HMI (visual alert and context).

Lane Load & Sandpile Height

• Maintain a per-lane load metric (“pile height”):

• queued complexity, active events, historical risk.

• When load exceeds safe band:

• Suggest slowing new intake to that lane.

• Optionally mark lane as “critical” in scheduling.

Outputs:

• avalanche_alerts (event_id, level, lane, suggested action).

• lane_load_state (per-lane load index, risk flag).

• run_risk_summary (for current project).

⸻

6. Non-Functional Requirements

• Low overhead:

• PLE must not block PAS execution.

• Operates on logs / async streams; outputs are advisory.

• Configurable sensitivity:

• Tail thresholds & alert levels tunable per lane.

• Explainable:

• HMI must show _why_ a job is high risk (e.g. “overrun 5× lane median for similar complexity”).

⸻

7. User Experience (HMI)

Two primary UI surfaces:

Blueprint / Estimation Panel

• Shows:

• “Average vs Tail” bars for cost/time.

• Tail Risk Score badge (e.g., Low / Medium / High).

• Tooltips: short explanations like:

“Historically, 5% of similar Code+QA projects cost ≥4× the median. We’ve widened your budget band accordingly.”

Runtime Risk Dashboard

• Per-project view:

• Current event size vs typical.

• Avalanche alerts (timeline).

• Lane view:

• Load gauge (pile height).

• Tail-risk indicator (how often this lane produces big avalanches).

⸻

8. Data Flow (Simplified)

flowchart TD

subgraph Logs

A[Action Logs\n(tokens, time, errors,...)]

E[Event Summaries\n(historical)]

end

subgraph PLE

B[PLE Batch\nTail Modeling]

C[PLE Runtime\nAvalanche Monitor]

end

subgraph Consumers

PLMS[PLMS / Blueprint]

RM[Resource Manager]

HMI[HMI UI]

end

A --> C

A --> B

E --> B

B --> PLMS

B --> HMI

C --> RM

C --> HMI

⸻

9. Three Additional Power-Law Features for Verdict

These are extra PLE-driven features that exploit power-law dynamics beyond basic risk modeling:

9.1 Customer Value & “Whale Radar” Observation: Customer revenue / usage typically follows a power law: a few whales contribute most revenue / usage, long tail contributes the rest. Feature:

• PLE ingests:

• Per-account usage (tokens, projects, active seats).

• Revenue / plan tier.

• Models the tail of account value distribution.

• Outputs to:

• Growth / Success dashboards:

• Identify whales early.

• Warn when whales are doing high-tail-risk projects (combine PLE project risk × account value).

• Blueprint:

• For high-value customers, suggest more conservative risk bands or extra rehearsal.

⸻

9.2 Blueprint Component / Template “Hit Map” Observation: Use frequency of Blueprint components, templates, and code recipes is heavy-tailed: a small set will power most usage. Feature:

• PLE tracks:

• Which Blueprint components are used in which projects.

• Downstream outcomes: overruns, bugs, SLO violations.

• Identifies:

• Power components: the 20% components used in 80% of successful projects.

• Toxic components: components that disproportionately appear in tail failures.

• Uses:

• HMI marks power components with a “Golden Path” badge.

• PLMS preferentially suggests power components in auto-generated plans.

• PAS flags toxic components for refactor / replacement work (controlled burns).

⸻

9.3 Incident Severity & Controlled Burn Planner Observation: Incident severity (downtime, bug impact, hotfix cost) is typically power-law: a few incidents dominate total pain. Feature:

• PLE ingests:

• Incident logs (severity, impacted users, time-to-resolve, cost).

• Models:

• Severity distribution tail (α).

• Which lanes, components, and providers dominate catastrophic incidents.

• Outputs to:

• Controlled Burn Planner:

• Suggests specific maintenance / refactor projects with maximal tail-risk reduction per unit effort.

• Blueprint:

• For critical components (high incident tail), suggests extra testing / redundancy budgets.

• HMI:

• “Top 10 sources of catastrophic incidents” list.

⸻

10. Open Questions & Next Steps

Event grouping semantics

• Do we define event_id strictly as project_run, or allow cross-project root causes (e.g., shared infra outage)?

Tail modeling approach

• Exact method (e.g., Hill estimator + goodness-of-fit checks) can be decided later; PLE PRD only requires:

• Ability to identify heavy-tailed vs non-heavy-tailed metrics.

• Approximate exponents & thresholds.

Alert routing

• Which alerts are auto-surfaced to the user vs only used internally by Resource Manager?

⸻

Yeah, 100% this belongs in Tron-land too. Here’s a clean section you can just bolt onto the bottom of the PRD.

⸻

11. Tron Integration (System-of-Systems Layer) Objective: Expose PLE’s tail-awareness and avalanche signals to Tron, so Tron can make global, cross-project decisions (intake, routing, safety modes, maintenance scheduling). 11.1 Tron’s Role w.r.t. PLE

At the Tron level, PLE is treated as a global risk oracle:

• Provides lane- and project-level tail risk.

• Surfaces avalanche patterns and fuel load (tech-debt / incident tail).

• Informs policy switches:

• When to accept/reject / defer incoming work.

• How aggressively to run rehearsal / sandboxing.

• When to schedule controlled burns.

High-level:

flowchart LR

subgraph Tron

T[Tron\nGlobal Policy & Governance]

end

subgraph Estimation & Planning

BP[Blueprint / PLMS]

end

subgraph Runtime

PAS[ PAS Root & Agents ]

RM[Resource Manager]

end

subgraph Analytics

PLE[Power Law Engine]

end

PAS --> PLE

RM --> PLE

BP --> PLE

PLE --> BP

PLE --> RM

PLE --> T

T --> BP

T --> RM

T --> PAS

11.2 Tron–PLE Interaction Points 1. Intake & Project Gating

• Input to Tron:

• Project-level Tail Risk Score from PLE (Low/Med/High).

• P50 / P90 / P99 cost & time bands.

• Tron policy:

• For High tail risk + high-value customer:

• Require rehearsal mode and/or manual approval.

• For High tail risk + low-value / trial user:

• Throttle size/scope or defer.

• For Low tail risk:

• Auto-approve, minimal guardrails.

2. Global Lane Pressure & Intake Throttling

• Input to Tron:

• Per-lane load index (“pile height”) and recent avalanche frequency.

• Tron policy:

• If a lane’s load/risk exceeds threshold:

• Reduce intake from Blueprint into that lane.

• Prefer routing new work to alternative lanes / templates when possible.

• If a lane is consistently low risk & underutilized:

• Allow Tron to steer more experimental / high-variance work there.

3. Controlled Burn Scheduling

• Input to Tron:

• PLE’s incident tail analysis and fuel load metrics:

• Components / repos / lanes driving the largest catastrophic incidents.

• Tron policy:

• Create maintenance / refactor epics and feed them into Blueprint.

• Reserve a fixed fraction of global capacity (e.g., 10–20%) for Tron-initiated controlled burns targeted at:

• Top incident sources.

• Lanes with worst tail exponents.

11.3 Configuration Ownership

• PLE implementation & metrics: owned by PLE/Analytics.

• PLE thresholds, policies, and “what to do when…”:

• Owned by Tron config:

• Tail-risk level → required rehearsal level.

• Lane load index → intake multiplier (e.g., 1.0, 0.5, 0.0).

• Incident tail severity → maintenance budget rules.

Tron remains the source of truth for:

• Global safety posture.

• How aggressive or conservative Verdict should be, given current tail risk and resource posture.

11.4 Open Questions (Tron-specific)

• Does Tron treat PLE as a hard gate (can block work), or as a strong advisory (humans can override)?

• Where do we surface Tron+PLE decisions in HMI so users understand:

• “This project was throttled because global tail-risk is elevated.”

• “This maintenance epic was auto-created by Tron based on PLE incident data.”

⸻

Yeah, that’s exactly the right place to nail it down: Tron owns what happens, PLE just screams into the void. Let’s bolt on a new subsection you can paste into the PRD under the Tron section.

⸻

11.5 Alert vs Action Policies & Tron UI 11.5.1 Alert vs Action: Two Independent Settings Tron treats PLE signals with two separate policy axes, both user-configurable:

Alert Level (how noisy the UI is)

Dropdown in Settings → “Tron / PLE Notifications”:

• Silent:

• No popups.

• Status bar only (icon + color).

• Tron View still records everything.

• Normal (default):

• Status bar indicator.

• Inline toast for High-severity events (e.g., PLE avalanche, lane overload).

• Tron View error list entries.

• Verbose:

• All of the above plus extra warnings for Medium-level anomalies.

• Suitable for debugging / power users.

Action Level (how aggressive Tron is)

Separate dropdown: “Tron Actions on PLE Events”:

• Off

• Tron never blocks or throttles based on PLE.

• Only logs & visualizes (pure advisory).

• Advisory Only

• Tron recommends actions (slow lane, rehearse, split project) but does not enforce.

• UI shows “Recommended by Tron” badges; user may apply or ignore.

• Soft Gate

• For high tail-risk projects or active avalanches:

• Tron pauses execution at decision points and prompts the user:

“Tron recommends pausing intake to Code lane. Proceed anyway?”

• User can override on a per-case basis.

• Hard Gate

• Tron blocks risky operations unless explicitly overridden with a strong action (e.g., “Force Run Anyway”).

• Intended for high-stakes / enterprise / compliance-heavy environments.

Design note: Both settings are per-user (for UX noise) but can also be workspace-locked by policy in enterprise mode.

⸻

11.5.2 Status Bar Indicators During project execution (in Verdict IDE / HMI main window):

• A dedicated Tron/PLE status item appears in the status bar with:

• Icon state:

• ✅ Green: No active Tron warnings.

• 🟡 Yellow: Warnings (medium tail-risk or lane load).

• 🔴 Red: Active Tron errors / hard gates.

• Badge count: number of active Tron events affecting current project.

Behavior:

• Clicking the Tron/PLE status item opens the Tron View, focused on:

• The current project_run.

• The most recent warning/error if any.

⸻

11.5.3 Tron View (Global PLE / Tron Console) The Tron View is a dedicated panel / window that complements Sequence and Tree views:

• Entry points:

• Status bar Tron icon.

• Clicking any “Tron error” or “Tron warning” link from:

• Run logs.

• Notifications.

• Blueprint / PLMS risk panel.

• Layout (conceptual):

flowchart TD

A[Tron Error Toast / Status Bar] --> B[Tron View\n(Global Control Panel)]

B --> C[Project Tab\n(Current run-focused)]

B --> D[Lanes Tab\nLane load & tail risk]

B --> E[Incidents Tab\nHistorical PLE/Tron events]

Project Tab (default when opened during a run):

• Summary strip:

• Tail Risk Score for the project.

• Current Action Level (Off / Advisory / Soft / Hard).

• Current Alert Level (Silent / Normal / Verbose).

• Event timeline:

• PLE signals (avalanche warnings, lane overloads, incident risk spikes).

• Tron decisions (throttled intake, enforced rehearsal, blocked operation).

• Details panel:

• Clicking an event shows:

• Raw PLE metrics (event size, Δtokens/time, lane load).

• Tron’s decision (what it did / suggested).

• If applicable, user override logged (“Force run at 12:32 by Trent”).

Lanes Tab:

• Table / dashboard:

• Each lane with:

• Load index (“pile height”).

• Tail-risk rating (Low / Med / High).

• Recent avalanche count.

• Buttons or links to:

• Open lane configuration.

• Schedule maintenance (controlled burns) if PLE suggests.

Incidents Tab:

• Historical list of Tron+PLE events:

• Sortable by severity, date, lane, project, customer.

• Clicking any row jumps to the detailed incident view (same as Project Tab detail panel, but historical).

⸻

11.5.4 Tron Error Click-Through Behavior Whenever Tron generates an error or hard gate:

• UI shows a clickable message, e.g.:

“Tron: Hard gate activated for Project — PLE detected extreme tail-risk in Code lane. [View Details]”

• Clicking “View Details”:

• Opens Tron View → Project Tab, focused on:

• That project.

• That specific Tron/PLE event highlighted.

• User can:

• See full context: metrics, distributions, why it triggered.

• Change Action Level (e.g., temporarily drop from Hard Gate → Soft Gate).

• Choose to override (where allowed by policy).

⸻

PRD: Power Law Engine (PLE)

Related Research

MARE PRD: Multi-Agent Redundant Execution (MARE) for Blueprint

PRD PAS Master Orchestration

PRD: Guarded Consensus Mode

VAK: The Verdict Autonomy Kernel