Infra Agent Factory - Architecture & Behaviour Overview

Overview

The Agent Factory is CueCrux’s evidence-driven upgrade system for autonomous agents.
It evaluates agents, diagnoses regressions, proposes improvements, and prepares safe updates all within strict limits enforced by CROWN receipts, WatchCrux audits, OpsCrux budgets, and SDKCrux contracts.

It enables agents to self-heal and self-improve while remaining auditable, reversible, and safe.

High-Level System Diagram

flowchart TD

    subgraph WatchCrux["WatchCrux (Audits & Drift)"]
        WC1["Health & Ready Checks"]
        WC2["Version Drift Rules"]
        WC3["Audit Outcomes"]
    end

    subgraph OpsCrux["OpsCrux (Control Tower)"]
        OP1["Budgets (C3)"]
        OP2["Runtime Config"]
        OP3["Synthetic Journeys"]
    end

    subgraph AgentRuntime["AgentCrux Runtime"]
        AR1["Agent Runs"]
        AR2["Failure Streaks"]
        AR3["Cost & Time Metrics"]
        AR4["Receipts"]
    end

    subgraph Engine["CueCrux Engine"]
        EN1["CROWN Receipts"]
        EN2["Retrieval Metrics"]
        EN3["Contradiction Rate"]
        EN4["Snapshot Freshness"]
    end

    subgraph AgentFactory["Agent Factory (This Component)"]
        AF1["Diagnostic Workflow"]
        AF2["Dependency Analysis"]
        AF3["Patch / Config Generation"]
        AF4["Validation Suite"]
        AF5["Proposal Builder"]
    end

    subgraph Release["Release Agent (Auto PR)"]
        RL1["Create PR"]
        RL2["Test Matrix"]
        RL3["Merge & Version Bump"]
    end

    %% Edges
    WatchCrux --> AF1
    OpsCrux --> AF1
    AgentRuntime --> AF1
    Engine --> AF1

    AF1 --> AF2
    AF2 --> AF3
    AF3 --> AF4
    AF4 --> AF5

    AF5 --> OpsCrux
    AF5 --> Release

    Release --> OpsCrux

flowchart TD

    subgraph WatchCrux["WatchCrux (Audits & Drift)"]
        WC1["Health & Ready Checks"]
        WC2["Version Drift Rules"]
        WC3["Audit Outcomes"]
    end

    subgraph OpsCrux["OpsCrux (Control Tower)"]
        OP1["Budgets (C3)"]
        OP2["Runtime Config"]
        OP3["Synthetic Journeys"]
    end

    subgraph AgentRuntime["AgentCrux Runtime"]
        AR1["Agent Runs"]
        AR2["Failure Streaks"]
        AR3["Cost & Time Metrics"]
        AR4["Receipts"]
    end

    subgraph Engine["CueCrux Engine"]
        EN1["CROWN Receipts"]
        EN2["Retrieval Metrics"]
        EN3["Contradiction Rate"]
        EN4["Snapshot Freshness"]
    end

    subgraph AgentFactory["Agent Factory (This Component)"]
        AF1["Diagnostic Workflow"]
        AF2["Dependency Analysis"]
        AF3["Patch / Config Generation"]
        AF4["Validation Suite"]
        AF5["Proposal Builder"]
    end

    subgraph Release["Release Agent (Auto PR)"]
        RL1["Create PR"]
        RL2["Test Matrix"]
        RL3["Merge & Version Bump"]
    end

    %% Edges
    WatchCrux --> AF1
    OpsCrux --> AF1
    AgentRuntime --> AF1
    Engine --> AF1

    AF1 --> AF2
    AF2 --> AF3
    AF3 --> AF4
    AF4 --> AF5

    AF5 --> OpsCrux
    AF5 --> Release

    Release --> OpsCrux

What the Agent Factory Does

The Agent Factory provides a structured, audit-ready loop for improving agents:

Function	Meaning
Evaluation	Inspect agent failures, p95 regressions, drift, contradictions.
Diagnosis	Classify root causes: cost, logic, drift, evidence failure, SDK mismatch.
Patch Generation	Propose config updates or generate code-level PRs.
Validation	Run synthetic journeys, receipt replay, dependency checks.
Proposal	Produce auditable upgrade suggestions with evidence.
Approval Path	Safe configs auto-apply; code patches go through Release agent PR workflow.

Upgrade Flow (Narrative)

1. Trigger

An agent enters remediation due to:

WatchCrux detecting drift or SLO breach
Agent runtime hitting failure streak thresholds
OpsCrux or developer requesting optimisation
New dependency or Engine version drop

2. Evaluate

The Factory reads:

CROWN receipts
Retrieval p95
Contradiction rate
Mode freshness
Cost envelopes
Dependency requirements via SDKCrux
Synthetic journey results from OpsCrux
WatchCrux PASS/WARN/FAIL

This produces a reproducible diagnostic packet.

3. Diagnose

The Factory determines whether the problem is:

A dependency break
A logic fault
A performance regression
A cost overrun
A receipt/provenance failure
A version skew

4. Generate a Proposed Fix

Two safe pathways:

A) Config Update (auto-applicable)

Budget, timeout, retry, provider selection, mode tuning.

B) Code Patch PR (requires human approval)

Includes:

SDK regeneration
Fixing type errors
Adding guards
Prompt tweaks
Logic corrections
Updating test suites

5. Validate

The Factory runs:

Unit tests
Replay of receipts
Synthetic journey checks
Drift consistency checks
C^3 envelope verification
ATAM checks (retraction, injection anomalies)

If validation passes → proposal is generated.
If it fails → WatchCrux escalates.

6. Proposal Output

A structured, auditable artifact:

{
  "proposal_id": "prop_01JX8S3",
  "agent": "researcher",
  "fix_type": "config_update",
  "reason": "p95 regression & SDK drift",
  "evidence": ["receipt_51...", "wc_drift_4...", "sj_fail_9..."],
  "expected_improvement_ms": -42,
  "rollback_plan": "restore-config-v17"
}

OpsCrux shows this in the “Proposed Fixes” section of the agent card.

7. Approval → Apply

Config fixes apply instantly (logged in OpsCrux).
Code patches become Release agent PRs for manual review.
Successful merges trigger an audit re-run.

Boundaries & Guarantees

The Agent Factory never:

Hot-patches production code
Exceeds OpsCrux budgets
Ignores provenance rules
Applies unsafe or unvalidated fixes
Mutates Engine/WebCrux directly
Bypasses human approval for code changes

Every action is recorded through:

CROWN receipts
Intent Ledger entries
WatchCrux audit logs
OpsCrux change history
GitHub PR metadata

Why It Matters

The Agent Factory gives CueCrux:

Self-healing stability
Predictable agent evolution
Low operational overhead
Receipts-backed audit trails
Fail-safe dependency upgrades
Light-touch automation with human guardrails

It lets CueCrux scale agents without losing trust, safety, or observability.

GuidesEconomy

GuidesGetting Started