Infra Agent Factory - Architecture & Behaviour Overview

Overview

The Agent Factory is CueCrux’s evidence-driven upgrade system for autonomous agents.
It evaluates agents, diagnoses regressions, proposes improvements, and prepares safe updates all within strict limits enforced by CROWN receipts, WatchCrux audits, OpsCrux budgets, and SDKCrux contracts.

It enables agents to self-heal and self-improve while remaining auditable, reversible, and safe.


High-Level System Diagram

flowchart TD

    subgraph WatchCrux["WatchCrux (Audits & Drift)"]
        WC1["Health & Ready Checks"]
        WC2["Version Drift Rules"]
        WC3["Audit Outcomes"]
    end

    subgraph OpsCrux["OpsCrux (Control Tower)"]
        OP1["Budgets (C3)"]
        OP2["Runtime Config"]
        OP3["Synthetic Journeys"]
    end

    subgraph AgentRuntime["AgentCrux Runtime"]
        AR1["Agent Runs"]
        AR2["Failure Streaks"]
        AR3["Cost & Time Metrics"]
        AR4["Receipts"]
    end

    subgraph Engine["CueCrux Engine"]
        EN1["CROWN Receipts"]
        EN2["Retrieval Metrics"]
        EN3["Contradiction Rate"]
        EN4["Snapshot Freshness"]
    end

    subgraph AgentFactory["Agent Factory (This Component)"]
        AF1["Diagnostic Workflow"]
        AF2["Dependency Analysis"]
        AF3["Patch / Config Generation"]
        AF4["Validation Suite"]
        AF5["Proposal Builder"]
    end

    subgraph Release["Release Agent (Auto PR)"]
        RL1["Create PR"]
        RL2["Test Matrix"]
        RL3["Merge & Version Bump"]
    end

    %% Edges
    WatchCrux --> AF1
    OpsCrux --> AF1
    AgentRuntime --> AF1
    Engine --> AF1

    AF1 --> AF2
    AF2 --> AF3
    AF3 --> AF4
    AF4 --> AF5

    AF5 --> OpsCrux
    AF5 --> Release

    Release --> OpsCrux
flowchart TD

    subgraph WatchCrux["WatchCrux (Audits & Drift)"]
        WC1["Health & Ready Checks"]
        WC2["Version Drift Rules"]
        WC3["Audit Outcomes"]
    end

    subgraph OpsCrux["OpsCrux (Control Tower)"]
        OP1["Budgets (C3)"]
        OP2["Runtime Config"]
        OP3["Synthetic Journeys"]
    end

    subgraph AgentRuntime["AgentCrux Runtime"]
        AR1["Agent Runs"]
        AR2["Failure Streaks"]
        AR3["Cost & Time Metrics"]
        AR4["Receipts"]
    end

    subgraph Engine["CueCrux Engine"]
        EN1["CROWN Receipts"]
        EN2["Retrieval Metrics"]
        EN3["Contradiction Rate"]
        EN4["Snapshot Freshness"]
    end

    subgraph AgentFactory["Agent Factory (This Component)"]
        AF1["Diagnostic Workflow"]
        AF2["Dependency Analysis"]
        AF3["Patch / Config Generation"]
        AF4["Validation Suite"]
        AF5["Proposal Builder"]
    end

    subgraph Release["Release Agent (Auto PR)"]
        RL1["Create PR"]
        RL2["Test Matrix"]
        RL3["Merge & Version Bump"]
    end

    %% Edges
    WatchCrux --> AF1
    OpsCrux --> AF1
    AgentRuntime --> AF1
    Engine --> AF1

    AF1 --> AF2
    AF2 --> AF3
    AF3 --> AF4
    AF4 --> AF5

    AF5 --> OpsCrux
    AF5 --> Release

    Release --> OpsCrux

What the Agent Factory Does

The Agent Factory provides a structured, audit-ready loop for improving agents:

FunctionMeaning
EvaluationInspect agent failures, p95 regressions, drift, contradictions.
DiagnosisClassify root causes: cost, logic, drift, evidence failure, SDK mismatch.
Patch GenerationPropose config updates or generate code-level PRs.
ValidationRun synthetic journeys, receipt replay, dependency checks.
ProposalProduce auditable upgrade suggestions with evidence.
Approval PathSafe configs auto-apply; code patches go through Release agent PR workflow.

Upgrade Flow (Narrative)

1. Trigger

An agent enters remediation due to:

  • WatchCrux detecting drift or SLO breach
  • Agent runtime hitting failure streak thresholds
  • OpsCrux or developer requesting optimisation
  • New dependency or Engine version drop

2. Evaluate

The Factory reads:

  • CROWN receipts
  • Retrieval p95
  • Contradiction rate
  • Mode freshness
  • Cost envelopes
  • Dependency requirements via SDKCrux
  • Synthetic journey results from OpsCrux
  • WatchCrux PASS/WARN/FAIL

This produces a reproducible diagnostic packet.


3. Diagnose

The Factory determines whether the problem is:

  • A dependency break
  • A logic fault
  • A performance regression
  • A cost overrun
  • A receipt/provenance failure
  • A version skew

4. Generate a Proposed Fix

Two safe pathways:

A) Config Update (auto-applicable)

Budget, timeout, retry, provider selection, mode tuning.

B) Code Patch PR (requires human approval)

Includes:

  • SDK regeneration
  • Fixing type errors
  • Adding guards
  • Prompt tweaks
  • Logic corrections
  • Updating test suites

5. Validate

The Factory runs:

  • Unit tests
  • Replay of receipts
  • Synthetic journey checks
  • Drift consistency checks
  • C^3 envelope verification
  • ATAM checks (retraction, injection anomalies)

If validation passes → proposal is generated.
If it fails → WatchCrux escalates.


6. Proposal Output

A structured, auditable artifact:

{
  "proposal_id": "prop_01JX8S3",
  "agent": "researcher",
  "fix_type": "config_update",
  "reason": "p95 regression & SDK drift",
  "evidence": ["receipt_51...", "wc_drift_4...", "sj_fail_9..."],
  "expected_improvement_ms": -42,
  "rollback_plan": "restore-config-v17"
}

OpsCrux shows this in the “Proposed Fixes” section of the agent card.


7. Approval → Apply

  • Config fixes apply instantly (logged in OpsCrux).
  • Code patches become Release agent PRs for manual review.
  • Successful merges trigger an audit re-run.

Boundaries & Guarantees

The Agent Factory never:

  • Hot-patches production code
  • Exceeds OpsCrux budgets
  • Ignores provenance rules
  • Applies unsafe or unvalidated fixes
  • Mutates Engine/WebCrux directly
  • Bypasses human approval for code changes

Every action is recorded through:

  • CROWN receipts
  • Intent Ledger entries
  • WatchCrux audit logs
  • OpsCrux change history
  • GitHub PR metadata

Why It Matters

The Agent Factory gives CueCrux:

  • Self-healing stability
  • Predictable agent evolution
  • Low operational overhead
  • Receipts-backed audit trails
  • Fail-safe dependency upgrades
  • Light-touch automation with human guardrails

It lets CueCrux scale agents without losing trust, safety, or observability.