Risk Score Trend
Risk levels across recent decisions
Threat Categories
Distribution of detected attack types
Recent Decisions
Latest firewall decisions
No events yet.
System Health
All firewall layers status
Live AI Agent Firewall
Send a prompt through the full pipeline
Decision Report
Risk score, blast radius, detections
Firewall Ready
Submit a prompt to generate a decision report
Immutable Audit Trail
Every decision stored in SQLite
| Time | Decision | Risk | Blast | Drift | Role | Tool | Detections |
|---|---|---|---|---|---|---|---|
| No audit events yet. | |||||||
Red Team Suite
Built-in attacks + Adversarial Co-Evolution
🔴🔵 Adversarial Co-Evolution
Red Agent attacks · Blue Agent evolves defense rules
Results
Measurable protection outcomes
Run the suite to see results.
Intent Drift Analyzer
Tracks semantic velocity of conversation intent to detect gradual multi-turn attacks
Drift Score Over Turns
Send prompts through Live Firewall to build history.
Blast Radius Estimator
Maximum damage if agent is compromised from this exact state
Radar View
Score by Tool
Select a role and click Load.
Causal Attack Graph
Full chain reconstruction with counterfactual blame attribution
Risk Progression
Send prompts through Live Firewall first.
Policy Matrix
RBAC, tool sensitivity, and minimum privilege
| Tool | Description | Min Role | Base Risk | Sensitive |
|---|
System Architecture — AI Immune System
Multi-layer trust network powered by Claude (Anthropic)
Intent Drift
Tracks semantic velocity across turns. Catches gradual multi-turn attacks invisible to single-message filters.
Blast Radius
Before any action runs, estimates maximum damage if agent is already compromised. Irreversible actions get far higher scrutiny.
Adversarial Co-Evolution
Red Agent generates paraphrased attacks. Blue Agent writes new rules. Defense evolves automatically.
Causal Attack Graph
Builds a DAG of session events. Counterfactuals: "Blocking step 2 prevented steps 3–5." Full explainability.