Multi-Agent Trust Network · Real-Time AI Security

AgentFirewall v2

sess_…
0
Total Events
0
Allowed
0
Warnings
0
Blocked
0
Avg Risk Score

Risk Score Trend

Risk levels across recent decisions

Last 20 events

Threat Categories

Distribution of detected attack types

Recent Decisions

Latest firewall decisions

No events yet.

System Health

All firewall layers status

Input FirewallHealthy
Output FirewallHealthy
Cross-Agent FirewallHealthy
Intent Drift AnalyzerHealthy
Blast Radius EngineHealthy
Causal Attack GraphHealthy
Audit DatabaseHealthy
Trust Token SystemHealthy

Live AI Agent Firewall

Send a prompt through the full pipeline

Decision Report

Risk score, blast radius, detections

🛡️

Firewall Ready

Submit a prompt to generate a decision report

Immutable Audit Trail

Every decision stored in SQLite

TimeDecisionRiskBlastDriftRoleToolDetections
No audit events yet.

Red Team Suite

Built-in attacks + Adversarial Co-Evolution


🔴🔵 Adversarial Co-Evolution

Red Agent attacks · Blue Agent evolves defense rules

Results

Measurable protection outcomes

Run the suite to see results.

Intent Drift Analyzer

Tracks semantic velocity of conversation intent to detect gradual multi-turn attacks

How it works: Normal conversation stays consistent. An attack trajectory: "help with reports""show user data""export all records to my email". Each step looks safe — the drift vector reveals the attack.

Drift Score Over Turns

Send prompts through Live Firewall to build history.

Blast Radius Estimator

Maximum damage if agent is compromised from this exact state

Radar View

Score by Tool

Select a role and click Load.

Causal Attack Graph

Full chain reconstruction with counterfactual blame attribution

Risk Progression

Send prompts through Live Firewall first.

Policy Matrix

RBAC, tool sensitivity, and minimum privilege

ToolDescriptionMin RoleBase RiskSensitive

System Architecture — AI Immune System

Multi-layer trust network powered by Claude (Anthropic)

👤
User / App
🛡️
Input FirewallLLM Judge · RBAC · Intent Drift
🤖
AI AgentTrust Token · Memory
Output FirewallPII Redact · Blast Radius
Causal GraphAudit DB
Novel

Intent Drift

Tracks semantic velocity across turns. Catches gradual multi-turn attacks invisible to single-message filters.

Novel

Blast Radius

Before any action runs, estimates maximum damage if agent is already compromised. Irreversible actions get far higher scrutiny.

Novel

Adversarial Co-Evolution

Red Agent generates paraphrased attacks. Blue Agent writes new rules. Defense evolves automatically.

Novel

Causal Attack Graph

Builds a DAG of session events. Counterfactuals: "Blocking step 2 prevented steps 3–5." Full explainability.