KRAKEN v1.2.0 agentic pentester
model online region: multi queue: idle uptime: 99.97%

An agentic pentester
that thinks like
the adversary_

Kraken is an agentic pentester built for security consultants, MSSPs and MSPs. Paste a target, ship the engagement, hand the chained dossier to your client. No fixed checklist. No template report. Real attacker-grade reasoning, on tap.

Target
kraken@ops ~ $
demo targets:

By authorizing the scan I agree to ThreatMate's Privacy Policy, Terms of Use, and Acceptable Use Policy.

All Kraken Pentests
--
scans run
--
avg iterations
--
compromises
--
domains tested
My Kraken Pentests
Sign in to view your pentest stats
L0
No Access
Reconnaissance only. Target surface mapped but no exploitable vulnerabilities confirmed.
scans
L1
Information Leak
Sensitive data exposed — credentials, API keys, internal paths, or configuration files retrieved.
scans
L2
Authenticated Access
Gained authenticated access via credential reuse, default creds, session hijacking, or auth bypass.
scans
L3
Significant Access
Deep access achieved — database dumps, admin panels, cloud resource enumeration, or lateral movement.
scans
L4
Full Compromise
Complete system takeover — remote code execution, IAM privilege escalation, or full cloud account control.
scans
Pioneer Model
Leads complex multi-step cloud attack chains. Handles lateral movement, privilege escalation, and chained exploits.
Executor Model
Handles generic web application testing. Fast reconnaissance, endpoint probing, and vulnerability validation.
Iteration Max
Each scan runs up to the selected number of attack iterations. The agent adapts its strategy based on findings from prior steps.
Supported Targets
Web Applications AWS Azure
engagement::active COMPROMISE ACHIEVEDLevel 0
graph:// attack-tree
nodes: 0
kraken:// stdout
[SYS] Kraken ready. Waiting for target...
Key Findings
🔒
Full pentest report ready
Enter your details to unlock the complete attack chain,
evidence, and remediation recommendations.

No spam. ThreatMate privacy policy applies.


    
OWASP A01
Broken Access Control
IDOR, privilege escalation, forced browsing, CORS misconfig. Tests every role boundary and object reference for unauthorized access paths.
OWASP A03
Injection & SSRF
SQL injection, command injection, SSRF to internal metadata services. Crafts context-aware payloads, chains SSRF to cloud IMDS for credential extraction.
OWASP A07
Authentication Flaws
Weak credentials, JWT forgery, session fixation, OAuth misconfigurations. Tests default creds, forges tokens, and exploits auth bypass chains.
CLOUD
Cloud Privilege Escalation
IAM policy abuse, role chaining, Lambda code extraction, storage key leaks. Enumerates AWS & Azure attack paths from initial foothold to full compromise.
OWASP A05
Security Misconfiguration
Exposed admin panels, verbose errors, directory listings, missing security headers. Probes every endpoint for configuration weaknesses.
OWASP A08
Data Exposure & Secrets
Leaked API keys, hardcoded credentials in source, exposed .env files, certificate transparency recon. Chains leaked secrets to deeper access.
CHAIN
Multi-Step Exploit Chains
Combines low-severity findings into critical attack paths. Staging subdomain CORS + token replay, SSRF + IMDS + role assumption. Paths no scanner finds.
OWASP A09
Logging & Monitoring Gaps
Identifies where your detection fails. Tests whether exploit activity triggers alerts, verifies audit trails, and maps blind spots in your SIEM coverage.
OWASP A03
XSS & Template Injection
Reflected, stored, and DOM-based XSS. Server-side template injection via Jinja2, Twig, and ERB. Escalates SSTI to remote code execution on the host.
OWASP A08
Insecure Deserialization
Python pickle, PHP object injection, Java deserialization. Crafts serialized payloads that achieve RCE through untrusted data unmarshalling.
OWASP A02
Cryptographic Failures
Padding oracle attacks, AES-CBC without MAC, weak JWT signing. Identifies exploitable crypto weaknesses and recovers plaintext or forges tokens.
UPLOAD
File Upload & LFI-to-RCE
Bypasses extension filters, MIME checks, and magic-byte validation. Chains local file inclusion with log poisoning to achieve remote code execution.
GRAPHQL
GraphQL & API Abuse
Introspection leaks, query depth attacks, field-level authorization bypass. Enumerates hidden schemas and extracts data through nested query manipulation.
CVE
Known CVE Exploitation
Apache path traversal (CVE-2021-41773), CGI-bin RCE, and other known vulns. Fingerprints server versions and deploys targeted, version-specific exploits.
RACE
Race Conditions & Logic Flaws
TOCTOU exploits, concurrent request abuse, business logic bypass. Fires parallel requests to exploit timing windows in state-changing operations.
RECON
Information Disclosure
Exposed .env and .git directories, directory listings, verbose stack traces, certificate transparency recon. Discovers secrets that unlock deeper attack paths.
TRIAL
free
1 scan · limited tokens
See Kraken on a real target. One engagement, capped token budget, full dossier.
--single-scan --token-capped --sandbox-ok --no-cc
▶ deploy
FIRM
$2,500
/ month · up to 30 scans
For consultancies delivering recurring engagements. ~$50/scan effective.
--30-scans-mo --whitelabel --slack-oncall --api-access
▶ deploy
MSSP/MSP
Scale
multi-tenancy · uncapped scans
For MSSPs and security teams that need multi-tenancy, uncapped scan volume, and dedicated support.
--multi-tenant --uncapped-scans --whitelabel --dedicated-support
▶ schedule

Pentest reports from your paid engagements.

Kraken: How We Built a Pentester That Thinks Like an Attacker

Traditional scanners run checklists. Kraken runs attack chains.

In 28 iterations and zero human input, Kraken found an SSRF vulnerability in an Azure Function App, extracted storage account credentials from a configuration file the server should never have served, enumerated every blob container in the account, discovered an SSH private key in a dev container nobody remembered existed, logged into a virtual machine, activated its managed identity, found an Automation Account with Owner-level permissions, and wrote a PowerShell runbook that promoted itself to subscription Owner.

Level 4. Full tenant compromise. No human touched the keyboard.

This is not a hypothetical. It is what Kraken did, autonomously, against AzureGoat — the industry-standard intentionally vulnerable Azure environment. And the mechanism behind it is fundamentally different from anything a traditional scanner does.


Scanners Find Vulnerabilities. Attackers Chain Them.

Every security team knows the workflow: run Nessus or OpenVAS, get a PDF with hundreds of findings sorted by CVSS score, hand it to engineering, argue about priorities. Run Burp Suite against the web layer, get a list of reflected XSS and missing headers. These tools are useful. They are also fundamentally limited.

Traditional scanners execute a predetermined set of tests. Each plugin or rule tests for one thing. The output is a list of individual findings. An SSRF is reported as a medium-severity issue. A publicly listable storage container is a separate finding. An exposed SSH key is another. The scanner has no concept that these three findings, chained in the right order, constitute a full infrastructure compromise.

Real attackers don’t work from checklists. They reason. An SSRF is not a finding to report — it is a tool to steal credentials. A storage key is not an endpoint — it is a stepping stone to SSH keys. An SSH key leads to a VM. A VM has a managed identity. This is the OODA loop: Observe, Orient, Decide, Act — repeated until the objective is reached.

No commercial scanner chains a web vulnerability into cloud privilege escalation. Web scanners and cloud security posture tools occupy different market categories entirely. But the attacker doesn’t care about your tool categories.

Traditional ScannersKraken
Decision makingFixed rules and pluginsAI reasoning per iteration
Attack chainingReports individual findingsChains web vulns into cloud escalation
LearningNone across scansCross-scan knowledge vault with AI synthesis
False positivesHigh (version fingerprinting)Low (heuristic validation + report grounding)
Cloud expertiseGeneric pluginsCloud-specific playbooks (Azure, AWS)
AdaptabilitySame tests regardless of resultsEvery result shapes the next action

Architecture: Three Phases of Autonomous Pentesting

Kraken runs a three-phase pipeline against every target.

Phase 1: Cloud Detection

Before any scanning begins, Kraken fingerprints the target’s cloud provider. It inspects the hostname (.azurewebsites.net, .amazonaws.com, .run.app), HTTP response headers (x-azure-ref, x-ms-request-id), and page source (storage SDK URLs, identity provider references). The result — Azure, AWS, GCP, or generic — determines which specialized attack playbook and system prompt Claude receives.

This matters because cloud infrastructure attacks follow fundamentally different paths. An Azure SSRF targets the Function App metadata and blob storage. An AWS SSRF targets the EC2 Instance Metadata Service. A generic web target gets tested for IDOR, SQLi, and application-layer vulns. The right playbook for the right infrastructure.

Phase 2: Reconnaissance

Four parallel probes run simultaneously: nmap port scanning on web ports, endpoint enumeration against 15+ standard paths (plus cloud-specific paths when a cloud provider is detected), page source scraping for JavaScript bundles and hardcoded secrets, and cloud-specific enumeration like SQL injection probing on Azure user-facing APIs.

The raw data from all probes is sent to Claude for summarization into a structured JSON: open ports, API endpoints, storage URLs, discovered users, and interesting files. This mirrors what a human pentester does in the first hour — compressed to seconds.

Phase 3: The AI Attack Loop

This is the core of Kraken, and it is where everything changes.


The ReAct Loop: An AI That Reasons About What It Finds

Kraken uses a ReAct (Reasoning + Acting) loop. Claude AI acts as the attacker’s brain. Python functions are the hands.

On each iteration, Claude receives the full conversation history of every action taken and every result received. It outputs two things: reasoning text explaining what it found, what it means, and what to do next, and a tool call — the specific action to take. Python executes the tool, captures the result, and sends it back to Claude. Claude reasons again.

Purpose-Built Offensive Tools

Claude has over 20 purpose-built tools at its disposal. These are not generic wrappers — each encodes offensive security knowledge:

  • Cloud infrastructure: Azure CLI execution, AzureHound tenant graph enumeration, Automation Account runbook escalation, AWS S3 enumeration, IMDS credential extraction, IAM enumeration, role assumption, Lambda code retrieval, permission simulation
  • Web exploitation: SSRF firing with blob URL following, HTTP probing, file download, SSH execution
  • Vulnerability-specific: JWT decoding and forging, command injection with six bypass techniques (;, &&, |, backticks, $(), time-based blind), XXE injection via DOCTYPE/ENTITY, boolean-based blind SQLi with filter bypass, LFI probing across path traversal depths, file upload with JPEG magic byte bypass and null-byte injection, default credential testing against 15 common pairs

Each tool returns structured results (success, data, error), truncated to 2,000 characters to prevent context bloat.

Guardrails

The loop terminates on Level 4 achievement, 50 iterations, 5 consecutive failures, stuck-loop detection (same tool + arguments called 3 times), or a configurable per-scan cost limit. Destination guardrails block loopback and link-local addresses in production, preventing the tool from inadvertently attacking its own infrastructure.


Cloud-Native Attack Chains

This is Kraken’s sharpest differentiator. No other automated tool chains web-layer vulnerabilities into cloud infrastructure compromise.

Azure: Web App to Tenant Owner in Seven Steps

The Azure playbook, encoded in Claude’s system prompt and validated end-to-end on AzureGoat:

  1. SSRF confirmation — fire an SSRF payload to read /etc/passwd, confirming the vector exists
  2. Credential extraction — SSRF to read local.settings.json, extracting the storage account name and key
  3. Storage enumeration — Azure CLI to list all blob containers in the storage account
  4. SSH key retrieval — download a private key from a dev container
  5. VM access — SSH into the VM, run az login -i to activate the managed identity
  6. Privilege discovery — enumerate Automation Accounts, find one with Owner-level permissions
  7. Privilege escalation — create and execute a PowerShell runbook that assigns Owner role to the VM’s managed identity

Each step unlocks the next. No single step is a “finding” in isolation. The chain is the finding.

AWS: Metadata to Administrator

The AWS playbook follows a parallel pattern:

  1. S3 bucket enumeration — derive bucket names from the target hostname, test for public listing and ACL misconfigurations
  2. IMDS exploitation — SSRF to the EC2 metadata service (169.254.169.254) to extract temporary IAM credentials
  3. IAM enumeration — map the caller’s identity, attached policies, and accessible services
  4. Permission simulation — test 22 high-value IAM actions (like iam:CreatePolicyVersion, iam:AttachUserPolicy, iam:PassRole) via the IAM policy simulator — stealthier than brute-force enumeration
  5. Privilege escalation — exploit dangerous IAM permissions to achieve AdministratorAccess or EC2 shell

Not Scripts — Knowledge

These playbooks are not hardcoded in Python. They are encoded as knowledge in Claude’s system prompt. Claude uses reasoning to decide when to follow the playbook and when to deviate. If the SSRF path is blocked, it adapts — tries Azure CLI directly, looks for exposed .env files, checks for default credentials. The system prompt says: “When HTTP is blocked, use Azure CLI tools.” The AI interprets this guidance in context.


Compound Learning: The Vault

Kraken’s second architectural layer is inspired by Andrej Karpathy’s concept of a “second brain” — a persistent, AI-curated knowledge base that compounds intelligence across every engagement.

After each scan, Kraken writes a raw record to the vault. Credentials are stripped. Techniques, tool sequences, and outcomes are preserved. Claude generates a 3-5 sentence synthesis of what worked, what failed, and why. The vault stores patterns, not secrets.

Every 10 scans, Claude rewrites the entire wiki from scratch — deduplicating patterns, removing contradictions, and weighting Level 3-4 results and frontier model scans more heavily. This is not appending logs. It is AI-curated knowledge synthesis.

Before each new scan, query_vault() retrieves relevant learnings filtered by cloud type and vulnerability tags, and injects them into Claude’s initial message. The agent starts each engagement knowing what worked last time against similar targets — and what didn’t.

The practical result: Kraken’s first scan against a new target class is good. Its tenth is meaningfully better. Its hundredth reflects accumulated intelligence from every prior engagement.


Cost-Aware by Design

AI-driven pentesting could be expensive. Kraken is engineered to keep costs practical.

The system prompt and tool declarations — identical on every iteration — are cached using Anthropic’s prompt cache API. On a 50-iteration scan, this eliminates 49 redundant re-reads, reducing input token cost by roughly 90%. Model routing sends cloud-specific targets (which need deeper reasoning) to Claude Opus and generic web scans to Claude Sonnet. A configurable per-scan cost cap automatically terminates scans that exceed budget.


What This Means for Your Security Program

Kraken is not replacing human pentesters. It is extending what a security team can do — running continuous, adaptive assessments at a fraction of the time and cost of a manual engagement.

For organizations running on Azure or AWS, the cloud-native attack chains represent a capability that does not exist in any other automated tool. No scanner chains a web SSRF into cloud tenant compromise. Kraken does, because that is what a real attacker would do.

The compound learning vault means the system improves with use. Every scan contributes to the next. Techniques that work are reinforced. Dead ends are catalogued and avoided. The tool gets smarter the more you use it.

This is the difference between an AI that reasons and an AI that learns. Kraken does both.

Kraken is built by ThreatMate. Authorized security testing only.