Product

Agentic Blackbox Pentester

True black box pentesting. We never see your code. Agents probe your application from the outside, exactly the way an attacker would.

Schedule a Technical Demo →

Point Keygraph at a URL and let it attack. Agents autonomously navigate your application with a real browser and terminal, just like a human pentester. Test as often as you like. No source code access, ever.

What we test.

Coverage spans the OWASP Top 10, validated with working exploits, not theoretical warnings. The classes below are where the agents spend most of their time. No Exploit, No Report.

A01

Broken Access Control

IDOR, privilege escalation, tenant isolation failures, and horizontal authorization bypasses across multiple roles.

A05

Security Misconfiguration

Exposed endpoints, default credentials, verbose errors, and permissive CORS or header policies.

A07

Auth Failures

Auth bypass, session handling flaws, weak MFA, credential-stuffing exposure, and token lifecycle issues.

Business Logic

Custom invariants

Application-specific rules and assumptions you supply as context, tested for ways an attacker could violate them.

The hypothesis-driven testing loop.

The pentester runs three phases autonomously, adapting to what it discovers, and follows the arc of a human pentest: discover, attack, report.

Phase 01

Reconnaissance

Fingerprints the target's tech stack, maps endpoints using WhatWeb and Playwright-driven browser automation, captures HTTP traffic via mitmproxy, and optionally generates an OpenAPI specification from observed traffic.

Phase 02

Testing Loop

A strategy agent selects a vulnerability hypothesis, dispatches exploit agents to test it via real browser automation, evaluates outcomes, updates confidence scores, and tracks what's been attempted. Continues until high-confidence exploits are confirmed or the iteration budget is reached.

Phase 03

Reporting

Synthesizes confirmed exploits into a pentest-grade report with business impact, proof-of-impact evidence, and step-by-step reproduction. Every vulnerability is marked identified, validated, and recorded, so each run stands as penetration test evidence. Exports as PDF and markdown.

Gray-box context configuration.

All fields optional. Leave blank for pure black box, or fill in to give the agents targeted guidance.

Credentials

Up to 4 login credentials. Supports Google OAuth, GitHub, and custom auth. Multiple credentials enable privilege escalation and IDOR testing across different roles and organizations.

Focus

Aim the agents at specific areas: your auth flow, a particular feature, a sub-service, or endpoints you're nervous about. Tell it: "I built this quickly and I'm not sure how secure it is."

Context

Describe business logic invariants, assumptions your app makes, legacy infrastructure, or tech stack details so the logic testing aims at the right places.

Avoid

Define scope exclusions for the assessment: specific features, flows, or sub-services the agents will not test. Useful for protecting production data, skipping known-good areas, or avoiding destructive operations.

OpenAPI Spec

For the most thorough coverage, provide an OpenAPI specification. It gives the agents a complete inventory of your endpoints for attack surface enumeration.

Deployment targets

Test against localhost dev setups via ngrok, or target staging and sandbox URLs after verifying domain ownership through DNS TXT record confirmation.

Diminishing Returns Detection

Testing stops when confidence plateaus, not when a counter runs out.

The agent tracks its own progress and detects the knee in the curve, the point where additional testing iterations start returning diminishing value.

See it attack your stack.

Point Keygraph at a staging URL. Get a pentest-grade report in hours, not weeks, with proof-of-impact evidence and step-by-step reproduction.

Schedule a Technical Demo →