Product

Agentic Blackbox Pentester

True black box pentesting. We never see your code. Agents probe your application from the outside, exactly the way an attacker would.

Schedule a Technical Demo →
Keygraph blackbox pentester findings list: AI-discovered vulnerabilities across an organization's repos

Point Keygraph at a URL and let it attack. Agents autonomously navigate your application with a real browser and terminal, just like a human pentester. Test as often as you like. No source code access, ever.

What we test.

OWASP 2025 coverage validated with working exploits. No theoretical warnings.

A01
Broken Access Control

IDOR, privilege escalation, tenant isolation failures, and horizontal authorization bypasses across multiple roles.

A05
Security Misconfiguration

Exposed endpoints, default credentials, verbose errors, and permissive CORS or header policies.

A07
Auth Failures

Auth bypass, session handling flaws, weak MFA, credential stuffing resilience, and token lifecycle issues.

Business Logic
Custom invariants

Application-specific rules and assumptions you supply as context, tested for ways an attacker could violate them.

The hypothesis-driven testing loop.

The pentester runs three phases autonomously, adapting to what it discovers.

01
Phase 01
Reconnaissance

Fingerprints the target's tech stack, maps endpoints using WhatWeb and Playwright-driven browser automation, captures HTTP traffic via mitmproxy, and optionally generates an OpenAPI specification from observed traffic.

02
Phase 02
Testing Loop

A strategy agent selects a vulnerability hypothesis, dispatches exploit agents to test it via real browser automation, evaluates outcomes, updates confidence scores, and tracks what's been attempted. Continues until high-confidence exploits are confirmed or the iteration budget is reached.

03
Phase 03
Reporting

Synthesizes confirmed exploits into a pentest-grade report with business impact, proof-of-impact evidence, and step-by-step reproduction. Exports as PDF and markdown.

Gray-box context configuration.

All fields optional. Leave blank for pure black box, or fill in to give the agents targeted guidance.

Credentials

Up to 4 login credentials. Supports Google OAuth, GitHub, and custom auth. Multiple credentials enable privilege escalation and IDOR testing across different roles and organizations.

Focus

Direct testing toward specific areas: your auth flow, a particular feature, a sub-service, or endpoints you're nervous about. "I built this quickly and I'm not sure how secure it is."

Context

Describe business logic invariants, assumptions your app makes, legacy infrastructure details, or tech stack info to target more accurate business logic testing.

Avoid

Exclude specific features, flows, or sub-services from testing. Useful for protecting production data, skipping known-good areas, or avoiding destructive operations.

OpenAPI Spec

For the most thorough coverage, provide an OpenAPI specification. Gives the agent complete documentation of your endpoints for comprehensive analysis.

Deployment targets

Test against localhost dev setups via ngrok, or target staging and sandbox URLs after verifying domain ownership through DNS TXT record confirmation.

Diminishing Returns Detection
Knee detection avoids wasteful redundant testing.

The agent tracks its own progress and detects when additional iterations are returning diminishing value. Testing stops when confidence plateaus, not when a hard-coded counter runs out.

See it attack your stack.

Point Keygraph at a staging URL. Get a pentest-grade report in hours, not weeks.