Free tool

URL Parser Confusion Analyzer

Paste a URL and see how different parsers read it side by side, with the host each one extracts highlighted. The bug class is parsers disagreeing on the host, which defeats an allowlist or denylist check and opens the door to server side request forgery and open redirect.

The URL stays here. Everything runs in your browser. Nothing is sent to any server.
No data leaves this page
Confusion examples (invented hosts, evil.example is the attacker)
The parser views, the named confusion primitives, and the host verdict will appear here.

Why parsers disagree on the host

A URL looks simple, but the rules for splitting it into a scheme, an authority, a path, and the rest are full of edge cases. The authority is the dangerous part, because that is where the host lives, and the host is what every allowlist and denylist check is really trying to pin down. When one piece of code reads the host one way and the code that sends the request reads it another way, an attacker lives in that gap. This tool shows the same URL through more than one parser so the disagreement is visible. The behavior shown for the browser native view follows the WHATWG URL standard, and the userinfo and authority rules referenced below come from RFC 3986.

The WHATWG view is what the browser uses

The first parser is the browser native new URL(). This is the same logic the fetch API uses to decide where a request goes, so for anything that runs in a browser it is usually the host that actually gets contacted. It applies the full standard, which means it treats a backslash like a forward slash for the common schemes, strips some control characters, and converts an international host into its punycode form. The other views exist to show how a less careful parser drifts away from this one.

The naive split is what a careless check does

The second parser is a deliberately simple one. It does what a quick validation routine often does: take the text after the scheme and before the first slash and call that the host. It does not know about userinfo, it does not fold a backslash into a slash, and it does not apply punycode. That is the point. When this naive host and the WHATWG host differ, the URL passes a check aimed at the naive host while the request goes to the WHATWG host. That single disagreement is the engine behind a large share of server side request forgery and open redirect findings.

The confusion primitives

The tool names each trick it spots in the input and explains what it does:

How to read the verdict

The headline result is whether the WHATWG host and the naive host agree. When they disagree, the tool says so plainly and shows both, because that is the bypassable case: a check that trusts one host while the request reaches the other. When they agree the input is not automatically safe, it just means these two parsers line up on this string. The durable fix is to parse a URL once with a strict parser, validate the host it returns, and then pass that same parsed result to whatever makes the request, so no second parser ever gets a chance to disagree. Reasoning about which parser wins in a real stack is the kind of context an AI security testing approach works through rather than matching one string. For the wider set of trust boundaries attackers probe, see the access control writing.

Related reading

More free tools