Question 1

What is URL parser confusion?

Accepted Answer

URL parser confusion is when two pieces of code read the same URL and disagree about which host it points to. A validation layer might use a quick string check to decide a URL is in scope, while the library that actually sends the request follows the full WHATWG rules. When those two see different hosts, the check passes for a host the request never visits, and the request goes somewhere else. That gap is the root of many server side request forgery and open redirect bugs.

Question 2

Does this tool send my URL anywhere?

Accepted Answer

No. The parsing and every check run entirely in your browser with JavaScript. The URL you paste is never uploaded, logged, or sent to any server. There are no network requests at all, so you can safely test internal links and real production URLs without them leaving your machine.

Question 3

Why does the host before the at sign not count?

Accepted Answer

In a URL, everything between the scheme and the at sign is userinfo, the optional username and password. The real host is what comes after the at sign. So in https://acme.example@evil.example the host is evil.example, and acme.example is just a username. A naive check that looks for a trusted name anywhere in the string is fooled, because the trusted name sits in the credential slot while the request goes to the attacker host.

Question 4

Why do browsers treat a backslash like a slash?

Accepted Answer

The WHATWG URL standard, which browsers and fetch follow for the common schemes, treats a backslash in the authority as if it were a forward slash. So a backslash can end the authority early and move the real host somewhere a naive parser does not expect. A check written with a simple split on the forward slash does not apply that rule, so the two disagree on where the host ends.

Question 5

Is the WHATWG result always the one that matters?

Accepted Answer

It is what a browser and the fetch API use, so for client side requests it is usually the host that actually gets contacted. On a server the answer depends on the language and library doing the request, and several of them follow older rules or their own logic. The safe approach is to parse the URL once with a strict parser, validate the host it returns, and pass that same parsed result to whatever makes the request, so no second parser can disagree.

URL Parser Confusion Analyzer

Why parsers disagree on the host

The WHATWG view is what the browser uses

The naive split is what a careless check does

The confusion primitives

How to read the verdict

Related reading

More free tools