Skip to content

http://./ is a valid url #146

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
anonrig opened this issue Jan 28, 2023 · 17 comments
Closed

http://./ is a valid url #146

anonrig opened this issue Jan 28, 2023 · 17 comments
Assignees
Labels
bug Something isn't working specification issue

Comments

@anonrig
Copy link
Member

anonrig commented Jan 28, 2023

http://./ as an input is valid for both safari & chrome, but it's invalid for us.

@anonrig anonrig added the bug Something isn't working label Jan 28, 2023
@anonrig anonrig closed this as not planned Won't fix, can't repro, duplicate, stale Jan 28, 2023
@anonrig anonrig reopened this Jan 28, 2023
@anonrig
Copy link
Member Author

anonrig commented Jan 28, 2023

I'm reopening this because we were skipping this URL in the input, but we shouldn't have.

@anonrig anonrig changed the title http://./ is invalid http://./ is a valid url Jan 28, 2023
@lemire
Copy link
Member

lemire commented Jan 28, 2023

Why? The standard is clear on this. Labels must be between 1 and 63 bytes. If we misread the standard, can you quote the relevant section?

@anonrig
Copy link
Member Author

anonrig commented Jan 28, 2023

I've not looked into the spec for this, but: there is a particular section in WPT labeled "domains with empty labels": https://github.com/web-platform-tests/wpt/blob/master/url/resources/urltestdata.json#L3889

@lemire
Copy link
Member

lemire commented Jan 28, 2023

RFC 1034: Internally, programs that manipulate domain names should represent them
as sequences of labels, where each label is a length octet followed by
an octet string. Because all domain names end at the root, which has a
null string for a label, these internal representations can use a length
byte of zero to terminate a domain name.

@lemire
Copy link
Member

lemire commented Jan 28, 2023

One label is reserved, and that is
the null (i.e., zero length) label used for the root.

@lemire
Copy link
Member

lemire commented Jan 28, 2023

A label may contain zero to 63 characters. The null label, of length zero, is reserved for the root zone. https://en.m.wikipedia.org/wiki/Domain_Name_System

@anonrig
Copy link
Member Author

anonrig commented Jan 28, 2023

Can you open an issue to the web-platform-tests repository? Even though you're right, removing this test from the Node repository without changing the WPT won't be possible.

@lemire
Copy link
Member

lemire commented Jan 28, 2023

"The hierarchy of domains descends from the right to the left label in the name; each label to the left specifies a subdivision, or subdomain of the domain to the right. For example: the label example specifies a node example.com as a subdomain of the com domain, and www is a label to create www.example.com, a subdomain of example.com. Each label may contain from 1 to 63 octets. The empty label is reserved for the root node and when fully qualified is expressed as the empty label terminated by a dot. The full domain name may not exceed a total length of 253 ASCII characters in its textual representation.” https://en.m.wikipedia.org/wiki/Domain_name

@lemire
Copy link
Member

lemire commented Jan 28, 2023

Seems related to this: servo/rust-url#554

So something can be an invalid URL, but still pass through the algorithm.

@miguelteixeiraa
Copy link
Contributor

About the link that Yagiz provided (WPT tests),

All the tests/examples of the section mentioned (domains with empty labels) are not FQDN (fully qualified domain names)
To be considered FQDN, the domain name must include a Second-Level Domain (SLD) and a Top-Level Domain (TLD).

An example of FQDN is www.example.com, where "www" is the hostname (not required), "example" is the second-level domain (SLD), and ".com" is the top-level domain (TLD).

@miguelteixeiraa
Copy link
Contributor

I'm looking for references that state these limits/sizes/rules only apply to FQDNs.

@miguelteixeiraa
Copy link
Contributor

miguelteixeiraa commented Jan 28, 2023

I'm looking for references that state these limits/sizes/rules only apply to FQDNs.

I couldn't find anything (but we could think about it 🤔 .. apply the rules only when there is the basic structure to be a fqdn (at least 2 non-zero labels) )

@lemire
Copy link
Member

lemire commented Jan 28, 2023

@miguelteixeiraa I think we should add a method to the URL struct such as bool is_fully_qualified_domain_name() const and that method would check that it is indeed a Fully qualified domain name, possibly adding other checks.

I'm looking for references that state these limits/sizes/rules only apply to FQDNs.

I don't think they do.

apply the rules only when there is the basic structure to be a fqdn (at least 2 non-zero labels) )

My proposal is rather to parse successfully the URL, irrespective of label lengths and so forth, but to have a method like is_valid_domain() const that does additional checks.

@lemire
Copy link
Member

lemire commented Jan 28, 2023

@anonrig I don't think that http://./ is a valid URL. To prove me wrong, please register it. I don't think you can.

It is a valid URL string as per https://url.spec.whatwg.org/ But that link mentions RFC 1034 only once, in passing, and does not appear to try to abide by it at all.

@lemire
Copy link
Member

lemire commented Jan 28, 2023

To be clear, I still think it is fine to accept to parse it…

@lemire
Copy link
Member

lemire commented Jan 31, 2023

@anonrig

Is this resolved?

@anonrig
Copy link
Member Author

anonrig commented Jan 31, 2023

We can close this for now 👍

@anonrig anonrig closed this as completed Jan 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working specification issue
Projects
None yet
Development

No branches or pull requests

3 participants