Skip to content

Conversation

@davishmcclurg
Copy link
Contributor

My implementation doesn't yet support all idn-hostname label separators properly, but (for multiple reasons) it is still passing the tests added here: #760

These are the tests I came up with while fixing/reproducing the issues.

@davishmcclurg davishmcclurg requested a review from a team as a code owner June 16, 2025 00:20
@karenetheridge
Copy link
Member

A lot of these fail in my implementation (they show as valid where the test says it's invalid), but that likely just means I need to mark these as TODO, due to a gap in the library I'm using.

@davishmcclurg
Copy link
Contributor Author

Looking at the IDNA specifications some more, it's unclear if the extended set of label separators is included in RFC 5890. The only reference I can find is in an appendix of RFC 5891 ("Summary of Major Changes from IDNA2003" (RFC 3490)):

Remove the dot separator from the mandatory part of the protocol.

Not sure what that means.

They're in RFC 3490 (as noted in #760), but I don't know how relevant that is since JSON schema uses 5890 for idn-hostname.

They're also included in Unicode technical standard 46, which references both IDNA versions. 🤷

Leading/trailing separators are similarly confusing.

Copy link
Member

@jdesrosiers jdesrosiers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been looking into IDN validation requirements myself recently and I came across this PR. I'm sure I saw it before, wasn't in a place to understand it at the time.

Good call on adding tests for empty labels. That's definitely a gap we want to fill.

I hadn't noticed that the separator tests reference the wrong specification. I'm not sure either if the extended set of separators applies to IDNA2008. I'm ok with leaving them there and even adding to them until we can find some confirmation that it's not correct. It appears to be correct according to UTS #46 which is what most implementations are probably using anyway.

Comment on lines 447 to 451
{
"description": "dot separator with label that is too long when separator is respected",
"data": "παράδειγμαπαράδειγμαπαράδειγμαπαράδειγμαπαράδειγμαπαράδειγμαπαράδειγμα.com",
"valid": false
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these tests have anything to do with separators. If the label is too long, it's going to fail regardless of the separator. I think these just amount to duplicates of label length tests that we already have.

These are mostly meant to test that the additional `idn-hostname` label
separators are treated like `.` and are only allowed between labels.
Regular `hostname` should not support the extended label separators used
in `idn-hostname`.
This tests that the extended label separators used in `idn-hostname`
properly validate individual labels and don't treat the whole instance
as a single label.
@jdesrosiers jdesrosiers force-pushed the idn-hostname-separators branch from bef99a3 to ef212f5 Compare October 29, 2025 21:01
@jdesrosiers jdesrosiers force-pushed the idn-hostname-separators branch from ef212f5 to aab0875 Compare October 29, 2025 21:04
@jdesrosiers jdesrosiers merged commit 4cf5599 into json-schema-org:main Oct 29, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants