-
-
Notifications
You must be signed in to change notification settings - Fork 35.4k
url: ignore IDN errors when domainname have two hyphens #12966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
2ee246f
9dee867
311667b
5e0f58c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are valid domain names with hyphens at 3 and 4th position, new node WHATWG URL parser was failing for it assume its an invalid IDN. Also filters IDN errors when domain label start or end with hyphen. Fixes: #12965 Refs: https://www.icann.org/news/announcement-2000-01-07-en
- Loading branch information
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -468,6 +468,21 @@ int32_t ToUnicode(MaybeStackBuffer<char>* buf, | |
| // #46. Therefore, explicitly filters out that error here. | ||
| info.errors &= ~UIDNA_ERROR_EMPTY_LABEL; | ||
|
|
||
| // These error conditions are mandated unconditionally by UTS #46 version | ||
| // 9.0.0 (rev. 17), but were found to be incompatible with many actual domain | ||
| // names in the wild. As such, in the current UTS #46 draft (rev. 18) these | ||
| // checks are made optional depending on the CheckHyphens flag, which will be | ||
| // disabled in WHATWG URL's "domain to ASCII" algorithm as soon as the UTS #46 | ||
| // draft becomes standard. | ||
| // Refs: | ||
| // - https://github.com/whatwg/url/issues/53 | ||
| // - http://www.unicode.org/review/pri317/ | ||
| // - http://www.unicode.org/reports/tr46/tr46-18.html | ||
| // - https://www.icann.org/news/announcement-2000-01-07-en | ||
| info.errors &= ~UIDNA_ERROR_HYPHEN_3_4; | ||
| info.errors &= ~UIDNA_ERROR_LEADING_HYPHEN; | ||
| info.errors &= ~UIDNA_ERROR_TRAILING_HYPHEN; | ||
|
|
||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unfortunately I was mistaken in my understanding of ToUnicode, which means that the entire block (ll. 464-485) shouldn't be necessary. While at it, can you delete this block, the erroneous
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
| if (U_FAILURE(status) || (!lenient && info.errors != 0)) { | ||
| len = -1; | ||
| buf->SetLength(0); | ||
|
|
@@ -516,7 +531,21 @@ int32_t ToASCII(MaybeStackBuffer<char>* buf, | |
| info.errors &= ~UIDNA_ERROR_EMPTY_LABEL; | ||
| info.errors &= ~UIDNA_ERROR_LABEL_TOO_LONG; | ||
| info.errors &= ~UIDNA_ERROR_DOMAIN_NAME_TOO_LONG; | ||
|
|
||
| // These error conditions are mandated unconditionally by UTS #46 version | ||
| // 9.0.0 (rev. 17), but were found to be incompatible with many actual domain | ||
| // names in the wild. As such, in the current UTS #46 draft (rev. 18) these | ||
| // checks are made optional depending on the CheckHyphens flag, which will be | ||
| // disabled in WHATWG URL's "domain to ASCII" algorithm as soon as the UTS #46 | ||
| // draft becomes standard. | ||
| // Refs: | ||
| // - https://github.com/whatwg/url/issues/53 | ||
| // - http://www.unicode.org/review/pri317/ | ||
| // - http://www.unicode.org/reports/tr46/tr46-18.html | ||
| // - https://www.icann.org/news/announcement-2000-01-07-en | ||
| info.errors &= ~UIDNA_ERROR_HYPHEN_3_4; | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I suggest the following code, which filters out several more conditions mandated by the UTS#46 draft, and adds an explicit comment explaining why this is: // These error conditions are mandated unconditionally by UTS #46 version
// 9.0.0 (rev. 17), but were found to be incompatible with many actual domain
// names in the wild. As such, in the current UTS #46 draft (rev. 18) these
// checks are made optional depending on the CheckHyphens flag, which will be
// disabled in WHATWG URL's "domain to ASCII" algorithm as soon as the UTS #46
// draft becomes standard.
// Refs:
// - https://github.com/whatwg/url/issues/53
// - http://www.unicode.org/review/pri317/
// - http://www.unicode.org/reports/tr46/tr46-18.html
info.errors &= ~UIDNA_ERROR_HYPHEN_3_4;
info.errors &= ~UIDNA_ERROR_LEADING_HYPHEN;
info.errors &= ~UIDNA_ERROR_TRAILING_HYPHEN;I also suspect that these lines of code need to be duplicated for the ToUnicode method in the file, since there is a set of identical changes to ToUnicode in the UTS#46 draft.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes. Will do.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @TimothyGu : Made changes as suggested. Filtering errors for trailing and leading hyphens. |
||
| info.errors &= ~UIDNA_ERROR_LEADING_HYPHEN; | ||
| info.errors &= ~UIDNA_ERROR_TRAILING_HYPHEN; | ||
|
|
||
| if (U_FAILURE(status) || (!lenient && info.errors != 0)) { | ||
| len = -1; | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
domain to Unicode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. done all changes.