Skip to content

Handle URLs with a colon after host but no port#5108

Merged
ethomson merged 3 commits intomasterfrom
ethomson/urlparse_empty_port
Jun 13, 2019
Merged

Handle URLs with a colon after host but no port#5108
ethomson merged 3 commits intomasterfrom
ethomson/urlparse_empty_port

Conversation

@ethomson
Copy link
Copy Markdown
Member

Core git copes with URLs that have a colon after the port, but no actual numeric value. eg http://example.com:/foo.git or http://example.com:. That's horrible, but RFC 3986 says:

URI producers and normalizers should omit the port component and its
":" delimiter if port is empty or if its value would be the same as
that of the scheme's default.

Which indicates that they may and therefore we must accept it.

Test that we can handle URLs with a colon but no following port number, fix our http-parser implementation, and require use of our http-parser which corrects this issue.

Fixes #5100.

ethomson added 2 commits June 11, 2019 21:53
Core git copes with URLs that have a colon after the port, but no actual
numeric value.  eg `http://example.com:/foo.git` or
`http://example.com:`.  That's horrible, but RFC 3986 says:

> URI producers and normalizers should omit the port component and its
> ":" delimiter if port is empty or if its value would be the same as
> that of the scheme's default.

Which indicates that they may and therefore we must accept it.

Test that we can handle URLs with a colon but no following port number.
When the end of the host is reached, and we're at the colon separating
the host with the port (ie, there is no numeric port) then do not error.
This is allowed by RFC 3986.
@ethomson
Copy link
Copy Markdown
Member Author

I suspect this will be contentious. 😀

@ethomson ethomson force-pushed the ethomson/urlparse_empty_port branch 2 times, most recently from f135d57 to d8d0ac6 Compare June 12, 2019 12:41
@ethomson
Copy link
Copy Markdown
Member Author

I updated this to prefer out http-parser but allow users to specify the system http-parser if they want, even sans a version check, because this bug really isn't that serious. Basically, it allows them to opt-in to very slightly broken URL parsing if they want.

I updated the CMake option for this so that users with an existing option wouldn't accidentally opt-in to it, which also allowed us to align closer to the zlib option.

Copy link
Copy Markdown
Member

@pks-t pks-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading RFC3986 several times I agree that your fix is correct. Some bikeshedding on the CMake option name, otherwise I'm fine with your changes. Thanks!

CMakeLists.txt Outdated
OPTION(DEBUG_POOL "Enable debug pool allocator" OFF)
OPTION(ENABLE_WERROR "Enable compilation with -Werror" OFF)
OPTION(USE_BUNDLED_ZLIB "Use the bundled version of zlib" OFF)
OPTION(USE_BUNDLED_HTTP_PARSER "Use the bundled HTTP Parser" ON)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, didn't know we already had an option. If we're about to change, then we should use a generic USE_HTTP_PARSER=(bundled|system), if you ask me. The reasoning is that this would be much easier to extend. For example at a future point, we'd be able to add an option USE_HTTP_PARSER=/path/to/http_parser.

http-parser that we have fixed in our implementation. The bundled
library is now the default, but if you wish to force the use of the
system http-parser implementation despite incompatibilities, you can
specify `-DUSE_BUNDLED_HTTP_PARSER=OFF` to CMake.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should treat changes to CMake options as breaking changes and thus highlight this in a separate section

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally we have a section titled "Breaking API Changes" at the bottom. CMake changes don't feel like they fit there, so I created a new section "Breaking CMake configuration changes" and slotted this in there. I also moved the "breaking" stuff to the top. We should be done breaking APIs, but if not, it should be the first thing people see in the release notes, IMO.

@ethomson ethomson force-pushed the ethomson/urlparse_empty_port branch from d8d0ac6 to 8b151b0 Compare June 13, 2019 18:57
@pks-t
Copy link
Copy Markdown
Member

pks-t commented Jun 13, 2019 via email

@ethomson ethomson force-pushed the ethomson/urlparse_empty_port branch 2 times, most recently from 531d9b3 to 3fb6d3d Compare June 13, 2019 19:50
Our bundled http-parser includes bugfixes, therefore we should prefer
our http-parser until such time as we can identify that the system
http-parser has these bugfixes (using a version check).

Since these bugs are - at present - minor, retain the ability for users
to force that they want to use the system http-parser anyway.  This does
change the cmake specification so that people _must_ opt-in to the new
behavior knowingly.
@ethomson ethomson force-pushed the ethomson/urlparse_empty_port branch from 3fb6d3d to fb529a0 Compare June 13, 2019 20:24
@ethomson ethomson merged commit e277ff4 into master Jun 13, 2019
@ethomson
Copy link
Copy Markdown
Member Author

Upstream: nodejs/http-parser#483

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Handling of port component in URLs (with schema) differs from git

2 participants