Fix issues with Proxy Authentication after httpclient refactor#5852
Conversation
68ed80b to
1e0dfbb
Compare
|
Sorry about the force pushes, there was a slight debate over whether it should be |
|
ehhh... We found another unrelated proxy issue with the |
|
I've updated the original description with the second issue. This seems to alleviate the issue we're seeing. |
|
|
||
| if ((error = apply_server_credentials(buf, client, request)) < 0 || | ||
| (error = apply_proxy_credentials(buf, client, request)) < 0) | ||
| (!client->proxy_connected && (error = apply_proxy_credentials(buf, client, request)) < 0)) |
There was a problem hiding this comment.
So thinking through this. You mention:
we continue to apply proxy-authorization headers to all requests even after we've already built a tunnel
which describes the CONNECT case. From what you're saying, I agree that we're definitely doing the wrong thing here: we would be sending Proxy-Authentication through the tunnel to the git server, which is clearly wrong.
But this auth step is for the non-CONNECT proxy case - when you're sending a GET http://foo.com/ through the proxy itself. If you're authenticating with a mechanism like HTTP Basic to the proxy, then you would need to supply the credentials on every proxy request.
The CONNECT proxy case would have supplied the credentials when it sent the CONNECT request itself, in generate_connect_request. So this is explicitly just for non-CONNECT proxies.
(I realize that it's remarkably uncommon to not speak over a CONNECT proxy, but here we are.)
I think that proxy_connected is not the right test here for that case. proxy_connected is set by proxy_connect which is only called in http_client_connect when:
use_proxy = client->proxy.url.host &&
!strcmp(client->server.url.scheme, "https");
So I think that proxy_connected will always be false when we're talking to a non-CONNECT proxy.
So my reading of this is that we're no longer passing proxy authentication to non-CONNECT proxies.
(But I'm surprised that we don't test this in our proxy integration tests. 😢 I know that speaking to non-https git servers is uncommon, but it's not a thing that is never done, and I think that we should support it - and support it through a proxy, so that means that we should test it. So I want to think through this a little bit more before I claim that I'm super confident of this analysis.)
I think that we might want to pull out the use_proxy test in http_client_connect into its own function, eg:
static bool use_connect_proxy(git_http_client *client)
{
return client->proxy.url.host && !strcmp(client->server.url.scheme, "https");
}
And make this line:
if ((error = apply_server_credentials(buf, client, request)) < 0 ||
(!use_connect_proxy(client) && (error = apply_proxy_credentials(buf, client, request)) < 0))
But - again - I'm not 100% on my analysis. Does this sound right to you?
There was a problem hiding this comment.
So I think that proxy_connected will always be false when we're talking to a non-CONNECT proxy.
So my reading of this is that we're no longer passing proxy authentication to non-CONNECT proxies.
I believe if proxy_connected is always false, then we always pass the Proxy-Authentication headers through, because we perform !client->proxy_connected aka stop sending the Proxy-Authentication headers if we're already connected.
I am not positive with your analysis, but I will go back to our test bed to confirm. I am not positive, because prior to the httpclient refactor, we always attached Proxy-Authentication headers unless we had successfully connected to the proxy. I believe that is what I've updated httpclient to do as well, since proxy_connected is not set until we have established a connection to the proxy.
My intuition with your suggestion is that we would actually regress again, because use_connect_proxy would always be set for an httpclient that has a connect proxy, therefore we would continue sending the Proxy-Authentication headers through after connect has occurred. However, I don't understand the area fully, so maybe the use_connect_proxy will return false on the direct-line to the git server and that's where I am confused with your analysis.
I'll get back to you today.
There was a problem hiding this comment.
Well I'll be damned. Yea, that totally works in testing. In fact, it works better than what I had originally wrote in that there is less traffic to the git server. This change you suggested removes one round trip... Basically:
In my PR as written:
"GET http://github.com/libgit2/TestGitRepository/info/refs?service=git-upload-pack HTTP/1.1\r\nUser-Agent: git/2.0 (libgit2 1.1.0)\r\nHost: github.com\r\nAccept: */*\r\n\r\n"
"GET http://github.com/libgit2/TestGitRepository/info/refs?service=git-upload-pack HTTP/1.1\r\nUser-Agent: git/2.0 (libgit2 1.1.0)\r\nHost: github.com\r\nAccept: */*\r\n\r\n"
"CONNECT github.com:443 HTTP/1.1\r\nUser-Agent: git/2.0 (libgit2 1.1.0)\r\nHost: {omitted}\r\n\r\n"
"CONNECT github.com:443 HTTP/1.1\r\nUser-Agent: git/2.0 (libgit2 1.1.0)\r\nHost: {omitted}\r\nProxy-Authorization: Negotiate {omitted}\r\n\r\n"
"GET /libgit2/TestGitRepository/info/refs?service=git-upload-pack HTTP/1.1\r\nUser-Agent: git/2.0 (libgit2 1.1.0)\r\nHost: github.com\r\nAccept: */*\r\n\r\n"
"POST /libgit2/TestGitRepository/git-upload-pack HTTP/1.1\r\nUser-Agent: git/2.0 (libgit2 1.1.0)\r\nHost: github.com\r\nAccept: application/x-git-upload-pack-result\r\nContent-Type: application/x-git-upload-pack-request\r\nContent-Length: 429\r\n\r\n"
After your suggestion
"GET http://github.com/libgit2/TestGitRepository/info/refs?service=git-upload-pack HTTP/1.1\r\nUser-Agent: git/2.0 (libgit2 1.1.0)\r\nHost: github.com\r\nAccept: */*\r\n\r\n"
"CONNECT github.com:443 HTTP/1.1\r\nUser-Agent: git/2.0 (libgit2 1.1.0)\r\nHost: {omitted}\r\n\r\n"
"CONNECT github.com:443 HTTP/1.1\r\nUser-Agent: git/2.0 (libgit2 1.1.0)\r\nHost: {omitted}\r\nProxy-Authorization: Negotiate {omitted}\r\n\r\n"
"GET /libgit2/TestGitRepository/info/refs?service=git-upload-pack HTTP/1.1\r\nUser-Agent: git/2.0 (libgit2 1.1.0)\r\nHost: github.com\r\nAccept: */*\r\n\r\n"
"POST /libgit2/TestGitRepository/git-upload-pack HTTP/1.1\r\nUser-Agent: git/2.0 (libgit2 1.1.0)\r\nHost: github.com\r\nAccept: application/x-git-upload-pack-result\r\nContent-Type: application/x-git-upload-pack-request\r\nContent-Length: 429\r\n\r\n"
There was a problem hiding this comment.
I'm going to try and get an intuition for that now :).
There was a problem hiding this comment.
My intuition with your suggestion is that we would actually regress again, because use_connect_proxy would always be set for an httpclient that has a connect proxy
Yes I would expect use_connect_proxy to always be true when set for an httpclient that had a connect proxy. My thinking is that we should always set proxy credentials in this method when not using a connect proxy and never set proxy credentials in this method when we are.
There was a problem hiding this comment.
I see. Thanks for explaining that. I understand and agree with your take on this. We leverage our proxy credentials in a connect proxy in generate_connect_request, and thus don't need to ever add them in this method when we are a connect proxy. This makes sense. (looks like I'm the one who missed a ! 😄).
I still think I standby this approach I took (though I can see how it is inferior), because proxy_connected will never be set for non CONNECT proxies, therefore we will always send the Proxy-Authentication when in generate_request. I believe that it works for the same reason that the original flow worked (there is external mutable state that signals whether we have leveraged CONNECT or not).
Are you certain that the approach I took will have a regression for non-CONNECT proxies? (asking, because we're shipping soon and debating on whether we need to take your suggestions before or if they can be taken later 😄)
There was a problem hiding this comment.
Apologies for the ping, I edited that last one and want to make sure that you get the notification @ethomson.
There was a problem hiding this comment.
Are you certain that the approach I took will have a regression for non-CONNECT proxies?
🤔 No, I'm not certain. I'm armchair reviewing from my phone. 🙃
I appreciate you explaining this, I think that you're right and that your approach should work. I had trouble reasoning about proxy_connected.
I'd like to spin up some tests to play with this more but I wouldn't have too many hesitations about shipping hours as if if I were you. Worst case scenario: http (not https) auth breaks to git servers through a proxy. I'm actually not even sure that we support auth over http at the layer above this.
There was a problem hiding this comment.
I had trouble reasoning about
proxy_connected.
Mutable state was easy to reason about? 😂
I am going to take your solution, because it ultimately relies not on whether a connection was ever established, but on the fact that if one was ever established, that check must also hold. I prefer the non mutable approach because it's much easier to reason about.
Thanks for the prompt reply, that helped us make a decision.
a2cb816 to
3473a08
Compare
|
Thanks for this! |
Issue 1
When skipping the body, we used to wait for the
on_message_completecallback to have been triggered, which setparse_finished. After the refactor withhttpclient, we stopped relying on this value, and thus we no longer consume the entire body when skipping it in the client. Here's where we used to consume the bodylibgit2/src/transports/http.c
Line 956 in bf55fac
This shows up as an issue primarily when working with the proxy connection workflow, where we need to discard the entire body (completely drain the socket of the response) before we start the retry with credentials.
Issue 2
We used to mutate
proxy_opts.typetoGIT_PROXY_NONEafter successfully connecting the tunnel. After the httpclient refactor, we continue to apply proxy-authorization headers to all requests even after we've already built a tunnel. This is fixed by checking the status of the proxy tunnel before applying proxy credentials. I think this roughly translates to the same behavior as before.This is where we used to mutate the
proxy_opts.typebeforelibgit2/src/transports/http.c
Line 1032 in bf55fac
libgit2/src/transports/http.c
Lines 233 to 235 in bf55fac