Fix flaky test test_horrible_queries#3857
Conversation
`test_horrible_queries` can be flaky, but it's very hard to detect.
This commit fixes the flakiness. To understand the flakiness, let me
put the original test code below:
```ruby
# then that large mangled field values are caught
10.times do |c|
get = "GET /#{rand_data(10,120)} HTTP/1.1\r\nX-Test: #{rand_data(1024, 1024+(c*1024), false)}\r\n\r\n"
assert_raises Puma::HttpParserError do
parser.execute({}, get, 0)
parser.reset
end
end
```
The way I understand `rand_data(xxx, false)` is that the function should
_always_ return invalid bytes (IOW cause the parser to raise an
exception).
However, it is possible for that function to _sometimes_ return valid
bytes. The above test asserts that an error is raised when
`parser.execute` is called, but if an exception gets raised, then
`parser.reset` isn't called. If `parser.reset` isn't called, then
subsequent calls to `parser.execute` will raise an exception
_regardless_ of whether the input is parse-able or not. In other words,
if the first iteration caused an exception, subsequent iterations are
guaranteed to raise an exception.
This commit does two things:
1. Pulls `parser.reset` outside of the assert_raises so that the flake
is more obvious / reproducible
2. Changes `rand_data` to always add an invalid byte so that if the
random data happens to contain parsable data, we'll at least have one
byte guaranteed to fail.
|
btw, it looks like we're seeing timeouts in Truffle? This could be because of my change. This loop: 10.times do |c|
get = "GET /#{rand_data(10,120)} HTTP/1.1\r\nX-Test: #{rand_data(1024, 1024+(c*1024), false)}\r\n\r\n"
# ....
endIs generating more and more data on each iteration (because of |
|
One question for the maintainers: is this test valuable? I mean, is the 10th iteration any more valuable than the 1st? Genuine question because I can't really tell (and I don't think it's my call). Especially since before my patch usually the first iteration made subsequent iterations do nothing (unless we're testing the behavior of not calling |
|
First off. Thank you for contributing! You're the best.
Just merged a fix on main. I triggered a rebase of your PR, it's green now.
I have resisted touching the HTTP parser code personally. This almost looks like a cross between fuzzing and regression protection. I'm aware of fuzzing, but haven't actually invested in trying it on a codebase for real. Also, as I understand it, it's a lengthy, expensive process that you probably wouldn't want to run in CI. Might be some opportunity there. Running the blame, it was added circa 2006 3c804d5
IMHO, no. If you want to clean it up, split it into several real and distinct tests with names instead of comments, and remove the iteration, seems good. Since you mention it, it also doesn't look like we have a test that asserts that calling |
Yes, that's what it looks like to me.
As long as the current test isn't causing timeouts, I feel it's probably fine to keep as-is.
Ok. I'll send a regression test! This PR inspired me to try my hand at a pure Ruby implementation (which is why I'm looking at this stuff). I want to study the "real" uses of the parser inside of Puma and try to add tests around that. For example, I suspect that internally, Puma will either always call |
|
The parser is used in the So after a request is parsed successfully then the client is reset. This happens in a loop (but only when using keepalive connections): while can_loop
# ...
case handle_request(client, requests + 1)
# ...
when :keep_alive
requests += 1
client.resetWhich clears the state of the parser among other things def reset
@parser.resetNot having to allocate is a benefit of doing it this way, but at the cost of having more book-keeping and logic. In general we've leaned towards performance at the cost of complexity/maintenance (with the caveat that correctness and reliability or other guarantees take precedence over raw perf). |
That makes sense. The parser I've designed is a stateless singleton (with no side effects), so there isn't a reset. It takes an offset in to a buffer (same as the current parser), and For example (pseudo code): offset = 0
loop do
req = {}
offset = AaronParser.execute(req, buff, offset)
# If it's a POST, the post body can be extracted from buff via the `offset` + content-length if desired
# do stuff with req
endI added a stateful wrapper around my stateless implementation only so that I can make the tests pass. Once I get the tests passing I may send a PR, but honestly it's just for fun and possibly a ZJIT benchmark. |
test_horrible_queriescan be flaky, but it's very hard to detect. This commit fixes the flakiness. To understand the flakiness, let me put the original test code below:The way I understand
rand_data(xxx, false)is that the function should always return invalid bytes (IOW cause the parser to raise an exception).However, it is possible for that function to sometimes return valid bytes. The above test asserts that an error is raised when
parser.executeis called, but if an exception gets raised, thenparser.resetisn't called. Ifparser.resetisn't called, then subsequent calls toparser.executewill raise an exception regardless of whether the input is parse-able or not. In other words, if the first iteration caused an exception, subsequent iterations are guaranteed to raise an exception.This commit does two things:
parser.resetoutside of the assert_raises so that the flake is more obvious / reproduciblerand_datato always add an invalid byte so that if the random data happens to contain parsable data, we'll at least have one byte guaranteed to fail.Description
Thank you for contributing! You're the best.
We can read your code, so consider leaving some comments here that are more about your motivations and decision making. Some things that may be helpful to address in your description:
Your checklist for this pull request
[ci skip]to the title of the PR.#issue" to the PR description or my commit messages.