Conversation
When porting, we overlooked that the difference between git's and our's time representation and copied their way of getting the max value. Unfortunately git was using unsigned integers, so `~0ll` does correspond to their max value, whereas for us it corresponds to `-1`. This means that we always consider the last date to be smaller than the current commit's and always think commits are interesting. Change the initial value to the macro that gives us the maximum value on each platform so we can accurately consider commits interesting or not.
…tamp This is not a big deal, but it does make us match git more closely by checking only the first. The lists are sorted already, so there should be no functional difference other than removing a possible check from every iteration in the loop.
|
I've just ran and timed the reproducers from #4428 and a trimmed version of the reproducer in #4740 (single iteration) on the linux kernel repo. The times reported for libgit2 built from master (bc34cb6): For the current tip (12a1790) Honestly, I thought I did something wrong when I performed the second measurement for the #4428 reproducer, but apparently this particular regression is not resolved. However, the performance is much better for #4740. Edit: As expected, the performance of the #4428 reproducer also greatly increases if the commits are not sorted by time: |
|
#4428 doesn't actually seem like a regression, but a fix and a misunderstanding of what different flags ought to do. |
|
Ouch! Thanks for the fix, @carlosmn. |
| { | ||
| int error, slop = SLOP; | ||
| int64_t time = ~0ll; | ||
| int64_t time = INT64_MAX; |
There was a problem hiding this comment.
I'm surprised none of our tests catched this :(
There was a problem hiding this comment.
It's just a performance regression, as all this does is make us think that we never went back far enough. If we had perf tests, maybe we would have caugh tit.
|
On Sat, Sep 22, 2018 at 12:32:52PM -0700, Carlos Martín Nieto wrote:
carlosmn commented on this pull request.
> @@ -405,7 +405,7 @@ static int still_interesting(git_commit_list *list, int64_t time, int slop)
static int limit_list(git_commit_list **out, git_revwalk *walk, git_commit_list *commits)
{
int error, slop = SLOP;
- int64_t time = ~0ll;
+ int64_t time = INT64_MAX;
It's just a performance regression, as all this does is make us think that we never went back far enough. If we had perf tests, maybe we would have caugh tit.
Yeah, I realized that at a later point, too. But thanks for
clarifying!
|
When porting, we overlooked that the difference between git's and our's time
representation and copied their way of getting the max value.
Unfortunately git was using unsigned integers, so
~0lldoes correspond totheir max value, whereas for us it corresponds to
-1. This means that wealways consider the last date to be smaller than the current commit's and always
think commits are interesting.
Change the initial value to the macro that gives us the maximum value on each
platform so we can accurately consider commits interesting or not.
The second commit is mostly just to reduce the drift from git, the actual perf difference should be negligible.
This fixes #4740 and we should backport it as it fixes a regression.