Remove unnecessary reversed() from random.shuffle().#25698
Conversation
There was a problem hiding this comment.
OK by me either way, but there are points on both sides. It's not just us - virtually everyone runs the loop from highest to lowest, presumably because that makes correctness easier to see. Python is unique in that it requires "more words" to go in the decreasing direction.
More troublesome to me: this can change shuffle results for people who force the seed. It's a related algorithm, not the same algorithm. For that reason I'm -0 overall - "marginal potential speedup under PyPy, but potentially changes results under CPython" isn't an attractive tradeoff to me.
|
For me, correctness seems equally easy to see. The loop invariant is that |
|
Yes, both ways fold in a new item on each iteration - but only the backwards way also removes an item from consideration on each iteration. In the standard way, after the first iteration each item has an equal chance of being swapped into the final position, and that's obvious, and it's thereafter impossible for the final position to change value again. That's what makes it very obvious - you can forget about it, it's done. Each iteration determines the final value at one position. Go forward instead, and no position is eliminated as iterations go on. Even on the last iteration, you still can't know coming in what the final value of any position will be - you can only state an invariant involving all positions. Try writing formal proofs, and I bet you too will find that harder to account for. |
|
Would it be faster to replace |
Under PyPy, almost certainly. But I happen to know that Raymond has an aversion to It so happens I prefer the |
Originally, I was going to just add a comment, but then thought we might as well remove the unnecessary code.
In CPython, a reversed range runs as fast a forward range:
With PyPy, reversed() does add some overhead: