Remove unnecessary reversed() from random.shuffle(). by rhettinger · Pull Request #25698 · python/cpython

rhettinger · 2021-04-28T17:36:48Z

Originally, I was going to just add a comment, but then thought we might as well remove the unnecessary code.

# The use of reversed() in the code below isn't necessary.
# The algorithm works equally well when running forward.
# We run in reverse only because "we've always done it this way".

In CPython, a reversed range runs as fast a forward range:

$ python3.10 -m timeit 'sum(range(10000))'
1000 loops, best of 5: 199 usec per loop
$ python3.10 -m timeit 'sum(reversed(range(10000)))'
2000 loops, best of 5: 199 usec per loop

With PyPy, reversed() does add some overhead:

$ pypy3 -m timeit 'sum(range(10000))'
50000 loops, average of 7: 9.66 +- 0.0393 usec per loop
$ pypy3 -m timeit 'sum(reversed(range(10000)))'
10000 loops, average of 7: 25.9 +- 0.321 usec per loop

tim-one

OK by me either way, but there are points on both sides. It's not just us - virtually everyone runs the loop from highest to lowest, presumably because that makes correctness easier to see. Python is unique in that it requires "more words" to go in the decreasing direction.

More troublesome to me: this can change shuffle results for people who force the seed. It's a related algorithm, not the same algorithm. For that reason I'm -0 overall - "marginal potential speedup under PyPy, but potentially changes results under CPython" isn't an attractive tradeoff to me.

rhettinger · 2021-04-28T20:22:49Z

For me, correctness seems equally easy to see. The loop invariant is that x[:i] is shuffled. A new element is mixed in at every step. It's just an insertion-shuffle vs a selection shuffle. The reverseless code is a bit easier because reasoning about reverse loops is more awkward than for forward loops. I hear what you're saying about changing existing seeded shuffles. That is my only misgiving. Thanks for looking at this.

tim-one · 2021-04-29T02:35:24Z

Yes, both ways fold in a new item on each iteration - but only the backwards way also removes an item from consideration on each iteration. In the standard way, after the first iteration each item has an equal chance of being swapped into the final position, and that's obvious, and it's thereafter impossible for the final position to change value again. That's what makes it very obvious - you can forget about it, it's done. Each iteration determines the final value at one position.

Go forward instead, and no position is eliminated as iterations go on. Even on the last iteration, you still can't know coming in what the final value of any position will be - you can only state an invariant involving all positions. Try writing formal proofs, and I bet you too will find that harder to account for.

iritkatriel · 2021-04-29T10:24:00Z

Would it be faster to replace reversed(range(1, len(x))) by range(len(x)-1, 0, -1)?

tim-one · 2021-04-29T17:33:57Z

Would it be faster to replace reversed(range(1, len(x))) by range(len(x)-1, 0, -1)?

Under PyPy, almost certainly. But I happen to know that Raymond has an aversion to range() with a negative step so strong that it wasn't worth mentioning 😉.

It so happens I prefer the reversed() spelling too, but then "speed" means nothing to me here - the cost of iteration in this context is bound to be insignificant compared to the cost of calling _randbelow() on each iteration, even under PyPy.

Remove unnecessary reversed() from random.shuffle().

445a538

rhettinger requested a review from tim-one April 28, 2021 17:36

the-knights-who-say-ni added the CLA signed label Apr 28, 2021

bedevere-bot added the awaiting core review label Apr 28, 2021

tim-one reviewed Apr 28, 2021

View reviewed changes

rhettinger closed this Apr 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove unnecessary reversed() from random.shuffle().#25698

Remove unnecessary reversed() from random.shuffle().#25698
rhettinger wants to merge 1 commit into
python:masterfrom
rhettinger:shuffle_direction

rhettinger commented Apr 28, 2021

Uh oh!

tim-one left a comment •

edited

Loading

Uh oh!

rhettinger commented Apr 28, 2021 •

edited

Loading

Uh oh!

tim-one commented Apr 29, 2021

Uh oh!

iritkatriel commented Apr 29, 2021

Uh oh!

tim-one commented Apr 29, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

rhettinger commented Apr 28, 2021

Uh oh!

tim-one left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rhettinger commented Apr 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tim-one commented Apr 29, 2021

Uh oh!

iritkatriel commented Apr 29, 2021

Uh oh!

tim-one commented Apr 29, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

tim-one left a comment •

edited

Loading

rhettinger commented Apr 28, 2021 •

edited

Loading