Make PyStr.hash an AtomicI64#2586
Conversation
youknowone
left a comment
There was a problem hiding this comment.
do we need to copy this behavior for real compatibility issues? when i worked on hash()es, I couldn't find any reason except for internal hard compitibility at the moment.
(but yes, hard compatibility is good when possible)
|
There are instances in the CPython source where a return value of |
|
Yes, they do it to use -1 as marker for 'error raised' meaning. But in RustPython, we returns |
|
Yeah, or just |
dafdeee to
f31f88f
Compare
|
@youknowone I changed |
f31f88f to
3889529
Compare
|
Pretty solid albeit small improvement: no-threading: I should run these again, my machine is pretty busy atm. |
|
Here's standalone benchmarks for checking |
|
I'm actually really happy with this, cause now we make a tradeoff the same way CPython does. For CPython, if you have a string that has just one non-ascii character, its size in memory can double or even quadruple depending on how big the codepoint is. Here, we use utf-8 for strings, which is much nicer on space due to being a variable-width encoding, but if a string is non-ascii we lose time rather than space, and only when we directly index into the string. I feel that this is very comparable in terms of costs/benefits to the tradeoff CPython makes, since it really just depends on the type of program running. If a Python program just passes strings around as user input or treats them as atoms, then it would benefit from lower memory usage for all the strings it has. However, if a program has to do text processing on non-ascii characters, it'll take a hit when it needs to index/slice a string, so may want to make the tradeoff to make indexing |
fb35318 to
1b7ccdc
Compare
|
Merging this, since I suppose it's already approved. |
cc @fanninpm / #2585