Skip to content

Commit 09a7c72

Browse files
committed
Merge from 3.1: Issue python#13703: add a way to randomize the hash values of basic types (str, bytes, datetime)
in order to make algorithmic complexity attacks on (e.g.) web apps much more complicated. The environment variable PYTHONHASHSEED and the new command line flag -R control this behavior.
2 parents fee358b + 2daf6ae commit 09a7c72

34 files changed

+676
-162
lines changed

Doc/library/sys.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -253,11 +253,15 @@ always available.
253253
:const:`verbose` :option:`-v`
254254
:const:`bytes_warning` :option:`-b`
255255
:const:`quiet` :option:`-q`
256+
:const:`hash_randomization` :option:`-R`
256257
============================= =============================
257258

258259
.. versionchanged:: 3.2
259260
Added ``quiet`` attribute for the new :option:`-q` flag.
260261

262+
.. versionadded:: 3.2.3
263+
The ``hash_randomization`` attribute.
264+
261265

262266
.. data:: float_info
263267

Doc/reference/datamodel.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1272,6 +1272,8 @@ Basic customization
12721272
inheritance of :meth:`__hash__` will be blocked, just as if :attr:`__hash__`
12731273
had been explicitly set to :const:`None`.
12741274

1275+
See also the :option:`-R` command-line option.
1276+
12751277

12761278
.. method:: object.__bool__(self)
12771279

Doc/using/cmdline.rst

Lines changed: 46 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ Command line
2424

2525
When invoking Python, you may specify any of these options::
2626

27-
python [-bBdEhiOsSuvVWx?] [-c command | -m module-name | script | - ] [args]
27+
python [-bBdEhiORqsSuvVWx?] [-c command | -m module-name | script | - ] [args]
2828

2929
The most common use case is, of course, a simple invocation of a script::
3030

@@ -227,6 +227,29 @@ Miscellaneous options
227227
.. versionadded:: 3.2
228228

229229

230+
.. cmdoption:: -R
231+
232+
Turn on hash randomization, so that the :meth:`__hash__` values of str, bytes
233+
and datetime objects are "salted" with an unpredictable random value.
234+
Although they remain constant within an individual Python process, they are
235+
not predictable between repeated invocations of Python.
236+
237+
This is intended to provide protection against a denial-of-service caused by
238+
carefully-chosen inputs that exploit the worst case performance of a dict
239+
insertion, O(n^2) complexity. See
240+
http://www.ocert.org/advisories/ocert-2011-003.html for details.
241+
242+
Changing hash values affects the order in which keys are retrieved from a
243+
dict. Although Python has never made guarantees about this ordering (and it
244+
typically varies between 32-bit and 64-bit builds), enough real-world code
245+
implicitly relies on this non-guaranteed behavior that the randomization is
246+
disabled by default.
247+
248+
See also :envvar:`PYTHONHASHSEED`.
249+
250+
.. versionadded:: 3.2.3
251+
252+
230253
.. cmdoption:: -s
231254

232255
Don't add the :data:`user site-packages directory <site.USER_SITE>` to
@@ -350,6 +373,7 @@ Options you shouldn't use
350373

351374
.. _Jython: http://jython.org
352375

376+
353377
.. _using-on-envvars:
354378

355379
Environment variables
@@ -458,6 +482,27 @@ These environment variables influence Python's behavior.
458482
option.
459483

460484

485+
.. envvar:: PYTHONHASHSEED
486+
487+
If this variable is set to ``random``, the effect is the same as specifying
488+
the :option:`-R` option: a random value is used to seed the hashes of str,
489+
bytes and datetime objects.
490+
491+
If :envvar:`PYTHONHASHSEED` is set to an integer value, it is used as a fixed
492+
seed for generating the hash() of the types covered by the hash
493+
randomization.
494+
495+
Its purpose is to allow repeatable hashing, such as for selftests for the
496+
interpreter itself, or to allow a cluster of python processes to share hash
497+
values.
498+
499+
The integer must be a decimal number in the range [0,4294967295]. Specifying
500+
the value 0 will lead to the same hash values as when hash randomization is
501+
disabled.
502+
503+
.. versionadded:: 3.2.3
504+
505+
461506
.. envvar:: PYTHONIOENCODING
462507

463508
If this is set before running the interpreter, it overrides the encoding used

Include/object.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -517,6 +517,12 @@ PyAPI_FUNC(Py_hash_t) _Py_HashDouble(double);
517517
PyAPI_FUNC(Py_hash_t) _Py_HashPointer(void*);
518518
#endif
519519

520+
typedef struct {
521+
Py_hash_t prefix;
522+
Py_hash_t suffix;
523+
} _Py_HashSecret_t;
524+
PyAPI_DATA(_Py_HashSecret_t) _Py_HashSecret;
525+
520526
/* Helper for passing objects to printf and the like */
521527
#define PyObject_REPR(obj) _PyUnicode_AsString(PyObject_Repr(obj))
522528

Include/pydebug.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ PyAPI_DATA(int) Py_DivisionWarningFlag;
2020
PyAPI_DATA(int) Py_DontWriteBytecodeFlag;
2121
PyAPI_DATA(int) Py_NoUserSiteDirectory;
2222
PyAPI_DATA(int) Py_UnbufferedStdioFlag;
23+
PyAPI_DATA(int) Py_HashRandomizationFlag;
2324

2425
/* this is a wrapper around getenv() that pays attention to
2526
Py_IgnoreEnvironmentFlag. It should be used for getting variables like

Include/pythonrun.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -248,6 +248,8 @@ typedef void (*PyOS_sighandler_t)(int);
248248
PyAPI_FUNC(PyOS_sighandler_t) PyOS_getsig(int);
249249
PyAPI_FUNC(PyOS_sighandler_t) PyOS_setsig(int, PyOS_sighandler_t);
250250

251+
/* Random */
252+
PyAPI_FUNC(int) _PyOS_URandom (void *buffer, Py_ssize_t size);
251253

252254
#ifdef __cplusplus
253255
}

Lib/json/__init__.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,9 @@
3131
Compact encoding::
3232
3333
>>> import json
34-
>>> json.dumps([1,2,3,{'4': 5, '6': 7}], separators=(',', ':'))
34+
>>> from collections import OrderedDict
35+
>>> mydict = OrderedDict([('4', 5), ('6', 7)])
36+
>>> json.dumps([1,2,3,mydict], separators=(',', ':'))
3537
'[1,2,3,{"4":5,"6":7}]'
3638
3739
Pretty printing::

Lib/os.py

Lines changed: 0 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -761,23 +761,6 @@ def _pickle_statvfs_result(sr):
761761
except NameError: # statvfs_result may not exist
762762
pass
763763

764-
if not _exists("urandom"):
765-
def urandom(n):
766-
"""urandom(n) -> str
767-
768-
Return a string of n random bytes suitable for cryptographic use.
769-
770-
"""
771-
try:
772-
_urandomfd = open("/dev/urandom", O_RDONLY)
773-
except (OSError, IOError):
774-
raise NotImplementedError("/dev/urandom (or equivalent) not found")
775-
bs = b""
776-
while len(bs) < n:
777-
bs += read(_urandomfd, n - len(bs))
778-
close(_urandomfd)
779-
return bs
780-
781764
# Supply os.popen()
782765
def popen(cmd, mode="r", buffering=-1):
783766
if not isinstance(cmd, str):

Lib/test/mapping_tests.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ class BasicTestMappingProtocol(unittest.TestCase):
1414
def _reference(self):
1515
"""Return a dictionary of values which are invariant by storage
1616
in the object under test."""
17-
return {1:2, "key1":"value1", "key2":(1,2,3)}
17+
return {"1": "2", "key1":"value1", "key2":(1,2,3)}
1818
def _empty_mapping(self):
1919
"""Return an empty mapping object"""
2020
return self.type2test()

Lib/test/regrtest.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -496,6 +496,11 @@ def main(tests=None, testdir=None, verbose=0, quiet=False,
496496
except ValueError:
497497
print("Couldn't find starting test (%s), using all tests" % start)
498498
if randomize:
499+
hashseed = os.getenv('PYTHONHASHSEED')
500+
if not hashseed:
501+
os.environ['PYTHONHASHSEED'] = str(random_seed)
502+
os.execv(sys.executable, [sys.executable] + sys.argv)
503+
return
499504
random.seed(random_seed)
500505
print("Using random seed", random_seed)
501506
random.shuffle(selected)

0 commit comments

Comments
 (0)