|
| 1 | +# Truthiness: Boolean Evaluations |
| 2 | + |
| 3 | +While it would seem Python has an actual Boolean (Yes/No, True/False) type, this idea can be seriously abused in many odd and confusing ways. First off, there are actual `True` and `False` values: |
| 4 | + |
| 5 | +```` |
| 6 | +>>> True == True |
| 7 | +True |
| 8 | +>>> False == False |
| 9 | +True |
| 10 | +```` |
| 11 | + |
| 12 | +But they are equivalent to integers: |
| 13 | + |
| 14 | +```` |
| 15 | +>>> True == 1 |
| 16 | +True |
| 17 | +>>> False == 0 |
| 18 | +True |
| 19 | +```` |
| 20 | + |
| 21 | +Which means, oddly, that you can add them: |
| 22 | + |
| 23 | +```` |
| 24 | +>>> True + True |
| 25 | +2 |
| 26 | +>>> True + True + False |
| 27 | +2 |
| 28 | +```` |
| 29 | + |
| 30 | +Lots of things are `False`-ey when they are evaluated in a Boolean context. The `int` `0`, the `float` `0.0`, the empty string, an empty list, and the special value `None` are all considered `False`-ey: |
| 31 | + |
| 32 | +```` |
| 33 | +>>> 'Hooray!' if 0 else 'Shucks!' |
| 34 | +'Shucks!' |
| 35 | +>>> 'Hooray!' if 0. else 'Shucks!' |
| 36 | +'Shucks!' |
| 37 | +>>> 'Hooray!' if [] else 'Shucks!' |
| 38 | +'Shucks!' |
| 39 | +>>> 'Hooray!' if '' else 'Shucks!' |
| 40 | +'Shucks!' |
| 41 | +>>> 'Hooray!' if None else 'Shucks!' |
| 42 | +'Shucks!' |
| 43 | +```` |
| 44 | + |
| 45 | +But note: |
| 46 | + |
| 47 | +```` |
| 48 | +>>> 'Hooray!' if 'None' else 'Shucks!' |
| 49 | +'Hooray!' |
| 50 | +```` |
| 51 | + |
| 52 | +There are quotes around `'None'` so it's the literal string "None" and not the special value `None`, and, since this is not an empty string, it evaluates *in a Boolean context* to not-`False` which is basically `True`. |
| 53 | + |
| 54 | +This behavior can introduce extremely subtle logical bugs into your programs that the Python compiler and linters cannot uncover. Consider the `dict.get` method that will safely return the value for a given key in a dictionary, returning `None` if the key does not exist. Given this dictionary: |
| 55 | + |
| 56 | +```` |
| 57 | +>>> d = {'foo': 0, 'bar': None} |
| 58 | +```` |
| 59 | + |
| 60 | +If we access a key that doesn't exist, Python generates an exception that, if not caught in our code, would immediately crash the program: |
| 61 | + |
| 62 | +```` |
| 63 | +>>> d['baz'] |
| 64 | +Traceback (most recent call last): |
| 65 | + File "<stdin>", line 1, in <module> |
| 66 | +KeyError: 'baz' |
| 67 | +```` |
| 68 | + |
| 69 | +But we can use `d.get()` to do this safely: |
| 70 | + |
| 71 | +```` |
| 72 | +>>> d.get('baz') |
| 73 | +```` |
| 74 | + |
| 75 | +Hmm, that seems unhelpful! What did we get back? |
| 76 | + |
| 77 | +```` |
| 78 | +>>> type(d.get('baz')) |
| 79 | +<class 'NoneType'> |
| 80 | +```` |
| 81 | + |
| 82 | +Ah, we got `None`! |
| 83 | + |
| 84 | +We could use an `or` to define a default value: |
| 85 | + |
| 86 | +```` |
| 87 | +>>> d.get('baz') or 'NA' |
| 88 | +'NA' |
| 89 | +```` |
| 90 | + |
| 91 | +It turns out the `get` method accepts a second, optional argument of the default value to return: |
| 92 | + |
| 93 | +```` |
| 94 | +>>> d.get('baz', 'NA') |
| 95 | +'NA' |
| 96 | +```` |
| 97 | + |
| 98 | +Great! So let's use that on the other values: |
| 99 | + |
| 100 | +```` |
| 101 | +>>> d.get('foo', 'NA') |
| 102 | +0 |
| 103 | +>>> d.get('bar', 'NA') |
| 104 | +```` |
| 105 | + |
| 106 | +The call for `bar` was weird, but remember that we put an actual `None` as the value: |
| 107 | + |
| 108 | +```` |
| 109 | +>>> type(d.get('bar', 'NA')) |
| 110 | +<class 'NoneType'> |
| 111 | +```` |
| 112 | + |
| 113 | +OK, so we go back to this: |
| 114 | + |
| 115 | +```` |
| 116 | +>>> d.get('bar') or 'NA' |
| 117 | +'NA' |
| 118 | +```` |
| 119 | + |
| 120 | +Which seems to work, but notice this: |
| 121 | + |
| 122 | +```` |
| 123 | +>>> d.get('foo') or 'NA' |
| 124 | +'NA' |
| 125 | +```` |
| 126 | + |
| 127 | +The value for `foo` is actually `0` which evaluates to `False` given the Boolean evaluation of the `or`. If this were a measurement of some value like the amount of sodium in water, then the string `NA` would indicate that no value was recorded whereas `0` indicates that sodium was measured and none detected. If some sort of important analysis rested on our interpretation of the strings in a spreadsheet, we might inadvertently introduce missing values because of the way Python coerces various non-Boolean values into Boolean values. |
| 128 | + |
| 129 | +Perhaps a safer way to access these values would be: |
| 130 | + |
| 131 | +```` |
| 132 | +>>> for key in ['foo', 'bar', 'baz']: |
| 133 | +... val = d[key] if key in d else 'NA' |
| 134 | +... val = 'NA' if val is None else val |
| 135 | +... print(key, val) |
| 136 | +... |
| 137 | +foo 0 |
| 138 | +bar NA |
| 139 | +baz NA |
| 140 | +```` |
0 commit comments