In the first part we looked at some of the consequences of Python being a dynamically-typed language. Let’s have a look at another implications such as the concept of class, how things work under the hood as well as performance.
The data descriptor exception
We have previously seen how an object returns an attribute from its dictionary if available, and if not returns the class attribute (still if available). There is however an exception to that attribute lookup rule. If a class attribute is a data descriptor, it will be used even if the object dictionary contains an attribute of the same name. A data descriptor is an object which has both a __get__ and a __set__ method defined.
>>> class Attr(object):
... def __get__(self, obj, type=None):
... return 42
... def __set__(self, obj, value):
... pass
...
>>> class MyClass(object):
... attr = Attr()
...
>>> obj = MyClass()
>>> obj.__dict__['attr'] = 'New value'
>>> obj.__dict__
{'attr': 'New value'}
>>> obj.attr
42
We’re using a hack on line 11 to update the object dictionary as using “obj.attr = ‘New value'” would call Attr.__set__(). But even though we’ve modified the object dictionary, calling “obj.attr” returns the result from the class attribute lookup.
The implication is that, no matter what, Python needs to perform a class attribute lookup even if the object dictionary contains the attribute name.
A different notion of class
More generally, the concept of class is also affected by the type system.
In a statically-typed language such as C++ or Java, one of the class roles is to define the attributes for its instances – whether or not they have a default value or not. As a result, each instance has its own copy of the all the attributes defined in the class. If a class defines an integer attribute “attr”, instantiating this class a million times will create a million copies of this attribute – all with the same default value. The methods however are shared among all instances (i.e. it’s the exact same code which is executed regardless of the instance)
In a dynamically-typed language, there is no need to set such attributes at the class level as it can be done after the objects are created (whether in the constructor or later on). In Python, if a class defines an integer attribute “attr”, instantiating this class a million times will not create copies of the attributes. The million instances will share this attribute – unless the constructor sets a value to this attribute, in which case each instance will have its own copy.
Python creates the illusion that a class attribute is defining an individual attribute for each object with a default value. After all, if “MyClass” has an “attr” attribute, “my_object.attr” will return the default value and can be updated at will, independently of other instances. However, “my_object.attr = … ” will create a new value in the object dictionary that will override the class attribute.
This illusion can however be broken in the case of mutable attributes:
>>> class MyClass(object): ... attr = [1, 2] ... >>> obj1 = MyClass() >>> obj2 = MyClass() >>> obj1.attr.append(3) >>> obj2.attr [1, 2, 3]
“obj1.attr” and “obj2.attr” look like object attributes until the array is modified.
Attribute lookup deep dive
Let’s now look at how attribute lookup is implemented in the CPython source code. In particularly, let’s look what is happening at line 8 in the code below when we’re evaluating the attribute “attr” of the object. Note how “attr” is defined both at the class level and at the object level:
>>> class MyClass(object): ... attr = 42 ... >>> obj = MyClass() >>> obj.attr # MyClass.attr 42 >>> obj.attr = "New value" >>> obj.attr # obj.attr 'New value'
When we step in the source code, here is what is happening:
The bytecode instruction behind “obj.attr” that interests us is LOAD_ATTR (Python/ceval.c, lines 2415-2424) as its purpose is to find the desired attribute at runtime. Following a series of function calls(*), _PyObject_GenericGetAttrWithDict gets called. This function (defined in Objects/object.c, lines 1013-1098) is the heart of the attribute lookup:
- At line 1036 if first calls _PyType_Lookup (defined in Objects/typeobject.c, lines 2695-2745). This function performs a class attribute lookup following the MRO. In the present case it returns the object 42.
- If the value returned by the class attribute lookup happens to be a data descriptor, the function immediately returns that value.
- In other cases, the function then looks in the object dictionary (line 1071)
- In the current case it finds the “New value” string, so returns it
- If the object dictionary lookup didn’t return anything it will return what was returned by _PyType_Lookup().
- If neither method find any attribute it issues an error (line 1091)
What about performance?
The first reaction one may have when seeing all the processing performed here is: what about performances? Not only dictionary lookups expensive, but CPython is going through the MRO chain no matter what, even if the result gets eventually discarded for what is in the object dictionary.
This is why I tried a performance test: how long it would take to go loop through 10 million class lookups, and does it matter whether the object has a large hierarchy or not?
In the following tests, I have checked that the same test for 1 million elements was roughly 10 times faster than for 10 millions, to make sure there was no caching of some sort.
>>> class A(object): a = 1 >>> class B(A): b = 2 >>> class C(B): c = 3 >>> class D(C): d = 4 >>> class E(D): e = 5 >>> class F(E): f = 6 ... >>> def speed(myclass, nb): ... objects = [] ... for _ in range(nb): ... objects.append(myclass.__new__(myclass)) ... t1 = time.time() ... for o in objects: ... tmp = o.a ... t2 = time.time() ... print(t2 - t1) ... >>> speed(A, 10000000) 0.4630260467529297 >>> speed(F, 10000000) 0.4720268249511719
As we can see, there is not much difference whether we’re using an object with a large hierarchy (F) or only “object” as a parent (A). Around 460-470 milliseconds for 10 million objects.
As a comparison, let’s see how long it takes to go through 10 million simple assignments, without any attribute lookup:
>>> def speed0(myclass, nb): ... objects = [] ... for _ in range(nb): ... objects.append(myclass.__new__(myclass)) ... t1 = time.time() ... for o in objects: ... tmp = 42 ... t2 = time.time() ... print(t2 - t1) ... >>> speed0(A, 10000000) 0.30701780319213867 >>> speed0(A, 10000000) 0.2980170249938965
So around 300 milliseconds. If we replace line 7 with “pass” to get an empty loop, the test takes around 180 milliseconds. So to summarize:
- Looping through an array of 10 million objects: 180 milliseconds
- 10 million assignments: 300 – 180 = 120 milliseconds
- 10 million attribute lookups: 470 – 300 = 170 milliseconds
As a comparison, an “equivalent” program in C++ takes 7 milliseconds and 14 milliseconds in Java.
So yes, dynamic typing has an impact on performance. Does it matter? Not always, as we will see in the next post.
(*) Here is how it goes:
- At line 2418, LOAD_ATTR calls PyObject_GetAttr() (defined in Objects/object.c, lines 862-884)
- PyObject_GetAttr() is calling the object’s attribute lookup method. It does that by getting the object type and calling its tp_getattro() method (defined in Objects/typeobject.c, line 4208)
- PyObject_GenericGetAttr() gets actually called (defined in Objects/object.c, lines 1101-1103) which is turn calls _PyObject_GenericGetAttrWithDict (defined in Objects/object.c, lines 1013-1098)