From 7c4ee67b6252c4afe85cc8a23528c8e52f0ef954 Mon Sep 17 00:00:00 2001 From: Yaroslav Pankovych <31005942+P-Alban@users.noreply.github.com> Date: Sun, 2 Aug 2020 11:34:01 +0300 Subject: [PATCH 1/7] Fix docs according to sources and devguide I've noticed that python docs differs from devguide (https://github.com/python/devguide/blob/master/garbage_collector.rst#collecting-the-oldest-generation) and source code (https://github.com/python/cpython/blob/master/Modules/gcmodule.c#L1409). So, I decided to fix python docs according to those sources. This is my first contribution, so any critic are welcome. --- Doc/library/gc.rst | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/Doc/library/gc.rst b/Doc/library/gc.rst index 0c33c865304591e..1edc5a689384131 100644 --- a/Doc/library/gc.rst +++ b/Doc/library/gc.rst @@ -106,9 +106,18 @@ The :mod:`gc` module provides the following functions: allocations minus the number of deallocations exceeds *threshold0*, collection starts. Initially only generation ``0`` is examined. If generation ``0`` has been examined more than *threshold1* times since generation ``1`` has been - examined, then generation ``1`` is examined as well. Similarly, *threshold2* - controls the number of collections of generation ``1`` before collecting - generation ``2``. + examined, then generation ``1`` is examined as well. + In addition to the various configurable thresholds, the GC only triggers a full collection of the oldest generation + if the ratio ``long_lived_pending / long_lived_total`` is above a given value (hardwired to ``25%``). + The reason is that, while "non-full" collections (i.e., collections of the young and middle generations) + will always examine roughly the same number of objects (determined by the aforementioned thresholds) + the cost of a full collection is proportional to the total number of long-lived objects, which is virtually unbounded. + Indeed, it has been remarked that doing a full collection every ```` of object creations + entails a dramatic performance degradation in workloads which consist of creating and storing + lots of long-lived objects (e.g. building a large list of GC-tracked objects would show quadratic performance, + instead of linear as expected). Using the above ratio, instead, yields amortized linear performance + in the total number of objects (the effect of which can be summarized thusly: + "each full garbage collection is more and more costly as the number of objects grows, but we do fewer and fewer of them"). .. function:: get_count() From 1b1ab98690d215b2c5f37085020b451db4a36f95 Mon Sep 17 00:00:00 2001 From: Yaroslav Pankovych <31005942+P-Alban@users.noreply.github.com> Date: Sun, 2 Aug 2020 18:23:47 +0300 Subject: [PATCH 2/7] Fix line wrapping --- Doc/library/gc.rst | 29 ++++++++++++++++++----------- 1 file changed, 18 insertions(+), 11 deletions(-) diff --git a/Doc/library/gc.rst b/Doc/library/gc.rst index 1edc5a689384131..df79424deef6a59 100644 --- a/Doc/library/gc.rst +++ b/Doc/library/gc.rst @@ -107,17 +107,24 @@ The :mod:`gc` module provides the following functions: starts. Initially only generation ``0`` is examined. If generation ``0`` has been examined more than *threshold1* times since generation ``1`` has been examined, then generation ``1`` is examined as well. - In addition to the various configurable thresholds, the GC only triggers a full collection of the oldest generation - if the ratio ``long_lived_pending / long_lived_total`` is above a given value (hardwired to ``25%``). - The reason is that, while "non-full" collections (i.e., collections of the young and middle generations) - will always examine roughly the same number of objects (determined by the aforementioned thresholds) - the cost of a full collection is proportional to the total number of long-lived objects, which is virtually unbounded. - Indeed, it has been remarked that doing a full collection every ```` of object creations - entails a dramatic performance degradation in workloads which consist of creating and storing - lots of long-lived objects (e.g. building a large list of GC-tracked objects would show quadratic performance, - instead of linear as expected). Using the above ratio, instead, yields amortized linear performance - in the total number of objects (the effect of which can be summarized thusly: - "each full garbage collection is more and more costly as the number of objects grows, but we do fewer and fewer of them"). + In addition to the various configurable thresholds, the GC only triggers + a full collection of the oldest generation + if the ratio ``long_lived_pending / long_lived_total`` is above a given value + (hardwired to ``25%``). The reason is that, while "non-full" collections + (i.e., collections of the young and middle generations) will always examine + roughly the same number of objects + (determined by the aforementioned thresholds) + the cost of a full collection is proportional to the total number + of long-lived objects, which is virtually unbounded. + Indeed, it has been remarked that doing a full collection every + ```` of object creations + entails a dramatic performance degradation in workloads which consist + of creating and storing lots of long-lived objects (e.g. building a large + list of GC-tracked objects would show quadratic performance, + instead of linear as expected). Using the above ratio, instead, yields + amortized linear performance in the total number of objects (the effect + of which can be summarized thusly: "each full garbage collection is more + and more costly as the number of objects grows, but we do fewer and fewer of them"). .. function:: get_count() From 9e2cff19edd1d87f44a103485ab49dc61e089a6a Mon Sep 17 00:00:00 2001 From: Yaroslav Pankovych <31005942+P-Alban@users.noreply.github.com> Date: Sun, 2 Aug 2020 18:24:39 +0300 Subject: [PATCH 3/7] Fix line wrapping --- Doc/library/gc.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/Doc/library/gc.rst b/Doc/library/gc.rst index df79424deef6a59..ba155796699b883 100644 --- a/Doc/library/gc.rst +++ b/Doc/library/gc.rst @@ -124,7 +124,8 @@ The :mod:`gc` module provides the following functions: instead of linear as expected). Using the above ratio, instead, yields amortized linear performance in the total number of objects (the effect of which can be summarized thusly: "each full garbage collection is more - and more costly as the number of objects grows, but we do fewer and fewer of them"). + and more costly as the number of objects grows, but we do + fewer and fewer of them"). .. function:: get_count() From 2bf9032900786a37961d832a3d2bbe005d63e1ec Mon Sep 17 00:00:00 2001 From: Yaroslav Pankovych <31005942+P-Alban@users.noreply.github.com> Date: Sun, 2 Aug 2020 18:29:15 +0300 Subject: [PATCH 4/7] Remove trailing whitespaces. --- Doc/library/gc.rst | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/Doc/library/gc.rst b/Doc/library/gc.rst index ba155796699b883..12f6c4e3c253a9e 100644 --- a/Doc/library/gc.rst +++ b/Doc/library/gc.rst @@ -107,24 +107,24 @@ The :mod:`gc` module provides the following functions: starts. Initially only generation ``0`` is examined. If generation ``0`` has been examined more than *threshold1* times since generation ``1`` has been examined, then generation ``1`` is examined as well. - In addition to the various configurable thresholds, the GC only triggers - a full collection of the oldest generation - if the ratio ``long_lived_pending / long_lived_total`` is above a given value + In addition to the various configurable thresholds, the GC only triggers + a full collection of the oldest generation + if the ratio ``long_lived_pending / long_lived_total`` is above a given value (hardwired to ``25%``). The reason is that, while "non-full" collections (i.e., collections of the young and middle generations) will always examine - roughly the same number of objects - (determined by the aforementioned thresholds) - the cost of a full collection is proportional to the total number + roughly the same number of objects + (determined by the aforementioned thresholds) + the cost of a full collection is proportional to the total number of long-lived objects, which is virtually unbounded. Indeed, it has been remarked that doing a full collection every - ```` of object creations + ```` of object creations entails a dramatic performance degradation in workloads which consist - of creating and storing lots of long-lived objects (e.g. building a large - list of GC-tracked objects would show quadratic performance, + of creating and storing lots of long-lived objects (e.g. building a large + list of GC-tracked objects would show quadratic performance, instead of linear as expected). Using the above ratio, instead, yields amortized linear performance in the total number of objects (the effect - of which can be summarized thusly: "each full garbage collection is more - and more costly as the number of objects grows, but we do + of which can be summarized thusly: "each full garbage collection is more + and more costly as the number of objects grows, but we do fewer and fewer of them"). From ec591c9288c49770f04b4bbfaf9a2623520cc711 Mon Sep 17 00:00:00 2001 From: Yaroslav Pankovych <31005942+P-Alban@users.noreply.github.com> Date: Sat, 8 Aug 2020 16:19:27 +0300 Subject: [PATCH 5/7] Update gc.rst --- Doc/library/gc.rst | 21 ++------------------- 1 file changed, 2 insertions(+), 19 deletions(-) diff --git a/Doc/library/gc.rst b/Doc/library/gc.rst index 12f6c4e3c253a9e..a0cd50a3d1fe379 100644 --- a/Doc/library/gc.rst +++ b/Doc/library/gc.rst @@ -107,25 +107,8 @@ The :mod:`gc` module provides the following functions: starts. Initially only generation ``0`` is examined. If generation ``0`` has been examined more than *threshold1* times since generation ``1`` has been examined, then generation ``1`` is examined as well. - In addition to the various configurable thresholds, the GC only triggers - a full collection of the oldest generation - if the ratio ``long_lived_pending / long_lived_total`` is above a given value - (hardwired to ``25%``). The reason is that, while "non-full" collections - (i.e., collections of the young and middle generations) will always examine - roughly the same number of objects - (determined by the aforementioned thresholds) - the cost of a full collection is proportional to the total number - of long-lived objects, which is virtually unbounded. - Indeed, it has been remarked that doing a full collection every - ```` of object creations - entails a dramatic performance degradation in workloads which consist - of creating and storing lots of long-lived objects (e.g. building a large - list of GC-tracked objects would show quadratic performance, - instead of linear as expected). Using the above ratio, instead, yields - amortized linear performance in the total number of objects (the effect - of which can be summarized thusly: "each full garbage collection is more - and more costly as the number of objects grows, but we do - fewer and fewer of them"). + With the third generation things a bit complicated, + see `Collecting the oldest generation `_ for more information. .. function:: get_count() From f99124c81d38eed74693730155a1a4e9a0cb110d Mon Sep 17 00:00:00 2001 From: Pablo Galindo Date: Sat, 8 Aug 2020 19:30:57 +0100 Subject: [PATCH 6/7] Update Doc/library/gc.rst --- Doc/library/gc.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/library/gc.rst b/Doc/library/gc.rst index a0cd50a3d1fe379..d009505930c52f6 100644 --- a/Doc/library/gc.rst +++ b/Doc/library/gc.rst @@ -107,7 +107,7 @@ The :mod:`gc` module provides the following functions: starts. Initially only generation ``0`` is examined. If generation ``0`` has been examined more than *threshold1* times since generation ``1`` has been examined, then generation ``1`` is examined as well. - With the third generation things a bit complicated, + With the third generation, things are a bit more complicated, see `Collecting the oldest generation `_ for more information. From 2d288fec9489ab1c6d1a781d814e3b9255e4bf2e Mon Sep 17 00:00:00 2001 From: Pablo Galindo Date: Sat, 8 Aug 2020 19:37:58 +0100 Subject: [PATCH 7/7] Update Doc/library/gc.rst --- Doc/library/gc.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/library/gc.rst b/Doc/library/gc.rst index d009505930c52f6..2d85cd3431711ab 100644 --- a/Doc/library/gc.rst +++ b/Doc/library/gc.rst @@ -107,7 +107,7 @@ The :mod:`gc` module provides the following functions: starts. Initially only generation ``0`` is examined. If generation ``0`` has been examined more than *threshold1* times since generation ``1`` has been examined, then generation ``1`` is examined as well. - With the third generation, things are a bit more complicated, + With the third generation, things are a bit more complicated, see `Collecting the oldest generation `_ for more information.