micro-ROS · pablogs9 · Apr 30, 2021 · Apr 29, 2021
diff --git a/_docs/concepts/benchmarking/memo_prof/3mems.png b/_docs/concepts/benchmarking/memo_prof/3mems.png
diff --git a/_docs/concepts/benchmarking/memo_prof/index.md b/_docs/concepts/benchmarking/memo_prof/index.md
@@ -45,13 +45,12 @@ Communication streams regulate how messages flow between the Clients and the Age
 
 ### Memory management of the RMW
 
-The `rmw-microxrcedds` layer uses static memory for allocating the resources associated with the ROS client support libraries, such as nodes, publishers, subscribers etc. This static memory is managed by independent memory pools for each kind of entity, whose size can be set at build time through CMake flags which allow to tune several parameters, for instance the maximum number of entities, which ultimately control how much memory a micro-ROS application can consume.
-
-Another important memory resource managed by the `rmw-microxrcedds` library is the message history (also referred to as RMW_history in the rest of the text). `rmw-microxrcedds` uses a static-memory message queue where to keep the subscription messages before the user reads them, that functions as a buffer ring while it is being populated with the incoming messages and depleted by means of users’ callbacks coming from the higher layers.
+The `rmw-microxrcedds` layer uses static memory for allocating the resources associated with the ROS client support libraries, such as nodes, publishers, subscribers etc. This memory is managed by static memory pools that are shared among all the entities of a given application. The number of pools is fixed by the RMW message history (also referred to as RMW_history in the rest of the text), a parameter which is chosen by the user as a CMake flag.
+These RMW pools act as message queues where to keep the subscription messages before the user reads them. The size of each pool is given by the MTU &#215; XRCE_history.
 
 ### Additional considerations
 
-It should be clear by now that the two history buffers involved in the Client-Agent communication are different in nature: the XRCE-DDS history is used to store chunks of fragmented messages if Reliable communication is implemented. Thanks to this, in Reliable mode one can send or receive up to MTU &#215; XRCE_history minus the memory reserved for headers, whereas Best-Effort communication streams can only exchange messages of size smaller or equal to the MTU.  The history of the RMW, in turn, controls the reception of data in the case of subscription and services. In this case, a buffer ring is generated in the RMW to store and cushion the data received from the XRCE-DDS library while handling the `take` calls received from the user’s interface to fetch these data and send them through the upper layers all the way up to the user’s application.
+It should be clear by now that the two history buffers involved in the Client-Agent communication are different in nature: the XRCE-DDS history is used to store chunks of fragmented messages if Reliable communication is implemented. Thanks to this, in Reliable mode one can send or receive up to MTU &#215; XRCE_history minus the memory reserved for headers, whereas Best-Effort communication streams can only exchange messages of size smaller or equal to the MTU.  The history of the RMW, in turn, controls the reception of data in the case of subscription and services. In this case, a buffer ring is generated in the RMW to store and cushion the data received from the XRCE-DDS library while handling the `take` calls received from the user’s interface to fetch the data and send them through the higher layers all the way up to the user’s application.
 
 Notice that this reflects the different behaviours of the XRCE-DDS library and of the RMW under subscription. While the XRCE-DDS library functions with callbacks, by warning the user whenever a new message comes in, the RMW functions by polling: this library listens to the topics the user has subscribed to and stores them until they are explicitly requested.
 
@@ -117,7 +116,7 @@ In the fifth set of measurements we measure the footprint of applications of req
 
 In this section, we detail the methodology employed for the memory profiling of the experimental configurations described above, and for each of them we present the results obtained.
 
-The measurements are conducted on a micro-ROS Client application with a varying number of entities: either publishers/subscribers (1, 5, 10, 15) or client/server (1, 2, 5, 10).
+The measurements are conducted on a micro-ROS Client application with a varying number of entities: either publishers/subscribers (from 1 to 15) or client/server (from 1 to 10).
 
 All the tested apps run on top of FreeRTOS and inside of an ESP32 board. The board is connected by either UDP or serial transport to a micro-ROS Agent running on a Linux machine. As explained above, the choice of FreeRTOS has been by virtue of its memory management functionalities, which easily allow to compute the memory used by applications.
 
@@ -127,30 +126,30 @@ In order to provide an assessment as much realistic as possible, the following p
 
 *Total memory as a function of entities number and message size*
 
-In this section, we report the total memory used by either publisher or subscriber applications in both Best-Effort and Reliable modes, using UDP transport, an RMW history of 4 slots, an MTU of 512 B and an XRCE-DDS history of 4 slots. The total memory consumption is reported as a function of the entity number and message size.
+In this section, we report the total memory used by either publisher or subscriber applications in both Best-Effort and Reliable modes, using UDP transport, an RMW history of 8 shared slots, an MTU of 512 B and an XRCE-DDS history of 4 slots. The total memory consumption is reported as a function of the entity number and message size.
 
 The number of publishers/subscribers has been varied, which is equivalent to changing the number of topics, since in our design of the set-up we associate each publisher/subscriber with just one topic.
 
-In principle, in the Reliable case one can occupy the generated buffers with message sizes up to MTU &#215; XRCE_history, whereas in the Best-Effort case it can be filled with messages with size up to MTU, which correspond respectively to to 512 B &#215; 4 = 2048 B and 512 B with our default chosen values. This is due to the absence of fragmentation in Best-Effort communication streams, while, thanks to fragmentation, an entity functioning in Reliable mode can send/receive a message opportunely chunked in a number of pieces equal to the XRCE_history, each of the size of the MTU. However, from table 1 one can see the message size only ranges from 0 and 1366 B in the case of Reliable entities, and between 0 and 490 B for entities in Best-Effort mode. This is due to the fact that in both cases some memory is consumed by headers and, most importantly, in the Reliable case some is consumed by confirmation messages such as heartbeats and acknacks.
+In principle, in the Reliable case one can occupy the generated buffers with message sizes up to MTU &#215; XRCE_history, whereas in the Best-Effort case it can be filled with messages with size up to MTU, which correspond respectively to to 512 B &#215; 4 = 2048 B and 512 B with our default chosen values. This is due to the absence of fragmentation in Best-Effort communication streams, while, thanks to fragmentation, an entity communicating in Reliable mode can send/receive a message opportunely chunked in a number of pieces equal to the XRCE_history, each of the size of the MTU. However, from table 1 one can see the message size only ranges from 0 and 1366 B in the case of Reliable entities, and between 0 and 490 B for entities in Best-Effort mode. This is due to the fact that in both cases some memory is consumed by headers and, most importantly, in the Reliable case, some is consumed by confirmation messages such as heartbeats and acknacks.
 
 <img alt="Total memory" src="overall.png" class="center">
 
 <p align="center">
   Fig 3: Total memory usage (in Bytes) of micro-ROS publisher and subscription applications in both Best-Effort and Reliable modes with UDP transport, default parameters and as a function of the entities number (x axis) and of the message size (legend).
 </p>
 
-
 From these plots, we can draw some conclusions and observe trends.
 
 First of all it appears clear that the total memory consumption varies with the number of entities but not with the message size. The reason for this is that all message sizes explored fit into the static buffers pre-allocated by the program at compile-time. We therefore expect that the memory consumption would only vary with the message size when the total space occupied by the topic plus the confirmation messages (in the reliable case) and the overhead exceeds the buffer size. In the case of increasing the number of entities, instead, the overall memory grows (as we’ll see below, this is driven by an increase in both the static and the dynamic memories, while the stack is not affected).
 
-By performing a simple calculation, we can see that the memory occupied by one publisher under the above experimental conditions is of ~ 400 B, while that occupied by one subscriber is ~ 8700 B. This huge difference is rooted in that subscriptions make full use of both the XRCE-DDS and the RMW histories, which respectively control the data storage and the data passing between the lower layers and the user interface by means of buffer rings where data are stored and retrieved during the subscription process. This makes subscriptions consume an additional memory of size MTU * XRCE_history * RMW_history (in this case, 512 * 4 * 4) with respect to the memory used by a publisher app, which is merely associated with the entity creation.
+By performing a simple calculation, we can see that the memory occupied by one publisher under the above experimental conditions is of ~ 400 B, while that occupied by one subscriber is ~ 500 B.
+The fact that there is virtually no substantial difference between the memory usage of these two entities, notwithstanding the fact that subscribers have a RMW_history associated, is ascribable to the fact that the memory pools of the RMW are shared among all the entities participating in a given application, and therefore it doesn't mark a difference between subscribers (in need to store messages before they are fetched from the higher layers) and publishers.
 
 Finally, we see that there is no substantial difference between Reliable and Best-Effort modes, exception made for the upper threshold of the message size that can be sent in these two modes, as explained at the beginning of this section.
 
 *Memory breakdown*
 
-To get a better insight into the type of memory consumed by these applications, below we provide the same data but broken down into its constituent memory chunks. We do so for just one message size (1 B), since, as we have seen, this number doesn’t affect the total memory consumed (nor its constituents).
+To get a better insight on the type of memory consumed by these applications, below we provide the same data but broken down into its constituent memory chunks. We do so for just one message size (1 B), since, as we have seen, this number doesn’t affect the total memory consumed (nor its constituents).
 
 <img alt="Memory breakdown" src="3mems.png" class="center">
 
@@ -160,45 +159,39 @@ To get a better insight into the type of memory consumed by these applications,
 
 From these results we see that both the static and the dynamic memories change with the entity number, while the stack stays constant.
 
-*Serial vs UDP transports*
-
-In this scenario we have performed a reduced set of measurements, for a configuration very similar to the one reported in the first scenario, but with a different transport protocol, that is, serial in spite of UDP. We have tested an application with a single publisher and another with a single subscriber communicating on Reliable streams, with both the RMW and the XRCE-DDS histories set to 4 units, and the MTU set to its default value of 512 B.  Because of this, the relevant figures produced as outputs to these measurements can be summarized by simple numbers and a proper plot is not provided. The results obtained are: 45590 B of total memory for one publication, and 52643 B for one subscription. To be able to perform a quick comparison between serial and UDP transports, we recall that in the UDP case tested at the beginning of this section, we had obtained 42869 B for one publisher app and 50843 subscriber app. Given the results closeness, we deduce that the transport does not influence the results (or, at most, it does so to a very small, pointing towards a slightly less consumption in the case of serial transport).
-
 *Role of the RMW history*
 
-In this scenario, we have measured the static memory consumed as a function of the RMW history, when this ranges from 1 to 20 units, for a single subscriber application to a message of fixed size (again, as seen above this size doesn’t affect the memory consumption as long as it’s smaller than the pre-allocated buffer size), with UDP transport and an XRCE-DDS history of 4, using Reliable communication. The results are summarized in the plot below:
+In this scenario, we have measured the static memory consumed as a function of the RMW history, when this ranges from 1 to 20 units, for a single subscriber application and with a message of fixed size (again, as seen above this size doesn’t affect the memory consumption as long as it’s smaller than the pre-allocated buffer size), with UDP transport and an XRCE-DDS history of 4, using Reliable communication. The results are summarized in the plot below:
 
 <img alt="RMW history" src="rmw_history.png" class="center" width="60%">
 
 <p align="center">
    Fig 5: Static memory usage (in Bytes) of a micro-ROS subscription application in reliable mode with UDP transport, default parameters and fixed message size as a function of the RMW history.
 </p>
 
-From this plot, we see that the total static memory used changes by MTU * RMW_history (which is equal to 512 * 4 for the parameters employed) for each unit of RMW memory that we add to the application.
+From this plot, we see that the total static memory used changes by MTU &#215; RMW_history (which is equal to 512 &#215; 4 for the parameters employed) for each unit of RMW memory that we add to the application.
 
 #### Client-Server apps
 
-We now pass to investigate our last case-scenario, where in spite of pub/sub apps, we consider a different kind of ROS object, that of services, in which the communication between entities follow a request/reply pattern. See below the results for the memory consumed, for a number of servers and clients ranging between 1, 2, 5 and 10. Notice that we report both the behaviour and values of the individual consituents (static, stack, and dynamic) and of the total memory.
+We now pass to investigate our last case-scenario, where in spite of pub/sub apps, we consider a different kind of ROS object, that of services, in which the communication between entities follow a request/reply pattern. See below the results for the memory consumed, for a number of servers and clients ranging from 1 to 10. Notice that we report both the behaviour and values of the individual consituents (static, stack, and dynamic) and of the total memory.
 
 <img alt="Services" src="servcli.png" class="center">
 
 <p align="center">
    Fig 6: Total memory usage (in Bytes) of micro-ROS service and clients applications as a function of the number of servers and clients.
 </p>
 
-As already done in the case of publishers and subscribers, we can calculate the total memory consumed by a single entity. From this calculation it results that the memory occupied by one server or one client is on the order of ~ 8800 B. From this figure, we see that the memory occupied by a server and that occupied by a client is virtually identical, and it is on the same order of magnitude as that occupied by a subscriber. This is understandable if one thinks that in the case of these applications, entities always need to ‘subscribe’ to either a request, in the case of a server, and to a reply, in the case of a client. In both cases, messages need to be stored in the memory slots allocated in both the XRCE-DDS and RMW layers.
+As already done in the case of publishers and subscribers, we can calculate the total memory consumed by a single entity. From this calculation it results that the memory occupied by one server or one client is on the order of ~ 300 B. From this figure, we see that the memory occupied by a server and that occupied by a client is virtually identical, and it is on the same order of magnitude as that occupied by a publisher or subscriber application.
 
 ### Conclusions
 
 To sum up, we have seen that:
 
-* Memory consumption doesn’t vary with message size as long as the sum of the latter and overheads can be accommodated by the static buffer pre-allocated at compile-time.
+* Memory consumption doesn’t vary with message size as long as the sum of the latter plus the overheads can be accommodated by the static buffer pre-allocated at compile-time.
 * Static and Dynamic memories vary with the entity number, while the stack remains constant.
-* A single publisher app with default configuration parameters and with both transport protocols explored (UDP and serial) consumes ~ 400 B of total memory, which corresponds to the memory needed for creating the entity.
-* A single subscriber app with default configuration parameters and with both transport protocols explored (UDP and serial) consumes ~ 8700 B of total memory, which represents the sum of the memory needed for creating the entity and the static buffers allocated to store and handle the messages in both the XRCE-DDS and RMW. It is important to note that this memory size is obtained when using the default configuration, and it can be heavily reduced to fit use cases with a lower memory profile.
-* A single client/server app with default configuration parameters and with UDP transport consumes ~ 8800 B of total memory. This reflects the fact that in a request/response pattern messages need to be stored and handled as in the subscription case.
-* In the case of a single subscription, the total static memory used changes by MTU * XRCE_history for each unit of RMW history that is added to the application.
-* The handling of the RMW history for subscriptions and services, which are the most critical cases, should be revisited in order to reduce the growth of the static consumption. The current implementation reserves RMW_history * XRCE_history * MTU Bytes for each subscriber/client/service. Our most recent considerations are directed towards reducing this number by using a shared RMW history pool for every entity.
+* A single publisher/subscriber app with default configuration parameters and with both transport protocols explored (UDP and serial) consumes ~ 400-500 B of total memory.
+* A single client/server app with default configuration parameters and with UDP transport consumes ~ 300 B of total memory, on the same order of magnitude of pub/sub applications.
+* In the case of a single subscription, the total static memory used changes by MTU &#215; XRCE_history for each unit of RMW history that is added to the application.
 
 <style type="text/css">
 

diff --git a/_docs/concepts/benchmarking/memo_prof/overall.png b/_docs/concepts/benchmarking/memo_prof/overall.png
diff --git a/_docs/concepts/benchmarking/memo_prof/rmw_history.png b/_docs/concepts/benchmarking/memo_prof/rmw_history.png
diff --git a/_docs/concepts/benchmarking/memo_prof/servcli.png b/_docs/concepts/benchmarking/memo_prof/servcli.png