Skip to content

Commit 5ae91aa

Browse files
authored
Reworked memory profiling section. (#306)
1 parent a5b3aa7 commit 5ae91aa

5 files changed

Lines changed: 17 additions & 24 deletions

File tree

-12.7 KB
Loading

_docs/concepts/benchmarking/memo_prof/index.md

Lines changed: 17 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -45,13 +45,12 @@ Communication streams regulate how messages flow between the Clients and the Age
4545

4646
### Memory management of the RMW
4747

48-
The `rmw-microxrcedds` layer uses static memory for allocating the resources associated with the ROS client support libraries, such as nodes, publishers, subscribers etc. This static memory is managed by independent memory pools for each kind of entity, whose size can be set at build time through CMake flags which allow to tune several parameters, for instance the maximum number of entities, which ultimately control how much memory a micro-ROS application can consume.
49-
50-
Another important memory resource managed by the `rmw-microxrcedds` library is the message history (also referred to as RMW_history in the rest of the text). `rmw-microxrcedds` uses a static-memory message queue where to keep the subscription messages before the user reads them, that functions as a buffer ring while it is being populated with the incoming messages and depleted by means of users’ callbacks coming from the higher layers.
48+
The `rmw-microxrcedds` layer uses static memory for allocating the resources associated with the ROS client support libraries, such as nodes, publishers, subscribers etc. This memory is managed by static memory pools that are shared among all the entities of a given application. The number of pools is fixed by the RMW message history (also referred to as RMW_history in the rest of the text), a parameter which is chosen by the user as a CMake flag.
49+
These RMW pools act as message queues where to keep the subscription messages before the user reads them. The size of each pool is given by the MTU × XRCE_history.
5150

5251
### Additional considerations
5352

54-
It should be clear by now that the two history buffers involved in the Client-Agent communication are different in nature: the XRCE-DDS history is used to store chunks of fragmented messages if Reliable communication is implemented. Thanks to this, in Reliable mode one can send or receive up to MTU × XRCE_history minus the memory reserved for headers, whereas Best-Effort communication streams can only exchange messages of size smaller or equal to the MTU. The history of the RMW, in turn, controls the reception of data in the case of subscription and services. In this case, a buffer ring is generated in the RMW to store and cushion the data received from the XRCE-DDS library while handling the `take` calls received from the user’s interface to fetch these data and send them through the upper layers all the way up to the user’s application.
53+
It should be clear by now that the two history buffers involved in the Client-Agent communication are different in nature: the XRCE-DDS history is used to store chunks of fragmented messages if Reliable communication is implemented. Thanks to this, in Reliable mode one can send or receive up to MTU × XRCE_history minus the memory reserved for headers, whereas Best-Effort communication streams can only exchange messages of size smaller or equal to the MTU. The history of the RMW, in turn, controls the reception of data in the case of subscription and services. In this case, a buffer ring is generated in the RMW to store and cushion the data received from the XRCE-DDS library while handling the `take` calls received from the user’s interface to fetch the data and send them through the higher layers all the way up to the user’s application.
5554

5655
Notice that this reflects the different behaviours of the XRCE-DDS library and of the RMW under subscription. While the XRCE-DDS library functions with callbacks, by warning the user whenever a new message comes in, the RMW functions by polling: this library listens to the topics the user has subscribed to and stores them until they are explicitly requested.
5756

@@ -117,7 +116,7 @@ In the fifth set of measurements we measure the footprint of applications of req
117116

118117
In this section, we detail the methodology employed for the memory profiling of the experimental configurations described above, and for each of them we present the results obtained.
119118

120-
The measurements are conducted on a micro-ROS Client application with a varying number of entities: either publishers/subscribers (1, 5, 10, 15) or client/server (1, 2, 5, 10).
119+
The measurements are conducted on a micro-ROS Client application with a varying number of entities: either publishers/subscribers (from 1 to 15) or client/server (from 1 to 10).
121120

122121
All the tested apps run on top of FreeRTOS and inside of an ESP32 board. The board is connected by either UDP or serial transport to a micro-ROS Agent running on a Linux machine. As explained above, the choice of FreeRTOS has been by virtue of its memory management functionalities, which easily allow to compute the memory used by applications.
123122

@@ -127,30 +126,30 @@ In order to provide an assessment as much realistic as possible, the following p
127126

128127
*Total memory as a function of entities number and message size*
129128

130-
In this section, we report the total memory used by either publisher or subscriber applications in both Best-Effort and Reliable modes, using UDP transport, an RMW history of 4 slots, an MTU of 512 B and an XRCE-DDS history of 4 slots. The total memory consumption is reported as a function of the entity number and message size.
129+
In this section, we report the total memory used by either publisher or subscriber applications in both Best-Effort and Reliable modes, using UDP transport, an RMW history of 8 shared slots, an MTU of 512 B and an XRCE-DDS history of 4 slots. The total memory consumption is reported as a function of the entity number and message size.
131130

132131
The number of publishers/subscribers has been varied, which is equivalent to changing the number of topics, since in our design of the set-up we associate each publisher/subscriber with just one topic.
133132

134-
In principle, in the Reliable case one can occupy the generated buffers with message sizes up to MTU × XRCE_history, whereas in the Best-Effort case it can be filled with messages with size up to MTU, which correspond respectively to to 512 B × 4 = 2048 B and 512 B with our default chosen values. This is due to the absence of fragmentation in Best-Effort communication streams, while, thanks to fragmentation, an entity functioning in Reliable mode can send/receive a message opportunely chunked in a number of pieces equal to the XRCE_history, each of the size of the MTU. However, from table 1 one can see the message size only ranges from 0 and 1366 B in the case of Reliable entities, and between 0 and 490 B for entities in Best-Effort mode. This is due to the fact that in both cases some memory is consumed by headers and, most importantly, in the Reliable case some is consumed by confirmation messages such as heartbeats and acknacks.
133+
In principle, in the Reliable case one can occupy the generated buffers with message sizes up to MTU × XRCE_history, whereas in the Best-Effort case it can be filled with messages with size up to MTU, which correspond respectively to to 512 B × 4 = 2048 B and 512 B with our default chosen values. This is due to the absence of fragmentation in Best-Effort communication streams, while, thanks to fragmentation, an entity communicating in Reliable mode can send/receive a message opportunely chunked in a number of pieces equal to the XRCE_history, each of the size of the MTU. However, from table 1 one can see the message size only ranges from 0 and 1366 B in the case of Reliable entities, and between 0 and 490 B for entities in Best-Effort mode. This is due to the fact that in both cases some memory is consumed by headers and, most importantly, in the Reliable case, some is consumed by confirmation messages such as heartbeats and acknacks.
135134

136135
<img alt="Total memory" src="overall.png" class="center">
137136

138137
<p align="center">
139138
Fig 3: Total memory usage (in Bytes) of micro-ROS publisher and subscription applications in both Best-Effort and Reliable modes with UDP transport, default parameters and as a function of the entities number (x axis) and of the message size (legend).
140139
</p>
141140

142-
143141
From these plots, we can draw some conclusions and observe trends.
144142

145143
First of all it appears clear that the total memory consumption varies with the number of entities but not with the message size. The reason for this is that all message sizes explored fit into the static buffers pre-allocated by the program at compile-time. We therefore expect that the memory consumption would only vary with the message size when the total space occupied by the topic plus the confirmation messages (in the reliable case) and the overhead exceeds the buffer size. In the case of increasing the number of entities, instead, the overall memory grows (as we’ll see below, this is driven by an increase in both the static and the dynamic memories, while the stack is not affected).
146144

147-
By performing a simple calculation, we can see that the memory occupied by one publisher under the above experimental conditions is of ~ 400 B, while that occupied by one subscriber is ~ 8700 B. This huge difference is rooted in that subscriptions make full use of both the XRCE-DDS and the RMW histories, which respectively control the data storage and the data passing between the lower layers and the user interface by means of buffer rings where data are stored and retrieved during the subscription process. This makes subscriptions consume an additional memory of size MTU * XRCE_history * RMW_history (in this case, 512 * 4 * 4) with respect to the memory used by a publisher app, which is merely associated with the entity creation.
145+
By performing a simple calculation, we can see that the memory occupied by one publisher under the above experimental conditions is of ~ 400 B, while that occupied by one subscriber is ~ 500 B.
146+
The fact that there is virtually no substantial difference between the memory usage of these two entities, notwithstanding the fact that subscribers have a RMW_history associated, is ascribable to the fact that the memory pools of the RMW are shared among all the entities participating in a given application, and therefore it doesn't mark a difference between subscribers (in need to store messages before they are fetched from the higher layers) and publishers.
148147

149148
Finally, we see that there is no substantial difference between Reliable and Best-Effort modes, exception made for the upper threshold of the message size that can be sent in these two modes, as explained at the beginning of this section.
150149

151150
*Memory breakdown*
152151

153-
To get a better insight into the type of memory consumed by these applications, below we provide the same data but broken down into its constituent memory chunks. We do so for just one message size (1 B), since, as we have seen, this number doesn’t affect the total memory consumed (nor its constituents).
152+
To get a better insight on the type of memory consumed by these applications, below we provide the same data but broken down into its constituent memory chunks. We do so for just one message size (1 B), since, as we have seen, this number doesn’t affect the total memory consumed (nor its constituents).
154153

155154
<img alt="Memory breakdown" src="3mems.png" class="center">
156155

@@ -160,45 +159,39 @@ To get a better insight into the type of memory consumed by these applications,
160159

161160
From these results we see that both the static and the dynamic memories change with the entity number, while the stack stays constant.
162161

163-
*Serial vs UDP transports*
164-
165-
In this scenario we have performed a reduced set of measurements, for a configuration very similar to the one reported in the first scenario, but with a different transport protocol, that is, serial in spite of UDP. We have tested an application with a single publisher and another with a single subscriber communicating on Reliable streams, with both the RMW and the XRCE-DDS histories set to 4 units, and the MTU set to its default value of 512 B. Because of this, the relevant figures produced as outputs to these measurements can be summarized by simple numbers and a proper plot is not provided. The results obtained are: 45590 B of total memory for one publication, and 52643 B for one subscription. To be able to perform a quick comparison between serial and UDP transports, we recall that in the UDP case tested at the beginning of this section, we had obtained 42869 B for one publisher app and 50843 subscriber app. Given the results closeness, we deduce that the transport does not influence the results (or, at most, it does so to a very small, pointing towards a slightly less consumption in the case of serial transport).
166-
167162
*Role of the RMW history*
168163

169-
In this scenario, we have measured the static memory consumed as a function of the RMW history, when this ranges from 1 to 20 units, for a single subscriber application to a message of fixed size (again, as seen above this size doesn’t affect the memory consumption as long as it’s smaller than the pre-allocated buffer size), with UDP transport and an XRCE-DDS history of 4, using Reliable communication. The results are summarized in the plot below:
164+
In this scenario, we have measured the static memory consumed as a function of the RMW history, when this ranges from 1 to 20 units, for a single subscriber application and with a message of fixed size (again, as seen above this size doesn’t affect the memory consumption as long as it’s smaller than the pre-allocated buffer size), with UDP transport and an XRCE-DDS history of 4, using Reliable communication. The results are summarized in the plot below:
170165

171166
<img alt="RMW history" src="rmw_history.png" class="center" width="60%">
172167

173168
<p align="center">
174169
Fig 5: Static memory usage (in Bytes) of a micro-ROS subscription application in reliable mode with UDP transport, default parameters and fixed message size as a function of the RMW history.
175170
</p>
176171

177-
From this plot, we see that the total static memory used changes by MTU * RMW_history (which is equal to 512 * 4 for the parameters employed) for each unit of RMW memory that we add to the application.
172+
From this plot, we see that the total static memory used changes by MTU &#215; RMW_history (which is equal to 512 &#215; 4 for the parameters employed) for each unit of RMW memory that we add to the application.
178173

179174
#### Client-Server apps
180175

181-
We now pass to investigate our last case-scenario, where in spite of pub/sub apps, we consider a different kind of ROS object, that of services, in which the communication between entities follow a request/reply pattern. See below the results for the memory consumed, for a number of servers and clients ranging between 1, 2, 5 and 10. Notice that we report both the behaviour and values of the individual consituents (static, stack, and dynamic) and of the total memory.
176+
We now pass to investigate our last case-scenario, where in spite of pub/sub apps, we consider a different kind of ROS object, that of services, in which the communication between entities follow a request/reply pattern. See below the results for the memory consumed, for a number of servers and clients ranging from 1 to 10. Notice that we report both the behaviour and values of the individual consituents (static, stack, and dynamic) and of the total memory.
182177

183178
<img alt="Services" src="servcli.png" class="center">
184179

185180
<p align="center">
186181
Fig 6: Total memory usage (in Bytes) of micro-ROS service and clients applications as a function of the number of servers and clients.
187182
</p>
188183

189-
As already done in the case of publishers and subscribers, we can calculate the total memory consumed by a single entity. From this calculation it results that the memory occupied by one server or one client is on the order of ~ 8800 B. From this figure, we see that the memory occupied by a server and that occupied by a client is virtually identical, and it is on the same order of magnitude as that occupied by a subscriber. This is understandable if one thinks that in the case of these applications, entities always need to ‘subscribe’ to either a request, in the case of a server, and to a reply, in the case of a client. In both cases, messages need to be stored in the memory slots allocated in both the XRCE-DDS and RMW layers.
184+
As already done in the case of publishers and subscribers, we can calculate the total memory consumed by a single entity. From this calculation it results that the memory occupied by one server or one client is on the order of ~ 300 B. From this figure, we see that the memory occupied by a server and that occupied by a client is virtually identical, and it is on the same order of magnitude as that occupied by a publisher or subscriber application.
190185

191186
### Conclusions
192187

193188
To sum up, we have seen that:
194189

195-
* Memory consumption doesn’t vary with message size as long as the sum of the latter and overheads can be accommodated by the static buffer pre-allocated at compile-time.
190+
* Memory consumption doesn’t vary with message size as long as the sum of the latter plus the overheads can be accommodated by the static buffer pre-allocated at compile-time.
196191
* Static and Dynamic memories vary with the entity number, while the stack remains constant.
197-
* A single publisher app with default configuration parameters and with both transport protocols explored (UDP and serial) consumes ~ 400 B of total memory, which corresponds to the memory needed for creating the entity.
198-
* A single subscriber app with default configuration parameters and with both transport protocols explored (UDP and serial) consumes ~ 8700 B of total memory, which represents the sum of the memory needed for creating the entity and the static buffers allocated to store and handle the messages in both the XRCE-DDS and RMW. It is important to note that this memory size is obtained when using the default configuration, and it can be heavily reduced to fit use cases with a lower memory profile.
199-
* A single client/server app with default configuration parameters and with UDP transport consumes ~ 8800 B of total memory. This reflects the fact that in a request/response pattern messages need to be stored and handled as in the subscription case.
200-
* In the case of a single subscription, the total static memory used changes by MTU * XRCE_history for each unit of RMW history that is added to the application.
201-
* The handling of the RMW history for subscriptions and services, which are the most critical cases, should be revisited in order to reduce the growth of the static consumption. The current implementation reserves RMW_history * XRCE_history * MTU Bytes for each subscriber/client/service. Our most recent considerations are directed towards reducing this number by using a shared RMW history pool for every entity.
192+
* A single publisher/subscriber app with default configuration parameters and with both transport protocols explored (UDP and serial) consumes ~ 400-500 B of total memory.
193+
* A single client/server app with default configuration parameters and with UDP transport consumes ~ 300 B of total memory, on the same order of magnitude of pub/sub applications.
194+
* In the case of a single subscription, the total static memory used changes by MTU &#215; XRCE_history for each unit of RMW history that is added to the application.
202195

203196
<style type="text/css">
204197

-40.6 KB
Loading
3.19 KB
Loading
-49.4 KB
Loading

0 commit comments

Comments
 (0)