Skip to content

Commit fd87465

Browse files
Josh Gachnangjayofdoomaweeks
committed
Add metrics support to IPA
This utilizes the new metrics support in ironic-lib to allow the agent to report timing metrics for agent API methods as configured in ironic-lib. Additionally, this adds developer docs on how to use metrics in IPA, including some caveats specific to ironic-lib.metrics use in IPA. Co-Authored-By: Jay Faulkner <jay@jvf.cc> Co-Authored-By: Alex Weeks <alex.weeks@gmail.com> Change-Id: Ic08d4ff78b6fb614b474b956a32eac352a14262a Partial-bug: #1526219
1 parent ad60806 commit fd87465

File tree

7 files changed

+99
-27
lines changed

7 files changed

+99
-27
lines changed

doc/source/index.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,8 @@ Index
1717

1818
.. toctree::
1919

20-
troubleshooting
20+
troubleshooting
21+
metrics
2122

2223
How it works
2324
============

doc/source/metrics.rst

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
.. _metrics:
2+
3+
===============================================
4+
Emitting metrics from Ironic-Python-Agent (IPA)
5+
===============================================
6+
7+
This document describes how to emit metrics from IPA, including timers and
8+
counters in code to directly emitting hardware metrics from a custom
9+
HardwareManager.
10+
11+
Overview
12+
========
13+
IPA uses the metrics implementation from ironic-lib, with a few caveats due
14+
to the dynamic configuration done at lookup time. You cannot cache the metrics
15+
instance as the MetricsLogger returned will change after lookup if configs
16+
different than the default setting have been used. This also means that the
17+
method decorator supported by ironic-lib cannot be used in IPA.
18+
19+
Using a context manager
20+
=======================
21+
Using the context manager is the recommended way for sending metrics that time
22+
or count sections of code. However, given that you cannot cache the
23+
MetricsLogger, you have to explicitly call get_metrics_logger() from
24+
ironic-lib every time. For example:
25+
26+
from ironic_lib import metrics_utils
27+
28+
def my_method():
29+
with metrics_utils.get_metrics_logger(__name__).timer():
30+
return _do_work()
31+
32+
As a note, these metric collectors do work for custom HardwareManagers as
33+
well, however, you may want to metric the portions of a method that determine
34+
compatability separate from portions of a method that actually do work, in
35+
order to assure the metrics are relevant and useful on all hardware.
36+
37+
Explicitly sending metrics
38+
==========================
39+
A feature that may be particularly helpful for deployers writing custom
40+
HardwareManagers is the ability to explicitly send metrics. As an example,
41+
you could add a cleaning step which would retrieve metrics about a device and
42+
ship them using the provided metrics library. For example:
43+
44+
from ironic_lib import metrics_utils
45+
46+
def my_cleaning_step():
47+
for name, value in _get_smart_data():
48+
metrics_utils.get_metrics_logger(__name__).send_gauge(name, value)
49+
50+
References
51+
==========
52+
For more information, please read the source of the metrics module in
53+
`ironic-lib <http://git.openstack.org/cgit/openstack/ironic-lib/tree/ironic_lib>`_.

ironic_python_agent/agent.py

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
import time
2121

2222
from oslo_concurrency import processutils
23+
from oslo_config import cfg
2324
from oslo_log import log
2425
import pkg_resources
2526
from six.moves.urllib import parse as urlparse
@@ -35,7 +36,6 @@
3536
from ironic_python_agent import ironic_api_client
3637
from ironic_python_agent import utils
3738

38-
3939
LOG = log.getLogger(__name__)
4040

4141
# Time(in seconds) to wait for any of the interfaces to be up
@@ -45,6 +45,9 @@
4545
# Time(in seconds) to wait before reattempt
4646
NETWORK_WAIT_RETRY = 5
4747

48+
cfg.CONF.import_group('metrics', 'ironic_lib.metrics_utils')
49+
cfg.CONF.import_group('metrics_statsd', 'ironic_lib.metrics_statsd')
50+
4851

4952
def _time():
5053
"""Wraps time.time() for simpler testing."""
@@ -340,6 +343,15 @@ def run(self):
340343
hardware.cache_node(self.node)
341344
self.heartbeat_timeout = content['heartbeat_timeout']
342345

346+
# Update config with values from Ironic
347+
config = content.get('config', {})
348+
if config.get('metrics'):
349+
for opt, val in config.items():
350+
setattr(cfg.CONF.metrics, opt, val)
351+
if config.get('metrics_statsd'):
352+
for opt, val in config.items():
353+
setattr(cfg.CONF.metrics_statsd, opt, val)
354+
343355
wsgi = simple_server.make_server(
344356
self.listen_address[0],
345357
self.listen_address[1],

ironic_python_agent/api/controllers/root.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,9 @@
1212
# License for the specific language governing permissions and limitations
1313
# under the License.
1414

15+
from ironic_lib import metrics_utils
1516
import pecan
1617
from pecan import rest
17-
1818
from wsme import types as wtypes
1919
import wsmeext.pecan as wsme_pecan
2020

@@ -81,7 +81,8 @@ def get(self):
8181
# NOTE: The reason why convert() it's being called for every
8282
# request is because we need to get the host url from
8383
# the request object to make the links.
84-
return Root.convert()
84+
with metrics_utils.get_metrics_logger(__name__).timer('get'):
85+
return Root.convert()
8586

8687
@pecan.expose()
8788
def _route(self, args):

ironic_python_agent/api/controllers/v1/command.py

Lines changed: 23 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
# License for the specific language governing permissions and limitations
1414
# under the License.
1515

16+
from ironic_lib import metrics_utils
1617
import pecan
1718
from pecan import rest
1819
from wsme import types
@@ -78,9 +79,10 @@ class CommandController(rest.RestController):
7879
@wsme_pecan.wsexpose(CommandResultList)
7980
def get_all(self):
8081
"""Get all command results."""
81-
agent = pecan.request.agent
82-
results = agent.list_command_results()
83-
return CommandResultList.from_results(results)
82+
with metrics_utils.get_metrics_logger(__name__).timer('get_all'):
83+
agent = pecan.request.agent
84+
results = agent.list_command_results()
85+
return CommandResultList.from_results(results)
8486

8587
@wsme_pecan.wsexpose(CommandResult, types.text, types.text)
8688
def get_one(self, result_id, wait=None):
@@ -91,13 +93,14 @@ def get_one(self, result_id, wait=None):
9193
:returns: a :class:`ironic_python_agent.api.controller.v1.command.
9294
CommandResult` object.
9395
"""
94-
agent = pecan.request.agent
95-
result = agent.get_command_result(result_id)
96+
with metrics_utils.get_metrics_logger(__name__).timer('get_one'):
97+
agent = pecan.request.agent
98+
result = agent.get_command_result(result_id)
9699

97-
if wait and wait.lower() == 'true':
98-
result.join()
100+
if wait and wait.lower() == 'true':
101+
result.join()
99102

100-
return CommandResult.from_result(result)
103+
return CommandResult.from_result(result)
101104

102105
@wsme_pecan.wsexpose(CommandResult, types.text, body=Command)
103106
def post(self, wait=None, command=None):
@@ -109,14 +112,15 @@ def post(self, wait=None, command=None):
109112
:returns: a :class:`ironic_python_agent.api.controller.v1.command.
110113
CommandResult` object.
111114
"""
112-
# the POST body is always the last arg,
113-
# so command must be a kwarg here
114-
if command is None:
115-
command = Command()
116-
agent = pecan.request.agent
117-
result = agent.execute_command(command.name, **command.params)
118-
119-
if wait and wait.lower() == 'true':
120-
result.join()
121-
122-
return result
115+
with metrics_utils.get_metrics_logger(__name__).timer('post'):
116+
# the POST body is always the last arg,
117+
# so command must be a kwarg here
118+
if command is None:
119+
command = Command()
120+
agent = pecan.request.agent
121+
result = agent.execute_command(command.name, **command.params)
122+
123+
if wait and wait.lower() == 'true':
124+
result.join()
125+
126+
return result

ironic_python_agent/api/controllers/v1/status.py

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
# License for the specific language governing permissions and limitations
1414
# under the License.
1515

16+
from ironic_lib import metrics_utils
1617
import pecan
1718
from pecan import rest
1819
from wsme import types
@@ -48,6 +49,7 @@ class StatusController(rest.RestController):
4849
@wsme_pecan.wsexpose(AgentStatus)
4950
def get_all(self):
5051
"""Get current status of the running agent."""
51-
agent = pecan.request.agent
52-
status = agent.get_status()
53-
return AgentStatus.from_agent_status(status)
52+
with metrics_utils.get_metrics_logger(__name__).timer('get_all'):
53+
agent = pecan.request.agent
54+
status = agent.get_status()
55+
return AgentStatus.from_agent_status(status)

ironic_python_agent/extensions/standby.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -533,7 +533,6 @@ def prepare_image(self,
533533
stream_raw_images = image_info.get('stream_raw_images', False)
534534
# don't write image again if already cached
535535
if self.cached_image_id != image_info['id']:
536-
537536
if self.cached_image_id is not None:
538537
LOG.debug('Already had %s cached, overwriting',
539538
self.cached_image_id)

0 commit comments

Comments
 (0)