Skip to content

Commit 7db8e10

Browse files
committed
feat: Add materialization, feature freshness, request latency, and push metrics to feature server
Signed-off-by: ntkathole <nikhilkathole2683@gmail.com>
1 parent a1a160d commit 7db8e10

File tree

7 files changed

+1746
-215
lines changed

7 files changed

+1746
-215
lines changed

docs/reference/feature-servers/python-feature-server.md

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -311,6 +311,120 @@ requests.post(
311311
data=json.dumps(materialize_data))
312312
```
313313

314+
## Prometheus Metrics
315+
316+
The Python feature server can expose Prometheus-compatible metrics on a dedicated
317+
HTTP endpoint (default port `8000`). Metrics are **opt-in** and carry zero overhead
318+
when disabled.
319+
320+
### Enabling metrics
321+
322+
**Option 1 — CLI flag** (useful for one-off runs):
323+
324+
```bash
325+
feast serve --metrics
326+
```
327+
328+
**Option 2 — `feature_store.yaml`** (recommended for production):
329+
330+
```yaml
331+
feature_server:
332+
type: local
333+
metrics:
334+
enabled: true
335+
```
336+
337+
Either option is sufficient. When both are set, metrics are enabled.
338+
339+
### Per-category control
340+
341+
By default, enabling metrics turns on **all** categories. You can selectively
342+
disable individual categories within the same `metrics` block:
343+
344+
```yaml
345+
feature_server:
346+
type: local
347+
metrics:
348+
enabled: true
349+
resource: true # CPU / memory gauges
350+
request: false # disable endpoint latency & request counters
351+
online_features: true # online feature retrieval counters
352+
push: true # push request counters
353+
materialization: true # materialization counters & duration
354+
freshness: true # feature freshness gauges
355+
```
356+
357+
Any category set to `false` will emit no metrics and start no background
358+
threads (e.g., setting `freshness: false` prevents the registry polling
359+
thread from starting). All categories default to `true`.
360+
361+
### Available metrics
362+
363+
| Metric | Type | Labels | Description |
364+
|--------|------|--------|-------------|
365+
| `feast_feature_server_cpu_usage` | Gauge | — | Process CPU usage % |
366+
| `feast_feature_server_memory_usage` | Gauge | — | Process memory usage % |
367+
| `feast_feature_server_request_total` | Counter | `endpoint`, `status` | Total requests per endpoint |
368+
| `feast_feature_server_request_latency_seconds` | Histogram | `endpoint`, `feature_count`, `feature_view_count` | Request latency with p50/p95/p99 support |
369+
| `feast_online_features_request_total` | Counter | — | Total online feature retrieval requests |
370+
| `feast_online_features_entity_count` | Histogram | — | Entity rows per online feature request |
371+
| `feast_push_request_total` | Counter | `push_source`, `mode` | Push requests by source and mode |
372+
| `feast_materialization_total` | Counter | `feature_view`, `status` | Materialization runs (success/failure) |
373+
| `feast_materialization_duration_seconds` | Histogram | `feature_view` | Materialization duration per feature view |
374+
| `feast_feature_freshness_seconds` | Gauge | `feature_view`, `project` | Seconds since last materialization |
375+
376+
### Scraping with Prometheus
377+
378+
```yaml
379+
scrape_configs:
380+
- job_name: feast
381+
static_configs:
382+
- targets: ["localhost:8000"]
383+
```
384+
385+
### Kubernetes / Feast Operator
386+
387+
Set `metrics: true` in your FeatureStore CR:
388+
389+
```yaml
390+
spec:
391+
services:
392+
onlineStore:
393+
server:
394+
metrics: true
395+
```
396+
397+
The operator automatically exposes port 8000 and creates the corresponding
398+
Service port so Prometheus can discover it.
399+
400+
### Multi-worker and multi-replica (HPA) support
401+
402+
Feast uses Prometheus **multiprocess mode** so that metrics are correct
403+
regardless of the number of Gunicorn workers or Kubernetes replicas.
404+
405+
**How it works:**
406+
407+
* Each Gunicorn worker writes metric values to shared files in a
408+
temporary directory (`PROMETHEUS_MULTIPROCESS_DIR`). Feast creates
409+
this directory automatically; you can override it by setting the
410+
environment variable yourself.
411+
* The metrics HTTP server on port 8000 aggregates all workers'
412+
metric files using `MultiProcessCollector`, so a single scrape
413+
returns accurate totals.
414+
* Gunicorn hooks clean up dead-worker files automatically
415+
(`child_exit` → `mark_process_dead`).
416+
* CPU and memory gauges use `multiprocess_mode=liveall` — Prometheus
417+
shows per-worker values distinguished by a `pid` label.
418+
* Feature freshness gauges use `multiprocess_mode=max` — Prometheus
419+
shows the worst-case staleness (all workers compute the same value).
420+
* Counters and histograms (request counts, latency, materialization)
421+
are automatically summed across workers.
422+
423+
**Multiple replicas (HPA):** Each pod runs its own metrics endpoint.
424+
Prometheus adds an `instance` label per pod, so there is no
425+
duplication. Use `sum(rate(...))` or `histogram_quantile(...)` across
426+
instances as usual.
427+
314428
## Starting the feature server in TLS(SSL) mode
315429

316430
Enabling TLS mode ensures that data between the Feast client and server is transmitted securely. For an ideal production environment, it is recommended to start the feature server in TLS mode.

docs/reference/feature-store-yaml.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,14 @@ An example configuration:
3636
```yaml
3737
feature_server:
3838
type: local
39+
metrics: # Prometheus metrics configuration. Also achievable via `feast serve --metrics`.
40+
enabled: true # Enable Prometheus metrics server on port 8000
41+
resource: true # CPU / memory gauges
42+
request: true # endpoint latency histograms & request counters
43+
online_features: true # online feature retrieval counters
44+
push: true # push request counters
45+
materialization: true # materialization counters & duration histograms
46+
freshness: true # per-feature-view freshness gauges
3947
offline_push_batching_enabled: true # Enables batching of offline writes processed by /push. Online writes are unaffected.
4048
offline_push_batching_batch_size: 100 # Maximum number of buffered rows before writing to the offline store.
4149
offline_push_batching_batch_interval_seconds: 5 # Maximum time rows may remain buffered before a forced flush.

0 commit comments

Comments
 (0)