Skip to content

google datasource support#2

Merged
ktsaou merged 1 commit into
masterfrom
groogle_datasource
Apr 14, 2014
Merged

google datasource support#2
ktsaou merged 1 commit into
masterfrom
groogle_datasource

Conversation

@ktsaou

@ktsaou ktsaou commented Apr 14, 2014

Copy link
Copy Markdown
Member

support for google datasources with JSONP and the variance of JSON used by google charts.

the url is

/datasource/...

it works exactly like

/data/...

but it returns jsonp response with the specifics required by google datasource specification.

ktsaou added a commit that referenced this pull request Apr 14, 2014
@ktsaou ktsaou merged commit 2f9e464 into master Apr 14, 2014
@ktsaou ktsaou deleted the groogle_datasource branch April 14, 2014 23:46
@danfinn danfinn mentioned this pull request Jul 26, 2016
2 tasks
ktsaou pushed a commit that referenced this pull request Sep 23, 2016
andvgal added a commit to tmpfork/netdata that referenced this pull request Apr 19, 2019
ilyam8 pushed a commit that referenced this pull request Apr 27, 2019
* NEW: Energi Core daemon monitoring, suites other Bitcoin forks

* Attempt to make Codacy checks happy

* Added energid sample configuration, enabled by default

* Post-review fixes & revised charts

* Added default Energi, Bitcoin and Dash to energid.conf configuration

* Fighting with Codacy markdown check

* Added energid protocol description comment

* Added JSON_RPC_VERSION as module variable

* Screw you Codacy markdown

* Screw you Codacy Markdown #2

* Screw you Codacy Markdown #3

* Finally, fixed what remark-lint wants

* Strict local remark-lint + plugins pass for README.md

* Attempt with another remark-lint configuration
thiagoftsm referenced this pull request in thiagoftsm/netdata May 5, 2019
…a#5894)

* NEW: Energi Core daemon monitoring, suites other Bitcoin forks

* Attempt to make Codacy checks happy

* Added energid sample configuration, enabled by default

* Post-review fixes & revised charts

* Added default Energi, Bitcoin and Dash to energid.conf configuration

* Fighting with Codacy markdown check

* Added energid protocol description comment

* Added JSON_RPC_VERSION as module variable

* Screw you Codacy markdown

* Screw you Codacy Markdown #2

* Screw you Codacy Markdown #3

* Finally, fixed what remark-lint wants

* Strict local remark-lint + plugins pass for README.md

* Attempt with another remark-lint configuration
vlegakis pushed a commit to vlware/netdata that referenced this pull request May 10, 2019
stelfrag referenced this pull request in stelfrag/netdata Nov 5, 2019
jackyhuang85 pushed a commit to jackyhuang85/netdata that referenced this pull request Jan 1, 2020
…a#5894)

* NEW: Energi Core daemon monitoring, suites other Bitcoin forks

* Attempt to make Codacy checks happy

* Added energid sample configuration, enabled by default

* Post-review fixes & revised charts

* Added default Energi, Bitcoin and Dash to energid.conf configuration

* Fighting with Codacy markdown check

* Added energid protocol description comment

* Added JSON_RPC_VERSION as module variable

* Screw you Codacy markdown

* Screw you Codacy Markdown netdata#2

* Screw you Codacy Markdown netdata#3

* Finally, fixed what remark-lint wants

* Strict local remark-lint + plugins pass for README.md

* Attempt with another remark-lint configuration
underhood referenced this pull request in underhood/netdata Feb 5, 2020
minor - Error already logs errno
erdem2000 pushed a commit that referenced this pull request Dec 20, 2021
erdem2000 pushed a commit to erdem2000/netdata that referenced this pull request Jan 20, 2022
erdem2000 added a commit to erdem2000/netdata that referenced this pull request Feb 7, 2022
erdem2000 pushed a commit to erdem2000/netdata that referenced this pull request Feb 20, 2022
erdem2000 added a commit to erdem2000/netdata that referenced this pull request Apr 9, 2022
erdem2000 added a commit to erdem2000/netdata that referenced this pull request Apr 18, 2022
erdem2000 added a commit to erdem2000/netdata that referenced this pull request May 19, 2022
vkalintiris added a commit that referenced this pull request Jan 11, 2024
* initial version

* basic GitHub Actions CI

* Create README.md

* run GitHub Action at least once weekly
to check test pass if distro was updated but project doesn't get
any change for long time

* add test for rbuf_find_bytes

* minor - add .vscode to .gitignore

* minor - add CFLAGS to makefile

* minor - silence unused var warning

* minor - update readme

* add submodules

* initial commit

* add README

* allow setting MQTT LWT

* allow setting LWT QOS

* handle MQTT keep-alives properly

* allow choosing keep alive time

* handle WS_OP_CONNECTION_CLOSE

* allow sending other frames than WS_OP_BINARY_FRAME

* minor readability improvements

* work on graceful disconnect

* reset ws_client state on subsequent connections

* implement mqtt_wss_destroy

* more descriptive RC for mqtt_wss_service

* properly free/destroy SSL

* work on reconnect

* set return code for WS disconnect

* Less logging under normal operation

* readme point to `test.c` as how-to for now

* OpenSSL certificate checking by default

* reset poll_fds on reconnect

* fix older SSL versions

* empty install and dist targets

* fix LGTM warning

* add rbuf_get_capacity

* gh actions test

* ws_ping impl

* test.c - port as cmd line param

* properly handle WebSocket disconnect packet

* initial HTTPS proxy support

* remove base64 submodule, use OpenSSL

* change to urandom

* ws_client_process WS_RAW retval fix

* coverity fixes

* CID 1448838

* CID 1448836

* CID 1448829

* test check for error on init

* make it play nicer with automake projects

* MQTT-C coverity fixes

* Create codeql-analysis.yml

* Make typedef introduce a new name for struct. (#2)

* add autotools related files to gitignore

* minor - silence -Wmaybe-uninitialized warning

* reinit on MQTT clean session connect

* fix apple endianness functions

* add libcrypto for macos

* MacOS compatibility

* fix FreeBSD build

* use TLS SNI

* propagate buffer full EC to app layer

* store buffer sizes in mqtt_wss_client

* allow auto buffer growth on buffer too small

* quicker connection drop on BUFFER FULL

* always clear last_ec on connect

* parse all HTTP headers

* minor - rename constant for clarity

* rename idx and idx2 for code readability

* limit response header field count

* flush not needed, replace with descriptive err msg

* bump MQTT-C

* Blocking publish and inflight MQTT-C buffer growth

* update obsolete comment

* add missing unlock in error case

* fix EINTR

* initial MQTT 5 implementation

Implements minimal MQTT 5 features. Up to QoS1

* fixes for bugs in initial MQTT5 implementation

* initial statistics support

* fix vbi parser

* initial base http proxy auth support

* fix base64 helper for longer credentials

* minor - create with single allocation

* allow defining custom alloc functions for user

* Allow custom memory fncs and macros by user (#5)

* update crbuf module
* allow custom malloc, free, strdup, calloc ...

* use long long for till_next_keep_alive

* dont send PUBACK on QOS0 (#6)

* fix publish parser (#7)

* mark QoS0 as GC on send (#8)

* log extra info in case of OpenSSL error (#9)

* log extra info in case of SSL error

* fix build error with older SSL (#10)

* adds possibility to decrypt traffic with wireshark for debugging (#13)

* initial commit

* add fncs key:uint64 and data:opaque ptr

* add uint64_t key iterator

* add github test runner

* start working on proper tests

* Initial support for topic aliases (#12)

Add support for topic alias functionality for PUBLISH packets
also adds support for parsing all MQTT properties as opposed to just skipping and ignoring them (what we did previously)

* implement c_rhash_iter_str_keys + tests

* Fixes of Topic Alias implementation (#14)

* initial removal of mqtt-c support (#15)

* honor max msglen for server (#19)

* Update README.md (#20)

* memory align fragments (#21)

* Remove mqtt_websocket submodule

* Remove c-rbuf and c_rhash submodules

* Exclude mqtt_websockets from Codacy

It seems that it was excluded before merging
the mqtt_websockets submodule.

---------

Co-authored-by: Timotej Šiškovič <timotejs@gmail.com>
Co-authored-by: Timotej S <6674623+underhood@users.noreply.github.com>
Co-authored-by: Emmanuel Vasilakis <mrzammler@mm.st>
thiagoftsm pushed a commit that referenced this pull request May 14, 2024
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 1 to 2.3.1.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](actions/upload-artifact@v1...v2.3.1)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
@ktsaou ktsaou mentioned this pull request Oct 1, 2025
kanelatechnical pushed a commit that referenced this pull request Feb 26, 2026
ktsaou added a commit that referenced this pull request May 8, 2026
* sow: reopen SOW-0014 (network flows documentation regression)

The 2026-05-07 closure of SOW-0014 was premature. The learn netlify
deploy preview for PR #2852 surfaced major content errors that the
prior validation pass missed: multiple statements that contradict
the source code, generic flow-monitoring advice imported from research
notes that does not apply to Netdata, several invented behaviours, and
structural choices that read as academic / generic guidance rather
than as a practical Netdata-specific guide.

Move file from done/ back to current/, set Status: in-progress, and
append a `## Regression - 2026-05-07` section with:

- 21 findings transcribed verbatim from the user (F1..F21);
- code-citation verdicts for each (per-flow sampling multiplication
  at decoder/record/core/record.rs:24-26 confirms F4/F5/F15 wrong;
  template persistence at decoder/protocol/{v9,ipfix}/templates.rs
  confirms F14 wrong; etc.);
- root cause analysis (subagents extracted data accurately but missed
  behavioural framing claims; codex review focused on autocomplete
  code, not docs prose; validation evidence was structural, not
  semantic; closure was driven by "all phases done" rather than
  "all claims true");
- a three-phase repair plan: per-finding fixes one at a time with
  no batching (R1), per-page audit subagents that re-run until clean
  (R2), final close with a whole-section review (R3).

The SOW remains in current/ until every F1..F21 has a fix landed
with code citation, every page passes a per-page audit, and a
whole-section review returns no new findings.

* F1 docs/.map: render Network Flows Overview as the section landing page

The /docs/network-flows URL rendered as an auto-generated tile grid
because the section root meta block carried only `label:` -- no leaf
content. Learn's `get_dir_make_file_and_recurse` then synthesised a
category index page from the children.

Hoist `edit_url:` and `description:` to the section root, pointing at
the existing README.md. Drop the redundant child "Overview" entry that
pointed to the same file.

Pattern reference: every section that renders an Overview at its root
URL on learn.netdata.cloud (Collecting Metrics, Dashboards and Charts,
Netdata Cloud, Welcome to Netdata, etc.) carries `edit_url:` directly
on the section root.

Logged as F1 in SOW-0014 regression log with code references.

* F2+F3 docs(network-flows): correct doubling vs bidirectional symmetry

Two related findings, fixed together because they touched the same
paragraphs.

- The "doubling" effect (per-packet ingress+egress accounting on a
  single router) was conflated with bidirectional traffic symmetry.
- The doubling fix said "filter by one exporter, one interface, in
  one direction". The "in one direction" is redundant on top of "one
  interface" and misleads readers into expecting another 50% halving.
- The bidirectional-traffic explanation said "when you see traffic
  X-to-Y and Y-to-X of similar volume, that's one conversation, not
  two". Bidirectional conversations are usually asymmetric (downloads
  vs ACKs), so "similar volume" is wrong as an identification
  heuristic.

Rewritten:

- Doubling fix is now: one exporter + one interface (Input Interface
  OR Output Interface, pick one). Each packet crossing that interface
  produces exactly one record on it.
- The mirror-conversation section is renamed and reframed: separate
  packets in each direction, separate records, typically asymmetric
  volumes. Per-direction accounting, not duplication.

Files touched:
- README.md, quick-start.md: paragraphs rewritten.
- summary-sankey.md, anti-patterns.md, validation.md: "in one
  direction" lines fixed in place. Anti-patterns / validation will
  be rewritten more broadly under F14-F17 but the wrong claims
  are removed now.

Logged as F2+F3 in SOW-0014 regression log.

* F4+F5 docs(network-flows): correct sampling-rate framing (uniform-rate myth)

Two related findings, fixed together because they are the same wrong
claim repeated across the documentation. Source-code reality:

  src/crates/netflow-plugin/src/decoder/record/core/record.rs:24-26
  let sampling_rate = rec.sampling_rate.max(1);
  rec.bytes = rec.bytes.saturating_mul(sampling_rate);
  rec.packets = rec.packets.saturating_mul(sampling_rate);

`sampling_rate` is set per-record from each protocol's appropriate
source (legacy header, v9 IE / Sampling Options Template, IPFIX IE /
options, sFlow per-sample rate, or static override). Multiplication
runs PER FLOW at decode time. Mixed sampling rates across exporters,
interfaces, or time are handled correctly automatically.

Removed false claims wherever they appeared:

- README.md "What sampling does to your numbers" -- "works correctly
  only if all your exporters use the same sampling rate" and "the
  clean path: keep sampling rates uniform across your network".
  Rewrote the paragraph to state per-flow multiplication, explain
  why the UI does not surface a single rate (mixed rates have no
  meaningful display value; uniform rates are already known to the
  operator), and keep the real statistical-floor caveat (sampling
  can miss small / short flows regardless of rate uniformity).

- field-reference.md and anti-patterns.md -- `RAW_BYTES` no longer
  framed as "use when sampling is uniform". Now correctly framed as
  the literal pre-multiplication value the exporter sent.

- troubleshooting.md "Bandwidth doesn't match SNMP" -- "Mixed
  sampling rates ... isn't comparable to any single SNMP
  measurement" replaced with the actual mistake (comparing
  aggregates of many interfaces to a single interface SNMP counter).
  Per-flow multiplication is correct regardless of rate uniformity.

- validation.md -- "undocumented sampling rate changes" dropped from
  the silent-failure intro; the "Sampling rate change" monitoring
  table row removed (per-flow multiplication absorbs rate changes).

- investigation-playbooks.md -- "Sampling rate of the exporter (so
  the numbers can be interpreted)" deliverable removed and "A
  change in sampling rate during the analysis window invalidates
  the trend" caveat removed. Both wrong under per-flow scaling.

- anti-patterns.md cross-protocol-counts section -- "Same goes for
  sampling-rate differences across exporters" removed; the
  protocol-counts-not-comparable point stays.

The F2/F3 doubling-fix wording ("filter by exporter + interface +
direction") was also wrong and got cleaned in the same anti-patterns
summary table row.

What stays: NetFlow v7 / v5 with rate=0 / v9 / IPFIX without a
Sampling Options Template are real cases where the plugin sees no
rate and undercounts. Those remain documented as the actual silent
failure mode.

Items deferred to F14 / F15 which rewrite their containing sections:
- validation.md silent-failure items #2, #3, #5 (F14 removes them as
  a block).
- anti-patterns.md "Ignoring the sampling rate" section + its
  summary-table row (F15 removes the section entirely).

Logged as F4+F5 in SOW-0014 regression log with code references.

* F6 docs(network-flows): remove "Globe less useful for analysis" judgement

The globe and city map render the same query response with the same
table beneath. The 3D projection is not "less useful for analysis";
it is a different rendering with different strengths.

Replaced both occurrences with a neutral framing that the 2D city map
is best for in-continent precision and the 3D globe is best when
distance and great-circle paths matter (transcontinental traffic,
undersea cables, intercontinental CDN routing).

While here, fixed the same page's "Mirroring" subsection to drop the
F2 symmetry myth ("25 top-N = 12 conversations" implied a 1:1 pairing
of A->B and B->A by volume; bidirectional traffic is usually
asymmetric).

Logged as F6 in SOW-0014 regression log.

* F7 docs(network-flows): correct UI location of the Network Flows view

The docs claimed "the Network Flows tab should appear in the top
navigation". The actual Netdata UI exposes Network Flows as a
Function under the **Live** tab. Verified against
docs/dashboards-and-charts/live-tab.md.

Adopted convention:
- "Open Network Flows" (verb)
- "the Network Flows view" (noun)
- "Click the Live tab in the top navigation; Network Flows appears
  in the Functions list on the right" (setup context).

Swept every "Network Flows tab" reference in:
- installation.md (the original bad sentence + follow-up).
- troubleshooting.md, investigation-playbooks.md, anti-patterns.md.
- visualization/dashboard-cards.md (4 occurrences).
- visualization/summary-sankey.md.

Logged as F7 in SOW-0014 regression log.

* F8 netflow-plugin: per-tier retention only; remove journal-level globals

Cleans up an unjustified schema redundancy. Today the plugin accepts
both top-level `size_of_journal_files` / `duration_of_journal_files`
under `journal` AND per-tier values under `journal.tiers.<tier>`. The
runtime already uses only per-tier values; the global was just a
default that flowed through `retention_for_tier()`. The two paths gave
operators a configuration surface with no underlying behavioural
difference.

After:
- `JournalConfig` carries only `tiers: JournalTierRetentionOverrides`
  (plus journal_dir and the query guardrails). No globals.
- Each `JournalTierRetentionConfig` is `Option<ByteSize> /
  Option<Duration>` directly. Omitted fields fall back to the
  built-in tier defaults (uniform 10GB / 7d, preserving today's
  default behaviour). Explicit `null` disables that limit on that
  tier; validation still requires at least one positive limit.
- The `RetentionLimitOverride<T>` enum, its serializer, deserializer,
  and `resolve` helpers are removed -- no longer reachable.
- The orphan `parse_bytesize` helper that fed the removed clap
  `value_parser` is gone.

Tests rewritten to exercise the per-tier-only schema:
- `journal_tier_retention_uses_built_in_tier_defaults`
- `journal_tier_retention_uses_per_tier_values_when_present`
- `journal_rotation_size_derives_from_tier_size_budget`
- `journal_rotation_size_uses_100mb_for_time_only_retention`
- `journal_validation_rejects_tier_size_below_100mb`
- `journal_validation_allows_time_only_retention_when_size_is_disabled`
- `journal_tier_retention_null_disables_size_limit_for_that_tier_only`

Plus mechanical updates to memory_tests / startup_memory_tests to use
the new schema. Full crate: 427 passed, 0 failed.

Documentation:
- configuration.md: rewrote the `## journal` section with a
  per-tier-only schema. Updated the production retention profile
  example. Replaced the existing "Top-level retention" subsection
  with a "Per-tier retention" subsection.
- retention-querying.md: dropped the global-form example; per-tier
  example only; cross-link to configuration.md.

Breaking change notice: any existing user config using
`journal.size_of_journal_files` / `journal.duration_of_journal_files`
at the top level will now fail to deserialize (deny_unknown_fields).
Migrate by moving those values under `journal.tiers.<tier>.*`. The
plugin is recently shipped (PR #22439, 2026-05-07); breaking-change
risk is low.

Logged as F8 in SOW-0014 regression log.

* F9 netflow-plugin: remove dead query_1m_max_window / query_5m_max_window

Code investigation: these two journal config knobs were declared,
validated, and exposed in YAML, but nothing in `src/.../query/`
ever read them. The actual tier auto-pick logic in
`query/planner/spans.rs::plan_query_tier_spans_recursive` selects
the coarser tier strictly from window / bucket-duration alignment
math -- it does not consult either knob.

Verdict: dead schema. Removed both:
- field declarations on `JournalConfig`
- non-zero / ordering validation in validate_journal
- two YAML test fixtures that mentioned them
- the two configuration.md table rows + code-block lines + the
  "query-window limits" explanation
- the retention-querying.md sentence that referenced them

Updated the retention-querying explanation of tier auto-pick to
match the actual behaviour: the planner uses bucket alignment, not
config-driven window caps.

Build + tests: 427 passed, 0 failed.

Breaking change notice: any user config that set
`journal.query_1m_max_window` or `journal.query_5m_max_window` will
now fail to deserialize (deny_unknown_fields). The keys had no
effect before; migration is delete-only.

Logged as F9 in SOW-0014 regression log.

* F10 netflow-plugin: keep query_max_groups, drop dead query_facet_max_values_per_field; document properly

Two journal config knobs, opposite verdicts after code investigation:

- query_max_groups: REAL. Read at query/service.rs:52 and threaded into
  the projected group accumulator via query/projected/apply.rs:48. When
  the accumulator's grouped_total() exceeds the limit, additional group
  keys are folded into a synthetic __overflow__ bucket
  (query/grouping/labels.rs:17, query/grouping/model/compact.rs:35) and
  the response carries a warning ("Group accumulator limit reached;
  additional groups were folded into __overflow__" at
  query/timeseries.rs:124). Bounds memory on accidentally wide group-by
  combinations. Keep; document properly.

- query_facet_max_values_per_field: DEAD. Declared, validated for
  non-zero, but the consumer at query/facets/render.rs:19,27 uses the
  hardcoded constant DEFAULT_FACET_ACCUMULATOR_MAX_VALUES_PER_FIELD
  (query/request/constants.rs:17) instead of the config knob. The two
  coincidentally have the same default value (5000) but the config knob
  is never threaded to the consumer. Remove.

Code:
- types/journal.rs: removed the dead field; added a doc comment for
  query_max_groups explaining the __overflow__ bucket and the warning.
- defaults.rs: removed default_query_facet_max_values_per_field().
- validation/journal.rs: removed the non-zero check for the dead knob.
- plugin_config_tests.rs: removed
  validate_rejects_zero_query_facet_max_values_per_field test entirely;
  cleaned YAML fixtures.

Stock config + README:
- src/crates/netflow-plugin/configs/netflow.yaml: rewrote the journal
  block to use the per-tier retention form (carries over the F8 schema
  in the stock config); dropped both dead knobs; added clear comments
  for query_max_groups.
- src/crates/netflow-plugin/README.md: example updated, explanatory
  paragraph rewritten to describe what query_max_groups actually does.

Docs:
- configuration.md: Query guardrails table now lists only
  query_max_groups, with full description of overflow behaviour.
- retention-querying.md: Group-by limit section consolidated.
- visualization/filters-facets.md: removed the "Facet limits"
  subsection that documented the dead knob.

Build + tests: 426 passed, 0 failed (one dead-knob validation test
removed).

Breaking change notice: any user config setting
journal.query_facet_max_values_per_field will now fail to deserialize.
The key had no effect before; migration is delete-only.

Logged as F10 in SOW-0014 regression log.

* F11 docs(network-flows): author the empty IP Intelligence concept page

The file existed as 0 bytes since the original documentation rewrite.
The netlify deploy preview rendered it as an empty page. Multiple other
pages cross-link to it (asn-resolution, static-metadata, network-identity,
README, configuration, the four provider integration cards), so the
empty placeholder was both a UX failure and a coverage gap.

Authored from scratch, code-grounded against:
- src/.../plugin_config/types/enrichment/geoip.rs (config schema)
- src/.../plugin_config/runtime.rs (auto-detect path)
- src/.../enrichment/data/geoip/resolver.rs (load + 30s refresh + per-IP
  composing-multiple-databases lookup)
- src/.../enrichment/data/network/asn.rs (AS-name rendering)

Page covers: the fields IP intelligence populates (with tier-preservation
notes), the configuration schema, auto-detection, refresh cadence, lookup
order vs the broader ASN provider chain, the four provider integration
cards (DB-IP / MaxMind / IPtoASN / Custom), private-IP rendering,
IPv6/IPv4 database split behaviour, staleness and accuracy caveats, and a
failure-modes table.

Frontmatter `learn_rel_path` matches the bgp-routing / network-identity
siblings ("Network Flows/Enrichment Concepts") for now -- the source
frontmatter is informational; the actual sidebar position derives from
docs/.map/map.yaml, and F20 will rename the section consistently.

Logged as F11 in SOW-0014 regression log.

* F20 docs(network-flows): rename "Enrichment Concepts" to "Flows Enrichment"

User: "'Encrichement Concepts' is a wrong title. 'Flows Enrichement' is
the right one."

Renamed in:
- docs/.map/map.yaml line 499 (the section label that drives the actual
  sidebar position on Learn).
- All seven `learn_rel_path` frontmatter values across
  docs/network-flows/enrichment/*.md. Prior state was inconsistent (4
  files had "Network Flows/Enrichment", 2 had "Network Flows/Enrichment
  Concepts", 1 had the F11-introduced "Flows Enrichment"). Settled on
  the canonical "Network Flows/Flows Enrichment" everywhere.

Logged as F20 in SOW-0014 regression log.

* F21 docs(integrations): rename "Sources" sub-category to "Flow Protocols"

User: "'Sources' is too generic. 'Flow Protocols' is the right one."

Renamed in:
- integrations/categories.yaml: flows.sources.name now "Flow Protocols".
- The three protocol-card frontmatter values
  (`netflow.md`, `ipfix.md`, `sflow.md`) now declare
  `learn_rel_path: "Network Flows/Flow Protocols"`.
- src/crates/netflow-plugin/metadata.yaml: removed three
  self-referencing learn URLs that pointed at
  /docs/network-flows/sources/{netflow,ipfix,sflow}. These links
  were broken before the rename (no /docs/network-flows/sources
  directory exists in source) and would stay broken under the new
  label. Replaced with the surviving "Network Flows Overview"
  anchor.
- Re-ran integrations/gen_integrations.py + gen_docs_integrations.py
  to regenerate the three protocol cards. Both exit clean.

Logged as F21 in SOW-0014 regression log.

* F18 docs(network-flows): journalctl --namespace netdata everywhere

User: "Netdata logs in namespace 'netdata'. Journalctl needs
`--namespace netdata`."

`-u netdata` selects the systemd UNIT and captures only stdout/stderr
the unit emits to the journal. Netdata writes structured logs into a
journal NAMESPACE called `netdata`. Without `--namespace netdata`,
users see at most unit-level startup/shutdown messages -- not the
plugin output that helps with debugging.

Swept all `journalctl -u netdata` invocations to
`journalctl --namespace netdata` across:
- quick-start.md
- troubleshooting.md (5 occurrences)
- installation.md
- enrichment/network-identity.md

Grep clean afterwards.

Logged as F18 in SOW-0014 regression log.

* F15 docs(network-flows): remove "Ignoring the sampling rate" anti-pattern

User: "How is it possible for users to ignore the sampling rate if we
calculate the estimated volume at ingestion? You invented reasons for
it. ... section must be removed."

The premise was wrong on multiple counts: per-flow multiplication is
always consistent (each record carries its own rate), users CAN'T
"ignore" the rate because Netdata applies it automatically at decode
time, and the "uniform rates required" framing is exactly the myth F4
+ F5 already corrected.

The two real concerns the section conflated remain documented
elsewhere:
- small flows missed at high sampling rates -- preserved in the
  Overview's "What sampling does to your numbers" section and in
  investigation-playbooks "Caveats".
- exporter sends no rate (v7 / v5 rate=0 / v9-IPFIX without Sampling
  Options Template) -- preserved in troubleshooting "Bandwidth doesn't
  match SNMP" and in validation.md.

Removed the entire section. Section numbering renumber will land with
F17 once all three section removals have completed.

Logged as F15 in SOW-0014 regression log.

* F16 docs(network-flows): remove "Trusting GeoIP for internal IPs" anti-pattern

User: "Geolocation does not position internal IPs on the map. ...
section must be removed."

Code-verified at
src/crates/netflow-plugin/src/enrichment/data/geoip/decode.rs:40-72.
`apply_geo_record` writes country/state/city/latitude/longitude only
when the MMDB record carries non-empty values for those fields. For
RFC 1918 / private IPs, the MMDB either has no entry or has one
tagged `ip_class: "private"` with no country/city/coords. Internal
IPs simply do not appear on geographic maps. The "in random
countries" claim was invented.

Removed the entire section. The numbering renumber lands with F17.

The troubleshooting.md "Internal IPs in random countries" subsection
carries the same invented claim and will be addressed under F19.

Logged as F16 in SOW-0014 regression log.

* F17 docs(network-flows): remove "Alerting on absolute volume thresholds" + renumber

User: "Netdata does not support alerting of flows yet. Remove this
section."

The section's own footnote already acknowledged the issue: "Netdata's
alerting on flow data is in development; for now this pattern lives in
your monitoring practice, not in the plugin." So it was advice for
users to apply outside Netdata, not a Netdata anti-pattern.

Plus: with F15, F16, F17 all deleting sections, this commit renumbers
the anti-patterns.md sections to be sequential again (1 ... 9). Dropped
three rows from the summary table at the bottom (Ignored sampling,
GeoIP for internal IPs, Absolute thresholds) that referred to the
removed sections. Updated the cross-link in
visualization/time-series.md that called out
"time-shifted comparison beats absolute thresholds" -- replaced with a
general anti-patterns reference since the specific section is gone.

Logged as F17 in SOW-0014 regression log.

* F19 docs(network-flows): troubleshooting cumulative cleanup

User: "This page has a mix of all the above issues: sampling, geoip,
etc."

Surgical fixes after F2-F18 land:

- Removed "Internal IPs in random countries" subsection -- same
  invented claim as F16 (GeoIP does not position internal IPs on
  maps; code-verified at
  src/.../enrichment/data/geoip/decode.rs:40-72).
- Rewrote "Things that look like bugs but aren't" entries to remove
  the F2/F3 "filter to one direction" framing and the F16 GeoIP
  myth. The two doubling-related bullets now point at
  exporter+interface filtering and at Source/Destination ASN
  filtering for bidirectional. Renamed "tier-0" -> "raw-tier" for
  consistency with the field reference.

Items kept (framing already correct after earlier findings):
- "Sampling rate not honoured by the exporter" -- correct (F4/F5).
- Doubling references in the SNMP-mismatch table (F2/F3).
- ASN provider chain debug recipe.
- Decapsulation destructive-on-non-tunnel.

Items deferred to per-page audit (R2):
- "Cisco's default template refresh is 30 minutes" -- vendor-
  specific claim; verify against current Cisco IOS-XE/XR docs in R2.

Logged as F19 in SOW-0014 regression log.

* F14 docs(network-flows): validation.md rewrite to remove invented risks

User: "I think the entire 'Validation and Data Quality' is completely
off. It mentions again sampling rates, etc."

Code-verified facts driving the rewrite:
- Per-flow sampling multiplication at decode time
  (decoder/record/core/record.rs:24-26): users don't need to monitor
  "sampling rate change" or "sampling rate misinterpretation".
- Template persistence across restarts
  (decoder/protocol/v9/templates.rs:106 +
  decoder/protocol/ipfix/templates/data.rs:67): users don't need to
  monitor "template loss after collector restart".
- UDP buffer overflow alert already exists
  (src/health/health.d/udp_errors.conf:6-19): kernel-level UDP drops
  are signalled by an existing system alert, not a silent failure.

Rewrite:
- New "What you actually need to watch" table -- five real failure
  modes (kernel UDP drops via existing alert, exporter stopped
  sending, wrong interfaces being exported, exporter sampling without
  communicating the rate, stale MMDB).
- Removed the three invented silent-failure items (sampling
  misinterpretation, sampling change, template loss).
- Removed the "Internal IP enrichment validation" section (F16
  confirmed GeoIP does not position internal IPs).
- Renamed "Sampling rate sanity check" to "Sampling rate
  verification" with the uniform-rate myth gone; kept the practical
  RAW_BYTES vs BYTES comparison.
- Removed the "Template cache health" subsection.
- Renamed the alerting table to "Plugin-side signals worth alerting
  on" and clarified these are signals the plugin exposes for the
  operator, not "silent failures" the dashboard hides.

Logged as F14 in SOW-0014 regression log.

* F13 docs(network-flows): rewrite Sizing and Capacity Planning as a practical guide

User: "People want sizing and planning directions. This is not an
academic paper, not a blog."

Rewrote the page from scratch around the user's seven requirements:
- plugin cap (single-thread post-decode; ~25k flows/s sustained;
  ISP-scale anchor)
- how ingestion rate maps to storage (single table, 4 rows;
  ~800 bytes/flow empirical)
- raw tier dominates; bound it; example per-tier production config
- fast NVMe is the right call for the raw tier; slow storage means
  shorter retention
- memory: routing-trie footprint + page-cache headroom
- query speed: indexed fields fast; FTS = full scan of raw tier
- distributed deployment as the scaling answer (one agent per
  router/site; federated via Netdata Cloud; no central aggregation
  needed for flow data)

Removed:
- All benchmark tables and methodology. Engineering benchmark
  numbers remain in src/crates/netflow-plugin/README.md.
- The "Bounding storage for capacity planning" formula derivation
  (ignored tier rollover and dedup; partly invalid).

Logged as F13 in SOW-0014 regression log.

* F12+F22 docs(network-flows): split retention/querying; add Visualization Overview

User on F12: "Retention is closer to configuration and querying is
closer to visualization. ... If you need to put generic visualization
rules, these should be a generic 'Visualization/Overview' page, to
explain FTS, sharing, grouping, etc."
User on F22: "The 'Section index' in the overview page is not needed.
Learn already shows the index as a side bar."

F12:
- New visualization/overview.md page collects "how queries work",
  "group-by limit and overflow", "full-text search", "URL sharing",
  filtering pointers, "picking the right view".
- retention-querying.md slimmed to retention-only (tiers, what
  survives rollup, tier auto-pick, "no data", what forces raw tier,
  default retention misconfig). Sidebar label renamed to "Retention
  and Tiers".
- map.yaml: Visualization sub-section root now carries edit_url +
  description pointing at visualization/overview.md (matches the F1
  pattern). Retention sidebar label renamed.

F22:
- Removed the "## Section index" block from README.md. The Learn
  sidebar already renders the same hierarchy. The "Where to start"
  role-based pointer block stays (not a sidebar duplicate). The
  "specific feature in depth" bullet now points readers at the
  sidebar.

Logged as F12 + F22 in SOW-0014 regression log.

* docs(network-flows): drop the trailing "use the sidebar" bullet from Overview

Per user: the bullet was redundant with the F22 cleanup. Sidebar
guidance is implicit; the four role-based bullets above are the
intended "where to start" entry points.

* docs(network-flows): address Phase R2 Round-1 audit findings

Documentation audits against source code surfaced a small number of
inaccuracies and a few low-severity polish items across the most-edited
pages. Fixed in-place, with the same surgical-edit policy as Phase R1.

Critical/high severity:

- configuration.md: drop the false claim that listener / protocols /
  journal keys can also appear at the top level. The flatten attribute
  is clap-only (CLI flag flatten), not serde; with deny_unknown_fields
  the YAML schema rejects unknown top-level keys. Stock file uses the
  nested form.
- validation.md: replace the wrong `dRcv` ss column reference with the
  actual `d<N>` value inside the `skmem:(...)` line (the sock_drop
  counter from iproute2 ss output).
- validation.md: replace the unreachable RAW_BYTES vs BYTES dashboard
  comparison with the supported approach -- group by the Sampling Rate
  field. RAW_BYTES is filtered from supported_flow_field_names and is
  not surfaced as a default table column.

Medium severity:

- README.md: rename "Source ASN" to "Source AS Name" in the default
  group-by description; the actual default uses SRC_AS_NAME, which the
  dashboard renders as "Source AS Name". Same fix swept through
  quick-start.md, investigation-playbooks.md, troubleshooting.md,
  visualization/time-series.md, visualization/summary-sankey.md so the
  doc text matches the dashboard label.
- README.md, quick-start.md, anti-patterns.md, validation.md,
  visualization/summary-sankey.md: soften the "doubling by default"
  framing. Both ingress + egress export is a common configuration but
  not a property of the protocol, and vendor best practice is
  ingress-only. Wording now reflects that.
- validation.md: note the udp_errors alert ships as `to: silent` by
  default; operators must override `to:` to receive notifications.
- validation.md: add a `du -sh` example for cross-checking on-disk
  tier sizes (cross-link to sizing-capacity.md).

Low-severity polish:

- README.md: classifier expression language is an Akvorado-compatible
  subset (matches classifiers.md framing).
- README.md: rollup tier note now mentions the dropped fields, so the
  tier auto-pick claim is not over-broad.
- sizing-capacity.md: drop the "after rotation and compression" qualifier
  on the 800-bytes/flow figure (the bench window is too short to reflect
  rotation cycles); rephrase the ingest description and the "spinning
  rust" sentence; soften the BMP/BioRIS RSS guidance to a rough estimate
  with bench numbers anchored.
- configuration.md: document the query_max_groups / query-max-groups
  alias; document enrichment.geoip.optional and the abort-vs-warn
  semantics; clarify that default_sampling_rate and override_sampling_rate
  both accept a single integer or a per-prefix map; mention the 100 MB
  rotation-size fallback when size_of_journal_files is null.
- anti-patterns.md: add the missing "What it costs" line to sections 8
  and 9 for shape consistency with sections 1-7.

* docs(network-flows): address Phase R2 Round-2 audit findings

Second round of per-page audits against source code surfaced several
critical inaccuracies plus the usual long tail of low-severity polish.

Critical / high:

- retention-querying.md: rewrite the tier auto-pick rules. The previous
  thresholds were inverted ("8h20m and longer -> 1-hour") -- the actual
  planner walks coarsest first and accepts the first tier with at least
  100 aligned buckets, so >=100h -> 1h, 8h20m..<100h -> 5m,
  100min..<8h20m -> 1m. Verified at
  src/crates/netflow-plugin/src/query/planner/timeseries.rs:34-46 and
  TIMESERIES_MIN_BUCKETS=100 at src/crates/netflow-plugin/src/query/request/constants.rs:18.

- retention-querying.md: rewrite the rollup-preserved field list. The
  previous list claimed AS path, BGP communities, MPLS labels, MACs, and
  post-NAT addresses survive into rollups -- they do not. The actual
  rollup tier carries only the fields defined in
  src/crates/netflow-plugin/src/tiering/rollup/schema/fields/defs/{core,exporter,interface,network,presence}.rs;
  every other field is raw-only and forces the query to the raw tier.

- retention-querying.md: correct the "no data" / fallback semantics. The
  planner does NOT fall back to a coarser tier for raw-only queries --
  rollups don't carry the field, so the span returns empty. Confirmed
  at src/crates/netflow-plugin/src/query/planner/prepare.rs:25-28 and
  src/crates/netflow-plugin/src/query/planner/spans.rs:99-105
  (lower_fallback_candidate_tiers returns &[] for Raw).

- validation.md: replace the "group by Sampling Rate field" verification
  with an SNMP-magnitude cross-check. The SAMPLING_RATE field is
  filtered out of supported_flow_field_names
  (src/crates/netflow-plugin/src/query/request/constants.rs:80),
  excluded from the groupable set
  (src/crates/netflow-plugin/src/query/fields/rules.rs:33), and not
  available as a facet
  (src/crates/netflow-plugin/src/facet_catalog.rs:123). Users cannot
  pick it from the dashboard. The honest verification path is SNMP
  magnitude or a per-prefix override.

- validation.md: correct the alert threshold framing. RcvbufErrors is
  read with RRD_ALGORITHM_INCREMENTAL
  (src/collectors/proc.plugin/proc_net_netstat.c:400-434), so the value
  Netdata stores is per-second. The "lookup: average -1m absolute" plus
  "$this > 10" in src/health/health.d/udp_errors.conf means >10
  errors/SECOND averaged over 1 minute, not >10/minute as previously
  stated.

- quick-start.md: correct the field labels used in the doubling fix
  step from "Input Interface Name" / "Output Interface Name" to
  "Ingress Interface Name" / "Egress Interface Name", matching the
  display labels in src/crates/netflow-plugin/src/presentation/display.rs:39-40.
  Same fix swept across anti-patterns.md, troubleshooting.md,
  validation.md, investigation-playbooks.md, and
  visualization/summary-sankey.md.

- quick-start.md: drop the false "60-second template refresh" claim for
  softflowd. softflowd's `expint` flag controls expiry-check interval,
  not template refresh; the NetFlow v9 template interval in softflowd
  is a compile-time default
  (NF9_DEFAULT_TEMPLATE_INTERVAL=16 in netflow9.c) with no CLI knob.

- quick-start.md: complete the Juniper J-Flow snippet. The previous
  example defined a sampling instance but never bound it to a
  forwarding card and never set a sampling rate, so it would not
  produce flows. Add `set chassis fpc 0 sampling-instance NETDATA` and
  `set forwarding-options sampling instance NETDATA input rate 1000`,
  with a short note explaining the FPC binding requirement.

- quick-start.md: correct the dashboard navigation step from "click the
  Network Flows tab" to "open the Live tab and select Network Flows
  from the Functions list".

- ip-intelligence.md: correct the GeoLite2 / DB-IP / IPtoASN cadence
  claims. DB-IP Lite is monthly. MaxMind GeoLite2 City/Country update
  on weekdays; GeoLite2 ASN updates daily since June 2024. IPtoASN is
  not MMDB -- it is a public-domain TSV feed that includes both ASN
  and country and must be converted to MMDB before the plugin can read
  it (the plugin only supports MMDB).

- ip-intelligence.md: correct the dual-stack guidance. Most current
  providers ship a single dual-stack MMDB; the previous "configure
  both an IPv4 file and a separate IPv6 file" advice was misleading.

- ip-intelligence.md: clarify the asn_providers chain semantics. The
  `geoip` provider is a terminal "use 0" shortcut -- when reached the
  AS number is forced to 0 (the AS name still comes from the MMDB
  lookup independently). Confirmed at
  src/crates/netflow-plugin/src/enrichment/asn/resolve.rs:75-109.

- ip-intelligence.md: distinguish the database-composition rules. ASN
  fields use pure last-wins; geo fields are written only when the
  matching record has a non-empty value, so a later database with an
  empty city does not overwrite an earlier database's city
  (src/crates/netflow-plugin/src/enrichment/data/geoip/decode.rs:40-72).

Medium / low:

- sizing-capacity.md: rewrite the example raw-tier YAML so the size cap
  matches the page's own 25k flows/s framing. The previous example used
  200GB / 24h; at 25k flows/s the size cap would fire after ~2.8h, not
  24h. Now uses 2TB / 24h with a paragraph on how to scale down for
  lighter loads, and explains the size-vs-duration relationship.

- validation.md: drop the misleading `decoder_state_dir` config-key
  reference (it is a derived path, not a user-facing key); add the
  `-n` flag to the ss command to keep the port numeric in the output.

- anti-patterns.md / troubleshooting.md / validation.md /
  visualization/summary-sankey.md: consistent doubling-framing hedge
  ("a common configuration; vendor best practice is ingress-only")
  across all five docs that mention doubling.

- anti-patterns.md: summary-table row "Doubled aggregate" qualified
  with "(when ingress + egress are both exported)".

- configuration.md: correct the `override_sampling_rate` default
  example from `{}` to `~` (the actual schema default is None).

* docs(network-flows): apply Phase R2 Round-3 critical fixes

retention-querying.md:
- correct the Time-Series sub-100-min fallback. The planner walks coarsest-first
  and falls back to the 1-minute tier (TierKind::Minute1) when no tier has
  >=100 aligned buckets, not to raw. Verified at
  src/crates/netflow-plugin/src/query/planner/timeseries.rs:39-46.
- split the field-eligibility list. The "force raw" set is exactly
  RAW_ONLY_FIELDS plus V9_*/IPFIX_* prefixes
  (src/crates/netflow-plugin/src/query/fields/rules.rs:5-11 +
  src/crates/netflow-plugin/src/query/request/constants.rs:46-57). The
  previously included AS path / BGP communities / MPLS labels / MAC addresses /
  NAT addresses do NOT switch tier; they are dropped from rollup output and
  return null on rollup queries. Page now describes both classes separately.

troubleshooting.md:
- correct Cisco's default template refresh from "30 minutes" to
  "600 seconds (10 minutes)" -- this is the IOS / IOS-XE Flexible NetFlow
  `template data timeout` default.
- replace the wrong "/proc/net/udp ... RcvbufErrors column" recipe. The
  /proc/net/udp file lists open sockets without per-socket drop counters; the
  kernel-wide RcvbufErrors total lives under the Udp: line of
  /proc/net/snmp, which is what Netdata's proc.plugin reads
  (src/collectors/proc.plugin/proc_net_netstat.c:1521). Also dropped the
  contradictory "30-60 seconds" template-refresh hint.

ip-intelligence.md:
- correct the ASN composition rule. The previous "last-wins for ASN, but
  geo writes only when non-empty" framing was wrong -- src/crates/netflow-plugin/src/enrichment/data/geoip/decode.rs:3-28
  filters empty / zero values for ASN fields just like geo, so both sets
  follow the same "last database with a non-empty value wins" rule.
- correct the GeoLite2 cadence. MaxMind's documentation publishes City and
  Country twice weekly (Tuesday and Friday); GeoLite2 ASN moved to every
  weekday in June 2024.

* docs(network-flows): collapse flows sub-categories to Flow Protocols + Enrichment Methods

The previous category tree split enrichment-method integrations across three
separate sub-categories (IP Intelligence, BGP Routing, Network Identity Sources),
which made the integrations page navigation harder than it needed to be and did
not match the conceptual model: the operator is choosing a *data source* for
enrichment, regardless of what kind of data it produces.

Collapsed under flows.enrichment-methods:

- ip-intelligence: dbip, maxmind, iptoasn, custom-mmdb
- bgp-routing: bmp, bioris
- network-identity: aws-ip-ranges, gcp-ip-ranges, azure-ip-ranges, netbox,
  generic-ipam

flows.sources keeps its existing membership (netflow / ipfix / sflow) and the
"Flow Protocols" name unchanged.

Both YAML files validated. Per-card content merges and the new cross-cutting
"Enrichment" + "Enrichment Intel Downloader" pages land in subsequent commits.

* docs(network-flows): merge concept-page content into the 11 enrichment integration cards

Round-by-round, agent-per-card merges. Each card absorbed the durable
provider-specific content from its corresponding concept page; cross-cutting
content was extracted for the new "Enrichment" page (separate commit). The
agents verified every behavioural claim against current source code at file:line
and every upstream URL by WebFetch -- not a mechanical sweep.

IP Intelligence:

- dbip: framed as the auto-detected default, monthly Lite cadence, CC-BY-4.0,
  populated-fields breakdown (geo + ASN), AS0 Private/Unknown labels driven by
  the DB-IP-built ip_class flag, raw-tier-only city/lat/lon.
- maxmind: GeoLite2 vs commercial GeoIP2 split, account-id + license-key auth,
  twice-weekly Tuesday/Friday cadence for City/Country, every-weekday for ASN
  since June 2024, geoipupdate setup. Important correction: the bundled
  topology-ip-intel-downloader does NOT support MaxMind (only dbip and iptoasn);
  the previous card's hint at the netdata downloader was misleading.
- iptoasn: PDDL public-domain feed, hourly TSV cadence (the previous card
  said "daily" -- wrong), bundled topology-ip-intel-downloader natively
  supports the TSV->MMDB conversion (correcting an earlier prompt assumption
  to the contrary), three setup examples including ASN-only and combined
  with DB-IP geo.
- custom-mmdb: reframed as the escape hatch for operators producing custom
  MMDBs (CIDR overlays, internal AS labels). Lists the field names the
  decoder reads from any MMDB, with file:line evidence; cites mmdbwriter
  libraries; recommends `optional: true` during build iteration.

BGP Routing:

- bmp: BMP-v3-only handling -- v1/v2 silently dropped (previously implicit).
  RFC 8671 cited for JunOS post-policy support since 18.3R1, separately
  from RFC 7854 which was previously lumped together. Cisco IOS-XE BMP
  added (was missing). Nokia SR OS added. JunOS minimum 13.3 documented.
  No IANA-registered port for BMP.
- bioris: corrected the topology -- Netdata connects to a USER-RUN bio-rd
  cmd/ris/ daemon over user-supplied gRPC, NOT directly to RIPE RIS. The
  user's bio-rd daemon does the BGP/BMP peering with upstream sources. No
  shipped collector list (ris_instances is required and operator-supplied).
  Memory cost (~hundreds of MB per peer for full-table feeds), no eviction,
  raw-tier-only AS path / communities.

Network Identity:

- aws-ip-ranges: schema reference (top-level + per-entry), live cadence
  softened ("whenever AWS IP space changes, often several times per day"
  rather than the folklore "every 15 minutes" -- AWS docs do not promise
  any fixed schedule). Three jq examples including network_border_group as
  site. Plugin's actual config key is `transform`, not `jq_program` (was
  wrong in the migration prompt).
- gcp-ip-ranges: cloud.json vs goog.json comparison; cloud.json today
  reports `service: "Google Cloud"` uniformly so per-service pivots are
  not possible from this file. No fixed Google cadence in the docs.
- azure-ip-ranges: URL rotates weekly. Service Tag Discovery REST API as
  authoritative alternative. API data lags JSON file by up to four weeks;
  new IPs aren't used for at least one week after publication. Three
  workaround patterns documented honestly.
- netbox: documented breaking change in NetBox 4.2 -- the `site` foreign
  key on Prefix was replaced with the generic `scope` field; the previous
  example used `(.site.name // "")` which silently breaks on 4.2+. New
  card ships scope-aware (4.x) and legacy (3.x/4.0/4.1) examples plus a
  fallback `(.scope.name // .site.name // "")`. Two token formats
  documented (legacy v1 hex and v2 nbt_<key>.<token>).
- generic-ipam: full RemoteNetworkSourceConfig schema (13 options including
  proxy, tls.enable, tls.verify, tls.skip_verify with explicit "rejected
  by validation" notes). Honest call-outs: POST is sent without a body
  (fetch.rs:11-17), interval floored at 60s by service.rs:73, TLS
  verification cannot be disabled (validation/enrichment.rs:183-192).

Categories: each card now under flows.enrichment-methods (the previous
ip-intelligence / bgp-routing / network-identity sub-categories were
collapsed in the previous commit). Generated .md files updated via
integrations/gen_docs_integrations.py.

* docs(network-flows): add 3 new enrichment integration cards + Intel Downloader page

Three new cards under flows.enrichment-methods, completing the "every
enrichment method is an integration" model. Each card was authored by an
agent that read the corresponding concept page and the source code, then
produced a metadata.yaml entry with every claim cited at file:line. The
agents flagged real inconsistencies between the concept pages and the
code; corrections were absorbed into the cards.

static_metadata:

- Three configuration surfaces: enrichment.metadata_static.exporters,
  enrichment.networks, and enrichment.override_sampling_rate (plus
  default_sampling_rate for the distinction).
- Field-population table tied to source at apply/metadata.rs:41-53,
  data/network/write.rs:93-125, apply/metadata.rs:78-97.
- Five corrections vs the previous concept page: boundary "undefined" vs
  numeric 0 are byte-identical in output; lookup priority is dominated
  by prefix specificity, not source-kind; override_sampling_rate matches
  the UDP datagram source IP; the "networks merges last and wins"
  framing was overstated.

classifiers:

- Two evaluation surfaces: exporter_classifiers + interface_classifiers,
  the latter called twice per flow (once per interface side).
- Akvorado-compatible expression-language *subset* — explicitly only the
  operators and actions implemented at enrichment/classifiers/parse.rs;
  every example in the card uses syntax verified against the parser and
  the existing test suite.
- Output normalisation includes "+" (concept page only listed ". -");
  static metadata short-circuits classifier evaluation
  (enrichment/classify.rs:117-119, :150-154); first-write-wins per slot
  (runtime/eval/action.rs:43-46); default cache 5m with >=1s validation
  (defaults.rs:46-48, validation/enrichment.rs:10-12).

decapsulation:

- Two modes (srv6, vxlan) per protocol.rs:50-57; default none.
- Three transport feeders: NetFlow v9 IE 104 (decoder.rs:80), IPFIX
  IE 315 (decoder.rs:74), sFlow SampledHeader (sflow/record.rs:44-69).
- Inner parsers at decoder/common.rs:3-18 (VXLAN port 4789) and
  :35-63 (SRv6 SRH walker); merge points at packet/transport.rs:21-33
  and record/packet/parse/transport.rs:14-21.
- Vendor-verification finding: Cisco IOS-XE / IOS-XR collect datalink
  frame-section could NOT be verified (cisco.com 403'd anonymous
  WebFetch and Akvorado's IOS-XE recipe deliberately omits L2 frame-
  section export). The card flags Cisco support as unverified and
  instructs operators to validate by template inspection. Juniper
  inline-monitoring with datalink-frame-size confirmed via the
  Akvorado mirror; sFlow header sampling confirmed via the project's
  decoder.

intel-downloader.md (new operator-tool page):

- Documents /usr/sbin/topology-ip-intel-downloader -- supported sources,
  CLI flags, atomic replacement, auto-detect integration with the
  netflow plugin's 30s reload window.
- Findings: no packaged systemd timer or cron file (operators must
  install their own; page provides a starter unit + timer pair);
  MaxMind support confirmed absent (no license_key field anywhere in
  config.go; only iptoasn:combined and dbip:asn-lite/country-lite/
  city-lite are recognised by builtInSource); MaxMind users directed to
  geoipupdate as the alternative.
- Hidden capability documented: interesting_cidrs config knob lets
  operators stamp public CIDRs as netdata.ip_class = "interesting" in
  both ASN and geo MMDBs (write.go:228-246).

The categories.yaml category for all 14 enrichment methods is now
flows.enrichment-methods (previously split across ip-intelligence /
bgp-routing / network-identity).

* docs(network-flows): replace 7 concept pages with one Enrichment page; update map.yaml

Restructure phase 5+8+9: collapse the docs/network-flows/enrichment/ directory
(7 concept pages: asn-resolution, bgp-routing, classifiers, decapsulation,
ip-intelligence, network-identity, static-metadata) into ONE consolidated
cross-cutting page at docs/network-flows/enrichment.md, plus per-method
integration cards (already added in previous commits).

The new Enrichment page (447 lines) is the single home for cross-cutting
concepts that span every enrichment method:

- Order of evaluation per flow record (8-step pipeline cited at
  apply/resolve.rs:5-50 and init.rs:50-64).
- The two provider chains (asn_providers / net_providers, the geoip
  terminal "use 0" shortcut, the AS-number-vs-AS-name distinction).
- Composition rules: specificity dominates, ties to static, per-field
  non-empty-wins merge.
- The MMDB shared mechanism (auto-detect path order, last-non-empty-wins
  composition, 30s signature reload, IPv4/IPv6 dual-stack handling).
- Network sources operational properties (fetch loop, 60s floor, jq
  schema, deny_unknown_fields, TLS-no-disable, no pagination, no auth
  helpers, POST-without-body, journal diagnostics).
- Static-metadata-blocks-classifiers semantics.
- Classifier evaluation surfaces and ordering.
- Decapsulation inner-packet override.
- Routing overlay (BMP+BioRIS shared trie).
- Cross-method operational properties: refresh windows, restart behaviour,
  no in-process freshness signal, empty-tree disables enricher, rollup
  tier survival table, geographic accuracy, sampling-rate knobs,
  integration test gap.

The page resolved several discrepancies the original concept pages had
against the source code -- e.g. the "static metadata > classifiers >
network sources > GeoIP > BGP routing" precedence claim was misleading
(actual rule: specificity dominates, ties to static, merge primitive is
non-empty-wins); the bmp alias for routing lives at providers.rs:10,12,
not validation/enrichment.rs; the GeoIP terminal shortcut sets the AS
number to 0 but the AS *name* still comes from the MMDB independently;
POST is sent without a body (fetch.rs:11-17); interval is silently
floored at 60s (service.rs:73). Each correction is cited at file:line.

map.yaml: removed the "Flows Enrichment" sub-section (which contained
the 7 deleted pages); added two new entries between Configuration and
Field Reference:
- "Enrichment" -> docs/network-flows/enrichment.md
- "Enrichment Intel Downloader" -> docs/network-flows/intel-downloader.md

Cross-references updated in 8 surviving pages to point at either the new
Enrichment page (for cross-cutting concepts) or the relevant integration
cards under src/crates/netflow-plugin/integrations/ (for per-method
specifics): configuration, validation, intel-downloader, quick-start,
installation, troubleshooting, visualization/maps-globe.

The 7 deleted concept pages had their durable content fully absorbed
into the Enrichment page (cross-cutting) and the 14 integration cards
(per-method); each integration card cites file:line evidence for every
behavioural claim and was re-verified against current source code by
its merge agent.

* docs(network-flows): repoint cross-references to the new Enrichment page

The merge agents preserved Learn-URL cross-references from the original
concept pages (e.g. https://learn.netdata.cloud/docs/network-flows/enrichment/
ip-intelligence). Those URLs now 404 because the seven concept pages were
collapsed into one. Sweep them all to point at the new consolidated
https://learn.netdata.cloud/docs/network-flows/enrichment page, which
covers the cross-cutting concepts (MMDB shared mechanism, asn_providers
chain, network-source operational properties, etc.) that the per-method
references were calling out.

Generated cards regenerated via gen_docs_integrations.py to flush the
new URL into the .md outputs.

* Repair Network Flows documentation

* Format netflow plugin tests

* Address Network Flows documentation regressions

* Fix Network Flows review regressions

* Move raw rebuild scan off async startup path
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant