[pinot-server/ proactive-query-killing] (1/2) add initial SPI implementation for supporting query killing based on Scan Cost#18102
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #18102 +/- ##
============================================
+ Coverage 63.58% 63.62% +0.04%
+ Complexity 1659 1653 -6
============================================
Files 3245 3252 +7
Lines 197441 197760 +319
Branches 30564 30614 +50
============================================
+ Hits 125536 125826 +290
- Misses 61856 61874 +18
- Partials 10049 10060 +11
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| // CPU time based killing threshold | ||
| private final long _cpuTimeBasedKillingThresholdNs; | ||
|
|
||
| private final boolean _queryKilledMetricEnabled; |
There was a problem hiding this comment.
Side q: Are we honoring this flag during emitting the metics in future PRs? worth a look.
There was a problem hiding this comment.
No, this flag doesn't make sense tbh. I always publish metrics when the feature is enabled, and the metric differentiates between logOnly (QUERY_KILLED_DRY_RUN) and enforce (QUERY_KILLED). Not sure why we would want to disable the metrics when the feature is enabled and seems like a redundant extra config.
| public class QueryMonitorConfig { | ||
| private static final Logger LOGGER = LoggerFactory.getLogger(QueryMonitorConfig.class); | ||
|
|
||
| private static final Set<String> VALID_SCAN_KILLING_MODES = Set.of( |
There was a problem hiding this comment.
Can we replace this and _scanBasedKillingMode with public enum ScanKillingMode { DISABLED, LOG_ONLY, ENFORCE }
There was a problem hiding this comment.
Makes sense, updated
| * (disabled, logOnly, enforce), logs an error and falls back to "disabled" so the server | ||
| * continues to start normally. | ||
| */ | ||
| private static String validateScanKillingMode(String mode) { |
There was a problem hiding this comment.
This can be removed if we use enum.
|
|
||
| if (changedConfigs.contains( | ||
| CommonConstants.Accounting.CONFIG_OF_SCAN_BASED_KILLING_MAX_ENTRIES_SCANNED_POST_FILTER)) { | ||
| _scanBasedKillingMaxEntriesScannedPostFilter = Long.parseLong( |
There was a problem hiding this comment.
Guard against invalid string values (empty string for example).
There was a problem hiding this comment.
thanks, handled
| factory.getName()); | ||
| } | ||
| } catch (Exception e) { | ||
| LOGGER.error("Failed to initialize scan-based killing strategy. " |
There was a problem hiding this comment.
Can we also add metrics here for production monitoring?
There was a problem hiding this comment.
Hmm, this metric will only emit one time and not really a timeseries, dont see a reason for a metric here. The log should be enough to let me know that the strategy failed to initialize, and we will continue with query killing being a no-op
| public class QueryKillingManager { | ||
| private static final Logger LOGGER = LoggerFactory.getLogger(QueryKillingManager.class); | ||
|
|
||
| private static volatile QueryKillingManager _instance; |
There was a problem hiding this comment.
Any particular reason we are keeping a static volatile variable and not passing it to relevant classes through DI?
There was a problem hiding this comment.
The consumer for this (to be added in next diff) is BaseOperator.checkTermination(), a method called on every nextBlock() invocation across many operator subclasses. Threading the manager through constructors would mean modifying the constructor of every operator, much larger change to review.
2. The manager is also server-scoped, we want to create one instance per JVM, initialized once at startup (BaseServerStarter), and then use it everywhere.
…rror metric Replace the three string constants (disabled/logOnly/enforce) with a proper ScanKillingMode enum for compile-time safety and cleaner comparisons. The enum supports case-insensitive parsing from config values for backward compatibility. Also add QUERIES_KILLED_SCAN_ERROR metric emitted when scan-based killing evaluation throws an unexpected exception. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add parseLongOrDefault() helper that safely handles null, empty string, whitespace, and non-numeric values when parsing scan killing thresholds from ZK cluster config updates. Falls back to the default (Long.MAX_VALUE) with an error log instead of throwing NumberFormatException. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implementation for #18043
This PR introduces all core classes for a proactive, scan-cost-based query killing framework. No existing query execution path is modified in this PR. A follow-up PR will wire these into BaseOperator.checkTermination() and the operator instrumentation points.
New classes:
QueryScanCostContext(pinot-spi) — LongAdder-based, thread-safe accumulator for numEntriesScannedInFilter, numDocsScanned, andnumEntriesScannedPostFilter. One instance per query, shared across segment worker threads.
QueryKillingStrategy(pinot-core) — Interface for pluggable kill decisions. Includes shouldTerminate(), buildKillReport(),getErrorCode(), and forQuery() for table-level threshold overrides.
ScanEntriesThresholdStrategy(pinot-core) — Default strategy. Kills queries exceeding maxEntriesScannedInFilter or maxDocsScanned.Threshold of Long.MAX_VALUE disables that metric. Includes a nested Factory that validates config at init and logs a warning if no
thresholds are set.
CompositeQueryKillingStrategy(pinot-core) — Combines strategies with AND/OR semantics.QueryKillingStrategyFactory(pinot-core) — Factory interface for config-driven strategy loading. Custom strategies are plugged in bysetting accounting.scan.based.killing.strategy.factory.class.name to a factory class
QueryKillReport(pinot-core) — Immutable snapshot of a kill event. Captures all metrics at creation timeProduces customer-facing error messages with actionable advice and structured internal log messages.
QueryKillingManager(pinot-core) — Singleton manager. Builds strategy once at init via factory, delegatesshouldTerminate()/buildKillReport() to the strategy.
Table-Level overrides supported
Table-level overrides via QueryConfig. Table config takes precedence over cluster config; null means use cluster default.
Error code: QUERY_SCAN_LIMIT_EXCEEDED (246, HTTP 400) - distinguishes scan kills (client problem) from OOM kills (server problem, 503).
Minimum config to enable