Skip to content

[pinot-server/ proactive-query-killing] (1/2) add initial SPI implementation for supporting query killing based on Scan Cost#18102

Open
anuragrai16 wants to merge 13 commits intoapache:masterfrom
anuragrai16:proactiveQueryKillingSPI
Open

[pinot-server/ proactive-query-killing] (1/2) add initial SPI implementation for supporting query killing based on Scan Cost#18102
anuragrai16 wants to merge 13 commits intoapache:masterfrom
anuragrai16:proactiveQueryKillingSPI

Conversation

@anuragrai16
Copy link
Copy Markdown
Contributor

@anuragrai16 anuragrai16 commented Apr 6, 2026

Implementation for #18043

This PR introduces all core classes for a proactive, scan-cost-based query killing framework. No existing query execution path is modified in this PR. A follow-up PR will wire these into BaseOperator.checkTermination() and the operator instrumentation points.

New classes:

  • QueryScanCostContext (pinot-spi) — LongAdder-based, thread-safe accumulator for numEntriesScannedInFilter, numDocsScanned, and
    numEntriesScannedPostFilter. One instance per query, shared across segment worker threads.
  • QueryKillingStrategy (pinot-core) — Interface for pluggable kill decisions. Includes shouldTerminate(), buildKillReport(),
    getErrorCode(), and forQuery() for table-level threshold overrides.
  • ScanEntriesThresholdStrategy (pinot-core) — Default strategy. Kills queries exceeding maxEntriesScannedInFilter or maxDocsScanned.
    Threshold of Long.MAX_VALUE disables that metric. Includes a nested Factory that validates config at init and logs a warning if no
    thresholds are set.
  • CompositeQueryKillingStrategy (pinot-core) — Combines strategies with AND/OR semantics.
  • QueryKillingStrategyFactory (pinot-core) — Factory interface for config-driven strategy loading. Custom strategies are plugged in by
    setting accounting.scan.based.killing.strategy.factory.class.name to a factory class
  • QueryKillReport (pinot-core) — Immutable snapshot of a kill event. Captures all metrics at creation time
    Produces customer-facing error messages with actionable advice and structured internal log messages.
  • QueryKillingManager (pinot-core) — Singleton manager. Builds strategy once at init via factory, delegates
    shouldTerminate()/buildKillReport() to the strategy.

Table-Level overrides supported
Table-level overrides via QueryConfig. Table config takes precedence over cluster config; null means use cluster default.

Error code: QUERY_SCAN_LIMIT_EXCEEDED (246, HTTP 400) - distinguishes scan kills (client problem) from OOM kills (server problem, 503).

Minimum config to enable

pinot.query.scheduler.accounting.scan.based.killing.enabled=true
pinot.query.scheduler.accounting.scan.based.killing.max.entries.scanned.in.filter=500000000

@anuragrai16 anuragrai16 changed the title [pinot-server/ proactive-query-killing] add initial SPI implementation for supporting query killing based on … [pinot-server/ proactive-query-killing] (1/2) add initial SPI implementation for supporting query killing based on … Apr 6, 2026
@anuragrai16 anuragrai16 changed the title [pinot-server/ proactive-query-killing] (1/2) add initial SPI implementation for supporting query killing based on … [pinot-server/ proactive-query-killing] (1/2) add initial SPI implementation for supporting query killing based on Scan Cost Apr 6, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 6, 2026

Codecov Report

❌ Patch coverage is 88.88889% with 28 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.62%. Comparing base (78e38eb) to head (5757849).
⚠️ Report is 8 commits behind head on master.

Files with missing lines Patch % Lines
.../pinot/core/query/killing/QueryKillingManager.java 74.13% 12 Missing and 3 partials ⚠️
...killing/strategy/ScanEntriesThresholdStrategy.java 90.00% 0 Missing and 4 partials ⚠️
...ache/pinot/core/accounting/QueryMonitorConfig.java 93.87% 1 Missing and 2 partials ⚠️
...e/query/killing/CompositeQueryKillingStrategy.java 92.00% 1 Missing and 1 partial ⚠️
...va/org/apache/pinot/spi/utils/CommonConstants.java 86.66% 1 Missing and 1 partial ⚠️
...pinot/core/query/killing/QueryKillingStrategy.java 66.66% 1 Missing ⚠️
...org/apache/pinot/spi/config/table/QueryConfig.java 91.66% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18102      +/-   ##
============================================
+ Coverage     63.58%   63.62%   +0.04%     
+ Complexity     1659     1653       -6     
============================================
  Files          3245     3252       +7     
  Lines        197441   197760     +319     
  Branches      30564    30614      +50     
============================================
+ Hits         125536   125826     +290     
- Misses        61856    61874      +18     
- Partials      10049    10060      +11     
Flag Coverage Δ
custom-integration1 ?
integration ?
integration1 ?
integration2 ?
java-11 55.60% <88.88%> (+0.09%) ⬆️
java-21 63.59% <88.88%> (+0.03%) ⬆️
temurin 63.62% <88.88%> (+0.04%) ⬆️
unittests 63.62% <88.88%> (+0.04%) ⬆️
unittests1 55.62% <88.88%> (+0.08%) ⬆️
unittests2 35.00% <5.15%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

// CPU time based killing threshold
private final long _cpuTimeBasedKillingThresholdNs;

private final boolean _queryKilledMetricEnabled;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side q: Are we honoring this flag during emitting the metics in future PRs? worth a look.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this flag doesn't make sense tbh. I always publish metrics when the feature is enabled, and the metric differentiates between logOnly (QUERY_KILLED_DRY_RUN) and enforce (QUERY_KILLED). Not sure why we would want to disable the metrics when the feature is enabled and seems like a redundant extra config.

public class QueryMonitorConfig {
private static final Logger LOGGER = LoggerFactory.getLogger(QueryMonitorConfig.class);

private static final Set<String> VALID_SCAN_KILLING_MODES = Set.of(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we replace this and _scanBasedKillingMode with public enum ScanKillingMode { DISABLED, LOG_ONLY, ENFORCE }

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, updated

* (disabled, logOnly, enforce), logs an error and falls back to "disabled" so the server
* continues to start normally.
*/
private static String validateScanKillingMode(String mode) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be removed if we use enum.


if (changedConfigs.contains(
CommonConstants.Accounting.CONFIG_OF_SCAN_BASED_KILLING_MAX_ENTRIES_SCANNED_POST_FILTER)) {
_scanBasedKillingMaxEntriesScannedPostFilter = Long.parseLong(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guard against invalid string values (empty string for example).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, handled

factory.getName());
}
} catch (Exception e) {
LOGGER.error("Failed to initialize scan-based killing strategy. "
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also add metrics here for production monitoring?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this metric will only emit one time and not really a timeseries, dont see a reason for a metric here. The log should be enough to let me know that the strategy failed to initialize, and we will continue with query killing being a no-op

public class QueryKillingManager {
private static final Logger LOGGER = LoggerFactory.getLogger(QueryKillingManager.class);

private static volatile QueryKillingManager _instance;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason we are keeping a static volatile variable and not passing it to relevant classes through DI?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The consumer for this (to be added in next diff) is BaseOperator.checkTermination(), a method called on every nextBlock() invocation across many operator subclasses. Threading the manager through constructors would mean modifying the constructor of every operator, much larger change to review.
2. The manager is also server-scoped, we want to create one instance per JVM, initialized once at startup (BaseServerStarter), and then use it everywhere.

anuragrai16 and others added 8 commits April 22, 2026 22:55
…rror metric

Replace the three string constants (disabled/logOnly/enforce) with a proper
ScanKillingMode enum for compile-time safety and cleaner comparisons. The enum
supports case-insensitive parsing from config values for backward compatibility.

Also add QUERIES_KILLED_SCAN_ERROR metric emitted when scan-based killing
evaluation throws an unexpected exception.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add parseLongOrDefault() helper that safely handles null, empty string,
whitespace, and non-numeric values when parsing scan killing thresholds
from ZK cluster config updates. Falls back to the default (Long.MAX_VALUE)
with an error log instead of throwing NumberFormatException.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants