feat: Add OnlineStore for Aerospike#6532
Open
vkagamlyk wants to merge 18 commits into
Open
Conversation
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
The Aerospike Python client mishandles bytes user keys (hashes only the first byte), collapsing all entities onto the same digest. Wrap keys in bytearray on write and read. Also pair BatchRecord responses with original input keys via zip rather than trusting br.key[2], which the client returns in a different representation on reads. Add two integration tests: cross-FV Map CDT coexistence and update(tables_to_delete=...) background scan. Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
…r-record errors Two review blockers rolled into one commit because they share the same code path. 1. Server-side projection. online_read now builds a map_get_by_key_list op nested into the feature-view submap via cdt_ctx_map_key when requested_features is provided, instead of fetching the whole FV slot and filtering in Python. For wide feature views this ships only the requested columns over the wire. The response shape (flat [k,v,k,v] list vs. dict) is normalized through _normalize_projected_features. 2. Per-record error surfacing. Both batch_write and batch_operate only raise when the whole request is rejected; partial failures (single-partition timeout, replica quorum miss) are otherwise silent and present downstream as missing features. online_read now distinguishes RECORD_NOT_FOUND (2) and OP_NOT_APPLICABLE (26, = nested ctx miss when FV slot is absent) from transient errors, which are raised. online_write_batch inspects every per-record result code after the batch call. Unit tests cover all four paths: projected read, not-found, op-not-applicable (nested ctx miss), and a simulated TIMEOUT that must raise. The docker-backed cross-FV and update() integration tests still pass, so server-side projection is verified end-to-end against a real Aerospike server. Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
…nd add socket_timeout_ms Review feedback: total_timeout_ms was ambiguous (users read it as a global/end-to-end timeout) and the timeout surface was missing socket_timeout, which is the per-attempt trigger that lets max_retries actually fire within the total budget. * total_timeout_ms -> batch_total_timeout_ms. Now explicitly named after the Aerospike batch policy it maps to, matches read_timeout_ms / write_timeout_ms in framing (each targets one policy scope). * Add socket_timeout_ms (optional). Applies uniformly to read, write and batch policies when set. Leaves the Aerospike client default in place when unset. BREAKING CHANGE: total_timeout_ms is renamed to batch_total_timeout_ms. Config files using the old name must be updated. No default value change. Docs updated (reference + perf-tuning guide) with a short explainer on the per-attempt vs total deadline distinction. Two new unit tests pin the policy wiring: socket_timeout_ms propagates to all three scopes, and is omitted (not injected as None) when unset. Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
…oped client
Cheap-win cleanups flagged in review, all touching the same small patch of write-path and lifecycle code.
* Map CDTs are now created with MAP_KEY_ORDERED. map_get_by_key / map_remove_by_key on an ordered map are O(log N) in the map size instead of O(N); matters on reads of wide feature views and on the update() background scan (which walks every record in the project's set).
* Writes drop POLICY_KEY_SEND and rely on the client default (POLICY_KEY_DIGEST). The serialized entity key is no longer stored alongside each record, saving per-record storage the read path never consumes (batch_operate preserves request order; results are paired back by zip in online_read).
* _client moves from a class attribute to an instance attribute (set in __init__). Previously two AerospikeOnlineStore instances could share the cached client through class state until one wrote self._client. With the instance attribute the state is always per-instance from construction.
* Drop MongoDB references from class docstrings and comments (they referred to how the storage layout was derived rather than documenting current behavior). Also rewrite the _build_batch_writes docstring to describe the policies applied on the write path.
Unit test assertions for the write-path record are updated: bw.policy is now None (client default applies) and map ops carry map_policy={'map_order': MAP_KEY_ORDERED}. All three docker-backed integration tests still pass end-to-end (cross-FV upsert, update() background scan, full feature-store round-trip), so the read/write shape survives the ordering and policy changes against a real server.
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
Adds three configuration knobs to AerospikeOnlineStoreConfig: - namespace_overrides: pin individual feature views to a different Aerospike namespace (e.g. RAM-only vs. SSD-backed) without splitting the project across stores. - set_overrides: place a feature view in its own set so admin ops on it (truncate, scan-based deletes during `feast apply`) do not touch records of other views. - prewriting_hook: import-string-resolved callable invoked once per online_write_batch with the rows about to be written, returning the rows that actually go on the wire. Resolved and cached on first use; returning [] short-circuits the wire call. Read, write, update and teardown paths all honour the per-FV ns/set resolution. update() groups dropped feature views by their resolved (ns, set) pair and issues one background scan per group. teardown() truncates every unique (ns, set) pair the project may have written to, including the store-level default. Adds 22 unit tests for the new behaviour and updates 3 existing call sites of _build_batch_writes for the new namespace= parameter. Adds a sample hook module under examples/online_store/aerospike_overrides_and_hooks/ and corresponding sections in docs/reference/online-stores/aerospike.md. Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
Adding aerospike to the feast-operator enum shifted the allowlisted SecretRef entry in api/v1/featurestore_types.go by one line. Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
8e819d7 to
2c37c0f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it
Adds a first-class Aerospike online store integration at
feast.infra.online_stores.aerospike_online_store, backed by the officialaerospike>=19Python client and targeting Aerospike server 6.0+(developed and tested against CE 8.x).
Aerospike is a low-latency distributed key-value store widely deployed for
real-time ML feature serving. This integration gives Feast users a managed path
to Aerospike for online retrieval.
Storage layout
Each entity is stored as a single Aerospike record keyed by the serialized
entity key. Multiple feature views for the same entity share one record using
Map CDT bins:
Default: one set per project (
{project}_latest). Per-feature-view isolationis opt-in via
set_overrides(see below).Writes use Aerospike
batch_writewith Map CDTmap_put_items/map_putops so concurrent writers touching different feature views on the sameentity never clobber each other. Reads use
batch_operatewith server-sideMap-get projection when
requested_featuresis provided.Implementation highlights
online_write_batch,online_read,online_write_batch_async,online_read_async,initialize(), andclose(). Async methods wrap the blocking client viarun_in_executor.Native asyncio client adoption is tracked as a follow-up.
requested_featuresprojection — read path builds a Map CDTmap_get_by_key_listnested into the feature-view submap viacdt_ctx_map_keyso only requested columns are returned from the server.
batch_write/batch_operateonly raisewhen the whole request is rejected; this store inspects per-record result
codes and raises on transient failures (
RECORD_NOT_FOUNDandOP_NOT_APPLICABLEare treated as missing features).bytearray, notbytes. The client hashes only the first byte ofbyteskeys during digest computation, causing silent collisions;
bytearrayusesthe intended binary-key digest.
batch_operatepreserves input order; responsesare paired back via
ziprather than re-parsingbr.key[2].O(log N) lookups on wide feature views and on the
update()background scan.serialized entity key is not stored redundantly on the server.
serialize_entity_key; noschema changes for multi-entity keys.
ttl_seconds: unset → namespace default,0→ neverexpire,
>0→ explicit seconds.namespace_overridesandset_overridespin individual feature views to different namespaces or setswithout splitting the project across stores.
update()andteardown()honour resolved
(namespace, set)pairs.online_write_batchfor PII masking, encryption, coercion, etc.read_timeout_ms,write_timeout_ms,batch_total_timeout_ms(per-batch total deadline including retries), andoptional
socket_timeout_ms(per-attempt, propagates to all three policies).user/passwordauth and TLS(
tls/tls_ca_file) are plumbed through the config but are config-onlyin v1 on Aerospike CE (not exercised in GitHub Actions). Documented in the
reference page.
Tests
~57 pure-Python unit tests (no Docker required) covering:
_build_batch_writesshape: Map bins, TTL meta,bytearraykeys,MAP_KEY_ORDEREDpolicy, empty-batch short-circuitonline_readprojection, not-found / op-not-applicable / transient errorsupdate(tables_to_delete=...)scan coalescing by(namespace, set)teardowntruncates all resolved pairs and closes the clientinitialize/closelifecycle3 Docker-backed integration tests via
testcontainers(Aerospike CE 8 —aerospike/aerospike-server:8.0.0.10_1):test_aerospike_online_features— end-to-end write + read + projection +missing-key → None across int, string, and composite entity keys
test_aerospike_cross_fv_map_cdt_upsert— two feature views on the sameentity coexist (Map CDT upsert semantics)
test_aerospike_update_strips_dropped_feature_view—update()backgroundscan removes a dropped FV slot while the surviving FV stays intact
Universal online-store suite wiring — ships
AerospikeOnlineStoreCreatorplus a
FULL_REPO_CONFIGSmodule for opt-in viaFULL_REPO_CONFIGS_MODULE=feast.infra.online_stores.aerospike_online_store.aerospike_repo_configuration.Default-matrix registration is deferred to a follow-up PR (MongoDB, Couchbase,
Cassandra, MySQL pattern).
Documentation
docs/reference/online-stores/aerospike.md— reference page (config, datamodel, CE vs EE matrix, overrides, prewriting hook, tuning).
docs/reference/online-stores/README.md+docs/SUMMARY.md— linked.docs/roadmap.md— Aerospike marked[x]under Online Stores.docs/how-to-guides/online-server-performance-tuning.md— Aerospike row andtuning subsection.
examples/online_store/aerospike_overrides_and_hooks/— sample hook +override config.
feast-operator
Adds
aerospiketo supported online store types in feast-operator CRDs andbundled manifests.
Out of scope for v1 (follow-ups)
run_in_executorwrappers).async_supportedonAerospikeOnlineStoreso the feature serverroutes to
online_read_async/online_write_batch_async(methods exist;property still inherits the base
read=False, write=Falsedefault).Known follow-ups before upstream PR
aerospike.mdper feast-dev convention(reviewer note: same pattern as MongoDB preview banner).
upstream benchmarks PR lands.
Checks
pytestunit + Docker integration tests).git commit -s).Testing strategy
Misc
Aerospike client dependency is gated behind the
aerospikeoptional extra(
pip install 'feast[aerospike]'), so existing installs that don't need it payno import cost.