Skip to content

feat: Add retrieve_online_documents_v2 support to Qdrant online store#6484

Open
patelchaitany wants to merge 1 commit into
feast-dev:masterfrom
patelchaitany:feat/qdrant-retrieve-online-documents-v2
Open

feat: Add retrieve_online_documents_v2 support to Qdrant online store#6484
patelchaitany wants to merge 1 commit into
feast-dev:masterfrom
patelchaitany:feat/qdrant-retrieve-online-documents-v2

Conversation

@patelchaitany

@patelchaitany patelchaitany commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

What this PR does / why we need it:

This PR adds retrieve_online_documents_v2 for the Qdrant online store so vector search returns all requested features per document, not just the embedding hit. It implements dense embedding search (with a join across Qdrant’s one-point-per-feature layout), plus optional hybrid search when text_search_enabled: true. v1 is unchanged; hybrid is off by default so existing setups don’t need migration. Files touched: qdrant.py, qdrant.md, alpha-vector-database.md, test_qdrant_online_retrieval.py, test_qdrant_retrieve_online_documents_v2.py, and manual_qdrant_v2.py.

Which issue(s) this PR fixes:

#6445

Checks

  • I've made sure the tests are passing.
  • My commits are signed off (git commit -s)
  • My PR title follows conventional commits format

Testing Strategy

  • Unit tests
  • Integration tests
  • Manual tests
  • Testing is not required for this change

Misc


Open in Devin Review

@patelchaitany patelchaitany requested a review from a team as a code owner June 9, 2026 09:20

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 5 additional findings in Devin Review.

Open in Devin Review

Comment on lines +567 to +612
response = client.query_points(
collection_name=collection_name,
prefetch=prefetches,
query=models.FusionQuery(fusion=models.Fusion.RRF),
limit=top_k,
with_payload=True,
)
points = response.points
elif embedding is not None:
response = client.query_points(
collection_name=collection_name,
query=embedding,
query_filter=vector_feature_filter,
limit=top_k,
with_payload=True,
using=dense_using,
)
points = response.points
else:
assert query_string is not None
response = client.query_points(
collection_name=collection_name,
query=models.Document(
text=query_string,
model=online_store_config.sparse_embedding_model,
),
using=online_store_config.sparse_vector_name,
limit=top_k,
with_payload=True,
)
points = response.points

for point in points:
payload = point.payload or {}
entity_key_bin = _entity_key_bytes_from_payload(payload.get("entity_key"))
if entity_key_bin is None:
continue
if entity_key_bin not in hit_order:
hit_order.append(entity_key_bin)
hit_payload_by_entity[entity_key_bin] = payload
if point.score is not None:
score = float(point.score)
if embedding is not None:
dense_score_by_entity[entity_key_bin] = score
if query_string is not None:
sparse_score_by_entity[entity_key_bin] = score

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Hybrid RRF fusion returns fewer unique entities than top_k and overwrites best scores

In hybrid search mode (both embedding and query_string provided), Qdrant's RRF fusion operates on points, not entities. Because Feast stores one Qdrant point per feature (embedding point and text_field point have different point IDs), the same entity can appear twice in the fused results — once from the dense prefetch (the "embedding" point) and once from the sparse prefetch (the "text_field" point). Since limit=top_k constrains the total number of points returned, not unique entities, the effective number of unique entities can be as low as ceil(top_k / 2) in the worst case.

Additionally, the score-tracking loop unconditionally overwrites scores when the same entity reappears (line 610–612). Results are ordered by descending score, so later duplicate points have lower scores, meaning the entity ends up with a degraded score.

Score overwrite and dedup issue

For entity A appearing twice in fusion results:

  1. First point (embedding, score=0.9): added to hit_order, score 0.9 stored in both dense_score_by_entity and sparse_score_by_entity
  2. Second point (text_field, score=0.7): NOT added to hit_order (dedup check passes), but score 0.7 overwrites 0.9 in both dicts

Result: entity A gets score 0.7 instead of 0.9.

The fix should (a) request a larger limit (e.g., top_k * 2) from the fusion query, (b) keep the first (best) score per entity rather than overwriting, and (c) truncate hit_order to top_k unique entities.

Prompt for agents
In retrieve_online_documents_v2, the hybrid search path at line 567-612 has two related issues:

1. The fusion query uses limit=top_k, but because Feast stores one Qdrant point per feature (embedding point and text_field point have different IDs), the same entity can consume two slots. This means fewer unique entities are returned than the user requested. Fix: increase the limit in the fusion query_points call to top_k * 2 (or more), and then truncate hit_order to top_k after deduplication.

2. In the score-processing loop (lines 599-612), when the same entity appears multiple times, the score dicts (dense_score_by_entity and sparse_score_by_entity) are unconditionally overwritten. Since results are ordered by descending score, later appearances have lower scores, degrading the entity's score. Fix: use setdefault or a conditional to only keep the first (highest) score per entity, e.g.:
   dense_score_by_entity.setdefault(entity_key_bin, score)
   sparse_score_by_entity.setdefault(entity_key_bin, score)

Also consider capping hit_order after the loop: hit_order = hit_order[:top_k]

Files: sdk/python/feast/infra/online_stores/qdrant_online_store/qdrant.py, function retrieve_online_documents_v2
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +153 to +155
client.set_sparse_model(config.sparse_embedding_model)
sparse_vectors: List[models.SparseVector] = []
for embedding in client._sparse_embed_documents(texts):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 _encode_sparse_texts uses private API _sparse_embed_documents

_encode_sparse_texts calls client._sparse_embed_documents(texts) (line 155) — a private/internal method of the Qdrant client (note the leading underscore). While this appears to be the only way to access client-side sparse embedding, this API is not part of the public contract and could break with future qdrant-client upgrades without notice. Additionally, client.set_sparse_model(config.sparse_embedding_model) is called on every invocation rather than once during client initialization. This is a maintainability concern rather than a current bug.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@patelchaitany patelchaitany force-pushed the feat/qdrant-retrieve-online-documents-v2 branch 3 times, most recently from ad6ef63 to 6b06175 Compare June 9, 2026 10:00
@patelchaitany patelchaitany changed the title feat: Add retrieve_online_documents_v2 support to Qdrant online store feat: Add retrieve_online_documents_v2 support to Qdrant online store Jun 9, 2026
Comment thread docs/reference/alpha-vector-database.md Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patelchaitany I think alpha-vector-database.md matrix table is inconsistent with the note text.

{% endhint %}

**Note**: Milvus and SQLite implement the v2 `retrieve_online_documents_v2` method in the SDK. This will be the longer-term solution so that Data Scientists can easily enable vector similarity search by just flipping a flag.
**Note**: Milvus, SQLite, PostgreSQL, Elasticsearch, MongoDB, and Qdrant implement the v2 `retrieve_online_documents_v2` method in the SDK. This will be the longer-term solution so that Data Scientists can easily enable vector similarity search by just flipping a flag.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The note at line 33 was updated to include Qdrant, but the table at line 17 still shows [] for V2 Support. Need to update the table row to [x] for V2 Support (and Online Read).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, Thanks for Pointing out

@patelchaitany patelchaitany force-pushed the feat/qdrant-retrieve-online-documents-v2 branch from 6b06175 to 8c0d332 Compare June 9, 2026 17:44
Comment on lines +159 to +160
sparse_vectors: List[models.SparseVector] = []
for embedding in client._sparse_embed_documents(texts):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patelchaitany _encode_sparse_texts has a likely bug and uses deprecated internals

client._sparse_embed_documents(texts) at line 155 is called without passing embedding_model_name, so it defaults to "BAAI/bge-small-en" (a dense model). The official usage in qdrant-client's own add() method explicitly passes self.sparse_embedding_model_name. This means the sparse encoding at write time would attempt to use a dense model name to initialize a sparse model, which will either error or produce wrong results.

Additionally, _sparse_embed_documents, _FASTEMBED_INSTALLED, and the surrounding fastembed APIs were deprecated in qdrant-client. The modern approach is to use models.Document objects and let the client handle embedding internally.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested fix: if you must use this private API, pass the model name explicitly:

client._sparse_embed_documents(texts, embedding_model_name=config.sparse_embedding_model)

Or better yet, consider using the non-deprecated models.Document approach for sparse vector creation (as the retrieval path already does at line 559).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Signed-off-by: Chaitany patel <patelchaitany93@gmail.com>
@patelchaitany patelchaitany force-pushed the feat/qdrant-retrieve-online-documents-v2 branch from 8c0d332 to e33a37a Compare June 9, 2026 18:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants