feat: Add retrieve_online_documents_v2 support to Qdrant online store#6484
feat: Add retrieve_online_documents_v2 support to Qdrant online store#6484patelchaitany wants to merge 1 commit into
Conversation
| response = client.query_points( | ||
| collection_name=collection_name, | ||
| prefetch=prefetches, | ||
| query=models.FusionQuery(fusion=models.Fusion.RRF), | ||
| limit=top_k, | ||
| with_payload=True, | ||
| ) | ||
| points = response.points | ||
| elif embedding is not None: | ||
| response = client.query_points( | ||
| collection_name=collection_name, | ||
| query=embedding, | ||
| query_filter=vector_feature_filter, | ||
| limit=top_k, | ||
| with_payload=True, | ||
| using=dense_using, | ||
| ) | ||
| points = response.points | ||
| else: | ||
| assert query_string is not None | ||
| response = client.query_points( | ||
| collection_name=collection_name, | ||
| query=models.Document( | ||
| text=query_string, | ||
| model=online_store_config.sparse_embedding_model, | ||
| ), | ||
| using=online_store_config.sparse_vector_name, | ||
| limit=top_k, | ||
| with_payload=True, | ||
| ) | ||
| points = response.points | ||
|
|
||
| for point in points: | ||
| payload = point.payload or {} | ||
| entity_key_bin = _entity_key_bytes_from_payload(payload.get("entity_key")) | ||
| if entity_key_bin is None: | ||
| continue | ||
| if entity_key_bin not in hit_order: | ||
| hit_order.append(entity_key_bin) | ||
| hit_payload_by_entity[entity_key_bin] = payload | ||
| if point.score is not None: | ||
| score = float(point.score) | ||
| if embedding is not None: | ||
| dense_score_by_entity[entity_key_bin] = score | ||
| if query_string is not None: | ||
| sparse_score_by_entity[entity_key_bin] = score |
There was a problem hiding this comment.
🔴 Hybrid RRF fusion returns fewer unique entities than top_k and overwrites best scores
In hybrid search mode (both embedding and query_string provided), Qdrant's RRF fusion operates on points, not entities. Because Feast stores one Qdrant point per feature (embedding point and text_field point have different point IDs), the same entity can appear twice in the fused results — once from the dense prefetch (the "embedding" point) and once from the sparse prefetch (the "text_field" point). Since limit=top_k constrains the total number of points returned, not unique entities, the effective number of unique entities can be as low as ceil(top_k / 2) in the worst case.
Additionally, the score-tracking loop unconditionally overwrites scores when the same entity reappears (line 610–612). Results are ordered by descending score, so later duplicate points have lower scores, meaning the entity ends up with a degraded score.
Score overwrite and dedup issue
For entity A appearing twice in fusion results:
- First point (embedding, score=0.9): added to
hit_order, score 0.9 stored in bothdense_score_by_entityandsparse_score_by_entity - Second point (text_field, score=0.7): NOT added to
hit_order(dedup check passes), but score 0.7 overwrites 0.9 in both dicts
Result: entity A gets score 0.7 instead of 0.9.
The fix should (a) request a larger limit (e.g., top_k * 2) from the fusion query, (b) keep the first (best) score per entity rather than overwriting, and (c) truncate hit_order to top_k unique entities.
Prompt for agents
In retrieve_online_documents_v2, the hybrid search path at line 567-612 has two related issues:
1. The fusion query uses limit=top_k, but because Feast stores one Qdrant point per feature (embedding point and text_field point have different IDs), the same entity can consume two slots. This means fewer unique entities are returned than the user requested. Fix: increase the limit in the fusion query_points call to top_k * 2 (or more), and then truncate hit_order to top_k after deduplication.
2. In the score-processing loop (lines 599-612), when the same entity appears multiple times, the score dicts (dense_score_by_entity and sparse_score_by_entity) are unconditionally overwritten. Since results are ordered by descending score, later appearances have lower scores, degrading the entity's score. Fix: use setdefault or a conditional to only keep the first (highest) score per entity, e.g.:
dense_score_by_entity.setdefault(entity_key_bin, score)
sparse_score_by_entity.setdefault(entity_key_bin, score)
Also consider capping hit_order after the loop: hit_order = hit_order[:top_k]
Files: sdk/python/feast/infra/online_stores/qdrant_online_store/qdrant.py, function retrieve_online_documents_v2
Was this helpful? React with 👍 or 👎 to provide feedback.
| client.set_sparse_model(config.sparse_embedding_model) | ||
| sparse_vectors: List[models.SparseVector] = [] | ||
| for embedding in client._sparse_embed_documents(texts): |
There was a problem hiding this comment.
🚩 _encode_sparse_texts uses private API _sparse_embed_documents
_encode_sparse_texts calls client._sparse_embed_documents(texts) (line 155) — a private/internal method of the Qdrant client (note the leading underscore). While this appears to be the only way to access client-side sparse embedding, this API is not part of the public contract and could break with future qdrant-client upgrades without notice. Additionally, client.set_sparse_model(config.sparse_embedding_model) is called on every invocation rather than once during client initialization. This is a maintainability concern rather than a current bug.
Was this helpful? React with 👍 or 👎 to provide feedback.
ad6ef63 to
6b06175
Compare
There was a problem hiding this comment.
@patelchaitany I think alpha-vector-database.md matrix table is inconsistent with the note text.
| {% endhint %} | ||
|
|
||
| **Note**: Milvus and SQLite implement the v2 `retrieve_online_documents_v2` method in the SDK. This will be the longer-term solution so that Data Scientists can easily enable vector similarity search by just flipping a flag. | ||
| **Note**: Milvus, SQLite, PostgreSQL, Elasticsearch, MongoDB, and Qdrant implement the v2 `retrieve_online_documents_v2` method in the SDK. This will be the longer-term solution so that Data Scientists can easily enable vector similarity search by just flipping a flag. |
There was a problem hiding this comment.
The note at line 33 was updated to include Qdrant, but the table at line 17 still shows [] for V2 Support. Need to update the table row to [x] for V2 Support (and Online Read).
There was a problem hiding this comment.
Done, Thanks for Pointing out
6b06175 to
8c0d332
Compare
| sparse_vectors: List[models.SparseVector] = [] | ||
| for embedding in client._sparse_embed_documents(texts): |
There was a problem hiding this comment.
@patelchaitany _encode_sparse_texts has a likely bug and uses deprecated internals
client._sparse_embed_documents(texts) at line 155 is called without passing embedding_model_name, so it defaults to "BAAI/bge-small-en" (a dense model). The official usage in qdrant-client's own add() method explicitly passes self.sparse_embedding_model_name. This means the sparse encoding at write time would attempt to use a dense model name to initialize a sparse model, which will either error or produce wrong results.
Additionally, _sparse_embed_documents, _FASTEMBED_INSTALLED, and the surrounding fastembed APIs were deprecated in qdrant-client. The modern approach is to use models.Document objects and let the client handle embedding internally.
There was a problem hiding this comment.
Suggested fix: if you must use this private API, pass the model name explicitly:
client._sparse_embed_documents(texts, embedding_model_name=config.sparse_embedding_model)
Or better yet, consider using the non-deprecated models.Document approach for sparse vector creation (as the retrieval path already does at line 559).
Signed-off-by: Chaitany patel <patelchaitany93@gmail.com>
8c0d332 to
e33a37a
Compare
What this PR does / why we need it:
This PR adds retrieve_online_documents_v2 for the Qdrant online store so vector search returns all requested features per document, not just the embedding hit. It implements dense embedding search (with a join across Qdrant’s one-point-per-feature layout), plus optional hybrid search when text_search_enabled: true. v1 is unchanged; hybrid is off by default so existing setups don’t need migration. Files touched: qdrant.py, qdrant.md, alpha-vector-database.md, test_qdrant_online_retrieval.py, test_qdrant_retrieve_online_documents_v2.py, and manual_qdrant_v2.py.
Which issue(s) this PR fixes:
#6445
Checks
git commit -s)Testing Strategy
Misc