|
| 1 | +# ElasticSearch online store (contrib) |
| 2 | + |
| 3 | +## Description |
| 4 | + |
| 5 | +The ElasticSearch online store provides support for materializing tabular feature values, as well as embedding feature vectors, into an ElasticSearch index for serving online features. \ |
| 6 | +The embedding feature vectors are stored as dense vectors, and can be used for similarity search. More information on dense vectors can be found [here](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html). |
| 7 | + |
| 8 | +## Getting started |
| 9 | +In order to use this online store, you'll need to run `pip install 'feast[elasticsearch]'`. You can get started by then running `feast init -t elasticsearch`. |
| 10 | + |
| 11 | +## Example |
| 12 | + |
| 13 | +{% code title="feature_store.yaml" %} |
| 14 | +```yaml |
| 15 | +project: my_feature_repo |
| 16 | +registry: data/registry.db |
| 17 | +provider: local |
| 18 | +online_store: |
| 19 | + type: elasticsearch |
| 20 | + host: ES_HOST |
| 21 | + port: ES_PORT |
| 22 | + user: ES_USERNAME |
| 23 | + password: ES_PASSWORD |
| 24 | + vector_len: 512 |
| 25 | + write_batch_size: 1000 |
| 26 | +``` |
| 27 | +{% endcode %} |
| 28 | +
|
| 29 | +The full set of configuration options is available in [ElasticsearchOnlineStoreConfig](https://rtd.feast.dev/en/master/#feast.infra.online_stores.contrib.elasticsearch.ElasticsearchOnlineStoreConfig). |
| 30 | +
|
| 31 | +## Functionality Matrix |
| 32 | +
|
| 33 | +
|
| 34 | +| | Postgres | |
| 35 | +| :-------------------------------------------------------- | :------- | |
| 36 | +| write feature values to the online store | yes | |
| 37 | +| read feature values from the online store | yes | |
| 38 | +| update infrastructure (e.g. tables) in the online store | yes | |
| 39 | +| teardown infrastructure (e.g. tables) in the online store | yes | |
| 40 | +| generate a plan of infrastructure changes | no | |
| 41 | +| support for on-demand transforms | yes | |
| 42 | +| readable by Python SDK | yes | |
| 43 | +| readable by Java | no | |
| 44 | +| readable by Go | no | |
| 45 | +| support for entityless feature views | yes | |
| 46 | +| support for concurrent writing to the same key | no | |
| 47 | +| support for ttl (time to live) at retrieval | no | |
| 48 | +| support for deleting expired data | no | |
| 49 | +| collocated by feature view | yes | |
| 50 | +| collocated by feature service | no | |
| 51 | +| collocated by entity key | no | |
| 52 | +
|
| 53 | +To compare this set of functionality against other online stores, please see the full [functionality matrix](overview.md#functionality-matrix). |
| 54 | +
|
| 55 | +## Retrieving online document vectors |
| 56 | +
|
| 57 | +The ElasticSearch online store supports retrieving document vectors for a given list of entity keys. The document vectors are returned as a dictionary where the key is the entity key and the value is the document vector. The document vector is a dense vector of floats. |
| 58 | +
|
| 59 | +{% code title="python" %} |
| 60 | +```python |
| 61 | +from feast import FeatureStore |
| 62 | + |
| 63 | +feature_store = FeatureStore(repo_path="feature_store.yaml") |
| 64 | + |
| 65 | +query_vector = [1.0, 2.0, 3.0, 4.0, 5.0] |
| 66 | +top_k = 5 |
| 67 | + |
| 68 | +# Retrieve the top k closest features to the query vector |
| 69 | + |
| 70 | +feature_values = feature_store.retrieve_online_documents( |
| 71 | + feature="my_feature", |
| 72 | + query=query_vector, |
| 73 | + top_k=top_k |
| 74 | +) |
| 75 | +``` |
| 76 | +{% endcode %} |
| 77 | + |
| 78 | +## Indexing |
| 79 | +Currently, the indexing mapping in the ElasticSearch online store is configured as: |
| 80 | + |
| 81 | +{% code title="indexing_mapping" %} |
| 82 | +```json |
| 83 | +"properties": { |
| 84 | + "entity_key": {"type": "binary"}, |
| 85 | + "feature_name": {"type": "keyword"}, |
| 86 | + "feature_value": {"type": "binary"}, |
| 87 | + "timestamp": {"type": "date"}, |
| 88 | + "created_ts": {"type": "date"}, |
| 89 | + "vector_value": { |
| 90 | + "type": "dense_vector", |
| 91 | + "dims": config.online_store.vector_len, |
| 92 | + "index": "true", |
| 93 | + "similarity": config.online_store.similarity, |
| 94 | + }, |
| 95 | +} |
| 96 | +``` |
| 97 | +{% endcode %} |
| 98 | +And the online_read API mapping is configured as: |
| 99 | + |
| 100 | +{% code title="online_read_mapping" %} |
| 101 | +```json |
| 102 | +"query": { |
| 103 | + "bool": { |
| 104 | + "must": [ |
| 105 | + {"terms": {"entity_key": entity_keys}}, |
| 106 | + {"terms": {"feature_name": requested_features}}, |
| 107 | + ] |
| 108 | + } |
| 109 | +}, |
| 110 | +``` |
| 111 | +{% endcode %} |
| 112 | + |
| 113 | +And the similarity search API mapping is configured as: |
| 114 | + |
| 115 | +{% code title="similarity_search_mapping" %} |
| 116 | +```json |
| 117 | +{ |
| 118 | + "field": "vector_value", |
| 119 | + "query_vector": embedding_vector, |
| 120 | + "k": top_k, |
| 121 | +} |
| 122 | +``` |
| 123 | +{% endcode %} |
| 124 | + |
| 125 | +These APIs are subject to change in future versions of Feast to improve performance and usability. |
0 commit comments