-
Notifications
You must be signed in to change notification settings - Fork 1.3k
feat: Adding blog on RAG with Milvus #5161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
946ffa3
13e4330
65211f5
a9b2a1a
027cb4b
6d7e1ac
4685750
ee53d42
7e1e3d2
ce1b114
5de9019
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,167 @@ | ||
| --- | ||
| title: Retrieval Augmented Generation with Feast | ||
| description: How Feast can empower you to customize your RAG applications for maximum flexibility. | ||
| date: 2025-03-17 | ||
| authors: ["Francisco Javier Arceo"] | ||
| --- | ||
|
|
||
| <div class="hero-image"> | ||
| <img src="/images/blog/space.jpg" alt="Exploring the Possibilities of AI" loading="lazy"> | ||
| </div> | ||
|
|
||
| # Is a Feature Store a good fit for GenAI use cases? | ||
|
franciscojavierarceo marked this conversation as resolved.
Outdated
|
||
| Yes, a Feature Store is a great fit for GenAI use cases! | ||
|
|
||
| That's because Feature Stores were developed over the last 10 years to explicitly handle the problems AI | ||
| practitioners faced when working with data. The Feast maintainers will continue to invest in making the GenAI development experience a first-class | ||
|
franciscojavierarceo marked this conversation as resolved.
Outdated
|
||
| citizen so that you can rely on Feast to customize your AI applications. If you have thoughts or ideas | ||
| feel free to reach out! | ||
|
franciscojavierarceo marked this conversation as resolved.
Outdated
|
||
|
|
||
| # What is Retrieval Augmented Generation (RAG)? | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This section is fine, but can probably be simplified. We can assume readers know what RAG is and dont need to be educated on it. |
||
| [RAG](https://en.wikipedia.org/wiki/Retrieval-augmented_generation) is a technique that combines generative models | ||
| (e.g., LLMs) with retrieval systems to generate contextually relevant output for a particular goal (e.g., | ||
| question and answering). | ||
|
|
||
| The typical RAG process involves: | ||
| 1. Sourcing text data relevant for your application | ||
| 2. Transforming each text document into smaller chunks of text | ||
| 3. Transforming those chunks of text into embeddings | ||
| 4. Inserting those chunks of text along with some identifier for the chunk and document in some database | ||
| 5. Retrieving those chunks of text along with the identifiers at run-time to inject that text into the LLM's context | ||
| 6. Calling some API to run inference with your LLM to generate contextually relevant output | ||
| 7. Returning the output to some end user | ||
|
|
||
| Implicit in (1)-(4) is the potential of scaling to large amounts of data (i.e., using some form of distributed computing), | ||
| orchestrating that scaling through some batch or streaming pipeline, and customization of key transformation decisions | ||
| (e.g., tokenization, model, chunking, data format, etc.). | ||
|
franciscojavierarceo marked this conversation as resolved.
Outdated
|
||
|
|
||
| # What is the scope of Retrieval? | ||
|
franciscojavierarceo marked this conversation as resolved.
Outdated
|
||
| So, the Retrieval in RAG also includes: | ||
| 1. Ingestion | ||
| 1. Transformation | ||
| 1. Indexing | ||
| 1. Retrieval | ||
|
|
||
| RAG patterns often use vector similarity search for the retrieval step, but it is worth emphasizing that this is not the | ||
| only retrieval pattern that can be useful. In fact, standard entity-based retrieval can be very powerful for | ||
|
franciscojavierarceo marked this conversation as resolved.
Outdated
|
||
| applications where relevant user-context is necessary. | ||
|
|
||
| # How does that relate to Feast? | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd add a section on how RAG falls apart today with only a vector store and no feature store.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good call.
franciscojavierarceo marked this conversation as resolved.
Outdated
|
||
| Feast is focused on supporting the lifecycle of data for AI/ML applications. Feast was built to support the | ||
| offline training, online serving, and metadata management so that users can successfully scale their production AI | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. very hand wavy "scale" -> I'd make it more clear what Feast helps with specifically.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I kind of want to balance brevity and Feast semantics (it's not always obvious to non-Feast users what an offline vs online store is). I'll try to clarify. |
||
| applications. | ||
|
|
||
| What that means is that Feast can help you not only serve your documents, user data, and other metadata for production | ||
|
franciscojavierarceo marked this conversation as resolved.
Outdated
|
||
| RAG applications, but it can also help you scale your embeddings on large amounts of data (e.g,. using Spark to embed | ||
| gigabytes of documents), re-use the same code online and offline, prepare your datasets so you can fine tune your | ||
|
franciscojavierarceo marked this conversation as resolved.
Outdated
|
||
| embedding, generative, or retrieval models later, and track changes to your transformations, data sources, and | ||
| RAG-sources to provide you with replayability and data lineage. | ||
|
|
||
| Historically, Feast catered to Data Scientists and ML Engineers who implemented their own types of data/feature transformations but, now, | ||
| many RAG providers handle this out of the box for you. We will invest in creating extendable implementations to make it easier | ||
| to customize your applications. | ||
|
|
||
| # How might I want to customize retrieval? | ||
| As mentioned, Feast allows you to customize: | ||
| - Chunking | ||
| - Tokenization | ||
| - Embedding | ||
| - The offline engine to embed huge amounts of documents | ||
| - The metadata you may want to structure or weight differently during retrieval | ||
| - Hybrid Retrieval strategies combining dense and sparse retrieval | ||
|
franciscojavierarceo marked this conversation as resolved.
Outdated
|
||
| - Reranking mechanisms | ||
| - Fine Tuning embedding, retrieval, and generative models | ||
|
|
||
| # Feast Powered by Milvus | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd spell out what the user should be getting. "Here's an example of using Feast to do X Y and Z. Milvus provides B, while Feast makes it easier to do A" |
||
|
|
||
| ## Step 0: Configure Milvus | ||
| Configure milvus in a simple `yaml` file. | ||
| ```yaml | ||
| project: rag | ||
| provider: local | ||
| registry: data/registry.db | ||
| online_store: | ||
| type: milvus | ||
| path: data/online_store.db | ||
| vector_enabled: true | ||
| embedding_dim: 384 | ||
| index_type: "IVF_FLAT" | ||
|
|
||
|
|
||
| offline_store: | ||
| type: file | ||
| entity_key_serialization_version: 3 | ||
| # By default, no_auth for authentication and authorization, other possible values kubernetes and oidc. Refer the documentation for more details. | ||
| auth: | ||
| type: no_auth | ||
| ``` | ||
|
|
||
| ## Step 1: Define your Data Sources and Views | ||
| You define your data declaratively using Feast's FeatureView and entity objects: | ||
| ```python | ||
| document = Entity( | ||
| name="document_id", | ||
| description="Document ID", | ||
| value_type=ValueType.INT64, | ||
| ) | ||
|
|
||
| source = FileSource( | ||
| file_format=ParquetFormat(), | ||
| path="./data/my_data.parquet", | ||
| timestamp_field="event_timestamp", | ||
| ) | ||
|
|
||
| # Define the view for retrieval | ||
| city_embeddings_feature_view = FeatureView( | ||
| name="city_embeddings", | ||
| entities=[document], | ||
| schema=[ | ||
| Field( | ||
| name="vector", | ||
| dtype=Array(Float32), | ||
| vector_index=True, # Vector search enabled | ||
| vector_search_metric="COSINE", # Distance metric configured | ||
| ), | ||
| Field(name="state", dtype=String), | ||
| Field(name="sentence_chunks", dtype=String), | ||
| Field(name="wiki_summary", dtype=String), | ||
| ], | ||
| source=source, | ||
| ttl=timedelta(hours=2), | ||
| ) | ||
| ``` | ||
|
|
||
| ## Step 2: Ingest your Data | ||
|
|
||
| ```python | ||
| store.write_to_online_store(feature_view_name='city_embeddings', df=df) | ||
| ``` | ||
|
|
||
| ## Step 3: Retrieve your Data | ||
|
|
||
| ```python | ||
| context_data = store.retrieve_online_documents_v2( | ||
| features=[ | ||
| "city_embeddings:vector", | ||
| "city_embeddings:document_id", | ||
| "city_embeddings:state", | ||
| "city_embeddings:sentence_chunks", | ||
| "city_embeddings:wiki_summary", | ||
| ], | ||
| query=query_embedding, | ||
| top_k=3, | ||
| distance_metric='COSINE', | ||
| ).to_df() | ||
| ``` | ||
|
|
||
| ## What are the benefits from using Feast? | ||
|
franciscojavierarceo marked this conversation as resolved.
Outdated
|
||
| You get a lot for free. | ||
| 1. Data dictionary/metadata catalog autogenerated from code | ||
| 3. UI exposing the metadata catalog | ||
| 2. FastAPI Server to serve your data | ||
| 3. Role Based Access Control (RBAC) to govern access to your data | ||
| 6. Deployment on Kubernetes using our Helm Chart or our Operator | ||
| 7. Multiple vector database providers | ||
| 8. Multiple data warehouse providers | ||
| 9. Support for different data formats | ||
| 10. Support for stream and batch processors (e.g., Spark and Flink) | ||
Uh oh!
There was an error while loading. Please reload this page.