Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
updated readme
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
  • Loading branch information
franciscojavierarceo committed Jan 30, 2025
commit c629a42fe58cbe7a7344b5f9516cc5a4b4b81323
18 changes: 11 additions & 7 deletions examples/rag/README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
# 🚀 Quickstart: Retrieval-Augmented Generation (RAG) using Feast and Large Language Models (LLMs)

This project demonstrates how to use **Feast** to power a **Retrieval-Augmented Generation (RAG)** application.
The RAG architecture combines retrieval of documents (using vector search) with contextual reasoning through a
The RAG architecture combines retrieval of documents (using vector search) with In-Context-Learning (ICL) through a
**Large Language Model (LLM)** to answer user questions accurately using structured and unstructured data.

## 💡 Why Use Feast for RAG?

- **Online retrieval of features:** Ensure real-time access to precomputed document embeddings and other structured data.
- **Vector search:** Leverage Feast’s integration with vector databases like **Milvus** to find relevant documents based on similarity.
- **Declarative feature definitions:** Define feature views and entities in a Python file and empower Data Scientists to easily ship scalabe RAG applications with all of the existing benefits of Feast.
- **Vector search:** Leverage Feast’s integration with vector databases like **Milvus** to find relevant documents based on a similarity metric (e.g., cosine).
- **Structured and unstructured context:** Retrieve both embeddings and traditional features, injecting richer context into LLM prompts.
- **Versioning and reusability:** Collaborate across teams with discoverable, versioned data pipelines.

Expand Down Expand Up @@ -77,8 +78,11 @@ context_data = store.retrieve_online_documents_v2(
display(context_data)
```

## 🔑 Explanation of Core Concepts
Feature View: Defines the schema of features and how they are retrieved from the offline store.
Entity: Represents primary keys like item_id for indexing and lookup.
Vector Search: Uses Milvus Lite to retrieve document embeddings based on cosine similarity.
LLM Context Injection: Retrieved documents are formatted and injected into the LLM prompt to provide grounded responses.
📊 Example Output
When querying: Which city has the largest population in New York?

The model provides:

```
The largest city in New York is New York City, often referred to as NYC. It is the most populous city in the United States, with an estimated population of 8,335,897 in 2022.
```