Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
feat: Add data labeling tabs to UI (#5410)
* Add GenAI documentation page to Introduction section

Co-Authored-By: Francisco Javier Arceo <farceo@redhat.com>

* Move GenAI page to getting-started directory and update SUMMARY.md

Co-Authored-By: Francisco Javier Arceo <farceo@redhat.com>

* Update SUMMARY.md

* hell 3.12.7
:wq
d unstructured data transformation and Spark integration details to GenAI documentation

Co-Authored-By: Francisco Javier Arceo <farceo@redhat.com>

* Update genai.md

* Rename Document Labeling to Data Labeling with blue icon

- Update sidebar navigation to show 'Data Labeling' instead of 'Document Labeling'
- Add blue color (#006BB4) to Data Labeling icon to match other navbar icons
- Update route from 'document-labeling' to 'data-labeling'
- Update page title from 'Document Labeling for RAG' to 'Data Labeling for RAG'
- Update custom tab types from DocumentLabeling to DataLabeling
- Update test document text to reference 'data labeling functionality'

Co-Authored-By: Francisco Javier Arceo <arceofrancisco@gmail.com>

Signed-off-by: Devin AI <devin-ai-integration[bot]@users.noreply.github.com>

Co-Authored-By: Francisco Javier Arceo <arceofrancisco@gmail.com>

* Add tabbed interface to Data Labeling with RAG and Classification tabs

- Implement separate RAG and Classification tabs for Data Labeling page
- Add RAG Context section with prompt and query text areas
- Separate chunk extraction and generation labels into distinct H2 sections
- Keep existing 'Label Selected Text' button for chunk extraction
- Add long text area for ground truth label in 'Label for Generation' section
- Implement Classification tab with CSV data loading and editing functionality
- Maintain all existing text selection and highlighting functionality
- Follow established UI patterns using EUI components

Co-Authored-By: Francisco Javier Arceo <arceofrancisco@gmail.com>

Co-Authored-By: Francisco Javier Arceo <arceofrancisco@gmail.com>
Signed-off-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>

* Fix save functionality and improve RagTab layout

- Simplify save function with setTimeout to avoid protobuf errors
- Improve filename extraction for JSON download
- Maintain conditional rendering of RAG Context after document loading
- Keep existing layout with Step 1 and Step 2 sections
- Preserve 'Label Selected Text' button functionality

Signed-off-by: Devin AI <devin-ai-integration[bot]@users.noreply.github.com>
Co-Authored-By: Francisco Javier Arceo <arceofrancisco@gmail.com>

Co-Authored-By: Francisco Javier Arceo <arceofrancisco@gmail.com>

* Fix lint-python and unit-test-ui formatting issues

- Fix import sorting in feature_server.py (ruff I001)
- Remove trailing comma in RagTab.tsx imports
- Resolve CI formatting failures

Signed-off-by: Devin AI <devin-ai-integration[bot]@users.noreply.github.com>
Co-Authored-By: Francisco Javier Arceo <arceofrancisco@gmail.com>

Co-Authored-By: Francisco Javier Arceo <arceofrancisco@gmail.com>

---------

Signed-off-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Signed-off-by: Rob Howley <rhowley@seatgeek.com>
  • Loading branch information
2 people authored and Rob Howley committed Jun 2, 2025
commit b0da945b06fa7d4082efe5608e2a9386c6ee5b9a
93 changes: 0 additions & 93 deletions docs/getting-started/genai.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,99 +65,6 @@ Feast supports transformations that can be used to:
* Normalize and preprocess features before serving to LLMs
* Apply custom transformations to adapt features for specific LLM requirements

## Getting Started with Feast for GenAI
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh oh, this was removed. I wonder why?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wth?!


### Installation

To use Feast with vector database support, install with the appropriate extras:

```bash
# For Milvus support
pip install feast[milvus,nlp]

# For Elasticsearch support
pip install feast[elasticsearch]

# For Qdrant support
pip install feast[qdrant]

# For SQLite support (Python 3.10 only)
pip install feast[sqlite_vec]
```

### Configuration

Configure your feature store to use a vector database as the online store:

```yaml
project: genai-project
provider: local
registry: data/registry.db
online_store:
type: milvus
path: data/online_store.db
vector_enabled: true
embedding_dim: 384 # Adjust based on your embedding model
index_type: "IVF_FLAT"

offline_store:
type: file
entity_key_serialization_version: 3
```

### Defining Vector Features

Create feature views with vector index support:

```python
from feast import FeatureView, Field, Entity
from feast.types import Array, Float32, String

document = Entity(
name="document_id",
description="Document identifier",
join_keys=["document_id"],
)

document_embeddings = FeatureView(
name="document_embeddings",
entities=[document],
schema=[
Field(
name="vector",
dtype=Array(Float32),
vector_index=True, # Enable vector search
vector_search_metric="COSINE", # Similarity metric
),
Field(name="document_id", dtype=String),
Field(name="content", dtype=String),
],
source=document_source,
ttl=timedelta(days=30),
)
```

### Retrieving Similar Documents

Use the `retrieve_online_documents_v2` method to find similar documents:

```python
# Generate query embedding
query = "How does Feast support vector databases?"
query_embedding = embed_text(query) # Your embedding function

# Retrieve similar documents
context_data = store.retrieve_online_documents_v2(
features=[
"document_embeddings:vector",
"document_embeddings:document_id",
"document_embeddings:content",
],
query=query_embedding,
top_k=3,
distance_metric='COSINE',
).to_df()
```
## Use Cases

### Document Question-Answering
Expand Down
26 changes: 26 additions & 0 deletions sdk/python/feast/feature_server.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,11 @@ class ReadDocumentRequest(BaseModel):
file_path: str


class SaveDocumentRequest(BaseModel):
file_path: str
data: dict


def _get_features(request: GetOnlineFeaturesRequest, store: "feast.FeatureStore"):
if request.feature_service:
feature_service = store.get_feature_service(
Expand Down Expand Up @@ -375,6 +380,27 @@ async def read_document_endpoint(request: ReadDocumentRequest):
except Exception as e:
return {"error": str(e)}

@app.post("/save-document")
async def save_document_endpoint(request: SaveDocumentRequest):
try:
import json
import os
from pathlib import Path

file_path = Path(request.file_path).resolve()
if not str(file_path).startswith(os.getcwd()):
return {"error": "Invalid file path"}

base_name = file_path.stem
labels_file = file_path.parent / f"{base_name}-labels.json"

with open(labels_file, "w", encoding="utf-8") as file:
json.dump(request.data, file, indent=2, ensure_ascii=False)

return {"success": True, "saved_to": str(labels_file)}
except Exception as e:
return {"error": str(e)}

@app.get("/chat")
async def chat_ui():
# Serve the chat UI
Expand Down
26 changes: 26 additions & 0 deletions sdk/python/feast/ui_server.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,16 @@
from fastapi import FastAPI, Response
from fastapi.middleware.cors import CORSMiddleware
from fastapi.staticfiles import StaticFiles
from pydantic import BaseModel

import feast


class SaveDocumentRequest(BaseModel):
file_path: str
data: dict


def get_app(
store: "feast.FeatureStore",
project_id: str,
Expand Down Expand Up @@ -76,6 +82,26 @@ def read_registry():
media_type="application/octet-stream",
)

@app.post("/save-document")
async def save_document_endpoint(request: SaveDocumentRequest):
try:
import os
from pathlib import Path

file_path = Path(request.file_path).resolve()
if not str(file_path).startswith(os.getcwd()):
return {"error": "Invalid file path"}

base_name = file_path.stem
labels_file = file_path.parent / f"{base_name}-labels.json"

with open(labels_file, "w", encoding="utf-8") as file:
json.dump(request.data, file, indent=2, ensure_ascii=False)

return {"success": True, "saved_to": str(labels_file)}
except Exception as e:
return {"error": str(e)}

# For all other paths (such as paths that would otherwise be handled by react router), pass to React
@app.api_route("/p/{path_name:path}", methods=["GET"])
def catch_all():
Expand Down
2 changes: 1 addition & 1 deletion ui/src/FeastUISansProviders.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,7 @@ const FeastUISansProvidersInner = ({
element={<DatasetInstance />}
/>
<Route
path="document-labeling/"
path="data-labeling/"
element={<DocumentLabelingPage />}
/>
<Route path="permissions/" element={<PermissionsIndex />} />
Expand Down
12 changes: 6 additions & 6 deletions ui/src/custom-tabs/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -136,18 +136,18 @@ interface DatasetCustomTabRegistrationInterface
}: DatasetCustomTabProps) => JSX.Element;
}

// Type for Document Labeling Custom Tabs
interface DocumentLabelingCustomTabProps {
// Type for Data Labeling Custom Tabs
interface DataLabelingCustomTabProps {
id: string | undefined;
feastObjectQuery: RegularFeatureViewQueryReturnType;
}
interface DocumentLabelingCustomTabRegistrationInterface
interface DataLabelingCustomTabRegistrationInterface
extends CustomTabRegistrationInterface {
Component: ({
id,
feastObjectQuery,
...args
}: DocumentLabelingCustomTabProps) => JSX.Element;
}: DataLabelingCustomTabProps) => JSX.Element;
}

export type {
Expand All @@ -171,6 +171,6 @@ export type {
FeatureCustomTabProps,
DatasetCustomTabRegistrationInterface,
DatasetCustomTabProps,
DocumentLabelingCustomTabRegistrationInterface,
DocumentLabelingCustomTabProps,
DataLabelingCustomTabRegistrationInterface,
DataLabelingCustomTabProps,
};
10 changes: 5 additions & 5 deletions ui/src/pages/Sidebar.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -132,13 +132,13 @@ const SideNav = () => {
isSelected: useMatchSubpath(`${baseUrl}/data-set`),
},
{
name: "Document Labeling",
id: htmlIdGenerator("documentLabeling")(),
icon: <EuiIcon type="documentEdit" />,
name: "Data Labeling",
id: htmlIdGenerator("dataLabeling")(),
icon: <EuiIcon type="documentEdit" color="#006BB4" />,
renderItem: (props) => (
<Link {...props} to={`${baseUrl}/document-labeling`} />
<Link {...props} to={`${baseUrl}/data-labeling`} />
),
isSelected: useMatchSubpath(`${baseUrl}/document-labeling`),
isSelected: useMatchSubpath(`${baseUrl}/data-labeling`),
},
{
name: "Permissions",
Expand Down
Loading