Skip to content

Add ability to group pipelines into "folders"#6229

Open
Karakatiza666 wants to merge 2 commits into
mainfrom
group-pipelines
Open

Add ability to group pipelines into "folders"#6229
Karakatiza666 wants to merge 2 commits into
mainfrom
group-pipelines

Conversation

@Karakatiza666
Copy link
Copy Markdown
Contributor

pipeline-manager changes

The unused description field is getting replaced by a metadata field that is not bound to the pipeline deployment state; it can be used for client to store arbitrary data, such as keys for organizing the pipelines - folder paths and tags. metadata can be updated when the pipeline is running, it does not trigger version nor refresh_version increments.

web-console changes

The new tree view UI supports intuitive drag&drop actions to organize the pipelines into folders - grab pipeiine by the handle icon on the left, drag one pipeline onto another to create a new folder, drag folder or pipeline into another folder or a scope to move it there.
The new view drops per-column sorting because it does not seem to be very useful, and forces lexicographical ordering of pipelines.

Also added the tree view and search to pipeline edit page for feature parity.

I implemented the folder UI before tags UI because it needs additional UI/UX design; folders are a more straightforward feature. The metadata field enables seamlessly adding the tag functionality later.

Screencast.from.2026-05-14.02-39-31.webm

Testing: manual

…ndependent of deployment status, deprecate 'description'

Signed-off-by: Karakatiza666 <bulakh.96@gmail.com>
…a' field

Signed-off-by: Karakatiza666 <bulakh.96@gmail.com>
@Karakatiza666 Karakatiza666 requested a review from snkas May 13, 2026 22:52
@mihaibudiu
Copy link
Copy Markdown
Contributor

so this is not per user, it's per pipeline manager.
what happens to your view if someone else is moving pipelines from another browser?

@gz
Copy link
Copy Markdown
Contributor

gz commented May 13, 2026

I'm a bit confused why we went with folders when users were asking for tags and in the issue everyone seemed to like tags?

Copy link
Copy Markdown
Contributor

@gz gz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets just build the version people asked for (tags) from the beginning no need to confuse users with different models

@@ -0,0 +1,6 @@
-- Add metadata field to the pipeline table.
-- This is arbitrary, optional text provided by the user.
ALTER TABLE pipeline ADD COLUMN metadata TEXT NOT NULL DEFAULT '';
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's good to have a catch-all JSON element this. We're already strictly structured, we dont have to switch to the dark side of using unstructured data on top of a relational database.

pub struct PipelineInfo {
pub id: PipelineId,
pub name: String,
/// Deprecated: use `metadata` instead.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to mark it deprecated; nothing has changed we weren't using it before we dont need to use it now?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to discuss this more with Simon once he's back; I think it's benefitial to fold the "description" column into the "metadata" column model I propose here

&& program_config.is_none())
);

// Metadata-only fast path. `metadata` is a free-form client-side annotation
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont understand how it can be free form and at the same time serve as a web-ui folder structure.

If we use if for that IMO it deserves an well structured field that is reserved for use by webconsole so we dont have arbitrary clients overwriting/using it differently.

@Karakatiza666
Copy link
Copy Markdown
Contributor Author

@gz

We're already strictly structured, we dont have to switch to the dark side of using unstructured data on top of a relational database.

The model I am proposing is not "unstructured", it just doesn't bind relational structure to the shape structure for a family of pipeline information fields with similar semantics - they are not used by the backend, not relevant to backend or pipeline deployment lifecycle, only used by the clients - "description", "tags", "path". JSON operations on a small dataset (for the number of pipelines) are fine in Postgres.

Are you saying you would prefer a e.g. "tags" column in PipelineDescriptor, with the same semantics as the "metadata" field I added? (can be changed without triggering pipeline version updates, when the pipeline is running)?

@Karakatiza666
Copy link
Copy Markdown
Contributor Author

Karakatiza666 commented May 14, 2026

... a well structured field that is reserved for use by webconsole so we dont have arbitrary clients overwriting/using it differently.

Two points I want to make here:

  • database structure =/= API structure; if we wanted to harden the API contract we would add validation at endpoint level, not modify the relational schema
  • I don't expect rogue clients to not respect the structure or overwriting the field, losing web-console's information. I guess I misused the term "free-form"; I propose "metadata" as a JSON field which contains various fields clients can set for e.g. organizing the pipelines - like "tags" or "path". Clients would respect this convention and not overwrite the field with e.g. plaintext, but individual clients can add their unique metadata fields they want to without affecting metadata set by other clients, and without having to add support for it in pipeline-manager.
    With the current, simpler model of pipeline-manager not validating "metadata" field it is easier for clients to implement related features, and, again, we don't expect rogue clients that just overwrite other clients' data

@Karakatiza666
Copy link
Copy Markdown
Contributor Author

Karakatiza666 commented May 14, 2026

The customer did not ask for tags specifically, they listed tags and folders as ways to solve their UX problem. I need to discuss tags with Anna because the feature is not trivial - good location to show and edit tags, and e.g. version can be displayed as another tag, which could improve the design and UX

@Karakatiza666
Copy link
Copy Markdown
Contributor Author

Whether tags or folders, the feature is likely to be used by tooling around Feldera some of our customers have built, not just web-console

@Karakatiza666
Copy link
Copy Markdown
Contributor Author

what happens to your view if someone else is moving pipelines from another browser?

They see the change immediately - within the polling period of web-console, which is 2 seconds now

@Karakatiza666
Copy link
Copy Markdown
Contributor Author

What I would like to do is merge this PR with folders as a simpler solution, and then open a separate PR for user-defined tags and search-by-tag. Tags and folders are orthogonal, both are optional, both are useful.
The folder feature is not invasive, it is only "visible" through the small drag icon at the left of the pipeline list item

Copy link
Copy Markdown
Contributor

@snkas snkas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My design suggestion for the backend is:

  • Adding a pipeline field tags, of type Vec<String>
  • Adding a pipeline table field tags with type VARCHAR[] or TEXT[]
  • The tags field can be edited like any other field using PATCH /v0/pipelines/<name>
  • Only tricky part is allowing the tags to be editable while the pipeline is running.

I can get to it sometime mid next week.

Reasoning for my design:

  • In future, the backend can potentially filter by tag without having to do JSON parsing
  • In future, the tags can still be migrated as the type is known to the database, without having to do JSON operations
  • Having a separate pipeline_tag table is likely too much as they are always part of the pipeline
  • Deprecating description should be a separate PR
  • It seems unnecessary to have metadata as nested, as what is "metadata" is not as well defined, and at least for now besides tags I'm not sure how much more of these fields we would add

The above does assume that we don't support for instance defining colors of tags, as that would require them to become independently management objects with their own API endpoint.

@snkas
Copy link
Copy Markdown
Contributor

snkas commented May 15, 2026

Regarding tags vs. nested directory structure: it seems to be most simple to just have tags, as it allows a pipeline to belong to multiple groups, and it is a concept users are already familiar with (e.g., GitHub). Directory structure is also a big departure from the current list.

@snkas
Copy link
Copy Markdown
Contributor

snkas commented May 15, 2026

For design of UI, this seems most straightforward (mock-up):
tags

  • In search bar, you can hint Search pipelines... e.g., "example", "tag:example"
  • There could be a dropdown filter by tags as well, that shows all the tags and once selected shows all pipelines with that tag

@lalithsuresh
Copy link
Copy Markdown
Contributor

lalithsuresh commented May 15, 2026

NACK from me on folders too. Folders are used in environments where there is a natural filesystem hierarchy for projects and a mix of content types (e.g. workspace, catalogs, schemas and separate files for sql like in dbt and other warehouse cloud products). It's not a good fit here.

Tags are the most appropriate here. There's many other things in the backlog we should get to first, but let's go through UI design here as well before implementing anything.

@Karakatiza666
Copy link
Copy Markdown
Contributor Author

I'll remove the folders and make this PR pipeline-manager only. Tags will be waiting for the UI design

Copy link
Copy Markdown

@mythical-fred mythical-fred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backend (pipeline-manager) changes are well-structured and thoroughly tested — the PipelineFieldUpdates struct with exhaustive destructuring is a nice pattern, and the metadata-only fast-path that skips version/refresh_version bumps is well-reasoned.

However, the web-console side adds ~1,060 lines of new behavioral code (DraggableTreeView, PipelineTreeView, SidebarPipelineTreeView, plus the Table.svelte rewrite) with zero tests. This is a hard-rule violation — behavior changes without tests get REQUEST_CHANGES, no exceptions. See inline comments for specifics.

Gerd's concerns about the metadata field design are still open; I won't duplicate them here.

@@ -0,0 +1,588 @@
<script lang="ts" generics="T extends { name: string }">
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is 588 lines of new interactive behavior — drag-and-drop, folder creation, folder renaming, expand/collapse state, multi-selection propagation. None of it has tests.

Good first targets for Vitest + @testing-library/svelte:

  1. Tree-building logic (ensureFolder, sortNode) — extract into a pure function and unit-test the tree shape for various folder-path inputs.
  2. folderCheckState — pure logic over a set; easy to test without DOM.
  3. is_metadata_only-style classification on the JS side: parseMetadata, buildMetadata, getFolderPath — these are already pure functions, just not tested.
  4. Component tests: render the tree view with a few items, simulate a drag from one row onto another, assert the onCreateFolderFor callback fires with the correct arguments.

The recommended stack is Vitest + @testing-library/svelte — runs in Node, no browser, no flake.

if (!metadata) {
return {}
}
try {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DRY: parseMetadata is duplicated here and in Table.svelte (lines ~57-64). Extract it into a shared utility (e.g. $lib/functions/pipelines/metadata.ts) — along with getFolderPath and buildMetadata — so the parsing logic lives in one place and can be unit-tested once.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants