Skip to content

feat(file): add Compress and Decompress operations to the File block#5100

Merged
waleedlatif1 merged 8 commits into
stagingfrom
perf/file-compress
Jun 16, 2026
Merged

feat(file): add Compress and Decompress operations to the File block#5100
waleedlatif1 merged 8 commits into
stagingfrom
perf/file-compress

Conversation

@waleedlatif1

@waleedlatif1 waleedlatif1 commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Add Compress and Decompress operations to the File block — a symmetric zip pair, ZIP-only (the universal interchange format; gzip/tar add nothing for this audience)
  • Compress bundles one or more workspace files (or upstream file outputs) into a single .zip stored in the workspace; returns the archive in file/files + id/name/size/url. Auto-names from the source file (single) or archive.zip (multiple); explicit Archive Name optional; dedupes entry names
  • Decompress extracts a .zip back into the workspace, preserving folder structure, and returns the extracted files in files — ready to chain into Get Content or downstream blocks
  • New file_compress / file_decompress tools + compress/decompress branches on /api/tools/file/manage, archiving via JSZip (DEFLATE)

Safety

  • Compress: per-file + total input caps (100 MB) to bound in-memory archiving
  • Decompress: zip-slip guard (rejects ../absolute entry paths), symlink entries skipped, and zip-bomb caps (max entries, per-entry + total uncompressed size, with declared-size pre-check mirroring the existing pptx zip-parser)

Note

Zip doesn't help the original "get an LLM attachment under the 50 MB provider limit" case — models can't read archives. The value here is general file handling: bundling for outbound transfer/storage (compress) and consuming inbound archives (decompress), which was the more common pain.

Type of Change

  • New feature

Testing

  • Unit tests for both tools (request body construction + success/failure transforms), 7 passing
  • bun run check:api-validation passes; typecheck clean; lint clean

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel

vercel Bot commented Jun 16, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped Jun 16, 2026 7:13pm

Request Review

@cursor

cursor Bot commented Jun 16, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
Decompress writes arbitrary archive bytes into workspace storage with substantial in-memory handling; mitigations are present but archive parsing and extraction remain a sensitive surface area.

Overview
Adds Compress and Decompress to the File v5 block and wires them through new file_compress / file_decompress tools to compress / decompress on /api/tools/file/manage (JSZip DEFLATE).

Compress takes one or more workspace files (IDs or file objects), optional archive name, enforces 100 MB per-file and combined input caps, flattens entry names and dedupes collisions, uploads a .zip to the workspace, and returns the archive in files plus id/name/size/url.

Decompress accepts a single .zip, validates and extracts with zip-slip path sanitization, symlink skipping, entry/count/size limits (including declared uncompressed pre-check and validate-before-write so failures do not leave partial extracts), recreates folder structure via ensureWorkspaceFileFolderPath, and returns extracted files in files.

API contracts, block UI sub-blocks/params, tool registry exports, application/zip extension mapping, block tests, and unit tests for tool request/response transforms are included.

Reviewed by Cursor Bugbot for commit cdf521b. Configure here.

Comment thread apps/sim/app/api/tools/file/manage/route.ts Outdated
@greptile-apps

greptile-apps Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds Compress and Decompress operations to the File block, backed by a new JSZip-based implementation in the /api/tools/file/manage route. The feature is well-scoped and follows the existing block/tool/route patterns throughout the codebase.

  • Compress bundles one or more workspace files into a DEFLATE-compressed .zip, with per-file and total 100 MB input caps, flat-name sanitisation via toFlatFileName, entry deduplication, and auto-naming from the source filename or archive.zip.
  • Decompress extracts a single .zip back into the workspace with zip-slip protection (sanitizeArchiveEntryPath), symlink-entry skipping, and a two-pass zip-bomb guard (declared-size pre-check + materialized-size cap); extracted files land in workspace folders mirroring the archive's directory structure.
  • Both operations ship with file_compress/file_decompress tool definitions, Zod contracts, unit tests (7 passing), and are wired into the FileV5Block alongside the existing read/write/append/fetch/get-content operations.

Confidence Score: 5/5

Safe to merge — the compress and decompress paths are well-guarded with size caps, zip-slip protection, and symlink skipping, and the changes are additive with no impact on existing operations.

The implementation is a clean additive feature that correctly follows the existing block/tool/route patterns. All safety-critical paths (zip-slip, zip-bomb, input size) are guarded, the previous review feedback has been addressed, and the only remaining note is a fragility concern around an internal JSZip property used for the bomb pre-check — not a correctness issue since the materialized fallback remains intact.

No files require special attention; route.ts carries the most logic and is worth a quick read-through, but no blocking concerns remain.

Important Files Changed

Filename Overview
apps/sim/app/api/tools/file/manage/route.ts Adds ~300 lines for compress and decompress cases; includes zip-slip guard, symlink skip, two-pass zip-bomb cap (declared + materialised), and deduplication. The pre-check for bomb detection relies on an internal JSZip _data field that could silently stop working across versions.
apps/sim/tools/file/compress.ts New tool definitions for file_compress and file_decompress; well-structured with correct param visibility, request builder, and transformResponse handlers.
apps/sim/blocks/blocks/file.ts Adds Compress/Decompress to FileV5Block with correct subblock definitions, tool routing, and params mapping; consistent with existing operation patterns.
apps/sim/lib/api/contracts/tools/file.ts New Zod schemas for compress (multi-file ID or input, optional archiveName) and decompress (single string fileId); refine guards enforce at least one source is present.
apps/sim/tools/file/compress.test.ts 7 tests covering request body construction, success/failure response transforms for both tools; straightforward and complete.
apps/sim/lib/uploads/utils/file-utils.ts Adds zip and gz MIME type mappings to the extension map; small, correct addition.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant B as File Block (file.ts)
    participant T as Tool (compress.ts)
    participant R as Route (/api/tools/file/manage)
    participant S as Storage

    Note over B,S: Compress flow
    B->>T: params (compressInput, archiveName)
    T->>R: "POST {operation: compress, fileId[], archiveName}"
    R->>S: getWorkspaceFile(id) x N
    S-->>R: WorkspaceFile[]
    R->>S: downloadFileFromStorage(file) x N
    S-->>R: Buffer[] (capped at 100 MB total)
    R->>R: JSZip.generateAsync(DEFLATE)
    R->>S: uploadWorkspaceFile(zipBuffer)
    S-->>R: result (id, name, url)
    R-->>T: "{success, data: {id, name, size, url, files}}"
    T-->>B: "output.files = [archive]"

    Note over B,S: Decompress flow
    B->>T: params (decompressInput)
    T->>R: "POST {operation: decompress, fileId}"
    R->>S: getWorkspaceFile(id)
    S-->>R: WorkspaceFile
    R->>S: downloadFileFromStorage(archive)
    S-->>R: archiveBuffer (100 MB max)
    R->>R: "JSZip.loadAsync -> validate entries"
    R->>R: sanitizeArchiveEntryPath (zip-slip guard)
    R->>R: declared-size pre-check + materialized-size cap
    loop each safe entry
        R->>S: ensureWorkspaceFileFolderPath(segments)
        R->>S: uploadWorkspaceFile(entryBuffer)
        S-->>R: UserFile
    end
    R-->>T: "{success, data: {files: UserFile[]}}"
    T-->>B: "output.files = extractedFiles"
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant B as File Block (file.ts)
    participant T as Tool (compress.ts)
    participant R as Route (/api/tools/file/manage)
    participant S as Storage

    Note over B,S: Compress flow
    B->>T: params (compressInput, archiveName)
    T->>R: "POST {operation: compress, fileId[], archiveName}"
    R->>S: getWorkspaceFile(id) x N
    S-->>R: WorkspaceFile[]
    R->>S: downloadFileFromStorage(file) x N
    S-->>R: Buffer[] (capped at 100 MB total)
    R->>R: JSZip.generateAsync(DEFLATE)
    R->>S: uploadWorkspaceFile(zipBuffer)
    S-->>R: result (id, name, url)
    R-->>T: "{success, data: {id, name, size, url, files}}"
    T-->>B: "output.files = [archive]"

    Note over B,S: Decompress flow
    B->>T: params (decompressInput)
    T->>R: "POST {operation: decompress, fileId}"
    R->>S: getWorkspaceFile(id)
    S-->>R: WorkspaceFile
    R->>S: downloadFileFromStorage(archive)
    S-->>R: archiveBuffer (100 MB max)
    R->>R: "JSZip.loadAsync -> validate entries"
    R->>R: sanitizeArchiveEntryPath (zip-slip guard)
    R->>R: declared-size pre-check + materialized-size cap
    loop each safe entry
        R->>S: ensureWorkspaceFileFolderPath(segments)
        R->>S: uploadWorkspaceFile(entryBuffer)
        S-->>R: UserFile
    end
    R-->>T: "{success, data: {files: UserFile[]}}"
    T-->>B: "output.files = extractedFiles"
Loading

Reviews (6): Last reviewed commit: "docs(file): correct compress description..." | Re-trigger Greptile

Comment thread apps/sim/app/api/tools/file/manage/route.ts
Comment thread apps/sim/app/api/tools/file/manage/route.ts
Adds the inbound half of the archive pair: extracts a .zip back into the
workspace with zip-slip path sanitization, symlink skipping, and entry/
size caps to bound zip-bomb expansion. Extracted files are returned in the
files output, ready to chain downstream.
@waleedlatif1 waleedlatif1 changed the title feat(file): add Compress operation to bundle files into a .zip archive feat(file): add Compress and Decompress operations to the File block Jun 16, 2026
- Drop the single 'file' output reintroduced for compress/decompress; v5
  intentionally exposes only 'files' (plus id/name/size/url scalars), so
  compress/decompress reuse the existing surface with no new block output
- Add zip/gz to EXTENSION_TO_MIME (previously only in the reverse map), so
  archive extensions resolve to a real mime instead of octet-stream
- Update File v5 block test for the two new operations
- Flatten zip entry names to a safe basename so untrusted fileInput names
  with .. or / cannot produce zip-slip entry paths (cursor)
- Treat archiveName as a flat name landing at the workspace root instead of
  passing it through splitWorkspaceFilePath, which silently created folders
  for names with separators (greptile)
- Add the upfront empty-input guard before any DB calls, matching the read
  and content operations (greptile)
@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/app/api/tools/file/manage/route.ts
Comment thread apps/sim/app/api/tools/file/manage/route.ts
Comment thread apps/sim/app/api/tools/file/manage/route.ts
- Read and validate every entry before writing any file, so hitting a size
  cap no longer leaves partially-extracted files in the workspace (cursor)
- Enforce the per-entry cap on the materialized buffer in addition to the
  declared size, covering entries that omit an uncompressed size (cursor)
- Pre-check declared sizes up front to reject standard zip bombs before
  materializing, and return 422 when no files could be extracted (cursor)
@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/app/api/tools/file/manage/route.ts
Comment thread apps/sim/blocks/blocks/file.ts
… decompress

- Resolve safe (sanitized) zip entries up front so unsafe/skipped entries
  no longer count toward the per-entry and total uncompressed-size caps (cursor)
- Reject decompress input that resolves to more than one archive with a clear
  error instead of silently extracting only the first (cursor)
@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/app/api/tools/file/manage/route.ts
The block already rejects multiple archives, but the manage route is the
real boundary (callable directly and by the LLM tool) and still took the
first of multiple resolved inputs. Add the empty-input and >1-archive guards
in the route so extra archives are rejected with a clear error rather than
silently ignored (cursor).
@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@cursor review

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 128a80a. Configure here.

…nces

- Drop the misleading 'under provider upload limits' claim from the compress
  tool description (models cannot read zip archives)
- Fix bestPractices to reference the 'files' output, not a non-existent 'file'
- Remove the stale 'file' property from the compress test fixture so it
  matches the real API response (greptile)
@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

Addressed the three doc/test accuracy findings from the latest review in cdf521b:

  • Removed the misleading "under provider upload limits" claim from the file_compress description (models can't read zip archives).
  • Fixed the block bestPractices to reference the files output instead of a non-existent file output.
  • Removed the stale file property from the compress test fixture so it matches the real API response.

No runtime behavior changed.

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@cursor review

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit cdf521b. Configure here.

@waleedlatif1 waleedlatif1 merged commit f238184 into staging Jun 16, 2026
16 checks passed
@waleedlatif1 waleedlatif1 deleted the perf/file-compress branch June 16, 2026 19:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant