feat(file): add Compress and Decompress operations to the File block#5100
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
PR SummaryMedium Risk Overview Compress takes one or more workspace files (IDs or file objects), optional archive name, enforces 100 MB per-file and combined input caps, flattens entry names and dedupes collisions, uploads a Decompress accepts a single API contracts, block UI sub-blocks/params, tool registry exports, Reviewed by Cursor Bugbot for commit cdf521b. Configure here. |
Adds the inbound half of the archive pair: extracts a .zip back into the workspace with zip-slip path sanitization, symlink skipping, and entry/ size caps to bound zip-bomb expansion. Extracted files are returned in the files output, ready to chain downstream.
- Drop the single 'file' output reintroduced for compress/decompress; v5 intentionally exposes only 'files' (plus id/name/size/url scalars), so compress/decompress reuse the existing surface with no new block output - Add zip/gz to EXTENSION_TO_MIME (previously only in the reverse map), so archive extensions resolve to a real mime instead of octet-stream - Update File v5 block test for the two new operations
- Flatten zip entry names to a safe basename so untrusted fileInput names with .. or / cannot produce zip-slip entry paths (cursor) - Treat archiveName as a flat name landing at the workspace root instead of passing it through splitWorkspaceFilePath, which silently created folders for names with separators (greptile) - Add the upfront empty-input guard before any DB calls, matching the read and content operations (greptile)
|
@greptile |
|
@cursor review |
- Read and validate every entry before writing any file, so hitting a size cap no longer leaves partially-extracted files in the workspace (cursor) - Enforce the per-entry cap on the materialized buffer in addition to the declared size, covering entries that omit an uncompressed size (cursor) - Pre-check declared sizes up front to reject standard zip bombs before materializing, and return 422 when no files could be extracted (cursor)
|
@greptile |
|
@cursor review |
… decompress - Resolve safe (sanitized) zip entries up front so unsafe/skipped entries no longer count toward the per-entry and total uncompressed-size caps (cursor) - Reject decompress input that resolves to more than one archive with a clear error instead of silently extracting only the first (cursor)
|
@greptile |
|
@cursor review |
The block already rejects multiple archives, but the manage route is the real boundary (callable directly and by the LLM tool) and still took the first of multiple resolved inputs. Add the empty-input and >1-archive guards in the route so extra archives are rejected with a clear error rather than silently ignored (cursor).
|
@greptile |
|
@cursor review |
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit 128a80a. Configure here.
…nces - Drop the misleading 'under provider upload limits' claim from the compress tool description (models cannot read zip archives) - Fix bestPractices to reference the 'files' output, not a non-existent 'file' - Remove the stale 'file' property from the compress test fixture so it matches the real API response (greptile)
|
Addressed the three doc/test accuracy findings from the latest review in cdf521b:
No runtime behavior changed. |
|
@greptile |
|
@cursor review |
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit cdf521b. Configure here.
Summary
.zipstored in the workspace; returns the archive infile/files+id/name/size/url. Auto-names from the source file (single) orarchive.zip(multiple); explicit Archive Name optional; dedupes entry names.zipback into the workspace, preserving folder structure, and returns the extracted files infiles— ready to chain into Get Content or downstream blocksfile_compress/file_decompresstools +compress/decompressbranches on/api/tools/file/manage, archiving via JSZip (DEFLATE)Safety
../absolute entry paths), symlink entries skipped, and zip-bomb caps (max entries, per-entry + total uncompressed size, with declared-size pre-check mirroring the existing pptx zip-parser)Note
Zip doesn't help the original "get an LLM attachment under the 50 MB provider limit" case — models can't read archives. The value here is general file handling: bundling for outbound transfer/storage (compress) and consuming inbound archives (decompress), which was the more common pain.
Type of Change
Testing
bun run check:api-validationpasses; typecheck clean; lint cleanChecklist