Status: Proposed Updated: 2026-04-25
This document captures the initial direction for a desktop application for Jcode under these constraints:
- no Electron/Tauri/web-app shell
- no general UI framework
- very high performance
- low idle resource use
- very custom product UI
- primary developer machine may be Linux
- most early users are expected to be on macOS
The goal is to make the desktop client a first-class Jcode surface without forking the Jcode runtime or turning the app into a heavyweight IDE clone.
See also:
DESKTOP_SUPERAPP_WORKSPACE.mdDESKTOP_CODEBASE_ARCHITECTURE.mdCLIENT_CORE_PRESENTATION_SPLIT_PLAN.mdMULTI_SESSION_CLIENT_ARCHITECTURE.mdSERVER_ARCHITECTURE.mdMEMORY_BUDGET.md
Build Jcode Desktop as a small Rust desktop client with a custom GPU-rendered UI. The app should connect to a local Jcode server/daemon that owns sessions, tools, agent execution, persistence, and permissions.
The frontend should be optimized as a render/input surface:
- Linux should be a first-class development platform.
- macOS should be the first-class product/distribution platform.
- The UI should not depend on Linux-only desktop concepts.
- The UI should not be a web view.
- The UI should not embed the agent runtime directly.
- Rendering should be on-demand, virtualized, and measurable from day one.
Recommended initial stack:
| Area | Decision |
|---|---|
| Frontend language | Rust |
| Backend/runtime | Existing Rust Jcode server/session runtime |
| Process model | Desktop frontend + local Jcode daemon/server |
| Window/input layer | Thin platform layer, likely winit initially |
| Rendering | wgpu with a custom 2D renderer |
| UI architecture | Retained UI tree with dirty tracking |
| Layout | Small custom layout system, not CSS/DOM |
| Text | Dedicated text layout/raster cache, likely cosmic-text/swash or platform-backed text later |
| Protocol | Versioned typed local event protocol |
| Persistence | Server-owned session/event persistence |
| Product identity | Agent operating console / mission control |
Jcode Desktop should not start as a full IDE and should not look like a conventional chatbot.
The differentiated product is a keyboard-driven, Niri-like agent workspace superapp for local development. The first-class object is not a chat window, but a workspace containing many navigable surfaces:
- agent sessions
- activity/task views
- diffs and changed files
- file/diff/tool surfaces
- optional future surfaces
- settings/debug/tool surfaces
The app should help users:
- supervise autonomous coding work
- inspect tool activity
- manage background tasks
- review changed files
- respond to permission prompts
- resume and coordinate sessions
- navigate many related surfaces spatially
The desktop client should complement the TUI/CLI, not replace it.
Linux should support the fastest inner loop:
- launch the desktop client locally
- run renderer stress tests
- run protocol integration tests
- benchmark memory/frame/layout/text performance
- debug the UI engine without a Mac in the loop
The Linux build should be real, not a fake simulator. It should render through the same UI engine and exercise the same protocol/view-model paths as macOS.
Most early users are expected to be on macOS, so macOS polish should be a product requirement even if day-to-day development happens on Linux.
Mac-specific work that should not be postponed too long:
- native
.appbundle - app icon and menu bar integration
- command-key shortcuts
- system light/dark appearance
- Retina rendering correctness
- trackpad scrolling quality
- native clipboard behavior
- file/open-with integration
- code signing and notarization path
- good behavior under Mission Control, Spaces, and full-screen windows
Because the developer may use Linux, the architecture should explicitly avoid baking in assumptions that work well only with a Linux window manager.
Do not make these hard dependencies:
- Niri-style external spatial window management
- X11-specific APIs
- Wayland-only behavior
- terminal-first session workflows
- Linux notification semantics
- global shortcuts that are unavailable or hostile on macOS
The existing Linux/Niri workflow should remain excellent, but desktop product quality should be judged primarily against macOS expectations.
Use a split process architecture:
Jcode Desktop Frontend
- window/input
- custom rendering
- local view model
- transient UI state
- surface-local state
- protocol client
Jcode Server/Daemon
- sessions
- agent runtime
- tool runtime
- background tasks
- persistence
- permissions
- model/provider configuration
The server remains the source of truth for:
- canonical session history
- streaming events
- tool execution
- file edits
- background tasks
- permission state
- persisted configuration
The desktop frontend owns only surface-local state:
- selected session/surface
- draft input
- cursor and text selection
- scroll offsets
- pane sizes
- focused panel
- local command palette state
- render caches
This aligns with the multi-session model where a server-owned session can be shown by different clients or surfaces over time.
The desktop app should consume a versioned, typed event stream rather than periodically fetching complete session snapshots.
Early protocol properties:
- local-first transport
- explicit protocol version
- capability negotiation
- append-only session events
- streaming deltas for assistant/tool output
- resumable subscriptions by event cursor
- compact events for high-volume tool output
- server-owned permission requests
Possible transports:
- Existing Jcode server channel, if compatible with desktop needs.
- Unix domain socket on Linux/macOS and named pipe on Windows.
- Stdio JSON protocol for early prototypes and test harnesses.
Avoid localhost HTTP as the default unless there is a strong reason. It creates a larger local security surface than a user-owned socket/pipe.
Example event families:
session.created
session.updated
surface.attached
message.created
message.delta
message.completed
tool.started
tool.output.delta
tool.completed
task.started
task.progress
task.completed
workspace.changed
git.changed
permission.requested
permission.resolved
error
Use a custom renderer rather than a native widget hierarchy or web view.
Recommended layers:
Platform window/input
-> input normalizer
-> app state/view model
-> retained UI tree
-> layout
-> text layout/cache
-> display list
-> GPU renderer
Core rules:
- no continuous render loop when idle
- render only on input, data events, animations, or explicit invalidation
- virtualize every unbounded list
- separate layout cost from paint cost
- cache shaped text by content/font/width
- use stable IDs for dirty tracking
- make debug/performance counters visible in-app
The renderer should initially support:
- rectangles
- rounded rectangles
- borders
- solid fills
- clipping
- scroll containers
- text runs
- monospaced blocks
- simple icons or vector-like primitives
- image support later
Defer:
- blur effects
- complex shadows
- animation framework
- SVG-heavy rendering
- full markdown renderer
- full terminal emulator
- embedded code editor
Use a retained UI tree with immediate-style builder ergonomics.
Rationale:
- transcripts are long-lived and streamed incrementally
- tool outputs can be large
- panes need stable focus/selection state
- dirty tracking matters for resource use
- accessibility will eventually need stable semantic nodes
- multi-session surfaces need stable identity
The model should not imitate the DOM/CSS stack. A small product-specific layout system is enough:
- row
- column
- stack
- split pane
- fixed size
- flex fill
- scroll container
- virtual list
- overlay/modal
- intrinsic text measurement
Text is one of the hardest parts of this project and should be treated as a core system, not a detail.
The desktop client needs:
- Unicode shaping
- font fallback
- monospace code/tool output
- wrapping
- incremental append layout
- selection/copy
- input cursor behavior
- command palette text input
- markdown-ish transcript styling
- ANSI-like tool output styling eventually
Initial recommendation:
- use a Rust text stack such as
cosmic-text/swashif dependency review is acceptable - maintain a GPU glyph atlas
- cache shaped lines/runs by stable block ID and available width
- specialize streamed append paths so new output does not re-layout the whole transcript
Mac-specific text quality should be evaluated early. If Rust text rendering is not good enough on macOS, consider platform-backed text for macOS while preserving the same higher-level text layout API.
Initial budgets should be measured on both Linux development machines and representative macOS hardware.
| Metric | MVP target | Long-term target |
|---|---|---|
| Cold launch to visible window | < 500 ms | < 150 ms |
| Frontend idle CPU | ~0% | ~0% |
| Frontend idle RSS | < 100 MiB | < 50 MiB |
| Input-to-paint latency | < 32 ms | < 16 ms |
| Scrolling | 60 fps | 120 fps-capable |
| Fake transcript stress case | 100k blocks usable | 100k blocks smooth |
| Full transcript re-layout on append | forbidden | forbidden |
| Unbounded retained visible nodes | forbidden | forbidden |
| Renderer frame when idle | forbidden | forbidden |
Required early instrumentation:
- frame time
- layout time
- text shaping time
- display-list build time
- GPU submit time
- visible node count
- total retained node count
- glyph atlas size
- text cache size
- protocol event backlog
- daemon round-trip latency
- frontend RSS if available
A debug HUD should exist in the prototype before real Jcode integration is considered complete.
Example HUD:
frame 1.8ms | layout 0.3ms | text 0.6ms | gpu 0.4ms
nodes 812 | visible 47 | glyph atlas 12.4 MiB | events 0 | daemon 2ms
The first UI milestone should prove the engine before proving every product workflow.
Success criteria:
- launches a native desktop window from Linux
- renders through the custom GPU pipeline
- shows session sidebar, transcript, composer, and activity panel
- handles mouse, keyboard, focus, and scrolling
- renders fake streamed transcript data
- virtualizes a 100k-block transcript
- idles at near-zero CPU
- exposes performance/debug HUD
- has screenshot or golden-state tests where practical
Success criteria:
- connects to local Jcode server/daemon
- lists sessions
- attaches to a session/surface
- subscribes to event stream
- sends a user prompt
- streams assistant/tool events into the transcript
- can stop/cancel an active run
- recovers from daemon restart or disconnect gracefully enough for development use
Success criteria:
- activity center for background tasks/tool calls
- permission request overlay
- workspace/git status panel
- changed-file list
- open external editor/diff action
- session search/filter
- macOS app bundle prototype
Do not put the whole desktop app in the root crate.
Suggested structure:
crates/
jcode-desktop-protocol/ # shared protocol/event types if not already covered by server types
jcode-desktop-ui/ # UI tree, layout, text/cache abstractions, renderer-agnostic pieces
jcode-desktop-renderer/ # wgpu renderer and GPU resources
jcode-desktop/ # app shell, platform window, protocol client, product UI
If compile time becomes a problem, keep protocol/UI crates lightweight and gate GPU/window dependencies behind the final app crate.
“No frameworks” does not have to mean “no libraries.” It should mean no heavyweight app framework and no web-shell product architecture.
Likely acceptable dependencies:
wgpufor rendering abstraction- a very thin window/input layer such as
winitfor bootstrapping cosmic-text/swashor equivalent for text shaping/rasterization- small serialization/protocol crates already consistent with Jcode
Avoid:
- Electron
- Tauri
- Qt
- Flutter
- GTK as the app framework
- WebView UI shell
- React/Vue/Svelte-style UI stack
- CSS/DOM-based architecture
If winit becomes limiting for macOS polish, the platform layer can grow direct AppKit support while preserving the renderer and UI model.
Because macOS is the primary user target, validate these early even if development happens on Linux:
- Retina scale factor correctness
- trackpad inertial scrolling
- text clarity compared with native apps
- keyboard shortcuts use Command rather than Control where appropriate
- system dark/light mode follows user preference
- window resizing and full-screen behavior feels native
- app menu and close/minimize/quit semantics are correct
- clipboard round-trips rich enough for code and transcripts
- local socket permissions are safe
- app bundle can launch/find the daemon reliably
These should be resolved before implementation moves past the fake-data prototype:
- Use
winitinitially or write direct platform shells from the start? - Use
wgpuor direct Metal-first rendering? - Use
cosmic-text/swashor platform text APIs? - Reuse the existing Jcode server protocol or introduce a desktop-specific event protocol crate?
- Should the first desktop binary support multi-surface mode or only one active surface?
- What is the minimum macOS version to support?
- What is the first distribution path: local
.app, Homebrew cask, or signed/notarized DMG?
Create a fake-data desktop prototype that runs on Linux but measures the exact performance characteristics required by the eventual macOS product.
The prototype should not wait for a perfect daemon API. It should validate the expensive UI systems first:
- window creation
- renderer startup
- retained tree
- layout
- text cache
- virtualized transcript
- on-demand repaint
- debug HUD
Only after that should the real Jcode event stream be connected.