Ifcviewer - an ultra fast ifcopenshell viewer and app by Moult · Pull Request #7930 · IfcOpenShell/IfcOpenShell

Moult · 2026-04-11T10:44:44Z

From https://community.osarch.org/discussion/3386/addressing-some-core-ifcopenshell-issues

A desktop viewer. Right now, Blender is really not optimised for a viewer and people don't realise how fast IOS really is because Blender itself imposes a crazy overhead for loading Blender meshes. We need a way on the desktop to view 50 models for simple coordination.

So the high level proposal is to make:

A big viewer tool. Basically Bonsai but not for authoring, it's for coordination. It's for viewing, BCF issue tracking, clash detection, viewing with drawings, connection to CDEs, all the stuff that we currently have to do with proprietary software (BIMVision, Revizto, ...).
A framework for other people to build their own tools. So if you want to build your own kiosk app, tablet app, desktop app, phone app, ... you have a starting point. Kind of like the non-web equivalent of "npm install web-ifc-component"

My general impression is that all the strategies of making a high performance viewer is well established and heavily documented (i.e. well-trained AI). So I'm fairly confident we can quickly get to a state where all the basic tricks are implemented with vibe coding and start to approach the more cutting edge like whatever nanite is doing.

Also, I think the general approach of a Qt app with dockable windows and standard properties viewers, checkboxes, settings window, status bar, etc is also very well established and AI will do a pretty good job in a greenfield situation, so I hope to vibe away.

I hope to get to a milestone where the bare viewer is ready with a good foundation. After that, hopefully with the work done with IfcZero, datamodel refactoring, new kernels, etc especially with Python utils upstreamed to C++, I hope especially with the guidance from the Bonsai data classes and UI layer, to replicate some of the more important readonly properties. I think this somewhat low risk, so long as the AI helps with all the Qt stuff, I'll be able to look after the IFC-related domain logic.

Obviously, don't merge :) Comments very very welcome.

See README.md for latest full explanation:

https://github.com/IfcOpenShell/IfcOpenShell/tree/ifcviewer/src/ifcviewer

(Also ignore the Python stuff and build hacks that's just a mess I made in my local dev)

aothms · 2026-04-11T18:28:26Z

For me either way is fine, but maybe in light of "release early release often" I think doing this with daily releases enabled would be even cooler. Just tell Claude/Codex/... that releases are build with build_win.yml build_rocky.yml build_rocky_arm.yml build_osx.yml should be able to figure it out.

Moult · 2026-04-18T11:32:31Z

This bears a bit of commentary. After commit 196f984 landed, basically I then decided to "do whatever AI thought necessary in GPU shader land".

It was not a good move.

All those commits up until 01dd8d5 were basically just GPU experiments and it basically didn't do much at all. Most stuff was actually reverted - it looked like AI was going nuts.

I then went back and decided to fix two things. Essentially, most of the time is when navigating. So 1) get HiZ to work properly - it was previously not implemented correctly and basically always off as a result and 2) just turn off small objects when orbiting (kind of like a aggressive contribution culling.

After those two tweaks (HiZ I never got to be absolutely perfect, there were always edge cases, I believe due to a huge scene distance with combinations of lots of very thin (extreme aspect ratio AABBs) objects like pipes but it was good enough whilst orbiting) I think the results speak for themselves: basically huge speed boost.

Note right now these are mostly configured via env vars. Also there are new benchmarking and camera args so I can consistently measure stats across scenes and flags that turn features on and off.

IFC_GPU_CULL=0 IFC_HIZ_MOTION=0 IFC_MIN_PX_MOTION=0
=== BENCHMARK (200 frames, orbit 103° at 0.5°/frame) ===
  avg: 61.25 ms (16.3 fps)
  median: 59.88 ms (16.7 fps)
  p1: 54.60 ms  p99: 73.61 ms
  last frame: obj 254097  tri 44719843  sub_draws 155073  hiz_rej 0
=== END BENCHMARK ===

IFC_GPU_CULL=0 IFC_HIZ_MOTION=0 IFC_MIN_PX_MOTION=10
=== BENCHMARK (200 frames, orbit 103° at 0.5°/frame) ===
  avg: 37.67 ms (26.5 fps)
  median: 37.56 ms (26.6 fps)
  p1: 33.81 ms  p99: 44.90 ms
  last frame: obj 70408  tri 21657373  sub_draws 55952  hiz_rej 0
=== END BENCHMARK ===

IFC_GPU_CULL=0 IFC_HIZ_MOTION=1 IFC_MIN_PX_MOTION=0
=== BENCHMARK (200 frames, orbit 103° at 0.5°/frame) ===
  avg: 21.44 ms (46.6 fps)
  median: 20.76 ms (48.2 fps)
  p1: 12.40 ms  p99: 30.23 ms
  last frame: obj 33479  tri 8836541  sub_draws 17495  hiz_rej 28067
=== END BENCHMARK ===

IFC_GPU_CULL=0 IFC_HIZ_MOTION=1 IFC_MIN_PX_MOTION=10
=== BENCHMARK (200 frames, orbit 103° at 0.5°/frame) ===
  avg: 19.62 ms (51.0 fps)
  median: 18.51 ms (54.0 fps)
  p1: 11.30 ms  p99: 30.13 ms
  last frame: obj 11388  tri 6935091  sub_draws 8683  hiz_rej 11488
=== END BENCHMARK ===

IFC_GPU_CULL=1 IFC_HIZ_MOTION=1 IFC_MIN_PX_MOTION=10
=== BENCHMARK (200 frames, orbit 103° at 0.5°/frame) ===
  avg: 19.22 ms (52.0 fps)
  median: 18.35 ms (54.5 fps)
  p1: 10.94 ms  p99: 29.98 ms
  last frame: obj 11304  tri 6912165  sub_draws 8584  hiz_rej 58963
=== END BENCHMARK ===

All AI generated slop. Do NOT trust these "fixes". It's just to get it working on my machine.

Track per-object AABB and index range during upload. Each frame, extract frustum planes from the view-projection matrix and cull objects whose AABB is entirely outside any plane. Draw only visible objects via glMultiDrawElements. Document the three-phase rendering performance strategy in README.md. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Show FPS, frame time, visible/total objects, and visible/total triangles in the status bar. Toggled via Settings > Show Performance Stats, persisted in app settings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Introduce ModelHandle and per-model GeometryStreamers so multiple IFC files can be loaded simultaneously. Object IDs are globally unique (monotonically increasing across models). File picker is now multiselect. Each model gets a top-level tree node. Property lookup uses the correct model's ifcopenshell::file. ViewportWindow supports hide/show/remove per model via model_id filtering in the frustum cull pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Reflect current architecture: per-model streamers, glMultiDrawElements with frustum culling, 32-byte vertex format with color, multiselect file picker, settings/stats files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…load Phase 2 performance: BVH acceleration with median-split build, per-model trees, and EBO re-sorting for GPU cache coherence. Raw binary .ifcview sidecar stores full geometry + BVH for instant subsequent loads (skip tessellation entirely). Per-model GPU buffers (VAO/VBO/EBO per model) eliminate cross-model buffer copies on growth. Sidecar reads happen on a background thread. Bulk GPU uploads are progressive (48 MB/frame chunks) so the viewport stays interactive while multi-GB models stream in. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Per-second frame log reports fps/ms, visible/total object & triangle ratios, VRAM breakdown (VBO+EBO), model count, and pending uploads. Upload-complete log includes per-model VBO/EBO MB and scene total VRAM. Streamer runs an instancing analysis keyed on geom.id(): total shapes, unique representations, dedup ratio, theoretical VBO/EBO/SSBO sizes if instanced, potential savings, and top-5 most-duplicated representations. Used to validate whether GPU instancing is worth the architectural rewrite for a given dataset. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When a BVH leaf passes the frustum test, emit a single glMultiDrawElements record covering the leaf's entire index range instead of one per object. Leaves are contiguous in the EBO after reorderEbo, so the range is just [first_object.index_offset, sum(index_count)]. Cuts draw calls by ~8x (BVH_MAX_LEAF_SIZE) and shifts the bottleneck from CPU/driver per-draw overhead toward GPU vertex throughput. Per-object features (selection highlight, per-vertex color, object_id picking) are unchanged — they operate on vertex attributes, not draw state. Future per-object hide/override will use SSBO lookups sampled by object_id in the fragment shader. Slight overdraw from skipping per-object frustum tests within a leaf is negligible given median-split BVH tightness and spare tri throughput. Also adds visible_objects_ counter so stats still report true object counts (not leaf counts), plus leaf_draws/model_draws breakdown in the per-second frame log. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Commit A of the instancing migration (Phase 3a). The streamer now runs the iterator with use-world-coords=false and dedupes by the geometry's representation id, emitting a MeshChunk once per unique geometry and an InstanceChunk per placement. The viewport keeps geometry in local coordinates (28 B/vertex, down from 32) and applies the per-instance transform in the vertex shader via an std430 SSBO indexed by gl_InstanceID + a per-draw uniform offset. After streaming finishes finalizeModel() stable-sorts instances by mesh_id, assigns each mesh a contiguous range, and uploads the SSBO; render then issues one glDrawElementsInstancedBaseVertex per mesh. BvhAccel is reshaped to operate on a generic BvhItem (world AABB + model_id) so it can drive instance-level culling, but the path is not wired in yet -- every instance is drawn every frame in this commit. Progressive-during-streaming rendering is likewise disabled: a model appears when its SSBO is uploaded, not incrementally. Sidecar cache is stubbed (reads miss, writes are no-ops); the v4 on-disk format with MeshInfo + InstanceGpu sections lands in Commit B. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Commit B of the instancing migration. The sidecar on-disk format is reintroduced at version 4 with MeshInfo + InstanceCpu sections in place of v3's flat per-object draw-info array. After streaming finishes, MainWindow asks the viewport for a post- finalise snapshot (VBO + EBO are read back from the GPU, meshes and instances come from the CPU-side arrays) and writes it alongside PackedElementInfo + the string table. On a subsequent load, readSidecar rehydrates the whole struct and ViewportWindow:: applyCachedModel uploads VBO/EBO/SSBO in a single step, bypassing the iterator entirely. Staleness check is still by source file size. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Re-wires the BVH acceleration structure on top of the new instanced renderer. Per model, build a BVH over per-instance world AABBs at finalize (and on sidecar apply). Each frame, traverse the BVH against the camera frustum to produce a visible-instance index list, bucket by mesh_id, and upload to a per-model SSBO at binding=1. The main and pick vertex shaders do a double-indirection `instances[visible[u_offset + gl_InstanceID]]` so draws only touch instances that passed the frustum test. Models with fewer than BVH_MIN_OBJECTS instances skip the BVH build and fall back to a linear per-instance frustum test. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Pre-allocate the instance SSBO on model creation (4 MB, grow-on-demand) and append each arriving InstanceChunk directly to the GPU-side InstanceGpu array in uploadInstanceChunk. This makes a model drawable as soon as its first mesh + first instance chunk land, rather than waiting for finalizeModel. The visible-list architecture already decouples SSBO order from the draw path, so appending in insertion order is correct — no sorting required. finalizeModel collapses to: - compute per-mesh instance counts (for stats + sidecar round-trip) - build the per-model BVH over instance world AABBs Render / pick loops now gate on ssbo_instance_count > 0 rather than the finalized flag. Stats include in-progress models in totals (excluding only hidden). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Each visible model now issues a single glMultiDrawElementsIndirect call instead of one glDrawElementsInstancedBaseVertex per mesh. The CPU BVH cull populates an array of DrawElementsIndirectCommand records plus the flat visible-instance list, uploads both, and draws the whole model in one GL call. Vertex shaders switch from a uniform u_instance_offset to gl_BaseInstanceARB (ARB_shader_draw_parameters), so per-draw offset comes from the indirect command's baseInstance field. Draw-call counts for BIM scenes with hundreds of unique meshes drop from hundreds-per-frame to one-per-model, cutting driver overhead. This also sets up the plumbing for the follow-up compute-shader cull that will populate the indirect buffer entirely on-GPU. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Two bugs conflated as "weird colors": 1. Two-sided lighting. IFC placements often embed reflection matrices (mirrored families). Transforming a_normal by mat3(inst.transform) produces a normal pointing the wrong way on those instances, and max(n·L, 0) then clamps the surface to pure ambient — reads as dark / washed out. Use gl_FrontFacing to flip n in the fragment shader so both winding orientations shade correctly. The proper fix (ship an inverse-transpose normal matrix or a det-sign bit per instance) is still owed; that would unlock re-enabling GL_CULL_FACE for a big fragment- work win on closed solids. 2. Stats label "inst_draws" was counting indirect sub-draws, not actual GL draw calls — misleading since MDI collapses N sub- draws into one glMultiDrawElementsIndirect. Split into gl_draw_calls (real GL calls, = drawn-model count) and indirect_sub_draws (packed sub-commands). For a BIM model with 47k unique meshes at full view this now correctly reads "1 gl_draws (47092 sub)" rather than suggesting 47k driver dispatches. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

IFC files routinely have IfcConnectedFaceSets whose faces point inconsistently within the same shell — the result under per-vertex normals is dark inside-out patches, and under GL_CULL_FACE it's swiss-cheese. reorient-shells fixes the face winding at geometry generation time, which is the only place it can be fixed correctly; no shader trick can recover from a mesh whose triangles disagree among themselves. Off by default in IfcOpenShell because it adds iterator time, but we cache the result in the sidecar so it's a one-shot cost per file. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Enables GL_CULL_FACE by default (user-toggleable in Settings) so closed solids skip shading their back halves. The catch is that IFC placements can contain reflections (mat4 with det<0 — mirrored families, symmetric instances). Naively culling would make every mirrored instance vanish because the rasterizer sees its screen-space winding as backwards. Fix: detect reflections at upload time via determinant sign, bucket visible instances into forward (det>=0) and reverse (det<0) per mesh during culling, and issue two glMultiDrawElementsIndirect calls per model with glFrontFace toggled CCW/CW between them. The indirect buffer is still one buffer — just split into a forward slice followed by a reverse slice, with m.indirect_forward_count recording the split. Vertex shader flips the normal when the transform has negative determinant, keeping lighting correct on mirrored instances. The fragment shader keeps the gl_FrontFacing fallback as a safety net when culling is disabled (e.g. for files with open shells). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Two stability bugs: 1. Clicking an object left the scene with wrong shading until the camera moved. The pick pass re-culls every model with its own parameters (min_pixel_radius=0, no HiZ) and overwrites each model's visible_ssbo and indirect buffer. The next render() saw an unchanged camera, skipped the cull via the have_cached_cull_ shortcut, and drew the stale pick-pass buffers. Fix: invalidate have_cached_cull_ at the end of pickObjectAt(). 2. Loading two sidecar-cached models made the second model's picked properties resolve to the first model's elements. Sidecars store raw object_id / model_id values from the session that wrote them, and both files start at object_id=1, so element_map_ entries collided. Fix: on load, rebase every PackedElementInfo and InstanceCpu by (next_object_id_ - min_id_in_sidecar) and overwrite model_id with the freshly-assigned handle before the elements hit element_map_. Also document both in the README — the pick-pass note under 3A contribution culling, the sidecar rebase under the sidecar format section.

The 'Known caveats' bullet still described the old 1-frame-stale behavior. Since 6b496d8 the cull compares hiz_vp_ to the current VP and drops HiZ rejection whenever they differ, so HiZ only helps on still frames — orbiting gets no benefit. Call out the tradeoff and the planned same-frame-depth-pre-pass fix slated for Phase 3E.

Scaffolding for Phase 3E (GPU compute cull). After finalizeModel / applyCachedModel, pack each InstanceCpu's world AABB + mesh_id + reflection bit into a std430-friendly 32 B record and push it to a per-model aabb_ssbo. No consumer yet — the CPU cull still drives rendering — but the next commits will point a compute shader at this buffer and have it produce the visible list + indirect commands directly on the GPU. Cost: 32 B per instance, ~18 MB for the 569 k-instance test scene. One-shot upload at finalize time; streaming-time appends aren't mirrored (the CPU cull doesn't need the SSBO, and finalizeModel rebuilds the whole thing in one go).

First Phase 3E milestone: a compute shader that reads the per-instance world-AABB SSBO added in the last commit, tests each instance against the 6 frustum planes, and atomicAdds a global counter. No visible list or indirect-buffer writes yet — the output is just a survivor count, cross-checked each frame against the CPU cull's numbers in the stats line (`gpu_cull[Xms in=A surv=B]`) so we can verify the plumbing end- to-end before we hand the GPU responsibility for the actual render data. Dispatched from render() after the CPU cull completes, only when IFC_GPU_CULL=1 and the camera moved (the skipped-cull still-frame path doesn't re-check either). The readback is synchronous — that's fine for a validation path; it'll go away once the GPU writes indirect commands directly. Expected invariant: gpu_cull.surv >= cpu_cull.visible_objects, since the GPU path does frustum-only and CPU adds contribution + HiZ cuts on top. A large mismatch (orders of magnitude, or surv < visible) means the SSBO upload or shader logic is wrong. No shader/buffer bindings overlap with the draw path (compute uses bindings 0/1, restored before drawing; draw programs rebind 0/1/2).

Promote the compute cull from a validation shader to the actual draw driver. With the gate on, the CPU cull fan-out is skipped and MDI consumes gpu_indirect_buffer / gpu_visible_ssbo directly. - uploadGpuCullStaticBuffers() pre-fills per-mesh DrawElementsIndirect commands and a mesh_base prefix sum so the compact shader can scatter survivors into a fixed per-mesh range. Instance count for each command is zeroed by a tiny reset dispatch, then the compact shader atomically writes survivors and increments instanceCount. - Draw loop branches on the gate: single CCW MDI with all mesh commands. Fwd/rev winding split, LOD selection, and HiZ are still CPU-path-only; reflected instances render with wrong winding under this gate (step 3b). - Once-per-second readback of each model's indirect buffer populates the survivor / visible-object / visible-triangle stats so the [frame] line reflects what the GPU actually drew. Known regression: sub_draws is the full mesh count per model (~172k on the test dataset) vs the handful of non-empty commands the CPU path produces. Command-processor overhead from zero-instance sub-draws is what drives the FPS drop, not the cull itself (0.05 ms). Compacting non-empty commands requires glMultiDrawElementsIndirectCount, a GL 4.6 entrypoint not exposed by Qt's QOpenGLFunctions_4_5_Core; deferring to 3a-followup so we don't bolt a getProcAddress loader into the renderer mid-restructure. IFC_GPU_CULL is off by default, so this does not affect normal runs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Extend the GPU-cull indirect buffer from M to 2M commands: the first M are the forward (non-reflected, CCW) bucket, the second M are the reverse (reflected, CW) bucket. The compact shader reads flags bit 0 from the AABB SSBO and routes each survivor to the appropriate bucket via bucket = reflected ? mesh_id + M : mesh_id. uploadGpuCullStaticBuffers() now precomputes exact per-mesh fwd/rev instance counts so each bucket reserves only the slots it needs (total visible_ssbo size unchanged — sum of fwd + rev = total). Draw loop issues two MDIs per model under IFC_GPU_CULL: first M commands CCW, next M commands CW. Sub-draws doubled (172k → 345k) which further regresses FPS due to command-processor overhead from zero-instance sub-draws — the same issue noted in 3a. MDI compaction remains the fix. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The compact shader now computes per-instance pixel radius and routes survivors to LOD1 buckets when the projected sphere falls below the LOD1 threshold (default 30 px, same as CPU path, tunable via IFC_LOD1_PX). Layout expanded from 2 to 4 buckets per mesh: [0..M) fwd_lod0 [M..2M) fwd_lod1 [2M..3M) rev_lod0 [3M..4M) rev_lod1 Two MDIs per model: CCW for [0..2M), CW for [2M..4M). Per-mesh has_lod1 flags live in a new gpu_mesh_flags_ssbo (binding 4). Contribution cull refactored: the compact shader now computes pixelRadius() once and uses it for both the min_pixel_radius rejection and LOD routing, matching the CPU path's logic. Visible-buffer worst case is 2 × total_instances (each LOD bucket reserves the full fwd/rev capacity per mesh, since LOD selection is dynamic). Tri count drops ~60% on the test dataset (53M → 22M) thanks to LOD1 decimated meshes. FPS recovers from 16 to 36 despite 690k sub_draws (4M layout). MDI compaction remains the final perf fix. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Two-phase compute-cull dispatch when IFC_GPU_CULL=1: Phase 1 frustum + contribution + LOD, no HiZ → survivors Depth render survivors depth-only into half-viewport FBO Build GPU compute max-reduce depth → R32F mip pyramid Phase 2 same cull + HiZ test → final survivors Color render final survivors The compact shader's new hizOccluded() projects 8 AABB corners to screen space, picks the mip level where the covered rect fits in ≤2×2 texels, and rejects when the AABB's near-depth exceeds the pyramid's max depth. New GPU resources (per-window): hiz_gpu_fbo_ / hiz_gpu_depth_tex_ — depth-only FBO at half viewport hiz_gpu_pyramid_tex_ — R32F mipmapped pyramid hiz_gpu_copy_prog_ — compute: depth → pyramid L0 hiz_gpu_reduce_prog_ — compute: max-reduce L(n-1)→L(n) hiz_gpu_depth_prog_ — vertex + trivial fragment On a dense 18-model BIM dataset: survivors: 140k → 65k (HiZ rejects ~50%) triangles: 22M → 13M gpu_cull: 0.06ms → 22.5ms (depth pre-pass CP overhead) The depth pre-pass suffers the same empty-sub-draws CP overhead as the color pass (690k commands, most with instanceCount=0). Once MDI compaction lands, both passes will be fast. For now, net FPS is flat (savings on color ≈ cost of depth pre-pass). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Pack compute shader compacts non-empty indirect commands into contiguous fwd/rev ranges, eliminating ~690k empty sub-draws that dominated command-processor overhead. GL 4.6 entrypoint loaded via getProcAddress with ARB fallback; graceful degradation to uncompacted MDI when unavailable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

This reverts commit d5b7b87.

This reverts commit 9a7a489.

This reverts commit 77cac3e.

This reverts commit 7defbe9.

This reverts commit 4fe32b5.

Replace the CPU BVH traversal + frustum + contribution stages with a GPU compute path (IFC_GPU_CULL=1). A single scene-wide dispatch tests all instances against frustum planes and screen-space contribution threshold, compacting survivors into a flat uint32 buffer via atomicAdd. Uses one-frame-late async readback: frame N dispatches and fences, frame N+1 polls the fence (non-blocking) and reads the persistent- mapped result buffer with zero GPU sync cost. CPU still handles HiZ, LOD selection, winding bucketing, and indirect command generation from the compact survivor list; draw path is unchanged. On a 1M-instance / 111-model scene (GTX 1650): GPU dispatch: 0.70 ms (frustum + contribution, brute-force) Readback: 0.00 ms (fence already signaled, persistent map) CPU consume: 5.7–6.7 ms (parallel emit across models) Cull wall: 5.8–6.9 ms (vs 9.6–15.2 ms CPU-only path) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…cull Only clear and emit mesh buckets that received survivors in the previous frame, converting both phases from O(total_meshes) to O(active_meshes). Adds per-sub-phase timing (bin/clr/class/emit) to the stats line. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The HiZ pipeline had two bugs causing false occlusions: 1. The scaling depth blit (glBlitFramebuffer from window-size to HiZ-size) produced GL_INVALID_VALUE on some drivers. Replace with a fullscreen- triangle shader that samples the resolved depth and writes gl_FragDepth. 2. The resolve texture used GL_DEPTH_COMPONENT24 but Qt's default FBO uses D24S8 (depth+stencil). Mismatched formats cause the MSAA resolve blit to fail. Fix by using GL_DEPTH24_STENCIL8 for the resolve texture. Additionally, the occlusion test was too aggressive for scenes with compressed depth ranges (entire scene in 0.99-1.0). Change from "max over coarse mip texels" to "reject only if ALL fine-mip texels agree the AABB is behind them", with early-out on first non-occluding texel and a 64-sample cap. Also fix IFC_HIZ_MOTION=0 being treated as enabled (checked env var existence, not value). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

During camera motion, use a larger pixel-radius threshold (IFC_MIN_PX_MOTION) to aggressively cull small objects, dramatically reducing sub_draws and improving orbit fps (e.g. 29→67 fps on 1M-instance scene). When the camera stops, automatically re-cull at the base threshold to restore full detail. Key behaviors: - IFC_MIN_PX_MOTION=N sets the motion threshold (0 = disabled) - Settle recull fires on the first still frame after motion - HiZ pyramid invalidated on settle (stale from sparse motion frame) - GPU cull results skipped on settle (dispatched at motion threshold) - requestUpdate() ensures the settle frame actually runs Also adds IFC_SUBDRAW_DIAG=1 diagnostic for sub-draw composition analysis and documents Phase 3E/3F experiment results in README. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add --camera tx,ty,tz,dist,yaw,pitch and --benchmark N CLI args for reproducible performance measurement. The benchmark orbits the camera (0.5°/frame yaw) for N frames after a 5-frame warmup, prints avg/median/p1/p99 frame times, then exits. Press C during interactive use to print the current camera as a --camera argument. Fix settle recull to fire after ANY camera motion (not just when IFC_MIN_PX_MOTION is set), ensuring HiZ artifacts from motion frames are always cleared when the camera stops. Document Phase 3G (motion-adaptive culling + HiZ during motion) in README with benchmark results from 1.06M-instance scene: - Baseline: 16.3 fps - IFC_MIN_PX_MOTION=10: 26.5 fps (1.6x) - IFC_HIZ_MOTION=1: 46.6 fps (2.9x) - Both combined: 51.0 fps (3.1x) - + GPU_CULL: 52.0 fps (3.2x, negligible gain) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Benchmarks showed negligible gain (52 vs 51 fps) — the CPU BVH path already culls efficiently, and the GPU path still read back to CPU for LOD/winding/HiZ. Removes ~570 lines of dead weight: compute shader, async readback, one-frame-late consume, per-model AABB SSBOs, and profiling counters. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Use $<TARGET_FILE_DIR:IfcGeom> instead of hardcoded ${CMAKE_BINARY_DIR}/ifcgeom/$<CONFIG> for plugin runtime dirs — the old path was wrong on non-MSVC generators where $<CONFIG> expands empty. Add explicit add_dependencies for kernel/mapping plugins so IfcViewer waits for them to build, and drop the redundant direct link against ${kernel_libraries}. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace i16x2 octahedral normals with i8x2, filling the 2-byte padding after position and saving 4 bytes per vertex. int8 gives ~1.4 deg worst-case angular error — invisible for BIM geometry which is overwhelmingly axis-aligned. 25% VBO reduction; sidecar files shrink ~15% overall (5.4 GB -> 4.6 GB on a 111-model test scene). Bumps sidecar format to v7. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Edge-collapse decimation (meshopt_simplify) returns BIM meshes unchanged due to per-triangle vertex duplication and non-manifold topology. The sloppy voxel-clustering decimator is faster, needs no shadow index welding, and produces good results at the sub-30px LOD1 threshold. Remove the non-sloppy branch, shadow buffer, IFC_LOD_SLOPPY and IFC_LOD_LOCK_BORDER env vars. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Turn src/ifcviewer into libIfcViewer.so holding the rendering engine + geometry pipeline (ViewportWindow, GeometryStreamer, BvhAccel, InstancedGeometry, SidecarCache, LodBuilder, AppSettings). Move the existing UI shell (MainWindow, SettingsWindow, main.cpp) into src/ifcviewer-full as the IfcViewerFull executable. Add a new src/ifcviewer-minimal target with a MinimalWindow that hosts only the viewport and reuses the sidecar fast-path for benchmark/debug runs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

MainWindow and MinimalWindow each carried ~150 lines of mirrored load-queue, sidecar-thread, streamer-wiring, and ID-rebase code. Lift all of it into a SceneLoader QObject in the library; both apps now consume it via signals. Sidecar writes stay on the full-app side since they need the consumer's element metadata strings. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Handle streamer success, failure, and cancellation as distinct terminal states so failed or cancelled loads do not finalize as successful models. Clean up partial model/UI state in the full and minimal viewer apps when a load is cancelled or fails. Generated with the assistance of an AI coding tool.

Buffer viewport model mutations until the OpenGL context is initialized so loads that start before first exposure do not silently drop geometry or model state. Generated with the assistance of an AI coding tool.

Previously readSidecar/writeSidecar were keyed on (path, file_size) with staleness rejected at read time. Switch to pure path-stem keying: foo.ifc and foo.ifcdb/ both resolve to foo.ifcview, so the same cache serves either source format. Staleness is user-managed (delete the sidecar to force a rebuild), which also lets sidecars be copied or moved independently of the source. v8 header drops the source_file_size field. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

@todo

The viewer can now open a .rdb directory (as produced by RocksDbSerializer / convert_path_to_rocksdb) anywhere it accepts an .ifc file. The full GUI gets an "Add Database..." File menu entry that opens a directory chooser; the streamer lets the file constructor autodetect the format and opens the store read-only so multiple viewers can share a database without taking the exclusive RocksDB lock. Parallel mapping on RocksDB-backed files still produces non-deterministic shape counts (the race is outside the instance cache), so force num_threads=1 for the iterator when the storage is RocksDB. Serial RocksDB (~2.6s) and parallel SPF (~0.7s) both produce 107 shapes on AC20-FZK-Haus; @todo in-source points at the remaining thread-safety work. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Moult marked this pull request as draft April 11, 2026 10:44

Moult mentioned this pull request Apr 11, 2026

C++ datamodel v1.0 #7534

Draft

Moult force-pushed the ifcviewer branch 2 times, most recently from d000002 to 91c8e46 Compare April 14, 2026 10:05

Moult force-pushed the ifcviewer branch 5 times, most recently from 6394034 to 8c41011 Compare April 22, 2026 20:39

Moult and others added 19 commits April 23, 2026 06:39

Local hacks to compile and monkey patch issues in the Python world

cac3336

All AI generated slop. Do NOT trust these "fixes". It's just to get it working on my machine.

Dump of hello world ifc viewer code

1c9bc86

Update ifcviewer to compile with datamodel refactor

b60ba1a

Plan out performance strategy

aea05a8

Add performance stats overlay in status bar

c8b438f

Show FPS, frame time, visible/total objects, and visible/total triangles in the status bar. Toggled via Settings > Show Performance Stats, persisted in app settings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Moult and others added 29 commits April 23, 2026 06:39

Revert "ifcviewer: MDI compaction via glMultiDrawElementsIndirectCount"

465fa5b

This reverts commit d5b7b87.

Revert "ifcviewer: same-frame HiZ occlusion cull on GPU (step 3d)"

05a019d

This reverts commit 9a7a489.

Revert "ifcviewer: GPU LOD0/LOD1 selection in compute cull (step 3c)"

e765525

This reverts commit 77cac3e.

Revert "ifcviewer: GPU cull fwd/rev reflection bucketing (step 3b)"

85d1957

This reverts commit 7defbe9.

Revert "ifcviewer: GPU cull drives rendering under IFC_GPU_CULL=1"

7c01cf2

This reverts commit 4fe32b5.

Queue viewer ops before GL init

1900bb1

Buffer viewport model mutations until the OpenGL context is initialized so loads that start before first exposure do not silently drop geometry or model state. Generated with the assistance of an AI coding tool.

Moult force-pushed the ifcviewer branch from 8c41011 to d2a1b95 Compare April 23, 2026 11:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ifcviewer - an ultra fast ifcopenshell viewer and app#7930

Ifcviewer - an ultra fast ifcopenshell viewer and app#7930
Moult wants to merge 62 commits intodatamodel-v1.0from
ifcviewer

Moult commented Apr 11, 2026 •

edited

Loading

Uh oh!

aothms commented Apr 11, 2026

Uh oh!

Moult commented Apr 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Moult commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aothms commented Apr 11, 2026

Uh oh!

Moult commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Moult commented Apr 11, 2026 •

edited

Loading

Moult commented Apr 18, 2026 •

edited

Loading