Skip to content

C++ datamodel v1.0#7534

Draft
aothms wants to merge 78 commits intov0.8.0from
datamodel-v1.0
Draft

C++ datamodel v1.0#7534
aothms wants to merge 78 commits intov0.8.0from
datamodel-v1.0

Conversation

@aothms
Copy link
Copy Markdown
Member

@aothms aothms commented Jan 5, 2026

For the past couple of days I have been rethinking what a sane v1.0 data model on the C++ side would look like that balances safety, readability, performance and a bit more modern language features. I still have work todo but wanted to share a bit of what's going on.

This has huge implications on client C++ code, so to feel the pain I went through most of the code base already. @RickBrice as one of the more active C++ devs lately I'm curious what your thoughts are.

Outcome:

>>> import ifcopenshell
>>> f = ifcopenshell.file(schema='ifc4x3_add2')
>>> f.createIfcWall()
#1=IfcWall($,$,$,$,$,$,$,$,$)
>>> inst = f[1]
>>> f.remove(f[1])
>>> inst
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\tkrij\miniconda3\envs\ifopsh08\lib\site-packages\ifcopenshell\ifcopenshell_wrapper.py", line 5999, in __repr__
    return _ifcopenshell_wrapper.entity_instance___repr__(self)
RuntimeError: Trying to access deleted instance reference

Before:

	IfcSchema::IfcOpenShell* shell = new IfcSchema::IfcOpenShell(faces);
	IfcSchema::IfcConnectedFaceSet::list::ptr shells(new IfcSchema::IfcConnectedFaceSet::list);
	shells->push(shell);
	IfcSchema::IfcFaceBasedSurfaceModel* surface_model = new IfcSchema::IfcFaceBasedSurfaceModel(shells);

	IfcSchema::IfcRepresentation::list::ptr reps(new IfcSchema::IfcRepresentation::list);
	IfcSchema::IfcRepresentationItem::list::ptr items(new IfcSchema::IfcRepresentationItem::list);

	items->push(surface_model);

	IfcSchema::IfcShapeRepresentation* rep = new IfcSchema::IfcShapeRepresentation(
		0, std::string("Facetation"), std::string("SurfaceModel"), items);

	reps->push(rep);
	IfcSchema::IfcProductDefinitionShape* shapedef = new IfcSchema::IfcProductDefinitionShape(boost::none, boost::none, reps);

	return shapedef;

After:

    auto shell = f.create<IfcSchema::IfcOpenShell>();
    shell.setCfsFaces(faces);

    auto surface_model = f.create<IfcSchema::IfcFaceBasedSurfaceModel>();
    surface_model.setFbsmFaces({shell});

	auto rep = f.create<IfcSchema::IfcShapeRepresentation>();
    rep.setRepresentationIdentifier("Tessellation");
    rep.setRepresentationType("SurfaceModel");
    rep.setItems({surface_model});

	auto shapedef = f.create<IfcSchema::IfcProductDefinitionShape>();
    shapedef.setRepresentations({rep});

	return shapedef;

Rationale

reason for weak_ptr encapsulated in object:

  • safety and matches semantics (the instance pointers always had their lifetime scoped to the file - this is now explicit and enforced)
  • objects can cast, so variants (EXPRESS SELECT) can work with implicit casts instead of virtual inheritance
  • elimination of virtual methods might help performance (although the declaration* is essentially the same mechanism) and reduce dependency on rtti in case that ever becomes a concern
  • elimination of virtual methods unifies latebound and earlybound entity definitions
  • which means it's possible now to do geometry interpretation on late bound schemas (previously not possible because of dependency on dynamic_cast / rtti)
  • cleaner separation between early bound statically typed layer and attribute storage container (which was a bit muddy in earlier releases but essentially present)
  • ability to redeclare the type of entity instances by assigning a new declaration* which was previously not possible as it was baked into the typeid

reason for creation as part of file

  • might be more efficient than create() + add()
  • necessary for rocksdb anyway, because file context is necessary for attribute storage in key-value store
  • instances not stored in file leak easily and have missing behaviour regarding inverses which depend on the file maps

reasons for no constructor:

  • readability because no named args
  • easier to adapt to multiple schemas
  • anyway immutability is not an option
  • reduce binary size
  • one way to do things
  • don't like argument forwarding because is created as part of file now

no custom class anymore for aggregates (aggregate_of_instance)

  • we can use aggregate initializers now and stl features
  • but casts are a bit uglier

Comment thread src/ifcparse/express.cpp Outdated

uint32_t express::Base::id() const { return data()->id(); }

const InstanceData* express::Base::data() const {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method, as well as the non-const version, defeats using smart pointers. Once the raw pointer is returned it is no longer managed and could be deleted by the caller. Why not

auto sp = data_.lock();
if (sp) {
   return sp;
} else {
   throw std::runtime_error("Trying to access deleted instance reference")
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, but I have the feeling that maybe this should be marked private instead and then it doesn't really matter anymore.

@RickBrice
Copy link
Copy Markdown
Contributor

RickBrice commented Jan 6, 2026

Does the weak_ptr encapsulated in the express::Base retain file scope lifetime of objects? It seems like shared_ptr would be required. Though, I don't have a complete understanding.

Creating entities as part of the file without constructors has a trade-off. Instead of create() + add() a user of the SDK must do create() + multiple parameter initialization steps by calling set() methods. This isn't bad, but it has these consequences:

  • Multiple set() methods must be invoked instead of a single constructor
  • Constructors had all parameters needed for proper construction and initialization. Now the caller needs to know which set() methods to call for all of the required initialization parameters. Failure (or forgetting) to initialize required parameters will likely lead to unwanted behavior.
  • Perhaps consider adding an initialize() method that has all the parameters of the old constructors. They the process is obj=f.create() + obj.initialize().

I very much like eliminating the custom classes for aggregates. I found them difficult to use. It was difficult to know which one of the classes where needed in different situations. Their interface wasn't the same as std collections, so looping wasn't as easy and they couldn't be used in the std algorithms.

This is going to break a lot of my code, but the fixes appear to be easy to deal with.

@aothms Let's set up a time for you to walk me through the bigger picture. I'm sure there are lots of details that I'm missing.

Comment thread src/examples/IfcAlignment.cpp Outdated

// Disable warnings coming from IfcOpenShell
#pragma warning(disable : 4018 4267 4250 4984 4985)
/********************************************************************************
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like a different example has been applied over the IfcAlignment.cpp example. The revised file is significantly different. Is this intentional?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that was sloppy, it's back and updated for the code changes

@aothms
Copy link
Copy Markdown
Member Author

aothms commented Jan 7, 2026

Does the weak_ptr encapsulated in the express::Base retain file scope lifetime of objects? It seems like shared_ptr would be required. Though, I don't have a complete understanding.

Yes the file have the shared_ptr not so much for sharing ownership but just so that we have the option to have a weak_ptr derived from them. So that when you have instances pointing to a deleted file or to instances from the file that have been deleted.

Now the caller needs to know which set() methods to call for all of the required initialization parameters.

Yes that's very valid. The discovery is indeed not really there.

I'm also fine with forwarding the arguments from file.create and reinstate the constructor. But as I said since the arguments are not named I find them a bit unclear especially with all the std::nullopts for all the optional arguments, it's hard to interpret. Boost has this named parameters idea, but I think it's a bit lengthy to setup https://www.boost.org/doc/libs/latest/libs/graph/doc/bgl_named_params.html

Perhaps consider adding an initialize() method that has all the parameters of the old constructors. They the process is obj=f.create() + obj.initialize().

I think that's indeed also a nice compromise. Then we also have autocompletion in our IDE because I think with std::forward from file.create() I think IDE's will be pretty clueless.

Very flexible to have a call these days, let me know some times that suit you please.

@RickBrice
Copy link
Copy Markdown
Contributor

I emailed your personal account with some proposed times for next week

@Moult
Copy link
Copy Markdown
Contributor

Moult commented Mar 28, 2026

I experienced segfaults when doing ifcopenshell.open() (after all those compilation fixes in the previous comment) on the ISSUE_159_kleine_Wohnung_R22.ifc model (from the list of profiling models). Here's an updated AI generated findings.md with an explanation of that fix, and a patch showing all the changes so far.

I'm also getting iterator initialisation errors on the advanced_model.ifc model saying "Trying to access deleted instance reference" (with detailed logging turned on). Haven't yet found a solution to that yet though.

Other than that, with those patches I can compile, and the profiling code is able to basically recreate the same numbers (apart from that initialisation error on that model) as the main branch.

ai.patch
findings.md

@aothms
Copy link
Copy Markdown
Member Author

aothms commented Mar 28, 2026

Hm yes apparently openai codex only bothered to verify compilation on windows. These template/typename things are notoriously different between compiler implementations.

@Moult
Copy link
Copy Markdown
Contributor

Moult commented Mar 28, 2026

Woke up this morning and the "Trying to access deleted instance reference" was solved by AI with this explanation:

findings.md
ai.patch

@aothms
Copy link
Copy Markdown
Member Author

aothms commented Mar 29, 2026

Thanks! that fix we probably cannot use, because that means that every deleted instance equals every other deleted instance. We want this generally to bubble up as an exception (I think). But the root cause is super useful we just need to catch this a bit earlier, because apparently this is not a very common issue so I don't think it warrants a fix so high up in the abstraction level.

I have a bit of a backlog of other work because I missed a good couple of days due to back issues, but I'll get to this later this week. If you want to have it chew at some other tasks in the meantime, essentially:

  • ifcgeom/mapping/mapping.cpp: taxonomy::ptr mapping::map(const express::Base& inst)

    Don't loop over all mappable geometry types for every converted instance, but rather "compile" based on IfcParse::declaration* into a function table in the mapping constructor so that we can dispatch to the appropriate call in sub linear time. Take care of subtypes: entity subtypes should be handled by the most closely defined handler in the inheritance tree. This for example applies to profiles where some subtypes are handled by their parent profile handler and some have a distinct handler for a specific subtype.

  • src/ifcgeom/AbstractKernel.h: dispatch_conversion.

    Same idea, but maybe more complex. Here we do not have a #include based loop, but a recursive template-based one. I always expected the compiler to optimize this into some sort of jump table, but it doesn't seem it happens.

  • Add a third "PassThrough" geometry kernel that only accepts taxonomy::shell with faces with at most 4 linear edges (so that triangulation is trivial) or extrusions of polygonal bases without openings. That writes the taxonomy::shell essentially directly to the IfcGeom::Representation::Triangulation, tie this in too hybrid kernel so that we can call something like hybrid-passthrough-cgal-simple-opencascade. Make the shape and structure similar to the occt and cgal kernel.

  • Create a third paramatrization of the cgal geometry lib. Now we have cgal and cgal-simple which corresponds to Exact_predicates_exact_construction (necessary for Nef) Exact_predicates_inexact_construction (prevents some crashes). But actually the idea was that cgal-simple would correspond to Simple_cartesian<double> which is much lighter, but less robust. If we at least have access to all three parametrizations we can better understand the failure modes the real cgal-simple and maybe start using it again.

  • (low prio) for cgal non-simple implement an alternative boolean logic routine not based on Nef_3, but on CGAL/Polygon_mesh_processing/corefinement

@Moult
Copy link
Copy Markdown
Contributor

Moult commented Apr 11, 2026

@Moult
Copy link
Copy Markdown
Contributor

Moult commented Apr 11, 2026

Maybe bug?

● Update(src/ifcgeom/mapping/IfcOffsetCurveByDistance.cpp)
⎿  Added 1 line, removed 1 line
158
159 // at this point, next == end and prev == end-1
160 #if defined SCHEMA_HAS_IfcDistanceExpression
161 - double last_distance = (*prev)->DistanceAlong() * length_unit_;
161 + double last_distance = (*prev).DistanceAlong() * length_unit_;
162 #else
163 double last_distance = (double) prev->DistanceAlong().asIfcSchema::IfcLengthMeasure() * length_
unit_;
164 #endif

@Moult
Copy link
Copy Markdown
Contributor

Moult commented Apr 11, 2026

I've noticed some discrepancies between v0.8.0 and this branch in outputs from the OpenCASCADE kernel (for the moment, ignoring all the other new kernels). See file:

bug1.ifc.txt

v0.8.0 OCC:

image

Datamodel OCC:

image

@Moult
Copy link
Copy Markdown
Contributor

Moult commented Apr 11, 2026

The passthrough kernel seems to be a bit funky, here's the results on AC20-FZK-Haus.ifc for hybrid-passthrough-manifold-cgal-simple-opencascade:

image

For example, for STEP ID 37582:

============================================================
  ios_hybrid (hybrid-manifold-cgal-simple-opencascade)
============================================================
  Vertices: 8
  Edges:    0
  Faces:    12
  Bbox min: [-0.500000, 9.729474, 3.200000]
  Bbox max: [12.500000, 9.809474, 3.360000]
  Dims:     [13.000000, 0.080000, 0.160000]
  Centroid: [6.000000, 9.769474, 3.280000]

  Vertices (8):
    [   0]    12.500000     9.809474     3.200000
    [   1]    12.500000     9.809474     3.360000
    [   2]    12.500000     9.729474     3.200000
    [   3]    12.500000     9.729474     3.360000
    [   4]    -0.500000     9.809474     3.200000
    [   5]    -0.500000     9.809474     3.360000
    [   6]    -0.500000     9.729474     3.200000
    [   7]    -0.500000     9.729474     3.360000

============================================================
  ios_hybrid (hybrid-passthrough-manifold-cgal-simple-opencascade)
============================================================
  Vertices: 1
  Edges:    0
  Faces:    12
  Bbox min: [0.000000, 0.000000, 0.000000]
  Bbox max: [0.000000, 0.000000, 0.000000]
  Dims:     [0.000000, 0.000000, 0.000000]
  Centroid: [0.000000, 0.000000, 0.000000]

  Vertices (1):
    [   0]     0.000000     0.000000     0.000000

@Moult
Copy link
Copy Markdown
Contributor

Moult commented Apr 11, 2026

BTW I have rebased the work on the viewer on the datamodel-v1.0 branch: #7930

@Moult
Copy link
Copy Markdown
Contributor

Moult commented Apr 12, 2026

Another observation: I think the normals coming from manifold aren't right. At least the viewer gives me funky colours :)

@aothms
Copy link
Copy Markdown
Member Author

aothms commented Apr 13, 2026

I've noticed some discrepancies between v0.8.0 and this branch in outputs from the OpenCASCADE kernel (for the moment, ignoring all the other new kernels). See file:

bug1.ifc.txt

v0.8.0 OCC:
image

Datamodel OCC:
image

@Moult just the stupidest of mistakes in copy pasting, which is in a weird way also reassuring a44f722

@aothms
Copy link
Copy Markdown
Member Author

aothms commented Apr 13, 2026

The passthrough kernel seems to be a bit funky

Something so simple I had better built myself. Now that I better see the code I think it needs to be thrown away, but at least this particular issue is gone.

I think the normals coming from manifold aren't right.

It appears it doesn't populate anything so you're likely reading random data. Will have a look tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants