Skip to content

add DAGTraverser class and remove MultiFunctions in apply_derivatives()#365

Merged
dham merged 1 commit intoFEniCS:mainfrom
firedrakeproject:ksagiyam/add_dag_visitor
May 22, 2025
Merged

add DAGTraverser class and remove MultiFunctions in apply_derivatives()#365
dham merged 1 commit intoFEniCS:mainfrom
firedrakeproject:ksagiyam/add_dag_visitor

Conversation

@ksagiyam
Copy link
Copy Markdown
Contributor

@ksagiyam ksagiyam commented Mar 26, 2025

Attempt to implement what @wence- described in this issue.

We basically would like to introduce classes (DAGTraversers) that define node processing (using singledispatchmethod) and hold caches.

We considered two DAG traversal approaches (for post-order and mostly post-order traversals).

Approach 1. Two-step approach.

Collect nodes first post-order and then process them in order.

This is the current UFL approach (cutoff_unique_post_traversal + MultiFunction wrapped in map_expr_dags).

This "bottom-up" approach is optimal, but when we process a node, we no longer know the local DAG structure around that node (e.g., we do not know its parent). So if we want to pass down some state from the parent, we need to (indirectly) specify that that parent is a "cutoff" node (so that we do not collect the child nodes in the first step), and perform a sub-DAG traversal under that parent; see RestrictionPropagator.restricted(self, o). In the current implementation, if the number of arguments to the method is two (self and o in the above case), the corresponding node type is regarded as a cutoff node type. MultiFunction constructor currently identifies cutoff node types by inspecting the signature of each method. If we wanted to do a similar thing with singledispatchmethod, we could subclass singledispatchmethod as David suggested, but we found that we would end up overwriting singledispatchmethod.register() method relying on the current implementation of singledispatchmethod, which would be a drawback.

Approach 2. Monolithic approach.

Use recursion with caching (each node is processed once and only once) described in this issue.

I observed about a few % - 20% overhead after rewriting apply_derivatives.py (please see below) presumably due to recursion, but no special consideration of cutoff is required. This approach is claimed to be more robust.

In this PR we take Approach2, and replace MultiFunction s with DAGTraversers in apply_derivatives.py. We should be able to incrementally/systematically remove all MultiFunctions in the future.

Performance checks:

holzapfel_ogden.py (holzapfel_ogden):

import time

from ufl import (
    Coefficient,
    Constant,
    FunctionSpace,
    Identity,
    Mesh,
    TestFunction,
    derivative,
    det,
    diff,
    dot,
    dx,
    exp,
    grad,
    inner,
    ln,
    tetrahedron,
    tr,
    variable,
)
from ufl.algorithms import compute_form_data
from ufl.finiteelement import FiniteElement
from ufl.pullback import identity_pullback
from ufl.sobolevspace import H1


mesh = Mesh(FiniteElement("Lagrange", tetrahedron, 1, (3,), identity_pullback, H1))
lamda = Constant(mesh)
a = Constant(mesh)
b = Constant(mesh)
a_s = Constant(mesh)
b_s = Constant(mesh)
a_f = Constant(mesh)
b_f = Constant(mesh)
a_fs = Constant(mesh)
b_fs = Constant(mesh)
e_s = Constant(mesh, shape=(3,))
e_f = Constant(mesh, shape=(3,))

def isochoric(F):
    C = F.T*F
    I_1 = tr(C)
    I4_f = dot(e_f, C*e_f)
    I4_s = dot(e_s, C*e_s)
    I8_fs = dot(e_f, C*e_s)

    def cutoff(x):
        return 1.0/(1.0 + exp(-(x - 1.0)*30.0))

    def scaled_exp(a0, a1, argument):
        return a0/(2.0*a1)*(exp(b*argument) - 1)

    def scaled_exp(a0, a1, argument):
        return a0/(2.0*a1)*(exp(b*argument) - 1)

    E_1 = scaled_exp(a, b, I_1 - 3.)
    E_f = cutoff(I4_f)*scaled_exp(a_f, b_f, (I4_f - 1.)**2)
    E_s = cutoff(I4_s)*scaled_exp(a_s, b_s, (I4_s - 1.)**2)
    E_3 = scaled_exp(a_fs, b_fs, I8_fs**2)
    E = E_1 + E_f + E_s + E_3
    return E

elem = FiniteElement("Lagrange", tetrahedron, 1, (3,), identity_pullback, H1)
V = FunctionSpace(mesh, elem)
u = Coefficient(V)
v = TestFunction(V)
I = Identity(mesh.ufl_cell().topological_dimension())
F = grad(u) + I
F = variable(F)
J = det(F)
Fbar = J**(-1.0/3.0)*F
E_volumetric = lamda*0.5*ln(J)**2
psi = isochoric(Fbar) + E_volumetric
P = diff(psi, F)
F = inner(P, grad(v))*dx
a = derivative(F, u)

ntest = 2
time_sum = 0
for _ in range(ntest):
    start = time.time()
    fd = compute_form_data(
        a,
        do_apply_function_pullbacks=True,
        do_apply_default_restrictions=True,
        do_apply_geometry_lowering=True,
        do_apply_restrictions=True,
        complex_mode=False,
    )
    end = time.time()
    time_sum += end - start
print("average time required: ", time_sum / ntest)

main:
average time required: 0.27037227153778076

this PR:
average time required: 0.27785348892211914 (+2.8%)

With #339 (<- performance regression) reverted:

main:
average time required: 0.04610729217529297

this PR:
average time required: 0.055258870124816895 (+20%)

wence_test (#69):

import time

from ufl import (
    Coefficient,
    Constant,
    FacetNormal,
    FunctionSpace,
    Identity,
    Mesh,
    avg,
    derivative,
    det,
    diff,
    dS,
    grad,
    hexahedron,
    inner,
    ln,
    outer,
    variable,
)
from ufl.algorithms import compute_form_data
from ufl.finiteelement import FiniteElement
from ufl.pullback import identity_pullback, contravariant_piola
from ufl.sobolevspace import H1, HDiv

cell = hexahedron
mesh = Mesh(FiniteElement("Q", cell, 1, (3,), identity_pullback, H1))
V = FunctionSpace(mesh, FiniteElement("NCF", cell, 2, (3,), contravariant_piola, HDiv))
u = Coefficient(V)
mu = Constant(mesh)
psi = lambda F: (mu/2) * (inner(F, F) - u.ufl_shape[0]) - mu*ln(det(F))
flux = lambda F: diff(psi(F), F)
n = FacetNormal(mesh)
uxn = outer(u('+'), n('+')) + outer(u('-'), n('-'))
eye = Identity(u.ufl_shape[0])
grad_u = grad(u)
F_ = variable(grad_u + eye)
Fm = variable(grad_u('-') + eye)
Fp = variable(grad_u('+') + eye)
avg_flux = avg(flux(F_))
U = inner(avg_flux, uxn)*dS
a = derivative(derivative(U, u), u)

ntest = 2
time_sum = 0
for _ in range(ntest):
    start = time.time()
    fd = compute_form_data(
        a,
        do_apply_function_pullbacks=True,
        do_apply_default_restrictions=True,
        do_apply_geometry_lowering=True,
        do_apply_restrictions=True,
        complex_mode=False,
    )
    end = time.time()
    time_sum += end - start
print("average time required: ", time_sum / ntest)

main:
average time required: 0.23360061645507812

this PR:
average time required: 0.2485201358795166 (+6.4%)

With #339 reverted:

main:
average time required: 0.060405850410461426

this PR:
average time required: 0.06431174278259277 (+6.5%)

Firedrake CI
firedrakeproject/firedrake#4145 (no notable performance regression).

@ksagiyam ksagiyam force-pushed the ksagiyam/add_dag_visitor branch 4 times, most recently from bffe57a to b99a902 Compare March 28, 2025 01:32
Copy link
Copy Markdown
Contributor

@connorjward connorjward left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is pretty great. I have written a lot of tree traversals and the DAGTraverser.postorder (and friends) solves an issue I've been thinking about.

It's almost a shame that this only lives in UFL.

Comment thread ufl/corealg/dag_traverser.py Outdated
Comment thread ufl/corealg/dag_traverser.py Outdated
Comment thread ufl/corealg/dag_traverser.py Outdated
Comment thread ufl/corealg/dag_traverser.py Outdated
Comment thread ufl/algorithms/apply_derivatives.py Outdated
Comment thread ufl/algorithms/apply_derivatives.py
Comment thread ufl/algorithms/apply_derivatives.py Outdated
Comment thread ufl/algorithms/apply_derivatives.py Outdated
Comment thread ufl/algorithms/apply_derivatives.py Outdated
@ksagiyam ksagiyam force-pushed the ksagiyam/add_dag_visitor branch 2 times, most recently from c16f038 to 449b44e Compare April 3, 2025 13:46
@ksagiyam ksagiyam marked this pull request as ready for review April 3, 2025 15:23
@ksagiyam
Copy link
Copy Markdown
Contributor Author

@jorgensd @mscroggs Could you have a look at this PR?

Comment thread ufl/corealg/dag_traverser.py Outdated
)

@process.register(Derivative)
def _(self, o: Expr) -> Expr:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to use

@process.register
    def _(self, o: Derivative) -> Expr:

here and in subsequent registrations, as https://docs.python.org/3/library/functools.html#functools.singledispatch states:

To add overloaded implementations to the function, use the register() attribute of the generic function, which can be used as a decorator. For functions annotated with types, the decorator will infer the type of the first argument automatically:

I guess we would have to wait for lower version Python to be 3.11 to do this for unions (python/cpython#30017). However, personally, I think this would be a cleaner solution. What do you think?

Copy link
Copy Markdown
Contributor Author

@ksagiyam ksagiyam Apr 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave it a go with this as a separate commit, but I personally prefer:

    @process.register(Grad)
    @process.register(CellAvg)
    @process.register(FacetAvg)
    def _(self, o) -> Expr:
        ...

as type names are left-aligned and easier to read for me. I also do not quite see how it plays well when decorators are nested as in:

    @process.register
    @DAGTraverser.postorder
    @pending_operations_recording
    def external_operator(self, N: ExternalOperator, *dfs) -> Expr:
        ...

though functools.wraps() seems to be doing the job for us.

I revert this commit for now as Python3.10 test fails as you already pointed out. Please see the commit history for the relevant changes.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should follow the Python convention here, which is the way Jørgen has suggested.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, it is good, but we indeed need Python >= 3.11, so we probably should not do this in this PR. I just dropped the relevant commit.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make a note of this in #176 to remind us about this in October 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I will drop a note once this PR is merged.

Comment thread ufl/corealg/dag_traverser.py Outdated
Comment thread ufl/algorithms/apply_derivatives.py
Comment thread ufl/algorithms/apply_derivatives.py Outdated
Comment thread ufl/algorithms/apply_derivatives.py
Comment thread ufl/algorithms/apply_derivatives.py Outdated
@jorgensd
Copy link
Copy Markdown
Member

In general very positive to this change. Will run the tests later today to check for potential regressions.

@jorgensd
Copy link
Copy Markdown
Member

Locally, i did not see a big regression with this PR, however, DOLFINx doesn't have the most complicated forms in the tests.
Maybe @schnellerhase , @jhale or @michalhabera has some more complicated forms in DOLFINy that we could check against.

@schnellerhase
Copy link
Copy Markdown
Contributor

schnellerhase commented Apr 23, 2025

Thanks for the ping. Quick testing on https://github.com/fenics-dolfiny/dolfiny/blob/main/demo/spectral/solid_elasticity_spectral.py - modified test case attached we see the following regression:

main 
F: 0.22513866424560547s
J: 0.5185086727142334s

https://github.com/firedrakeproject/ufl.git@449b44e67efc0bda6f5d4cd99c92dbd22dafd146
F: 0.7992544174194336s
J: 4.553445982933044s
Modified example
#!/usr/bin/env python3

import time

from mpi4py import MPI
from petsc4py import PETSc

import basix
import dolfinx
import ufl
from dolfinx import default_scalar_type as scalar

import mesh_tube3d_gmshapi as mg
import numpy as np
import plot_tube3d_pyvista as pl

import dolfiny

# Basic settings
name = "solid_elasticity_spectral"
comm = MPI.COMM_WORLD

# Geometry and mesh parameters
r, t, h = 0.04, 0.01, 0.1  # [m]
nr, nt, nh = 16, 5, 8

# Create the regular mesh of a tube with given dimensions
gmsh_model, tdim = mg.mesh_tube3d_gmshapi(name, r, t, h, nr, nt, nh, do_quads=True, order=2)

# Get mesh and meshtags
mesh, mts = dolfiny.mesh.gmsh_to_dolfin(gmsh_model, tdim)

# Get merged MeshTags for each codimension
subdomains, subdomains_keys = dolfiny.mesh.merge_meshtags(mesh, mts, tdim - 0)
interfaces, interfaces_keys = dolfiny.mesh.merge_meshtags(mesh, mts, tdim - 1)

# Define shorthands for labelled tags
surface_lower = interfaces_keys["surface_lower"]
surface_upper = interfaces_keys["surface_upper"]

# Define integration measures
dx = ufl.Measure("dx", domain=mesh, subdomain_data=subdomains, metadata={"quadrature_degree": 4})
ds = ufl.Measure("ds", domain=mesh, subdomain_data=interfaces, metadata={"quadrature_degree": 4})

# Define elements
Ue = basix.ufl.element("P", mesh.basix_cell(), 2, shape=(mesh.geometry.dim,))

# Define function spaces
Uf = dolfinx.fem.functionspace(mesh, Ue)

# Define functions
u = dolfinx.fem.Function(Uf, name="u")

u_ = dolfinx.fem.Function(Uf, name="u_")  # boundary conditions

δu = ufl.TestFunction(Uf)

# Define state as (ordered) list of functions
m, δm = [u], [δu]

# output / visualisation
vorder = mesh.geometry.cmap.degree
uo = dolfinx.fem.Function(dolfinx.fem.functionspace(mesh, ("P", vorder, (3,))), name="u")
so = dolfinx.fem.Function(dolfinx.fem.functionspace(mesh, ("P", vorder)), name="s")  # for output

# Kinematics
F = ufl.Identity(3) + ufl.grad(u)

# Strain measure: from Cauchy strain tensor to squares of principal stretches
c, _ = dolfiny.invariants.eigenstate(F.T * F)  # spectral decomposition of C
c = ufl.as_vector(c)  # squares of principal stretches
c = ufl.variable(c)
# Variation of strain measure (squares of principal stretches)
δc = dolfiny.expression.derivative(c, m, δm)

# Elasticity parameters
E = dolfinx.fem.Constant(mesh, scalar(1.0))  # [MPa]
nu = dolfinx.fem.Constant(mesh, scalar(0.4))  # [-]
mu = E / (2 * (1 + nu))
la = E * nu / ((1 + nu) * (1 - 2 * nu))

# Define boundary stress vector (torque at upper face)
x0 = ufl.SpatialCoordinate(mesh)
n0 = ufl.FacetNormal(mesh)
λ = dolfinx.fem.Constant(mesh, scalar(0.0))
t = ufl.cross(x0 - ufl.as_vector([0.0, 0.0, h]), n0) * 4 * λ  # [N/m^2]


def strain_energy(i1, i2, i3):
    """Strain energy function
    i1, i2, i3: principal invariants of the Cauchy-Green tensor
    """
    # Determinant of configuration gradient F
    J = ufl.sqrt(i3)
    #
    # Classical St. Venant-Kirchhoff
    # Ψ = la / 8 * (i1 - 3) ** 2 + mu / 4 * ((i1 - 3) ** 2 + 4 * (i1 - 3) - 2 * (i2 - 3))
    # Modified St. Venant-Kirchhoff
    # Ψ = la / 2 * (ufl.ln(J)) ** 2 + mu / 4 * ((i1 - 3) ** 2 + 4 * (i1 - 3) - 2 * (i2 - 3))
    # Compressible neo-Hooke
    Ψ = mu / 2 * (i1 - 3 - 2 * ufl.ln(J)) + la / 2 * (J - 1) ** 2
    # Compressible Mooney-Rivlin (beta = 0)
    # Ψ = mu / 4 * (i1 - 3) + mu / 4 * (i2 - 3) - mu * ufl.ln(J) + la / 2 * (J - 1) ** 2
    #
    return Ψ


# Invariants (based on spectral decomposition of C)
i1, i2, i3 = c[0] + c[1] + c[2], c[0] * c[1] + c[1] * c[2] + c[0] * c[2], c[0] * c[1] * c[2]
# Material model (isotropic)
Ψ = strain_energy(i1, i2, i3)
# Stress measure
s = 2 * ufl.diff(Ψ, c)

# von Mises stress (output)
svm = ufl.sqrt(3 / 2 * ufl.inner(ufl.dev(ufl.diag(s)), ufl.dev(ufl.diag(s))))

# Weak form: for isotropic material, eigenprojectors of C and S are identical
form = -0.5 * ufl.inner(δc, s) * dx + ufl.inner(δu, t) * ds(surface_upper)

# Overall form (as list of forms)
forms = dolfiny.function.extract_blocks(form, δm)

form = forms[0]
n = 10
start = time.time()
for _ in range(n):
    ufl.algorithms.compute_form_data(
        form,
        do_apply_function_pullbacks=True,
        do_apply_default_restrictions=True,
        do_apply_geometry_lowering=True,
        do_apply_restrictions=True,
        complex_mode=False,
    )

print(f"F: {(time.time() - start)/n}s")


J_form = ufl.derivative(forms[0], m[0], ufl.TrialFunction(m[0].function_space))

start = time.time()
for _ in range(n):
    ufl.algorithms.compute_form_data(
        J_form,
        do_apply_function_pullbacks=True,
        do_apply_default_restrictions=True,
        do_apply_geometry_lowering=True,
        do_apply_restrictions=True,
        complex_mode=False,
    )

print(f"J: {(time.time() - start)/n}s")

@jorgensd
Copy link
Copy Markdown
Member

jorgensd commented Apr 23, 2025

@schnellerhase your example is quite complex.
I've here narrowed it down to a "pure" UFL example (taken bits and pieces from dolfiny)

import ufl
import time

from ufl import (
    Coefficient,
    FunctionSpace,
    Mesh,
    TestFunction,
    hexahedron,
    Constant,
)
from ufl.finiteelement import FiniteElement
from ufl.pullback import identity_pullback
from ufl.sobolevspace import H1


def invariants_principal(A):
    """Principal invariants of (real-valued) tensor A.
    https://doi.org/10.1007/978-3-7091-0174-2_3
    """
    i1 = ufl.tr(A)
    i2 = (ufl.tr(A) ** 2 - ufl.tr(A * A)) / 2
    i3 = ufl.det(A)
    return i1, i2, i3


def eigenstate3(A):
    """Eigenvalues and eigenprojectors of the 3x3 (real-valued) tensor A.
    Provides the spectral decomposition A = sum_{a=0}^{2} λ_a * E_a
    with (ordered) eigenvalues λ_a and their associated eigenprojectors E_a = n_a^R x n_a^L.

    Note: Tensor A must not have complex eigenvalues!
    """
    if ufl.shape(A) != (3, 3):
        raise RuntimeError(
            f"Tensor A of shape {ufl.shape(A)} != (3, 3) is not supported!"
        )
    #
    eps = 3.0e-16  # slightly above 2**-(53 - 1), see https://en.wikipedia.org/wiki/IEEE_754
    #
    A = ufl.variable(A)
    #
    # --- determine eigenvalues λ0, λ1, λ2

    # Invariants
    I1, _, _ = invariants_principal(A)

    # Discriminant as sum-of-products
    Δx = [
        A[0, 1] * A[1, 2] * A[2, 0] - A[0, 2] * A[1, 0] * A[2, 1],
        A[0, 1] ** 2 * A[1, 2]
        - A[0, 1] * A[0, 2] * A[1, 1]
        + A[0, 1] * A[0, 2] * A[2, 2]
        - A[0, 2] ** 2 * A[2, 1],
        A[0, 0] * A[0, 1] * A[2, 1]
        - A[0, 1] ** 2 * A[2, 0]
        - A[0, 1] * A[2, 1] * A[2, 2]
        + A[0, 2] * A[2, 1] ** 2,
        A[0, 0] * A[0, 2] * A[1, 2]
        + A[0, 1] * A[1, 2] ** 2
        - A[0, 2] ** 2 * A[1, 0]
        - A[0, 2] * A[1, 1] * A[1, 2],
        A[0, 0] * A[0, 1] * A[1, 2]
        - A[0, 1] * A[0, 2] * A[1, 0]
        - A[0, 1] * A[1, 2] * A[2, 2]
        + A[0, 2] * A[1, 2] * A[2, 1],
        A[0, 0] * A[0, 2] * A[2, 1]
        - A[0, 1] * A[0, 2] * A[2, 0]
        + A[0, 1] * A[1, 2] * A[2, 1]
        - A[0, 2] * A[1, 1] * A[2, 1],
        A[0, 1] * A[1, 0] * A[1, 2]
        - A[0, 2] * A[1, 0] * A[1, 1]
        + A[0, 2] * A[1, 0] * A[2, 2]
        - A[0, 2] * A[1, 2] * A[2, 0],
        A[0, 0] ** 2 * A[1, 2]
        - A[0, 0] * A[0, 2] * A[1, 0]
        - A[0, 0] * A[1, 1] * A[1, 2]
        - A[0, 0] * A[1, 2] * A[2, 2]
        + A[0, 1] * A[1, 0] * A[1, 2]
        + A[0, 2] * A[1, 0] * A[2, 2]
        + A[1, 1] * A[1, 2] * A[2, 2]
        - A[1, 2] ** 2 * A[2, 1],
        A[0, 0] ** 2 * A[1, 2]
        - A[0, 0] * A[0, 2] * A[1, 0]
        - A[0, 0] * A[1, 1] * A[1, 2]
        - A[0, 0] * A[1, 2] * A[2, 2]
        + A[0, 2] * A[1, 0] * A[1, 1]
        + A[0, 2] * A[1, 2] * A[2, 0]
        + A[1, 1] * A[1, 2] * A[2, 2]
        - A[1, 2] ** 2 * A[2, 1],
        A[0, 0] * A[0, 1] * A[1, 1]
        - A[0, 0] * A[0, 1] * A[2, 2]
        - A[0, 1] ** 2 * A[1, 0]
        + A[0, 1] * A[0, 2] * A[2, 0]
        - A[0, 1] * A[1, 1] * A[2, 2]
        + A[0, 1] * A[2, 2] ** 2
        + A[0, 2] * A[1, 1] * A[2, 1]
        - A[0, 2] * A[2, 1] * A[2, 2],
        A[0, 0] * A[0, 1] * A[1, 1]
        - A[0, 0] * A[0, 1] * A[2, 2]
        + A[0, 0] * A[0, 2] * A[2, 1]
        - A[0, 1] ** 2 * A[1, 0]
        - A[0, 1] * A[1, 1] * A[2, 2]
        + A[0, 1] * A[1, 2] * A[2, 1]
        + A[0, 1] * A[2, 2] ** 2
        - A[0, 2] * A[2, 1] * A[2, 2],
        A[0, 0] * A[0, 1] * A[1, 2]
        - A[0, 0] * A[0, 2] * A[1, 1]
        + A[0, 0] * A[0, 2] * A[2, 2]
        - A[0, 1] * A[1, 1] * A[1, 2]
        - A[0, 2] ** 2 * A[2, 0]
        + A[0, 2] * A[1, 1] ** 2
        - A[0, 2] * A[1, 1] * A[2, 2]
        + A[0, 2] * A[1, 2] * A[2, 1],
        A[0, 0] * A[0, 2] * A[1, 1]
        - A[0, 0] * A[0, 2] * A[2, 2]
        - A[0, 1] * A[0, 2] * A[1, 0]
        + A[0, 1] * A[1, 1] * A[1, 2]
        - A[0, 1] * A[1, 2] * A[2, 2]
        + A[0, 2] ** 2 * A[2, 0]
        - A[0, 2] * A[1, 1] ** 2
        + A[0, 2] * A[1, 1] * A[2, 2],
        A[0, 0] ** 2 * A[1, 1]
        - A[0, 0] ** 2 * A[2, 2]
        - A[0, 0] * A[0, 1] * A[1, 0]
        + A[0, 0] * A[0, 2] * A[2, 0]
        - A[0, 0] * A[1, 1] ** 2
        + A[0, 0] * A[2, 2] ** 2
        + A[0, 1] * A[1, 0] * A[1, 1]
        - A[0, 2] * A[2, 0] * A[2, 2]
        + A[1, 1] ** 2 * A[2, 2]
        - A[1, 1] * A[1, 2] * A[2, 1]
        - A[1, 1] * A[2, 2] ** 2
        + A[1, 2] * A[2, 1] * A[2, 2],
    ]
    Δy = [
        A[0, 2] * A[1, 0] * A[2, 1] - A[0, 1] * A[1, 2] * A[2, 0],
        A[1, 0] ** 2 * A[2, 1]
        - A[1, 0] * A[1, 1] * A[2, 0]
        + A[1, 0] * A[2, 0] * A[2, 2]
        - A[1, 2] * A[2, 0] ** 2,
        A[0, 0] * A[1, 0] * A[1, 2]
        - A[0, 2] * A[1, 0] ** 2
        - A[1, 0] * A[1, 2] * A[2, 2]
        + A[1, 2] ** 2 * A[2, 0],
        A[0, 0] * A[2, 0] * A[2, 1]
        - A[0, 1] * A[2, 0] ** 2
        + A[1, 0] * A[2, 1] ** 2
        - A[1, 1] * A[2, 0] * A[2, 1],
        A[0, 0] * A[1, 0] * A[2, 1]
        - A[0, 1] * A[1, 0] * A[2, 0]
        - A[1, 0] * A[2, 1] * A[2, 2]
        + A[1, 2] * A[2, 0] * A[2, 1],
        A[0, 0] * A[1, 2] * A[2, 0]
        - A[0, 2] * A[1, 0] * A[2, 0]
        + A[1, 0] * A[1, 2] * A[2, 1]
        - A[1, 1] * A[1, 2] * A[2, 0],
        A[0, 1] * A[1, 0] * A[2, 1]
        - A[0, 1] * A[1, 1] * A[2, 0]
        + A[0, 1] * A[2, 0] * A[2, 2]
        - A[0, 2] * A[2, 0] * A[2, 1],
        A[0, 0] ** 2 * A[2, 1]
        - A[0, 0] * A[0, 1] * A[2, 0]
        - A[0, 0] * A[1, 1] * A[2, 1]
        - A[0, 0] * A[2, 1] * A[2, 2]
        + A[0, 1] * A[1, 0] * A[2, 1]
        + A[0, 1] * A[2, 0] * A[2, 2]
        + A[1, 1] * A[2, 1] * A[2, 2]
        - A[1, 2] * A[2, 1] ** 2,
        A[0, 0] ** 2 * A[2, 1]
        - A[0, 0] * A[0, 1] * A[2, 0]
        - A[0, 0] * A[1, 1] * A[2, 1]
        - A[0, 0] * A[2, 1] * A[2, 2]
        + A[0, 1] * A[1, 1] * A[2, 0]
        + A[0, 2] * A[2, 0] * A[2, 1]
        + A[1, 1] * A[2, 1] * A[2, 2]
        - A[1, 2] * A[2, 1] ** 2,
        A[0, 0] * A[1, 0] * A[1, 1]
        - A[0, 0] * A[1, 0] * A[2, 2]
        - A[0, 1] * A[1, 0] ** 2
        + A[0, 2] * A[1, 0] * A[2, 0]
        - A[1, 0] * A[1, 1] * A[2, 2]
        + A[1, 0] * A[2, 2] ** 2
        + A[1, 1] * A[1, 2] * A[2, 0]
        - A[1, 2] * A[2, 0] * A[2, 2],
        A[0, 0] * A[1, 0] * A[1, 1]
        - A[0, 0] * A[1, 0] * A[2, 2]
        + A[0, 0] * A[1, 2] * A[2, 0]
        - A[0, 1] * A[1, 0] ** 2
        - A[1, 0] * A[1, 1] * A[2, 2]
        + A[1, 0] * A[1, 2] * A[2, 1]
        + A[1, 0] * A[2, 2] ** 2
        - A[1, 2] * A[2, 0] * A[2, 2],
        A[0, 0] * A[1, 0] * A[2, 1]
        - A[0, 0] * A[1, 1] * A[2, 0]
        + A[0, 0] * A[2, 0] * A[2, 2]
        - A[0, 2] * A[2, 0] ** 2
        - A[1, 0] * A[1, 1] * A[2, 1]
        + A[1, 1] ** 2 * A[2, 0]
        - A[1, 1] * A[2, 0] * A[2, 2]
        + A[1, 2] * A[2, 0] * A[2, 1],
        A[0, 0] * A[1, 1] * A[2, 0]
        - A[0, 0] * A[2, 0] * A[2, 2]
        - A[0, 1] * A[1, 0] * A[2, 0]
        + A[0, 2] * A[2, 0] ** 2
        + A[1, 0] * A[1, 1] * A[2, 1]
        - A[1, 0] * A[2, 1] * A[2, 2]
        - A[1, 1] ** 2 * A[2, 0]
        + A[1, 1] * A[2, 0] * A[2, 2],
        A[0, 0] ** 2 * A[1, 1]
        - A[0, 0] ** 2 * A[2, 2]
        - A[0, 0] * A[0, 1] * A[1, 0]
        + A[0, 0] * A[0, 2] * A[2, 0]
        - A[0, 0] * A[1, 1] ** 2
        + A[0, 0] * A[2, 2] ** 2
        + A[0, 1] * A[1, 0] * A[1, 1]
        - A[0, 2] * A[2, 0] * A[2, 2]
        + A[1, 1] ** 2 * A[2, 2]
        - A[1, 1] * A[1, 2] * A[2, 1]
        - A[1, 1] * A[2, 2] ** 2
        + A[1, 2] * A[2, 1] * A[2, 2],
    ]
    Δd = [9, 6, 6, 6, 8, 8, 8, 2, 2, 2, 2, 2, 2, 1]
    Δ = sum(Δxk * Δdk * Δyk for Δxk, Δdk, Δyk in zip(Δx, Δd, Δy))  # discriminant as sop

    # Invariant dp as sum-of-products
    Δxp = [
        A[1, 0],
        A[2, 0],
        A[2, 1],
        -A[0, 0] + A[1, 1],
        -A[0, 0] + A[2, 2],
        -A[1, 1] + A[2, 2],
    ]
    Δyp = [
        A[0, 1],
        A[0, 2],
        A[1, 2],
        -A[0, 0] + A[1, 1],
        -A[0, 0] + A[2, 2],
        -A[1, 1] + A[2, 2],
    ]
    Δdp = [6, 6, 6, 1, 1, 1]
    dp = (
        sum(Δxpk * Δdpk * Δypk for Δxpk, Δdpk, Δypk in zip(Δxp, Δdp, Δyp)) / 2
    )  # dp as sop

    # Invariant dq as sum-of-products
    Δxq = [
        A[1, 2],
        A[2, 1],
        A[0, 1] * A[1, 0],
        A[0, 2] * A[2, 0],
        A[1, 2] * A[2, 1],
        A[1, 1] + A[2, 2] - 2 * A[0, 0],
    ]
    Δyq = [
        A[0, 1] * A[2, 0],
        A[0, 2] * A[1, 0],
        A[0, 0] + A[1, 1] - 2 * A[2, 2],
        A[0, 0] + A[2, 2] - 2 * A[1, 1],
        A[1, 1] + A[2, 2] - 2 * A[0, 0],
        (A[0, 0] + A[2, 2] - 2 * A[1, 1]) * (A[0, 0] + A[1, 1] - 2 * A[2, 2]),
    ]
    Δdq = [27, 27, 9, 9, 9, -1]
    dq = sum(Δxqk * Δdqk * Δyqk for Δxqk, Δdqk, Δyqk in zip(Δxq, Δdq, Δyq))

    # Avoid dp = 0 and disc = 0, both are known with absolute error of ~eps**2
    # Required to avoid sqrt(0) derivatives and negative square roots
    dp += eps**2
    Δ += eps**2

    phi3 = ufl.atan2(ufl.sqrt(27) * ufl.sqrt(Δ), dq)

    # sorted eigenvalues: λ0 <= λ1 <= λ2
    λ = [
        (I1 + 2 * ufl.sqrt(dp) * ufl.cos((phi3 + 2 * ufl.pi * k) / 3)) / 3
        for k in range(1, 4)
    ]
    #
    # --- determine eigenprojectors E0, E1, E2
    #
    E = [ufl.diff(λk, A).T for λk in λ]

    return λ, E


def eigenstate2(A):
    """Eigenvalues and eigenprojectors of the 2x2 (real-valued) tensor A.
    Provides the spectral decomposition A = sum_{a=0}^{1} λ_a * E_a
    with (ordered) eigenvalues λ_a and their associated eigenprojectors E_a = n_a^R x n_a^L.

    Note: Tensor A must not have complex eigenvalues!
    """
    if ufl.shape(A) != (2, 2):
        raise RuntimeError(
            f"Tensor A of shape {ufl.shape(A)} != (2, 2) is not supported!"
        )
    #
    eps = 3.0e-16  # slightly above 2**-(53 - 1), see https://en.wikipedia.org/wiki/IEEE_754
    #
    A = ufl.variable(A)
    #
    # --- determine eigenvalues λ0, λ1
    #
    I1, _, _ = invariants_principal(A)
    #
    Δ = (A[0, 0] - A[1, 1]) ** 2 + 4 * A[0, 1] * A[1, 0]  # = I1**2 - 4 * I2
    # Avoid dp = 0 and disc = 0, both are known with absolute error of ~eps**2
    # Required to avoid sqrt(0) derivatives and negative square roots
    Δ += eps**2
    # sorted eigenvalues: λ0 <= λ1
    λ = (I1 - ufl.sqrt(Δ)) / 2, (I1 + ufl.sqrt(Δ)) / 2
    #
    # --- determine eigenprojectors E0, E1
    #
    E = [ufl.diff(λk, A).T for λk in λ]

    return λ, E


def eigenstate(A):
    """Eigenvalues and eigenprojectors of the (real-valued) tensor A of dimension m = 2, 3.
    Provides the spectral decomposition A = sum_{a=0}^{m} λ_a * E_a
    with (ordered) eigenvalues λ_a and their associated eigenprojectors E_a = n_a^R x n_a^L.

    Note: Tensor A must not have complex eigenvalues!
    """
    if ufl.shape(A) == (3, 3):
        return eigenstate3(A)
    elif ufl.shape(A) == (2, 2):
        return eigenstate2(A)
    else:
        raise RuntimeError(f"Tensor A of shape {ufl.shape(A)} is not supported!")


element = FiniteElement("Lagrange", hexahedron, 2, (3,), identity_pullback, H1)
domain = Mesh(FiniteElement("Lagrange", hexahedron, 2, (3,), identity_pullback, H1))


# Define integration measures
dx = ufl.Measure("dx", domain=domain, metadata={"quadrature_degree": 4})
ds = ufl.Measure("ds", domain=domain, metadata={"quadrature_degree": 4})


Uf = FunctionSpace(domain, element)
u = Coefficient(Uf)  # displacement
u_ = Coefficient(Uf)

δu = TestFunction(Uf)


# output / visualisation

# Kinematics
F = ufl.Identity(3) + ufl.grad(u)

# Strain measure: from Cauchy strain tensor to squares of principal stretches
c, _ = eigenstate(F.T * F)  # spectral decomposition of C
c = ufl.as_vector(c)  # squares of principal stretches
c = ufl.variable(c)
# Variation of strain measure (squares of principal stretches)
δc = ufl.derivative(c, u, δu)

# Elasticity parameters
E = Constant(domain)  # [MPa]
nu = Constant(domain)
mu = E / (2 * (1 + nu))
la = E * nu / ((1 + nu) * (1 - 2 * nu))

# Define boundary stress vector (torque at upper face)
x0 = ufl.SpatialCoordinate(domain)
n0 = ufl.FacetNormal(domain)
λ = Constant(domain)
h = Constant(domain)  # [m]
t = ufl.cross(x0 - ufl.as_vector([0.0, 0.0, h]), n0) * 4 * λ  # [N/m^2]


def strain_energy(i1, i2, i3):
    """Strain energy function
    i1, i2, i3: principal invariants of the Cauchy-Green tensor
    """
    # Determinant of configuration gradient F
    J = ufl.sqrt(i3)
    #
    # Classical St. Venant-Kirchhoff
    # Ψ = la / 8 * (i1 - 3) ** 2 + mu / 4 * ((i1 - 3) ** 2 + 4 * (i1 - 3) - 2 * (i2 - 3))
    # Modified St. Venant-Kirchhoff
    # Ψ = la / 2 * (ufl.ln(J)) ** 2 + mu / 4 * ((i1 - 3) ** 2 + 4 * (i1 - 3) - 2 * (i2 - 3))
    # Compressible neo-Hooke
    Ψ = mu / 2 * (i1 - 3 - 2 * ufl.ln(J)) + la / 2 * (J - 1) ** 2
    # Compressible Mooney-Rivlin (beta = 0)
    # Ψ = mu / 4 * (i1 - 3) + mu / 4 * (i2 - 3) - mu * ufl.ln(J) + la / 2 * (J - 1) ** 2
    #
    return Ψ


# Invariants (based on spectral decomposition of C)
i1, i2, i3 = (
    c[0] + c[1] + c[2],
    c[0] * c[1] + c[1] * c[2] + c[0] * c[2],
    c[0] * c[1] * c[2],
)
# Material model (isotropic)
Ψ = strain_energy(i1, i2, i3)
# Stress measure
s = 2 * ufl.diff(Ψ, c)

# von Mises stress (output)
svm = ufl.sqrt(3 / 2 * ufl.inner(ufl.dev(ufl.diag(s)), ufl.dev(ufl.diag(s))))

# Weak form: for isotropic material, eigenprojectors of C and S are identical
form = -0.5 * ufl.inner(δc, s) * dx + ufl.inner(δu, t) * ds(1)


start = time.perf_counter()
ufl.algorithms.compute_form_data(
    form,
    do_apply_function_pullbacks=True,
    do_apply_default_restrictions=True,
    do_apply_geometry_lowering=True,
    do_apply_restrictions=True,
    complex_mode=False,
)
end = time.perf_counter()
print(f"Time taken to compute form data (residual): {(end - start):.2e}")


J = ufl.derivative(form, u)

start = time.perf_counter()
ufl.algorithms.compute_form_data(
    J,
    do_apply_function_pullbacks=True,
    do_apply_default_restrictions=True,
    do_apply_geometry_lowering=True,
    do_apply_restrictions=True,
    complex_mode=False,
)
end = time.perf_counter()
print(f"Time taken to compute form data (jacobian): {(end - start):.2e}")

which results in the following timings on main:

Time taken to compute form data (residual): 1.11e-01
Time taken to compute form data (jacobian): 2.14e-01

and the following on this branch:

Time taken to compute form data (residual): 4.97e-01
Time taken to compute form data (jacobian): 3.65e+00

@jorgensd
Copy link
Copy Markdown
Member

There was a small typo in the example above (didn't compile form data for jacobian).
This has now been updated, and the timings have been updated, showing the 10-fold increase in runtime for computing form data for the jacobian.

@ksagiyam
Copy link
Copy Markdown
Contributor Author

Thanks! I had not updated my branch after some performance fix was merged to main. After rebasing, the numbers are like the following:

main:
Time taken to compute form data (residual): 8.51e-02
Time taken to compute form data (jacobian): 1.62e-01

this PR:
Time taken to compute form data (residual): 1.01e-01 (+19%)
Time taken to compute form data (jacobian): 2.01e-01 (+24%)

Let me also update the results for the other tests in the PR description:

holzapfel_ogden:

main:
0.04760003089904785

this PR:
0.05493581295013428 (+15%)

wence_test:

main:
0.06049168109893799

this PR:
0.06576049327850342 (+8.7%)

@ksagiyam ksagiyam force-pushed the ksagiyam/add_dag_visitor branch 4 times, most recently from 76e4f4a to 68f3038 Compare April 25, 2025 12:50
@ksagiyam ksagiyam force-pushed the ksagiyam/add_dag_visitor branch from 68f3038 to c50f537 Compare May 8, 2025 11:59
@ksagiyam ksagiyam force-pushed the ksagiyam/add_dag_visitor branch 3 times, most recently from 51e1924 to 0d9c2bf Compare May 21, 2025 13:01
- replace MultiFunction with DAGTraverser in apply_derivatives.py

Co-authored-by: Connor Ward <c.ward20@imperial.ac.uk>

Co-authored-by: Jørgen Schartum Dokken <dokken92@gmail.com>
@ksagiyam ksagiyam force-pushed the ksagiyam/add_dag_visitor branch from 0d9c2bf to 396df4a Compare May 21, 2025 14:42
@ksagiyam ksagiyam changed the title Ksagiyam/add DAGTraverser add DAGTraverser class and remove MultiFunctions in apply_derivatives() May 22, 2025
@mscroggs mscroggs self-requested a review May 22, 2025 08:59
@dham dham added this pull request to the merge queue May 22, 2025
Merged via the queue into FEniCS:main with commit 0f52b14 May 22, 2025
18 checks passed
@dham dham deleted the ksagiyam/add_dag_visitor branch May 22, 2025 12:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants