Skip to content

Data flow: Speedup subpaths predicate#9017

Merged
hvitved merged 4 commits into
github:mainfrom
hvitved:dataflow/subpaths-perf
May 4, 2022
Merged

Data flow: Speedup subpaths predicate#9017
hvitved merged 4 commits into
github:mainfrom
hvitved:dataflow/subpaths-perf

Conversation

@hvitved
Copy link
Copy Markdown
Contributor

@hvitved hvitved commented May 3, 2022

Before

[2022-05-02 15:47:16] (1280s) Tuple counts for DataFlowImpl::Subpaths::subpaths#656de156#ffff/4@c5f3dclb after 3m22s:
                      8389013    ~4%     {5} r1 = JOIN DataFlowImpl::Subpaths::subpaths#656de156#ffff#shared WITH DataFlowImpl::PathNode::getASuccessor#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'arg', Lhs.1, Lhs.2, Lhs.3, Lhs.4 'out'
                      6689751    ~0%     {4} r2 = JOIN r1 WITH DataFlowImpl::Subpaths::subpaths03#656de156#ffffff_034512#join_rhs ON FIRST 4 OUTPUT Rhs.4, Lhs.4 'out', Lhs.0 'arg', Rhs.5 'ret'

                      1513839768 ~1%     {5} r3 = JOIN r2 WITH DataFlowImpl::PathNodeImpl::getNodeEx#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Lhs.1 'out', Lhs.2 'arg', Lhs.3 'ret', Rhs.1 'par', Lhs.3 'ret'
                      1513839768 ~1%     {5} r4 = r3 AND NOT DataFlowImpl::PathNodeImpl::isHidden#dispred#f0820431#f(Lhs.4 'ret')
                      1513839768 ~5%     {4} r5 = SCAN r4 OUTPUT In.1 'arg', In.3 'par', In.0 'out', In.4 'ret'

                      1513839768 ~2%     {4} r6 = JOIN r2 WITH DataFlowImpl::PathNodeImpl::getNodeEx#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Lhs.3 'ret', Lhs.1 'out', Lhs.2 'arg', Rhs.1 'par'
                      0          ~0%     {5} r7 = JOIN r6 WITH boundedFastTC(DataFlowImpl::Subpaths::localStepToHidden#656de156#ff_10#higher_order_body,DataFlowImpl::Subpaths::subpaths#656de156#ffff#higher_order_body) ON FIRST 1 OUTPUT Lhs.1 'out', Lhs.2 'arg', Lhs.0, Lhs.3 'par', Rhs.1 'ret'
                      0          ~0%     {5} r8 = r7 AND NOT DataFlowImpl::PathNodeImpl::isHidden#dispred#f0820431#f(Lhs.4 'ret')
                      0          ~0%     {4} r9 = SCAN r8 OUTPUT In.1 'arg', In.3 'par', In.0 'out', In.4 'ret'

                      1513839768 ~5%     {4} r10 = r5 UNION r9
                      6689751    ~0%     {4} r11 = JOIN r10 WITH DataFlowImpl::PathNode::getASuccessor#dispred#f0820431#ff ON FIRST 2 OUTPUT Lhs.0 'arg', Lhs.1 'par', Lhs.3 'ret', Lhs.2 'out'
                                         return r11

After

[2022-05-04 15:06:05] (925s) Tuple counts for DataFlowImplForContentDataFlow::Subpaths::subpaths#d678ef22#ffff/4@6c67a2ka after 8.1s:
                      8372517 ~0%     {3} r1 = JOIN DataFlowImplForContentDataFlow::PathNode::getASuccessor#dispred#f0820431#ff_10#join_rhs WITH DataFlowImplForContentDataFlow::PathNodeImpl::getNodeEx#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1 'arg', Rhs.1, Rhs.0
                      6673799 ~3%     {9} r2 = JOIN r1 WITH DataFlowImplForContentDataFlow::Subpaths::subpaths03#d678ef22#fffffffff ON FIRST 2 OUTPUT Rhs.3, Rhs.4, Rhs.5, Rhs.7, Rhs.6, Rhs.8, Lhs.2 'par', Lhs.0 'arg', Rhs.2 'ret'
                      
                      6637884 ~0%     {4} r3 = JOIN r2 WITH project#DataFlowImplForContentDataFlow::pathNode#d678ef22#ffffffff_1234560#join_rhs ON FIRST 6 OUTPUT Lhs.7 'arg', Lhs.6 'par', Lhs.8 'ret', Rhs.6 'out'
                      
                      6637884 ~1%     {4} r4 = JOIN r2 WITH project#DataFlowImplForContentDataFlow::pathNode#d678ef22#ffffffff_1234560#join_rhs ON FIRST 6 OUTPUT Rhs.6 'out', Lhs.6 'par', Lhs.7 'arg', Lhs.8 'ret'
                      51867   ~1%     {4} r5 = JOIN r4 WITH DataFlowImplForContentDataFlow::PathNodeMid::projectToSink#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.2 'arg', Lhs.1 'par', Lhs.3 'ret', Rhs.1 'out'
                      
                      6689751 ~1%     {4} r6 = r3 UNION r5
                                      return r6

hvitved added 2 commits May 3, 2022 11:45
Before
```
[2022-05-02 15:47:16] (1280s) Tuple counts for DataFlowImpl::Subpaths::subpaths#656de156#ffff/4@c5f3dclb after 3m22s:
                      8389013    ~4%     {5} r1 = JOIN DataFlowImpl::Subpaths::subpaths#656de156#ffff#shared WITH DataFlowImpl::PathNode::getASuccessor#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'arg', Lhs.1, Lhs.2, Lhs.3, Lhs.4 'out'
                      6689751    ~0%     {4} r2 = JOIN r1 WITH DataFlowImpl::Subpaths::subpaths03#656de156#ffffff_034512#join_rhs ON FIRST 4 OUTPUT Rhs.4, Lhs.4 'out', Lhs.0 'arg', Rhs.5 'ret'

                      1513839768 ~1%     {5} r3 = JOIN r2 WITH DataFlowImpl::PathNodeImpl::getNodeEx#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Lhs.1 'out', Lhs.2 'arg', Lhs.3 'ret', Rhs.1 'par', Lhs.3 'ret'
                      1513839768 ~1%     {5} r4 = r3 AND NOT DataFlowImpl::PathNodeImpl::isHidden#dispred#f0820431#f(Lhs.4 'ret')
                      1513839768 ~5%     {4} r5 = SCAN r4 OUTPUT In.1 'arg', In.3 'par', In.0 'out', In.4 'ret'

                      1513839768 ~2%     {4} r6 = JOIN r2 WITH DataFlowImpl::PathNodeImpl::getNodeEx#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Lhs.3 'ret', Lhs.1 'out', Lhs.2 'arg', Rhs.1 'par'
                      0          ~0%     {5} r7 = JOIN r6 WITH boundedFastTC(DataFlowImpl::Subpaths::localStepToHidden#656de156#ff_10#higher_order_body,DataFlowImpl::Subpaths::subpaths#656de156#ffff#higher_order_body) ON FIRST 1 OUTPUT Lhs.1 'out', Lhs.2 'arg', Lhs.0, Lhs.3 'par', Rhs.1 'ret'
                      0          ~0%     {5} r8 = r7 AND NOT DataFlowImpl::PathNodeImpl::isHidden#dispred#f0820431#f(Lhs.4 'ret')
                      0          ~0%     {4} r9 = SCAN r8 OUTPUT In.1 'arg', In.3 'par', In.0 'out', In.4 'ret'

                      1513839768 ~5%     {4} r10 = r5 UNION r9
                      6689751    ~0%     {4} r11 = JOIN r10 WITH DataFlowImpl::PathNode::getASuccessor#dispred#f0820431#ff ON FIRST 2 OUTPUT Lhs.0 'arg', Lhs.1 'par', Lhs.3 'ret', Lhs.2 'out'
                                         return r11
```

After
```
[2022-05-03 11:44:10] (969s) Tuple counts for DataFlowImpl::Subpaths::subpaths#656de156#ffff/4@b26b969r after 11.8s:
                      8372525 ~0%     {3} r1 = JOIN DataFlowImpl::PathNode::getASuccessor#dispred#f0820431#ff_10#join_rhs WITH DataFlowImpl::PathNodeImpl::getNodeEx#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1 'arg', Rhs.1, Rhs.0
                      6673799 ~6%     {9} r2 = JOIN r1 WITH DataFlowImpl::Subpaths::subpaths03#656de156#fffffffff ON FIRST 2 OUTPUT Rhs.3, Rhs.4, Rhs.5, Rhs.7, Rhs.6, Rhs.8, Lhs.2 'par', Lhs.0 'arg', Rhs.2 'ret'

                      6637884 ~0%     {5} r3 = JOIN r2 WITH project#DataFlowImpl::pathNode#656de156#ffffffff_1234560#join_rhs ON FIRST 6 OUTPUT Lhs.6 'par', Lhs.7 'arg', Lhs.8 'ret', Rhs.6 'out', Lhs.8 'ret'

                      6637884 ~0%     {4} r4 = JOIN r2 WITH project#DataFlowImpl::pathNode#656de156#ffffffff_1234560#join_rhs ON FIRST 6 OUTPUT Rhs.6 'out', Lhs.6 'par', Lhs.7 'arg', Lhs.8 'ret'

                      51867   ~0%     {5} r5 = JOIN r4 WITH DataFlowImpl::PathNodeMid::projectToSink#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1 'par', Lhs.2 'arg', Lhs.3 'ret', Rhs.1 'out', Lhs.3 'ret'

                      6689751 ~0%     {5} r6 = r3 UNION r5
                      6689751 ~0%     {5} r7 = r6 AND NOT DataFlowImpl::PathNodeImpl::isHidden#dispred#f0820431#f(Lhs.4 'ret')
                      6689751 ~0%     {4} r8 = SCAN r7 OUTPUT In.1 'arg', In.0 'par', In.4 'ret', In.3 'out'

                      6637884 ~0%     {4} r9 = JOIN r2 WITH project#DataFlowImpl::pathNode#656de156#ffffffff_1234560#join_rhs ON FIRST 6 OUTPUT Lhs.8 'ret', Lhs.6 'par', Lhs.7 'arg', Rhs.6 'out'

                      51867   ~0%     {4} r10 = JOIN r4 WITH DataFlowImpl::PathNodeMid::projectToSink#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.3 'ret', Lhs.1 'par', Lhs.2 'arg', Rhs.1 'out'

                      6689751 ~0%     {4} r11 = r9 UNION r10
                      0       ~0%     {5} r12 = JOIN r11 WITH boundedFastTC(DataFlowImpl::Subpaths::localStepToHidden#656de156#ff_10#higher_order_body,DataFlowImpl::Subpaths::subpaths#656de156#ffff#higher_order_body) ON FIRST 1 OUTPUT Lhs.1 'par', Lhs.2 'arg', Lhs.0, Lhs.3 'out', Rhs.1 'ret'
                      0       ~0%     {5} r13 = r12 AND NOT DataFlowImpl::PathNodeImpl::isHidden#dispred#f0820431#f(Lhs.4 'ret')
                      0       ~0%     {4} r14 = SCAN r13 OUTPUT In.1 'arg', In.0 'par', In.4 'ret', In.3 'out'

                      6689751 ~0%     {4} r15 = r8 UNION r14
                                      return r15
```
@hvitved hvitved added the no-change-note-required This PR does not need a change note label May 3, 2022
@hvitved hvitved marked this pull request as ready for review May 4, 2022 08:02
@hvitved hvitved requested review from a team as code owners May 4, 2022 08:02
@aschackmull
Copy link
Copy Markdown
Contributor

So the listed "before" join-order highlights a clear problem with a (in my mind) straightforward fix. Yet, the code change seems a lot more involved. Is there something else being fixed here as well, or could we stick to a simpler fix? The simpler fix being a separately materialised join of arg.getASuccessor() = par and par.getNodeEx() = p.

@aschackmull
Copy link
Copy Markdown
Contributor

An even simpler and possibly sufficient fix is to unbind p, but the arg-to-par join might then have some fan-out, which we can avoid with the above sketched solution if it's too much.

@hvitved
Copy link
Copy Markdown
Contributor Author

hvitved commented May 4, 2022

Updated the PR to use binding pragmas instead, as suggested by @aschackmull (thanks!).

@hvitved hvitved merged commit 8e33653 into github:main May 4, 2022
@hvitved hvitved deleted the dataflow/subpaths-perf branch May 4, 2022 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants