Skip to content

Go: Improve bad join order in guardingCall#18831

Merged
owen-mc merged 2 commits into
github:mainfrom
owen-mc:go/join-order-fix-2
Feb 24, 2025
Merged

Go: Improve bad join order in guardingCall#18831
owen-mc merged 2 commits into
github:mainfrom
owen-mc:go/join-order-fix-2

Conversation

@owen-mc
Copy link
Copy Markdown
Contributor

@owen-mc owen-mc commented Feb 21, 2025

Old code:

guardingFunction(g, f, inp, outp, p) and
c = f.getACall() and
nd = inp.getNode(c) and
localFlow(pragma[only_bind_out](outp.getNode(c)), resNode)

Old join order (note that FunctionOutput.getExitNode comes from ``outp.getNode`, which is just a wrapper for it):

Pipeline standard for DataFlowUtil::BarrierGuard<WrappedErrorAlwaysNil::nilTestGuard>::guardingCall/8#aa0defc4#fffffffb@dfedaw8n was evaluated in 2 iterations totaling 7ms (delta sizes total: 4).
             6   ~0%    {5} r1 = SCAN `DataFlowUtil::BarrierGuard<WrappedErrorAlwaysNil::nilTestGuard>::guardingFunction/5#1f27d9d8#fffbf#prev_delta` OUTPUT In.3, In.0, In.1, In.2, In.4
        722106   ~6%    {7}    | JOIN WITH `FunctionInputsAndOutputs::FunctionOutput.getExitNode/1#dispred#4c5ce77c` ON FIRST 1 OUTPUT Lhs.2, Rhs.1, Lhs.1, Lhs.3, Lhs.0, Lhs.4, Rhs.2
            19   ~0%    {7}    | JOIN WITH `Scopes::Function.getACall/0#ab99e5da` ON FIRST 2 OUTPUT Lhs.3, Lhs.1, Lhs.2, Lhs.0, Lhs.4, Lhs.5, Lhs.6
            19   ~0%    {8}    | JOIN WITH `FunctionInputsAndOutputs::FunctionInput.getNode/1#dispred#bca71294` ON FIRST 2 OUTPUT Lhs.6, Lhs.2, Lhs.3, Lhs.0, Lhs.4, Lhs.5, Lhs.1, Rhs.2
                    
             4   ~0%    {8} r2 = JOIN r1 WITH `project#Properties::Property.checkOn/3#dispred#43ad13f2` ON FIRST 1 OUTPUT Lhs.1, Lhs.2, Lhs.3, Lhs.4, Lhs.5, Lhs.6, Lhs.7, Lhs.0
                    
            14   ~0%    {8} r3 = JOIN r1 WITH `#DataFlowUtil::localFlowStep/2#edbc0f9aPlus` ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.2, Lhs.3, Lhs.4, Lhs.5, Lhs.6, Lhs.7
             0   ~0%    {8}    | JOIN WITH `project#Properties::Property.checkOn/3#dispred#43ad13f2` ON FIRST 1 OUTPUT Lhs.1, Lhs.2, Lhs.3, Lhs.4, Lhs.5, Lhs.6, Lhs.7, Lhs.0
                    
             4   ~0%    {8} r4 = r2 UNION r3
             4   ~0%    {8}    | AND NOT `DataFlowUtil::BarrierGuard<WrappedErrorAlwaysNil::nilTestGuard>::guardingCall/8#aa0defc4#fffffffb#prev`(FIRST 8)
                        return r4

It is finding the exit node corresponding to inp of all calls c, and only afterwards restricting c to be a call to f. This doesn't seem to be taking a long time, but it seems worth fixing the join order in case it takes a long time on some other database.

New code:

guardingFunction(g, f, inp, outp, p) and
c = pragma[only_bind_into](f).getACall() and
nd = inp.getNode(c) and
localFlow(outp.getNode(c), resNode)

New join order:

Pipeline standard for DataFlowUtil::BarrierGuard<WrappedErrorAlwaysNil::nilTestGuard>::guardingCall/8#aa0defc4#fffffffb@fc684wgo was evaluated in 2 iterations totaling 0ms (delta sizes total: 4).
         6   ~0%    {5} r1 = SCAN `DataFlowUtil::BarrierGuard<WrappedErrorAlwaysNil::nilTestGuard>::guardingFunction/5#1f27d9d8#fffbf#prev_delta` OUTPUT In.1, In.0, In.2, In.3, In.4
        19   ~0%    {6}    | JOIN WITH `Scopes::Function.getACall/0#ab99e5da` ON FIRST 1 OUTPUT Lhs.3, Rhs.1, Lhs.0, Lhs.1, Lhs.2, Lhs.4
        19   ~0%    {7}    | JOIN WITH `FunctionInputsAndOutputs::FunctionOutput.getExitNode/1#dispred#4c5ce77c` ON FIRST 2 OUTPUT Lhs.4, Lhs.1, Lhs.2, Lhs.3, Lhs.0, Lhs.5, Rhs.2
        19   ~0%    {8}    | JOIN WITH `FunctionInputsAndOutputs::FunctionInput.getNode/1#dispred#bca71294` ON FIRST 2 OUTPUT Lhs.6, Lhs.1, Lhs.2, Lhs.3, Lhs.0, Lhs.4, Lhs.5, Rhs.2
                
         4   ~0%    {8} r2 = JOIN r1 WITH `#DataFlowUtil::localFlowStep/2#edbc0f9aPlus#sinkBound#3` ON FIRST 1 OUTPUT Lhs.3, Lhs.2, Lhs.4, Lhs.5, Lhs.6, Lhs.1, Lhs.7, Lhs.0
                
         0   ~0%    {8} r3 = JOIN r1 WITH `#DataFlowUtil::localFlowStep/2#edbc0f9aPlus#bounded` ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.2, Lhs.3, Lhs.4, Lhs.5, Lhs.6, Lhs.7
         0   ~0%    {8}    | JOIN WITH `#DataFlowUtil::localFlowStep/2#edbc0f9aPlus#sinkBound#3` ON FIRST 1 OUTPUT Lhs.3, Lhs.2, Lhs.4, Lhs.5, Lhs.6, Lhs.1, Lhs.7, Lhs.0
                
         4   ~0%    {8} r4 = r2 UNION r3
         4   ~0%    {8}    | AND NOT `DataFlowUtil::BarrierGuard<WrappedErrorAlwaysNil::nilTestGuard>::guardingCall/8#aa0defc4#fffffffb#prev`(FIRST 8)
                    return r4

Copilot AI review requested due to automatic review settings February 21, 2025 10:44
@owen-mc owen-mc requested a review from a team as a code owner February 21, 2025 10:44
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 1 out of 1 changed files in this pull request and generated no comments.

Tip: Copilot code review supports C#, Go, Java, JavaScript, Markdown, Python, Ruby and TypeScript, with more languages coming soon. Learn more

@github-actions github-actions Bot added the Go label Feb 21, 2025
@owen-mc owen-mc added the no-change-note-required This PR does not need a change note label Feb 21, 2025
@owen-mc owen-mc requested a review from smowton February 21, 2025 10:44
@aschackmull
Copy link
Copy Markdown
Contributor

This seems very brittle and even though the result works as intended, I don't see much reason for that - I'm actually somewhat confused about why this even works - it seems that the added pragma ought to prohibit the good order. It would seem safer to use inline_late on both inp.getNode(c) and outp.getNode(c) with bindingsets requiring both the qualifier and the call to be present, as that should ensure that they're joined ON FIRST 2 as seen in the good join-order.

@smowton
Copy link
Copy Markdown
Contributor

smowton commented Feb 21, 2025

I'm also confused about why that binding hint doesn't actually cue it to go the other way around, via the output/input first and only then getting to a call node.

I note that the old order shows Property.checkOn magic'd in from the use-site, whereas the new one doesn't. I don't know off-hand if the magic is a good thing or not here.

@owen-mc
Copy link
Copy Markdown
Contributor Author

owen-mc commented Feb 21, 2025

My intention was to make it evaluate c = f.getACall() first, since I think it has some info about f (in the recursive case at least). My thinking was that it would then be okay to evaluate outp.getNode(c) as there are fewer call nodes. I may well have messed it up though, and accidentally perturbed the order to something better. I'll have another go when I next have time.

Copy link
Copy Markdown
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a quick check to convince myself I wasn't going nuts, and surrounding any of f and c with bind-in or bind-out annotations has no effect. I would expect some to serve as a cue to go inp -> c -> f or f -> c -> inp, but apparently not. Since the annotations don't seem to be doing anything, I suggest simply dropping them.

) {
guardingFunction(g, f, inp, outp, p) and
c = f.getACall() and
c = pragma[only_bind_into](f).getACall() and
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
c = pragma[only_bind_into](f).getACall() and
c = f.getACall() and

@owen-mc
Copy link
Copy Markdown
Contributor Author

owen-mc commented Feb 22, 2025

New RA looks good:

Pipeline standard for DataFlowUtil::BarrierGuard<WrappedErrorAlwaysNil::nilTestGuard>::guardingCall/8#aa0defc4#fffffffb@10408wls was evaluated in 2 iterations totaling 0ms (delta sizes total: 4).
         6   ~0%    {5} r1 = SCAN `DataFlowUtil::BarrierGuard<WrappedErrorAlwaysNil::nilTestGuard>::guardingFunction/5#1f27d9d8#fffbf#prev_delta` OUTPUT In.1, In.0, In.2, In.3, In.4
        19   ~0%    {6}    | JOIN WITH `Scopes::Function.getACall/0#ab99e5da` ON FIRST 1 OUTPUT Lhs.3, Rhs.1, Lhs.1, Lhs.0, Lhs.2, Lhs.4
        19   ~0%    {7}    | JOIN WITH `FunctionInputsAndOutputs::FunctionOutput.getExitNode/1#dispred#4c5ce77c` ON FIRST 2 OUTPUT Lhs.4, Lhs.1, Lhs.2, Lhs.3, Lhs.0, Lhs.5, Rhs.2
        19   ~0%    {8}    | JOIN WITH `FunctionInputsAndOutputs::FunctionInput.getNode/1#dispred#bca71294` ON FIRST 2 OUTPUT Lhs.6, Lhs.2, Lhs.3, Lhs.0, Lhs.4, Lhs.5, Lhs.1, Rhs.2
                
         4   ~0%    {8} r2 = JOIN r1 WITH `project#Properties::Property.checkOn/3#dispred#43ad13f2` ON FIRST 1 OUTPUT Lhs.1, Lhs.2, Lhs.3, Lhs.4, Lhs.5, Lhs.6, Lhs.7, Lhs.0
                
        14   ~0%    {8} r3 = JOIN r1 WITH `#DataFlowUtil::localFlowStep/2#edbc0f9aPlus` ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.2, Lhs.3, Lhs.4, Lhs.5, Lhs.6, Lhs.7
         0   ~0%    {8}    | JOIN WITH `project#Properties::Property.checkOn/3#dispred#43ad13f2` ON FIRST 1 OUTPUT Lhs.1, Lhs.2, Lhs.3, Lhs.4, Lhs.5, Lhs.6, Lhs.7, Lhs.0
                
         4   ~0%    {8} r4 = r2 UNION r3
         4   ~0%    {8}    | AND NOT `DataFlowUtil::BarrierGuard<WrappedErrorAlwaysNil::nilTestGuard>::guardingCall/8#aa0defc4#fffffffb#prev`(FIRST 8)
                    return r4

DCA looks good.

Thought: should the actual member predicate FunctionOutput.getExitNode be inline_late? If we don't know the call then we'll get lots of results. If we don't know which FunctionOutput it is we'll also get quite a few extra. I guess I should try it and run DCA. Edit: I did that and the DCA results were awful, so I'll abandon that idea.

Copy link
Copy Markdown
Contributor

@aschackmull aschackmull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. This makes the join-order very clear and stable.

@owen-mc owen-mc merged commit 0d994c1 into github:main Feb 24, 2025
@owen-mc owen-mc deleted the go/join-order-fix-2 branch February 24, 2025 22:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Go no-change-note-required This PR does not need a change note

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants