Skip to content

Python: Add Node::getPreUpdateNode#6079

Closed
tausbn wants to merge 3 commits into
github:mainfrom
tausbn:python-add-getpreupdatenode
Closed

Python: Add Node::getPreUpdateNode#6079
tausbn wants to merge 3 commits into
github:mainfrom
tausbn:python-add-getpreupdatenode

Conversation

@tausbn

@tausbn tausbn commented Jun 15, 2021

Copy link
Copy Markdown
Contributor

This is mostly as a convenience, as
foo.(PostUpdateNode).getPreUpdateNode() can be a bit of a mouthful.

This does not change the external API in any meaningful way, so I don't
think a change note is necessary.

This is mostly as a convenience, as
`foo.(PostUpdateNode).getPreUpdateNode()` can be a bit of a mouthful.

This does not change the external API in any meaningful way, so I don't
think a change note is necessary.
@tausbn tausbn added Python no-change-note-required This PR does not need a change note labels Jun 15, 2021
@tausbn tausbn requested a review from a team as a code owner June 15, 2021 12:00
@tausbn tausbn mentioned this pull request Jun 15, 2021
tausbn added 2 commits June 15, 2021 12:20
I had forgotten that `DataFlowImplCommon.qll` is shared across
languages, and so any change would have to be mirrored there.

But that would also mean that other languages would have to add the same
setup as us for these nodes, which I think is going to be a hard sell.

I'm not really sure where this leaves this proposal... :/
Turns out I had only looked for `(PostUpdateNode).get...`
and not `(DataFlow::PostUpdateNode).get...` which had a bunch more
instances that could be cleaned up.

@RasmusWL RasmusWL left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slightly split about this (even after thinking it over for some days). On one hand, I really like foo.(DataFlow::PostUpdateNode).getPreUpdateNode() since it so explicitly says that foo is in fact a PostUpdateNode. On the other hand, for foo.getPreUpdateNode() I have to remember that this implies that foo is now PostUpdateNode, which is slightly subtle, and doing this mental gymnastics feels draining to me.

foo.(PostUpdateNode).getPreUpdateNode() is a bit of a mouthful, but from my point of view, this explicitness makes it much easier to figure out the pre- and post update node setup. So I think overall I'm not in favor of this change 😐

EDIT: I see this might seem a bit strange to oppose when I'm suggesting to introduce .getPostUpdateNode in #5926... But I think it has to do with the fact that in bar.getPostUpdateNode(), I can still think of bar as a "normal" data-flow node, and in foo.getPreUpdateNode() I have to remember that foo is in fact a "special" data-flow node, namely a PostUpdateNode.

@yoff

yoff commented Jun 21, 2021

Copy link
Copy Markdown
Contributor

I also feel split, and I have not really found a principle to guide me.
I like the explicit "demand it is a PostUpdateNode, then ask for its PreUpdateNode". I then mentally move into a world where I can expect the PreUpdateNode to exist. But all predicates can have 0 (or even more than one) results. I wonder if there should be a way to mark a predicate as functional (say writing function instead of predicate), then the predicate on PostUpdateNode would be functional and the convenience would not.
I can also follow @RasmusWL in the implicit conversion, although I would point out that not all PostUpdateNodes are special, at least they are not all synthetic nodes.
All in all, I am not sure. It feels like convenience should somehow always be allowed. Would it work better as an external predicate (getPreUpdateNode(PostUpdateNode pun)) to would that be even worse?

@tausbn

tausbn commented Jun 23, 2021

Copy link
Copy Markdown
Contributor Author

I think whatever we decide, it should be consistent: either we have both, or we have neither. Otherwise, we end up in a situation where one direction is nice to express, and the other one is just... seemingly missing (and requires a "magic" cast to work).

Also, my desire for adding these "convenience" functions is to help combat that "magic". If the interface behind which we access pre- and post-update nodes requires casting, then it's basically impossible to encounter by chance. No one sits down and goes "hey I wonder what happens if I cast to PostUpdateNode?".
By having these methods available on Node, there is at least some hope that people will encounter them and learn about pre- and post-update nodes.

Seeing as getPreUpdateNode (from #5926) is only used in one place, I wonder if we should punt on this issue for the moment, and consider whether there's a better overall solution.

I'm still very tempted by the idea of being able to access the pre- and post-update node for any node (and these nodes always existing), but having these be equal to the node in question if no synthesised node exists. That way, I don't ever have to remember that in some cases it's the PreUpdateNode that is synthesised, and in some cases it is the PostUpdateNode that is synthesised -- I can simply write underlying_node.getPreUpdateNode() and know that this will be the node for "before the state change", but I don't need to know or care whether that's actually equal to the node itself.

I believe this would be as simple as defining these methods as the identity relation on Node itself, and then overriding them appropriately on the relevant subclasses. (Though it may be that there's some subtlety I'm overlooking.)

@tausbn tausbn closed this Sep 2, 2021
@tausbn tausbn deleted the python-add-getpreupdatenode branch September 2, 2021 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-change-note-required This PR does not need a change note Python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants