Skip to content

Remove LUB calculation#3669

Merged
tlively merged 5 commits into
WebAssembly:mainfrom
tlively:remove-lubs
Mar 11, 2021
Merged

Remove LUB calculation#3669
tlively merged 5 commits into
WebAssembly:mainfrom
tlively:remove-lubs

Conversation

@tlively

@tlively tlively commented Mar 10, 2021

Copy link
Copy Markdown
Member

Since correct LUB calculation for recursive types is complicated, stop depending
on LUBs throughout the code base. This also fixes a validation bug in which the
validator required blocks to be typed with the LUB of all the branch types, when
in fact any upper bound should have been valid. In addition to fixing that bug,
this PR simplifies the code for break handling by not storing redundant
information about the arity of types.

@tlively tlively requested a review from kripken March 10, 2021 03:09

@kripken kripken left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

The switch from single values to sets of values in some places may have a perf downside. I think it is unavoidable given this is PR is the right solution, but it's worth measuring to see if we need to look into solutions there.

Comment thread src/ir/ReFinalize.cpp Outdated
Type supertype;
if (Type::getSuperType(iter->second, supertype)) {
curr->type = supertype;
}

@kripken kripken Mar 10, 2021

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about in an else of the if, asserting that the current type, that we assume is correct, is a supertype of all the break types? (the validator will catch it later, but it might nice to be alerted here)

also, this pattern (call + if + assign) happens several times in this PR. perhaps there could be a function that does all the above, including the assert? getSuperTypeOrGiven([] types, Type& given), called here via Type::getSuperTypeOrGiven(iter->second, curr->type)?

In fact for efficiency reasons it could avoid the sort and first check if the given value is a supertype of all the othrers, and just return it if so. If not, it would look among them, and in case of failure assert.

Comment thread src/wasm/wasm.cpp Outdated
Type supertype;
if (Type::getSuperType(seeker.types, supertype)) {
type = supertype;
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as before

Comment thread src/wasm-type.h Outdated
// types.
// FIXME: This is not necessarily deterministic as long as we are not
// canonicalizing structurally identical recursive types because it may end up
// choosing any of multiple identical candidates.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we do a stable sort here to make this deterministic? then the result would be determined by the order of the inputs, which is enough.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that if we use a stable_sort here and use std::set instead of std::unordered_set to collect types before calling this function, that would be sufficient to make it deterministic. I'll do that.

@kripken

kripken commented Mar 10, 2021

Copy link
Copy Markdown
Member

This may require some fuzzing, I see it finds bugs, e.g. seed 14395034353793924628 which reduces to

(module
 (type $none_=>_none (func))
 (type $f64_f32_i32_externref_=>_f32 (func (param f64 f32 i32 externref) (result f32)))
 (global $global$0 (mut i32) (i32.const 10))
 (func $0 (result funcref)
  (if (result funcref)
   (global.get $global$0)
   (ref.null $f64_f32_i32_externref_=>_f32)
   (ref.null $none_=>_none)
  )
 )
)

on wasm-opt w.wat -O4 -all.

@tlively

tlively commented Mar 10, 2021

Copy link
Copy Markdown
Member Author

Yeah, I'm splitting out the problematic part (treating unreachable as the bottom type in Type::isSubType) into a separate PR and fixing the OptimizeInstructions issue.

tlively added 2 commits March 10, 2021 16:20
Since correct LUB calculation for recursive types is complicated, stop depending
on LUBs throughout the code base. This also fixes a validation bug in which the
validator required blocks to be typed with the LUB of all the branch types, when
in fact any upper bound should have been valid. In addition to fixing that bug,
this PR simplifies the code for break handling by not storing redundant
information about the arity of types.
@tlively

tlively commented Mar 11, 2021

Copy link
Copy Markdown
Member Author

@kripken this should be ready for another look.

@kripken kripken left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm aside from comments.

Comment thread src/ir/utils.h

std::map<Name, Type> breakValues;
// TODO: Switch to std::unordered_set once types are properly canonicalized so
// determinism isn't an issue.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not do the change now? i'm not sure what's preventing it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have an unordered_set, the stable sort in ensureSuperType is no better than an unstable sort. If we have two equivalent-but-distinct types A and B that are supertypes of all the other types being considered, then whether A or B is chosen will be nondeterministic. Using a std::map uses < rather than == to check uniqueness, so only A or B will be sent to ensureSuperType, not both.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, I missed that part...

Another option could be an unordered set that remembers the insertion order. That wouldn't need to wait on anything.

Comment thread src/wasm/wasm-validator.cpp Outdated

std::unordered_map<Name, BreakInfo> breakInfos;
// TODO: Switch to std::unordered_set once types are properly canonicalized
// so determinism isn't an issue.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think determinism matters in the validator? Reordered validation errors don't seem that bad.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I guess it would matter more if we had more tests for validation failures. I'll just update this now.

Comment thread src/wasm/wasm.cpp
: Type::none;
if (ifFalse) {
// TODO: Calculate a proper LUB.
Type::ensureSuperType(std::array<Type, 2>{{ifTrue->type, ifFalse->type}},

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the std::array necessary? A shame...

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, unfortunately we need to assign a type to the braced initializer list so ensureSuperType knows what kind of iterators it is dealing with.

Comment thread src/wasm/wasm.cpp
if (type == Type::none && condition->type == Type::unreachable) {
if ((type == Type::none && condition->type == Type::unreachable) ||
(ifTrue->type == Type::unreachable && ifFalse &&
ifFalse->type == Type::unreachable)) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like an unrelated bugfix, is that right? If so could it be split out?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is necessary due to the change to use ensureSuperType above. Previously, if the two arms were unreachable, the LUB would handle setting the if type to unreachable. ensureSuperType OTOH never changes the if type when the two arms are unreachable because the pre-existing type is always already a supertype of unreachable.

@tlively tlively enabled auto-merge (squash) March 11, 2021 20:39
@tlively tlively merged commit 4845637 into WebAssembly:main Mar 11, 2021
@tlively tlively deleted the remove-lubs branch March 11, 2021 21:53
kripken added a commit that referenced this pull request Mar 12, 2021
@aheejin

aheejin commented Mar 17, 2021

Copy link
Copy Markdown
Member

Just saw this. I haven't reviewed recursive type PRs so I'm not sure what's going on there, but it also seems to remove LUB for normal use cases for normal types when computing things such as block return types.. Can someone explain why?

@tlively

tlively commented Mar 17, 2021

Copy link
Copy Markdown
Member Author

We discovered last week that LUBs of recursive types are difficult to compute and that with our half-baked implementation of recursive types it basically wasn't possible to do correctly. We decided to remove LUBs entirely from the code base and see how far we could get that way, but that broke some optimization passes. The current plan is to finish properly implementing recursive types and put the LUBs back in with a proper implementation that can handle recursive LUBs as well, partially reverting this PR.

@kripken

kripken commented Mar 17, 2021

Copy link
Copy Markdown
Member

To add a little more detail, the hope was that we don't need full LUBs because optimization passes generally don't create the need for new types. A pass might move one value around, replacing something with a more specific type (say the item at the end of a block) but the supertype would remain valid (the type of that block doesn't need to change). And this is true, but passes do need to shuffle around information. That turns out to be significant work in RemoveUnusedBrs which moves around types quite a lot. It's possible to fix, but if full LUBs are possible in a few days, that might be better since we'll need it anyhow.

Another concern was that tracking stuff in passes so LUB calculations are not needed can end up using a supertype that is not the LUB (say if two if arms are replaced by more specific types, perhaps there is a new more specific LUB than the original if type). And it seems like that might be bad for other optimizations - it's better to have the most specific type possible, though we might not use it now. So that's another reason for going back to full LUBs. (The one possible downside is that more specific types may increase the type section. In principle a set of recursive types may have a LUB that does not appear anywhere in the code. But the hope is that is rare...)

@aheejin

aheejin commented Mar 17, 2021

Copy link
Copy Markdown
Member

Thanks for the explanation. So how are we calculating block return types in the meantime, when there are multiple different types exist for branches and the return type?

@kripken

kripken commented Mar 18, 2021

Copy link
Copy Markdown
Member

The idea was that we don't need to do it "from scratch". That is, when we read the input wasm we have a proper block return type for each block. An optimization pass might later specialize one of the values going to the block, but that existing block return type would remain a valid supertype. So whenever we compute a block return value we'd check the existing value, and also that of the values arriving, and assume that there is a valid supertype among them. Only a pass that really computes new types entirely (which we don't have yet) would need to do something more clever.

kripken added a commit that referenced this pull request Mar 22, 2021
I'm not entirely sure how LUB removal made this noticeable, as it seems
to be a pre-existing bug. However, somehow before #3669 it was not
noticable - perhaps the finalize code worked around it.

The bug is that RemoveUnusedBrs was moving code around and
finalizing the parent before the child. The correct pattern is always to
work from the children outwards, as otherwise the parent is trying to
finalize itself based on non-finalized children.

The fix is to just not finalize in the stealSlice method. The caller can
do it after finishing any other work it has. As part of this refactoring,
move stealSlice into the single pass that uses it; aside from that being
more orderly, this method is really not a general-purpose tool, it is
quite specific to what RemoveUnusedBrs does, and it might easily
be used incorrectly elsewhere.
tlively added a commit to tlively/binaryen that referenced this pull request Mar 25, 2021
This is a partial revert of WebAssembly#3669, which removed the old implementation of
Type::getLeastUpperBound that did not correctly handle recursive types. The new
implementation in this PR uses a TypeBuilder to construct LUBs and for recursive
types, it returns a temporary HeapType that has not yet been fully constructed
to break what would otherwise be infinite recursions.
@tlively tlively mentioned this pull request Mar 25, 2021
tlively added a commit that referenced this pull request Mar 29, 2021
This is a partial revert of #3669, which removed the old implementation of
Type::getLeastUpperBound that did not correctly handle recursive types. The new
implementation in this PR uses a TypeBuilder to construct LUBs and for recursive
types, it returns a temporary HeapType that has not yet been fully constructed
to break what would otherwise be infinite recursions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants