Remove LUB calculation by tlively · Pull Request #3669 · WebAssembly/binaryen

tlively · 2021-03-10T03:09:52Z

Since correct LUB calculation for recursive types is complicated, stop depending
on LUBs throughout the code base. This also fixes a validation bug in which the
validator required blocks to be typed with the LUB of all the branch types, when
in fact any upper bound should have been valid. In addition to fixing that bug,
this PR simplifies the code for break handling by not storing redundant
information about the arity of types.

kripken

Nice work!

The switch from single values to sets of values in some places may have a perf downside. I think it is unavoidable given this is PR is the right solution, but it's worth measuring to see if we need to look into solutions there.

kripken · 2021-03-10T03:58:35Z

+      Type supertype;
+      if (Type::getSuperType(iter->second, supertype)) {
+        curr->type = supertype;
+      }


how about in an else of the if, asserting that the current type, that we assume is correct, is a supertype of all the break types? (the validator will catch it later, but it might nice to be alerted here)

also, this pattern (call + if + assign) happens several times in this PR. perhaps there could be a function that does all the above, including the assert? getSuperTypeOrGiven([] types, Type& given), called here via Type::getSuperTypeOrGiven(iter->second, curr->type)?

In fact for efficiency reasons it could avoid the sort and first check if the given value is a supertype of all the othrers, and just return it if so. If not, it would look among them, and in case of failure assert.

kripken · 2021-03-10T04:04:54Z

+    Type supertype;
+    if (Type::getSuperType(seeker.types, supertype)) {
+      type = supertype;
    }


same comment as before

kripken · 2021-03-10T04:16:10Z

+  // types.
+  // FIXME: This is not necessarily deterministic as long as we are not
+  // canonicalizing structurally identical recursive types because it may end up
+  // choosing any of multiple identical candidates.


could we do a stable sort here to make this deterministic? then the result would be determined by the order of the inputs, which is enough.

Yes, I think that if we use a stable_sort here and use std::set instead of std::unordered_set to collect types before calling this function, that would be sufficient to make it deterministic. I'll do that.

kripken · 2021-03-10T05:05:56Z

This may require some fuzzing, I see it finds bugs, e.g. seed 14395034353793924628 which reduces to

(module
 (type $none_=>_none (func))
 (type $f64_f32_i32_externref_=>_f32 (func (param f64 f32 i32 externref) (result f32)))
 (global $global$0 (mut i32) (i32.const 10))
 (func $0 (result funcref)
  (if (result funcref)
   (global.get $global$0)
   (ref.null $f64_f32_i32_externref_=>_f32)
   (ref.null $none_=>_none)
  )
 )
)

on wasm-opt w.wat -O4 -all.

tlively · 2021-03-10T22:02:06Z

Yeah, I'm splitting out the problematic part (treating unreachable as the bottom type in Type::isSubType) into a separate PR and fixing the OptimizeInstructions issue.

Since correct LUB calculation for recursive types is complicated, stop depending on LUBs throughout the code base. This also fixes a validation bug in which the validator required blocks to be typed with the LUB of all the branch types, when in fact any upper bound should have been valid. In addition to fixing that bug, this PR simplifies the code for break handling by not storing redundant information about the arity of types.

tlively · 2021-03-11T02:16:39Z

@kripken this should be ready for another look.

kripken

lgtm aside from comments.

kripken · 2021-03-11T02:26:48Z


-  std::map<Name, Type> breakValues;
+  // TODO: Switch to std::unordered_set once types are properly canonicalized so
+  // determinism isn't an issue.


why not do the change now? i'm not sure what's preventing it.

If we have an unordered_set, the stable sort in ensureSuperType is no better than an unstable sort. If we have two equivalent-but-distinct types A and B that are supertypes of all the other types being considered, then whether A or B is chosen will be nondeterministic. Using a std::map uses < rather than == to check uniqueness, so only A or B will be sent to ensureSuperType, not both.

Oh right, I missed that part...

Another option could be an unordered set that remembers the insertion order. That wouldn't need to wait on anything.

kripken · 2021-03-11T02:30:30Z

-
-  std::unordered_map<Name, BreakInfo> breakInfos;
+  // TODO: Switch to std::unordered_set once types are properly canonicalized
+  // so determinism isn't an issue.


I don't think determinism matters in the validator? Reordered validation errors don't seem that bad.

Sure, I guess it would matter more if we had more tests for validation failures. I'll just update this now.

kripken · 2021-03-11T02:35:40Z

-                 : Type::none;
+  if (ifFalse) {
+    // TODO: Calculate a proper LUB.
+    Type::ensureSuperType(std::array<Type, 2>{{ifTrue->type, ifFalse->type}},


Is the std::array necessary? A shame...

Yes, unfortunately we need to assign a type to the braced initializer list so ensureSuperType knows what kind of iterators it is dealing with.

kripken · 2021-03-11T02:37:17Z

-  if (type == Type::none && condition->type == Type::unreachable) {
+  if ((type == Type::none && condition->type == Type::unreachable) ||
+      (ifTrue->type == Type::unreachable && ifFalse &&
+       ifFalse->type == Type::unreachable)) {


This looks like an unrelated bugfix, is that right? If so could it be split out?

No, this is necessary due to the change to use ensureSuperType above. Previously, if the two arms were unreachable, the LUB would handle setting the if type to unreachable. ensureSuperType OTOH never changes the if type when the two arms are unreachable because the pre-existing type is always already a supertype of unreachable.

aheejin · 2021-03-17T00:24:57Z

Just saw this. I haven't reviewed recursive type PRs so I'm not sure what's going on there, but it also seems to remove LUB for normal use cases for normal types when computing things such as block return types.. Can someone explain why?

tlively · 2021-03-17T01:20:50Z

We discovered last week that LUBs of recursive types are difficult to compute and that with our half-baked implementation of recursive types it basically wasn't possible to do correctly. We decided to remove LUBs entirely from the code base and see how far we could get that way, but that broke some optimization passes. The current plan is to finish properly implementing recursive types and put the LUBs back in with a proper implementation that can handle recursive LUBs as well, partially reverting this PR.

kripken · 2021-03-17T03:55:55Z

To add a little more detail, the hope was that we don't need full LUBs because optimization passes generally don't create the need for new types. A pass might move one value around, replacing something with a more specific type (say the item at the end of a block) but the supertype would remain valid (the type of that block doesn't need to change). And this is true, but passes do need to shuffle around information. That turns out to be significant work in RemoveUnusedBrs which moves around types quite a lot. It's possible to fix, but if full LUBs are possible in a few days, that might be better since we'll need it anyhow.

Another concern was that tracking stuff in passes so LUB calculations are not needed can end up using a supertype that is not the LUB (say if two if arms are replaced by more specific types, perhaps there is a new more specific LUB than the original if type). And it seems like that might be bad for other optimizations - it's better to have the most specific type possible, though we might not use it now. So that's another reason for going back to full LUBs. (The one possible downside is that more specific types may increase the type section. In principle a set of recursive types may have a LUB that does not appear anywhere in the code. But the hope is that is rare...)

aheejin · 2021-03-17T23:28:42Z

Thanks for the explanation. So how are we calculating block return types in the meantime, when there are multiple different types exist for branches and the return type?

kripken · 2021-03-18T15:20:38Z

The idea was that we don't need to do it "from scratch". That is, when we read the input wasm we have a proper block return type for each block. An optimization pass might later specialize one of the values going to the block, but that existing block return type would remain a valid supertype. So whenever we compute a block return value we'd check the existing value, and also that of the values arriving, and assume that there is a valid supertype among them. Only a pass that really computes new types entirely (which we don't have yet) would need to do something more clever.

I'm not entirely sure how LUB removal made this noticeable, as it seems to be a pre-existing bug. However, somehow before #3669 it was not noticable - perhaps the finalize code worked around it. The bug is that RemoveUnusedBrs was moving code around and finalizing the parent before the child. The correct pattern is always to work from the children outwards, as otherwise the parent is trying to finalize itself based on non-finalized children. The fix is to just not finalize in the stealSlice method. The caller can do it after finishing any other work it has. As part of this refactoring, move stealSlice into the single pass that uses it; aside from that being more orderly, this method is really not a general-purpose tool, it is quite specific to what RemoveUnusedBrs does, and it might easily be used incorrectly elsewhere.

This is a partial revert of WebAssembly#3669, which removed the old implementation of Type::getLeastUpperBound that did not correctly handle recursive types. The new implementation in this PR uses a TypeBuilder to construct LUBs and for recursive types, it returns a temporary HeapType that has not yet been fully constructed to break what would otherwise be infinite recursions.

This is a partial revert of #3669, which removed the old implementation of Type::getLeastUpperBound that did not correctly handle recursive types. The new implementation in this PR uses a TypeBuilder to construct LUBs and for recursive types, it returns a temporary HeapType that has not yet been fully constructed to break what would otherwise be infinite recursions.

tlively requested a review from kripken March 10, 2021 03:09

kripken reviewed Mar 10, 2021

View reviewed changes

tlively added 2 commits March 10, 2021 16:20

Fix in validator

b60c874

tlively mentioned this pull request Mar 11, 2021

Revert "Make unreachable a subtype of everything" #3676

Closed

ensureSuperType

43861ae

tlively force-pushed the remove-lubs branch from e9aaecc to 43861ae Compare March 11, 2021 02:15

kripken reviewed Mar 11, 2021

View reviewed changes

kripken approved these changes Mar 11, 2021

View reviewed changes

Update validator

935a9c7

tlively enabled auto-merge (squash) March 11, 2021 20:39

Fix unreachable try

fd49725

tlively merged commit 4845637 into WebAssembly:main Mar 11, 2021

tlively deleted the remove-lubs branch March 11, 2021 21:53

kripken added a commit that referenced this pull request Mar 12, 2021

fix a regression from #3669

6d08d42

kripken mentioned this pull request Mar 12, 2021

[Wasm GC] Provide the local type in SimplifyLocals return value optimizations #3681

Closed

kripken mentioned this pull request Mar 22, 2021

Fix a fuzz regression from #3669 #3715

Merged

tlively mentioned this pull request Mar 25, 2021

LUBs #3731

Merged

Conversation

tlively commented Mar 10, 2021

Uh oh!

kripken left a comment

Choose a reason for hiding this comment

Uh oh!

kripken Mar 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kripken commented Mar 10, 2021

Uh oh!

tlively commented Mar 10, 2021

Uh oh!

tlively commented Mar 11, 2021

Uh oh!

kripken left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aheejin commented Mar 17, 2021

Uh oh!

tlively commented Mar 17, 2021

Uh oh!

kripken commented Mar 17, 2021

Uh oh!

aheejin commented Mar 17, 2021

Uh oh!

kripken commented Mar 18, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kripken Mar 10, 2021 •

edited

Loading