Intern HeapTypes and clean up types code by tlively · Pull Request #3428 · WebAssembly/binaryen

tlively · 2020-12-05T06:49:13Z

Interns HeapTypes using the same patterns and utilities already used to intern
Types. This allows HeapTypes to efficiently be compared for equality and hashed,
which may be important for very large struct types in the future. This change
also has the benefit of increasing symmetry between the APIs of Type and
HeapType, which will make the developer experience more consistent. Finally,
this change will make TypeBuilder (#3418) much simpler because it will no longer
have to introduce TypeInfo variants to refer to HeapTypes indirectly.

Interns HeapTypes using the same patterns and utilities already used to intern Types. This allows HeapTypes to efficiently be compared for equality and hashed, which may be important for very large struct types in the future. This change also has the benefit of increasing symmetry between the APIs of Type and HeapType, which will make the developer experience more consistent. Finally, this change will make TypeBuilder (WebAssembly#3418) much simpler because it will no longer have to introduce TypeInfo variants to refer to HeapTypes indirectly.

kripken · 2020-12-06T17:47:40Z

+
+  Signature getSignature() const;
+  const Struct& getStruct() const;
+  Array getArray() const;


Suggested change

Array getArray() const;

const Array& getArray() const;

Is the reason Arrays and Structs differ in this way that Arrays are "small" and Structs are possibly of unbounded size?

Yes, that's what I was thinking 👍 Arrays contain only one Field each, which in turn contains only three items taking up two machine words. That seems small enough to pass around by value.

That does make sense, but it does make the API asymmetrical, which seems a little surprising. Or do you think for users of the API the difference wouldn't be noticeable?

A comment might be good either way.

Yeah, the asymmetry is a little unfortunate, but I think it will be ok. What would you think of deleting operator= to prevent folks from accidentally doing auto s = getStruct() and making a copy of the struct? Would that be worth it or too inconvenient?

Will add comments.

That seems good, assuming I understand correctly and it would just force people to add the & in auto& struct_ = type.getStruct(); - ? If so sgtm. But regardless, that could also be separate.

kripken · 2020-12-06T18:16:42Z

  std::string toString() const;
 };

+class HeapType {


I think it would be good to add a comment explaining what a HeapType is, and why it's separate from Type.

Also it would be good to comment that Type and HeapType are basically interned IDs so are supposed to be passed/used by values.

aheejin

It's much clearer this way!

aheejin · 2020-12-07T21:48:51Z

  std::string toString() const;
 };

+class HeapType {


Also it would be good to comment that Type and HeapType are basically interned IDs so are supposed to be passed/used by values.

aheejin · 2020-12-07T22:39:11Z

-        break;
+      } else {
+        if (info.ref.heapType == HeapType::i31) {
+          return Type::i31ref;


Any reason to change this from map lookup to switch-case jump table? Wouldn't map lookup be cheaper?

I'm not sure about the performance - the map lookup at least requires taking a lock and calculating a hash. I wouldn't expect this to have a measurable impact (but I haven't measured it). The reason I changed it was that what we had before was kind of a hack. It used tuple TypeInfos to store individual BasicTypes even though empty and singleton tuples were otherwise not allowed anywhere. In contrast, there is no need for HeapTypeInfo to be able to represent a BasicHeapType and it seemed cleaner to make that separation of concerns explicit for TypeInfo as well.

Hmm, I thought it would be simpler if every Type has its matching TypeInfo and we can get one using the other, but as long as it is implementation detail and well encapsulated from outside I think either way should be fine.

Thanks :) This change should not be observable outside of this function.

aheejin · 2020-12-07T22:41:57Z

-      case HeapType::ArrayKind:
-        assert(heapType.array.element.type.isSingle());
-        break;
+      } else {


How about adding an assertion failure in case heapType is not nullable and is not i31, because we don't support it yet?

I agree that erroring out when we use non-nullable types would make sense (including for i31ref, which we don't support any better than other non-nullable types). But I think the validator would be a better place for that error since we support creating and querying non-nullable types, just not using them in the IR.

What I was wondering is, even if the type is basic, if it is not nullable, it will fall out of all these ifs and call Store<TypeInfo>::canonicalize(info), which will create TypeInfo for it, in which case it can cause an unintended consequence which is not easy to catch later. Maybe I'm missing something?

I don't think the act of making a TypeInfo for a non-nullable type can make anything bad happen. We also depend on being able to do that in test/example/typinfo.cpp.

The bad things only start happening once you use a non-nullable type for locals in the IR.

But didn't you say you don't create TypeInfos anymore with (nullable) basic types?

Oh, I see what you're saying. We only provide BasicTypes for types explicitly given their own names in the spec proposals, so (ref null extern), which is equivalent to externref, is canonicalized into Type::externref, but on the other hand (ref extern) would be backed by a TypeInfo. This is working as intended. i31ref is weird because it is non-nullable by definition.

aheejin · 2020-12-07T22:50:09Z

+
 } // namespace wasm
+
+namespace std {


Any reason we put hash functions in two separate chunks? Can we merge them into one chunk?

There are a couple unfortunate things going on here.

The declaration of specializations of std::hash for TypeInfo and HeapTypeInfo needs to be done before the declaration of Store, otherwise the declaration of Store::typeIDs tries to use the default instantiation and fails.

There's no way to declare specializations from a different namespace, so there's no way to avoid leaving the anonymous and wasm namespaces and entering std just for these two declarations.

There's nothing stopping us from moving the hash definitions from down below up here, but I thought it would still be better to keep them separate despite the extra verbosity.

tlively · 2020-12-07T23:32:40Z

It's much clearer this way!

Thanks, I'm glad you think so :D

kripken

lgtm % comments we discussed

Co-authored-by: Heejin Ahn <aheejin@gmail.com>

aheejin

LGTM % comments

tlively · 2020-12-08T02:41:19Z

+  // This type annotation is unused. Beware it needing to be used in the future!
+  getHeapType();


cc @kripken here's where I fixed that unused variable error I was getting.

kripken reviewed Dec 6, 2020

View reviewed changes

aheejin reviewed Dec 7, 2020

View reviewed changes

kripken approved these changes Dec 7, 2020

View reviewed changes

Update src/wasm/wasm-type.cpp

2906cd5

Co-authored-by: Heejin Ahn <aheejin@gmail.com>

aheejin approved these changes Dec 8, 2020

View reviewed changes

tlively added 4 commits December 7, 2020 17:46

Add comments about passing by value/reference and about HeapType vs Type

1effa82

Merge remote-tracking branch 'origin/master' into pointer-to-heap-type

eef587f

Remove unused variables

37930dc

Change for CI's clang-format

ef2e585

tlively commented Dec 8, 2020

View reviewed changes

tlively merged commit 2a0059d into WebAssembly:master Dec 8, 2020

tlively deleted the pointer-to-heap-type branch December 8, 2020 03:32

This was referenced Dec 9, 2020

TypeBuilder #3418

Merged

Small RTT cleanups #3434

Merged

		// This type annotation is unused. Beware it needing to be used in the future!
		getHeapType();

Conversation

tlively commented Dec 5, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aheejin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tlively commented Dec 7, 2020

Uh oh!

kripken left a comment

Choose a reason for hiding this comment

Uh oh!

aheejin left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aheejin left a comment •

edited

Loading