Skip to content

Intern HeapTypes and clean up types code#3428

Merged
tlively merged 6 commits into
WebAssembly:masterfrom
tlively:pointer-to-heap-type
Dec 8, 2020
Merged

Intern HeapTypes and clean up types code#3428
tlively merged 6 commits into
WebAssembly:masterfrom
tlively:pointer-to-heap-type

Conversation

@tlively

@tlively tlively commented Dec 5, 2020

Copy link
Copy Markdown
Member

Interns HeapTypes using the same patterns and utilities already used to intern
Types. This allows HeapTypes to efficiently be compared for equality and hashed,
which may be important for very large struct types in the future. This change
also has the benefit of increasing symmetry between the APIs of Type and
HeapType, which will make the developer experience more consistent. Finally,
this change will make TypeBuilder (#3418) much simpler because it will no longer
have to introduce TypeInfo variants to refer to HeapTypes indirectly.

Interns HeapTypes using the same patterns and utilities already used to intern
Types. This allows HeapTypes to efficiently be compared for equality and hashed,
which may be important for very large struct types in the future. This change
also has the benefit of increasing symmetry between the APIs of Type and
HeapType, which will make the developer experience more consistent. Finally,
this change will make TypeBuilder (WebAssembly#3418) much simpler because it will no longer
have to introduce TypeInfo variants to refer to HeapTypes indirectly.
Comment thread src/wasm-type.h

Signature getSignature() const;
const Struct& getStruct() const;
Array getArray() const;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Array getArray() const;
const Array& getArray() const;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the reason Arrays and Structs differ in this way that Arrays are "small" and Structs are possibly of unbounded size?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's what I was thinking 👍 Arrays contain only one Field each, which in turn contains only three items taking up two machine words. That seems small enough to pass around by value.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does make sense, but it does make the API asymmetrical, which seems a little surprising. Or do you think for users of the API the difference wouldn't be noticeable?

A comment might be good either way.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the asymmetry is a little unfortunate, but I think it will be ok. What would you think of deleting operator= to prevent folks from accidentally doing auto s = getStruct() and making a copy of the struct? Would that be worth it or too inconvenient?

Will add comments.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems good, assuming I understand correctly and it would just force people to add the & in auto& struct_ = type.getStruct(); - ? If so sgtm. But regardless, that could also be separate.

Comment thread src/wasm-type.h
std::string toString() const;
};

class HeapType {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to add a comment explaining what a HeapType is, and why it's separate from Type.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it would be good to comment that Type and HeapType are basically interned IDs so are supposed to be passed/used by values.

@aheejin aheejin left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's much clearer this way!

Comment thread src/wasm-type.h
std::string toString() const;
};

class HeapType {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it would be good to comment that Type and HeapType are basically interned IDs so are supposed to be passed/used by values.

Comment thread src/wasm/wasm-type.cpp Outdated
Comment thread src/wasm/wasm-type.cpp
break;
} else {
if (info.ref.heapType == HeapType::i31) {
return Type::i31ref;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to change this from map lookup to switch-case jump table? Wouldn't map lookup be cheaper?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the performance - the map lookup at least requires taking a lock and calculating a hash. I wouldn't expect this to have a measurable impact (but I haven't measured it). The reason I changed it was that what we had before was kind of a hack. It used tuple TypeInfos to store individual BasicTypes even though empty and singleton tuples were otherwise not allowed anywhere. In contrast, there is no need for HeapTypeInfo to be able to represent a BasicHeapType and it seemed cleaner to make that separation of concerns explicit for TypeInfo as well.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I thought it would be simpler if every Type has its matching TypeInfo and we can get one using the other, but as long as it is implementation detail and well encapsulated from outside I think either way should be fine.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks :) This change should not be observable outside of this function.

Comment thread src/wasm/wasm-type.cpp
case HeapType::ArrayKind:
assert(heapType.array.element.type.isSingle());
break;
} else {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding an assertion failure in case heapType is not nullable and is not i31, because we don't support it yet?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that erroring out when we use non-nullable types would make sense (including for i31ref, which we don't support any better than other non-nullable types). But I think the validator would be a better place for that error since we support creating and querying non-nullable types, just not using them in the IR.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I was wondering is, even if the type is basic, if it is not nullable, it will fall out of all these ifs and call Store<TypeInfo>::canonicalize(info), which will create TypeInfo for it, in which case it can cause an unintended consequence which is not easy to catch later. Maybe I'm missing something?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the act of making a TypeInfo for a non-nullable type can make anything bad happen. We also depend on being able to do that in test/example/typinfo.cpp.

The bad things only start happening once you use a non-nullable type for locals in the IR.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But didn't you say you don't create TypeInfos anymore with (nullable) basic types?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see what you're saying. We only provide BasicTypes for types explicitly given their own names in the spec proposals, so (ref null extern), which is equivalent to externref, is canonicalized into Type::externref, but on the other hand (ref extern) would be backed by a TypeInfo. This is working as intended. i31ref is weird because it is non-nullable by definition.

Comment thread src/wasm/wasm-type.cpp

} // namespace wasm

namespace std {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason we put hash functions in two separate chunks? Can we merge them into one chunk?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple unfortunate things going on here.

  1. The declaration of specializations of std::hash for TypeInfo and HeapTypeInfo needs to be done before the declaration of Store, otherwise the declaration of Store::typeIDs tries to use the default instantiation and fails.

  2. There's no way to declare specializations from a different namespace, so there's no way to avoid leaving the anonymous and wasm namespaces and entering std just for these two declarations.

There's nothing stopping us from moving the hash definitions from down below up here, but I thought it would still be better to keep them separate despite the extra verbosity.

@tlively

tlively commented Dec 7, 2020

Copy link
Copy Markdown
Member Author

It's much clearer this way!

Thanks, I'm glad you think so :D

@kripken kripken left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comments we discussed

Co-authored-by: Heejin Ahn <aheejin@gmail.com>

@aheejin aheejin left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % comments

Comment thread src/wasm/wasm-binary.cpp
Comment on lines +5633 to +5634
// This type annotation is unused. Beware it needing to be used in the future!
getHeapType();

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @kripken here's where I fixed that unused variable error I was getting.

@tlively tlively merged commit 2a0059d into WebAssembly:master Dec 8, 2020
@tlively tlively deleted the pointer-to-heap-type branch December 8, 2020 03:32
This was referenced Dec 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants