fix and implement more unary ops#1442
Conversation
| builder->makeUnary(EqZInt32, builder->makeGetLocal(highBits, i32)), | ||
| builder->makeUnary(EqZInt32, curr->value) | ||
| TempVar lowBits = getTemp(); | ||
| SetLocal* setLow = builder->makeSetLocal(lowBits, curr->value); |
There was a problem hiding this comment.
I may be misunderstanding this. Why is a temporary needed here? Also, setLow just copies the value in curr->value, which is 64-bit, don't we need to truncate it to get the low bits?
There was a problem hiding this comment.
My (limited) understanding is that the corresponding get_local generated later will perform the truncation. The reasons for this are not clear to me, but it is at least consistent with what all the other i64 lowering does.
The asm.js looks like this before the patch:
function $5($0, $0$hi) {
$0 = $0 | 0;
$0$hi = $0$hi | 0;
var i64toi32_i32$0 = 0, $3 = 0, $4 = 0, $5 = 0, $6 = 0, $7 = 0, $8 = 0, $9 = 0, $10 = 0;
$4 = (i64toi32_i32$0 | 0) == (0 | 0);
i64toi32_i32$0 = $0$hi;
return $4 & ($0 | 0) == (0 | 0) | 0 | 0;
}Notice that we've completely dropped anything having to do with the high bits. After the patch (without the eqz (or x y) optimization introduced later in the patch series), it looks like:
function $5($0, $0$hi) {
$0 = $0 | 0;
$0$hi = $0$hi | 0;
var i64toi32_i32$0 = 0, i64toi32_i32$1 = 0, $4 = 0, $5 = 0, $6 = 0, $7 = 0, $8 = 0, $9 = 0, $10 = 0, $11 = 0, $12 = 0, $13 = 0, $14 = 0, wasm2asm_i32$0 = 0;
return ($0$hi | 0) == (0 | 0) & ($0 | 0) == (0 | 0) | 0 | 0;
return wasm2asm_i32$0 | 0;
}which is much more sensible.
There was a problem hiding this comment.
I think I see, thanks. What I believe happens is the lowering was already done, so this node is 32-bit, and the high 32 bits are received in fetchOutParam.
If that's right, then you don't need the set_local/get_local (or the block) here, since the low bits are used exactly once, you can just use them directly.
There was a problem hiding this comment.
I mean, I believe there is no need for the temporary variable. It could be auto lowBits = curr->value;
There was a problem hiding this comment.
How is that any different from the original code, which used curr->value directly?
There was a problem hiding this comment.
I think the original code got the part of receiving the low and high bits right, but it just did an AND operation on them, which your PR fixes, it should indeed be an EQZ of an OR.
There was a problem hiding this comment.
The original code wanted to do (and (eqz $hi) (eqz $lo)) which is a valid translation of i64.eqz: if both 32-bit parts are equal to zero, the 64-bit integer is equal to zero. But the code as written was generating the above asm.js, which was wrong.
Even with the (eqz (or $hi $lo)) algorithm, if you change it to use a bare curr->value:
TempVar highBits = fetchOutParam(curr->value);
Block* result = builder->blockify(
builder->makeUnary(
EqZInt32,
builder->makeBinary(
OrInt32,
builder->makeGetLocal(highBits, i32),
curr->value
)
)
);
replaceCurrent(result);You get incorrect results, because the asm.js looks like:
function $5($0, $0$hi) {
$0 = $0 | 0;
$0$hi = $0$hi | 0;
var i64toi32_i32$0 = 0, $3 = 0, $4 = 0, $5 = 0, $6 = 0, $7 = 0, $8 = 0, $9 = 0, $10 = 0, $11 = 0;
$3 = i64toi32_i32$0;
i64toi32_i32$0 = $0$hi;
return ($3 | $0 | 0 | 0) == (0 | 0) | 0;
}which is pretty bogus.
There was a problem hiding this comment.
I think that's due to an ordering bug, see 3572e57 . The issue is that curr->value sets up both the low and high bits, so it needs to execute first before we use either. And in an or the order doesn't matter so we can just flip it.
Btw, I think it's easier to understand looking at the pass by itself as in that commit, and testing the pass directly, what do you think?
b7105d9 to
54d0118
Compare
| builder->makeUnary(EqZInt32, builder->makeGetLocal(highBits, i32)), | ||
| builder->makeUnary(EqZInt32, curr->value) | ||
|
|
||
| Block* result = builder->blockify( |
There was a problem hiding this comment.
no need for the block here, can just do auto* result = and then the makeUnary
There was a problem hiding this comment.
this kind of gets to my point, that when the low bits are just used once, the situation is much simpler - no need for a temp, no need for a block, which makes it shorter to write and at least in my opinion easier to understand.
but reading the other cases, i do see your point that in many or most you do need to read the low bits more than once, in which case for consistency maybe it's better for this to be that way too. i don't feel strongly either way, up to you.
54d0118 to
a219992
Compare
|
Looks good to me. Should we merge this or did you have more thoughts on refactoring? |
I have no additional thoughts on refactoring. I can see the point about shorter/simpler code though; I think there are fewer local variables generated with this version than my original blocked version. |
|
Ok, cool. Let's merge this. We can always improve things later if we find a good way. |
This series takes care of all the easy unary ops, fixing ones that were broken and adding ones that weren't implemented. Float conversions and bit reinterpretations are left for another day.
Passes
wasm2asmtests locally (node tests only, no spidermonkey).