Skip to content

Avoid __popcnt and __popcnt64 intrinsics for MSC#2944

Merged
kripken merged 3 commits into
WebAssembly:masterfrom
MaxGraey:fix-2942
Jul 6, 2020
Merged

Avoid __popcnt and __popcnt64 intrinsics for MSC#2944
kripken merged 3 commits into
WebAssembly:masterfrom
MaxGraey:fix-2942

Conversation

@MaxGraey

@MaxGraey MaxGraey commented Jul 5, 2020

Copy link
Copy Markdown
Contributor

Potentially fix #2942

Unfortunately not but still relevant for other environments

@MaxGraey MaxGraey changed the title Avoid __popcnt and __popcnt64 intrinsics for MSVC Avoid __popcnt and __popcnt64 intrinsics for MSC Jul 5, 2020
@kripken

kripken commented Jul 5, 2020

Copy link
Copy Markdown
Member
  1. @RReverser perhaps you can check if this fixes that issue?

  2. But why do we think this might be the issue - is there a reason to suspect these intrinsics are broken on msvc?

@MaxGraey

MaxGraey commented Jul 5, 2020

Copy link
Copy Markdown
Contributor Author

But why do we think this might be the issue - is there a reason to suspect these intrinsics are broken on msvc?

Some of intrinsics like __popcnt or __lzcnt on MSC required cpuid check before usage. It's just direct asm injection I guess without any fallbacks for generic arch. But with _BitScanForward / _BitScanReverse we have different story and that's instructions should be hardware independent)

Comment thread src/ir/bits.h
// otherwise, see if adding one turns this into a 1-bit thing, 00011111 + 1
// => 00100000
if (PopCount(mask + 1) != 1) {
if (mask & (mask + 1)) {

@MaxGraey MaxGraey Jul 6, 2020

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(mask & (mask + 1)) != 0 is not fully equivalent to PopCount(mask + 1) != 1 because not handle case when mask == uint32_t(-1) but this already checked earlier so we can use more simplify version without this special case here

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me. It would be good to fix the comment to reflect the new calculation, since it currently suggests that we are using popcount.

@tlively tlively left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if this doesn't fix the original issue, this still seems like a good change if __popcnt is not guaranteed to work for all CPUs on windows.

@kripken

kripken commented Jul 6, 2020

Copy link
Copy Markdown
Member

Sounds good, let's merge this then.

@RReverser if you get a chance, it would still be good to confirm if this fixes that issue.

@kripken kripken merged commit 18095a6 into WebAssembly:master Jul 6, 2020
@MaxGraey MaxGraey deleted the fix-2942 branch July 6, 2020 22:43
@RReverser

Copy link
Copy Markdown
Member

@kripken I already mentioned on the original issue that I tried it and it didn't change anything for me (neither size nor the reproduction): #2942 (comment)

@kripken

kripken commented Jul 6, 2020

Copy link
Copy Markdown
Member

Got it, thanks @RReverser

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Misoptimization causing a runtime panic in Rust code

4 participants