Here's an ugly result for Blake2 testing with Crypto++ and Botan on ARMv8/Aarch64 with Cortex-A57. Cortex-A53 is OK, meaning it does not slow down. A53 runs at about the same speed for both CXX and NEON.
A57, Crypto++ (3 second benchmark):
- CXX implementation: 5.7 cpb
- NEON implementation: 12.6 cpb
A57, Botan (speed test, 3000 ms):
- CXX implementation: 315.197 MiB/sec (945.594 MiB in 3000.008 ms)
- NEON implementation: 148.028 MiB/sec (444.086 MiB in 3000.014 ms)
The astute reader will realize those numbers should be inverted :(
Here's an ugly result for Blake2 testing with Crypto++ and Botan on ARMv8/Aarch64 with Cortex-A57. Cortex-A53 is OK, meaning it does not slow down. A53 runs at about the same speed for both CXX and NEON.
A57, Crypto++ (3 second benchmark):
A57, Botan (speed test, 3000 ms):
The astute reader will realize those numbers should be inverted :(