Skip to content

Improve ByteBufUtil#lastIndexOf#13942

Merged
chrisvest merged 1 commit intonetty:4.1from
jchrys:4.1-bytebuf-last-index
Apr 8, 2024
Merged

Improve ByteBufUtil#lastIndexOf#13942
chrisvest merged 1 commit intonetty:4.1from
jchrys:4.1-bytebuf-last-index

Conversation

@jchrys
Copy link
Copy Markdown
Contributor

@jchrys jchrys commented Mar 30, 2024

Motivation:
The performance of #lastIndexOf could be enhanced by applying SWAR.

Modification:
Utilized SWARUtil for byte search.

Result:
Enhanced performance.

@jchrys jchrys force-pushed the 4.1-bytebuf-last-index branch 3 times, most recently from 9643584 to a9c4bf6 Compare March 30, 2024 19:16
@jchrys
Copy link
Copy Markdown
Contributor Author

jchrys commented Mar 30, 2024

Benchmark result on below env shows max 83% performance boost.
1X10X2, Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz, openjdk 17.0.8 2023-07-18, Ubuntu 22.04.3 LTS, tuend network low-latency, no turbo boost.

Benchresult

@jchrys jchrys marked this pull request as ready for review March 30, 2024 19:53
@jchrys jchrys force-pushed the 4.1-bytebuf-last-index branch from a9c4bf6 to ff4d295 Compare March 30, 2024 19:55
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While searching backward, we need to check the last occurrence of the needle in the long batch, which means basically working with the opposite endianness. I don't see any mention about it (in a comment too)

Copy link
Copy Markdown
Contributor Author

@jchrys jchrys Mar 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@franz1981
I added the comment.

                if (result != 0) {
                    // used the oppoiste endianness since we are looking for the last index.
                    return offset - 1 - SWARUtil.getIndex(result, !isNative);
                }

@jchrys jchrys force-pushed the 4.1-bytebuf-last-index branch 3 times, most recently from 54fefbb to f857129 Compare March 30, 2024 21:10
Copy link
Copy Markdown
Member

@chrisvest chrisvest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AbstractByteBufTest.testSWARIndexOf only covers forward searching. Please add test coverage for backward searching as well.

Comment thread buffer/src/main/java/io/netty/buffer/ByteBufUtil.java Outdated
return -1;
}

private static int unrolledLastIndexOf(final AbstractByteBuf buffer, final int fromIndex, final int byteCount,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you compare this unrolledLastIndexOf to calling linearLastIndexOf with adjusted range?

Copy link
Copy Markdown
Contributor Author

@jchrys jchrys Apr 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous research has shown that manually unrolled loops improves performance for size=7 benchmark case(#10737 (comment)).

I will add an updated comparison.

Copy link
Copy Markdown
Contributor Author

@jchrys jchrys Apr 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manual unrolling results in better performance compared to a linear approach. (size > 1)

1X10X2, Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz, openjdk 17.0.8 2023-07-18, Ubuntu 22.04.3 LTS, tuend network low-latency, no turbo boost.
benchmark

linear benchmark source code
manual unroll benchmark source code

Motivation:
The performance of `#lastIndexOf` could be enhanced by applying SWAR.

Modification:
Utilized `SWARUtil` for byte search.

Result:
Enhanced performance.
@jchrys jchrys force-pushed the 4.1-bytebuf-last-index branch from 98a2c2a to b39fa54 Compare April 7, 2024 06:33
@jchrys jchrys requested review from chrisvest and franz1981 April 7, 2024 06:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants