-
-
Notifications
You must be signed in to change notification settings - Fork 957
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use intrinsics for bit twiddling? #7042
Comments
Remember that it's got to be supported by all platforms, but I don't see why not otherwise |
We have lots of legacy code that never was updated with newer features, so if you have some improvements that
then go ahead |
Hmm... well I would need three times the existing amount of code (one for MSVC, one for GCC/Clang, and one for fallback/unrecognized compiler)
Definitely won't be worse
Won't be incompatible, because I can keep the existing code as a fallback for unrecognized compilers. |
well, you don't necessarily need to fulfill all of these, just you will have more arguing to do. when you deviate from one, you must argue that it is better on another one |
Before embarking on using intrinsics, I would look at the code generated by common compilers to check if they might recognize the pattern and generate identical code. |
@btzy There are SIMD wrapper libraries to simplify this task. They use intrinsics if possible and provide fallbacks when necessary. |
@fsimonis I don't think we are looking at SIMD intrinsics like SSE/AVX over here. Instead, we are looking at things like popcount and findfirstset/findlastset, which have existed since very long ago (Intel i386 and ARM both have such instructions), and will operate on a single 32 or 64-bit integer. |
My mistake. I thought I read about mask operations representing simd
versions of Intels _BitScanForward and _BitScanReverse. I'll get back to
you if I find something suitable.
|
|
I tried implementing I also tried implementing For testing I used ProZone Game 13, which has 2400 trains running, and Wentboune Transport, which has 4833 trains, 5499 rvs, 2818 ships, 749 aircraft. |
Would it still be reasonable to change the code to use intrinsics then (perhaps on grounds that intrinsics might be clearer to the reader than the existing algorithms)? Also, my PR #7080 uses |
This issue has been automatically marked as stale because it has not had any activity in the last two months. |
I noticed that in
bitmath_func.hpp
there is some code at the bottom that uses compiler intrinsics. However, there are other functions in that file that are not using intrinsics but could be made faster with intrinsics, e.g.FindFirstBit
,FindLastBit
, andCountBits
. Using intrinsics would lead to an algorithmic speedup for these three functions. Things like ROL and ROR would also benefit from intrinsics, but there will not be any improvement in time complexity.Would intrinsics be preferred here? I can modify this file to use compiler intrinsics where available if the consensus is to use intrinsics.
The text was updated successfully, but these errors were encountered: