You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Because of different alignment requirements and VEX encoding/VZEROUPPER woes, and because it would be so low-level and pervasive (we cannot gate every operation on R3Element behind a CPUID test) we need a separate DLL for that, with an intermediate layer that chooses which one to use based on CPUID.
We could generate the intermediate layer from journal.proto, and LoadLibrary/GetProcAddress etc.
This blocks FMA usage on Linux & macOS, see #3010 (comment).
The text was updated successfully, but these errors were encountered:
just curious: is this relevant only for Linux & macOS? Or is there something here that might also be useful as an alternative to FMA3 on Windows 10 machines with older chips? For users like me who are stuck (for a few more years) with Ivy-Bridge processors that support AVX but not FMA3/AVX2:
Because of different alignment requirements and VEX encoding/VZEROUPPER woes, and because it would be so low-level and pervasive (we cannot gate every operation on R3Element behind a CPUID test) we need a separate DLL for that, with an intermediate layer that chooses which one to use based on CPUID.
We could generate the intermediate layer from journal.proto, and LoadLibrary/GetProcAddress etc.
This blocks FMA usage on Linux & macOS, see #3010 (comment).
The text was updated successfully, but these errors were encountered: