Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gf2x: disable hardware-specific code #45299

Merged
merged 1 commit into from Aug 18, 2018
Merged

Conversation

timokau
Copy link
Member

@timokau timokau commented Aug 18, 2018

Motivation for this change

This was causing "Illegal instruction" issues in the ntl testsuite when
it was executed on a different cpu than the one gf2x was built on
(43679).

Thanks @grahamc @dezgeg and clever from IRC (don't know their github handle) for tracking this issue down!

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Fits CONTRIBUTING.md.

This was causing "Illegal instruction" issues in the ntl testsuite when
it was executed on a different cpu than the one gf2x was built on
(43679).
@GrahamcOfBorg
Copy link

Success on x86_64-darwin (full log)

Attempted: gf2x

Partial log (click to expand)

make[2]: Entering directory '/private/tmp/nix-build-gf2x-1.2.drv-0/gf2x-1.2/tests'
make[2]: Nothing to be done for 'install-exec-am'.
make[2]: Nothing to be done for 'install-data-am'.
make[2]: Leaving directory '/private/tmp/nix-build-gf2x-1.2.drv-0/gf2x-1.2/tests'
make[1]: Leaving directory '/private/tmp/nix-build-gf2x-1.2.drv-0/gf2x-1.2/tests'
post-installation fixup
strip is /nix/store/bg39mxc401qqxgjasxfgardi6xl2ikmc-cctools-binutils-darwin/bin/strip
stripping (with command strip and flags -S) in /nix/store/7pvdvljlhmzm0m95n62h9qm2sdri0bmc-gf2x-1.2/lib
patching script interpreter paths in /nix/store/7pvdvljlhmzm0m95n62h9qm2sdri0bmc-gf2x-1.2
/nix/store/7pvdvljlhmzm0m95n62h9qm2sdri0bmc-gf2x-1.2

@GrahamcOfBorg
Copy link

Success on aarch64-linux (full log)

Attempted: gf2x

Partial log (click to expand)

make[2]: Leaving directory '/build/gf2x-1.2/tests'
make[1]: Leaving directory '/build/gf2x-1.2/tests'
post-installation fixup
shrinking RPATHs of ELF executables and libraries in /nix/store/dwy361jcqkw4syyzyrqbfgwhjs16yvda-gf2x-1.2
shrinking /nix/store/dwy361jcqkw4syyzyrqbfgwhjs16yvda-gf2x-1.2/lib/libgf2x.so.1.0.2
strip is /nix/store/4md2i310zklkkl5j41yw70gcwgn4kav5-binutils-2.30/bin/strip
stripping (with command strip and flags -S) in /nix/store/dwy361jcqkw4syyzyrqbfgwhjs16yvda-gf2x-1.2/lib
patching script interpreter paths in /nix/store/dwy361jcqkw4syyzyrqbfgwhjs16yvda-gf2x-1.2
checking for references to /build in /nix/store/dwy361jcqkw4syyzyrqbfgwhjs16yvda-gf2x-1.2...
/nix/store/dwy361jcqkw4syyzyrqbfgwhjs16yvda-gf2x-1.2

@GrahamcOfBorg
Copy link

Success on x86_64-linux (full log)

Attempted: gf2x

Partial log (click to expand)

make[2]: Leaving directory '/build/gf2x-1.2/tests'
make[1]: Leaving directory '/build/gf2x-1.2/tests'
post-installation fixup
shrinking RPATHs of ELF executables and libraries in /nix/store/iih9w4f49azan2md11il9qk3qys5wb8c-gf2x-1.2
shrinking /nix/store/iih9w4f49azan2md11il9qk3qys5wb8c-gf2x-1.2/lib/libgf2x.so.1.0.2
strip is /nix/store/553rihc190vsyy8b22iqcq25a6489h8y-binutils-2.30/bin/strip
stripping (with command strip and flags -S) in /nix/store/iih9w4f49azan2md11il9qk3qys5wb8c-gf2x-1.2/lib
patching script interpreter paths in /nix/store/iih9w4f49azan2md11il9qk3qys5wb8c-gf2x-1.2
checking for references to /build in /nix/store/iih9w4f49azan2md11il9qk3qys5wb8c-gf2x-1.2...
/nix/store/iih9w4f49azan2md11il9qk3qys5wb8c-gf2x-1.2

@timokau timokau merged commit fe0c07a into NixOS:master Aug 18, 2018
@timokau timokau deleted the gf2x-purity branch August 18, 2018 12:00
@vcunat
Copy link
Member

vcunat commented Aug 18, 2018

The 1.1 version in 18.03 is missing this configure option, so it's probably just fine.

@timokau
Copy link
Member Author

timokau commented Aug 19, 2018

The issue remained undetected until I enabled the tests for ntl. It also requires a specific builder combination. So it is probably present in 18.03.

@timokau
Copy link
Member Author

timokau commented Aug 19, 2018

I think the configure option was just added later when people complained. See https://trac.sagemath.org/ticket/15316.

@vcunat
Copy link
Member

vcunat commented Aug 19, 2018

Well, 1.2 could be backported with this fix, if that's better. Though I've seen noone complaining about a problem on 18.03...

@timokau
Copy link
Member Author

timokau commented Aug 19, 2018

Yeah I think the problem would only occur when the wrong builder picks up gf2x and then only when the users machine is pretty old. Given the short remaining lifetime of 18.03, I'm not sure if a backport is worth it. In either case I recently dislocated my shoulder and won't be able to do it myself.

vcunat pushed a commit to vcunat/nixpkgs that referenced this pull request Aug 20, 2018
(cherry picked from commit fe0c07a)
On nixpkgs master we were running into "Illegal instruction" problems,
and apparently the same might happen on 18.03 (without this cherry-pick).
@vcunat
Copy link
Member

vcunat commented Aug 20, 2018

Hmm, cherry-picks themselves were easy enough, getting the same expression as on nixpkgs master, but the one adding this option resulted into a build error, so I dropped the effort for now :-/

In file included from check_small_size.c:7:0:
../lowlevel/mul2t.c:43:2: error: #error "This code needs sse-2 support"
 #error "This code needs sse-2 support"

https://github.com/vcunat/nixpkgs/commits/p/gf2x-18.03

@timokau
Copy link
Member Author

timokau commented Aug 20, 2018

During the gf2x-1.2 build? That is very weird, since it has basically no dependencies and should behave just as on master. Does it build on master?

@vcunat
Copy link
Member

vcunat commented Aug 20, 2018

Yes, it's very weird; master works both locally and on Hydra. I now tried on two machines (AMD and Intel), both are sandboxed x86_64 NixOS.

@timokau
Copy link
Member Author

timokau commented Aug 20, 2018

Maybe 18.03 has a different gcc version? There was some discussion about weather or not "no hardware specific codes" should include sse2. I think they ended up disabling it but someone said that at least debians compiler made use of sse2 anyways IIRC.

@vcunat
Copy link
Member

vcunat commented Aug 21, 2018

I had checked that, and it's 7.3.0 for both.

@timokau
Copy link
Member Author

timokau commented Aug 21, 2018

Thats really weird, no idea what other stdenv change might cause this. Apparently there also is a --disable-sse2 flag, you might try adding that one too.

@vcunat
Copy link
Member

vcunat commented Aug 21, 2018

x86_64 implies SSE2 support in principle.

@timokau
Copy link
Member Author

timokau commented Aug 21, 2018

Yes, but I thought it may be worth a try anyways since the problem appears to be with sse2 support.

@vcunat
Copy link
Member

vcunat commented Aug 21, 2018

It didn't make the error go away. BTW, I expect there are a few more flags to disable by default, right?

  --enable-sse3           Turn on sse-3 code (default is yes)
  --enable-ssse3          Turn on ssse3 code (default is yes)
  --enable-sse41          Turn on sse-4.1 code (default is yes)
  --enable-pclmul         Turn on pclmul code (default is yes)

@timokau
Copy link
Member Author

timokau commented Aug 21, 2018

Thats a shame. As far as I understand, --disable-hardware-specific-codes is intended to automatically disable all of them.

It is probably a good idea to involve upstream here.

@timokau
Copy link
Member Author

timokau commented Aug 21, 2018

@emmanuelthome, you opened the upstream ticket about --disable-hardware-specific-codes. Do you have any idea what may cause the error described here?

@emmanuelthome
Copy link

emmanuelthome commented Sep 4, 2019

Hi. Sorry I haven't touched gf2x for a (long) while.

I understand you followed the path of using released gf2x-1.2 + cherry-picking some patches.
Which ones exactly ?

Do you have a test procedure to validate a new tarball of gf2x, so as to see if it fixes the problems you encounter ?

In the git version, I really don't see how mul2t.c could be compiled in if you passed --disable-hardware-specific-code ; FWIW it's spelled without an s, but I assume that was just a typo in your comment.

@timokau
Copy link
Member Author

timokau commented Sep 4, 2019

We're using released 1.2 without any patches. We don't test gf2x directly, only indirectly through the tests of reverse-dependencies.

However it has been a year and I don't quite remember the details of this problem.

@emmanuelthome
Copy link

ok. I'd like to know if there's any misbehaviour identified in this bug report that I must make sure to fix before I release a new version.
What I see is that under circumstances that are not clear to me, you get a compilation error with mul2t complaining about missing sse2 support, even though you passed --disable-hardware-specific-code (whence mul2t should not be a compiled unit at all).
If this happens with released gf2x-1.2, then I don't see how this can happen, and I can't reproduce it.

@vcunat
Copy link
Member

vcunat commented Sep 5, 2019

@emmanuelthome: fortunately, git+nix give very good reproducibility, so I did the same as described in my comment. Log: n07pqhjv2jj1vkkrybay8pxzc3h0zynf-gf2x-1.2.log, and I can easily retry the same with a different tarball/version.

@emmanuelthome
Copy link

Thanks. Then yes, it's a bug in 1.2 ; fixed by later commits (58beff3 -- unfortunately no further release ever happened), and also in the master branch.

I'm going to release the master branch (9c3a76f passes all tests), pending maybe a few last-minute fixes if something pops up (nothing comes to mind, but I've got a few further integration tests to make).

If you could give a go to git version 9c3a76f and confirms that things work well, that would be great,

@vcunat
Copy link
Member

vcunat commented Sep 21, 2019

Possible deadlock detected; I'm sorry, I was confused apparently.

It's interesting that in current nixpkgs the error doesn't happen for me anymore, even though gf2x wasn't touched (gcc upgrade or something?). Still on that previous nixpkgs commit I do get that error, and when I use gf2x master (b7a5f021) there, that error goes away. I didn't try to understand the whys.

@emmanuelthome
Copy link

FYI I released gf2x-1.3.0 yesterday.

@timokau
Copy link
Member Author

timokau commented Dec 11, 2019

Thanks for the heads-up! Unfortunately NTL is not yet compatible with the new release, so we'll have to wait a bit. I've contacted the NTL author via email.

Keep in mind that updating gf2x for nix (and probably other distros) would be a lot smoother if there were a predictable download link. In that case our automated tooling could suggest an update and we would just need to approve it.

@emmanuelthome
Copy link

emmanuelthome commented Dec 11, 2019 via email

@vcunat
Copy link
Member

vcunat commented Dec 11, 2019

GitLab is very nice.

timokau added a commit to timokau/nixpkgs that referenced this pull request Jan 2, 2020
Switch source to fetchgit since it removes the requirements to manually
find the correct link for each version.

More information at
NixOS#45299
@timokau
Copy link
Member Author

timokau commented Jan 2, 2020

The NTL author Victor Shoup has pushed a fix. Continuation in #76829. I've fixed the URL problem for now by just checking out the sources directly from git.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants