Remove limits on output file size. #256

andrewla · 2020-12-05T15:46:02Z

The use of self-pointers into a vector<> meant that the vector could not be relocated, which meant that any growth in the file had to fit within the initial reservation.

Now, instead of a pointer to a header and a pointer to the contents, those values are computed dynamically from the underlying vector().

I made an attempt at writing a test that would provide coverage, but couldn't make it work. If you have suggestions for writing a test to test this functionality, please let me know and I'll give it a shot.

andrewla · 2021-02-02T21:32:54Z

Fixes #88
Fixes #62

domenkozar · 2021-02-04T15:14:43Z

@andrewla I know this is probably not helpful, but PRs with tests have a higher chance of getting merged.

I do not have ideas how to make a dummy big output.

andrewla · 2021-02-05T13:02:17Z

I'm going to put this on hold. As a stress test, I used this as an overlay for patchelf in nix, I'm getting errors trying to build gcc.

Not clear if this is patchelf related or some configuration problem, but for the moment I don't consider this safe to merge.

dstahlke · 2021-05-24T23:28:44Z

FWIW, this patch seems to work for me. I tried this patch, and also tried just boosting the buffer size to 256MB. The output of each method was identical. Unfortunately I am not able to share my test case, because of IP reasons. I haven't reviewed the changes to make sure there are no lingering pointers, but at least it passes valgrind.

andrewla · 2021-05-26T13:54:38Z

I'm all but positive that the errors encountered are related to how nix does bootstrapping of the compiler, rather than related to this change specifically.

Mic92 · 2021-08-21T07:57:00Z

Hey. I joined the maintainer team of patchelf recently if you could resolve the merge conflicts I give this a test.

andrewla · 2021-09-03T14:59:12Z

I applied the latest changes; I get a strange test failure locally because it can't find the "strip" utility but that appears to be some autoconf issue that is probably local to my machine; running the strip manually (to produce libbig-dynstr.debug makes the test pass).

Mic92 · 2021-09-04T11:07:45Z

Does make STRIP=strip fixes that? Might be because of

patchelf/tests/Makefile.am

Line 3 in c62988a

STRIP ?= strip

Mic92 · 2021-09-04T11:26:58Z

So one way to test this is to link in a large binary blob (can be created with dd): https://balau82.wordpress.com/2012/02/19/linking-a-binary-blob-with-gcc/ and than create a case where this section needs to be replaced.
I am currently working on a deadline for a different project so I might look later at this unless you come up with a test first.

nixos-discourse · 2021-09-08T20:50:16Z

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/how-to-work-around-patchelf-maximum-file-size-exceeded-error/14920/2

trofi · 2021-09-08T21:27:14Z

I think the problem is not in original binary size, but in limited growth that patchelf allows for. We can pick a tiny hello world and try to add 32MB of data to it. Say, 32MB of NEEDED libraries:

$ nix-shell -p patchelf

# very clumsy way to achieve many long "--add-needed AAAAAAAAA...." strings.
[nix-shell:/tmp]$ echo 'int main(){}' | gcc -x c - -o a -fPIC -pie; ./a; while patchelf --add-needed $(perl -e 'print "a" x 100000;') --add-needed $(perl -e 'print "b" x 100000;') --add-needed $(perl -e 'print "c" x 100000;') --add-needed $(perl -e 'print "d" x 100000;') --add-needed $(perl -e 'print "e" x 100000;') --add-needed $(perl -e 'print "f" x 100000') --add-needed $(perl -e 'print "g" x 100000') --add-needed $(perl -e 'print "h" x 100000') --add-needed $(perl -e 'print "i" x 100000') --add-needed $(perl -e 'print "j" x 100000') --add-needed $(perl -e 'print "k" x 100000') --add-needed $(perl -e 'print "l" x 100000') --add-needed $(perl -e 'print "m" x 100000') --add-needed $(perl -e 'print "m" x 100000') --add-needed $(perl -e 'print "o" x 100000') --add-needed $(perl -e 'print "p" x 100000') --add-needed $(perl -e 'print "q" x 100000') --add-needed $(perl -e 'print "r" x 100000') --add-needed $(perl -e 'print "s" x 100000') --add-needed $(perl -e 'print "t" x 100000') a; do echo again; done
again
again
again
again
again
again
again
again
again
again
again
again
again
again
again
again
again
patchelf: maximum file size exceeded

samuela · 2021-09-08T22:31:56Z

Ran into this issue while attempting to package tensorflow 2.6.0:

Rewriting #!/nix/store/1g54cwl6m78rgrqsnn1pp29dbvldpvsd-python3-3.9.6/bin/python3.9 to #!/nix/store/1g54cwl6m78rgrqsnn1pp29dbvldpvsd-python3-3.9.6
wrapping `/nix/store/hcghxmsg8smr0fmdpq9nr2pfifgcz586-python3.9-tensorflow-gpu-2.6.0/bin/tflite_convert'...
patchelf: maximum file size exceeded
builder for '/nix/store/gkp1bdbkya9p5zgxrbbp70qks31zmsz3-python3.9-tensorflow-gpu-2.6.0.drv' failed with exit code 1

samuela · 2021-09-08T23:05:14Z

Oddly enough, I've not been able to reproduce the failure outside of the nix build environment. I've been trying to provide a test case for this, but patchelf isn't failing outside of nix-build. In my postFixup:

for $lib in ...
      echo "processing patchelf"
      cp "$lib" "$lib.orig"
      echo "$lib"
      echo $(patchelf --print-rpath "$lib")
      patchelf --set-rpath "${cudatoolkit}/lib:${cudatoolkit.lib}/lib:${cudnn}/lib:${nccl}/lib:$(patchelf --print-rpath "$lib")" "$lib"
done

And I get output:

processing patchelf
/nix/store/i5ywd496x0ppp1lifkfwncsmixdsvm9s-python3.9-tensorflow-gpu-2.6.0/lib/python3.9/site-packages/tensorflow/python/_pywrap_toco_api.so
/run/opengl-driver/lib:$ORIGIN/../../_solib_local/_U_S_Stensorflow_Spython_C_Upywrap_Utoco_Uapi.so___Utensorflow:$ORIGIN/../../_solib_local/_U_S_Stensorflow_Spython_C_Upywrap_Utensorflow_Uinternal_Ulinux___Utensorflow_Spython:$ORIGIN/:$ORIGIN/..:/nix/store/g3h82za6g1j3gwz1br0spzpx1f20hc3c-tensorflow-gpu-2.6.0/lib:/nix/store/9bh3986bpragfjmr32gay8p95k91q4gy-glibc-2.33-47/lib:/nix/store/ql152marawkw62kvkjskndxccffdfns1-gcc-9.3.0-lib/lib
patchelf: maximum file size exceeded
exitHandler
_callImplicitHook 0 failureHook
builder for '/nix/store/1cvkfzsnpiz31qp0wpimqgiqhgj95xf3-python3.9-tensorflow-gpu-2.6.0.drv' failed with exit code 1

(You can find the full code to reproduce in NixOS/nixpkgs#137069, nix-build -A tensorflowWithCuda.)

But when I try to test patchelf outside of nix-build:

lib="/nix/store/i5ywd496x0ppp1lifkfwncsmixdsvm9s-python3.9-tensorflow-gpu-2.6.0/lib/python3.9/site-packages/tensorflow/python/_pywrap_toco_api.so.orig"
cp "$lib" /tmp/deleteme.so
patchelf --set-rpath "/nix/store/7z8w0varw65bfvgqb82lm4isggvfap23-cudatoolkit-11.0.3/lib:/nix/store/94nh9ffcg1fxx9915nqyfv1928q3k1g0-cudatoolkit-11.0.3-lib/lib:/nix/store/hcddc6qcb7mdv61qd4i5cgjz0yb43cf6-cudatoolkit-11.0-cudnn-8.1.1/lib:/nix/store/mqnx8hl70sras9szypmbn6y8bq6fcb92-nccl-2.7.8-1-cuda-11.2/lib:$(patchelf --print-rpath "/tmp/deleteme.so")" "/tmp/deleteme.so"

But then... it works oddly enough 😕

But @trofi's test case looks simpler anyhow!

samuela · 2021-09-09T00:03:44Z

Ok, I tried running this patched version of patchelf on my build, and I'm still seeing the same error. So either there's a bug in this PR, or I'm experience a separate failure mode altogether. I find the bug situation more likely since the error message is exactly the same as what's reported here.

Here's a branch with a full reproduction: https://github.com/samuela/nixpkgs/tree/samuela/pr-256-does-not-work. Just run nix-build -A python3Packages.tensorflowWithCuda to see the failure. Beware that building tensorflow is not for the faint of heart. I would recommend leveraging existing builds from my public cachix cache: https://app.cachix.org/cache/ploop. This way you can avoid having to build tensorflow from scratch entirely.

EDIT: I'm thoroughly confused how I'm still seeing this error message at all considering the word "maximum" doesn't even show up anywhere in patchelf.cc after this commit...

EDIT 2: Mystery solved... the subtlety was that it was failing in addOpenGLRunpath, not in the explicit patchelf call. The so that it's failing on is 781M, but I'd be happy to upload it if someone thinks that this test case would still be useful somehow.

nixos-discourse · 2021-09-09T01:01:02Z

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/how-to-work-around-patchelf-maximum-file-size-exceeded-error/14920/3

Mic92 · 2021-09-09T05:58:51Z

I think the problem is not in original binary size, but in limited growth that patchelf allows for. We can pick a tiny hello world and try to add 32MB of data to it. Say, 32MB of NEEDED libraries:

$ nix-shell -p patchelf

# very clumsy way to achieve many long "--add-needed AAAAAAAAA...." strings.
[nix-shell:/tmp]$ echo 'int main(){}' | gcc -x c - -o a -fPIC -pie; ./a; while patchelf --add-needed $(perl -e 'print "a" x 100000;') --add-needed $(perl -e 'print "b" x 100000;') --add-needed $(perl -e 'print "c" x 100000;') --add-needed $(perl -e 'print "d" x 100000;') --add-needed $(perl -e 'print "e" x 100000;') --add-needed $(perl -e 'print "f" x 100000') --add-needed $(perl -e 'print "g" x 100000') --add-needed $(perl -e 'print "h" x 100000') --add-needed $(perl -e 'print "i" x 100000') --add-needed $(perl -e 'print "j" x 100000') --add-needed $(perl -e 'print "k" x 100000') --add-needed $(perl -e 'print "l" x 100000') --add-needed $(perl -e 'print "m" x 100000') --add-needed $(perl -e 'print "m" x 100000') --add-needed $(perl -e 'print "o" x 100000') --add-needed $(perl -e 'print "p" x 100000') --add-needed $(perl -e 'print "q" x 100000') --add-needed $(perl -e 'print "r" x 100000') --add-needed $(perl -e 'print "s" x 100000') --add-needed $(perl -e 'print "t" x 100000') a; do echo again; done
again
again
again
again
again
again
again
again
again
again
again
again
again
again
again
again
again
patchelf: maximum file size exceeded

Thanks. I try to add this test case as soon as I can.

andrewla · 2021-09-10T15:53:30Z

To be honest I'm surprised that the test works, because each resize is for the same amount, which is less that 32MB. I guess the shifting or alignment makes it a cumulative effect.

Regardless, it takes a long time for this to run, and the produced executable doesn't run because the needed libraries aren't found.

If we use add-rpath instead, we get the same effect (overflow), and we can run the produced executable, but we can only add one rpath per command line so it's a little less efficient.

Considering a patch that would allow text arguments to be prefixed with @ to indicate that the argument should be read from a file instead of directly from the command line; that way we can just pass in a 40MB rpath string in one go.

andrewla · 2021-09-10T19:32:07Z

Created #316 for the @file syntax. That makes testing this much simpler, and is also nice because these rpaths can get crazy.

Mic92 · 2021-09-11T05:38:31Z

The @-pr was merged.

The use of self-pointers into a vector<> meant that the vector could not be relocated, which meant that any growth in the file had to fit within the inital reservation. Now, instead of a pointer to a header and a pointer to the contents, those values are computed dynamically from the underlying vector().

andrewla · 2021-09-13T14:56:16Z

Added a test that attempts to grow the file. Have confirmed that this fails on the current master:

$ git checkout origin-master
...
$ git cherry-pick f9ebafc7a7cb964af62d53bf3ae8e7c6c20b7d1a
...
$ make
$ cd tests
$ ./grow-file.sh 
patchelf: maximum file size exceeded

Mic92 · 2021-09-13T15:58:07Z

We also build now for clang for regression testing. I fixed the tests for that.

nixos-discourse · 2021-09-14T06:09:19Z

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/how-to-work-around-patchelf-maximum-file-size-exceeded-error/14920/5

andrewla force-pushed the master branch from bb53105 to 64bf538 Compare February 2, 2021 21:03

domenkozar requested a review from edolstra February 2, 2021 21:22

varyshare mentioned this pull request Aug 24, 2021

patchelf fails on big files #305

Closed

Mic92 force-pushed the master branch from b0c6073 to 07dfd67 Compare August 28, 2021 11:09

andrewla force-pushed the master branch from 64bf538 to c99745d Compare September 3, 2021 14:24

samuela mentioned this pull request Sep 8, 2021

python3Packages.tensorflow: 2.4.2 -> 2.6.0 NixOS/nixpkgs#137069

Closed

15 tasks

samuela added a commit to samuela/nixpkgs that referenced this pull request Sep 8, 2021

NixOS/patchelf#256 doesn't seem to work

27ac662

andrewla mentioned this pull request Sep 10, 2021

Allow string arguments to be read from files. #316

Merged

andrewla force-pushed the master branch from 3eba5a8 to f9ebafc Compare September 13, 2021 14:55

New test for growing the file

a4ae03d

andrewla force-pushed the master branch from f9ebafc to a4ae03d Compare September 13, 2021 15:00

grow-test: fix on non-gcc systems

90da38b

Mic92 merged commit f40a896 into NixOS:master Sep 13, 2021

domenkozar mentioned this pull request Sep 13, 2021

hardcoded maxSize? #88

Closed

tpwrules mentioned this pull request Nov 26, 2021

python3Packages.tensorflowWithCuda: fix build NixOS/nixpkgs#147509

Merged

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove limits on output file size. #256

Remove limits on output file size. #256

andrewla commented Dec 5, 2020

andrewla commented Feb 2, 2021

domenkozar commented Feb 4, 2021

andrewla commented Feb 5, 2021 •

edited

dstahlke commented May 24, 2021

andrewla commented May 26, 2021

Mic92 commented Aug 21, 2021

andrewla commented Sep 3, 2021

Mic92 commented Sep 4, 2021

Mic92 commented Sep 4, 2021

nixos-discourse commented Sep 8, 2021

trofi commented Sep 8, 2021

samuela commented Sep 8, 2021

samuela commented Sep 8, 2021

samuela commented Sep 9, 2021 •

edited

nixos-discourse commented Sep 9, 2021

Mic92 commented Sep 9, 2021

andrewla commented Sep 10, 2021 •

edited

andrewla commented Sep 10, 2021

Mic92 commented Sep 11, 2021

andrewla commented Sep 13, 2021 •

edited

Mic92 commented Sep 13, 2021

nixos-discourse commented Sep 14, 2021

Remove limits on output file size. #256

Remove limits on output file size. #256

Conversation

andrewla commented Dec 5, 2020

andrewla commented Feb 2, 2021

domenkozar commented Feb 4, 2021

andrewla commented Feb 5, 2021 • edited

dstahlke commented May 24, 2021

andrewla commented May 26, 2021

Mic92 commented Aug 21, 2021

andrewla commented Sep 3, 2021

Mic92 commented Sep 4, 2021

Mic92 commented Sep 4, 2021

nixos-discourse commented Sep 8, 2021

trofi commented Sep 8, 2021

samuela commented Sep 8, 2021

samuela commented Sep 8, 2021

samuela commented Sep 9, 2021 • edited

nixos-discourse commented Sep 9, 2021

Mic92 commented Sep 9, 2021

andrewla commented Sep 10, 2021 • edited

andrewla commented Sep 10, 2021

Mic92 commented Sep 11, 2021

andrewla commented Sep 13, 2021 • edited

Mic92 commented Sep 13, 2021

nixos-discourse commented Sep 14, 2021

andrewla commented Feb 5, 2021 •

edited

samuela commented Sep 9, 2021 •

edited

andrewla commented Sep 10, 2021 •

edited

andrewla commented Sep 13, 2021 •

edited