New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some performance optimizations to dockerTools.build{,Layered}Image #87154
Conversation
…ing layers Appending to an existing tar archive repeatedly seems to be a quadratic operation, since tar seems to traverse the existing archive even using the `-r, --append` flag. This commit avoids that by passing the list of files to a single tar invocation.
Calculating the tarsum after creating a layer is inefficient, since we have to read the tarball we've just written from the disk. This commit simultaneously calculates the tarsum while creating the tarball.
Adding a little more context, some of the reason for the original slowness is the inefficiency of appending to TAR files - see http://tiamat.name/blogposts/fast-appending-files-to-tar-archive-is-impossible/ The portions of this change that eliminate the separate pass with We have a case with a client where a sizeable image built with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise, lgtm.
# Compute a checksum of the tarball. | ||
tarhash=$(tarsum < $layerPath/layer.tar) | ||
--transform 's,^nix(/|$),/nix/,' \ | ||
--transform 's,^[^/],/nix/store/\0,rS' | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, it seems my sed
powers are now powerful enough ;)
- what is the purpose of
^[^/]
? Why^
is not sufficient? - I also don't understand why
rS
flags are added? What is the default behavior, especially regarding ther
flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems my sed powers are now powerful enough ;)
I think that's only because they're tar
's variant of sed
:).
what is the purpose of ^[^/]? Why ^ is not sufficient?
The transforms get applied one after another, and since the first one converts ^nix
to ^/nix
, I had to make sure that the second transform won't match to that. Otherwise that nix/
directory at the root would become /nix/store/nix
.
I also don't understand why rS flags are added? What is the default behavior, especially regarding the r flag?
The previous code was only replacing the exact store path(s,$n,/nix/store/$n
), but since we now archive multiple paths at the same time, we need to prepend /nix/store
to everything. But turns out those transformations are applied to symlink targets too, and even when they are relative (eg. ...libpng.so.1 -> libpng.so
would become ...libpng.so.1 -> /nix/store/libpng.so
). That S
flag disables transforming the symlinks. However, just setting S
overrides all of the default flags(rsh
is the default), so it doesn't apply the transformations to the regular files either. Only setting both r
and S
makes it so that the transformation is applied to regular files, but not to the symlinks.
Here's the doc I used: https://www.gnu.org/software/tar/manual/html_section/tar_51.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow... ok. Thanks for explanations!
# Compute a checksum of the tarball. | ||
tarhash=$(tarsum < $layerPath/layer.tar) | ||
--transform 's,^nix(/|$),/nix/,' \ | ||
--transform 's,^[^/],/nix/store/\0,rS' | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow... ok. Thanks for explanations!
@GrahamcOfBorg test docker-tools |
Motivation for this change
This PR aims to speed up creation of Docker images using the
dockerTools
utilities by reducing the disk I/O performed while creating layers.The first commit reduces the number of of
tar -r
invocations, and the second one calculatestarsum
's on the fly while creating the tarball. These two changes seem to make thebuildLayeredImage
function almost twice as fast when building large (> 4GB) images when disk I/O is the bottleneck (eg. on cloud machines).Things done
sandbox
innix.conf
on non-NixOS linux)nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
./result/bin/
)nix path-info -S
before and after)