New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
importPaths: Don't copy imported NAR into memory. #2206
Conversation
Fixes `error: out of memory` of `nix-store --serve --write` when receiving packages via SSH (and perhaps other sources). See NixOS#1681 NixOS#1969 NixOS#1988 NixOS/nixpkgs#38808. Performance improvement on `nix-store --import` of a 2.2 GB cudatoolkit closure: When the store path already exists: Before: 10.82user 2.66system 0:20.14elapsed 66%CPU (0avgtext+0avgdata 12556maxresident)k After: 11.43user 2.94system 0:16.71elapsed 86%CPU (0avgtext+0avgdata 4204664maxresident)k When the store path doesn't yet exist (after `nix-store --delete`): Before: 11.15user 2.09system 0:13.26elapsed 99%CPU (0avgtext+0avgdata 4204732maxresident)k After: 5.27user 1.48system 0:06.80elapsed 99%CPU (0avgtext+0avgdata 12032maxresident)k The reduction is 4200 MB -> 12 MB RAM usage, and it also takes less time.
Convenience override to try this patch out (stick it into the machine config of the target machine, e.g. I put it in {
# Patch nix to avoid problem running out of RAM when sending closures to the machine.
# See
# https://github.com/NixOS/nix/issues/1681
# https://github.com/NixOS/nix/issues/1969
# https://github.com/NixOS/nixpkgs/issues/38808
nixpkgs.config.packageOverrides = pkgs: {
nix = pkgs.nixUnstable.overrideAttrs (oldAttrs: {
src = pkgs.fetchFromGitHub {
owner = "NixOS";
repo = "nix";
rev = "54b1c596435b0aaf3a2557652ad4bf74d5756514";
sha256 = "0g7knsfj445r50rk0d9hm5n1pv20k542bz6xf5c47qmkgvfa40x4";
};
patches = [
(pkgs.fetchpatch {
url = "https://github.com/nh2/nix/commit/d31a4410d92790e2c27110154896445d99d7abfe.patch";
sha256 = "08gcw2xw8yc61zz2nr1j3cnd6wagp5qs02mjfazrd9wa045y26hg";
})
];
# Changes cherry-picked from upstream nix `release-common.nix` that
# aren't in `pkgs.nixUnstable` yet:
buildInputs = oldAttrs.buildInputs ++ [
(pkgs.aws-sdk-cpp.override {
apis = ["s3" "transfer"];
customMemoryManagement = false;
})
];
});
};
} |
When we got this fixed, ideally somebody would also update the 18.03 EC2 images so that people can actually use them as a base on EC2 machines with <= 4 GB ram. |
This doubles the amount of I/O and disk space needed on the target machine. |
Yes, based on the idea that most machines have more than 2x as much disk as they have RAM (especially when they have a In terms of IO, my measurements show it to be faster overall. Not super sure how that happens. But in general for machines with reasonable RAM, all the data written to disk will be in the OS buffer cache anyway (so it's practically in RAM if you have enough to store it there). I think the best solution would be to stream the incoming data right into is final location in the nix store, but that was beyond my time allocation for this fix to make it work at all. |
@edolstra, in my opinion doubling necessary disk storage in exchange for constant memory operations is absolutely the correct trade-off. Disk space is generally more plentiful than memory and is far easier to provision. Even in the short nine months I have used |
do we know if/when this is getting merged? |
It's worth considering that while reducing disk space by stripping down installed packages is possible as an end user, this isn't really true for memory usage; it's very easy to hit the limit even when installing relatively few things. This suggests that the disk-space-instead-of-RAM tradeoff is worth it. |
FWIW, this allowed a build to complete when it otherwise couldn't here. The disk-space-for-RAM tradeoff is important when dealing with big (gigabytes) products on machines with a smaller amount of RAM like SBCs (raspberry pi, etc.). |
I'm going to chime in here and say this is a very annoying problem for us at work, where we tend to have very large objects we push into the store -- we have one binary object that's over 30GB of raw data, and right now we have to manually register it into the store with mount/remount hacks. This is really nasty and troublesome because we have to do it on nearly a dozen machines individually. The disk/memory tradeoff here is absolutely the correct one to make, as others have mentioned here. There will always be hierarchies in the storage layer -- slower, larger storage that's further away, and faster, less plentiful storage (DRAM) that's available locally. The fact that this increases I/O on the lower tier while decreasing the requirements on the less plentiful, far more expensive tier, is the only argument you need, because it's far more sustainable and scalable than expecting DRAM to magically increase in density or $/GB over night. You could probably even make an argument based on the ratio of DRAM prices to storage and compute prices have skewed so much over the past few years that trading off disk space isn't just faster and actually makes Nix more usable -- it's actually a superior architecture for the way modern workloads and system capacity have progressed. |
What's the holdup with this? |
I've gone for a different solution (2825e05) that doesn't require writing the NAR to disk. |
Thank you @edolstra! I'm excited to try this since we actually ran into NAR import issues again just this week (copying large local files into the store). I'll see how this works for our use case of extremely large objects. |
closing as it's fixed |
Fixes
error: out of memory
ofnix-store --serve --write
when receiving packages via SSH (and perhaps other sources).
See #1681 #1969 #1988 NixOS/nixpkgs#38808.
Performance improvement on
nix-store --import
of a 2.2 GB cudatoolkit closure:The reduction is 4200 MB -> 12 MB RAM usage, and it also takes less time.