Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decompress tar files using libarchive #3259

Merged
merged 10 commits into from Dec 19, 2019
Merged

Conversation

yorickvP
Copy link
Contributor

@yorickvP yorickvP commented Dec 9, 2019

Context

Since #2748, nix is able to decompress tar files without any runtime dependencies. However, this change drops support for .gz files, among other things. While we strongly encourage porting more stuff to Rust, the current nix Rust FFI library has some issues that should be looked at before it is ready to take over the world.
There are also some security properties that may need to be taken care of, like extracting to parent directories.

Fortunately, there exist a solution in the form of libarchive, a well-supported, well-maintained and often used library that can extract and write almost any sort of archive. You may know it from utilities such as bsdtar, the default tar implementation on Mac OS.

This PR

This PR changes the unpackTarfile function to use libarchive, instead of the Rust FFI library. The main user facing change is support for arbitrary archive formats in nix-channel and builtins.fetchTarball.

Future possibilities

libarchive has the ability to autodetect, compress and decompress lot of compression formats, such as
bzip2, compress, grzip, gzip, lrzip, lz4, lzip, lzma, lzop, uuencode, xz and zstd. Using this, we can eventually drop the direct dependency on liblzma, libz and libbz2, while supporting compression using (some of) those formats.

Potential problems

Ubuntu 16.04, which still seems to be supported by nix, ships with libarchive 3.1.2, from 4 years ago. I've been unable to test compilation with this version.

cc: @puckipedia

@edolstra
Copy link
Member

edolstra commented Dec 9, 2019

I would prefer to add gzip support to our decompression code, rather than add ad hoc tarfile decompression support. That way it will also benefit things like NAR compression.

@yorickvP
Copy link
Contributor Author

I have a PoC for decompression and compression support on the source/sinks (for gz,bz2,xz,lzma,zstd). This will drop the direct dependency on liblzma, libbz2 and simplify the code.

But extracting e.g. zip files will require integration between the layers, which is why the decompression and extraction steps are handled at the same time here.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rust-in-nix-discussion-thread/5092/5

@edolstra edolstra merged commit f765e44 into NixOS:master Dec 19, 2019
@edolstra
Copy link
Member

Thanks!

@tomberek
Copy link
Contributor

tomberek commented Jan 6, 2020

I have a PoC for decompression and compression support on the source/sinks (for gz,bz2,xz,lzma,zstd). This will drop the direct dependency on liblzma, libbz2 and simplify the code.

@yorickvP need assistance?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants