New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
buildenv: read propagated-user-env-packages line-by-line #27427
Conversation
@ttuegel, thanks for your PR! By analyzing the history of the files in this pull request, we identified @edolstra, @aszlig and @rasendubi to be potential reviewers. |
@ttuegel Sorry! I've been unsure of exactly how much testing staging PRs are supposed to receive---exhaustive testing is easier without a mass rebuild of course, and rebuilding both stdenvs as I have been doing already takes a while. In this specific case, I tried to grep for all instances of the changed files' names and audit them, but regrettably forgot that one. Part of the problem might have been that I originally I didn't intend to mess with I don't know Perl, but your change looks correct based on what is there before, and certainly the idea is right---hence my approval. |
This patch is not correct. Apparently, Perl does not trim trailing newlines when reading a file line-wise. 🙄 |
Since 3cb745d, the format of propagated-user-env-packages has changed and propagated packages have not been included by buildenv, including in the system environment. The buildenv builder is modified to read propagated-user-env-packages line-by-line, instead of expecting all packages on one line.
380c5ca
to
dce958a
Compare
There's also a buildenv script in |
Yes. Both the Perl script in the released version and the unreleased C++ version are now broken. |
Oh jeeze, did not realize I was changing an interface of Nix itself. Well, if we want to keep my change, Nix can be compatible with both the old and the new by reading in the entire file and breaking on white-space (in a double loop for example). I would be happy to make the patch or the reversion. |
I think it would be better just to use Arguably, Nix should not depend on Nixpkgs at all, but if it's going to, it should at least depend on high-level interfaces.
I may be alone, but I would prefer not to revert this. I think newline-separated lists is the correct way to do it and ultimately the way we must go. We have already had one rebuild due to this, I would hate to revert it, incur another rebuild, and then eventually bring it back and get another rebuild. |
OK, this works correctly on NixOS. Because it's a platform independent change, I expect it will work on macOS, too. It would be difficult to check every package that depends on this, because it's a mass rebuild, but if it's still broken, it's at least less broken now. |
Ugh. I bisected for hours to find those new `printLines` everywhere.
I think 3cb745d and anything else that
introduced new `printLines` on master needs to be reverted ASAP.
Current master is completely broken. Both `environment.systemPackages`
and `nix-env -i` are unusable ATM.
After those changes are reverted this patch should be applied to master,
then nix's buildenv should be fixed (I see such PR in progress), then we
should wait a while for users to update their nix, and only then reapply
file format changes.
|
NixOS/nix#1480 and NixOS/nix#1482 are the Nix 1.12 and 1.11 PRs, respectably, for being compatible with space-separated or line-separated entries. |
Right. Merging those two PRs into nix, merging this PR into master and
updating nix in nixpkgs would probably work for most users, but not in
every possible case.
The problem is that 3cb745d silently
breaks nix semantics.
For instance, if one uses nix on non-NixOS system or doesn't use
nixos-rebuild on NixOS-like system then doing all of the above actions
will still leave one with an old version of nix that will build unusable
things without producing any errors. That's bad.
I see no other way to fix that except to revert
3cb745d first, and then maybe pick it
back when everything else gets merged (note that you'll have to update
nixpkgs nix version check to ask for the new nix since that's a
backwards-incompatible change).
I puzzles me why you care about those newlines so much, just
`sed 's/ /\n/'` them on display or something if you read those
generated files so often.
/cc @edolstra because of his seal of approval on #27284
|
What I'm trying to say is this:
Current treatment of newlines in both buildEnv and buildenv can be
treated as a bug that needs to be fixed. True. (I imagine there're more
such parsing bugs for files stored in `nix-support`. What would happen
if I concatenate some junk to a random file there? Nothing too good, I
suspect.) Silently doing the wrong thing is bad.
But before 3cb745d that bug was never
triggered.
Since 3cb745d is not a killer-feature
by any measure, it triggers that bug for many sinless packages, and that
bug is rather hard to debug, 3cb745d
needs to be reverted ASAP.
It can be cherry-picked back when the bug is dealt with, but not before.
|
@oxij I'm not against reverting---it would be more work for me as other commits of mine on top rely on it and also would have to be reverted but I have already caused more work and pain than than that from the reversion for many other people. But I do think the line-based way is ultimately better, which I think everyone agrees. Insofar that the patches move us one step closer to that, and do not interfere with a revert, I see no issues with merging them once I can test them. Also, keep in mind
|
I'm not against reverting---it would be more work for me but I have
already caused more work and pain than than that from the reversion
for many other people.
Why? I reverted it locally without any conflicts and no problems since.
But I do think the line-based way is ultimately better, which I think
everyone agrees.
I don't. Why not go all the way to `IFS="\0"` then? Paths can have
newlines in them too. Since neither space nor newline can be used in
shebangs I don't see any difference between those two for building
executable packages with nix. And if I wanted to use `nix-store` as a
generic storage interface or something I would go for '\0'.
Insofar that the patches move us one step closer to that, and do not
interfere with a revert, I see no issues with merging them once I can
test them.
I don't understand what you are talking about. I don't see any
substantial followups in master ATM. All I ask is to revert
3cb745d because of the
pain
From my point of view it looked like this: an old revision worked
perfectly, master failed, hundreds commits in between. "Okay, then."
Debug the package in question until "aha, it doesn't propagate packages
anymore" discovery (minutes), check history for anything obvious (~half
an hour), nothing, "Okay, then.", write a simple script that checks that
the package propagates its deps into `result` (minutes), bisect nixpkgs
with that script, wait awhile (days), as there was a mass rebuild on
almost every step, revert the culprit, "Aaaaah, finally.", argue that
the source of that misery needs to be reverted in master so that other
people won't need to repeat that (hours).
Also, keep in mind for users that *can* update Nix manually, doing so
is a more immediate solution than waiting for the channels to catch up
to the revert.
Except.
* How do I know that this new silent "behavior" is supposed to be
fixed by updating nix?
* What if I use nix provided by my distro and have no idea how to build
it myself?
* Is it normal that `nix-env -i <package>` for nix before those two PRs
for nix and the same `nix-env -i <package>` after produce completely
different environments on the same revision of nixpkgs?
* (Because produced environments don't depend on nix version. Should
packages depend on nix version they are build with?)
* How do I know that I should not only update my nix, but also to rebuild
all my user environments to get the fix?
My point is: if you want to keep
3cb745d in master and have everything
work as it should (without any craziness itemized above) you should
update `lib/minver.nix` in nixpkgs to require a future nix version.
This, of course, will break evaluation for everyone until that future
version comes out. Which strongly suggests that
3cb745d needs to be reverted for
everyone now.
I've already spent my time bisecting the problem and reverted its source
(and will keep reverting any followups until `lib/minver.nix` gets
updated, because we don't just break things in SLNOS). But you are free
to do whatever, of course.
In any case, you have to consider that there are some poor nixpkgs users
struggling with the side effects of that change right now. And some of
them would feel those effects for quite awhile after everything gets
"fixed" because nix with 3cb745d
propagates at least some of the problems directly into /nix/store and
you'll have to --gc and rebuild those environments to fix that. (Which,
by the way, is another reason to use buildEnv from nixpkgs in nix
instead of the builtin buildenv, but I digress.)
|
@oxij Ah, I didn't realize that was an easy revert---I got confused with the Darwin problems I also caused which are less so. edit Ah but it causes a merge conflict with
Woah. Ok that's real bad CC @edolstra. I'm reverting now.
Well, it is interesting and not coincident that the breakage I cause both this this and Darwin |
Done in b087618 |
As @oxij points out in [1], this breakage is especially serious because it changes the contents of built environments without a corresonding change in their hashes. Also, the revert is easier than I thought. This reverts commit 3cb745d. [1]: #27427 (comment)
@oxij Now that that is done, I would like to talk about changing the format in a more controlled process. @ttuegel This unfortunately would mean yet another mass rebuild, but are you still interested in switching away from space-separated? I wouldn't be opposed to |
@Ericson2314 so will you merge master into staging now then? I've just encountered an odd bug on staging that doesn't happen on master.
Another failure without any helpful output. Note that b087618 doesn't matter. |
The only thing I have to say is: if we are going to use something weird (i.e. not |
Reverted
Yay! Thanks!
I wouldn't be opposed to \0 as a deliminator. My reasoning for '\n'-deliminated was that, as we don't support arbitrary path, it is both easiest to read as a human, and easiest to work with in lousy scripting languages.
GNU tools support `-z` and similar options for millennia.
E.g. I regularly do something like
`find . -iname '*.jpg' -print0 | sort -zR | head -zn 1 | xargs -0 -I{} qiv -if {}`
in directories with some crazy filenames and it just works (this shows a
random jpg with qiv).
The only thing I have to say is: if we are going to use something weird (i.e. not `\n` and not ` `) then we should use one of the [ASCII delimiter characters](https://en.wikipedia.org/wiki/C0_and_C1_control_codes#Field_separators).
Should not. Any char except '\0' and '/' can be in a file system name as
per [POSIX 1003.1-2001](http://pubs.opengroup.org/onlinepubs/007904875/basedefs/xbd_chap03.html#tag_03_169).
|
@oxij So I'm now thinking it's import that when reading we support both Given that plan, however, I find using '\0' somewhat unpalatable because that would mean using |
The more I think about it the more I like the '\0'. However, mixing ' '
and '\0' doesn't feel right, I agree. The solution is to allow '\0' xor
' ' (not both) in a single file.
Down the road, when Nixpkgs has been using the new deliminator for a
major release or two, we can *then* stop accepting it on read---a
breaking change but one causing little breakage in practice.
I would prefer if nix/buildEnv would start complaining about the ' '
early (something like "Your environment is using a package `${name}'
with a package metadata format that is deprecated. Please upgrade
`${name}' or remove it from your environment.").
Also, there's a case of really old nix trying to use internal buildenv
with a new package format (use case: build on a testing machine with a
new nix, copy closure to a stable production machine, `nix-env -i` the
result).
I see no graceful way to handle that except make packages to have a kind
of "metadata version" which nix should start checking as soon as
possible.
|
Noting here that both b087618 and f6f40e3 were nominal reverts of 3cb745d. They were merged in 9be4084, where the approach of f6f40e3 was taken. So the net result on both I'm not familiar enough with nix to really say whether that matters. However, I'm in the process of building a nix expr for OpenJDK 9 over in #27194, and need to know whether I should be using |
@jerith666 it probably doesn't matter, but try printWords. |
* openjdk 8: code cleanup as recommended by 0xABAB in NixOS#27194 * openjdk 9: init at ea build 176 this starts with copy of 8.nix and just updates hashes and replaces 8 with 9. it also tweaks the version handling because we aren't dealing with an update version yet. * openjdk 9: adapt patches from openjdk 8 fix-java-home: surrounding code changed slightly swing-use-gtk-jdk9: location of the file being patched changed due to modularization read-truststore-from-env: the code that handles the trustStore was refactored out into a helper class in upstream commit http://hg.openjdk.java.net/jdk9/jdk9/jdk/rev/904861872c0e adlc_updater: this isn't present anymore * openjdk 9: make two more warnings-as-errors non-fatal this requires that we switch to configureFlagsArray to deal with whitespace the errors being suppressed are show below: * For target support_native_java.desktop_libawt_xawt_awt_Robot.o: /tmp/nix-build-openjdk-9ea-b176.drv-0/jdk9-jdk-9+176/jdk/src/java.desktop/unix/native/libawt_xawt/awt/awt_Robot.c: In function 'isXCompositeDisplay': /tmp/nix-build-openjdk-9ea-b176.drv-0/jdk9-jdk-9+176/jdk/src/java.desktop/unix/native/libawt_xawt/awt/awt_Robot.c:152:50: error: embedded '\0' in format [-Werror=format-contains-nul] snprintf(NET_WM_CM_Sn, sizeof(NET_WM_CM_Sn), "_NET_WM_CM_S%d\0", screenNumber); ^ /tmp/nix-build-openjdk-9ea-b176.drv-0/jdk9-jdk-9+176/jdk/src/java.desktop/unix/native/libawt_xawt/awt/awt_Robot.c:152:50: error: embedded '\0' in format [-Werror=format-contains-nul] cc1: all warnings being treated as errors * For target support_native_jdk.hotspot.agent_libsa_ps_core.o: /tmp/nix-build-openjdk-9ea-b176.drv-0/jdk9-jdk-9+176/hotspot/src/jdk.hotspot.agent/linux/native/libsaproc/ps_core.c: In function 'read_exec_segments': /tmp/nix-build-openjdk-9ea-b176.drv-0/jdk9-jdk-9+176/hotspot/src/jdk.hotspot.agent/linux/native/libsaproc/ps_core.c:834:7: error: ignoring return value of 'pread', declared with attribute warn_unused_result [-Werror=unused-result] pread(ph->core->exec_fd, interp_name, exec_php->p_filesz, exec_php->p_offset); ^ cc1: all warnings being treated as errors * openjdk 9: ea+176 -> ea+180 * openjdk 9: TODO disable infinality patches, at least to start the code being patched here seems to have changed substantially or perhaps even disappeared altogether. need to investigate whether these patches are still relevant. * openjdk 9: update installPhase for modularization * separate jdk and jre images are now present under build/*/images * samples have been removed (JEP 298) -- TODO that JEP says demos will be gone too, but it seems some are still present? * bina directory is no longer present * openjdk 9: TODO handle *.pf files or purge this code completely * openjdk 9: update minimal jre components in particular, the name of the config option for headless has changed, per https://bugs.openjdk.java.net/browse/JDK-8163102 * TODO about echo -n vs printWords, NixOS#27427
* openjdk 8: code cleanup as recommended by 0xABAB in #27194 * openjdk 9: init at ea build 176 this starts with copy of 8.nix and just updates hashes and replaces 8 with 9. it also tweaks the version handling because we aren't dealing with an update version yet. * openjdk 9: adapt patches from openjdk 8 fix-java-home: surrounding code changed slightly swing-use-gtk-jdk9: location of the file being patched changed due to modularization read-truststore-from-env: the code that handles the trustStore was refactored out into a helper class in upstream commit http://hg.openjdk.java.net/jdk9/jdk9/jdk/rev/904861872c0e adlc_updater: this isn't present anymore * openjdk 9: make two more warnings-as-errors non-fatal this requires that we switch to configureFlagsArray to deal with whitespace the errors being suppressed are show below: * For target support_native_java.desktop_libawt_xawt_awt_Robot.o: /tmp/nix-build-openjdk-9ea-b176.drv-0/jdk9-jdk-9+176/jdk/src/java.desktop/unix/native/libawt_xawt/awt/awt_Robot.c: In function 'isXCompositeDisplay': /tmp/nix-build-openjdk-9ea-b176.drv-0/jdk9-jdk-9+176/jdk/src/java.desktop/unix/native/libawt_xawt/awt/awt_Robot.c:152:50: error: embedded '\0' in format [-Werror=format-contains-nul] snprintf(NET_WM_CM_Sn, sizeof(NET_WM_CM_Sn), "_NET_WM_CM_S%d\0", screenNumber); ^ /tmp/nix-build-openjdk-9ea-b176.drv-0/jdk9-jdk-9+176/jdk/src/java.desktop/unix/native/libawt_xawt/awt/awt_Robot.c:152:50: error: embedded '\0' in format [-Werror=format-contains-nul] cc1: all warnings being treated as errors * For target support_native_jdk.hotspot.agent_libsa_ps_core.o: /tmp/nix-build-openjdk-9ea-b176.drv-0/jdk9-jdk-9+176/hotspot/src/jdk.hotspot.agent/linux/native/libsaproc/ps_core.c: In function 'read_exec_segments': /tmp/nix-build-openjdk-9ea-b176.drv-0/jdk9-jdk-9+176/hotspot/src/jdk.hotspot.agent/linux/native/libsaproc/ps_core.c:834:7: error: ignoring return value of 'pread', declared with attribute warn_unused_result [-Werror=unused-result] pread(ph->core->exec_fd, interp_name, exec_php->p_filesz, exec_php->p_offset); ^ cc1: all warnings being treated as errors * openjdk 9: ea+176 -> ea+180 * openjdk 9: TODO disable infinality patches, at least to start the code being patched here seems to have changed substantially or perhaps even disappeared altogether. need to investigate whether these patches are still relevant. * openjdk 9: update installPhase for modularization * separate jdk and jre images are now present under build/*/images * samples have been removed (JEP 298) -- TODO that JEP says demos will be gone too, but it seems some are still present? * bina directory is no longer present * openjdk 9: TODO handle *.pf files or purge this code completely * openjdk 9: update minimal jre components in particular, the name of the config option for headless has changed, per https://bugs.openjdk.java.net/browse/JDK-8163102 * TODO about echo -n vs printWords, #27427 (cherry picked from commit 02fe120)
Motivation
Since 3cb745d, the format of
propagated-user-env-packages has changed and propagated packages have not been
included by buildenv, including in the system environment.
The buildenv builder is modified to read propagated-user-env-packages
line-by-line, instead of expecting all packages on one line.
Testing
(nix.useSandbox on NixOS,
or option
build-use-sandbox
innix.conf
on non-NixOS)
nix-shell -p nox --run "nox-review wip"
./result/bin/
)This is a rebuild-the-world bugfix, so nobody is going to merge this into
staging
until those boxes above are checked.Input from any Perl experts is welcome. I think this will work now, but even with 16 cores it will take me a minute or two to find out. 🙄
@Ericson2314 Several things were broken by this. I expect if you had run the NixOS tests before merging, you would have discovered that and prevented this.