Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linuxPackages: 4.14 -> 4.19 #57641

Merged
merged 2 commits into from Mar 18, 2019
Merged

linuxPackages: 4.14 -> 4.19 #57641

merged 2 commits into from Mar 18, 2019

Conversation

aszlig
Copy link
Member

@aszlig aszlig commented Mar 14, 2019

This should bring back kernel 4.19 as our default kernel by applying a patch that fixes our overlayfs regression (#54509).

I'm basing this against master first and we can backport it to stable later once we got it tested well enough™.

Tested this only against the kernel-latest NixOS test so far , so we should probably add a Hydra job to make sure this doesn't introduce additional breakages.

@lheckemann
Copy link
Member

Not sure I understand why the hydra job would be necessary. Since 19.03 is still beta, we can just merge it into release-19.03 and see if the regular tests still work?

@bachp
Copy link
Member

bachp commented Mar 15, 2019

@aszlig I think #54508 should be merged to to make sure the regression is gone.

@aszlig
Copy link
Member Author

aszlig commented Mar 15, 2019

@bachp: I'll merge both, but there is something which I have missed in this patch. It doesn't affect us, but I would like to wait for an answer from upstream before merging both.

@aszlig
Copy link
Member Author

aszlig commented Mar 16, 2019

Just updated the patch to the latest version I submitted upstream.

Our VM tests and everything related to our virtualisation infrastructure
is currently broken if used with kernel 4.19 or later.

The reason for this is that since 4.19, overlayfs uses the O_NOATIME
flag when opening files in lowerdir and this doesn't play nice with the
way we pass the Nix store to our QEMU guests.

On a NixOS system, paths in the Nix store are typically owned by root
but the QEMU process is usually run by an ordinary user. Using O_NOATIME
on a file where you're not the owner (or superuser) will return with
EPERM (Operation not permitted).

This is exactly what happens in our VM tests, because we're using
overlayfs in the guests to allow writes to the store.

Another implication of this is that the default kernel version for NixOS
19.03 has been reverted to Linux 4.14.

Work on getting this upstream is still ongoing and the patch I posted
previously was incomplete, needs rework and also some more review from
upstream maintainers - in summary: This will take a while.

So instead of rushing in a kernel patch to nixpkgs, which will affect
all users of overlayfs, not just NixOS VM tests, I opted to patch QEMU
for now to ignore the O_NOATIME flag in 9p.

I think this is also the least impacting change, because even if you
care about whether access times are written or not, you get the same
behaviour as with Linux 4.19 in conjunction with QEMU.

Signed-off-by: aszlig <aszlig@nix.build>
Fixes: NixOS#54509
This reverts commit 048c36c.

With the patch applied for fixing the overlayfs bug in QEMU, there
really shouldn't stand anything in our way to use 4.19 as the default
kernel.

Signed-off-by: aszlig <aszlig@nix.build>
@aszlig
Copy link
Member Author

aszlig commented Mar 18, 2019

@samueldr, @lheckemann: Since this needs more time to resolve upstream, I decided to go with a QEMU patch for now, simply because it has the lowest impact on users. We can remove the patch as soon as this is fixed in the next round of stable kernels.

@LnL7 LnL7 merged commit 9a395a4 into NixOS:master Mar 18, 2019
aszlig added a commit that referenced this pull request Mar 18, 2019
In Linux 4.19 there has been a major rework of the overlayfs
implementation and it now opens files in lowerdir with O_NOATIME, which
in turn caused issues in our VM tests because the process owner of QEMU
doesn't match the file owner of the lowerdir.

The crux here is that 9p propagates the O_NOATIME flag to the host and
the guest kernel has no way of verifying whether that flag will lead to
any problems beforehand.

There is ongoing work to possibly fix this in the kernel, but it will
take a while until there is a working patch and consensus.

So in order to bring our default kernel back to 4.19 and of course make
it possible to run newer kernels in VM tests, I'm merging a small QEMU
patch as an interim solution, which we can drop once we have a working
fix in the next round of stable kernels.

Now we already had Linux 4.19 set as the default kernel, but that was
subsequently reverted in 048c36c
because the patch we have used was the revert of the commit I bisected a
while ago.

This patch broke overlayfs in other ways, so I'm also merging in a VM
test by @bachp, which only tests whether overlayfs is working, just to
be on the safe side that something like this won't happen in the future.

Even though this change could be considered a moderate mass-rebuild at
least for GNU/Linux, I'm merging this to master, mainly to give us some
time to get it into the current 19.03 release branch (and subsequent
testing window) once we got no new breaking builds from Hydra.

Cc: @samueldr, @lheckemann

Fixes: #54509
Fixes: #48828
Merges: #57641
Merges: #54508
aszlig added a commit that referenced this pull request Mar 21, 2019
In Linux 4.19 there has been a major rework of the overlayfs
implementation and it now opens files in lowerdir with O_NOATIME, which
in turn caused issues in our VM tests because the process owner of QEMU
doesn't match the file owner of the lowerdir.

The crux here is that 9p propagates the O_NOATIME flag to the host and
the guest kernel has no way of verifying whether that flag will lead to
any problems beforehand.

There is ongoing work to possibly fix this in the kernel, but it will
take a while until there is a working patch and consensus.

So in order to bring our default kernel back to 4.19 and of course make
it possible to run newer kernels in VM tests, I'm merging a small QEMU
patch as an interim solution, which we can drop once we have a working
fix in the next round of stable kernels.

Now we already had Linux 4.19 set as the default kernel, but that was
subsequently reverted in 048c36c
because the patch we have used was the revert of the commit I bisected a
while ago.

This patch broke overlayfs in other ways, so I'm also merging in a VM
test by @bachp, which only tests whether overlayfs is working, just to
be on the safe side that something like this won't happen in the future.

Even though this change could be considered a moderate mass-rebuild at
least for GNU/Linux, I'm merging this to master, mainly to give us some
time to get it into the current 19.03 release branch (and subsequent
testing window) once we got no new breaking builds from Hydra.

Cc: @samueldr, @lheckemann

Fixes: #54509
Fixes: #48828
Merges: #57641
Merges: #54508
(cherry picked from commit 12efcc2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants