Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix inconsistent caching of VM xchg dirs #82258

Merged
merged 2 commits into from Jun 1, 2020

Conversation

erikarvstedt
Copy link
Member

xchg is advertised as a bidirectional exchange dir, but file content transfer from host to VM fails due to caching:
If a file is read in the VM and then modified on the host, subsequent re-reads in the VM can yield old, cached data.
This is caused by the use of 9p's cache=loose mode that is explicitly meant for read-only mounts.

TODO

Would it be sensible to use cache=mmap (doc) instead?

Reproduce

This script reproduces the bug.

Ping @domenkozar, @aszlig

@erikarvstedt erikarvstedt changed the title fix inconsistent caching for VM xchg dirs fix inconsistent caching of VM xchg dirs Mar 10, 2020
@flokli
Copy link
Contributor

flokli commented May 24, 2020

@JJJollyjim, can you take a look at this?

@erikarvstedt We might also then be able to drop the sync commands from the copy_from_host method introduced in 798fcaa.

@flokli flokli requested a review from tfc May 24, 2020 17:55
@flokli
Copy link
Contributor

flokli commented May 24, 2020

It's probably fine to not cache - I doubt there's anything performance critical with a lot of IO exchanged via these folders.

@erikarvstedt
Copy link
Member Author

@flokli removing the sync in the test driver will lead to bugs in case we add a cached backend in the future. I'll let you decide whether to drop it.

@flokli
Copy link
Contributor

flokli commented May 25, 2020

@flokli removing the sync in the test driver will lead to bugs in case we add a cached backend in the future. I'll let you decide whether to drop it.

I consider the sync a workaround implemented to explicitly flush caches, which hides away the broken caching behaviour. If other applications in the test use /tmp/xchg directly, we also can't tell them they need to use sync.

If we add back some caching, it should properly invalidate caches on external changes (like an nfs share would do), so we should remove these workarounds - please drop the sync.

@erikarvstedt
Copy link
Member Author

erikarvstedt commented May 27, 2020

I've had a thorough look at these syncs now.

  • with xchg dir caching, this sync is needed to commit the VM data to the host. This sync can be removed when removing caching.
    I'm now removing this sync in my original commit.
  • the 3 other syncs in copy_from_host and copy_from_vm try to transfer host data changes to the VM, but this is pointless (irrespective of xchg dir caching) because 1) syncing in the VM can't possibly pull in host data and 2) 9p is accessing the host filesystem on the cached layer anyways, so even syncing on the host would have no effect in the VM.
    (@flokli, would you consider this reasoning to be correct?)
    I'm removing these syncs in a separate commit.

These syncs have the goal to transfer host filesystem changes to the VM,
but they have no effect because 1) syncing in the VM can't possibly pull
in host data and 2) 9p is accessing the host filesystem on the cached
layer anyways, so even syncing on the host would have no effect in the
VM.
@flokli
Copy link
Contributor

flokli commented May 31, 2020

I think your reasoning is correct. Can you prefix the second commit, too (test-driver, virtualization: maybe)?

Also, slightly off-topic, but did you see #85568?

xchg is advertised as a bidirectional exchange dir, but file content
transfer from host to VM fails due to caching:
If a file is read in the VM and then modified on the host, subsequent
re-reads in the VM can yield old, cached data.
This is caused by the use of 9p's cache=loose mode that is explicitly
meant for read-only mounts.

9p doesn't provide any suitable cache modes, so fix this by disabling
caching.

Also, remove a now unnecessary sync in the test driver.
@erikarvstedt
Copy link
Member Author

I'll keep an eye on #85568 and jump in when I see new caching-related issues arise.

@flokli flokli merged commit 8a388c8 into NixOS:master Jun 1, 2020
@flokli
Copy link
Contributor

flokli commented Jun 1, 2020

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants