New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test-driver.py: randomize VM tmp state dir name #90663
Conversation
The NixOS test VMs state directory is currently hardcoded to /tmp/vm-state-$machineName. While this is perfectly fine when building the test from a sandboxed environment, we can run into some stray state problem across various runs when we use the test interactively. These stray state-induced errors can be pretty subtle to spot and debug. We fix this problem by using a randomized and unique state dir name for each run. This should prevent any stray state across several runs while keeping the state of the previous ones for a potential post-run analysis.
I think a goal of this state file is to be able to restart a VM (when interactively used). In theory, this should work well, but this is not what I'm observing in practice! So, I always have to remove it. |
No opinion on this matter, however if a change is going to be made the documentation in |
The use of a deterministic name is intentional so we can keep state across reboots. This is useful both for interactive use and in some tests to see whether state it kept correctly across reboots (e.g. uid assignments). |
@edolstra I think a lot of people find this pretty confusing – a lot of tests fail when the state is persisted, and it's easy to chalk that up to a different change and spend a long time debugging. |
@edolstra well it is obvious that it is intentional right now. But instead of just stating this and closing the issue you could have at least tried to relate to the problems that multiple people are stating here - which i have also run into when debugging something. You don't seem to even acknowledge the possibility that there is an issue that people are running into. The fact that it hasn't been a problem for you doesn't mean it doesn't exist. I would appreciate if in scenarios like this there could at least be some sort of discourse: What are the issue that people are facing? What is the suggested solution and how does it collide with existing goals? Is there something that can be done about this that you or others are less opposed to but still addresses the issue described? I think this would give a less hostile and more encouraging impression. PS: Just to clarify: i am not saying you are wrong, my concern is purely about the communication and not ignoring concerns:
versus
|
I for one would appreciate a |
@edolstra is a flag something that you could find acceptable? |
Personally I would just add a warning like BTW, since NixOS deployments should generally be congruent (i.e. not depend on the previous state of the system), if a test fails because the previous contents of the disk, then there might be an issue with the test or the service that it's testing. |
@edolstra This won't work. If you terminate an instance during the test, you might end up with a screwed up file system, and NixOS shouldn't try to automatically fix this and break it further. If have been bitten by some state on the machines in other occurences as well. IMHO, The sane default here is to put the qcow image in a tmpdir that's cleaned up after invocation - with some flags to alternate behaviour. |
Stdout gets flooded by the VMs boot sequence when starting up the test driver. I'm not sure I'd notice such a warning. On top of that I'm generally more in enclined to prevent people from using a footgun altogether rather than warn them they're about to use a footgun.
That's a fair point! I could add a ^ Does that sound like a fair middle-ground to you? Would you be up to merge such PR? |
I would name that something more specific, like |
Implemented this solution there: picnoir@052e4f5 @domenkozar good suggestion, I used this together with @mmilata good point regarding the doc. Updated. I also added a line to the release note about this behaviour's change. Could we re-open this PR to talk about these changes? |
@NinjaTrappeur github doesn't allow to reopen this PR, as the branch was deleted. But yes, please open a new PR, and cross-link it from here. |
👍 See #91046 |
You can run with |
@ali-abrar, #91046 has been merged. It'll delete the VMs state on the test runner startup. |
Motivation for this change
The NixOS test VMs state directory is currently hardcoded to
/tmp/vm-state-$machineName.
While this is perfectly fine when building the test from a sandboxed
environment, we can run into some stray state problem across various
runs when we use the test interactively. These stray state-induced
errors can be pretty subtle to spot and debug.
We fix this problem by using a randomized and unique state dir name
for each run. This should prevent any stray state across several runs
while keeping the state of the previous ones for a potential post-run
analysis.
After a run:
Things done
sandbox
innix.conf
on non-NixOS linux)nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
./result/bin/
)nix path-info -S
before and after)