Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/version: obfuscate stateVersion to discourage modification #80907

Closed
wants to merge 1 commit into from

Conversation

tilpner
Copy link
Member

@tilpner tilpner commented Feb 23, 2020

Motivation for this change

Many users get the impression of having to update stateVersion when
upgrading their channel, and cause subtle or obvious breakage.

This happened repeatedly, despite nixos-generate-config adding
5 lines of comment around the definition.

Because the value of stateVersion is a NixOS version, e.g. "19.09",
people were tricked into believing this option sets the state ("system
version") of their system, when instead it sets the system version
of the state to stay compatible with.

This commit changes nothing about stateVersion itself, to ensure
backwards compatibility with uses of stateVersion inside and outside
of nixpkgs.

Instead, it now sets stateVersion to an expression that doesn't look
like a NixOS version, and should not invoke the notion of needing to be
updated.

Example configuration snippet:
system.stateVersion = lib.nixos.decodeStateVersion 2989;

Some effort is made to prevent decoding to invalid versions, but
those can still be set.

This is only for newly generated configurations, and the obfuscation
can be bypassed by just setting stateVersion to e.g. "20.03" directly,
without encoding/decoding.

Things done
  • Generated ISO and installed a new system
  • Tried a few encoded values, passed sanity check as expected
  • Good error handling for values like "unstable" and "nixos-unstable"
  • Revise the comment above it?
Things to question
  • Do we never need a stateVersion newer than config.system.nixos.release?
  • Do we want lib.nixos, or rather utils?
  • Is this a good encoding scheme (it would be a permanent choice)?
  • Is this still needed after c1d7850?
  • Is this condescending to our users (this protects you from yourself, we know better, etc.)?

I only discovered the new comment when I was almost done. I'm okay with this being closed,
but I still wanted to bring it up.

Related contributors

@infinisil @maralorn @nh2 @grahamc

Many users get the impression of having to update stateVersion when
upgrading their channel, and cause subtle or obvious breakage.

This happened repeatedly, despite nixos-generate-config adding
5 lines of comment around the definition.

Because the value of stateVersion is a NixOS version, e.g. "19.09",
people were tricked into believing this option sets the state ("system
version") of their system, when instead it sets the system version
of the state to stay compatible with.

This commit changes nothing about stateVersion itself, to ensure
backwards compatibility with uses of stateVersion inside and outside
of nixpkgs.

Instead, it now sets stateVersion to an expression that doesn't look
like a NixOS version, and should not invoke the notion of needing to be
updated.

Example configuration snippet:
system.stateVersion = lib.nixos.decodeStateVersion 2989;

Some effort is made to prevent decoding to invalid versions, but
those can still be set.

This is only for newly generated configurations, and the obfuscation
can be bypassed by just setting stateVersion to e.g. "20.03" directly,
without encoding/decoding.
@tilpner
Copy link
Member Author

tilpner commented Feb 23, 2020

This year, I've seen two people set stateVersion to "unstable" or "nixos-unstable", which would be caught by this with an out-of-bounds error from versions.minor. Not great, but better than silently accepting it, as NixOS currently does. I might add better error handling later.

@bhipple
Copy link
Contributor

bhipple commented Feb 23, 2020

I get where you're coming from, and I've confused myself once on this front, and had the same thought; but after reading the docs and understanding what the state version actually is, it hasn't been a problem, and one nice thing about it is that if you're moving from state version 19.03 to 20.03 you know which release notes and changelogs to read for changes at least ... and have a sense of how out of date you are. It's possible the confusion from the obfuscation and complexity may be worse than the simple issue it resolves.

That said I don't have a super strong opinion one way or the other.

@tilpner
Copy link
Member Author

tilpner commented Feb 23, 2020

@bhipple I'm aware it's a weird "solution", to a problem you don't have.

This is not supposed to help people familiar with what stateVersion does, but those who skip or misinterpret the comment above stateVersion, and end up feeling like they need to change it. And that does happen, more often than you might think. This PR was prompted by another person running into hard-to-diagnose issues that occur when you don't know what stateVersion does.

Either way, something needs to improve about error handling of stateVersion = "nixos-unstable";.

Being able to read changelogs is a good point, I hadn't considered that before.
Do you think it would be sufficient to add a comment about the initially installed version above?

@rycee
Copy link
Member

rycee commented Feb 23, 2020

In Home Manager I just made the state version have an enum type, currently

type = types.enum [ "18.09" "19.03" "19.09" "20.03" ];

It doesn't fix the general problem but at least one would get a reasonable error message if one tries to put unstable or something like that there.

Another thing I've been contemplating is to write a file, ~/.local/share/home-manager/init-version or something, on the first activation containing the value of stateVersion so that HM can detect any subsequent state version change. The activation could then fail unless you set an option i.really.know.what.stateVersion.does = true. But my impression is that people haven't been so confused about state version in HM so I haven't put any effort into this. Of course, it may very well be that there is wide spread confusion, just I am blissfully unaware 🙂

@infinisil
Copy link
Member

The problem with such a stateful init-version file with NixOS at least is that the build host isn't always the target host and evaluation could be delayed to deployment too. This means that in general there's no way to read the correct init-version for influencing evaluation.

@edolstra
Copy link
Member

Not in favor of this since it makes it harder to understand the configuration (i.e. if I see system.stateVersion = lib.nixos.decodeStateVersion 2989 I'll have a hard time figuring out exactly what that does). Also, it's actually perfectly legal to change system.stateVersion if you're willing to deal with the consequences (e.g. your Postgres version changing).

Regardless, we shouldn't have lib/nixos.nix. lib should be largely NixOS-independent. It definitely shouldn't have functions that are only useful in the implementation of one particular NixOS option.

👍 on validating the value of system.stateVersion.

@tilpner
Copy link
Member Author

tilpner commented Feb 26, 2020

@edolstra I agree that this makes it harder to understand what's happening. A nix repl session will quickly clear that up, but it's still weird and confusing.

Changes to stateVersion are still possible, and don't need to use this encoding at all.

It felt very wrong to add NixOS-specific things to lib, but I was encouraged to do so anyway. There is also utils, where these functions might live.

@maralorn
Copy link
Member

I don‘t think I am qualified to weigh into the discussion, but I also don‘t think the obscuring the option will lead to better understanding of the users, which is in the end what we should be aiming that. The concept of the stateVersion is not too complicated.

@tilpner
Copy link
Member Author

tilpner commented Feb 26, 2020

@edolstra Validation was extracted to #81135, please review there.

@rycee
Copy link
Member

rycee commented Mar 7, 2020

@infinisil My idea was that the init-version file handling would be done entirely at activation time, which would happen on the deployed machine in the case of nixops. If the file is missing the script writes the value of stateVersion to it. If it exists, the script reads the file content and checks it against the current value of stateVersion and errors out if they differ. If the configuration contains i.really.know.what.stateVersion.does = true then this check would be omitted from the activation script.

Note, this is just a half baked idea that I have no plan of implementing anywhere but have as a backup plan in case I start getting inundated with complaints related to stateVersion. It is my "nuke it from orbit" plan 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants