Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example NixOps State Backends #1264

Merged
merged 80 commits into from Jul 7, 2021

Conversation

grahamc
Copy link
Member

@grahamc grahamc commented Mar 24, 2020

Store in S3:

{
  network = {
    storage.s3 = {
      profile = "rootlol";
      region = "us-east-1";
      bucket = "grahamc-nixops-state";
      key = "personal.nixops";
      kms_keyid = "166c5cbe-b827-4105-bdf4-a2db9b52efb4";
    };
  };

  flexo = {
    deployment = {
      targetHost = "147.75.105.137";
    };
    ...
  };
}

Store in Memory:

{
  network = {
    storage.memory = {};
  };

  flexo = {
    deployment = {
      targetHost = "147.75.105.137";
    };
    ...
  };
}

Legacy, exactly as master works now:

{
  network = {
    storage.legacy = {};
  };

  flexo = {
    deployment = {
      targetHost = "147.75.105.137";
    };
    ...
  };
}

@grahamc grahamc force-pushed the context-manager-states-rebase branch 3 times, most recently from cb46a0a to d105b44 Compare March 26, 2020 16:01
# Making it part of the type definition allows adding new
# arguments later.
def fetchToFile(self, path: str, **kwargs) -> None:
os.symlink(os.path.abspath(self.state_location()), path)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This at one point did not abspath, which was causing problems with legacy paths to state files.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me like it would be a bit safer to make an actual copy rather than a symlink.

That way we know that if something goes wrong in the middle of writing the state file to disk, the "proper" statefile won't be corrupted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were to copy, we would need to implement our own implementation of locking on top. The legacy backend already supports its own locking. My thinking is it would be better to find a way to "escape" the legacy backend, or implement a "backup" feature in the Legacy backend, than implement copying. What do you think?

@grahamc grahamc force-pushed the context-manager-states-rebase branch from d105b44 to 63b64d0 Compare March 26, 2020 17:21
This was referenced Mar 26, 2020
ConditionExpression="attribute_not_exists(LockID)",
)

def unlock(self, path: str) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be cool to use contextmanager for locking as well to ensure a bit of safety

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way it can be moved into Storage abstraction and then backend can implement it or not

Copy link
Member

@adisbladis adisbladis Apr 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we should completely conflate looking & storage.
S3, Backblaze B2 & friends don't have any locking mechanisms themselves, so you may want to utilise a dynamodb table for locking.

It's fine if a storage backend implements a default locking mechanism though, as long as it can be overriden by the user.

@grahamc grahamc force-pushed the context-manager-states-rebase branch from 63b64d0 to 2834cbd Compare March 26, 2020 21:10
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/using-nixops-in-a-team/6459/4

@grahamc
Copy link
Member Author

grahamc commented Apr 1, 2020

Right now if you don't define network, you get a bad error:

RuntimeError: error: attribute 'network' in selection path 'network' not found

@grahamc
Copy link
Member Author

grahamc commented Apr 1, 2020

If you define network.storage = {} you get a bad error:

    key = list(storage.keys()).pop()
IndexError: pop from empty list

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/tweag-nix-dev-update/6525/1

@ip1981
Copy link
Contributor

ip1981 commented Apr 20, 2020

A PR can be (and converted to) to a draft, so you wouldn't need a "do not merge" warning.

@grahamc grahamc marked this pull request as draft April 23, 2020 14:04
@grahamc
Copy link
Member Author

grahamc commented Apr 23, 2020

todo:

  • split the locking from the storage
  • treat these as context managers decided not to, to keep the explicit API's actions very clear. they're used within context managers already.
  • remove the S3 one and relocate it to nixops-aws
  • error messages / data validation is terrible, reimplement on top of Move from xml intermediate Nix representation to JSON #1275
  • validate the network parameters wit the module system (todo: new PR) not going to do this
  • tests, somehow (see: tests: Add functional tests using NixOS in Podman (Docker) #1326)
  • docs. lots, lol
  • talk to users who use nixops with many networks in one state file
  • accept a deployment ID for the "legacy" backend type. Should we support it for other types, too? Not sure. Migration might be tricky. not going to do this
  • a way to force-unlock a network
  • copy data from one backend to another (docs, command, otherwise)

this PR removes the idea of "global " networks and the "-d ..." argument, because of the dependence on ./network.nix. So, some thinking of the implications of this are needed.

@grahamc grahamc force-pushed the context-manager-states-rebase branch 2 times, most recently from 6d13093 to 9f3d08e Compare April 29, 2020 19:22
@adisbladis adisbladis force-pushed the context-manager-states-rebase branch from 9f3d08e to 8becbe2 Compare June 4, 2020 14:10
@adisbladis adisbladis force-pushed the context-manager-states-rebase branch from 8becbe2 to 35a0a89 Compare July 10, 2020 14:51
@grahamc grahamc force-pushed the context-manager-states-rebase branch from 35a0a89 to 7157e91 Compare July 10, 2020 15:16
@adisbladis adisbladis force-pushed the context-manager-states-rebase branch from e9bb679 to 3f7d77a Compare July 7, 2021 10:42
@adisbladis adisbladis merged commit 9b3ac91 into NixOS:master Jul 7, 2021
kanban automation moved this from In progress to Done Jul 7, 2021
@adisbladis
Copy link
Member

We're never going to move forward with this without getting this change out to more people for review.

Therefore I'm merging this, with the caveat that it might be somewhat changed before a final release.

1. Open %s
2. Add:
network.storage.legacy = {
databasefile = "~/.nixops/deployments.nixops"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a reference to databasefile here that doesn't seem to exist in the code

@rehno-lindeque
Copy link

  • remove the S3 one and relocate it to nixops-aws

Looks like the relocation part of this one still needs doing if I'm not mistaken? I couldn't find any reference to StorageBackend in nixops-aws.

@roberth
Copy link
Member

roberth commented Jul 30, 2021

Whew, I finally got around to playing with this. Good call merging this though!

My observations so far:

The interface is quite coarse grained, dealing with the whole db only and having no knowledge of which deployment is used. This coarseness means that state file history becomes noisy and locking more problematic as the number of deployments increases. Users can choose distinct backend configurations to compensate for this, but that's effectively duplicating the -d feature that already exists. I'd be ok removing -d altogether, but that's perhaps too radical. Making the interface fine grained seems like the way to go.

State is always written, even for read-only operations like nixops info. I could compensate for this in a backend, but that's not the right solution. The same problem applies to locking, where it can't be worked around in the backend. So if NixOps needs to know what it's about to do anyway, it may as well use that knowledge in order not to write state redundantly.

Sqlite isn't really an interchange format. If something goes wrong, it's hard to read. I don't know about its format/version strategy (maybe it's good though?) and it's not easy to diff. Perhaps the backends could be based on nixops import/export instead?

Which brings us to: nixops import --include-keys-like functionality doesn't seem to be included. I haven't tested it and I may have read the code wrong, but it also seems like a good idea to have a single code path for this stuff anyway.

@Mic92
Copy link
Member

Mic92 commented Sep 20, 2021

Is the memory backend functional at all? We tried to use both legacy backend and memory backend in nix-community/infra#123 but they both seem not to work. nixops stable is also broken, which means there is no functional nixops left.

@talyz
Copy link
Contributor

talyz commented Oct 21, 2021

Yeah, the legacy backend was broken by this, since it now ignores the nixAttrs state attribute. I've solved this by using the require attribute to import the deployment files from the network file. Specifying the network file is however cumbersome, since the --network flag currently expects a directory, not a file. To make it easier, I've opened #1480.

With that in place, I think we could essentially drop deployments altogether without losing granularity: instead of running nixops deploy -d my_deployment you'd get the same result by running nixops deploy -n my_deployment.nix, as long as my_deployment.nix requires the appropriate files and specifies the backend to use. Note however that this strategy doesn't work with the current legacy backend, since it doesn't allow choosing where the state is stored. A simple json file backend based on import/export, like @roberth is suggesting, would solve this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
kanban
  
Done
Development

Successfully merging this pull request may close these issues.

None yet