Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

google-compute-image: Remove fetch-ssh-keys.service #33004

Closed
wants to merge 2 commits into from

Conversation

4e6
Copy link
Contributor

@4e6 4e6 commented Dec 23, 2017

Nixops deployment to GCE fails when trying to receive ssh keys metadata:

$ nixops create -d hydra hydra-gce.nix

$ nixops deploy -d hydra
slave1...> starting the following units: audit.service, fetch-ssh-keys.service, kmod-static-nodes.service, network-local-commands.service, network-setup.service, nix-daemon.socket, nscd.service, systemd-journal-catalog-update.service, systemd-modules-load.service, systemd-sysctl.service, systemd-timesyncd.service, systemd-tmpfiles-clean.timer, systemd-tmpfiles-setup-dev.service, systemd-udev-trigger.service, systemd-udevd-control.socket, systemd-udevd-kernel.socket
slave1...> Job for fetch-ssh-keys.service failed because the control process exited with error code.
slave1...> See "systemctl  status fetch-ssh-keys.service" and "journalctl  -xe" for details.
slave1...> the following new units were started: configure-forwarding-rules.service, google-accounts-daemon.service, google-clock-skew-daemon.service, google-ip-forwarding-daemon.service, google-shutdown-scripts.service
slave1...> warning: the following units failed: fetch-ssh-keys.service
slave1...> 
slave1...> ● fetch-ssh-keys.service - Fetch host keys and authorized_keys for root user
slave1...>    Loaded: loaded (/nix/store/gmxkx5i7057jwid351k49nvvdp4d9v3b-unit-fetch-ssh-keys.service/fetch-ssh-keys.service; enabled; vendor preset: enabled)
slave1...>    Active: failed (Result: exit-code) since Sat 2017-12-23 09:07:24 UTC; 1s ago
slave1...>   Process: 3459 ExecStart=/nix/store/fdhj4xv6sj4pk5g8gavjm8ddvmwyq4d4-unit-script/bin/fetch-ssh-keys-start (code=exited, status=8)
slave1...>  Main PID: 3459 (code=exited, status=8)
slave1...> 
slave1...> Dec 23 09:07:24 slave1 fetch-ssh-keys-start[3459]: Obtaining SSH keys...
slave1...> Dec 23 09:07:24 slave1 fetch-ssh-keys-start[3459]: --2017-12-23 09:07:24--  http://metadata.google.internal/computeMetadata/v1/project/attributes/sshKeys
slave1...> Dec 23 09:07:24 slave1 fetch-ssh-keys-start[3459]: Resolving metadata.google.internal (metadata.google.internal)... 169.254.169.254
slave1...> Dec 23 09:07:24 slave1 fetch-ssh-keys-start[3459]: Connecting to metadata.google.internal (metadata.google.internal)|169.254.169.254|:80... connected.
slave1...> Dec 23 09:07:24 slave1 fetch-ssh-keys-start[3459]: HTTP request sent, awaiting response... 404 Not Found
slave1...> Dec 23 09:07:24 slave1 fetch-ssh-keys-start[3459]: 2017-12-23 09:07:24 ERROR 404: Not Found.
slave1...> Dec 23 09:07:24 slave1 systemd[1]: fetch-ssh-keys.service: Main process exited, code=exited, status=8/n/a
slave1...> Dec 23 09:07:24 slave1 systemd[1]: Failed to start Fetch host keys and authorized_keys for root user.
slave1...> Dec 23 09:07:24 slave1 systemd[1]: fetch-ssh-keys.service: Unit entered failed state.
slave1...> Dec 23 09:07:24 slave1 systemd[1]: fetch-ssh-keys.service: Failed with result 'exit-code'.
slave1...> error: Traceback (most recent call last):
  File "/nix/store/h6af8fcjqcbyny6499pzxszwgq406ayj-nixops-1.5.2/lib/python2.7/site-packages/nixops/deployment.py", line 705, in worker
    raise Exception("unable to activate new configuration")
Exception: unable to activate new configuration

A quick inspection of deployment from GCE console showed that ssh keys are attached as an instance metadata, but script tries to fetch global project/attributes/sshKeys which result in 404.

SSHing and fetching instance/attributes/sshKeys succeeded (response body with ssh keys omitted)

$ curl -v -H 'Metadata-Flavor: Google' http://metadata.google.internal/computeMetadata/v1/instance/attributes/sshKeys
*   Trying 169.254.169.254...
* TCP_NODELAY set
* Connected to metadata.google.internal (169.254.169.254) port 80 (#0)
> GET /computeMetadata/v1/instance/attributes/sshKeys HTTP/1.1
> Host: metadata.google.internal
> User-Agent: curl/7.57.0
> Accept: */*
> Metadata-Flavor: Google
> 
< HTTP/1.1 200 OK
< Metadata-Flavor: Google
< Content-Type: application/text
< ETag: e7e5bb219660806f
< Date: Sat, 23 Dec 2017 09:40:51 GMT
< Server: Metadata Server for VM
< Content-Length: 838
< X-XSS-Protection: 1; mode=block
< X-Frame-Options: SAMEORIGIN
< 

Testing done

Was able to successfully deploy to GCE using -I nixpkgs override

$ nixops create -d hydra hydra-gce.nix -I nixpkgs=/home/dbushev/projects/nixos/nixpkgs-channels

$ nixops deploy -d hydra

$ nixops info -d hydra
Network name: hydra
Network UUID: 5b2bdaa1-e7cb-11e7-9ebf-0a0027000000
Network description: Unnamed NixOps network
Nix expressions: /home/dbushev/projects/Aviora/nixops-typeable/hydra/hydra-gce.nix
Nix path: -I nixpkgs=/home/dbushev/projects/nixos/nixpkgs-channels

+-----------+-----------------+-------------------------------+----------------------------------------------+----------------+
| Name      |      Status     | Type                          | Resource Id                                  | IP address     |
+-----------+-----------------+-------------------------------+----------------------------------------------+----------------+
| slave1    | Up / Up-to-date | gce [us-central1-c; g1-small] | n-5b2bdaa1e7cb11e79ebf0a0027000000-slave1    | 35.192.219.125 |
| bootstrap | Up / Up-to-date | gce-image                     | n-5b2bdaa1e7cb11e79ebf0a0027000000-bootstrap |                |
| hydra-net | Up / Up-to-date | gce-network [192.168.4.0/24]  | n-5b2bdaa1e7cb11e79ebf0a0027000000-hydra-net |                |
$ cat hydra-gce.nix
let

  credentials = {
    project = "xxxxxx";
    serviceAccount = "xxxxxx@xxxxxxxx.gserviceaccount.com";
    accessKey = builtins.readFile ../xxxxxxx.pem;
  };

  gce = { resources, ... }: {
    deployment.targetEnv = "gce";
    deployment.gce = credentials // {
      region = "us-central1-c";
      tags = [ "public-http" ];
      network = resources.gceNetworks.hydra-net;
    };
  };

in {

  # create a network that allows SSH traffic(by default), pings
  # and HTTP traffic for machines tagged "public-http"
  resources.gceNetworks.hydra-net = credentials // {
    addressRange = "192.168.4.0/24";
    firewall = {
      allow-http = {
        targetTags = [ "public-http" ];
        allowed.tcp = [ 80 3000 ];
      };
    };
  };

  slave1 = gce;
}

Update 2017-12-29

After a thorough investigation, I found that fetch-ssh-keys.service duplicates the fetch-ssh-keys-start.service and probably uses obsolete apis, so can be removed.

related #24273

Testing

I was able to successfully deploy minimal working configuration with the fetch-ssh-keys.service disabled. Here is its journalctl boot log.

@4e6 4e6 changed the title Fix: Nixops gce image fetch-ssh-keys google-compute-image: Fix fetch-ssh-keys service Dec 23, 2017
@grahamc
Copy link
Member

grahamc commented Dec 23, 2017

@GrahamcOfBorg eval

(sorry for the noise, master was broken by a merge)

@8573
Copy link
Contributor

8573 commented Dec 29, 2017

Will this only fix the Google Compute image for use with NixOps?

This patch doesn't seem to address #24273 (comment).

@4e6 4e6 changed the title google-compute-image: Fix fetch-ssh-keys service google-compute-image: Remove fetch-ssh-keys.service Dec 29, 2017
jbboehr added a commit to BitKitchen/nixpkgs-channels that referenced this pull request May 11, 2018
@flokli
Copy link
Contributor

flokli commented Dec 6, 2018

superseded by #51566.

@flokli flokli closed this Dec 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants