NixOS EC2 AMI: Support IMDSv2 #104193

grahamc · 2020-11-18T17:39:48Z

Motivation for this change

AWS's metadata service has two versions. Version 1 allowed plain HTTP requests to get metadata. However, this was frequently abused when a user could trick an AWS-hosted server in to proxying requests to the metadata service. Since the metadata service is frequently used to generate AWS access keys, this is pretty gnarly. Version two is identical except it requires the caller to request a token and provide it on each request.

Today, starting a NixOS AMI in EC2 where the metadata service is configured to only allow v2 requests fails: the user's SSH key is to placed, and configuration provided by the user-data is not applied. The server is useless. This patch addresses that.

OpenStack "borrowed" this metadata fetcher when it happened to be the same API, but this new version is different enough, imo, to warrant undoing this borrow.

~~This PR is not yet tested in an actual deployment, and I'll mark it as ready once I have. It should be backported to stable and new AMIs should be built.~~

Things done

These two APIs have diverged over time and are no longer compatible.

andir

Nice! Thank you for digging into this 👍

When you are done can you ensure that the commit messages have an equal amount of information as the PR description has right now? This will make it much easier for someone to debug this in the future as most editors support showing commit messages inline.

grahamc · 2020-11-18T22:10:56Z

Done, @andir. I've tested this with real AWS machines and it works correctly, fixing the issue preventing our AMI from working with IMDSv2.

andir

I didn't run this but the solution really looks straight forward. Given that this is for AWS instances and not for my kettle's IoT interface the additional 30M do not seem too bad. If, in fact, this becomes an issue we could still use a slimmer version of curl or revisit our options here.

There are a few comments that I'd like to see clarified before I can "approve" this.

nixos/modules/virtualisation/ec2-metadata-fetcher.nix

worldofpeace · 2020-11-18T23:22:20Z

Oh, I see my review was requested. Is the goal to have this in the stable distro?
Anyways, I think graham is cool etc. and he do cool thing. that's my disposition on this pull request. Strongly in favor to put smiley faces, hearts, and sparkles.

grahamc · 2020-11-18T23:24:29Z

Yeah, I want to backport it so NixOS on AWS users can improve the security of their instances.

grahamc · 2020-11-19T13:39:31Z

Got a +1 from @andir on IRC.

AmineChikhaoui · 2020-11-19T16:09:59Z

FYI @endgame and @jonringer , this implements #96593 a bit differently I think.

grahamc · 2020-11-19T16:18:29Z

I provided Amine a copy of the AMI to test with NixOps, so let's merge once that is ready.

nixos/modules/virtualisation/ec2-metadata-fetcher.nix

AWS's metadata service has two versions. Version 1 allowed plain HTTP requests to get metadata. However, this was frequently abused when a user could trick an AWS-hosted server in to proxying requests to the metadata service. Since the metadata service is frequently used to generate AWS access keys, this is pretty gnarly. Version two is identical except it requires the caller to request a token and provide it on each request. Today, starting a NixOS AMI in EC2 where the metadata service is configured to only allow v2 requests fails: the user's SSH key is not placed, and configuration provided by the user-data is not applied. The server is useless. This patch addresses that. Note the dependency on curl is not a joyful one, and it expand the initrd by 30M. However, see the added comment for more information about why this is needed. Note the idea of using `echo` and `nc` are laughable. Don't do that.

grahamc · 2020-11-19T18:47:03Z

nixos/modules/virtualisation/ec2-metadata-fetcher.nix

+  try=1
+  while [ $try -le 10 ]; do
+    echo "(attempt $try/10) validating the EC2 instance metadata service v2 token..."
+    preflight_imds_token && break
+    try=$((try + 1))
+    sleep 1
+  done


I added a preflight check of the IMDS token, as apparently sometimes the instance metadata service takes a second to figure itself out.

I hope we do not need more of these fallbacks.. It will soon escape the scope of where a shell script appears to be the right thing to me.

The commit message indicates it can take minutes for the metadata service to respond properly. Yet we are only waiting for to 30s for a response. Is that a trade-off between slow boot / unusable machines?

I wonder if the metadata service might also fail to hand out the "current" token during startup. Since we are already retrieving the instance-id should be validate something?

Ok, we talked on IRC and it is moment not minutes. I guess 30s are fine then?!

"minutes" was a mistake on my part. I updated it to say "moments". I too hope we don't need more, it is already growing to the limits of comfort. I think once we get evidence of a working token, we're good to go, but I'm happy to be shown wrong.

@andir I'm not sure there's a "current" token:

https://aws.amazon.com/blogs/security/defense-in-depth-open-firewalls-reverse-proxies-ssrf-vulnerabilities-ec2-instance-metadata-service/ under "what's new in IMDSv2":

The token is never stored by IMDSv2 and can never be retrieved by subsequent calls, so a session and its token are effectively destroyed when the process using the token terminates. There’s no limit on the number of requests within a single session, and there’s no limit on the number of IMDSv2 sessions.

So guess it generates some cryptographic doohickey each time you PUT?

According to Freenode's ##AWS, the metadata server can sometimes take a few moments to get its shoes on, and the very first boot of a machine can see failed requests for a few moments.

andir · 2020-11-19T18:53:11Z

nixos/modules/virtualisation/ec2-metadata-fetcher.nix

+  preflight_imds_token() {
+    # retry-delay of 1 selected to give the system a second to get going,
+    # but not add a lot to the bootup time
+    ${curl}/bin/curl \


At this point we might consider setting PATH=${curl}/bin:$PATH or something. This will safe use ~100 bytes or so ;-)

andir · 2020-11-19T18:54:58Z

nixos/modules/virtualisation/ec2-metadata-fetcher.nix

+  try=1
+  while [ $try -le 10 ]; do
+    echo "(attempt $try/10) validating the EC2 instance metadata service v2 token..."
+    preflight_imds_token && break
+    try=$((try + 1))
+    sleep 1
+  done


I hope we do not need more of these fallbacks.. It will soon escape the scope of where a shell script appears to be the right thing to me.

The commit message indicates it can take minutes for the metadata service to respond properly. Yet we are only waiting for to 30s for a response. Is that a trade-off between slow boot / unusable machines?

I wonder if the metadata service might also fail to hand out the "current" token during startup. Since we are already retrieving the instance-id should be validate something?

AmineChikhaoui

LGTM! thanks @grahamc

endgame

I appreciate the ping. There are a couple of small things that I think need to change. Fixing IMDSv2 on nixos was actually a yakshave; I need the public IPv4 address from IMDS for something else I'm working on, and tried to fix IMDS while I was there.

Also, the metadata dir should probably cleared on each boot, because otherwise it can become outdated w.r.t actions people can take in the management console.

I think there's enough difference in #96593 that it's probably worth @grahamc having a look and seeing if there's anything he wants to lift: I constructed the PUT request using busybox nc, so I don't pull additional packages into initramfs, and the fetcher is parameterised so the EC2/openstack common parts have been DRYed up. Some of the comments might also be worth pulling across. I don't have a preflight check on the IMDS, though.

Sidenote: I must confess that I'm a little frustrated by what feels like a process failure here: I'd been trying to get help testing my PR for months, and there's so much to do on NixOS that duplicating work isn't great. But at the end of the day, progress is excellent and I'm glad to see this about to land 👍 🎉

endgame · 2020-11-19T19:51:52Z

nixos/modules/virtualisation/openstack-metadata-fetcher.nix

+  if ! [ -e "$metaDir/ami-manifest-path" ]; then
+    wget ${wgetExtraOptions} -O "$metaDir/ami-manifest-path" http://169.254.169.254/1.0/meta-data/ami-manifest-path
+  fi
+
+  if ! [ -e "$metaDir/user-data" ]; then
+    wget ${wgetExtraOptions} -O "$metaDir/user-data" http://169.254.169.254/1.0/user-data && chmod 600 "$metaDir/user-data"
+  fi
+
+  if ! [ -e "$metaDir/hostname" ]; then
+    wget ${wgetExtraOptions} -O "$metaDir/hostname" http://169.254.169.254/1.0/meta-data/hostname
+  fi
+
+  if ! [ -e "$metaDir/public-keys-0-openssh-key" ]; then
+    wget ${wgetExtraOptions} -O "$metaDir/public-keys-0-openssh-key" http://169.254.169.254/1.0/meta-data/public-keys/0/openssh-key


Can you please start using http://169.254.169.254/2009-04-04 and download the public IPv4 address? This is a version that openstack and ec2 both support (I checked the docs), and supports the http://169.254.169.254/2009-04-04/public-ipv4 endpoint which i need for some other stuff.

endgame · 2020-11-19T19:55:06Z

nixos/modules/virtualisation/openstack-metadata-fetcher.nix

+    wget ${wgetExtraOptions} -O "$metaDir/ami-manifest-path" http://169.254.169.254/1.0/meta-data/ami-manifest-path
+  fi
+
+  if ! [ -e "$metaDir/user-data" ]; then


For both ec2 and openstack, I think these calls should not be guarded, and that it is correct to unconditionally replace fetched files. It might even be correct to clear the contents of metaDir on boot and re-fetch everything.

Example: currently, I can stop an EC2 instance in the management console, change its user data, and reboot. Then the updated user data is not applied. I think it's also possible to change image key pairs, and if I change the machine's subnet to start or stop assigning public IP addresses, then those files could appear or disappear.

endgame · 2020-11-19T20:10:59Z

nixos/modules/virtualisation/ec2-metadata-fetcher.nix

+{ curl, targetRoot, wgetExtraOptions }:
+# Note: be very cautious about dependencies, each dependency grows
+# the closure of the initrd. Ideally we would not even require curl,
+# but there is no reasonable way to send an HTTP PUT request without


This might not be "reasonable", but you can construct a PUT with busybox nc:

echo -ne 'PUT /latest/api/token HTTP/1.1\r\nHost: 169.254.169.254\r\nX-aws-ec2-metadata-token-ttl-seconds: 60\r\n\r\n' | nc 169.254.169.254 80 | tail -n 1

grahamc · 2020-11-19T20:18:13Z

I need the public IPv4 address from IMDS

I wonder why it needs to be part of the very early boot process, and not fetched as part of a service pre-start or something? We could add it, but I fear growing what is already a fairly gnarly script in a gnarly and unfortunate location.

Also, the metadata dir should probably cleared on each boot, because otherwise it can become outdated w.r.t actions people can take in the management console.

This sounds like a good idea, but I'm a bit unsure about backporting to stable. It would be a pretty big change in behavior, potentially breaking workflows. Maybe an additional PR which only hits master, plus release notes. What do you think?

I constructed the PUT request using busybox nc, so I don't pull additional packages into initramfs

I think the HTTP protocol is complicated enough to deserve using curl, and that implementing it with nc to save a few bytes is just going to cause pain for users / admins later on, and that it is well worth the extra ~20M it adds to the initrd. Even if it isn't so preferable to be doing this in the first place.

the fetcher is parameterised so the EC2/openstack common parts have been DRYed up

Since they are different APIs with different implementations, I don't like the idea of trying to make it "dry" -- they're doing different things with APIs that look similar but evolve differently. I thought it made sense to split out the two, creating a less dynamic script, as opposed to a highly parameterized script in stage-1. I'd rather a bit of duplication here, as opposed to risking subtle bugs from dryness.

Sidenote: I must confess that I'm a little frustrated by what feels like a process failure here: I'd been trying to get help testing my PR for months, and there's so much to do on NixOS that duplicating work isn't great. But at the end of the day, progress is excellent and I'm glad to see this about to land.

Yes. I agree with you.

endgame · 2020-11-19T20:29:17Z

Thanks for the quick response @grahamc .

IPv4 from IMDS - does it need to happen here?

I have a service that needs to know its public IP address, and was setting it up as a NixOS module. So I want that file present on disk so that I can refer to it in configuration.nix aka instance user data, before the image boot script does nixos-rebuild switch as part of normal boot. If there is a better place to achieve what I want?

Clear metadata on boot - is this actually a scary change?

Agree with your suggestion - this should happen later and go into release notes, so we don't blow up someone's cloud deployment by accident. I don't think that's that likely, but likelihood × severity = risk, and it's high enough that it's worth being cautious. I can do a separate PR for this.

DRYing openstack/ec2 metadata fetchers

Openstack explicitly advertises their IMDS as EC2-compatible. Though I'm not sure whether they have a token service on their IMDS, and maybe we say that means they've failed in that goal? In any case, maybe add a comment to keep the two fetchers in "reasonable" sync?

grahamc · 2020-11-19T20:44:37Z

re the IP address: would it be possible to define your service like this? :

systemd.services.foobar.script = ''
  ip=$(get-from-imds public_ipv4)
  exec my-program "$ip"
'';

If you can send that PR shortly after this one, we can merge it soon while we're all on top of it. BTW I've started #nixos-cloud on Freenode if you're interested in that sort of thing.

w.r.t. Openstack, I've found it really hit or miss if a deployment has an up to date set of components, or has even configured and deployed all the components. The variability there leaves me grayer than I was before ... 😬

grahamc · 2020-11-19T21:00:11Z

Pending the discussion here, a backport PR: #104302

endgame · 2020-11-19T21:10:43Z

re the IP address: would it be possible to define your service [to fetch the ip in a script]?

Probably; I hadn't thought about doing it that way. It's a low-pressure side project so I'll give it a go at some point. Definitely should not hold up this PR.

I'll look at clearing out the metadata directory on boot this weekend, hopefully.

That's all my concerns addressed, I think. Should we ship this, then?

grahamc · 2020-11-19T21:11:25Z

Thanks, @endgame! Let's go!

fixup breakage from NixOS#104193 (cherry picked from commit 1ef139f)

fixup breakage from #104193

nixos: openstack: have its own metadata fetcher expression

21339b4

These two APIs have diverged over time and are no longer compatible.

ofborg bot added 6.topic: nixos 8.has: module (update) 10.rebuild-darwin: 0 10.rebuild-linux: 0 labels Nov 18, 2020

andir reviewed Nov 18, 2020

View reviewed changes

grahamc force-pushed the ec2-metadata-imdsv2 branch 2 times, most recently from 6badc53 to b1c1087 Compare November 18, 2020 21:37

grahamc marked this pull request as ready for review November 18, 2020 22:08

grahamc requested review from jonringer, worldofpeace, nlewo and zimbatm November 18, 2020 22:08

andir reviewed Nov 18, 2020

View reviewed changes

nixos/modules/virtualisation/ec2-metadata-fetcher.nix Show resolved Hide resolved

nixos/modules/virtualisation/ec2-metadata-fetcher.nix Show resolved Hide resolved

nixos/modules/virtualisation/ec2-metadata-fetcher.nix Show resolved Hide resolved

grahamc added the 9.needs: port to stable A PR needs a backport to the stable release. label Nov 18, 2020

AmineChikhaoui approved these changes Nov 19, 2020

View reviewed changes

grahamc commented Nov 19, 2020

View reviewed changes

nixos/modules/virtualisation/ec2-metadata-fetcher.nix Show resolved Hide resolved

grahamc force-pushed the ec2-metadata-imdsv2 branch from e8551fd to ca95acf Compare November 19, 2020 18:43

grahamc commented Nov 19, 2020

View reviewed changes

grahamc added 2 commits November 19, 2020 13:56

nixos ami: preflight the imds token

f2cfecd

According to Freenode's ##AWS, the metadata server can sometimes take a few moments to get its shoes on, and the very first boot of a machine can see failed requests for a few moments.

nixos: release: add amazonImage as a channel blocker

0d87ce6

grahamc force-pushed the ec2-metadata-imdsv2 branch from ca95acf to 0d87ce6 Compare November 19, 2020 18:57

andir reviewed Nov 19, 2020

View reviewed changes

AmineChikhaoui approved these changes Nov 19, 2020

View reviewed changes

endgame suggested changes Nov 19, 2020

View reviewed changes

endgame reviewed Nov 19, 2020

View reviewed changes

grahamc merged commit 7fa7bf2 into NixOS:master Nov 19, 2020

grahamc deleted the ec2-metadata-imdsv2 branch November 19, 2020 21:11

grahamc mentioned this pull request Nov 19, 2020

nixos/release-small: add amazonImage #104322

Merged

10 tasks

grahamc added a commit to grahamc/nixpkgs that referenced this pull request Nov 19, 2020

nixos/release-small: add amazonImage

f80410d

fixup breakage from NixOS#104193 (cherry picked from commit 1ef139f)

grahamc mentioned this pull request Nov 19, 2020

[20.09] nixos/release-small: add amazonImage #104323

Merged

10 tasks

grahamc added a commit that referenced this pull request Nov 19, 2020

nixos/release-small: add amazonImage

1ef139f

fixup breakage from #104193

endgame mentioned this pull request Nov 20, 2020

ec2-metadata-fetcher: use IMDSv2, fetch public-ipv4 as well #96593

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NixOS EC2 AMI: Support IMDSv2 #104193

NixOS EC2 AMI: Support IMDSv2 #104193

grahamc commented Nov 18, 2020 •

edited

andir left a comment

grahamc commented Nov 18, 2020

andir left a comment

worldofpeace commented Nov 18, 2020

grahamc commented Nov 18, 2020

grahamc commented Nov 19, 2020

AmineChikhaoui commented Nov 19, 2020

grahamc commented Nov 19, 2020

grahamc Nov 19, 2020

andir Nov 19, 2020

andir Nov 19, 2020

grahamc Nov 19, 2020

endgame Nov 19, 2020

andir Nov 19, 2020

andir Nov 19, 2020

AmineChikhaoui left a comment

endgame left a comment

endgame Nov 19, 2020

endgame Nov 19, 2020 •

edited

endgame Nov 19, 2020

grahamc commented Nov 19, 2020

endgame commented Nov 19, 2020 •

edited

grahamc commented Nov 19, 2020

grahamc commented Nov 19, 2020

endgame commented Nov 19, 2020

grahamc commented Nov 19, 2020

Navigation Menu

NixOS EC2 AMI: Support IMDSv2 #104193

NixOS EC2 AMI: Support IMDSv2 #104193

Conversation

grahamc commented Nov 18, 2020 • edited

Motivation for this change

Things done

andir left a comment

Choose a reason for hiding this comment

grahamc commented Nov 18, 2020

andir left a comment

Choose a reason for hiding this comment

worldofpeace commented Nov 18, 2020

grahamc commented Nov 18, 2020

grahamc commented Nov 19, 2020

AmineChikhaoui commented Nov 19, 2020

grahamc commented Nov 19, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AmineChikhaoui left a comment

Choose a reason for hiding this comment

endgame left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

endgame Nov 19, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

grahamc commented Nov 19, 2020

endgame commented Nov 19, 2020 • edited

grahamc commented Nov 19, 2020

grahamc commented Nov 19, 2020

endgame commented Nov 19, 2020

grahamc commented Nov 19, 2020

grahamc commented Nov 18, 2020 •

edited

endgame Nov 19, 2020 •

edited

endgame commented Nov 19, 2020 •

edited