Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nginx: Fix ETag patch to ignore realpath(3) error #80671

Closed
wants to merge 1 commit into from

Conversation

aszlig
Copy link
Member

@aszlig aszlig commented Feb 20, 2020

While our ETag patch works pretty fine if it comes to serving data off store paths, it unfortunately broke something that might be a bit more common, namely when using regexes to extract path components of location directives for example.

Recently, @devhell has reported a bug with a nginx location directive like this:

location ~^/\~([a-z0-9_]+)(/.*)?$" {
  alias /home/$1/public_html$2;
}

While this might look harmless at first glance, it does however cause issues with our ETag patch. The alias directive gets broken up by nginx like this:

*2 http script copy: "/home/"
*2 http script capture: "foo"
*2 http script copy: "/public_html/"
*2 http script capture: "bar.txt"

In our patch however, we use realpath(3) to get the canonicalised path from ngx_http_core_loc_conf_s.root, which returns the configured value from the root or alias directive. So in the example above, realpath(3) boils down to the following syscalls:

lstat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("/home/$1", 0x7ffd08da6f60) = -1 ENOENT (No such file or directory)

During my review of the initial patch, I didn't actually notice that what we're doing here is returning NGX_ERROR if the realpath(3) call fails, which in turn causes an HTTP 500 error.

Since our patch actually made the canonicalisation (and thus additional syscalls) necessary, we really shouldn't introduce an additional error, so let's - at least for now - silently skip return value if realpath(3) has failed.

However since we're using the unaltered root from the config we have another issue, consider a root like /nix/store/...-abcde/$1:

Calling realpath(3) on this path will fail (except if there's a file called $1 of course), so even this fix is not enough because it results in the ETag not being set to the store path hash.

While this is very ugly and we should fix this very soon, it's not as serious as getting HTTP 500 errors for serving static files.

I added a small NixOS VM test, which uses the example above as a regression test.

It seems that my memory is failing me these days, since apparently I knew about this issue, since when digging for existing issues in nixpkgs I found this similar pull request which I even reviewed 🤦‍♂️

However, since the comments weren't addressed and the author hasn't responded to the pull request, I decided to open a follow-up pull request.

While our ETag patch works pretty fine if it comes to serving data off
store paths, it unfortunately broke something that might be a bit more
common, namely when using regexes to extract path components of
location directives for example.

Recently, @devhell has reported a bug with a nginx location directive
like this:

  location ~^/\~([a-z0-9_]+)(/.*)?$" {
    alias /home/$1/public_html$2;
  }

While this might look harmless at first glance, it does however cause
issues with our ETag patch. The alias directive gets broken up by nginx
like this:

  *2 http script copy: "/home/"
  *2 http script capture: "foo"
  *2 http script copy: "/public_html/"
  *2 http script capture: "bar.txt"

In our patch however, we use realpath(3) to get the canonicalised path
from ngx_http_core_loc_conf_s.root, which returns the *configured* value
from the root or alias directive. So in the example above, realpath(3)
boils down to the following syscalls:

  lstat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
  lstat("/home/$1", 0x7ffd08da6f60) = -1 ENOENT (No such file or directory)

During my review[1] of the initial patch, I didn't actually notice that
what we're doing here is returning NGX_ERROR if the realpath(3) call
fails, which in turn causes an HTTP 500 error.

Since our patch actually made the canonicalisation (and thus additional
syscalls) necessary, we really shouldn't introduce an additional error
so let's - at least for now - silently skip return value if realpath(3)
has failed.

However since we're using the unaltered root from the config we have
another issue, consider this root:

  /nix/store/...-abcde/$1

Calling realpath(3) on this path will fail (except if there's a file
called "$1" of course), so even this fix is not enough because it
results in the ETag not being set to the store path hash.

While this is very ugly and we should fix this very soon, it's not as
serious as getting HTTP 500 errors for serving static files.

I added a small NixOS VM test, which uses the example above as a
regression test.

It seems that my memory is failing these days, since apparently I *knew*
about this issue since digging for existing issues in nixpkgs, I found
this similar pull request which I even reviewed:

NixOS#66532

However, since the comments weren't addressed and the author hasn't
responded to the pull request, I decided to keep this very commit and do
a follow-up pull request.

[1]: NixOS#48337

Signed-off-by: aszlig <aszlig@nix.build>
Reported-by: @devhell
@aszlig
Copy link
Member Author

aszlig commented Mar 5, 2020

Cc: @domenkozar, @7c6f434c, @cransom, @jglukasik who also commented on #66532

Copy link
Contributor

@yorickvP yorickvP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this is better than returning 500 errors.

cc @edef1c

@aanderse
Copy link
Member

(triage) ping as this seems important 😃

@7c6f434c
Copy link
Member

Hmmm. If we ignore ENAMETOOLONG, what happens if a user-owned static content directory contains a symlink to /nix/store/a…a- of length PATH_MAX? Won't * ptr2 = '\0'; lead to out-of-bounds write in non-user-owned Nginx process?

(GitHub UI is always broken, so I cannot put the comment where it belongs, as the key point is the unchanged part of the patch that is not intended for the new context).

@aszlig
Copy link
Member Author

aszlig commented Mar 26, 2020

Hmmm. If we ignore ENAMETOOLONG, what happens if a user-owned static content directory contains a symlink to /nix/store/a…a- of length PATH_MAX? Won't * ptr2 = '\0'; lead to out-of-bounds write in non-user-owned Nginx process?

ptr2 is accessing/modifying memory allocated by ngx_realpath and what we're overwriting here is the - character plus the following \0 via *ptr2.

So let's say real is /nix/store/foo- then we have the following:

// /nix/store/foo-\0
//           ^ ptr1
*ptr1 = '"';
// /nix/store"foo-\0
//           ^ ptr1
ptr2 = ngx_strchr(ptr1, '-');
if (ptr2 == NULL) { /* abort... */ }
//           v ptr1
// /nix/store"foo-\0
//               ^ ptr2
*ptr2++ = '"';
//           v ptr1
// /nix/store"foo"\0
//                ^^ ptr2
*ptr2 = '\0';
//           v ptr1
// /nix/store"foo"\0
//                ^^ ptr2

This of course is under the assumption that real is always 0-terminated, which I think it's safe to assume since it's documented in the realpath(3) manpage (ngx_realpath is an alias):

realpath() expands all symbolic links and resolves references to /./, /../ and extra / characters in the null-terminated string named by path to produce a canonicalized absolute pathname. The resulting pathname is stored as a null-terminated string, up to a maximum of PATH_MAX bytes, in the buffer pointed to by resolved_path. The resulting path will have no symbolic link, /./ or /../ components.

If you look at the glibc implementation, the terminating \0 is written unconditionally before a successful return.

(GitHub UI is always broken, so I cannot put the comment where it belongs, as the key point is the unchanged part of the patch that is not intended for the new context).

No problem, I usually work on the checked out branch (in a worktree) anyway, so it's easy enough to follow. :-)

@7c6f434c
Copy link
Member

7c6f434c commented Mar 26, 2020 via email

@aszlig
Copy link
Member Author

aszlig commented Mar 26, 2020

Yes, I mostly wondered how reliable is null termination in the exact PATH_MAX case

Even if we do not pass NULL (POSIX.1-2008, realpath, see issue 7), the null-termination needs to occur, otherwise it's a bug in the C library, see below.

Should we check any other libc? Is Darwin libc inside the open part?

It seems so and while the implementation has a few more indirections, the behaviour in this regard is the same.

Since this is also specified in POSIX I'd consider it to be a bug of the C library implementation if we do not get a null-terminated string.

To also make sure that this is really the case, I made a small test:

{ pkgs ? import <nixpkgs> {}, lib ? pkgs.lib }:

pkgs.runCommandLocal "test" {
  nativeBuildInputs = lib.singleton (pkgs.writers.writeCBin "testrpath" {} ''
    #include <stdlib.h>
    #include <stdio.h>
    #include <string.h>
    #include <limits.h>

    int main(int argc, char **argv) {
      char buf[PATH_MAX];
      if (argc != 2) return EXIT_FAILURE;
      if (realpath(argv[1], buf) == NULL) {
        perror("realpath");
        return EXIT_FAILURE;
      }
      printf("PATH_MAX: %u; len: %zu\n", PATH_MAX, strlen(buf));
      return EXIT_SUCCESS;
    }
  '');
} ''
  root="$(for i in $(seq 2040); do echo -n a/; done)"
  mkdir -p "$root"
  fname=b
  touch "$root$fname"
  while testrpath "$root$fname"; do
    fname="''${fname}b"
    (cd "$root" && touch "$fname")
  done
''

Output:

PATH_MAX: 4096; len: 4088
PATH_MAX: 4096; len: 4089
PATH_MAX: 4096; len: 4090
PATH_MAX: 4096; len: 4091
PATH_MAX: 4096; len: 4092
PATH_MAX: 4096; len: 4093
PATH_MAX: 4096; len: 4094
PATH_MAX: 4096; len: 4095
realpath: File name too long

@7c6f434c
Copy link
Member

I looked at POSIX definition and though «ah but of course some libc is bound to return a PATH_MAX-sized prefix of a null-terminated string». Sorry for being too pessimistic.

aszlig added a commit that referenced this pull request Mar 28, 2020
While our ETag patch works pretty fine if it comes to serving data off
store paths, it unfortunately broke something that might be a bit more
common, namely when using regexes to extract path components of
location directives for example.

Recently, @devhell has reported a bug with a nginx location directive
like this:

  location ~^/\~([a-z0-9_]+)(/.*)?$" {
    alias /home/$1/public_html$2;
  }

While this might look harmless at first glance, it does however cause
issues with our ETag patch. The alias directive gets broken up by nginx
like this:

  *2 http script copy: "/home/"
  *2 http script capture: "foo"
  *2 http script copy: "/public_html/"
  *2 http script capture: "bar.txt"

In our patch however, we use realpath(3) to get the canonicalised path
from ngx_http_core_loc_conf_s.root, which returns the *configured* value
from the root or alias directive. So in the example above, realpath(3)
boils down to the following syscalls:

  lstat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
  lstat("/home/$1", 0x7ffd08da6f60) = -1 ENOENT (No such file or directory)

During my review[1] of the initial patch, I didn't actually notice that
what we're doing here is returning NGX_ERROR if the realpath(3) call
fails, which in turn causes an HTTP 500 error.

Since our patch actually made the canonicalisation (and thus additional
syscalls) necessary, we really shouldn't introduce an additional error
so let's - at least for now - silently skip return value if realpath(3)
has failed.

However since we're using the unaltered root from the config we have
another issue, consider this root:

  /nix/store/...-abcde/$1

Calling realpath(3) on this path will fail (except if there's a file
called "$1" of course), so even this fix is not enough because it
results in the ETag not being set to the store path hash.

While this is very ugly and we should fix this very soon, it's not as
serious as getting HTTP 500 errors for serving static files.

I added a small NixOS VM test, which uses the example above as a
regression test.

It seems that my memory is failing these days, since apparently I *knew*
about this issue since digging for existing issues in nixpkgs, I found
this similar pull request which I even reviewed:

#66532

However, since the comments weren't addressed and the author hasn't
responded to the pull request, I decided to keep this very commit and do
a follow-up pull request.

[1]: #48337

Signed-off-by: aszlig <aszlig@nix.build>
Reported-by: @devhell
Acked-by: @7c6f434c
Acked-by: @yorickvP
Merges: #80671
Fixes: #66532
@aszlig
Copy link
Member Author

aszlig commented Mar 28, 2020

Merged as e1d63ad.

@aszlig aszlig closed this Mar 28, 2020
@aszlig aszlig deleted the nginx-etag-fix-take2 branch March 28, 2020 02:01
aszlig added a commit that referenced this pull request Mar 28, 2020
While our ETag patch works pretty fine if it comes to serving data off
store paths, it unfortunately broke something that might be a bit more
common, namely when using regexes to extract path components of
location directives for example.

Recently, @devhell has reported a bug with a nginx location directive
like this:

  location ~^/\~([a-z0-9_]+)(/.*)?$" {
    alias /home/$1/public_html$2;
  }

While this might look harmless at first glance, it does however cause
issues with our ETag patch. The alias directive gets broken up by nginx
like this:

  *2 http script copy: "/home/"
  *2 http script capture: "foo"
  *2 http script copy: "/public_html/"
  *2 http script capture: "bar.txt"

In our patch however, we use realpath(3) to get the canonicalised path
from ngx_http_core_loc_conf_s.root, which returns the *configured* value
from the root or alias directive. So in the example above, realpath(3)
boils down to the following syscalls:

  lstat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
  lstat("/home/$1", 0x7ffd08da6f60) = -1 ENOENT (No such file or directory)

During my review[1] of the initial patch, I didn't actually notice that
what we're doing here is returning NGX_ERROR if the realpath(3) call
fails, which in turn causes an HTTP 500 error.

Since our patch actually made the canonicalisation (and thus additional
syscalls) necessary, we really shouldn't introduce an additional error
so let's - at least for now - silently skip return value if realpath(3)
has failed.

However since we're using the unaltered root from the config we have
another issue, consider this root:

  /nix/store/...-abcde/$1

Calling realpath(3) on this path will fail (except if there's a file
called "$1" of course), so even this fix is not enough because it
results in the ETag not being set to the store path hash.

While this is very ugly and we should fix this very soon, it's not as
serious as getting HTTP 500 errors for serving static files.

I added a small NixOS VM test, which uses the example above as a
regression test.

It seems that my memory is failing these days, since apparently I *knew*
about this issue since digging for existing issues in nixpkgs, I found
this similar pull request which I even reviewed:

#66532

However, since the comments weren't addressed and the author hasn't
responded to the pull request, I decided to keep this very commit and do
a follow-up pull request.

[1]: #48337

Signed-off-by: aszlig <aszlig@nix.build>
Reported-by: @devhell
Acked-by: @7c6f434c
Acked-by: @yorickvP
Merges: #80671
Fixes: #66532
(cherry picked from commit e1d63ad)
aszlig added a commit that referenced this pull request Mar 28, 2020
While our ETag patch works pretty fine if it comes to serving data off
store paths, it unfortunately broke something that might be a bit more
common, namely when using regexes to extract path components of
location directives for example.

Recently, @devhell has reported a bug with a nginx location directive
like this:

  location ~^/\~([a-z0-9_]+)(/.*)?$" {
    alias /home/$1/public_html$2;
  }

While this might look harmless at first glance, it does however cause
issues with our ETag patch. The alias directive gets broken up by nginx
like this:

  *2 http script copy: "/home/"
  *2 http script capture: "foo"
  *2 http script copy: "/public_html/"
  *2 http script capture: "bar.txt"

In our patch however, we use realpath(3) to get the canonicalised path
from ngx_http_core_loc_conf_s.root, which returns the *configured* value
from the root or alias directive. So in the example above, realpath(3)
boils down to the following syscalls:

  lstat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
  lstat("/home/$1", 0x7ffd08da6f60) = -1 ENOENT (No such file or directory)

During my review[1] of the initial patch, I didn't actually notice that
what we're doing here is returning NGX_ERROR if the realpath(3) call
fails, which in turn causes an HTTP 500 error.

Since our patch actually made the canonicalisation (and thus additional
syscalls) necessary, we really shouldn't introduce an additional error
so let's - at least for now - silently skip return value if realpath(3)
has failed.

However since we're using the unaltered root from the config we have
another issue, consider this root:

  /nix/store/...-abcde/$1

Calling realpath(3) on this path will fail (except if there's a file
called "$1" of course), so even this fix is not enough because it
results in the ETag not being set to the store path hash.

While this is very ugly and we should fix this very soon, it's not as
serious as getting HTTP 500 errors for serving static files.

I added a small NixOS VM test, which uses the example above as a
regression test.

It seems that my memory is failing these days, since apparently I *knew*
about this issue since digging for existing issues in nixpkgs, I found
this similar pull request which I even reviewed:

#66532

However, since the comments weren't addressed and the author hasn't
responded to the pull request, I decided to keep this very commit and do
a follow-up pull request.

[1]: #48337

Signed-off-by: aszlig <aszlig@nix.build>
Reported-by: @devhell
Acked-by: @7c6f434c
Acked-by: @yorickvP
Merges: #80671
Fixes: #66532
(cherry picked from commit e1d63ad)
@risicle
Copy link
Contributor

risicle commented Apr 24, 2020

Well done for this fix - if I'd been using master I might not have lost half the afternoon to this issue, following the request though nginx in gdb ("how can ngx_http_set_etag be making a call to realpath?").

I'll look at doing the stable backport(s). Facepalm. You've already done the backports but I was working from a stale nixpkgs tree.

stigok pushed a commit to stigok/nixpkgs that referenced this pull request Jun 12, 2020
While our ETag patch works pretty fine if it comes to serving data off
store paths, it unfortunately broke something that might be a bit more
common, namely when using regexes to extract path components of
location directives for example.

Recently, @devhell has reported a bug with a nginx location directive
like this:

  location ~^/\~([a-z0-9_]+)(/.*)?$" {
    alias /home/$1/public_html$2;
  }

While this might look harmless at first glance, it does however cause
issues with our ETag patch. The alias directive gets broken up by nginx
like this:

  *2 http script copy: "/home/"
  *2 http script capture: "foo"
  *2 http script copy: "/public_html/"
  *2 http script capture: "bar.txt"

In our patch however, we use realpath(3) to get the canonicalised path
from ngx_http_core_loc_conf_s.root, which returns the *configured* value
from the root or alias directive. So in the example above, realpath(3)
boils down to the following syscalls:

  lstat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
  lstat("/home/$1", 0x7ffd08da6f60) = -1 ENOENT (No such file or directory)

During my review[1] of the initial patch, I didn't actually notice that
what we're doing here is returning NGX_ERROR if the realpath(3) call
fails, which in turn causes an HTTP 500 error.

Since our patch actually made the canonicalisation (and thus additional
syscalls) necessary, we really shouldn't introduce an additional error
so let's - at least for now - silently skip return value if realpath(3)
has failed.

However since we're using the unaltered root from the config we have
another issue, consider this root:

  /nix/store/...-abcde/$1

Calling realpath(3) on this path will fail (except if there's a file
called "$1" of course), so even this fix is not enough because it
results in the ETag not being set to the store path hash.

While this is very ugly and we should fix this very soon, it's not as
serious as getting HTTP 500 errors for serving static files.

I added a small NixOS VM test, which uses the example above as a
regression test.

It seems that my memory is failing these days, since apparently I *knew*
about this issue since digging for existing issues in nixpkgs, I found
this similar pull request which I even reviewed:

NixOS#66532

However, since the comments weren't addressed and the author hasn't
responded to the pull request, I decided to keep this very commit and do
a follow-up pull request.

[1]: NixOS#48337

Signed-off-by: aszlig <aszlig@nix.build>
Reported-by: @devhell
Acked-by: @7c6f434c
Acked-by: @yorickvP
Merges: NixOS#80671
Fixes: NixOS#66532
(cherry picked from commit e1d63ad)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants