Skip to content

Commit

Permalink
boostrap: Enable mksh by default on macOS 10.11+.
Browse files Browse the repository at this point in the history
El Capitan (10.11) introduced System Integrity Protection (SIP), and one of
the side effects of this protection is that system shells (i.e. /bin/*sh)
unset any variables that may affect the security of the system.  This causes
problems with packages that rely on e.g. LD_LIBRARY_PATH.

Using a shell outside of the system paths allows us to work around this, at
least for now.

Tested in bulk builds on macOS Catalina, though with SIP disabled (as there
is no way to run sandboxed builds with SIP enabled).
  • Loading branch information
jperkin committed Jul 6, 2020
1 parent 671976c commit 9545f5f
Showing 1 changed file with 10 additions and 1 deletion.
11 changes: 10 additions & 1 deletion bootstrap/bootstrap
@@ -1,6 +1,6 @@
#! /bin/sh

# $NetBSD: bootstrap,v 1.285 2020/07/06 10:25:29 jperkin Exp $
# $NetBSD: bootstrap,v 1.286 2020/07/06 10:43:47 jperkin Exp $
#
# Copyright (c) 2001-2011 Alistair Crooks <agc@NetBSD.org>
# All rights reserved.
Expand Down Expand Up @@ -589,6 +589,15 @@ Darwin)
need_sed=yes
fi

# Avoid system shells on macOS versions that enable System Integrity
# Protection (SIP) as it affects packages that rely on variables such
# as LD_LIBRARY_PATH. SIP unsets any variables that may affect
# security when using system binaries, i.e. /bin/*sh, but using a
# non-system shell is unaffected, at least for now.
if [ $macos_version -ge 1011 ]; then
need_mksh=yes
fi

case "$macos_version" in
100[7-9])
packagemaker=/Applications/PackageMaker.app/Contents/MacOS/PackageMaker
Expand Down

13 comments on commit 9545f5f

@Mitsos101
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What reasons are there for making mksh the default shell on Darwin besides the use of DYLD_* variables? No other reasons are explicitly stated in the comments or the commit message.

@jperkin
Copy link
Collaborator Author

@jperkin jperkin commented on 9545f5f Jul 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was the reason. There have been a number of issues caused by this that we've so far had to work around in individual packages, either by inserting LD_LIBRARY_PATH directly into the build process or by depending on shells/pdksh which we know has issues.

@Mitsos101
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not pkgsrc's responsibility to fix such issues. This should be left to upstream. DYLD_LIBRARY_PATH is allowed to be set for unprotected processes; if it doesn't work, it's because it is being wrongly set for the parent shell. (See #66 (comment))

@jperkin
Copy link
Collaborator Author

@jperkin jperkin commented on 9545f5f Jul 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, if you want to go around and convince meson, ninja, gobject-introspection and others to fix their build processes, be my guest ;-)

Unfortunately, as is so often the case, upstreams do not always do the things that we want. Our job is to provide software for our users regardless.

@Mitsos101
Copy link

@Mitsos101 Mitsos101 commented on 9545f5f Jul 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unable to find a source that states that upstream meson, ninja, or gobject-introspection require SIP to be disabled. I was also unable to find any relevant patches on MacPorts or Homebrew. Could you point me to it?

@jperkin
Copy link
Collaborator Author

@jperkin jperkin commented on 9545f5f Jul 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There have been a number of threads recently on tech-pkg regarding this issue, I'd try there. Note that some of these issues are specific to pkgsrc as it can sometimes depend on individual philosophies on how to build software. For example, we disable use of $ORIGIN which then means we need to add LD_LIBRARY_PATH if a build process has previously relied on $ORIGIN to find its internal shared libraries when running some code generation tool.

While I'm happy to continue answering questions, I'd like to ask one of my own - what problem are you trying to solve? It's not really clear what your issue is, and it would help a lot if you could be specific about what failures you are seeing with pkgsrc so that we can reproduce and fix them, otherwise we may just end up going around in circles and wasting everyone's time. Thanks.

@Mitsos101
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not trying to solve any problem in this case. Importing 60,000 lines of code to get around a reasonable security restriction seems unreasonable to me. The fact that packages don't build with SIP indicates that pkgsrc is incorrectly passing environment variables to the shell instead of the child, not that the shell should be replaced.

@jperkin
Copy link
Collaborator Author

@jperkin jperkin commented on 9545f5f Jul 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I explained in the commit message for the mksh import, our current bootstrap shell (shells/pdksh) has been unmaintained for many years and has known bugs. pkgsrc needs a bootstrap shell regardless of any specific macOS issues, and this was as good a time as any to start migrating the various OS that need a bootstrap shell away from pdksh, as I have already done for Solaris, so that it can eventually be removed. I didn't import mksh solely for this macOS issue.

You may be right that we aren't patching things correctly everywhere, unfortunately this is difficult to test across pkgsrc as there is currently, at least as far as I am aware, no way of building in sandboxes without disabling SIP, as I have done on my bulk build hosts. Obviously this means that any SIP-related issues are hidden, which is why it took us a while to initially diagnose this problem. I'm also not convinced that we can avoid doing this in a way that completely avoids being able to require a non-system shell, see for example the ninja build process that hardcodes spawning /bin/sh which required switching out to a non-system shell.

In summary:

  • We needed to import a new bootstrap shell anyway, so the objection to 60,000 lines of code is somewhat irrelevant.
  • Switching macOS to mksh fixes problems and allows packages to build that previously did not.
  • There may be a cleaner solution, but performing the necessary testing to prove it is correct is very difficult.

I don't mean this as a dismissive "patches welcome", but would genuinely be happy if you or someone else is able to do the necessary work to prove that we can go back to using a system shell on macOS while ensuring that there are no regressions across pkgsrc when building on SIP-enabled systems, or if you have ideas on how to perform sandboxed SIP-enabled builds on Catalina. The main issue is that DNS resolution does not work as you are not able to modify mDNSResponder to listen on additional sockets, but I think Catalina introduced others too.

@Mitsos101
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may be right that we aren't patching things correctly everywhere,

And the solution to this is another incorrect patch?

Switching macOS to mksh fixes problems and allows packages to build that previously did not.

My point is that it does not fix any problems, it simply ignores them. Environment variables are still being incorrectly passed to the shell. This is wrong, and this is why building packages fails on SIP. It seems obvious to me that this is what should be fixed, not the shell itself. Replacing the shell on macOS so that DYLD_LIBRARY_PATH is allowed distracts from the real issue.

@jperkin
Copy link
Collaborator Author

@jperkin jperkin commented on 9545f5f Jul 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, if all you're going to do is complain then I'm done here, sorry. If at any point in the future you want to come back and provide some specific patches then I'm more than happy to look at them, but continually asserting "you are wrong" without contributing any meaningful alternative or even evidence to show that specific packages have not been fixed by this change is not generally a good way to interact with an open source project and those who are volunteering their time to help.

@Mitsos101
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, if all you're going to do is complain then I'm done here, sorry.

It is not my intention to pointlessly complain, nor to nitpick. I have explained why this isn't a good solution to the DYLD_* issue.

If at any point in the future you want to come back and provide some specific patches

The patch I am suggesting here is to revert this commit.

but continually asserting "you are wrong"

I repeat: The issue here is that environment variables are being incorrectly passed to the shell when they should only be passed to the child. This is always wrong. In this case, the dynamic libraries that the shell loads are being surreptitiously changed. This is not something that pkgsrc should turn a blind eye to.

without contributing any meaningful alternative

The alternative here is clear: Fix all the scripts that incorrectly pass environment variables to the shell.

or even evidence to show that specific packages have not been fixed by this change

From #66 (comment):

This will lead eventually to problems for trying to do things like DYLD_LIBRARY_PATH=/path_with_dylibs /usr/local/pkg//bin/ctest which is pretty common.

Quoting from dlopen(3):

When path contains a slash but is not a framework path (i.e. a full path
or a partial path to a dylib), dlopen() searches the following until it
finds a compatible Mach-O file: $DYLD_LIBRARY_PATH (with leaf name from
path), then the supplied path (using current working directory for relative paths), then $DYLD_FALLBACK_LIBRARY_PATH (with leaf name from path).

Here, dyld will load the libraries in /path_with_dylibs for the shell. If /path_with_dylibs contains a library with the same name as one the shell needs, dyld will load that one instead. This is bound to change the shell's behavior and is clearly not the user's intention.

not generally a good way to interact with an open source project and those who are volunteering their time to help.

I don't understand this. It is not my intention to waste your time. I have also volunteered my time to explain why this commit is wrong.

@alarixnia
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mitsos101 It's very unlikely this will be reverted. pdksh is crufty, effectively unmaintained for at least 10 years.

The alternative here is clear: Fix all the scripts that incorrectly pass environment variables to the shell.

If you truly believe this, you should submit a patch that fixes all instances of this, testing with a bulk build in the process. If your objection is the size of mksh, you can also test a shell you deem more acceptable for bootstrapping in the process. Any fallout (broken packages) will of course need to be fixed.

Volunteering time to explain why you object to this commit isn't enough, we have to worry about actually delivering working packages on an array of platforms.

@Mitsos101
Copy link

@Mitsos101 Mitsos101 commented on 9545f5f Jul 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your response.

The problem with this commit is that it doesn't fix the DYLD_LIBRARY_PATH=/path_with_dylibs sh issue, as it is reliant on /path_with_dylibs not containing any libraries with the same name as one the shell requires. As this cannot be guaranteed, the fix doesn't work.

It's very unlikely this will be reverted. pdksh is crufty, effectively unmaintained for at least 10 years.

See #66. mksh does not replace pdksh, but the macOS system shells.

If you truly believe this, you should submit a patch that fixes all instances of this, testing with a bulk build in the process.
Any fallout (broken packages) will of course need to be fixed.

Bulk builds cannot be performed on SIP-enabled platforms, see 9545f5f#commitcomment-40422281. Nevertheless, as the justification provided for this commit is invalid, it shouldn't have been pushed in the first place. Therefore, any packages that would break after this commit is reverted should not be considered to have regressed, as this would place an undue burden on me: The reason provided for this commit is invalid, but I would have to fix every broken package to get it reverted. The reason the use of DYLD_* in shells is disallowed in the first place is because it is dangerous and deviates from the user's intention, as can be seen in the example I gave here: 9545f5f#commitcomment-40452532. No justification is provided as to why the problems illustrated in the example are not relevant to this commit. I am unable to find a proper fix to the DYLD_* issue at the moment, but it's clear that this commit doesn't fix this issue either.

If your objection is the size of mksh, you can also test a shell you deem more acceptable for bootstrapping in the process.

I originally mistakenly believed that mksh was imported solely for this commit. I no longer disagree with importing it into the pkgsrc tree. The reason I disagree with this commit is that it doesn't solve the issue it's supposed to solve. Therefore, there is no reason to replace the macOS system shells.

Please sign in to comment.