Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbus: Add NSS modules path to dbus system bus service #41635

Merged
merged 1 commit into from Jun 29, 2018

Conversation

spacefrogg
Copy link
Contributor

Motivation for this change

DBus seems to resolve user IDs directly via glibc, circumventing nscd. In more
advanced setups this leads to user's coming from LDAP or SSSD not being
resolved by the dbus system bus daemon. The effect for such users is, that all
access to the system bus (e.g. busctl or nmcli) is denied.

Adding the respective NSS modules to the service's environment solves the issue
the same way it does for nscd.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Fits CONTRIBUTING.md.

 DBus seems to resolve user IDs directly via glibc, circumventing nscd. In more
 advanced setups this leads to user's coming from LDAP or SSSD not being
 resolved by the dbus system bus daemon. The effect for such users is, that all
 access to the system bus (e.g. busctl or nmcli) is denied.

 Adding the respective NSS modules to the service's environment solves the issue
 the same way it does for nscd.
@arianvp
Copy link
Member

arianvp commented Sep 25, 2020

Sorry for the super later (2 years late!) comment.

It would've been very useful if this commit would have told what exactly was breaking.

@spacefrogg do you have any more context what breakage exactly was fixed with this? What exactly do you mean by "dbus seems to resolve user IDs directly via glibc" ?
libc is responsible for shelling out to nscd; not dbus. So the rationale of this commit doesn't make too much sense to me. DBUS calls glibc, and glibc calls ncsd. Why would ncsd be bypassed?
If it does get bypassed that's a serious bug somewhere else and this kind of "masks" the original source of the issue.

Adding the respective NSS modules to the service's environment solves the issue
the same way it does for nscd.

The sole reason why we run nscd is so that all services resolve nss modules through it, such that we don't have to sed LD_LIBRARY_PATH everywhere. So we shouldn't be "fixing things the same way as we did for nscd" as that means that nscd wasn't working as expected. Which is a serious bug.

I'm kind of inclined to revert this; as nscd should be the sole source of nss_* calls to avoid ABI issues.
I have a feeling this patch might blow up in our faces sometime in the future.

@edolstra thoughts on this patch?

@spacefrogg
Copy link
Contributor Author

spacefrogg commented Sep 25, 2020 via email

@arianvp
Copy link
Member

arianvp commented Sep 26, 2020

Thank you for your detailed response

Maybe, for some kind of optimisation, dbus does/did not go through the nscd call path.

dbus is unaware of nscd's existence. All the logic for delegating getent syscalls to nscd is in glibc itself. If the nscd socket is present, then glibc will redirect getent syscalls to nscd instead of dlopen'ing nss modules by itself. This is how we make sure that all processes in NixOS have a consistent view of what NSS modules are there without any ABI issues.

What we did observe in the past is that glibc will fall back to non-nscd if nscd doesn't answer within 5 seconds. Then in that case nscd is bypassed and getent would indeed fail. (#55276). If wonder if that's what was going on here.

. I am, of course, open for suggestions and further testing. I am not open for a revert based on speculation. But then, who am I.

totally agree. And i won't revert it until I have tested my assumptions. Don't worry . I'll see if I can make a reproducer in the form of a NixOS test. worst case is that we have a test that will make sure we do not regress on this issue again, and best case is that we find out the issue is not an issue anymore.

Do you maybe still remember what your LDAP setup looked like? Were you using sssd by any chance?

Also do you remember what part of dbus was misbehaving? was it a dbus service? Or specifically dbus itself?

@arianvp
Copy link
Member

arianvp commented Sep 26, 2020

Note; as I see you are responding by email, and github doesn't propagate edits to comments through their email interface (sigh), that I semi-heavily edited my initial comment to add additional context.

@spacefrogg
Copy link
Contributor Author

dbus is unaware of nscd's existence. All the logic for delegating getent syscalls to nscd is in glibc itself. If the nscd socket is present, then glibc will redirect getent syscalls to nscd instead of dlopen'ing nss modules by itself. This is how we make sure that all processes in NixOS have a consistent view of what NSS modules are there without any ABI issues.

That is how I understood the architecture, as well. Then reality happened and it did no longer match.

Do you maybe still remember what your LDAP setup looked like? Were you using sssd by any chance?

I used (and still do) sssd.

The configuration is along the lines:

        [sssd]
        services = nss, pam

        [domain/mydomain]
        id_provider = ldap
        auth_provider = krb5
        access_provider = ldap
        chpass_provider = none

Also do you remember what part of dbus was misbehaving? was it a dbus service? Or specifically dbus itself?

My commit message (in our corporate environment) reads: dbus system daemon resolves UIDs by hand without asking nscd.

So, maybe the socket is not (reliably) present or not accessible to it. If I remember correctly (big if), the effect was that normal users could no longer use network manager and the likes due to access restrictions (based on errors in dbus).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants