Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the hash modulo in the derivation outputs #4348

Merged
merged 2 commits into from Dec 16, 2020
Merged

Conversation

thufschmitt
Copy link
Member

Rather than storing the derivation outputs as drvPath!outputName internally,
store them as drvHashModulo!outputName (or outputHash!outputName for
fixed-output derivations).

This makes the storage slightly more opaque, but enables an earlier
cutoff in cases where a fixed-output dependency changes (but keeps the
same output hash) − same as what we already do for input-addressed
derivations.

Depends on: #4330
Fix: #4347

@thufschmitt thufschmitt added the ca-derivations Derivations with content addressed outputs label Dec 11, 2020
@thufschmitt
Copy link
Member Author

A few notes on that:

  1. We could go a bit further for the sake of consistency, and use the exact same hash that input-addressed derivations use for computing the output path (in which case we could get rid of the output name as it's included in the path, or just keep it for information as input-addressed drv outputs do). But I'm not sure the added opacity is worth it − and it would kind-of conflict with Should we allow spitting the derivation output map per output? #4344 as the different outptus would get a different hash
  2. I couldn't find a way of storing this hash in the drv files themselves without breaking backwards compatibility, so we have to recompute it whenever we need it
  3. As a consequence of 2., this relies a lot on the in-memory caching of drv hashes modulo to not be absurdly inefficient. Not a big issue I think, but not utterly clean either

if (!goal.isAllowed(info.id.drvPath))
throw InvalidPath("cannot register unknown drv output '%s' in recursive Nix", printStorePath(info.id.drvPath));
// XXX: Should we check for something here? Probably, but I'm not sure
// how
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably registerDrvOutput shouldn't be allowed in RestrictedStore at all, since the caller is completely untrusted and supposed to be sandboxed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess once we take the time to properly implement this, we should allow it if the drv output matches an allowed path, but that's complicated to check as we only know its hash modulo and not the drv path. So I'll just forbid everything for now

if (!goal.isAllowed(id.drvPath))
throw InvalidPath("cannot query the output info for unknown derivation '%s' in recursive Nix", printStorePath(id.drvPath));
// XXX: Should we check for something here? Probably, but I'm not sure
// how
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

queryRealisation() should only succeed in RestrictedStore if DrvOutput refers to a derivation that the client added (via addToStore) and where it has built the output (i.e. has done a buildDerivation()). Otherwise it can leak info to which the sandboxed build isn't supposed to have access.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed… like #4348 (comment) I'll just restrict everything for now as it's slightly tricky to implement and we might change it eventually anyways.

I've opened #4353 to track it

@thufschmitt
Copy link
Member Author

Needs to double-check that this doesn't change the daemon protocol for queryDrvOutputMap

Rather than storing the derivation outputs as `drvPath!outputName` internally,
store them as `drvHashModulo!outputName` (or `outputHash!outputName` for
fixed-output derivations).

This makes the storage slightly more opaque, but enables an earlier
cutoff in cases where a fixed-output dependency changes (but keeps the
same output hash) − same as what we already do for input-addressed
derivations.
There's currently no way to properly filter them, so disallow them
altogether instead.
@thufschmitt
Copy link
Member Author

Needs to double-check that this doesn't change the daemon protocol for queryDrvOutputMap

This isn't the case, queryDrvOutputMap only manipulates StorePaths, so we're good on that front

@dpulls
Copy link

dpulls bot commented Dec 14, 2020

🎉 All dependencies have been resolved !

Base automatically changed from ca/properly-store-outputs to master December 14, 2020 14:01
@edolstra edolstra merged commit 3765174 into master Dec 16, 2020
@edolstra edolstra deleted the ca/use-hashmodulo branch December 16, 2020 11:48
thufschmitt added a commit that referenced this pull request Dec 17, 2020
PRs #4370 and #4348 had a bad interaction in that the second broke the fist
one in a not trivial way.

The issue was that since #4348 the logic for detecting whether a
derivation output is already built requires some logic that was specific
to the `LocalStore`.

It happens though that most of this logic could be upstreamed to any `Store`,
which is what this commit does.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ca-derivations Derivations with content addressed outputs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use the hash modulo for the derivation outputs
2 participants