Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix the hash rewriting for ca-derivations #4282

Merged
merged 4 commits into from Jun 14, 2023

Conversation

thufschmitt
Copy link
Member

Add self-hash rewriting for ca-derivations.
This had apparently been forgotten in #3883, and contrary to what I though there was no test for it (and as it's mostly orthogonal to all the rest I didn't notice it otherwise until now).

@thufschmitt
Copy link
Member Author

/cc @Ericson2314

@thufschmitt thufschmitt added the ca-derivations Derivations with content addressed outputs label Nov 25, 2020
Path tmpPath = actualPath + ".tmp";
restorePath(tmpPath, *source);
deletePath(actualPath);
movePath(tmpPath, actualPath);
Copy link
Member

@edolstra edolstra Nov 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a call to canonicalisePathMetaData() is needed here. (See rewriteOutput() which does so after doing a rewrite.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, thanks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we give rewriteOutput a references parameter above, I think we can deduplciate this?

  1. rewriteOutput(outputRewrites);
  2. calc hahes
  3. if (scratchPath != newInfo0.path)
        rewriteOutput({
            std::string { scratchPath.hashPart() },
            std::string { newInfo0.path.hashPart() },
        });
    

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duh indeed. Thanks, should be fixed now

@Ericson2314
Copy link
Member

Wait isn't

/* Substitute output placeholders with the scratch output paths.
We'll use during the build. */
inputRewrites[hashPlaceholder(outputName)] = worker.store.printStorePath(scratchPath);

supposed to do it?

@thufschmitt
Copy link
Member Author

Wait isn't

/* Substitute output placeholders with the scratch output paths.
We'll use during the build. */
inputRewrites[hashPlaceholder(outputName)] = worker.store.printStorePath(scratchPath);

supposed to do it?

This line is a rewriting of the input of the build. What's missing is the final rewrite of the self-references after the build

@Ericson2314
Copy link
Member

@regnat sorry, here's the output one

outputRewrites[std::string { scratchPath.hashPart() }] = std::string { finalStorePath.hashPart() };

But I don't think the finish one is being called in enough places?

@thufschmitt
Copy link
Member Author

@regnat sorry, here's the output one

outputRewrites[std::string { scratchPath.hashPart() }] = std::string { finalStorePath.hashPart() };

But I don't think the finish one is being called in enough places?

I think it is, but it's called after rewriteOutput() (and needs to be, because getting the final hash can only happen after we've rewritten the hashes of the dependencies). We could just call rewriteOutput() (once for rewriting the dependencies and once for rewriting the self-references) that should be equivalent − except that we'd need to make sure that we only warn once about the hash rewriting.

@thufschmitt thufschmitt force-pushed the fix-ca-hash-rewriting branch 2 times, most recently from 836d457 to 9715a41 Compare November 30, 2020 16:43
Comment on lines 3126 to 3129
rewriteOutput({{
std::string(scratchPath.hashPart()),
std::string(newInfo0.path.hashPart())
}});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to contradict my own prior suggestion here, but I forgot about the two sets of {..}, maybe this would be better to visually distinguish input from output?

Suggested change
rewriteOutput({{
std::string(scratchPath.hashPart()),
std::string(newInfo0.path.hashPart())
}});
rewriteOutput({
{ std::string(scratchPath.hashPart()), std::string(newInfo0.path.hashPart()) }
});

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've actually reformated it in yet-another-way 😛 (which makes me wish for #3697 to be a think), with an explicit call to the StringMap constructor because it's indeed to easy to forget what's going on otherwise

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, how does that work, isn't StringMap the outer set of braces?

(Fine with the idea though!)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, how does that work, isn't StringMap the outer set of braces?

I guess that's why people say that one shouldn't write code too late in the evening 🥴

Comment on lines 3124 to 3125
if (scratchPath != newInfo0.path) {
// Also rewrite the output path
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah it just occurred to me, refs.first is whether we have a self reference, so we can use this to optimize slightly.

I also threw in a comment explaining. Feel free to reword :).

Suggested change
if (scratchPath != newInfo0.path) {
// Also rewrite the output path
if (refs.first) {
// patch has self-references, so should also rewrite them.
// Note that this doesn't change the end-goal CA we ca calculated above, since that calculation skips the contents of self-references.
// But rather, if we have self-references, relocate from `scratchPath`, and then *don't* do this,
// our self-references to `scratchPath` become non-self references, and then the CA is wrong!
// Instead we "corrupt" the path at `scratchPath` so when we later relocate it it's thereby uncorrupted, the opposite of the above problem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, res.first is a nicer option (I kept the schatchPath != newInfo0.path condition too because I think it's useful to prevent rewriting CA derivations).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Copy link
Member

@Ericson2314 Ericson2314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this mistake of mine @regnat :)

@Ericson2314
Copy link
Member

See #4284 (comment), I guess this one can just be the code de-dup.

@Ericson2314
Copy link
Member

Still keen on this one, lest the rewrite logic drift apart.

@thufschmitt thufschmitt removed this from the ca-derivations-mvp milestone May 5, 2021
@edolstra
Copy link
Member

@regnat I guess this is still needed? If so, could you resolve the merge conflict please?

@Ericson2314
Copy link
Member

It would be good to get this in before 2.4 too, I think.

@stale
Copy link

stale bot commented Apr 16, 2022

I marked this as stale due to inactivity. → More info

@stale stale bot added the stale label Apr 16, 2022
@stale stale bot removed the stale label Mar 11, 2023
@Ericson2314
Copy link
Member

@thufschmitt I rebased this https://github.com/Ericson2314/nix/commits/fix-ca-hash-rewriting it would be good to get it merged!

@thufschmitt
Copy link
Member Author

Thanks a lot @Ericson2314 ! I've re-rebased them and force-pushed

@Ericson2314 Ericson2314 self-assigned this Apr 17, 2023
@Ericson2314
Copy link
Member

Weird this failed, but I rebased it yet again.

@thufschmitt
Copy link
Member Author

The darwin CI failure looks quite legit and related:

nix-tests> error: ca hash mismatch importing path '/build/nix-test/tests/fetchClosure/store/y0af26shqb7ag6jplsjffw3lkbq4391k-dependencies-top';
nix-tests> specified: sha256:1k6givbgikzbx4a399vn6y5lvnysa2iw2gfm71fmihqkaqdspdd6
nix-tests> got: sha256:0jdn54c8rcjpv1y5wn1rwqfrfrxli4m2lgiba8qidahmi6mr3slv

Can you have another look?

@Ericson2314
Copy link
Member

Ericson2314 commented Apr 19, 2023

Agreed; but I was thinking it might be easier to fix post rebase?

If you like I can also just open a fresh PR and take this one over if you are busy. Maybe I can make a pr from this repo so we can both push to it.

@thufschmitt
Copy link
Member Author

Mh sorry, my brain broke down, ignore my last comment. Thanks for the rebase, I'll look at the failing test :)

@github-actions github-actions bot added the with-tests Issues related to testing. PRs with tests have some priority label May 10, 2023
Giving it the same semantics as `rewriteStrings`.
Also add some tests for it
Possibly this will make it stream
Broken because of the change introduced by NixOS#4282
@thufschmitt
Copy link
Member Author

CI is green now, but I had to disable some test or older daemons because this changes the hash of CA paths with self-references (which is a shame, but also very much needed)

Copy link
Member

@Ericson2314 Ericson2314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the belated review!

@Ericson2314 Ericson2314 merged commit 61a3e1f into NixOS:master Jun 14, 2023
8 checks passed
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/tweag-nix-dev-update-50/29793/1

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/content-addressed-nix-call-for-testers/12881/216

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug ca-derivations Derivations with content addressed outputs with-tests Issues related to testing. PRs with tests have some priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants