Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libutils/hash: rename Base to HashEncoding #3680

Closed
wants to merge 3 commits into from

Conversation

zimbatm
Copy link
Member

@zimbatm zimbatm commented Jun 10, 2020

Compress the (Base, usePrefix) tuple to a single field that represents
the hash encoding. This makes more sense because SRI always has a
prefix.

This is a continuation of the work started at #3655

If this PR gets merged, the next PR will be introducing a ParsedHash subclass which retains the information on the original hash and allows hash.to_string(hash.original_encoding)

@zimbatm
Copy link
Member Author

zimbatm commented Jun 10, 2020

/cc @Ericson2314 @matthewbauer who have conflicting PRs open

@zimbatm zimbatm changed the title libutil: Rename Base to HashEncoding libutils/hash: rename Base to HashEncoding Jun 10, 2020
@zimbatm
Copy link
Member Author

zimbatm commented Jun 10, 2020

got this locally, not sure why Perl isn't finding the header (or how to debug this):

lib/Nix/Store.xs:11:10: fatal error: derivations.hh: No such file or directory
   11 | #include "derivations.hh"
      |          ^~~~~~~~~~~~~~~~

@zimbatm
Copy link
Member Author

zimbatm commented Jun 10, 2020

TODO: Discuss the naming a little bit. Does "PrefixedBase32" make sense for the old prefixed format?

@Ericson2314
Copy link
Member

The idea looks good to me! I just would take the opportunity to switch to enum struct. Besides what I think is nicer syntax (bu that's just an opinion) per https://en.cppreference.com/w/cpp/language/enum there are fewer implicit conversions with the underlying scalar type, which helps avoid improper values.

@@ -12,6 +12,18 @@ MakeError(BadHash, Error);

enum HashType : char { htUnknown, htMD5, htSHA1, htSHA256, htSHA512 };

enum HashEncoding : int {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually you could have

enum struct Encoding { Base16, Base32, Base64, };

struct LegacyHashEncoding { Encoding base, Bool prefix, };
struct SRI {};
typedef std::variant<LegacyHashEncoding, SRI> HashEncoding;

It's more work, but I think this would be the very best.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really have an opinion about that. @edolstra ?

Note that PrefixedBase16 and PrefixedBase32 are here to stay as they are being used in core parts like the narinfo file format.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If @edolstra likes this, I'll happily do the extra work myself if you like, @zimbatm.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After thinking about it for a while, my preference would be to keep the enumeration flat. Considering FFI and how it will work with Perl and Rust.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. I do still think the enum struct is still a good idea, however---Eelco was concerned about churn in the other thread but here we are changing it anyways.

[Also, for the future, I opened dtolnay/cxx#217 as in principle, std::variant ought to work great with Rust at least.]

@zimbatm zimbatm requested a review from edolstra June 12, 2020 07:39
src/nix/hash.cc Outdated Show resolved Hide resolved
@edolstra
Copy link
Member

It's not entirely clear to me what the practical advantage of these changes is. OTOH the disadvantage (a lot of code churn) is clear.

@zimbatm
Copy link
Member Author

zimbatm commented Jun 12, 2020

Agreed, without the next change, the PR doesn't bring a lot of value. Let me then bolt on the next PR so the value becomes clearer.

@Ericson2314
Copy link
Member

@edolstra I would have thought one of the benefits of not having a stable interfaces is we need not shun churn, and can iteratively refactor things until they are nicer. (Last I checked, everyone agreed Nix was less than ideal when used as a library.) Even this PR without the follow-up I value for removing some redundancy from the printing options.

I would be more than happy to contribute and review dozens of these PRs if you change your mind about churn, so that their value would add up into something quite significant.

@zimbatm
Copy link
Member Author

zimbatm commented Jun 12, 2020

Generally, I think the idea would be to move more things into Rust and have more of the churn go there. But that's really a discussion we need to have separately.

@Ericson2314
Copy link
Member

Sure, the first thing I tried was #3416, but by now I've gotten more familiar with C++ and am happy to refactor things in either language.

@zimbatm zimbatm force-pushed the prefixed-hashes branch 3 times, most recently from bda9c2b to 2b071f1 Compare June 17, 2020 22:16
Compress the (Base, usePrefix) tuple to a single field that represents
the hash encoding. This makes more sense because SRI always has a
prefix.

This is a continuation of the work started at NixOS#3655
This will come in handy later when comparing desired and actual hashes.
@zimbatm
Copy link
Member Author

zimbatm commented Jun 18, 2020

I pushed 2 more commits to finalize the design. I didn't know how to sub-class the Hash struct so the original encoding is embedded in there.

@Ericson2314
Copy link
Member

I think this will work.

struct HashWithEncoding : Hash
{
    HashEncoding encoding;
    HashWithEncoding(std::string_view s, HashEncoding encoding HashType type = htUnknown)
        : Hash(s, type), encoding(encoding);
    std::string to_string() const {
        return to_string(encoding);
    }
}

@zimbatm
Copy link
Member Author

zimbatm commented Jun 18, 2020

related to #3650

@Ericson2314
Copy link
Member

Ericson2314 commented Jun 18, 2020

Hmm? I think #3650 is a good change, but I better made it not contain the enum struct churn PR if it's going to be merged. Not sure what it has to do with HashWithEncoding?

@@ -30,6 +41,7 @@ struct Hash
unsigned char hash[maxHashSize] = {};

HashType type = htUnknown;
HashEncoding encoding = heUnknown;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this field. The Hash type should not store an encoding. It creates all sorts of issues like whether operator == should compare the encoding.

@zimbatm
Copy link
Member Author

zimbatm commented Nov 16, 2020

Closing as this PR has completely rotted. I am not sure the change makes sense in the new code.

@zimbatm zimbatm closed this Nov 16, 2020
@zimbatm zimbatm deleted the prefixed-hashes branch November 16, 2020 08:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants