Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Digest consolidation #4037

Merged
merged 3 commits into from Feb 16, 2017
Merged

Conversation

Papierkorb
Copy link
Contributor

@Papierkorb Papierkorb commented Feb 14, 2017

Hello,

Here's the Digest consolidation as promised in #4021 :)

The original MD5 implementation used a public Context class and a private ContextImpl struct. I guess the original creators intention was to save a memory allocation in the non-yielding digest methods, and use memory allocation only for the yielding digest methods.

After a discussion in IRC, we came to the conclusion to further improve upon it by make it not use any dynamic memory allocation. As this is not the first time I needed something like this (And likely, others did too), we agreed to put a generic wrapper into the stdlib, called ValueRef(T). This wrapper basically mimics a C++ variable reference.

I'm eager to hear y'all opinions, if this is ok, or if the ValueRef(T) should not be there at all, or be part of its own PR :)

(Side note: with that there are now 5000 stdlib unit tests. Yay)

Copy link
Contributor

@ysbaddaden ysbaddaden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wary of ValueRef. It's unsafe since it keeps a pointer to a stack allocated value. It's good for speed, but the pointer is passed, not kept internally, so the unsafety leaks.

If we eventually require unsafe blocks (as Rust), then accessing ValueRef will be unsafe too, and the block API will be ugly/painful to use.

Maybe dropping the block form would be simpler?

require "value_ref"
require "base64"

module Digest::Base(C)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use T for denoting a generic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Naming standards we all agree on (i.e. dont use 1 letter variable names) seem to get routinely ignored when using generics. Wouldn't ContextType be a better name? T is a good generic name for generic containers, this is not a generic container.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convention.

Reading C.new below made we wondered what the hell it was, and why someone would call a struct/class like that, until I noticed it denoted a generic. Reading ContextType I would never guess it was a generic.

Reading T I immediately understand it as a generic type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's an awful shame to add the feature so that our generic variables can have descriptive names, but not use it. Non-generic type names tend not to end in Type for obvious reasons, but if we made it the convention that multi-char generic names ended in Type would that make it more readable for you?

@Papierkorb
Copy link
Contributor Author

Papierkorb commented Feb 15, 2017

Maybe dropping the block form would be simpler?

The block form allows one to calculate a hash of large data easily, without having everything in memory. Use-case would be hashing large files.

Edit I'd rather have a class Context(T) in that case, similar to how it was before.

@asterite
Copy link
Member

I'm not entirely sure about the current design of digest. Providing only the block form is not good, we should provide a way to create a digest, maybe store it in an instance variable and update it as data comes. For this we need to use a class. But writing @digest : ValueRef(Digest::MD5::Context) is not very nice. I think we should just make Digest::MD5::Context a class, without having a structure. Later the compiler could remove this extra allocation with escape analysis. But we shouldn't complicate the design of the standard library because of this small allocation.

@Papierkorb
Copy link
Contributor Author

But writing @digest : ValueRef(Digest::MD5::Context) is not very nice.

That would've been inherently unsafe too, ValueRef wasn't intended to be stored.

Later the compiler could remove this extra allocation with escape analysis.

Ah, that sounds really good! In that case, I don't have much qualms about using a class there "for now", when the compiler will get rid of the expensive parts "later on" anyway.

But we shouldn't complicate the design of the standard library because of this small allocation.

Agreed.

Latest changes: Removed ValueRef again and renamed the ContextImpl structs to class Context

@asterite
Copy link
Member

I'm also not sure we should design this in a PR. For example there's no reason to use a generic module here.

We'll some time in the future design this and get back to this.

@Papierkorb
Copy link
Contributor Author

So while we wait for the CI to finish, the current PR contains:

  1. Renamed Crypto::MD5 to Digest::MD5
  2. Moved the class methods from MD5 and SHA1 into Digest::Base (#digest, #hexdigest, #base64digest, each with and without yield)
  3. Both MD5 and SHA1 inherit from Digest::Base
  4. MD5 and SHA1 have their logic right in the class, and not in a ContextImpl sub-struct
  5. The Context class has been removed and basically replaced by the classes themself

@spalladino spalladino merged commit 6ea1129 into crystal-lang:master Feb 16, 2017
@spalladino
Copy link
Contributor

Merged, thanks for the contribution @Papierkorb!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants