[2.4] Hash key randomization, universal hashing, new Hash impl #4708

headius · 2017-07-07T00:10:21Z

MRI has implemented a new Hash that has some innovative features:

Open addressing for better cache locality
Automatically switching over to a secure hash when a "hash DOS" attack is detected.

We currently have the original "chained buckets" implementation of Hash with no generalized hash randomization except for strings. This is open work to be done for JRuby.

See https://bugs.ruby-lang.org/issues/12142
See https://bugs.ruby-lang.org/issues/13002

Part of TBD work for 2.4 support in #4293.

We do not randomize all hash calculation, even though MRI made the decision to do that at some point. I have opened #4708 to track this and a possible reimplementation of our Hash to compare to the open-adressing version introduced in MRI 2.4.

headius · 2017-07-07T00:13:28Z

Tag #4687 for other unimplemented features.

See jruby#4708

This commit basically implements two approaches to improve the performance of RubyHash: Switching from closed addressing hashing (double linked list) to open addressing hashing because of better cache locality. Furthermore we removed almost all RubyHashEntry objects for smaller memory allocation (which helps for better cache locality as well). Further more small hashes (less than 8 entires) are now implemented via a linear search which reduces memory allocation for a buckets. For fast bucket skips we maintain in this case a hashes array to cache the hash values. Implements jruby#4708 & jruby#2989

eregon · 2018-07-13T16:38:09Z

Automatically switching over to a secure hash when a "hash DOS" attack is detected.

AFAIK this was removed later, as it was discovered to be a user-visible change (2 different objects with same hash and eql? could no longer find the same key in a Hash IIRC, mentioned in https://bugs.ruby-lang.org/issues/13002).

headius · 2018-07-13T20:14:41Z

@eregon Thanks, makes sense.

This commit basically implements two approaches to improve the performance of RubyHash: Switching from closed addressing hashing (double linked list) to open addressing hashing because of better cache locality. Furthermore we removed almost all RubyHashEntry objects for smaller memory allocation (which helps for better cache locality as well). Further more small hashes (less than 8 entires) are now implemented via a linear search which reduces memory allocation for a buckets. For fast bucket skips we maintain in this case a hashes array to cache the hash values. Implements jruby#4708 & jruby#2989

to improve the performance by leverage better cache locality. Switching from closed addressing hash algorithm (linked list) to open addressing hashing because of a better cache locality on modern CPU architectures. Furthermore we removed almost all RubyHashEntry objects for smaller memory allocation. This is already implemented in MRI since 2.4, see https://bugs.ruby-lang.org/issues/12142 Small hashes (less than 8 entries) are now implemented via a linear search which reduces memory allocation in this case and has almost no performance implication. For a fast bucket skip we maintain in this case a hashes array to cache the hash values. Implements jruby#4708 & jruby#2989

kares · 2018-09-07T08:57:01Z

guess we can call this a day for 9.2.1 (with @ChrisBr excellent open-addressing work)
... although the description mentions some more things such as specialization?

ChrisBr · 2018-09-07T09:13:35Z

According to @eregon this was removed from CRuby later on... So yep, should be fine to close

ChrisBr added a commit to ChrisBr/jruby that referenced this issue Jun 11, 2018

Inital implementation of open addressing hash algorithm

7d86859

See jruby#4708

ChrisBr added a commit to ChrisBr/jruby that referenced this issue Jun 11, 2018

Inital implementation of open addressing hash algorithm

9a35611

See jruby#4708

ChrisBr added a commit to ChrisBr/jruby that referenced this issue Jun 11, 2018

Inital implementation of open addressing hash algorithm

2c0cd9f

See jruby#4708

ChrisBr mentioned this issue Jun 11, 2018

Open addressing algorithm for RubyHash #5215

Merged

kares added this to the JRuby 9.2.1.0 milestone Sep 7, 2018

kares closed this as completed Sep 7, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

[2.4] Hash key randomization, universal hashing, new Hash impl #4708

[2.4] Hash key randomization, universal hashing, new Hash impl #4708

headius commented Jul 7, 2017

headius commented Jul 7, 2017

eregon commented Jul 13, 2018 •

edited

Loading

headius commented Jul 13, 2018

kares commented Sep 7, 2018

ChrisBr commented Sep 7, 2018

[2.4] Hash key randomization, universal hashing, new Hash impl #4708

[2.4] Hash key randomization, universal hashing, new Hash impl #4708

Comments

headius commented Jul 7, 2017

headius commented Jul 7, 2017

eregon commented Jul 13, 2018 • edited Loading

headius commented Jul 13, 2018

kares commented Sep 7, 2018

ChrisBr commented Sep 7, 2018

eregon commented Jul 13, 2018 •

edited

Loading