Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Set#&, Set#clone and Set.new #4047

Merged
merged 1 commit into from Feb 20, 2017
Merged

Optimize Set#&, Set#clone and Set.new #4047

merged 1 commit into from Feb 20, 2017

Conversation

karlseguin
Copy link
Contributor

@karlseguin karlseguin commented Feb 17, 2017

UPDATE:
I also init the size on a Union now (as the size has to be at least the Max size of the two sets, this has similar gains as changes to clone and new)

Changes to clone and new involve setting the initial capacity. Always in the case of clone, and an overload for Indexable(T) in the case of new for access to an efficient size. (I also expect this to generate less garbage, so the real world gain from should be even better)

0.25.0
new 100 items 243.51k (  4.11µs) (± 7.58%) 
new 1 000 000 items  22.69  ( 44.08ms) (± 3.39%)
clone 100 items 239.94k (  4.17µs) (± 5.79%)
clone 1 000 000 items  21.42  ( 46.69ms) (± 3.56%)
PR
new 100 items 282.04k (  3.55µs) (± 9.10%)
new 1 000 000 items  25.95  ( 38.54ms) (± 5.07%)
clone 100 items  280.8k (  3.56µs) (± 5.76%)
clone 1 000 000 items  24.84  ( 40.26ms) (± 4.88%)

For & it's the same optimization as what intersects? does: iterating over the smaller of the two:

0.25.0
sparse & small  17.54k ( 57.01µs) (± 2.78%)
dense with small  26.63k ( 37.56µs) (± 2.99%)
small with sparse    9.4M (106.34ns) (± 3.39%)
small with dense   9.34M (107.01ns) (± 2.34%)

PR
sparse & small    7.8M ( 128.2ns) (± 2.59%)
dense with small   8.03M (124.61ns) (± 2.46%)
small with sparse   7.85M (127.32ns) (± 2.88%)
small with dense   8.07M (123.98ns) (± 2.38%)

small is 1 item, and both sparse and dense are in the 10K range.

@sdogruyol
Copy link
Member

Am i reading this correct for &?

Sparse & small 17.54k vs 7.8M is 444 times faster 😮 ?

@karlseguin
Copy link
Contributor Author

@sdogruyol

small = Set.new([1])
large = Set.new(Array.new(1_000_000) {|i| i})

In the the existing code, if you did small & large you'd only iterate through smalls single item and check its existence in large. However, if you did large & small you'd iterate all million items and check all million of them. But in the end, both give the same result.

With the PR, it will always let the small one "lead". The cost is figuring out which is smaller (self.size < other.size)...and while intersecting 1 to 1_000_000 is extreme, the gain should be fairly linear relative to the size difference.

@sdogruyol
Copy link
Member

@karlseguin thanks for the detailed explanation 🙏

src/set.cr Outdated
@@ -37,6 +37,11 @@ struct Set(T)
@hash = Hash(T, Nil).new(initial_capacity: initial_capacity)
end

# Optimized version of `new` used when `other` is also an `Indexable`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small correction: other in this case is a parameter, so use asterisks to wrap instead of backticks *other*.

Mode details on the documentation style here: https://crystal-lang.org/docs/conventions/documenting_code.html

@luislavena
Copy link
Contributor

Excellent work @karlseguin, let's wait for dev team to review! 😸

Set#& will now iterate over the smaller of the two sets (similar to what
Set#intersects? does).

The new set created by Set#clone and Set#| now gets a proper initial
capacity.

Added overload to Set.new for Indexable(T) to use Indexable#size to set
an initial capacity.

improve union performance by sizing the resulting set
@asterite asterite added this to the 0.21.0 milestone Feb 20, 2017
@asterite asterite merged commit 9d04cf3 into crystal-lang:master Feb 20, 2017
@asterite
Copy link
Member

Thank you again! 💚

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants