-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Hash#transform_keys and Hash#transform_values #4385
Conversation
303e966
to
fd1e266
Compare
@deepj the bang version can be implemented, as long as they don't change the type of the values nor keys. Would you prefer to implement those in this PR as well? |
@bcardiff I meant the bang versions cannot work in the same way (including type casting) Yes, I can add them including a notice they don't work with type casting. |
I tried to implement class Hash
def transform_keys!(&block : K -> K)
each do |key, value|
self[yield(key)] = value
delete(key)
end
end
end
h = {'a' => 1, 'b' => 2}
h.transform_keys!(&.upcase)
puts h But |
@deepj it's because the hash in changed inside the iterator. For example if you swap I think we will need to duplicate the keys. Unless we go down with something that deals with the hash internals for reallocating the entries. So something as follow could work: class Hash
def transform_keys!(&block : K -> K)
each.to_a.each do |(key, value)|
delete(key)
self[yield(key)] = value
end
end
end
h = {'a' => 1, 'b' => 2}
h.transform_keys!(&.upcase)
puts h Note that deleting the key should happen before inserting. Just in case the key transformation did nothing for that key. But, now i am thinking of the corner cases: h = {'a' => 1, 'b' => 2}
h.transform_keys!(&.succ)
puts h The above change leads of So we would need something like class Hash
def transform_keys!(&block : K -> K)
pairs = each.to_a
clear
pairs.each do |(key, value)|
self[yield(key)] = value
end
end
end which is essentially So, unless we really do it in-place there is not real value for bang of keys. Let's leave that out. But the |
This looks like map for keys/values on hash. Why not call it |
@RX14 One reason would be familiarity, since there's already |
I'm all for The implementation needs to use internals I think: https://carc.in/#/r/1zwm class Hash
def map_keys!
current = @first
while current
current.key = yield current.key
current = current.fore
end
end
private class Entry
setter key : K # key has a getter only, adding a setter here
end
end
h = {'a' => 1, 'b' => 2}
h.map_keys!(&.succ)
puts h # => {'b' => 1, 'c' => 2} This impl has no copy (well maybe when copying from the block result to the Entry, but I'm not sure..), so it should be the fastest and the safest (no allocations or anything that could fail) |
What if two keys would be mapped to the same value? It should raise, not create inconsistent UB'ish state imho. https://carc.in/#/r/200a |
b2e8dc4
to
fba15fb
Compare
src/hash.cr
Outdated
# | ||
# ``` | ||
# hash = {"a" => 1, "b" => 2, "c" => 3} | ||
# hash.transform_key! { |key| key.succ } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be transform_keys!
(you missed the s
)
src/hash.cr
Outdated
# hash = {"a" => 1, "b" => 2, "c" => 3} | ||
# hash.transform_key! { |key| key.succ } | ||
# hash # => {"b" => 1, "c" => 2, "d" => 3} | ||
def transform_keys!(&block : K -> K) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is breaking the Hash invariant. A call to rehash
is needed at the end or the final state won't be useful for user.
I don't fully like that the Entry#key
is changed from getter
to property
since that was preventing the invariant to be broken.
I would say to either base the implementation on rehash
(zeroing the buckets, creating new Entry, iterate the old entries, storing new entries in the right place) or remove this implementation.
If any implementation is added, I would say to assert that actual.should eq(expected)
and expected.should eq(actual)
so we can detect if the code messes up with the invariant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From @konovod comment I would say that insert_in_bucket
should be used also so we avoid having more than one entry with the same key.
If the user provides a function that collide two keys into one there is no guarantee which value will be preserved. It's that or raising.
I thought this was merged, @deepj do you want to continue working on it? If we can't find a consensus for mutating methods, at least it would be great to have this for non-mutating ones. |
I think you can also keep |
fba15fb
to
378262a
Compare
8266e31
to
1527126
Compare
LGTM, can this have some reviews? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR adds similar functionality in Ruby 2.4 which is
Hash#transform_keys
andHash#transform_values
.Hash#transform_keys
returns a new hash with all keys converted using the block operation.Hash#transform_values
returns a new hash with the results of running block once for every value.I haven't added bang equivalents since what I know it's not possible to modify types of keys/values in self.