Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add named tuples #2573

Merged
merged 1 commit into from
May 10, 2016
Merged

Add named tuples #2573

merged 1 commit into from
May 10, 2016

Conversation

asterite
Copy link
Member

@asterite asterite commented May 9, 2016

This PR adds named tuples to the language. What a Tuple is for an Array, a NamedTuple is for a Hash. You can read the docs about the new type here.

A small example (from the docs):

language = {name: "Crystal", year: 2011} # NamedTuple(name: String, year: Int32)
language[:name]  # => "Crystal" (String)
language[:year]  # => 2011      (Int32)
language[:other] # compile time error

Some things to note:

  1. A value is fetched with []. This doesn't work: language.name. The reasons are two, one technical and the other for consistency. The first reason is that you can't currently add methods to a generic instance type (only to a non-instantiated generic type). The second reason is that this, in my opinion, looks much more consistent with Tuple: indexing with a number literal works for Tuple, indexing with a symbol literal works for NamedTuple. And since NamedTuple is like a compile-time Hash, it makes sense. There's also the thing that in this way you can have a name "size", so {size: 1}[:size] always work, but {size: 2}.size would be... confusing.
  2. The syntax "conflicts" with that of a Hash with symbol keys. In fact, this syntax will now be a NamedTuple literal, and never a Hash literal. This is a breaking change, but the solution is easy, use :foo => 1 instead of foo: 1. I think this is also consistent with other elements in the language, specially named arguments, because after this PR we'll add support to splat a named tuple into a method's arguments.

At runtime, a named tuple is represented in a similar way to a tuple: a contiguous series of values (it's a struct), so it's very efficient. I mean, creating a named tuple involves no memory allocations, compared to a Hash. And accessing a key known at compile time is O(1) (if it's not known at runtime then it's O(N), with N the number of names, though LLVM will surely optimize this to a jump table).

I believe named tuples will be an essential part of the language. Not in this PR, but you'll be able to splat named arguments. You can read a bit about that here.

Another example: in some places a tuple or nil is returned, and the indices are not very clear. For example:

def some_method
  if some_condition
    {"Crystal", 2011}
  else
    nil
  end
end

value = some_method
if value
  name, language = value
  # or use value[0], value[1]
end

That's not very clear, we have to remember what's in each position. We can created a record, but it's tedious. With named tuples:

def some_method
  if some_condition
    {name: "Crystal", year: 2011}
  else
    nil
  end
end

value = some_method
if value
  puts value[:name]
  puts value[:language]
end

And, of course, if we do value[:bar] we'll get a compile-time error, so it's as safe as using a tuple, and also as efficient :-)

Another thing is that included in this PR, just as an example (but very useful) is to_json and to_yaml for a named tuple. With that we can do:

require "json"

{language: "Crystal", year: 201}.to_json # => %({"language": "Crystal", year: 2011})

The best thing is that this doesn't involve creating a Hash. I always see this in Ruby and I think "Ugh, here we are creating a hash (memory allocation), assigning its elements and then dumping them to json and the Hash is immediately discarded, that's so ugly". Well, that won't be the case anymore in Crystal ;-)

However, there's a problem: what if we want to quickly generate a JSON object but the key has spaces in it? There's currently no way to create a named tuple with such key. The most consistent way would be:

{"key with spaces": 1}

However, that's currently Hash literal with a string key. I'd like to change that to also be a named tuple literal. With that consistency is maximum: a hash literal is always written with =>, and a named tuple literal is always written with :, and matches named arguments.

In fact, with the above we could do something like this, once we have splat for named arguments:

# Obviously a dummy method, just to show the usage
def to_json(**args)
  args.json
end

tuple = {"hello world": 1}
to_json(**tuple)

So we can actually pass named arguments to a method, and they can have spaces. This can be useful for example in gelf-crystal, in which you currently have to do:

logger.debug({ "short_message" => "some short message", "_extra_var" => "some var"})

But you'll be able to also use it like this:

logger.debug short_message: "some short message", _extra_var: "some var", "another value": "foo"

And this last way won't involve creating a hash in the heap.

This, of course, is also a breaking change. For example we have HTTP::Headers{"Content-Type": "text/plain"} in a few places, but I'd like to change it anyway. I still have to discuss all this with @waj, though :-)

@@ -104,6 +104,9 @@ module Crystal
types["Tuple"] = tuple = @tuple = TupleType.new self, self, "Tuple", value, ["T"]
tuple.allowed_in_generics = false

types["NamedTuple"] = named_tuple = @named_tuple = NamedTupleType.new self, self, "NamedTuple", value, ["T"]
tuple.allowed_in_generics = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should maybe be named_tuple.allowed_in_generics = false

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, thanks!

@asterite asterite force-pushed the feature/named_tuples branch from 9bc1104 to 11b81a5 Compare May 9, 2016 17:03
@refi64
Copy link
Contributor

refi64 commented May 9, 2016

At runtime, a tuple is represented in a similar way to a tuple

Do you mean "a named tuple"?

@will
Copy link
Contributor

will commented May 9, 2016

If the values in named tuples are only accessible by name, and not by position, they could be rearranged and packed to cut out alignment slop.

@sdogruyol
Copy link
Member

I just want to see some benchmarks for Named Tuple vs Hash.

Also I'm pretty sure that most of the newcomers will be really confused about this.

@sdogruyol
Copy link
Member

Meanwhile i did some quick benchmarking for JSON serialization. Seems like Named Tuple is really effective.

require "json"

Benchmark.ips do |x|

  x.report "named tuple" do
    language = {name: "Crystal", year: 2011}.to_json
  end

  x.report "hash" do
    language = {"name" => "Crystal", "year" => 2011}.to_json
  end
end
named tuple   2.15M (± 2.09%)       fastest
       hash 973.86k (± 6.77%)  2.20× slower

@asterite
Copy link
Member Author

asterite commented May 9, 2016

@will We were actually thinking of making a NamedTuple remember the names' order inside it. So {x: 1, y: 'a'} and {y: 'a', x: 1} would be different types, but they would be compatible (though I'm still not sure about that, mostly because it's a lot harder to implement, but I'll try).

@asterite asterite closed this in adc2d37 May 9, 2016
@asterite asterite reopened this May 9, 2016
@asterite asterite force-pushed the feature/named_tuples branch from 11b81a5 to de6925e Compare May 9, 2016 18:10
@ozra
Copy link
Contributor

ozra commented May 9, 2016

Mmmm, I like named tuples! Not a fan of making order have significance though, seems like added complexity for nothing.

@rmosolgo
Copy link
Contributor

rmosolgo commented May 9, 2016

I'm also curious about ordering the keys, what is the advantage to do that? Why distinguish between {x: 1, y: 'a'} and {y: 'a', x: 1}?

@asterite
Copy link
Member Author

asterite commented May 9, 2016

The main benefit of ordering is that this:

{
  id: 1,
  name: "Hello",
  extra: true
}.to_pretty_json 

will have, as an output:

{
  "id": 1,
  "name": "Hello",
  "extra": true
}

And not this:

{
  "extra": true,
  "id": 1,
  "name": "Hello"
}

The second output would be like that because keys would be sorted lexicographically.

A similar example:

{foo: 1, bar: 2}.to_s # => "{foo: 1, bar: 2}"

# But without order: "{bar: 2, foo: 1}"

In any case, it's a small thing, but if order can be preserved it's more intuitive. It's also compatible with how Hash preserves order of insertion.

And tuples with same names and types but different order will still be compatible. When doing the union of such compatible types, the first one will be preserved (so that if you have an instance variable with such type it will remain with such type).

@asterite
Copy link
Member Author

asterite commented May 9, 2016

We also decided that we won't change {"hello world": 1} to be a named tuple, it will remain as a Hash with a string key, mostly because a named tuple should be a representation of a method's arguments. That they can be used for JSON is a nice feature but not the main reason of their existence.

@asterite asterite force-pushed the feature/named_tuples branch from de6925e to 3cc7f2d Compare May 9, 2016 20:01
@ysbaddaden
Copy link
Contributor

I can only see some bright future for this! For example An immediate use case is for Rails-like view helpers for example.

The syntax is a little confusing with Ruby, but I guess with enough newcomer documentation and practice this will fade out.

@asterite shouldn't { "some space": value } break with "invalid NamedTuple key" instead of being coerced into a Hash? Maybe Hash should always use => now, so both syntaxes are distinct?

@kumpelblase2
Copy link
Contributor

I think named tuples are great but I also have doubts about making position matter. Preserving order of defined properties in a json objects is nothing gained, json doesn't care, the resulting objects are identical.

I'd only like ordering if it serves the purpose of identifying elements (such as in arrays, lists, tuples ...) but for named tuples we already have a way of identifying an element: the name. {foo: 1, bar: 2} and {bar: 2, foo: 1} should be the exact same thing just like foo(x: 1, y: 2) and foo(y: 2, x: 1) call the method foo with the same exact arguments.

@ysbaddaden
Copy link
Contributor

A NamedTuple is merely a Tuple with a second list for keys, hence it's an enhanced Tuple and definitely not a Hash.

Not retaining order would make the #values method useless —it would return values in whatever order, breaking the Tuple contract. The #hash method takes care to sort keys before hashing, so NamedTuples with the same keys and values but in different orders are actually considered equal. Perfect.

Definitely 👍 for me.

@asterite asterite force-pushed the feature/named_tuples branch from 3cc7f2d to 01ef76a Compare May 10, 2016 13:06
@asterite
Copy link
Member Author

@ysbaddaden We discussed using {"foo": 1} for named tuples with @waj and he said that:

  1. The JSON example where I need keys with spaces isn't very real, APIs don't use spaces for identifiers.
  2. Since a named tuple will be able to match method arguments, and since method arguments don't allow names with spaces, it doesn't make a lot of sense for them to have spaces.
  3. Hash with string keys are very common and useful, so being able to write {"foo": 1} or HTTP::Headers{"Content-Type": "text/plain"} instead of {"foo" => 1} or HTTP::Headers{"Content-Type" => "text/plain"} is nice. So he'd like to keep this syntax.

As for me, I only have doubts about point 3. Maybe hash keys with string literals are common, and being able to write them with colons instead of arrows is probably a bit faster and maybe easier to read. However, consistency is lost because hashes now use => and : (well, that's also the case now, but now I'm not sure I like it that way). I know some Ruby programmers that don't like : being used in hashes for this same reason: having two ways to do the same thing, and having to learn two things. It might be a small thing, but I really appreciate consistency in a language, and with time your brain will start to recognize : as a named something, and => as a hash.

But I don't have a very strong opinion about this last point, so for now string keys with : will stay like that.

@asterite asterite mentioned this pull request May 10, 2016
@asterite asterite merged commit 5024487 into master May 10, 2016
@asterite asterite deleted the feature/named_tuples branch May 10, 2016 14:44
@Gangwolf
Copy link

Hello, I would appreciate any thoughts on #2629.

cristianoliveira referenced this pull request in marceloboeira/bojack Aug 12, 2016
Instead of giving key and value to commands, give them the params
structure, for them to lookup for the params.

Yes, the way it is right now may lead to several consistency errors, but
it is a step between now and #12.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants