Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] JSON.mapping: add store_other_attributes_in flag #2834

Closed
wants to merge 1 commit into from

Conversation

asterite
Copy link
Member

This PR isn't finished but I'd like to start and discuss the design around this.

Check the added specs to see what this is all about. This idea originated here

The things to discuss are:

  • The name of the JSON.mapping parameter
  • What if we want to store the extra attributes as raw strings, to later parse them as we want? Of course we can just call to_s on each hash value, but that's not memory efficient if we will lazily parse the string. So, an option to invoke pull.read_raw instead of creating a JSON::Any. I can't come up with a good API for this. Maybe a second parameter, store_other_attributes_as_raw_strings: true?

We could then copy this to YAML.mapping.

@jhass
Copy link
Member

jhass commented Jun 14, 2016

My biggest concern is that extra isn't the most conflict free name one can imagine. But it's admittedly hard to come up with a good one. Which leads my thoughts towards API that lets me pick it:

 JSON.mapping({
   x: Int32, y: Int32,
 }, extra_attributes: {key: "extra", type: Hash(String, JSON::Any)})


 JSON.mapping({
   x: Int32, y: Int32,
 }, extra_attributes: {key: "unparsed", type: String::RawConverter})

key defines the name of the attribute to collect extra attributes in. The idea is that the passed named tuple is identical to or at least strongly resembles that passed to "regular" attributes of the mapping. A bit more verbose but the usecases for this are not that common anyway.

@asterite
Copy link
Member Author

I like that, but since type is always going to be either that has or the raw string, maybe we can have:

 JSON.mapping({
   x: Int32, y: Int32,
 }, extra_attributes: {key: "extra"}) # same as raw: false


 JSON.mapping({
   x: Int32, y: Int32,
 }, extra_attributes: {key: "extra", raw: true})

Using extra_attributes as a name was only my initial though and had the same concerns as you, but I guess it's fine (shorter and kind of obvious).

We could even support just: extra_attributes: "extra" as an equivalent to passing raw: false (similar to how in the mapping you can specify a type or a "hash" with options).

@jhass
Copy link
Member

jhass commented Jun 14, 2016

I'm actually not sure it'll stay "just those two" for ever, I can't say for sure that using a user defined "collector" can't make sense there.

@jessedoyle
Copy link
Contributor

I really like the concept behind this PR. Just thought I'd chime in on the parameter name:

What about other?:

 JSON.mapping({
   x: Int32, y: Int32,
 }, other: {key: "extra"})


 JSON.mapping({
   x: Int32, y: Int32,
 }, other: {key: "extra", raw: true})

It would be short and (I believe) more readable.

@asterite
Copy link
Member Author

I see only one use case for this: the JSON object you get has properties you are not interested about and you just want to keep them to forward them when serializing the object back to JSON, or when passing it somewhere else. So, they are opaque. If they weren't opaque you would mention them in the mapping itself. So I can't imagine why you'd like to treat them in a special way other than a hash of opaque data or a hash of opaque string data.

We could probably have converters, or have a callback invoked on each unknown key, passing the pull parser, but that's not really the point, and it would complicate the API a lot.

In the case of sidekiq these properties are opaque to the class defining the mapping, and they are passed to plugins which know what type to expect there. Other cases will be similar.

Also, the name key doesn't look good to me. It's not a key in the JSON mapping, it's in which instance variable or property we want to store the extra data. I also like other because it's shorter and also kind of obvious as to what it does, specially when followed by the mapping itself. Maybe then:

JSON.mapping({
   x: Int32, y: Int32,
 }, other: extra)

JSON.mapping({
   x: Int32, y: Int32,
 }, other: {property: extra})

 JSON.mapping({
   x: Int32, y: Int32,
 }, otra: {property: "extra", raw: true})

@asterite
Copy link
Member Author

On the other hand, let me take that back. Here's the use case we are trying to solve. A plugin would like to register an extra attribute to be stored in the job, and later get that attribute back. So maybe we do want to have a callback of some sort, maybe specify which methods on the instance gets called, when an unknown attribute is reached, and pass the pull parser from there, so it's as type safe as possible.

@asterite asterite mentioned this pull request Jul 13, 2016
@spalladino spalladino removed the RFC label Jan 9, 2017
@asterite asterite closed this May 24, 2017
@asterite asterite deleted the feature/json_mapping_unknown_attributes branch April 27, 2018 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants