-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Truffle] Encoding#replicate #4161
Conversation
👍 |
private final DynamicObject[] ENCODING_LIST_BY_ENCODING_LIST_INDEX = new DynamicObject[EncodingDB.getEncodings().size()]; | ||
private final DynamicObject[] ENCODING_LIST_BY_ENCODING_INDEX = new DynamicObject[EncodingDB.getEncodings().size()]; | ||
private final List<DynamicObject> ENCODING_LIST_BY_ENCODING_LIST_INDEX = new ArrayList<DynamicObject>(EncodingDB.getEncodings().size()); | ||
private final Map<Integer, DynamicObject> ENCODING_LIST_BY_ENCODING_INDEX = new HashMap<Integer,DynamicObject>(EncodingDB.getEncodings().size()); | ||
private final Map<String, DynamicObject> LOOKUP = new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These will probably need some synchronization if encodings can be added after startup.
Maybe a CopyOnWriteArrayList
and a ConcurrentHashMap
, or synchronize methods accessing those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added the synchronized
keyword as we discussed.
It is pretty difficult to make updates in the Also, I don't think JRuby replicates correctly currently because I don't see the Encoding appear in the Encoding.name_list or Encoding.list. |
You've probably worked this out by now, but |
@eregon would know better than me, but I think |
@nirvdrum, @eregon would know better than me too here, correct me if I am wrong ... I discussed choosing I thought I agree that |
I don't think you really need to worry about consistency all that much in general. Replicating an encoding doesn't affect the indices or name mappings of existing encodings in EncodingDB or EncodingManager. So, if a thread is looking at an older version of one of the already bootstrapped data structures, they should still get valid data. The contention is really around one thread replicating an encoding and another thread trying to use it, which should be handled by user code in my opinion. As long as a single thread's view doesn't become inconsistent, I think we're okay. |
return ENCODING_LIST_BY_ENCODING_LIST_INDEX; | ||
} | ||
|
||
public DynamicObject getRubyEncoding(Encoding encoding) { | ||
DynamicObject rubyEncoding = ENCODING_LIST_BY_ENCODING_INDEX[encoding.getIndex()]; | ||
@TruffleBoundary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HashMap.get caused Too deep inlining, probably caused by recursive inlining.
@@ -55,44 +57,46 @@ public static DynamicObject newRubyEncoding(RubyContext context, Encoding encodi | |||
return Layouts.ENCODING.createEncoding(context.getCoreLibrary().getEncodingFactory(), encoding, string, dummy); | |||
} | |||
|
|||
public DynamicObject[] getUnsafeEncodingList() { | |||
public synchronized List<DynamicObject> getUnsafeEncodingList() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This synchronized is useless though :p
Since the full list is returned, there could be ConcurrentModificationException when iterating while it's being modified.
So either use a CopyOnWriteArrayList here or replace with a helper that just returns one encoding and not the whole list.
@nirvdrum @bjfish I'd like to have a look in my IDE and make this thread-safe after your PR, could you remove the |
|
||
} | ||
|
||
@Primitive(name = "encoding_get_object_encoding_by_index", needsSelf = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just encoding_get_encoding_by_index
?
Since anyway primitives return Ruby-level values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed at 79e5a8a
79e5a8a
to
a8367e1
Compare
Let's get this merged as soon as it's green! |
private final DynamicObject[] ENCODING_LIST_BY_ENCODING_LIST_INDEX = new DynamicObject[EncodingDB.getEncodings().size()]; | ||
private final DynamicObject[] ENCODING_LIST_BY_ENCODING_INDEX = new DynamicObject[EncodingDB.getEncodings().size()]; | ||
private final List<DynamicObject> ENCODING_LIST_BY_ENCODING_LIST_INDEX = new ArrayList<DynamicObject>(EncodingDB.getEncodings().size()); | ||
private final Map<Integer, DynamicObject> ENCODING_LIST_BY_ENCODING_INDEX = new HashMap<Integer,DynamicObject>(EncodingDB.getEncodings().size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did this become a map? That's fairly expensive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the List API doesn't support adding at an index > size I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aliases resolve to their aliased encoding:
irb(main):002:0> 'abc'.encode('ASCII-8BIT').encoding.name
=> "ASCII-8BIT"
irb(main):003:0> 'abc'.encode('BINARY').encoding.name
=> "ASCII-8BIT"
Since we associate only the jcodings Encoding
with the rope, we need to look up the associated DynamicObject
. Prior to this change, there was essentially a 1:1 mapping, since aliases always resolved to their aliased encoding.
Encoding#replicate
, however, works differently.
irb(main):004:0> 'abc'.encode('BINARY').encoding.name
=> "ASCII-8BIT"
irb(main):005:0> 'abc'.encode(Encoding::BINARY.replicate('blah')).encoding.name
=> "blah"
I'm pretty sure he made it a map so he could store multiple DynamicObject
s with the same encoding index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly, this works, but I am not sure why:
jt ruby -e 'Encoding::BINARY.replicate("blah"); p e=Encoding.find("blah"); p "".encode(e).encoding; p "".encode(Encoding::BINARY).encoding'
#<Encoding:blah>
#<Encoding:blah>
#<Encoding:ASCII-8BIT>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It creates another Encoding instance, lazily, see Encoding.replicate
.
This is correct. This could probably be done with an
My understanding was that |
@bjfish Thanks for the clarification. |
Work In Progress.
I've got one known failure here which I think is due to the following from
encoding.rb
not recognizing the encodings added at runtime. I'm still thinking about a fix for this. I'll probably replace these with a primitive for the few places these are used.Please review @nirvdrum