-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Truffle] Lazily initialised, shared string literals. #3687
Conversation
@@ -32,7 +32,7 @@ public Object execute(VirtualFrame frame) { | |||
|
|||
@Override | |||
public Object isDefined(VirtualFrame frame) { | |||
return create7BitString("nil", UTF8Encoding.INSTANCE); | |||
return stringLiterals().NIL.getString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the motivation for this work - we want to be able to refer to a compilation final immutable rope which we can package up in a tiny, escape analysable string object very quickly.
This looks good. My only real gripe is the naming. We have a |
|
||
return context.getRopeTable().getRope( | ||
literal.getBytes(StandardCharsets.US_ASCII), | ||
USASCIIEncoding.INSTANCE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want this to UTF8Encoding.INSTANCE
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've constrained to 7-bit ASCII so we can use the advantageous 7-bit rope (see the assert in the constructor). We aren't likely to need unicode string literals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be for correctness, however. In the defined?
case, for instance, MRI would return a UTF-8 string. We can have a 7-bit UTF-8 rope (that's the dominant case in normal operation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I take that back. In the defined?
case, MRI actually returns ASCII-8BIT
, which would be ASCIIEncoding.INSTANCE
(NB: that's different yet from USASCIIEncoding.INSTANCE
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I'll check for more cases as I go. Others include exception message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it looks like defined?
strings are all Encoding::ASCII_8BIT
, whatever the magic comment or internal/external encodings. I'll switch to that for this use case and I'll check as I start to use it for other use cases.
I renamed it |
Can you take a look again please @nirvdrum |
Other than my cautionary comment, this looks good to me. |
# Conflicts: # truffle/src/main/java/org/jruby/truffle/language/RubyNode.java
[Truffle] Lazily initialised, shared string literals.
Is this worth the extra complexity to initialize lazily? |
Also, maybe the leaf ropes should be weakly cached like Symbols so this explicit global state is not necessary? |
We want to be able to create a Ruby string with a literal value as easily as possible in our Java code, but we only want to create one rope (our equivalent of the
char[]
) per VM, as these are immutable and sharing them improves caching elsewhere.@nirvdrum