-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
String from literal sometimes not reused with frozen_string_literal enabled #4970
Comments
Why are you testing that two literals are exactly the same? |
There's likely a race here in As far as I know, there's nothing specified in any MRI documentation that says two different sites with the same content should return the same frozen string object. We can probably fix this, but I think it's a bad side effect to test for. |
You're right, for this test we can just use My main worry was the randomness of the behavior, and if it was a symptom of a bigger issue. |
I've located a likely culprit in the code and I'm working on a fix. |
The logic here goes like this: * Attempt to lookup the dedup'ed string. * If found, return it. * Try to insert new dedup'ed string. * If nobody beat us to it, return it. * If someone beat us to it and the reference is still good, return it. * If someone beat us to it and the reference is vacated, atomically replace it or retrieve any updates that beat us. * Keep trying until we succeed in updating or get back a populated reference. Previously, there could be a race if two threads try to dedup the same string at the same time. Only one would get in the cache, but the n+1 threads might not return the already-cached value.
I believe I've got this fixed, but please review and test locally. I could not reproduce the issue with your script, but I believe you :-) |
Thanks for the quick turnaround! I'll test and report back asap. |
Hey @headius, I've been able to re-test and I'm problem is still there. (I can't reopen the ticket myself, it seems). I've debugged it a bit, it's definitely something around # frozen_string_literal: true
puts RUBY_DESCRIPTION
DUMMY_VALUE = 'hello bug'
puts "object_id for DUMMY_VALUE: #{DUMMY_VALUE.object_id}"
java.lang.System.gc unless ENV['NOGC']
OTHER_VALUE = 'hello bug'
puts "object_id for OTHER_VALUE: #{OTHER_VALUE.object_id}" I can reproduce every time: it's the GC that triggers it. That's probably why you couldn't reproduce --- you probably have more RAM than I do :)
Looking into it, now that it's consistently reproduceable, I found that these strings are being created as part of the at org.jruby.ir.runtime.IRRuntimeHelpers.newFrozenString(IRRuntimeHelpers.java:1912)
at org.jruby.ir.operands.FrozenString.createCacheObject(FrozenString.java:109)
at org.jruby.ir.operands.ImmutableLiteral.cachedObject(ImmutableLiteral.java:63)
at org.jruby.ir.operands.ImmutableLiteral.retrieve(ImmutableLiteral.java:84)
at org.jruby.ir.instructions.PutConstInstr.interpret(PutConstInstr.java:26)
// Never use value as key
deduped = string.strDup(this);
deduped.setFrozen(true);
final WeakReference<RubyString> weakref = new WeakReference<>(deduped);
// try to insert new
dedupedRef = dedupMap.computeIfAbsent(string, new Function<RubyString, WeakReference<RubyString>>() { ...and then return the deduped frozen copy in the next step. So the problem is that the I believe the fix is simple: the new So my question is --- why? In this case it seems like exactly what we want: the deduped string is weakly referenced as a key AND weakly referenced as a value, because the behavior we want out of this map (I think...) is that it should behave as a |
Actually you're right...the comment is misleading. What I'm trying to avoid here is ever using the incoming string as the dedup key, since we don't know how else that string is being used. You are correct; the actual deduped string should be both weak key and weak value (WeakHashSet would be nice) and the hard reference keeping it alive is the responsibility of Also discovered: I will have to break the atomicity of the patch on JRuby 9.1 since it supports Java 7 and the |
* Use deduped as both key and value. Hard reference to key is the responsibility of callers of freezeAndDedupString. * Remove use of Java 8+ atomic Map.compute API calls. This means 9.1 will have a race, but that should only affect concurrent loads that attempt to deduplicate the same currently-not- deduplicated strings.
I have pushed a modified fix to jruby-9.1 that uses the deduped value as the key and passes all your examples. We'll want those to get into ruby/spec. I'm also doing a fully Java 8 lambda-fied version of the atomic logic for master. |
Thanks for the awesome fix! Indeed the |
This seems to be happening again for the "freeze" optimization. This recently modified spec now fails to assert two identical frozen strings are the same object. I've filed #5070. No theories yet. |
Environment
jruby 9.1.15.0 (2.3.3) 2017-12-07 929fde8 Java HotSpot(TM) 64-Bit Server VM 25.151-b12 on 1.8.0_151-b12 [linux-x86_64]
Linux u186024434db159d25c92 4.13.0-26-generic #29~16.04.2-Ubuntu SMP Tue Jan 9 22:00:44 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Ubuntu 16.04.3 LTS
Expected Behavior
With
frozen_string_literal
enabled, always MRI reuses repeated copies of the same literal.Testcase:
Output on MRI:
Actual Behavior
JRuby's behavior for this is pretty bizarre. With
THREADS
set to a small number, it works the same as MRI:But eventually as you raise the number of threads, it stops reusing the literal:
The example above is from a test suite that was randomly failing on JRuby sometimes, as it randomly ordered threads and thus it would fail when the tests that played with threads ran before a test that tested that two literals were exactly the same.
The text was updated successfully, but these errors were encountered: