-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrency Issue with Lambdas and String Split #4868
Comments
seems like the back-ref isn't stored/getting back properly, trace :
|
@kares update :) Note the last line I just added (sorry for not adding that here). Single threaded, this works fine. |
Backref is not thread-safe, unfortunately. The problem here is due to the threads all sharing the same backref ref (in this case, the top-level frame). This is documented in other issues, and there's no easy solution. The simplest workaround is to start each thread in its own method body. The issue with the ClassCastException can probably be improved; we're obviously expecting the backref to keep pointing at a MatchData from the beginning of the search to the end, and don't re-check it. |
This modification makes the script work ok:
|
@headius yea that's what I figured as well, as a matter of fact it even works if I simply make it: def splittt(strr)
begin
res = strr.split("")
res
rescue => e
puts("error")
puts(e)
end
end and call that instead of the lambda. The issue from our perspective is rather, that this change from Thanks for looking into this! Should I close here, if this is tracked elsewhere already anyhow? |
I think the regression here actually is more of a compatibility improvement :-( It's likely we made improvements in 9k to frame sharing that make it more compatible but less thread-safe. I have proven that MRI also has this issue, and they have no solution for it either. You want it isolated sometimes, but not always; often the intent is exactly to share the backref with a thread you've just spawned. https://bugs.ruby-lang.org/issues/12689 My perspective is that each thread should get its own backref frame, cloned from the parent (holding that parent's value but never writing back to it). Unfortunately this isn't a simple fix either; if you have backref sharing in an arbitrary proc, we don't currently specialize anything from that proc for each thread it runs on. That issue is detailed here: #3031 |
@original-brownbear Yes, that is a simpler change; all that really matters is that the nearest non-block frame doesn't get shared, which you've done here by putting the split call in a separate method. The bottom line here is that, as we've always known, globals are bad. Even when ostensibly isolated by method frames, we can still run into sharing cases like this. I'm going to close this, since it doesn't document any new buggy behavior and it's now linked to #3031. I don't know what the solution should be. |
This changes all places we read the frame-local backref to use the thread-local match instead. None of these places used the backref for its content; they only used it to reuse the object, which is now handled more correctly by the thread-local match and the setBackRef method that marks matches as in-use. This should eliminate all internal data races against the $~ variable, since no core methods depend upon its value. This does not make $~ thread-safe but it pushes the issue to the edge, where the user will have to decide to use it knowing it may be updated across threads. See jruby#4868 for the original issue that this partially fixes, and see jruby#6640 for an issue caused by these internal data races.
This changes all places we read the frame-local backref to use the thread-local match instead. None of these places used the backref for its content; they only used it to reuse the object, which is now handled more correctly by the thread-local match and the setBackRef method that marks matches as in-use. This should eliminate all internal data races against the $~ variable, since no core methods depend upon its value. This does not make $~ thread-safe but it pushes the issue to the edge, where the user will have to decide to use it knowing it may be updated across threads. See jruby#4868 for the original issue that this partially fixes, and see jruby#6640 for an issue caused by these internal data races.
Updating this because I did not notice when #6640 was filed that it is actually the same issue (split concurrency problem leading to ClassCastException). This is now FIXED in 9.2.17.0 by #6644. The general problem is also FIXED for 9.3 by #6647 which avoids ever reading |
We had this issue elastic/logstash#8727 reported and it seems doing
string.split
inside a lambda that is used across multiple threads runs into some concurrency issue in9k
.Testing this code:
9k
dies:while running fine with
1.7.26
.Also this is running just fine if just running a single thread => must be some concurrency issue.
The text was updated successfully, but these errors were encountered: