Concurrency Issue with Lambdas and String Split #4868

original-brownbear · 2017-11-27T14:36:58Z

We had this issue elastic/logstash#8727 reported and it seems doing string.split inside a lambda that is used across multiple threads runs into some concurrency issue in 9k.

Testing this code:

str = '{"fld_1": "a`b ", "fld_2": "-", "fld_3": 7}'

running = true

splittt = lambda do |strr|
  begin
    res = strr.split("")
    res
  rescue => e
    puts("error")
    puts(e)
  end
end

one = Thread.new do
  puts "starting 1"
  while running do
    break if splittt.call(str).nil?
  end
end
two = Thread.new do
  puts "starting 2"
  while running do
    break if splittt.call(str).nil?
  end
end

while one.alive? && two.alive?
  sleep(1)
  puts "ping"
end

9k dies:

➜  logstash git:(repro-race-jruby-regex) ✗ ruby regexrace.rb 
starting 1
starting 2
error
org.jruby.RubyNil cannot be cast to org.jruby.RubyMatchData
ping

while running fine with 1.7.26.

Also this is running just fine if just running a single thread => must be some concurrency issue.

The text was updated successfully, but these errors were encountered:

kares · 2017-11-27T16:56:37Z

seems like the back-ref isn't stored/getting back properly, trace :

error: java.lang.ClassCastException: org.jruby.RubyNil cannot be cast to org.jruby.RubyMatchData
  org.jruby.RubyString.regexSplit19(RubyString.java:3592)
  org.jruby.RubyString.splitCommon19(RubyString.java:3549)
  org.jruby.RubyString.split19(RubyString.java:3516)
  org.jruby.RubyString$INVOKER$i$split19.call(RubyString$INVOKER$i$split19.gen)
  org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:153)
  split_threads.invokeOther1:split(split_threads.rb:7)

original-brownbear · 2017-11-27T17:12:28Z

@kares update :)

Note the last line I just added (sorry for not adding that here). Single threaded, this works fine.
Also works fine if using a method defined via simply def instead of the lambda.

headius · 2017-11-27T18:22:27Z

Backref is not thread-safe, unfortunately. The problem here is due to the threads all sharing the same backref ref (in this case, the top-level frame). This is documented in other issues, and there's no easy solution.

The simplest workaround is to start each thread in its own method body.

The issue with the ClassCastException can probably be improved; we're obviously expecting the backref to keep pointing at a MatchData from the beginning of the search to the end, and don't re-check it.

headius · 2017-11-27T18:30:43Z

This modification makes the script work ok:

def one
  Thread.new do
    puts "starting 1"
    while running do
      break if splittt.call(str).nil?
    end
  end
end
one = self.one

def two
  Thread.new do
    puts "starting 2"
    while running do
      break if splittt.call(str).nil?
    end
  end
end
two = self.two

original-brownbear · 2017-11-27T18:34:54Z

@headius yea that's what I figured as well, as a matter of fact it even works if I simply make it:

def splittt(strr)
  begin
    res = strr.split("")
    res
  rescue => e
    puts("error")
    puts(e)
  end
end

and call that instead of the lambda. The issue from our perspective is rather, that this change from 1.7.x to 9k breaks a plugin of ours. But if there's no quick fix here, we'll fix that plugin accordingly.

Thanks for looking into this! Should I close here, if this is tracked elsewhere already anyhow?

headius · 2017-11-27T22:47:31Z

I think the regression here actually is more of a compatibility improvement :-( It's likely we made improvements in 9k to frame sharing that make it more compatible but less thread-safe.

I have proven that MRI also has this issue, and they have no solution for it either. You want it isolated sometimes, but not always; often the intent is exactly to share the backref with a thread you've just spawned.

https://bugs.ruby-lang.org/issues/12689

My perspective is that each thread should get its own backref frame, cloned from the parent (holding that parent's value but never writing back to it). Unfortunately this isn't a simple fix either; if you have backref sharing in an arbitrary proc, we don't currently specialize anything from that proc for each thread it runs on. That issue is detailed here: #3031

headius · 2017-11-27T22:53:50Z

@original-brownbear Yes, that is a simpler change; all that really matters is that the nearest non-block frame doesn't get shared, which you've done here by putting the split call in a separate method.

The bottom line here is that, as we've always known, globals are bad. Even when ostensibly isolated by method frames, we can still run into sharing cases like this.

I'm going to close this, since it doesn't document any new buggy behavior and it's now linked to #3031. I don't know what the solution should be.

This changes all places we read the frame-local backref to use the thread-local match instead. None of these places used the backref for its content; they only used it to reuse the object, which is now handled more correctly by the thread-local match and the setBackRef method that marks matches as in-use. This should eliminate all internal data races against the $~ variable, since no core methods depend upon its value. This does not make $~ thread-safe but it pushes the issue to the edge, where the user will have to decide to use it knowing it may be updated across threads. See jruby#4868 for the original issue that this partially fixes, and see jruby#6640 for an issue caused by these internal data races.

headius · 2021-04-01T20:45:06Z

Updating this because I did not notice when #6640 was filed that it is actually the same issue (split concurrency problem leading to ClassCastException). This is now FIXED in 9.2.17.0 by #6644.

The general problem is also FIXED for 9.3 by #6647 which avoids ever reading $~ within our core methods that are aware of it. Those methods now only SET the $~ value, avoiding internal races. Users explicitly opting into access of $~, either by accessing the variable or by using methods like last_match that access it, will still need to take care that other threads are not updating it at the same time it is being read.

original-brownbear changed the title ~~Concurrency Issue with Lambdas~~ Concurrency Issue with Lambdas and String Split Nov 27, 2017

original-brownbear mentioned this issue Nov 27, 2017

Logstash 6.0 breaks ruby code if running more than 1 worker elastic/logstash#8727

Closed

headius closed this as completed Nov 27, 2017

headius added this to the Invalid or Duplicate milestone Nov 27, 2017

original-brownbear mentioned this issue Nov 28, 2017

BUG: Fix concurrency issues arising from sharing lambda across worker threads logstash-plugins/logstash-filter-ruby#37

Closed

This was referenced Apr 29, 2019

Indexing concurrency problem psu-libraries/psulib_traject#132

Closed

Using regex inside a proc isn't thread safe traject/traject#225

Open

This was referenced Mar 31, 2021

java.lang.ClassCastException: org.jruby.RubyNil cannot be cast to org.jruby.RubyMatchData #6640

Closed

Improve concurrency of $~ adjacent methods #6647

Merged

headius modified the milestones: Invalid or Duplicate, JRuby 9.2.17.0 Apr 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

Concurrency Issue with Lambdas and String Split #4868

Concurrency Issue with Lambdas and String Split #4868

original-brownbear commented Nov 27, 2017 •

edited

Loading

kares commented Nov 27, 2017

original-brownbear commented Nov 27, 2017

headius commented Nov 27, 2017

headius commented Nov 27, 2017

original-brownbear commented Nov 27, 2017

headius commented Nov 27, 2017 •

edited

Loading

headius commented Nov 27, 2017

headius commented Apr 1, 2021

Concurrency Issue with Lambdas and String Split #4868

Concurrency Issue with Lambdas and String Split #4868

Comments

original-brownbear commented Nov 27, 2017 • edited Loading

kares commented Nov 27, 2017

original-brownbear commented Nov 27, 2017

headius commented Nov 27, 2017

headius commented Nov 27, 2017

original-brownbear commented Nov 27, 2017

headius commented Nov 27, 2017 • edited Loading

headius commented Nov 27, 2017

headius commented Apr 1, 2021

original-brownbear commented Nov 27, 2017 •

edited

Loading

headius commented Nov 27, 2017 •

edited

Loading