Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose two Enumerator-related bugs #3814

Merged
merged 3 commits into from
Aug 24, 2016
Merged

Conversation

nightscape
Copy link
Contributor

JRuby seems to have two bugs related to Enumerators yielding Arrays. They can be reproduced when comparing the following code snippets which correctly return true for MRI, but false on JRuby.

  1. << should be an alias for .yield

    Enumerator.new {|y| y.yield([1])}.to_a == [[1]] # MRI: true, JRuby: true
    Enumerator.new {|y| y << [1]}.to_a ==  [[1]] # MRI: true, JRuby: false ([[[1]]] instead)
  2. map on lazy Enumerators should not unwrap Arrays

    Enumerator.new {|y| y.yield([1])}.map {|e| e}.to_a == [[1]] # MRI: true, JRuby: true
    Enumerator.new {|y| y.yield([1])}.lazy.map {|e| e}.to_a == [[1]] # MRI: true, JRuby: false ([1] instead)

I added corresponding specs, but I'm not sure if they're in the right place and format.

@headius headius added this to the JRuby 9.1.1.0 milestone Apr 28, 2016
@headius
Copy link
Member

headius commented Apr 28, 2016

These are definitely still true and probably related to our block arg-processing logic.

We can merge this but until we fix it we'd need to tag them for JRuby.

@nightscape
Copy link
Contributor Author

So, this is the commit that breaks << vs yield. It doesn't contain a test so it's hard to see what the erroneous behavior without it was.
Does anyone remember?

@nightscape
Copy link
Contributor Author

Ok, so the corresponding PR for the above commit is this one and the corresponding issue
The functionality that gets fixed by the commit is the following:

Enumerator.new{|y| y << [42]}.first == [42] # Without the commit it seems to be 42

I'll add a test for this case as well and hopefully find an implementation that passes all tests.

@@ -5,4 +5,7 @@

describe "Enumerator::Lazy#map" do
it_behaves_like :enumerator_lazy_collect, :map
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be an extra blank line here.

@nightscape
Copy link
Contributor Author

I added a spec for the bug that the commit fixes.
Does anyone have an idea how to properly fix this?

@headius
Copy link
Member

headius commented May 2, 2016

@nightscape We've realized for a while we need a rework of how Enumera{ble,tor} passes values through. The current structure tends to lose the incoming argument layout and not match it properly with parameter binding.

9.1.1 might be a good time for us to look into fixing this.

I'm curious: do you have an app or library failing to work because of the current bugs in Enumera{ble,tor}?

@nightscape
Copy link
Contributor Author

Yes, we have a part of our code where we use Enumerators to efficiently stream data through a calculation pipeline to keep memory consumption low. We were running on MRI without issues but moved over to JRuby (we've ported parts over to Spark in order to speed up things) and suddenly things were failing in strange ways. We're not using lazy enumerators anymore now and convert the Enumerator into a real array in order to workaround this issue.

@headius
Copy link
Member

headius commented May 2, 2016

@nightscape Were the problems you had only in lazy Enumerator or in regular Enumerator too?

@nightscape
Copy link
Contributor Author

Both. We ran into the two problems that I've mentioned in the first post. They almost "cancel each other out" but then again not really ;)

@headius
Copy link
Member

headius commented May 2, 2016

Ok, I'll try to prioritize this for the first (hopefully quick) maintenance release of 9.1. Thank you for the test cases...they will help us sort out exactly what to fix.

@nightscape
Copy link
Contributor Author

Cool, thanks a lot! If I can help with anything where one doesn't have to be an expert in JRuby internals let me know!

@headius headius modified the milestones: JRuby 9.1.1.0, JRuby 9.1.2.0 May 11, 2016
@enebo enebo modified the milestones: JRuby 9.1.2.0, JRuby 9.1.3.0 May 25, 2016
@codeforkjeff
Copy link

To chime in, I encountered this bug in a data processing script I was trying to run on JRuby.

I have an understanding of this problem being caused by splats and arg destructuring in /core/src/main/ruby/jruby/kernel/enumerator.rb (and possibly RubyYielder.java?) but haven't come up with a fix. If anyone has any suggestions for an approach to fixing this, i could try to implement it and submit a PR. But as it stands, it's a bit dizzying to puzzle through.

@headius
Copy link
Member

headius commented Aug 23, 2016

Sadly still broken in 9.1.3.0, even with arg fixes. We need a rework of Enumerable/Enumerator to fix this, I suspect.

@headius
Copy link
Member

headius commented Aug 23, 2016

This may simply be a bit of bad behavior in the Yielder#<< implementation. We do the re-wrapping dance...MRI does not.

/* :nodoc: */
static VALUE
yielder_yield(VALUE obj, VALUE args)
{
    struct yielder *ptr = yielder_ptr(obj);

    return rb_proc_call(ptr->proc, args);
}

/* :nodoc: */
static VALUE
yielder_yield_push(VALUE obj, VALUE args)
{
    yielder_yield(obj, args);
    return obj;
}

This patch gets your examples working. I am checking tests now.

diff --git a/core/src/main/java/org/jruby/RubyYielder.java b/core/src/main/java/org/jruby/RubyYielder.java
index 6494e4d..1376db4 100644
--- a/core/src/main/java/org/jruby/RubyYielder.java
+++ b/core/src/main/java/org/jruby/RubyYielder.java
@@ -96,17 +96,13 @@ public class RubyYielder extends RubyObject {
     }

     @JRubyMethod(rest = true)
-    public IRubyObject yield(ThreadContext context, IRubyObject[]args) {
+    public IRubyObject yield(ThreadContext context, IRubyObject[] args) {
         checkInit();
         return proc.call19(context, args, Block.NULL_BLOCK);
     }

     @JRubyMethod(name = "<<", rest = true)
-    public IRubyObject op_lshift(ThreadContext context, IRubyObject[]args) {
-        if (args.length == 1 &&
-                args[0] instanceof RubyArray &&
-                ((RubyArray) args[0]).getLength() == 1)
-            args[0] = RubyArray.newArray(context.runtime, args[0]);
+    public IRubyObject op_lshift(ThreadContext context, IRubyObject[] args) {
         yield(context, args);
         return this;
     }

@headius
Copy link
Member

headius commented Aug 23, 2016

Stuffed my change into a PR at #4106. We'll see how it looks.

@headius
Copy link
Member

headius commented Aug 24, 2016

Well I realized after reading @nightscape's comment above, I realized that #4106 basically would just revert #2458, which fixed #2376. It's obvious the fix isn't quite right, but we regress a different way if we remove it.

@headius
Copy link
Member

headius commented Aug 24, 2016

Replacing our Yielder impl (ed: with a pure Ruby version) does not appear to fix the issue either. The problem seems to lie in the proc logic.

@headius
Copy link
Member

headius commented Aug 24, 2016

Ok, replacing both Yielder#yield and Enumerable#first makes it behave the way we expect, for all the original cases plus the one from #2376:

class Enumerator::Yielder
    def yield(*args)
      @proc.call(*args)
    end
end

module Enumerable
  def first
    val = nil
    catch(:foo) do
      each do |x|
        val = x
        throw :foo
      end
    end
    val
  end
end

p Enumerator.new {|y| y.yield([1])}.to_a
p Enumerator.new {|y| y << [1]}.to_a
p Enumerator.new {|y| y.yield([1])}.map {|e| e}.to_a
p Enumerator.new {|y| y.yield([1])}.lazy.map {|e| e}.to_a
p Enumerator.new {|y| y.yield [1]}.first

Output

$ jruby blah.rb
[[1]]
[[1]]
[[1]]
[[1]]
[1]

This is more demonstration that our Java-based Enumerable has several hard-to-fix flaws left in it. I think we're going to need to move to a Ruby-based impl sooner rather than later.

@headius
Copy link
Member

headius commented Aug 24, 2016

This PR kinda became a place to discuss the fix, but I'd like to get these specs in. I'm going to merge, tag, and open a separate bug for actually fixing the issue.

@headius headius merged commit 3ce1432 into jruby:master Aug 24, 2016
headius added a commit that referenced this pull request Aug 24, 2016
eregon pushed a commit to ruby/spec that referenced this pull request Sep 27, 2016
kares added a commit that referenced this pull request Aug 2, 2019
also from comments GH-3814 which are now passing

`Enumerator.new { |y| y.yield([1]) }.lazy.map { |e| e }.to_a`
seems to be left not working as in MRI
Adithya-copart pushed a commit to Adithya-copart/jruby that referenced this pull request Aug 25, 2019
also from comments jrubyGH-3814 which are now passing

`Enumerator.new { |y| y.yield([1]) }.lazy.map { |e| e }.to_a`
seems to be left not working as in MRI
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants