Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy enumerable select eagerly evaluates next item #4212

Closed
headius opened this issue Oct 7, 2016 · 23 comments
Closed

Lazy enumerable select eagerly evaluates next item #4212

headius opened this issue Oct 7, 2016 · 23 comments

Comments

@headius
Copy link
Member

headius commented Oct 7, 2016

Discovered by @twalpole in SeleniumHQ/selenium#2880. It appears our lazy enumerable logic is trying to evaluate the next item eagerly, or something...

class A
  include Enumerable

  def initialize()
    @result_enum = ["Text", "Text", "Text"].lazy.select do |t|
      puts "evaluating lazy select"
      true
    end
  end

  def each(&block)
    loop do
      block.call(@result_enum.next)
    end
  end
end

A.new().first

Running this on MRI prints "evaluating lazy select" once, while JRuby prints it twice.

@headius headius added this to the JRuby 9.1.6.0 milestone Oct 7, 2016
@headius
Copy link
Member Author

headius commented Oct 7, 2016

Printing out caller(0) shows that this logic is called twice with pretty much the same stack.

blah.rb:6:in `block in initialize'
/Users/headius/projects/jruby/core/src/main/ruby/jruby/kernel/enumerator.rb:81:in `block in select'
/Users/headius/projects/jruby/core/src/main/ruby/jruby/kernel/enumerator.rb:24:in `block in initialize'
/Users/headius/projects/jruby/core/src/main/ruby/jruby/kernel/enumerator.rb:23:in `each'
/Users/headius/projects/jruby/core/src/main/ruby/jruby/kernel/enumerator.rb:23:in `block in initialize'
/Users/headius/projects/jruby/core/src/main/ruby/jruby/kernel/enumerator.rb:22:in `catch'
/Users/headius/projects/jruby/core/src/main/ruby/jruby/kernel/enumerator.rb:22:in `block in initialize'
blah.rb:6:in `block in initialize'
/Users/headius/projects/jruby/core/src/main/ruby/jruby/kernel/enumerator.rb:81:in `block in select'
/Users/headius/projects/jruby/core/src/main/ruby/jruby/kernel/enumerator.rb:24:in `block in initialize'
/Users/headius/projects/jruby/core/src/main/ruby/jruby/kernel/enumerator.rb:23:in `each'
/Users/headius/projects/jruby/core/src/main/ruby/jruby/kernel/enumerator.rb:23:in `block in initialize'
/Users/headius/projects/jruby/core/src/main/ruby/jruby/kernel/enumerator.rb:22:in `catch'
/Users/headius/projects/jruby/core/src/main/ruby/jruby/kernel/enumerator.rb:22:in `block in initialize'

@headius
Copy link
Member Author

headius commented Oct 7, 2016

Reduced case; it's back to being a problem with Enumerator#next eagerly evaluating the next element:

enum = ["Text1", "Text2", "Text3"].lazy.select do |t|
  puts caller(0)
  true
end

enum.next
sleep

@headius
Copy link
Member Author

headius commented Oct 7, 2016

This may be a good time to get rid of the native and separate fiber-like logic from Enumerator and just use Fiber.

@headius
Copy link
Member Author

headius commented Oct 8, 2016

Ok, so here's the flaw in the design of our threaded Enumerator#next. In order to support Enumerator#peek, we eagerly try to get the next element. This causes the nexting thread to continue running after yielding a value, so it can acquire the next value, and as a result you have two threads operating against the same state at the same time.

Looking over MRI's implementation of peek, I can see they do this lazily. If you simply next your way through a collection, each yield stops the nexting thread (or Fiber in MRI's case). Only if you call Enumerator#peek does MRI attempt to grab the next value, which it then saves for any subsequent call to Enumerator#next.

This shouldn't be a difficult change to make. It might be less risk in a minor release than switching wholesale over to a Fiber-based next.

twalpole added a commit to teamcapybara/capybara that referenced this issue Oct 8, 2016
JRuby has an issue with lazy enumerator evaluation - disable Capybara::Result optimization when using JRuby
  See: jruby/jruby#4212
@headius
Copy link
Member Author

headius commented Oct 8, 2016

Here's an implementation of Enumerator's next, peek, and rewind using Fibers and lazily acquiring the peek value. It works as expected for all above examples, but doesn't quite pass specs yet. It's probably close.

diff --git a/core/src/main/ruby/jruby/kernel/enumerator.rb b/core/src/main/ruby/jruby/kernel/enumerator.rb
index 4036949..d5cf978 100644
--- a/core/src/main/ruby/jruby/kernel/enumerator.rb
+++ b/core/src/main/ruby/jruby/kernel/enumerator.rb
@@ -1,4 +1,33 @@
 class Enumerator
+  UNDEF = Object.new
+
+  def next
+    if defined?(@__peek_value) && @__peek_value != UNDEF
+      ret, @__peek_value = @__peek_value, UNDEF
+      return ret
+    end
+
+    fiber = @fiber ||= Fiber.new do
+      each {|x| Fiber.yield x}
+    end
+
+    fiber.resume
+  end
+
+  def peek
+    if defined?(@__peek_value) && @__peek_value != UNDEF
+      return @__peek_value
+    end
+
+    @__peek_value = self.next
+  end
+
+  def rewind
+    @__peek_value = UNDEF
+    @fiber = nil
+    self
+  end
+
   class Yielder
     # Current API for Lazy Enumerator does not provide an easy way
     # to handle internal state. We "cheat" and use yielder to hold it for us.

Known issues:

  • I believe this version ends up rooting the Enumerator and Fiber so they won't GC properly if abandoned. We need more isolation between the Fiber and the Enumerator.
  • next_values, peek_values are not implemented, and various flow-control aspects of next and peek are missing (e.g. throwing StopIteration).
  • Our shortcut logic to avoid threaded nexting on known collection types (like Array and Hash) is not wired in here. That would need to be reimplemented in Ruby alongside this logic.

Fiber has become reasonably stable in JRuby at this point, so having two separate Fiber-like implementations isn't really necessary, if we can address the above issues.

@kares
Copy link
Member

kares commented Oct 10, 2016

najs, but we'll somehow loose Enumerator's recent java.util.Iterator-like behavior :(

@enebo
Copy link
Member

enebo commented Oct 10, 2016

@kares would it be possible to include java.util.Iterator and provide methods or will that no longer work? I guess I should ask whether peek is different semantically than hasNext and same for next (between ruby and java)...

Oh sorry for re-editing over and over but we can still dyncall to next and peek from Java versions of next and hasNext internally if we need the java signatures statically reachable.

@kares
Copy link
Member

kares commented Oct 10, 2016

@enebo hopefully that would work but some state needs to be managed for Iterator (hasNext still needs to act as look-ahead) ... also thought there might be a collision with the next method (which I was wrong since I assumed it might receive *args)

@enebo
Copy link
Member

enebo commented Oct 10, 2016

@kares peek also does look-ahead but I guess the problem is we cannot tell if peek is returning nil from next or it really has no value?

@headius what happens if lazy is stored and called by two separate threads? Isn't there a race for the fiber being created?

@headius
Copy link
Member Author

headius commented Oct 11, 2016

@kares if RubyEnumerator still implements Iterator, then it should be fine. We'd only be moving a few methods from Java to Ruby and ditching the custom fiber logic in RubyEnumerator/Nexter.

Peek and hasNext would both trigger the next item to load. That's just how it has to work to satisfy both specs.

@enebo We spoke over IM but to close the loop: yes, this is a naive proof-of-concept and I have not filled out all features nor made any attempt to keep it thread-safe. I did this to show that a correct lazy impl does indeed fix the reported issue, and because reworking the Java-based Enumerator#next would take more work to be made lazy (but probably will be less risk).

@headius
Copy link
Member Author

headius commented Oct 11, 2016

I looked over the Truffle impl of this (based on the Rubinius 2.x impl) and it also appears to have numerous race conditions.

@kares
Copy link
Member

kares commented Oct 11, 2016

did not realize some of the RubyEnumerator parts are to be kept. sorry for the confusion on my part than.

jsonn pushed a commit to jsonn/pkgsrc that referenced this issue Oct 18, 2016
#2.10.1
Release date: 2016-10-08

### Fixed
* App errors are now correctly raised with the explanatory cause in JRuby [Thomas Walpole]
* Capybara::Result optimization disabled in JRuby due to issue with lazy enumerator evaluation [Thomas Walpole]
  See: jruby/jruby#4212

#2.10.0
Release date: 2016-10-05

### Added
* Select `<button>` elements with descendant images with `alt` attributes matching the locator [Ian Lesperance]
* Locator string is optional in selector based matchers [Thomas Walpole]
* Selectors can specify their default visible setting [Thomas Walpole]
* Selector based finders and matchers can be passed a block to filter the results within the retry behavior [Thomas Walpole]

#Version 2.9.2
Release date: 2016-09-29

### Fixed
* :label built-in selector finds nested label/control by control id if the label has no 'for' attribute [Thomas Walpole]
* Warning issued if an unknown selector type is specified

#Version 2.9.1
Release date: 2016-09-23

### Fixed
* allow_label_click option did not work in some cases with Poltergeist - Issue #1762 [Thomas Walpole]
* matches_selector? should have access to all of a selectors options except the count options [Thomas Walpole]

#Version 2.9.0
Release date: 2016-09-19

### Fixed
* Issue with rack-test driver and obselete mime-types when using `#attach_file` - Issue #1756 [Thomas Walpole]

### Added
* `:class` option to many of the built-in selectors [Thomas Walpole]
* Removed need to specify value when creating `:boolean` filter type in custom selectors [Thomas Walpole]
* Filters can now be implemented through the XPath/CSS expressions in custom selectors [Thomas Walpole]
* `Element#matches_xpath?` and `Element#matches_css?` [Thomas Walpole]

#Version 2.8.1
Release data: 2016-08-25

###Fixed
* Fixed error message from have_text when text is not found but contains regex special characters [Ryunosuke Sato]
* Warn when :exact option is passed that has no effect [Thomas Walpole]

# Version 2.8.0
Release date: 2016-08-16

### Fixed
* Issue with modals present when closing the page using selenium - Issue #1696 [Jonas Nicklas, Thomas Walpole]
* Server errors raised in test code have the cause set to an explanatory exception
  in rubies that support Exception#cause rather than a confusing ExpectationNotMet - Issue #1719 [Thomas Walpole]
* background/given/given! RSoec aliases will work if RSpec config.shared_context_metadata_behavior == :apply_to_host_groups [Thomas Walpole]
* Fixed setting of unexpectedAlertError now that Selenium will be freezing the Capabilities::DEFAULTS [Thomas Walpole]

### Added
* 'check', 'uncheck', and 'choose' can now optionally click the associated label if the checkbox/radio button is not visible [Thomas Walpole]
* Raise error if Capybara.app_host/default_host are specified incorrectly [Thomas Walpole]
* Capybara::Selector::FilterSet allows for sharing filter definitions between selectors [Thomas Walpole]
* Remove need to pass nil locator in most node actions when locator is not needed [Thomas Walpole]
* New frames API for drivers - Issue #1365 [Thomas Walpole]
* Deprecated Element#parent in favor of Element#query_scope to better indicate what it is [Thomas Walpole]
* Improved error messages for have_text matcher [Alex Chaffee, Thomas Walpole]
* The `:with` option for the field selector now accepts a regular expression for matching the field value [Uwe Kubosch]
* Support matching on aria-label attribute when finding fields/links/buttons - Issue #1528 [Thomas Walpole]
* Optimize Capybara::Result to only apply fields as necessary in common use-case of `.all[idx]` [Thomas Walpole]

#Version 2.7.1
Release date: 2016-05-01

### Fixed
* Issue where within_Frame would fail with Selenium if the frame is removed from within itself [Thomas Walpole]
* Reset sessions in reverse order so sessions with active servers are reset last - Issue #1692 [Jonas Nicklas, Thomas Walpole]

# Version 2.7.0
Release date: 2016-04-07

### Fixed
* Element#visible?/checked?/disabled?/selected? Now return boolean
  as expected when using the rack_test driver [Thomas Walpole]
* The rack_test driver now considers \<input type="hidden"> elements as non-visible [Thomas Walpole]
* A nil locator passed to the built-in html type selectors now behaves consistently, and finds elements of the expected types [Thomas Walpole]
* Capybara::Server now searches for available ports on the same interface it binds to [Aaron Stone]
* Selenium Driver handles system modals that appear when page is unloading [Thomas Walpole]
* Warning output if unused parameters are passed to a selector query [Thomas Walpole]

### Added
* Capybara now waits for requests to Capybaras server to complete while restting the session [John Hawthorn, Thomas Walpole]
* Capybara.reuse_server option to allow disabling of sharing server instance between sessions [Thomas Walpole]
* :multiple filter added to relevant selectors [Thomas Walpole]
* Provided server registrations for :webrick and :puma. Capybara.server = :puma for testing with Rails 5 [Thomas Walpole]
* Deprecate passing a block to Capybara::server user Capybara::register_server instead [Thomas Walpole]
* :option selector supports :selected and :disabled filters [Thomas Walpole]
* Element#matches_selector? and associated matchers (match_selector, match_css, etc) for comparing an element to a selector [Thomas Walpole]
* Deprecated Driver#browser_initialized? - Driver#reset! is required to be synchronous [Jonas Nicklas, Thomas Walpole]
* Deprecated Capybara.save_and_open_page_path in favor of Capybara.save_path with slightly different behavior when using relative paths with
  save_page/save_screenshot [Thomas Walpole]
* :label selector [Thomas Walpole]
@enebo enebo modified the milestones: JRuby 9.2.0.0, JRuby 9.1.6.0 Nov 7, 2016
@headius headius mentioned this issue Mar 27, 2018
15 tasks
@headius headius modified the milestones: JRuby 9.2.0.0, JRuby 9.2.1.0 May 15, 2018
@headius
Copy link
Member Author

headius commented Oct 11, 2018

See also #4903.

@headius
Copy link
Member Author

headius commented Oct 11, 2018

Marked #5007 as dupe.

@twalpole
Copy link
Contributor

twalpole commented Apr 11, 2019

Any update on if or when this may get some attention? Would be nice to be able to remove JRuby specific limitations from the Capybara code base.

@headius
Copy link
Member Author

headius commented Apr 11, 2019

@twalpole Could you test out #5672 and see if it works better? I still have some work to do there, but that's the direction. Once we can be assured that it doesn't leak fibers, it can get merged.

@headius
Copy link
Member Author

headius commented Apr 11, 2019

@twalpole Actually I took your comment as a call to action to finish the PR. I should have something landed on master shortly.

@headius
Copy link
Member Author

headius commented Apr 11, 2019

FWIW the new logic works properly.

@twalpole
Copy link
Contributor

@headius Ok - I built the distribution files from your branch and then installed it into my rbenv - and now I get a bunch of errors like -

An error occurred while loading ./spec/session_spec.rb.

Failure/Error: # frozen_string_literal: true

LoadError:
  load error: childprocess/jruby -- java.lang.reflect.InaccessibleObjectException: Unable to make field private final java.io.FileDescriptor sun.nio.ch.FileChannelImpl.fd accessible: module java.base does not "opens sun.nio.ch" to module org.jruby.dist
# ./spec/spec_helper.rb:1:in `block in (root)'
# ./spec/spec_helper.rb:6:in `<main>'
# ./spec/session_spec.rb:1:in `(root)'
# ./spec/session_spec.rb:3:in `<main>

When attempting to run the Capybara specs -- so not sure if it's my building/installing (probably) or your branch that has issues - I'll wait for it to go into master and then try again

@headius
Copy link
Member Author

headius commented Apr 11, 2019

That looks like a Java 9 issue...and also an issue that was fixed in the release but probably not on this branch. Hold off and I'll merge to master shortly.

@headius
Copy link
Member Author

headius commented Apr 11, 2019

I've merged master back into #5672 and just waiting to make sure CI looks good.

@headius
Copy link
Member Author

headius commented Apr 12, 2019

Fixed by #5672.

@twalpole Could you come up with some simple specs that test this behavior? You can submit as a PR to JRuby or https://github.com/ruby/spec under spec/ruby/core/enumerator.

@headius headius closed this as completed Apr 12, 2019
@twalpole
Copy link
Contributor

Ok - so using the master branch with Java 8 (can't use 9+ because the current release of cucumber doesn't support child_process 1.0.x which is required for Java 9+...) this problem is fixed - but a bunch of new stuff in the Capybara test suite breaks. I'm going to have to dig deeper into it tomorrow, and will try and get some simple tests for the lazy enumerator behavior submitted tomorrow too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants