-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Web Scraping with 9.1.5.0 much much slower than 9.0.4.0 #4150
Comments
@asdf1337 do you know which version of Java you are using? Someone else reported slower performance and it ended up being Java 7 as being much slower than Java 8 (we have not fully figured out why but think it may be something involving invokedynamic [which we increased in using since 9.1.2.0]). |
I thought it might have been the Java version as well, so today I upgraded but I've been using 8 the whole time. Went from java "1.8.0_91" to java "1.8.0_101". Running the program with or without invokedynamic doesn't seem to do much either. |
@asdf1337 thanks for checking. Some issue of invokedynamic are always on now but the fact you see this with Java 8 has me be hopeful that we just regressed in some other area |
@asdf1337 Ok, we really want to fix this, but we need a way to reproduce so we can profile it. If that's not possible, you'll have to do the profiling and we'll talk through it. We have a wiki page on profiling here: https://github.com/jruby/jruby/wiki/Profiling-jruby. Thank you for reporting this! We definitely should not regress this badly on performance, so we'll want to fix this up for 9.1.60. |
Awesome! I'm going to put together a minimal version of my code that hopefully still exhibits the same effects. |
It may not be possible to get it down to a version I can share easily. |
This isn't going to make 9.1.6.0, and I apologize for the lack of progress here. I have written up a wiki page on various methods of profiling JRuby here: https://github.com/jruby/jruby/wiki/Troubleshooting-Performance These flags will give you a good start on profiling. Keep in mind the allocation profile will be very slow, so what you really need is to just ensure the main work of your application has been running for a while; it doesn't have to complete to be useful. We're looking for really high allocation counts in the resulting output. You can also stop by IRC (perhaps after RubyConf this week) and some of us will be there to help you. |
@asdf1337 - Is it possible that this is a manifestation of jruby/jruby-openssl#111 |
Still an issue? We haven't revisited this in over a year. |
I don't know if it there is still a performance degradation because we implemented a workaround (described above) in our application some time ago and it remains in place today. It consists of an initializer in our rails app that defines a constant: CERTIFICATE_STORE = OpenSSL::X509::Store.new.freeze
CERTIFICATE_STORE.set_default_paths In every place where we use HTTParty.get url, cert_store: CERTIFICATE_STORE In every place where we use RestClient::Request.execute method: :get, url: url, ssl_cert_store: CERTIFICATE_STORE I still believe this issue is related to jruby/jruby-openssl/issues/111 - I haven't tried dropping the workarounds since that issue remains open. |
Not sure what progress has been made on this. Mostly pending jruby-openssl work I believe. |
Environment
Tried using both OpenURI as well as Manticore for HTTP get requests. Same results.
Expected Behavior
Scraping around 200,000 pages with many threads.

Using 9.0.40 it is very fast.
Actual Behavior
When I switch to any newer version of JRuby the performance hit is huge.

The code I am running is exactly the same.
I'd love to start using a more up to date version of this amazing project,
but this huge issue is preventing me from upgrading.
I know I'm probably not sharing enough information to help much but I'm not sure what would be useful.
Please let me know what else I should provide and I will do my best.
Thank you.
The text was updated successfully, but these errors were encountered: