Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Web Scraping with 9.1.5.0 much much slower than 9.0.4.0 #4150

Open
asdf1337 opened this issue Sep 12, 2016 · 11 comments
Open

Web Scraping with 9.1.5.0 much much slower than 9.0.4.0 #4150

asdf1337 opened this issue Sep 12, 2016 · 11 comments

Comments

@asdf1337
Copy link

Environment

  • JRuby-9.0.4.0 & JRuby-9.1.5.0
  • Ubuntu 16

Tried using both OpenURI as well as Manticore for HTTP get requests. Same results.

Expected Behavior

Scraping around 200,000 pages with many threads.
Using 9.0.40 it is very fast.
Image of 9.0.4.0

Actual Behavior

When I switch to any newer version of JRuby the performance hit is huge.
The code I am running is exactly the same.
Image of 9.0.4.0


I'd love to start using a more up to date version of this amazing project,
but this huge issue is preventing me from upgrading.

I know I'm probably not sharing enough information to help much but I'm not sure what would be useful.
Please let me know what else I should provide and I will do my best.

Thank you.

@enebo
Copy link
Member

enebo commented Sep 12, 2016

@asdf1337 do you know which version of Java you are using? Someone else reported slower performance and it ended up being Java 7 as being much slower than Java 8 (we have not fully figured out why but think it may be something involving invokedynamic [which we increased in using since 9.1.2.0]).

@asdf1337
Copy link
Author

I thought it might have been the Java version as well, so today I upgraded but I've been using 8 the whole time. Went from java "1.8.0_91" to java "1.8.0_101".

Running the program with or without invokedynamic doesn't seem to do much either.

@enebo
Copy link
Member

enebo commented Sep 13, 2016

@asdf1337 thanks for checking. Some issue of invokedynamic are always on now but the fact you see this with Java 8 has me be hopeful that we just regressed in some other area

@headius
Copy link
Member

headius commented Sep 13, 2016

@asdf1337 Ok, we really want to fix this, but we need a way to reproduce so we can profile it.

If that's not possible, you'll have to do the profiling and we'll talk through it.

We have a wiki page on profiling here: https://github.com/jruby/jruby/wiki/Profiling-jruby.

Thank you for reporting this! We definitely should not regress this badly on performance, so we'll want to fix this up for 9.1.60.

@asdf1337
Copy link
Author

asdf1337 commented Sep 13, 2016

Awesome! I'm going to put together a minimal version of my code that hopefully still exhibits the same effects.

@asdf1337
Copy link
Author

It may not be possible to get it down to a version I can share easily.
When you have time let's go through the profiling procedure.
Thanks again.

@headius
Copy link
Member

headius commented Nov 8, 2016

This isn't going to make 9.1.6.0, and I apologize for the lack of progress here.

I have written up a wiki page on various methods of profiling JRuby here: https://github.com/jruby/jruby/wiki/Troubleshooting-Performance

These flags will give you a good start on profiling. Keep in mind the allocation profile will be very slow, so what you really need is to just ensure the main work of your application has been running for a while; it doesn't have to complete to be useful. We're looking for really high allocation counts in the resulting output.

You can also stop by IRC (perhaps after RubyConf this week) and some of us will be there to help you.

@headius headius modified the milestones: JRuby 9.2.0.0, JRuby 9.1.6.0 Nov 8, 2016
@JasonLunn
Copy link
Contributor

@asdf1337 - Is it possible that this is a manifestation of jruby/jruby-openssl#111
If so, we had success instantiating a certificate store, freezing it, assigning it to a constant, and then passing it into every HTTP request we made. The way to pass the certificate store varies depending on what library you use, but we've gotten it working with both httparty and rest-client. I have also had an engineer on our team make the claim that it doesn't matter if the request is HTTP or HTTPS or not, that the certificate store was being constantly reconstructed regardless, but I don't know if that assertion is true across all libraries.

@headius
Copy link
Member

headius commented May 15, 2018

Still an issue? We haven't revisited this in over a year.

@headius headius modified the milestones: JRuby 9.2.0.0, JRuby 9.2.1.0 May 15, 2018
@JasonLunn
Copy link
Contributor

JasonLunn commented May 15, 2018

I don't know if it there is still a performance degradation because we implemented a workaround (described above) in our application some time ago and it remains in place today. It consists of an initializer in our rails app that defines a constant:

CERTIFICATE_STORE = OpenSSL::X509::Store.new.freeze
CERTIFICATE_STORE.set_default_paths

In every place where we use HTTParty to make outbound connections, we pass in the constant as the value of the cert_store parameter like so:

HTTParty.get url, cert_store: CERTIFICATE_STORE

In every place where we use RestClient, we use a similar strategy:

RestClient::Request.execute method: :get, url: url, ssl_cert_store: CERTIFICATE_STORE

I still believe this issue is related to jruby/jruby-openssl/issues/111 - I haven't tried dropping the workarounds since that issue remains open.

@headius headius modified the milestones: JRuby 9.2.1.0, JRuby 9.2.2.0 Sep 19, 2018
@headius
Copy link
Member

headius commented Sep 19, 2018

Not sure what progress has been made on this. Mostly pending jruby-openssl work I believe.

@enebo enebo modified the milestones: JRuby 9.2.5.0, JRuby 9.2.6.0 Dec 6, 2018
@headius headius removed this from the JRuby 9.2.6.0 milestone Dec 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants