-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
require 'everything' inside a jar in 9.1.5.0 seems very slow #4185
Comments
require 'everything'
inside a jar in 9.1.5.0 seems very slow
Current suspect for me would be the changes we made to classloader resource loading and searching in 9k. Something isn't getting cached or we're doing more work to load those resources now than in 1.7. Hopefully -Xprof or -Djruby.debug.loadService.timing will tell us something. |
The first thing i did was turn on This caused me to realize another difference between the two jars. The 1.7 jar is custom built, after The J9K jar uses |
Now I have to apologize for being a java novice. While I can use Maybe something in the way the java main class that warbler provides is written? Off to look at that. |
Here is more story. Our jar builder was apparently a hacked up copy of a much older warbler which someone pounded on until it made a runnable JRuby jar. I threw it out for the j9k port and moved us to unmodified warbler 2.0.3, I never even looked at the old code. However, going back ... there is this file Don't know how this would relate to "J9000 jar runs won't save or print profile data" which is also something that I can't understand |
Here's FasterJar.java ... no idea where it came from, but it is sitting mixed into the hacked up warbler. I think this is a red herring, but I am still coming up to speed on how all this works, maybe I am wrong. import java.io.Closeable;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;
import java.util.zip.ZipOutputStream;
import org.jruby.Ruby;
import org.jruby.RubyArray;
import org.jruby.RubyHash;
import org.jruby.RubyModule;
import org.jruby.RubyString;
import org.jruby.anno.JRubyMethod;
import org.jruby.runtime.Block;
import org.jruby.runtime.ThreadContext;
import org.jruby.runtime.builtin.IRubyObject;
import org.jruby.util.ByteList;
import org.jruby.util.JRubyFile;
public class FasterJar {
public static void create(Ruby runtime) {
RubyModule task = runtime.getClassFromPath("Jar::Builder");
task.defineAnnotatedMethods(FasterJar.class);
}
@JRubyMethod
public static IRubyObject create_jar(ThreadContext context, IRubyObject recv, IRubyObject jar_path, IRubyObject entries) {
final Ruby runtime = recv.getRuntime();
if (!(entries instanceof RubyHash)) {
throw runtime.newArgumentError("expected a hash for the second argument");
}
RubyHash hash = (RubyHash) entries;
try {
FileOutputStream file = newFile(jar_path);
try {
ZipOutputStream zip = new ZipOutputStream(file);
addEntries(context, zip, hash);
zip.finish();
} finally {
close(file);
}
} catch (IOException e) {
if (runtime.isDebug()) {
e.printStackTrace();
}
throw runtime.newIOErrorFromException(e);
}
return runtime.getNil();
}
@JRubyMethod
public static IRubyObject entry_in_jar(ThreadContext context, IRubyObject recv, IRubyObject jar_path, IRubyObject entry) {
final Ruby runtime = recv.getRuntime();
try {
InputStream entryStream = getStream(jar_path.convertToString().getUnicodeValue(),
entry.convertToString().getUnicodeValue());
try {
byte[] buf = new byte[16384];
ByteList blist = new ByteList();
int bytesRead = -1;
while ((bytesRead = entryStream.read(buf)) != -1) {
blist.append(buf, 0, bytesRead);
}
IRubyObject stringio = runtime.getModule("StringIO");
return stringio.callMethod(context, "new", runtime.newString(blist));
} finally {
close(entryStream);
}
} catch (IOException e) {
if (runtime.isDebug()) {
e.printStackTrace();
}
throw runtime.newIOErrorFromException(e);
}
}
private static void addEntries(ThreadContext context, ZipOutputStream zip, RubyHash entries) throws IOException {
RubyArray keys = entries.keys().sort(context, Block.NULL_BLOCK);
for (int i = 0; i < keys.getLength(); i++) {
IRubyObject key = keys.entry(i);
IRubyObject value = entries.op_aref(context, key);
addEntry(context, zip, key.convertToString().getUnicodeValue(), value);
}
}
private static void addEntry(ThreadContext context, ZipOutputStream zip, String entryName, IRubyObject value) throws IOException {
if (value.respondsTo("read")) {
RubyString str = (RubyString) value.callMethod(context, "read").checkStringType();
byte[] contents = str.getByteList().getUnsafeBytes();
zip.putNextEntry(new ZipEntry(entryName));
zip.write(contents);
} else {
File f;
if (value.isNil() || (f = getFile(value)).isDirectory()) {
zip.putNextEntry(new ZipEntry(entryName + "/"));
} else {
String path = f.getPath();
if (!f.exists()) {
path = value.convertToString().getUnicodeValue();
}
InputStream inFile = getStream(path, null);
try {
zip.putNextEntry(new ZipEntry(entryName));
byte[] buf = new byte[16384];
int bytesRead = -1;
while ((bytesRead = inFile.read(buf)) != -1) {
zip.write(buf, 0, bytesRead);
}
} finally {
close(inFile);
}
}
}
}
private static FileOutputStream newFile(IRubyObject jar_path) throws IOException {
return new FileOutputStream(getFile(jar_path));
}
private static File getFile(IRubyObject path) {
return JRubyFile.create(path.getRuntime().getCurrentDirectory(),
path.convertToString().getUnicodeValue());
}
private static void close(Closeable c) {
try {
c.close();
} catch (Exception e) {
}
}
private static Pattern PROTOCOL = Pattern.compile("^[a-z][a-z0-9]+:");
private static InputStream getStream(String jar, String entry) throws IOException {
Matcher m = PROTOCOL.matcher(jar);
while (m.find()) {
jar = jar.substring(m.end());
m = PROTOCOL.matcher(jar);
}
String[] path = jar.split("!/");
InputStream stream = new FileInputStream(path[0]);
for (int i = 1; i < path.length; i++) {
stream = entryInJar(stream, path[i]);
}
if (entry == null) {
return stream;
}
return entryInJar(stream, entry);
}
private static InputStream entryInJar(InputStream jar, String entry) throws IOException {
ZipInputStream jstream = new ZipInputStream(jar);
ZipEntry zentry = null;
while ((zentry = jstream.getNextEntry()) != null) {
if (zentry.getName().equals(entry)) {
return jstream;
}
jstream.closeEntry();
}
throw new FileNotFoundException("entry '" + entry + "' not found in " + jar);
}
} |
jruby 9K usually just uses classloader lookup when finding something and there is no caching involved like "directory path does not exits". I think there some caching on jruby side can speedup things. looking at FasterJar is assuming the war-file is used with a regular jar-protocol or at least uses "!/" inside the path. that might not work with custom protocol handlers used by websphere, wildfly, osgi, etc |
ok ... i figured out how to use "jvisualvm" to grab a profile of just the require processing (and of course all the code that fires up as part of the require-ing) i turned on sampling just before the first require, and the program exits right after the last one. here's the export of the data file ... tried to ALSO do a profile run
|
ok ... updated to 1.8.0_102 and now the profiler doesn't segv the jvm ... is the profile output useful? |
looks like Let me know if you need anything else, I think I've grabbed the first set of data. I wish I understood more about this, feel awkward filing a bug which will probably turn out to be my fault. |
This is the last stumbling block for us switching to 9000. If I wanted to look at JRuby code for why this is slow, where would I be looking? |
@the-michael-toy jruby-9k almost everywhere (there are only two minor exceptions) uses uri:classloader to find its resources on JRubyClassloader |
@the-michael-toy Thank you for your continued exploration on this. We understand this is holding you back and will try to get back to this. I'm marking it for 9.1.6.0 to help prioritize it. Is it possible to get access to your application, so we can do our own profiling too? Your posts have been helpful, but it still seems to point to just general classloading as the choke point. |
I can probably figure out how to give you something, I'll look at that this afternoon, thanks. |
Well I couldn't figure out how to give you a jar without getting lawyers involved, but I have done different and possibly enlightening experiment. I wrote a ruby program which generates a source project roughly similar in layout and complexity to our project. The jar in this case has the ruby app code compiled, but the gems in source, because that is how our app is packaged at the moment.
The slowness is interesting because the But the slowness is nothing like the slowness in our app, and if our app were only this much slower, I would already have switched to JRuby9000. I will have to think more about what our actual app is doing differently than the fake app. |
For giggles I added
to the java options, which i didn't expect to make much difference On my fast mac, this cut 10 seconds off the 120 startup time. |
I tried |
So I re-ran the profiler today and tried to dig around a little bit more today as if I knew what I was doing. As the profile is running, JarCache.isValid is a hotspot shown as the data is being collected. In 9.1.5.0 this is Flailing, but trying to find some helpful data. |
Trying to figure out how to run with a custom jruby so I can experiment with "isValid" always returning true. Got the jruby part down, can't get bundler to find the right jruby-jars ... working on that. |
yeah ... red herring ... making JarCache.isValid simply return true didn't speed anything up. sigh. |
... but maybe because i used jruby-jars 9.1.5.0 in my Gemfile, because i couldn't figure out how to write a gemfile that used my custom jruby jar that would build with warbler (because warbler doesn't like :path gems) ... the version of the class loader i ended up using was the one in jruby 9.1.5.0 and not my patched one, so maybe it isn't a red herring. |
You may need to set it in an environment variable so it propagates to child processes. If you set it and did not see output, it's either not getting into JRuby or the output is getting clobbered by something in your app. Thanks for the script to create a big jar! This will be helpful in our investigation. Will try to look into it this week. |
I'm still thinking I didn't correctly stub out JarCache.JarIndex.isValid |
OK, I was correct, I had NOT patched out isValid properly. Based on my limited ability to understand the profile output, I made this change ...
and after I figured out how to use the custom JRuby I built with that change, it takes the load time from 110 seconds to 65 seconds. |
@kares I am not particularly keen on writing the cache so I am not pushing it. but I really do not like to see all those not sure about the fatjar, with the jruby-gradle-plugin that is default way to bundle up a jruby application and it runs off the jar - even rails application do work (now) with it. |
Sorry if this is a dumb idea, or if it exists already, but since a one-size-fits-all approach to jar access and caching doesn't exist, maybe some sort of pluggable API so that an app could make it's own more informed choices about which objects to cache and how long to keep the cache? |
I apparently live in the dark ages. If I used this plugin to build my jar instead of warble, that might fix everything ... I will look into that as well. |
I think this might slip to 9.1.7.0, since we need to get a release out with recent fixes before RubyConf this week. Bumping for now. @mkristian You are concerned about your port of the caching stuff? How much different is it from mine? |
@headius the difference is to keep the copying of nested jars and keep this CompoundJArHandler out. this means all dealing with classes business is done by the underlying classloader only those for me the caching was always missing in the URLResource, we have caching in JarResource, so I still think it is worth doing it, especially since I personally put a lot of effort to get applications like rails run off a packed fat-jars. but since I need to "make" time for it, a consensus among the core team to go this way is important for me. |
Any concerns with shipping an app with a forked-jruby running the patch? On Tue, Nov 8, 2016 at 11:20 PM Christian Meier notifications@github.com
|
No concerns from me :-) It's just your challenge to maintain. We do need to get this caching in, though. |
will prepare a PR then . . . |
am i crazy or is this fixed in 9.1.6.0? i feel like i must be measuring wrong. i'm seeing 10-12 seconds for the |
I am definitely not measuring wrong. 9.1.6.0 has fixed this problem for our app. I am moving our internal builds to 9000 next week and if all goes well next rollout to customers will all be 9000 based. Thanks. Slightly curious if this was intentionally fixed, or if it is an accidental benefit of some other change. |
curious though what commit did the little wonder ? |
I could bisect if that would be helpful, would probably take an afternoon. Want me to? |
@the-michael-toy hey if takes an afternoon - no. it is not so important, just nice to know. as you were the driving force on this issue and you are saying things are fine now - that counts (for me). would it make sense to close this issue ? |
@mkristian it can be closed. i too would like to know what made it better, since apparently it was a unintended benefit of some other fix, but i am mostly happy that we are off 1.7. thanks to everyone, including the mysterious masked bug fixer. |
OK, I took the time to bisect this down because it's holiday weekend and I am not allowed to change anything in our code, and I know way more than before. First of all, 9.1.5.0, with master of our project, is also fast, so the speedup that I measured when I switched to 9.1.6.0 was NOT due to 9.1.6.0. So I started bisecting on our project's history to find out at what point it become fast, and I found the culprit. When I started working on this, there was a bug which I originally thought was in warbler, but I eventually decided must be in bundler. The bug was an exception when you did The workaround was to iterate over the gems and add their Recently as I was preparing to ship the 9000 upgrade I have been reviewing patches I made to get it running with 9000, and removing patches that were no longer needed, and this was one of them. I did not re-check performance at that time, I just did the check, sort of casually when I upgraded to 9.1.6.0. Which caused me to mis-attribute the speed to 9.1.6.0. My test app shows that it isn't just a long I'm still happy with this being closed, thought you might appreciate the extra info. |
@the-michael-toy Thank you for the followup and extra work on this! So I'm on the same page...the problem was basically that a very large load path severely impacted load performance, to an extreme degree when running inside a jar. Correct? If all those gems were going to be loaded anyway, I'd expect that eventually all their libs would be in load path. However, a few things lend some insight to why it's faster when you don't eagerly add them:
This also gives us something that might be worth looking into: why does our load speed degrade so badly when there's many load path entries running from within a jar file? And it may still be worth having some sort of caching, but perhaps we have a free hand to reimagine it now. |
Exercise for those who have looked into this: knowing what we know from @the-michael-toy, could we duplicate this performance problem? Perhaps just generate a big jar with a few thousand load path entries and try requiring files that come earlier or later in those paths? I'm going to move this to a 9.2 activity since it sounds like we're actually doing better than 1.7. |
close, i have a "fake very big project" app, which generates a fake app
sized about as large as my app, and it isn't slow.
the difference is that our app has one gem which has a jar in the gem. what
i GUESS is that the nested jar is what triggers the badness, somehow.
but yeah, 9.2 sounds fine.
…On Tue, Dec 13, 2016 at 3:00 AM Charles Oliver Nutter < ***@***.***> wrote:
Exercise for those who have looked into this: knowing what we know from
@the-michael-toy <https://github.com/the-michael-toy>, could we duplicate
this performance problem? Perhaps just generate a big jar with a few
thousand load path entries and try requiring files that come earlier or
later in those paths?
I'm going to move this to a 9.2 activity since it sounds like we're
actually doing better than 1.7.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4185 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACHuW57wJF4qx0S0l05DSCB7eV_y-Qqoks5rHnrWgaJpZM4KIODk>
.
|
Don't need anything fancy to trigger just, just shuffle activerecord inside a jar and try to:
|
Can someone retest this with JRuby 9.2? I'm not sure, but I feel like we made improvements to loading from within a jar. |
Over a year since last followup so I'm going to close this. If this is still an issue let us know. |
Environment
eventually gets to the line of code which loads the loaders, which then load the bulk of the classes which make up our app
Expected Behavior
In 1.7.19 this completes in about 10 seconds when run from a jar
Actual Behavior
In 9.1.5.0 this runs acceptably fast from the file system (but still slower than the jar load time of 1.7), but takes over 107 seconds when run from a jar where all the ruby files are compiled. On a typical AWS instance, this expands to well over three minutes.
Working on getting
-Xprof
data now.The text was updated successfully, but these errors were encountered: