Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to override classpath for ScriptingContainer #3461

Closed
bigsur0 opened this issue Nov 15, 2015 · 19 comments
Closed

Unable to override classpath for ScriptingContainer #3461

bigsur0 opened this issue Nov 15, 2015 · 19 comments

Comments

@bigsur0
Copy link
Contributor

bigsur0 commented Nov 15, 2015

In one of my target deployment environments the Java classpath contains another older version of JRuby. Our application is making use of ScriptingContainer to execute Ruby code and when it goes to load YAML files it uncovers the classpath collision between the two versions of JRuby with an error related to Psych. To workaround this issue, I've attempted to make use of ScriptingContainer#setLoadPaths to control the classpath provided to the Ruby "environment" by excluding the older jruby-complete jar from the classpath , but it doesn't seem to take, it always inherits the classpath from the parent Java environment.

Maybe this isn't supposed to work in this way, but the documentation below led me to believe this would be possible. If this isn't a bug, is there another way to override the CLASSPATH or class loading behavior to exclude certain jars at runtime?

From: org.jruby.embed.ScriptingContainer

public void setLoadPaths(List<String> paths)
Changes a list of load paths Ruby scripts/libraries. The default value is an empty array. 
If no paths is given, the list is created from java.class.path System property. 
This value can be set by org.jruby.embed.class.path System property, also. 
Call this method before you use put/get, runScriptlet, and parse methods so that the given paths will be used.

Specified by:
setLoadPaths in interface EmbedRubyInstanceConfigAdapter

Parameters:
paths - a new list of load paths.

Since:
JRuby 1.5.0.
@mkristian
Copy link
Member

@r6p the only thing you can set with the ScriptingContainter is the the classloader:

`````` scriptingContainer.setClassLoader(somehting)```

and this classloader needs to contain the org.jruby:jruby dependencies (or jruby-complete). so you can always get the "jars" and setup a clean classloader hierarchy with only ONE jruby in it.

if you use the IsolatedScriptingContainer then you get one which does not inherit GEM_HOME, GEM_PATH, JARS_HOME from the outer environment but assumes everything is embedded.

you also can try to look at the jruby classloader before going to the parent but this one has problems with nokogiri gem (and maybe others). but there is a change to this solves your problem:
scriptingContainer.setClassloaderDelegate(false)

@bigsur0
Copy link
Contributor Author

bigsur0 commented Nov 16, 2015

@mkristian Thanks. But, I actually already tried what you suggested i.e. IsolatedScriptingContainer#setClassLoader but had no luck with that. Adding scriptingContainer.setClassloaderDelegate#setClassloaderDelegate(false) doesn't seem to change the behavior the parent classloader is always at play.

Here's what I tried, if I am doing something wrong or don't have the order correct let me know. Keep in mind that this did work when testing via rspec, but not at runtime in my target deployment environment.

    ruby.setClassLoader(new URLClassLoader(urls));
    ruby.setClassloaderDelegate(false);

@mkristian
Copy link
Member

new URLClassLoader(urls) does use the "application classloader" as parent. you better use
new URLClassLoader(urls, ClassLoader.getSystemClassLoader().getParent()) which is the extension classloader and best place to start your own classloader hierarchy.

but this will not work if jruby is the bootstrap classloader which for example bin/jruby from the binary distribution does for faster loading.

@bigsur0
Copy link
Contributor Author

bigsur0 commented Nov 16, 2015

@mkristian new URLClassLoader(urls, ClassLoader.getSystemClassLoader().getParent()), didn't seem to work either. I am going to give, new URLClassLoader(urls, null) a shot, unless you can think of a better option.

@mkristian
Copy link
Member

@r6p this sounds like this is wrong :(

where is this jruby-complete.jar located ? which classloader ? ClassLoader.getSystemClassLoader() ?

take this
new URLClassLoader(urls, ClassLoader.getSystemClassLoader().getParent()) and get the ScriptingContainer via loadClass and execute it via reflection.

@bigsur0
Copy link
Contributor Author

bigsur0 commented Nov 16, 2015

@mkristian, both our jar with 9K embedded and the conflicting older jruby-complete.jar are in the SystemClassLoader. That is why it seems that #getParent() should work as you describe.

I've also tried the code below and within an 'rspec' context the resulting $CLASSPATH still contains jars that come from the SystemClassLoader that I had filtered out when creating the isolatedClassLoader. Let me try it in the true runtime context to see if anything is different there.

    IsolatedScriptingContainer ruby = null;
    ClassLoader isolatedClassLoader = new URLClassLoader(urls, ClassLoader.getSystemClassLoader().getParent());
    try {
      ruby = (IsolatedScriptingContainer) isolatedClassLoader.loadClass("org.jruby.embed.IsolatedScriptingContainer").newInstance();
    } catch (ClassNotFoundException ex) {
      // Unable to resolve class for some reason, fallback to default and move on
    } catch (InstantiationException ex) {
      // Unable to instantiate class for some reason, fallback to default and move on
    } catch (IllegalAccessException ex) {
      // No access to instantiate class for some reason, fallback to default and move on
    }
    ruby.setClassloaderDelegate(false);
    ruby.setClassLoader(isolatedClassLoader);

    ruby.runScriptlet("require 'pp'; pp $CLASSPATH");

@mkristian
Copy link
Member

@r6p what is the output of this ?

FYI $CLASSPATH are the urls from JRuby.runtime.jruby_class_loader this jruby-classloader has the classloader of the ScriptingContainer as parent.

now I had the thought that the problem might be that per default those ScriptingContainer create singleton runtime, i.e. uses the global runtime which might be the reason things get shared between the two runtime (are there two runtimes ? it sounds like what you described so far).

@mkristian
Copy link
Member

@r6p hey any luck with this ? I know it is tricky and remote debugging is difficult but actually I am sure there is a way of getting it to work (as I had similar cases in the past).

@bigsur0
Copy link
Contributor Author

bigsur0 commented Dec 4, 2015

@mkristian, I've tried a few things and none worked. The last major effort I made was to use an IsolatedScriptingContainer created via reflection, with setClassLoaderDelegate(false), and setClassLoader(customClassLoader) being called to then launch the Ruby code to kick's off our Map/Reduce jobs. Doing this produces a number of problems associated with code loading from multiple class loaders, i.e. the same types not being equivalent as they could have been loaded under multiple class loaders mostly as it is related to Hadoop dependencies (see Stacktrace below). I am going to try to generate a simple test case that reproduces the problem and post back.

With regards to your thoughts own two runtimes, can you give me more information on how I can either ensure that there are two separate Runtimes or verify that a single runtime is being shared? Is there a JRuby test case that validate that each ScriptingContainer has a unique Runtime? If, so can you point me to that?

Any help working through this would be appreciated, it is blocking our upgrade to JRuby 9030 at the moment.

Caused by: java.lang.RuntimeException: class org.apache.hadoop.security.ShellBasedUnixGroupsMapping not org.apache.hadoop.security.GroupMappingServiceProvider
  at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2138)
  at org.apache.hadoop.security.Groups.<init>(Groups.java:78)
  at org.apache.hadoop.security.Groups.<init>(Groups.java:74)
  at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:303)
  at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:283)
  at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:260)
  at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:804)
  at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:774)
  at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:647)
  at org.apache.hadoop.mapreduce.task.JobContextImpl.<init>(JobContextImpl.java:72)
  at org.apache.hadoop.mapreduce.Job.<init>(Job.java:144)
  at org.apache.hadoop.mapreduce.Job.<init>(Job.java:131)
  at org.apache.hadoop.mapreduce.Job.<init>(Job.java:139)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
  at org.jruby.javasupport.JavaConstructor.newInstanceDirect(JavaConstructor.java:246)
  at org.jruby.java.invokers.ConstructorInvoker.call(ConstructorInvoker.java:58)
  at org.jruby.java.invokers.ConstructorInvoker.call(ConstructorInvoker.java:138)
  at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:273)
  at org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:79)
  at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:83)
  at org.jruby.java.proxies.ConcreteJavaProxy$InitializeMethod.call(ConcreteJavaProxy.java:48)
  at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:273)
  at org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:79)
  at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:83)
  at org.jruby.RubyClass.newInstance(RubyClass.java:891)
  at org.jruby.RubyClass$INVOKER$i$newInstance.call(RubyClass$INVOKER$i$newInstance.gen)
  at org.jruby.java.proxies.ConcreteJavaProxy$NewMethod.call(ConcreteJavaProxy.java:105)
  at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:273)
  at org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:79)
  at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:83)
  at org.jruby.ir.instructions.CallBase.interpret(CallBase.java:419)
  at org.jruby.ir.interpreter.InterpreterEngine.processCall(InterpreterEngine.java:322)
  at org.jruby.ir.interpreter.StartupInterpreterEngine.interpret(StartupInterpreterEngine.java:77)
  at org.jruby.internal.runtime.methods.MixedModeIRMethod.INTERPRET_METHOD(MixedModeIRMethod.java:127)
  at org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:113)
  at org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:171)
  at org.jruby.RubyClass.finvoke(RubyClass.java:737)
  at org.jruby.runtime.Helpers.invoke(Helpers.java:409)
  at org.jruby.embed.internal.EmbedRubyObjectAdapterImpl.callEachType(EmbedRubyObjectAdapterImpl.java:356)
  at org.jruby.embed.internal.EmbedRubyObjectAdapterImpl.call(EmbedRubyObjectAdapterImpl.java:309)
  at org.jruby.embed.internal.EmbedRubyObjectAdapterImpl.callMethod(EmbedRubyObjectAdapterImpl.java:250)
  at org.jruby.embed.ScriptingContainer.callMethod(ScriptingContainer.java:1412)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:497)
  at foo.MRDriver.run(MRDriver.java:79)
  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
  at foo.MRDriver.main(MRDriver.java:28)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:497)
  at org.apache.oozie.action.hadoop.JavaMain.run(JavaMain.java:57)
  at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:46)
  at org.apache.oozie.action.hadoop.JavaMain.main(JavaMain.java:38)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:497)
  at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:228)
  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
  at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

@bigsur0
Copy link
Contributor Author

bigsur0 commented Dec 6, 2015

@mkristian, so I was able to get something working using multiple classloaders (not ideal), but the key was to create a new classloader that doesn't include the conflicting jruby-complete jar in that classloader, then load the IsolatedScriptingContainer via reflection, setClassloaderDelegate(false), and finally in the Map/Reduce Ruby code that is being executed via the scripting container java.lang.Thread.currentThread().setContextClassLoader(JRuby.runtime.jruby_class_loader). This was slightly different than my previous attempt as I attempted to use setClassLoader(isolatedClassLoader) when setting-up the IsolatedScriptingContainer instance.

All this was required to avoid a YAML conflict when doing require 'yaml' that was loading YAML code from both versions of the JRuby jars on the classpath although the preferred version of JRuby was in a jar that was earlier in the classpath. I was hoping there would have been a simpler way to avoid this conflict by telling eliminating the 2nd JRuby jar from loading any code at all, but I understand that this is likely an edge case for most people.

I "think" this bit of irb code illustrates the root issue that I was facing.

cmd: pwd
/tmp/jruby-collision

cmd: ls
jruby-complete-1.7.19.jar
jruby-complete-9.0.3.0.jar

cmd: java -Xmx500m -Xss1024k -jar jruby-complete-9.0.3.0.jar -e 'load "META-INF/jruby.home/bin/jirb"'

>> puts $CLASSPATH
["file:/Users/rasik_pandey/.m2/repository/jline/jline/2.11/jline-2.11.jar", "file:/var/folders/20/fx2s3vsn3tgf_88k5yq6qpyw0000gn/T/jruby-16288/jruby1727575212859602471readline.jar"]
=> nil

>> puts $LOAD_PATH
uri:classloader:/META-INF/jruby.home/lib/ruby/2.2/site_ruby
uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib
=> nil

>> c = org.jruby.embed.IsolatedScriptingContainer.new
=> #<Java::OrgJrubyEmbed::IsolatedScriptingContainer:0x6dab01d9>

>> c.run_scriptlet("puts $LOAD_PATH")
uri:classloader:/META-INF/jruby.home/lib/ruby/2.2/site_ruby
uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib
=> nil

>> c.run_scriptlet("puts $CLASSPATH")
["file:/Users/me/.m2/repository/jline/jline/2.11/jline-2.11.jar", "file:/var/folders/20/fx2s3vsn3tgf_88k5yq6qpyw0000gn/T/jruby-15276/jruby6743288289275416610readline.jar"]
=> nil

>> $CLASSPATH << File.join(Dir.pwd, 'jruby-complete-1.7.19.jar') # After this point the 2nd JRuby jar will be available on the $CLASSPATH can can cause conflicts
=> ["file:/Users/me/.m2/repository/jline/jline/2.11/jline-2.11.jar", "file:/var/folders/20/fx2s3vsn3tgf_88k5yq6qpyw0000gn/T/jruby-15276/jruby6743288289275416610readline.jar", "file:/private/tmp/jruby-collision/jruby-complete-1.7.19.jar"]

>> puts $LOAD_PATH
uri:classloader:/META-INF/jruby.home/lib/ruby/2.2/site_ruby
uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib
=> nil

>> puts $CLASSPATH
["file:/Users/me/.m2/repository/jline/jline/2.11/jline-2.11.jar", "file:/var/folders/20/fx2s3vsn3tgf_88k5yq6qpyw0000gn/T/jruby-15276/jruby6743288289275416610readline.jar", "file:/private/tmp/jruby-collision/jruby-complete-1.7.19.jar"]
=> nil

>> c.run_scriptlet("puts $LOAD_PATH")
uri:classloader:/META-INF/jruby.home/lib/ruby/2.2/site_ruby
uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib
=> nil

>> c.run_scriptlet("puts $CLASSPATH") # Notice the 2nd JRuby jar is on this classpath  
["file:/Users/me/.m2/repository/jline/jline/2.11/jline-2.11.jar", "file:/var/folders/20/fx2s3vsn3tgf_88k5yq6qpyw0000gn/T/jruby-15276/jruby6743288289275416610readline.jar", "file:/private/tmp/jruby-collision/jruby-complete-1.7.19.jar"]
=> nil


>> c = org.jruby.embed.IsolatedScriptingContainer.new
=> #<Java::OrgJrubyEmbed::IsolatedScriptingContainer:0x68ee3b6d>
>> c.setClassloaderDelegate(false)
=> nil

>> c.run_scriptlet("puts $LOAD_PATH")
uri:classloader:/META-INF/jruby.home/lib/ruby/2.2/site_ruby
uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib
=> nil

>> c.run_scriptlet("puts $CLASSPATH") # Again, even w/ a new container instance,  notice the 2nd JRuby jar is on this classpath
["file:/Users/me/.m2/repository/jline/jline/2.11/jline-2.11.jar", "file:/var/folders/20/fx2s3vsn3tgf_88k5yq6qpyw0000gn/T/jruby-15276/jruby6743288289275416610readline.jar", "file:/private/tmp/jruby-collision/jruby-complete-1.7.19.jar"]
=> nil

>> c = org.jruby.embed.IsolatedScriptingContainer.new
=> #<Java::OrgJrubyEmbed::IsolatedScriptingContainer:0x593a6726>

>> c.setClassloaderDelegate(false)
=> nil

>> c.run_scriptlet("puts $CLASSPATH") # Once again, notice that even w/ a new container instance and setClassloaderDelegate the 2nd JRuby jar is on this classpath
["file:/Users/rasik_pandey/.m2/repository/jline/jline/2.11/jline-2.11.jar", "file:/var/folders/20/fx2s3vsn3tgf_88k5yq6qpyw0000gn/T/jruby-16216/jruby7850591616193323552readline.jar", "file:/private/tmp/jruby-collision/jruby-complete-1.7.19.jar"]
=> nil

>> c.run_scriptlet("puts $LOAD_PATH")
uri:classloader:/META-INF/jruby.home/lib/ruby/2.2/site_ruby
uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib

>> c.run_scriptlet("require 'yaml'") # Boom the YAML loading conflict is triggered
uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/psych.rb:229: warning: already initialized constant LIBYAML_VERSION

@bigsur0
Copy link
Contributor Author

bigsur0 commented Dec 9, 2015

@headius @enebo, any chance I can get you guys to have a look at this as well. I basically have two colliding versions of jruby-complete on the classpath that I don't control. The primary version that I want to use is bundled into my own jar. But there is another older version of the jruby-complete jar on the classpath. I basically want all resources for 9000 from stdlib and site_ruby to load from my jar like require 'yaml' instead of loading from both jars.

@mkristian
Copy link
Member

@r6p when you talked about classpath and mean $CLASSPATH as in your sample code. then you have snakeyaml.jar as part of jruby-complete-1.7.19.jar and psych gem from jruby-9k uses a different version of snakeyaml. both will use the same $CLASSPATH. so you end up with two different versions of the snakeyaml on the same classloader and there is no way to fix this beside not bringing the jruby-complete-1.7.19.jar on the classpath. btw jruby-complete-1.7.23.jar uses the same snakeyaml version as psych from jruby-9k

@bigsur0
Copy link
Contributor Author

bigsur0 commented Dec 10, 2015

@mkristian this happens/is true whether it is $CLASSPATH or the java.class.path. So I guess the short answer is that there is no way to exclude the 2nd version of JRuby from interfering when at least Ruby files are being loaded from the $LOAD_PATH. I guess I'll have to go with the custom ClassLoader option for now.

@mkristian
Copy link
Member

@r6p what I said is around $CLASSPATH and adding a jruby-9k jar there will blow up as well.

the java.class.path - let me try this.

@mkristian
Copy link
Member

@r6p I just played around a bit with having both jruby-complete-17.19 and jruby-complete-9.0.4.0 on the java.class.path. this starts using org.jruby.gen.org$jruby$ext$psych$PsychParser$POPULATOR which does not exists on jruby-9k anymore, and this uses methods which only exists in jruby-1.7.x. yes, there is no way to run both jrubies in one classloader

@bigsur0
Copy link
Contributor Author

bigsur0 commented Dec 10, 2015

@mkristian, ok many thanks for "all" your help. Is it worthy of a technical feature request to be able to isolate the ScriptingContainer environments to a single version of JRuby? Or is this just too hard or impossible to do? And I mean doing so w/o having to create a custom ClassLoader. I.e. first version of the the JRuby jar on the classpath containing ScriptingContainer or IsolatedScriptingContainert wins and is used ignoring any other versions later on the classpath.

@mkristian
Copy link
Member

@rp6 the problem here is really that you have two versions of the same library on classloader - in general this is source for all kind of classloader problems: NoSuchMethodError, ClassCastException. this is true for any library. having a classpath there is an order and you could make something here but putting the same jars into WEB-INF/lib directory there is no order - maybe the lexical or creation time. IMO there is no way to achieve this.

@bigsur0
Copy link
Contributor Author

bigsur0 commented Dec 11, 2015

@mkristian ok thanks again for all your help.

@mkristian
Copy link
Member

@r6p closing issue since as far I understood the problem there nothing we can do on the JRuby side of things

@mkristian mkristian added this to the Invalid or Duplicate milestone Jan 5, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants