New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use of jruby causes “Errno::EBADF: Bad file descriptor” error #5249
Comments
EBADF would generally mean that one of the IO streams to the subprocess have been shut down, perhaps prematurely. Can you put together a small git repo that we can clone and run to see the problem locally? |
Alright. I'll work on trying to put together a bit of code that reproduces the error. Might take me into next week. I didn't post until I'd pretty much exhausted google and cross tested every other alternative. Stay tuned. |
OK, so it didn't take too long to cook up a demo. Here's a file-set on Github that should allow you to reproduce the problem. run the .rb file in jruby and you should get the following error:
As I said before, the code runs fine with system ruby and rubinius (no error). With either of those, you'll get a tiff file you can open with any image viewer program. Thanks for the help in figuring this out. |
I think there's a bug in the demo script you provided; it can't find the Unfortunately, I also did not get EBADF:
With my change, does the script give you an EBADF, or do we need to keep looking for a reproduction? |
I've updated my script with your change, inserting a "./" in front of decode. I get the same error.
Here's a file listing in case there's something wrong there (I can't see what)
And for good measure I've run the executable manually as follows:
It produced the expected tiff file without issue. Switching over to system ruby with rbenv global system and running the script again produces the following result:
Ditto for rbx (rubinius) which I also have installed. Clearly, there's something wrong with my installation of jruby. PS. I've updated the GIT repo with the minor tweak to bugdemo.rb. |
I found this explanation from a post of yours quite some time ago. I don't know if it is relevant or not.
My point in using pipes is to improve performance over std disk-based file I/O. The code I've abstracted for the demo is part of a larger system that performs repeated conversions, hundreds of thousands of them in fact, and needs to do so as quickly as possible. That's why the code that actually does the converting is written in C and compiled. That's just step 1 though as I need to get the data to convert to the binary, and the converted data back from it, as quickly and as efficiently as possible. The earlier version of the code that used temporary files worked, but it also occasionally would throw an "Errno::EBADF: Bad file descriptor" error. I'm using multiple threads in a worker pool (Celluloid) to fire off as many of these conversions at once as the system can handle. It smelled to me in the temp-file version that I was running into a resource problem, but given that std file I/O is not as efficient as pipes theoretically, I moved to that. But I'm not even getting off first base with the new pipe-based code. If there's a better way to do what I need to do in jRuby, I'm open. |
Just tried the bug demo code with jruby 9.2.1... no joy. |
I am traveling but will try to look at this again soon. I suspect it is a difference in how we do IO or handle errors on Linux versus MacOS (I'm testing on MacOS). @enebo Can you reproduce this? |
Thanks for the reply Charles. I'll do whatever I can to help you resolve this including write additional proof-of-concept code as necessary. I have a 16-core Mac, and am running linux in VM's under Virtualbox, so it's pretty easy for me to spin up custom test environments if needed. Would it
be useful for me too try reproducing the problem natively on my mac? I have brew installed on my mac so in theory I should be able to get rbenv and jruby working natively.
|
@voyager131 @headius ok something fascinating here...If I run it using Java 8 it runs quickly. If I run with Java 9 or Java 10 it just sits there. Here is a dump:
I do not seem to get EBADF but it just doesn't work... |
That is indeed interesting. I am running java 10.
You guys will have to be the judges though of what that might mean or how best to proceed with this new information. I know next to nothing about java, and only use it. |
Ah-ha, Java 10...I think that's a detail I missed, but it could be important. Could you try downloading a Java 8 SDK and see if it works? The problem on Java 9+ is that we need to conform to some new standards about reflectively digging around inside the JDK classes (something we call "cracking them open"). In the case of process IO, this may mean that we are unable to properly represent native IO because we have to use the JDK IO classes as-is. I'm pulling 9 to my VM now over a slow connection and then I should be able to confirm this. |
Ok, I get a slightly different error than you do when I use Java 9, but it's failing:
|
Interestingly, it sometimes succeeds for me. |
Ok, some good news! If I modify your script to just use IO.popen (which doesn't provide access to stderr) it appears to work every time. stdin = IO.popen("./decode", 'r+');
tap do
stdin.write(raw_data)
stdin.close_write
$image_data = stdin.read
#$status_message = stderr.read
#exit_status = wait_thr.value
end# The problem you're seeing seems to stem from the implementation of popen3 and how we handle it. |
Ok more information... It does appear that popen3 is getting a proper native PID for a directly-launched subprocess, rather than one faked through JDK classes. So my original theory about Java 9 interfering with the process launch seems to not be true. However if I pause your script inside the popen3 block and check whether a thread has actually been started for wait_thr, I see only the main thread running. This may not be related to your issue, but it seems that Process.detach is having some kind of trouble. |
So I have java 8 installed and working along side java 10. I can switch between them on the command line just fine using
However this does not change the version of java jruby uses as jruby --version yields the same result (see above) regardless off the setting I choose for java using the above commands. Working on figuring out how to fix that. Since I use rbenv, it might be nice to have another version of jruby linked to a different java or something. |
Good news!
For those (like me) who aren't as familiar with switching out the version of java used by jruby, that is done using the JAVA_HOME, environment variable. To use jruby with java 10, $JAVA_HOME is set (on my system) to:
To use it with java 8, it must be set as follows:
The correct paths will likely vary based on your specific installation. In addition JAVA_OPTS must contain options compatible with the version of java set by JAVA_HOME (I had a problem with that, as java 10 requires some options in my case that java 8 doesn't support). Thanks to @headius and @enebo, I think it is known what the basic issue is with jruby and java10 using popen3 specifically. I'm not sure it matters to me which java version I use as long as I can get the performance I'm after, and I do need popen3 as stderr is vital to what I'm doing. I'll do a bit more work with java8 and post some additional feedback in a day or so. |
So my code appears to work quite well under jruby/java8. So problem solved as far as my immediate need is concerned. This leaves things unresolved for users of java 10 (9?) however. I'll leave it to you guys to sort that out. Some final comments by one or both of you are probably appropriate prior to closing the thread, so I'll leave it to one of you to do that. I also don't know if this conversation results in some to-do on the jruby development side. Another reason I'll let one of you close this. Thanks again for the help. It is much appreciated. |
Relates to modularity work in #4835 which will continue in 9.2.2. |
it also breaks many tests on java 9+ Line 97 in 49df0b4
the problem is that sun.nio.ch.SelChImpl.getFD fails to load and Pipes are using fake FD
java 8
java 9+
|
I just confirmed that #5235 now appears to work fine with JRuby 9.2.8 and Java 11. I think we can call this resolved after the past year of module-fiddling changes to JRuby and its launchers.
Can you retest? |
@headius I can confirm it works fine with these flags, thanks |
still getting this error with jruby-9.1.17.0 and java 11. |
@sjt003 You should upgrade to a recent JRuby 9.2.x build like 9.2.14.0. |
9.2.8.0 worked |
what was the actual issue. |
@sjt003 In order to support all Ruby features we sometimes have to open up core Java objects, for example to get a real file descriptor. When running on Java 9+, we are not able to open those objects unless we pass specific flags to Java or work around the limitations in other ways. Those sorts of changes are in more recent JRuby versions. I would not recommend using Java 9+ with anything earlier than JRuby 9.2.1, and of course we always recommend running the latest JRuby release for best compatibility and security. |
This is posted to StackOverflow as well.
I've got a little executable we'll call "decode" (written in C) that takes a block of data on stdin (an image file), converts it, and spits it back to stdout. So from the linux command line the following command works just fine:
I'm trying to wrap this binary in some ruby code and eliminate the need for the use of regular file IO using Open3.popen3. Here's the relevant section of Ruby code:
Variable f contains the block of data to convert. I was writing it to file and then calling decode on the written file. When trying to run the the above code from irb using jruby, one gets the following error traceback (slightly sanitized):
The funny thing is that the exact same code works fine unchanged in irb if I'm using the system ruby interpreter, or rubinius (both of which I have installed and can switch between using rbenv).
Can anyone tell me what gives? I'm runing ubuntu linux 18.04 LTS, and jruby 9.2.0.0 (2.5.0). Jruby is the platform of choice because of speed and other considerations, so I need to get this working.
The text was updated successfully, but these errors were encountered: