Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java IllegalBlockingModeException thrown in 9K but not in 1.7 or MRI #2420

Closed
iconara opened this issue Jan 4, 2015 · 8 comments
Closed

Comments

@iconara
Copy link
Contributor

iconara commented Jan 4, 2015

I'm trying to isolate this issue, but so far all I have are Travis tests that pass in JRuby 1.7 and MRI that fail in jruby-head because of a Java exception (java.nio.channels.IllegalBlockingModeException):

https://travis-ci.org/iconara/ione/jobs/45286311#L924

You can run the same code yourself like this:

$ git clone git@gihub.com:iconara/ione
$ cd ione
$ bundle
$ rspec spec/integration/io_spec.rb

Curiously even more tests in the same suite break with the same Java exception when running JRuby with --dev.

Here is the full stack trace from one of the failing tests: https://gist.github.com/iconara/56f97bcd9a9ea3f37a40

I will try to isolate the code that triggers the error.

@headius
Copy link
Member

headius commented Jan 6, 2015

Generally this error is from trying to select against a stream that has not been set non-blocking. This will almost certainly be a bug in the new IO subsystem. I'll take it.

@headius headius added this to the JRuby 9.0.0.0 milestone Jan 6, 2015
@headius headius self-assigned this Jan 6, 2015
@headius
Copy link
Member

headius commented Jan 6, 2015

Hmm ok, it's a little weirder than I thought. write_nonblock does set the stream nonblocking, but does not use any select logic. The error is coming from a finally{} call to setBlock to restore blocking status.

It would help a lot if you can narrow this down a bit more, at least to a specific type of stream. I'm guessing it's a concurrency issue where two threads are attempting to tweak blocking status at the same time.

@headius
Copy link
Member

headius commented Jan 6, 2015

FWIW, MRI does not appear to set the stream blocking again. I'm not sure if this is a bug or intentional behavior, but it would avoid the race to set it blocking:

static VALUE
io_write_nonblock(VALUE io, VALUE str, int no_exception)
{
    rb_io_t *fptr;
    long n;

    if (!RB_TYPE_P(str, T_STRING))
    str = rb_obj_as_string(str);

    io = GetWriteIO(io);
    GetOpenFile(io, fptr);
    rb_io_check_writable(fptr);

    if (io_fflush(fptr) < 0)
        rb_sys_fail(0);

    rb_io_set_nonblock(fptr);
    n = write(fptr->fd, RSTRING_PTR(str), RSTRING_LEN(str));

    if (n == -1) {
        if (errno == EWOULDBLOCK || errno == EAGAIN) {
        if (no_exception) {
        return ID2SYM(rb_intern("wait_writable"));
        }
        else {
        rb_readwrite_sys_fail(RB_IO_WAIT_WRITABLE, "write would block");
        }
    }
        rb_sys_fail_path(fptr->pathv);
    }

    return LONG2FIX(n);
}

@iconara
Copy link
Contributor Author

iconara commented Jan 7, 2015

I have a shorter script that reliably replicates the issue, and I also have the exact conditions that cause it.

Here's the script https://gist.github.com/iconara/4e41edb000a1305a9c28 it's got some inline explanations as to what's going on and what causes the problem.

What happens in my example is that for various reasons the server thread is selecting the client connection sockets for writability, at the same time as another thread calls #write_nonblock on them.

Obviously this isn't very good code, it's a quick and dirty thing that's used in a few tests – and explaining it like that makes is sound strange that it even works in JRuby 1.7 or MRI. It's shouldn't raise a Java exception though. Not sure what it should do actually.

@iconara
Copy link
Contributor Author

iconara commented Jan 7, 2015

I should also add that at the time of #2102 this was not a problem. When that issue was resolved the tests that fail now passed.

@headius
Copy link
Member

headius commented Jan 8, 2015

I've just pushed a series of improvements to nonblocking IO: 830c8e6...1436786

Part of the issue, at least, was that in MRI the _nonblock methods do not attempt to set the stream blocking again. I've known about this behavior and never implemented it in JRuby, but it appears there are reasons for it (like these races to set blocking status). So that change is in here.

I also made tweaks to support that logic, like ensuring blocking reads after a nonblocking read continue to block properly, selection logic works correctly, and files are ignored for purposes of nonblocking behavior.

All specs and tests pass as before, including a few new ones from MRI's suite.

The script @iconara provided runs to completion with no errors.

I attempted to clone ione but I'm having some connectivity problems locally. Hopefully @iconara can verify the fixes.

Copying @tobiassvn since he reported similar issues running Celluloid/nio4r-based apps on 9k.

I'm going to optimistically mark this fixed.

@headius headius closed this as completed Jan 8, 2015
@iconara
Copy link
Contributor Author

iconara commented Jan 8, 2015

I can confirm that the fix works.

@headius
Copy link
Member

headius commented Jan 8, 2015

Closing the loop...@tobiassvn reported that things are working well for him to.

✌️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants