Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: nonblocking standard file descriptors using timers #6068

Conversation

ysbaddaden
Copy link
Contributor

@ysbaddaden ysbaddaden commented May 5, 2018

This is an attempt at fixing #2713 and #5830. With this patch all examples in #2713 plus the failing example in #5380 are now working.

WARNING:
Proof of concept / work in progress!

The issue:
Setting O_NONBLOCK doesn't only affect the file descriptor of the application, but the whole file description which is shared with other programs. This is always the case for STDIN, STDOUT and STDERR that can be changed at any time by other processes. This makes Crystal applications a bad citizen too, since it changes those file descriptors in unexpected ways.

Proposed solution:
This patch introduces a new class, IO::StdFileDescriptor. It still relies on libevent to determine whether a file descriptor is ready, but doesn't set O_NONBLOCK and instead arms a timer then uses a blocking syscalls that will be interrupted by the timer (through the SIGALRM signal) which resumes to libevent, achieving a (somewhat) nonblocking call —i.e. it's blocking but not for too long.

The SIGALRM signal must be trapped using sigaction to set the SA_NODEFER and more importantly not set the SA_RESTART flag that would prevent syscalls from failing with EINTR.

All syscalls that may be interrupted must now check both EAGAIN and EINTR before deciding to retry or fail.

Caveats:
Relies on a magic number to setup the timer (1 microsecond); the value must be greater than the timer precision, yet small enough to not block operations for too long...

Some syscalls are now interrupted by SIGALRM which can happen at any time, and in situations unrelated to STDIN/OUT/ERR read/write. It's possible that some syscalls will now start failing with EINTR, and shall be retried. The linux signal(7) manpage lists the following syscalls that we use as interruptible for example:

  • read(2), write(2) (as expected by this pull request);
  • open(2);
  • wait(2), waitpid(2);
  • accept(2), connect(2), recv(2), recvfrom(2), recvmsg(2), send(2), sendto(2), sendmsg(2)
  • flock(2), fcntl(2) F_SETLKW.

Hijacks SIGALRM —we already hijack SIGCHLD which is more common.

WIP:

  • Only C bindings for x86_64-linux-gnu have been added (for the time being);
  • some targets (darwin, openbsd) don't support POSIX timers but provide the older setitimer(2);
  • must check and retry interuptible syscalls that failed with EINTR;
  • POSIX only, the class doesn't exist on Win32 (still directly uses IO::FileDescriptor);
  • Not isolated into Crystal::System.


def initialize(fd, blocking = false)
ret = LibC.timer_create(LibC::CLOCK_MONOTONIC, nil, out @read_timer_id)
raise Errno.new("timer_create") if ret == -1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since STDIN, STDOUT and STDERR can only be read from or written to, I guess we could use a single timer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, darwin and openbsd only provide setitimer, where we can't have different timers, so I'm now using only one.

@ysbaddaden ysbaddaden force-pushed the fix/nonblocking-standard-file-descriptors-with-timers branch from 88a6ddf to 5bcb801 Compare May 5, 2018 21:03
require "../signal"
require "c/time"

class IO::StdFileDescriptor < IO::FileDescriptor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about IO::BlockingFileDescriptor? Since we should describe the class's function (somewhat) - not it's current usecase.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except that it doesn't block. If we want to be explicit, it could be SharedFileDescriptor, InterruptibleFileDescriptor, ...

I'm more interested in the general idea, whether we like this PoC, then how it should be implemented, over naming choices.

For example this could be included in Crystal::System:: FileDescriptor and triggered in some way, instead of adding yet another class.

end

private def with_timer(timer_id)
return yield if blocking?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blocking = true, blocking= is a no-op, so this method always does nothing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The blocking argument is delegated in #initialize to FileDescriptor that will call #blocking=.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, #blocking= isn't a noop, it changes @blocking.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, this is why @foo syntax in method args should be limited to constructors...

@ysbaddaden
Copy link
Contributor Author

I added missing C bindings and a fallback to use setitimer when POSIX timers aren't available (namely: OpenBSD and Darwin).

Copy link
Contributor

@RX14 RX14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to bother about windows for now, it can keep using IO::FileDescriptor.new (everything's blocking). Might need a {% skip_file if flag?(:win32) %} on the StdFileDescriptor class though.

require "c/sys/time"
require "c/time"

class IO::StdFileDescriptor < IO::FileDescriptor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please :nodoc: this actually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe even push it under Crystal::System?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I like that idea.

require "c/time"

class IO::StdFileDescriptor < IO::FileDescriptor
@blocking = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# Never set `O_NONBLOCK` on standard file descriptors! See:
# - https://github.com/crystal-lang/crystal/issues/3674
# - http://cr.yp.to/unix/nonblock.html
property? blocking = true

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't property? implies a nilable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ysbaddaden nope.

Copy link
Contributor

@RX14 RX14 May 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

property! is nillable/raise-on-nil. property? is a boolean property.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd call property? more like query property since it doesn't necessarily have to be boolean.

src/socket.cr Outdated
@@ -240,7 +240,7 @@ class Socket < IO
if client_fd == -1
if closed?
return
elsif Errno.value == Errno::EAGAIN
elsif Errno.value == Errno::EAGAIN || Errno::EINTR
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Errno.value == Errno::EAGAIN || Errno.value == Errno::EINTR

ditto below

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{EAGAIN, EINTR}.contains?(Errno.value).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I just noticed while fixing interruptible syscalls :)

@ysbaddaden ysbaddaden force-pushed the fix/nonblocking-standard-file-descriptors-with-timers branch from 80e56d3 to 77fb289 Compare May 6, 2018 16:11
# - http://cr.yp.to/unix/nonblock.html

# :nodoc:
class IO::StdFileDescriptor < IO::FileDescriptor
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lost comment: move to Crystal::System.

@ysbaddaden ysbaddaden removed the pr:wip label May 6, 2018
@ysbaddaden
Copy link
Contributor Author

I removed the wip label. Please test and report errors and performance issues :)

action.sa_flags = SA_NODEFER;
action.sa_sigaction = &alarm_handler;

sigaction(SIGALRM, &action, NULL);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish I could avoid the C function, but this involves different C structs & macros for each target, which is beyond my competence and patience to properly port to Crystal...

@ysbaddaden ysbaddaden force-pushed the fix/nonblocking-standard-file-descriptors-with-timers branch from 77fb289 to d722da4 Compare May 6, 2018 20:01
Setting O_NONBLOCK doesn't only affect the file descriptor of the
application, but the whole file description which may be shared with
other programs. This is always the case for STDIN, STDOUT and STDERR
that can be changed at any time by other processes. This makes
Crystal applications a bad citizen too, since it changes those file
descriptors in unexpected ways.

This patch introduces a new class, IO::StdFileDescriptor. It still
relies on libevent to determine whether a file descriptor is ready,
but doesn't set O_NONBLOCK and instead uses blocking syscalls that
shall be interrupted with a timer triggering a SIGALRM signal, thus
achieving a (somewhat) nonblocking operation.

The SIGALRM signal must be trapped using sigaction to set the
SA_NODEFER and more importantly not set the SA_RESTART flag that
would prevent syscalls from failing with EINTR.

All syscalls that may be interrupted must now check both EAGAIN and
EINTR before deciding to retry or fail.
Adds missing C binding definitions for the different targets. Adds C
bindings for setitimer and use it on targets that don't support
timer_create (namely: OpenBSD and Darwin).
POSIX timers (and `setitimer(2)`) trigger SIGALRM that is configured
without `SA_RESTART` to interrupt some syscalls. This change affects
a few other syscalls that can now fail with `EINTR`.

This patch ensures that we do.
@ysbaddaden ysbaddaden force-pushed the fix/nonblocking-standard-file-descriptors-with-timers branch from d722da4 to 425382a Compare July 22, 2018 13:51
@ysbaddaden
Copy link
Contributor Author

Now moved to Crystal::System.

Also separates POSIX timer_create(2) and setitimer(2)
implementations.
@ysbaddaden ysbaddaden force-pushed the fix/nonblocking-standard-file-descriptors-with-timers branch from 425382a to 1a61a45 Compare July 22, 2018 14:40
@RX14
Copy link
Contributor

RX14 commented Jul 22, 2018

Why not avoid the whole timers thing in the fast path by asking libevent if the FD is ready before we try to read/write. For the case of when crystal has exclusive use of the FD, this should always work. If it's racing then yeah you still need timers to stop you blocking when you loose the race.

@RX14
Copy link
Contributor

RX14 commented Jul 22, 2018

Also I don't like how all the other syscalls get interrupted, wouldn't the timer be reset if those syscalls are running?

@ysbaddaden
Copy link
Contributor Author

ysbaddaden commented Jul 24, 2018

Why not avoid the whole timers thing in the fast path (...) when crystal has exclusive use of the FD

The whole issue is that a process never has exclusive control of standard file descriptors. A fast path would lead to a blocking syscall, but maybe a sub-process will have already read/written it, and we'd hung the crystal process 😢

Or just with threads: we'd have concurrent accesses, and could hung a thread.

I don't like how all the other syscalls get interrupted

Other syscalls won't be running until we get threads, it's kind of a safe guard to retry on EINTR. Since we can't control what users do, like enabling some signals to interrupt syscalls... I believe we should retry on EINTR anyway.

Wouldn't the timer be reset if those syscalls are running?

I don't understand. Even with threads those syscalls don't care about the timers: they're just interrupted when the kernel sends the SIGALRM signal.

With threads, I guess setitimer could lead to the timer being reset in thread A while thread B still needs it. We'll need some emulation of timer_create or something, like a thread with a queue of timers that will (dis)arm setitimer accordingly (or nanosleep and signal(SIGALRM) directly in a loop).

@RX14
Copy link
Contributor

RX14 commented Jul 24, 2018

Okay, so this patch doesn't cause the other syscalls to start returning EINTR, but it could if threads were in use? Then yeah I'm fine with adding these checks for future-proofing.

And re: checking with libevent that was in addition to the timers. Given that stdout/stdin are probably more often to not block than block, my suggestion is probably not worth it.

@ysbaddaden
Copy link
Contributor Author

I reanalyzed, and yes: #with_timer only englobs the read(2) and write(2) calls, so the arm -> syscall -> disarm is always synchronous within a single thread. The #wait_writable and #wait_readable that would trigger the event loop (and run whatever code) happens after disarming the timer.

I'm not very fond of this solution, but it's the best I have at hand.

I tried using a thread that would execute the blocking read/write calls, but the thread safe communication / sync kills the overall performance (e.g. 50 concurrent writes was twice slower).

@RX14
Copy link
Contributor

RX14 commented Jul 24, 2018

I wonder how many C libraries handle EINTR correctly. This might not interact well with multithreading and C libraries.

I'm not sure that we should really care too much about performance of stdin/out/err right now, so perhaps the thread solution is better.

@ysbaddaden
Copy link
Contributor Author

The timer_create variant ain't fork-safe too, we must recreate the timer_id in forked processes...

@Timbus
Copy link
Contributor

Timbus commented Aug 6, 2018

I have been wondering how libUV handles this, so I took a cursory glance over the source and found they.. reopen the fds?
https://github.com/libuv/libuv/blob/v1.x/src/unix/tty.c#L114

I'm familiar with this problem, but not this solution.

@ysbaddaden
Copy link
Contributor Author

Oh, that's a good idea, thanks @Timbus !

@ysbaddaden
Copy link
Contributor Author

@Timbus sadly that doesn't work. I tried to dup the file descriptors, but the snippet in #5830 is still broken.

@Timbus
Copy link
Contributor

Timbus commented Aug 8, 2018

Hmm. Maybe it has something to do with the extra 'master/slave tty' code in there? I didn't even know tty's had such a concept.
The dup code they use is also fairly complicated, too: https://github.com/libuv/libuv/blob/619937c783a05b51fba95cc9a62543deeffe5fa7/src/unix/core.c#L1004

Just seems odd that it wouldn't work, especially with how relevant their code comments were.

@Timbus
Copy link
Contributor

Timbus commented Aug 9, 2018

I'm still at work so I haven't really looked into it much but I'd guess you also need to set the FD_CLOEXEC flag on the handle as well.. I might try it when I get home. Thanks for the easy test case.

@Timbus
Copy link
Contributor

Timbus commented Aug 9, 2018

Yeah, so after fully reading the libuv code, it seems the key is to reopen the FD first, not just dup

# The top of main.cr
lib LibC
  fun ttyname_r(fd : Int32, buf : Char*, buffersize : Int32) : Int32
  fun dup3(oldfd : Int32, newfd : Int32, flags : Int32) : Int32
end

#... and at the top of self.main
    buf = UInt8[255]
    [STDIN, STDOUT, STDERR].each do |handle|
      fd = handle.fd
      ret = LibC.ttyname_r(fd, buf, 254)
      if ret == 0
        path = String.new(buf.to_unsafe)
        puts "Got a TTY called #{path}, cloning FD #{fd}"

        # Reopen and tie the FDs together, basically.
        newfd = LibC.open(path, LibC::O_RDWR | LibC::O_CLOEXEC)
        # Don't dup for now.. Crystal doesn't handle it right
        # LibC.dup3(newfd, fd, LibC::O_CLOEXEC)

        # Hacked this in..
        handle.fd = newfd
        handle.blocking = false
      end
    end

    if STDOUT.tty?
      STDOUT.puts "All working I think? BTW stdout is now #{STDOUT.fd}"
      # ^ stdout is now 4
    end

I assume the dup3 is to 'tie' the filehandles together in some way, but it breaks atm when crystal sets blocking IO on its own handles(?), to spawn the child process (see: reopen_io). If you're willing to treat the original handles separately it's all good though:

#in src/process.cr - self.exec_internal, dupe these handles instead
    reopen_io(input, IO::FileDescriptor.new(0), "r")
    reopen_io(output, IO::FileDescriptor.new(1), "w")
    reopen_io(error, IO::FileDescriptor.new(2), "w")

And, now it seems to work:

./test
Got a TTY called /dev/pts/1, cloning FD 0
Got a TTY called /dev/pts/1, cloning FD 1
Got a TTY called /dev/pts/1, cloning FD 2
All working I think? BTW stdout is now 4
call a subprocess
Press CTRL+C
^CCanceled by user

So that's some sort of progress, maybe?

@j8r
Copy link
Contributor

j8r commented Aug 9, 2018

Awesome, good work @Timbus !

@Timbus
Copy link
Contributor

Timbus commented Aug 9, 2018

Thanks, but since I spent a long time just going round in circles with this one, at this point my brain hurts and I'm not 100% sure it's a clean/full fix, or it just happens to just look like a fix..
Hope it helps pave a way forward, though.

My possibly unwanted opinion: I think switching to libuv could make a lot of these small problems go away, since it handles ttys, and the subprocess fork/exec stuff.. Plus it's generally known to be faster

@j8r
Copy link
Contributor

j8r commented Aug 9, 2018

I have a dumb question: since libuv handle this case, why libevent don't?

@Timbus
Copy link
Contributor

Timbus commented Aug 9, 2018

libuv has many abstractions for dealing with IO. You use things like uv_open to open a file, uv_spawn to fork a process, etc.. libevent is -- in my limited experience -- more focused on being a convenient wrapper around polling. That means you still need to handle some of the 'extra work'. Compare these two echo servers: https://github.com/eddieh/libevent-echo-server/blob/master/echo-server.c + https://github.com/eddieh/libuv-echo-server/blob/master/echo-server.c


Back on topic: I turned off the filehandle duping on a subprocess exec, performed a dup3 to overwrite the original FDs + set CLOEXEC on them, and stopped setting NONBLOCK on the original file handles, essentially leaving them alone.. And it seems to work for all the scenarios I could test. I could make an initial PR tonight (work don't pay me to hack on crystal -- yet), or maybe @ysbaddaden wants to do it?

@ysbaddaden
Copy link
Contributor Author

Please push your PR.

Not the place to discuss that, but livuv isn't threadsafe, does lots too many things and is deeply tied to nodejs, and thus very opinionated. libevent is threadsafe, merely wraps the underlying poll syscalls, and we only use timers and socket polling, not even DNS (bypasses getaddrinfo or signal handling (don't remember why). We wouldn't use anything from libuv. It's probable that we'll skip it someday for something custom —just like we replaced libpcl with custom assembly for context switch. Having something light will only make it simpler.

Copy link
Member

@sdogruyol sdogruyol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a rebase and it's good to go 👍

@RX14
Copy link
Contributor

RX14 commented Oct 15, 2018

This has been obsoleted by #6518.

@RX14 RX14 closed this Oct 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants