Fix: nonblocking standard file descriptors using timers #6068

ysbaddaden · 2018-05-05T20:17:05Z

This is an attempt at fixing #2713 and #5830. With this patch all examples in #2713 plus the failing example in #5380 are now working.

WARNING:
Proof of concept / work in progress!

The issue:
Setting O_NONBLOCK doesn't only affect the file descriptor of the application, but the whole file description which is shared with other programs. This is always the case for STDIN, STDOUT and STDERR that can be changed at any time by other processes. This makes Crystal applications a bad citizen too, since it changes those file descriptors in unexpected ways.

Proposed solution:
This patch introduces a new class, IO::StdFileDescriptor. It still relies on libevent to determine whether a file descriptor is ready, but doesn't set O_NONBLOCK and instead arms a timer then uses a blocking syscalls that will be interrupted by the timer (through the SIGALRM signal) which resumes to libevent, achieving a (somewhat) nonblocking call —i.e. it's blocking but not for too long.

The SIGALRM signal must be trapped using sigaction to set the SA_NODEFER and more importantly not set the SA_RESTART flag that would prevent syscalls from failing with EINTR.

All syscalls that may be interrupted must now check both EAGAIN and EINTR before deciding to retry or fail.

Caveats:
Relies on a magic number to setup the timer (1 microsecond); the value must be greater than the timer precision, yet small enough to not block operations for too long...

Some syscalls are now interrupted by SIGALRM which can happen at any time, and in situations unrelated to STDIN/OUT/ERR read/write. It's possible that some syscalls will now start failing with EINTR, and shall be retried. The linux signal(7) manpage lists the following syscalls that we use as interruptible for example:

read(2), write(2) (as expected by this pull request);
open(2);
wait(2), waitpid(2);
accept(2), connect(2), recv(2), recvfrom(2), recvmsg(2), send(2), sendto(2), sendmsg(2)
flock(2), fcntl(2) F_SETLKW.

Hijacks SIGALRM —we already hijack SIGCHLD which is more common.

WIP:

~~Only C bindings for x86_64-linux-gnu have been added (for the time being);~~
~~some targets (darwin, openbsd) don't support POSIX timers but provide the older setitimer(2);~~
~~must check and retry interuptible syscalls that failed with EINTR;~~
POSIX only, the class doesn't exist on Win32 (still directly uses IO::FileDescriptor);
Not isolated into Crystal::System.

ysbaddaden · 2018-05-05T20:19:39Z

src/io/std_file_descriptor.cr

+
+  def initialize(fd, blocking = false)
+    ret = LibC.timer_create(LibC::CLOCK_MONOTONIC, nil, out @read_timer_id)
+    raise Errno.new("timer_create") if ret == -1


Since STDIN, STDOUT and STDERR can only be read from or written to, I guess we could use a single timer.

Well, darwin and openbsd only provide setitimer, where we can't have different timers, so I'm now using only one.

RX14 · 2018-05-05T21:35:42Z

src/io/std_file_descriptor.cr

+require "../signal"
+require "c/time"
+
+class IO::StdFileDescriptor < IO::FileDescriptor


How about IO::BlockingFileDescriptor? Since we should describe the class's function (somewhat) - not it's current usecase.

Except that it doesn't block. If we want to be explicit, it could be SharedFileDescriptor, InterruptibleFileDescriptor, ...

I'm more interested in the general idea, whether we like this PoC, then how it should be implemented, over naming choices.

For example this could be included in Crystal::System:: FileDescriptor and triggered in some way, instead of adding yet another class.

RX14 · 2018-05-05T21:49:11Z

src/io/std_file_descriptor.cr

+  end
+
+  private def with_timer(timer_id)
+    return yield if blocking?


@blocking = true, blocking= is a no-op, so this method always does nothing?

The blocking argument is delegated in #initialize to FileDescriptor that will call #blocking=.

Oh, #blocking= isn't a noop, it changes @blocking.

Ugh, this is why @foo syntax in method args should be limited to constructors...

ysbaddaden · 2018-05-06T14:33:28Z

I added missing C bindings and a fallback to use setitimer when POSIX timers aren't available (namely: OpenBSD and Darwin).

RX14

We don't need to bother about windows for now, it can keep using IO::FileDescriptor.new (everything's blocking). Might need a {% skip_file if flag?(:win32) %} on the StdFileDescriptor class though.

RX14 · 2018-05-06T15:02:45Z

src/io/std_file_descriptor.cr

+require "c/sys/time"
+require "c/time"
+
+class IO::StdFileDescriptor < IO::FileDescriptor


Please :nodoc: this actually.

Maybe even push it under Crystal::System?

Yeah I like that idea.

Sija · 2018-05-06T15:12:26Z

src/io/std_file_descriptor.cr

+require "c/time"
+
+class IO::StdFileDescriptor < IO::FileDescriptor
+  @blocking = true


# Never set `O_NONBLOCK` on standard file descriptors! See: # - https://github.com/crystal-lang/crystal/issues/3674 # - http://cr.yp.to/unix/nonblock.html property? blocking = true

?

Doesn't property? implies a nilable?

@ysbaddaden nope.

property! is nillable/raise-on-nil. property? is a boolean property.

I'd call property? more like query property since it doesn't necessarily have to be boolean.

Sija · 2018-05-06T15:15:18Z

src/socket.cr

@@ -240,7 +240,7 @@ class Socket < IO
      if client_fd == -1
        if closed?
          return
-        elsif Errno.value == Errno::EAGAIN
+        elsif Errno.value == Errno::EAGAIN || Errno::EINTR


Errno.value == Errno::EAGAIN || Errno.value == Errno::EINTR

ditto below

{EAGAIN, EINTR}.contains?(Errno.value).

Thanks, I just noticed while fixing interruptible syscalls :)

ysbaddaden · 2018-05-06T16:12:20Z

src/io/std_file_descriptor.cr

+# - http://cr.yp.to/unix/nonblock.html
+
+# :nodoc:
+class IO::StdFileDescriptor < IO::FileDescriptor


Lost comment: move to Crystal::System.

ysbaddaden · 2018-05-06T16:14:34Z

I removed the wip label. Please test and report errors and performance issues :)

ysbaddaden · 2018-05-06T16:18:45Z

src/ext/sigfault.c

+  action.sa_flags = SA_NODEFER;
+  action.sa_sigaction = &alarm_handler;
+
+  sigaction(SIGALRM, &action, NULL);


I wish I could avoid the C function, but this involves different C structs & macros for each target, which is beyond my competence and patience to properly port to Crystal...

Setting O_NONBLOCK doesn't only affect the file descriptor of the application, but the whole file description which may be shared with other programs. This is always the case for STDIN, STDOUT and STDERR that can be changed at any time by other processes. This makes Crystal applications a bad citizen too, since it changes those file descriptors in unexpected ways. This patch introduces a new class, IO::StdFileDescriptor. It still relies on libevent to determine whether a file descriptor is ready, but doesn't set O_NONBLOCK and instead uses blocking syscalls that shall be interrupted with a timer triggering a SIGALRM signal, thus achieving a (somewhat) nonblocking operation. The SIGALRM signal must be trapped using sigaction to set the SA_NODEFER and more importantly not set the SA_RESTART flag that would prevent syscalls from failing with EINTR. All syscalls that may be interrupted must now check both EAGAIN and EINTR before deciding to retry or fail.

Adds missing C binding definitions for the different targets. Adds C bindings for setitimer and use it on targets that don't support timer_create (namely: OpenBSD and Darwin).

POSIX timers (and `setitimer(2)`) trigger SIGALRM that is configured without `SA_RESTART` to interrupt some syscalls. This change affects a few other syscalls that can now fail with `EINTR`. This patch ensures that we do.

ysbaddaden · 2018-07-22T13:53:35Z

Now moved to Crystal::System.

Also separates POSIX timer_create(2) and setitimer(2) implementations.

RX14 · 2018-07-22T17:29:06Z

Why not avoid the whole timers thing in the fast path by asking libevent if the FD is ready before we try to read/write. For the case of when crystal has exclusive use of the FD, this should always work. If it's racing then yeah you still need timers to stop you blocking when you loose the race.

RX14 · 2018-07-22T17:45:30Z

Also I don't like how all the other syscalls get interrupted, wouldn't the timer be reset if those syscalls are running?

ysbaddaden · 2018-07-24T09:38:31Z

Why not avoid the whole timers thing in the fast path (...) when crystal has exclusive use of the FD

The whole issue is that a process never has exclusive control of standard file descriptors. A fast path would lead to a blocking syscall, but maybe a sub-process will have already read/written it, and we'd hung the crystal process 😢

Or just with threads: we'd have concurrent accesses, and could hung a thread.

I don't like how all the other syscalls get interrupted

Other syscalls won't be running until we get threads, it's kind of a safe guard to retry on EINTR. Since we can't control what users do, like enabling some signals to interrupt syscalls... I believe we should retry on EINTR anyway.

Wouldn't the timer be reset if those syscalls are running?

I don't understand. Even with threads those syscalls don't care about the timers: they're just interrupted when the kernel sends the SIGALRM signal.

With threads, I guess setitimer could lead to the timer being reset in thread A while thread B still needs it. We'll need some emulation of timer_create or something, like a thread with a queue of timers that will (dis)arm setitimer accordingly (or nanosleep and signal(SIGALRM) directly in a loop).

RX14 · 2018-07-24T11:21:32Z

Okay, so this patch doesn't cause the other syscalls to start returning EINTR, but it could if threads were in use? Then yeah I'm fine with adding these checks for future-proofing.

And re: checking with libevent that was in addition to the timers. Given that stdout/stdin are probably more often to not block than block, my suggestion is probably not worth it.

ysbaddaden · 2018-07-24T13:17:59Z

I reanalyzed, and yes: #with_timer only englobs the read(2) and write(2) calls, so the arm -> syscall -> disarm is always synchronous within a single thread. The #wait_writable and #wait_readable that would trigger the event loop (and run whatever code) happens after disarming the timer.

I'm not very fond of this solution, but it's the best I have at hand.

I tried using a thread that would execute the blocking read/write calls, but the thread safe communication / sync kills the overall performance (e.g. 50 concurrent writes was twice slower).

RX14 · 2018-07-24T15:51:01Z

I wonder how many C libraries handle EINTR correctly. This might not interact well with multithreading and C libraries.

I'm not sure that we should really care too much about performance of stdin/out/err right now, so perhaps the thread solution is better.

ysbaddaden · 2018-07-24T16:47:19Z

The timer_create variant ain't fork-safe too, we must recreate the timer_id in forked processes...

Timbus · 2018-08-06T21:27:53Z

I have been wondering how libUV handles this, so I took a cursory glance over the source and found they.. reopen the fds?
https://github.com/libuv/libuv/blob/v1.x/src/unix/tty.c#L114

I'm familiar with this problem, but not this solution.

ysbaddaden · 2018-08-08T20:21:42Z

Oh, that's a good idea, thanks @Timbus !

ysbaddaden · 2018-08-08T21:18:53Z

@Timbus sadly that doesn't work. I tried to dup the file descriptors, but the snippet in #5830 is still broken.

Timbus · 2018-08-08T21:58:21Z

Hmm. Maybe it has something to do with the extra 'master/slave tty' code in there? I didn't even know tty's had such a concept.
The dup code they use is also fairly complicated, too: https://github.com/libuv/libuv/blob/619937c783a05b51fba95cc9a62543deeffe5fa7/src/unix/core.c#L1004

Just seems odd that it wouldn't work, especially with how relevant their code comments were.

Timbus · 2018-08-09T02:36:40Z

I'm still at work so I haven't really looked into it much but I'd guess you also need to set the FD_CLOEXEC flag on the handle as well.. I might try it when I get home. Thanks for the easy test case.

Timbus · 2018-08-09T09:49:22Z

Yeah, so after fully reading the libuv code, it seems the key is to reopen the FD first, not just dup

# The top of main.cr
lib LibC
  fun ttyname_r(fd : Int32, buf : Char*, buffersize : Int32) : Int32
  fun dup3(oldfd : Int32, newfd : Int32, flags : Int32) : Int32
end

#... and at the top of self.main
    buf = UInt8[255]
    [STDIN, STDOUT, STDERR].each do |handle|
      fd = handle.fd
      ret = LibC.ttyname_r(fd, buf, 254)
      if ret == 0
        path = String.new(buf.to_unsafe)
        puts "Got a TTY called #{path}, cloning FD #{fd}"

        # Reopen and tie the FDs together, basically.
        newfd = LibC.open(path, LibC::O_RDWR | LibC::O_CLOEXEC)
        # Don't dup for now.. Crystal doesn't handle it right
        # LibC.dup3(newfd, fd, LibC::O_CLOEXEC)

        # Hacked this in..
        handle.fd = newfd
        handle.blocking = false
      end
    end

    if STDOUT.tty?
      STDOUT.puts "All working I think? BTW stdout is now #{STDOUT.fd}"
      # ^ stdout is now 4
    end

I assume the dup3 is to 'tie' the filehandles together in some way, but it breaks atm when crystal sets blocking IO on its own handles(?), to spawn the child process (see: reopen_io). If you're willing to treat the original handles separately it's all good though:

#in src/process.cr - self.exec_internal, dupe these handles instead
    reopen_io(input, IO::FileDescriptor.new(0), "r")
    reopen_io(output, IO::FileDescriptor.new(1), "w")
    reopen_io(error, IO::FileDescriptor.new(2), "w")

And, now it seems to work:

./test
Got a TTY called /dev/pts/1, cloning FD 0
Got a TTY called /dev/pts/1, cloning FD 1
Got a TTY called /dev/pts/1, cloning FD 2
All working I think? BTW stdout is now 4
call a subprocess
Press CTRL+C
^CCanceled by user

So that's some sort of progress, maybe?

j8r · 2018-08-09T10:38:15Z

Awesome, good work @Timbus !

Timbus · 2018-08-09T11:03:32Z

Thanks, but since I spent a long time just going round in circles with this one, at this point my brain hurts and I'm not 100% sure it's a clean/full fix, or it just happens to just look like a fix..
Hope it helps pave a way forward, though.

_{My possibly unwanted opinion: I think switching to libuv could make a lot of these small problems go away, since it handles ttys, and the subprocess fork/exec stuff.. Plus it's generally known to be faster}

j8r · 2018-08-09T12:25:32Z

I have a dumb question: since libuv handle this case, why libevent don't?

Timbus · 2018-08-09T21:41:53Z

libuv has many abstractions for dealing with IO. You use things like uv_open to open a file, uv_spawn to fork a process, etc.. libevent is -- in my limited experience -- more focused on being a convenient wrapper around polling. That means you still need to handle some of the 'extra work'. Compare these two echo servers: https://github.com/eddieh/libevent-echo-server/blob/master/echo-server.c + https://github.com/eddieh/libuv-echo-server/blob/master/echo-server.c

Back on topic: I turned off the filehandle duping on a subprocess exec, performed a dup3 to overwrite the original FDs + set CLOEXEC on them, and stopped setting NONBLOCK on the original file handles, essentially leaving them alone.. And it seems to work for all the scenarios I could test. I could make an initial PR tonight (work don't pay me to hack on crystal -- yet), or maybe @ysbaddaden wants to do it?

ysbaddaden · 2018-08-10T08:08:14Z

Please push your PR.

Not the place to discuss that, but livuv isn't threadsafe, does lots too many things and is deeply tied to nodejs, and thus very opinionated. libevent is threadsafe, merely wraps the underlying poll syscalls, and we only use timers and socket polling, not even DNS (bypasses getaddrinfo or signal handling (don't remember why). We wouldn't use anything from libuv. It's probable that we'll skip it someday for something custom —just like we replaced libpcl with custom assembly for context switch. Having something light will only make it simpler.

sdogruyol

Needs a rebase and it's good to go 👍

RX14 · 2018-10-15T13:55:01Z

This has been obsoleted by #6518.

ysbaddaden added the pr:wip label May 5, 2018

ysbaddaden mentioned this pull request May 5, 2018

Bug with signal trap #5830

Closed

ysbaddaden commented May 5, 2018

View reviewed changes

ysbaddaden force-pushed the fix/nonblocking-standard-file-descriptors-with-timers branch from 88a6ddf to 5bcb801 Compare May 5, 2018 21:03

RX14 reviewed May 5, 2018

View reviewed changes

RX14 approved these changes May 6, 2018

View reviewed changes

RX14 reviewed May 6, 2018

View reviewed changes

Sija reviewed May 6, 2018

View reviewed changes

ysbaddaden force-pushed the fix/nonblocking-standard-file-descriptors-with-timers branch from 80e56d3 to 77fb289 Compare May 6, 2018 16:11

ysbaddaden commented May 6, 2018

View reviewed changes

ysbaddaden removed the pr:wip label May 6, 2018

ysbaddaden commented May 6, 2018

View reviewed changes

ysbaddaden force-pushed the fix/nonblocking-standard-file-descriptors-with-timers branch from 77fb289 to d722da4 Compare May 6, 2018 20:01

wooster0 approved these changes Jun 10, 2018

View reviewed changes

Timbus mentioned this pull request Jul 10, 2018

[bug] Crystal programs crash if STDIN or STDOUT is closed #6359

Closed

ysbaddaden mentioned this pull request Jul 20, 2018

Process.run() not returning in forked process #6416

Closed

ysbaddaden added 4 commits July 22, 2018 15:16

Drop remember/restore_blocking_state

87dd63c

fixup: missing C bindings for IO::StdFileDescriptor + setitimer fallback

7a96a7f

Adds missing C binding definitions for the different targets. Adds C bindings for setitimer and use it on targets that don't support timer_create (namely: OpenBSD and Darwin).

Always retry interruptible syscalls

7f3a314

POSIX timers (and `setitimer(2)`) trigger SIGALRM that is configured without `SA_RESTART` to interrupt some syscalls. This change affects a few other syscalls that can now fail with `EINTR`. This patch ensures that we do.

ysbaddaden force-pushed the fix/nonblocking-standard-file-descriptors-with-timers branch from d722da4 to 425382a Compare July 22, 2018 13:51

Move StdFileDescriptor to crystal/system

1a61a45

Also separates POSIX timer_create(2) and setitimer(2) implementations.

ysbaddaden force-pushed the fix/nonblocking-standard-file-descriptors-with-timers branch from 425382a to 1a61a45 Compare July 22, 2018 14:40

Timbus mentioned this pull request Aug 10, 2018

Reopen standard handles when they are a TTY #6518

Merged

sdogruyol approved these changes Oct 15, 2018

View reviewed changes

RX14 closed this Oct 15, 2018

ysbaddaden deleted the fix/nonblocking-standard-file-descriptors-with-timers branch September 13, 2024 13:29

Fix: nonblocking standard file descriptors using timers #6068

Fix: nonblocking standard file descriptors using timers #6068

Conversation

ysbaddaden commented May 5, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ysbaddaden commented May 6, 2018

RX14 left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RX14 May 6, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ysbaddaden commented May 6, 2018

Choose a reason for hiding this comment

ysbaddaden commented Jul 22, 2018

RX14 commented Jul 22, 2018

RX14 commented Jul 22, 2018

ysbaddaden commented Jul 24, 2018 • edited Loading

RX14 commented Jul 24, 2018

ysbaddaden commented Jul 24, 2018

RX14 commented Jul 24, 2018

ysbaddaden commented Jul 24, 2018

Timbus commented Aug 6, 2018

ysbaddaden commented Aug 8, 2018

ysbaddaden commented Aug 8, 2018

Timbus commented Aug 8, 2018

Timbus commented Aug 9, 2018

Timbus commented Aug 9, 2018

j8r commented Aug 9, 2018

Timbus commented Aug 9, 2018

j8r commented Aug 9, 2018

Timbus commented Aug 9, 2018

ysbaddaden commented Aug 10, 2018

sdogruyol left a comment

Choose a reason for hiding this comment

RX14 commented Oct 15, 2018

ysbaddaden commented May 5, 2018 •

edited

Loading

RX14 left a comment •

edited

Loading

RX14 May 6, 2018 •

edited

Loading

ysbaddaden commented Jul 24, 2018 •

edited

Loading