-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Vectored I/O #6078
[WIP] Vectored I/O #6078
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs specs.
I'm a little concerned that the cost of the heap allocation for the to_a
call will be larger than the performance gained using scatter-gather IO. Looking at writev_syscall_heaper
makes me think that write_syscall_helper
might have been the wrong abstraction and we should have encapsulated only the loop and the errno check, not any of the slice accounting. (i.e. i'd like to use write_syscall_helper
for writev as well)
I think the best effort to avoid heap allocs should be made if this is a performance-sensitive syscall.
private def unbuffered_write(*slices : Bytes) | ||
writev_syscall_helper(slices.to_a, "Error writing file") do |slices| | ||
iovecs = slices.map do |slice| | ||
iovec = LibC::IoVec.new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LibC::IoVec.new(iov_base: slice.to_unsafe, iov_len: slice.size)
should work and be a lot shorter.
src/io/buffered.cr
Outdated
return unbuffered_write(*slices) | ||
end | ||
|
||
count = slices.sum &.size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: this is changing terminology from size to count, how about just total_size
?
@@ -38,4 +43,5 @@ lib LibC | |||
fun sysconf(x0 : Int) : Long | |||
fun unlink(x0 : Char*) : Int | |||
fun write(x0 : Int, x1 : Void*, x2 : SizeT) : SSizeT | |||
fun writev(x0 : Int, x1 : IoVec*, x2 : SizeT) : SSizeT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will obviously need to be copied to all the other libcs.
@@ -26,6 +26,22 @@ module Crystal::System::FileDescriptor | |||
end | |||
end | |||
|
|||
private def unbuffered_write(*slices : Bytes) | |||
writev_syscall_helper(slices.to_a, "Error writing file") do |slices| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This to_a
defeats the purpose of the method (it allocates). Does it work without it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, but if I could convert a Tuple
to a StaticArray
it would.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we could add Tuple#static_array
, as Number#static_array
Lines 65 to 71 in 80a975c
macro static_array(*nums) | |
%array = uninitialized StaticArray({{@type}}, {{nums.size}}) | |
{% for num, i in nums %} | |
%array.to_unsafe[{{i}}] = {{@type}}.new({{num}}) | |
{% end %} | |
%array | |
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a splat argument with type like *slices : Bytes
should just be a StaticArray
?
yes, to my understanding |
@carlhoerberg write should always write everything because partial writes are always the wrong behaviour, read shouldn't read everything because usually full reads are often the wrong behaviour (you can't implement most protocols efficiently without partial read) |
It would be nice to have a longer description in the pull request:
def write(slices : Enumerable(Bytes))
slices.each { |slice| write(slice) }
end
def write(*slices : Bytes)
write(slices)
end |
@ysbaddaden The array allocation can also be skipped if the splash argument is converted to a |
Still a naive solution is 6 lines of code, doesn't need external bindings, accepts any Enumerable without allocating, ... Using |
writev must be worth it for some usecases to be added as a syscall. @carlhoerberg what is your usecase exactly and can you benchmark it for us? Before obsessing over the implementation of this PR it'd be nice to see some benchmarks. |
Worth reading: https://stackoverflow.com/a/10520793/4990850 The use cases are very specific. In Ruby there's a gem for that: https://github.com/tenderlove/writev If it's not too difficult to implement, we can have it in the standard library. |
Hence why I think it'd be interesting to hear @carlhoerberg's usecase :) |
Benchmark:
Output:
When the combined writes are larger than the buffer's 4096 byte limit the faster (relatively) |
Benchmarking TCP writes:
output:
|
@@ -81,6 +81,34 @@ module IO::Syscall | |||
end | |||
end | |||
|
|||
def writev_syscall_helper(slices, errno_msg : String) : Nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
errno_msg
argument here is IMHO pointless, why not just use Error writing to IO
?
Benchmark TCP writes, where I compare it with using a IO::Memory as buffer:
output:
|
@@ -26,6 +26,19 @@ module Crystal::System::FileDescriptor | |||
end | |||
end | |||
|
|||
private def unbuffered_write(slices : Indexable(Bytes)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it be Enumerable
instead of Indexable
? (ditto all below)
Thanks for taking the time to benchmark! |
Should someone else maybe continue this PR? |
FWIW, looking at the glibc implementation, it looks like they already to a solution that looks quite similar to what RX14 talked about in the puts thread: https://github.com/lattera/glibc/blob/master/sysdeps/posix/writev.c . That one should work fine and atomically with buffered io, I think. The variant provided by the kernel is on the other hand used by musl: https://github.com/davidlazar/musl/blob/master/src/unistd/writev.c on the other hand. That variant should probably not be mixed with any buffered io as it will write directly to the underlying file descriptor, so any competing buffered writes may end up being applied in an unexpected order. (I accdentally pinged ysbaddaden in an earlier edit) |
@yxhuvud writev is a linux syscall, i'm 100% certain that glibc implementation is just a fallback for ancient kernels with no writev support. |
Optimize writes by using writev
WIP, feedback appreciated