Skip to content

Commit

Permalink
Optimise IO::Buffered for reads 4096 <= size < 8192
Browse files Browse the repository at this point in the history
When count is more than half the buffer size, an unbuffered read is likely
to be happening in more than half of the read calls, so it's unlikely to be
worth the extra buffer copy.

Benchmarks show this to be true:

before
2048   7.74  (± 0.46%)  1.24× slower
4096   7.39  (± 0.45%)  1.30× slower
6144   7.52  (± 0.57%)  1.28× slower
8192   9.59  (± 0.79%)       fastest

after
2048   7.73  (± 0.64%)  1.24× slower
4096   8.11  (± 0.76%)  1.18× slower
6144   8.61  (± 0.66%)  1.11× slower
8192    9.6  (± 0.89%)       fastest

Sizes 2048 and 8192 have the same behaviour before and after this commit,
so we see similar speeds. However, we see a 10% speedup for 4096 and 6144
bytes as they call unbuffered_read less often.

Benchmark code:
```
require "benchmark"

macro benchmark(size)
  b.report("{{size}}") do
    # Read a 1 gigabyte file
    in = File.open("1gb", "r")
    buffer = uninitialized UInt8[{{size}}]
    while in.read(buffer.to_slice) > 0
    end
    in.close
  end
end

Benchmark.ips do |b|
  benchmark(2048)
  benchmark(4096)
  benchmark(6144)
  benchmark(8192)
end
```
RX14 authored and Ary Borenszweig committed Oct 13, 2016
1 parent 89da3a7 commit bb208bc
Showing 1 changed file with 4 additions and 3 deletions.
7 changes: 4 additions & 3 deletions src/io/buffered.cr
Original file line number Diff line number Diff line change
@@ -156,9 +156,10 @@ module IO::Buffered
return 0 if count == 0

if @in_buffer_rem.empty?
# If we are asked to read more than the buffer's size,
# read directly into the slice.
if count >= BUFFER_SIZE
# If we are asked to read more than half the buffer's size,
# read directly into the slice, as it's not worth the extra
# memory copy.
if count >= BUFFER_SIZE / 2
return unbuffered_read(slice[0, count]).to_i
else
fill_buffer

0 comments on commit bb208bc

Please sign in to comment.