Optimise IO::Buffered for reads 4096 <= size < 8192 #3416

RX14 · 2016-10-12T15:55:51Z

When count is more than half the buffer size, an unbuffered read is likely
to be happening in more than half of the read calls, so it's unlikely to be
worth the extra buffer copy.

Benchmarks show this to be true:

before

2048   7.74  (± 0.46%)  1.24× slower
4096   7.39  (± 0.45%)  1.30× slower
6144   7.52  (± 0.57%)  1.28× slower
8192   9.59  (± 0.79%)       fastest

after

2048   7.73  (± 0.64%)  1.24× slower
4096   8.11  (± 0.76%)  1.18× slower
6144   8.61  (± 0.66%)  1.11× slower
8192    9.6  (± 0.89%)       fastest

Sizes 2048 and 8192 have the same behaviour before and after this commit,
so we see similar speeds. However, we see a 10% speedup for 4096 and 6144
bytes as they call unbuffered_read less often.

Benchmark code:

require "benchmark"

macro benchmark(size)
  b.report("{{size}}") do
    in = File.open("1gb", "r")
    buffer = uninitialized UInt8[{{size}}]
    while in.read(buffer.to_slice) > 0
    end
    in.close
  end
end

Benchmark.ips do |b|
  benchmark(2048)
  benchmark(4096)
  benchmark(6144)
  benchmark(8192)
end

RX14 · 2016-10-12T18:40:50Z

Looks like I broke the compiler in this travis build.

asterite · 2016-10-12T18:44:10Z

@RX14 Ah, no, it was me in a middle commit that I already changed. It's a separate bug but it's minor so I'll fix it later. If you rebase against master it should be green in travis.

When count is more than half the buffer size, an unbuffered read is likely to be happening in more than half of the read calls, so it's unlikely to be worth the extra buffer copy. Benchmarks show this to be true: before 2048 7.74 (± 0.46%) 1.24× slower 4096 7.39 (± 0.45%) 1.30× slower 6144 7.52 (± 0.57%) 1.28× slower 8192 9.59 (± 0.79%) fastest after 2048 7.73 (± 0.64%) 1.24× slower 4096 8.11 (± 0.76%) 1.18× slower 6144 8.61 (± 0.66%) 1.11× slower 8192 9.6 (± 0.89%) fastest Sizes 2048 and 8192 have the same behaviour before and after this commit, so we see similar speeds. However, we see a 10% speedup for 4096 and 6144 bytes as they call unbuffered_read less often. Benchmark code: ``` require "benchmark" macro benchmark(size) b.report("{{size}}") do # Read a 1 gigabyte file in = File.open("1gb", "r") buffer = uninitialized UInt8[{{size}}] while in.read(buffer.to_slice) > 0 end in.close end end Benchmark.ips do |b| benchmark(2048) benchmark(4096) benchmark(6144) benchmark(8192) end ```

RX14 · 2016-10-12T18:50:57Z

Rebased!

asterite · 2016-10-13T12:44:46Z

@RX14 Looks good, thank you!!

RX14 force-pushed the optimise-buffered-io branch from fb5a62e to a520004 Compare October 12, 2016 15:56

RX14 changed the title ~~Optimise IO::Buffered for reads 2048 <= size < 8192~~ Optimise IO::Buffered for reads 4096 <= size < 8192 Oct 12, 2016

RX14 force-pushed the optimise-buffered-io branch from a520004 to d23ecd9 Compare October 12, 2016 16:02

RX14 force-pushed the optimise-buffered-io branch from d23ecd9 to b1f40f4 Compare October 12, 2016 18:46

asterite merged commit bb208bc into crystal-lang:master Oct 13, 2016

asterite added kind:feature topic:stdlib labels Oct 15, 2016

asterite added this to the 0.20.0 milestone Oct 15, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimise IO::Buffered for reads 4096 <= size < 8192 #3416

Optimise IO::Buffered for reads 4096 <= size < 8192 #3416

RX14 commented Oct 12, 2016

RX14 commented Oct 12, 2016

asterite commented Oct 12, 2016

RX14 commented Oct 12, 2016

asterite commented Oct 13, 2016

Optimise IO::Buffered for reads 4096 <= size < 8192 #3416

Optimise IO::Buffered for reads 4096 <= size < 8192 #3416

Conversation

RX14 commented Oct 12, 2016

RX14 commented Oct 12, 2016

asterite commented Oct 12, 2016

RX14 commented Oct 12, 2016

asterite commented Oct 13, 2016