Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance bug with JDBC/sqlite driver under multi-threaded conditions #3398

Closed
chuckremes opened this issue Oct 15, 2015 · 14 comments
Closed

Comments

@chuckremes
Copy link

When writing data in a single thread to a single database, performance is excellent. As soon as there is 2+ threads using the database driver, performance falls off a cliff.

Notes:

  • The reproduction has 4 threads each writing to a separate database and database file. There is no contention for the file itself between threads.
  • Reducing the number of threads improves performance. Increasing worsens it.
  • I recommend running with DEBUG_PRINT=true at least once. It will show the inserts/sec in the low 100s but as the data is exhausted and the threads exit you'll see performance rise to the many 1000s.
require 'sequel'
require 'csv'
require 'benchmark'

Db_string = if RUBY_PLATFORM =~ /java/
  "jdbc:sqlite:/tmp/"
else
  "sqlite:///tmp/"
end

def elapsed_time(&blk)
  start = Time.now
  blk.call
  elapsed = Time.now - start
end

class DBHandle
  def initialize
    @connections = {}
  end

  def setup_db_connection(key)
    connection_string = "#{Db_string}#{key}.db"

    db = Sequel.connect(connection_string, :single_threaded => true)
    db.create_table? :options do
      primary_key :id

      String :underlying_symbol, :size => 15
      Float :underlying_price
      String :exchange, :fixed => true, :size => 1
      String :option_symbol, :size => 20
      String :option_ext, :size => 5
      String :type, :fixed => true, :size => 1
      Date :expiration
      Date :trade_date
      Float :strike
      Float :last
      Float :bid
      Float :ask
      Integer :volume
      Integer :open_interest
      Float :implied_vol
      Float :delta
      Float :gamma
      Float :theta
      Float :vega
      String :aka, :size => 15

      index :trade_date
    end    

    db
  end

  def setup_db_thread(key)
    db_queue = Queue.new
    thread = Thread.new do
      work(key, db_queue)
    end

    return({:queue => db_queue, :thread => thread})
  end

  def work(key, queue)
    headers = [
        :underlying_symbol, :underlying_price, :exchange, :option_symbol, :option_ext,
        :type, :expiration, :trade_date, :strike, :last, :bid, :ask, :volume,
        :open_interest, :implied_vol, :delta, :gamma, :theta, :vega, :aka
      ]

    loop do
      index, rows = queue.pop
      break if rows == "done"
      inserts = 0
      elapsed = 0

      unless @connections.has_key?(index)
        @connections[index] = setup_db_connection(index)
      end

      db = @connections[index]

      inserts += rows.size

      elapsed += elapsed_time do
        db[:options].import(headers, rows, :commit_every => 500)
      end

      unless elapsed.zero?
        persec = "%.2f" % (inserts / elapsed)
        puts "[child] db-conn Inserted [#{inserts}] rows at a rate of [#{persec}] rows per second." if DEBUG_PRINT
      end
    end # loop

    puts "[child] db-conn Database connection exiting..." if DEBUG_PRINT
  end
end # class DBHandle

handles = []
connection = DBHandle.new
handles << connection.setup_db_thread(0)
handles << connection.setup_db_thread(1)
handles << connection.setup_db_thread(2)
handles << connection.setup_db_thread(3)

rows = []
all_rows = []
file = File.open("slow.csv")
io = CSV.new(file, :converters => :all)

count = 0
while row = io.gets
  rows << row
  count += 1

  if count > rand(500)
    all_rows << rows
    count = 0
    rows = []
  end
end


DEBUG_PRINT = false

Benchmark.bmbm(22) do |x|
  x.report('insert rows') do
    all_rows.each do |rows|
      index = rand(4)
      handles[index][:queue] << [index, rows]
    end

    handles.each { |hsh| hsh[:queue] << [nil, "done"] }
    handles.each { |hsh| hsh[:thread].join }
  end
end

io.close

The datafile to use is here:
https://www.dropbox.com/s/sawaxksmlkep674/slow.zip?dl=0

Benchmarks:
jruby 9.0.2.0-SNAPSHOT (2.2.2) 2015-10-13 f2242d0 Java HotSpot(TM) 64-Bit Server VM 25.60-b23 on 1.8.0_60-b27 +jit [darwin-x86_64]

ruby --server slow.rb 
Rehearsal ----------------------------------------------------------
insert rows             13.910000  25.840000  39.750000 ( 23.120468)
------------------------------------------------ total: 39.750000sec

                             user     system      total        real
insert rows              0.010000   0.000000   0.010000 (  0.002492)

ruby 2.2.3p173 (2015-08-18 revision 51636) [x86_64-darwin14]

ruby slow.rb 
Rehearsal ----------------------------------------------------------
insert rows              1.360000   0.940000   2.300000 (  2.721511)
------------------------------------------------- total: 2.300000sec

                             user     system      total        real
insert rows              0.000000   0.000000   0.000000 (  0.000299)
@headius
Copy link
Member

headius commented Oct 15, 2015

Some findings:

  • Almost all of a sampled profile (--sample) appears to be in the native downcall to SQLite, NativeDB._exec. After multiple loops, it was 90% of all samples.
  • A dtruss output showed pmutex (OS X mutex) calls as by far the biggest number of calls.
  • I could find no locking in the Java code of sqlite-jdbc.
  • I did find mention of locking modes in the config options for SQLite, both in Java and in C.

My current theory is that the JDBC driver for SQLite is either defaulting to or being configure to use some locking mode that's causing global contention. On my system I was unable to max out four cores with four threads.

I don't really know enough about sqlite or the sequel library by @jeremyevans to know if there's a way we could push some config options through to the native connection establishment.

@headius
Copy link
Member

headius commented Oct 15, 2015

Another peculiar finding: it performance much better with smaller insert batches.

If I set the row aggregator to do 10 at a time, performance goes from around 7.5s down to 3s. After some warmup (a couple iterations through main benchmark) that improves as low as 2.16s. Anything below ten-at-a-time does not improve or degrades due to extra Ruby and object overhead.

This would be consistent with the thought that there's some global locking going on at the driver level, preventing all threads from running concurrently.

@headius
Copy link
Member

headius commented Oct 15, 2015

Also worth pointing out that a single thread is able to complete the bench in 6s, somewhat better than four threads doing 500 at a time. It's just not parallelizing like it should.

@headius
Copy link
Member

headius commented Oct 16, 2015

Ok, I figured it out. And it wasn't us or Sequel causing the problem (you're off the hook, @jeremyevans!).

The problem seems to be that the sqlite-jdbc driver compiles in its own sqlite using the default threadsafety mode. This defaults to serializing all calls, and it seems that in this case it's serializing calls even across database connections. I'm not sure if there's a way around that or not.

My fix was to recompile sqlite-jdbc using -DSQLITE_THREADSAFE=0 and replace the jar in the jdbc-sqlite3 gem.

I modified your benchmark to do exactly 500-wide batches (rather than rand(500)) and distribute them evenly between all threads.

...
  if count > 500
    all_rows << rows
...
    all_rows.each_with_index do |rows, i|
      index = i % handles.size

Before results:

[] ~/projects/jruby $ jruby -J-Xmx2g sqlite3_slow.rb 
                             user     system      total        real
insert rows             16.780000  10.160000  26.940000 ( 12.013076)
                             user     system      total        real
insert rows              7.290000   9.100000  16.390000 ( 10.325927)
                             user     system      total        real
insert rows              5.830000   9.520000  15.350000 ( 10.670606)
                             user     system      total        real
insert rows              4.860000   9.560000  14.420000 ( 10.467786)
                             user     system      total        real
insert rows              4.660000   9.060000  13.720000 ( 10.040496)

After results, with recompiled sqlite-jdbc:

[] ~/projects/jruby $ jruby -J-Xmx2g sqlite3_slow.rb 
                             user     system      total        real
insert rows             13.150000   0.430000  13.580000 (  2.127813)
                             user     system      total        real
insert rows              6.060000   0.250000   6.310000 (  1.022566)
                             user     system      total        real
insert rows              3.960000   0.160000   4.120000 (  0.764608)
                             user     system      total        real
insert rows              2.600000   0.120000   2.720000 (  0.631406)
                             user     system      total        real
insert rows              2.930000   0.150000   3.080000 (  0.686241)
                             user     system      total        real
insert rows              2.690000   0.130000   2.820000 (  0.655170)

Here's MRI 2.2.2 on my system for comparison. We are now roughly 2x faster for this particular benchmark.

[] ~/projects/jruby $ ruby sqlite3_slow.rb 
                             user     system      total        real
insert rows              1.450000   0.100000   1.550000 (  1.605332)
                             user     system      total        real
insert rows              1.230000   0.090000   1.320000 (  1.376512)
                             user     system      total        real
insert rows              1.210000   0.080000   1.290000 (  1.342696)
                             user     system      total        real
insert rows              1.190000   0.080000   1.270000 (  1.314342)

My system has four real cores and four hyperthreaded cores, so for kicks, I also tried eight handles, but performance did not improve considerably. This is expected, but I would hope to see a machine with eight real cores to scale well:

[] ~/projects/jruby $ jruby -J-Xmx2g sqlite3_slow.rb 
                             user     system      total        real
insert rows             16.260000   0.510000  16.770000 (  2.339669)
                             user     system      total        real
insert rows              7.460000   0.300000   7.760000 (  1.021627)
                             user     system      total        real
insert rows              5.960000   0.230000   6.190000 (  0.813012)
                             user     system      total        real
insert rows              4.690000   0.200000   4.890000 (  0.653922)
                             user     system      total        real
insert rows              5.280000   0.280000   5.560000 (  0.769821)
...
                             user     system      total        real
insert rows              3.380000   0.140000   3.520000 (  0.513625)
                             user     system      total        real
insert rows              3.370000   0.140000   3.510000 (  0.506076)
                             user     system      total        real
insert rows              3.360000   0.150000   3.510000 (  0.496591)
                             user     system      total        real
insert rows              3.510000   0.160000   3.670000 (  0.547823)
                             user     system      total        real
insert rows              3.290000   0.140000   3.430000 (  0.535552)

So it seems we need to have a discussion about how and whether to ship our own sqlite-jdbc that's compiled with thread-safety off.

@headius
Copy link
Member

headius commented Oct 16, 2015

I expect there's more room for improvement here, but we may be limited by going through the JDBC driver (every string has to be marshalled across that boundary to a Java string and then back out to a C string). Going directly through FFI would probably be faster (as fast as C ext, likely), but we don't have such a library right now.

@headius
Copy link
Member

headius commented Oct 16, 2015

Notable: https://github.com/qoobaa/sqlite3 by @qoobaa.

Also sparklemotion/sqlite3-ruby#22 where @sgoings talks about forking that library and using it successfully.

@headius
Copy link
Member

headius commented Oct 16, 2015

My diff for sqlite-jdbc:

diff --git a/Makefile b/Makefile
index de1f1d3..31cd2d5 100644
--- a/Makefile
+++ b/Makefile
@@ -62,6 +62,7 @@ $(SQLITE_OUT)/sqlite3.o : $(SQLITE_UNPACKED)
            -DSQLITE_ENABLE_FTS3_PARENTHESIS \
            -DSQLITE_ENABLE_RTREE \
            -DSQLITE_ENABLE_STAT2 \
+           -DSQLITE_THREADSAFE=0 \
            $(SQLITE_FLAGS) \
            $(SQLITE_OUT)/sqlite3.c

@headius
Copy link
Member

headius commented Oct 16, 2015

I have pushed a copy of my build of sqlite-jdbc here: https://www.dropbox.com/s/qpa910t65d8jwxt/sqlite-jdbc-3.8.11.3-SNAPSHOT.jar?dl=0

I just copied it over the jar in the current jdbc-sqlite3 gem. Keep in mind this makes all connections lockless and thread-unsafe. We'll need a more robust option long term.

@jeremyevans
Copy link
Contributor

Making sqlite-jdbc thread-unsafe will work with Sequel, since Sequel's thread pool already only allows a single thread access to a connection. That may fix this performance issue, but I agree that it isn't a good general solution. If it is at all possible to use a different embedded database with jdbc (e.g. h2/derby/hsqldb), I would recommend that.

@headius
Copy link
Member

headius commented Oct 16, 2015

I have filed xerial/sqlite-jdbc#59 to address the build issue in sqlite-jdbc. My suggestion to them is that they compile it this way but by default ask for thread safety when opening connections. Users (e.g. sequel) can then specify lower levels of thread-safety when it is appropriate via connection properties.

@jeremyevans My suggestion would not reduce the overall thread-safety of the sqlite-jdbc driver, but it would allow libraries like yours to reduce locking when it is appropriate. I think it's a good balance. Of course I agree that using a more concurrency-friendly library is the best solution.

@chuckremes
Copy link
Author

Great work, @headius. Would you like me to leave this ticket open to track the issue long-term, or are you satisfied that the problem is resolved?

@headius
Copy link
Member

headius commented Oct 20, 2015

I'd like to leave it open for now. I'm not satisfied that this is exactly the right solution, since all resources I've read indicate that the default sqlite on OS X (the one MRI and Rubinius would be using) is compiled threadsafe. They should be seeing the same performance impact we see, but they don't.

I've emailed the sqlite-users list to dig deeper. It's possible there's still something we are doing wrong that's keeping performance down.

@headius
Copy link
Member

headius commented Jun 10, 2016

Still waiting on xerial/sqlite-jdbc#59 but I pinged it again today.

@headius
Copy link
Member

headius commented Jun 10, 2016

I'm going to transfer this over to ar-jdbc so we can track it as part of the release process for the drivers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants