-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance bug with JDBC/sqlite driver under multi-threaded conditions #3398
Comments
Some findings:
My current theory is that the JDBC driver for SQLite is either defaulting to or being configure to use some locking mode that's causing global contention. On my system I was unable to max out four cores with four threads. I don't really know enough about sqlite or the sequel library by @jeremyevans to know if there's a way we could push some config options through to the native connection establishment. |
Another peculiar finding: it performance much better with smaller insert batches. If I set the row aggregator to do 10 at a time, performance goes from around 7.5s down to 3s. After some warmup (a couple iterations through main benchmark) that improves as low as 2.16s. Anything below ten-at-a-time does not improve or degrades due to extra Ruby and object overhead. This would be consistent with the thought that there's some global locking going on at the driver level, preventing all threads from running concurrently. |
Also worth pointing out that a single thread is able to complete the bench in 6s, somewhat better than four threads doing 500 at a time. It's just not parallelizing like it should. |
Ok, I figured it out. And it wasn't us or Sequel causing the problem (you're off the hook, @jeremyevans!). The problem seems to be that the sqlite-jdbc driver compiles in its own sqlite using the default threadsafety mode. This defaults to serializing all calls, and it seems that in this case it's serializing calls even across database connections. I'm not sure if there's a way around that or not. My fix was to recompile sqlite-jdbc using -DSQLITE_THREADSAFE=0 and replace the jar in the jdbc-sqlite3 gem. I modified your benchmark to do exactly 500-wide batches (rather than rand(500)) and distribute them evenly between all threads. ...
if count > 500
all_rows << rows
...
all_rows.each_with_index do |rows, i|
index = i % handles.size Before results:
After results, with recompiled sqlite-jdbc:
Here's MRI 2.2.2 on my system for comparison. We are now roughly 2x faster for this particular benchmark.
My system has four real cores and four hyperthreaded cores, so for kicks, I also tried eight handles, but performance did not improve considerably. This is expected, but I would hope to see a machine with eight real cores to scale well:
So it seems we need to have a discussion about how and whether to ship our own sqlite-jdbc that's compiled with thread-safety off. |
I expect there's more room for improvement here, but we may be limited by going through the JDBC driver (every string has to be marshalled across that boundary to a Java string and then back out to a C string). Going directly through FFI would probably be faster (as fast as C ext, likely), but we don't have such a library right now. |
Notable: https://github.com/qoobaa/sqlite3 by @qoobaa. Also sparklemotion/sqlite3-ruby#22 where @sgoings talks about forking that library and using it successfully. |
My diff for sqlite-jdbc: diff --git a/Makefile b/Makefile
index de1f1d3..31cd2d5 100644
--- a/Makefile
+++ b/Makefile
@@ -62,6 +62,7 @@ $(SQLITE_OUT)/sqlite3.o : $(SQLITE_UNPACKED)
-DSQLITE_ENABLE_FTS3_PARENTHESIS \
-DSQLITE_ENABLE_RTREE \
-DSQLITE_ENABLE_STAT2 \
+ -DSQLITE_THREADSAFE=0 \
$(SQLITE_FLAGS) \
$(SQLITE_OUT)/sqlite3.c
|
I have pushed a copy of my build of sqlite-jdbc here: https://www.dropbox.com/s/qpa910t65d8jwxt/sqlite-jdbc-3.8.11.3-SNAPSHOT.jar?dl=0 I just copied it over the jar in the current jdbc-sqlite3 gem. Keep in mind this makes all connections lockless and thread-unsafe. We'll need a more robust option long term. |
Making sqlite-jdbc thread-unsafe will work with Sequel, since Sequel's thread pool already only allows a single thread access to a connection. That may fix this performance issue, but I agree that it isn't a good general solution. If it is at all possible to use a different embedded database with jdbc (e.g. h2/derby/hsqldb), I would recommend that. |
I have filed xerial/sqlite-jdbc#59 to address the build issue in sqlite-jdbc. My suggestion to them is that they compile it this way but by default ask for thread safety when opening connections. Users (e.g. sequel) can then specify lower levels of thread-safety when it is appropriate via connection properties. @jeremyevans My suggestion would not reduce the overall thread-safety of the sqlite-jdbc driver, but it would allow libraries like yours to reduce locking when it is appropriate. I think it's a good balance. Of course I agree that using a more concurrency-friendly library is the best solution. |
Great work, @headius. Would you like me to leave this ticket open to track the issue long-term, or are you satisfied that the problem is resolved? |
I'd like to leave it open for now. I'm not satisfied that this is exactly the right solution, since all resources I've read indicate that the default sqlite on OS X (the one MRI and Rubinius would be using) is compiled threadsafe. They should be seeing the same performance impact we see, but they don't. I've emailed the sqlite-users list to dig deeper. It's possible there's still something we are doing wrong that's keeping performance down. |
Still waiting on xerial/sqlite-jdbc#59 but I pinged it again today. |
I'm going to transfer this over to ar-jdbc so we can track it as part of the release process for the drivers. |
When writing data in a single thread to a single database, performance is excellent. As soon as there is 2+ threads using the database driver, performance falls off a cliff.
Notes:
The datafile to use is here:
https://www.dropbox.com/s/sawaxksmlkep674/slow.zip?dl=0
Benchmarks:
jruby 9.0.2.0-SNAPSHOT (2.2.2) 2015-10-13 f2242d0 Java HotSpot(TM) 64-Bit Server VM 25.60-b23 on 1.8.0_60-b27 +jit [darwin-x86_64]
ruby 2.2.3p173 (2015-08-18 revision 51636) [x86_64-darwin14]
The text was updated successfully, but these errors were encountered: