-
-
Notifications
You must be signed in to change notification settings - Fork 924
review String -> RubyString UTF (8) encoding #5239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
+304
−171
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
AR-JDBC 5.1 numbersJRuby 9.1
JRuby 9.1 + with newDefaultInternalString replacement
JRuby 9.2.1 SNAPSHOT
|
Numbers look good...I'll review the change. |
... this likely isn't used that much since it might have failed for cases where String's char[] is shared (int offset being > 0) also this would need special care in Java 10 where its a byte[]
... for non-direct ByteBuffer we can extract bytes directly
doing toString() does not make a difference in micro-benchmarks thus could as well not char[] copy esp. since one doesn't know what kind of CharSequence objects might come along ...
and no longer fill in null encoding - we always pass it down interestingly, with micro-benchmarks, this seems to run better passing a StringBuilder down seems to get a very noticeable speed improvement, while String cases stay around the same performance
cb57a76
to
5105163
Compare
BEFORE: ``` Benchmark Mode Cnt Score Error Units EncodingBenchmark.benchLongRubyStringNew thrpt 5 7104.064 ± 252.231 ops/ms EncodingBenchmark.benchLongRubyStringNewCharSequence thrpt 5 6882.044 ± 133.946 ops/ms EncodingBenchmark.benchLongRubyStringNewCharSequence2 thrpt 5 7059.163 ± 208.203 ops/ms EncodingBenchmark.benchLongRubyStringNewCharSequence3 thrpt 5 7177.851 ± 188.033 ops/ms EncodingBenchmark.benchShortRubyStringNew thrpt 5 15108.288 ± 282.496 ops/ms EncodingBenchmark.benchShortRubyStringNewCharSequence thrpt 5 14342.470 ± 101.090 ops/ms EncodingBenchmark.benchVeryLongRubyStringNew thrpt 5 1173.092 ± 8.716 ops/ms EncodingBenchmark.benchVeryLongRubyStringNewCharSequence thrpt 5 1017.636 ± 58.843 ops/ms EncodingBenchmark.benchVeryLongRubyStringNewCharSequence2 thrpt 5 1065.907 ± 26.763 ops/ms ``` AFTER: ``` Benchmark Mode Cnt Score Error Units EncodingBenchmark.benchLongRubyStringNew thrpt 5 7205.086 ± 474.930 ops/ms EncodingBenchmark.benchLongRubyStringNewCharSequence thrpt 5 9239.360 ± 338.284 ops/ms EncodingBenchmark.benchLongRubyStringNewCharSequence2 thrpt 5 4425.827 ± 246.294 ops/ms EncodingBenchmark.benchLongRubyStringNewCharSequence3 thrpt 5 7661.631 ± 418.873 ops/ms EncodingBenchmark.benchShortRubyStringNew thrpt 5 15875.130 ± 926.360 ops/ms EncodingBenchmark.benchShortRubyStringNewCharSequence thrpt 5 16137.382 ± 1024.177 ops/ms EncodingBenchmark.benchVeryLongRubyStringNew thrpt 5 1149.699 ± 27.375 ops/ms EncodingBenchmark.benchVeryLongRubyStringNewCharSequence thrpt 5 1982.773 ± 133.350 ops/ms EncodingBenchmark.benchVeryLongRubyStringNewCharSequence2 thrpt 5 634.528 ± 224.842 ops/ms ```
... makes no difference for micro-benchmarks but we at least won't copy char[] buffers around - its clearly a user intent to encode the buffer
let's ship before it gets behind ... CI didn't spot any regressions. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
first UTF-16 is broken on Java 10+ since it assumes
char[]
value String internals (using Unsafe)doing that, spotted a few less copy
byte[]
/char[]
opportunities and started micro-benchmarkingby separating String/CharSequence paths down the encoding pipe there seems to be gains (maybe its likely due HotSpot's JIT assuming the CS method as mono-morphic as they only get a
StringBuilder
)but since the micro benchmark wasn't assuring (non StringBuilder CharSequence paths do get slower in JMH) I did some AR-JDBC "raw" AR benchmarking ...
surprisingly this seems to make sense as there seems to be a noticeable almost 5% speed gain