Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MINOR: Dry up RubyArray and improve performance in a few spots #4751

Merged
merged 1 commit into from
Sep 5, 2017

Conversation

original-brownbear
Copy link
Contributor

@original-brownbear original-brownbear commented Aug 26, 2017

Some cleanups I came across debugging RubyArray performance recently :)

  • remove dead imports from RubyArray
  • Extract the USE_PACKED_ARRAYS == true paths to a single if branch to be more JIT friendly here
    • especially isPackedArray will be nicely dead code eliminated now for the Collection case and we save an expensive size call here and there (+ bytecode becomes smaller :))
  • Make 2 methods static that had no instance reference anywhere
  • Remove dead cast
  • Removed redundant array creation for varargs method with one arguement
  • Remove dead imports from packed array types
  • Replaced slower collection to array conversion list.toArray(new IRubyObject[list.size()])) with the faster version collection.toArray(IRubyObject.NULL_ARRAY)

Just a few suggestions, let me know what you think :)

return new RubyArray(runtime, values);
}

public static RubyArray newArray(Ruby runtime, List<? extends IRubyObject> list) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still'd like to have the List version around for binary compatibility - it also does not need toArray for packed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kares right bringing it back sec :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

brought it back here https://github.com/jruby/jruby/pull/4751/files#diff-008115c4ee37b136355736c9aa675fa3R264 with nicely separate packed and not packed paths :)

@@ -4100,7 +4077,9 @@ public IRubyObject sample(ThreadContext context, IRubyObject[] args) {
if (args.length > 0) {
IRubyObject hash = TypeConverter.checkHashType(context.runtime, args[args.length - 1]);
if (!hash.isNil()) {
IRubyObject[] rets = ArgsUtil.extractKeywordArgs(context, (RubyHash) hash, new String[] { "random" });
IRubyObject[] rets = ArgsUtil.extractKeywordArgs(context, (RubyHash) hash,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's keep it consistently a one liner maybe :) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kares sure :)

@@ -4792,7 +4771,7 @@ public IRubyObject min(ThreadContext context, IRubyObject num, Block block) {
}

private static final int optimizedCmp(ThreadContext context, IRubyObject a, IRubyObject b, int token, CachingCallSite op_cmp, CallSite op_gt, CallSite op_lt) {
if (token == ((RubyBasicObject) a).getMetaClass().generation) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this cast is all over the place - it used to help inlining ... without the JIT wasn't just smart-enough to get it "fast"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kares ah I see, phew I'm not sure about this point, but in Java 8 this should not help anything anymore right. I'd rather expect this to potentially be harmful because it increases the size of the method and the type profiler will figure out that this is always a RubyBasicObjecteventually right? (I could investigate this a little with JitWatch I guess :))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kk, but Java 7 still supported.
so it might be valuable to measure using that as well, although Java 8 first I guess
(9K is already considerably slower on 7)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kares wow I stand 100% corrected here, check this out:
Tried to use a slightly deeper (in the sense of inheritance) case and benchmarked this:

    @Benchmark
    @OperationsPerInvocation(EVENTS_PER_INVOCATION)
    public final void castPerformanceImpactWithout(final Blackhole blackhole) throws Exception {
        for (int i = 0; i < EVENTS_PER_INVOCATION; ++i) {
            blackhole.consume(random().isTrue());
        }
    }

    @Benchmark
    @OperationsPerInvocation(EVENTS_PER_INVOCATION)
    public final void castPerformanceImpactWith(final Blackhole blackhole) throws Exception {
        for (int i = 0; i < EVENTS_PER_INVOCATION; ++i) {
            blackhole.consume(((RubyBoolean) random()).isTrue());
        }
    }

Results are:

# Run complete. Total time: 00:01:03

Benchmark                                              Mode  Cnt      Score      Error   Units
CastExperimentBenchmark.castPerformanceImpactWith     thrpt   20  92950.858 ± 1169.586  ops/ms
CastExperimentBenchmark.castPerformanceImpactWithout  thrpt   20  69318.454 ± 2354.502  ops/ms

didn't expect this at all (same effect for JDK 7 and 8 btw.) => will revert + thanks for the very educational moment :)

Copy link
Contributor Author

@original-brownbear original-brownbear Aug 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    private static IRubyObject random() {
        return RubyUtil.RUBY.newBoolean(RANDOM.nextBoolean());
    }
    private static final Random RANDOM = new Random();

was my object generator :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overhead in these cases seems to be the keeping of the type profile itself, inlining is not affected as such and method size is even a byte larger from the cast as I expected:

   5189 2084       4       org.logstash.benchmark.CastExperimentBenchmark::castPerformanceImpactWith (28 bytes)
                              @ 9   org.logstash.benchmark.CastExperimentBenchmark::random (13 bytes)   inline (hot)
                                @ 6   java.util.Random::nextBoolean (14 bytes)   inline (hot)
                                  @ 2   java.util.Random::next (47 bytes)   inline (hot)
                                    @ 8   java.util.concurrent.atomic.AtomicLong::get (5 bytes)   inline (hot)
                                    @ 32   java.util.concurrent.atomic.AtomicLong::compareAndSet (13 bytes)   inline (hot)
                                      @ 9   sun.misc.Unsafe::compareAndSwapLong (0 bytes)   (intrinsic)
                                @ 9   org.jruby.Ruby::newBoolean (16 bytes)   inline (hot)
                              @ 15   org.jruby.RubyBasicObject::isTrue (17 bytes)   inline (hot)
                              @ 18   org.openjdk.jmh.infra.Blackhole::consume (28 bytes)   disallowed by CompilerOracle

   2917 2053       4       org.logstash.benchmark.CastExperimentBenchmark::castPerformanceImpactWithout (27 bytes)
                              @ 9   org.logstash.benchmark.CastExperimentBenchmark::random (13 bytes)   inline (hot)
                                @ 6   java.util.Random::nextBoolean (14 bytes)   inline (hot)
                                  @ 2   java.util.Random::next (47 bytes)   inline (hot)
                                    @ 8   java.util.concurrent.atomic.AtomicLong::get (5 bytes)   inline (hot)
                                    @ 32   java.util.concurrent.atomic.AtomicLong::compareAndSet (13 bytes)   inline (hot)
                                      @ 9   sun.misc.Unsafe::compareAndSwapLong (0 bytes)   (intrinsic)
                                @ 9   org.jruby.Ruby::newBoolean (16 bytes)   inline (hot)
                              @ 12   org.jruby.RubyBasicObject::isTrue (17 bytes)   inline (hot)
                              @ 12   org.jruby.RubyBasicObject::isTrue (17 bytes)   inline (hot)
                               \-> TypeProfile (140240/280503 counts) = org/jruby/RubyBoolean$False
                               \-> TypeProfile (140263/280503 counts) = org/jruby/RubyBoolean$True
                              @ 17   org.openjdk.jmh.infra.Blackhole::consume (28 bytes)   disallowed by CompilerOracle

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, and yes it surprising (and IDE gives warnings all over) also did not know about that.
... but I recall @headius lecture on some of the issues along the way of making JRuby.
think he also raised the issue to the HotSpot team - maybe for 9 thay might do smt about it.
or maybe they won't and eventually just push GraalVM's optimizations for all as Java 10 :)

@original-brownbear original-brownbear force-pushed the cleanup-ruby-array branch 2 times, most recently from 03cd174 to 51cd6a2 Compare August 27, 2017 07:37
@original-brownbear
Copy link
Contributor Author

@kares thanks for the swift review, addressed all comments I think/hope :)
List special case handling is back and should inline very clean now too.

@original-brownbear
Copy link
Contributor Author

@kares reverted the removed cast and added some benchmarks here #4751 (comment) (just in case GH hides it now that I pushed over the change, kind of interesting imo :))

@kares
Copy link
Member

kares commented Sep 5, 2017

no one complained in a week or so, thus let's ship it ...

@kares kares merged commit 3cbf5ff into jruby:master Sep 5, 2017
@kares kares added this to the JRuby 9.2.0.0 milestone Sep 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants