Fixnum/Float 'fast-ops' working even when re-opened #4736

kares · 2017-08-09T11:16:57Z

instead of relying on isFixnumReopened and similar
we check at call-site whether the cached method isBuiltin

while its slightly slower it does not degrade whenever the Fixnum/Float gets re-opened ...
... thus also works (fast-opt path being triggered) with active_support/core_ext being loaded

for polymorphic style behaviour (e.g. with +) deciding on Fixnum/Float argument there's now 2 caches used so it should play nicely (not invalidate) if a site does 1+1, 1+1.0, 1+1 in a row ... let me know if its premature :)

some numbers

... really just some quick "bmbm" runs to confirm its getting better and also not getting worse :)

previous (no AS loaded)

Rehearsal ----------------------------------------------------------------
1 + 2 + 12 + 99 [100000000x]   3.970000   0.000000   3.970000 (  3.582797)
2 + 1 [100000000x]             2.990000   0.000000   2.990000 (  2.864492)
------------------------------------------------------- total: 6.960000sec

                                   user     system      total        real
1 + 2 + 12 + 99 [100000000x]   3.320000   0.000000   3.320000 (  3.314471)
2 + 1 [100000000x]             2.530000   0.010000   2.540000 (  2.535214)

before (AS loaded)

Rehearsal ----------------------------------------------------------------
1 + 2 + 12 + 99 [100000000x]   6.330000   0.010000   6.340000 (  6.136795)
2 + 1 [100000000x]             3.490000   0.000000   3.490000 (  3.313757)
------------------------------------------------------- total: 9.830000sec

                                   user     system      total        real
1 + 2 + 12 + 99 [100000000x]   5.930000   0.000000   5.930000 (  5.920239)
2 + 1 [100000000x]             3.090000   0.000000   3.090000 (  3.091332)

after the change (AS loaded)

Rehearsal ----------------------------------------------------------------
1 + 2 + 12 + 99 [100000000x]   4.640000   0.010000   4.650000 (  4.236394)
2 + 1 [100000000x]             3.030000   0.000000   3.030000 (  2.839032)
------------------------------------------------------- total: 7.680000sec

                                   user     system      total        real
1 + 2 + 12 + 99 [100000000x]   4.150000   0.000000   4.150000 (  4.107868)
2 + 1 [100000000x]             2.680000   0.000000   2.680000 (  2.683514)

after the change (AS not loaded)

Rehearsal ----------------------------------------------------------------
1 + 2 + 12 + 99 [100000000x]   4.630000   0.000000   4.630000 (  4.137725)
2 + 1 [100000000x]             3.090000   0.000000   3.090000 (  2.986892)
------------------------------------------------------- total: 7.720000sec

                                   user     system      total        real
1 + 2 + 12 + 99 [100000000x]   3.760000   0.010000   3.770000 (  3.769681)
2 + 1 [100000000x]             2.850000   0.000000   2.850000 (  2.784346)

headius

The patch seems pretty straightforward. I guess you added a second cache so you wouldn't be blowing out the Fixnum cache with Float values? I'm unclear on that change but the rest is fine.

core/src/main/java/org/jruby/runtime/callsite/CachingCallSite.java

kares · 2017-08-17T05:31:22Z

I guess you added a second cache so you wouldn't be blowing out the Fixnum cache with Float values? I'm unclear on that change but the rest is fine.

exactly seemed like it would make sense since we manage that call-site manually - we know when a Fixnum and when a Float class is to be expected (unlike if it were a generic - truly polymorphic cache).

headius · 2017-08-18T15:44:15Z

Seems ok to me then. I see there's no patch for indy, so that might be some additional work worth doing.

kares · 2017-08-18T19:59:03Z

thanks, will merge this one than ... will check what I can do about indy but no promises :)

kares · 2017-08-21T11:24:22Z

so I see what you've meant - the fixnum/float invalidators are still based on Fixnum/Float re-opening logic.
good enough challenge for me to finally write some indy code ... but what can be done is doing the built-in check in a method-handle guard. I'm on it.

btw. had a related/similar question on my mind for a while - it seems a bit more complicated but if we wanted to get rid of the isBuiltin check completely - would need to observe changes to a RubyClass -> could dispatch to listeners with details every time a method is added/removed. what's the flaw in general with such an idea?

enebo · 2017-08-21T15:28:45Z

@kares early on we tried that without much success (we called it active invalidation at the time). There are a couple of issues: 1) eager invalidation being extra execution cost for code never hit again 2) concurrency consistency (unless I lock down the world and one site updated but not the next one before it fires the wrong method version). Some inconsistency is just indeterminism of threading but I think this sort of invalidation goes beyond that.

A second point for invokedynamic with invalidation is switchpoint invalidation is incredibly cheap.

headius · 2017-08-23T18:47:28Z

I did experiment with including a switchpoint inside each method object, so that if that method were ever overwritten we could do a one-shot invalidation of all sites that might have cached it. This would reduce the call site churn caused by invalidating a single class and having all methods cached from it get kicked out, even if they're still valid.

I don't recall exactly what problems I ran into but I do remember that determining when to invalidate was tricky. You need to invalidate a method if any subclass overrides it, which meant all new methods had to walk up the hierarchy looking for the method they might be shadowing and then invalidate that method. This ended up being a very high startup cost even if it might save us walking down the hierarchy in the current model.

It's something to consider, though...there may still be an efficient way to invalidate all call sites containing a given method without invalidating the whole class.

headius · 2017-08-29T00:37:43Z

I pushed something similar to my experiment from years ago to the active_invalidation branch. The strategy there works a bit differently than I described above:

Each CacheEntry, which is immutable, gets a new, valid SwitchPoint.
When redefining a method, a search is done for that method name and any cached entry is invalidated.
All uses of CacheEntry check that it is still valid before proceeding.

This would eliminate the down-hierarchy search required to invalidate all descendant classes, replacing them with a usually-shorter and finer-grained search up the hierarchy for a single cache entry.

It would also mean we are invalidating methods based both on the class they're in and the name of the method, reducing the impact on unrelated method names.

However, it doesn't pass. I haven't dug into the failures.

enebo · 2017-08-29T13:40:53Z

@headius you are using the same term but it is different that what @kares suggested and what I described about the past. This is what I would consider fine-grained inactive invalidation since cacheentry would only invalidate and replace upon getting called after the method has changed. Perhaps we need different vocabulary? method changing and notifying all sites (old days inactive) is probably better called eager. What we do today is lazy. What your branch does is method-grained and what we do today is type-grained. This last distinction because runtime-grained exists and no doubt other interesting subsets. Seems a fair description?

headius · 2017-09-05T21:03:40Z

@enebo Yes, that's not a bad description.

My branch was an attempt to make invalidation work like it does today, but on a method slot basis. So only call sites that cache a method that gets replaced or overridden would invalidate.

Granularity-wise, this is close to the finest grain we can invalidate. Where currently we invalidate only based on the serial number of the class (and method changes spin all serials below), this additionally separates invalidation based on the name and where the class and method are in the hierarchy (e.g. invalidating foo in a parent class doesn't need to invalidate children that never saw it).

headius · 2018-10-11T17:31:04Z

@kares This is marked for 9.2.1. Can it be updated?

enebo

The two level cache class and method names have bugged me. The 2 represents looking at second cache instead of 1 but it looks like 2 as in rb_new_ary2. Could the three of us maybe come up with another naming convention here?

TwoLevelCachingCallsite? isbuiltin2 is one I don't have any idea on...

kares · 2019-06-06T06:25:30Z

since NormalCachingCallSite is kind of a MonomorphicCachingCallSite
thus in that sense NormalCachingCallSite2 could become a BimorphicCachingCallSite
or a BiNormalCachingCallSite ... not sure if that sounds weird or not 😉

the class is an internal base class - naming is more of a reflection on having cache2 (along w cache)
so all 'duplicate' methods simply got the 2 suffix - haven't had a better idea wout refactoring NormalCachingCallSite extensively ... but I am open to suggestions

kares · 2019-06-06T10:29:12Z

as promised ... here's some numbers: https://gist.github.com/kares/7188d075d614ed023ccad31cf6d968bb
for indy=true it might not be a "fair" comparison since when (Float/Fixnum) re-opened we always invalidate, but since its common e.g. in a Rails env or even when loading some stdlib (bigdecimal/util.rb) parts

enebo · 2019-06-06T14:50:27Z

@kares re-naming...yeah I am having a hard time with a name here too. Perhaps @headius will get inspiration. 2 can be Bi or Two or Dual. The method itself is more challenging to me too :(

@kares It is neat that math will be faster in Rails apps with this. I wonder if we do much math in ARJDBC to ever notice this? (not that it matters I just hope we see a positive effect there anyways).

headius · 2019-06-24T18:30:46Z

core/src/main/java/org/jruby/runtime/invokedynamic/MathLinker.java

    }

-    public static boolean fixnumBooleanFail(ThreadContext context, IRubyObject caller, IRubyObject self, JRubyCallSite site, RubyFixnum value) throws Throwable {
+    static boolean fixnumTestWithGen(IRubyObject self, RubyClass klass, final int gen) {
+        return self instanceof RubyFixnum && klass.getGeneration() == gen;


This is a nice addition but ideally the indy way would be to use a switch point rather than a generation check. The generation check will always require traversing the class and the field, where the switch point is compiled as a safe point with basically zero overhead.

This is fine for now; we have continued to flip the generation since many other places can't use switch point, but the latter will optimize better.

this was an addition on top of the method invalidation, as explained:
https://github.com/jruby/jruby/pull/4736/files#r297135999
we can get rid of it and see how it goes.

headius · 2019-06-24T18:32:25Z

core/src/main/java/org/jruby/runtime/invokedynamic/MathLinker.java

-        target = ((SwitchPoint)context.runtime.getFloatInvalidator().getData())
-                .guardWithTest(target, fallback);
-
+        target = ((SwitchPoint) classFloat.getInvalidator().getData()).guardWithTest(target, fallback);


This is a bit confusing to me...why do we need to check both the SwitchPoint and the generation? They should both invalidate on class hierarchy changes.

okay, will remove.
this is smt I wasn't sure about due the comment history where it was noted that the callback invalidation approach wasn't working in the past well. so I did the generation check as well as an attempt to invalidate methods "early" (besides the class-level method invalidator)

headius

No objections for merging. Only concern is with the use of both generation and switchpoint to invalidate the math operations. Unless I've missed some detail, I think only one should be required.

headius · 2019-06-24T18:42:03Z

I have submitted a technical review. As for the naming...I'm with @enebo that "2" is not descriptive enough.

I'm thinking along these lines:

NormalCachingCallSite becomes common abstract parent to MonomorphicCallSite and BimorphiCallSite (caching seems irrelevant at this level) or goes away entirely (binary compat issue; perhaps it just becomes an empty class).

MonomorphicCallSite keeps the methods named without any numbering, since those are definitely used inside and outside of JRuby.

BimorphicCallSite methods are trickier. I would think of the second cache entry as "secondary" but maybe that's a little long? isSecondaryBuiltin, retrieveSecondaryCache, cacheAndGetSecondary...maybe it's not too bad for code that's intended only to be used internally?

kares · 2019-06-25T20:12:26Z

okay, PR feedback should be addressed by now,
will need to do a rebase due OracleJDK11 failing CI + a conflict resolution but otherwise hopefully all good ...

will make these take the desired route even with ActiveSupport loaded

…-ops

kares requested a review from headius August 9, 2017 11:17

kares added this to the JRuby 9.2.0.0 milestone Aug 9, 2017

headius reviewed Aug 16, 2017

View reviewed changes

core/src/main/java/org/jruby/runtime/callsite/CachingCallSite.java Outdated Show resolved Hide resolved

kares force-pushed the test-call-site-2 branch from 41b2879 to e051cd5 Compare September 26, 2017 09:24

kares added in progress internal labels Nov 10, 2017

kares modified the milestones: JRuby 9.2.0.0, JRuby 9.2.1.0 May 20, 2018

kares force-pushed the test-call-site-2 branch 3 times, most recently from 0764282 to ff52286 Compare October 17, 2018 10:27

headius modified the milestones: JRuby 9.2.1.0, JRuby 9.2.2.0 Nov 1, 2018

enebo modified the milestones: JRuby 9.2.2.0, JRuby 9.2.3.0, JRuby 9.2.4.0, JRuby 9.2.5.0 Nov 8, 2018

enebo removed this from the JRuby 9.2.5.0 milestone Dec 6, 2018

kares requested review from headius and enebo June 5, 2019 11:21

enebo requested changes Jun 5, 2019

View reviewed changes

headius reviewed Jun 24, 2019

View reviewed changes

headius approved these changes Jun 24, 2019

View reviewed changes

kares requested a review from enebo June 25, 2019 20:13

kares force-pushed the test-call-site-2 branch from a6c878c to 563106a Compare June 26, 2019 06:18

enebo approved these changes Jun 26, 2019

View reviewed changes

kares added 12 commits June 26, 2019 21:43

opt-ed path should be taken whenever the (fixnum) op call is built-in

980b9bc

will make these take the desired route even with ActiveSupport loaded

invent a call-site with 2 caches and use with mixed fixnum/float fast…

8287ab7

…-ops

cleanup these (un-used) specialized call-sites as well

11a473c

remove unused JRubyCallSite methods/fields

7a299ed

[refactor] JRuby's extended CallSite from unused bits

53b782a

[refactor] MethodIndex fast fixnum/float call-site internals

37a7796

review fast ops sites - add '/' and more float support

df8d658

redo indy linker to avoid Fixnum/Float re-open checks

f6652b2

deprecate runtime reopen + invalidator parts (not used)

bc87efc

avoid gen check on fast-ops - rely on method invalidator

3c336a0

[refactor] 'normal' call-site(s) to monomorphic (and bimorphic)

00630a2

keeping NormalCachingCallSite around for binary compat

6d99896

kares force-pushed the test-call-site-2 branch from 563106a to 6d99896 Compare June 26, 2019 19:45

kares merged commit 97da0b8 into jruby:master Jun 27, 2019

kares mentioned this pull request Aug 5, 2019

fix custom call-site performance regression #5814

Merged

kares mentioned this pull request Oct 23, 2019

regressed indy fixnum op with custom equality (==) #5938

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

Fixnum/Float 'fast-ops' working even when re-opened #4736

Fixnum/Float 'fast-ops' working even when re-opened #4736

kares commented Aug 9, 2017 •

edited

Loading

headius left a comment

kares commented Aug 17, 2017

headius commented Aug 18, 2017

kares commented Aug 18, 2017

kares commented Aug 21, 2017 •

edited

Loading

enebo commented Aug 21, 2017

headius commented Aug 23, 2017

headius commented Aug 29, 2017

enebo commented Aug 29, 2017

headius commented Sep 5, 2017

headius commented Oct 11, 2018

enebo left a comment

kares commented Jun 6, 2019

kares commented Jun 6, 2019

enebo commented Jun 6, 2019

headius Jun 24, 2019

kares Jun 25, 2019 •

edited

Loading

headius Jun 24, 2019

kares Jun 25, 2019

headius left a comment

headius commented Jun 24, 2019

kares commented Jun 25, 2019

Fixnum/Float 'fast-ops' working even when re-opened #4736

Fixnum/Float 'fast-ops' working even when re-opened #4736

Conversation

kares commented Aug 9, 2017 • edited Loading

some numbers

headius left a comment

Choose a reason for hiding this comment

kares commented Aug 17, 2017

headius commented Aug 18, 2017

kares commented Aug 18, 2017

kares commented Aug 21, 2017 • edited Loading

enebo commented Aug 21, 2017

headius commented Aug 23, 2017

headius commented Aug 29, 2017

enebo commented Aug 29, 2017

headius commented Sep 5, 2017

headius commented Oct 11, 2018

enebo left a comment

Choose a reason for hiding this comment

kares commented Jun 6, 2019

kares commented Jun 6, 2019

enebo commented Jun 6, 2019

headius Jun 24, 2019

Choose a reason for hiding this comment

kares Jun 25, 2019 • edited Loading

Choose a reason for hiding this comment

headius Jun 24, 2019

Choose a reason for hiding this comment

kares Jun 25, 2019

Choose a reason for hiding this comment

headius left a comment

Choose a reason for hiding this comment

headius commented Jun 24, 2019

kares commented Jun 25, 2019

kares commented Aug 9, 2017 •

edited

Loading

kares commented Aug 21, 2017 •

edited

Loading

kares Jun 25, 2019 •

edited

Loading