Apply frozen string optimizations as in MRI #2990

headius · 2015-05-26T19:30:00Z

We are not doing the frozen string optimization (deduplicating them using a weak internal set) in as many places as MRI. We do it for the literal case of "string".freeze, and until I remove it (for compat reasons) we attempted to do it for all string keys given to hashes. MRI does it in the following places, based on current MRI trunk:

For the pieces of a dstr, since they don't need to be recreated every time (compile_dstr_fragments in MRI).
Literal strings passed to when cases (case_when_optimizable_literal, when_vals in MRI).
The literal freeze case ("string".freeze).
For any calls to [] or []= with a literal string that end up in a builtin class that doesn't care if it's frozen (e.g. Hash#[] or String#[], but need to confirm in opt_aref_with and opt_aset_with instrs).
For the cached string object inside NODE_STR, presumably dup'ed on execution (we already cache ByteList).
For the string backing a simple xstr (this always was frozen, and I don't believe it checks for Kernel#`).

The text was updated successfully, but these errors were encountered:

enebo · 2015-05-26T19:37:39Z

{
    if (!SPECIAL_CONST_P(recv) && RBASIC_CLASS(recv) == rb_cHash && BASIC_OP_UNREDEFINED_P(BOP_AREF, HASH_REDEFINED_OP_FLAG)) {
    val = rb_hash_aref(recv, key);
    }
    else {
    PUSH(recv);
    PUSH(rb_str_resurrect(key));
    CALL_SIMPLE_METHOD(recv);
    }
}

for Hash#[] specifically they will make opt_aref_with which will remove callsite logic for the conditional in this instr which does not seem like a big deal (probably a wash vs callsite logic), but it does keep reusing a single frozen string.

@subbuss Any thoughts on how you would consider attacking this (if builtin then use a single frozen string otherwise use a mutable dup every call?

headius · 2015-05-26T19:40:17Z

FYI, overzealous string deduping caused #2976. It was a combination of deduping non-natural String (e.g. subclasses) and doing so for any string-like Hash keys (not just for literal String keys).

enebo · 2015-05-26T19:48:58Z

Idea for snippet above would be a special call instr (e.g. call_builtin_with_frozen)? It makes me wonder though if this should just use our instructions.specialized.* and Call.create() will make the appropriate specialized version. We do have a pattern for this already.

Fixes #2976. See #2990 for dedup cases from MRI we may want to add to JRuby.

enebo · 2015-05-26T21:08:07Z

Note: CallInstr.create() is very unlikely to be able to use this information past knowing the method is named "[]". The receiver and the single argument will be hidden behind variables so if we want to replace the call with a more efficient one we may have to do it during a pass?

IRBuilder.buildCall() might be a better place since we can trivially see name and whether its single argument is a strnode. This feels ickier to me but the info is more readily handy.

subbuss · 2015-05-27T12:56:37Z

As far as I can tell: "For any calls to [] or []= with a literal string that end up in a builtin class that doesn't care if it's frozen (e.g. Hash#[] or String#[], but need to confirm in opt_aref_with and opt_aset_with instrs)." is the only case that is difficult and what you are discussing. In the presence of monkey-patching, I don't know how you can tell a priori that [] will be to a builtin class.

enebo · 2015-05-27T14:01:00Z

@subbuss yeah we cannot a priori tell it will be builtin but the logic up above is the logic of their special instr for this case. So they have an if/else instr which de-freezes+dups if the method is no longer builtin. By end of yesterday I realized this is much simpler to do in our impl in IRBuild using AST knowledge.

At this point my main question is for analysis analysis. Since this specialized instr is just a call and worst case an arbitrary call the frozeness is never exposed in analysis because we cannot guarantee [] will never change. So even though we sort of know something at runtime about the instr we can never rely on that knowledge in passes past what callinstr gives us. That makes sense right?

headius added a commit that referenced this issue May 26, 2015

Back off Hash String key dedup and make fstring cache strict.

d4b4a26

Fixes #2976. See #2990 for dedup cases from MRI we may want to add to JRuby.

headius mentioned this issue May 26, 2015

Regression: Sequel's literals contain additional quotation marks under JRuby 9k head but not under pre2 #2976

Closed

enebo added performance JRuby 9000 labels May 26, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

Apply frozen string optimizations as in MRI #2990

Apply frozen string optimizations as in MRI #2990

headius commented May 26, 2015

enebo commented May 26, 2015

headius commented May 26, 2015

enebo commented May 26, 2015

enebo commented May 26, 2015

subbuss commented May 27, 2015

enebo commented May 27, 2015

Apply frozen string optimizations as in MRI #2990

Apply frozen string optimizations as in MRI #2990

Comments

headius commented May 26, 2015

enebo commented May 26, 2015

headius commented May 26, 2015

enebo commented May 26, 2015

enebo commented May 26, 2015

subbuss commented May 27, 2015

enebo commented May 27, 2015