Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply frozen string optimizations as in MRI #2990

Open
headius opened this issue May 26, 2015 · 6 comments
Open

Apply frozen string optimizations as in MRI #2990

headius opened this issue May 26, 2015 · 6 comments

Comments

@headius
Copy link
Member

headius commented May 26, 2015

We are not doing the frozen string optimization (deduplicating them using a weak internal set) in as many places as MRI. We do it for the literal case of "string".freeze, and until I remove it (for compat reasons) we attempted to do it for all string keys given to hashes. MRI does it in the following places, based on current MRI trunk:

  • For the pieces of a dstr, since they don't need to be recreated every time (compile_dstr_fragments in MRI).
  • Literal strings passed to when cases (case_when_optimizable_literal, when_vals in MRI).
  • The literal freeze case ("string".freeze).
  • For any calls to [] or []= with a literal string that end up in a builtin class that doesn't care if it's frozen (e.g. Hash#[] or String#[], but need to confirm in opt_aref_with and opt_aset_with instrs).
  • For the cached string object inside NODE_STR, presumably dup'ed on execution (we already cache ByteList).
  • For the string backing a simple xstr (this always was frozen, and I don't believe it checks for Kernel#`).
@enebo
Copy link
Member

enebo commented May 26, 2015

{
    if (!SPECIAL_CONST_P(recv) && RBASIC_CLASS(recv) == rb_cHash && BASIC_OP_UNREDEFINED_P(BOP_AREF, HASH_REDEFINED_OP_FLAG)) {
    val = rb_hash_aref(recv, key);
    }
    else {
    PUSH(recv);
    PUSH(rb_str_resurrect(key));
    CALL_SIMPLE_METHOD(recv);
    }
}

for Hash#[] specifically they will make opt_aref_with which will remove callsite logic for the conditional in this instr which does not seem like a big deal (probably a wash vs callsite logic), but it does keep reusing a single frozen string.

@subbuss Any thoughts on how you would consider attacking this (if builtin then use a single frozen string otherwise use a mutable dup every call?

@headius
Copy link
Member Author

headius commented May 26, 2015

FYI, overzealous string deduping caused #2976. It was a combination of deduping non-natural String (e.g. subclasses) and doing so for any string-like Hash keys (not just for literal String keys).

@enebo
Copy link
Member

enebo commented May 26, 2015

Idea for snippet above would be a special call instr (e.g. call_builtin_with_frozen)? It makes me wonder though if this should just use our instructions.specialized.* and Call.create() will make the appropriate specialized version. We do have a pattern for this already.

@enebo
Copy link
Member

enebo commented May 26, 2015

Note: CallInstr.create() is very unlikely to be able to use this information past knowing the method is named "[]". The receiver and the single argument will be hidden behind variables so if we want to replace the call with a more efficient one we may have to do it during a pass?

IRBuilder.buildCall() might be a better place since we can trivially see name and whether its single argument is a strnode. This feels ickier to me but the info is more readily handy.

@subbuss
Copy link
Contributor

subbuss commented May 27, 2015

As far as I can tell: "For any calls to [] or []= with a literal string that end up in a builtin class that doesn't care if it's frozen (e.g. Hash#[] or String#[], but need to confirm in opt_aref_with and opt_aset_with instrs)." is the only case that is difficult and what you are discussing. In the presence of monkey-patching, I don't know how you can tell a priori that [] will be to a builtin class.

@enebo
Copy link
Member

enebo commented May 27, 2015

@subbuss yeah we cannot a priori tell it will be builtin but the logic up above is the logic of their special instr for this case. So they have an if/else instr which de-freezes+dups if the method is no longer builtin. By end of yesterday I realized this is much simpler to do in our impl in IRBuild using AST knowledge.

At this point my main question is for analysis analysis. Since this specialized instr is just a call and worst case an arbitrary call the frozeness is never exposed in analysis because we cannot guarantee [] will never change. So even though we sort of know something at runtime about the instr we can never rely on that knowledge in passes past what callinstr gives us. That makes sense right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants