A DStrNode is generated when parsing squigly heredocs #4172

jsyeo · 2016-09-23T06:06:13Z

When parsing squigly heredocs, a DStrNode is produced if there are spaces between the first heredoc line and the content. Like for example: "<<~TEST\n hello\nTEST". If I try to parse it without any spaces, only the StrNode is produced.

[23] pry(main)> parse("<<~TEST\nhello\nTEST").map { |a| a }
=> [#<Java::OrgJrubyparserAst::RootNode:0x33617539>, #<Java::OrgJrubyparserAst::StrNode:0x5db4c359>]
[24] pry(main)> parse("<<~TEST\n hello\nTEST").map { |a| a }
=> [#<Java::OrgJrubyparserAst::RootNode:0x7997b197>,
 #<Java::OrgJrubyparserAst::DStrNode:0x460f76a6>,
 #<Java::OrgJrubyparserAst::StrNode:0x11acdc30>,
 #<Java::OrgJrubyparserAst::StrNode:0x4a8ab068>]

Environment

» jruby --version
jruby 9.1.5.0 (2.3.1) 2016-09-07 036ce39 Java HotSpot(TM) 64-Bit Server VM 25.102-b14 on 1.8.0_102-b14 +jit [darwin-x86_64]

The text was updated successfully, but these errors were encountered:

jsyeo · 2016-09-23T16:41:37Z

Here's the output from ast

$ cat heredoc.rb
<<~TEST
 hello
TEST
$ ast heredoc.rb
AST:
RootNode 0
  DStrNode 0
    StrNode 0
    StrNode 1

$ cat nospace_heredoc.rb
<<~TEST
hello
TEST
$ ast nospace_heredoc.rb
AST:
RootNode 0
  StrNode 0

enebo · 2016-09-23T18:47:42Z

@jsyeo What is the bug? Or what problem are you seeing with us processing tilde heredocs this way? MRI I believe also does this same thing internally.

headius · 2016-09-23T21:08:14Z

@enebo I'm not sure if -y tells us what we want to know, but MRI appears to parse both of these identically. I think the concern here is that JRuby's creating a DStr for a non-dynamic string only when there's spaces ahead of the first heredoc line.

enebo · 2016-09-24T02:41:59Z

@headius I don't think -y sheds any light on this. It shows the lex tokens and which productions are hit. It does not report on the actual tree nodes generated. Both JRuby and MRI have a heredoc_dedent method in the lexer which keeps reading each line and concat'ing it as elements on a DStr. (update: I think MRI changed this from dstr + str concats to using memmove at some point?).

So my original question might be answered by @headius which is that this may not build in IR as efficient as it could, but I am not positive this is why @jsyeo is asking. He might just feel the AST is not what he wants for some tooling possibly?

I guess regardless how it is asked I can probably collapse it into a single string literal but I still sort of need to know whether AST is important or not? If not then I think I would fix it in IRBuilder as it will be a simpler fix.

jsyeo · 2016-09-24T07:41:47Z

He might just feel the AST is not what he wants for some tooling possibly?

Yep. I've forked the jruby-parser and updated the code with the main line parser and I was writing specs for it. While I was writing specs for #4173, I wrote something like this:

describe Parser do
  it 'parses squiggly heredoc with single quotes' do
    code = "obj = <<~'TEST'
                  hello
            TEST"
    str_node = parse(code).find_node.tap do |s|
      expect(str_node).to_not be_nil
      expect(str_node.value).to eq "hello\n"
    end
  end
end

This spec fails because the first StrNode that was found contains an empty string instead of "hello". It turns out that the parser was generating a DStrNode that wraps around two StrNodes and the string value is in the second StrNode and I don't think that's what one would expect if someone is traversing the AST.

jsyeo · 2016-09-24T07:45:49Z

I guess regardless how it is asked I can probably collapse it into a single string literal but I still sort of need to know whether AST is important or not?

I guess from a higher level point of view, as long as the semantics are right, we don't really need the generated AST to be correct. However, I think Ripper might not be behaving in the same way as MRI but I couldn't test this hypothesis. >_< I am having problems using Ripper to lex squigly heredocs. I have reported the issue in #4176.

enebo · 2016-09-24T18:48:19Z

@jsyeo ok thanks for the update. We can fix this in the AST so that it lines up and I guess we shall see how ripper should behave once it is working.

headius · 2016-09-26T16:54:28Z

I've forked the jruby-parser and updated the code with the main line parser

Nice! Hopefully you can turn this into a PR to update jruby-parser?

headius · 2016-09-26T16:55:12Z

Perhaps this issue should be moved to jruby/jruby-parser, since for standard JRuby uses, nobody will ever see that this is a DStr?

jsyeo · 2016-09-27T01:55:20Z

We can fix this in the AST so that it lines up and I guess we shall see how ripper should behave once it is working.

Now that ripper is working, I can confirm that this does not affect ripper. I will close this issue and move it to jruby/jruby-parser.

» bin/jruby -e "require 'ripper'; p Ripper.lex(File.read('heredoc.rb'))"
[[[1, 0], :on_heredoc_beg, "<<~TEST"], [[1, 7], :on_nl, "\n"], [[2, 1], :on_tstring_content, "hello\n"], [[3, 0], :on_heredoc_end, "TEST\n"]]
» bin/jruby -e "require 'ripper'; p Ripper.lex(File.read('heredoc_nospace.rb'))"
[[[1, 0], :on_heredoc_beg, "<<~TEST"], [[1, 7], :on_nl, "\n"], [[2, 0], :on_tstring_content, "hello\n"], [[3, 0], :on_heredoc_end, "TEST\n"]]

enebo · 2016-09-27T14:55:15Z

@jsyeo thanks for submitting a PR to jruby-parser. I also think if you have n lines in the heredoc each line gets emitted in ripper. So that might be worth checking to make sure they are the same.

jsyeo closed this as completed Sep 27, 2016

jsyeo mentioned this issue Sep 27, 2016

Update parser with jruby/jruby's main line parser jruby/jruby-parser#39

Closed

enebo added this to the Invalid or Duplicate milestone Nov 9, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

A DStrNode is generated when parsing squigly heredocs #4172

A DStrNode is generated when parsing squigly heredocs #4172

jsyeo commented Sep 23, 2016

jsyeo commented Sep 23, 2016 •

edited

Loading

enebo commented Sep 23, 2016

headius commented Sep 23, 2016

enebo commented Sep 24, 2016

jsyeo commented Sep 24, 2016

jsyeo commented Sep 24, 2016

enebo commented Sep 24, 2016

headius commented Sep 26, 2016

headius commented Sep 26, 2016

jsyeo commented Sep 27, 2016

enebo commented Sep 27, 2016

A DStrNode is generated when parsing squigly heredocs #4172

A DStrNode is generated when parsing squigly heredocs #4172

Comments

jsyeo commented Sep 23, 2016

Environment

jsyeo commented Sep 23, 2016 • edited Loading

enebo commented Sep 23, 2016

headius commented Sep 23, 2016

enebo commented Sep 24, 2016

jsyeo commented Sep 24, 2016

jsyeo commented Sep 24, 2016

enebo commented Sep 24, 2016

headius commented Sep 26, 2016

headius commented Sep 26, 2016

jsyeo commented Sep 27, 2016

enebo commented Sep 27, 2016

jsyeo commented Sep 23, 2016 •

edited

Loading