Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser: don't use invalid internal names for arguments like @select or @@select #6379

Merged
merged 1 commit into from Jul 14, 2018

Conversation

asterite
Copy link
Member

Fixes #5997

Right now this:

def foo(@arg)
end

is expanded to this:

def foo(arg)
  @arg = arg
end

The problem comes when we use a name that's also a keyword:

def foo(@class)
end

# gets expanded to
def foo(class)
  @class = class
end

The above actually works fine, because the parser creates this and it can create a variable name named "class", it's just the user that can't do this. However, if we pass that to a macro, for example:

macro id(x)
  {{x}}
end

id(def foo(@class)
end)

then it doesn't work because the code gets stringified and then output in the program, and then it's invalid:

Error in foo.cr:5: macro didn't expand to a valid program, it expanded to:

================================================================================
--------------------------------------------------------------------------------
   1.   def foo(class)
   2.   @class = class
   3. end
--------------------------------------------------------------------------------
Syntax error in expanded macro: id:1: cannot use 'class' as an argument name

  def foo(class)
          ^

================================================================================

id(def foo(@class)
^~

My suggestion was to change the compiler to expand it using external named arguments:

def foo(@arg)
end

# gets expanded to
def foo(arg __arg0)
  @arg = __arg0
end

The problem with that is that one can no longer do:

def foo(@arg)
  puts arg # arg doesn't exist anymore, it's now called __arg0
end

However, if we only do this with names that are keywords, then there's no problem:

def foo(@module)
  # this obviously doesn't compile, so it's no problem if the internal name is __arg0
  # because we have to always use @module to access the argument
  puts module
end

That's what this PR does.

@asterite asterite force-pushed the bug/5997-invalid-internal-names branch from d38d330 to e12a1fa Compare July 13, 2018 15:46
@jhass
Copy link
Member

jhass commented Jul 13, 2018

Let's please always rewrite to internal temporary locals. I would never have considered the access ivar arg as local by the same name more than an implementation detail, we should break this now and force people to do the assignment themselves.

@asterite
Copy link
Member Author

@jhass I'd rather not to. I tried, but it was a mess

We must simply document this expansion, and that's it. There's also this.

@straight-shoota
Copy link
Member

@asterite I don't get it. Why do you think #6009 was a mess? That's how it should be IMHO. If you need to access an argument as a local variable, you simply shouldn't use the ivar argument short cut. That's it and everything should be fine.

@asterite
Copy link
Member Author

#6007 (comment)

Making that change introduces silent segfaults and memory corruptions.

Also, things that used to refer to an instance variable (or, well, the argument behind an instance variable) will now result in a method access. In fact, when I implemented #6007 and ran the compiler, it crashed because of this, and it was very hard to track down where was the problem.

Things are fine right now. This PR is an improvement and fixes a bug. Can we merge it and continue the discussion of doing this all the time in a separate place?

@asterite
Copy link
Member Author

Like, take this line:

https://github.com/crystal-lang/crystal/pull/6007/files#diff-0e6efe33c8b047fbf52512a8cb7d2311L1697

def initialize(@name, @args = [] of Arg, @return_type = nil, @varargs = false, @body = nil, @real_name = name)

That currently works, @real_name = name refers to @name. But changing that it will now refer to the method name. And because of a compiler bug which doesn't notice that this is accessing the instance var before it being used, it basically access a null pointer.

Now, I could fix that bug. But I can't, it's difficult and I don't have time. If someone wants to fix that bug and then change the way arguments work, then go ahead. But until then, I prefer to fix existing bugs (that I find easier :-P)

@straight-shoota
Copy link
Member

It would work if it was simply changed to this:

def initialize(name, @args = [] of Arg, @return_type = nil, @varargs = false, @body = nil, @real_name = name)
   @name = name

It's just that the compiler doesn't complain about the existing line?

@asterite
Copy link
Member Author

I know, that's what I did. But without doing it, it compiles and simply tries to access name, which access @name, which produces a segmentation fault. The compiler doesn't detect this, and I have no idea how to do it (that means I tried to look at the code to implement this and I found it very hard to do/detect).

@straight-shoota
Copy link
Member

Yeah, I was just checking I understand the problem.

But this issue exists regardless of @ivar argument aliases, doesn't it? If a method is used as an argument default value. It's equally undetectable.

The only special circumstance with ivar arguments is that it currently compiles and changing the language breaks code without raising a compiler error. I'd say, this is totally acceptable before 1.0.

@asterite
Copy link
Member Author

asterite commented Jul 13, 2018

For me it's not acceptable. Segmentation faults are not fun.

But if you (community) want to go that direction, revive my PR, approve it, and merge it (I won't do it because I'm against that change).

@RX14
Copy link
Contributor

RX14 commented Jul 13, 2018

The segfaults were a completely seperate bug that should be fixed regardless of whether this change is made. The change didn't introduce silent segfaults, it just exposed an existing bug. Please, lets fix the bug then merge the original PR.

@asterite
Copy link
Member Author

My reasoning is this:

  • there are two existing bugs: one that generates invalid code with to_s, another one that produces segfault when using named args in som way
  • this PR fixes the first bug
  • what all of you are proposing is to not merge this PR, because we can change the behaviour of the language plus fix the bug that produces the segfault

So I see this as "No, let's keep the two existing bugs and wait to fix them at the same time instead of fixing one first, for which we already have a fix, and then the other".

The problem is that there's no guarantee we'll fix the bug, or when. This PR provides a fix for an existing bug, without changing existing behavior nor introducing segfaults. I can't see the harm of fixing this.

Then when that segfault bug is fixed and we can always used external names for @foo arguments, it's just about removing that if that decides when to use external arguments.

@RX14
Copy link
Contributor

RX14 commented Jul 13, 2018

yeah, you're absolutely right

@RX14 RX14 merged commit e110660 into crystal-lang:master Jul 14, 2018
@RX14 RX14 added this to the Next milestone Jul 14, 2018
@asterite
Copy link
Member Author

Thank you!

And note that I'm not against changing how @foo is implemented in arguments, it's just that the cost of fixing that without fixing the other bug is too high, in my opinion (as I said, the compiler started to segfault and it was hard to find why that happened, and I don't want to make users go through that process)

@jhass
Copy link
Member

jhass commented Jul 14, 2018

Mmh, I wanted to log an issue for the bug as a perquisite for #4332, but it looks like I didn't fully get it yet, why does this correctly error out? https://carc.in/#/r/4ij0

@asterite
Copy link
Member Author

@jhass Interesting! I was going to point you to the issue that shows how to reproduce the bug, 3413, changing the instance variable access with a method that accesses it, but it turns out I can't reproduce #3413 anymore in master, it gives a correct error. I also tried reproducing it with a method access and it gives the same error. I don't know when it got fixed...

That's good news. I will try to see if I can apply this PR then to all instance/class variable arguments and see if I don't get a segfault anymore and the errors are easy to fix.

@RX14
Copy link
Contributor

RX14 commented Jul 14, 2018

@asterite a git bisect would be interesting

@asterite
Copy link
Member Author

@RX14 Yeah, I might try that too...

I'm implementing this PR for all instance/class vars but I'm not sure I like it (like before). Sometimes there's code like this:

@x : Int32?

def foo(@x)
  z = x ? x.abs : 0
end

Basically, @x means receiving an argument and assigning it to the instance var. With this change that argument is lost and we only have access to the instance var. So we change it:

@x : Int32?

def foo(@x)
  z = @x ? @x.abs : 0
end

and now it doesn't work, because there's no type filtering for instance vars. So we must do:

@x : Int32?

def foo(x)
  @x = x
  z = x ? x.abs : 0
end

Essentially, we lost the ability to use @x in arguments if you also want to type filter them through the local variable that it gets assigned to first. And that local variable always exists, it's just that you want to hide it.

I'm not sure what's the benefit of all this... having to write more code? Also, accessing a local variable is faster than accessing an instance variable.

Someone care to tell me what are the benefits of this change? We can simply document that @x as an argument expands to @x = x inside the method, which is the current behavior.

@asterite
Copy link
Member Author

False alarm, I can still reproduce the bug:

class A
  property a : Int32
  property b : Int32

  def initialize(@a, @b = a)
  end
end

a = A.new(1)
puts a.b # => 0

@bew
Copy link
Contributor

bew commented Jul 14, 2018

To me your example above should be written:

@x : Int32?

def foo(@x)
  z = (x = @x) ? x.abs : 0
end

Assigning to a local var to have a smaller type. Right?

@jhass
Copy link
Member

jhass commented Jul 14, 2018

@asterite the point is that the behavior is entirely non-obvious as it's compiler generated code affecting (or I'd say polluting) the local scope. I would go as far as saying it leads to more brittle code as refactoring (renaming things) can easily lead to changed semantics (lvar access -> method call), as you basically experienced it yourself when trying to change this behavior.

Sometimes the longer variant (I wouldn't call it redundant) is better because it makes the code easier to read and follow. It's one less thing to keep in the back of your head when reading code, and this is a quite big one in my opinion.

@straight-shoota
Copy link
Member

Okay, I think we're discussing the general issue with ivar arguments in too many different places where it's mostly off-topic. Let's have a proper discussion reflecting the already mentioned arguments and current issues with this.

@RX14 RX14 removed this from the Next milestone Jul 30, 2018
@RX14 RX14 added this to the 0.26.0 milestone Jul 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants