-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Throw an error when a unicode character is parsed in a identifier #2129
Conversation
Hm, I don't understand why the lexer is currently accepting those characters at all. The only token type that matches in
|
I don't get it either. Here's the bison trace of the master branch:
I really don't get that part:
I don't understand why the lexer is consuming the rest of the input as soon as it reach a unicode char. Isn't the ANY match supposed to consume the chars one by one? As I said in the PR, I did not find anything better than explicitly throwing an error while lexing a unicode identifier. |
A random guess: missing |
That didn't help unfortunately. Apparently 8-bit is the default unless you're using certain table compression options. |
Fixes #1374.
I am a bit unsure about my approach here, but I couldn't come with anything better... I am open to any suggestion :)
Before
#After