LB-158 / LB-170: Fix listenstore code for users with special characters, add tests and upgrade influx #199

paramsingh · 2017-06-17T19:59:00Z

So I spent the day looking through the influxdb-python source code. And the reason this was so hard to debug is a combination of random weird behaviour in influx with backslashes and some problems in our code.

Influxdb-python automatically escapes measurement names when we use write_points which can be seen here, so when we send a measurement with name "para\m" to write_points, it shows up in influx as "para\\m". As a result, the line protocol only requires escaping of \n characters. On the other hand, the query language requires escaping of everything and since influx had replaced the \ in the measurement name with \\, we need FOUR backslashes in the query to get the data.

So, I have changed the code so that it does this and added a bunch of comments explaining stuff as best I could. I expect more changes will be requested to make it clearer. Also, I have upgraded to influx 1.2.4 and influxdb-python 4.1.0 .

Also, there is some weird behaviour when we backslash quotes in influx. Influx SOMETIMES totally ignores backslashes present in measurement names, if they come before quotes. This behaviour is present in influx itself (not influxdb-python) and I am not sure if this is a bug or if I am misunderstanding something. For example:

Backslash away from quotes:

> insert "random\user"",val=1 hello=1
> show measurements
name: measurements
name
----
"random\user""

> select * from "\"random\\user\"\""
name: "random\user""
time                hello val
----                ----- ---
1497719311488498867 1     1

Backslash in front of a quote gets ignored

> insert "randomuser\"",val=1 hello=1
> show measurements
name: measurements
name
----
"random\user""
"randomuser""

> select * from "\"randomuser\"\""
name: "randomuser""
time                hello val
----                ----- ---
1497719359874380564 1     1

Stuff like this is hard to write code for, unless there is something that I am missing completely...

paramsingh · 2017-06-17T20:07:55Z

Clicked Create Pull Request before it was ready. 😅

paramsingh · 2017-06-17T20:10:27Z

listenbrainz/listen.py

@@ -196,7 +197,7 @@ def to_influx(self, measurement):
        # add the user generated keys present in additional info to fields
        for key, value in self.data['additional_info'].items():
            if key not in Listen.SUPPORTED_KEYS:
-                data['fields'][key] = value
+                data['fields'][key] = escape(value)


Did this because additional_info variables may have random newlines in them, which would lead to parse errors.

Influx has different behaviour for the line protocol and for the query language, so we have to use 2 backslashes for each backslash in the line protocol and 4 backslashes for each backslash in the query language. Change the code so that it does that everywhere. Also, because newlines cause parse errors in the line protocol replace newlines in the measurement names and the values of extraneous variables in additional_info with `\n`. I've also added a unit test and an integration test with a user with a weird user name.

Setuptools might have an error that is causing build errors See here: https://travis-ci.org/paramsingh/listenbrainz-server/builds/244051410 The bug seems to be known, see this: pypa/setuptools#951

mayhem

Thank you for doing all of this painful work! 🛡 💯

paramsingh closed this Jun 17, 2017

paramsingh reopened this Jun 17, 2017

paramsingh changed the title ~~Escape users~~ LB-158 / LB-170: Fix listenstore code for users with special characters, add tests and upgrade influx Jun 17, 2017

paramsingh commented Jun 17, 2017

View reviewed changes

paramsingh added 3 commits June 19, 2017 17:19

Upgrade influx to 1.2.4 and influx python client to 4.1.0

6e0013e

Fix build errors due to bug in setuptools by upgrading

3b691c2

Setuptools might have an error that is causing build errors See here: https://travis-ci.org/paramsingh/listenbrainz-server/builds/244051410 The bug seems to be known, see this: pypa/setuptools#951

paramsingh force-pushed the escape-users branch from 43c1984 to 3b691c2 Compare June 19, 2017 11:50

Fix failing test for Listen class

51632f4

paramsingh mentioned this pull request Jun 19, 2017

move the redis keys to the correct module, fix a print issue #200

Merged

mayhem approved these changes Jun 19, 2017

View reviewed changes

mayhem merged commit 81c826c into metabrainz:master Jun 19, 2017

paramsingh deleted the escape-users branch June 19, 2017 15:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LB-158 / LB-170: Fix listenstore code for users with special characters, add tests and upgrade influx #199

LB-158 / LB-170: Fix listenstore code for users with special characters, add tests and upgrade influx #199

paramsingh commented Jun 17, 2017 •

edited

paramsingh commented Jun 17, 2017

paramsingh Jun 17, 2017

mayhem left a comment

LB-158 / LB-170: Fix listenstore code for users with special characters, add tests and upgrade influx #199

LB-158 / LB-170: Fix listenstore code for users with special characters, add tests and upgrade influx #199

Conversation

paramsingh commented Jun 17, 2017 • edited

paramsingh commented Jun 17, 2017

paramsingh Jun 17, 2017

Choose a reason for hiding this comment

mayhem left a comment

Choose a reason for hiding this comment

paramsingh commented Jun 17, 2017 •

edited