Use rabbitmq for queueing #169

mayhem · 2017-04-24T13:01:00Z

This PR swaps out the Redis-pubsub for a proper, less memory hungry, queue. I chose RabbitMQ since we already use that in production. This PR are improves the error handling and resilience of the writer containers and pausing on startup if a service isn't available yet.

New redis keys were needed to keep track of queue stats, so I created a new redis_keys.py module for the purpose of having a central place to track redis keys.

The per-user listen counts have a bug in them in this PR, but I am already working on an improved PR for counting user and total listens using influxdb's continuous queries that make things like counting listens much easier.

…a lot of error handling still needs to be added.

…nt, improve error handling

…nd and exiting, so that the new container can give the right defs

…into rabbitmq

paramsingh · 2017-04-24T15:00:32Z

bigquery-writer/bigquery-writer.py

+                self.connection = pika.BlockingConnection(pika.ConnectionParameters(host=config.RABBITMQ_HOST, port=config.RABBITMQ_PORT))
+                break
+            except Exception as e:
+                self.log.error("Cannot connect to rabbitmq: %s, sleeping 2 seconds")


Seems like you forgot to add the error message here.

Good catch.

paramsingh · 2017-04-24T15:01:17Z

All tests pass, looks good to me. 👍

gentlecat · 2017-04-25T12:09:38Z

bigquery-writer/bigquery-writer.py

+                self.connection = pika.BlockingConnection(pika.ConnectionParameters(host=config.RABBITMQ_HOST, port=config.RABBITMQ_PORT))
+                break
+            except Exception as e:
+                self.log.error("Cannot connect to rabbitmq: %s, sleeping 2 seconds" % str(e))


retrying in 2 secondsmight be a more useful message.

In other places too.

gentlecat · 2017-04-25T12:11:40Z

bigquery-writer/bigquery-writer.py

+        return obj.callback(ch, method, properties, body)
+
+
+    def callback(self, ch, method, properties, body):


Both ch and properties arguments are unused.

gentlecat · 2017-04-25T12:15:10Z

bigquery-writer/bigquery-writer.py

+
+
+    @staticmethod
+    def static_callback(ch, method, properties, body, obj):


It might be better to make obj a first argument since it's the most important in the context of this method. The rest can be passed as keyword arguments without the need to explicitly specify them. These can be inferred from the callback method.

This is callback called by pika -- I can't change the signature of the callback.

Assuming it's the one in the start below, it can probably be modified like this:

self.channel.basic_consume( lambda ch, method, properties, body: self.static_callback(self, ch, method, properties, body), queue='unique' )

(haven't tested it)

gentlecat · 2017-04-25T12:23:09Z

influx-writer/influx-writer.py

-class InfluxWriterSubscriber(RedisPubSubSubscriber):
-    def __init__(self, ls, influx, redis):
-        RedisPubSubSubscriber.__init__(self, redis, KEYSPACE_NAME_INCOMING, __name__)
+class InfluxWriterSubscriber(object):


I see that InfluxWriterSubscriber and BigQueryWriter have some common methods related to writing. Having an abstract ListenWriter class can be useful.

...or not even abstract since connect_to_rabbitmq is exactly the same.

I thought about that,but I didn't really want to create another py module just for one function. I'll do this should I need to add another common method.

Fair enough. My point about an abstract writer class still stands. Other methods that can go in it are callback, static_callback, and start. Here's some more information about them and why they can be useful:

https://www.python.org/dev/peps/pep-3119/

http://stackoverflow.com/a/3571946/272770

https://docs.python.org/2/library/abc.html

That last comment was great!

(still not enough to sway me, but still. :) )

I've spent over an hour trying to remove some args, but it always fails now. I'd rather spent my time doing other things than fixing code that is working.

gentlecat · 2017-04-25T12:25:40Z

influx-writer/influx-writer.py

+                self.connection = pika.BlockingConnection(pika.ConnectionParameters(host=config.RABBITMQ_HOST, port=config.RABBITMQ_PORT))
+                break
+            except Exception as e:
+                self.log.error("Cannot connect to rabbitmq: %s, sleeping 2 seconds")


%s is there, but the string is not formatted.

gentlecat · 2017-04-25T12:36:59Z

bigquery-writer/bigquery-writer.py


        # if we're not supposed to run, just sleep
        if not config.WRITE_TO_BIGQUERY:
-            sleep(1000)
+            sleep(66666)


Why has the duration increased?

The actual value doesn't matter as long as it is large to prevent the container from restarting too often.

gentlecat · 2017-04-25T12:38:24Z

bigquery-writer/bigquery-writer.py

@@ -97,38 +160,37 @@ def start(self):
            sleep(1000)
            return


Can all the checks above be consolidated in one function? They all call sleep and return.

And why is sleep call necessary?

If there is no sleep, the container will just keep restarting, which just sucks resources.

I don't really see the need to consolidate them in one function -- I don't believe in creating new functions just to keep functions below an artificial length. I believe that makes the code harder to read in the end.

gentlecat · 2017-04-25T12:41:38Z

docker/docker-compose.test.yml

@@ -18,6 +18,13 @@ services:
      INFLUXDB_HTTP_LOG_ENABLED: 'true'
      INFLUXDB_CONTINUOUS_QUERIES_LOG_ENABLED: 'false'

+  rabbitmq:
+    image: rabbitmq:3.6.9


We use version 3.6.5 in production. https://github.com/metabrainz/docker-server-configs/blob/master/scripts/services.sh#L481

gentlecat · 2017-04-25T12:44:53Z

influx-writer/influx-writer.py

+        return ret
+
+
+    def write(self, listen_dicts):


This method is way too long and complex. I'd at least split it up in multiple methods that are easier to understand and test.

I prefer to keep it as it, in order to ensure readability.

That's my point though, it's kind of difficult to read.

Example of one of the changes that can be made:

@staticmethod def listens_time_range(listens): """Calculate the time range that a set of listens covers. Args: listens (list): List of dictionaries with listen data. Returns: Tuple with two values: minimum time and maximum time. """ min_time = 0 max_time = 0 for listen in listens: t = int(listen['listened_at']) if not max_time: min_time = max_time = t continue if t > max_time: max_time = t if t < min_time: min_time = t return min_time, max_time def write(self, listen_dicts): submit = [] unique = [] min_time, max_time = listens_time_range(listen_dicts) ...

I don't need to know how listens_time_range is implemented to understand how write works. listens_time_range is also very easy to test now.

You left out the part where the user_name is determined for this batch of listens, so now the clean single purpose function needs to be have two purposes or the data needs to be processed a second time.

Don't know if all the listens have user_name in them. If yes, then it can be retrieved from the first one. If not, get from the first one that's encountered.

My snippets are examples that might not work if you just copy them. I haven't tested everything.

gentlecat · 2017-04-25T12:46:27Z

influx-writer/influx-writer.py

 DUMP_JSON_WITH_ERRORS = False
+ERROR_RETRY_DELAY = 3 # number of seconds to wait until retrying an operation


It seems like this value is just used in only one place. Can it be integrated into other parts with retrying within this module?

gentlecat · 2017-04-25T12:51:20Z

Some of the issues I pointed out can be detected with Pylint.

…into rabbitmq

No longer relevant.

mayhem added 18 commits April 11, 2017 16:56

Proof of concept of RabbitMQ!

c78d131

Finish getting rabbitmq pipeline set up provisionally. It works, but …

3dc13a1

…a lot of error handling still needs to be added.

Update current status page. Ack writing listens, make queues persiste…

db78487

…nt, improve error handling

Automatically restart the connection in case it gets dropped

ec43ac5

Store exact incoming/unique counts in redis

2fb0644

Add exception handling to the queue publisher

fbccac4

Fix a syntax error

f94f13b

Handle a missing rabbitmq definition with grace by waiting for 2 seco…

f5b1f8e

…nd and exiting, so that the new container can give the right defs

Merge branch 'rabbitmq' of github.com:metabrainz/listenbrainz-server …

19d2c1f

…into rabbitmq

Make the initial connection to rabbitmq more resilient.

b3457a7

Remove old redis_pubsub and redis-consumer cruft

05b765c

Fix integration tests.

bbc5664

Ensure that rabbitmq comes up in time

0b77049

Add redis_keys.py to collect redis keys in one place

082f020

Refactoring code to improve error handling

2cd3cbe

Merge branch 'rabbitmq' of github.com:metabrainz/listenbrainz-server …

70da744

…into rabbitmq

Finished cleaning error handling in influx writer

630c176

Finishing hardening bigquery-writer

dcdae6b

mayhem requested a review from alastair April 24, 2017 13:01

paramsingh reviewed Apr 24, 2017

View reviewed changes

Pass the missing error message

43b14e7

gentlecat previously requested changes Apr 25, 2017

View reviewed changes

mayhem added 5 commits April 25, 2017 15:38

Use the same rabbitmq as in production

d40bb54

Various PR improvements

c84c4a1

Add missing sleep

111848a

Improve error message and sleep constants

e23bdc3

Merge branch 'rabbitmq' of github.com:metabrainz/listenbrainz-server …

57130cc

…into rabbitmq

mayhem merged commit 3c92cde into master Apr 26, 2017

mayhem deleted the rabbitmq branch July 6, 2017 15:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use rabbitmq for queueing #169

Use rabbitmq for queueing #169

mayhem commented Apr 24, 2017

paramsingh Apr 24, 2017

mayhem Apr 24, 2017

paramsingh commented Apr 24, 2017

gentlecat Apr 25, 2017

gentlecat Apr 25, 2017

gentlecat Apr 25, 2017

gentlecat Apr 25, 2017

mayhem Apr 25, 2017

gentlecat Apr 25, 2017 •

edited

gentlecat Apr 25, 2017

gentlecat Apr 25, 2017

mayhem Apr 25, 2017

gentlecat Apr 25, 2017

mayhem Apr 25, 2017

mayhem Apr 25, 2017

gentlecat Apr 25, 2017

gentlecat Apr 25, 2017

mayhem Apr 25, 2017

gentlecat Apr 25, 2017

gentlecat Apr 25, 2017

mayhem Apr 25, 2017

mayhem Apr 25, 2017

gentlecat Apr 25, 2017

gentlecat Apr 25, 2017

mayhem Apr 25, 2017

gentlecat Apr 25, 2017

mayhem Apr 26, 2017

gentlecat Apr 26, 2017 •

edited

gentlecat Apr 25, 2017

mayhem Apr 25, 2017

gentlecat commented Apr 25, 2017

		return obj.callback(ch, method, properties, body)


		def callback(self, ch, method, properties, body):



		@staticmethod
		def static_callback(ch, method, properties, body, obj):

		DUMP_JSON_WITH_ERRORS = False
		ERROR_RETRY_DELAY = 3 # number of seconds to wait until retrying an operation

Use rabbitmq for queueing #169

Use rabbitmq for queueing #169

Conversation

mayhem commented Apr 24, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paramsingh commented Apr 24, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gentlecat Apr 25, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gentlecat Apr 26, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gentlecat commented Apr 25, 2017

gentlecat Apr 25, 2017 •

edited

gentlecat Apr 26, 2017 •

edited