Persistent database connections with async workers on Django >= 1.6 fails #996

Starefossen · 2015-03-20T21:10:32Z

I noticed something when switching from sync to async eventlet workers in Gunicorn; Postgres which had been working perfectly so far begun complaining that maximum connection limit was reached, and it was refusing new connections. When enabling monitoring I could clearly see that new connections to the database was established for each HTTP request. Even though CONN_MAX_AGE was set to 60 database connections were not reused.

After some digging around I found this post on serverfault which explains the a very similar situation which concludes that persistent database connections are indeed set up – but they are never reused in any sequential HTTP requests and they are eventually closed when they reach the max age.

Is this a known issues? I can not see it has been mentioned in this issue tracker, nor on eventlet.

djano v1.7
gunicorn v19.3
eventlet v0.17
psycopg2 v2.6

The text was updated successfully, but these errors were encountered:

ramiro · 2015-04-15T09:36:19Z

I'd change the title to something that makes clear the report is about Django >= 1.6 persistent DB connections feature.

ramiro · 2015-04-15T10:07:04Z

I'm going to try to add some extra info with an excerpt of an internal issue in which we are exploring & testing changing CONN_MAX_AGE from its current default value 0. We use the gevent worker:

Django handles DB connections like this:

If CONN_MAX_AGE = 0 then connections are opened when needed & closed at the end of every request. This is implemented with the request_finished signal.
If CONN_MAX_AGE != 0 (n or None) they are handled per [1]process, every worker process started by the web server can have one connection which it uses for the requests it's in charge of: It reuses it (or opens it if it's closed) when needed BUT doesn't necessarily closes it at the end of requests. They are recycled every n seconds (by closing them if the TTL has been reached, this is implemented with both request_started request_finished signals) or never (None).

If the web server uses a concurrency model based on greenlets (eventlet, gevent workers) then connections are still opened when needed on every request and...:

In the default case of CONN_MAX_AGE = 0 there aren't problems as they are closed at the end of the request.
Now, if we set CONN_MAX_AGE != 0 then what happens is that connections aren't closed at the end of requests. So they start to accumulate.

Some options are:

Make connections persistent with CONN_MAX_AGE != 0 but switch the gunicorn worker to the 'sync' one (process-based)
Set CONN_MAX_AGE = 0 and switch to a DB connection pooling solution:
- External like pgBouncer, or
- Something like https://github.com/jneight/django-db-geventpool which activates psycopg greenlet support with psycogreen.

[1] This is actually per thread, but in actual web server deployment scenarios with e.g. Linux + apache or gunicorn, multithreading model is rarely used. Sync, multi-process (which means a one-to-one process-thread mapping) or async greenlets-based concurrency models are the usual choices.

If the above analysis is correct, I'm not sure it's in gunicorn scope to 'fix' this. Don't think it's in Django scope either (i.e., [try to] detect what concurrency model does the web server it has been deployed to uses and adapt its handling of these persistent DB connections accordingly.)

tilgovi · 2015-04-15T22:29:09Z

@ramiro I agree with your analysis.

tilgovi · 2015-12-31T07:27:57Z

I think, given the analysis, that this will not be fixed. If multiple, concurrent connections are made by an async worker the application needs to use a thread safe connection pool to get safe, correct behavior.

ghost · 2017-02-23T17:02:02Z

Re: "Now, if we set CONN_MAX_AGE != 0 then what happens is that connections aren't closed at the end of requests. So they start to accumulate."

Why is that an issue with async workers? As soon as an async worker completes a request it still closes the connection? Why would connections accumulate then? By accumulate you mean connections are getting more and more (and finally exceeding the DB's connection limit)?

I can understand that async workers produce more connections (because a single worker can handle more than 1 request at a time, hence opening more db connections) but once a certain number of connections is reached, it should converge.

Additionally can someone explain why/how https://github.com/jneight/django-db-geventpool would help in this case?

DanielStevenLewis · 2017-05-05T07:34:32Z

Could somebody (perhaps @ramiro or @tilgovi) please confirm that https://github.com/jneight/django-db-geventpool should actually help in this case? I think I ran into this problem but it kept occurring after installing django-db-geventpool (with the import from eventlet included, and the import from gevent not included). I was getting a log of "FATAL: sorry, too many clients already" which indicates that I opened up more than the allowed limit of connections to the database. CPU was also maxing out so perhaps that's why I kept getting the same error after installing django-db-geventpool...?

DanielStevenLewis · 2017-05-05T18:18:44Z

Following up from my last post, it seems possible that the CPU max out was (at least partially) caused by the issue @Starefossen raised above. http://stackoverflow.com/a/43799635/805141 indicates this is a possibility.

Also, shouldn't this issue be documented somewhere?

DanielStevenLewis · 2017-05-06T01:11:02Z

The lack of issue resolution that I experienced may have been caused by jneight/django-db-geventpool#21

The default gunicorn worker class is 'gevent', which is not compatible with django's CONN_MAX_AGE because it leaks database connections. See: benoitc/gunicorn#996 (comment) This changes the default from 60 to 0.

jxltom · 2018-09-20T08:16:34Z

Re: "Now, if we set CONN_MAX_AGE != 0 then what happens is that connections aren't closed at the end of requests. So they start to accumulate."

Why is that an issue with async workers? As soon as an async worker completes a request it still closes the connection? Why would connections accumulate then? By accumulate you mean connections are getting more and more (and finally exceeding the DB's connection limit)?

I can understand that async workers produce more connections (because a single worker can handle more than 1 request at a time, hence opening more db connections) but once a certain number of connections is reached, it should converge.

Additionally can someone explain why/how https://github.com/jneight/django-db-geventpool would help in this case?

I think it is because the established connections will only be closed when requests starts or finishes if conn_max_age is not 0. So if async workers die for some reason such as timeout, they won't receive new requests and then the db connections can not be closed for ever (they can only be closed in reqeusts starts or finishes).

As for the django-db-geventpool, it is used when conn_max_age is 0. When it is 0, every new request will create new db connection. The db pool can improve the performance.

Correct me if I'm wrong. :D

benoitc · 2018-09-21T07:43:47Z

In current design, connections will be kept alive to handle HTTP keepalive before releasing. New coming requests are handled in a new greenlet.

The Django documentation states:

Caveats¶

Since each thread maintains its own connection, your database must support at least as many > simultaneous connections as you have worker threads.

I guess the way django persistent connections work is to associate them to the thread id. If threads are patched in the gevent or eventlet workers, it is probably creating a new connection / greenlets which is limited by the number of maximum connections which may explains what you observe. If it' that then maybe django could be patched to actually use the real thread on which it run? Or by limiting the number of concurrent connections.

* setting is not respected by celery < v4.2 * celery/celery#4292 * not compatible with 'gevent' gunicorn worker class * benoitc/gunicorn#996 (comment)

zhangi · 2019-02-08T06:44:04Z

I made a patch to allow db pooling with async workers.
https://pypi.org/project/django-db-pooling/

patrik7 · 2020-04-23T07:31:50Z

Read this thread but I am lost. Does anyone know how to get DB connections properly reused when running with uvicorn?

gunicorn Application.asgi:application --max-requests 4000 -w 4 -k uvicorn.workers.UvicornWorker

I have the exact same issue of each request resulting in a new DB connection that never clears if I set CONN_MAX_AGE != 0

sandys · 2020-05-21T07:49:14Z

@patrik7 did you figure this out ?

patrik7 · 2020-05-21T08:25:03Z

@patrik7 did you figure this out ?

Nope, I resigned - I am still opening a new connection for every single HTTP request

Starefossen changed the title ~~Persistent database connections with async workers~~ Persistent database connections with async workers on Django >= 1.6 fails Apr 15, 2015

tilgovi closed this as completed Dec 31, 2015

sevein mentioned this issue Apr 15, 2017

Switch to Django persistent connections from django-mysqlpool artefactual/archivematica#505

Closed

DanielStevenLewis mentioned this issue May 6, 2017

Docs for MAX_CONNS need clarification jneight/django-db-geventpool#21

Closed

hhstore mentioned this issue May 1, 2018

django issue fix: orm - too many connection hhstore/blog#19

Open

mtyaka mentioned this issue Jul 3, 2018

Make edx_django_service CONN_MAX_AGE default to 0. openedx-unsupported/configuration#4666

Merged

8 tasks

snopoke added a commit to dimagi/commcare-cloud that referenced this issue Nov 21, 2018

remove CONN_MAX_AGE

dad6c88

* setting is not respected by celery < v4.2 * celery/celery#4292 * not compatible with 'gevent' gunicorn worker class * benoitc/gunicorn#996 (comment)

snopoke mentioned this issue Nov 21, 2018

remove CONN_MAX_AGE dimagi/commcare-cloud#2429

Merged

easherma mentioned this issue Jan 15, 2020

Performance/build timeout cityofaustin/techstack#3754

Open

patrik7 mentioned this issue Apr 22, 2020

Not respecting Django's CONN_MAX_AGE database setting encode/uvicorn#386

Closed

jathanism mentioned this issue Mar 15, 2021

SSL error: decryption failed or bad record mac & SSL SYSCALL error: EOF detected nautobot/nautobot#127

Closed

gabicavalcante mentioned this issue Dec 24, 2021

Set Persistent Connections to a low value by default vintasoftware/django-react-boilerplate#558

Open

AetherUnbound mentioned this issue Feb 21, 2022

Postgres connection is crashing in production WordPress/openverse-api#522

Closed

1 task

lotfyhussein mentioned this issue May 4, 2022

Number of active clients is always increasing pgbouncer/pgbouncer#710

Closed

raphaelm mentioned this issue Dec 20, 2022

gunicorn 20+ breaks django with database connections? #2730

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Persistent database connections with async workers on Django >= 1.6 fails #996

Persistent database connections with async workers on Django >= 1.6 fails #996

Starefossen commented Mar 20, 2015

ramiro commented Apr 15, 2015

ramiro commented Apr 15, 2015

tilgovi commented Apr 15, 2015

tilgovi commented Dec 31, 2015

ghost commented Feb 23, 2017 •

edited by ghost

Loading

DanielStevenLewis commented May 5, 2017

DanielStevenLewis commented May 5, 2017

DanielStevenLewis commented May 6, 2017

jxltom commented Sep 20, 2018 •

edited

Loading

benoitc commented Sep 21, 2018 •

edited

Loading

zhangi commented Feb 8, 2019

patrik7 commented Apr 23, 2020

sandys commented May 21, 2020

patrik7 commented May 21, 2020

Persistent database connections with async workers on Django >= 1.6 fails #996

Persistent database connections with async workers on Django >= 1.6 fails #996

Comments

Starefossen commented Mar 20, 2015

ramiro commented Apr 15, 2015

ramiro commented Apr 15, 2015

tilgovi commented Apr 15, 2015

tilgovi commented Dec 31, 2015

ghost commented Feb 23, 2017 • edited by ghost Loading

DanielStevenLewis commented May 5, 2017

DanielStevenLewis commented May 5, 2017

DanielStevenLewis commented May 6, 2017

jxltom commented Sep 20, 2018 • edited Loading

benoitc commented Sep 21, 2018 • edited Loading

zhangi commented Feb 8, 2019

patrik7 commented Apr 23, 2020

sandys commented May 21, 2020

patrik7 commented May 21, 2020

ghost commented Feb 23, 2017 •

edited by ghost

Loading

jxltom commented Sep 20, 2018 •

edited

Loading

benoitc commented Sep 21, 2018 •

edited

Loading