Your submission was sent successfully! Close

Thank you for signing up for our newsletter!
In these regular emails you will find the latest updates from Canonical and upcoming events where you can meet our team.Close

Thank you for contacting our team. We will be in touch shortly.Close

  1. Blog
  2. Article

Robin Winslow
on 19 November 2019

Avoiding dropped connections in nginx containers with “STOPSIGNAL SIGQUIT”


(Also published on my blog at robinwinslow.uk)

Update: The default used in the official nginx docker image was changed from SIGTERM to SIGQUIT in November 2020, so this should no longer be an issue for Docker or Kubernetes users.

nginx is a very popular web server.

It may have just become the most popular web server – Netcraft’s October survey found nginx had 33% market share. This may be thanks to growth in Kubernetes adoption (most Kubernetes installations use nginx as the default ingress controller).

Unsurprisingly, its use within Kubernetes configurations is just as popular. 33 thousand public projects on GitHub use the nginx image in configs, and at least as many run it in their own Docker images.

We run nginx in the images for our 15 static websites running on our team’s Charmed Kubernetes cluster.

Closing connections in usn.ubuntu.com

About 2 years ago we noticed a problem in downloads of our USN database: Many clients were their connections reset during the download.

This may have had a few causes, but one likely culprit is Kubernetes restarting or replacing containers. This would happen when Kubernetes reschedules pods to respond to load. It will also happen every time we release a new version of the site.

The trouble with SIGTERM and nginx

When we release a new version of a site to Kubernetes, we first build and push a new image to the registry, then we ask Kubernetes to gradually roll out new containers based off the new image.

Kubernetes will gradually switch over to the new containers, removing the old ones as it goes. It does this using the same mechanism as docker stop: It will send a SIGTERM signal to the container, allow 30 seconds for it to stop gracefully, then send SIGKILL. It expects SIGTERM to result in a graceful shutdown:

It’s important that your application handle termination gracefully so that there is minimal impact on the end user and the time-to-recovery is as fast as possible!

In practice, this means your application needs to handle the SIGTERM message and begin shutting down when it receives it. This means saving all data that needs to be saved, closing down network connections, finishing any work that is left, and other similar tasks.

From

Unfortunately, this isn’t what happens when nginx receives a TERM signal:

The master process supports the following signals:

TERM, INT    fast shutdown
QUIT        graceful shutdown

From

What that “fast shutdown” means is that any open connections will be immediately closed. So if Kubernetes sends SIGTERM to my running containers, any open connection to those containers will break.

We can illustrate this using a Dockerfile which simply uses nginx to proxy to httpbin.org/delay/10. When we stop the container, we can see the container exit within less than a second, and the curl exit with “Empty reply from server”:

$ docker build -t delay-image .
…
Successfully tagged delay-image:latest

$ docker run --rm --name delay-ctnr --detach --publish 80:80 delay-image
E5F6789...

$ curl -I localhost 2>&1 | egrep 'curl:|HTTP' &
[1] 14531

$ /usr/bin/time --format '%E' docker stop delay-ctnr
curl: (52) Empty reply from server
delay-ctnr
0:00.56

SIGQUIT to the rescue

What we need instead is to send a SIGQUIT signal to ask nginx to perform a graceful shutdown, where it will wait for open connections to close before quitting.

If we  add STOPSIGNAL SIGQUIT to the Dockerfile then we can instead see curl exit gracefully with a successful 200 response, and docker stop wait (in this case, for 8.51 seconds) until it’s done so:

$ docker build -t delay-image .
…
Successfully tagged delay-image:latest

$ docker run --rm --name delay-ctnr --detach --publish 80:80 delay-image
A1B2C3D4...

$ curl -I localhost 2>&1 | egrep 'curl:|HTTP' &
[1] 14533

$ /usr/bin/time --format '%E' docker stop delay-ctnr
HTTP/1.1 200 OK
delay-ctnr
0:08.51

Bear in mind that `docker kill` will only wait 10 seconds, and Kubernetes will only wait 30, before killing the container with `SIGKILL`. So if you might need longer than that to close connections then you may need to increase the grace period.

An exception to this rule is if you are relying on unix sockets in your nginx config. In this case, SIGQUIT will fail to close the sockets properly, resulting in containers potentially not restarting correctly. So if you’re using sockets, be careful with SIGQUIT.

Why isn’t this default?

I can’t find any reference to why nginx made the decision not to treat SIGTERM more gracefully, as a graceful termination seems to be the norm with SIGTERM:

The SIGTERM signal is sent to a process to request its termination. […] This allows the process to perform nice termination releasing resources and saving state if appropriate.

From

What is a shame is that the Dockerfile for the default nginx Docker image explicitly uses STOPSIGNAL SIGTERM, meaning that anyone using the default image (and anyone copying it) will get this connection closing issue.

They have made the decision to use SIGTERM rather than SIGQUIT because of the issue with sockets. But if you’re not using sockets, you should definitely use SIGQUIT instead.

Edit: Since November 2020, the official docker image now uses SIGQUIT, so termination should now be graceful in Docker and Kubernetes.

Related posts


mitabhattacharya
6 March 2024

Meet Canonical at KubeCon + CloudNativeCon

Kubernetes Article

Join Canonical, the publishers of Ubuntu, as we proudly return as a gold sponsor at KubeCon + CloudNativeCon EU 2024. Hosted by the Cloud Native Computing Foundation, the conference unites adopters and technologists from top open source and cloud-native communities. Mark your calendars for March 20-22, 2024, as we gather in Paris for this ...


Hugo Huang
29 November 2023

Generative AI explained

AI Article

When OpenAI released ChatGPT on November 30, 2022, no one could have anticipated that the following 6 months would usher in a dizzying transformation for human society with the arrival of a new generation of artificial intelligence. Since the emergence of deep learning in the early 2010s, artificial intelligence has entered its third wave ...


Bill Wear
16 October 2023

A call for community

Cloud and server Article

Introduction Open source projects are a testament to the possibilities of collective action. From small libraries to large-scale systems, these projects rely on the volunteer efforts of communities to evolve, improve, and sustain. The principles behind successful open source projects resonate deeply with the divide-and-conquer strategy, a ...