NGinx Lua scripting to reload configuration

Bear with me... I know the first thing you're probably thinking is "why the hell would he want to do that?!", well, let me explain...

I was recently building a horizontally scalable deployment of NGinx pods on kubernetes. They had shared storage for the HTTPS certificates they used to serve up HTTPS content. Those certificates were generated automatically, if required, when the container boots using LetsEncrypt. They also implemented a distributed locking mechanism to ensure that only a single pod was ever responsible for updating the certificates at any given time.

This lead me to a requirement for these pods to reload their config if the certificates were ever updated by another pod. My immediate thought was to use inotify however the shared storage was a fuse filesystem, therefore inotify wasn't going to work. I'll go into the how more in another post, but in summary we ended up in a situation where the pod that updated the certificates was responsible for notifying the other pods (via http) that their configuration needed to be reloaded.

This resulted in me wanting to expose a url on an internal port, which triggered an NGINX zero downtime configuration reload. In my case, this was server-ip:1337/reload. However by default, nginx can't shell out to system commands, but we can add that support with Lua script support.

Getting Started

I do all of my work in docker containers these days, you can see my implemention of this post here. All of the steps below are based on a Centos7 installation, so some of the library paths may differ based upon your setup (another reason to love docker).

You will also need to build NGinx from source, with Lua support. In order to do that you need to get some components in advance:

LuaJIT
wget --quiet http://luajit.org/download/LuaJIT-2.1.0-beta2.tar.gz  
tar xzf Lua*  
cd Lua*  
make  
make install  

Then set some environment variables to tell nginx where to find the libraries.

export LUAJIT_LIB=/usr/local/lib  
export LUAJIT_INC=/usr/local/include/luajit-2.1  
NGX Dev Kit
wget --quiet https://github.com/simpl/ngx_devel_kit/archive/v0.3.0.tar.gz  
tar xzf v0.3.0*  
mv ngx_devel_kit* /usr/local/src/ngx-devel-kit  
NGX_Lua Module
wget --quiet https://github.com/openresty/lua-nginx-module/archive/v0.10.7.tar.gz  
tar xzf v0.10.7*  
mv lua-nginx-module* /usr/local/src/lua-nginx-module  
Nginx

At this point, you've got all the pre-reqs, you just need to build Nginx from source. This is highly implementation dependent, so I'll just show you the mandatory lines in order to compile the Lua modules too:

# Download the latest source and build it
export nginxVersion="1.11.10"

cd /usr/local/src  
wget --quiet http://nginx.org/download/nginx-$nginxVersion.tar.gz  
tar -xzf nginx-$nginxVersion.tar.gz  
cd nginx*

./configure \
--with-ld-opt="-Wl,-rpath,$LUAJIT_LIB" \
--add-module=/usr/local/src/lua-nginx-module \
--add-module=/usr/local/src/ngx-devel-kit

.... and the rest of your build options

make  
make install  

Site Configuration

This is the easy bit, I stuck the following in /etc/nginx/conf.d/internal.conf

server {  
  listen 1337;

  location /reload {
    content_by_lua 'os.execute("/usr/sbin/nginx -s reload")';
  }
}

And voilla, you're able to reload nginx from a url with no downtime:

[nginx@nginx-3759635402-zcwgx /]$ curl -v http://127.0.0.1:1337/reload
* About to connect() to 127.0.0.1 port 1337 (#0)
*   Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 1337 (#0)
> GET /reload HTTP/1.1
> User-Agent: curl/7.29.0
> Host: 127.0.0.1:1337
> Accept: */*
> 
< HTTP/1.1 200 OK  
< Server: nginx/1.11.10  
< Date: Thu, 09 Mar 2017 11:12:41 GMT  
< Content-Type: application/octet-stream  
< Transfer-Encoding: chunked  
< Connection: keep-alive  
<  
* Connection #0 to host 127.0.0.1 left intact

In the logs:

127.0.0.1 - - [09/Mar/2017:11:12:41 +0000] "GET /reload HTTP/1.1" 200 5 "-" "curl/7.29.0"  
2017/03/09 11:12:41 [info] 131#0: *85365 client 127.0.0.1 closed keepalive connection  
2017/03/09 11:12:41 [notice] 79#0: using the "epoll" event method  
2017/03/09 11:12:41 [notice] 79#0: start worker processes  
2017/03/09 11:12:41 [notice] 79#0: start worker process 169  
2017/03/09 11:12:41 [notice] 79#0: start worker process 170  
2017/03/09 11:12:42 [notice] 130#0: gracefully shutting down  
2017/03/09 11:12:42 [notice] 130#0: exiting  
2017/03/09 11:12:42 [notice] 130#0: exit  
2017/03/09 11:12:42 [notice] 131#0: gracefully shutting down  
2017/03/09 11:12:42 [notice] 131#0: exiting  
2017/03/09 11:12:42 [notice] 131#0: exit  
2017/03/09 11:12:42 [notice] 79#0: signal 17 (SIGCHLD) received  
2017/03/09 11:12:42 [notice] 79#0: worker process 130 exited with code 0  
2017/03/09 11:12:42 [notice] 79#0: worker process 131 exited with code 0  
2017/03/09 11:12:42 [notice] 79#0: signal 29 (SIGIO) received  

Disclaimer: With great power comes great responsibility. You're exposing a system command via a http request, think about that for a minute. In my world; that port is not internet facing and sits behind two internal load balancers. I also go a step further and lock down the IP addresses that can hit that URL to the subnet of the kubernetes nginx pods.