Debugging Nginx Hostname – Web Service Development

Right now I’m writing a script to automatically deploy our application on a server. For the application, the simplest approach for using HTTPS is generally to have an Nginx http server acting as a reverse proxy server to handle the authentication with Let’s Encrypt, and then build the rest of the application from there.

The problem is that I’m running into a weird issue where after confirmation and deployment of the HTTPS certificate, nginx seems to be ignoring my server block entirely and reverting to the default server from there. If we check out the logs, there are no access to our n.wsd.sh domain, and any requests after that seem to go the default domain.

2600:1f14:804:fd02:d76c:406b:8e38:9f15 - - [14/Jul/2021:02:39:15 +0000] "GET /.well-known/acme-challenge/tegGFP9H7z1pVmdjJXFw-03F29olh-gLMmL-Jf8-eSY HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"
2600:3000:1511:200::1f - - [14/Jul/2021:02:39:15 +0000] "GET /.well-known/acme-challenge/tegGFP9H7z1pVmdjJXFw-03F29olh-gLMmL-Jf8-eSY HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"
2600:1f16:269:da01:93df:3a01:5143:6ae0 - - [14/Jul/2021:02:39:15 +0000] "GET /.well-known/acme-challenge/tegGFP9H7z1pVmdjJXFw-03F29olh-gLMmL-Jf8-eSY HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"
2a05:d014:3ad:702:b6ff:6e35:49b3:a1bc - - [14/Jul/2021:02:39:15 +0000] "GET /.well-known/acme-challenge/tegGFP9H7z1pVmdjJXFw-03F29olh-gLMmL-Jf8-eSY HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"
93.174.93.12 - - [14/Jul/2021:02:41:10 +0000] "GET / HTTP/1.1" 400 173 "-" "-"

The weird part about this, is there should be anything that has inherently changed from before or after certbot has been called from the stance of nginx. Which means we need to try and analyze exactly where this break is happening and how to test it. The way that I will attempt to resolve this issue is by first starting with a control to install and nginx cert manually, and see if it works. Then we will use the script that I am using to install and see if it works. Depending on how that works we will attempted a mixed script. And last we will try to access what the results of the tests are, and try to come up with an appropriate fix depending on the information.

For each test we will use a new USP, in the Japan region on Linode. We will create a new subdomain. And as a side note, these will be temporary. I’d rather not spend $15 a month until the end of time to keep this information stored on a blog. So we leave the version numbers if independent verification is needed at a later time.

Test 01 – control.wsd.sh

We’ll do this manually first. Here is a list of the commands we run.

# yum update -y
# yum upgrade -y
# setenforce 0
# yum install git vim yum-utils wget -y
# dnf install epel-release -y

# yum install nginx -y
# systemctl start nginx
# systemctl enable nginx
# dnf install certbot python3-certbot-nginx -y

# firewall-cmd --permanent --add-service=http
# firewall-cmd --permanent --add-service=https
# firewall-cmd --reload

# echo "server {
    listen          80;
    listen          [::]:80;
    server_name     control.wsd.sh;
    access_log      /var/log/nginx/control.wsd.sh.access.log;
    error_log       /var/log/nginx/control.wsd.sh.error.log;
    
    root            /usr/share/nginx/html;
    location ^~ /.well-known/acme-challenge/ {
        default_type "text/plain";
    }
    
    location / {
        proxy_set_header X-Real-IP \$remote_addr;
        proxy_set_header HOST \$http_host;
        proxy_set_header X-NginX-Proxy true;
        proxy_pass http://localhost:3000;
        proxy_redirect off;
    }
    
}" > /etc/nginx/conf.d/my-app.conf
# systemctl restart nginx
# certbot --nginx -d control.wsd.sh

Since we’re throwing everything back to our application server on port 3000, we expect to see a 502 error, but with https enabled.

And that’s exactly what we see. First test done, exactly as expected. Next test we will attempt to re-create the problem.

Test 02 – Script

In this test we will try and replicate the problem. We will do almost the exact same as the test above, except we will use a script. If the issues with the script are the same as what we’re having, we will get a 404 error with no https cert.

And the difference is we will use a bash script with variables. So we will create a file named, install.sh.

my_hostname="script.wsd.sh"
my_email="my_email@wsd.co.jp"

yum update -y
yum upgrade -y
setenforce 0
yum install git vim yum-utils wget -y
dnf install epel-release -y

yum install nginx -y
systemctl start nginx
systemctl enable nginx
dnf install certbot python3-certbot-nginx -y

firewall-cmd --permanent --add-service=http
firewall-cmd --permanent --add-service=https
firewall-cmd --reload

echo "server {
    listen          80;
    listen          [::]:80;
    server_name     $my_hostname;
    access_log      /var/log/nginx/$my_hostname.access.log;
    error_log       /var/log/nginx/$my_hostname.error.log;
    
    root            /usr/share/nginx/html;
    location ^~ /.well-known/acme-challenge/ {
        default_type "text/plain";
    }
    
    location / {
        proxy_set_header X-Real-IP \$remote_addr;
        proxy_set_header HOST \$http_host;
        proxy_set_header X-NginX-Proxy true;
        proxy_pass http://localhost:3000;
        proxy_redirect off;
    }
    
}" > /etc/nginx/conf.d/my-app.conf
systemctl restart nginx
certbot --nginx -d "$my_hostname" --non-interactive --agree-tos -m "$my_email" --redirect

Okay, and we get a 502 with https. Which is what we want / expect to happen. So what I was thinking is there might be an issue with nginx at this point, but it’s working as i would normally be expecting. So I think the next step is to go back and run my exact script as-is, and see if it works or not, and that means I have an error somewhere in my install script.

Full Script

My original goal was to try and isolate the issue by focusing on the nginx portion of the install script. But that part didn’t seem to run into any issues. So I’ll run my full script up to the point where the https certificate is created, and we’ll see if that works or not. If it works, it means that I ran into a fluke or some kind of bad luck where I question my sanity. otherwise if it doesn’t work we can try to comment lines out to identify what the issue is.

So three tests, three results of what I would expect, but none of them replicated the error that I was running into earlier. And that makes me worried that I’m taking crazy pills as this is a really weird error that I would never really expect to run into, but I did.

So I can think of two possibilities. One is that I probably shouldn’t be using single letter subdomains, as there might be an issue when testing for that. Or there might be an issue with Git clone coming before this. Or there might be an issue with the enitre install running puts strain on the VPS and causes un-expected issues. What I think I’ll do is split the install into install_server and install_application. And that way I can test and debug any nginx issues before the rest of the application is installed.