A Strictly Practical (While Potentially Misguided) HTTPS Guide

(The following post was written at the request of a friend, who wished to set up HTTPS himself.)

If you’re reading this, your computer just asked my server for the contents of this webpage. It did so by converting the URL you typed (or, more likely, clicked) into a request, which was then sent to another computer somewhere in a data-center containing the actual text in memory, and then displaying the reply 1. If I did everything right, your computer was also able to authenticate the origin of this reply, and ensure the aforementioned exchange was private (as in, safe from tampering and prying eyes). You can make sure that this is the case by searching for a padlock near the URL, saying something like “connection secure”. I’m talking, of course, about the fact that your connection should be an HTTPS connection.

Now, while I don’t own the computer that’s hosting this webpage (indeed, I rent it from a hosting company), I do manage it 2. Maybe you also manage (or even own!) your own small computer hosting some webpages. And, if you do, you already know that these days it’s pretty much impossible not to support HTTPS.

Indeed, regardless of all the good reasons to support HTTPS, pages that fail to support HTTPS these days result in full-page warnings of insecurity and risk of fraud in most (if not all) modern browsers. Some mobile browsers, I’ve heard, will even refuse to load plain HTTP pages without a substantial amount of effort. So, if you’re the least bit concerned with your webpage being accessible to a general audience, there’s little to do but ensure HTTPS support.

At the same time, implementing HTTPS requires some understanding of the difference between a URL and your server, and how your server serves up content as it is asked to do so. This, in turn, involves a number of concepts: DNS, public key cryptography, a Certificates Authority, and your web server (e.g., Nginx, or Apache). So, overall, providing an HTTPS connection – even for a very small self-hosted page – may become daunting.

This tutorial aims to approach the problem of setting up HTTPS in an extremely practical way; essentially, I hope to explain how I set up HTTPS for this server in a reasonably detailed manner, so that you may do the same. I have not researched the technologies involved, and I may incur in technical error. But, keeping this warning in mind, the mental models I’ve developed for the different technologies involved have served me well so far, and leveraging open-source technology allows us to operate within some level of ignorance, while achieving practical results 3.

What is HTTPS?

As outlined in the introduction, the URL is not the same as the webpage itself. Computers themselves in the internet are identified by IPs 4. DNS (Domain Name System) is a system in place to translate human-readable names (domain names) to IPs. HTTPS seeks to ensure that servers can authenticate as having some control over a domain name. (This allows, e.g., end-to-end encryption between you and the website’s server, preventing the contents of the webpage to be hijacked in-transit.) This authentication is achieved via a combination of public key cryptography, and the concept of Certificate Authorities (CA): central entities, both issuing certificates and providing certification services. Thus, a certain domain/server may authenticate itself to a CA as follows (the ACME protocol):

  1. The server X requests a certain CA to certificate its identity as URL example.com
  2. The CA responds by issuing a “challenge” to the server X: a blob of data that the server should host at an agreed URL, e.g., example.com/.well-known/acme-challenge/aabbcc.... Note that the IP to which this URL resolves to is determined by whoever controls the domain, not the server. Thus, if the blob is found at this location, the challenger (server X) certifies its control over both the contents of the server and the DNS resolution.
  3. The server receives a private key from the CA, allowing them to authenticate themselves to others, from that moment onwards, as example.com.
  4. This process is repeated periodically, to ensure server X maintains control over the domain.

Some time ago, getting a certificate from a CA was, generally, paid (and expensive). After a push for encrypted communication, and through the efforts of entities such as Let’s Encrypt, it is nowadays possible to get a certificate for free from a number of different entities (e.g., Let’s Encrypt themselves).

Configuring your web server (I)

As hinted in the previous section, we assume that we have already configured our DNS to resolve example.com to point to our server. Thus (and assuming you are using Nginx as your web server), your Nginx configuration (/etc/nginx/sites-available/example.com) might look something like the following:

# /etc/nginx/sites-available/example.com

server {
    listen 80;
    listen [::]:80; # IPv6 support

    server_name example.com;

    root /etc/www/html;
}

This configuration instructs any request of the form example.com/aaa/bbb/... arriving on port 80 5 to be replied to with the contents found at /etc/www/html/aaa/bbb/.... Now, we wish to modify this configuration to reply to requests of the form example.com/.well-known/acme-challenge/aabbccdd.. with the contents of the challenge blob. If we ensure that the challenge blob is placed at /etc/www/html/.well-known/challenge/, then no modification is necessary:

# /etc/nginx/sites-available/example.com

server {
    listen 80;
    listen [::]:80; # IPv6 support

    server_name example.com;

    root /etc/www/html;

    # "Specialization" for HTTPS challenges;
    # If we match this location, the rules within overrule the previous rules.

    location /.well-known/acme-challenge/ {

        # No modifications necessary;
        # The blob will be found at <root>/.well-known/acme-challenge/...
        # We will use this configuration block later.

    }
}

Now, we need to actually perform the challenge, and configure the system to renew its identity periodically.

acme.sh

This can be accomplished by means of the acme.sh script. This script will authenticate the domain by means of the ACME protocol, as outlined in the introduction, via one of the free CAs, like Let’s Encrypt, ZeroSSL, and others. Although you should follow the install instructions in the script’s repository, especially if you don’t want to install acme.sh as root (which is possible!), a simple root installation is very straightforward:

# As root:
curl https://get.acme.sh | sh -s [email protected]

(Yes, you should inspect the contents of https://get.acme.sh first, and ensure it hasn’t been hijacked!) The email parameter is “is the email used to register an account to Let’s Encrypt, you will receive a renewal notice email here.” [@].

This will install an alias for the acme.sh script, which we’ll now be using to issue our certificates. As you issue a certificate, acme.sh will also set up a cron job to renew the certificates automatically. Having configured our server, per the previous section, it is now easy to issue a certificate:

acme.sh --issue -d example.com -w /etc/www/html

Note how we’re indicating 1) the domain to certify, and, importantly, 2) the web root of our server. acme.sh will, following the ACME protocol, place the challenge blob at /etc/www/html/.well-known/acme-challenge/..., as we assumed in the previous section.

Following some logging, you should receive a success message from acme.sh, which will also set up a cron job to repeat this certification periodically. acme.sh will also now have in store cryptography keys that your web server can use to authenticate itself to connecting clients.

“Installing” the certificate

Now we have the cryptographic keys to authenticate our identity, but our web server is not using them. We can instruct acme.sh to place the newly-obtained key files in an agreed-upon location whenever these keys are renewed; if using Nginx, this is done with

acme.sh \
    --install-cert -d example.com \
    --key-file       /etc/nginx/certs/example.com/key.pem  \
    --fullchain-file /etc/nginx/certs/example.com/cert.pem \
    --reloadcmd     "nginx -s reload"

where, note:

Feel free to run this command already, but Nginx still won’t know to use the newly obtained authentication keys!

Configuring your web server (II)

It remains to instruct Nginx to use the authentication keys. Going back to our configuration file, let us create a server block for secure (HTTPS) connections. These should happen whenever the connection is made via the 443 port (the standard port for HTTPS connections), and require the specification of the SSL certificate and certificate key—which are precisely the files that acme.sh installed for us!

# /etc/nginx/sites-available/example.com

# Regular HTTP connection block
server {
    listen 80;
    listen [::]:80; # IPv6 support

    server_name example.com;

    root /etc/www/html;

    location /.well-known/acme-challenge/ {
        # Nothing, for now.
    }
}

# HTTPS connection block
server {
    listen 443 ssl;
    listen [::]:443; # IPv6 support

    ssl_certificate /etc/nginx/certs/example.com/cert.pem;
    ssl_certificate_key /etc/nginx/certs/example.com/key.pem;

    root /var/www/html;
}

Go ahead, save these modifications, and call nginx -t (to test the correctness of the new configuration), and nginx -s reload, to load these modifications. You should now be able to connect to https://example.com and successfully get an HTTPS connection!

Connection upgrading

However… If you connect to http://example.com, you’ll find that your connection is still regular HTTP. (Modern browsers will usually nudge you towards HTTPS, by “assuming” https:// over http:// when not present; in any case, depending on a number of factors, your connection can still end up being plain HTTP, even when HTTPS is available.) We can configure our web server to forward users connecting via plain HTTP to the corresponding HTTPS connection 6. We can do this by means of the “301: Moved permanently” HTTP code: if a browser connects to a.example.com and receives a “301: Moved to b.example.com”, it will connect to b.exapmle.com instead (and take that new URL as the “correct” location from then on).

However, we must be careful! The HTTPS certification renewals will always occur over plain HTTP (because, of course, how could you establish an HTTPS connection when the certificate has expired). Therefore, we must ensure that we serve the challenge blob over plain HTTP.

Modifying the Nginx configuration to follow these guidelines…

# /etc/nginx/sites-available/example.com

# Regular HTTP connection block
server {
    listen 80;
    listen [::]:80; # IPv6 support

    server_name example.com;

    location /.well-known/acme-challenge/ {

        # An ACME challenge for renewal of our HTTPS certificates.
        # Serve the content as plain HTTP.

        root /etc/www/html;

    }

    location / {

        # Most general match rule; will be overriden by more specific matches.
        # Request for some other content in the server.
        # Forward to the HTTPS version of our website.
        # $request_uri is a variable that will be replaced by the path
        #  the user has requested. Thus, e.g., for
        #   example.com/aaa/bbb/ccc   $request_uri=/aaa/bbb/ccc

        return 301 https://example.com$request_uri;

    }
}

# HTTPS connection block
server {
    listen 443 ssl;
    listen [::]:443; # IPv6 support

    ssl_certificate /etc/nginx/certs/example.com/cert.pem;
    ssl_certificate_key /etc/nginx/certs/example.com/key.pem;

    root /var/www/html;
}

Call nginx -t and nginx -s reload to reload these changes. By now connecting to http://example.com, you should be forwarded to https://example.com!


  1. Already this may be inaccurate: it’s quite likely that the computer your computer talked with is actually a sort of time-share of the same hardware between different “computers”. It’s also possible your computer’s request never actually reached the data-center at all, and rather a cache along the way pre-emptively provided the information, to ensure a faster page load. Or your computer itself already had the contents of this page cached. 

  2. There’s a lot of aspects I don’t manage, like hardware (as mentioned in the footnote above), the network connection, memory, and a myriad of other complicated aspects of managing computers when they become very big. I pay someone – the hosting company – to be able to pretend I’m SSH-ing into a Linux PC in someone’s home that’s connected to the internet with a fixed IP. I manage this imaginary computer. 

  3. But do remember we’re operating at a high level of abstraction! If you’re doing something critical or professionally, please ensure you have a deeper understanding of the different aspects involved. In my opinion, it’s a bit like owning a car: you can take good care of your car without understanding how motors work; but it won’t cut it if you’re a mechanic, or want to make informed decisions. 

  4. While connections involve an IP and a “port”, which is just some number. Its purpose is to distinguish different channels of communication with the same computer. So, you may establish a connection with computer X on port 443 for an HTTPS connection, and to the same computer X but on port 22 for an SSH connection. In this sense, there’s an “outgoing port”, and an “incoming port”, such that a connection from computer A to computer B is determined by (A's IP)+(A's port)+(B's IP)+(B's port). Each such 4-tuple uniquely determines a single connection. While a port is just a number, there are general conventions for what services are served on what ports – such as 443 for HTTPS and 22 for SSH, as above. 

  5. While connections involve an IP and a “port”, which is just some number. Its purpose is to distinguish different channels of communication with the same computer. So, you may establish a connection with computer X on port 443 for an HTTPS connection, and to the same computer X but on port 22 for an SSH connection. In this sense, there’s an “outgoing port”, and an “incoming port”, such that a connection from computer A to computer B is determined by (A's IP)+(A's port)+(B's IP)+(B's port). Each such 4-tuple uniquely determines a single connection. While a port is just a number, there are general conventions for what services are served on what ports – such as 443 for HTTPS and 22 for SSH, as above. 

  6. Whether this is a good thing is vaguely debatable; some very old equipment simply won’t support HTTPS, so you’re effectively shutting these clients out from your website. Thus, if maximizing compatibility with as many devices as possible is a concern to you, and if you’re serving only public static content (so, not doing anything like authentication, serving sensitive content, etc.), feel free to serve a parallel plain HTTP version of your site. Otherwise, end-to-end encryption will, of course, be preferable as a default.