(The following post was written at the request of a friend, who wished to set up HTTPS himself.)
If you’re reading this, your computer just asked my server for the contents of this webpage. It did so by converting the URL you typed (or, more likely, clicked) into a request, which was then sent to another computer somewhere in a data-center containing the actual text in memory, and then displaying the reply 1. If I did everything right, your computer was also able to authenticate the origin of this reply, and ensure the aforementioned exchange was private (as in, safe from tampering and prying eyes). You can make sure that this is the case by searching for a padlock near the URL, saying something like “connection secure”. I’m talking, of course, about the fact that your connection should be an HTTPS connection.
Now, while I don’t own the computer that’s hosting this webpage (indeed, I rent it from a hosting company), I do manage it 2. Maybe you also manage (or even own!) your own small computer hosting some webpages. And, if you do, you already know that these days it’s pretty much impossible not to support HTTPS.
Indeed, regardless of all the good reasons to support HTTPS, pages that fail to support HTTPS these days result in full-page warnings of insecurity and risk of fraud in most (if not all) modern browsers. Some mobile browsers, I’ve heard, will even refuse to load plain HTTP pages without a substantial amount of effort. So, if you’re the least bit concerned with your webpage being accessible to a general audience, there’s little to do but ensure HTTPS support.
At the same time, implementing HTTPS requires some understanding of the difference between a URL and your server, and how your server serves up content as it is asked to do so. This, in turn, involves a number of concepts: DNS, public key cryptography, a Certificates Authority, and your web server (e.g., Nginx, or Apache). So, overall, providing an HTTPS connection – even for a very small self-hosted page – may become daunting.
This tutorial aims to approach the problem of setting up HTTPS in an extremely practical way; essentially, I hope to explain how I set up HTTPS for this server in a reasonably detailed manner, so that you may do the same. I have not researched the technologies involved, and I may incur in technical error. But, keeping this warning in mind, the mental models I’ve developed for the different technologies involved have served me well so far, and leveraging open-source technology allows us to operate within some level of ignorance, while achieving practical results 3.
As outlined in the introduction, the URL is not the same as the webpage itself. Computers themselves in the internet are identified by IPs 4. DNS (Domain Name System) is a system in place to translate human-readable names (domain names) to IPs. HTTPS seeks to ensure that servers can authenticate as having some control over a domain name. (This allows, e.g., end-to-end encryption between you and the website’s server, preventing the contents of the webpage to be hijacked in-transit.) This authentication is achieved via a combination of public key cryptography, and the concept of Certificate Authorities (CA): central entities, both issuing certificates and providing certification services. Thus, a certain domain/server may authenticate itself to a CA as follows (the ACME protocol):
example.com
example.com/.well-known/acme-challenge/aabbcc...
.
Note that the IP to which this URL resolves to is determined by
whoever controls the domain, not the server. Thus, if
the blob is found at this location, the challenger (server X)
certifies its control over both the contents of the server and
the DNS resolution.example.com
.Some time ago, getting a certificate from a CA was, generally, paid (and expensive). After a push for encrypted communication, and through the efforts of entities such as Let’s Encrypt, it is nowadays possible to get a certificate for free from a number of different entities (e.g., Let’s Encrypt themselves).
As hinted in the previous section, we assume that we have
already configured our DNS to resolve example.com
to
point to our server. Thus (and assuming you are using Nginx as
your web server), your Nginx configuration
(/etc/nginx/sites-available/example.com
) might look
something like the following:
# /etc/nginx/sites-available/example.com
server {
listen 80;
listen [::]:80; # IPv6 support
server_name example.com;
root /etc/www/html;
}
This configuration instructs any request of the form
example.com/aaa/bbb/...
arriving on port 80
5 to be replied to with the contents found at
/etc/www/html/aaa/bbb/...
. Now, we wish to modify
this configuration to reply to requests of the form
example.com/.well-known/acme-challenge/aabbccdd..
with the contents of the challenge blob. If we ensure that the
challenge blob is placed at
/etc/www/html/.well-known/challenge/
, then no
modification is necessary:
# /etc/nginx/sites-available/example.com
server {
listen 80;
listen [::]:80; # IPv6 support
server_name example.com;
root /etc/www/html;
# "Specialization" for HTTPS challenges;
# If we match this location, the rules within overrule the previous rules.
location /.well-known/acme-challenge/ {
# No modifications necessary;
# The blob will be found at <root>/.well-known/acme-challenge/...
# We will use this configuration block later.
}
}
Now, we need to actually perform the challenge, and configure the system to renew its identity periodically.
This can be accomplished by means of the acme.sh
script. This script will authenticate the domain by means of
the ACME protocol, as outlined in the introduction, via one of
the free CAs, like Let’s Encrypt, ZeroSSL, and others. Although
you should follow the install instructions in the script’s
repository, especially if you don’t want to install
acme.sh
as root (which is possible!), a simple root
installation is very straightforward:
# As root:
curl https://get.acme.sh | sh -s [email protected]
(Yes, you should inspect the contents of
https://get.acme.sh
first, and ensure it hasn’t been
hijacked!) The email parameter is “is the email used to register
an account to Let’s Encrypt, you will receive a renewal notice
email here.” [@].
This will install an alias for the acme.sh
script, which we’ll now be using to issue our certificates. As
you issue a certificate, acme.sh
will also set up a
cron job to renew the certificates automatically. Having
configured our server, per the previous section, it is now easy
to issue a certificate:
acme.sh --issue -d example.com -w /etc/www/html
Note how we’re indicating 1) the domain to certify, and,
importantly, 2) the web root of our server. acme.sh
will, following the ACME protocol, place the challenge blob at
/etc/www/html/.well-known/acme-challenge/...
, as we
assumed in the previous section.
Following some logging, you should receive a success message
from acme.sh
, which will also set up a cron job to
repeat this certification periodically. acme.sh
will
also now have in store cryptography keys that your web server can
use to authenticate itself to connecting clients.
Now we have the cryptographic keys to authenticate our
identity, but our web server is not using them. We can instruct
acme.sh
to place the newly-obtained key files in an
agreed-upon location whenever these keys are renewed; if using
Nginx, this is done with
acme.sh \
--install-cert -d example.com \
--key-file /etc/nginx/certs/example.com/key.pem \
--fullchain-file /etc/nginx/certs/example.com/cert.pem \
--reloadcmd "nginx -s reload"
where, note:
example.com
;/etc/nginx/certs/example.com/
exists, and will
store our cryptographic keys;acme.sh
will call nginx -s reload
to reload its configuration after placing new keys in this
location.Feel free to run this command already, but Nginx still won’t know to use the newly obtained authentication keys!
It remains to instruct Nginx to use the authentication keys.
Going back to our configuration file, let us create a
server
block for secure (HTTPS) connections. These
should happen whenever the connection is made via the 443 port
(the standard port for HTTPS connections), and require the
specification of the SSL certificate and certificate key—which
are precisely the files that acme.sh
installed for
us!
# /etc/nginx/sites-available/example.com
# Regular HTTP connection block
server {
listen 80;
listen [::]:80; # IPv6 support
server_name example.com;
root /etc/www/html;
location /.well-known/acme-challenge/ {
# Nothing, for now.
}
}
# HTTPS connection block
server {
listen 443 ssl;
listen [::]:443; # IPv6 support
ssl_certificate /etc/nginx/certs/example.com/cert.pem;
ssl_certificate_key /etc/nginx/certs/example.com/key.pem;
root /var/www/html;
}
Go ahead, save these modifications, and call nginx
-t
(to test the correctness of the new configuration), and
nginx -s reload
, to load these modifications. You
should now be able to connect to https://example.com
and successfully get an HTTPS connection!
However… If you connect to http://example.com
,
you’ll find that your connection is still regular HTTP. (Modern
browsers will usually nudge you towards HTTPS, by “assuming”
https://
over http://
when not present;
in any case, depending on a number of factors, your connection
can still end up being plain HTTP, even when HTTPS is available.)
We can configure our web server to forward users connecting via
plain HTTP to the corresponding HTTPS connection 6. We can
do this by means of the “301: Moved permanently” HTTP code: if a
browser connects to a.example.com
and receives a
“301: Moved to b.example.com
”, it will connect to
b.exapmle.com
instead (and take that new URL as the
“correct” location from then on).
However, we must be careful! The HTTPS certification renewals will always occur over plain HTTP (because, of course, how could you establish an HTTPS connection when the certificate has expired). Therefore, we must ensure that we serve the challenge blob over plain HTTP.
Modifying the Nginx configuration to follow these guidelines…
# /etc/nginx/sites-available/example.com
# Regular HTTP connection block
server {
listen 80;
listen [::]:80; # IPv6 support
server_name example.com;
location /.well-known/acme-challenge/ {
# An ACME challenge for renewal of our HTTPS certificates.
# Serve the content as plain HTTP.
root /etc/www/html;
}
location / {
# Most general match rule; will be overriden by more specific matches.
# Request for some other content in the server.
# Forward to the HTTPS version of our website.
# $request_uri is a variable that will be replaced by the path
# the user has requested. Thus, e.g., for
# example.com/aaa/bbb/ccc $request_uri=/aaa/bbb/ccc
return 301 https://example.com$request_uri;
}
}
# HTTPS connection block
server {
listen 443 ssl;
listen [::]:443; # IPv6 support
ssl_certificate /etc/nginx/certs/example.com/cert.pem;
ssl_certificate_key /etc/nginx/certs/example.com/key.pem;
root /var/www/html;
}
Call nginx -t
and nginx -s reload
to
reload these changes. By now connecting to
http://example.com
, you should be forwarded to
https://example.com
!
Already this may be inaccurate: it’s quite likely that the computer your computer talked with is actually a sort of time-share of the same hardware between different “computers”. It’s also possible your computer’s request never actually reached the data-center at all, and rather a cache along the way pre-emptively provided the information, to ensure a faster page load. Or your computer itself already had the contents of this page cached. ↩
There’s a lot of aspects I don’t manage, like hardware (as mentioned in the footnote above), the network connection, memory, and a myriad of other complicated aspects of managing computers when they become very big. I pay someone – the hosting company – to be able to pretend I’m SSH-ing into a Linux PC in someone’s home that’s connected to the internet with a fixed IP. I manage this imaginary computer. ↩
But do remember we’re operating at a high level of abstraction! If you’re doing something critical or professionally, please ensure you have a deeper understanding of the different aspects involved. In my opinion, it’s a bit like owning a car: you can take good care of your car without understanding how motors work; but it won’t cut it if you’re a mechanic, or want to make informed decisions. ↩
While connections involve an IP and a “port”, which is
just some number. Its purpose is to distinguish different
channels of communication with the same computer. So, you
may establish a connection with computer X on port 443 for
an HTTPS connection, and to the same computer X but on port
22 for an SSH connection. In this sense, there’s an
“outgoing port”, and an “incoming port”, such that a
connection from computer A to computer B is determined by
(A's IP)+(A's port)+(B's IP)+(B's port)
. Each
such 4-tuple uniquely determines a single connection. While
a port is just a number, there are general conventions for
what services are served on what ports – such as 443 for
HTTPS and 22 for SSH, as above. ↩
While connections involve an IP and a “port”, which is
just some number. Its purpose is to distinguish different
channels of communication with the same computer. So, you
may establish a connection with computer X on port 443 for
an HTTPS connection, and to the same computer X but on port
22 for an SSH connection. In this sense, there’s an
“outgoing port”, and an “incoming port”, such that a
connection from computer A to computer B is determined by
(A's IP)+(A's port)+(B's IP)+(B's port)
. Each
such 4-tuple uniquely determines a single connection. While
a port is just a number, there are general conventions for
what services are served on what ports – such as 443 for
HTTPS and 22 for SSH, as above. ↩
Whether this is a good thing is vaguely debatable; some very old equipment simply won’t support HTTPS, so you’re effectively shutting these clients out from your website. Thus, if maximizing compatibility with as many devices as possible is a concern to you, and if you’re serving only public static content (so, not doing anything like authentication, serving sensitive content, etc.), feel free to serve a parallel plain HTTP version of your site. Otherwise, end-to-end encryption will, of course, be preferable as a default. ↩