Igor Bubelov About Blog Notes Photos

How to Publish Things Online

May 13, 2021

This is an opinionated yet comprehensive guide on publishing your digital content online. The method I describe prioritizes simplicity, minimalism, vendor-independence and censorship resistance.

Photo by Bruno Martins

Table of Contents

Isn’t Publishing a Solved Problem?

There are many ways to publish things online, so why reinvent the wheel? Most publishing platforms are easy to use, but they usually hide tremendous complexity under the hood. That hidden complexity can be far more dangerous than the visible one though. By choosing an inherently complex system managed by a third party you make yourself vulnerable to censorship and exploitation.

Go Static

Self-publishing can be easy and fun if you pick the right tools. Not every website is created the same, so it’s important to keep things simple and focus on your content instead of your infrastructure. Static websites allow you to publish your content effortlessly. You can read more about static websites here.

Websites and Webservers

Once your website is ready to be shared with the whole world, you need to figure out how to publish it. Most websites are striving to have easy memorable names so people can just type those names in their browsers when they want to see content.

Web browsers act as HTTP clients, and they know polite ways of asking HTTP servers to serve all kinds of webpages for them. You can find a full description of an HTTP protocol in RFC 7230.

Don’t be afraid of wordy technical documents, HTTP is actually pretty simple and it can give you a lot of insights on what your browser is actually doing and how to fix the most common issues you might bump into while browsing the web.

Client and server are popular abstractions used in many communication protocols, and HTTP is no exception. Protocols are like languages and it’s crucial for both a client and a server to speak the same language. Their actual roles can be radically different though. By publishing a website, what we really want is to make it possible for various HTTP clients (browsers) to reach our HTTP server and display its content.

Websites can’t serve themselves, which means we need to install a special piece of software called a webserver and tell it to serve our websites to all interested clients. There are two most popular and mature open source webservers: Apache2 and Nginx. It’s really hard to choose the winner here, because they both have a similar market share of about 35% and both of them are fast and well-documented. In this guide, I’m going to use Apache2, because I use Nginx most of the time and playing with a new piece of software is fun, isn’t it?

Operating System

Every website needs a webserver and every webserver needs an operating system. The most popular operating system for running webservers is Linux. There are plenty of free and open source Linux distributions and all of them share the same kernel and they are more similar than they are different. It’s hard to make a wrong choice here, but sticking with a popular distribution such as Ubuntu seems like a wise choice.

Some people point out that Ubuntu isn’t as free and non-binding as the other Linux distributions. I partly agree, but we shouldn’t forget that Ubuntu is based on Debian and you can always move to Debian without having to learn a completely new set of tools. Canonical offers a good product for free and there are serious checks on its powers, so I wouldn’t worry about that, for now.

Hosting Provider

Websites need webservers and webservers are programs, which means that they usually need an operating system to run on, but what does an operating system need? Well, an operating system is essentially a piece of software designed to run and manage other programs. Obviously, no software can run without hardware, but the line here isn’t that clear in the sense that software is actually a kind of hardware. It’s not an abstract idea disconnected from a physical world. Software has a physical form, it exists on our storage devices such as hard drives and SSDs.

Philosophical questions aside, we really need some hardware in order to run an operating system, which naturally leads us to a hosting provider. This part is pretty important, because some hosting providers don’t give their customers direct access to their servers and sometimes, you can’t even choose an operating system. Those providers try to place themselves in between your website and its visitors. Their marketing departments work hard to convince you not to bother with setting up your own private server. Their goal is to sell you a custom product which will make it very hard for you to switch to another provider. This is called “managed” hosting, and it may sound like a good idea at first, until you experience customer lock-in, terrible user interfaces, and unresponsive support. Managing your own server isn’t that hard, and it will save you a lot of time and nerves in the long run.

The choice of a hosting provider isn’t that important, as long as it gives you full access to your server. I often use Digital Ocean, and their service is more or less tolerable. I also use Scaleway, and it’s a good and cheap choice if you want to host your website in Europe. As a rule of thumb, your website should be as close to its visitors as possible. Light travels fast, but not as fast to make the distance unnoticeable when you open a website hosted halfway around the world.

Connecting to Your Server

Servers are a bit different from traditional desktop computers. They rarely have a graphical user interface, and the best way to manage them is by using a textual one. You can also use a textual interface to manage your Linux desktop by using a terminal app. Windows machines also have their own flavor of terminals, so you can also talk to your Linux servers from a Windows machine. Let’s say your hosting provider gave you an Ubuntu server with an IP 100.101.102.103. To connect to its terminal, you can issue the following command:

$ ssh root@100.101.102.103

This command assumes that you want to log in as a root user. If you don’t, just use the username supplied by your hosting provider. If you’re not comfortable with textual interfaces, take your time, explore the file system and some basic commands. Mastering the command line is a long and insightful journey, but you don’t really need to be a command line guru to set up a webserver.

You can always disconnect from your remote shell by typing exit.

Setting Up Webserver Software (Apache2)

Debian-based systems keep an internal database of compatible software. You can think of it as a large spreadsheet which has a bunch of columns like name, version and so on. Folks who maintain this database tend to update it rather often, so you might end up in a situation when your copy of this “spreadsheet” is a bit outdated. Luckily for us, Debian-based systems have a special command called apt which can be invoked in order to update our local package registry:

# APT stands for Advanced Package Tool, and we tend 
# to use terms "program" and "package" interchangeably
$ apt update
5 packages can be upgraded. Run 'apt list --upgradable' to see them.

Installing new software on Debian-based systems and Linux in general is easy, especially if it’s included in the official repositories of your Linux distribution. Apache2 is popular, and that’s all you need to do in order to install it:

$ apt install apache2

Let’s check if it works. Try opening the following URL (don’t forget to use your server’s IP):

http://100.101.102.103/

Apache2 listens for new HTTP connection attempts coming to your server, and it should show you a page with a few interesting tips and tricks. I highly recommend reading this page in full. This HTML page is located at /var/www/html/index.html Feel free to edit this page or replace its contents with something like this:

<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8">
    <title>My test page</title>
  </head>
  <body>
    <p>Hello World!</p>
  </body>
</html>

After refreshing your browser, you should see the changes you’ve made.

Deploying a Static Website

So, it looks like everything located in /var/www/html/ is immediately visible from a browser window. What’s so special about this directory? First, don’t worry about other directories, they are NOT accessible via a browser. You should think twice before putting anything into this directory, but that’s exactly where you should put your static website in order to make it visible to the rest of the world.

So, once you set up your server, publishing your writings is as simple as pasteing a bunch of files into a directory. Here is an example of how to copy a directory named “website” from your computer to a remote server:

$ rsync        \
  --checksum   \
  --recursive  \
  --verbose    \
  <path_to_your_website>/ root@100.101.102.103:/var/www/html

# The actual output will depend on your data. It usually shows
# which files are changed since the last sync. It copies only
# the changed files, which makes it super fast to deploy changes

> sending incremental file list
> index.json
> index.xml
> blog/how-to-publish-things-online/index.html

> sent 74,448 bytes  received 11,884 bytes  24,666.29 bytes/sec
> total size is 163,314,189  speedup is 1,891.70

Where:

  • <path_to_your_website> is a path to the directory with your static website on your PC, something like /home/john/website.
  • root@100.101.102.103 identifies your server. Don’t forget the leading slash after the semicolon, it plays an important role.
  • /var/www/html is a path to a public web directory on your server.

This method of deployment is extremely fast and convenient. All changes will be available to your audience in no time. It will also show you which files have changed since the last sync, which can help you to detect unexpected changes and figure out what’s going on.

Setting Up DNS Records

URLs like http://100.101.102.103/ are hard to memorize. It’s one of the reasons why most websites use the Domain Name System (DNS). Memorizing names is easier than memorizing numbers, so you can think of DNS as of a big spreadsheet which matches different names with different IP addresses. Let’s say you registered a domain name writings.com, now we need to connect it to our webserver. Our end goal is to make sure that when people type writings.com in their browsers, they will be shown your website. In the end, it’s just a simpler way of typing http://100.101.102.103/, your readers will surely appreciate the convenience.

To bind a domain name to your webserver’s IP address, you have to own both of those things first. There are plenty of companies selling domain names, the only requirement you should have is the ability to manage DNS records. Most providers allow that. Owning a domain is not enough, you should also tell your domain where it should redirect all those browsers. This name-to-ip binding can be accomplished by simply adding the following DNS record:

Field Value
Record type A
Name @
Value 100.101.102.103
TTL Any sensible value. Choose 60 minutes if can’t decide.

You might need to wait for a bit for those changes to be applied. Try to ping your domain name:

# -c is short for packet count. Personally, I don't like 
# short arguments due to their steeper learning curve, 
# but I guess they come handy if you use this command 
# hundreds of times

$ ping -c 4 writings.com

> PING writings.com (100.101.102.103) 56(84) bytes of data.
> 64 bytes from 100.101.102.103 (100.101.102.103): icmp_seq=1 ttl=49 time=37.7 ms
> 64 bytes from 100.101.102.103 (100.101.102.103): icmp_seq=2 ttl=49 time=39.2 ms
> 64 bytes from 100.101.102.103 (100.101.102.103): icmp_seq=3 ttl=49 time=37.4 ms
> 64 bytes from 100.101.102.103 (100.101.102.103): icmp_seq=4 ttl=49 time=39.0 ms
> --- writings.com ping statistics ---
> 4 packets transmitted, 4 received, 0% packet loss, time 3303ms
> rtt min/avg/max/mdev = 37.374/38.309/39.201/0.802 ms

If it shows the IP address of your webserver, we’re good to go. If not, don’t worry and go make some coffee, it can take a while.

Getting a TLS Certificate From LetsEncrypt

At this point, your website should be reachable by the following URLs:

http://100.101.102.103/

http://writings.com/

You might have noticed that your browser is not comfortable with those URLs. The thing is: they use insecure HTTP, and there are many good reasons to only use secure HTTP, or HTTPS. Insecure connections aren’t private, and they enable ISPs and other actors to basically tap all of your communications over HTTP. That’s why it’s critically important to make your website available over HTTPS, and to do that, we need to obtain a thing called an HTTPS certificate.

In ancient times, HTTPS certificates were expensive and hard to set up. Nowadays, thanks to the Snowden revelations and EFF’s Let’s Encrypt project, you can get certificates for free, and they usually work out of the box.

First, I would recommend you to read about Let’s Encrypt and Certbot, although it’s not strictly necessary. In short, Certbot is an open-source program which can take care of setting up HTTPS certificates for you, free of charge. Now, let’s install it:

$ snap install --classic certbot

After installing Certbot, just run it and follow the instructions:

$ certbot

That’s it, now you have a website with a dedicated domain name. It also hides the traffic from anyone except your readers and yourself. HTTPS doesn’t let anyone tap into your traffic and see what exactly your readers are interested in. Here is the final version of a website URL:

https://writings.com/

Conclusion

Publishing your writings in a self-sovereign way is really easy, but it might feel frightening if you aren’t familiar with the command line and server administration. The thing is: if writing is your work or even a hobby, being your own publisher is an investment worth having. The scheme I described allows you to update your website in no time, and it also delivers the best possible performance to your readers, saving them time and nerves.

A sceptical person might argue that all this self-sovereignty is a lie, because you’re still dependent on your hosting and DNS providers. Well, DNS is just a convenience feature, you can live without it. The real problem is your hosting provider, and the fact that it can block your webserver and take away your IP address. This is a seemingly unavoidable dependency and having this single dependency is still better than having many dependencies, isn’t it?

Is it even possible to make your website fully uncensorable? Yes, and it’s actually pretty easy. The method I described needs only a few little adjustments in order to make your website available via a Tor network. Tor services can be easily hosted from home, and they don’t even need IP addresses. That’s what we’re going to do next, stay tuned.