Now With Ansible!
Now's probably a good time to do things properly and stop depending on a snowflake server
2022-07-10 :: 1448 words
For nearly four years, this website has been running on the lowest tier of DigitalOcean Droplet, initially for free with some credit that I got courtesy of GitHub Education. During that time, I operated two individual servers,1 both of which were setup entirely by hand.
The first server only contained native apps, but I moved to using Docker containers on the second server. In either case, re-deploying or updating software involved me SSHing into the server, running git pull
and manually rebuilding everything. Some of the server configuration files were stored alongside the code for the website, some were untracked and being edited on-the-fly and I'm sure there was some stuff I'd forgotten about but was quietly ticking away in the background. That's a lot of words to say that I had a snowflake server!
Fast-forward to last month, when DigitalOcean announced that they were going to be hiking their prices, effective July 2022. This meant that I'd end up spending an extra $1 a month for the same spec VPS, which is quite a lot for me and my non-existent income. Because of this, I decided to move to Hetzner, which was better value than DigitalOcean, even before the price hike, which is why it was something I'd been planning to do already, this just gave the the motivation to actually do it.
I also took this as a chance to set the server up the proper way,2 so I could spin up a new copy of it in minutes if I need to, instead of the hours like before. I did this with a tool called Ansible, which is maintained by Red Hat. Jeff Geerling is quite the proponent of this, and it was his Ansible 101 series that first got me interested in the tool.
Ansible
I think Ansible is really cool. At its most basic level, you use it to run commands against a remote machine over SSH. It doesn't require any special software to be installed on any of the remotes you're working with - just an installation of Python. The only software you need to install is installed on your local machine.
"Playbooks" are the Ansible name for the way you define the ideal state of a remote. They're written in YAML, and leverage a number of built-in and third-party "modules" which each do certain things. Examples of this are managing Cron jobs, editing files and dealing with Docker.
At its most basic level, a playbook contains a list of idempotent tasks that each invoke a module. I ended up writing two playbooks, one called provision.yml
and one called update.yml
. The beginning of the former looks like this:
1---
2- name: Provision and setup an Ubuntu amd64 webserver
3 become: yes
4 hosts: webserver
5 tasks:
6
7 - name: Update apt repositories
8 tags: [tools]
9 ansible.builtin.apt:
10 update_cache: yes
11 cache_valid_time: 3600
12
13 - name: Install tools
14 tags: [tools]
15 ansible.builtin.apt:
16 name:
17 - git
18 - curl
19 - sqlite3
20 - python3-pip
21 state: present
22
23 - name: Add SSH login alert script
24 tags: [sshAlerts]
25 ansible.builtin.copy:
26 dest: /usr/bin/on-ssh-login.sh
27 src: on-ssh-login.sh
28 mode: +x
29 # ...
This is pretty simple! It was a breeze getting things set up and working in a usable state with some guesswork. All I had to do was write a long list of tasks, whack them in a playbook and add some tags so I can selectively run certain sections of it.
To run a playbook in its entirety, all I need to do is to run ansible-playbook provision.yml
. Only running, say, the steps to install Docker and Caddy is similarly simple - ansible-playbook provision.yml --tags "docker,caddy"
.
And, in all fairness - those playbooks did work like that! There were, however, a couple of shortcomings.
The first was that any files that a given playbook used had to be shoved in a single directory to keep some semblance of organisation. When you have a collection of files, many of which are called things like config.json
or settings.yaml
, it beceoms hard to keep track of what's what. To get around that, I took the extraordinarily janky approach of prepending the service name to the filename, if it wasn't already there.
1server
2├── provisionAssets
3│ ├── backup.sh
4│ └── on-ssh-login.sh
5├── provision.yml
6├── updateAssets
7│ ├── Caddyfile
8│ ├── walrss.config.yaml
9│ └── website.env
10└── update.yml
Second was that it became a little tricky to deal with the outputs of each task. In a playbook, you can optionally "register" the output of a task as a variable with a name. When you have all your tasks in a big file together, you can end up with name collisions, which isn't great.
Luckily, Ansible has a solution for this!
Enter roles
I think the closest comparison to a role in Ansible is a function in a programming language. Each role has its own variable scope, its own collection of files it uses and you can set variables within it, all scoped to that role. "Role" seems like an awful name for this functionality, in all honesty, but what do I know?
Using roles allowed me to split my long playbooks up into a single role for each step. I can have a role called setupDocker
or configureCaddy
or the name of a service, and it'll setup and install Docker, apply a Caddy configuration or update and (re)start a service, respectively.
Let's say you wanted to create a new role called bookstack
that installs or updates an installation of Bookstack. Since each individual role lives in its own directory inside the ./roles
directory (relative to the playbook), it's a matter of creating the appropriate directory structure, adding files in the right place and including the role in your playbooks.
1$ mkdir -p roles/bookstack/tasks
2$ mkdir roles/bookstack/files
3$ nano roles/bookstack/tasks/main.yml # add your playbook steps here
4$ cp ~/bookstack.config.php roles/bookstack/files
1# Within a playbook's tasks section:
2- name: Install/update Bookstack
3 tags: [bookstack]
4 ansible.builtin.include_role:
5 name: bookstack
6 apply: { tags: [bookstack] }
(The apply.tags
applies the given tags to all the role's child tasks so you can still use tags to selectively run parts of a playbook.)
If you're interested in roles as a thing, you can take at look at the Ansible docs for them. One aspect of roles that I didn't cover here are variables that can be used as inputs to your role - most of the roles I wrote made use of these variables in some shape or form.
Some tricky things
While, on the whole, writing my playbooks was pretty simple and only took the best part of a day, that's not to say it was all plain sailing.
The first issue was that, if Ansible detects that a given task has failed, usually by exit code, it'll stop executing the playbook at that point. This is an issue if, say, one of the install scripts you use returns a non-zero exit code if a tool is already installed, causing it to crash your playbook. If this is the case for 95% of the times you run the playbook, that's a problem.
This is why Ansible lets you override the default "not 0 means failure" behaviour by inserting a conditional expression into a field called failed_when
on your task.
1 - name: Install rclone
2 ansible.builtin.command:
3 argv:
4 - "bash"
5 - "-c"
6 - "curl https://rclone.org/install.sh | bash"
7 register: command_result
8 failed_when: command_result.rc != 3 # 3 means rclone is up-to-date
One of the other issues I encountered was more subtle, and only showed itself when I tried to update my website's Docker container.
Normally, when you run docker pull ghcr.io/codemicro/walrss:latest
, for example, Docker will download any updated versions of that image. The issue comes from the fact that Ansible's community.docker.docker_image
will only check for the presence (or lack thereof) of an image, not to see if there's an update. The easiest way to fix this is to just add force_source
to the task to make it re-pull the image regardless every single time.
1 - name: Pull Walrss Docker container
2 community.docker.docker_image:
3 name: ghcr.io/codemicro/walrss:latest
4 source: pull
5 force_source: yes
It's an incredibly simple thing, but it's one of those issues that takes up far more time to fix than it feels like it should do.
Another issue with deploying updates to Docker containers with Ansible like this is that if you already have a container running with a given image, any changes to the container's image won't take effect until the container is restarted.
You could just restart the container every single time the playbook is run, which is fine, I guess, but it'll mark that task as changed every single timeun - even if the underlying container image never changed. That annoys me irrationally, so we can configure Ansible to only restart the container when it doesn't already exist or if the image changed.
1 - name: Pull Walrss Docker image
2 community.docker.docker_image:
3 name: ghcr.io/codemicro/walrss:latest
4 source: pull
5 force_source: yes
6 register: imagePull
7
8 - name: Check Walrss container state
9 community.docker.docker_container_info:
10 name: walrss
11 register: containerInfo
12
13 - name: Ensure Walrss is running
14 tags: [walrss]
15 community.docker.docker_container:
16 state: started
17 recreate: yes
18 # other container settings here
19 when: imagePull.changed or (not containerInfo.exists)
To conclude...
Ansible is cool. It's fairly boring technology, which is also cool. It might be old, but it's easy to use and get started with (if you ignore how the docs kinda suck sometimes). I'll definitely be continuing to use it in the future.
I very much enjoyed putting my two playbooks togeter, and I look forwards to writing more in the future, because it makes repeatable deployments a breeze.
Footnotes
- There came a point where I wanted to update the first Droplet to a newer version of Ubuntu, but I was scared doing so might break something catastrophically, so I "just" rebuilt it on a new Droplet. ↝
- When I say "proper", I mean it in the sense that setting up a server from scratch should be a non-event. ↝