README.md 8.41 KB
Newer Older
Sven-Hendrik Haase's avatar
Sven-Hendrik Haase committed
1
2
3
4
# Arch Infrastructure

This repository contains the complete collection of ansible playbooks and roles for the Arch Linux infrastructure.

5
6
7
It also contains git submodules so you have to run `git submodule update --init
--recursive` after cloning or some tasks will fail to run.

8
9
## Requirements

10
Install these packages:
11
  - terraform
12

13
14
### Instructions

Sven-Hendrik Haase's avatar
Sven-Hendrik Haase committed
15
All systems are set up the same way. For the first time setup in the Hetzner rescue system,
16
run the provisioning script: `ansible-playbook playbooks/tasks/install-arch.yml -l $host`.
Sven-Hendrik Haase's avatar
Sven-Hendrik Haase committed
17
18
19
The provisioning script configures a sane basic systemd with sshd. By design, it is NOT idempotent.
After the provisioning script has run, it is safe to reboot.

20
Once in the new system, run the regular playbook: `HCLOUD_TOKEN=$(misc/get_key.py misc/vault_hetzner.yml hetzner_cloud_api_key) ansible-playbook playbooks/$hostname.yml`.
21
This playbook is the one regularity used for administrating the server and is entirely idempotent.
Sven-Hendrik Haase's avatar
Sven-Hendrik Haase committed
22

23
24
25
26
When adding a new machine you should also deploy our SSH known_hosts file and update the SSH hostkeys file in this git repo.
For this you can simply run the `playbooks/tasks/sync-ssh-hostkeys.yml` playbook and commit the changes it makes to this git repository.
It will also deploy any new SSH host keys to all our machines.

27
28
29
#### Note about GPG keys

The root_access.yml file contains the root_gpgkeys variable that determine the users that have access to the vault, as well as the borg backup keys.
30
All the keys should be on the local user gpg keyring and at **minimum** be locally signed with --lsign-key. This is necessary for running either the reencrypt-vault-key
Jelle van der Waa's avatar
Jelle van der Waa committed
31
or the fetch-borg-keys tasks.
32

33
34
35
36
37
38
39
40
#### Note about Ansible dynamic inventories

We use a dynamic inventory script in order to automatically get information for
all servers directly from hcloud. You don't really have to do anything to make
this work but you should keep in mind to NOT add hcloud servers to `hosts`!
They'll be available automatically.

#### Note about first time certificates
41
42
43
44
45

The first time a certificate is issued, you'll have to do this manually by yourself. First, configure the DNS to
point to the new server and then run a playbook onto the server which includes the nginx role. Then on the server,
it is necessary to run the following once:

Sven-Hendrik Haase's avatar
Sven-Hendrik Haase committed
46
    certbot certonly --email webmaster@archlinux.org --agree-tos --rsa-key-size 4096 --renew-by-default --webroot -w /var/lib/letsencrypt/ -d <domain-name>
47

48
49
Note that some roles already run this automatically.

50
#### Note about packer
Sven-Hendrik Haase's avatar
Sven-Hendrik Haase committed
51
52
53
54

We use packer to build snapshots on hcloud to use as server base images.
In order to use this, you need to install packer and then run

55
    packer build -var $(misc/get_key.py misc/vault_hetzner.yml hetzner_cloud_api_key env) packer/archlinux.json
Sven-Hendrik Haase's avatar
Sven-Hendrik Haase committed
56
57
58

This will take some time after which a new snapshot will have been created on the primary hcloud archlinux project.

59
#### Note about terraform
60
61

We use terraform to provision a part of the infrastructure on hcloud.
62
63
The very first time you run terraform on your system, you'll have to init it:

64
    terraform init -backend-config="conn_str=postgres://terraform:$(misc/get_key.py group_vars/all/vault_terraform.yml vault_terraform_db_password)@state.archlinux.org"
65

66
67
68
After making changes to the infrastructure in `archlinux.fg`, run

    terraform plan
69
70
71
72

This will show you planned changes between the current infrastructure and the desired infrastructure.
You can then run

73
    terraform apply
74
75
76

to actually apply your changes.

77
78
79
80
81
82
We store terraform state on a special server that is the only hcloud server NOT
managed by terraform so that we do not run into a chicken-egg problem. The
state server is assumed to just exist so in an unlikely case where we have to
entirely redo this infrastructure, the state server would have to be manually
set up.

83
84
85
86
#### SMTP Configuration

All hosts should be relaying email through our primary mx host (currently 'orion'). See docs/email.txt for full details.

87
#### Note about opendkim
88
89
90
91
92

The opendkim DNS data has to be added to DNS manually. The roles verifies that the DNS is correct before starting opendkim.

The file that has to be added to the zone is `/etc/opendkim/private/$selector.txt`.

93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
### Putting a service in maintenance mode

Most web services with a nginx configuration, can be put into a maintenance mode, by running the playbook with a maintenance variable:

    ansible-playbook -e maintenance=true playbooks/<playbook.yml>

This also works with a tag:

    ansible-playbook -t <tag> -e maintenance=true playbooks/<playbook.yml>

As long as you pass the maintenance variable to the playbook run, the web service will stay in maintenance mode. As soon as you stop
passing it on the command line and run the playbook again, the regular nginx configuration should resume and the service should accept
requests by the end of the run.

Passing maintenance=false, will also prevent the regular nginx configuration from resuming, but will not put the service into maintence
mode.

Keep in mind that passing the maintenance variable to the whole playbook, without any tag, will make all the web services that have the
maintenance mode in them, to be put in maintenance mode. Use tags to affect only the services you want.

Documentation on how to add the maintenance mode to a web service is inside docs/maintenance.txt
114
115
116
117
118
119
120

### Finding servers requiring security updates

Arch-audit can be used to find servers in need of updates for security issues.

    ansible all -a "arch-audit -u"

121
122
123
124
#### Updating servers

The following steps should be used to update our managed servers:

125
126
127
128
129
  * pacman -Syu
  * manually update the kernel, since it is in IgnorePkg by default
  * sync
  * checkservices
  * reboot
130

Sven-Hendrik Haase's avatar
Sven-Hendrik Haase committed
131
132
133
134
135
## Servers

### vostok

#### Services
136
  - backups
Sven-Hendrik Haase's avatar
Sven-Hendrik Haase committed
137
138
139
140

### orion

#### Services
141
142
143
144
  - repos/sync (repos.archlinux.org)
  - sources (sources.archlinux.org)
  - archive (archive.archlinux.org)
  - torrent tracker hefurd (tracker.archlinux.org)
Sven-Hendrik Haase's avatar
Sven-Hendrik Haase committed
145

146
### luna
Sven-Hendrik Haase's avatar
Sven-Hendrik Haase committed
147
148

#### Services
149

150
151
  - aur (aur.archlinux.org)
  - mailman
152
153
154
155
156
157
  - projects (projects.archlinux.org)

### apollo

#### Services
  - wiki (wiki.archlinux.org)
158
159
160
  - bugs (bugs.archlinux.org)
  - archweb
  - patchwork
161
162
163
164
165
166
167
168
169
170

## bugs.archlinux.org

#### Services
  - flyspray

## bbs.archlinux.org

#### Services
  - bbs
Sven-Hendrik Haase's avatar
Sven-Hendrik Haase committed
171

172
### phrik.archlinux.org
173
174

#### Services
175
176
177
178
   - phrik (irc bot) users in the phrik group defined in
     the hosts vars and re-used the archusers role. Users
     in the phrik group are allowed to restar the irc bot.

179
180
181
### dragon

#### Services
Jelle van der Waa's avatar
Jelle van der Waa committed
182
  - build server
183
  - sogrep
Jelle van der Waa's avatar
Jelle van der Waa committed
184
  - arch-boxes (packer)
185

186
### state.archlinux.org
187
188

#### Services:
189
  - postgres server for terraform state
190

191
192
193
### quassel.archlinux.org

#### Services:
194
  - quassel core
195

Jelle van der Waa's avatar
Jelle van der Waa committed
196
197
198
199
200
## homedir.archlinux.org

#### Services:
  - ~/user/ webhost

201
202
203
204
205
206
207
## mirror.pkgbuild.com

### Services
  - Load balancer for PIA mirrors across the world. Uses Maxmind's GeoIP City
    database for routing users to their nearest mirror. Account information is
    stored in the ansible vault.

Jelle van der Waa's avatar
Jelle van der Waa committed
208
209
210
211
212
213
## reproducible.archlinux.org

### Services
  - Runs a master rebuilderd instance with two PIA workers (repro1.pkgbuild.com
    and repro2.pkgbuild.com).

214
215
216
217
## Ansible repo workflows

### Replace vault password and change vaulted passwords

218
219
220
221
  - Generate a new key and save it as ./new-vault-pw: `pwgen -s 64 1 > new-vault-pw`
  - `for i in $(ag ANSIBLE_VAULT -l); do ansible-vault rekey --new-vault-password-file new-vault-pw $i; done`
  - Change the key in misc/vault-password.gpg
  - `rm new-vault-pw`
222

223
224
### Re-encrypting the vault after adding or removing a new GPG key

225
226
  - Make sure you have all the GPG keys **at least** locally signed
  - Run the playbooks/tasks/reencrypt-vault-key.yml playbook and make sure it does not have **any** failed task
227
228
229
230
231
  - Test that the vault is working by running ansible-vault view on any encrypted vault file
  - Commit and push your changes

### Fetching the borg keys for local storage

232
  - Make sure you have all the GPG keys **at least** locally signed
233
234
  - Run the playbooks/tasks/fetch-borg-keys.yml playbook
  - Make sure the playbook runs successfully and check the keys under the borg-keys directory
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254

## Backup documentation

Backups should be checked now and then. Some common tasks:

### Listing current backups per server

    borg list borg@vostok.archlinux.org:/backup/<hostname>

Example

    borg list borg@vostok.archlinux.org:/backup/homedir.archlinux.org

### Listing files in a backup

    borg list borg@vostok.archlinux.org:/backup/<hostname>::<archive name>

Example

    borg list borg@vostok.archlinux.org:/backup/homedir.archlinux.org::20191127-084357