Make the WSL image reproducible

Set SOURCE_DATE_EPOCH

Since this commit (which is included in our v7.1.0 pacman package), pacman supports honoring SOURCE_DATE_EPOCH for packages' install date.

For consistency, we set SOURCE_DATE_EPOCH at the same date / timestamp that the archive repo snapshot is picked from (meaning the day before the image version's date, see rationale for this in the next change).

This should prevent non-deterministic timestamps recording on that front, such as:

│ ├── ./var/lib/pacman/local/iana-etc-20251114-1/desc
│ │ @@ -16,15 +16,15 @@
│ │  %ARCH%
│ │  any
│ │
│ │  %BUILDDATE%
│ │  1763578803
│ │
│ │  %INSTALLDATE%
│ │ **-1765650975**
│ │ **+1765651076**
│ │
│ │  %PACKAGER%
│ │  Jelle van der Waa <jelle@archlinux.org>
│ │
│ │  %SIZE%
│ │  4198552

Use archived repo snapshot from archive.archlinux.org

Use an archived repo snapshot from https://archive.archlinux.org to ensure that the same exact versions / releases of packages get installed across builds.

To avoid eventual "bad timing" issue (e.g. if an image build starts before the daily snapshot have been uploaded to https://archive.archlinux.org), the archived repo snapshot is taken from the day before the date used as the image version.

Redirect pacman logs to /dev/null

Pacman records timestamps in its logfile (/var/log/pacman.log).
We don't really need pacman logs to be recorded during the image build, so we can redirect pacman logs to /dev/null/ with --logfile /dev/null.

This should prevent non-deterministic timestamps recording, such as:

│ ├── ./var/log/pacman.log
│ │ @@ -1,125 +1,125 @@
│ │ -[2025-12-14T00:11:43+0000] [ALPM] installed filesystem (2025.10.12-1)
│ │ +[2025-12-14T00:13:28+0000] [ALPM] installed filesystem (2025.10.12-1)

Normalize filesystem and archives mtimes

Normalize the filesystem and archive mtimes by setting them to SOURCE_DATE_EPOCH.

This should prevent non-deterministic timestamps in the filesystem, such as:

│ ├── file list
│ │ @@ -1,582 +1,582 @@
│ │ -drwxr-xr-x   0        0        0        0 2025-12-15 17:01:30.944371 ./var/
│ │ +drwxr-xr-x   0        0        0        0 2025-12-15 17:03:31.247635 ./var/

Also, sort directories by name with tar --sort=name to avoid non-deterministic ordering in archives.

Remove pacman keys

Pacman keys are non-deterministic by design, so we purge them and take advantage of the oobe script to regenerate them at the first boot of the image.

This should prevent non-deterministic data in pacman keyring, such as:

│ ├── ./etc/pacman.d/gnupg/pubring.gpg
│ │┄ xxd not available in path. Falling back to Python hexlify.
│ │ @@ -1,45 +1,45 @@
│ │ -99020d0469408f04011000b9a3a76a44cdaaa630398dd297705ffb5b9906dca5
[...]
│ ├── ./etc/pacman.d/gnupg/tofu.db
│ │ ├── sqlite3 {} .dump
│ │ │ @@ -7,13 +7,13 @@
│ │ │    fingerprint TEXT, email TEXT, user_id TEXT, time INTEGER,
│ │ │    policy INTEGER CHECK (policy in (1, 2, 3, 4, 5)),
│ │ │    conflict STRING, effective_policy INTEGER DEFAULT 0 CHECK (effective_policy in (0, 1, 2, 3, 4, 5)),
│ │ │    unique (fingerprint, email));
│ │ │  CREATE TABLE signatures  (binding INTEGER NOT NULL, sig_digest TEXT,  origin TEXT, sig_time INTEGER, time INTEGER,  primary key (binding, sig_digest, origin));
│ │ │  CREATE TABLE encryptions (binding INTEGER NOT NULL,  time INTEGER);
│ │ │  CREATE TABLE ultimately_trusted_keys (keyid);
│ │ │ -INSERT INTO ultimately_trusted_keys VALUES('0AE7DE748E9EC852');
│ │ │ +INSERT INTO ultimately_trusted_keys VALUES('3653096D0D4F06F3');
│ │ │  CREATE INDEX bindings_fingerprint_email
│ │ │   on bindings (fingerprint, email);
│ │ │  CREATE INDEX bindings_email on bindings (email);
│ │ │  CREATE INDEX encryptions_binding on encryptions (binding);
│ │ │  COMMIT;
│ ├── ./etc/pacman.d/gnupg/trustdb.gpg
│ │┄ xxd not available in path. Falling back to Python hexlify.
│ │ @@ -1,46 +1,46 @@
│ │ -01677067030301050102000069408f07697c67e1000000000000000000000000
│ │ +01677067030301050102000069408f76697c67e1000000000000000000000000
[...]

Remove atime & ctime from archives metadata

Use --pax-option=delete=atime,delete=ctime with tar to get rid of atime & ctime from archive metadata.

This should prevent non-deterministic timestamps from archive metadata, such as:

--- /builds/archlinux/archlinux-wsl/workdir/output/archlinux-2025.12.16.154822.wsl
+++ /builds/archlinux/archlinux-wsl/repro/output/archlinux-2025.12.16.154822.wsl
├── archlinux-2025.12.16.154822.wsl-content
│┄ xxd not available in path. Falling back to Python hexlify.
│ @@ -1,25 +1,25 @@
│  2e2f506178486561646572732f2e000000000000000000000000000000000000
│  0000000000000000000000000000000000000000000000000000000000000000
│  0000000000000000000000000000000000000000000000000000000000000000
│  0000000030303030363434003030303030303000303030303030300030303030
│ -3030303030363100313531323031323034303000303130333333002078000000
│ +3030303030363200313531323031323034303000303130333334002078000000
Edited by Robin Candau

Merge request reports

Loading