Tiered multiarch support
💡
Idea proposal Checks
NOTE: The below check boxes must be checked before the accompanying idea will be considered.
-
I have checked that the idea is not directly tied to a specific project For example: "Show label icons in the package overview web page" must be a feature request in the ArchWeb repository -
I have carefully checked this idea is not already covered by any open or closed ideas. -
I understand that I hold no copyright claims and that this idea can be adapted and used by Arch Linux in any arbitrary shape or form.
Summary
Arch should support architectures in a three-tiered system. Pacman and other infra are updated to also allow microarchitectures, where "any" is relegated to a special microarchitecture of all architectures.
Motivation
In the past, Arch supported multiple architectures in full: i686 and x86_64. pacman and makepkg still work with architectures and have to specify explicitly what architecture is used, even though x86_64 is the only officially supported architecture.
Nowadays, x86_64 is not the only widespread architecture. AArch64 is very widely used, even for workstations, and RISC-V and other architectures are on the horizon. It's clear that at the moment, the Arch project can't simply drop everything and switch from supporting one architecture to several, and should instead adopt a tiered system where architectures can be partially supported in the meantime.
Additionally, with the advent of standard microarchitectures for x86_64, there has been some desire to move to supporting x86_64-v3, but this would require additional work to ensure that other x86_64 packages are still usable.
The status quo right now is to have spinoff projects which offer their own pacman repos for their own architectures. This is less than desirable, because it means that there's no path for these projects to become supported by the official project, with the titular case being AArch64 support under Arch Linux ARM. Having support built into governance and infrastructure will make the relationship to these projects more clear.
Specification
Architecture tiers
Instead of the existing, tierless system, architectures are assigned to tiers:
- Tier 1 is the status quo. Full attention is put toward running bare-metal machines on this architecture, including installation. All packages should be built for this architecture. All supported mirrors must serve this architecture.
- Tier 2 removes the requirement for bare-metal and installation support and instead shifts the focus to supporting as many packages as possible, and keeping existing packages up-to-date. Packages in the
base
andbase-devel
groups (now metapackages) must work. Not all mirrors have to provide packages for this architecture, but at least a few prominent ones should. Official container images are provided for these architectures. - Tier 3 is the best-effort tier, where the only requirement is that infrastructure must exist to build and serve packages for this architecture. Support is not guaranteed, and avenues for filing bugs may be restricted. In practice, this essentially only requires that pacman and other tools be aware of this architecture and its microarchitectures, and nothing else. This does not even mean that pacman needs to build under these architectures, just that it knows about it in its code.
Tiers are also specific to microarchitectures, which will be explained below. However, supporting a microarchitecture tier doesn't require all packages to target that microarchitecture, just that packages work on that microarchitecture -- this would allow, for example, supporting x86_64-v3 as tier 1 while most built packages still target x86_64-v1.
Microarchitectures
Pacman and makepkg should support the notion of microarchitectures in addition to architectures. Architectures are completely incompatible with each other, whereas microarchitectures can run packages for their base architectures in addition to their parent architecture.
Repositories for pacman will work exactly the same as they do today, where only one architecture's repository is downloaded on any given installation thanks to the inclusion of an $arch variable on the mirror list. So, in a repository for x86_64-v3, one could expect to find x86_64-v2, x86_64 (v1 implied), and "any" packages.
In PKGBUILD, listing a microarchitecture in the architectures list indicates that this microarchitecture is required for the build. So, for example, listing i686 means the package will refuse to build for i586. However, if i586 is listed, this will not prevent the package from being built for i686. Makepkg should accept configuration options to determine what microarchitecture to use by default, and the default settings should target a "comfortable default" similar to how -mtune=generic
is added to CFLAGS
.
Although it could in theory be possible, this proposal does not make the distinction of "this package would not gain anything by compiling for a more specific microarchitecture" since it would greatly complicate logic, and doesn't really seem like it would work anyway. With x86_64 as an example, compilers can take advantage of vectorisation and improve code that doesn't even use SIMD normally, and this would mean that even simple programs compiled for v1 could become different and incompatible under v3. So, if x86_64
by itself were specified in a PKGBUILD
and a system on x86_64-v3
were building it, then it would build for x86_64-v3
unless it were told otherwise.
Maintainers should not be expected to modify their architecture settings to be the broadest possible, and only need to include what's officially supported. This means that if Arch were to only support x86_64-v3, then maintainers would not be expected to determine whether it's appropriate to list the microarchitecture as x86_64 (-v1) or x86_64-v2 in the PKGBUILD, nor would they be expected to build for these microarchitectures.
Changes to "social" infrastructure
Here, "social" infrastructure refers to things like mailing lists, bug trackers, repos, etc.
The largest change will be that there can be multiple maintainers for the same package, if that package is to be built for multiple architectures. In these cases, the PKGBUILD will be shared, but the responsibility to build and update the package will be separate. Implicitly, the multiple maintainers for a package would ensure that any changes made to the PKGBUILD
wouldn't break other architectures of the same (or higher) tier.
Flagging a package as out of date will be specific to an architecture, and filing bugs will have to include an architecture as well, although there are some bugs that will be architecture-independent. This is similar to how users filing bugs today should include details about their systems, even though these may not all matter.
Relations with mirrors will also have to change, where each mirror will be responsible for reporting what architectures it includes. Mirrors will be allowed to list themselves even if they don't include all of the architectures, or only include lower-tier architectures, but should be separated if they don't support tier 1 architectures.
Perhaps the most important change would be that maintainers of spinoff projects for other architectures, like Arch Linux ARM and Arch Linux 32 would have the option to become official maintainers supporting these projects under the official Arch umbrella, just under a lower tier. These maintainers would have a similar level to those supporting community packages, except that these architectures themselves would have a level of support that's less guaranteed.
It's important to remark that none of these changes necessitate immediately supporting any packages, but rather modifying existing infrastructure to accommodate additional architectures, and enabling these architectures to become part of the Arch project instead of requiring side-projects.
Changes to tech infrastructure
As clarified earlier, both pacman and makepkg will contain the bulk of the required changes. The mirrorlist API and potentially the security/CVE API will need changes to include architectures.
A lot of existing infrastructure already retains the ability to search by architecture, namely the package search. Although the AUR search doesn't currently filter by architecture, it easily could be modified to.
Proposal for supported architectures
Right now, x86_64 (with proposed alias x86_64-v1) will remain the only tier 1 architecture. I propose that the following architectures be added under tier 3. Note that this only would require updating pacman and makepkg to recognise the microarchitecture hierarchy, not anything else for now. A less-than sign will indicate microarchitecture relationships.
- x86_64-v2 < x86_64-v3 < x86_64-v4
- aarch64
- i386 < i486 < i586 < i686
Justification for these:
- x86_64 (implicitly v1) is currently supported, and the ability to support modern CPU features like AVX can dramatically speed up some programs. The goal would be to eventually decide on a level to promote to tier 1 (likely v3) and then bump v1 down to a lower tier or even tier 3.
- AArch64 is the most standardised version of ARM with substantially more devices having mainline kernel support. Additionally, it's becoming increasingly popular on servers. It would be a good candidate for eventual T2 or T1 promotion.
- i686 was the only other architecture supported officially by Arch, and it makes sense to at least parse and recognise it under the new system. It's unlikely this would ever gain anything beyond T3 support, but simply parsing it seems reasonable. Additionally, if there were ever a proposal to somehow incorporate multilib into this model, having some awareness of i686 would be helpful.
I intentionally omitted all the variants of ARM32 just due to the immense confusion of how these architectures are named. That can be reserved for a later discussion.
Outcomes
Although not strictly required for the proposal, it's worth talking about some of the things that could happen under this tiered model.
Although the situation is much better than it is now, ARM has historically had a problem with devices gaining mainline kernel support, and vendors often had their own kernel and bootloader forks to support running on their hardware. With a tier 2 level of support for ARM, the main Arch project could still build and provide a lot of the packages for the base system (namely the base
and base-devel
groups) while the community could pick up the remainder and offer custom repositories containing these vendor-specific packages.
Having the support of understanding microarchitectures in pacman could allow, for example, the Arch 32 project to offer a variety of builds of 32-bit packages at different microarchitecture levels, so that someone on a super-old machine could piece together supported packages with some added effort, custom repos, and the AUR, without having to completely depart from pacman and its existing tooling.
I don't believe that supporting all this would be a massive burden on the project, but I don't believe it would be trivial either. The main purpose of the proposal is to add a structure for supporting other architectures at a variable commitment level rather than an all-or-nothing system like the current one.