Skip to content

Refactor: speed up all clone operations by squashing all commit history to 1 commit

Evangeline Rome requested to merge nikelborm/pacman:master into master

I added --depth 1 to all git clone commands without --single-branch or --depth to download latest state of the repo without all of the commit history.

--single-branch is implied when using --depth according to git manual.

--depth <depth>
Create a shallow clone with a history truncated to the specified number of commits. Implies --single-branch unless --no-single-branch is given to fetch the histories near the tips of all branches. If you want to clone submodules shallowly, also pass --shallow-submodules.

My big problem with yay was is that it always cloned entire repo with entire commit history. If one function/file/class was created once and then deleted, it may not be present in the current filesystem, but still will be present in commits and will be passed to user over the internet during clone operation.

--depth 1 solves the problem and loads only default branch of the repo and only the latest it's state. It basically squashes all commits into one.

There are loads and loads of packages that take a long time to download. Example of such package in AUR is amf-headers-git. It references this AMF repo.

Cloning of AMF repo even over a consistent ssh channel takes SIGNIFICANTLY more time and disk space than doing so with --depth 1.

Here is a benchmark (cloning speed in git logs represents only the speed in last seconds)

$ time git clone git@github.com:GPUOpen-LibrariesAndSDKs/AMF.git

Cloning into 'AMF'...
remote: Enumerating objects: 7632, done.
remote: Counting objects: 100% (2425/2425), done.
remote: Compressing objects: 100% (1138/1138), done.
remote: Total 7632 (delta 1260), reused 2390 (delta 1247), pack-reused 5207
Receiving objects: 100% (7632/7632), 848.69 MiB | 1.59 MiB/s, done.
Resolving deltas: 100% (4242/4242), done.
Updating files: 100% (5085/5085), done.

real	8m58.996s
user	0m36.857s
sys	0m16.024s


$ du -s AMF
1481388	AMF

$ time git clone --depth 1 git@github.com:GPUOpen-LibrariesAndSDKs/AMF.git amf_faster_smaller

Cloning into 'amf_faster_smaller'...
remote: Enumerating objects: 1923, done.
remote: Counting objects: 100% (1923/1923), done.
remote: Compressing objects: 100% (1504/1504), done.
remote: Total 1923 (delta 675), reused 1260 (delta 384), pack-reused 0
Receiving objects: 100% (1923/1923), 142.80 MiB | 1.37 MiB/s, done.
Resolving deltas: 100% (675/675), done.
Updating files: 100% (5085/5085), done.

real	1m49.330s
user	0m7.058s
sys	0m3.369s


du -s amf_faster_smaller
758384	amf_faster_smaller

Cloning repo without --depth 1 takes 8*60 + 58 = 538 seconds

Cloning repo with --depth 1 takes 1*60 + 49 = 109 seconds

Repo cloned without --depth 1 takes 1481388 bytes

Repo cloned with --depth 1 takes 758384 bytes

We get 100-(758384 / 1481388)*100=48.8% decreasing in space on disk

We get 100-(109 / 538)*100=79.7% decreasing in cloning time

That's not my initial pull request. I tried to make PR to yay, but was redirected, so here I am.

Merge request reports