Use codegen-units = 1 for release mode

This tends to yield slightly better codegen.
3 jobs for master in 2 minutes and 18 seconds (queued for 2 seconds)