mbox series

[v2,00/10] A couple of CI improvements

Message ID 20250106-b4-pks-ci-fixes-v2-0-06ae540771b7@pks.im (mailing list archive)
Headers show
Series A couple of CI improvements | expand

Message

Patrick Steinhardt Jan. 6, 2025, 11:16 a.m. UTC
Hi,

this patch series addresses a couple of issues I've found while
investigating flaky CI jobs. Besides two more fixes for flaky jobs it
also removes some stale code and simplifies the setup on GitHub Actions
to always use containerized jobs on Linux.

Test runs can be found for GitLab [1] and GitHub [2].

Changes in v2:

  - Expand a bit on the reasoning behind the conversion to use
    containerized jobs.
  - Fix commit message typo.
  - Properly fix the race in t7422 via pipe stuffing, as proposed by
    Peff.
  - Link to v1: https://lore.kernel.org/r/20250103-b4-pks-ci-fixes-v1-0-a9bb95dff833@pks.im

Thanks!

Patrick

[1]: https://gitlab.com/gitlab-org/git/-/merge_requests/277
[2]: https://github.com/git/git/pull/1865

---
Patrick Steinhardt (10):
      t0060: fix EBUSY in MinGW when setting up runtime prefix
      t7422: fix flaky test caused by buffered stdout
      github: adapt containerized jobs to be rootless
      github: convert all Linux jobs to be containerized
      github: simplify computation of the job's distro
      gitlab-ci: remove the "linux-old" job
      gitlab-ci: add linux32 job testing against i386
      ci: stop special-casing for Ubuntu 16.04
      ci: use latest Ubuntu release
      ci: remove stale code for Azure Pipelines

 .github/workflows/main.yml  | 78 ++++++++++++++++++++++-----------------------
 .gitlab-ci.yml              | 19 ++++++-----
 ci/install-dependencies.sh  |  6 ++--
 ci/lib.sh                   | 34 +++-----------------
 ci/print-test-failures.sh   |  5 ---
 t/t0060-path-utils.sh       | 10 +++---
 t/t7422-submodule-output.sh | 10 ++++--
 7 files changed, 69 insertions(+), 93 deletions(-)

Range-diff versus v1:

 1:  8ef1870c39 =  1:  14a80c2683 t0060: fix EBUSY in MinGW when setting up runtime prefix
 2:  f0647aad30 <  -:  ---------- t7422: fix flaky test caused by buffered stdout
 -:  ---------- >  2:  967e76f482 t7422: fix flaky test caused by buffered stdout
 3:  2768ecb60c =  3:  bd2bae13e4 github: adapt containerized jobs to be rootless
 4:  3a8aafdc32 !  4:  bc0bf7b8d5 github: convert all Linux jobs to be containerized
    @@ Commit message
         The latter is more flexible because it allows us to freely pick whatever
         container image we want to use for a specific job, while the former only
         allows us to pick from a handful of different distros. The containerized
    -    jobs shouldn't cause a significant slowdown, either, so they do not have
    -    any significant upside to the best of my knowlegde. The only upside that
    -    they did have before the preceding commit is that they run as a non-root
    -    user, but that has been addressed now.
    +    jobs do not have any significant downsides to the best of my knowledge:
     
    -    Convert all Linux jobs to be containerized for additional flexibility.
    +      - They aren't significantly slower to start up. A quick comparison by
    +        Peff shows that the difference is mostly lost in the noise:
    +
    +                job             |  old | new
    +            --------------------|------|------
    +            linux-TEST-vars      11m30s 10m54s
    +            linux-asan-ubsan     30m26s 31m14s
    +            linux-gcc             9m47s 10m6s
    +            linux-gcc-default     9m47s  9m41s
    +            linux-leaks          25m50s 25m21s
    +            linux-meson          10m36s 10m41s
    +            linux-reftable       10m25s 10m23s
    +            linux-reftable-leaks 27m18s 27m28s
    +            linux-sha256          9m54s 10m31s
    +
    +        Some jobs are a bit faster, some are a bit slower, but there does
    +        not seem to be any significant change.
    +
    +      - Containerized jobs run as root, which keeps a couple of tests from
    +        running. This has been addressed in the preceding commit though,
    +        where we now use setpriv(1) to run tests as a separate user.
    +
    +      - GitHub injects a Node binary into containerized jobs, which is
    +        dynamically linked. This has led to some issues in the past [1], but
    +        only for our 32 bit jobs. The issues have since been resolved.
    +
    +    Overall there seem to be no downsides, but the upside is that we have
    +    more control over the exact image that these jobs use. Convert the Linux
    +    jobs accordingly.
    +
    +    [1]: https://lore.kernel.org/git/20240912094841.GD589828@coredump.intra.peff.net/
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
 5:  a50ee3dd9a !  5:  22bd775ad0 github: simplify computation of the job's distro
    @@ Commit message
     
         There are a couple of exceptions:
     
    -      - The "linux32" job, w whose distro name is different than the image
    +      - The "linux32" job, whose distro name is different than the image
             name. This is handled by adapting all sites to use the new name.
     
           - The "alpine" and "fedora" jobs, neither of which specify a tag for
 6:  b31305597e =  6:  ddce6be0b6 gitlab-ci: remove the "linux-old" job
 7:  dfa41f5593 =  7:  40a0c1e22a gitlab-ci: add linux32 job testing against i386
 8:  bd1efb0373 =  8:  d775afb9c3 ci: stop special-casing for Ubuntu 16.04
 9:  fa505756a7 =  9:  0dd988643f ci: use latest Ubuntu release
10:  c64af8aa78 = 10:  bdca84eebd ci: remove stale code for Azure Pipelines

---
base-commit: 1b4e9a5f8b5f048972c21fe8acafe0404096f694
change-id: 20250103-b4-pks-ci-fixes-2d0a23fb5c78