Message ID | xmqqy140o2kb.fsf@gitster.g (mailing list archive) |
---|---|
State | Accepted |
Commit | 90f2c7240ccc7fb349199b4ded6b47702b40331f |
Headers | show |
Series | ci: remove 'Upload failed tests' directories' step from linux32 jobs | expand |
Junio C Hamano <gitster@pobox.com> writes: > Linux32 jobs seem to be getting: > > Error: This request has been automatically failed because it uses a > deprecated version of `actions/upload-artifact: v1`. Learn more: > https://github.blog/changelog/2024-02-13-deprecation-notice-v1-and-v2-of-the-artifact-actions/ > > before doing anything useful. For now, disable the step. > > Ever since actions/upload-artifact@v1 got disabled, mentioning the > offending version of it seems to stop anything from happening. At > least this should run the same build and test. > > See > > https://github.com/git/git/actions/runs/10780030750/job/29894867249 > > for example. > > Signed-off-by: Junio C Hamano <gitster@pobox.com> > --- > .github/workflows/main.yml | 6 ------ > 1 file changed, 6 deletions(-) I refrain from merging my own patches until somebody else at least comments on them, but I'll make an exception and merge this to 'next', as a few "container" jobs _always_ fail to start otherwise. At least with the step removed, a run without any test failures would tell us that these container tasks are OK, which is much better. If somebody finds a better solution (i.e., a way to extract trash directories of failed tests', with actions/upload-artifact@v1), that can be added later on top. > diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml > index 1ee0433acc..97f9b06310 100644 > --- a/.github/workflows/main.yml > +++ b/.github/workflows/main.yml > @@ -365,12 +365,6 @@ jobs: > with: > name: failed-tests-${{matrix.vector.jobname}} > path: ${{env.FAILED_TEST_ARTIFACTS}} > - - name: Upload failed tests' directories > - if: failure() && env.FAILED_TEST_ARTIFACTS != '' && matrix.vector.jobname == 'linux32' > - uses: actions/upload-artifact@v1 # cannot be upgraded because Node.js Actions aren't supported in this container > - with: > - name: failed-tests-${{matrix.vector.jobname}} > - path: ${{env.FAILED_TEST_ARTIFACTS}} > static-analysis: > needs: ci-config > if: needs.ci-config.outputs.enabled == 'yes'
On Wed, Sep 11, 2024 at 03:32:46PM -0700, Junio C Hamano wrote: > I refrain from merging my own patches until somebody else at least > comments on them, but I'll make an exception and merge this to 'next', > as a few "container" jobs _always_ fail to start otherwise. At least > with the step removed, a run without any test failures would tell us > that these container tasks are OK, which is much better. > > If somebody finds a better solution (i.e., a way to extract trash > directories of failed tests', with actions/upload-artifact@v1), that > can be added later on top. I looked at this a bit last night, but as I didn't come up with any useful solution, I refrained from replying. I was going to do a brief write-up of all of my dead ends, but after banging my head against the wall a bit more, I think I might actually have something. ;) I think a better path forward is to figure out how to keep up to date with the upload-artifact action for all jobs. The root of the issue is that all of the runner images are 64-bit, and the actions aren't prepared to run inside a 32-bit container. They're written in javascript and ship with their own node.js, but it doesn't work inside the container due to dynamic linking issues. So you probably saw a message like this: exec /__e/node20/bin/node: no such file or directory in the job log. That funky path is the node binary that ships with the action, and the ENOENT is because it can't find the runtime linker. Aside: I think the world would be a better place if they shipped a totally static build of node, but there may be some complications there. I found some discussion here: https://github.com/actions/setup-node/issues/922 https://github.com/actions/runner/issues/2906 There are two options here, I think: 1. Instead of running everything inside the container, we can run the actions outside (on the normal 64-bit runner image), and just do the build/test steps inside the container by manually running docker with the appropriate mounts. This was the suggested here: https://github.com/actions/runner/issues/1491#issuecomment-970418408 and apparently is how libgit2 works (I don't think it's too hard to do so, but their infrastructure wasn't totally trivial). 2. After some tricky debugging via tmate[1], I found that we can install the necessary libc bits within the container. But there's another catch! They've moved to node20, which requires glibc2.28, and the eight-year-old xenial release we're using is too old for that. We have to move to Focal, which was released in 2020. It feels like (1) is probably the more robust and flexible solution overall. But we can get to (2) a lot more easily from where we are now. Doing this: diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index cbdb677258..6b45dcad9d 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -347,8 +347,8 @@ jobs: image: alpine distro: alpine-latest - jobname: linux32 - image: daald/ubuntu32:xenial - distro: ubuntu32-16.04 + image: i386/ubuntu:focal + distro: ubuntu32-20.04 - jobname: pedantic image: fedora distro: fedora-latest @@ -358,27 +358,21 @@ jobs: runs-on: ubuntu-latest container: ${{matrix.vector.image}} steps: - - uses: actions/checkout@v4 - if: matrix.vector.jobname != 'linux32' - - uses: actions/checkout@v1 # cannot be upgraded because Node.js Actions aren't supported in this container + - name: prepare libc6 for actions + run: apt update && apt install -y libc6-amd64 lib64stdc++6 if: matrix.vector.jobname == 'linux32' + - uses: actions/checkout@v4 - run: ci/install-dependencies.sh - run: ci/run-build-and-tests.sh - name: print test failures if: failure() && env.FAILED_TEST_ARTIFACTS != '' run: ci/print-test-failures.sh - name: Upload failed tests' directories - if: failure() && env.FAILED_TEST_ARTIFACTS != '' && matrix.vector.jobname != 'linux32' + if: failure() && env.FAILED_TEST_ARTIFACTS != '' uses: actions/upload-artifact@v4 with: name: failed-tests-${{matrix.vector.jobname}} path: ${{env.FAILED_TEST_ARTIFACTS}} - - name: Upload failed tests' directories - if: failure() && env.FAILED_TEST_ARTIFACTS != '' && matrix.vector.jobname == 'linux32' - uses: actions/upload-artifact@v1 # cannot be upgraded because Node.js Actions aren't supported in this container - with: - name: failed-tests-${{matrix.vector.jobname}} - path: ${{env.FAILED_TEST_ARTIFACTS}} static-analysis: needs: ci-config if: needs.ci-config.outputs.enabled == 'yes' lets us eliminate the special cases and just work with the regular versions of each action. That gets us through the checkout action. There does seem to be some fallout from using the more recent image. Now running "linux32" to change the machine personality in ci/install-dependencies runs afoul of some seccomp restrictions. Loosening docker like this works: diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 6b45dcad9d..ed66c0bea4 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -356,7 +356,9 @@ jobs: jobname: ${{matrix.vector.jobname}} distro: ${{matrix.vector.distro}} runs-on: ubuntu-latest - container: ${{matrix.vector.image}} + container: + image: ${{matrix.vector.image}} + options: --security-opt=seccomp=unconfined steps: - name: prepare libc6 for actions run: apt update && apt install -y libc6-amd64 lib64stdc++6 but it's not at all clear to me why we need to run "linux32" in the first place. Sure, it's a 64-bit kernel still (we're just in a userspace container) but the system knows its 32-bit, and stuff like "getconf LONG_BIT" returns 32. So another solution may be just: diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh index 2190c82313..8a8b832a35 100755 --- a/ci/install-dependencies.sh +++ b/ci/install-dependencies.sh @@ -60,11 +60,9 @@ ubuntu-*) chmod a+x "$CUSTOM_PATH/jgit" ;; ubuntu32-*) - sudo linux32 --32bit i386 sh -c ' - apt update >/dev/null && - apt install -y build-essential libcurl4-openssl-dev \ - libssl-dev libexpat-dev gettext python >/dev/null - ' + sudo apt update >/dev/null && + sudo apt install -y build-essential libcurl4-openssl-dev \ + libssl-dev libexpat-dev gettext python >/dev/null ;; macos-*) export HOMEBREW_NO_AUTO_UPDATE=1 HOMEBREW_NO_INSTALL_CLEANUP=1 but I'm worried I'm missing something, as it's been a while since I poked at multi-arch stuff. After that, we need this: diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh index 4781cd20bb..2190c82313 100755 --- a/ci/install-dependencies.sh +++ b/ci/install-dependencies.sh @@ -22,6 +22,9 @@ then } fi +# Required so that apt doesn't wait for user input on certain packages. +export DEBIAN_FRONTEND=noninteractive + case "$distro" in alpine-*) apk add --update shadow sudo build-base curl-dev openssl-dev expat-dev gettext \ @@ -34,8 +37,6 @@ fedora-*) dnf -yq install make gcc findutils diffutils perl python3 gettext zlib-devel expat-devel openssl-devel curl-devel pcre2-devel >/dev/null ;; ubuntu-*) - # Required so that apt doesn't wait for user input on certain packages. - export DEBIAN_FRONTEND=noninteractive sudo apt-get -q update sudo apt-get -q -y install \ to cover the ubuntu32 job (I guess it just wasn't needed on the ancient image we were using). After that, we get all the way to the actual build. Looks like it fails because we're missing zlib. Presumably that was included by default in the old image, but not the new, and it needs to be added to package install list. Taken together, I suspect we could just treat "ubuntu32" the same as "ubuntu" in the ci/install-dependencies script. That would fix all of those issues, and simplify the script. Assuming I'm not missing something. -Peff [1] Using tmate to diagnose failing "node" proved a bit tricky, because the tmate action also uses node, so it fails, too! For future reference, this is what I used to get it running manually: - name: Setup tmate if: always() run: | apt update && apt install -y xz-utils wget && v=tmate-2.4.0-static-linux-i386 && wget https://github.com/tmate-io/tmate/releases/download/2.4.0/$v.tar.xz && xzcat $v.tar.xz | tar xvvf - && cd $v && ./tmate -F
On Thu, Sep 12, 2024 at 03:56:32AM -0400, Jeff King wrote: > > If somebody finds a better solution (i.e., a way to extract trash > > directories of failed tests', with actions/upload-artifact@v1), that > > can be added later on top. > > I looked at this a bit last night, but as I didn't come up with any > useful solution, I refrained from replying. I was going to do a brief > write-up of all of my dead ends, but after banging my head against the > wall a bit more, I think I might actually have something. ;) > > I think a better path forward is to figure out how to keep up to date > with the upload-artifact action for all jobs. The root of the issue is > that all of the runner images are 64-bit, and the actions aren't > prepared to run inside a 32-bit container. They're written in javascript > and ship with their own node.js, but it doesn't work inside the > container due to dynamic linking issues. So obviously that other email ended up long and full of details. What I was originally going to say was: "OK, losing artifact uploads for linux32 is not the end of the world, and it is important to unblock CI now." But I _think_ all of those details I sent add up to a possible patch series. But it probably still makes sense to take your patch here, as it unblocks CI in the short term. And then I can build the bigger fix on top of that (it touches the same spot, but in the end both end up removing that special case). -Peff
diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 1ee0433acc..97f9b06310 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -365,12 +365,6 @@ jobs: with: name: failed-tests-${{matrix.vector.jobname}} path: ${{env.FAILED_TEST_ARTIFACTS}} - - name: Upload failed tests' directories - if: failure() && env.FAILED_TEST_ARTIFACTS != '' && matrix.vector.jobname == 'linux32' - uses: actions/upload-artifact@v1 # cannot be upgraded because Node.js Actions aren't supported in this container - with: - name: failed-tests-${{matrix.vector.jobname}} - path: ${{env.FAILED_TEST_ARTIFACTS}} static-analysis: needs: ci-config if: needs.ci-config.outputs.enabled == 'yes'
Linux32 jobs seem to be getting: Error: This request has been automatically failed because it uses a deprecated version of `actions/upload-artifact: v1`. Learn more: https://github.blog/changelog/2024-02-13-deprecation-notice-v1-and-v2-of-the-artifact-actions/ before doing anything useful. For now, disable the step. Ever since actions/upload-artifact@v1 got disabled, mentioning the offending version of it seems to stop anything from happening. At least this should run the same build and test. See https://github.com/git/git/actions/runs/10780030750/job/29894867249 for example. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- .github/workflows/main.yml | 6 ------ 1 file changed, 6 deletions(-)