Message ID | 20210222230106.7030-3-dbuono@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | gitlab-ci.yml: Add jobs to test CFI | expand |
On 23/02/21 00:01, Daniele Buono wrote: > +# Set JOBS=1 because this requires LTO and ld consumes a large amount of memory. > +# On gitlab runners, default JOBS of 2 sometimes end up calling 2 lds concurrently > +# and triggers an Out-Of-Memory error Does it make sense to test only one target instead? > +# Because of how slirp is used in QEMU, we need to have CFI also on libslirp. > +# System-wide version in fedora is not compiled with CFI so we recompile it using > +# -enable-slirp=git Can you explain what you mean, and perhaps add a check or warning for incompatible settings? Paolo
On 2/23/2021 3:11 AM, Paolo Bonzini wrote: > On 23/02/21 00:01, Daniele Buono wrote: >> +# Set JOBS=1 because this requires LTO and ld consumes a large amount >> of memory. >> +# On gitlab runners, default JOBS of 2 sometimes end up calling 2 lds >> concurrently >> +# and triggers an Out-Of-Memory error > > Does it make sense to test only one target instead? I'd prefer grouping multiple targets per job so that the number of jobs doesn't explode, and stopping ninja from linking in parallel does solve the issue. There's also the issue that tests are also compiled here so you may end up with two linkers anyway. However the chance that this will end up in an out-of-memory error is quite smaller (possibly zero) since tests don't link that many object files together. > >> +# Because of how slirp is used in QEMU, we need to have CFI also on >> libslirp. >> +# System-wide version in fedora is not compiled with CFI so we >> recompile it using >> +# -enable-slirp=git > > Can you explain what you mean, and perhaps add a check or warning for > incompatible settings? Certainly. The issue here is that there is a function in libslirp that is used as callbacks for QEMU Timers: ra_timer_handler (There may be others, but of this one I'm sure because I traced it). This is not an issue when you compile slirp with qemu, since the whole library now has CFI informations and is statically linked in the QEMU binary. It becomes an issue if you are dynamically linking a system-wide libslirp, as it happens on Fedora. I'd be happy to add a check on configure/meson that ends the configure step with an error when this happens, but that would technically be an independent patch that I'd work on in parallel to this one. I would prefer to not automatically select the git-based libslirp because that may go unnoticed when configuring. > > Paolo >
On 24/02/21 18:55, Daniele Buono wrote: >> >> Does it make sense to test only one target instead? > > I'd prefer grouping multiple targets per job so that the number of jobs doesn't explode, and stopping ninja from linking in parallel does solve the issue. Yeah, backend_max_links should do it. The 3 hour timeout scared me. >> Can you explain what you mean, and perhaps add a check or warning for >> incompatible settings? > > Certainly. The issue here is that there is a function in libslirp that > is used as callbacks for QEMU Timers: ra_timer_handler > (There may be others, but of this one I'm sure because I traced it). > > This is not an issue when you compile slirp with qemu, since the whole > library now has CFI informations and is statically linked in the QEMU > binary. It becomes an issue if you are dynamically linking a system-wide > libslirp, as it happens on Fedora. > > I'd be happy to add a check on configure/meson that ends the configure > step with an error when this happens, but that would technically be an > independent patch that I'd work on in parallel to this one. > I would prefer to not automatically select the git-based libslirp > because that may go unnoticed when configuring. Sounds good. For now just add a comment, please. Paolo
diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index 5c198f05d4..f2fea8e2eb 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -479,6 +479,98 @@ clang-user: --extra-cflags=-fsanitize=undefined --extra-cflags=-fno-sanitize-recover=undefined MAKE_CHECK_ARGS: check-unit check-tcg +# Set JOBS=1 because this requires LTO and ld consumes a large amount of memory. +# On gitlab runners, default JOBS of 2 sometimes end up calling 2 lds concurrently +# and triggers an Out-Of-Memory error +# +# Because of how slirp is used in QEMU, we need to have CFI also on libslirp. +# System-wide version in fedora is not compiled with CFI so we recompile it using +# -enable-slirp=git +# +# Split in two sets of build/check/acceptance because a single build job for every +# target creates an artifact archive too big to be uploaded +build-cfi-set1: + <<: *native_build_job_definition + needs: + - job: amd64-fedora-container + variables: + JOBS: 1 + AR: llvm-ar + IMAGE: fedora + CONFIGURE_ARGS: --cc=clang --cxx=clang++ --enable-cfi --enable-cfi-debug + --enable-safe-stack --enable-slirp=git + TARGETS: aarch64-softmmu arm-softmmu alpha-softmmu i386-softmmu ppc-softmmu + ppc64-softmmu riscv32-softmmu riscv64-softmmu s390x-softmmu sparc-softmmu + sparc64-softmmu x86_64-softmmu + aarch64-linux-user aarch64_be-linux-user arm-linux-user i386-linux-user + ppc64-linux-user ppc64le-linux-user s390x-linux-user x86_64-linux-user + MAKE_CHECK_ARGS: check-build + timeout: 3h + artifacts: + expire_in: 2 days + paths: + - build + +check-cfi-set1: + <<: *native_test_job_definition + needs: + - job: build-cfi-set1 + artifacts: true + variables: + IMAGE: fedora + MAKE_CHECK_ARGS: check + +acceptance-cfi-set1: + <<: *native_test_job_definition + needs: + - job: build-cfi-set1 + artifacts: true + variables: + IMAGE: fedora + MAKE_CHECK_ARGS: check-acceptance + <<: *acceptance_definition + +build-cfi-set2: + <<: *native_build_job_definition + needs: + - job: amd64-fedora-container + variables: + JOBS: 1 + AR: llvm-ar + IMAGE: fedora + CONFIGURE_ARGS: --cc=clang --cxx=clang++ --enable-cfi --enable-cfi-debug + --enable-safe-stack --enable-slirp=git + TARGETS: avr-softmmu cris-softmmu hppa-softmmu m68k-softmmu + microblaze-softmmu microblazeel-softmmu mips-softmmu mips64-softmmu + mips64el-softmmu mipsel-softmmu moxie-softmmu nios2-softmmu or1k-softmmu + rx-softmmu sh4-softmmu sh4eb-softmmu tricore-softmmu xtensa-softmmu + xtensaeb-softmmu + MAKE_CHECK_ARGS: check-build + timeout: 3h + artifacts: + expire_in: 2 days + paths: + - build + +check-cfi-set2: + <<: *native_test_job_definition + needs: + - job: build-cfi-set2 + artifacts: true + variables: + IMAGE: fedora + MAKE_CHECK_ARGS: check + +acceptance-cfi-set2: + <<: *native_test_job_definition + needs: + - job: build-cfi-set2 + artifacts: true + variables: + IMAGE: fedora + MAKE_CHECK_ARGS: check-acceptance + <<: *acceptance_definition + tsan-build: <<: *native_build_job_definition variables:
QEMU has had options to enable control-flow integrity features for a few months now. Add two sets of build/check/acceptance jobs to ensure the binary produced is working fine. The two sets allow testing of x86_64 binaries for every target that is not deprecated. Signed-off-by: Daniele Buono <dbuono@linux.vnet.ibm.com> --- .gitlab-ci.yml | 92 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 92 insertions(+)