diff mbox series

[2/2] gitlab-ci.yml: Add jobs to test CFI flags

Message ID 20210222230106.7030-3-dbuono@linux.vnet.ibm.com (mailing list archive)
State New, archived
Headers show
Series gitlab-ci.yml: Add jobs to test CFI | expand

Commit Message

Daniele Buono Feb. 22, 2021, 11:01 p.m. UTC
QEMU has had options to enable control-flow integrity features
for a few months now. Add two sets of build/check/acceptance
jobs to ensure the binary produced is working fine.

The two sets allow testing of x86_64 binaries for every target
that is not deprecated.

Signed-off-by: Daniele Buono <dbuono@linux.vnet.ibm.com>
---
 .gitlab-ci.yml | 92 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 92 insertions(+)

Comments

Paolo Bonzini Feb. 23, 2021, 8:11 a.m. UTC | #1
On 23/02/21 00:01, Daniele Buono wrote:
> +# Set JOBS=1 because this requires LTO and ld consumes a large amount of memory.
> +# On gitlab runners, default JOBS of 2 sometimes end up calling 2 lds concurrently
> +# and triggers an Out-Of-Memory error

Does it make sense to test only one target instead?

> +# Because of how slirp is used in QEMU, we need to have CFI also on libslirp.
> +# System-wide version in fedora is not compiled with CFI so we recompile it using
> +# -enable-slirp=git

Can you explain what you mean, and perhaps add a check or warning for 
incompatible settings?

Paolo
Daniele Buono Feb. 24, 2021, 5:55 p.m. UTC | #2
On 2/23/2021 3:11 AM, Paolo Bonzini wrote:
> On 23/02/21 00:01, Daniele Buono wrote:
>> +# Set JOBS=1 because this requires LTO and ld consumes a large amount 
>> of memory.
>> +# On gitlab runners, default JOBS of 2 sometimes end up calling 2 lds 
>> concurrently
>> +# and triggers an Out-Of-Memory error
> 
> Does it make sense to test only one target instead?

I'd prefer grouping multiple targets per job so that the number of jobs 
doesn't explode, and stopping ninja from linking in parallel does solve 
the issue.

There's also the issue that tests are also compiled here so you may end
up with two linkers anyway. However the chance that this will end up in
an out-of-memory error is quite smaller (possibly zero) since tests
don't link that many object files together.

> 
>> +# Because of how slirp is used in QEMU, we need to have CFI also on 
>> libslirp.
>> +# System-wide version in fedora is not compiled with CFI so we 
>> recompile it using
>> +# -enable-slirp=git
> 
> Can you explain what you mean, and perhaps add a check or warning for 
> incompatible settings?

Certainly. The issue here is that there is a function in libslirp that
is used as callbacks for QEMU Timers: ra_timer_handler
(There may be others, but of this one I'm sure because I traced it).

This is not an issue when you compile slirp with qemu, since the whole
library now has CFI informations and is statically linked in the QEMU
binary. It becomes an issue if you are dynamically linking a system-wide
libslirp, as it happens on Fedora.

I'd be happy to add a check on configure/meson that ends the configure
step with an error when this happens, but that would technically be an
independent patch that I'd work on in parallel to this one.
I would prefer to not automatically select the git-based libslirp
because that may go unnoticed when configuring.

> 
> Paolo
>
Paolo Bonzini Feb. 24, 2021, 6:28 p.m. UTC | #3
On 24/02/21 18:55, Daniele Buono wrote:
>>
>> Does it make sense to test only one target instead?
> 
> I'd prefer grouping multiple targets per job so that the number of jobs doesn't explode, and stopping ninja from linking in parallel does solve the issue.

Yeah, backend_max_links should do it.  The 3 hour timeout scared me.

>> Can you explain what you mean, and perhaps add a check or warning for 
>> incompatible settings?
> 
> Certainly. The issue here is that there is a function in libslirp that
> is used as callbacks for QEMU Timers: ra_timer_handler
> (There may be others, but of this one I'm sure because I traced it).
> 
> This is not an issue when you compile slirp with qemu, since the whole
> library now has CFI informations and is statically linked in the QEMU
> binary. It becomes an issue if you are dynamically linking a system-wide
> libslirp, as it happens on Fedora.
> 
> I'd be happy to add a check on configure/meson that ends the configure
> step with an error when this happens, but that would technically be an
> independent patch that I'd work on in parallel to this one.
> I would prefer to not automatically select the git-based libslirp
> because that may go unnoticed when configuring.

Sounds good.  For now just add a comment, please.

Paolo
diff mbox series

Patch

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 5c198f05d4..f2fea8e2eb 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -479,6 +479,98 @@  clang-user:
       --extra-cflags=-fsanitize=undefined --extra-cflags=-fno-sanitize-recover=undefined
     MAKE_CHECK_ARGS: check-unit check-tcg
 
+# Set JOBS=1 because this requires LTO and ld consumes a large amount of memory.
+# On gitlab runners, default JOBS of 2 sometimes end up calling 2 lds concurrently
+# and triggers an Out-Of-Memory error
+#
+# Because of how slirp is used in QEMU, we need to have CFI also on libslirp.
+# System-wide version in fedora is not compiled with CFI so we recompile it using
+# -enable-slirp=git
+#
+# Split in two sets of build/check/acceptance because a single build job for every
+# target creates an artifact archive too big to be uploaded
+build-cfi-set1:
+  <<: *native_build_job_definition
+  needs:
+  - job: amd64-fedora-container
+  variables:
+    JOBS: 1
+    AR: llvm-ar
+    IMAGE: fedora
+    CONFIGURE_ARGS: --cc=clang --cxx=clang++ --enable-cfi --enable-cfi-debug
+      --enable-safe-stack --enable-slirp=git
+    TARGETS: aarch64-softmmu arm-softmmu alpha-softmmu i386-softmmu ppc-softmmu
+      ppc64-softmmu riscv32-softmmu riscv64-softmmu s390x-softmmu sparc-softmmu
+      sparc64-softmmu x86_64-softmmu
+      aarch64-linux-user aarch64_be-linux-user arm-linux-user i386-linux-user
+      ppc64-linux-user ppc64le-linux-user s390x-linux-user x86_64-linux-user
+    MAKE_CHECK_ARGS: check-build
+  timeout: 3h
+  artifacts:
+    expire_in: 2 days
+    paths:
+      - build
+
+check-cfi-set1:
+  <<: *native_test_job_definition
+  needs:
+    - job: build-cfi-set1
+      artifacts: true
+  variables:
+    IMAGE: fedora
+    MAKE_CHECK_ARGS: check
+
+acceptance-cfi-set1:
+  <<: *native_test_job_definition
+  needs:
+    - job: build-cfi-set1
+      artifacts: true
+  variables:
+    IMAGE: fedora
+    MAKE_CHECK_ARGS: check-acceptance
+  <<: *acceptance_definition
+
+build-cfi-set2:
+  <<: *native_build_job_definition
+  needs:
+  - job: amd64-fedora-container
+  variables:
+    JOBS: 1
+    AR: llvm-ar
+    IMAGE: fedora
+    CONFIGURE_ARGS: --cc=clang --cxx=clang++ --enable-cfi --enable-cfi-debug
+      --enable-safe-stack --enable-slirp=git
+    TARGETS: avr-softmmu cris-softmmu hppa-softmmu m68k-softmmu
+      microblaze-softmmu microblazeel-softmmu mips-softmmu mips64-softmmu
+      mips64el-softmmu mipsel-softmmu moxie-softmmu nios2-softmmu or1k-softmmu
+      rx-softmmu sh4-softmmu sh4eb-softmmu tricore-softmmu xtensa-softmmu
+      xtensaeb-softmmu
+    MAKE_CHECK_ARGS: check-build
+  timeout: 3h
+  artifacts:
+    expire_in: 2 days
+    paths:
+      - build
+
+check-cfi-set2:
+  <<: *native_test_job_definition
+  needs:
+    - job: build-cfi-set2
+      artifacts: true
+  variables:
+    IMAGE: fedora
+    MAKE_CHECK_ARGS: check
+
+acceptance-cfi-set2:
+  <<: *native_test_job_definition
+  needs:
+    - job: build-cfi-set2
+      artifacts: true
+  variables:
+    IMAGE: fedora
+    MAKE_CHECK_ARGS: check-acceptance
+  <<: *acceptance_definition
+
 tsan-build:
   <<: *native_build_job_definition
   variables: