Message ID | 20240529133106.1224866-1-thuth@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | io/channel-socket: Fix -fsanitize=undefined problem with latest Clang | expand |
On 29/5/24 15:31, Thomas Huth wrote: > Casting function pointers from one type to another causes undefined > behavior errors when compiling with -fsanitize=undefined with Clang v18: > > $ QTEST_QEMU_BINARY=./qemu-system-mips64 tests/qtest/netdev-socket > TAP version 13 > # random seed: R02S4424f4f460de783fdd3d72c5571d3adc > 1..10 > # Start of mips64 tests > # Start of netdev tests > # Start of stream tests > # starting QEMU: exec ./qemu-system-mips64 -qtest unix:/tmp/qtest-1213196.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-1213196.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -nodefaults -M none -netdev stream,id=st0,addr.type=fd,addr.str=3 -accel qtest > ../io/task.c:78:13: runtime error: call to function qapi_free_SocketAddress through pointer to incorrect function type 'void (*)(void *)' > /tmp/qemu-sanitize/qapi/qapi-types-sockets.c:170: note: qapi_free_SocketAddress defined here > SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../io/task.c:78:13 > > Add a wrapper function to avoid the problem. > > Signed-off-by: Thomas Huth <thuth@redhat.com> > --- > io/channel-socket.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
On Wed, 29 May 2024 at 14:32, Thomas Huth <thuth@redhat.com> wrote: > > Casting function pointers from one type to another causes undefined > behavior errors when compiling with -fsanitize=undefined with Clang v18: > > $ QTEST_QEMU_BINARY=./qemu-system-mips64 tests/qtest/netdev-socket > TAP version 13 > # random seed: R02S4424f4f460de783fdd3d72c5571d3adc > 1..10 > # Start of mips64 tests > # Start of netdev tests > # Start of stream tests > # starting QEMU: exec ./qemu-system-mips64 -qtest unix:/tmp/qtest-1213196.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-1213196.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -nodefaults -M none -netdev stream,id=st0,addr.type=fd,addr.str=3 -accel qtest > ../io/task.c:78:13: runtime error: call to function qapi_free_SocketAddress through pointer to incorrect function type 'void (*)(void *)' > /tmp/qemu-sanitize/qapi/qapi-types-sockets.c:170: note: qapi_free_SocketAddress defined here > SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../io/task.c:78:13 It's a pity the sanitizer error doesn't tell you the actual function type as well as the incorrect one it got cast to (especially since in this case the function and its declaration are both in generated code in the build tree not conveniently findable with 'git grep'.) In this case the function being called is: void qapi_free_SocketAddress(SocketAddress *obj) and it's cast to a GDestroyNotify, which is typedef void (*GDestroyNotify) (gpointer data); (and gpointer is void*) and although you can pass a foo* to a function expecting void*, you can't treat a pointer to a function taking foo* as if it was a pointer to a function taking void*, just in case the compiler needs to do some clever trickery with the pointer value. So the wrapper function looks like it doesn't do anything, but it's doing the permitted implicit-cast of the argument. > Add a wrapper function to avoid the problem. > > Signed-off-by: Thomas Huth <thuth@redhat.com> > --- > io/channel-socket.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/io/channel-socket.c b/io/channel-socket.c > index 3a899b0608..aa2a1c8586 100644 > --- a/io/channel-socket.c > +++ b/io/channel-socket.c > @@ -193,6 +193,10 @@ static void qio_channel_socket_connect_worker(QIOTask *task, > qio_task_set_error(task, err); > } > > +static void qio_qapi_free_SocketAddress(gpointer sa) > +{ > + qapi_free_SocketAddress(sa); > +} > > void qio_channel_socket_connect_async(QIOChannelSocket *ioc, > SocketAddress *addr, > @@ -213,7 +217,7 @@ void qio_channel_socket_connect_async(QIOChannelSocket *ioc, > qio_task_run_in_thread(task, > qio_channel_socket_connect_worker, > addrCopy, > - (GDestroyNotify)qapi_free_SocketAddress, > + qio_qapi_free_SocketAddress, > context); > } Reviewed-by: Peter Maydell <peter.maydell@linaro.org> thanks -- PMM
On Wed, May 29, 2024 at 02:53:37PM +0100, Peter Maydell wrote: > On Wed, 29 May 2024 at 14:32, Thomas Huth <thuth@redhat.com> wrote: > > > > Casting function pointers from one type to another causes undefined > > behavior errors when compiling with -fsanitize=undefined with Clang v18: > > > > $ QTEST_QEMU_BINARY=./qemu-system-mips64 tests/qtest/netdev-socket > > TAP version 13 > > # random seed: R02S4424f4f460de783fdd3d72c5571d3adc > > 1..10 > > # Start of mips64 tests > > # Start of netdev tests > > # Start of stream tests > > # starting QEMU: exec ./qemu-system-mips64 -qtest unix:/tmp/qtest-1213196.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-1213196.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -nodefaults -M none -netdev stream,id=st0,addr.type=fd,addr.str=3 -accel qtest > > ../io/task.c:78:13: runtime error: call to function qapi_free_SocketAddress through pointer to incorrect function type 'void (*)(void *)' > > /tmp/qemu-sanitize/qapi/qapi-types-sockets.c:170: note: qapi_free_SocketAddress defined here > > SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../io/task.c:78:13 > > It's a pity the sanitizer error doesn't tell you the actual > function type as well as the incorrect one it got cast to > (especially since in this case the function and its declaration > are both in generated code in the build tree not conveniently > findable with 'git grep'.) > > In this case the function being called is: > void qapi_free_SocketAddress(SocketAddress *obj) > and it's cast to a GDestroyNotify, which is > typedef void (*GDestroyNotify) (gpointer data); > (and gpointer is void*) > > and although you can pass a foo* to a function expecting void*, > you can't treat a pointer to a function taking foo* as if it was > a pointer to a function taking void*, just in case the compiler > needs to do some clever trickery with the pointer value. > > So the wrapper function looks like it doesn't do anything, > but it's doing the permitted implicit-cast of the argument I guess that's the letter of the law in C, but does that actually matter in practice, historically ? The use of "(GDestroyNotify)blah" casts is standard practice across any application using GLib, and even in QEMU this is far from the only place that does such a cast: $ git grep '(GDestroyNotify)' chardev/char-socket.c: (GDestroyNotify)object_unref); hw/i386/kvm/xen_xenstore.c: g_list_free_full(w->events, (GDestroyNotify)free_watch_event); hw/i386/kvm/xenstore_impl.c: (GDestroyNotify)xs_node_unref); hw/i386/kvm/xenstore_impl.c: (GDestroyNotify)xs_node_unref); hw/i386/kvm/xenstore_impl.c: (GDestroyNotify)xs_node_unref); hw/mem/sparse-mem.c: (GDestroyNotify)g_free); hw/misc/xlnx-versal-cframe-reg.c: NULL, (GDestroyNotify)g_free); hw/virtio/virtio-iommu.c: NULL, (GDestroyNotify)g_free, hw/virtio/virtio-iommu.c: (GDestroyNotify)g_free); io/channel-socket.c: (GDestroyNotify)qapi_free_SocketAddress, io/net-listener.c: listener, (GDestroyNotify)object_unref, NULL); io/net-listener.c: listener, (GDestroyNotify)object_unref, context); io/net-listener.c: listener, (GDestroyNotify)object_unref, NULL); migration/block-dirty-bitmap.c: gdn = (GDestroyNotify) qapi_free_BitmapMigrationBitmapAlias; subprojects/libvhost-user/libvhost-user-glib.c: (GDestroyNotify) vug_source_destroy); system/memory.c: (GDestroyNotify) flatview_unref); tests/qtest/libqos/qgraph.c: g_test_queue_destroy((GDestroyNotify) qos_object_destroy, obj); tests/unit/test-vmstate.c: (GDestroyNotify)g_free, tests/unit/test-vmstate.c: (GDestroyNotify)g_free); tests/unit/test-vmstate.c: (GDestroyNotify)g_free, tests/unit/test-vmstate.c: (GDestroyNotify)g_free); ui/dbus-clipboard.c: (GDestroyNotify)qemu_clipboard_info_unref, ui/dbus-listener.c: (GDestroyNotify)qemu_free_displaysurface, ui/dbus-listener.c: (GDestroyNotify)pixman_image_unref, ui/dbus-listener.c: (GDestroyNotify)pixman_image_unref, pixman_image_ref(ddl->ds->image)); ui/dbus-listener.c: (GDestroyNotify)cursor_unref, ui/dbus.c: (GDestroyNotify)g_array_unref, consoles); I presume this means the rest of the code merely isn't exercised by the CI job test case this is fixing, but if we're saying this needs fixing, then we should be fixing all usage. Then there are many other function casts beyond GDestroyNotify that we use, which feel like they should have similar conceptual problems: $ git grep -E '\(\w+Func\)\w+' '*.c' | grep -v -E '(void|int|double|float|HANDLE)' chardev/char-fd.c: FEWatchFunc func = (FEWatchFunc)callback; chardev/char-fd.c: g_source_set_callback(child, (GSourceFunc)child_func, source, NULL); chardev/char-fd.c: g_source_set_callback(child, (GSourceFunc)child_func, source, NULL); chardev/char-fe.c: g_source_set_callback(src, (GSourceFunc)func, user_data, NULL); chardev/char-socket.c: g_source_set_callback(s->hup_source, (GSourceFunc)tcp_chr_hup, chardev/spice.c: GIOFunc func = (GIOFunc)callback; hw/i386/kvm/xenstore_impl.c: *items = g_list_insert_sorted(*items, g_strdup(key), (GCompareFunc)strcmp); io/channel-buffer.c: QIOChannelFunc func = (QIOChannelFunc)callback; io/channel-null.c: QIOChannelFunc func = (QIOChannelFunc)callback; io/channel-watch.c: QIOChannelFunc func = (QIOChannelFunc)callback; io/channel-watch.c: QIOChannelFunc func = (QIOChannelFunc)callback; io/channel-watch.c: QIOChannelFunc func = (QIOChannelFunc)callback; io/channel-websock.c: QIOChannelFunc func = (QIOChannelFunc)callback; io/channel.c: g_source_set_callback(source, (GSourceFunc)func, user_data, notify); io/channel.c: (GSourceFunc)qio_channel_wait_complete, io/net-listener.c: (GSourceFunc)qio_net_listener_wait_client_func, migration/migration.c: notify->notify = (NotifierWithReturnFunc)func; migration/rdma.c: QIOChannelFunc func = (QIOChannelFunc)callback; net/colo-compare.c: (GCompareDataFunc)seq_sorter, net/colo-compare.c: (GCompareFunc)colo_old_packet_check_one)) net/colo-compare.c: (GCompareFunc)colo_old_packet_check_one)) net/colo-compare.c: (GCompareFunc)colo_old_packet_check_one_conn); net/colo-compare.c: pkt, (GCompareFunc)HandlePacket); net/net.c: g_ptr_array_sort(results, (GCompareFunc)model_cmp); qga/commands-win32.c: getifentry2_ex = (GetIfEntry2Func)func; qga/vss-win32.c: func = (QGAVSSRequesterFunc)GetProcAddress(provider_lib, func_name); subprojects/libvhost-user/libvhost-user-glib.c: g_source_set_callback(gsrc, (GSourceFunc)vu_cb, data, NULL); system/qdev-monitor.c: g_ptr_array_sort(array, (GCompareFunc)qemu_pstrcmp0); target/i386/cpu.c: names = g_list_sort(names, (GCompareFunc)strcmp); tests/unit/test-qtree.c: tree = q_tree_new_full((GCompareDataFunc)my_compare, NULL, tests/unit/test-util-filemonitor.c: (GFunc)qemu_file_monitor_test_record_free, NULL); util/filemonitor-inotify.c: g_idle_add((GSourceFunc)qemu_file_monitor_free_idle, mon); util/qemu-option.c: g_ptr_array_sort(array, (GCompareFunc)qemu_pstrcmp0); So overall, I'm not in favour of this patch unless there's a compelling functional reason why just this 1 case is special and all the others can be safely ignored. > > > Add a wrapper function to avoid the problem. > > > > Signed-off-by: Thomas Huth <thuth@redhat.com> > > --- > > io/channel-socket.c | 6 +++++- > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > diff --git a/io/channel-socket.c b/io/channel-socket.c > > index 3a899b0608..aa2a1c8586 100644 > > --- a/io/channel-socket.c > > +++ b/io/channel-socket.c > > @@ -193,6 +193,10 @@ static void qio_channel_socket_connect_worker(QIOTask *task, > > qio_task_set_error(task, err); > > } > > > > +static void qio_qapi_free_SocketAddress(gpointer sa) > > +{ > > + qapi_free_SocketAddress(sa); > > +} > > > > void qio_channel_socket_connect_async(QIOChannelSocket *ioc, > > SocketAddress *addr, > > @@ -213,7 +217,7 @@ void qio_channel_socket_connect_async(QIOChannelSocket *ioc, > > qio_task_run_in_thread(task, > > qio_channel_socket_connect_worker, > > addrCopy, > > - (GDestroyNotify)qapi_free_SocketAddress, > > + qio_qapi_free_SocketAddress, > > context); > > } > > Reviewed-by: Peter Maydell <peter.maydell@linaro.org> > > thanks > -- PMM > With regards, Daniel
On 03/06/2024 14.48, Daniel P. Berrangé wrote: > On Wed, May 29, 2024 at 02:53:37PM +0100, Peter Maydell wrote: >> On Wed, 29 May 2024 at 14:32, Thomas Huth <thuth@redhat.com> wrote: >>> >>> Casting function pointers from one type to another causes undefined >>> behavior errors when compiling with -fsanitize=undefined with Clang v18: >>> >>> $ QTEST_QEMU_BINARY=./qemu-system-mips64 tests/qtest/netdev-socket >>> TAP version 13 >>> # random seed: R02S4424f4f460de783fdd3d72c5571d3adc >>> 1..10 >>> # Start of mips64 tests >>> # Start of netdev tests >>> # Start of stream tests >>> # starting QEMU: exec ./qemu-system-mips64 -qtest unix:/tmp/qtest-1213196.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-1213196.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -nodefaults -M none -netdev stream,id=st0,addr.type=fd,addr.str=3 -accel qtest >>> ../io/task.c:78:13: runtime error: call to function qapi_free_SocketAddress through pointer to incorrect function type 'void (*)(void *)' >>> /tmp/qemu-sanitize/qapi/qapi-types-sockets.c:170: note: qapi_free_SocketAddress defined here >>> SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../io/task.c:78:13 >> >> It's a pity the sanitizer error doesn't tell you the actual >> function type as well as the incorrect one it got cast to >> (especially since in this case the function and its declaration >> are both in generated code in the build tree not conveniently >> findable with 'git grep'.) >> >> In this case the function being called is: >> void qapi_free_SocketAddress(SocketAddress *obj) >> and it's cast to a GDestroyNotify, which is >> typedef void (*GDestroyNotify) (gpointer data); >> (and gpointer is void*) >> >> and although you can pass a foo* to a function expecting void*, >> you can't treat a pointer to a function taking foo* as if it was >> a pointer to a function taking void*, just in case the compiler >> needs to do some clever trickery with the pointer value. >> >> So the wrapper function looks like it doesn't do anything, >> but it's doing the permitted implicit-cast of the argument > > > I guess that's the letter of the law in C, but does that actually > matter in practice, historically ? > > The use of "(GDestroyNotify)blah" casts is standard practice > across any application using GLib, and even in QEMU this is > far from the only place that does such a cast: ... > I presume this means the rest of the code merely isn't exercised by the > CI job test case this is fixing, but if we're saying this needs fixing, > then we should be fixing all usage. There are more spots discovered by the CI, I just didn't tackle them yet. Another problem is e.g. the call_rcu macro from include/qemu/rcu.h... > So overall, I'm not in favour of this patch unless there's a compelling > functional reason why just this 1 case is special and all the others > can be safely ignored. We'd need to tackle them all. I thought it might be a good idea since we then could also switch to a more strict CFI mode (i.e. stop using -fsanitize-cfi-icall-generalize-pointers) ... but yes, looking at the amount of spots that need fixes, it feels like tilting at windmills. Maybe best if we really just add -fno-sanitize=function to the CFLAGS when we're compiling with the sanitizer enabled. Thomas
Thomas Huth <thuth@redhat.com> writes: > Casting function pointers from one type to another causes undefined > behavior errors when compiling with -fsanitize=undefined with Clang > v18: Queued to testing/next, thanks.
On Mon, Jun 03, 2024 at 04:38:53PM +0200, Thomas Huth wrote: > On 03/06/2024 14.48, Daniel P. Berrangé wrote: > > On Wed, May 29, 2024 at 02:53:37PM +0100, Peter Maydell wrote: > > > On Wed, 29 May 2024 at 14:32, Thomas Huth <thuth@redhat.com> wrote: > > > > > > > > Casting function pointers from one type to another causes undefined > > > > behavior errors when compiling with -fsanitize=undefined with Clang v18: > > > > > > > > $ QTEST_QEMU_BINARY=./qemu-system-mips64 tests/qtest/netdev-socket > > > > TAP version 13 > > > > # random seed: R02S4424f4f460de783fdd3d72c5571d3adc > > > > 1..10 > > > > # Start of mips64 tests > > > > # Start of netdev tests > > > > # Start of stream tests > > > > # starting QEMU: exec ./qemu-system-mips64 -qtest unix:/tmp/qtest-1213196.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-1213196.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -nodefaults -M none -netdev stream,id=st0,addr.type=fd,addr.str=3 -accel qtest > > > > ../io/task.c:78:13: runtime error: call to function qapi_free_SocketAddress through pointer to incorrect function type 'void (*)(void *)' > > > > /tmp/qemu-sanitize/qapi/qapi-types-sockets.c:170: note: qapi_free_SocketAddress defined here > > > > SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../io/task.c:78:13 > > > > > > It's a pity the sanitizer error doesn't tell you the actual > > > function type as well as the incorrect one it got cast to > > > (especially since in this case the function and its declaration > > > are both in generated code in the build tree not conveniently > > > findable with 'git grep'.) > > > > > > In this case the function being called is: > > > void qapi_free_SocketAddress(SocketAddress *obj) > > > and it's cast to a GDestroyNotify, which is > > > typedef void (*GDestroyNotify) (gpointer data); > > > (and gpointer is void*) > > > > > > and although you can pass a foo* to a function expecting void*, > > > you can't treat a pointer to a function taking foo* as if it was > > > a pointer to a function taking void*, just in case the compiler > > > needs to do some clever trickery with the pointer value. > > > > > > So the wrapper function looks like it doesn't do anything, > > > but it's doing the permitted implicit-cast of the argument > > > > > > I guess that's the letter of the law in C, but does that actually > > matter in practice, historically ? > > > > The use of "(GDestroyNotify)blah" casts is standard practice > > across any application using GLib, and even in QEMU this is > > far from the only place that does such a cast: > ... > > I presume this means the rest of the code merely isn't exercised by the > > CI job test case this is fixing, but if we're saying this needs fixing, > > then we should be fixing all usage. > > There are more spots discovered by the CI, I just didn't tackle them yet. > > Another problem is e.g. the call_rcu macro from include/qemu/rcu.h... > > > So overall, I'm not in favour of this patch unless there's a compelling > > functional reason why just this 1 case is special and all the others > > can be safely ignored. > > We'd need to tackle them all. I thought it might be a good idea since we > then could also switch to a more strict CFI mode (i.e. stop using > -fsanitize-cfi-icall-generalize-pointers) ... but yes, looking at the amount > of spots that need fixes, it feels like tilting at windmills. We can't rely on the sanitizers to catch all cases where we're casting functions, as we don't have good enough code coverage in tests to identify all places that way. Unless there's a warning flag we can use to get diagnosis of this problem at compile time and then fix all reported issues, I won't have any confidence in our ability to remove -fsanitize-cfi-icall-generalize-pointers for CFI. > Maybe best if we really just add -fno-sanitize=function to the CFLAGS when > we're compiling with the sanitizer enabled. That's my recommendation for the short term. With regards, Daniel
On Mon, Jun 03, 2024 at 03:47:55PM +0100, Alex Bennée wrote: > Thomas Huth <thuth@redhat.com> writes: > > > Casting function pointers from one type to another causes undefined > > behavior errors when compiling with -fsanitize=undefined with Clang > > v18: > > Queued to testing/next, thanks. Please remove, as I don't think this is a viable approach to the problem With regards, Daniel
On Mon, 3 Jun 2024 at 15:49, Daniel P. Berrangé <berrange@redhat.com> wrote: > We can't rely on the sanitizers to catch all cases where we're casting > functions, as we don't have good enough code coverage in tests to > identify all places that way. > > Unless there's a warning flag we can use to get diagnosis of this > problem at compile time and then fix all reported issues, I won't have > any confidence in our ability to remove -fsanitize-cfi-icall-generalize-pointers > for CFI. You might think that -Wcast-function-type would detect them at compile time, but that has two loopholes: 1. void(*) (void) matches everything 2. any parameter of pointer type matches any other pointer type It seems to me rather inconsistent that the sanitizers do not match up with the warning semantics here. I think I would vote for raising that with the compiler folks -- either the sanitizer should be made looser so it matches the -Wcast-function-type semantics, or else a new tighter warning option that matches the sanitizer should be provided. Ideally both, I think. But it's definitely silly to have a runtime check that flags up things that the compiler perfectly well could detect at compile time but is choosing not to. thanks -- PMM
On Mon, 3 Jun 2024 at 15:58, Peter Maydell <peter.maydell@linaro.org> wrote: > > On Mon, 3 Jun 2024 at 15:49, Daniel P. Berrangé <berrange@redhat.com> wrote: > > We can't rely on the sanitizers to catch all cases where we're casting > > functions, as we don't have good enough code coverage in tests to > > identify all places that way. > > > > Unless there's a warning flag we can use to get diagnosis of this > > problem at compile time and then fix all reported issues, I won't have > > any confidence in our ability to remove -fsanitize-cfi-icall-generalize-pointers > > for CFI. > > You might think that -Wcast-function-type would detect them at compile > time, but that has two loopholes: > 1. void(*) (void) matches everything > 2. any parameter of pointer type matches any other pointer type > > It seems to me rather inconsistent that the sanitizers do > not match up with the warning semantics here. I think I > would vote for raising that with the compiler folks -- > either the sanitizer should be made looser so it matches > the -Wcast-function-type semantics, or else a new tighter > warning option that matches the sanitizer should be > provided. Ideally both, I think. But it's definitely silly > to have a runtime check that flags up things that the > compiler perfectly well could detect at compile time but > is choosing not to. Slightly further investigation suggests clang 16 and later have -Wcast-function-type-strict for the "report all the same casts that the sanitizer does". gcc doesn't I think have that yet. I didn't spot any option in the sanitizer to loosen the set of things it considers mismatched function pointers. -- PMM
On Mon, Jun 03, 2024 at 04:12:34PM +0100, Peter Maydell wrote: > On Mon, 3 Jun 2024 at 15:58, Peter Maydell <peter.maydell@linaro.org> wrote: > > > > On Mon, 3 Jun 2024 at 15:49, Daniel P. Berrangé <berrange@redhat.com> wrote: > > > We can't rely on the sanitizers to catch all cases where we're casting > > > functions, as we don't have good enough code coverage in tests to > > > identify all places that way. > > > > > > Unless there's a warning flag we can use to get diagnosis of this > > > problem at compile time and then fix all reported issues, I won't have > > > any confidence in our ability to remove -fsanitize-cfi-icall-generalize-pointers > > > for CFI. > > > > You might think that -Wcast-function-type would detect them at compile > > time, but that has two loopholes: > > 1. void(*) (void) matches everything > > 2. any parameter of pointer type matches any other pointer type > > > > It seems to me rather inconsistent that the sanitizers do > > not match up with the warning semantics here. I think I > > would vote for raising that with the compiler folks -- > > either the sanitizer should be made looser so it matches > > the -Wcast-function-type semantics, or else a new tighter > > warning option that matches the sanitizer should be > > provided. Ideally both, I think. But it's definitely silly > > to have a runtime check that flags up things that the > > compiler perfectly well could detect at compile time but > > is choosing not to. > > Slightly further investigation suggests clang 16 and later > have -Wcast-function-type-strict for the "report all the > same casts that the sanitizer does". gcc doesn't I think have > that yet. I didn't spot any option in the sanitizer to > loosen the set of things it considers mismatched function > pointers. I just tried that with CC=clang ./configure --target-list=x86_64-softmmu --extra-cflags="-Wcast-function-type-strict" --disable-werror and it rather shows the futility of the task, picking one reoprted warning that is repeated over & over in differnt contexts: In file included from qapi/qapi-types-block-core.c:15: qapi/qapi-types-block-core.h:3620:1: warning: cast from 'void (*)(DummyBlockCoreForceArrays *)' (aka 'void (*)(struct DummyBlockCoreForceArrays *)') to 'void (*)(void)' converts to incompatible function type [-Wcast-function-type-strict] 3620 | G_DEFINE_AUTOPTR_CLEANUP_FUNC(DummyBlockCoreForceArrays, qapi_free_DummyBlockCoreForceArrays) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /usr/include/glib-2.0/glib/gmacros.h:1372:3: note: expanded from macro 'G_DEFINE_AUTOPTR_CLEANUP_FUNC' 1372 | _GLIB_DEFINE_AUTOPTR_CLEANUP_FUNCS(TypeName, TypeName, func) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /usr/include/glib-2.0/glib/gmacros.h:1364:57: note: expanded from macro '_GLIB_DEFINE_AUTOPTR_CLEANUP_FUNCS' 1364 | { if (*_q) g_queue_free_full (*_q, (GDestroyNotify) (void(*)(void)) cleanup); } \ | ^~~~~~~~~~~~~~~~~~~~~~~ IOW, the GLib headers themselves have bad casts in macros which we rely on. So we'll never be cast safe until GLib changes its own code...if it ever does. With regards, Daniel
On 03/06/2024 17.52, Daniel P. Berrangé wrote: > On Mon, Jun 03, 2024 at 04:12:34PM +0100, Peter Maydell wrote: >> On Mon, 3 Jun 2024 at 15:58, Peter Maydell <peter.maydell@linaro.org> wrote: >>> >>> On Mon, 3 Jun 2024 at 15:49, Daniel P. Berrangé <berrange@redhat.com> wrote: >>>> We can't rely on the sanitizers to catch all cases where we're casting >>>> functions, as we don't have good enough code coverage in tests to >>>> identify all places that way. >>>> >>>> Unless there's a warning flag we can use to get diagnosis of this >>>> problem at compile time and then fix all reported issues, I won't have >>>> any confidence in our ability to remove -fsanitize-cfi-icall-generalize-pointers >>>> for CFI. >>> >>> You might think that -Wcast-function-type would detect them at compile >>> time, but that has two loopholes: >>> 1. void(*) (void) matches everything >>> 2. any parameter of pointer type matches any other pointer type >>> >>> It seems to me rather inconsistent that the sanitizers do >>> not match up with the warning semantics here. I think I >>> would vote for raising that with the compiler folks -- >>> either the sanitizer should be made looser so it matches >>> the -Wcast-function-type semantics, or else a new tighter >>> warning option that matches the sanitizer should be >>> provided. Ideally both, I think. But it's definitely silly >>> to have a runtime check that flags up things that the >>> compiler perfectly well could detect at compile time but >>> is choosing not to. >> >> Slightly further investigation suggests clang 16 and later >> have -Wcast-function-type-strict for the "report all the >> same casts that the sanitizer does". gcc doesn't I think have >> that yet. I didn't spot any option in the sanitizer to >> loosen the set of things it considers mismatched function >> pointers. > > I just tried that with > > CC=clang ./configure --target-list=x86_64-softmmu --extra-cflags="-Wcast-function-type-strict" --disable-werror > > and it rather shows the futility of the task, picking one reoprted > warning that is repeated over & over in differnt contexts: > > In file included from qapi/qapi-types-block-core.c:15: > qapi/qapi-types-block-core.h:3620:1: warning: cast from 'void (*)(DummyBlockCoreForceArrays *)' (aka 'void (*)(struct DummyBlockCoreForceArrays *)') to 'void (*)(void)' converts to incompatible function type [-Wcast-function-type-strict] > 3620 | G_DEFINE_AUTOPTR_CLEANUP_FUNC(DummyBlockCoreForceArrays, qapi_free_DummyBlockCoreForceArrays) > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > /usr/include/glib-2.0/glib/gmacros.h:1372:3: note: expanded from macro 'G_DEFINE_AUTOPTR_CLEANUP_FUNC' > 1372 | _GLIB_DEFINE_AUTOPTR_CLEANUP_FUNCS(TypeName, TypeName, func) > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > /usr/include/glib-2.0/glib/gmacros.h:1364:57: note: expanded from macro '_GLIB_DEFINE_AUTOPTR_CLEANUP_FUNCS' > 1364 | { if (*_q) g_queue_free_full (*_q, (GDestroyNotify) (void(*)(void)) cleanup); } \ > | ^~~~~~~~~~~~~~~~~~~~~~~ > > > IOW, the GLib headers themselves have bad casts in macros which we > rely on. So we'll never be cast safe until GLib changes its own > code...if it ever does. Ok, thanks for checking! So the ultimate answer to the problem (at least right now) is: Let's use -fno-sanitize=function in QEMU. Thomas
Daniel P. Berrangé <berrange@redhat.com> writes: > On Mon, Jun 03, 2024 at 03:47:55PM +0100, Alex Bennée wrote: >> Thomas Huth <thuth@redhat.com> writes: >> >> > Casting function pointers from one type to another causes undefined >> > behavior errors when compiling with -fsanitize=undefined with Clang >> > v18: >> >> Queued to testing/next, thanks. > > Please remove, as I don't think this is a viable approach to the > problem Ok - I'll drop it. I still have: .gitlab-ci.d/buildtest.yml: Use -fno-sanitize=function in the clang-system job to prevent the CI failures. > > With regards, > Daniel
On Mon, Jun 03, 2024 at 06:46:15PM +0100, Alex Bennée wrote: > Daniel P. Berrangé <berrange@redhat.com> writes: > > > On Mon, Jun 03, 2024 at 03:47:55PM +0100, Alex Bennée wrote: > >> Thomas Huth <thuth@redhat.com> writes: > >> > >> > Casting function pointers from one type to another causes undefined > >> > behavior errors when compiling with -fsanitize=undefined with Clang > >> > v18: > >> > >> Queued to testing/next, thanks. > > > > Please remove, as I don't think this is a viable approach to the > > problem > > Ok - I'll drop it. > > I still have: > > .gitlab-ci.d/buildtest.yml: Use -fno-sanitize=function in the clang-system job > > to prevent the CI failures. Yes, that's good. With regards, Daniel
diff --git a/io/channel-socket.c b/io/channel-socket.c index 3a899b0608..aa2a1c8586 100644 --- a/io/channel-socket.c +++ b/io/channel-socket.c @@ -193,6 +193,10 @@ static void qio_channel_socket_connect_worker(QIOTask *task, qio_task_set_error(task, err); } +static void qio_qapi_free_SocketAddress(gpointer sa) +{ + qapi_free_SocketAddress(sa); +} void qio_channel_socket_connect_async(QIOChannelSocket *ioc, SocketAddress *addr, @@ -213,7 +217,7 @@ void qio_channel_socket_connect_async(QIOChannelSocket *ioc, qio_task_run_in_thread(task, qio_channel_socket_connect_worker, addrCopy, - (GDestroyNotify)qapi_free_SocketAddress, + qio_qapi_free_SocketAddress, context); }
Casting function pointers from one type to another causes undefined behavior errors when compiling with -fsanitize=undefined with Clang v18: $ QTEST_QEMU_BINARY=./qemu-system-mips64 tests/qtest/netdev-socket TAP version 13 # random seed: R02S4424f4f460de783fdd3d72c5571d3adc 1..10 # Start of mips64 tests # Start of netdev tests # Start of stream tests # starting QEMU: exec ./qemu-system-mips64 -qtest unix:/tmp/qtest-1213196.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-1213196.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -nodefaults -M none -netdev stream,id=st0,addr.type=fd,addr.str=3 -accel qtest ../io/task.c:78:13: runtime error: call to function qapi_free_SocketAddress through pointer to incorrect function type 'void (*)(void *)' /tmp/qemu-sanitize/qapi/qapi-types-sockets.c:170: note: qapi_free_SocketAddress defined here SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../io/task.c:78:13 Add a wrapper function to avoid the problem. Signed-off-by: Thomas Huth <thuth@redhat.com> --- io/channel-socket.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)