Message ID | 20250116062514.2391108-1-jstultz@google.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [RFC] sound/virtio: Fix cancel_sync warnings on uninitialized work_structs | expand |
On Thu, 16 Jan 2025 07:25:08 +0100, John Stultz wrote: > > Betty reported hitting the following warning: > > [ 8.709131][ T221] WARNING: CPU: 2 PID: 221 at kernel/workqueue.c:4182 > ... > [ 8.713282][ T221] Call trace: > [ 8.713365][ T221] __flush_work+0x8d0/0x914 > [ 8.713468][ T221] __cancel_work_sync+0xac/0xfc > [ 8.713570][ T221] cancel_work_sync+0x24/0x34 > [ 8.713667][ T221] virtsnd_remove+0xa8/0xf8 [virtio_snd ab15f34d0dd772f6d11327e08a81d46dc9c36276] > [ 8.713868][ T221] virtsnd_probe+0x48c/0x664 [virtio_snd ab15f34d0dd772f6d11327e08a81d46dc9c36276] > [ 8.714035][ T221] virtio_dev_probe+0x28c/0x390 > [ 8.714139][ T221] really_probe+0x1bc/0x4c8 > ... > > It seems we're hitting the error path in virtsnd_probe(), which > triggers a virtsnd_remove() which iterates over the substreams > calling cancel_work_sync() on the elapsed_period work_struct. > > Looking at the code, from earlier in: > virtsnd_probe()->virtsnd_build_devs()->virtsnd_pcm_parse_cfg() > > We set snd->nsubstreams, allocate the snd->substreams, and if > we then hit an error on the info allocation or something in > virtsnd_ctl_query_info() fails, we will exit without having > initialized the elapsed_period work_struct. > > When that error path unwinds we then call virtsnd_remove() > which as long as the substreams array is allocated, will iterate > through calling cancel_work_sync() on the uninitialized work > struct hitting this warning. > > The simplest fix seems to just check the work_struct->func ptr > is valid before calling cancel_work_sync(), but I wanted to share > in case folks had ideas for a better solution. > > I have not yet managed to reproduce the issue myself, so this > patch has had limited testing. > > Feedback or thoughts would be appreciated! The conditional canceling is a bit flaky, IMHO. Alternatively, we can initialize the whole stuff right after the allocation, e.g. something like below. thanks, Takashi --- a/sound/virtio/virtio_pcm.c +++ b/sound/virtio/virtio_pcm.c @@ -339,6 +339,16 @@ int virtsnd_pcm_parse_cfg(struct virtio_snd *snd) if (!snd->substreams) return -ENOMEM; + for (i = 0; i < snd->nsubstreams; ++i) { + struct virtio_pcm_substream *vss = &snd->substreams[i]; + + vss->snd = snd; + vss->sid = i; + INIT_WORK(&vss->elapsed_period, virtsnd_pcm_period_elapsed); + init_waitqueue_head(&vss->msg_empty); + spin_lock_init(&vss->lock); + } + info = kcalloc(snd->nsubstreams, sizeof(*info), GFP_KERNEL); if (!info) return -ENOMEM; @@ -352,12 +362,6 @@ int virtsnd_pcm_parse_cfg(struct virtio_snd *snd) struct virtio_pcm_substream *vss = &snd->substreams[i]; struct virtio_pcm *vpcm; - vss->snd = snd; - vss->sid = i; - INIT_WORK(&vss->elapsed_period, virtsnd_pcm_period_elapsed); - init_waitqueue_head(&vss->msg_empty); - spin_lock_init(&vss->lock); - rc = virtsnd_pcm_build_hw(vss, &info[i]); if (rc) goto on_exit;
diff --git a/sound/virtio/virtio_card.c b/sound/virtio/virtio_card.c index 965209e1d872..87dc49c9390a 100644 --- a/sound/virtio/virtio_card.c +++ b/sound/virtio/virtio_card.c @@ -365,7 +365,9 @@ static void virtsnd_remove(struct virtio_device *vdev) for (i = 0; snd->substreams && i < snd->nsubstreams; ++i) { struct virtio_pcm_substream *vss = &snd->substreams[i]; - cancel_work_sync(&vss->elapsed_period); + /* check we initialized the work struct before cancelling*/ + if (vss->elapsed_period.func) + cancel_work_sync(&vss->elapsed_period); virtsnd_pcm_msg_free(vss); }
Betty reported hitting the following warning: [ 8.709131][ T221] WARNING: CPU: 2 PID: 221 at kernel/workqueue.c:4182 ... [ 8.713282][ T221] Call trace: [ 8.713365][ T221] __flush_work+0x8d0/0x914 [ 8.713468][ T221] __cancel_work_sync+0xac/0xfc [ 8.713570][ T221] cancel_work_sync+0x24/0x34 [ 8.713667][ T221] virtsnd_remove+0xa8/0xf8 [virtio_snd ab15f34d0dd772f6d11327e08a81d46dc9c36276] [ 8.713868][ T221] virtsnd_probe+0x48c/0x664 [virtio_snd ab15f34d0dd772f6d11327e08a81d46dc9c36276] [ 8.714035][ T221] virtio_dev_probe+0x28c/0x390 [ 8.714139][ T221] really_probe+0x1bc/0x4c8 ... It seems we're hitting the error path in virtsnd_probe(), which triggers a virtsnd_remove() which iterates over the substreams calling cancel_work_sync() on the elapsed_period work_struct. Looking at the code, from earlier in: virtsnd_probe()->virtsnd_build_devs()->virtsnd_pcm_parse_cfg() We set snd->nsubstreams, allocate the snd->substreams, and if we then hit an error on the info allocation or something in virtsnd_ctl_query_info() fails, we will exit without having initialized the elapsed_period work_struct. When that error path unwinds we then call virtsnd_remove() which as long as the substreams array is allocated, will iterate through calling cancel_work_sync() on the uninitialized work struct hitting this warning. The simplest fix seems to just check the work_struct->func ptr is valid before calling cancel_work_sync(), but I wanted to share in case folks had ideas for a better solution. I have not yet managed to reproduce the issue myself, so this patch has had limited testing. Feedback or thoughts would be appreciated! Cc: Anton Yakovlev <anton.yakovlev@opensynergy.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Takashi Iwai <tiwai@suse.com> Cc: virtualization@lists.linux.dev Cc: linux-sound@vger.kernel.org Cc: kernel-team@android.com Reported-by: Betty Zhou <bettyzhou@google.com> Signed-off-by: John Stultz <jstultz@google.com> --- sound/virtio/virtio_card.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)