diff mbox series

[RFC] sound/virtio: Fix cancel_sync warnings on uninitialized work_structs

Message ID 20250116062514.2391108-1-jstultz@google.com (mailing list archive)
State Superseded
Headers show
Series [RFC] sound/virtio: Fix cancel_sync warnings on uninitialized work_structs | expand

Commit Message

John Stultz Jan. 16, 2025, 6:25 a.m. UTC
Betty reported hitting the following warning:

[    8.709131][  T221] WARNING: CPU: 2 PID: 221 at kernel/workqueue.c:4182
...
[    8.713282][  T221] Call trace:
[    8.713365][  T221]  __flush_work+0x8d0/0x914
[    8.713468][  T221]  __cancel_work_sync+0xac/0xfc
[    8.713570][  T221]  cancel_work_sync+0x24/0x34
[    8.713667][  T221]  virtsnd_remove+0xa8/0xf8 [virtio_snd ab15f34d0dd772f6d11327e08a81d46dc9c36276]
[    8.713868][  T221]  virtsnd_probe+0x48c/0x664 [virtio_snd ab15f34d0dd772f6d11327e08a81d46dc9c36276]
[    8.714035][  T221]  virtio_dev_probe+0x28c/0x390
[    8.714139][  T221]  really_probe+0x1bc/0x4c8
...

It seems we're hitting the error path in virtsnd_probe(), which
triggers a virtsnd_remove() which iterates over the substreams
calling cancel_work_sync() on the elapsed_period work_struct.

Looking at the code, from earlier in:
virtsnd_probe()->virtsnd_build_devs()->virtsnd_pcm_parse_cfg()

We set snd->nsubstreams, allocate the snd->substreams, and if
we then hit an error on the info allocation or something in
virtsnd_ctl_query_info() fails, we will exit without having
initialized the elapsed_period work_struct.

When that error path unwinds we then call virtsnd_remove()
which as long as the substreams array is allocated, will iterate
through calling cancel_work_sync() on the uninitialized work
struct hitting this warning.

The simplest fix seems to just check the work_struct->func ptr
is valid before calling cancel_work_sync(), but I wanted to share
in case folks had ideas for a better solution.

I have not yet managed to reproduce the issue myself, so this
patch has had limited testing.

Feedback or thoughts would be appreciated!

Cc: Anton Yakovlev <anton.yakovlev@opensynergy.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: virtualization@lists.linux.dev
Cc: linux-sound@vger.kernel.org
Cc: kernel-team@android.com
Reported-by: Betty Zhou <bettyzhou@google.com>
Signed-off-by: John Stultz <jstultz@google.com>
---
 sound/virtio/virtio_card.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Takashi Iwai Jan. 16, 2025, 1:56 p.m. UTC | #1
On Thu, 16 Jan 2025 07:25:08 +0100,
John Stultz wrote:
> 
> Betty reported hitting the following warning:
> 
> [    8.709131][  T221] WARNING: CPU: 2 PID: 221 at kernel/workqueue.c:4182
> ...
> [    8.713282][  T221] Call trace:
> [    8.713365][  T221]  __flush_work+0x8d0/0x914
> [    8.713468][  T221]  __cancel_work_sync+0xac/0xfc
> [    8.713570][  T221]  cancel_work_sync+0x24/0x34
> [    8.713667][  T221]  virtsnd_remove+0xa8/0xf8 [virtio_snd ab15f34d0dd772f6d11327e08a81d46dc9c36276]
> [    8.713868][  T221]  virtsnd_probe+0x48c/0x664 [virtio_snd ab15f34d0dd772f6d11327e08a81d46dc9c36276]
> [    8.714035][  T221]  virtio_dev_probe+0x28c/0x390
> [    8.714139][  T221]  really_probe+0x1bc/0x4c8
> ...
> 
> It seems we're hitting the error path in virtsnd_probe(), which
> triggers a virtsnd_remove() which iterates over the substreams
> calling cancel_work_sync() on the elapsed_period work_struct.
> 
> Looking at the code, from earlier in:
> virtsnd_probe()->virtsnd_build_devs()->virtsnd_pcm_parse_cfg()
> 
> We set snd->nsubstreams, allocate the snd->substreams, and if
> we then hit an error on the info allocation or something in
> virtsnd_ctl_query_info() fails, we will exit without having
> initialized the elapsed_period work_struct.
> 
> When that error path unwinds we then call virtsnd_remove()
> which as long as the substreams array is allocated, will iterate
> through calling cancel_work_sync() on the uninitialized work
> struct hitting this warning.
> 
> The simplest fix seems to just check the work_struct->func ptr
> is valid before calling cancel_work_sync(), but I wanted to share
> in case folks had ideas for a better solution.
> 
> I have not yet managed to reproduce the issue myself, so this
> patch has had limited testing.
> 
> Feedback or thoughts would be appreciated!

The conditional canceling is a bit flaky, IMHO.
Alternatively, we can initialize the whole stuff right after the
allocation, e.g. something like below.


thanks,

Takashi

--- a/sound/virtio/virtio_pcm.c
+++ b/sound/virtio/virtio_pcm.c
@@ -339,6 +339,16 @@ int virtsnd_pcm_parse_cfg(struct virtio_snd *snd)
 	if (!snd->substreams)
 		return -ENOMEM;
 
+	for (i = 0; i < snd->nsubstreams; ++i) {
+		struct virtio_pcm_substream *vss = &snd->substreams[i];
+
+		vss->snd = snd;
+		vss->sid = i;
+		INIT_WORK(&vss->elapsed_period, virtsnd_pcm_period_elapsed);
+		init_waitqueue_head(&vss->msg_empty);
+		spin_lock_init(&vss->lock);
+	}
+
 	info = kcalloc(snd->nsubstreams, sizeof(*info), GFP_KERNEL);
 	if (!info)
 		return -ENOMEM;
@@ -352,12 +362,6 @@ int virtsnd_pcm_parse_cfg(struct virtio_snd *snd)
 		struct virtio_pcm_substream *vss = &snd->substreams[i];
 		struct virtio_pcm *vpcm;
 
-		vss->snd = snd;
-		vss->sid = i;
-		INIT_WORK(&vss->elapsed_period, virtsnd_pcm_period_elapsed);
-		init_waitqueue_head(&vss->msg_empty);
-		spin_lock_init(&vss->lock);
-
 		rc = virtsnd_pcm_build_hw(vss, &info[i]);
 		if (rc)
 			goto on_exit;
diff mbox series

Patch

diff --git a/sound/virtio/virtio_card.c b/sound/virtio/virtio_card.c
index 965209e1d872..87dc49c9390a 100644
--- a/sound/virtio/virtio_card.c
+++ b/sound/virtio/virtio_card.c
@@ -365,7 +365,9 @@  static void virtsnd_remove(struct virtio_device *vdev)
 	for (i = 0; snd->substreams && i < snd->nsubstreams; ++i) {
 		struct virtio_pcm_substream *vss = &snd->substreams[i];
 
-		cancel_work_sync(&vss->elapsed_period);
+		/* check we initialized the work struct before cancelling*/
+		if (vss->elapsed_period.func)
+			cancel_work_sync(&vss->elapsed_period);
 		virtsnd_pcm_msg_free(vss);
 	}