diff mbox series

[v2,09/11] ASoC: Intel: hdac_hdmi: Set ops to NULL on remove

Message ID 20190617113644.25621-10-amadeuszx.slawinski@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series Fix driver reload issues | expand

Commit Message

Amadeusz Sławiński June 17, 2019, 11:36 a.m. UTC
When we unload Skylake driver we may end up calling
hdac_component_master_unbind(), it uses acomp->audio_ops, which we set
in hdmi_codec_probe(), so we need to set it to NULL in hdmi_codec_remove(),
otherwise we will dereference no longer existing pointer.

Signed-off-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com>
---
 sound/soc/codecs/hdac_hdmi.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Ranjani Sridharan June 17, 2019, 8:51 p.m. UTC | #1
On Mon, 2019-06-17 at 13:36 +0200, Amadeusz Sławiński wrote:
> When we unload Skylake driver we may end up calling
> hdac_component_master_unbind(), it uses acomp->audio_ops, which we
> set
> in hdmi_codec_probe(), so we need to set it to NULL in
> hdmi_codec_remove(),
> otherwise we will dereference no longer existing pointer.

Hi Amadeusz,

It looks like the audio_ops should be deleted snd_hdac_acomp_exit().
Also, this doesnt seem to be the case with when the SOF driver is
removed.
Could you please give a bit more context on what error you see when
this happens?

Thanks,
Ranjani
> 
> Signed-off-by: Amadeusz Sławiński <
> amadeuszx.slawinski@linux.intel.com>
> ---
>  sound/soc/codecs/hdac_hdmi.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/sound/soc/codecs/hdac_hdmi.c
> b/sound/soc/codecs/hdac_hdmi.c
> index 911bb6e2a1ac..9ee1bff548d8 100644
> --- a/sound/soc/codecs/hdac_hdmi.c
> +++ b/sound/soc/codecs/hdac_hdmi.c
> @@ -1890,6 +1890,12 @@ static void hdmi_codec_remove(struct
> snd_soc_component *component)
>  {
>  	struct hdac_hdmi_priv *hdmi =
> snd_soc_component_get_drvdata(component);
>  	struct hdac_device *hdev = hdmi->hdev;
> +	int ret;
> +
> +	ret = snd_hdac_acomp_register_notifier(hdev->bus, NULL);
> +	if (ret < 0)
> +		dev_err(&hdev->dev, "notifier unregister failed: err:
> %d\n",
> +				ret);
>  
>  	pm_runtime_disable(&hdev->dev);
>  }
Takashi Iwai June 17, 2019, 9:36 p.m. UTC | #2
On Mon, 17 Jun 2019 22:51:42 +0200,
Ranjani Sridharan wrote:
> 
> On Mon, 2019-06-17 at 13:36 +0200, Amadeusz Sławiński wrote:
> > When we unload Skylake driver we may end up calling
> > hdac_component_master_unbind(), it uses acomp->audio_ops, which we
> > set
> > in hdmi_codec_probe(), so we need to set it to NULL in
> > hdmi_codec_remove(),
> > otherwise we will dereference no longer existing pointer.
> 
> Hi Amadeusz,
> 
> It looks like the audio_ops should be deleted snd_hdac_acomp_exit().

It's a different one.  The codec driver registers / de-registers the
notifier at its probe/remove, while the controller driver takes care
of snd_hdac_acomp_init().  snd_hdac_acomp_exit() is usually not needed
to be called explicitly, as it's managed via devres.


Takashi

> Also, this doesnt seem to be the case with when the SOF driver is
> removed.
> Could you please give a bit more context on what error you see when
> this happens?
> 
> Thanks,
> Ranjani
> > 
> > Signed-off-by: Amadeusz Sławiński <
> > amadeuszx.slawinski@linux.intel.com>
> > ---
> >  sound/soc/codecs/hdac_hdmi.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/sound/soc/codecs/hdac_hdmi.c
> > b/sound/soc/codecs/hdac_hdmi.c
> > index 911bb6e2a1ac..9ee1bff548d8 100644
> > --- a/sound/soc/codecs/hdac_hdmi.c
> > +++ b/sound/soc/codecs/hdac_hdmi.c
> > @@ -1890,6 +1890,12 @@ static void hdmi_codec_remove(struct
> > snd_soc_component *component)
> >  {
> >  	struct hdac_hdmi_priv *hdmi =
> > snd_soc_component_get_drvdata(component);
> >  	struct hdac_device *hdev = hdmi->hdev;
> > +	int ret;
> > +
> > +	ret = snd_hdac_acomp_register_notifier(hdev->bus, NULL);
> > +	if (ret < 0)
> > +		dev_err(&hdev->dev, "notifier unregister failed: err:
> > %d\n",
> > +				ret);
> >  
> >  	pm_runtime_disable(&hdev->dev);
> >  }
> 
>
Ranjani Sridharan June 18, 2019, 4:19 a.m. UTC | #3
On Mon, 2019-06-17 at 23:36 +0200, Takashi Iwai wrote:
> On Mon, 17 Jun 2019 22:51:42 +0200,
> Ranjani Sridharan wrote:
> > 
> > On Mon, 2019-06-17 at 13:36 +0200, Amadeusz Sławiński wrote:
> > > When we unload Skylake driver we may end up calling
> > > hdac_component_master_unbind(), it uses acomp->audio_ops, which
> > > we
> > > set
> > > in hdmi_codec_probe(), so we need to set it to NULL in
> > > hdmi_codec_remove(),
> > > otherwise we will dereference no longer existing pointer.
> > 
> > Hi Amadeusz,
> > 
> > It looks like the audio_ops should be deleted
> > snd_hdac_acomp_exit().
> 
> It's a different one.  The codec driver registers / de-registers the
> notifier at its probe/remove, while the controller driver takes care
> of snd_hdac_acomp_init().  snd_hdac_acomp_exit() is usually not
> needed
> to be called explicitly, as it's managed via devres.

Makes sense, thanks Takashi. But I am still wondering why we havent
seen this issue with SOF yet. We run the module unload/reload stress
test as well and havent seen this yet. 

Thanks,
Ranjani
> 
> 
> Takashi
> 
> > Also, this doesnt seem to be the case with when the SOF driver is
> > removed.
> > Could you please give a bit more context on what error you see when
> > this happens?
> > 
> > Thanks,
> > Ranjani
> > > 
> > > Signed-off-by: Amadeusz Sławiński <
> > > amadeuszx.slawinski@linux.intel.com>
> > > ---
> > >  sound/soc/codecs/hdac_hdmi.c | 6 ++++++
> > >  1 file changed, 6 insertions(+)
> > > 
> > > diff --git a/sound/soc/codecs/hdac_hdmi.c
> > > b/sound/soc/codecs/hdac_hdmi.c
> > > index 911bb6e2a1ac..9ee1bff548d8 100644
> > > --- a/sound/soc/codecs/hdac_hdmi.c
> > > +++ b/sound/soc/codecs/hdac_hdmi.c
> > > @@ -1890,6 +1890,12 @@ static void hdmi_codec_remove(struct
> > > snd_soc_component *component)
> > >  {
> > >  	struct hdac_hdmi_priv *hdmi =
> > > snd_soc_component_get_drvdata(component);
> > >  	struct hdac_device *hdev = hdmi->hdev;
> > > +	int ret;
> > > +
> > > +	ret = snd_hdac_acomp_register_notifier(hdev->bus, NULL);
> > > +	if (ret < 0)
> > > +		dev_err(&hdev->dev, "notifier unregister failed: err:
> > > %d\n",
> > > +				ret);
> > >  
> > >  	pm_runtime_disable(&hdev->dev);
> > >  }
> > 
> >
Takashi Iwai June 18, 2019, 5:16 a.m. UTC | #4
On Tue, 18 Jun 2019 06:19:15 +0200,
Ranjani Sridharan wrote:
> 
> On Mon, 2019-06-17 at 23:36 +0200, Takashi Iwai wrote:
> > On Mon, 17 Jun 2019 22:51:42 +0200,
> > Ranjani Sridharan wrote:
> > > 
> > > On Mon, 2019-06-17 at 13:36 +0200, Amadeusz Sławiński wrote:
> > > > When we unload Skylake driver we may end up calling
> > > > hdac_component_master_unbind(), it uses acomp->audio_ops, which
> > > > we
> > > > set
> > > > in hdmi_codec_probe(), so we need to set it to NULL in
> > > > hdmi_codec_remove(),
> > > > otherwise we will dereference no longer existing pointer.
> > > 
> > > Hi Amadeusz,
> > > 
> > > It looks like the audio_ops should be deleted
> > > snd_hdac_acomp_exit().
> > 
> > It's a different one.  The codec driver registers / de-registers the
> > notifier at its probe/remove, while the controller driver takes care
> > of snd_hdac_acomp_init().  snd_hdac_acomp_exit() is usually not
> > needed
> > to be called explicitly, as it's managed via devres.
> 
> Makes sense, thanks Takashi. But I am still wondering why we havent
> seen this issue with SOF yet. We run the module unload/reload stress
> test as well and havent seen this yet. 

Usually this isn't a problem as the controller is removed at first.
But if the codec is unbound at first by some reason, it can be a
problem without unregistering, I guess.


Takashi

> 
> Thanks,
> Ranjani
> > 
> > 
> > Takashi
> > 
> > > Also, this doesnt seem to be the case with when the SOF driver is
> > > removed.
> > > Could you please give a bit more context on what error you see when
> > > this happens?
> > > 
> > > Thanks,
> > > Ranjani
> > > > 
> > > > Signed-off-by: Amadeusz Sławiński <
> > > > amadeuszx.slawinski@linux.intel.com>
> > > > ---
> > > >  sound/soc/codecs/hdac_hdmi.c | 6 ++++++
> > > >  1 file changed, 6 insertions(+)
> > > > 
> > > > diff --git a/sound/soc/codecs/hdac_hdmi.c
> > > > b/sound/soc/codecs/hdac_hdmi.c
> > > > index 911bb6e2a1ac..9ee1bff548d8 100644
> > > > --- a/sound/soc/codecs/hdac_hdmi.c
> > > > +++ b/sound/soc/codecs/hdac_hdmi.c
> > > > @@ -1890,6 +1890,12 @@ static void hdmi_codec_remove(struct
> > > > snd_soc_component *component)
> > > >  {
> > > >  	struct hdac_hdmi_priv *hdmi =
> > > > snd_soc_component_get_drvdata(component);
> > > >  	struct hdac_device *hdev = hdmi->hdev;
> > > > +	int ret;
> > > > +
> > > > +	ret = snd_hdac_acomp_register_notifier(hdev->bus, NULL);
> > > > +	if (ret < 0)
> > > > +		dev_err(&hdev->dev, "notifier unregister failed: err:
> > > > %d\n",
> > > > +				ret);
> > > >  
> > > >  	pm_runtime_disable(&hdev->dev);
> > > >  }
> > > 
> > > 
>
Amadeusz Sławiński June 18, 2019, 11 a.m. UTC | #5
On Mon, 17 Jun 2019 13:51:42 -0700
Ranjani Sridharan <ranjani.sridharan@linux.intel.com> wrote:

> On Mon, 2019-06-17 at 13:36 +0200, Amadeusz Sławiński wrote:
> > When we unload Skylake driver we may end up calling
> > hdac_component_master_unbind(), it uses acomp->audio_ops, which we
> > set
> > in hdmi_codec_probe(), so we need to set it to NULL in
> > hdmi_codec_remove(),
> > otherwise we will dereference no longer existing pointer.  
> 
> Hi Amadeusz,
> 
> It looks like the audio_ops should be deleted snd_hdac_acomp_exit().
> Also, this doesnt seem to be the case with when the SOF driver is
> removed.
> Could you please give a bit more context on what error you see when
> this happens?

Hi,

I get Oops. This is what happens with all other patches in this series and only this one reverted:

root@APL:~# rmmod snd_soc_sst_bxt_rt298
root@APL:~# rmmod snd_soc_hdac_hdmi
root@APL:~# rmmod snd_soc_skl
Killed

[   57.007783] BUG: unable to handle page fault for address: fffffbfff4067038
[   57.007956] #PF: supervisor read access in kernel mode
[   57.008065] #PF: error_code(0x0000) - not-present page
[   57.008173] PGD 268266067 P4D 268266067 PUD 23809a067 PMD 22b545067 PTE 0
[   57.008322] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN PTI
[   57.008453] CPU: 3 PID: 1045 Comm: rmmod Tainted: G                T 5.2.0-rc4-dev #824
[   57.008617] Hardware name: Intel Corp. Broxton P/Apollolake RVP1C, BIOS APLKRVPA.X64.0151.B25.1609151411 09/15/2016
[   57.008834] RIP: 0010:__asan_load8+0x39/0x90
[   57.008931] Code: ff ff ff ff 7f ff ff 48 39 c3 76 40 48 8d 43 07 48 89 c2 83 e2 07 48 83 fa 07 75 19 48 ba 00 00 00 00 00 fc ff df 48 c1 e8 03 <0f> b6 04 10 84 c0 75 2c 5b 5d c3 48 be 00 00 00 00 00 f
c ff df 48
[   57.009299] RSP: 0018:ffff88822431fa68 EFLAGS: 00010203
[   57.009411] RAX: 1ffffffff4067038 RBX: ffffffffa03381c0 RCX: ffffffffa01bd8a4
[   57.009557] RDX: dffffc0000000000 RSI: dffffc0000000000 RDI: ffffffffa03381c0
[   57.009704] RBP: ffff88822431fa70 R08: ffffed1046a6d8f3 R09: ffffed1046a6d8f3
[   57.009851] R10: ffffed1046a6d8f3 R11: 0000000000000000 R12: ffff88823536c4b0
[   57.009998] R13: ffffffffa03381a0 R14: ffffffffa01bd860 R15: ffff888223108538
[   57.010147] FS:  00007fedb579f540(0000) GS:ffff888237780000(0000) knlGS:0000000000000000
[   57.010312] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   57.010433] CR2: fffffbfff4067038 CR3: 000000022260a000 CR4: 00000000003406e0
[   57.010580] Call Trace:
[   57.010667]  hdac_component_master_unbind+0x44/0xb0 [snd_hda_core]
[   57.010822]  ? snd_hdac_acomp_exit+0x130/0x130 [snd_hda_core]
[   57.010949]  take_down_master+0x53/0x80
[   57.011037]  component_master_del+0x76/0xa0
[   57.011144]  snd_hdac_acomp_exit+0x97/0x130 [snd_hda_core]
[   57.011275]  ? snd_hdac_display_power+0x12e/0x1d0 [snd_hda_core]
[   57.011414]  skl_free+0xbf/0xd0 [snd_soc_skl]
[   57.011519]  skl_remove+0xf1/0x110 [snd_soc_skl]
[   57.011623]  pci_device_remove+0xd9/0x1f0
[   57.011714]  ? pcibios_free_irq+0x10/0x10
[   57.011806]  ? preempt_count_sub+0x18/0xd0
[   57.011898]  ? _raw_spin_unlock_irqrestore+0x26/0x40
[   57.012009]  device_release_driver_internal+0x140/0x270
[   57.012124]  driver_detach+0x7a/0xe0
[   57.012207]  bus_remove_driver+0x95/0x160
[   57.012303]  driver_unregister+0x43/0x60
[   57.012392]  pci_unregister_driver+0x29/0x110
[   57.012501]  skl_driver_exit+0x10/0x1b [snd_soc_skl]
[   57.012610]  __x64_sys_delete_module+0x235/0x3d0
[   57.012712]  ? free_module+0x380/0x380
[   57.012804]  do_syscall_64+0xcd/0x650
[   57.012887]  ? syscall_return_slowpath+0x1e0/0x1e0
[   57.012998]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   57.013107] RIP: 0033:0x7fedb52bc1b7
[   57.013189] Code: 73 01 c3 48 8b 0d d1 8c 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a1 8c 2c 00 f7 d8 64 89 01 48
[   57.013556] RSP: 002b:00007ffcfc17ce18 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[   57.013712] RAX: ffffffffffffffda RBX: 00007ffcfc17ce78 RCX: 00007fedb52bc1b7
[   57.013858] RDX: 000000000000000a RSI: 0000000000000800 RDI: 00005649a5309a98
[   57.014004] RBP: 00005649a5309a30 R08: 00007ffcfc17bd91 R09: 0000000000000000
[   57.014149] R10: 00007fedb5338cc0 R11: 0000000000000206 R12: 00007ffcfc17d040
[   57.014294] R13: 00007ffcfc17e79b R14: 00005649a5309260 R15: 00005649a5309a30
[   57.014446] Modules linked in: i2c_designware_platform i2c_designware_core snd_soc_dmic joydev x86_pkg_temp_thermal intel_powerclamp coretemp crc32c_intel serio_raw pwm_lpss_pci pwm_lpss intel_lpss_pci intel_lpss snd_soc_rt298 mei_me mei snd_soc_rt286 snd_soc_rl6347a snd_soc_skl(-) snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_hda_core snd_soc_acpi_intel_match snd_soc_acpi snd_soc_core snd_compress snd_pcm_dmaengine snd_pcm snd_timer parport_pc lp parport ip_tables x_tables igb dca pinctrl_broxton pinctrl_intel [last unloaded: snd_soc_hdac_hdmi]
[   57.015477] CR2: fffffbfff4067038
[   57.015556] ---[ end trace 794bf9fb0862965b ]---

Amadeusz
Ranjani Sridharan June 18, 2019, 3:58 p.m. UTC | #6
On Tue, 2019-06-18 at 13:00 +0200, Amadeusz Sławiński wrote:
> On Mon, 17 Jun 2019 13:51:42 -0700
> Ranjani Sridharan <ranjani.sridharan@linux.intel.com> wrote:
> 
> > On Mon, 2019-06-17 at 13:36 +0200, Amadeusz Sławiński wrote:
> > > When we unload Skylake driver we may end up calling
> > > hdac_component_master_unbind(), it uses acomp->audio_ops, which
> > > we
> > > set
> > > in hdmi_codec_probe(), so we need to set it to NULL in
> > > hdmi_codec_remove(),
> > > otherwise we will dereference no longer existing pointer.  
> > 
> > Hi Amadeusz,
> > 
> > It looks like the audio_ops should be deleted
> > snd_hdac_acomp_exit().
> > Also, this doesnt seem to be the case with when the SOF driver is
> > removed.
> > Could you please give a bit more context on what error you see when
> > this happens?
> 
> Hi,
> 
> I get Oops. This is what happens with all other patches in this
> series and only this one reverted:
> 
> root@APL:~# rmmod snd_soc_sst_bxt_rt298
> root@APL:~# rmmod snd_soc_hdac_hdmi
> root@APL:~# rmmod snd_soc_skl

Thanks, Amadeusz. I think the order in which the drivers are removed is
what's causing the oops in your case. With SOF, the order we remove is

1. rmmod sof_pci_dev
2. rmmod snd_soc_sst_bxt_rt298
3. rmmod snd_soc_hdac_hdmi

Thanks,
Ranjani
Amadeusz Sławiński June 19, 2019, 8:38 a.m. UTC | #7
On Tue, 18 Jun 2019 08:58:22 -0700
Ranjani Sridharan <ranjani.sridharan@linux.intel.com> wrote:

> On Tue, 2019-06-18 at 13:00 +0200, Amadeusz Sławiński wrote:
> > On Mon, 17 Jun 2019 13:51:42 -0700
> > Ranjani Sridharan <ranjani.sridharan@linux.intel.com> wrote:
> >   
> > > On Mon, 2019-06-17 at 13:36 +0200, Amadeusz Sławiński wrote:  
> > > > When we unload Skylake driver we may end up calling
> > > > hdac_component_master_unbind(), it uses acomp->audio_ops, which
> > > > we
> > > > set
> > > > in hdmi_codec_probe(), so we need to set it to NULL in
> > > > hdmi_codec_remove(),
> > > > otherwise we will dereference no longer existing pointer.    
> > > 
> > > Hi Amadeusz,
> > > 
> > > It looks like the audio_ops should be deleted
> > > snd_hdac_acomp_exit().
> > > Also, this doesnt seem to be the case with when the SOF driver is
> > > removed.
> > > Could you please give a bit more context on what error you see
> > > when this happens?  
> > 
> > Hi,
> > 
> > I get Oops. This is what happens with all other patches in this
> > series and only this one reverted:
> > 
> > root@APL:~# rmmod snd_soc_sst_bxt_rt298
> > root@APL:~# rmmod snd_soc_hdac_hdmi
> > root@APL:~# rmmod snd_soc_skl  
> 
> Thanks, Amadeusz. I think the order in which the drivers are removed
> is what's causing the oops in your case. With SOF, the order we
> remove is
> 
> 1. rmmod sof_pci_dev
> 2. rmmod snd_soc_sst_bxt_rt298
> 3. rmmod snd_soc_hdac_hdmi
> 

Well, there is nothing enforcing the order in which modules can be
unloaded (and I see no reason to force it), as you can see from
following excerpt, you can either start unloading from
snd_soc_sst_bxt_rt298 or snd_soc_skl, and yes if you start from
snd_soc_skl, there is no problem.

snd_soc_sst_bxt_rt298    40960  0
snd_soc_hdac_hdmi      45056  1 snd_soc_sst_bxt_rt298
snd_soc_dmic           16384  1
snd_soc_rt298          65536  2 snd_soc_sst_bxt_rt298
snd_soc_rt286          61440  0
snd_soc_rl6347a        16384  2 snd_soc_rt298,snd_soc_rt286
snd_soc_skl           372736  0
snd_soc_sst_ipc        20480  1 snd_soc_skl
snd_soc_sst_dsp        36864  1 snd_soc_skl
snd_hda_ext_core       28672  2 snd_soc_hdac_hdmi,snd_soc_skl
snd_hda_core          139264  3
snd_hda_ext_core,snd_soc_hdac_hdmi,snd_soc_skl
snd_soc_acpi_intel_match    53248  1 snd_soc_skl snd_soc_acpi
16384  2 snd_soc_acpi_intel_match,snd_soc_skl snd_soc_core
405504  6
snd_soc_rt298,snd_soc_rt286,snd_soc_hdac_hdmi,snd_soc_skl,snd_soc_dmic,snd_soc_sst_bxt_rt298
snd_compress           36864  2 snd_soc_core,snd_soc_skl
snd_pcm_dmaengine      16384  1 snd_soc_core snd_pcm
163840  10
snd_soc_rt298,snd_soc_rt286,snd_hda_ext_core,snd_soc_hdac_hdmi,snd_compress,snd_soc_core,snd_soc_skl,snd_hda_core,snd_soc_sst_bxt_rt298,snd_pcm_dmaengine
snd_timer              53248  1 snd_pcm

Amadeusz
Ranjani Sridharan June 19, 2019, 9:09 p.m. UTC | #8
On Wed, 2019-06-19 at 10:38 +0200, Amadeusz Sławiński wrote:
> On Tue, 18 Jun 2019 08:58:22 -0700
> Ranjani Sridharan <ranjani.sridharan@linux.intel.com> wrote:
> 
> > On Tue, 2019-06-18 at 13:00 +0200, Amadeusz Sławiński wrote:
> > > On Mon, 17 Jun 2019 13:51:42 -0700
> > > Ranjani Sridharan <ranjani.sridharan@linux.intel.com> wrote:
> > >   
> > > > On Mon, 2019-06-17 at 13:36 +0200, Amadeusz Sławiński wrote:  
> > > > > When we unload Skylake driver we may end up calling
> > > > > hdac_component_master_unbind(), it uses acomp->audio_ops,
> > > > > which
> > > > > we
> > > > > set
> > > > > in hdmi_codec_probe(), so we need to set it to NULL in
> > > > > hdmi_codec_remove(),
> > > > > otherwise we will dereference no longer existing pointer.    
> > > > 
> > > > Hi Amadeusz,
> > > > 
> > > > It looks like the audio_ops should be deleted
> > > > snd_hdac_acomp_exit().
> > > > Also, this doesnt seem to be the case with when the SOF driver
> > > > is
> > > > removed.
> > > > Could you please give a bit more context on what error you see
> > > > when this happens?  
> > > 
> > > Hi,
> > > 
> > > I get Oops. This is what happens with all other patches in this
> > > series and only this one reverted:
> > > 
> > > root@APL:~# rmmod snd_soc_sst_bxt_rt298
> > > root@APL:~# rmmod snd_soc_hdac_hdmi
> > > root@APL:~# rmmod snd_soc_skl  
> > 
> > Thanks, Amadeusz. I think the order in which the drivers are
> > removed
> > is what's causing the oops in your case. With SOF, the order we
> > remove is
> > 
> > 1. rmmod sof_pci_dev
> > 2. rmmod snd_soc_sst_bxt_rt298
> > 3. rmmod snd_soc_hdac_hdmi
> > 
> 
> Well, there is nothing enforcing the order in which modules can be
> unloaded (and I see no reason to force it), as you can see from
> following excerpt, you can either start unloading from
> snd_soc_sst_bxt_rt298 or snd_soc_skl, and yes if you start from
> snd_soc_skl, there is no problem.
> 
I am good with this patch. I just wanted to understand why we werent
seeing this error with SOF. Sure, there's nothing enforcing the order
in which modules are unloaded  but there must be a logical order for
testing purposes. 

Pierre, can you please comment on it. I vaguely remember discussing
this with you last year.

Thanks,
Ranjani
Pierre-Louis Bossart June 20, 2019, 6:17 a.m. UTC | #9
>>>>> Could you please give a bit more context on what error you see
>>>>> when this happens?
>>>>
>>>> Hi,
>>>>
>>>> I get Oops. This is what happens with all other patches in this
>>>> series and only this one reverted:
>>>>
>>>> root@APL:~# rmmod snd_soc_sst_bxt_rt298
>>>> root@APL:~# rmmod snd_soc_hdac_hdmi
>>>> root@APL:~# rmmod snd_soc_skl
>>>
>>> Thanks, Amadeusz. I think the order in which the drivers are
>>> removed
>>> is what's causing the oops in your case. With SOF, the order we
>>> remove is
>>>
>>> 1. rmmod sof_pci_dev
>>> 2. rmmod snd_soc_sst_bxt_rt298
>>> 3. rmmod snd_soc_hdac_hdmi
>>>
>>
>> Well, there is nothing enforcing the order in which modules can be
>> unloaded (and I see no reason to force it), as you can see from
>> following excerpt, you can either start unloading from
>> snd_soc_sst_bxt_rt298 or snd_soc_skl, and yes if you start from
>> snd_soc_skl, there is no problem.

there is a fundamental dependency that you are ignoring: the module 
snd_soc_sst_bxt_rt298 is a machine driver which will be probed when 
snd_soc_skl creates a platform_device.
Sure you can remove modules in a different order, but that's a bit of an 
artificial/academic exercise isn't it?

>>
> I am good with this patch. I just wanted to understand why we werent
> seeing this error with SOF. Sure, there's nothing enforcing the order
> in which modules are unloaded  but there must be a logical order for
> testing purposes.
> 
> Pierre, can you please comment on it. I vaguely remember discussing
> this with you last year.

Our tests remove the modules by taking care of dependencies and it's 
already unveiled dozens of issues.
We could add a sequence similar to Amadeusz and unbind the modules which 
are loaded with the creation of a platform_device (machine driver, 
dmic), I am just not sure how of useful this would be.
Amadeusz Sławiński June 24, 2019, 7:50 a.m. UTC | #10
On Thu, 20 Jun 2019 08:17:33 +0200
Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> wrote:

> >>>>> Could you please give a bit more context on what error you see
> >>>>> when this happens?  
> >>>>
> >>>> Hi,
> >>>>
> >>>> I get Oops. This is what happens with all other patches in this
> >>>> series and only this one reverted:
> >>>>
> >>>> root@APL:~# rmmod snd_soc_sst_bxt_rt298
> >>>> root@APL:~# rmmod snd_soc_hdac_hdmi
> >>>> root@APL:~# rmmod snd_soc_skl  
> >>>
> >>> Thanks, Amadeusz. I think the order in which the drivers are
> >>> removed
> >>> is what's causing the oops in your case. With SOF, the order we
> >>> remove is
> >>>
> >>> 1. rmmod sof_pci_dev
> >>> 2. rmmod snd_soc_sst_bxt_rt298
> >>> 3. rmmod snd_soc_hdac_hdmi
> >>>  
> >>
> >> Well, there is nothing enforcing the order in which modules can be
> >> unloaded (and I see no reason to force it), as you can see from
> >> following excerpt, you can either start unloading from
> >> snd_soc_sst_bxt_rt298 or snd_soc_skl, and yes if you start from
> >> snd_soc_skl, there is no problem.  
> 
> there is a fundamental dependency that you are ignoring: the module 
> snd_soc_sst_bxt_rt298 is a machine driver which will be probed when 
> snd_soc_skl creates a platform_device.
> Sure you can remove modules in a different order, but that's a bit of
> an artificial/academic exercise isn't it?
> 
> >>  
> > I am good with this patch. I just wanted to understand why we werent
> > seeing this error with SOF. Sure, there's nothing enforcing the
> > order in which modules are unloaded  but there must be a logical
> > order for testing purposes.
> > 
> > Pierre, can you please comment on it. I vaguely remember discussing
> > this with you last year.  
> 
> Our tests remove the modules by taking care of dependencies and it's 
> already unveiled dozens of issues.
> We could add a sequence similar to Amadeusz and unbind the modules
> which are loaded with the creation of a platform_device (machine
> driver, dmic), I am just not sure how of useful this would be.

You work under the assumption that users will remove modules in
"correct" order. Because it is not enforced by modules dependencies you
can expect users to do everything possible at some point in time. In
this case unloading modules in not expected order will lead to kernel
Oops, which is not what should happen.
diff mbox series

Patch

diff --git a/sound/soc/codecs/hdac_hdmi.c b/sound/soc/codecs/hdac_hdmi.c
index 911bb6e2a1ac..9ee1bff548d8 100644
--- a/sound/soc/codecs/hdac_hdmi.c
+++ b/sound/soc/codecs/hdac_hdmi.c
@@ -1890,6 +1890,12 @@  static void hdmi_codec_remove(struct snd_soc_component *component)
 {
 	struct hdac_hdmi_priv *hdmi = snd_soc_component_get_drvdata(component);
 	struct hdac_device *hdev = hdmi->hdev;
+	int ret;
+
+	ret = snd_hdac_acomp_register_notifier(hdev->bus, NULL);
+	if (ret < 0)
+		dev_err(&hdev->dev, "notifier unregister failed: err: %d\n",
+				ret);
 
 	pm_runtime_disable(&hdev->dev);
 }