diff mbox series

ALSA: hda: intel: Fix Optimus when GPU has no sound

Message ID 20240904203955.245085-1-maxtram95@gmail.com (mailing list archive)
State New
Headers show
Series ALSA: hda: intel: Fix Optimus when GPU has no sound | expand

Commit Message

Maxim Mikityanskiy Sept. 4, 2024, 8:39 p.m. UTC
Lenovo IdeaPad Z570 with NVIDIA GeForce Ge 520M doesn't have sound on
the discrete GPU. snd_hda_intel probes the device and schedules
azx_probe_continue(), which fails at azx_first_init(). The driver ends
up probed, but calls azx_free() and stops the chip. However, from the
runtime PM point of view, the device remains active, because the PCI
subsystem makes it active on probe, and it's still bound. It prevents
vga_switcheroo from turning off the DGPU (pci_create_device_link() syncs
power management for the video and audio devices).

Fix it by forcing the device to the suspended state in azx_free().

Fixes: 07f4f97d7b4b ("vga_switcheroo: Use device link for HDA controller")
Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com>
---
 sound/pci/hda/hda_intel.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

Comments

Takashi Iwai Sept. 5, 2024, 2:24 p.m. UTC | #1
On Wed, 04 Sep 2024 22:39:55 +0200,
Maxim Mikityanskiy wrote:
> 
> Lenovo IdeaPad Z570 with NVIDIA GeForce Ge 520M doesn't have sound on
> the discrete GPU. snd_hda_intel probes the device and schedules
> azx_probe_continue(), which fails at azx_first_init(). The driver ends
> up probed, but calls azx_free() and stops the chip. However, from the
> runtime PM point of view, the device remains active, because the PCI
> subsystem makes it active on probe, and it's still bound. It prevents
> vga_switcheroo from turning off the DGPU (pci_create_device_link() syncs
> power management for the video and audio devices).
> 
> Fix it by forcing the device to the suspended state in azx_free().
> 
> Fixes: 07f4f97d7b4b ("vga_switcheroo: Use device link for HDA controller")
> Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com>

What if this device probe is skipped (e.g. adding your device entry to
driver_denylist[] in hda_intel.c)?  Is the device also in the
runtime-suspended state?


thanks,

Takashi

> ---
>  sound/pci/hda/hda_intel.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c
> index b79020adce63..65fcb92e11c7 100644
> --- a/sound/pci/hda/hda_intel.c
> +++ b/sound/pci/hda/hda_intel.c
> @@ -1361,8 +1361,20 @@ static void azx_free(struct azx *chip)
>  	if (use_vga_switcheroo(hda)) {
>  		if (chip->disabled && hda->probe_continued)
>  			snd_hda_unlock_devices(&chip->bus);
> -		if (hda->vga_switcheroo_registered)
> +		if (hda->vga_switcheroo_registered) {
>  			vga_switcheroo_unregister_client(chip->pci);
> +
> +			/* Some GPUs don't have sound, and azx_first_init fails,
> +			 * leaving the device probed but non-functional. As long
> +			 * as it's probed, the PCI subsystem keeps its runtime
> +			 * PM status as active. Force it to suspended (as we
> +			 * actually stop the chip) to allow GPU to suspend via
> +			 * vga_switcheroo.
> +			 */
> +			pm_runtime_disable(&pci->dev);
> +			pm_runtime_set_suspended(&pci->dev);
> +			pm_runtime_enable(&pci->dev);
> +		}
>  	}
>  
>  	if (bus->chip_init) {
> -- 
> 2.46.0
>
Maxim Mikityanskiy Sept. 6, 2024, 6:05 p.m. UTC | #2
On Thu, 05 Sep 2024 at 16:24:26 +0200, Takashi Iwai wrote:
> On Wed, 04 Sep 2024 22:39:55 +0200,
> Maxim Mikityanskiy wrote:
> > 
> > Lenovo IdeaPad Z570 with NVIDIA GeForce Ge 520M doesn't have sound on

Spotted a typo: s/520M/540M/

> > the discrete GPU. snd_hda_intel probes the device and schedules
> > azx_probe_continue(), which fails at azx_first_init(). The driver ends
> > up probed, but calls azx_free() and stops the chip. However, from the
> > runtime PM point of view, the device remains active, because the PCI
> > subsystem makes it active on probe, and it's still bound. It prevents
> > vga_switcheroo from turning off the DGPU (pci_create_device_link() syncs
> > power management for the video and audio devices).
> > 
> > Fix it by forcing the device to the suspended state in azx_free().
> > 
> > Fixes: 07f4f97d7b4b ("vga_switcheroo: Use device link for HDA controller")
> > Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com>
> 
> What if this device probe is skipped (e.g. adding your device entry to
> driver_denylist[] in hda_intel.c)?  Is the device also in the
> runtime-suspended state?

I added the following:

{ PCI_DEVICE_SUB(0x10de, 0x0bea, 0x0000, 0x0000) },

The probe was apparently skipped (the device is not attached to a
driver), runtime_status=suspended, runtime_usage=0, the GPU goes to
DynOff.

However, I'm not sure whether it's a good idea to blacklist 540M
entirely, as there might be other laptops with this GPU that have sound,
and AFAIK there are variants of Lenovo Z570 with other NVIDIA GPUs.

Another way to make vga_switcheroo work is to disable quirk_nvidia_hda,
although I don't know whether it can be done without recompiling the
kernel. In this case, 0000:01:00.1 doesn't even appear on the bus.

(Note that I need to set nouveau.modeset=2 either way, otherwise it
stays in DynPwr if the screen is on.)

> 
> 
> thanks,
> 
> Takashi
> 
> > ---
> >  sound/pci/hda/hda_intel.c | 14 +++++++++++++-
> >  1 file changed, 13 insertions(+), 1 deletion(-)
> > 
> > diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c
> > index b79020adce63..65fcb92e11c7 100644
> > --- a/sound/pci/hda/hda_intel.c
> > +++ b/sound/pci/hda/hda_intel.c
> > @@ -1361,8 +1361,20 @@ static void azx_free(struct azx *chip)
> >  	if (use_vga_switcheroo(hda)) {
> >  		if (chip->disabled && hda->probe_continued)
> >  			snd_hda_unlock_devices(&chip->bus);
> > -		if (hda->vga_switcheroo_registered)
> > +		if (hda->vga_switcheroo_registered) {
> >  			vga_switcheroo_unregister_client(chip->pci);
> > +
> > +			/* Some GPUs don't have sound, and azx_first_init fails,
> > +			 * leaving the device probed but non-functional. As long
> > +			 * as it's probed, the PCI subsystem keeps its runtime
> > +			 * PM status as active. Force it to suspended (as we
> > +			 * actually stop the chip) to allow GPU to suspend via
> > +			 * vga_switcheroo.
> > +			 */
> > +			pm_runtime_disable(&pci->dev);
> > +			pm_runtime_set_suspended(&pci->dev);
> > +			pm_runtime_enable(&pci->dev);
> > +		}
> >  	}
> >  
> >  	if (bus->chip_init) {
> > -- 
> > 2.46.0
> >
Takashi Iwai Sept. 7, 2024, 10:06 a.m. UTC | #3
On Fri, 06 Sep 2024 20:05:06 +0200,
Maxim Mikityanskiy wrote:
> 
> On Thu, 05 Sep 2024 at 16:24:26 +0200, Takashi Iwai wrote:
> > On Wed, 04 Sep 2024 22:39:55 +0200,
> > Maxim Mikityanskiy wrote:
> > > 
> > > Lenovo IdeaPad Z570 with NVIDIA GeForce Ge 520M doesn't have sound on
> 
> Spotted a typo: s/520M/540M/
> 
> > > the discrete GPU. snd_hda_intel probes the device and schedules
> > > azx_probe_continue(), which fails at azx_first_init(). The driver ends
> > > up probed, but calls azx_free() and stops the chip. However, from the
> > > runtime PM point of view, the device remains active, because the PCI
> > > subsystem makes it active on probe, and it's still bound. It prevents
> > > vga_switcheroo from turning off the DGPU (pci_create_device_link() syncs
> > > power management for the video and audio devices).
> > > 
> > > Fix it by forcing the device to the suspended state in azx_free().
> > > 
> > > Fixes: 07f4f97d7b4b ("vga_switcheroo: Use device link for HDA controller")
> > > Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com>
> > 
> > What if this device probe is skipped (e.g. adding your device entry to
> > driver_denylist[] in hda_intel.c)?  Is the device also in the
> > runtime-suspended state?
> 
> I added the following:
> 
> { PCI_DEVICE_SUB(0x10de, 0x0bea, 0x0000, 0x0000) },
> 
> The probe was apparently skipped (the device is not attached to a
> driver), runtime_status=suspended, runtime_usage=0, the GPU goes to
> DynOff.

OK, that's good.

> However, I'm not sure whether it's a good idea to blacklist 540M
> entirely, as there might be other laptops with this GPU that have sound,
> and AFAIK there are variants of Lenovo Z570 with other NVIDIA GPUs.

You should specify the PCI SSID of your device instead of 0:0 in the
above, so that it's picked up only for yours.


Takashi

> Another way to make vga_switcheroo work is to disable quirk_nvidia_hda,
> although I don't know whether it can be done without recompiling the
> kernel. In this case, 0000:01:00.1 doesn't even appear on the bus.
> 
> (Note that I need to set nouveau.modeset=2 either way, otherwise it
> stays in DynPwr if the screen is on.)
> 
> > 
> > 
> > thanks,
> > 
> > Takashi
> > 
> > > ---
> > >  sound/pci/hda/hda_intel.c | 14 +++++++++++++-
> > >  1 file changed, 13 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c
> > > index b79020adce63..65fcb92e11c7 100644
> > > --- a/sound/pci/hda/hda_intel.c
> > > +++ b/sound/pci/hda/hda_intel.c
> > > @@ -1361,8 +1361,20 @@ static void azx_free(struct azx *chip)
> > >  	if (use_vga_switcheroo(hda)) {
> > >  		if (chip->disabled && hda->probe_continued)
> > >  			snd_hda_unlock_devices(&chip->bus);
> > > -		if (hda->vga_switcheroo_registered)
> > > +		if (hda->vga_switcheroo_registered) {
> > >  			vga_switcheroo_unregister_client(chip->pci);
> > > +
> > > +			/* Some GPUs don't have sound, and azx_first_init fails,
> > > +			 * leaving the device probed but non-functional. As long
> > > +			 * as it's probed, the PCI subsystem keeps its runtime
> > > +			 * PM status as active. Force it to suspended (as we
> > > +			 * actually stop the chip) to allow GPU to suspend via
> > > +			 * vga_switcheroo.
> > > +			 */
> > > +			pm_runtime_disable(&pci->dev);
> > > +			pm_runtime_set_suspended(&pci->dev);
> > > +			pm_runtime_enable(&pci->dev);
> > > +		}
> > >  	}
> > >  
> > >  	if (bus->chip_init) {
> > > -- 
> > > 2.46.0
> > >
Maxim Mikityanskiy Sept. 7, 2024, 1:45 p.m. UTC | #4
On Sat, 07 Sep 2024 at 12:06:25 +0200, Takashi Iwai wrote:
> On Fri, 06 Sep 2024 20:05:06 +0200,
> Maxim Mikityanskiy wrote:
> > 
> > On Thu, 05 Sep 2024 at 16:24:26 +0200, Takashi Iwai wrote:
> > > On Wed, 04 Sep 2024 22:39:55 +0200,
> > > Maxim Mikityanskiy wrote:
> > > > 
> > > > Lenovo IdeaPad Z570 with NVIDIA GeForce Ge 520M doesn't have sound on
> > 
> > Spotted a typo: s/520M/540M/
> > 
> > > > the discrete GPU. snd_hda_intel probes the device and schedules
> > > > azx_probe_continue(), which fails at azx_first_init(). The driver ends
> > > > up probed, but calls azx_free() and stops the chip. However, from the
> > > > runtime PM point of view, the device remains active, because the PCI
> > > > subsystem makes it active on probe, and it's still bound. It prevents
> > > > vga_switcheroo from turning off the DGPU (pci_create_device_link() syncs
> > > > power management for the video and audio devices).
> > > > 
> > > > Fix it by forcing the device to the suspended state in azx_free().
> > > > 
> > > > Fixes: 07f4f97d7b4b ("vga_switcheroo: Use device link for HDA controller")
> > > > Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com>
> > > 
> > > What if this device probe is skipped (e.g. adding your device entry to
> > > driver_denylist[] in hda_intel.c)?  Is the device also in the
> > > runtime-suspended state?
> > 
> > I added the following:
> > 
> > { PCI_DEVICE_SUB(0x10de, 0x0bea, 0x0000, 0x0000) },
> > 
> > The probe was apparently skipped (the device is not attached to a
> > driver), runtime_status=suspended, runtime_usage=0, the GPU goes to
> > DynOff.
> 
> OK, that's good.
> 
> > However, I'm not sure whether it's a good idea to blacklist 540M
> > entirely, as there might be other laptops with this GPU that have sound,
> > and AFAIK there are variants of Lenovo Z570 with other NVIDIA GPUs.
> 
> You should specify the PCI SSID of your device instead of 0:0 in the
> above, so that it's picked up only for yours.

Where can I get it?

# cat /sys/bus/pci/devices/0000\:01\:00.1/subsystem_vendor
0x0000
# cat /sys/bus/pci/devices/0000\:01\:00.1/subsystem_device
0x0000

Is it not the right place?

> 
> Takashi
> 
> > Another way to make vga_switcheroo work is to disable quirk_nvidia_hda,
> > although I don't know whether it can be done without recompiling the
> > kernel. In this case, 0000:01:00.1 doesn't even appear on the bus.
> > 
> > (Note that I need to set nouveau.modeset=2 either way, otherwise it
> > stays in DynPwr if the screen is on.)
> > 
> > > 
> > > 
> > > thanks,
> > > 
> > > Takashi
> > > 
> > > > ---
> > > >  sound/pci/hda/hda_intel.c | 14 +++++++++++++-
> > > >  1 file changed, 13 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c
> > > > index b79020adce63..65fcb92e11c7 100644
> > > > --- a/sound/pci/hda/hda_intel.c
> > > > +++ b/sound/pci/hda/hda_intel.c
> > > > @@ -1361,8 +1361,20 @@ static void azx_free(struct azx *chip)
> > > >  	if (use_vga_switcheroo(hda)) {
> > > >  		if (chip->disabled && hda->probe_continued)
> > > >  			snd_hda_unlock_devices(&chip->bus);
> > > > -		if (hda->vga_switcheroo_registered)
> > > > +		if (hda->vga_switcheroo_registered) {
> > > >  			vga_switcheroo_unregister_client(chip->pci);
> > > > +
> > > > +			/* Some GPUs don't have sound, and azx_first_init fails,
> > > > +			 * leaving the device probed but non-functional. As long
> > > > +			 * as it's probed, the PCI subsystem keeps its runtime
> > > > +			 * PM status as active. Force it to suspended (as we
> > > > +			 * actually stop the chip) to allow GPU to suspend via
> > > > +			 * vga_switcheroo.
> > > > +			 */
> > > > +			pm_runtime_disable(&pci->dev);
> > > > +			pm_runtime_set_suspended(&pci->dev);
> > > > +			pm_runtime_enable(&pci->dev);
> > > > +		}
> > > >  	}
> > > >  
> > > >  	if (bus->chip_init) {
> > > > -- 
> > > > 2.46.0
> > > >
Takashi Iwai Sept. 7, 2024, 3:49 p.m. UTC | #5
On Sat, 07 Sep 2024 15:45:41 +0200,
Maxim Mikityanskiy wrote:
> 
> On Sat, 07 Sep 2024 at 12:06:25 +0200, Takashi Iwai wrote:
> > On Fri, 06 Sep 2024 20:05:06 +0200,
> > Maxim Mikityanskiy wrote:
> > > 
> > > On Thu, 05 Sep 2024 at 16:24:26 +0200, Takashi Iwai wrote:
> > > > On Wed, 04 Sep 2024 22:39:55 +0200,
> > > > Maxim Mikityanskiy wrote:
> > > > > 
> > > > > Lenovo IdeaPad Z570 with NVIDIA GeForce Ge 520M doesn't have sound on
> > > 
> > > Spotted a typo: s/520M/540M/
> > > 
> > > > > the discrete GPU. snd_hda_intel probes the device and schedules
> > > > > azx_probe_continue(), which fails at azx_first_init(). The driver ends
> > > > > up probed, but calls azx_free() and stops the chip. However, from the
> > > > > runtime PM point of view, the device remains active, because the PCI
> > > > > subsystem makes it active on probe, and it's still bound. It prevents
> > > > > vga_switcheroo from turning off the DGPU (pci_create_device_link() syncs
> > > > > power management for the video and audio devices).
> > > > > 
> > > > > Fix it by forcing the device to the suspended state in azx_free().
> > > > > 
> > > > > Fixes: 07f4f97d7b4b ("vga_switcheroo: Use device link for HDA controller")
> > > > > Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com>
> > > > 
> > > > What if this device probe is skipped (e.g. adding your device entry to
> > > > driver_denylist[] in hda_intel.c)?  Is the device also in the
> > > > runtime-suspended state?
> > > 
> > > I added the following:
> > > 
> > > { PCI_DEVICE_SUB(0x10de, 0x0bea, 0x0000, 0x0000) },
> > > 
> > > The probe was apparently skipped (the device is not attached to a
> > > driver), runtime_status=suspended, runtime_usage=0, the GPU goes to
> > > DynOff.
> > 
> > OK, that's good.
> > 
> > > However, I'm not sure whether it's a good idea to blacklist 540M
> > > entirely, as there might be other laptops with this GPU that have sound,
> > > and AFAIK there are variants of Lenovo Z570 with other NVIDIA GPUs.
> > 
> > You should specify the PCI SSID of your device instead of 0:0 in the
> > above, so that it's picked up only for yours.
> 
> Where can I get it?
> 
> # cat /sys/bus/pci/devices/0000\:01\:00.1/subsystem_vendor
> 0x0000
> # cat /sys/bus/pci/devices/0000\:01\:00.1/subsystem_device
> 0x0000

Ouch, Lenovo didn't set it right.
Alternatively we may introduce a deny list based on DMI.  Hmm...


Takashi


> 
> Is it not the right place?
> 
> > 
> > Takashi
> > 
> > > Another way to make vga_switcheroo work is to disable quirk_nvidia_hda,
> > > although I don't know whether it can be done without recompiling the
> > > kernel. In this case, 0000:01:00.1 doesn't even appear on the bus.
> > > 
> > > (Note that I need to set nouveau.modeset=2 either way, otherwise it
> > > stays in DynPwr if the screen is on.)
> > > 
> > > > 
> > > > 
> > > > thanks,
> > > > 
> > > > Takashi
> > > > 
> > > > > ---
> > > > >  sound/pci/hda/hda_intel.c | 14 +++++++++++++-
> > > > >  1 file changed, 13 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c
> > > > > index b79020adce63..65fcb92e11c7 100644
> > > > > --- a/sound/pci/hda/hda_intel.c
> > > > > +++ b/sound/pci/hda/hda_intel.c
> > > > > @@ -1361,8 +1361,20 @@ static void azx_free(struct azx *chip)
> > > > >  	if (use_vga_switcheroo(hda)) {
> > > > >  		if (chip->disabled && hda->probe_continued)
> > > > >  			snd_hda_unlock_devices(&chip->bus);
> > > > > -		if (hda->vga_switcheroo_registered)
> > > > > +		if (hda->vga_switcheroo_registered) {
> > > > >  			vga_switcheroo_unregister_client(chip->pci);
> > > > > +
> > > > > +			/* Some GPUs don't have sound, and azx_first_init fails,
> > > > > +			 * leaving the device probed but non-functional. As long
> > > > > +			 * as it's probed, the PCI subsystem keeps its runtime
> > > > > +			 * PM status as active. Force it to suspended (as we
> > > > > +			 * actually stop the chip) to allow GPU to suspend via
> > > > > +			 * vga_switcheroo.
> > > > > +			 */
> > > > > +			pm_runtime_disable(&pci->dev);
> > > > > +			pm_runtime_set_suspended(&pci->dev);
> > > > > +			pm_runtime_enable(&pci->dev);
> > > > > +		}
> > > > >  	}
> > > > >  
> > > > >  	if (bus->chip_init) {
> > > > > -- 
> > > > > 2.46.0
> > > > >
Maxim Mikityanskiy Sept. 7, 2024, 5:24 p.m. UTC | #6
On Sat, 07 Sep 2024 at 17:49:11 +0200, Takashi Iwai wrote:
> On Sat, 07 Sep 2024 15:45:41 +0200,
> Maxim Mikityanskiy wrote:
> > 
> > On Sat, 07 Sep 2024 at 12:06:25 +0200, Takashi Iwai wrote:
> > > On Fri, 06 Sep 2024 20:05:06 +0200,
> > > Maxim Mikityanskiy wrote:
> > > > 
> > > > On Thu, 05 Sep 2024 at 16:24:26 +0200, Takashi Iwai wrote:
> > > > > On Wed, 04 Sep 2024 22:39:55 +0200,
> > > > > Maxim Mikityanskiy wrote:
> > > > > > 
> > > > > > Lenovo IdeaPad Z570 with NVIDIA GeForce Ge 520M doesn't have sound on
> > > > 
> > > > Spotted a typo: s/520M/540M/
> > > > 
> > > > > > the discrete GPU. snd_hda_intel probes the device and schedules
> > > > > > azx_probe_continue(), which fails at azx_first_init(). The driver ends
> > > > > > up probed, but calls azx_free() and stops the chip. However, from the
> > > > > > runtime PM point of view, the device remains active, because the PCI
> > > > > > subsystem makes it active on probe, and it's still bound. It prevents
> > > > > > vga_switcheroo from turning off the DGPU (pci_create_device_link() syncs
> > > > > > power management for the video and audio devices).
> > > > > > 
> > > > > > Fix it by forcing the device to the suspended state in azx_free().
> > > > > > 
> > > > > > Fixes: 07f4f97d7b4b ("vga_switcheroo: Use device link for HDA controller")
> > > > > > Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com>
> > > > > 
> > > > > What if this device probe is skipped (e.g. adding your device entry to
> > > > > driver_denylist[] in hda_intel.c)?  Is the device also in the
> > > > > runtime-suspended state?
> > > > 
> > > > I added the following:
> > > > 
> > > > { PCI_DEVICE_SUB(0x10de, 0x0bea, 0x0000, 0x0000) },
> > > > 
> > > > The probe was apparently skipped (the device is not attached to a
> > > > driver), runtime_status=suspended, runtime_usage=0, the GPU goes to
> > > > DynOff.
> > > 
> > > OK, that's good.
> > > 
> > > > However, I'm not sure whether it's a good idea to blacklist 540M
> > > > entirely, as there might be other laptops with this GPU that have sound,
> > > > and AFAIK there are variants of Lenovo Z570 with other NVIDIA GPUs.
> > > 
> > > You should specify the PCI SSID of your device instead of 0:0 in the
> > > above, so that it's picked up only for yours.
> > 
> > Where can I get it?
> > 
> > # cat /sys/bus/pci/devices/0000\:01\:00.1/subsystem_vendor
> > 0x0000
> > # cat /sys/bus/pci/devices/0000\:01\:00.1/subsystem_device
> > 0x0000
> 
> Ouch, Lenovo didn't set it right.
> Alternatively we may introduce a deny list based on DMI.  Hmm...

A DMI-based quirk sounds better than blocking any NVIDIA 540M, it would
also allow handling alternative GPUs on this laptop model.

But could you explain what's wrong with a generic approach that I
suggest (probe_continue failed => mark as suspended)? It doesn't require
any lists. Any drawbacks?

> 
> 
> Takashi
> 
> 
> > 
> > Is it not the right place?
> > 
> > > 
> > > Takashi
> > > 
> > > > Another way to make vga_switcheroo work is to disable quirk_nvidia_hda,
> > > > although I don't know whether it can be done without recompiling the
> > > > kernel. In this case, 0000:01:00.1 doesn't even appear on the bus.
> > > > 
> > > > (Note that I need to set nouveau.modeset=2 either way, otherwise it
> > > > stays in DynPwr if the screen is on.)
> > > > 
> > > > > 
> > > > > 
> > > > > thanks,
> > > > > 
> > > > > Takashi
> > > > > 
> > > > > > ---
> > > > > >  sound/pci/hda/hda_intel.c | 14 +++++++++++++-
> > > > > >  1 file changed, 13 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c
> > > > > > index b79020adce63..65fcb92e11c7 100644
> > > > > > --- a/sound/pci/hda/hda_intel.c
> > > > > > +++ b/sound/pci/hda/hda_intel.c
> > > > > > @@ -1361,8 +1361,20 @@ static void azx_free(struct azx *chip)
> > > > > >  	if (use_vga_switcheroo(hda)) {
> > > > > >  		if (chip->disabled && hda->probe_continued)
> > > > > >  			snd_hda_unlock_devices(&chip->bus);
> > > > > > -		if (hda->vga_switcheroo_registered)
> > > > > > +		if (hda->vga_switcheroo_registered) {
> > > > > >  			vga_switcheroo_unregister_client(chip->pci);
> > > > > > +
> > > > > > +			/* Some GPUs don't have sound, and azx_first_init fails,
> > > > > > +			 * leaving the device probed but non-functional. As long
> > > > > > +			 * as it's probed, the PCI subsystem keeps its runtime
> > > > > > +			 * PM status as active. Force it to suspended (as we
> > > > > > +			 * actually stop the chip) to allow GPU to suspend via
> > > > > > +			 * vga_switcheroo.
> > > > > > +			 */
> > > > > > +			pm_runtime_disable(&pci->dev);
> > > > > > +			pm_runtime_set_suspended(&pci->dev);
> > > > > > +			pm_runtime_enable(&pci->dev);
> > > > > > +		}
> > > > > >  	}
> > > > > >  
> > > > > >  	if (bus->chip_init) {
> > > > > > -- 
> > > > > > 2.46.0
> > > > > >
Takashi Iwai Sept. 7, 2024, 6:10 p.m. UTC | #7
On Sat, 07 Sep 2024 19:24:41 +0200,
Maxim Mikityanskiy wrote:
> 
> On Sat, 07 Sep 2024 at 17:49:11 +0200, Takashi Iwai wrote:
> > On Sat, 07 Sep 2024 15:45:41 +0200,
> > Maxim Mikityanskiy wrote:
> > > 
> > > On Sat, 07 Sep 2024 at 12:06:25 +0200, Takashi Iwai wrote:
> > > > On Fri, 06 Sep 2024 20:05:06 +0200,
> > > > Maxim Mikityanskiy wrote:
> > > > > 
> > > > > On Thu, 05 Sep 2024 at 16:24:26 +0200, Takashi Iwai wrote:
> > > > > > On Wed, 04 Sep 2024 22:39:55 +0200,
> > > > > > Maxim Mikityanskiy wrote:
> > > > > > > 
> > > > > > > Lenovo IdeaPad Z570 with NVIDIA GeForce Ge 520M doesn't have sound on
> > > > > 
> > > > > Spotted a typo: s/520M/540M/
> > > > > 
> > > > > > > the discrete GPU. snd_hda_intel probes the device and schedules
> > > > > > > azx_probe_continue(), which fails at azx_first_init(). The driver ends
> > > > > > > up probed, but calls azx_free() and stops the chip. However, from the
> > > > > > > runtime PM point of view, the device remains active, because the PCI
> > > > > > > subsystem makes it active on probe, and it's still bound. It prevents
> > > > > > > vga_switcheroo from turning off the DGPU (pci_create_device_link() syncs
> > > > > > > power management for the video and audio devices).
> > > > > > > 
> > > > > > > Fix it by forcing the device to the suspended state in azx_free().
> > > > > > > 
> > > > > > > Fixes: 07f4f97d7b4b ("vga_switcheroo: Use device link for HDA controller")
> > > > > > > Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com>
> > > > > > 
> > > > > > What if this device probe is skipped (e.g. adding your device entry to
> > > > > > driver_denylist[] in hda_intel.c)?  Is the device also in the
> > > > > > runtime-suspended state?
> > > > > 
> > > > > I added the following:
> > > > > 
> > > > > { PCI_DEVICE_SUB(0x10de, 0x0bea, 0x0000, 0x0000) },
> > > > > 
> > > > > The probe was apparently skipped (the device is not attached to a
> > > > > driver), runtime_status=suspended, runtime_usage=0, the GPU goes to
> > > > > DynOff.
> > > > 
> > > > OK, that's good.
> > > > 
> > > > > However, I'm not sure whether it's a good idea to blacklist 540M
> > > > > entirely, as there might be other laptops with this GPU that have sound,
> > > > > and AFAIK there are variants of Lenovo Z570 with other NVIDIA GPUs.
> > > > 
> > > > You should specify the PCI SSID of your device instead of 0:0 in the
> > > > above, so that it's picked up only for yours.
> > > 
> > > Where can I get it?
> > > 
> > > # cat /sys/bus/pci/devices/0000\:01\:00.1/subsystem_vendor
> > > 0x0000
> > > # cat /sys/bus/pci/devices/0000\:01\:00.1/subsystem_device
> > > 0x0000
> > 
> > Ouch, Lenovo didn't set it right.
> > Alternatively we may introduce a deny list based on DMI.  Hmm...
> 
> A DMI-based quirk sounds better than blocking any NVIDIA 540M, it would
> also allow handling alternative GPUs on this laptop model.
> 
> But could you explain what's wrong with a generic approach that I
> suggest (probe_continue failed => mark as suspended)? It doesn't require
> any lists. Any drawbacks?

As you noticed, it will leave the device bound with driver, i.e. this
looks as if it were operative.  It means that the sound driver is
still responsible for suspend/resume or whatever action even though
it's totally useless.  In that sense, it makes more sense to give it
away at the early stage before actually binding it, and the deny list
is exactly for that.

Apart from that, the suggested change itself can be applied
independently from the deny list update, too.  It'd be good for other
similar devices, too.

Though, a slight concern is the sequence of runtime PM you applied in
the patch.  Is it a standard idiom to perform pm_runtime_disable(),
set_suspended() and enable()?  Also, azx_free() is the common
destructor, hence it's called also at the regular driver unbinding.
I'm not entirely sure whether it's OK to call there also at
unbinding.


thanks,

Takashi

> 
> > 
> > 
> > Takashi
> > 
> > 
> > > 
> > > Is it not the right place?
> > > 
> > > > 
> > > > Takashi
> > > > 
> > > > > Another way to make vga_switcheroo work is to disable quirk_nvidia_hda,
> > > > > although I don't know whether it can be done without recompiling the
> > > > > kernel. In this case, 0000:01:00.1 doesn't even appear on the bus.
> > > > > 
> > > > > (Note that I need to set nouveau.modeset=2 either way, otherwise it
> > > > > stays in DynPwr if the screen is on.)
> > > > > 
> > > > > > 
> > > > > > 
> > > > > > thanks,
> > > > > > 
> > > > > > Takashi
> > > > > > 
> > > > > > > ---
> > > > > > >  sound/pci/hda/hda_intel.c | 14 +++++++++++++-
> > > > > > >  1 file changed, 13 insertions(+), 1 deletion(-)
> > > > > > > 
> > > > > > > diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c
> > > > > > > index b79020adce63..65fcb92e11c7 100644
> > > > > > > --- a/sound/pci/hda/hda_intel.c
> > > > > > > +++ b/sound/pci/hda/hda_intel.c
> > > > > > > @@ -1361,8 +1361,20 @@ static void azx_free(struct azx *chip)
> > > > > > >  	if (use_vga_switcheroo(hda)) {
> > > > > > >  		if (chip->disabled && hda->probe_continued)
> > > > > > >  			snd_hda_unlock_devices(&chip->bus);
> > > > > > > -		if (hda->vga_switcheroo_registered)
> > > > > > > +		if (hda->vga_switcheroo_registered) {
> > > > > > >  			vga_switcheroo_unregister_client(chip->pci);
> > > > > > > +
> > > > > > > +			/* Some GPUs don't have sound, and azx_first_init fails,
> > > > > > > +			 * leaving the device probed but non-functional. As long
> > > > > > > +			 * as it's probed, the PCI subsystem keeps its runtime
> > > > > > > +			 * PM status as active. Force it to suspended (as we
> > > > > > > +			 * actually stop the chip) to allow GPU to suspend via
> > > > > > > +			 * vga_switcheroo.
> > > > > > > +			 */
> > > > > > > +			pm_runtime_disable(&pci->dev);
> > > > > > > +			pm_runtime_set_suspended(&pci->dev);
> > > > > > > +			pm_runtime_enable(&pci->dev);
> > > > > > > +		}
> > > > > > >  	}
> > > > > > >  
> > > > > > >  	if (bus->chip_init) {
> > > > > > > -- 
> > > > > > > 2.46.0
> > > > > > >
diff mbox series

Patch

diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c
index b79020adce63..65fcb92e11c7 100644
--- a/sound/pci/hda/hda_intel.c
+++ b/sound/pci/hda/hda_intel.c
@@ -1361,8 +1361,20 @@  static void azx_free(struct azx *chip)
 	if (use_vga_switcheroo(hda)) {
 		if (chip->disabled && hda->probe_continued)
 			snd_hda_unlock_devices(&chip->bus);
-		if (hda->vga_switcheroo_registered)
+		if (hda->vga_switcheroo_registered) {
 			vga_switcheroo_unregister_client(chip->pci);
+
+			/* Some GPUs don't have sound, and azx_first_init fails,
+			 * leaving the device probed but non-functional. As long
+			 * as it's probed, the PCI subsystem keeps its runtime
+			 * PM status as active. Force it to suspended (as we
+			 * actually stop the chip) to allow GPU to suspend via
+			 * vga_switcheroo.
+			 */
+			pm_runtime_disable(&pci->dev);
+			pm_runtime_set_suspended(&pci->dev);
+			pm_runtime_enable(&pci->dev);
+		}
 	}
 
 	if (bus->chip_init) {