diff mbox series

[11/11] video/aperture: Only remove sysfb on the default vga pci device

Message ID 20230111154112.90575-11-daniel.vetter@ffwll.ch (mailing list archive)
State New, archived
Headers show
Series [01/11] drm/ast: Use drm_aperture_remove_conflicting_pci_framebuffers | expand

Commit Message

Daniel Vetter Jan. 11, 2023, 3:41 p.m. UTC
This fixes a regression introduced by ee7a69aa38d8 ("fbdev: Disable
sysfb device registration when removing conflicting FBs"), where we
remove the sysfb when loading a driver for an unrelated pci device,
resulting in the user loosing their efifb console or similar.

Note that in practice this only is a problem with the nvidia blob,
because that's the only gpu driver people might install which does not
come with an fbdev driver of it's own. For everyone else the real gpu
driver will restor a working console.

Also note that in the referenced bug there's confusion that this same
bug also happens on amdgpu. But that was just another amdgpu specific
regression, which just happened to happen at roughly the same time and
with the same user-observable symptons. That bug is fixed now, see
https://bugzilla.kernel.org/show_bug.cgi?id=216331#c15

For the above reasons the cc: stable is just notionally, this patch
will need a backport and that's up to nvidia if they care enough.

References: https://bugzilla.kernel.org/show_bug.cgi?id=216303#c28
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Aaron Plattner <aplattner@nvidia.com>
Cc: Javier Martinez Canillas <javierm@redhat.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Helge Deller <deller@gmx.de>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: <stable@vger.kernel.org> # v5.19+ (if someone else does the backport)
---
 drivers/video/aperture.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

Comments

Thomas Zimmermann Jan. 11, 2023, 4:20 p.m. UTC | #1
Hi

Am 11.01.23 um 16:41 schrieb Daniel Vetter:
> This fixes a regression introduced by ee7a69aa38d8 ("fbdev: Disable
> sysfb device registration when removing conflicting FBs"), where we
> remove the sysfb when loading a driver for an unrelated pci device,
> resulting in the user loosing their efifb console or similar.
> 
> Note that in practice this only is a problem with the nvidia blob,
> because that's the only gpu driver people might install which does not
> come with an fbdev driver of it's own. For everyone else the real gpu
> driver will restor a working console.
> 
> Also note that in the referenced bug there's confusion that this same
> bug also happens on amdgpu. But that was just another amdgpu specific
> regression, which just happened to happen at roughly the same time and
> with the same user-observable symptons. That bug is fixed now, see
> https://bugzilla.kernel.org/show_bug.cgi?id=216331#c15
> 
> For the above reasons the cc: stable is just notionally, this patch
> will need a backport and that's up to nvidia if they care enough.
> 
> References: https://bugzilla.kernel.org/show_bug.cgi?id=216303#c28
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Aaron Plattner <aplattner@nvidia.com>
> Cc: Javier Martinez Canillas <javierm@redhat.com>
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: Helge Deller <deller@gmx.de>
> Cc: Sam Ravnborg <sam@ravnborg.org>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: <stable@vger.kernel.org> # v5.19+ (if someone else does the backport)
> ---
>   drivers/video/aperture.c | 7 ++++---
>   1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c
> index ba565515480d..a1821d369bb1 100644
> --- a/drivers/video/aperture.c
> +++ b/drivers/video/aperture.c
> @@ -321,15 +321,16 @@ int aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const char *na
>   
>   	primary = pdev == vga_default_device();
>   
> +	if (primary)
> +		sysfb_disable();
> +

There's another sysfb_disable() in aperture_remove_conflicting_devices() 
without the branch but with a long comment.  I find this slightly confusing.

I'd rather add a branched sysfb_disable() plus the comment  to 
aperture_detach_devices(). And then add a 'primary' parameter to 
aperture_detach_devices(). In aperture_remove_conflicting_devices() the 
parameter would be unconditionally true.

Best regards
Thomas

>   	for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) {
>   		if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM))
>   			continue;
>   
>   		base = pci_resource_start(pdev, bar);
>   		size = pci_resource_len(pdev, bar);
> -		ret = aperture_remove_conflicting_devices(base, size, name);
> -		if (ret)
> -			return ret;
> +		aperture_detach_devices(base, size);
>   	}
>   
>   	if (!primary)
Daniel Vetter Jan. 11, 2023, 4:37 p.m. UTC | #2
On Wed, Jan 11, 2023 at 05:20:00PM +0100, Thomas Zimmermann wrote:
> Hi
> 
> Am 11.01.23 um 16:41 schrieb Daniel Vetter:
> > This fixes a regression introduced by ee7a69aa38d8 ("fbdev: Disable
> > sysfb device registration when removing conflicting FBs"), where we
> > remove the sysfb when loading a driver for an unrelated pci device,
> > resulting in the user loosing their efifb console or similar.
> > 
> > Note that in practice this only is a problem with the nvidia blob,
> > because that's the only gpu driver people might install which does not
> > come with an fbdev driver of it's own. For everyone else the real gpu
> > driver will restor a working console.
> > 
> > Also note that in the referenced bug there's confusion that this same
> > bug also happens on amdgpu. But that was just another amdgpu specific
> > regression, which just happened to happen at roughly the same time and
> > with the same user-observable symptons. That bug is fixed now, see
> > https://bugzilla.kernel.org/show_bug.cgi?id=216331#c15
> > 
> > For the above reasons the cc: stable is just notionally, this patch
> > will need a backport and that's up to nvidia if they care enough.
> > 
> > References: https://bugzilla.kernel.org/show_bug.cgi?id=216303#c28
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Aaron Plattner <aplattner@nvidia.com>
> > Cc: Javier Martinez Canillas <javierm@redhat.com>
> > Cc: Thomas Zimmermann <tzimmermann@suse.de>
> > Cc: Helge Deller <deller@gmx.de>
> > Cc: Sam Ravnborg <sam@ravnborg.org>
> > Cc: Alex Deucher <alexander.deucher@amd.com>
> > Cc: <stable@vger.kernel.org> # v5.19+ (if someone else does the backport)
> > ---
> >   drivers/video/aperture.c | 7 ++++---
> >   1 file changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c
> > index ba565515480d..a1821d369bb1 100644
> > --- a/drivers/video/aperture.c
> > +++ b/drivers/video/aperture.c
> > @@ -321,15 +321,16 @@ int aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const char *na
> >   	primary = pdev == vga_default_device();
> > +	if (primary)
> > +		sysfb_disable();
> > +
> 
> There's another sysfb_disable() in aperture_remove_conflicting_devices()
> without the branch but with a long comment.  I find this slightly confusing.
> 
> I'd rather add a branched sysfb_disable() plus the comment  to
> aperture_detach_devices(). And then add a 'primary' parameter to
> aperture_detach_devices(). In aperture_remove_conflicting_devices() the
> parameter would be unconditionally true.

Yeah I was on the fence, but should be easy to redo with all the prep work
out of the way. It does mean we call sysfb_disable once for every bar, but
that shouldn't matter in any reasonable case :-)
-Daniel

> 
> Best regards
> Thomas
> 
> >   	for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) {
> >   		if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM))
> >   			continue;
> >   		base = pci_resource_start(pdev, bar);
> >   		size = pci_resource_len(pdev, bar);
> > -		ret = aperture_remove_conflicting_devices(base, size, name);
> > -		if (ret)
> > -			return ret;
> > +		aperture_detach_devices(base, size);
> >   	}
> >   	if (!primary)
> 
> -- 
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Ivo Totev
Javier Martinez Canillas Jan. 11, 2023, 4:43 p.m. UTC | #3
On 1/11/23 17:20, Thomas Zimmermann wrote:

[...]

>>
>> diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c
>> index ba565515480d..a1821d369bb1 100644
>> --- a/drivers/video/aperture.c
>> +++ b/drivers/video/aperture.c
>> @@ -321,15 +321,16 @@ int aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const char *na
>>   
>>   	primary = pdev == vga_default_device();
>>   
>> +	if (primary)
>> +		sysfb_disable();
>> +
> 
> There's another sysfb_disable() in aperture_remove_conflicting_devices() 
> without the branch but with a long comment.  I find this slightly confusing.
> 
> I'd rather add a branched sysfb_disable() plus the comment  to 
> aperture_detach_devices(). And then add a 'primary' parameter to 
> aperture_detach_devices(). In aperture_remove_conflicting_devices() the 
> parameter would be unconditionally true.
>

Or just remove that long comment since there's already kernel-doc for the
sysfb_disable() function definition.

This feels to me that any approach to parameterize this will lead to code
that is harder to read.

Since is just a single function call, I would just duplicate like $subject
does to be honest.
Javier Martinez Canillas Jan. 11, 2023, 4:58 p.m. UTC | #4
Hello Daniel,

On 1/11/23 16:41, Daniel Vetter wrote:
> This fixes a regression introduced by ee7a69aa38d8 ("fbdev: Disable
> sysfb device registration when removing conflicting FBs"), where we
> remove the sysfb when loading a driver for an unrelated pci device,
> resulting in the user loosing their efifb console or similar.
> 
> Note that in practice this only is a problem with the nvidia blob,
> because that's the only gpu driver people might install which does not
> come with an fbdev driver of it's own. For everyone else the real gpu
> driver will restor a working console.

restore

> 
> Also note that in the referenced bug there's confusion that this same
> bug also happens on amdgpu. But that was just another amdgpu specific
> regression, which just happened to happen at roughly the same time and
> with the same user-observable symptons. That bug is fixed now, see

symptoms

> https://bugzilla.kernel.org/show_bug.cgi?id=216331#c15
> 
> For the above reasons the cc: stable is just notionally, this patch
> will need a backport and that's up to nvidia if they care enough.
> 

Maybe adding a Fixes: ee7a69aa38d8 tag here too ?

> References: https://bugzilla.kernel.org/show_bug.cgi?id=216303#c28
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Aaron Plattner <aplattner@nvidia.com>
> Cc: Javier Martinez Canillas <javierm@redhat.com>
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: Helge Deller <deller@gmx.de>
> Cc: Sam Ravnborg <sam@ravnborg.org>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: <stable@vger.kernel.org> # v5.19+ (if someone else does the backport)
> ---
>  drivers/video/aperture.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c
> index ba565515480d..a1821d369bb1 100644
> --- a/drivers/video/aperture.c
> +++ b/drivers/video/aperture.c
> @@ -321,15 +321,16 @@ int aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const char *na
>  
>  	primary = pdev == vga_default_device();
>  
> +	if (primary)
> +		sysfb_disable();
> +
>  	for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) {
>  		if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM))
>  			continue;
>  
>  		base = pci_resource_start(pdev, bar);
>  		size = pci_resource_len(pdev, bar);
> -		ret = aperture_remove_conflicting_devices(base, size, name);
> -		if (ret)
> -			return ret;
> +		aperture_detach_devices(base, size);

Maybe mention in the commit message that you are doing this change, something like:

"Instead of calling aperture_remove_conflicting_devices() to remove the conflicting
devices, just call to aperture_detach_devices() to detach the device that matches
the same PCI BAR / aperture range. Since the former is just a wrapper of the latter
plus a sysfb_disable() call, and now that's done in this function but only for the
primary devices"

Patch looks good to me:

Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Aaron Plattner Jan. 11, 2023, 7:21 p.m. UTC | #5
On 1/11/23 8:58 AM, Javier Martinez Canillas wrote:
> Hello Daniel,
> 
> On 1/11/23 16:41, Daniel Vetter wrote:
>> This fixes a regression introduced by ee7a69aa38d8 ("fbdev: Disable
>> sysfb device registration when removing conflicting FBs"), where we
>> remove the sysfb when loading a driver for an unrelated pci device,
>> resulting in the user loosing their efifb console or similar.
>>
>> Note that in practice this only is a problem with the nvidia blob,
>> because that's the only gpu driver people might install which does not
>> come with an fbdev driver of it's own. For everyone else the real gpu
>> driver will restor a working console.
> 
> restore
> 
>>
>> Also note that in the referenced bug there's confusion that this same
>> bug also happens on amdgpu. But that was just another amdgpu specific
>> regression, which just happened to happen at roughly the same time and
>> with the same user-observable symptons. That bug is fixed now, see
> 
> symptoms
> 
>> https://bugzilla.kernel.org/show_bug.cgi?id=216331#c15
>>
>> For the above reasons the cc: stable is just notionally, this patch
>> will need a backport and that's up to nvidia if they care enough.
>>
> 
> Maybe adding a Fixes: ee7a69aa38d8 tag here too ?
> 
>> References: https://bugzilla.kernel.org/show_bug.cgi?id=216303#c28
>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>> Cc: Aaron Plattner <aplattner@nvidia.com>
>> Cc: Javier Martinez Canillas <javierm@redhat.com>
>> Cc: Thomas Zimmermann <tzimmermann@suse.de>
>> Cc: Helge Deller <deller@gmx.de>
>> Cc: Sam Ravnborg <sam@ravnborg.org>
>> Cc: Alex Deucher <alexander.deucher@amd.com>
>> Cc: <stable@vger.kernel.org> # v5.19+ (if someone else does the backport)
>> ---
>>   drivers/video/aperture.c | 7 ++++---
>>   1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c
>> index ba565515480d..a1821d369bb1 100644
>> --- a/drivers/video/aperture.c
>> +++ b/drivers/video/aperture.c
>> @@ -321,15 +321,16 @@ int aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const char *na
>>   
>>   	primary = pdev == vga_default_device();
>>   
>> +	if (primary)
>> +		sysfb_disable();
>> +
>>   	for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) {
>>   		if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM))
>>   			continue;
>>   
>>   		base = pci_resource_start(pdev, bar);
>>   		size = pci_resource_len(pdev, bar);
>> -		ret = aperture_remove_conflicting_devices(base, size, name);
>> -		if (ret)
>> -			return ret;
>> +		aperture_detach_devices(base, size);
> 
> Maybe mention in the commit message that you are doing this change, something like:
> 
> "Instead of calling aperture_remove_conflicting_devices() to remove the conflicting
> devices, just call to aperture_detach_devices() to detach the device that matches
> the same PCI BAR / aperture range. Since the former is just a wrapper of the latter
> plus a sysfb_disable() call, and now that's done in this function but only for the
> primary devices"
> 
> Patch looks good to me:
> 
> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>

Thanks Daniel and Javier!

I wasn't able to reproduce the original problem on my hybrid laptop 
since it refuses to boot with the console on an external display, but I 
was able to reproduce it by switching the configuration around: booting 
with i915.modeset=0 and with an experimental version of nvidia-drm that 
registers a framebuffer console. I verified that loading nvidia-drm 
breaks the efi-firmware framebuffer on Intel on Arch's 
linux-6.1.4-arch1-1 kernel and that applying this patch series fixes it. So

Tested-by: Aaron Plattner <aplattner@nvidia.com>

FWIW, the bug ought to be reproducible with i915.modeset=0 + any other 
drm driver that registers a framebuffer.

-- Aaron
Thomas Zimmermann Jan. 12, 2023, 7:48 a.m. UTC | #6
Hi

Am 11.01.23 um 17:37 schrieb Daniel Vetter:
> On Wed, Jan 11, 2023 at 05:20:00PM +0100, Thomas Zimmermann wrote:
>> Hi
>>
>> Am 11.01.23 um 16:41 schrieb Daniel Vetter:
>>> This fixes a regression introduced by ee7a69aa38d8 ("fbdev: Disable
>>> sysfb device registration when removing conflicting FBs"), where we
>>> remove the sysfb when loading a driver for an unrelated pci device,
>>> resulting in the user loosing their efifb console or similar.
>>>
>>> Note that in practice this only is a problem with the nvidia blob,
>>> because that's the only gpu driver people might install which does not
>>> come with an fbdev driver of it's own. For everyone else the real gpu
>>> driver will restor a working console.
>>>
>>> Also note that in the referenced bug there's confusion that this same
>>> bug also happens on amdgpu. But that was just another amdgpu specific
>>> regression, which just happened to happen at roughly the same time and
>>> with the same user-observable symptons. That bug is fixed now, see
>>> https://bugzilla.kernel.org/show_bug.cgi?id=216331#c15
>>>
>>> For the above reasons the cc: stable is just notionally, this patch
>>> will need a backport and that's up to nvidia if they care enough.
>>>
>>> References: https://bugzilla.kernel.org/show_bug.cgi?id=216303#c28
>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>> Cc: Aaron Plattner <aplattner@nvidia.com>
>>> Cc: Javier Martinez Canillas <javierm@redhat.com>
>>> Cc: Thomas Zimmermann <tzimmermann@suse.de>
>>> Cc: Helge Deller <deller@gmx.de>
>>> Cc: Sam Ravnborg <sam@ravnborg.org>
>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>> Cc: <stable@vger.kernel.org> # v5.19+ (if someone else does the backport)
>>> ---
>>>    drivers/video/aperture.c | 7 ++++---
>>>    1 file changed, 4 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c
>>> index ba565515480d..a1821d369bb1 100644
>>> --- a/drivers/video/aperture.c
>>> +++ b/drivers/video/aperture.c
>>> @@ -321,15 +321,16 @@ int aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const char *na
>>>    	primary = pdev == vga_default_device();
>>> +	if (primary)
>>> +		sysfb_disable();
>>> +
>>
>> There's another sysfb_disable() in aperture_remove_conflicting_devices()
>> without the branch but with a long comment.  I find this slightly confusing.
>>
>> I'd rather add a branched sysfb_disable() plus the comment  to
>> aperture_detach_devices(). And then add a 'primary' parameter to
>> aperture_detach_devices(). In aperture_remove_conflicting_devices() the
>> parameter would be unconditionally true.
> 
> Yeah I was on the fence, but should be easy to redo with all the prep work
> out of the way. It does mean we call sysfb_disable once for every bar, but
> that shouldn't matter in any reasonable case :-)

Or leave it as is. It's not so important. The idea of the current design 
was that aperture_remove_conflicting_device() would be the general 
implementation and aperture_remove_conflicting_pci_device() would be a 
helper that only detects the correct PCI BAR. That never really worked 
in practice.

Best regards
Thomas

> -Daniel
> 
>>
>> Best regards
>> Thomas
>>
>>>    	for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) {
>>>    		if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM))
>>>    			continue;
>>>    		base = pci_resource_start(pdev, bar);
>>>    		size = pci_resource_len(pdev, bar);
>>> -		ret = aperture_remove_conflicting_devices(base, size, name);
>>> -		if (ret)
>>> -			return ret;
>>> +		aperture_detach_devices(base, size);
>>>    	}
>>>    	if (!primary)
>>
>> -- 
>> Thomas Zimmermann
>> Graphics Driver Developer
>> SUSE Software Solutions Germany GmbH
>> Maxfeldstr. 5, 90409 Nürnberg, Germany
>> (HRB 36809, AG Nürnberg)
>> Geschäftsführer: Ivo Totev
> 
> 
> 
>
Thomas Zimmermann Jan. 12, 2023, 7:55 a.m. UTC | #7
Hi

Am 11.01.23 um 20:21 schrieb Aaron Plattner:
> On 1/11/23 8:58 AM, Javier Martinez Canillas wrote:
>> Hello Daniel,
>>
>> On 1/11/23 16:41, Daniel Vetter wrote:
>>> This fixes a regression introduced by ee7a69aa38d8 ("fbdev: Disable
>>> sysfb device registration when removing conflicting FBs"), where we
>>> remove the sysfb when loading a driver for an unrelated pci device,
>>> resulting in the user loosing their efifb console or similar.
>>>
>>> Note that in practice this only is a problem with the nvidia blob,
>>> because that's the only gpu driver people might install which does not
>>> come with an fbdev driver of it's own. For everyone else the real gpu
>>> driver will restor a working console.
>>
>> restore
>>
>>>
>>> Also note that in the referenced bug there's confusion that this same
>>> bug also happens on amdgpu. But that was just another amdgpu specific
>>> regression, which just happened to happen at roughly the same time and
>>> with the same user-observable symptons. That bug is fixed now, see
>>
>> symptoms
>>
>>> https://bugzilla.kernel.org/show_bug.cgi?id=216331#c15
>>>
>>> For the above reasons the cc: stable is just notionally, this patch
>>> will need a backport and that's up to nvidia if they care enough.
>>>
>>
>> Maybe adding a Fixes: ee7a69aa38d8 tag here too ?
>>
>>> References: https://bugzilla.kernel.org/show_bug.cgi?id=216303#c28
>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>> Cc: Aaron Plattner <aplattner@nvidia.com>
>>> Cc: Javier Martinez Canillas <javierm@redhat.com>
>>> Cc: Thomas Zimmermann <tzimmermann@suse.de>
>>> Cc: Helge Deller <deller@gmx.de>
>>> Cc: Sam Ravnborg <sam@ravnborg.org>
>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>> Cc: <stable@vger.kernel.org> # v5.19+ (if someone else does the 
>>> backport)
>>> ---
>>>   drivers/video/aperture.c | 7 ++++---
>>>   1 file changed, 4 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c
>>> index ba565515480d..a1821d369bb1 100644
>>> --- a/drivers/video/aperture.c
>>> +++ b/drivers/video/aperture.c
>>> @@ -321,15 +321,16 @@ int 
>>> aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const 
>>> char *na
>>>       primary = pdev == vga_default_device();
>>> +    if (primary)
>>> +        sysfb_disable();
>>> +
>>>       for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) {
>>>           if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM))
>>>               continue;
>>>           base = pci_resource_start(pdev, bar);
>>>           size = pci_resource_len(pdev, bar);
>>> -        ret = aperture_remove_conflicting_devices(base, size, name);
>>> -        if (ret)
>>> -            return ret;
>>> +        aperture_detach_devices(base, size);
>>
>> Maybe mention in the commit message that you are doing this change, 
>> something like:
>>
>> "Instead of calling aperture_remove_conflicting_devices() to remove 
>> the conflicting
>> devices, just call to aperture_detach_devices() to detach the device 
>> that matches
>> the same PCI BAR / aperture range. Since the former is just a wrapper 
>> of the latter
>> plus a sysfb_disable() call, and now that's done in this function but 
>> only for the
>> primary devices"
>>
>> Patch looks good to me:
>>
>> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
> 
> Thanks Daniel and Javier!
> 
> I wasn't able to reproduce the original problem on my hybrid laptop 
> since it refuses to boot with the console on an external display, but I 
> was able to reproduce it by switching the configuration around: booting 
> with i915.modeset=0 and with an experimental version of nvidia-drm that 
> registers a framebuffer console. I verified that loading nvidia-drm 

Thank you for testing.

One thing I'd like to note is that using DRM's fbdev emulation is the 
correct way to support a console. Nvidia-drm's current approach of 
utilizing efifb is fragile and requires workarounds from distributions 
(at least here at SUSE). Steps towards fbdev emulation are much appreciated.

Best regards
Thomas

> breaks the efi-firmware framebuffer on Intel on Arch's 
> linux-6.1.4-arch1-1 kernel and that applying this patch series fixes it. So
> 
> Tested-by: Aaron Plattner <aplattner@nvidia.com>
> 
> FWIW, the bug ought to be reproducible with i915.modeset=0 + any other 
> drm driver that registers a framebuffer.
> 
> -- Aaron
Javier Martinez Canillas Jan. 12, 2023, 8:44 a.m. UTC | #8
On 1/12/23 08:55, Thomas Zimmermann wrote:

[...]

>> Thanks Daniel and Javier!
>>
>> I wasn't able to reproduce the original problem on my hybrid laptop 
>> since it refuses to boot with the console on an external display, but I 
>> was able to reproduce it by switching the configuration around: booting 
>> with i915.modeset=0 and with an experimental version of nvidia-drm that 
>> registers a framebuffer console. I verified that loading nvidia-drm 
> 
> Thank you for testing.
> 
> One thing I'd like to note is that using DRM's fbdev emulation is the 
> correct way to support a console. Nvidia-drm's current approach of 
> utilizing efifb is fragile and requires workarounds from distributions 
> (at least here at SUSE). Steps towards fbdev emulation are much appreciated.
>
 
I was meaning to mention the same. Fedora also is carrying a workaround just
for the Nvidia proprietary driver since all other drivers provide a emulated
fbdev device.

So getting this finally fixed will be indeed highly appreciated.
diff mbox series

Patch

diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c
index ba565515480d..a1821d369bb1 100644
--- a/drivers/video/aperture.c
+++ b/drivers/video/aperture.c
@@ -321,15 +321,16 @@  int aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const char *na
 
 	primary = pdev == vga_default_device();
 
+	if (primary)
+		sysfb_disable();
+
 	for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) {
 		if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM))
 			continue;
 
 		base = pci_resource_start(pdev, bar);
 		size = pci_resource_len(pdev, bar);
-		ret = aperture_remove_conflicting_devices(base, size, name);
-		if (ret)
-			return ret;
+		aperture_detach_devices(base, size);
 	}
 
 	if (!primary)