diff mbox series

[2/6] PCI/VGA: Deal with PCI VGA compatible devices only

Message ID 20230711134354.755966-3-sui.jingfeng@linux.dev (mailing list archive)
State Superseded
Delegated to: Bjorn Helgaas
Headers show
Series PCI/VGA: Fix typos, comments and copyright | expand

Commit Message

Sui Jingfeng July 11, 2023, 1:43 p.m. UTC
From: Sui Jingfeng <suijingfeng@loongson.cn>

Currently, vgaarb only cares about PCI VGA-compatible class devices.

While vga_arbiter_del_pci_device() gets called unbalanced when some PCI
device is about to be removed. This happens even during the boot process.

Another reason is that the vga_arb_device_init() function is not efficient.
Since we only care about VGA-compatible devices (pdev->class == 0x030000),
We could filter the unqualified devices out in the vga_arb_device_init()
function. While the current implementation is to search all PCI devices
in a system, this is not necessary.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn>
---
 drivers/pci/vgaarb.c | 25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

Comments

Bjorn Helgaas July 19, 2023, 6:26 p.m. UTC | #1
On Tue, Jul 11, 2023 at 09:43:50PM +0800, Sui Jingfeng wrote:
> From: Sui Jingfeng <suijingfeng@loongson.cn>
> 
> Currently, vgaarb only cares about PCI VGA-compatible class devices.
> 
> While vga_arbiter_del_pci_device() gets called unbalanced when some PCI
> device is about to be removed. This happens even during the boot process.

The previous code calls vga_arbiter_add_pci_device() for every device
(every device present at boot and also every hot-added device).  It
only allocates a vga_device if pdev->class is 0x0300XX.

It calls vga_arbiter_del_pci_device() for every device removal.  It
does nothing unless it finds a vga_device.

This seems symmetric and reasonable to me.  Did you observe a problem
with it?

> Another reason is that the vga_arb_device_init() function is not efficient.
> Since we only care about VGA-compatible devices (pdev->class == 0x030000),
> We could filter the unqualified devices out in the vga_arb_device_init()
> function. While the current implementation is to search all PCI devices
> in a system, this is not necessary.

Optimization is fine, but the most important thing here is to be clear
about what functional change this patch makes.  As I mentioned at [1], 
if this patch affects the class codes accepted, please make that clear
here.

> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
> Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn>

I do not see Mario's Reviewed-by on the list.  I do see Mario's
Reviewed-by [2] for a previous version, but that version added this in
pci_notify():

  + if (pdev->class != PCI_CLASS_DISPLAY_VGA << 8)
  +   return 0;

while this version adds:

  + if ((pdev->class >> 8) != PCI_CLASS_DISPLAY_VGA)
  +   return 0;

It's OK to carry a review to future versions if there are
insignificant changes, but this is a functional change that seems
significant to me.  The first matches only 0x030000, while the second
discards the low eight bits so it matches 0x0300XX.

[1] https://lore.kernel.org/r/20230718231400.GA496927@bhelgaas
[2] https://lore.kernel.org/all/5b6fdf65-b354-94a9-f883-be820157efad@amd.com/

> ---
>  drivers/pci/vgaarb.c | 25 +++++++++++++------------
>  1 file changed, 13 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
> index c1bc6c983932..021116ed61cb 100644
> --- a/drivers/pci/vgaarb.c
> +++ b/drivers/pci/vgaarb.c
> @@ -754,10 +754,6 @@ static bool vga_arbiter_add_pci_device(struct pci_dev *pdev)
>  	struct pci_dev *bridge;
>  	u16 cmd;
>  
> -	/* Only deal with VGA class devices */
> -	if ((pdev->class >> 8) != PCI_CLASS_DISPLAY_VGA)
> -		return false;
> -
>  	/* Allocate structure */
>  	vgadev = kzalloc(sizeof(struct vga_device), GFP_KERNEL);
>  	if (vgadev == NULL) {
> @@ -1502,6 +1498,10 @@ static int pci_notify(struct notifier_block *nb, unsigned long action,
>  
>  	vgaarb_dbg(dev, "%s\n", __func__);
>  
> +	/* Deal with VGA compatible devices only */
> +	if ((pdev->class >> 8) != PCI_CLASS_DISPLAY_VGA)
> +		return 0;
> +
>  	/* For now we're only intereted in devices added and removed. I didn't
>  	 * test this thing here, so someone needs to double check for the
>  	 * cases of hotplugable vga cards. */
> @@ -1534,8 +1534,8 @@ static struct miscdevice vga_arb_device = {
>  
>  static int __init vga_arb_device_init(void)
>  {
> +	struct pci_dev *pdev = NULL;
>  	int rc;
> -	struct pci_dev *pdev;
>  
>  	rc = misc_register(&vga_arb_device);
>  	if (rc < 0)
> @@ -1543,13 +1543,14 @@ static int __init vga_arb_device_init(void)
>  
>  	bus_register_notifier(&pci_bus_type, &pci_notifier);
>  
> -	/* We add all PCI devices satisfying VGA class in the arbiter by
> -	 * default */
> -	pdev = NULL;
> -	while ((pdev =
> -		pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
> -			       PCI_ANY_ID, pdev)) != NULL)
> -		vga_arbiter_add_pci_device(pdev);
> +	/*
> +	 * We add all PCI VGA compatible devices in the arbiter by default
> +	 */
> +	do {
> +		pdev = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, pdev);
> +		if (pdev)
> +			vga_arbiter_add_pci_device(pdev);
> +	} while (pdev);
>  
>  	pr_info("loaded\n");
>  	return rc;
> -- 
> 2.25.1
>
Sui Jingfeng July 19, 2023, 7:58 p.m. UTC | #2
Hi,


On 2023/7/20 02:26, Bjorn Helgaas wrote:
> On Tue, Jul 11, 2023 at 09:43:50PM +0800, Sui Jingfeng wrote:
[...]
>> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
>> Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn>
> I do not see Mario's Reviewed-by on the list.  I do see Mario's
> Reviewed-by [2] for a previous version, but that version added this in
> pci_notify():
>
>    + if (pdev->class != PCI_CLASS_DISPLAY_VGA << 8)
>    +   return 0;
>
> while this version adds:
>
>    + if ((pdev->class >> 8) != PCI_CLASS_DISPLAY_VGA)
>    +   return 0;
>
> It's OK to carry a review to future versions if there are
> insignificant changes, but this is a functional change that seems
> significant to me.  The first matches only 0x030000, while the second
> discards the low eight bits so it matches 0x0300XX.

Yes, you are right.

But I suddenly realized that this may deserve another patch, desperate 
trivial.

What this version adds here is *same* before this patch set is applied.

My explanation about the minor tweak being made before this version and 
previous version

is that  I want to keep my patch *less distraction*.

The major functional gains(benefit) is that we filter non VGA compatible 
devices out.

As a start point, I should keep one patch do one thing (do one thing and 
do it well).


On the other hand, even though the lest significant 8 but if pdev->class 
is really matter.

I think I still need to wait the things(a bug emerged, for example) 
became clear.

Instead of cleanup all potential problems with obvious motivation.

I think Mario will accept my explanation.

> [1] https://lore.kernel.org/r/20230718231400.GA496927@bhelgaas
> [2] https://lore.kernel.org/all/5b6fdf65-b354-94a9-f883-be820157efad@amd.com/
>
>> ---
>>   drivers/pci/vgaarb.c | 25 +++++++++++++------------
>>   1 file changed, 13 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
>> index c1bc6c983932..021116ed61cb 100644
>> --- a/drivers/pci/vgaarb.c
>> +++ b/drivers/pci/vgaarb.c
>> @@ -754,10 +754,6 @@ static bool vga_arbiter_add_pci_device(struct pci_dev *pdev)
>>   	struct pci_dev *bridge;
>>   	u16 cmd;
>>   
>> -	/* Only deal with VGA class devices */
>> -	if ((pdev->class >> 8) != PCI_CLASS_DISPLAY_VGA)
>> -		return false;
>> -
>>   	/* Allocate structure */
>>   	vgadev = kzalloc(sizeof(struct vga_device), GFP_KERNEL);
>>   	if (vgadev == NULL) {
>> @@ -1502,6 +1498,10 @@ static int pci_notify(struct notifier_block *nb, unsigned long action,
>>   
>>   	vgaarb_dbg(dev, "%s\n", __func__);
>>   
>> +	/* Deal with VGA compatible devices only */
>> +	if ((pdev->class >> 8) != PCI_CLASS_DISPLAY_VGA)
>> +		return 0;
>> +
>>   	/* For now we're only intereted in devices added and removed. I didn't
>>   	 * test this thing here, so someone needs to double check for the
>>   	 * cases of hotplugable vga cards. */
>> @@ -1534,8 +1534,8 @@ static struct miscdevice vga_arb_device = {
>>   
>>   static int __init vga_arb_device_init(void)
>>   {
>> +	struct pci_dev *pdev = NULL;
>>   	int rc;
>> -	struct pci_dev *pdev;
>>   
>>   	rc = misc_register(&vga_arb_device);
>>   	if (rc < 0)
>> @@ -1543,13 +1543,14 @@ static int __init vga_arb_device_init(void)
>>   
>>   	bus_register_notifier(&pci_bus_type, &pci_notifier);
>>   
>> -	/* We add all PCI devices satisfying VGA class in the arbiter by
>> -	 * default */
>> -	pdev = NULL;
>> -	while ((pdev =
>> -		pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
>> -			       PCI_ANY_ID, pdev)) != NULL)
>> -		vga_arbiter_add_pci_device(pdev);
>> +	/*
>> +	 * We add all PCI VGA compatible devices in the arbiter by default
>> +	 */
>> +	do {
>> +		pdev = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, pdev);
>> +		if (pdev)
>> +			vga_arbiter_add_pci_device(pdev);
>> +	} while (pdev);
>>   
>>   	pr_info("loaded\n");
>>   	return rc;
>> -- 
>> 2.25.1
>>
Sui Jingfeng July 19, 2023, 8:06 p.m. UTC | #3
On 2023/7/20 03:58, Sui Jingfeng wrote:
> What this version adds here is *same* before this patch set is applied.


The filter method is *same* , in the cases of before this patch is 
applied and after this patch is applied.
Sui Jingfeng July 19, 2023, 8:08 p.m. UTC | #4
On 2023/7/20 03:58, Sui Jingfeng wrote:
> My explanation about the minor tweak being made before this version 
> and previous version
>
> is that  I want to keep my patch *less distraction*. 


The minor tweak being made between this version and previous version is 
to keep my patch *less distraction*.
Sui Jingfeng July 19, 2023, 8:16 p.m. UTC | #5
On 2023/7/20 03:58, Sui Jingfeng wrote:
> On the other hand, even though the lest significant 8 but if 
> pdev->class is really matter.


If the low eight bits of pdev->class is really matters,

maybe we should wait the potential problems became severe.

Currently, it is not obvious.
Sui Jingfeng July 19, 2023, 9:13 p.m. UTC | #6
Hi,


On 2023/7/20 02:26, Bjorn Helgaas wrote:
> On Tue, Jul 11, 2023 at 09:43:50PM +0800, Sui Jingfeng wrote:
>> From: Sui Jingfeng<suijingfeng@loongson.cn>
>>
>> Currently, vgaarb only cares about PCI VGA-compatible class devices.
>>
>> While vga_arbiter_del_pci_device() gets called unbalanced when some PCI
>> device is about to be removed. This happens even during the boot process.
> The previous code calls vga_arbiter_add_pci_device() for every device
> (every device present at boot and also every hot-added device).  It
> only allocates a vga_device if pdev->class is 0x0300XX.
>
> It calls vga_arbiter_del_pci_device() for every device removal.  It
> does nothing unless it finds a vga_device.
> This seems symmetric and reasonable to me.  Did you observe a problem
> with it?
>
Not big deal, but the vgaarb does do some useless work there.


Right,  it calls vga_arbiter_del_pci_device() for every device removal.

And it can not finds a vga_device at the most time. (Because on normal 
case, a user only have one or two GPU device in the system.)

But even it can not finds a vga_device, vga_arbiter_del_pci_device() 
still brings

additional(and it is unnecessary) overheads.


For an example, on my i3-8100 (the motherboard model is H110 D4L) machine,

The PCI device(0000:00:1f.1) will trigger the call to 
vga_arbiter_del_pci_device().


Even though it can not finds a vga_device,

vga_arbiter_del_pci_device() is *NOT* a no-op still.


```

static bool vga_arbiter_del_pci_device(struct pci_dev *pdev)
{
     struct vga_device *vgadev;
     unsigned long flags;
     bool ret = true;

     spin_lock_irqsave(&vga_lock, flags);
     vgadev = vgadev_find(pdev);
     if (vgadev == NULL) {
         ret = false;
         goto bail;
     }

     // omit ...


bail:
     spin_unlock_irqrestore(&vga_lock, flags);
     kfree(vgadev);
     return ret;
}

```


1) It call spin_lock_irqsave() and  spin_unlock_irqrestore() pair for 
complete irrelevant PCI devices

2) It try to find a vgadev with pdev pointer, which have to search the 
whole list (All nodes in the list got accessed), because it can not find.

3) It call kfree() to free NULL pointer, it's just that kfree() will 
just return if you pass a NULL, so no bug happen.


It is not efficient.

While the major contribution of my patch is to filter irrelevant PCI device.

Otherwise there 30+ noisy(useless) events got snooped. See below:


```

[    0.246077] pci 0000:01:00.0: vgaarb: setting as boot VGA device
[    0.246077] pci 0000:01:00.0: vgaarb: bridge control possible
[    0.246077] pci 0000:01:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=io+mem,locks=none
[    0.246077] vgaarb: loaded
[    0.294169] skl_uncore 0000:00:00.0: vgaarb: pci_notify: action=3
[    0.294182] skl_uncore 0000:00:00.0: vgaarb: pci_notify: action=4
[    0.301297] pcieport 0000:00:01.0: vgaarb: pci_notify: action=3
[    0.301482] pcieport 0000:00:01.0: vgaarb: pci_notify: action=4
[    0.301488] pcieport 0000:00:1c.0: vgaarb: pci_notify: action=3
[    0.301705] pcieport 0000:00:1c.0: vgaarb: pci_notify: action=4
[    1.806445] xhci_hcd 0000:00:14.0: vgaarb: pci_notify: action=3
[    1.810976] ahci 0000:00:17.0: vgaarb: pci_notify: action=3
[    1.824383] xhci_hcd 0000:00:14.0: vgaarb: pci_notify: action=4
[    1.857470] ahci 0000:00:17.0: vgaarb: pci_notify: action=4
[    4.692700] intel_pch_thermal 0000:00:14.2: vgaarb: pci_notify: action=3
[    4.693110] intel_pch_thermal 0000:00:14.2: vgaarb: pci_notify: action=4
[    4.746712] i801_smbus 0000:00:1f.4: vgaarb: pci_notify: action=3
[    4.747212] pci 0000:00:1f.1: vgaarb: pci_notify: action=0
[    4.747227] pci 0000:00:1f.1: vgaarb: pci_notify: action=1
[    4.747250] pci 0000:00:1f.1: vgaarb: pci_notify: action=2
[    4.749098] i801_smbus 0000:00:1f.4: vgaarb: pci_notify: action=4
[    4.799217] mei_me 0000:00:16.0: vgaarb: pci_notify: action=3
[    4.802503] mei_me 0000:00:16.0: vgaarb: pci_notify: action=4
[    4.874880] intel-lpss 0000:00:15.0: vgaarb: pci_notify: action=3
[    4.881227] intel-lpss 0000:00:15.0: vgaarb: pci_notify: action=4
[    4.881240] intel-lpss 0000:00:15.1: vgaarb: pci_notify: action=3
[    4.887578] intel-lpss 0000:00:15.1: vgaarb: pci_notify: action=4
[    4.985796] r8169 0000:02:00.0: vgaarb: pci_notify: action=3
[    4.991862] r8169 0000:02:00.0: vgaarb: pci_notify: action=4
[    5.404835] snd_hda_intel 0000:00:1f.3: vgaarb: pci_notify: action=3
[    5.405175] snd_hda_intel 0000:00:1f.3: vgaarb: pci_notify: action=4
[    5.405401] snd_hda_intel 0000:01:00.1: vgaarb: pci_notify: action=3
[    5.405973] snd_hda_intel 0000:01:00.1: vgaarb: pci_notify: action=4
[   10.793665] i915 0000:00:02.0: vgaarb: pci_notify: action=3
[   11.201384] i915 0000:00:02.0: vgaarb: pci_notify: action=4
[   16.135842] amdgpu 0000:01:00.0: vgaarb: pci_notify: action=3
[   16.140458] amdgpu 0000:01:00.0: vgaarb: deactivate vga console
[   16.638564] amdgpu 0000:01:00.0: vgaarb: pci_notify: action=4

```
Sui Jingfeng July 19, 2023, 9:27 p.m. UTC | #7
Hi,

On 2023/7/20 05:13, Sui Jingfeng wrote:
> Otherwise there 30+ noisy(useless) events got snooped. See below:
>
>
> ```
>
> [    0.246077] pci 0000:01:00.0: vgaarb: setting as boot VGA device
> [    0.246077] pci 0000:01:00.0: vgaarb: bridge control possible
> [    0.246077] pci 0000:01:00.0: vgaarb: VGA device added: 
> decodes=io+mem,owns=io+mem,locks=none
> [    0.246077] vgaarb: loaded
> [    0.294169] skl_uncore 0000:00:00.0: vgaarb: pci_notify: action=3
> [    0.294182] skl_uncore 0000:00:00.0: vgaarb: pci_notify: action=4
> [    0.301297] pcieport 0000:00:01.0: vgaarb: pci_notify: action=3
> [    0.301482] pcieport 0000:00:01.0: vgaarb: pci_notify: action=4
> [    0.301488] pcieport 0000:00:1c.0: vgaarb: pci_notify: action=3
> [    0.301705] pcieport 0000:00:1c.0: vgaarb: pci_notify: action=4
> [    1.806445] xhci_hcd 0000:00:14.0: vgaarb: pci_notify: action=3
> [    1.810976] ahci 0000:00:17.0: vgaarb: pci_notify: action=3
> [    1.824383] xhci_hcd 0000:00:14.0: vgaarb: pci_notify: action=4
> [    1.857470] ahci 0000:00:17.0: vgaarb: pci_notify: action=4
> [    4.692700] intel_pch_thermal 0000:00:14.2: vgaarb: pci_notify: 
> action=3
> [    4.693110] intel_pch_thermal 0000:00:14.2: vgaarb: pci_notify: 
> action=4
> [    4.746712] i801_smbus 0000:00:1f.4: vgaarb: pci_notify: action=3
> [    4.747212] pci 0000:00:1f.1: vgaarb: pci_notify: action=0
> [    4.747227] pci 0000:00:1f.1: vgaarb: pci_notify: action=1
> [    4.747250] pci 0000:00:1f.1: vgaarb: pci_notify: action=2
> [    4.749098] i801_smbus 0000:00:1f.4: vgaarb: pci_notify: action=4
> [    4.799217] mei_me 0000:00:16.0: vgaarb: pci_notify: action=3
> [    4.802503] mei_me 0000:00:16.0: vgaarb: pci_notify: action=4
> [    4.874880] intel-lpss 0000:00:15.0: vgaarb: pci_notify: action=3
> [    4.881227] intel-lpss 0000:00:15.0: vgaarb: pci_notify: action=4
> [    4.881240] intel-lpss 0000:00:15.1: vgaarb: pci_notify: action=3
> [    4.887578] intel-lpss 0000:00:15.1: vgaarb: pci_notify: action=4
> [    4.985796] r8169 0000:02:00.0: vgaarb: pci_notify: action=3
> [    4.991862] r8169 0000:02:00.0: vgaarb: pci_notify: action=4
> [    5.404835] snd_hda_intel 0000:00:1f.3: vgaarb: pci_notify: action=3
> [    5.405175] snd_hda_intel 0000:00:1f.3: vgaarb: pci_notify: action=4
> [    5.405401] snd_hda_intel 0000:01:00.1: vgaarb: pci_notify: action=3
> [    5.405973] snd_hda_intel 0000:01:00.1: vgaarb: pci_notify: action=4
> [   10.793665] i915 0000:00:02.0: vgaarb: pci_notify: action=3
> [   11.201384] i915 0000:00:02.0: vgaarb: pci_notify: action=4
> [   16.135842] amdgpu 0000:01:00.0: vgaarb: pci_notify: action=3
> [   16.140458] amdgpu 0000:01:00.0: vgaarb: deactivate vga console
> [   16.638564] amdgpu 0000:01:00.0: vgaarb: pci_notify: action=4
>
> ``` 


After apply my patch, this events are still will notify me, it is just 
that if we found it is irrelevant, we will return immediately.

No further process is needed.
Sui Jingfeng July 22, 2023, 8:11 a.m. UTC | #8
Hi,

On 2023/7/20 02:26, Bjorn Helgaas wrote:
> Optimization is fine, but the most important thing here is to be clear
> about what functional change this patch makes.  As I mentioned at [1],
> if this patch affects the class codes accepted, please make that clear
> here.
>
>> Reviewed-by: Mario Limonciello<mario.limonciello@amd.com>
>> Signed-off-by: Sui Jingfeng<suijingfeng@loongson.cn>
> I do not see Mario's Reviewed-by on the list.  I do see Mario's
> Reviewed-by [2] for a previous version, but that version added this in
> pci_notify():
>
>    + if (pdev->class != PCI_CLASS_DISPLAY_VGA << 8)
>    +   return 0;
>
> while this version adds:
>
>    + if ((pdev->class >> 8) != PCI_CLASS_DISPLAY_VGA)
>    +   return 0;
>
> It's OK to carry a review to future versions if there are
> insignificant changes, but this is a functional change that seems
> significant to me.  The first matches only 0x030000, while the second
> discards the low eight bits so it matches 0x0300XX.
>
> [1]https://lore.kernel.org/r/20230718231400.GA496927@bhelgaas
> [2]https://lore.kernel.org/all/5b6fdf65-b354-94a9-f883-be820157efad@amd.com/
>
Yes,  you are right. As you already told me at [1]:


According to the "PCI Code and ID Assignment" spec, r1.15, sec 1.4,

only mentions 0x0300 programming interface 0x00 as decoding

the legacy VGA addresses.


If the programming interface is 0x01, then it is a 8514-compatible 
controller.

It is petty old card,  about 30 years old(I think, it is nearly obsolete 
for now).

I never have a chance to see such a card in real life.


Yes, we should adopt first matches method here. That is:

   + if (pdev->class != PCI_CLASS_DISPLAY_VGA << 8)
   +   return 0;

It seems that we are more rigorous to deal the VGA-compatible devices as 
illustrated by above code here.

But who the still have that card (8514-compatible) and the hardware to 
using such a card today ?


Please consider that the pci_dev_attrs_are_visible() function[3] also 
ignore the

programming interface (the least significant 8 bits).

Therefore, at this version of my vgaarb cleanup patch set.

I choose to keep the original filtering rule,

but do the necessary optimization only which I think is meaningful.


In the future, we may want to expand VGAARB to deal all PCI display 
class devices, with another patch.

  
if (pdev->class >> 16 == PCI_BASE_CLASS_DISPLAY)

          // accept

else

       // return immediately.


Then, we will have a more good chance to clarify the programmer interface.

Is this explanation feasible and reasonable, Bjorn and Mario ?


[3] 
https://elixir.bootlin.com/linux/latest/source/drivers/pci/pci-sysfs.c#L1551
Bjorn Helgaas July 25, 2023, 9:49 p.m. UTC | #9
On Sat, Jul 22, 2023 at 04:11:07PM +0800, suijingfeng wrote:
> ...
> In the future, we may want to expand VGAARB to deal all PCI display class
> devices, with another patch.
> 
> if (pdev->class >> 16 == PCI_BASE_CLASS_DISPLAY)
> 
>          // accept
> 
> else
> 
>       // return immediately.
> 
> 
> Then, we will have a more good chance to clarify the programmer interface.

I would prefer not to expand vgaarb to deal with all display devices
unless they actually need the legacy resources like [pci 0xa0000-0xbffff].

But maybe the consumer of these interfaces relies on vgaarb even for
devices that don't need those resources?  If so, please mention
examples of where they depend on vgaarb.

I expect the vgaarb interfaces are used by things that need to emulate
the option ROM to initialize the device.  If the device has a
programming interface other than 0000 0000b, the option ROM should not
be using the [pci 0xa0000-0xbffff] resource, so vgaarb should not be
needed.

Bjorn
Sui Jingfeng Aug. 1, 2023, 7:17 a.m. UTC | #10
Hi,

On 2023/7/26 05:49, Bjorn Helgaas wrote:
> On Sat, Jul 22, 2023 at 04:11:07PM +0800, suijingfeng wrote:
>> ...
>> In the future, we may want to expand VGAARB to deal all PCI display class
>> devices, with another patch.
>>
>> if (pdev->class >> 16 == PCI_BASE_CLASS_DISPLAY)
>>
>>           // accept
>>
>> else
>>
>>        // return immediately.
>>
>>
>> Then, we will have a more good chance to clarify the programmer interface.
> I would prefer not to expand vgaarb to deal with all display devices
> unless they actually need the legacy resources like [pci 0xa0000-0xbffff].

What if a system have multiple non VGA-compatible GPU while all of them can display?
We still need to select a default for for user-space executable program (X server).

What if the VGA goes away someday?
I means that hardware vendors may abandon the old VGA standard.
After all, snooping a fixed address aperture is not absolute necessary for modern graphic card.
Modern graphic have dedicated VRAM Bar, the occupied address range can be relocatable.
Thus avoid the address overlap (or occlusion).


> But maybe the consumer of these interfaces relies on vgaarb even for
> devices that don't need those resources? If so, please mention
> examples of where they depend on vgaarb.

Yes, there do exist some PCI*NON*  VGA-compatible display controllers,
Strictly speaking, there are not VGA-compatible in the sense that
they don't respond the fixed legacy VGA aperture.
Such a display controller also don't cares about the extension ROM (option ROM).
Loongson display controllers are one of the various examples.

Besides, Intel integrate GPU is capable switch to*NON*  VGA-compatible.
especially in a multiple GPU co-exist hardware environment.
Old BIOS of Intel platform will change its class code from 0x0300 to 0x0380.
Newer BIOS do allow us to choose which one should be the primary GPU,
but if a user don't choose the Intel integrate GPU as primary,
the BIOS still will alter its PCI class code from 0x0300 to 0x0380.


By listing examples as above, I means that a PCI(e) GPU device do not
need to be VGA-compatible to display something on screen.
This is a very important point, I think,
which lead me to consider expand vgaarb.

I'm not sure if we should handle the programming interface thing here,
there are a lot of places where just ignore the programming interface.

> I expect the vgaarb interfaces are used by things that need to emulate
> the option ROM to initialize the device.  If the device has a
> programming interface other than 0000 0000b, the option ROM should not
> be using the [pci 0xa0000-0xbffff] resource, so vgaarb should not be
> needed.

Also, I have another thought for this question.
The vga_set_default_device() function interface exported by vgaarb
is not ensure the restriction either.
I means that it does not check if a device is VGA-compatible,
it does not examine if the programming interface is 00000000b or 00000001b either.
In theory, a programmer could set a display device via the vga_set_default_device() interface.
Maybe this function is intentionally leave some space to workaround.

So, my idea is that leave programming interface related problems to the future.
I don't want to worry about a non-exist thing(programming interface == 0x01 for an example).


Back to my patch set, is this patch acceptable?
Or I still need to refine this series?
My other patches are queued up with this.


> Bjorn
diff mbox series

Patch

diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
index c1bc6c983932..021116ed61cb 100644
--- a/drivers/pci/vgaarb.c
+++ b/drivers/pci/vgaarb.c
@@ -754,10 +754,6 @@  static bool vga_arbiter_add_pci_device(struct pci_dev *pdev)
 	struct pci_dev *bridge;
 	u16 cmd;
 
-	/* Only deal with VGA class devices */
-	if ((pdev->class >> 8) != PCI_CLASS_DISPLAY_VGA)
-		return false;
-
 	/* Allocate structure */
 	vgadev = kzalloc(sizeof(struct vga_device), GFP_KERNEL);
 	if (vgadev == NULL) {
@@ -1502,6 +1498,10 @@  static int pci_notify(struct notifier_block *nb, unsigned long action,
 
 	vgaarb_dbg(dev, "%s\n", __func__);
 
+	/* Deal with VGA compatible devices only */
+	if ((pdev->class >> 8) != PCI_CLASS_DISPLAY_VGA)
+		return 0;
+
 	/* For now we're only intereted in devices added and removed. I didn't
 	 * test this thing here, so someone needs to double check for the
 	 * cases of hotplugable vga cards. */
@@ -1534,8 +1534,8 @@  static struct miscdevice vga_arb_device = {
 
 static int __init vga_arb_device_init(void)
 {
+	struct pci_dev *pdev = NULL;
 	int rc;
-	struct pci_dev *pdev;
 
 	rc = misc_register(&vga_arb_device);
 	if (rc < 0)
@@ -1543,13 +1543,14 @@  static int __init vga_arb_device_init(void)
 
 	bus_register_notifier(&pci_bus_type, &pci_notifier);
 
-	/* We add all PCI devices satisfying VGA class in the arbiter by
-	 * default */
-	pdev = NULL;
-	while ((pdev =
-		pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
-			       PCI_ANY_ID, pdev)) != NULL)
-		vga_arbiter_add_pci_device(pdev);
+	/*
+	 * We add all PCI VGA compatible devices in the arbiter by default
+	 */
+	do {
+		pdev = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, pdev);
+		if (pdev)
+			vga_arbiter_add_pci_device(pdev);
+	} while (pdev);
 
 	pr_info("loaded\n");
 	return rc;