diff mbox

[RFC] gpu:drm:i915:intel_detect_pch: back to check devfn instead of check class type

Message ID 1403171631-3452-1-git-send-email-tiejun.chen@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Tiejun Chen June 19, 2014, 9:53 a.m. UTC
Originally the reason to probe ISA bridge instead of Dev31:Fun0
is to make graphics device passthrough work easy for VMM, that
only need to expose ISA bridge to let driver know the real
hardware underneath. This is a requirement from virtualization
team. Especially in that virtualized environments, XEN, there
is irrelevant ISA bridge in the system with that legacy qemu
version specific to xen, qemu-xen-traditional. So to work
reliably, we should scan through all the ISA bridge devices
and check for the first match, instead of only checking the
first one.

But actually, qemu-xen-traditional, is always enumerated with
Dev31:Fun0, 00:1f.0 as follows:

hw/pt-graphics.c:

intel_pch_init()
    |
    + pci_isa_bridge_init(bus, PCI_DEVFN(0x1f, 0), ...);

so this mean that isa bridge is still represented with Dev31:Func0
like the native OS. Furthermore, currently we're pushing VGA
passthrough support into qemu upstream, and with some discussion,
we wouldn't set the bridge class type and just expose this devfn.

So we just go back to check devfn to make life normal.

Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 19 +++----------------
 1 file changed, 3 insertions(+), 16 deletions(-)

Comments

Tiejun Chen June 20, 2014, 9:40 a.m. UTC | #1
Just ping, any comments?

Thanks
Tiejun

On 2014/6/19 17:53, Tiejun Chen wrote:
> Originally the reason to probe ISA bridge instead of Dev31:Fun0
> is to make graphics device passthrough work easy for VMM, that
> only need to expose ISA bridge to let driver know the real
> hardware underneath. This is a requirement from virtualization
> team. Especially in that virtualized environments, XEN, there
> is irrelevant ISA bridge in the system with that legacy qemu
> version specific to xen, qemu-xen-traditional. So to work
> reliably, we should scan through all the ISA bridge devices
> and check for the first match, instead of only checking the
> first one.
>
> But actually, qemu-xen-traditional, is always enumerated with
> Dev31:Fun0, 00:1f.0 as follows:
>
> hw/pt-graphics.c:
>
> intel_pch_init()
>      |
>      + pci_isa_bridge_init(bus, PCI_DEVFN(0x1f, 0), ...);
>
> so this mean that isa bridge is still represented with Dev31:Func0
> like the native OS. Furthermore, currently we're pushing VGA
> passthrough support into qemu upstream, and with some discussion,
> we wouldn't set the bridge class type and just expose this devfn.
>
> So we just go back to check devfn to make life normal.
>
> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.c | 19 +++----------------
>   1 file changed, 3 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 651e65e..cb2526e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -417,18 +417,8 @@ void intel_detect_pch(struct drm_device *dev)
>   		return;
>   	}
>
> -	/*
> -	 * The reason to probe ISA bridge instead of Dev31:Fun0 is to
> -	 * make graphics device passthrough work easy for VMM, that only
> -	 * need to expose ISA bridge to let driver know the real hardware
> -	 * underneath. This is a requirement from virtualization team.
> -	 *
> -	 * In some virtualized environments (e.g. XEN), there is irrelevant
> -	 * ISA bridge in the system. To work reliably, we should scan trhough
> -	 * all the ISA bridge devices and check for the first match, instead
> -	 * of only checking the first one.
> -	 */
> -	while ((pch = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, pch))) {
> +	pch = pci_get_bus_and_slot(0, PCI_DEVFN(0x1f, 0));
> +	if (pch) {
>   		if (pch->vendor == PCI_VENDOR_ID_INTEL) {
>   			unsigned short id = pch->device & INTEL_PCH_DEVICE_ID_MASK;
>   			dev_priv->pch_id = id;
> @@ -462,10 +452,7 @@ void intel_detect_pch(struct drm_device *dev)
>   				DRM_DEBUG_KMS("Found LynxPoint LP PCH\n");
>   				WARN_ON(!IS_HASWELL(dev));
>   				WARN_ON(!IS_ULT(dev));
> -			} else
> -				continue;
> -
> -			break;
> +			}
>   		}
>   	}
>   	if (!pch)
>
Daniel Vetter June 20, 2014, 12:32 p.m. UTC | #2
Well I have no clue about forwarding the intel gpu to virtualized
hosts and also no idea who could review this really. There's been a
bit a discussion around the iommu mapping forwarding and similar
topics though. So I really wonder how well our driver works in this
use case ...
-Daniel

On Fri, Jun 20, 2014 at 11:40 AM, Chen, Tiejun <tiejun.chen@intel.com> wrote:
> Just ping, any comments?
>
> Thanks
> Tiejun
>
>
> On 2014/6/19 17:53, Tiejun Chen wrote:
>>
>> Originally the reason to probe ISA bridge instead of Dev31:Fun0
>> is to make graphics device passthrough work easy for VMM, that
>> only need to expose ISA bridge to let driver know the real
>> hardware underneath. This is a requirement from virtualization
>> team. Especially in that virtualized environments, XEN, there
>> is irrelevant ISA bridge in the system with that legacy qemu
>> version specific to xen, qemu-xen-traditional. So to work
>> reliably, we should scan through all the ISA bridge devices
>> and check for the first match, instead of only checking the
>> first one.
>>
>> But actually, qemu-xen-traditional, is always enumerated with
>> Dev31:Fun0, 00:1f.0 as follows:
>>
>> hw/pt-graphics.c:
>>
>> intel_pch_init()
>>      |
>>      + pci_isa_bridge_init(bus, PCI_DEVFN(0x1f, 0), ...);
>>
>> so this mean that isa bridge is still represented with Dev31:Func0
>> like the native OS. Furthermore, currently we're pushing VGA
>> passthrough support into qemu upstream, and with some discussion,
>> we wouldn't set the bridge class type and just expose this devfn.
>>
>> So we just go back to check devfn to make life normal.
>>
>> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.c | 19 +++----------------
>>   1 file changed, 3 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c
>> b/drivers/gpu/drm/i915/i915_drv.c
>> index 651e65e..cb2526e 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -417,18 +417,8 @@ void intel_detect_pch(struct drm_device *dev)
>>                 return;
>>         }
>>
>> -       /*
>> -        * The reason to probe ISA bridge instead of Dev31:Fun0 is to
>> -        * make graphics device passthrough work easy for VMM, that only
>> -        * need to expose ISA bridge to let driver know the real hardware
>> -        * underneath. This is a requirement from virtualization team.
>> -        *
>> -        * In some virtualized environments (e.g. XEN), there is
>> irrelevant
>> -        * ISA bridge in the system. To work reliably, we should scan
>> trhough
>> -        * all the ISA bridge devices and check for the first match,
>> instead
>> -        * of only checking the first one.
>> -        */
>> -       while ((pch = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, pch))) {
>> +       pch = pci_get_bus_and_slot(0, PCI_DEVFN(0x1f, 0));
>> +       if (pch) {
>>                 if (pch->vendor == PCI_VENDOR_ID_INTEL) {
>>                         unsigned short id = pch->device &
>> INTEL_PCH_DEVICE_ID_MASK;
>>                         dev_priv->pch_id = id;
>> @@ -462,10 +452,7 @@ void intel_detect_pch(struct drm_device *dev)
>>                                 DRM_DEBUG_KMS("Found LynxPoint LP PCH\n");
>>                                 WARN_ON(!IS_HASWELL(dev));
>>                                 WARN_ON(!IS_ULT(dev));
>> -                       } else
>> -                               continue;
>> -
>> -                       break;
>> +                       }
>>                 }
>>         }
>>         if (!pch)
>>
>
Paolo Bonzini June 20, 2014, 12:48 p.m. UTC | #3
Il 19/06/2014 11:53, Tiejun Chen ha scritto:
> so this mean that isa bridge is still represented with Dev31:Func0
> like the native OS. Furthermore, currently we're pushing VGA
> passthrough support into qemu upstream, and with some discussion,
> we wouldn't set the bridge class type and just expose this devfn.

Even this is not really optimal.  It just happens to work just because 
QEMU's machine is currently a PCI machine with the ISA bridge on 00:01.0.

As soon as you'll try doing integrated graphics passthrough on a PCIe 
machine type (such as QEMU's "-M q35") things will break down just as badly.

Paolo
Tiejun Chen June 22, 2014, 8 a.m. UTC | #4
On 2014/6/20 20:32, Daniel Vetter wrote:
> Well I have no clue about forwarding the intel gpu to virtualized
> hosts and also no idea who could review this really. There's been a
> bit a discussion around the iommu mapping forwarding and similar

No, this doesn't affect IOMMU mapping.

> topics though. So I really wonder how well our driver works in this
> use case ...

In native case the truth is intel_detect_pch() needs to probe the 
00:15.0 to determine what type current pch is, then the i915 driver can 
be initialized correctly. Actually the 00:15.0 is just a ISA bridge.

In virtualized case we thought this ISA bridge mayn't be represented 
with 00:15.0 originally by qemu-xen-traditional. So instead of checking 
devfn, the i915 driver check the class type to get this ISA bridge.

But with my investigation,qemu-xen-traditional always set 00:15.0 
explicitly to represent this ISA bridge. And especially, we wouldn't 
provide that ISA bridge with an explicit class type in qemu-upstream, so 
we need to the i915 driver to probe pch by checking devfn.

This should work both on the native case and the virtualized case.

Thanks
Tiejun

> -Daniel
>
> On Fri, Jun 20, 2014 at 11:40 AM, Chen, Tiejun <tiejun.chen@intel.com> wrote:
>> Just ping, any comments?
>>
>> Thanks
>> Tiejun
>>
>>
>> On 2014/6/19 17:53, Tiejun Chen wrote:
>>>
>>> Originally the reason to probe ISA bridge instead of Dev31:Fun0
>>> is to make graphics device passthrough work easy for VMM, that
>>> only need to expose ISA bridge to let driver know the real
>>> hardware underneath. This is a requirement from virtualization
>>> team. Especially in that virtualized environments, XEN, there
>>> is irrelevant ISA bridge in the system with that legacy qemu
>>> version specific to xen, qemu-xen-traditional. So to work
>>> reliably, we should scan through all the ISA bridge devices
>>> and check for the first match, instead of only checking the
>>> first one.
>>>
>>> But actually, qemu-xen-traditional, is always enumerated with
>>> Dev31:Fun0, 00:1f.0 as follows:
>>>
>>> hw/pt-graphics.c:
>>>
>>> intel_pch_init()
>>>       |
>>>       + pci_isa_bridge_init(bus, PCI_DEVFN(0x1f, 0), ...);
>>>
>>> so this mean that isa bridge is still represented with Dev31:Func0
>>> like the native OS. Furthermore, currently we're pushing VGA
>>> passthrough support into qemu upstream, and with some discussion,
>>> we wouldn't set the bridge class type and just expose this devfn.
>>>
>>> So we just go back to check devfn to make life normal.
>>>
>>> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/i915_drv.c | 19 +++----------------
>>>    1 file changed, 3 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.c
>>> b/drivers/gpu/drm/i915/i915_drv.c
>>> index 651e65e..cb2526e 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.c
>>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>>> @@ -417,18 +417,8 @@ void intel_detect_pch(struct drm_device *dev)
>>>                  return;
>>>          }
>>>
>>> -       /*
>>> -        * The reason to probe ISA bridge instead of Dev31:Fun0 is to
>>> -        * make graphics device passthrough work easy for VMM, that only
>>> -        * need to expose ISA bridge to let driver know the real hardware
>>> -        * underneath. This is a requirement from virtualization team.
>>> -        *
>>> -        * In some virtualized environments (e.g. XEN), there is
>>> irrelevant
>>> -        * ISA bridge in the system. To work reliably, we should scan
>>> trhough
>>> -        * all the ISA bridge devices and check for the first match,
>>> instead
>>> -        * of only checking the first one.
>>> -        */
>>> -       while ((pch = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, pch))) {
>>> +       pch = pci_get_bus_and_slot(0, PCI_DEVFN(0x1f, 0));
>>> +       if (pch) {
>>>                  if (pch->vendor == PCI_VENDOR_ID_INTEL) {
>>>                          unsigned short id = pch->device &
>>> INTEL_PCH_DEVICE_ID_MASK;
>>>                          dev_priv->pch_id = id;
>>> @@ -462,10 +452,7 @@ void intel_detect_pch(struct drm_device *dev)
>>>                                  DRM_DEBUG_KMS("Found LynxPoint LP PCH\n");
>>>                                  WARN_ON(!IS_HASWELL(dev));
>>>                                  WARN_ON(!IS_ULT(dev));
>>> -                       } else
>>> -                               continue;
>>> -
>>> -                       break;
>>> +                       }
>>>                  }
>>>          }
>>>          if (!pch)
>>>
>>
>
>
>
Tiejun Chen June 22, 2014, 8:25 a.m. UTC | #5
On 2014/6/20 20:48, Paolo Bonzini wrote:
> Il 19/06/2014 11:53, Tiejun Chen ha scritto:
>> so this mean that isa bridge is still represented with Dev31:Func0
>> like the native OS. Furthermore, currently we're pushing VGA
>> passthrough support into qemu upstream, and with some discussion,
>> we wouldn't set the bridge class type and just expose this devfn.
>
> Even this is not really optimal.  It just happens to work just because
> QEMU's machine is currently a PCI machine with the ISA bridge on 00:01.0.
>
> As soon as you'll try doing integrated graphics passthrough on a PCIe
> machine type (such as QEMU's "-M q35") things will break down just as
> badly.
>

Sorry, I can't understand why this is related to the ISA bridge, 00:01.0 
or even other PCIe machine type.

In virtualized case we always need to create this ISA bridge as a devfn, 
00:15.0, work for the i915 driver to support IGD passthrough.

In qemu-xen-traditional, this ISA bridge is already created as follows:

1> We set this ISA type explicitly;
2> We register that as 00:15.0.

In qemu-upstream, as you commented we can't create this as a ISA class 
type explicitly. So we compromise by faking this ISA bridge without ISA 
class type setting (as I recall you already said this way is slightly 
better). Maybe we will figure better way in the future. But anyway, this 
is always registered as 00:15.0, right? So I think the i915 driver can 
go back to probe the devfn like the native environment.

If I'm wrong please correct me.

Thanks
Tiejun
Zhenyu Wang June 24, 2014, 2:59 a.m. UTC | #6
On 2014.06.19 17:53:51 +0800, Tiejun Chen wrote:
> Originally the reason to probe ISA bridge instead of Dev31:Fun0
> is to make graphics device passthrough work easy for VMM, that
> only need to expose ISA bridge to let driver know the real
> hardware underneath. This is a requirement from virtualization
> team. Especially in that virtualized environments, XEN, there
> is irrelevant ISA bridge in the system with that legacy qemu
> version specific to xen, qemu-xen-traditional. So to work
> reliably, we should scan through all the ISA bridge devices
> and check for the first match, instead of only checking the
> first one.
> 
> But actually, qemu-xen-traditional, is always enumerated with
> Dev31:Fun0, 00:1f.0 as follows:
> 
> hw/pt-graphics.c:
> 
> intel_pch_init()
>     |
>     + pci_isa_bridge_init(bus, PCI_DEVFN(0x1f, 0), ...);
> 
> so this mean that isa bridge is still represented with Dev31:Func0
> like the native OS. Furthermore, currently we're pushing VGA
> passthrough support into qemu upstream, and with some discussion,
> we wouldn't set the bridge class type and just expose this devfn.
> 
> So we just go back to check devfn to make life normal.
> 
> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>

This was added historically when supporting graphics device passthrough.
Looks qemu upstream can't accept multiple ISA bridge and our PCH is always
on device 31: func0 as far as I know. Looks good to me.

Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Tiejun Chen June 25, 2014, 2:28 a.m. UTC | #7
On 2014/6/24 10:59, Zhenyu Wang wrote:
> On 2014.06.19 17:53:51 +0800, Tiejun Chen wrote:
>> Originally the reason to probe ISA bridge instead of Dev31:Fun0
>> is to make graphics device passthrough work easy for VMM, that
>> only need to expose ISA bridge to let driver know the real
>> hardware underneath. This is a requirement from virtualization
>> team. Especially in that virtualized environments, XEN, there
>> is irrelevant ISA bridge in the system with that legacy qemu
>> version specific to xen, qemu-xen-traditional. So to work
>> reliably, we should scan through all the ISA bridge devices
>> and check for the first match, instead of only checking the
>> first one.
>>
>> But actually, qemu-xen-traditional, is always enumerated with
>> Dev31:Fun0, 00:1f.0 as follows:
>>
>> hw/pt-graphics.c:
>>
>> intel_pch_init()
>>      |
>>      + pci_isa_bridge_init(bus, PCI_DEVFN(0x1f, 0), ...);
>>
>> so this mean that isa bridge is still represented with Dev31:Func0
>> like the native OS. Furthermore, currently we're pushing VGA
>> passthrough support into qemu upstream, and with some discussion,
>> we wouldn't set the bridge class type and just expose this devfn.
>>
>> So we just go back to check devfn to make life normal.
>>
>> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
>
> This was added historically when supporting graphics device passthrough.
> Looks qemu upstream can't accept multiple ISA bridge and our PCH is always
> on device 31: func0 as far as I know. Looks good to me.
>
> Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
>

Thanks for your review.

Do you know when this can be applied?

Tiejun
Paolo Bonzini June 25, 2014, 6:48 a.m. UTC | #8
Il 22/06/2014 10:25, Chen, Tiejun ha scritto:
> In qemu-upstream, as you commented we can't create this as a ISA class
> type explicitly.

Note I didn't say that QEMU doesn't like having two ISA bridges.

I commented that the firmware will see two ISA bridges and will try to 
initialize both of them.  It currently doesn't just because it only 
knows of two southbridge PCI IDs, and both of them are old or relatively 
old (PIIX3/4 and ICH9).

But what would happen if you did graphics passthrough on a machine with 
an ICH9 southbridge?  Your code will create the PIIX4 ISA bridge, and 
will create an ICH9 southbridge just to please the i915 driver.  The 
BIOS will recognize the ICH9's vendor/device ids and try to initialize 
it.  It will write to the configuration space to set up PCI PIRQ[A-H] 
routing, for example, which makes no sense since your ICH9 is just a 
dummy device.

Second problem.  Your IGD passthrough code currently works with QEMU's 
PIIX4-based machine.  But what happens if you try to extend it, so that 
it works with QEMU's ICH9-based machine?  That's a more modern machine 
that has a PCIe chipset and hence has its ISA bridge at 00:1f.0.  Now 
you have a conflict.

In other words, if you want IGD passthrough in QEMU, you need a solution 
that is future-proof.

> So we compromise by faking this ISA bridge without ISA
> class type setting (as I recall you already said this way is slightly
> better).

It is only slightly better, but the right solution is to fix the driver. 
  There is absolutely zero reason why a graphics driver should know 
about the vendor/device ids of the PCH.

The right way could be to make QEMU add a vendor-specific capability to 
the video device. The driver can probe for that capability before 
looking at the PCI bus.  QEMU can add the capability to the list, it is 
easy because all accesses to the video device's configuration space trap 
to QEMU.  Then you do not need to add fake devices to the machine.

In fact, it would be nice if Intel added such a capability on the next 
generation of integrated graphics, too.  On real hardware, ACPI or some 
other kind of firmware can probe the PCH at 00:1f.0 and initialize that 
vendor-specific capability.  QEMU would just forward the value from the 
host, so that virtualized guests never depend on the PCH at 00:1f.0.

Paolo

> Maybe we will figure better way in the future. But anyway, this
> is always registered as 00:15.0, right? So I think the i915 driver can
> go back to probe the devfn like the native environment.
Tiejun Chen June 25, 2014, 7:34 a.m. UTC | #9
On 2014/6/25 14:48, Paolo Bonzini wrote:
> Il 22/06/2014 10:25, Chen, Tiejun ha scritto:
>> In qemu-upstream, as you commented we can't create this as a ISA class
>> type explicitly.
>
> Note I didn't say that QEMU doesn't like having two ISA bridges.
>
> I commented that the firmware will see two ISA bridges and will try to
> initialize both of them.  It currently doesn't just because it only
> knows of two southbridge PCI IDs, and both of them are old or relatively
> old (PIIX3/4 and ICH9).
>
> But what would happen if you did graphics passthrough on a machine with
> an ICH9 southbridge?  Your code will create the PIIX4 ISA bridge, and
> will create an ICH9 southbridge just to please the i915 driver.  The
> BIOS will recognize the ICH9's vendor/device ids and try to initialize
> it.  It will write to the configuration space to set up PCI PIRQ[A-H]
> routing, for example, which makes no sense since your ICH9 is just a
> dummy device.
>

Thanks for your detailed explanation.

> Second problem.  Your IGD passthrough code currently works with QEMU's
> PIIX4-based machine.  But what happens if you try to extend it, so that

Yes, current xen machine, xenpv, is based on pii4, and also I don't 
known if we will plan to migrate to q35 or others. So its hard to 
further say more now.

> it works with QEMU's ICH9-based machine?  That's a more modern machine
> that has a PCIe chipset and hence has its ISA bridge at 00:1f.0.  Now

But even in this case, could we set the real vendor/device ids for that 
ISA bridge at 00:1f.0? If not, what's broken?

> you have a conflict.
>
> In other words, if you want IGD passthrough in QEMU, you need a solution
> that is future-proof.
>
>> So we compromise by faking this ISA bridge without ISA
>> class type setting (as I recall you already said this way is slightly
>> better).
>
> It is only slightly better, but the right solution is to fix the driver.
>   There is absolutely zero reason why a graphics driver should know
> about the vendor/device ids of the PCH.

This means we have to fix this both on Linux and Windows but I'm not 
sure if this is feasible to us.

>
> The right way could be to make QEMU add a vendor-specific capability to
> the video device. The driver can probe for that capability before

Do you mean we can pick two unused offsets in the configuration space of 
the video device as a vendor-specific capability to hold the 
vendor/device ids of the PCH?

> looking at the PCI bus.  QEMU can add the capability to the list, it is
> easy because all accesses to the video device's configuration space trap
> to QEMU.  Then you do not need to add fake devices to the machine.
>
> In fact, it would be nice if Intel added such a capability on the next
> generation of integrated graphics, too.  On real hardware, ACPI or some

Maybe, but even this would be implemented, shouldn't we need to be 
compatible with those old generations?

Thanks
Tiejun

> other kind of firmware can probe the PCH at 00:1f.0 and initialize that
> vendor-specific capability.  QEMU would just forward the value from the
> host, so that virtualized guests never depend on the PCH at 00:1f.0.
>
> Paolo
>
>> Maybe we will figure better way in the future. But anyway, this
>> is always registered as 00:15.0, right? So I think the i915 driver can
>> go back to probe the devfn like the native environment.
>
>
>
Paolo Bonzini June 25, 2014, 7:55 a.m. UTC | #10
Il 25/06/2014 09:34, Chen, Tiejun ha scritto:
> On 2014/6/25 14:48, Paolo Bonzini wrote:
>> Second problem.  Your IGD passthrough code currently works with QEMU's
>> PIIX4-based machine.  But what happens if you try to extend it, so that
>
> Yes, current xen machine, xenpv, is based on pii4, and also I don't
> known if we will plan to migrate to q35 or others. So its hard to
> further say more now.
>
>> it works with QEMU's ICH9-based machine?  That's a more modern machine
>> that has a PCIe chipset and hence has its ISA bridge at 00:1f.0.  Now
>
> But even in this case, could we set the real vendor/device ids for that
> ISA bridge at 00:1f.0? If not, what's broken?

The config space layout changes for different vendor/device ids, so the 
guest firmware only works if you have the right vendor/device id.

>> It is only slightly better, but the right solution is to fix the driver.
>>   There is absolutely zero reason why a graphics driver should know
>> about the vendor/device ids of the PCH.
>
> This means we have to fix this both on Linux and Windows but I'm not
> sure if this is feasible to us.

You have to do it if you want this feature in QEMU in a future-proof way.

You _can_ provide the ugly PIIX4-specific hack as a compatibility 
fallback (and this patch is okay to make the compatibility fallback less 
hacky).  However, I don't think QEMU should accept the patch for IGD 
passthrough unless Intel is willing to make drivers 
virtualization-friendly.  Once you assign the IGD, it is not that 
integrated anymore and the drivers must take that into account.

It is worthwhile pointing out that neither AMD nor nVidia need any of this.

>> The right way could be to make QEMU add a vendor-specific capability to
>> the video device. The driver can probe for that capability before
>
> Do you mean we can pick two unused offsets in the configuration space of
> the video device as a vendor-specific capability to hold the
> vendor/device ids of the PCH?

Yes, either that or add a new capability (which lets you choose the 
offsets more freely).

If the IGD driver needs config space fields of the MCH, those fields 
could also be mirrored in the new capability.  QEMU would forward them 
automatically.

It could even be a new BAR, which gives even more freedom to allocate 
the fields.

>> looking at the PCI bus.  QEMU can add the capability to the list, it is
>> easy because all accesses to the video device's configuration space trap
>> to QEMU.  Then you do not need to add fake devices to the machine.
>>
>> In fact, it would be nice if Intel added such a capability on the next
>> generation of integrated graphics, too.  On real hardware, ACPI or some
>
> Maybe, but even this would be implemented, shouldn't we need to be
> compatible with those old generations?

Yes.

- old generation / old driver: use 00:1f.0 hack, only guaranteed to work 
on PIIX4-based virtual guest

- old generation / new driver: use 00:1f.0 hack on real hardware, use 
capability on 00:02.0 on virtual guest, can work on PCIe virtual guest

- new generation / old driver: doesn't exist

- new generation / new driver: always use capability on 00:02.0, can 
work on PCIe virtual guest.

Paolo
Tiejun Chen June 30, 2014, 3:13 a.m. UTC | #11
On 2014/6/25 15:55, Paolo Bonzini wrote:
> Il 25/06/2014 09:34, Chen, Tiejun ha scritto:
>> On 2014/6/25 14:48, Paolo Bonzini wrote:
>>> Second problem.  Your IGD passthrough code currently works with QEMU's
>>> PIIX4-based machine.  But what happens if you try to extend it, so that
>>
>> Yes, current xen machine, xenpv, is based on pii4, and also I don't
>> known if we will plan to migrate to q35 or others. So its hard to
>> further say more now.
>>
>>> it works with QEMU's ICH9-based machine?  That's a more modern machine
>>> that has a PCIe chipset and hence has its ISA bridge at 00:1f.0.  Now
>>
>> But even in this case, could we set the real vendor/device ids for that
>> ISA bridge at 00:1f.0? If not, what's broken?
>
> The config space layout changes for different vendor/device ids, so the
> guest firmware only works if you have the right vendor/device id.

Paolo,

After I discuss internal, we think even we just set the real 
vendor/device ids to this ISA bridge at 00:1f.0, guest firmware should 
still work well with these pair of real vendor/device ids.

So if you think something would conflict or be broken, could you tell us 
what's exactly that? Then we will double check.

Thanks
Tiejun

>
>>> It is only slightly better, but the right solution is to fix the driver.
>>>   There is absolutely zero reason why a graphics driver should know
>>> about the vendor/device ids of the PCH.
>>
>> This means we have to fix this both on Linux and Windows but I'm not
>> sure if this is feasible to us.
>
> You have to do it if you want this feature in QEMU in a future-proof way.
>
> You _can_ provide the ugly PIIX4-specific hack as a compatibility
> fallback (and this patch is okay to make the compatibility fallback less
> hacky).  However, I don't think QEMU should accept the patch for IGD
> passthrough unless Intel is willing to make drivers
> virtualization-friendly.  Once you assign the IGD, it is not that
> integrated anymore and the drivers must take that into account.
>
> It is worthwhile pointing out that neither AMD nor nVidia need any of this.
>
>>> The right way could be to make QEMU add a vendor-specific capability to
>>> the video device. The driver can probe for that capability before
>>
>> Do you mean we can pick two unused offsets in the configuration space of
>> the video device as a vendor-specific capability to hold the
>> vendor/device ids of the PCH?
>
> Yes, either that or add a new capability (which lets you choose the
> offsets more freely).
>
> If the IGD driver needs config space fields of the MCH, those fields
> could also be mirrored in the new capability.  QEMU would forward them
> automatically.
>
> It could even be a new BAR, which gives even more freedom to allocate
> the fields.
>
>>> looking at the PCI bus.  QEMU can add the capability to the list, it is
>>> easy because all accesses to the video device's configuration space trap
>>> to QEMU.  Then you do not need to add fake devices to the machine.
>>>
>>> In fact, it would be nice if Intel added such a capability on the next
>>> generation of integrated graphics, too.  On real hardware, ACPI or some
>>
>> Maybe, but even this would be implemented, shouldn't we need to be
>> compatible with those old generations?
>
> Yes.
>
> - old generation / old driver: use 00:1f.0 hack, only guaranteed to work
> on PIIX4-based virtual guest
>
> - old generation / new driver: use 00:1f.0 hack on real hardware, use
> capability on 00:02.0 on virtual guest, can work on PCIe virtual guest
>
> - new generation / old driver: doesn't exist
>
> - new generation / new driver: always use capability on 00:02.0, can
> work on PCIe virtual guest.
>
> Paolo
>
>
Paolo Bonzini June 30, 2014, 10:56 a.m. UTC | #12
Il 30/06/2014 05:13, Chen, Tiejun ha scritto:
> After I discuss internal, we think even we just set the real
> vendor/device ids to this ISA bridge at 00:1f.0, guest firmware should
> still work well with these pair of real vendor/device ids.
>
> So if you think something would conflict or be broken, could you tell us
> what's exactly that? Then we will double check.

The Xen hvmloader doesn't break since it only supports one chipset.  But 
SeaBIOS checks for the exact vendor/device ids since Q35 support was added.

If you want to add this feature, try to implement it in a way that is a 
bit more forward-looking.  I'm sure that Xen sooner or later will want a 
PCIe chipset, otherwise things such as AER forwarding are impossible.

Paolo
Michael S. Tsirkin June 30, 2014, 11:18 a.m. UTC | #13
On Thu, Jun 19, 2014 at 05:53:51PM +0800, Tiejun Chen wrote:
> Originally the reason to probe ISA bridge instead of Dev31:Fun0
> is to make graphics device passthrough work easy for VMM, that
> only need to expose ISA bridge to let driver know the real
> hardware underneath. This is a requirement from virtualization
> team. Especially in that virtualized environments, XEN, there
> is irrelevant ISA bridge in the system with that legacy qemu
> version specific to xen, qemu-xen-traditional. So to work
> reliably, we should scan through all the ISA bridge devices
> and check for the first match, instead of only checking the
> first one.
> 
> But actually, qemu-xen-traditional, is always enumerated with
> Dev31:Fun0, 00:1f.0 as follows:
> 
> hw/pt-graphics.c:
> 
> intel_pch_init()
>     |
>     + pci_isa_bridge_init(bus, PCI_DEVFN(0x1f, 0), ...);
> 
> so this mean that isa bridge is still represented with Dev31:Func0
> like the native OS. Furthermore, currently we're pushing VGA
> passthrough support into qemu upstream, and with some discussion,
> we wouldn't set the bridge class type and just expose this devfn.
> 
> So we just go back to check devfn to make life normal.
> 
> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.c | 19 +++----------------
>  1 file changed, 3 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 651e65e..cb2526e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -417,18 +417,8 @@ void intel_detect_pch(struct drm_device *dev)
>  		return;
>  	}
>  
> -	/*
> -	 * The reason to probe ISA bridge instead of Dev31:Fun0 is to
> -	 * make graphics device passthrough work easy for VMM, that only
> -	 * need to expose ISA bridge to let driver know the real hardware
> -	 * underneath. This is a requirement from virtualization team.
> -	 *
> -	 * In some virtualized environments (e.g. XEN), there is irrelevant
> -	 * ISA bridge in the system. To work reliably, we should scan trhough
> -	 * all the ISA bridge devices and check for the first match, instead
> -	 * of only checking the first one.
> -	 */
> -	while ((pch = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, pch))) {
> +	pch = pci_get_bus_and_slot(0, PCI_DEVFN(0x1f, 0));
> +	if (pch) {

Then if you want to use this slot for something else, what happens?
If you want to relax the PCI_CLASS_BRIDGE_ISA requirement when
running on top of a hypervisor, just scan all devices.

>  		if (pch->vendor == PCI_VENDOR_ID_INTEL) {
>  			unsigned short id = pch->device & INTEL_PCH_DEVICE_ID_MASK;
>  			dev_priv->pch_id = id;
> @@ -462,10 +452,7 @@ void intel_detect_pch(struct drm_device *dev)
>  				DRM_DEBUG_KMS("Found LynxPoint LP PCH\n");
>  				WARN_ON(!IS_HASWELL(dev));
>  				WARN_ON(!IS_ULT(dev));
> -			} else
> -				continue;
> -
> -			break;
> +			}
>  		}
>  	}
>  	if (!pch)
> -- 
> 1.9.1
>
Tiejun Chen July 1, 2014, 1:52 a.m. UTC | #14
On 2014/6/30 19:18, Michael S. Tsirkin wrote:
> On Thu, Jun 19, 2014 at 05:53:51PM +0800, Tiejun Chen wrote:
>> Originally the reason to probe ISA bridge instead of Dev31:Fun0
>> is to make graphics device passthrough work easy for VMM, that
>> only need to expose ISA bridge to let driver know the real
>> hardware underneath. This is a requirement from virtualization
>> team. Especially in that virtualized environments, XEN, there
>> is irrelevant ISA bridge in the system with that legacy qemu
>> version specific to xen, qemu-xen-traditional. So to work
>> reliably, we should scan through all the ISA bridge devices
>> and check for the first match, instead of only checking the
>> first one.
>>
>> But actually, qemu-xen-traditional, is always enumerated with
>> Dev31:Fun0, 00:1f.0 as follows:
>>
>> hw/pt-graphics.c:
>>
>> intel_pch_init()
>>      |
>>      + pci_isa_bridge_init(bus, PCI_DEVFN(0x1f, 0), ...);
>>
>> so this mean that isa bridge is still represented with Dev31:Func0
>> like the native OS. Furthermore, currently we're pushing VGA
>> passthrough support into qemu upstream, and with some discussion,
>> we wouldn't set the bridge class type and just expose this devfn.
>>
>> So we just go back to check devfn to make life normal.
>>
>> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.c | 19 +++----------------
>>   1 file changed, 3 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
>> index 651e65e..cb2526e 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -417,18 +417,8 @@ void intel_detect_pch(struct drm_device *dev)
>>   		return;
>>   	}
>>
>> -	/*
>> -	 * The reason to probe ISA bridge instead of Dev31:Fun0 is to
>> -	 * make graphics device passthrough work easy for VMM, that only
>> -	 * need to expose ISA bridge to let driver know the real hardware
>> -	 * underneath. This is a requirement from virtualization team.
>> -	 *
>> -	 * In some virtualized environments (e.g. XEN), there is irrelevant
>> -	 * ISA bridge in the system. To work reliably, we should scan trhough
>> -	 * all the ISA bridge devices and check for the first match, instead
>> -	 * of only checking the first one.
>> -	 */
>> -	while ((pch = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, pch))) {
>> +	pch = pci_get_bus_and_slot(0, PCI_DEVFN(0x1f, 0));
>> +	if (pch) {
>
> Then if you want to use this slot for something else, what happens?

I think this slot is always occupied to be dedicated to this ISA bridge 
in the platform.

So don't worry, the drivers in Linux and Windows can live with this.

Thanks
Tiejun

> If you want to relax the PCI_CLASS_BRIDGE_ISA requirement when
> running on top of a hypervisor, just scan all devices.
>
>>   		if (pch->vendor == PCI_VENDOR_ID_INTEL) {
>>   			unsigned short id = pch->device & INTEL_PCH_DEVICE_ID_MASK;
>>   			dev_priv->pch_id = id;
>> @@ -462,10 +452,7 @@ void intel_detect_pch(struct drm_device *dev)
>>   				DRM_DEBUG_KMS("Found LynxPoint LP PCH\n");
>>   				WARN_ON(!IS_HASWELL(dev));
>>   				WARN_ON(!IS_ULT(dev));
>> -			} else
>> -				continue;
>> -
>> -			break;
>> +			}
>>   		}
>>   	}
>>   	if (!pch)
>> --
>> 1.9.1
>>
>
Michael S. Tsirkin July 2, 2014, 6:21 a.m. UTC | #15
On Thu, Jun 19, 2014 at 05:53:51PM +0800, Tiejun Chen wrote:
> Originally the reason to probe ISA bridge instead of Dev31:Fun0
> is to make graphics device passthrough work easy for VMM, that
> only need to expose ISA bridge to let driver know the real
> hardware underneath. This is a requirement from virtualization
> team. Especially in that virtualized environments, XEN, there
> is irrelevant ISA bridge in the system with that legacy qemu
> version specific to xen, qemu-xen-traditional. So to work
> reliably, we should scan through all the ISA bridge devices
> and check for the first match, instead of only checking the
> first one.
> 
> But actually, qemu-xen-traditional, is always enumerated with
> Dev31:Fun0, 00:1f.0 as follows:
> 
> hw/pt-graphics.c:
> 
> intel_pch_init()
>     |
>     + pci_isa_bridge_init(bus, PCI_DEVFN(0x1f, 0), ...);
> 
> so this mean that isa bridge is still represented with Dev31:Func0
> like the native OS. Furthermore, currently we're pushing VGA
> passthrough support into qemu upstream, and with some discussion,
> we wouldn't set the bridge class type and just expose this devfn.
> 
> So we just go back to check devfn to make life normal.
> 
> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>

Making sure all relevant people are Cc'd.

I think we should wait for the hypervisor to settle
on what it wants to implement before making guest
changes. Otherwise we'll just add more guest configurations
that need to be supported.



> ---
>  drivers/gpu/drm/i915/i915_drv.c | 19 +++----------------
>  1 file changed, 3 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 651e65e..cb2526e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -417,18 +417,8 @@ void intel_detect_pch(struct drm_device *dev)
>  		return;
>  	}
>  
> -	/*
> -	 * The reason to probe ISA bridge instead of Dev31:Fun0 is to
> -	 * make graphics device passthrough work easy for VMM, that only
> -	 * need to expose ISA bridge to let driver know the real hardware
> -	 * underneath. This is a requirement from virtualization team.
> -	 *
> -	 * In some virtualized environments (e.g. XEN), there is irrelevant
> -	 * ISA bridge in the system. To work reliably, we should scan trhough
> -	 * all the ISA bridge devices and check for the first match, instead
> -	 * of only checking the first one.
> -	 */
> -	while ((pch = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, pch))) {
> +	pch = pci_get_bus_and_slot(0, PCI_DEVFN(0x1f, 0));
> +	if (pch) {
>  		if (pch->vendor == PCI_VENDOR_ID_INTEL) {
>  			unsigned short id = pch->device & INTEL_PCH_DEVICE_ID_MASK;
>  			dev_priv->pch_id = id;
> @@ -462,10 +452,7 @@ void intel_detect_pch(struct drm_device *dev)
>  				DRM_DEBUG_KMS("Found LynxPoint LP PCH\n");
>  				WARN_ON(!IS_HASWELL(dev));
>  				WARN_ON(!IS_ULT(dev));
> -			} else
> -				continue;
> -
> -			break;
> +			}
>  		}
>  	}
>  	if (!pch)
> -- 
> 1.9.1
>
Tiejun Chen July 2, 2014, 8:27 a.m. UTC | #16
On 2014/7/2 14:21, Michael S. Tsirkin wrote:
> On Thu, Jun 19, 2014 at 05:53:51PM +0800, Tiejun Chen wrote:
>> Originally the reason to probe ISA bridge instead of Dev31:Fun0
>> is to make graphics device passthrough work easy for VMM, that
>> only need to expose ISA bridge to let driver know the real
>> hardware underneath. This is a requirement from virtualization
>> team. Especially in that virtualized environments, XEN, there
>> is irrelevant ISA bridge in the system with that legacy qemu
>> version specific to xen, qemu-xen-traditional. So to work
>> reliably, we should scan through all the ISA bridge devices
>> and check for the first match, instead of only checking the
>> first one.
>>
>> But actually, qemu-xen-traditional, is always enumerated with
>> Dev31:Fun0, 00:1f.0 as follows:
>>
>> hw/pt-graphics.c:
>>
>> intel_pch_init()
>>      |
>>      + pci_isa_bridge_init(bus, PCI_DEVFN(0x1f, 0), ...);
>>
>> so this mean that isa bridge is still represented with Dev31:Func0
>> like the native OS. Furthermore, currently we're pushing VGA
>> passthrough support into qemu upstream, and with some discussion,
>> we wouldn't set the bridge class type and just expose this devfn.
>>
>> So we just go back to check devfn to make life normal.
>>
>> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
>
> Making sure all relevant people are Cc'd.

Are you saying you and Paolo? Or others I'm missing?

>
> I think we should wait for the hypervisor to settle
> on what it wants to implement before making guest
> changes. Otherwise we'll just add more guest configurations
> that need to be supported.

Agreed. Actually this is why I just send this by RFC.

Thanks
Tiejun

>
>
>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.c | 19 +++----------------
>>   1 file changed, 3 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
>> index 651e65e..cb2526e 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -417,18 +417,8 @@ void intel_detect_pch(struct drm_device *dev)
>>   		return;
>>   	}
>>
>> -	/*
>> -	 * The reason to probe ISA bridge instead of Dev31:Fun0 is to
>> -	 * make graphics device passthrough work easy for VMM, that only
>> -	 * need to expose ISA bridge to let driver know the real hardware
>> -	 * underneath. This is a requirement from virtualization team.
>> -	 *
>> -	 * In some virtualized environments (e.g. XEN), there is irrelevant
>> -	 * ISA bridge in the system. To work reliably, we should scan trhough
>> -	 * all the ISA bridge devices and check for the first match, instead
>> -	 * of only checking the first one.
>> -	 */
>> -	while ((pch = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, pch))) {
>> +	pch = pci_get_bus_and_slot(0, PCI_DEVFN(0x1f, 0));
>> +	if (pch) {
>>   		if (pch->vendor == PCI_VENDOR_ID_INTEL) {
>>   			unsigned short id = pch->device & INTEL_PCH_DEVICE_ID_MASK;
>>   			dev_priv->pch_id = id;
>> @@ -462,10 +452,7 @@ void intel_detect_pch(struct drm_device *dev)
>>   				DRM_DEBUG_KMS("Found LynxPoint LP PCH\n");
>>   				WARN_ON(!IS_HASWELL(dev));
>>   				WARN_ON(!IS_ULT(dev));
>> -			} else
>> -				continue;
>> -
>> -			break;
>> +			}
>>   		}
>>   	}
>>   	if (!pch)
>> --
>> 1.9.1
>>
>
>
Daniel Vetter July 7, 2014, 2:49 p.m. UTC | #17
On Wed, Jun 25, 2014 at 08:48:32AM +0200, Paolo Bonzini wrote:
> It is only slightly better, but the right solution is to fix the driver.
> There is absolutely zero reason why a graphics driver should know about the
> vendor/device ids of the PCH.

There is a very valid reason to know about the PCH since it _is_ part of
the gpu. You don't notice this all that much in the driver since the
hardware provides magic mmio regions in the main vga device register bar
which remap to the relevant PCH register ranges. And an awful lot of other
stuff like the MCH. So from an abstract pov you could only pass the intel
igt to a guest if you'd forward the gfx device, the mch/host bridge and
the pch. Thanks to these magic mmio windows we don't need those, except
that someone forgot to forward the pch pci id.

So the correct fix to forward intel gpus to guests is indeed to somehow
fake the pch pci ids since the driver really needs them. Gross design, but
that's how the hardware works.
-Daniel
Daniel Vetter July 7, 2014, 2:51 p.m. UTC | #18
On Wed, Jun 25, 2014 at 10:28:21AM +0800, Chen, Tiejun wrote:
> On 2014/6/24 10:59, Zhenyu Wang wrote:
> >On 2014.06.19 17:53:51 +0800, Tiejun Chen wrote:
> >>Originally the reason to probe ISA bridge instead of Dev31:Fun0
> >>is to make graphics device passthrough work easy for VMM, that
> >>only need to expose ISA bridge to let driver know the real
> >>hardware underneath. This is a requirement from virtualization
> >>team. Especially in that virtualized environments, XEN, there
> >>is irrelevant ISA bridge in the system with that legacy qemu
> >>version specific to xen, qemu-xen-traditional. So to work
> >>reliably, we should scan through all the ISA bridge devices
> >>and check for the first match, instead of only checking the
> >>first one.
> >>
> >>But actually, qemu-xen-traditional, is always enumerated with
> >>Dev31:Fun0, 00:1f.0 as follows:
> >>
> >>hw/pt-graphics.c:
> >>
> >>intel_pch_init()
> >>     |
> >>     + pci_isa_bridge_init(bus, PCI_DEVFN(0x1f, 0), ...);
> >>
> >>so this mean that isa bridge is still represented with Dev31:Func0
> >>like the native OS. Furthermore, currently we're pushing VGA
> >>passthrough support into qemu upstream, and with some discussion,
> >>we wouldn't set the bridge class type and just expose this devfn.
> >>
> >>So we just go back to check devfn to make life normal.
> >>
> >>Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
> >
> >This was added historically when supporting graphics device passthrough.
> >Looks qemu upstream can't accept multiple ISA bridge and our PCH is always
> >on device 31: func0 as far as I know. Looks good to me.
> >
> >Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
> >
> 
> Thanks for your review.
> 
> Do you know when this can be applied?

I'll hold off merging until we have buy-in from upstream quemu on a given
approach (which should work for both linux and windows).
-Daniel
Paolo Bonzini July 7, 2014, 2:57 p.m. UTC | #19
Il 07/07/2014 16:49, Daniel Vetter ha scritto:
> So the correct fix to forward intel gpus to guests is indeed to somehow
> fake the pch pci ids since the driver really needs them. Gross design, but
> that's how the hardware works.

A way that could work for virtualization is this: if you find the card 
has a magic subsystem vendor id, fetch the subsystem device id and use 
_that_ as the PCH device id.

Would that work for you?

Paolo
Daniel Vetter July 7, 2014, 5:54 p.m. UTC | #20
On Mon, Jul 07, 2014 at 04:57:45PM +0200, Paolo Bonzini wrote:
> Il 07/07/2014 16:49, Daniel Vetter ha scritto:
> >So the correct fix to forward intel gpus to guests is indeed to somehow
> >fake the pch pci ids since the driver really needs them. Gross design, but
> >that's how the hardware works.
> 
> A way that could work for virtualization is this: if you find the card has a
> magic subsystem vendor id, fetch the subsystem device id and use _that_ as
> the PCH device id.
> 
> Would that work for you?

I guess for quemu it also depends upon what windows does since we can't
change that. If we care about that part. Another consideration is
supporting older kernels, if that's possible at all.
-Daniel
Paolo Bonzini July 7, 2014, 5:58 p.m. UTC | #21
Il 07/07/2014 19:54, Daniel Vetter ha scritto:
> On Mon, Jul 07, 2014 at 04:57:45PM +0200, Paolo Bonzini wrote:
>> Il 07/07/2014 16:49, Daniel Vetter ha scritto:
>>> So the correct fix to forward intel gpus to guests is indeed to somehow
>>> fake the pch pci ids since the driver really needs them. Gross design, but
>>> that's how the hardware works.
>>
>> A way that could work for virtualization is this: if you find the card has a
>> magic subsystem vendor id, fetch the subsystem device id and use _that_ as
>> the PCH device id.
>>
>> Would that work for you?
>
> I guess for quemu it also depends upon what windows does since we can't
> change that. If we care about that part. Another consideration is
> supporting older kernels, if that's possible at all.

Yes, but right now it's more important to get something that's not too 
gross for the future, for both Linux and Windows.  Hacks for existing 
guests can be done separately, especially since they might differ 
between Linux (check ISA bridge) and Windows (check 1f.0).

Paolo
Daniel Vetter July 7, 2014, 6:40 p.m. UTC | #22
On Mon, Jul 07, 2014 at 07:58:30PM +0200, Paolo Bonzini wrote:
> Il 07/07/2014 19:54, Daniel Vetter ha scritto:
> >On Mon, Jul 07, 2014 at 04:57:45PM +0200, Paolo Bonzini wrote:
> >>Il 07/07/2014 16:49, Daniel Vetter ha scritto:
> >>>So the correct fix to forward intel gpus to guests is indeed to somehow
> >>>fake the pch pci ids since the driver really needs them. Gross design, but
> >>>that's how the hardware works.
> >>
> >>A way that could work for virtualization is this: if you find the card has a
> >>magic subsystem vendor id, fetch the subsystem device id and use _that_ as
> >>the PCH device id.
> >>
> >>Would that work for you?
> >
> >I guess for quemu it also depends upon what windows does since we can't
> >change that. If we care about that part. Another consideration is
> >supporting older kernels, if that's possible at all.
> 
> Yes, but right now it's more important to get something that's not too gross
> for the future, for both Linux and Windows.  Hacks for existing guests can
> be done separately, especially since they might differ between Linux (check
> ISA bridge) and Windows (check 1f.0).

Well old Linux also checked 1f.0, so kinda the same really. As long as
1f.0 is an isa bridge. Wrt Windows I don't really expect them to change
this (they're probably more focuesed on the windows hypervisor or whatever).

In the end if the approach is ok for quemu and isn't much worse than what
we currently have I don't mind at all about the i915.ko code. I just want
to avoid flip-flopping around on the hack du jour like we seem to do just
now.
-Daniel
Tian, Kevin July 10, 2014, 9:08 p.m. UTC | #23
> From: Daniel Vetter
> Sent: Monday, July 07, 2014 11:40 AM
> 
> On Mon, Jul 07, 2014 at 07:58:30PM +0200, Paolo Bonzini wrote:
> > Il 07/07/2014 19:54, Daniel Vetter ha scritto:
> > >On Mon, Jul 07, 2014 at 04:57:45PM +0200, Paolo Bonzini wrote:
> > >>Il 07/07/2014 16:49, Daniel Vetter ha scritto:
> > >>>So the correct fix to forward intel gpus to guests is indeed to somehow
> > >>>fake the pch pci ids since the driver really needs them. Gross design, but
> > >>>that's how the hardware works.
> > >>
> > >>A way that could work for virtualization is this: if you find the card has a
> > >>magic subsystem vendor id, fetch the subsystem device id and use _that_
> as
> > >>the PCH device id.
> > >>
> > >>Would that work for you?
> > >
> > >I guess for quemu it also depends upon what windows does since we can't
> > >change that. If we care about that part. Another consideration is
> > >supporting older kernels, if that's possible at all.
> >
> > Yes, but right now it's more important to get something that's not too gross
> > for the future, for both Linux and Windows.  Hacks for existing guests can
> > be done separately, especially since they might differ between Linux (check
> > ISA bridge) and Windows (check 1f.0).
> 
> Well old Linux also checked 1f.0, so kinda the same really. As long as
> 1f.0 is an isa bridge. Wrt Windows I don't really expect them to change
> this (they're probably more focuesed on the windows hypervisor or whatever).

discussion is also on-going with Windows driver folks. Add Allen here.

> 
> In the end if the approach is ok for quemu and isn't much worse than what
> we currently have I don't mind at all about the i915.ko code. I just want
> to avoid flip-flopping around on the hack du jour like we seem to do just
> now.
> -Daniel

actually I'm curious whether it's still necessary to __detect__ PCH. Could
we assume a 1:1 mapping between GPU and PCH, e.g. BDW already hard
code the knowledge:

              } else if (IS_BROADWELL(dev)) {
                      dev_priv->pch_type = PCH_LPT;
                      dev_priv->pch_id =
                              INTEL_PCH_LPT_LP_DEVICE_ID_TYPE;
                      DRM_DEBUG_KMS("This is Broadwell, assuming "
                                    "LynxPoint LP PCH\n");

Or if there is real usage on non-fixed mapping (not majority), could it be a 
better option to have fixed mapping as a fallback instead of leaving as 
PCH_NONE? Then even when Qemu doesn't provide a special tweaked PCH,
the majority case just works.

Thanks
Kevin
Daniel Vetter July 11, 2014, 6:29 a.m. UTC | #24
On Thu, Jul 10, 2014 at 09:08:24PM +0000, Tian, Kevin wrote:
> actually I'm curious whether it's still necessary to __detect__ PCH. Could
> we assume a 1:1 mapping between GPU and PCH, e.g. BDW already hard
> code the knowledge:
> 
>               } else if (IS_BROADWELL(dev)) {
>                       dev_priv->pch_type = PCH_LPT;
>                       dev_priv->pch_id =
>                               INTEL_PCH_LPT_LP_DEVICE_ID_TYPE;
>                       DRM_DEBUG_KMS("This is Broadwell, assuming "
>                                     "LynxPoint LP PCH\n");
> 
> Or if there is real usage on non-fixed mapping (not majority), could it be a 
> better option to have fixed mapping as a fallback instead of leaving as 
> PCH_NONE? Then even when Qemu doesn't provide a special tweaked PCH,
> the majority case just works.

I guess we can do it, at least I haven't seen any strange combinations in
the wild outside of Intel ...
-Daniel
Konrad Rzeszutek Wilk July 11, 2014, 7:42 p.m. UTC | #25
On Fri, Jul 11, 2014 at 08:29:56AM +0200, Daniel Vetter wrote:
> On Thu, Jul 10, 2014 at 09:08:24PM +0000, Tian, Kevin wrote:
> > actually I'm curious whether it's still necessary to __detect__ PCH. Could
> > we assume a 1:1 mapping between GPU and PCH, e.g. BDW already hard
> > code the knowledge:
> > 
> >               } else if (IS_BROADWELL(dev)) {
> >                       dev_priv->pch_type = PCH_LPT;
> >                       dev_priv->pch_id =
> >                               INTEL_PCH_LPT_LP_DEVICE_ID_TYPE;
> >                       DRM_DEBUG_KMS("This is Broadwell, assuming "
> >                                     "LynxPoint LP PCH\n");
> > 
> > Or if there is real usage on non-fixed mapping (not majority), could it be a 
> > better option to have fixed mapping as a fallback instead of leaving as 
> > PCH_NONE? Then even when Qemu doesn't provide a special tweaked PCH,
> > the majority case just works.
> 
> I guess we can do it, at least I haven't seen any strange combinations in
> the wild outside of Intel ...

How big is the QA matrix for this? Would it make sense to just
include the latest hardware (say going two generations back)
and ignore the older one?
Tian, Kevin July 11, 2014, 8:30 p.m. UTC | #26
> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]
> Sent: Friday, July 11, 2014 12:42 PM
> 
> On Fri, Jul 11, 2014 at 08:29:56AM +0200, Daniel Vetter wrote:
> > On Thu, Jul 10, 2014 at 09:08:24PM +0000, Tian, Kevin wrote:
> > > actually I'm curious whether it's still necessary to __detect__ PCH. Could
> > > we assume a 1:1 mapping between GPU and PCH, e.g. BDW already hard
> > > code the knowledge:
> > >
> > >               } else if (IS_BROADWELL(dev)) {
> > >                       dev_priv->pch_type = PCH_LPT;
> > >                       dev_priv->pch_id =
> > >
> INTEL_PCH_LPT_LP_DEVICE_ID_TYPE;
> > >                       DRM_DEBUG_KMS("This is Broadwell,
> assuming "
> > >                                     "LynxPoint LP PCH\n");
> > >
> > > Or if there is real usage on non-fixed mapping (not majority), could it be a
> > > better option to have fixed mapping as a fallback instead of leaving as
> > > PCH_NONE? Then even when Qemu doesn't provide a special tweaked
> PCH,
> > > the majority case just works.
> >
> > I guess we can do it, at least I haven't seen any strange combinations in
> > the wild outside of Intel ...
> 
> How big is the QA matrix for this? Would it make sense to just
> include the latest hardware (say going two generations back)
> and ignore the older one?

suppose minimal or no QA effort on bare metal, if we only conservatively 
change the fallback path which is today not supposed to function with 
PCH_NONE. so it's only same amount of QA effort as whatever else is 
proposed in this passthru upstreaming task. I agree no need to cover 
older model, possibly just snb, ivb and hsw, but will leave Tiejun to answer 
the overall goal.

Thanks
Kevin
Daniel Vetter July 12, 2014, 10:13 a.m. UTC | #27
On Fri, Jul 11, 2014 at 08:30:59PM +0000, Tian, Kevin wrote:
> > From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]
> > Sent: Friday, July 11, 2014 12:42 PM
> > 
> > On Fri, Jul 11, 2014 at 08:29:56AM +0200, Daniel Vetter wrote:
> > > On Thu, Jul 10, 2014 at 09:08:24PM +0000, Tian, Kevin wrote:
> > > > actually I'm curious whether it's still necessary to __detect__ PCH. Could
> > > > we assume a 1:1 mapping between GPU and PCH, e.g. BDW already hard
> > > > code the knowledge:
> > > >
> > > >               } else if (IS_BROADWELL(dev)) {
> > > >                       dev_priv->pch_type = PCH_LPT;
> > > >                       dev_priv->pch_id =
> > > >
> > INTEL_PCH_LPT_LP_DEVICE_ID_TYPE;
> > > >                       DRM_DEBUG_KMS("This is Broadwell,
> > assuming "
> > > >                                     "LynxPoint LP PCH\n");
> > > >
> > > > Or if there is real usage on non-fixed mapping (not majority), could it be a
> > > > better option to have fixed mapping as a fallback instead of leaving as
> > > > PCH_NONE? Then even when Qemu doesn't provide a special tweaked
> > PCH,
> > > > the majority case just works.
> > >
> > > I guess we can do it, at least I haven't seen any strange combinations in
> > > the wild outside of Intel ...
> > 
> > How big is the QA matrix for this? Would it make sense to just
> > include the latest hardware (say going two generations back)
> > and ignore the older one?
> 
> suppose minimal or no QA effort on bare metal, if we only conservatively 
> change the fallback path which is today not supposed to function with 
> PCH_NONE. so it's only same amount of QA effort as whatever else is 
> proposed in this passthru upstreaming task. I agree no need to cover 
> older model, possibly just snb, ivb and hsw, but will leave Tiejun to answer 
> the overall goal.

Yeah, I'd be ok with the approach of using defaults if we can't recognize
the pch - if anyone screams we can either quirk or figure something else
out.
-Daniel
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 651e65e..cb2526e 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -417,18 +417,8 @@  void intel_detect_pch(struct drm_device *dev)
 		return;
 	}
 
-	/*
-	 * The reason to probe ISA bridge instead of Dev31:Fun0 is to
-	 * make graphics device passthrough work easy for VMM, that only
-	 * need to expose ISA bridge to let driver know the real hardware
-	 * underneath. This is a requirement from virtualization team.
-	 *
-	 * In some virtualized environments (e.g. XEN), there is irrelevant
-	 * ISA bridge in the system. To work reliably, we should scan trhough
-	 * all the ISA bridge devices and check for the first match, instead
-	 * of only checking the first one.
-	 */
-	while ((pch = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, pch))) {
+	pch = pci_get_bus_and_slot(0, PCI_DEVFN(0x1f, 0));
+	if (pch) {
 		if (pch->vendor == PCI_VENDOR_ID_INTEL) {
 			unsigned short id = pch->device & INTEL_PCH_DEVICE_ID_MASK;
 			dev_priv->pch_id = id;
@@ -462,10 +452,7 @@  void intel_detect_pch(struct drm_device *dev)
 				DRM_DEBUG_KMS("Found LynxPoint LP PCH\n");
 				WARN_ON(!IS_HASWELL(dev));
 				WARN_ON(!IS_ULT(dev));
-			} else
-				continue;
-
-			break;
+			}
 		}
 	}
 	if (!pch)