diff mbox series

[1/4] s390x/pci: use a reserved ID for the default PCI group

Message ID 20211202164110.326947-2-mjrosato@linux.ibm.com (mailing list archive)
State New, archived
Headers show
Series s390x/pci: some small fixes | expand

Commit Message

Matthew Rosato Dec. 2, 2021, 4:41 p.m. UTC
The current default PCI group being used can technically collide with a
real group ID passed from a hostdev.  Let's instead use a group ID that comes
from a special pool that is architected to be reserved for simulated devices.

Fixes: 28dc86a072 ("s390x/pci: use a PCI Group structure")
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 include/hw/s390x/s390-pci-bus.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

David Hildenbrand Dec. 2, 2021, 4:43 p.m. UTC | #1
On 02.12.21 17:41, Matthew Rosato wrote:
> The current default PCI group being used can technically collide with a
> real group ID passed from a hostdev.  Let's instead use a group ID that comes
> from a special pool that is architected to be reserved for simulated devices.
> 
> Fixes: 28dc86a072 ("s390x/pci: use a PCI Group structure")
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>  include/hw/s390x/s390-pci-bus.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
> index aa891c178d..2727e7bdef 100644
> --- a/include/hw/s390x/s390-pci-bus.h
> +++ b/include/hw/s390x/s390-pci-bus.h
> @@ -313,7 +313,7 @@ typedef struct ZpciFmb {
>  } ZpciFmb;
>  QEMU_BUILD_BUG_MSG(offsetof(ZpciFmb, fmt0) != 48, "padding in ZpciFmb");
>  
> -#define ZPCI_DEFAULT_FN_GRP 0x20
> +#define ZPCI_DEFAULT_FN_GRP 0xFF
>  typedef struct S390PCIGroup {
>      ClpRspQueryPciGrp zpci_group;
>      int id;
> 

What happens if we migrate a VM from old to new QEMU? Won't the guest be
able to observe the change?
Matthew Rosato Dec. 2, 2021, 5:11 p.m. UTC | #2
On 12/2/21 11:43 AM, David Hildenbrand wrote:
> On 02.12.21 17:41, Matthew Rosato wrote:
>> The current default PCI group being used can technically collide with a
>> real group ID passed from a hostdev.  Let's instead use a group ID that comes
>> from a special pool that is architected to be reserved for simulated devices.
>>
>> Fixes: 28dc86a072 ("s390x/pci: use a PCI Group structure")
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   include/hw/s390x/s390-pci-bus.h | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
>> index aa891c178d..2727e7bdef 100644
>> --- a/include/hw/s390x/s390-pci-bus.h
>> +++ b/include/hw/s390x/s390-pci-bus.h
>> @@ -313,7 +313,7 @@ typedef struct ZpciFmb {
>>   } ZpciFmb;
>>   QEMU_BUILD_BUG_MSG(offsetof(ZpciFmb, fmt0) != 48, "padding in ZpciFmb");
>>   
>> -#define ZPCI_DEFAULT_FN_GRP 0x20
>> +#define ZPCI_DEFAULT_FN_GRP 0xFF
>>   typedef struct S390PCIGroup {
>>       ClpRspQueryPciGrp zpci_group;
>>       int id;
>>
> 
> What happens if we migrate a VM from old to new QEMU? Won't the guest be
> able to observe the change?
> 

Yes, technically --  But # itself is not really all that important, it 
is provided from CLP Q PCI FN to be subsequently used as input into Q 
PCI FNGRP -- With the fundamental notion being that all functions that 
share the same group # share the same group CLP info.  Whether the 
number is, say, 1 or 5 doesn't matter so much.

However..  0xF0 and greater are the only values reserved for hypervisor 
use.  By using 0x20 we run the risk of accidentally conflating simulated 
devices and real hardware, hence the desire to change it.

Is your concern about a migrated guest with a virtio device trying to do 
a CLP QUERY PCI FNGRP using 0x20 on a new QEMU?  I suppose we could 
modify 'clp_service_call, case CLP_QUERY_PCI_FNGRP' to silently catch 
simulated devices trying to use something other than the default group, 
e.g.:

if ((pbdev->fh & FH_SHM_EMUL) &&
     (pbdev->zpci_fn.pfgid != ZPCI_DEFAULT_FN_GRP)) {
         /* Simulated device MUST have default group */
	pbdev->zpci_fn.pfgid = ZPCI_DEFAULT_FN_GRP;
	group = s390_group_find(ZPCI_DEFAULT_FN_GRP);
}

What do you think?
Eric Farman Dec. 2, 2021, 9:27 p.m. UTC | #3
On Thu, 2021-12-02 at 11:41 -0500, Matthew Rosato wrote:
> The current default PCI group being used can technically collide with
> a
> real group ID passed from a hostdev.  Let's instead use a group ID
> that comes
> from a special pool that is architected to be reserved for simulated
> devices.
> 
> Fixes: 28dc86a072 ("s390x/pci: use a PCI Group structure")
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>

Regardless of the question regarding virtio migration, this is good.

Reviewed-by: Eric Farman <farman@linux.ibm.com>

> ---
>  include/hw/s390x/s390-pci-bus.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-
> pci-bus.h
> index aa891c178d..2727e7bdef 100644
> --- a/include/hw/s390x/s390-pci-bus.h
> +++ b/include/hw/s390x/s390-pci-bus.h
> @@ -313,7 +313,7 @@ typedef struct ZpciFmb {
>  } ZpciFmb;
>  QEMU_BUILD_BUG_MSG(offsetof(ZpciFmb, fmt0) != 48, "padding in
> ZpciFmb");
>  
> -#define ZPCI_DEFAULT_FN_GRP 0x20
> +#define ZPCI_DEFAULT_FN_GRP 0xFF
>  typedef struct S390PCIGroup {
>      ClpRspQueryPciGrp zpci_group;
>      int id;
David Hildenbrand Dec. 2, 2021, 9:55 p.m. UTC | #4
On 02.12.21 18:11, Matthew Rosato wrote:
> On 12/2/21 11:43 AM, David Hildenbrand wrote:
>> On 02.12.21 17:41, Matthew Rosato wrote:
>>> The current default PCI group being used can technically collide with a
>>> real group ID passed from a hostdev.  Let's instead use a group ID that comes
>>> from a special pool that is architected to be reserved for simulated devices.
>>>
>>> Fixes: 28dc86a072 ("s390x/pci: use a PCI Group structure")
>>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>>> ---
>>>   include/hw/s390x/s390-pci-bus.h | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
>>> index aa891c178d..2727e7bdef 100644
>>> --- a/include/hw/s390x/s390-pci-bus.h
>>> +++ b/include/hw/s390x/s390-pci-bus.h
>>> @@ -313,7 +313,7 @@ typedef struct ZpciFmb {
>>>   } ZpciFmb;
>>>   QEMU_BUILD_BUG_MSG(offsetof(ZpciFmb, fmt0) != 48, "padding in ZpciFmb");
>>>   
>>> -#define ZPCI_DEFAULT_FN_GRP 0x20
>>> +#define ZPCI_DEFAULT_FN_GRP 0xFF
>>>   typedef struct S390PCIGroup {
>>>       ClpRspQueryPciGrp zpci_group;
>>>       int id;
>>>
>>
>> What happens if we migrate a VM from old to new QEMU? Won't the guest be
>> able to observe the change?
>>
> 
> Yes, technically --  But # itself is not really all that important, it 
> is provided from CLP Q PCI FN to be subsequently used as input into Q 
> PCI FNGRP -- With the fundamental notion being that all functions that 
> share the same group # share the same group CLP info.  Whether the 
> number is, say, 1 or 5 doesn't matter so much.
> 
> However..  0xF0 and greater are the only values reserved for hypervisor 
> use.  By using 0x20 we run the risk of accidentally conflating simulated 
> devices and real hardware, hence the desire to change it.
> 
> Is your concern about a migrated guest with a virtio device trying to do 
> a CLP QUERY PCI FNGRP using 0x20 on a new QEMU?  I suppose we could 
> modify 'clp_service_call, case CLP_QUERY_PCI_FNGRP' to silently catch 
> simulated devices trying to use something other than the default group, 
> e.g.:
> 
> if ((pbdev->fh & FH_SHM_EMUL) &&
>      (pbdev->zpci_fn.pfgid != ZPCI_DEFAULT_FN_GRP)) {
>          /* Simulated device MUST have default group */
> 	pbdev->zpci_fn.pfgid = ZPCI_DEFAULT_FN_GRP;
> 	group = s390_group_find(ZPCI_DEFAULT_FN_GRP);
> }
> 
> What do you think?
> 

Good question, I'm certainly not a zPCI expert. However if you think
that we cannot really break migration in a sane use case, I'm fine with
it as well :)

The question about breaking migration just popped up in my head when I
stumbled over this patch.
Halil Pasic Dec. 2, 2021, 11:06 p.m. UTC | #5
On Thu, 2 Dec 2021 12:11:38 -0500
Matthew Rosato <mjrosato@linux.ibm.com> wrote:

> > 
> > What happens if we migrate a VM from old to new QEMU? Won't the guest be
> > able to observe the change?
> >   
> 
> Yes, technically --  But # itself is not really all that important, it 
> is provided from CLP Q PCI FN to be subsequently used as input into Q 
> PCI FNGRP -- With the fundamental notion being that all functions that 
> share the same group # share the same group CLP info.  Whether the 
> number is, say, 1 or 5 doesn't matter so much.
> 
> However..  0xF0 and greater are the only values reserved for hypervisor 
> use.  By using 0x20 we run the risk of accidentally conflating simulated 
> devices and real hardware, hence the desire to change it.
> 
> Is your concern about a migrated guest with a virtio device trying to do 
> a CLP QUERY PCI FNGRP using 0x20 on a new QEMU?  I suppose we could 
> modify 'clp_service_call, case CLP_QUERY_PCI_FNGRP' to silently catch 
> simulated devices trying to use something other than the default group, 
> e.g.:
> 
> if ((pbdev->fh & FH_SHM_EMUL) &&
>      (pbdev->zpci_fn.pfgid != ZPCI_DEFAULT_FN_GRP)) {
>          /* Simulated device MUST have default group */
> 	pbdev->zpci_fn.pfgid = ZPCI_DEFAULT_FN_GRP;
> 	group = s390_group_find(ZPCI_DEFAULT_FN_GRP);
> }
> 
> What do you think?

Another option, and in my opinion the cleaner one would be to tie this
change to a new machine version. That is if a post-change qemu is used
in compatibility mode, we would still have the old behavior.

What do you think?

Regards,
Halil
Matthew Rosato Dec. 3, 2021, 2:25 a.m. UTC | #6
On 12/2/21 6:06 PM, Halil Pasic wrote:
> On Thu, 2 Dec 2021 12:11:38 -0500
> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
> 
>>>
>>> What happens if we migrate a VM from old to new QEMU? Won't the guest be
>>> able to observe the change?
>>>    
>>
>> Yes, technically --  But # itself is not really all that important, it
>> is provided from CLP Q PCI FN to be subsequently used as input into Q
>> PCI FNGRP -- With the fundamental notion being that all functions that
>> share the same group # share the same group CLP info.  Whether the
>> number is, say, 1 or 5 doesn't matter so much.
>>
>> However..  0xF0 and greater are the only values reserved for hypervisor
>> use.  By using 0x20 we run the risk of accidentally conflating simulated
>> devices and real hardware, hence the desire to change it.
>>
>> Is your concern about a migrated guest with a virtio device trying to do
>> a CLP QUERY PCI FNGRP using 0x20 on a new QEMU?  I suppose we could
>> modify 'clp_service_call, case CLP_QUERY_PCI_FNGRP' to silently catch
>> simulated devices trying to use something other than the default group,
>> e.g.:
>>
>> if ((pbdev->fh & FH_SHM_EMUL) &&
>>       (pbdev->zpci_fn.pfgid != ZPCI_DEFAULT_FN_GRP)) {
>>           /* Simulated device MUST have default group */
>> 	pbdev->zpci_fn.pfgid = ZPCI_DEFAULT_FN_GRP;
>> 	group = s390_group_find(ZPCI_DEFAULT_FN_GRP);
>> }
>>
>> What do you think?
> 
> Another option, and in my opinion the cleaner one would be to tie this
> change to a new machine version. That is if a post-change qemu is used
> in compatibility mode, we would still have the old behavior.
> 
> What do you think?
> 

The problem there is that the old behavior goes against the architecture 
(group 0x20 could belong to real hardware) and AFAIU assigning this new 
behavior only to a new machine version means we can't fix old stable 
QEMU versions.

Also, wait a minute -- migration isn't even an option right now, it's 
blocked for zpci devices, both passthrough and simulated (see 
aede5d5dfc5f 's390x/pci: mark zpci devices as unmigratable') so I say 
let's just move to a proper default group now before we potentially 
allow migration later.
Pierre Morel Dec. 3, 2021, 9:07 a.m. UTC | #7
On 12/3/21 03:25, Matthew Rosato wrote:
> On 12/2/21 6:06 PM, Halil Pasic wrote:
>> On Thu, 2 Dec 2021 12:11:38 -0500
>> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
>>
>>>>
>>>> What happens if we migrate a VM from old to new QEMU? Won't the 
>>>> guest be
>>>> able to observe the change?
>>>
>>> Yes, technically --  But # itself is not really all that important, it
>>> is provided from CLP Q PCI FN to be subsequently used as input into Q
>>> PCI FNGRP -- With the fundamental notion being that all functions that
>>> share the same group # share the same group CLP info.  Whether the
>>> number is, say, 1 or 5 doesn't matter so much.
>>>
>>> However..  0xF0 and greater are the only values reserved for hypervisor
>>> use.  By using 0x20 we run the risk of accidentally conflating simulated
>>> devices and real hardware, hence the desire to change it.
>>>
>>> Is your concern about a migrated guest with a virtio device trying to do
>>> a CLP QUERY PCI FNGRP using 0x20 on a new QEMU?  I suppose we could
>>> modify 'clp_service_call, case CLP_QUERY_PCI_FNGRP' to silently catch
>>> simulated devices trying to use something other than the default group,
>>> e.g.:
>>>
>>> if ((pbdev->fh & FH_SHM_EMUL) &&
>>>       (pbdev->zpci_fn.pfgid != ZPCI_DEFAULT_FN_GRP)) {
>>>           /* Simulated device MUST have default group */
>>>     pbdev->zpci_fn.pfgid = ZPCI_DEFAULT_FN_GRP;
>>>     group = s390_group_find(ZPCI_DEFAULT_FN_GRP);
>>> }
>>>
>>> What do you think?
>>
>> Another option, and in my opinion the cleaner one would be to tie this
>> change to a new machine version. That is if a post-change qemu is used
>> in compatibility mode, we would still have the old behavior.
>>
>> What do you think?
>>
> 
> The problem there is that the old behavior goes against the architecture 
> (group 0x20 could belong to real hardware) and AFAIU assigning this new 
> behavior only to a new machine version means we can't fix old stable 
> QEMU versions.
> 
> Also, wait a minute -- migration isn't even an option right now, it's 
> blocked for zpci devices, both passthrough and simulated (see 
> aede5d5dfc5f 's390x/pci: mark zpci devices as unmigratable') so I say 
> let's just move to a proper default group now before we potentially 
> allow migration later.

Looks right to me.
Pierre Morel Dec. 3, 2021, 9:24 a.m. UTC | #8
On 12/2/21 17:41, Matthew Rosato wrote:
> The current default PCI group being used can technically collide with a
> real group ID passed from a hostdev.  Let's instead use a group ID that comes
> from a special pool that is architected to be reserved for simulated devices.

NIT: May be add that PCIFG between 0xF0 and 0xFF is specified for this 
reserved pool.


> 
> Fixes: 28dc86a072 ("s390x/pci: use a PCI Group structure")
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   include/hw/s390x/s390-pci-bus.h | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
> index aa891c178d..2727e7bdef 100644
> --- a/include/hw/s390x/s390-pci-bus.h
> +++ b/include/hw/s390x/s390-pci-bus.h
> @@ -313,7 +313,7 @@ typedef struct ZpciFmb {
>   } ZpciFmb;
>   QEMU_BUILD_BUG_MSG(offsetof(ZpciFmb, fmt0) != 48, "padding in ZpciFmb");
>   
> -#define ZPCI_DEFAULT_FN_GRP 0x20
> +#define ZPCI_DEFAULT_FN_GRP 0xFF
>   typedef struct S390PCIGroup {
>       ClpRspQueryPciGrp zpci_group;
>       int id;
> 

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
David Hildenbrand Dec. 3, 2021, 6:21 p.m. UTC | #9
On 03.12.21 03:25, Matthew Rosato wrote:
> On 12/2/21 6:06 PM, Halil Pasic wrote:
>> On Thu, 2 Dec 2021 12:11:38 -0500
>> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
>>
>>>>
>>>> What happens if we migrate a VM from old to new QEMU? Won't the guest be
>>>> able to observe the change?
>>>>    
>>>
>>> Yes, technically --  But # itself is not really all that important, it
>>> is provided from CLP Q PCI FN to be subsequently used as input into Q
>>> PCI FNGRP -- With the fundamental notion being that all functions that
>>> share the same group # share the same group CLP info.  Whether the
>>> number is, say, 1 or 5 doesn't matter so much.
>>>
>>> However..  0xF0 and greater are the only values reserved for hypervisor
>>> use.  By using 0x20 we run the risk of accidentally conflating simulated
>>> devices and real hardware, hence the desire to change it.
>>>
>>> Is your concern about a migrated guest with a virtio device trying to do
>>> a CLP QUERY PCI FNGRP using 0x20 on a new QEMU?  I suppose we could
>>> modify 'clp_service_call, case CLP_QUERY_PCI_FNGRP' to silently catch
>>> simulated devices trying to use something other than the default group,
>>> e.g.:
>>>
>>> if ((pbdev->fh & FH_SHM_EMUL) &&
>>>       (pbdev->zpci_fn.pfgid != ZPCI_DEFAULT_FN_GRP)) {
>>>           /* Simulated device MUST have default group */
>>> 	pbdev->zpci_fn.pfgid = ZPCI_DEFAULT_FN_GRP;
>>> 	group = s390_group_find(ZPCI_DEFAULT_FN_GRP);
>>> }
>>>
>>> What do you think?
>>
>> Another option, and in my opinion the cleaner one would be to tie this
>> change to a new machine version. That is if a post-change qemu is used
>> in compatibility mode, we would still have the old behavior.
>>
>> What do you think?
>>
> 
> The problem there is that the old behavior goes against the architecture 
> (group 0x20 could belong to real hardware) and AFAIU assigning this new 
> behavior only to a new machine version means we can't fix old stable 
> QEMU versions.
> 
> Also, wait a minute -- migration isn't even an option right now, it's 
> blocked for zpci devices, both passthrough and simulated (see 
> aede5d5dfc5f 's390x/pci: mark zpci devices as unmigratable') so I say 
> let's just move to a proper default group now before we potentially 
> allow migration later.

Perfect, thanks for confirming!
diff mbox series

Patch

diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
index aa891c178d..2727e7bdef 100644
--- a/include/hw/s390x/s390-pci-bus.h
+++ b/include/hw/s390x/s390-pci-bus.h
@@ -313,7 +313,7 @@  typedef struct ZpciFmb {
 } ZpciFmb;
 QEMU_BUILD_BUG_MSG(offsetof(ZpciFmb, fmt0) != 48, "padding in ZpciFmb");
 
-#define ZPCI_DEFAULT_FN_GRP 0x20
+#define ZPCI_DEFAULT_FN_GRP 0xFF
 typedef struct S390PCIGroup {
     ClpRspQueryPciGrp zpci_group;
     int id;