diff mbox series

vfio-pci: Mask cap zero

Message ID 158836927527.9272.16785800801999547009.stgit@gimli.home (mailing list archive)
State New, archived
Headers show
Series vfio-pci: Mask cap zero | expand

Commit Message

Alex Williamson May 1, 2020, 9:41 p.m. UTC
There is no PCI spec defined capability with ID 0, therefore we don't
expect to find it in a capability chain and we use this index in an
internal array for tracking the sizes of various capabilities to handle
standard config space.  Therefore if a device does present us with a
capability ID 0, we mark our capability map with nonsense that can
trigger conflicts with other capabilities in the chain.  Ignore ID 0
when walking the capability chain, handling it as a hidden capability.

Seen on an NVIDIA Tesla T4.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/vfio/pci/vfio_pci_config.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Cornelia Huck May 4, 2020, 4:09 p.m. UTC | #1
On Fri, 01 May 2020 15:41:24 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> There is no PCI spec defined capability with ID 0, therefore we don't
> expect to find it in a capability chain and we use this index in an
> internal array for tracking the sizes of various capabilities to handle
> standard config space.  Therefore if a device does present us with a
> capability ID 0, we mark our capability map with nonsense that can
> trigger conflicts with other capabilities in the chain.  Ignore ID 0
> when walking the capability chain, handling it as a hidden capability.
> 
> Seen on an NVIDIA Tesla T4.
> 
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> ---
>  drivers/vfio/pci/vfio_pci_config.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c
> index 87d0cc8c86ad..5935a804cb88 100644
> --- a/drivers/vfio/pci/vfio_pci_config.c
> +++ b/drivers/vfio/pci/vfio_pci_config.c
> @@ -1487,7 +1487,7 @@ static int vfio_cap_init(struct vfio_pci_device *vdev)
>  		if (ret)
>  			return ret;
>  
> -		if (cap <= PCI_CAP_ID_MAX) {

Maybe add a comment:

/* no PCI spec defined capability with ID 0: hide it */

?

> +		if (cap && cap <= PCI_CAP_ID_MAX) {
>  			len = pci_cap_length[cap];
>  			if (len == 0xFF) { /* Variable length */
>  				len = vfio_cap_len(vdev, cap, pos);
> 

Is there a requirement for caps to be strictly ordered? If not, could
len hold a residual value from a previous iteration?
Alex Williamson May 4, 2020, 6:52 p.m. UTC | #2
On Mon, 4 May 2020 18:09:16 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Fri, 01 May 2020 15:41:24 -0600
> Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> > There is no PCI spec defined capability with ID 0, therefore we don't
> > expect to find it in a capability chain and we use this index in an
> > internal array for tracking the sizes of various capabilities to handle
> > standard config space.  Therefore if a device does present us with a
> > capability ID 0, we mark our capability map with nonsense that can
> > trigger conflicts with other capabilities in the chain.  Ignore ID 0
> > when walking the capability chain, handling it as a hidden capability.
> > 
> > Seen on an NVIDIA Tesla T4.
> > 
> > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > ---
> >  drivers/vfio/pci/vfio_pci_config.c |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c
> > index 87d0cc8c86ad..5935a804cb88 100644
> > --- a/drivers/vfio/pci/vfio_pci_config.c
> > +++ b/drivers/vfio/pci/vfio_pci_config.c
> > @@ -1487,7 +1487,7 @@ static int vfio_cap_init(struct vfio_pci_device *vdev)
> >  		if (ret)
> >  			return ret;
> >  
> > -		if (cap <= PCI_CAP_ID_MAX) {  
> 
> Maybe add a comment:
> 
> /* no PCI spec defined capability with ID 0: hide it */
> 

Sure.

> 
> > +		if (cap && cap <= PCI_CAP_ID_MAX) {
> >  			len = pci_cap_length[cap];
> >  			if (len == 0xFF) { /* Variable length */
> >  				len = vfio_cap_len(vdev, cap, pos);
> >   
> 
> Is there a requirement for caps to be strictly ordered? If not, could
> len hold a residual value from a previous iteration?

There is no ordering requirement for capabilities, but len is declared
non-static with an initial value within the scope of the loop, it's
reset every iteration.  Thanks,

Alex
Neo Jia May 4, 2020, 10:08 p.m. UTC | #3
On Mon, May 04, 2020 at 12:52:53PM -0600, Alex Williamson wrote:
> External email: Use caution opening links or attachments
> 
> 
> On Mon, 4 May 2020 18:09:16 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri, 01 May 2020 15:41:24 -0600
> > Alex Williamson <alex.williamson@redhat.com> wrote:
> >
> > > There is no PCI spec defined capability with ID 0, therefore we don't
> > > expect to find it in a capability chain and we use this index in an
> > > internal array for tracking the sizes of various capabilities to handle
> > > standard config space.  Therefore if a device does present us with a
> > > capability ID 0, we mark our capability map with nonsense that can
> > > trigger conflicts with other capabilities in the chain.  Ignore ID 0
> > > when walking the capability chain, handling it as a hidden capability.
> > >
> > > Seen on an NVIDIA Tesla T4.
> > >
> > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > > ---
> > >  drivers/vfio/pci/vfio_pci_config.c |    2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c
> > > index 87d0cc8c86ad..5935a804cb88 100644
> > > --- a/drivers/vfio/pci/vfio_pci_config.c
> > > +++ b/drivers/vfio/pci/vfio_pci_config.c
> > > @@ -1487,7 +1487,7 @@ static int vfio_cap_init(struct vfio_pci_device *vdev)
> > >             if (ret)
> > >                     return ret;
> > >
> > > -           if (cap <= PCI_CAP_ID_MAX) {
> >
> > Maybe add a comment:
> >
> > /* no PCI spec defined capability with ID 0: hide it */

Hi Alex,

I think this is NULL Capability defined in Codes and IDs spec, probably we
should just add a new enum to represent that?

Thanks,
Neo

> >
> 
> Sure.
> 
> >
> > > +           if (cap && cap <= PCI_CAP_ID_MAX) {
> > >                     len = pci_cap_length[cap];
> > >                     if (len == 0xFF) { /* Variable length */
> > >                             len = vfio_cap_len(vdev, cap, pos);
> > >
> >
> > Is there a requirement for caps to be strictly ordered? If not, could
> > len hold a residual value from a previous iteration?
> 
> There is no ordering requirement for capabilities, but len is declared
> non-static with an initial value within the scope of the loop, it's
> reset every iteration.  Thanks,
> 
> Alex
>
Alex Williamson May 4, 2020, 11:03 p.m. UTC | #4
On Mon, 4 May 2020 15:08:08 -0700
Neo Jia <cjia@nvidia.com> wrote:

> On Mon, May 04, 2020 at 12:52:53PM -0600, Alex Williamson wrote:
> > External email: Use caution opening links or attachments
> > 
> > 
> > On Mon, 4 May 2020 18:09:16 +0200
> > Cornelia Huck <cohuck@redhat.com> wrote:
> >   
> > > On Fri, 01 May 2020 15:41:24 -0600
> > > Alex Williamson <alex.williamson@redhat.com> wrote:
> > >  
> > > > There is no PCI spec defined capability with ID 0, therefore we don't
> > > > expect to find it in a capability chain and we use this index in an
> > > > internal array for tracking the sizes of various capabilities to handle
> > > > standard config space.  Therefore if a device does present us with a
> > > > capability ID 0, we mark our capability map with nonsense that can
> > > > trigger conflicts with other capabilities in the chain.  Ignore ID 0
> > > > when walking the capability chain, handling it as a hidden capability.
> > > >
> > > > Seen on an NVIDIA Tesla T4.
> > > >
> > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > > > ---
> > > >  drivers/vfio/pci/vfio_pci_config.c |    2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c
> > > > index 87d0cc8c86ad..5935a804cb88 100644
> > > > --- a/drivers/vfio/pci/vfio_pci_config.c
> > > > +++ b/drivers/vfio/pci/vfio_pci_config.c
> > > > @@ -1487,7 +1487,7 @@ static int vfio_cap_init(struct vfio_pci_device *vdev)
> > > >             if (ret)
> > > >                     return ret;
> > > >
> > > > -           if (cap <= PCI_CAP_ID_MAX) {  
> > >
> > > Maybe add a comment:
> > >
> > > /* no PCI spec defined capability with ID 0: hide it */  
> 
> Hi Alex,
> 
> I think this is NULL Capability defined in Codes and IDs spec, probably we
> should just add a new enum to represent that?

Yes, it looks like the 1.1 version of that specification from June 2015
changed ID 0 from reserved to a NULL capability.  So my description and
this comment are wrong, but I wonder if we should did anything
different with the handling of this capability.  It's specified to
contain only the ID and next pointer, so I'd expect it's primarily a
mechanism for hardware vendors to blow fuses in config space to
maintain a capability chain while maybe hiding a feature not supported
by the product sku.  Hiding the capability in vfio is trivial, exposing
it implies some changes to our config space map that might be more
subtle.  I'm inclined to stick with this solution for now.  Thanks,

Alex

> > 
> > Sure.
> >   
> > >  
> > > > +           if (cap && cap <= PCI_CAP_ID_MAX) {
> > > >                     len = pci_cap_length[cap];
> > > >                     if (len == 0xFF) { /* Variable length */
> > > >                             len = vfio_cap_len(vdev, cap, pos);
> > > >  
> > >
> > > Is there a requirement for caps to be strictly ordered? If not, could
> > > len hold a residual value from a previous iteration?  
> > 
> > There is no ordering requirement for capabilities, but len is declared
> > non-static with an initial value within the scope of the loop, it's
> > reset every iteration.  Thanks,
> > 
> > Alex
> >   
>
Cornelia Huck May 5, 2020, 6:09 a.m. UTC | #5
On Mon, 4 May 2020 17:03:54 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Mon, 4 May 2020 15:08:08 -0700
> Neo Jia <cjia@nvidia.com> wrote:
> 
> > On Mon, May 04, 2020 at 12:52:53PM -0600, Alex Williamson wrote:  
> > > External email: Use caution opening links or attachments
> > > 
> > > 
> > > On Mon, 4 May 2020 18:09:16 +0200
> > > Cornelia Huck <cohuck@redhat.com> wrote:
> > >     
> > > > On Fri, 01 May 2020 15:41:24 -0600
> > > > Alex Williamson <alex.williamson@redhat.com> wrote:
> > > >    
> > > > > There is no PCI spec defined capability with ID 0, therefore we don't
> > > > > expect to find it in a capability chain and we use this index in an
> > > > > internal array for tracking the sizes of various capabilities to handle
> > > > > standard config space.  Therefore if a device does present us with a
> > > > > capability ID 0, we mark our capability map with nonsense that can
> > > > > trigger conflicts with other capabilities in the chain.  Ignore ID 0
> > > > > when walking the capability chain, handling it as a hidden capability.
> > > > >
> > > > > Seen on an NVIDIA Tesla T4.
> > > > >
> > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > > > > ---
> > > > >  drivers/vfio/pci/vfio_pci_config.c |    2 +-
> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c
> > > > > index 87d0cc8c86ad..5935a804cb88 100644
> > > > > --- a/drivers/vfio/pci/vfio_pci_config.c
> > > > > +++ b/drivers/vfio/pci/vfio_pci_config.c
> > > > > @@ -1487,7 +1487,7 @@ static int vfio_cap_init(struct vfio_pci_device *vdev)
> > > > >             if (ret)
> > > > >                     return ret;
> > > > >
> > > > > -           if (cap <= PCI_CAP_ID_MAX) {    
> > > >
> > > > Maybe add a comment:
> > > >
> > > > /* no PCI spec defined capability with ID 0: hide it */    
> > 
> > Hi Alex,
> > 
> > I think this is NULL Capability defined in Codes and IDs spec, probably we
> > should just add a new enum to represent that?  
> 
> Yes, it looks like the 1.1 version of that specification from June 2015
> changed ID 0 from reserved to a NULL capability.  So my description and
> this comment are wrong, but I wonder if we should did anything
> different with the handling of this capability.  It's specified to
> contain only the ID and next pointer, so I'd expect it's primarily a
> mechanism for hardware vendors to blow fuses in config space to
> maintain a capability chain while maybe hiding a feature not supported
> by the product sku.  Hiding the capability in vfio is trivial, exposing
> it implies some changes to our config space map that might be more
> subtle.  I'm inclined to stick with this solution for now.  Thanks,
> 
> Alex

From this description, I also think that we should simply hide these
NULL capabilities.
Neo Jia May 5, 2020, 9:58 p.m. UTC | #6
On Tue, May 05, 2020 at 08:09:39AM +0200, Cornelia Huck wrote:
> External email: Use caution opening links or attachments
> 
> 
> On Mon, 4 May 2020 17:03:54 -0600
> Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> > On Mon, 4 May 2020 15:08:08 -0700
> > Neo Jia <cjia@nvidia.com> wrote:
> >
> > > On Mon, May 04, 2020 at 12:52:53PM -0600, Alex Williamson wrote:
> > > > External email: Use caution opening links or attachments
> > > >
> > > >
> > > > On Mon, 4 May 2020 18:09:16 +0200
> > > > Cornelia Huck <cohuck@redhat.com> wrote:
> > > >
> > > > > On Fri, 01 May 2020 15:41:24 -0600
> > > > > Alex Williamson <alex.williamson@redhat.com> wrote:
> > > > >
> > > > > > There is no PCI spec defined capability with ID 0, therefore we don't
> > > > > > expect to find it in a capability chain and we use this index in an
> > > > > > internal array for tracking the sizes of various capabilities to handle
> > > > > > standard config space.  Therefore if a device does present us with a
> > > > > > capability ID 0, we mark our capability map with nonsense that can
> > > > > > trigger conflicts with other capabilities in the chain.  Ignore ID 0
> > > > > > when walking the capability chain, handling it as a hidden capability.
> > > > > >
> > > > > > Seen on an NVIDIA Tesla T4.
> > > > > >
> > > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > > > > > ---
> > > > > >  drivers/vfio/pci/vfio_pci_config.c |    2 +-
> > > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c
> > > > > > index 87d0cc8c86ad..5935a804cb88 100644
> > > > > > --- a/drivers/vfio/pci/vfio_pci_config.c
> > > > > > +++ b/drivers/vfio/pci/vfio_pci_config.c
> > > > > > @@ -1487,7 +1487,7 @@ static int vfio_cap_init(struct vfio_pci_device *vdev)
> > > > > >             if (ret)
> > > > > >                     return ret;
> > > > > >
> > > > > > -           if (cap <= PCI_CAP_ID_MAX) {
> > > > >
> > > > > Maybe add a comment:
> > > > >
> > > > > /* no PCI spec defined capability with ID 0: hide it */
> > >
> > > Hi Alex,
> > >
> > > I think this is NULL Capability defined in Codes and IDs spec, probably we
> > > should just add a new enum to represent that?
> >
> > Yes, it looks like the 1.1 version of that specification from June 2015
> > changed ID 0 from reserved to a NULL capability.  So my description and
> > this comment are wrong, but I wonder if we should did anything
> > different with the handling of this capability.  It's specified to
> > contain only the ID and next pointer, so I'd expect it's primarily a
> > mechanism for hardware vendors to blow fuses in config space to
> > maintain a capability chain while maybe hiding a feature not supported
> > by the product sku.  Hiding the capability in vfio is trivial, exposing
> > it implies some changes to our config space map that might be more
> > subtle.  I'm inclined to stick with this solution for now.  Thanks,
> >
> > Alex
> 
> From this description, I also think that we should simply hide these
> NULL capabilities.

I don't have a strong preference either way, the current implementation looks
fine.

Thanks,
Neo

>
diff mbox series

Patch

diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c
index 87d0cc8c86ad..5935a804cb88 100644
--- a/drivers/vfio/pci/vfio_pci_config.c
+++ b/drivers/vfio/pci/vfio_pci_config.c
@@ -1487,7 +1487,7 @@  static int vfio_cap_init(struct vfio_pci_device *vdev)
 		if (ret)
 			return ret;
 
-		if (cap <= PCI_CAP_ID_MAX) {
+		if (cap && cap <= PCI_CAP_ID_MAX) {
 			len = pci_cap_length[cap];
 			if (len == 0xFF) { /* Variable length */
 				len = vfio_cap_len(vdev, cap, pos);