mbox series

[net-next,v2,0/2] sfc: Add EF100 BAR config support

Message ID 165719918216.28149.7678451615870416505.stgit@palantir17.mph.net (mailing list archive)
Headers show
Series sfc: Add EF100 BAR config support | expand

Message

Martin Habets July 7, 2022, 1:07 p.m. UTC
The EF100 NICs allow for different register layouts of a PCI memory BAR.
This series provides the framework to switch this layout at runtime.

Subsequent patch series will use this to add support for vDPA.

v2: Include PCI and virtio maintainers.
---

Martin Habets (2):
      sfc: Add EF100 BAR config support
      sfc: Implement change of BAR configuration


 drivers/net/ethernet/sfc/ef100_nic.c |   80 ++++++++++++++++++++++++++++++++++
 drivers/net/ethernet/sfc/ef100_nic.h |    6 +++
 2 files changed, 86 insertions(+)

--
Martin Habets <habetsm.xilinx@gmail.com>

Comments

Bjorn Helgaas July 7, 2022, 3:55 p.m. UTC | #1
On Thu, Jul 07, 2022 at 02:07:07PM +0100, Martin Habets wrote:
> The EF100 NICs allow for different register layouts of a PCI memory BAR.
> This series provides the framework to switch this layout at runtime.
> 
> Subsequent patch series will use this to add support for vDPA.

Normally drivers rely on the PCI Vendor and Device ID to learn the
number of BARs and their layouts.  I guess this series implies that
doesn't work on this device?  And the user needs to manually specify
what kind of device this is?

I'm confused about how this is supposed to work.  What if the driver
is built-in and claims a device before the user can specify the
register layout?  What if the user specifies the wrong layout and the
driver writes to the wrong registers?

> ---
> 
> Martin Habets (2):
>       sfc: Add EF100 BAR config support
>       sfc: Implement change of BAR configuration
> 
> 
>  drivers/net/ethernet/sfc/ef100_nic.c |   80 ++++++++++++++++++++++++++++++++++
>  drivers/net/ethernet/sfc/ef100_nic.h |    6 +++
>  2 files changed, 86 insertions(+)
> 
> --
> Martin Habets <habetsm.xilinx@gmail.com>
> 
>
Martin Habets July 11, 2022, 1:38 p.m. UTC | #2
On Thu, Jul 07, 2022 at 10:55:00AM -0500, Bjorn Helgaas wrote:
> On Thu, Jul 07, 2022 at 02:07:07PM +0100, Martin Habets wrote:
> > The EF100 NICs allow for different register layouts of a PCI memory BAR.
> > This series provides the framework to switch this layout at runtime.
> > 
> > Subsequent patch series will use this to add support for vDPA.
> 
> Normally drivers rely on the PCI Vendor and Device ID to learn the
> number of BARs and their layouts.  I guess this series implies that
> doesn't work on this device?  And the user needs to manually specify
> what kind of device this is?

When a new PCI device is added (like a VF) it always starts of with
the register layout for an EF100 network device. This is hardcoded,
i.e. it cannot be customised.
The layout can be changed after bootup, and only after the sfc driver has
bound to the device.
The PCI Vendor and Device ID do not change when the layout is changed.

For vDPA specifically we return the Xilinx PCI Vendor and our device ID
to the vDPA framework via struct vdpa_config_opts.

> I'm confused about how this is supposed to work.  What if the driver
> is built-in and claims a device before the user can specify the
> register layout?

The bar_config file will only exist once the sfc driver has bound to
the device. So in fact we count on that driver getting loaded.
When a new value is written to bar_config it is the sfc driver that
instructs the NIC to change the register layout.

> What if the user specifies the wrong layout and the
> driver writes to the wrong registers?

We have specific hardware and driver requirements for this sort of
situation. For example, the register layouts must have some common
registers (to ensure some compatibility).
A layout that is too different will require a separate device ID.
A driver that writes to the wrong register is a bug.

Maybe the name "bar_config" is causing most of the confusion here.
Internally we also talk about "function profiles" or "personalities",
but we thought such a name would be too vague.

Martin

> > ---
> > 
> > Martin Habets (2):
> >       sfc: Add EF100 BAR config support
> >       sfc: Implement change of BAR configuration
> > 
> > 
> >  drivers/net/ethernet/sfc/ef100_nic.c |   80 ++++++++++++++++++++++++++++++++++
> >  drivers/net/ethernet/sfc/ef100_nic.h |    6 +++
> >  2 files changed, 86 insertions(+)
> > 
> > --
> > Martin Habets <habetsm.xilinx@gmail.com>
Jakub Kicinski July 11, 2022, 6:48 p.m. UTC | #3
On Mon, 11 Jul 2022 14:38:54 +0100 Martin Habets wrote:
> > Normally drivers rely on the PCI Vendor and Device ID to learn the
> > number of BARs and their layouts.  I guess this series implies that
> > doesn't work on this device?  And the user needs to manually specify
> > what kind of device this is?  
> 
> When a new PCI device is added (like a VF) it always starts of with
> the register layout for an EF100 network device. This is hardcoded,
> i.e. it cannot be customised.
> The layout can be changed after bootup, and only after the sfc driver has
> bound to the device.
> The PCI Vendor and Device ID do not change when the layout is changed.
> 
> For vDPA specifically we return the Xilinx PCI Vendor and our device ID
> to the vDPA framework via struct vdpa_config_opts.

So it's switching between ethernet and vdpa? Isn't there a general
problem for configuring vdpa capabilities (net vs storage etc) and
shouldn't we seek to solve your BAR format switch in a similar fashion
rather than adding PCI device attrs, which I believe is not done for
anything vDPA-related?

> > I'm confused about how this is supposed to work.  What if the driver
> > is built-in and claims a device before the user can specify the
> > register layout?  
> 
> The bar_config file will only exist once the sfc driver has bound to
> the device. So in fact we count on that driver getting loaded.
> When a new value is written to bar_config it is the sfc driver that
> instructs the NIC to change the register layout.

When you say "driver bound" you mean the VF driver, right?

> > What if the user specifies the wrong layout and the
> > driver writes to the wrong registers?  
> 
> We have specific hardware and driver requirements for this sort of
> situation. For example, the register layouts must have some common
> registers (to ensure some compatibility).
> A layout that is too different will require a separate device ID.
> A driver that writes to the wrong register is a bug.
> 
> Maybe the name "bar_config" is causing most of the confusion here.
> Internally we also talk about "function profiles" or "personalities",
> but we thought such a name would be too vague.
Bjorn Helgaas July 11, 2022, 10 p.m. UTC | #4
On Mon, Jul 11, 2022 at 02:38:54PM +0100, Martin Habets wrote:
> On Thu, Jul 07, 2022 at 10:55:00AM -0500, Bjorn Helgaas wrote:
> > On Thu, Jul 07, 2022 at 02:07:07PM +0100, Martin Habets wrote:
> > > The EF100 NICs allow for different register layouts of a PCI memory BAR.
> > > This series provides the framework to switch this layout at runtime.
> > > 
> > > Subsequent patch series will use this to add support for vDPA.
> > 
> > Normally drivers rely on the PCI Vendor and Device ID to learn the
> > number of BARs and their layouts.  I guess this series implies that
> > doesn't work on this device?  And the user needs to manually specify
> > what kind of device this is?
> 
> When a new PCI device is added (like a VF) it always starts of with
> the register layout for an EF100 network device. This is hardcoded,
> i.e. it cannot be customised.
> The layout can be changed after bootup, and only after the sfc driver has
> bound to the device.
> The PCI Vendor and Device ID do not change when the layout is changed.
> 
> For vDPA specifically we return the Xilinx PCI Vendor and our device ID
> to the vDPA framework via struct vdpa_config_opts.
> 
> > I'm confused about how this is supposed to work.  What if the driver
> > is built-in and claims a device before the user can specify the
> > register layout?
> 
> The bar_config file will only exist once the sfc driver has bound to
> the device. So in fact we count on that driver getting loaded.
> When a new value is written to bar_config it is the sfc driver that
> instructs the NIC to change the register layout.
>
> > What if the user specifies the wrong layout and the
> > driver writes to the wrong registers?
> 
> We have specific hardware and driver requirements for this sort of
> situation. For example, the register layouts must have some common
> registers (to ensure some compatibility).

Obviously we have to deal with the hardware as it exists, but it seems
like a hardware design problem that you can change the register
layout but the change is not detectable via those common registers.  

Anyway, it seems weird to me, but doesn't affect the PCI core and I
won't stand in your way ;)

> A layout that is too different will require a separate device ID.
> A driver that writes to the wrong register is a bug.
> 
> Maybe the name "bar_config" is causing most of the confusion here.
> Internally we also talk about "function profiles" or "personalities",
> but we thought such a name would be too vague.
> 
> Martin
> 
> > > ---
> > > 
> > > Martin Habets (2):
> > >       sfc: Add EF100 BAR config support
> > >       sfc: Implement change of BAR configuration
> > > 
> > > 
> > >  drivers/net/ethernet/sfc/ef100_nic.c |   80 ++++++++++++++++++++++++++++++++++
> > >  drivers/net/ethernet/sfc/ef100_nic.h |    6 +++
> > >  2 files changed, 86 insertions(+)
> > > 
> > > --
> > > Martin Habets <habetsm.xilinx@gmail.com>
Martin Habets July 13, 2022, 8:40 a.m. UTC | #5
On Mon, Jul 11, 2022 at 11:48:06AM -0700, Jakub Kicinski wrote:
> On Mon, 11 Jul 2022 14:38:54 +0100 Martin Habets wrote:
> > > Normally drivers rely on the PCI Vendor and Device ID to learn the
> > > number of BARs and their layouts.  I guess this series implies that
> > > doesn't work on this device?  And the user needs to manually specify
> > > what kind of device this is?  
> > 
> > When a new PCI device is added (like a VF) it always starts of with
> > the register layout for an EF100 network device. This is hardcoded,
> > i.e. it cannot be customised.
> > The layout can be changed after bootup, and only after the sfc driver has
> > bound to the device.
> > The PCI Vendor and Device ID do not change when the layout is changed.
> > 
> > For vDPA specifically we return the Xilinx PCI Vendor and our device ID
> > to the vDPA framework via struct vdpa_config_opts.
> 
> So it's switching between ethernet and vdpa? Isn't there a general
> problem for configuring vdpa capabilities (net vs storage etc) and
> shouldn't we seek to solve your BAR format switch in a similar fashion
> rather than adding PCI device attrs, which I believe is not done for
> anything vDPA-related?

The initial support will be for vdpa net. vdpa block and RDMA will follow
later, and we also need to consider FPGA management.

When it comes to vDPA there is a "vdpa" tool that we intend to support.
This comes into play after we've switched a device into vdpa mode (using
this new file).
For a network device there is also "devlink" to consider. That could be used
to switch a device into vdpa mode, but it cannot be used to switch it
back (there is no netdev to operate on).
My current understanding is that we won't have this issue for RDMA.
For FPGA management there is no general configuration tool, just what
fpga_mgr exposes (drivers/fpga). We intend to remove the special PF
devices we have for this (PCI space is valuable), and use the normal
network device in stead. I can give more details on this if you want.
Worst case a special BAR config would be needed for this, but if needed I
expect we can restrict this to the NIC provisioning stage.

So there is a general problem I think. The solution here is something at
lower level, which is PCI in this case.
Another solution would be a proprietary tool, something we are off course
keen to avoid.

> > > I'm confused about how this is supposed to work.  What if the driver
> > > is built-in and claims a device before the user can specify the
> > > register layout?  
> > 
> > The bar_config file will only exist once the sfc driver has bound to
> > the device. So in fact we count on that driver getting loaded.
> > When a new value is written to bar_config it is the sfc driver that
> > instructs the NIC to change the register layout.
> 
> When you say "driver bound" you mean the VF driver, right?

For a VF device yes it's the VF driver.
For a PF device it would be the PF driver.

Martin

> > > What if the user specifies the wrong layout and the
> > > driver writes to the wrong registers?  
> > 
> > We have specific hardware and driver requirements for this sort of
> > situation. For example, the register layouts must have some common
> > registers (to ensure some compatibility).
> > A layout that is too different will require a separate device ID.
> > A driver that writes to the wrong register is a bug.
> > 
> > Maybe the name "bar_config" is causing most of the confusion here.
> > Internally we also talk about "function profiles" or "personalities",
> > but we thought such a name would be too vague.
Jakub Kicinski July 13, 2022, 6:48 p.m. UTC | #6
On Wed, 13 Jul 2022 09:40:01 +0100 Martin Habets wrote:
> > So it's switching between ethernet and vdpa? Isn't there a general
> > problem for configuring vdpa capabilities (net vs storage etc) and
> > shouldn't we seek to solve your BAR format switch in a similar fashion
> > rather than adding PCI device attrs, which I believe is not done for
> > anything vDPA-related?  
> 
> The initial support will be for vdpa net. vdpa block and RDMA will follow
> later, and we also need to consider FPGA management.
> 
> When it comes to vDPA there is a "vdpa" tool that we intend to support.
> This comes into play after we've switched a device into vdpa mode (using
> this new file).
> For a network device there is also "devlink" to consider. That could be used
> to switch a device into vdpa mode, but it cannot be used to switch it
> back (there is no netdev to operate on).
> My current understanding is that we won't have this issue for RDMA.
> For FPGA management there is no general configuration tool, just what
> fpga_mgr exposes (drivers/fpga). We intend to remove the special PF
> devices we have for this (PCI space is valuable), and use the normal
> network device in stead. I can give more details on this if you want.
> Worst case a special BAR config would be needed for this, but if needed I
> expect we can restrict this to the NIC provisioning stage.
> 
> So there is a general problem I think. The solution here is something at
> lower level, which is PCI in this case.
> Another solution would be a proprietary tool, something we are off course
> keen to avoid.

Okay. Indeed, we could easily bolt something onto devlink, I'd think
but I don't know the space enough to push for one solution over
another. 

Please try to document the problem and the solution... somewhere, tho.
Otherwise the chances that the next vendor with this problem follows
the same approach fall from low to none.
Martin Habets July 14, 2022, 11:32 a.m. UTC | #7
On Wed, Jul 13, 2022 at 11:48:04AM -0700, Jakub Kicinski wrote:
> On Wed, 13 Jul 2022 09:40:01 +0100 Martin Habets wrote:
> > > So it's switching between ethernet and vdpa? Isn't there a general
> > > problem for configuring vdpa capabilities (net vs storage etc) and
> > > shouldn't we seek to solve your BAR format switch in a similar fashion
> > > rather than adding PCI device attrs, which I believe is not done for
> > > anything vDPA-related?  
> > 
> > The initial support will be for vdpa net. vdpa block and RDMA will follow
> > later, and we also need to consider FPGA management.
> > 
> > When it comes to vDPA there is a "vdpa" tool that we intend to support.
> > This comes into play after we've switched a device into vdpa mode (using
> > this new file).
> > For a network device there is also "devlink" to consider. That could be used
> > to switch a device into vdpa mode, but it cannot be used to switch it
> > back (there is no netdev to operate on).
> > My current understanding is that we won't have this issue for RDMA.
> > For FPGA management there is no general configuration tool, just what
> > fpga_mgr exposes (drivers/fpga). We intend to remove the special PF
> > devices we have for this (PCI space is valuable), and use the normal
> > network device in stead. I can give more details on this if you want.
> > Worst case a special BAR config would be needed for this, but if needed I
> > expect we can restrict this to the NIC provisioning stage.
> > 
> > So there is a general problem I think. The solution here is something at
> > lower level, which is PCI in this case.
> > Another solution would be a proprietary tool, something we are off course
> > keen to avoid.
> 
> Okay. Indeed, we could easily bolt something onto devlink, I'd think
> but I don't know the space enough to push for one solution over
> another. 
> 
> Please try to document the problem and the solution... somewhere, tho.
> Otherwise the chances that the next vendor with this problem follows
> the same approach fall from low to none.

Yeah, good point. The obvious thing would be to create a
 Documentation/networking/device_drivers/ethernet/sfc/sfc/rst
Is that generic enough for other vendors to find out, or there a better place?
I can do a follow-up patch for this.

Martin
Jakub Kicinski July 14, 2022, 4:05 p.m. UTC | #8
On Thu, 14 Jul 2022 12:32:12 +0100 Martin Habets wrote:
> > Okay. Indeed, we could easily bolt something onto devlink, I'd think
> > but I don't know the space enough to push for one solution over
> > another. 
> > 
> > Please try to document the problem and the solution... somewhere, tho.
> > Otherwise the chances that the next vendor with this problem follows
> > the same approach fall from low to none.  
> 
> Yeah, good point. The obvious thing would be to create a
>  Documentation/networking/device_drivers/ethernet/sfc/sfc/rst
> Is that generic enough for other vendors to find out, or there a better place?

Documentation/vdpa.rst ? I don't see any kernel level notes on
implementing vDPA perhaps virt folks can suggest something.
I don't think people would be looking into driver-specific docs
when trying to implement an interface, so sfc is not a great option
IMHO.

> I can do a follow-up patch for this.

Let's make it part of the same series.
Jason Wang Aug. 3, 2022, 7:57 a.m. UTC | #9
On Fri, Jul 15, 2022 at 12:05 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 14 Jul 2022 12:32:12 +0100 Martin Habets wrote:
> > > Okay. Indeed, we could easily bolt something onto devlink, I'd think
> > > but I don't know the space enough to push for one solution over
> > > another.
> > >
> > > Please try to document the problem and the solution... somewhere, tho.
> > > Otherwise the chances that the next vendor with this problem follows
> > > the same approach fall from low to none.
> >
> > Yeah, good point. The obvious thing would be to create a
> >  Documentation/networking/device_drivers/ethernet/sfc/sfc/rst
> > Is that generic enough for other vendors to find out, or there a better place?
>
> Documentation/vdpa.rst ? I don't see any kernel level notes on
> implementing vDPA perhaps virt folks can suggest something.

Not sure, since it's not a vDPA general thing but a vendor/parent
specific thing.

Or maybe Documentation/vdpa/sfc ?

Thanks

> I don't think people would be looking into driver-specific docs
> when trying to implement an interface, so sfc is not a great option
> IMHO.
>
> > I can do a follow-up patch for this.
>
> Let's make it part of the same series.
>
Martin Habets Aug. 12, 2022, 9:38 a.m. UTC | #10
FYI, during my holiday my colleagues found a way to use the vdpa tool for this.
That means we should not need this series, at least for vDPA.
So we can drop this series.

Thanks,
Martin

On Wed, Aug 03, 2022 at 03:57:34PM +0800, Jason Wang wrote:
> On Fri, Jul 15, 2022 at 12:05 AM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > On Thu, 14 Jul 2022 12:32:12 +0100 Martin Habets wrote:
> > > > Okay. Indeed, we could easily bolt something onto devlink, I'd think
> > > > but I don't know the space enough to push for one solution over
> > > > another.
> > > >
> > > > Please try to document the problem and the solution... somewhere, tho.
> > > > Otherwise the chances that the next vendor with this problem follows
> > > > the same approach fall from low to none.
> > >
> > > Yeah, good point. The obvious thing would be to create a
> > >  Documentation/networking/device_drivers/ethernet/sfc/sfc/rst
> > > Is that generic enough for other vendors to find out, or there a better place?
> >
> > Documentation/vdpa.rst ? I don't see any kernel level notes on
> > implementing vDPA perhaps virt folks can suggest something.
> 
> Not sure, since it's not a vDPA general thing but a vendor/parent
> specific thing.
> 
> Or maybe Documentation/vdpa/sfc ?
> 
> Thanks
> 
> > I don't think people would be looking into driver-specific docs
> > when trying to implement an interface, so sfc is not a great option
> > IMHO.
> >
> > > I can do a follow-up patch for this.
> >
> > Let's make it part of the same series.
Jakub Kicinski Aug. 12, 2022, 7:18 p.m. UTC | #11
On Fri, 12 Aug 2022 10:38:35 +0100 Martin Habets wrote:
> FYI, during my holiday my colleagues found a way to use the vdpa tool for this.
> That means we should not need this series, at least for vDPA.
> So we can drop this series.