diff mbox series

[RFC,10/11] nvmet: Add addr fam and trtype for mdev pci driver

Message ID 20250313052222.178524-11-michael.christie@oracle.com (mailing list archive)
State New
Headers show
Series nvmet: Add NVMe target mdev/vfio driver | expand

Commit Message

Mike Christie March 13, 2025, 5:18 a.m. UTC
This allocates 253 for mdev pci since it might not fit into any
existing value (not sure how to co-exist with pci-epf).

One of the reasons this patchset is a RFC is because I was not sure
if allocating a new number for this was the best. Another approach
is that I could break up pci-epf into a:

1. PCI component - Common PCI and NVMe PCI code.
2. Interface/bus component - Callouts so pci-epf can use the
pci_epf_driver/pci_epf_ops and mdev-pci can use mdev and vfio
callouts.
3. Memory management component - Callouts for using DMA for pci-epf
vs vfio related memory for mdev-pci.

On one hand, by creating a core nvmet pci driver then have subdrivers
we could share NVMF_ADDR_FAMILY_PCI and NVMF_TRTYPE_PCI. However,
it will get messy. There is some PCI code we could share for 1
but 2 and 3 will make sharing difficult becuse of how different the
drivers work (mdev-vfio vs pci-epf layers).

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/nvme/target/configfs.c |  1 +
 drivers/nvme/target/nvmet.h    |  5 ++++-
 include/linux/nvme.h           | 14 ++++++++------
 3 files changed, 13 insertions(+), 7 deletions(-)

Comments

Christoph Hellwig March 13, 2025, 6:42 a.m. UTC | #1
On Thu, Mar 13, 2025 at 12:18:11AM -0500, Mike Christie wrote:
> This allocates 253 for mdev pci since it might not fit into any
> existing value (not sure how to co-exist with pci-epf).
> 
> One of the reasons this patchset is a RFC is because I was not sure
> if allocating a new number for this was the best. Another approach
> is that I could break up pci-epf into a:
> 
> 1. PCI component - Common PCI and NVMe PCI code.
> 2. Interface/bus component - Callouts so pci-epf can use the
> pci_epf_driver/pci_epf_ops and mdev-pci can use mdev and vfio
> callouts.
> 3. Memory management component - Callouts for using DMA for pci-epf
> vs vfio related memory for mdev-pci.
> 
> On one hand, by creating a core nvmet pci driver then have subdrivers
> we could share NVMF_ADDR_FAMILY_PCI and NVMF_TRTYPE_PCI. However,
> it will get messy. There is some PCI code we could share for 1
> but 2 and 3 will make sharing difficult becuse of how different the
> drivers work (mdev-vfio vs pci-epf layers).

I think we'll need to discuss this more based on concrete code proposals
once we go along, but here's my handwavy 2cents for now:

  - in addition to the pure software endpoint and mdev I also expect
    hardardware offloaded PCIe endpoints to show up really soon, so
    we'll have more than just the two
  - having common code for different PCIe tagets where applicable is
    thus a good idea, but I'd expect it to be a set of library
    functions or conditionals in the core code, not a new layer
    with indirect calls
  - I had quite a lot of discussions with Damien about the trtype and
    related bits.  I suspect by the time we get to having multiple
    PCIe endpoints we just need to split the configfs interface naming
    from the on-wire fabrics trtrype enum to not need trtype assignments.
Mike Christie March 13, 2025, 5:56 p.m. UTC | #2
On 3/13/25 1:42 AM, Christoph Hellwig wrote:
> On Thu, Mar 13, 2025 at 12:18:11AM -0500, Mike Christie wrote:
>> This allocates 253 for mdev pci since it might not fit into any
>> existing value (not sure how to co-exist with pci-epf).
>>
>> One of the reasons this patchset is a RFC is because I was not sure
>> if allocating a new number for this was the best. Another approach
>> is that I could break up pci-epf into a:
>>
>> 1. PCI component - Common PCI and NVMe PCI code.
>> 2. Interface/bus component - Callouts so pci-epf can use the
>> pci_epf_driver/pci_epf_ops and mdev-pci can use mdev and vfio
>> callouts.
>> 3. Memory management component - Callouts for using DMA for pci-epf
>> vs vfio related memory for mdev-pci.
>>
>> On one hand, by creating a core nvmet pci driver then have subdrivers
>> we could share NVMF_ADDR_FAMILY_PCI and NVMF_TRTYPE_PCI. However,
>> it will get messy. There is some PCI code we could share for 1
>> but 2 and 3 will make sharing difficult becuse of how different the
>> drivers work (mdev-vfio vs pci-epf layers).
> 
> I think we'll need to discuss this more based on concrete code proposals
> once we go along, but here's my handwavy 2cents for now:
> 
>   - in addition to the pure software endpoint and mdev I also expect
>     hardardware offloaded PCIe endpoints to show up really soon, so
>     we'll have more than just the two
>   - having common code for different PCIe tagets where applicable is
>     thus a good idea, but I'd expect it to be a set of library
>     functions or conditionals in the core code, not a new layer
>     with indirect calls

A lib based approach will be easier. I'll take a stab at it on the next
posting.

>   - I had quite a lot of discussions with Damien about the trtype and
>     related bits.  I suspect by the time we get to having multiple
>     PCIe endpoints we just need to split the configfs interface naming
>     from the on-wire fabrics trtrype enum to not need trtype assignments.
diff mbox series

Patch

diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c
index 31c484d51a69..73bab15506c2 100644
--- a/drivers/nvme/target/configfs.c
+++ b/drivers/nvme/target/configfs.c
@@ -39,6 +39,7 @@  static struct nvmet_type_name_map nvmet_transport[] = {
 	{ NVMF_TRTYPE_TCP,	"tcp" },
 	{ NVMF_TRTYPE_PCI,	"pci" },
 	{ NVMF_TRTYPE_LOOP,	"loop" },
+	{ NVMF_TRTYPE_MDEV_PCI,	"mdev-pci" },
 };
 
 static const struct nvmet_type_name_map nvmet_addr_family[] = {
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index a16d1c74e3d9..6c825177ee87 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -755,7 +755,10 @@  static inline bool nvmet_is_disc_subsys(struct nvmet_subsys *subsys)
 
 static inline bool nvmet_is_pci_ctrl(struct nvmet_ctrl *ctrl)
 {
-	return ctrl->port->disc_addr.trtype == NVMF_TRTYPE_PCI;
+	struct nvmf_disc_rsp_page_entry *addr = &ctrl->port->disc_addr;
+
+	return addr->trtype == NVMF_TRTYPE_PCI ||
+	       addr->trtype == NVMF_TRTYPE_MDEV_PCI;
 }
 
 #ifdef CONFIG_NVME_TARGET_PASSTHRU
diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index a7b8bcef20fb..994f02158078 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -53,12 +53,13 @@  enum nvme_dctype {
 
 /* Address Family codes for Discovery Log Page entry ADRFAM field */
 enum {
-	NVMF_ADDR_FAMILY_PCI	= 0,	/* PCIe */
-	NVMF_ADDR_FAMILY_IP4	= 1,	/* IP4 */
-	NVMF_ADDR_FAMILY_IP6	= 2,	/* IP6 */
-	NVMF_ADDR_FAMILY_IB	= 3,	/* InfiniBand */
-	NVMF_ADDR_FAMILY_FC	= 4,	/* Fibre Channel */
-	NVMF_ADDR_FAMILY_LOOP	= 254,	/* Reserved for host usage */
+	NVMF_ADDR_FAMILY_PCI		= 0,	/* PCIe */
+	NVMF_ADDR_FAMILY_IP4		= 1,	/* IP4 */
+	NVMF_ADDR_FAMILY_IP6		= 2,	/* IP6 */
+	NVMF_ADDR_FAMILY_IB		= 3,	/* InfiniBand */
+	NVMF_ADDR_FAMILY_FC		= 4,	/* Fibre Channel */
+	NVMF_ADDR_FAMILY_MDEV_PCI	= 253,	/* MDEV PCI */
+	NVMF_ADDR_FAMILY_LOOP		= 254,	/* Reserved for host usage */
 	NVMF_ADDR_FAMILY_MAX,
 };
 
@@ -68,6 +69,7 @@  enum {
 	NVMF_TRTYPE_RDMA	= 1,	/* RDMA */
 	NVMF_TRTYPE_FC		= 2,	/* Fibre Channel */
 	NVMF_TRTYPE_TCP		= 3,	/* TCP/IP */
+	NVMF_TRTYPE_MDEV_PCI	= 253,	/* MDEV PCI hack */
 	NVMF_TRTYPE_LOOP	= 254,	/* Reserved for host usage */
 	NVMF_TRTYPE_MAX,
 };