diff mbox series

[net-next,v15,02/14] net: netdev netlink api to bind dma-buf to a net device

Message ID 20240628003253.1694510-3-almasrymina@google.com (mailing list archive)
State New
Headers show
Series Device Memory TCP | expand

Commit Message

Mina Almasry June 28, 2024, 12:32 a.m. UTC
API takes the dma-buf fd as input, and binds it to the netdevice. The
user can specify the rx queues to bind the dma-buf to.

Suggested-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Mina Almasry <almasrymina@google.com>

---

v7:
- Use flags: [ admin-perm ] instead of a CAP_NET_ADMIN check.

Changes in v1:
- Add rx-queue-type to distingish rx from tx (Jakub)
- Return dma-buf ID from netlink API (David, Stan)

Changes in RFC-v3:
- Support binding multiple rx rx-queues

---
 Documentation/netlink/specs/netdev.yaml | 53 +++++++++++++++++++++++++
 include/uapi/linux/netdev.h             | 19 +++++++++
 net/core/netdev-genl-gen.c              | 19 +++++++++
 net/core/netdev-genl-gen.h              |  2 +
 net/core/netdev-genl.c                  |  6 +++
 tools/include/uapi/linux/netdev.h       | 19 +++++++++
 6 files changed, 118 insertions(+)

Comments

Donald Hunter June 28, 2024, 10:04 a.m. UTC | #1
Mina Almasry <almasrymina@google.com> writes:
> +  -
> +    name: bind-dmabuf
> +    attributes:
> +      -
> +        name: ifindex
> +        doc: netdev ifindex to bind the dma-buf to.

Minor nit:

The series uses a mix of dmabuf and dma-buf but the doc additions
(devmem.rst) consistently uses dmabuf. I think it would be helpful to be
consistent here and say 'devmem dmabuf' in the docstring to highlight
whos dmabuf it is and keep the generated netdev docs in alignment.

> +        type: u32
> +        checks:
> +          min: 1
> +      -
> +        name: queues
> +        doc: receive queues to bind the dma-buf to.

And here.

> +        type: nest
> +        nested-attributes: queue-dmabuf
> +        multi-attr: true
> +      -
> +        name: dmabuf-fd
> +        doc: dmabuf file descriptor to bind.
> +        type: u32
> +      -
> +        name: dmabuf-id
> +        doc: id of the dmabuf binding
> +        type: u32
> +        checks:
> +          min: 1
> +
>  
>    -
>      name: qstats
> @@ -579,6 +618,20 @@ operations:
>            attributes:
>              - ifindex
>          reply: *queue-get-op
> +    -
> +      name: bind-rx
> +      doc: Bind dmabuf to netdev

And here.

> +      attribute-set: bind-dmabuf
> +      flags: [ admin-perm ]
> +      do:
> +        request:
> +          attributes:
> +            - ifindex
> +            - dmabuf-fd
> +            - queues
> +        reply:
> +          attributes:
> +            - dmabuf-id
Mina Almasry July 1, 2024, 7:04 p.m. UTC | #2
On Fri, Jun 28, 2024 at 3:10 AM Donald Hunter <donald.hunter@gmail.com> wrote:
>
> Mina Almasry <almasrymina@google.com> writes:
> > +  -
> > +    name: bind-dmabuf
> > +    attributes:
> > +      -
> > +        name: ifindex
> > +        doc: netdev ifindex to bind the dma-buf to.
>
> Minor nit:
>
> The series uses a mix of dmabuf and dma-buf but the doc additions
> (devmem.rst) consistently uses dmabuf. I think it would be helpful to be
> consistent here and say 'devmem dmabuf' in the docstring to highlight
> whos dmabuf it is and keep the generated netdev docs in alignment.
>

To be honest, even the dmabuf docs mixes 'dma-buf' and 'dmabuf', to my eye:

https://docs.kernel.org/driver-api/dma-buf.html

I can edit these docs I'm adding so these are consistent.

But on 'devmem dmabuf', not sure to be honest. Technically all dmabufs
are supported, even non-devmem ones. I'm not sure non-devmem dmabufs
are common at all, the only example I can think of is udmabuf whose
primary user is qemu and testing, so it's somewhat implied that the
dmabuf is devmem, and even if it isn't, it would be supported. I
prefer to keep the docs saying just 'dmabuf' as technically all are
supported. Maybe I should add a note about this somewhere in the
dedicated docs.
Donald Hunter July 2, 2024, 9:33 a.m. UTC | #3
On Mon, 1 Jul 2024 at 20:05, Mina Almasry <almasrymina@google.com> wrote:
>
> On Fri, Jun 28, 2024 at 3:10 AM Donald Hunter <donald.hunter@gmail.com> wrote:
> >
> > Mina Almasry <almasrymina@google.com> writes:
> > > +  -
> > > +    name: bind-dmabuf
> > > +    attributes:
> > > +      -
> > > +        name: ifindex
> > > +        doc: netdev ifindex to bind the dma-buf to.
> >
> > Minor nit:
> >
> > The series uses a mix of dmabuf and dma-buf but the doc additions
> > (devmem.rst) consistently uses dmabuf. I think it would be helpful to be
> > consistent here and say 'devmem dmabuf' in the docstring to highlight
> > whos dmabuf it is and keep the generated netdev docs in alignment.
>
> To be honest, even the dmabuf docs mixes 'dma-buf' and 'dmabuf', to my eye:
>
> https://docs.kernel.org/driver-api/dma-buf.html
>
> I can edit these docs I'm adding so these are consistent.
>
> But on 'devmem dmabuf', not sure to be honest. Technically all dmabufs
> are supported, even non-devmem ones. I'm not sure non-devmem dmabufs
> are common at all, the only example I can think of is udmabuf whose
> primary user is qemu and testing, so it's somewhat implied that the
> dmabuf is devmem, and even if it isn't, it would be supported. I
> prefer to keep the docs saying just 'dmabuf' as technically all are
> supported. Maybe I should add a note about this somewhere in the
> dedicated docs.

That's a fair point. If you could mention it in the docs, that would be great.

Thanks,
Donald.
Jakub Kicinski July 3, 2024, 12:19 a.m. UTC | #4
On Fri, 28 Jun 2024 00:32:39 +0000 Mina Almasry wrote:
> API takes the dma-buf fd as input, and binds it to the netdevice. The
> user can specify the rx queues to bind the dma-buf to.
> 
> Suggested-by: Stanislav Fomichev <sdf@google.com>
> Signed-off-by: Mina Almasry <almasrymina@google.com>

> diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml
> index 959755be4d7f9..899ac0882a098 100644
> --- a/Documentation/netlink/specs/netdev.yaml
> +++ b/Documentation/netlink/specs/netdev.yaml
> @@ -268,6 +268,45 @@ attribute-sets:
>          name: napi-id
>          doc: ID of the NAPI instance which services this queue.
>          type: u32
> +  -
> +    name: queue-dmabuf
> +    attributes:
> +      -
> +        name: type
> +        doc: rx or tx queue
> +        type: u8
> +        enum: queue-type
> +      -
> +        name: idx
> +        doc: queue index
> +        type: u32

u8 is a waste of space, since attrs are rounded up to 4B
and we don't use "idx"

How about we use a subset of queue attrs?

	name: queue-id
	subset-of: queue
	attributes:
	  -
	    name: id
	  -
	    name: type

> +  -
> +    name: bind-dmabuf

The naming is a bit too command specific, how about pp-buf ?
Or just dmabuf ?

> +    attributes:
> +      -
> +        name: ifindex
> +        doc: netdev ifindex to bind the dma-buf to.
> +        type: u32
> +        checks:
> +          min: 1
> +      -
> +        name: queues
> +        doc: receive queues to bind the dma-buf to.
> +        type: nest
> +        nested-attributes: queue-dmabuf
> +        multi-attr: true
> +      -
> +        name: dmabuf-fd
> +        doc: dmabuf file descriptor to bind.
> +        type: u32
> +      -
> +        name: dmabuf-id
> +        doc: id of the dmabuf binding
> +        type: u32
> +        checks:
> +          min: 1
> +

We need some form of introspection. Can we add both in the queue dump
and page pool dump some info (dmabuf-id?) to indicate there is a DMABUF
bound to the queue / page pool?

>    -
>      name: qstats
> @@ -579,6 +618,20 @@ operations:
>            attributes:
>              - ifindex
>          reply: *queue-get-op
> +    -
> +      name: bind-rx
> +      doc: Bind dmabuf to netdev
> +      attribute-set: bind-dmabuf
> +      flags: [ admin-perm ]
> +      do:
> +        request:
> +          attributes:
> +            - ifindex
> +            - dmabuf-fd
> +            - queues
> +        reply:
> +          attributes:
> +            - dmabuf-id

The ops end up getting rendered as an enum, so the ordering matters.
You can't insert in the middle without breaking uAPI.
For attribute sets (which you also added before qstat) it technically
doesn't matter but would be good to have them in order to match ops.

> diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h
> index 43742ac5b00da..190a504a62358 100644
> --- a/include/uapi/linux/netdev.h
> +++ b/include/uapi/linux/netdev.h
> @@ -136,6 +136,24 @@ enum {
>  	NETDEV_A_QUEUE_MAX = (__NETDEV_A_QUEUE_MAX - 1)
>  };
>  
> +enum {
> +	NETDEV_A_QUEUE_DMABUF_TYPE = 1,
> +	NETDEV_A_QUEUE_DMABUF_IDX,
> +
> +	__NETDEV_A_QUEUE_DMABUF_MAX,
> +	NETDEV_A_QUEUE_DMABUF_MAX = (__NETDEV_A_QUEUE_DMABUF_MAX - 1)
> +};
> +
> +enum {
> +	NETDEV_A_BIND_DMABUF_IFINDEX = 1,
> +	NETDEV_A_BIND_DMABUF_QUEUES,
> +	NETDEV_A_BIND_DMABUF_DMABUF_FD,
> +	NETDEV_A_BIND_DMABUF_DMABUF_ID,

This does look kinda repetitive, maybe let's drop the dmabuf from attr
names?

> +	__NETDEV_A_BIND_DMABUF_MAX,
> +	NETDEV_A_BIND_DMABUF_MAX = (__NETDEV_A_BIND_DMABUF_MAX - 1)
> +};
diff mbox series

Patch

diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml
index 959755be4d7f9..899ac0882a098 100644
--- a/Documentation/netlink/specs/netdev.yaml
+++ b/Documentation/netlink/specs/netdev.yaml
@@ -268,6 +268,45 @@  attribute-sets:
         name: napi-id
         doc: ID of the NAPI instance which services this queue.
         type: u32
+  -
+    name: queue-dmabuf
+    attributes:
+      -
+        name: type
+        doc: rx or tx queue
+        type: u8
+        enum: queue-type
+      -
+        name: idx
+        doc: queue index
+        type: u32
+
+  -
+    name: bind-dmabuf
+    attributes:
+      -
+        name: ifindex
+        doc: netdev ifindex to bind the dma-buf to.
+        type: u32
+        checks:
+          min: 1
+      -
+        name: queues
+        doc: receive queues to bind the dma-buf to.
+        type: nest
+        nested-attributes: queue-dmabuf
+        multi-attr: true
+      -
+        name: dmabuf-fd
+        doc: dmabuf file descriptor to bind.
+        type: u32
+      -
+        name: dmabuf-id
+        doc: id of the dmabuf binding
+        type: u32
+        checks:
+          min: 1
+
 
   -
     name: qstats
@@ -579,6 +618,20 @@  operations:
           attributes:
             - ifindex
         reply: *queue-get-op
+    -
+      name: bind-rx
+      doc: Bind dmabuf to netdev
+      attribute-set: bind-dmabuf
+      flags: [ admin-perm ]
+      do:
+        request:
+          attributes:
+            - ifindex
+            - dmabuf-fd
+            - queues
+        reply:
+          attributes:
+            - dmabuf-id
     -
       name: napi-get
       doc: Get information about NAPI instances configured on the system.
diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h
index 43742ac5b00da..190a504a62358 100644
--- a/include/uapi/linux/netdev.h
+++ b/include/uapi/linux/netdev.h
@@ -136,6 +136,24 @@  enum {
 	NETDEV_A_QUEUE_MAX = (__NETDEV_A_QUEUE_MAX - 1)
 };
 
+enum {
+	NETDEV_A_QUEUE_DMABUF_TYPE = 1,
+	NETDEV_A_QUEUE_DMABUF_IDX,
+
+	__NETDEV_A_QUEUE_DMABUF_MAX,
+	NETDEV_A_QUEUE_DMABUF_MAX = (__NETDEV_A_QUEUE_DMABUF_MAX - 1)
+};
+
+enum {
+	NETDEV_A_BIND_DMABUF_IFINDEX = 1,
+	NETDEV_A_BIND_DMABUF_QUEUES,
+	NETDEV_A_BIND_DMABUF_DMABUF_FD,
+	NETDEV_A_BIND_DMABUF_DMABUF_ID,
+
+	__NETDEV_A_BIND_DMABUF_MAX,
+	NETDEV_A_BIND_DMABUF_MAX = (__NETDEV_A_BIND_DMABUF_MAX - 1)
+};
+
 enum {
 	NETDEV_A_QSTATS_IFINDEX = 1,
 	NETDEV_A_QSTATS_QUEUE_TYPE,
@@ -184,6 +202,7 @@  enum {
 	NETDEV_CMD_PAGE_POOL_CHANGE_NTF,
 	NETDEV_CMD_PAGE_POOL_STATS_GET,
 	NETDEV_CMD_QUEUE_GET,
+	NETDEV_CMD_BIND_RX,
 	NETDEV_CMD_NAPI_GET,
 	NETDEV_CMD_QSTATS_GET,
 
diff --git a/net/core/netdev-genl-gen.c b/net/core/netdev-genl-gen.c
index 8350a0afa9ec7..9acd0d893765a 100644
--- a/net/core/netdev-genl-gen.c
+++ b/net/core/netdev-genl-gen.c
@@ -27,6 +27,11 @@  const struct nla_policy netdev_page_pool_info_nl_policy[NETDEV_A_PAGE_POOL_IFIND
 	[NETDEV_A_PAGE_POOL_IFINDEX] = NLA_POLICY_FULL_RANGE(NLA_U32, &netdev_a_page_pool_ifindex_range),
 };
 
+const struct nla_policy netdev_queue_dmabuf_nl_policy[NETDEV_A_QUEUE_DMABUF_IDX + 1] = {
+	[NETDEV_A_QUEUE_DMABUF_TYPE] = NLA_POLICY_MAX(NLA_U8, 1),
+	[NETDEV_A_QUEUE_DMABUF_IDX] = { .type = NLA_U32, },
+};
+
 /* NETDEV_CMD_DEV_GET - do */
 static const struct nla_policy netdev_dev_get_nl_policy[NETDEV_A_DEV_IFINDEX + 1] = {
 	[NETDEV_A_DEV_IFINDEX] = NLA_POLICY_MIN(NLA_U32, 1),
@@ -58,6 +63,13 @@  static const struct nla_policy netdev_queue_get_dump_nl_policy[NETDEV_A_QUEUE_IF
 	[NETDEV_A_QUEUE_IFINDEX] = NLA_POLICY_MIN(NLA_U32, 1),
 };
 
+/* NETDEV_CMD_BIND_RX - do */
+static const struct nla_policy netdev_bind_rx_nl_policy[NETDEV_A_BIND_DMABUF_DMABUF_FD + 1] = {
+	[NETDEV_A_BIND_DMABUF_IFINDEX] = NLA_POLICY_MIN(NLA_U32, 1),
+	[NETDEV_A_BIND_DMABUF_DMABUF_FD] = { .type = NLA_U32, },
+	[NETDEV_A_BIND_DMABUF_QUEUES] = NLA_POLICY_NESTED(netdev_queue_dmabuf_nl_policy),
+};
+
 /* NETDEV_CMD_NAPI_GET - do */
 static const struct nla_policy netdev_napi_get_do_nl_policy[NETDEV_A_NAPI_ID + 1] = {
 	[NETDEV_A_NAPI_ID] = { .type = NLA_U32, },
@@ -130,6 +142,13 @@  static const struct genl_split_ops netdev_nl_ops[] = {
 		.maxattr	= NETDEV_A_QUEUE_IFINDEX,
 		.flags		= GENL_CMD_CAP_DUMP,
 	},
+	{
+		.cmd		= NETDEV_CMD_BIND_RX,
+		.doit		= netdev_nl_bind_rx_doit,
+		.policy		= netdev_bind_rx_nl_policy,
+		.maxattr	= NETDEV_A_BIND_DMABUF_DMABUF_FD,
+		.flags		= GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
+	},
 	{
 		.cmd		= NETDEV_CMD_NAPI_GET,
 		.doit		= netdev_nl_napi_get_doit,
diff --git a/net/core/netdev-genl-gen.h b/net/core/netdev-genl-gen.h
index 4db40fd5b4a9e..ca5a0983f2834 100644
--- a/net/core/netdev-genl-gen.h
+++ b/net/core/netdev-genl-gen.h
@@ -13,6 +13,7 @@ 
 
 /* Common nested types */
 extern const struct nla_policy netdev_page_pool_info_nl_policy[NETDEV_A_PAGE_POOL_IFINDEX + 1];
+extern const struct nla_policy netdev_queue_dmabuf_nl_policy[NETDEV_A_QUEUE_DMABUF_IDX + 1];
 
 int netdev_nl_dev_get_doit(struct sk_buff *skb, struct genl_info *info);
 int netdev_nl_dev_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb);
@@ -26,6 +27,7 @@  int netdev_nl_page_pool_stats_get_dumpit(struct sk_buff *skb,
 int netdev_nl_queue_get_doit(struct sk_buff *skb, struct genl_info *info);
 int netdev_nl_queue_get_dumpit(struct sk_buff *skb,
 			       struct netlink_callback *cb);
+int netdev_nl_bind_rx_doit(struct sk_buff *skb, struct genl_info *info);
 int netdev_nl_napi_get_doit(struct sk_buff *skb, struct genl_info *info);
 int netdev_nl_napi_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb);
 int netdev_nl_qstats_get_dumpit(struct sk_buff *skb,
diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c
index 05f9515d2c05c..2d726e65211dd 100644
--- a/net/core/netdev-genl.c
+++ b/net/core/netdev-genl.c
@@ -721,6 +721,12 @@  int netdev_nl_qstats_get_dumpit(struct sk_buff *skb,
 	return err;
 }
 
+/* Stub */
+int netdev_nl_bind_rx_doit(struct sk_buff *skb, struct genl_info *info)
+{
+	return 0;
+}
+
 static int netdev_genl_netdevice_event(struct notifier_block *nb,
 				       unsigned long event, void *ptr)
 {
diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h
index 43742ac5b00da..190a504a62358 100644
--- a/tools/include/uapi/linux/netdev.h
+++ b/tools/include/uapi/linux/netdev.h
@@ -136,6 +136,24 @@  enum {
 	NETDEV_A_QUEUE_MAX = (__NETDEV_A_QUEUE_MAX - 1)
 };
 
+enum {
+	NETDEV_A_QUEUE_DMABUF_TYPE = 1,
+	NETDEV_A_QUEUE_DMABUF_IDX,
+
+	__NETDEV_A_QUEUE_DMABUF_MAX,
+	NETDEV_A_QUEUE_DMABUF_MAX = (__NETDEV_A_QUEUE_DMABUF_MAX - 1)
+};
+
+enum {
+	NETDEV_A_BIND_DMABUF_IFINDEX = 1,
+	NETDEV_A_BIND_DMABUF_QUEUES,
+	NETDEV_A_BIND_DMABUF_DMABUF_FD,
+	NETDEV_A_BIND_DMABUF_DMABUF_ID,
+
+	__NETDEV_A_BIND_DMABUF_MAX,
+	NETDEV_A_BIND_DMABUF_MAX = (__NETDEV_A_BIND_DMABUF_MAX - 1)
+};
+
 enum {
 	NETDEV_A_QSTATS_IFINDEX = 1,
 	NETDEV_A_QSTATS_QUEUE_TYPE,
@@ -184,6 +202,7 @@  enum {
 	NETDEV_CMD_PAGE_POOL_CHANGE_NTF,
 	NETDEV_CMD_PAGE_POOL_STATS_GET,
 	NETDEV_CMD_QUEUE_GET,
+	NETDEV_CMD_BIND_RX,
 	NETDEV_CMD_NAPI_GET,
 	NETDEV_CMD_QSTATS_GET,