diff mbox series

[net-next,v2,6/9] devlink: support directly reading from region memory

Message ID 20221123203834.738606-7-jacob.e.keller@intel.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series support direct read from region | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 383 this patch: 383
netdev/cc_maintainers warning 5 maintainers not CCed: pabeni@redhat.com corbet@lwn.net edumazet@google.com linux-doc@vger.kernel.org davem@davemloft.net
netdev/build_clang success Errors and warnings before: 24 this patch: 24
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 528 this patch: 528
netdev/checkpatch warning WARNING: line length of 100 exceeds 80 columns WARNING: line length of 108 exceeds 80 columns WARNING: line length of 82 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 14 this patch: 14
netdev/source_inline success Was 0 now: 0

Commit Message

Jacob Keller Nov. 23, 2022, 8:38 p.m. UTC
To read from a region, user space must currently request a new snapshot of
the region and then read from that snapshot. This can sometimes be overkill
if user space only reads a tiny portion. They first create the snapshot,
then request a read, then destroy the snapshot.

For regions which have a single underlying "contents", it makes sense to
allow supporting direct reading of the region data.

Extend the DEVLINK_CMD_REGION_READ to allow direct reading from a region if
supported. Instead of reporting a missing snapshot id as invalid, check if
the region supports direct read and if so switch to the direct access
method for reading the region data.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
Changes since v1:
* Update documentation to explain that direct reads are not atomic

 .../networking/devlink/devlink-region.rst     | 11 ++++
 include/net/devlink.h                         | 16 +++++
 net/core/devlink.c                            | 66 ++++++++++++++-----
 3 files changed, 76 insertions(+), 17 deletions(-)

Comments

Jiri Pirko Nov. 24, 2022, 9:05 a.m. UTC | #1
Wed, Nov 23, 2022 at 09:38:31PM CET, jacob.e.keller@intel.com wrote:
>To read from a region, user space must currently request a new snapshot of
>the region and then read from that snapshot. This can sometimes be overkill
>if user space only reads a tiny portion. They first create the snapshot,
>then request a read, then destroy the snapshot.
>
>For regions which have a single underlying "contents", it makes sense to
>allow supporting direct reading of the region data.
>
>Extend the DEVLINK_CMD_REGION_READ to allow direct reading from a region if
>supported. Instead of reporting a missing snapshot id as invalid, check if
>the region supports direct read and if so switch to the direct access
>method for reading the region data.
>
>Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
>---
>Changes since v1:
>* Update documentation to explain that direct reads are not atomic
>
> .../networking/devlink/devlink-region.rst     | 11 ++++
> include/net/devlink.h                         | 16 +++++
> net/core/devlink.c                            | 66 ++++++++++++++-----
> 3 files changed, 76 insertions(+), 17 deletions(-)
>
>diff --git a/Documentation/networking/devlink/devlink-region.rst b/Documentation/networking/devlink/devlink-region.rst
>index f06dca9a1eb6..6525a9120a4e 100644
>--- a/Documentation/networking/devlink/devlink-region.rst
>+++ b/Documentation/networking/devlink/devlink-region.rst
>@@ -31,6 +31,13 @@ in its ``devlink_region_ops`` structure. If snapshot id is not set in
> the ``DEVLINK_CMD_REGION_NEW`` request kernel will allocate one and send
> the snapshot information to user space.
> 
>+Regions may optionally allow directly reading from their contents without a
>+snapshot. Direct read requests are not atomic. In particular a read request
>+of size 256 bytes or larger will be split into multiple chunks. If atomic
>+access is required, use a snapshot. A driver wishing to enable this for a
>+region should implement the ``.read`` callback in the ``devlink_region_ops``
>+structure.
>+
> example usage
> -------------
> 
>@@ -65,6 +72,10 @@ example usage
>     $ devlink region read pci/0000:00:05.0/fw-health snapshot 1 address 0 length 16
>     0000000000000000 0014 95dc 0014 9514 0035 1670 0034 db30
> 
>+    # Read from the region without a snapshot
>+    $ devlink region read pci/0000:00:05.0/fw-health address 16 length 16
>+    0000000000000010 0000 0000 ffff ff04 0029 8c00 0028 8cc8
>+
> As regions are likely very device or driver specific, no generic regions are
> defined. See the driver-specific documentation files for information on the
> specific regions a driver supports.
>diff --git a/include/net/devlink.h b/include/net/devlink.h
>index 074a79b8933f..02528f736f65 100644
>--- a/include/net/devlink.h
>+++ b/include/net/devlink.h
>@@ -650,6 +650,10 @@ struct devlink_info_req;
>  *            the data variable must be updated to point to the snapshot data.
>  *            The function will be called while the devlink instance lock is
>  *            held.
>+ * @read: callback to directly read a portion of the region. On success,
>+ *        the data pointer will be updated with the contents of the
>+ *        requested portion of the region. The function will be called
>+ *        while the devlink instance lock is held.
>  * @priv: Pointer to driver private data for the region operation
>  */
> struct devlink_region_ops {
>@@ -659,6 +663,10 @@ struct devlink_region_ops {
> 			const struct devlink_region_ops *ops,
> 			struct netlink_ext_ack *extack,
> 			u8 **data);
>+	int (*read)(struct devlink *devlink,
>+		    const struct devlink_region_ops *ops,
>+		    struct netlink_ext_ack *extack,
>+		    u64 offset, u32 size, u8 *data);
> 	void *priv;
> };
> 
>@@ -670,6 +678,10 @@ struct devlink_region_ops {
>  *            the data variable must be updated to point to the snapshot data.
>  *            The function will be called while the devlink instance lock is
>  *            held.
>+ * @read: callback to directly read a portion of the region. On success,
>+ *        the data pointer will be updated with the contents of the
>+ *        requested portion of the region. The function will be called
>+ *        while the devlink instance lock is held.
>  * @priv: Pointer to driver private data for the region operation
>  */
> struct devlink_port_region_ops {
>@@ -679,6 +691,10 @@ struct devlink_port_region_ops {
> 			const struct devlink_port_region_ops *ops,
> 			struct netlink_ext_ack *extack,
> 			u8 **data);
>+	int (*read)(struct devlink_port *port,
>+		    const struct devlink_port_region_ops *ops,
>+		    struct netlink_ext_ack *extack,
>+		    u64 offset, u32 size, u8 *data);
> 	void *priv;
> };
> 
>diff --git a/net/core/devlink.c b/net/core/devlink.c
>index 729e2162a4db..f249b5b57677 100644
>--- a/net/core/devlink.c
>+++ b/net/core/devlink.c
>@@ -6515,6 +6515,26 @@ devlink_region_snapshot_fill(void *cb_priv, u8 *chunk, u32 chunk_size,
> 	return 0;
> }
> 
>+static int
>+devlink_region_port_direct_fill(void *cb_priv, u8 *chunk, u32 chunk_size,
>+				u64 curr_offset, struct netlink_ext_ack *extack)
>+{
>+	struct devlink_region *region = cb_priv;
>+
>+	return region->port_ops->read(region->port, region->port_ops, extack,
>+				      curr_offset, chunk_size, chunk);
>+}
>+
>+static int
>+devlink_region_direct_fill(void *cb_priv, u8 *chunk, u32 chunk_size,
>+			   u64 curr_offset, struct netlink_ext_ack *extack)
>+{
>+	struct devlink_region *region = cb_priv;
>+
>+	return region->ops->read(region->devlink, region->ops, extack,
>+				 curr_offset, chunk_size, chunk);
>+}
>+
> static int devlink_nl_cmd_region_read_dumpit(struct sk_buff *skb,
> 					     struct netlink_callback *cb)
> {
>@@ -6523,12 +6543,12 @@ static int devlink_nl_cmd_region_read_dumpit(struct sk_buff *skb,
> 	u64 ret_offset, start_offset, end_offset = U64_MAX;
> 	struct nlattr **attrs = info->attrs;
> 	struct devlink_port *port = NULL;
>-	struct devlink_snapshot *snapshot;
>+	devlink_chunk_fill_t *region_cb;
> 	struct devlink_region *region;
> 	const char *region_name;
> 	struct devlink *devlink;
> 	unsigned int index;
>-	u32 snapshot_id;
>+	void *region_cb_priv;
> 	void *hdr;
> 	int err;
> 
>@@ -6546,12 +6566,6 @@ static int devlink_nl_cmd_region_read_dumpit(struct sk_buff *skb,
> 		goto out_unlock;
> 	}
> 
>-	if (!attrs[DEVLINK_ATTR_REGION_SNAPSHOT_ID]) {
>-		NL_SET_ERR_MSG(cb->extack, "No snapshot id provided");
>-		err = -EINVAL;
>-		goto out_unlock;
>-	}
>-
> 	if (info->attrs[DEVLINK_ATTR_PORT_INDEX]) {
> 		index = nla_get_u32(info->attrs[DEVLINK_ATTR_PORT_INDEX]);
> 
>@@ -6577,12 +6591,30 @@ static int devlink_nl_cmd_region_read_dumpit(struct sk_buff *skb,
> 	}
> 
> 	snapshot_attr = attrs[DEVLINK_ATTR_REGION_SNAPSHOT_ID];
>-	snapshot_id = nla_get_u32(snapshot_attr);
>-	snapshot = devlink_region_snapshot_get_by_id(region, snapshot_id);
>-	if (!snapshot) {
>-		NL_SET_ERR_MSG_ATTR(cb->extack, snapshot_attr, "requested snapshot does not exist");
>-		err = -EINVAL;
>-		goto out_unlock;
>+	if (!snapshot_attr) {

Hmm. Being pedantic here, it changes userspace expectations:
current - message without snapshotid is errored out
new - message without snapshotid is happily processed

I can imagine some obscure userspace app depending on this behaviour,
in theory.

Safe would be to add new NLA_FLAG attr DEVLINK_ATTR_REGION_DIRECT or
something that would indicate userspace is interested in direct read.
Also, the userspace would know right away it this new functionality
is supported or not by the kernel.



>+		if (!region->ops->read) {
>+			NL_SET_ERR_MSG(cb->extack, "requested region does not support direct read");
>+			err = -EOPNOTSUPP;
>+			goto out_unlock;
>+		}
>+		if (port)
>+			region_cb = &devlink_region_port_direct_fill;
>+		else
>+			region_cb = &devlink_region_direct_fill;
>+		region_cb_priv = region;
>+	} else {
>+		struct devlink_snapshot *snapshot;
>+		u32 snapshot_id;
>+
>+		snapshot_id = nla_get_u32(snapshot_attr);
>+		snapshot = devlink_region_snapshot_get_by_id(region, snapshot_id);
>+		if (!snapshot) {
>+			NL_SET_ERR_MSG_ATTR(cb->extack, snapshot_attr, "requested snapshot does not exist");
>+			err = -EINVAL;
>+			goto out_unlock;
>+		}
>+		region_cb = &devlink_region_snapshot_fill;
>+		region_cb_priv = snapshot;
> 	}
> 
> 	if (attrs[DEVLINK_ATTR_REGION_CHUNK_ADDR] &&
>@@ -6633,9 +6665,9 @@ static int devlink_nl_cmd_region_read_dumpit(struct sk_buff *skb,
> 		goto nla_put_failure;
> 	}
> 
>-	err = devlink_nl_region_read_fill(skb, &devlink_region_snapshot_fill,
>-					  snapshot, start_offset, end_offset,
>-					  &ret_offset, cb->extack);
>+	err = devlink_nl_region_read_fill(skb, region_cb, region_cb_priv,
>+					  start_offset, end_offset, &ret_offset,
>+					  cb->extack);
> 
> 	if (err && err != -EMSGSIZE)
> 		goto nla_put_failure;
>-- 
>2.38.1.420.g319605f8f00e
>
Jacob Keller Nov. 28, 2022, 6:34 p.m. UTC | #2
On 11/24/2022 1:05 AM, Jiri Pirko wrote:
> Wed, Nov 23, 2022 at 09:38:31PM CET, jacob.e.keller@intel.com wrote:
> 
> Hmm. Being pedantic here, it changes userspace expectations:
> current - message without snapshotid is errored out
> new - message without snapshotid is happily processed
> 

Yea. I guess I'm thinking more from an interactive application 
standpoint this wouldn't be a problem but from a scripted setup it might 
do something weird.

> I can imagine some obscure userspace app depending on this behaviour,
> in theory.

Hmm.

> 
> Safe would be to add new NLA_FLAG attr DEVLINK_ATTR_REGION_DIRECT or
> something that would indicate userspace is interested in direct read.
> Also, the userspace would know right away it this new functionality
> is supported or not by the kernel.
> 

The advantage of being able to see that such a feature exists is good.

I can rework this to add an attribute.

> 
> 
>> +		if (!region->ops->read) {
>> +			NL_SET_ERR_MSG(cb->extack, "requested region does not support direct read");
>> +			err = -EOPNOTSUPP;
>> +			goto out_unlock;
>> +		}
>> +		if (port)
>> +			region_cb = &devlink_region_port_direct_fill;
>> +		else
>> +			region_cb = &devlink_region_direct_fill;
>> +		region_cb_priv = region;
>> +	} else {
>> +		struct devlink_snapshot *snapshot;
>> +		u32 snapshot_id;
>> +
>> +		snapshot_id = nla_get_u32(snapshot_attr);
>> +		snapshot = devlink_region_snapshot_get_by_id(region, snapshot_id);
>> +		if (!snapshot) {
>> +			NL_SET_ERR_MSG_ATTR(cb->extack, snapshot_attr, "requested snapshot does not exist");
>> +			err = -EINVAL;
>> +			goto out_unlock;
>> +		}
>> +		region_cb = &devlink_region_snapshot_fill;
>> +		region_cb_priv = snapshot;
>> 	}
>>
>> 	if (attrs[DEVLINK_ATTR_REGION_CHUNK_ADDR] &&
>> @@ -6633,9 +6665,9 @@ static int devlink_nl_cmd_region_read_dumpit(struct sk_buff *skb,
>> 		goto nla_put_failure;
>> 	}
>>
>> -	err = devlink_nl_region_read_fill(skb, &devlink_region_snapshot_fill,
>> -					  snapshot, start_offset, end_offset,
>> -					  &ret_offset, cb->extack);
>> +	err = devlink_nl_region_read_fill(skb, region_cb, region_cb_priv,
>> +					  start_offset, end_offset, &ret_offset,
>> +					  cb->extack);
>>
>> 	if (err && err != -EMSGSIZE)
>> 		goto nla_put_failure;
>> -- 
>> 2.38.1.420.g319605f8f00e
>>
diff mbox series

Patch

diff --git a/Documentation/networking/devlink/devlink-region.rst b/Documentation/networking/devlink/devlink-region.rst
index f06dca9a1eb6..6525a9120a4e 100644
--- a/Documentation/networking/devlink/devlink-region.rst
+++ b/Documentation/networking/devlink/devlink-region.rst
@@ -31,6 +31,13 @@  in its ``devlink_region_ops`` structure. If snapshot id is not set in
 the ``DEVLINK_CMD_REGION_NEW`` request kernel will allocate one and send
 the snapshot information to user space.
 
+Regions may optionally allow directly reading from their contents without a
+snapshot. Direct read requests are not atomic. In particular a read request
+of size 256 bytes or larger will be split into multiple chunks. If atomic
+access is required, use a snapshot. A driver wishing to enable this for a
+region should implement the ``.read`` callback in the ``devlink_region_ops``
+structure.
+
 example usage
 -------------
 
@@ -65,6 +72,10 @@  example usage
     $ devlink region read pci/0000:00:05.0/fw-health snapshot 1 address 0 length 16
     0000000000000000 0014 95dc 0014 9514 0035 1670 0034 db30
 
+    # Read from the region without a snapshot
+    $ devlink region read pci/0000:00:05.0/fw-health address 16 length 16
+    0000000000000010 0000 0000 ffff ff04 0029 8c00 0028 8cc8
+
 As regions are likely very device or driver specific, no generic regions are
 defined. See the driver-specific documentation files for information on the
 specific regions a driver supports.
diff --git a/include/net/devlink.h b/include/net/devlink.h
index 074a79b8933f..02528f736f65 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -650,6 +650,10 @@  struct devlink_info_req;
  *            the data variable must be updated to point to the snapshot data.
  *            The function will be called while the devlink instance lock is
  *            held.
+ * @read: callback to directly read a portion of the region. On success,
+ *        the data pointer will be updated with the contents of the
+ *        requested portion of the region. The function will be called
+ *        while the devlink instance lock is held.
  * @priv: Pointer to driver private data for the region operation
  */
 struct devlink_region_ops {
@@ -659,6 +663,10 @@  struct devlink_region_ops {
 			const struct devlink_region_ops *ops,
 			struct netlink_ext_ack *extack,
 			u8 **data);
+	int (*read)(struct devlink *devlink,
+		    const struct devlink_region_ops *ops,
+		    struct netlink_ext_ack *extack,
+		    u64 offset, u32 size, u8 *data);
 	void *priv;
 };
 
@@ -670,6 +678,10 @@  struct devlink_region_ops {
  *            the data variable must be updated to point to the snapshot data.
  *            The function will be called while the devlink instance lock is
  *            held.
+ * @read: callback to directly read a portion of the region. On success,
+ *        the data pointer will be updated with the contents of the
+ *        requested portion of the region. The function will be called
+ *        while the devlink instance lock is held.
  * @priv: Pointer to driver private data for the region operation
  */
 struct devlink_port_region_ops {
@@ -679,6 +691,10 @@  struct devlink_port_region_ops {
 			const struct devlink_port_region_ops *ops,
 			struct netlink_ext_ack *extack,
 			u8 **data);
+	int (*read)(struct devlink_port *port,
+		    const struct devlink_port_region_ops *ops,
+		    struct netlink_ext_ack *extack,
+		    u64 offset, u32 size, u8 *data);
 	void *priv;
 };
 
diff --git a/net/core/devlink.c b/net/core/devlink.c
index 729e2162a4db..f249b5b57677 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -6515,6 +6515,26 @@  devlink_region_snapshot_fill(void *cb_priv, u8 *chunk, u32 chunk_size,
 	return 0;
 }
 
+static int
+devlink_region_port_direct_fill(void *cb_priv, u8 *chunk, u32 chunk_size,
+				u64 curr_offset, struct netlink_ext_ack *extack)
+{
+	struct devlink_region *region = cb_priv;
+
+	return region->port_ops->read(region->port, region->port_ops, extack,
+				      curr_offset, chunk_size, chunk);
+}
+
+static int
+devlink_region_direct_fill(void *cb_priv, u8 *chunk, u32 chunk_size,
+			   u64 curr_offset, struct netlink_ext_ack *extack)
+{
+	struct devlink_region *region = cb_priv;
+
+	return region->ops->read(region->devlink, region->ops, extack,
+				 curr_offset, chunk_size, chunk);
+}
+
 static int devlink_nl_cmd_region_read_dumpit(struct sk_buff *skb,
 					     struct netlink_callback *cb)
 {
@@ -6523,12 +6543,12 @@  static int devlink_nl_cmd_region_read_dumpit(struct sk_buff *skb,
 	u64 ret_offset, start_offset, end_offset = U64_MAX;
 	struct nlattr **attrs = info->attrs;
 	struct devlink_port *port = NULL;
-	struct devlink_snapshot *snapshot;
+	devlink_chunk_fill_t *region_cb;
 	struct devlink_region *region;
 	const char *region_name;
 	struct devlink *devlink;
 	unsigned int index;
-	u32 snapshot_id;
+	void *region_cb_priv;
 	void *hdr;
 	int err;
 
@@ -6546,12 +6566,6 @@  static int devlink_nl_cmd_region_read_dumpit(struct sk_buff *skb,
 		goto out_unlock;
 	}
 
-	if (!attrs[DEVLINK_ATTR_REGION_SNAPSHOT_ID]) {
-		NL_SET_ERR_MSG(cb->extack, "No snapshot id provided");
-		err = -EINVAL;
-		goto out_unlock;
-	}
-
 	if (info->attrs[DEVLINK_ATTR_PORT_INDEX]) {
 		index = nla_get_u32(info->attrs[DEVLINK_ATTR_PORT_INDEX]);
 
@@ -6577,12 +6591,30 @@  static int devlink_nl_cmd_region_read_dumpit(struct sk_buff *skb,
 	}
 
 	snapshot_attr = attrs[DEVLINK_ATTR_REGION_SNAPSHOT_ID];
-	snapshot_id = nla_get_u32(snapshot_attr);
-	snapshot = devlink_region_snapshot_get_by_id(region, snapshot_id);
-	if (!snapshot) {
-		NL_SET_ERR_MSG_ATTR(cb->extack, snapshot_attr, "requested snapshot does not exist");
-		err = -EINVAL;
-		goto out_unlock;
+	if (!snapshot_attr) {
+		if (!region->ops->read) {
+			NL_SET_ERR_MSG(cb->extack, "requested region does not support direct read");
+			err = -EOPNOTSUPP;
+			goto out_unlock;
+		}
+		if (port)
+			region_cb = &devlink_region_port_direct_fill;
+		else
+			region_cb = &devlink_region_direct_fill;
+		region_cb_priv = region;
+	} else {
+		struct devlink_snapshot *snapshot;
+		u32 snapshot_id;
+
+		snapshot_id = nla_get_u32(snapshot_attr);
+		snapshot = devlink_region_snapshot_get_by_id(region, snapshot_id);
+		if (!snapshot) {
+			NL_SET_ERR_MSG_ATTR(cb->extack, snapshot_attr, "requested snapshot does not exist");
+			err = -EINVAL;
+			goto out_unlock;
+		}
+		region_cb = &devlink_region_snapshot_fill;
+		region_cb_priv = snapshot;
 	}
 
 	if (attrs[DEVLINK_ATTR_REGION_CHUNK_ADDR] &&
@@ -6633,9 +6665,9 @@  static int devlink_nl_cmd_region_read_dumpit(struct sk_buff *skb,
 		goto nla_put_failure;
 	}
 
-	err = devlink_nl_region_read_fill(skb, &devlink_region_snapshot_fill,
-					  snapshot, start_offset, end_offset,
-					  &ret_offset, cb->extack);
+	err = devlink_nl_region_read_fill(skb, region_cb, region_cb_priv,
+					  start_offset, end_offset, &ret_offset,
+					  cb->extack);
 
 	if (err && err != -EMSGSIZE)
 		goto nla_put_failure;