diff mbox series

[net-next,v4] net: qualcomm: rmnet: Add side band flow control support

Message ID 20231006001614.1678782-1-quic_subashab@quicinc.com (mailing list archive)
State Rejected
Delegated to: Netdev Maintainers
Headers show
Series [net-next,v4] net: qualcomm: rmnet: Add side band flow control support | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 5508 this patch: 5508
netdev/cc_maintainers success CCed 7 of 7 maintainers
netdev/build_clang fail Errors and warnings before: 312 this patch: 312
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn fail Errors and warnings before: 4517 this patch: 4517
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 224 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Subash Abhinov Kasiviswanathan (KS) Oct. 6, 2023, 12:16 a.m. UTC
Individual rmnet devices map to specific network types such as internet,
multimedia messaging services, IP multimedia subsystem etc. Each of
these network types may support varying quality of service for different
bearers or traffic types.

The physical device interconnect to radio hardware may support a
higher data rate than what is actually supported by the radio network.
Any packets transmitted to the radio hardware which exceed the radio
network data rate limit maybe dropped. This patch tries to minimize the
loss of packets by adding support for bearer level flow control within a
rmnet device by ensuring that the packets transmitted do not exceed the
limit allowed by the radio network.

In order to support multiple bearers, rmnet must be created as a
multiqueue TX netdevice. Radio hardware communicates the supported
bearer information for a given network via side band signalling.
Consider the following mapping -

IPv4 UDP port 1234 - Mark 0x1001 - Queue 1
IPv6 TCP port 2345 - Mark 0x2001 - Queue 2

iptables can be used to install filters which mark packets matching these
specific traffic patterns and the RMNET_QUEUE_MAPPING_ADD operation can
then be to install the mapping of the mark to the specific txqueue.

If the traffic limit is exceeded for a particular bearer, radio hardware
would notify that the bearer cannot accept more packets and the
corresponding txqueue traffic can be stopped using RMNET_QUEUE_DISABLE.

Conversely, if radio hardware can send more traffic for a particular
bearer, RMNET_QUEUE_ENABLE can be used to allow traffic on that
particular txqueue. RMNET_QUEUE_MAPPING_REMOVE can be used to remove the
mark to queue mapping in case the radio network doesn't support that
particular bearer any longer.

Signed-off-by: Sean Tranchetti <quic_stranche@quicinc.com>
Signed-off-by: Subash Abhinov Kasiviswanathan <quic_subashab@quicinc.com>
---
v3->v4
  Update and propagate the queue operation errors to the newlink and
  changelink handlers and also unlink the upper dev in case of failure as
  mentioned by Simon. Additionally, reword the extack error message for
  unsupported operation and return the xarray error instead of a -EINVAL
  in case of failing xarray operations.

v2->v3
 Change the variable declaration ordering to reverse x-mas tree as
 mentioned by Vadim.

v1 -> v2
 Fix incorrect xarray API usage in rmnet_update_queue_map() and remove some
 unneccessary checks in rmnet_vnd_select_queue() as mentioned by Vadim.
 Fix UAPI types as reported by kernel test robot.

 .../ethernet/qualcomm/rmnet/rmnet_config.c    | 100 +++++++++++++++++-
 .../ethernet/qualcomm/rmnet/rmnet_config.h    |   2 +
 .../net/ethernet/qualcomm/rmnet/rmnet_vnd.c   |  22 ++++
 include/uapi/linux/if_link.h                  |  16 +++
 4 files changed, 139 insertions(+), 1 deletion(-)

Comments

Jakub Kicinski Oct. 10, 2023, 2:42 a.m. UTC | #1
On Thu,  5 Oct 2023 17:16:14 -0700 Subash Abhinov Kasiviswanathan wrote:
> Individual rmnet devices map to specific network types such as internet,
> multimedia messaging services, IP multimedia subsystem etc. Each of
> these network types may support varying quality of service for different
> bearers or traffic types.
> 
> The physical device interconnect to radio hardware may support a
> higher data rate than what is actually supported by the radio network.
> Any packets transmitted to the radio hardware which exceed the radio
> network data rate limit maybe dropped. This patch tries to minimize the
> loss of packets by adding support for bearer level flow control within a
> rmnet device by ensuring that the packets transmitted do not exceed the
> limit allowed by the radio network.
> 
> In order to support multiple bearers, rmnet must be created as a
> multiqueue TX netdevice. Radio hardware communicates the supported
> bearer information for a given network via side band signalling.
> Consider the following mapping -
> 
> IPv4 UDP port 1234 - Mark 0x1001 - Queue 1
> IPv6 TCP port 2345 - Mark 0x2001 - Queue 2
> 
> iptables can be used to install filters which mark packets matching these
> specific traffic patterns and the RMNET_QUEUE_MAPPING_ADD operation can
> then be to install the mapping of the mark to the specific txqueue.

I don't understand why you need driver specific commands to do this.
It should be easily achievable using existing TC qdisc infra.
What's the gap?
Subash Abhinov Kasiviswanathan (KS) Oct. 10, 2023, 4 a.m. UTC | #2
On 10/9/2023 8:42 PM, Jakub Kicinski wrote:
> On Thu,  5 Oct 2023 17:16:14 -0700 Subash Abhinov Kasiviswanathan wrote:
>> Individual rmnet devices map to specific network types such as internet,
>> multimedia messaging services, IP multimedia subsystem etc. Each of
>> these network types may support varying quality of service for different
>> bearers or traffic types.
>>
>> The physical device interconnect to radio hardware may support a
>> higher data rate than what is actually supported by the radio network.
>> Any packets transmitted to the radio hardware which exceed the radio
>> network data rate limit maybe dropped. This patch tries to minimize the
>> loss of packets by adding support for bearer level flow control within a
>> rmnet device by ensuring that the packets transmitted do not exceed the
>> limit allowed by the radio network.
>>
>> In order to support multiple bearers, rmnet must be created as a
>> multiqueue TX netdevice. Radio hardware communicates the supported
>> bearer information for a given network via side band signalling.
>> Consider the following mapping -
>>
>> IPv4 UDP port 1234 - Mark 0x1001 - Queue 1
>> IPv6 TCP port 2345 - Mark 0x2001 - Queue 2
>>
>> iptables can be used to install filters which mark packets matching these
>> specific traffic patterns and the RMNET_QUEUE_MAPPING_ADD operation can
>> then be to install the mapping of the mark to the specific txqueue.
> 
> I don't understand why you need driver specific commands to do this.
> It should be easily achievable using existing TC qdisc infra.
> What's the gap?

tc doesn't allow userspace to manipulate the flow state (allow / 
disallow traffic) on a specific queue. As I understand, the traffic 
dequeued / queued / dropped on a specific queue of existing qdiscs are 
controlled by the implementation of the qdisc itself.
Jakub Kicinski Oct. 10, 2023, 2:56 p.m. UTC | #3
On Mon, 9 Oct 2023 22:00:40 -0600 Subash Abhinov Kasiviswanathan (KS)
wrote:
> > I don't understand why you need driver specific commands to do this.
> > It should be easily achievable using existing TC qdisc infra.
> > What's the gap?  
> 
> tc doesn't allow userspace to manipulate the flow state (allow / 
> disallow traffic) on a specific queue. As I understand, the traffic 
> dequeued / queued / dropped on a specific queue of existing qdiscs are 
> controlled by the implementation of the qdisc itself.

I'm not sure what you mean. Qdiscs can form hierarchies.
You put mq first and then whatever child qdisc you want for individual
queues.
Subash Abhinov Kasiviswanathan (KS) Oct. 10, 2023, 3:23 p.m. UTC | #4
On 10/10/2023 8:56 AM, Jakub Kicinski wrote:
> On Mon, 9 Oct 2023 22:00:40 -0600 Subash Abhinov Kasiviswanathan (KS)
> wrote:
>>> I don't understand why you need driver specific commands to do this.
>>> It should be easily achievable using existing TC qdisc infra.
>>> What's the gap?
>>
>> tc doesn't allow userspace to manipulate the flow state (allow /
>> disallow traffic) on a specific queue. As I understand, the traffic
>> dequeued / queued / dropped on a specific queue of existing qdiscs are
>> controlled by the implementation of the qdisc itself.
> 
> I'm not sure what you mean. Qdiscs can form hierarchies.
> You put mq first and then whatever child qdisc you want for individual
> queues.

There is no userspace interface exposed today currently to invoke 
netif_tx_stop_queue(dev, queue) / netif_tx_wake_queue(dev, queue). The 
API itself can only be invoked within kernel.

I was wondering if it would be acceptable to add a user accessible 
interface in core networking to stop_queue / wake_queue instead of the 
driver.
Jakub Kicinski Oct. 10, 2023, 6:21 p.m. UTC | #5
On Tue, 10 Oct 2023 09:23:12 -0600 Subash Abhinov Kasiviswanathan (KS)
wrote:
> > I'm not sure what you mean. Qdiscs can form hierarchies.
> > You put mq first and then whatever child qdisc you want for individual
> > queues.  
> 
> There is no userspace interface exposed today currently to invoke 
> netif_tx_stop_queue(dev, queue) / netif_tx_wake_queue(dev, queue). The 
> API itself can only be invoked within kernel.
> 
> I was wondering if it would be acceptable to add a user accessible 
> interface in core networking to stop_queue / wake_queue instead of the 
> driver.

Maybe not driver queue control but if there's no qdisc which allows
users to pause from user space, I think that would be a much easier
sale.

That said the flow of the whole thing seems a bit complex.
Can't the driver somehow be notified by the device directly?
User space will suffer from all sort of wake up / scheduling
latencies, it'd be better if the whole sleep / wake thing was 
handled in the kernel.
Subash Abhinov Kasiviswanathan (KS) Oct. 10, 2023, 9:32 p.m. UTC | #6
On 10/10/2023 12:21 PM, Jakub Kicinski wrote:
> On Tue, 10 Oct 2023 09:23:12 -0600 Subash Abhinov Kasiviswanathan (KS)
> wrote:
>>
>> I was wondering if it would be acceptable to add a user accessible
>> interface in core networking to stop_queue / wake_queue instead of the
>> driver.
> 
> Maybe not driver queue control but if there's no qdisc which allows
> users to pause from user space, I think that would be a much easier
> sale.
> 
> That said the flow of the whole thing seems a bit complex.
> Can't the driver somehow be notified by the device directly?
> User space will suffer from all sort of wake up / scheduling
> latencies, it'd be better if the whole sleep / wake thing was
> handled in the kernel.

Our userspace module relies on various inputs from radio hardware and 
has proprietary logic to determine when to transmit / stop sending 
packets corresponding to a specific bearer. I agree that an in kernel 
scheme might be faster than an userspace - kernel solution. However, I 
believe that this latency impact could be reduced through schemes like 
setting process priority, pinning applications in isolated cores etc.
Subash Abhinov Kasiviswanathan (KS) Oct. 11, 2023, 12:35 a.m. UTC | #7
On 10/10/2023 3:32 PM, Subash Abhinov Kasiviswanathan (KS) wrote:
> 
> 
> On 10/10/2023 12:21 PM, Jakub Kicinski wrote:
>> On Tue, 10 Oct 2023 09:23:12 -0600 Subash Abhinov Kasiviswanathan (KS)
>> wrote:
>>>
>>> I was wondering if it would be acceptable to add a user accessible
>>> interface in core networking to stop_queue / wake_queue instead of the
>>> driver.
>>
>> Maybe not driver queue control but if there's no qdisc which allows
>> users to pause from user space, I think that would be a much easier
>> sale.
>>
>> That said the flow of the whole thing seems a bit complex.
>> Can't the driver somehow be notified by the device directly?
>> User space will suffer from all sort of wake up / scheduling
>> latencies, it'd be better if the whole sleep / wake thing was
>> handled in the kernel.
> 
> Our userspace module relies on various inputs from radio hardware and 
> has proprietary logic to determine when to transmit / stop sending 
> packets corresponding to a specific bearer. I agree that an in kernel 
> scheme might be faster than an userspace - kernel solution. However, I 
> believe that this latency impact could be reduced through schemes like 
> setting process priority, pinning applications in isolated cores etc.

After reviewing the qdisc set closer, it appears that my understanding 
was incorrect as tc-plug provides userspace controllable queuing. Thanks 
for the help and discussion!
diff mbox series

Patch

diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
index 39d24e07f306..038d32ab84d6 100644
--- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
@@ -1,5 +1,6 @@ 
 // SPDX-License-Identifier: GPL-2.0-only
 /* Copyright (c) 2013-2018, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2023, Qualcomm Innovation Center, Inc. All rights reserved.
  *
  * RMNET configuration engine
  */
@@ -19,6 +20,7 @@ 
 static const struct nla_policy rmnet_policy[IFLA_RMNET_MAX + 1] = {
 	[IFLA_RMNET_MUX_ID]	= { .type = NLA_U16 },
 	[IFLA_RMNET_FLAGS]	= { .len = sizeof(struct ifla_rmnet_flags) },
+	[IFLA_RMNET_QUEUE]	= { .len = sizeof(struct rmnet_queue_mapping) },
 };
 
 static int rmnet_is_real_dev_registered(const struct net_device *real_dev)
@@ -88,6 +90,66 @@  static int rmnet_register_real_device(struct net_device *real_dev,
 	return 0;
 }
 
+static int rmnet_update_queue_map(struct net_device *dev, u8 operation,
+				  u8 txqueue, u32 mark,
+				  struct netlink_ext_ack *extack)
+{
+	struct rmnet_priv *priv = netdev_priv(dev);
+	struct netdev_queue *q;
+	void *p;
+	u8 txq;
+
+	if (unlikely(txqueue >= dev->num_tx_queues)) {
+		NL_SET_ERR_MSG_MOD(extack, "invalid txqueue");
+		return -EINVAL;
+	}
+
+	switch (operation) {
+	case RMNET_QUEUE_MAPPING_ADD:
+		p = xa_store(&priv->queue_map, mark, xa_mk_value(txqueue),
+			     GFP_ATOMIC);
+		if (xa_is_err(p)) {
+			NL_SET_ERR_MSG_MOD(extack, "unable to add mapping");
+			return xa_err(p);
+		}
+		break;
+	case RMNET_QUEUE_MAPPING_REMOVE:
+		p = xa_erase(&priv->queue_map, mark);
+		if (xa_is_err(p)) {
+			NL_SET_ERR_MSG_MOD(extack, "unable to remove mapping");
+			return xa_err(p);
+		}
+		break;
+	case RMNET_QUEUE_ENABLE:
+	case RMNET_QUEUE_DISABLE:
+		p = xa_load(&priv->queue_map, mark);
+		if (p && xa_is_value(p)) {
+			txq = xa_to_value(p);
+
+			q = netdev_get_tx_queue(dev, txq);
+			if (unlikely(!q)) {
+				NL_SET_ERR_MSG_MOD(extack,
+						   "unsupported queue mapping");
+				return -EINVAL;
+			}
+
+			if (operation == RMNET_QUEUE_ENABLE)
+				netif_tx_wake_queue(q);
+			else
+				netif_tx_stop_queue(q);
+		} else {
+			NL_SET_ERR_MSG_MOD(extack, "invalid queue mapping");
+			return -EINVAL;
+		}
+		break;
+	default:
+		NL_SET_ERR_MSG_MOD(extack, "unsupported queue operation");
+		return -EOPNOTSUPP;
+	}
+
+	return 0;
+}
+
 static void rmnet_unregister_bridge(struct rmnet_port *port)
 {
 	struct net_device *bridge_dev, *real_dev, *rmnet_dev;
@@ -175,8 +237,26 @@  static int rmnet_newlink(struct net *src_net, struct net_device *dev,
 	netdev_dbg(dev, "data format [0x%08X]\n", data_format);
 	port->data_format = data_format;
 
+	if (data[IFLA_RMNET_QUEUE]) {
+		struct rmnet_queue_mapping *queue_map;
+
+		queue_map = nla_data(data[IFLA_RMNET_QUEUE]);
+		err = rmnet_update_queue_map(dev, queue_map->operation,
+					     queue_map->txqueue,
+					     queue_map->mark, extack);
+		if (err < 0)
+			goto err3;
+
+		netdev_dbg(dev, "op %02x txq %02x mark %08x\n",
+			   queue_map->operation, queue_map->txqueue,
+			   queue_map->mark);
+	}
+
 	return 0;
 
+err3:
+	hlist_del_init_rcu(&ep->hlnode);
+	netdev_upper_dev_unlink(real_dev, dev);
 err2:
 	unregister_netdevice(dev);
 	rmnet_vnd_dellink(mux_id, port, ep);
@@ -352,6 +432,22 @@  static int rmnet_changelink(struct net_device *dev, struct nlattr *tb[],
 		}
 	}
 
+	if (data[IFLA_RMNET_QUEUE]) {
+		struct rmnet_queue_mapping *queue_map;
+		int err;
+
+		queue_map = nla_data(data[IFLA_RMNET_QUEUE]);
+		err = rmnet_update_queue_map(dev, queue_map->operation,
+					     queue_map->txqueue,
+					     queue_map->mark, extack);
+		if (err < 0)
+			return err;
+
+		netdev_dbg(dev, "op %02x txq %02x mark %08x\n",
+			   queue_map->operation, queue_map->txqueue,
+			   queue_map->mark);
+	}
+
 	return 0;
 }
 
@@ -361,7 +457,9 @@  static size_t rmnet_get_size(const struct net_device *dev)
 		/* IFLA_RMNET_MUX_ID */
 		nla_total_size(2) +
 		/* IFLA_RMNET_FLAGS */
-		nla_total_size(sizeof(struct ifla_rmnet_flags));
+		nla_total_size(sizeof(struct ifla_rmnet_flags)) +
+		/* IFLA_RMNET_QUEUE */
+		nla_total_size(sizeof(struct rmnet_queue_mapping));
 }
 
 static int rmnet_fill_info(struct sk_buff *skb, const struct net_device *dev)
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h
index ed112d51ac5a..ae8300fc5ed7 100644
--- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h
@@ -1,6 +1,7 @@ 
 /* SPDX-License-Identifier: GPL-2.0-only */
 /* Copyright (c) 2013-2014, 2016-2018, 2021 The Linux Foundation.
  * All rights reserved.
+ * Copyright (c) 2023, Qualcomm Innovation Center, Inc. All rights reserved.
  *
  * RMNET Data configuration engine
  */
@@ -87,6 +88,7 @@  struct rmnet_priv {
 	struct rmnet_pcpu_stats __percpu *pcpu_stats;
 	struct gro_cells gro_cells;
 	struct rmnet_priv_stats stats;
+	struct xarray queue_map;
 };
 
 struct rmnet_port *rmnet_get_port_rcu(struct net_device *real_dev);
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
index 046b5f7d8e7c..de2792231293 100644
--- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
@@ -1,5 +1,6 @@ 
 // SPDX-License-Identifier: GPL-2.0-only
 /* Copyright (c) 2013-2018, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2023, Qualcomm Innovation Center, Inc. All rights reserved.
  *
  * RMNET Data virtual network driver
  */
@@ -158,6 +159,24 @@  static void rmnet_get_stats64(struct net_device *dev,
 	s->tx_dropped = total_stats.tx_drops;
 }
 
+static u16 rmnet_vnd_select_queue(struct net_device *dev,
+				  struct sk_buff *skb,
+				  struct net_device *sb_dev)
+{
+	struct rmnet_priv *priv = netdev_priv(dev);
+	void *p;
+	u8 txq;
+
+	p = xa_load(&priv->queue_map, skb->mark);
+	if (!p || !xa_is_value(p))
+		return 0;
+
+	txq = xa_to_value(p);
+
+	netdev_dbg(dev, "mark %08x -> txq %02x\n", skb->mark, txq);
+	return txq;
+}
+
 static const struct net_device_ops rmnet_vnd_ops = {
 	.ndo_start_xmit = rmnet_vnd_start_xmit,
 	.ndo_change_mtu = rmnet_vnd_change_mtu,
@@ -167,6 +186,7 @@  static const struct net_device_ops rmnet_vnd_ops = {
 	.ndo_init       = rmnet_vnd_init,
 	.ndo_uninit     = rmnet_vnd_uninit,
 	.ndo_get_stats64 = rmnet_get_stats64,
+	.ndo_select_queue = rmnet_vnd_select_queue,
 };
 
 static const char rmnet_gstrings_stats[][ETH_GSTRING_LEN] = {
@@ -334,6 +354,8 @@  int rmnet_vnd_newlink(u8 id, struct net_device *rmnet_dev,
 
 		priv->mux_id = id;
 
+		xa_init(&priv->queue_map);
+
 		netdev_dbg(rmnet_dev, "rmnet dev created\n");
 	}
 
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index fac351a93aed..452867d5246a 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -1368,6 +1368,7 @@  enum {
 	IFLA_RMNET_UNSPEC,
 	IFLA_RMNET_MUX_ID,
 	IFLA_RMNET_FLAGS,
+	IFLA_RMNET_QUEUE,
 	__IFLA_RMNET_MAX,
 };
 
@@ -1378,6 +1379,21 @@  struct ifla_rmnet_flags {
 	__u32	mask;
 };
 
+enum {
+	RMNET_QUEUE_OPERATION_UNSPEC,
+	RMNET_QUEUE_MAPPING_ADD,	/* Add new queue <-> mark mapping */
+	RMNET_QUEUE_MAPPING_REMOVE,	/* Remove queue <-> mark mapping */
+	RMNET_QUEUE_ENABLE,		/* Allow traffic on an existing queue */
+	RMNET_QUEUE_DISABLE,		/* Stop traffic on an existing queue */
+};
+
+struct rmnet_queue_mapping {
+	__u8	operation;
+	__u8	txqueue;
+	__u16	padding;
+	__u32	mark;
+};
+
 /* MCTP section */
 
 enum {