diff mbox series

[v5,net-next,08/17] net/sched: mqprio: allow reverse TC:TXQ mappings

Message ID 20230202003621.2679603-9-vladimir.oltean@nxp.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series ENETC mqprio/taprio cleanup | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count fail Series longer than 15 patches (and no cover letter)
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1 this patch: 1
netdev/cc_maintainers success CCed 8 of 8 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 1
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 25 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Vladimir Oltean Feb. 2, 2023, 12:36 a.m. UTC
By imposing that the last TXQ of TC i is smaller than the first TXQ of
any TC j (j := i+1 .. n), mqprio imposes a strict ordering condition for
the TXQ indices (they must increase as TCs increase).

Claudiu points out that the complexity of the TXQ count validation is
too high for this logic, i.e. instead of iterating over j, it is
sufficient that the TXQ indices of TC i and i + 1 are ordered, and that
will eventually ensure global ordering.

This is true, however it doesn't appear to me that is what the code
really intended to do. Instead, based on the comments, it just wanted to
check for overlaps (and this isn't how one does that).

So the following mqprio configuration, which I had recommended to
Vinicius more than once for igb/igc (to account for the fact that on
this hardware, lower numbered TXQs have higher dequeue priority than
higher ones):

num_tc 4 map 0 1 2 3 queues 1@3 1@2 1@1 1@0

is in fact denied today by mqprio.

The full story is that in fact, it's only denied with "hw 0"; if
hardware offloading is requested, mqprio defers TXQ range overlap
validation to the device driver (a strange decision in itself).

This is most certainly a bug, but it's not one that has any merit for
being fixed on "stable" as far as I can tell. This is because mqprio
always rejected a configuration which was in fact valid, and this has
shaped the way in which mqprio configuration scripts got built for
various hardware (see igb/igc in the link below). Therefore, one could
consider it to be merely an improvement for mqprio to allow reverse
TC:TXQ mappings.

Link: https://patchwork.kernel.org/project/netdevbpf/patch/20230130173145.475943-9-vladimir.oltean@nxp.com/#25188310
Link: https://patchwork.kernel.org/project/netdevbpf/patch/20230128010719.2182346-6-vladimir.oltean@nxp.com/#25186442
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
v4->v5: patch is new

 net/sched/sch_mqprio.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

Comments

Simon Horman Feb. 3, 2023, 4:18 p.m. UTC | #1
On Thu, Feb 02, 2023 at 02:36:12AM +0200, Vladimir Oltean wrote:
> By imposing that the last TXQ of TC i is smaller than the first TXQ of
> any TC j (j := i+1 .. n), mqprio imposes a strict ordering condition for
> the TXQ indices (they must increase as TCs increase).
> 
> Claudiu points out that the complexity of the TXQ count validation is
> too high for this logic, i.e. instead of iterating over j, it is
> sufficient that the TXQ indices of TC i and i + 1 are ordered, and that
> will eventually ensure global ordering.
> 
> This is true, however it doesn't appear to me that is what the code
> really intended to do. Instead, based on the comments, it just wanted to
> check for overlaps (and this isn't how one does that).
> 
> So the following mqprio configuration, which I had recommended to
> Vinicius more than once for igb/igc (to account for the fact that on
> this hardware, lower numbered TXQs have higher dequeue priority than
> higher ones):
> 
> num_tc 4 map 0 1 2 3 queues 1@3 1@2 1@1 1@0
> 
> is in fact denied today by mqprio.
> 
> The full story is that in fact, it's only denied with "hw 0"; if
> hardware offloading is requested, mqprio defers TXQ range overlap
> validation to the device driver (a strange decision in itself).
> 
> This is most certainly a bug, but it's not one that has any merit for
> being fixed on "stable" as far as I can tell. This is because mqprio
> always rejected a configuration which was in fact valid, and this has
> shaped the way in which mqprio configuration scripts got built for
> various hardware (see igb/igc in the link below). Therefore, one could
> consider it to be merely an improvement for mqprio to allow reverse
> TC:TXQ mappings.
> 
> Link: https://patchwork.kernel.org/project/netdevbpf/patch/20230130173145.475943-9-vladimir.oltean@nxp.com/#25188310
> Link: https://patchwork.kernel.org/project/netdevbpf/patch/20230128010719.2182346-6-vladimir.oltean@nxp.com/#25186442
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Gerhard Engleder Feb. 5, 2023, 11:55 a.m. UTC | #2
On 02.02.23 01:36, Vladimir Oltean wrote:
> By imposing that the last TXQ of TC i is smaller than the first TXQ of
> any TC j (j := i+1 .. n), mqprio imposes a strict ordering condition for
> the TXQ indices (they must increase as TCs increase).
> 
> Claudiu points out that the complexity of the TXQ count validation is
> too high for this logic, i.e. instead of iterating over j, it is
> sufficient that the TXQ indices of TC i and i + 1 are ordered, and that
> will eventually ensure global ordering.
> 
> This is true, however it doesn't appear to me that is what the code
> really intended to do. Instead, based on the comments, it just wanted to
> check for overlaps (and this isn't how one does that).
> 
> So the following mqprio configuration, which I had recommended to
> Vinicius more than once for igb/igc (to account for the fact that on
> this hardware, lower numbered TXQs have higher dequeue priority than
> higher ones):
> 
> num_tc 4 map 0 1 2 3 queues 1@3 1@2 1@1 1@0
> 
> is in fact denied today by mqprio.
> 
> The full story is that in fact, it's only denied with "hw 0"; if
> hardware offloading is requested, mqprio defers TXQ range overlap
> validation to the device driver (a strange decision in itself).
> 
> This is most certainly a bug, but it's not one that has any merit for
> being fixed on "stable" as far as I can tell. This is because mqprio
> always rejected a configuration which was in fact valid, and this has
> shaped the way in which mqprio configuration scripts got built for
> various hardware (see igb/igc in the link below). Therefore, one could
> consider it to be merely an improvement for mqprio to allow reverse
> TC:TXQ mappings.
> 
> Link: https://patchwork.kernel.org/project/netdevbpf/patch/20230130173145.475943-9-vladimir.oltean@nxp.com/#25188310
> Link: https://patchwork.kernel.org/project/netdevbpf/patch/20230128010719.2182346-6-vladimir.oltean@nxp.com/#25186442
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Vladimir Oltean Feb. 5, 2023, 12:22 p.m. UTC | #3
On Sun, Feb 05, 2023 at 12:55:13PM +0100, Gerhard Engleder wrote:
> Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com>

Thanks. Could you please put this comment on the v6?
https://patchwork.kernel.org/project/netdevbpf/patch/20230204135307.1036988-5-vladimir.oltean@nxp.com/
diff mbox series

Patch

diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c
index 3579a64da06e..25ab215641a2 100644
--- a/net/sched/sch_mqprio.c
+++ b/net/sched/sch_mqprio.c
@@ -27,6 +27,14 @@  struct mqprio_sched {
 	u64 max_rate[TC_QOPT_MAX_QUEUE];
 };
 
+/* Returns true if the intervals [a, b) and [c, d) overlap. */
+static bool intervals_overlap(int a, int b, int c, int d)
+{
+	int left = max(a, c), right = min(b, d);
+
+	return left < right;
+}
+
 static int mqprio_enable_offload(struct Qdisc *sch,
 				 const struct tc_mqprio_qopt *qopt)
 {
@@ -144,7 +152,10 @@  static int mqprio_parse_opt(struct net_device *dev, struct tc_mqprio_qopt *qopt)
 
 		/* Verify that the offset and counts do not overlap */
 		for (j = i + 1; j < qopt->num_tc; j++) {
-			if (last > qopt->offset[j])
+			if (intervals_overlap(qopt->offset[i], last,
+					      qopt->offset[j],
+					      qopt->offset[j] +
+					      qopt->count[j]))
 				return -EINVAL;
 		}
 	}