diff mbox series

[v4,net-next,04/15] net: enetc: ensure we always have a minimum number of TXQs for stack

Message ID 20230130173145.475943-5-vladimir.oltean@nxp.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series ENETC mqprio/taprio cleanup | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 3 this patch: 3
netdev/cc_maintainers warning 6 maintainers not CCed: john.fastabend@gmail.com daniel@iogearbox.net bpf@vger.kernel.org hawk@kernel.org ast@kernel.org linux@armlinux.org.uk
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 3 this patch: 3
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 50 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Vladimir Oltean Jan. 30, 2023, 5:31 p.m. UTC
Currently it can happen that an mqprio qdisc is installed with num_tc 8,
and this will reserve 8 (out of 8) TXQs for the network stack. Then we
can attach an XDP program, and this will crop 2 TXQs, leaving just 6 for
mqprio. That's not what the user requested, and we should fail it.

On the other hand, if mqprio isn't requested, we still give the 8 TXQs
to the network stack (with hashing among a single traffic class), but
then, cropping 2 TXQs for XDP is fine, because the user didn't
explicitly ask for any number of TXQs, so no expectations are violated.

Simply put, the logic that mqprio should impose a minimum number of TXQs
for the network never existed. Let's say (more or less arbitrarily) that
without mqprio, the driver expects a minimum number of TXQs equal to the
number of CPUs (on NXP LS1028A, that is either 1, or 2). And with mqprio,
mqprio gives the minimum required number of TXQs.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
v3->v4: none
v2->v3: move min_num_stack_tx_queues definition so it doesn't conflict
        with the ethtool mm patches I haven't submitted yet for enetc
        (and also to make use of a 4 byte hole)
v1->v2: patch is new

 drivers/net/ethernet/freescale/enetc/enetc.c | 14 ++++++++++++++
 drivers/net/ethernet/freescale/enetc/enetc.h |  3 +++
 2 files changed, 17 insertions(+)

Comments

Simon Horman Feb. 1, 2023, 1:43 p.m. UTC | #1
On Mon, Jan 30, 2023 at 07:31:34PM +0200, Vladimir Oltean wrote:
> Currently it can happen that an mqprio qdisc is installed with num_tc 8,
> and this will reserve 8 (out of 8) TXQs for the network stack. Then we
> can attach an XDP program, and this will crop 2 TXQs, leaving just 6 for
> mqprio. That's not what the user requested, and we should fail it.
> 
> On the other hand, if mqprio isn't requested, we still give the 8 TXQs
> to the network stack (with hashing among a single traffic class), but
> then, cropping 2 TXQs for XDP is fine, because the user didn't
> explicitly ask for any number of TXQs, so no expectations are violated.
> 
> Simply put, the logic that mqprio should impose a minimum number of TXQs
> for the network never existed. Let's say (more or less arbitrarily) that
> without mqprio, the driver expects a minimum number of TXQs equal to the
> number of CPUs (on NXP LS1028A, that is either 1, or 2). And with mqprio,
> mqprio gives the minimum required number of TXQs.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

The nit below notwithstanding,

Reviewed-by: Simon Horman <simon.horman@corigine.com>

...

> diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h
> index 1fe8dfd6b6d4..e21d096c5a90 100644
> --- a/drivers/net/ethernet/freescale/enetc/enetc.h
> +++ b/drivers/net/ethernet/freescale/enetc/enetc.h
> @@ -369,6 +369,9 @@ struct enetc_ndev_priv {
>  
>  	struct psfp_cap psfp_cap;
>  
> +	/* Minimum number of TX queues required by the network stack */
> +	unsigned int min_num_stack_tx_queues;
> +

It is probably not important.
But I do notice there are several holes in struct enetc_ndev_priv
that would fit this field.

>  	struct phylink *phylink;
>  	int ic_mode;
>  	u32 tx_ictt;
> -- 
> 2.34.1
>
Vladimir Oltean Feb. 1, 2023, 6:46 p.m. UTC | #2
Hi Simon,

On Wed, Feb 01, 2023 at 02:43:44PM +0100, Simon Horman wrote:
> The nit below notwithstanding,
> 
> Reviewed-by: Simon Horman <simon.horman@corigine.com>

I appreciate your time to review this patch set.

> > diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h
> > index 1fe8dfd6b6d4..e21d096c5a90 100644
> > --- a/drivers/net/ethernet/freescale/enetc/enetc.h
> > +++ b/drivers/net/ethernet/freescale/enetc/enetc.h
> > @@ -369,6 +369,9 @@ struct enetc_ndev_priv {
> >  
> >  	struct psfp_cap psfp_cap;
> >  
> > +	/* Minimum number of TX queues required by the network stack */
> > +	unsigned int min_num_stack_tx_queues;
> > +
> 
> It is probably not important.
> But I do notice there are several holes in struct enetc_ndev_priv
> that would fit this field.

This is true. However, this patch was written taking pahole into
consideration, and one new field can only fill a single hole :)

Before:

pahole -C enetc_ndev_priv $KBUILD_OUTPUT/drivers/net/ethernet/freescale/enetc/enetc.o
struct enetc_ndev_priv {
        struct net_device *        ndev;                 /*     0     8 */
        struct device *            dev;                  /*     8     8 */
        struct enetc_si *          si;                   /*    16     8 */
        int                        bdr_int_num;          /*    24     4 */

        /* XXX 4 bytes hole, try to pack */

        struct enetc_int_vector *  int_vector[2];        /*    32    16 */
        u16                        num_rx_rings;         /*    48     2 */
        u16                        num_tx_rings;         /*    50     2 */
        u16                        rx_bd_count;          /*    52     2 */
        u16                        tx_bd_count;          /*    54     2 */
        u16                        msg_enable;           /*    56     2 */

        /* XXX 2 bytes hole, try to pack */

        enum enetc_active_offloads active_offloads;      /*    60     4 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        u32                        speed;                /*    64     4 */

        /* XXX 4 bytes hole, try to pack */

        struct enetc_bdr * *       xdp_tx_ring;          /*    72     8 */
        struct enetc_bdr *         tx_ring[16];          /*    80   128 */
        /* --- cacheline 3 boundary (192 bytes) was 16 bytes ago --- */
        struct enetc_bdr *         rx_ring[16];          /*   208   128 */
        /* --- cacheline 5 boundary (320 bytes) was 16 bytes ago --- */
        const struct enetc_bdr_resource  * tx_res;       /*   336     8 */
        const struct enetc_bdr_resource  * rx_res;       /*   344     8 */
        struct enetc_cls_rule *    cls_rules;            /*   352     8 */
        struct psfp_cap            psfp_cap;             /*   360    20 */

        /* XXX 4 bytes hole, try to pack */

        /* --- cacheline 6 boundary (384 bytes) --- */
        struct phylink *           phylink;              /*   384     8 */
        int                        ic_mode;              /*   392     4 */
        u32                        tx_ictt;              /*   396     4 */
        struct bpf_prog *          xdp_prog;             /*   400     8 */
        long unsigned int          flags;                /*   408     8 */
        struct work_struct         tx_onestep_tstamp;    /*   416     0 */

        /* XXX 32 bytes hole, try to pack */

        /* --- cacheline 7 boundary (448 bytes) --- */
        struct sk_buff_head        tx_skbs;              /*   448     0 */

        /* size: 472, cachelines: 8, members: 26 */
        /* sum members: 402, holes: 5, sum holes: 46 */
        /* padding: 24 */
        /* last cacheline: 24 bytes */
};

After:

struct enetc_ndev_priv {
        struct net_device *        ndev;                 /*     0     8 */
        struct device *            dev;                  /*     8     8 */
        struct enetc_si *          si;                   /*    16     8 */
        int                        bdr_int_num;          /*    24     4 */

        /* XXX 4 bytes hole, try to pack */

        struct enetc_int_vector *  int_vector[2];        /*    32    16 */
        u16                        num_rx_rings;         /*    48     2 */
        u16                        num_tx_rings;         /*    50     2 */
        u16                        rx_bd_count;          /*    52     2 */
        u16                        tx_bd_count;          /*    54     2 */
        u16                        msg_enable;           /*    56     2 */

        /* XXX 2 bytes hole, try to pack */

        enum enetc_active_offloads active_offloads;      /*    60     4 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        u32                        speed;                /*    64     4 */

        /* XXX 4 bytes hole, try to pack */

        struct enetc_bdr * *       xdp_tx_ring;          /*    72     8 */
        struct enetc_bdr *         tx_ring[16];          /*    80   128 */
        /* --- cacheline 3 boundary (192 bytes) was 16 bytes ago --- */
        struct enetc_bdr *         rx_ring[16];          /*   208   128 */
        /* --- cacheline 5 boundary (320 bytes) was 16 bytes ago --- */
        const struct enetc_bdr_resource  * tx_res;       /*   336     8 */
        const struct enetc_bdr_resource  * rx_res;       /*   344     8 */
        struct enetc_cls_rule *    cls_rules;            /*   352     8 */
        struct psfp_cap            psfp_cap;             /*   360    20 */
        unsigned int               min_num_stack_tx_queues; /*   380     4 */
        /* --- cacheline 6 boundary (384 bytes) --- */
        struct phylink *           phylink;              /*   384     8 */
        int                        ic_mode;              /*   392     4 */
        u32                        tx_ictt;              /*   396     4 */
        struct bpf_prog *          xdp_prog;             /*   400     8 */
        long unsigned int          flags;                /*   408     8 */
        struct work_struct         tx_onestep_tstamp;    /*   416     0 */

        /* XXX 32 bytes hole, try to pack */

        /* --- cacheline 7 boundary (448 bytes) --- */
        struct sk_buff_head        tx_skbs;              /*   448     0 */

        /* size: 472, cachelines: 8, members: 27 */
        /* sum members: 406, holes: 4, sum holes: 42 */
        /* padding: 24 */
        /* last cacheline: 24 bytes */
};
Simon Horman Feb. 2, 2023, 9:29 a.m. UTC | #3
On Wed, Feb 01, 2023 at 08:46:52PM +0200, Vladimir Oltean wrote:
> Hi Simon,
> 
> On Wed, Feb 01, 2023 at 02:43:44PM +0100, Simon Horman wrote:
> > The nit below notwithstanding,
> > 
> > Reviewed-by: Simon Horman <simon.horman@corigine.com>
> 
> I appreciate your time to review this patch set.
> 
> > > diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h
> > > index 1fe8dfd6b6d4..e21d096c5a90 100644
> > > --- a/drivers/net/ethernet/freescale/enetc/enetc.h
> > > +++ b/drivers/net/ethernet/freescale/enetc/enetc.h
> > > @@ -369,6 +369,9 @@ struct enetc_ndev_priv {
> > >  
> > >  	struct psfp_cap psfp_cap;
> > >  
> > > +	/* Minimum number of TX queues required by the network stack */
> > > +	unsigned int min_num_stack_tx_queues;
> > > +
> > 
> > It is probably not important.
> > But I do notice there are several holes in struct enetc_ndev_priv
> > that would fit this field.
> 
> This is true. However, this patch was written taking pahole into
> consideration, and one new field can only fill a single hole :)

Yes, indeed. Silly me.
diff mbox series

Patch

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
index e18a6c834eb4..1c0aeaa13cde 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -2626,6 +2626,7 @@  int enetc_setup_tc_mqprio(struct net_device *ndev, void *type_data)
 	if (!num_tc) {
 		netdev_reset_tc(ndev);
 		netif_set_real_num_tx_queues(ndev, num_stack_tx_queues);
+		priv->min_num_stack_tx_queues = num_possible_cpus();
 
 		/* Reset all ring priorities to 0 */
 		for (i = 0; i < priv->num_tx_rings; i++) {
@@ -2656,6 +2657,7 @@  int enetc_setup_tc_mqprio(struct net_device *ndev, void *type_data)
 
 	/* Reset the number of netdev queues based on the TC count */
 	netif_set_real_num_tx_queues(ndev, num_tc);
+	priv->min_num_stack_tx_queues = num_tc;
 
 	netdev_set_num_tc(ndev, num_tc);
 
@@ -2702,9 +2704,20 @@  static int enetc_reconfigure_xdp_cb(struct enetc_ndev_priv *priv, void *ctx)
 static int enetc_setup_xdp_prog(struct net_device *ndev, struct bpf_prog *prog,
 				struct netlink_ext_ack *extack)
 {
+	int num_xdp_tx_queues = prog ? num_possible_cpus() : 0;
 	struct enetc_ndev_priv *priv = netdev_priv(ndev);
 	bool extended;
 
+	if (priv->min_num_stack_tx_queues + num_xdp_tx_queues >
+	    priv->num_tx_rings) {
+		NL_SET_ERR_MSG_FMT_MOD(extack,
+				       "Reserving %d XDP TXQs does not leave a minimum of %d TXQs for network stack (total %d available)",
+				       num_xdp_tx_queues,
+				       priv->min_num_stack_tx_queues,
+				       priv->num_tx_rings);
+		return -EBUSY;
+	}
+
 	extended = !!(priv->active_offloads & ENETC_F_RX_TSTAMP);
 
 	/* The buffer layout is changing, so we need to drain the old
@@ -2989,6 +3002,7 @@  int enetc_alloc_msix(struct enetc_ndev_priv *priv)
 	if (err)
 		goto fail;
 
+	priv->min_num_stack_tx_queues = num_possible_cpus();
 	first_xdp_tx_ring = priv->num_tx_rings - num_possible_cpus();
 	priv->xdp_tx_ring = &priv->tx_ring[first_xdp_tx_ring];
 
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h
index 1fe8dfd6b6d4..e21d096c5a90 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.h
+++ b/drivers/net/ethernet/freescale/enetc/enetc.h
@@ -369,6 +369,9 @@  struct enetc_ndev_priv {
 
 	struct psfp_cap psfp_cap;
 
+	/* Minimum number of TX queues required by the network stack */
+	unsigned int min_num_stack_tx_queues;
+
 	struct phylink *phylink;
 	int ic_mode;
 	u32 tx_ictt;