diff mbox series

ixgbe: let the xdpdrv work with more than 64 cpus

Message ID 20210824104918.7930-1-kerneljasonxing@gmail.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series ixgbe: let the xdpdrv work with more than 64 cpus | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Guessed tree name to be net-next
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cc_maintainers success CCed 16 of 16 maintainers
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 1 this patch: 1
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes fail Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 17 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 1
netdev/header_inline success Link

Commit Message

Jason Xing Aug. 24, 2021, 10:49 a.m. UTC
From: Jason Xing <xingwanli@kuaishou.com>

Originally, ixgbe driver doesn't allow the mounting of xdpdrv if the
server is equipped with more than 64 cpus online. So it turns out that
the loading of xdpdrv causes the "NOMEM" failure.

Actually, we can adjust the algorithm and then make it work, which has
no harm at all, only if we set the maxmium number of xdp queues.

Fixes: 33fdc82f08 ("ixgbe: add support for XDP_TX action")
Co-developed-by: Shujin Li <lishujin@kuaishou.com>
Signed-off-by: Shujin Li <lishujin@kuaishou.com>
Signed-off-by: Jason Xing <xingwanli@kuaishou.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c  | 2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 3 ---
 2 files changed, 1 insertion(+), 4 deletions(-)

Comments

Jesper Dangaard Brouer Aug. 24, 2021, 1:32 p.m. UTC | #1
On 24/08/2021 12.49, kerneljasonxing@gmail.com wrote:
> From: Jason Xing <xingwanli@kuaishou.com>
> 
> Originally, ixgbe driver doesn't allow the mounting of xdpdrv if the
> server is equipped with more than 64 cpus online. So it turns out that
> the loading of xdpdrv causes the "NOMEM" failure.
> 
> Actually, we can adjust the algorithm and then make it work, which has
> no harm at all, only if we set the maxmium number of xdp queues.

This is not true, it can cause harm, because XDP transmission queues are 
used without locking. See drivers ndo_xdp_xmit function ixgbe_xdp_xmit().
As driver assumption is that each CPU have its own XDP TX-queue.

This patch is not a proper fix.

I do think we need a proper fix for this issue on ixgbe.


> Fixes: 33fdc82f08 ("ixgbe: add support for XDP_TX action")
> Co-developed-by: Shujin Li <lishujin@kuaishou.com>
> Signed-off-by: Shujin Li <lishujin@kuaishou.com>
> Signed-off-by: Jason Xing <xingwanli@kuaishou.com>
> ---
>   drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c  | 2 +-
>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 3 ---
>   2 files changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> index 0218f6c..5953996 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> @@ -299,7 +299,7 @@ static void ixgbe_cache_ring_register(struct ixgbe_adapter *adapter)
>   
>   static int ixgbe_xdp_queues(struct ixgbe_adapter *adapter)
>   {
> -	return adapter->xdp_prog ? nr_cpu_ids : 0;
> +	return adapter->xdp_prog ? min_t(int, MAX_XDP_QUEUES, nr_cpu_ids) : 0;
>   }
>   
>   #define IXGBE_RSS_64Q_MASK	0x3F
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index 14aea40..b36d16b 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -10130,9 +10130,6 @@ static int ixgbe_xdp_setup(struct net_device *dev, struct bpf_prog *prog)
>   			return -EINVAL;
>   	}
>   
> -	if (nr_cpu_ids > MAX_XDP_QUEUES)
> -		return -ENOMEM;
> -
>   	old_prog = xchg(&adapter->xdp_prog, prog);
>   	need_reset = (!!prog != !!old_prog);
>   
>
Jason Xing Aug. 24, 2021, 3:23 p.m. UTC | #2
On Tue, Aug 24, 2021 at 9:32 PM Jesper Dangaard Brouer
<jbrouer@redhat.com> wrote:
>
>
>
> On 24/08/2021 12.49, kerneljasonxing@gmail.com wrote:
> > From: Jason Xing <xingwanli@kuaishou.com>
> >
> > Originally, ixgbe driver doesn't allow the mounting of xdpdrv if the
> > server is equipped with more than 64 cpus online. So it turns out that
> > the loading of xdpdrv causes the "NOMEM" failure.
> >
> > Actually, we can adjust the algorithm and then make it work, which has
> > no harm at all, only if we set the maxmium number of xdp queues.
>
> This is not true, it can cause harm, because XDP transmission queues are
> used without locking. See drivers ndo_xdp_xmit function ixgbe_xdp_xmit().
> As driver assumption is that each CPU have its own XDP TX-queue.
>

Point taken. I indeed miss that part which would cause bad behavior if it
happens.

At this point, I think I should find all the allocation and use of XDP
related, say,
queues and rings, then adjust them all?

Let's say if the server is shipped with 128 cpus, we could map 128 cpus to 64
rings in the function ixgbe_xdp_xmit(). However, it sounds a little bit odd.

Do you think that it makes any sense?

Thanks,
Jason

> This patch is not a proper fix.
>
> I do think we need a proper fix for this issue on ixgbe.
>
>
> > Fixes: 33fdc82f08 ("ixgbe: add support for XDP_TX action")
> > Co-developed-by: Shujin Li <lishujin@kuaishou.com>
> > Signed-off-by: Shujin Li <lishujin@kuaishou.com>
> > Signed-off-by: Jason Xing <xingwanli@kuaishou.com>
> > ---
> >   drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c  | 2 +-
> >   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 3 ---
> >   2 files changed, 1 insertion(+), 4 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> > index 0218f6c..5953996 100644
> > --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> > +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> > @@ -299,7 +299,7 @@ static void ixgbe_cache_ring_register(struct ixgbe_adapter *adapter)
> >
> >   static int ixgbe_xdp_queues(struct ixgbe_adapter *adapter)
> >   {
> > -     return adapter->xdp_prog ? nr_cpu_ids : 0;
> > +     return adapter->xdp_prog ? min_t(int, MAX_XDP_QUEUES, nr_cpu_ids) : 0;
> >   }
> >
> >   #define IXGBE_RSS_64Q_MASK  0x3F
> > diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > index 14aea40..b36d16b 100644
> > --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > @@ -10130,9 +10130,6 @@ static int ixgbe_xdp_setup(struct net_device *dev, struct bpf_prog *prog)
> >                       return -EINVAL;
> >       }
> >
> > -     if (nr_cpu_ids > MAX_XDP_QUEUES)
> > -             return -ENOMEM;
> > -
> >       old_prog = xchg(&adapter->xdp_prog, prog);
> >       need_reset = (!!prog != !!old_prog);
> >
> >
>
Fijalkowski, Maciej Aug. 24, 2021, 3:32 p.m. UTC | #3
On Tue, Aug 24, 2021 at 11:23:29PM +0800, Jason Xing wrote:
> On Tue, Aug 24, 2021 at 9:32 PM Jesper Dangaard Brouer
> <jbrouer@redhat.com> wrote:
> >
> >
> >
> > On 24/08/2021 12.49, kerneljasonxing@gmail.com wrote:
> > > From: Jason Xing <xingwanli@kuaishou.com>
> > >
> > > Originally, ixgbe driver doesn't allow the mounting of xdpdrv if the
> > > server is equipped with more than 64 cpus online. So it turns out that
> > > the loading of xdpdrv causes the "NOMEM" failure.
> > >
> > > Actually, we can adjust the algorithm and then make it work, which has
> > > no harm at all, only if we set the maxmium number of xdp queues.
> >
> > This is not true, it can cause harm, because XDP transmission queues are
> > used without locking. See drivers ndo_xdp_xmit function ixgbe_xdp_xmit().
> > As driver assumption is that each CPU have its own XDP TX-queue.

Thanks Jesper for chiming in.

> >
> 
> Point taken. I indeed miss that part which would cause bad behavior if it
> happens.
> 
> At this point, I think I should find all the allocation and use of XDP
> related, say,
> queues and rings, then adjust them all?
> 
> Let's say if the server is shipped with 128 cpus, we could map 128 cpus to 64
> rings in the function ixgbe_xdp_xmit(). However, it sounds a little bit odd.
> 
> Do you think that it makes any sense?

We need a fallback path for ixgbe. I did the following for ice:
https://x-lore.kernel.org/bpf/20210819120004.34392-9-maciej.fijalkowski@intel.com/T/#u

> 
> Thanks,
> Jason
> 
> > This patch is not a proper fix.
> >
> > I do think we need a proper fix for this issue on ixgbe.
> >
> >
> > > Fixes: 33fdc82f08 ("ixgbe: add support for XDP_TX action")
> > > Co-developed-by: Shujin Li <lishujin@kuaishou.com>
> > > Signed-off-by: Shujin Li <lishujin@kuaishou.com>
> > > Signed-off-by: Jason Xing <xingwanli@kuaishou.com>
> > > ---
> > >   drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c  | 2 +-
> > >   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 3 ---
> > >   2 files changed, 1 insertion(+), 4 deletions(-)
> > >
> > > diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> > > index 0218f6c..5953996 100644
> > > --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> > > +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> > > @@ -299,7 +299,7 @@ static void ixgbe_cache_ring_register(struct ixgbe_adapter *adapter)
> > >
> > >   static int ixgbe_xdp_queues(struct ixgbe_adapter *adapter)
> > >   {
> > > -     return adapter->xdp_prog ? nr_cpu_ids : 0;
> > > +     return adapter->xdp_prog ? min_t(int, MAX_XDP_QUEUES, nr_cpu_ids) : 0;
> > >   }
> > >
> > >   #define IXGBE_RSS_64Q_MASK  0x3F
> > > diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > > index 14aea40..b36d16b 100644
> > > --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > > +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > > @@ -10130,9 +10130,6 @@ static int ixgbe_xdp_setup(struct net_device *dev, struct bpf_prog *prog)
> > >                       return -EINVAL;
> > >       }
> > >
> > > -     if (nr_cpu_ids > MAX_XDP_QUEUES)
> > > -             return -ENOMEM;
> > > -
> > >       old_prog = xchg(&adapter->xdp_prog, prog);
> > >       need_reset = (!!prog != !!old_prog);
> > >
> > >
> >
Jason Xing Aug. 25, 2021, 11:59 a.m. UTC | #4
On Tue, Aug 24, 2021 at 11:48 PM Maciej Fijalkowski
<maciej.fijalkowski@intel.com> wrote:
>
> On Tue, Aug 24, 2021 at 11:23:29PM +0800, Jason Xing wrote:
> > On Tue, Aug 24, 2021 at 9:32 PM Jesper Dangaard Brouer
> > <jbrouer@redhat.com> wrote:
> > >
> > >
> > >
> > > On 24/08/2021 12.49, kerneljasonxing@gmail.com wrote:
> > > > From: Jason Xing <xingwanli@kuaishou.com>
> > > >
> > > > Originally, ixgbe driver doesn't allow the mounting of xdpdrv if the
> > > > server is equipped with more than 64 cpus online. So it turns out that
> > > > the loading of xdpdrv causes the "NOMEM" failure.
> > > >
> > > > Actually, we can adjust the algorithm and then make it work, which has
> > > > no harm at all, only if we set the maxmium number of xdp queues.
> > >
> > > This is not true, it can cause harm, because XDP transmission queues are
> > > used without locking. See drivers ndo_xdp_xmit function ixgbe_xdp_xmit().
> > > As driver assumption is that each CPU have its own XDP TX-queue.
>
> Thanks Jesper for chiming in.
>
> > >
> >
> > Point taken. I indeed miss that part which would cause bad behavior if it
> > happens.
> >
> > At this point, I think I should find all the allocation and use of XDP
> > related, say,
> > queues and rings, then adjust them all?
> >
> > Let's say if the server is shipped with 128 cpus, we could map 128 cpus to 64
> > rings in the function ixgbe_xdp_xmit(). However, it sounds a little bit odd.
> >
> > Do you think that it makes any sense?
>
> We need a fallback path for ixgbe. I did the following for ice:
> https://x-lore.kernel.org/bpf/20210819120004.34392-9-maciej.fijalkowski@intel.com/T/#u
>

Thanks. I'm ready to send the v2 patch. Please help me review the next
submission.

Jason

> >
> > Thanks,
> > Jason
> >
> > > This patch is not a proper fix.
> > >
> > > I do think we need a proper fix for this issue on ixgbe.
> > >
> > >
> > > > Fixes: 33fdc82f08 ("ixgbe: add support for XDP_TX action")
> > > > Co-developed-by: Shujin Li <lishujin@kuaishou.com>
> > > > Signed-off-by: Shujin Li <lishujin@kuaishou.com>
> > > > Signed-off-by: Jason Xing <xingwanli@kuaishou.com>
> > > > ---
> > > >   drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c  | 2 +-
> > > >   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 3 ---
> > > >   2 files changed, 1 insertion(+), 4 deletions(-)
> > > >
> > > > diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> > > > index 0218f6c..5953996 100644
> > > > --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> > > > +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> > > > @@ -299,7 +299,7 @@ static void ixgbe_cache_ring_register(struct ixgbe_adapter *adapter)
> > > >
> > > >   static int ixgbe_xdp_queues(struct ixgbe_adapter *adapter)
> > > >   {
> > > > -     return adapter->xdp_prog ? nr_cpu_ids : 0;
> > > > +     return adapter->xdp_prog ? min_t(int, MAX_XDP_QUEUES, nr_cpu_ids) : 0;
> > > >   }
> > > >
> > > >   #define IXGBE_RSS_64Q_MASK  0x3F
> > > > diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > > > index 14aea40..b36d16b 100644
> > > > --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > > > +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > > > @@ -10130,9 +10130,6 @@ static int ixgbe_xdp_setup(struct net_device *dev, struct bpf_prog *prog)
> > > >                       return -EINVAL;
> > > >       }
> > > >
> > > > -     if (nr_cpu_ids > MAX_XDP_QUEUES)
> > > > -             return -ENOMEM;
> > > > -
> > > >       old_prog = xchg(&adapter->xdp_prog, prog);
> > > >       need_reset = (!!prog != !!old_prog);
> > > >
> > > >
> > >
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
index 0218f6c..5953996 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
@@ -299,7 +299,7 @@  static void ixgbe_cache_ring_register(struct ixgbe_adapter *adapter)
 
 static int ixgbe_xdp_queues(struct ixgbe_adapter *adapter)
 {
-	return adapter->xdp_prog ? nr_cpu_ids : 0;
+	return adapter->xdp_prog ? min_t(int, MAX_XDP_QUEUES, nr_cpu_ids) : 0;
 }
 
 #define IXGBE_RSS_64Q_MASK	0x3F
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 14aea40..b36d16b 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -10130,9 +10130,6 @@  static int ixgbe_xdp_setup(struct net_device *dev, struct bpf_prog *prog)
 			return -EINVAL;
 	}
 
-	if (nr_cpu_ids > MAX_XDP_QUEUES)
-		return -ENOMEM;
-
 	old_prog = xchg(&adapter->xdp_prog, prog);
 	need_reset = (!!prog != !!old_prog);