Message ID | 20210414023428.10121-1-kerneljasonxing@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 4e39a072a6a0fc422ba7da5e4336bdc295d70211 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,v3] i40e: fix the panic when running bpf in xdpdrv mode | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | fail | 1 blamed authors not CCed: shannon.nelson@intel.com; 1 maintainers not CCed: shannon.nelson@intel.com |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 3 this patch: 3 |
netdev/kdoc | success | Errors and warnings before: 3 this patch: 3 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 18 lines checked |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 3 this patch: 3 |
netdev/header_inline | success | Link |
kerneljasonxing@gmail.com wrote: > From: Jason Xing <xingwanli@kuaishou.com> > > Fix this panic by adding more rules to calculate the value of @rss_size_max > which could be used in allocating the queues when bpf is loaded, which, > however, could cause the failure and then trigger the NULL pointer of > vsi->rx_rings. Prio to this fix, the machine doesn't care about how many > cpus are online and then allocates 256 queues on the machine with 32 cpus > online actually. > > Once the load of bpf begins, the log will go like this "failed to get > tracking for 256 queues for VSI 0 err -12" and this "setup of MAIN VSI > failed". > > Thus, I attach the key information of the crash-log here. > > BUG: unable to handle kernel NULL pointer dereference at > 0000000000000000 > RIP: 0010:i40e_xdp+0xdd/0x1b0 [i40e] > Call Trace: > [2160294.717292] ? i40e_reconfig_rss_queues+0x170/0x170 [i40e] > [2160294.717666] dev_xdp_install+0x4f/0x70 > [2160294.718036] dev_change_xdp_fd+0x11f/0x230 > [2160294.718380] ? dev_disable_lro+0xe0/0xe0 > [2160294.718705] do_setlink+0xac7/0xe70 > [2160294.719035] ? __nla_parse+0xed/0x120 > [2160294.719365] rtnl_newlink+0x73b/0x860 > > Fixes: 41c445ff0f48 ("i40e: main driver core") > Co-developed-by: Shujin Li <lishujin@kuaishou.com> > Signed-off-by: Shujin Li <lishujin@kuaishou.com> > Signed-off-by: Jason Xing <xingwanli@kuaishou.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> @Jakub/@DaveM - feel free to apply this directly.
On Wed, 14 Apr 2021 19:06:52 -0700 Jesse Brandeburg <jesse.brandeburg@intel.com> wrote: > kerneljasonxing@gmail.com wrote: > > > From: Jason Xing <xingwanli@kuaishou.com> > > > > Fix this panic by adding more rules to calculate the value of @rss_size_max > > which could be used in allocating the queues when bpf is loaded, which, > > however, could cause the failure and then trigger the NULL pointer of > > vsi->rx_rings. Prio to this fix, the machine doesn't care about how many > > cpus are online and then allocates 256 queues on the machine with 32 cpus > > online actually. > > > > Once the load of bpf begins, the log will go like this "failed to get > > tracking for 256 queues for VSI 0 err -12" and this "setup of MAIN VSI > > failed". > > > > Thus, I attach the key information of the crash-log here. > > > > BUG: unable to handle kernel NULL pointer dereference at > > 0000000000000000 > > RIP: 0010:i40e_xdp+0xdd/0x1b0 [i40e] > > Call Trace: > > [2160294.717292] ? i40e_reconfig_rss_queues+0x170/0x170 [i40e] > > [2160294.717666] dev_xdp_install+0x4f/0x70 > > [2160294.718036] dev_change_xdp_fd+0x11f/0x230 > > [2160294.718380] ? dev_disable_lro+0xe0/0xe0 > > [2160294.718705] do_setlink+0xac7/0xe70 > > [2160294.719035] ? __nla_parse+0xed/0x120 > > [2160294.719365] rtnl_newlink+0x73b/0x860 > > > > Fixes: 41c445ff0f48 ("i40e: main driver core") > > Co-developed-by: Shujin Li <lishujin@kuaishou.com> > > Signed-off-by: Shujin Li <lishujin@kuaishou.com> > > Signed-off-by: Jason Xing <xingwanli@kuaishou.com> > > Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> > > @Jakub/@DaveM - feel free to apply this directly. Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> The crash/bug happens in this code: static int i40e_xdp_setup(struct i40e_vsi *vsi, struct bpf_prog *prog, struct netlink_ext_ack *extack) { [...] for (i = 0; i < vsi->num_queue_pairs; i++) WRITE_ONCE(vsi->rx_rings[i]->xdp_prog, vsi->xdp_prog); And this is a side effect of i40e_setup_pf_switch() failing with "setup of MAIN VSI failed". LGTM
Hello: This patch was applied to netdev/net.git (refs/heads/master): On Wed, 14 Apr 2021 10:34:28 +0800 you wrote: > From: Jason Xing <xingwanli@kuaishou.com> > > Fix this panic by adding more rules to calculate the value of @rss_size_max > which could be used in allocating the queues when bpf is loaded, which, > however, could cause the failure and then trigger the NULL pointer of > vsi->rx_rings. Prio to this fix, the machine doesn't care about how many > cpus are online and then allocates 256 queues on the machine with 32 cpus > online actually. > > [...] Here is the summary with links: - [net,v3] i40e: fix the panic when running bpf in xdpdrv mode https://git.kernel.org/netdev/net/c/4e39a072a6a0 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 521ea9d..4e9a247 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -11867,6 +11867,7 @@ static int i40e_sw_init(struct i40e_pf *pf) { int err = 0; int size; + u16 pow; /* Set default capability flags */ pf->flags = I40E_FLAG_RX_CSUM_ENABLED | @@ -11885,6 +11886,11 @@ static int i40e_sw_init(struct i40e_pf *pf) pf->rss_table_size = pf->hw.func_caps.rss_table_size; pf->rss_size_max = min_t(int, pf->rss_size_max, pf->hw.func_caps.num_tx_qp); + + /* find the next higher power-of-2 of num cpus */ + pow = roundup_pow_of_two(num_online_cpus()); + pf->rss_size_max = min_t(int, pf->rss_size_max, pow); + if (pf->hw.func_caps.rss) { pf->flags |= I40E_FLAG_RSS_ENABLED; pf->alloc_rss_size = min_t(int, pf->rss_size_max,