diff mbox series

[iwl-net,v2] i40e: Fix XDP program unloading while removing the driver

Message ID 20240624151213.1755714-1-michal.kubiak@intel.com (mailing list archive)
State Awaiting Upstream
Delegated to: Netdev Maintainers
Headers show
Series [iwl-net,v2] i40e: Fix XDP program unloading while removing the driver | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 859 this patch: 859
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers fail 3 blamed authors not CCed: sylwesterx.dziedziuch@intel.com slawomirx.laba@intel.com anthony.l.nguyen@intel.com; 7 maintainers not CCed: bpf@vger.kernel.org pabeni@redhat.com edumazet@google.com sylwesterx.dziedziuch@intel.com jesse.brandeburg@intel.com slawomirx.laba@intel.com anthony.l.nguyen@intel.com
netdev/build_clang success Errors and warnings before: 863 this patch: 863
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 863 this patch: 863
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 24 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 211 this patch: 211
netdev/source_inline success Was 0 now: 0

Commit Message

Michal Kubiak June 24, 2024, 3:12 p.m. UTC
The commit 6533e558c650 ("i40e: Fix reset path while removing
the driver") introduced a new PF state "__I40E_IN_REMOVE" to block
modifying the XDP program while the driver is being removed.
Unfortunately, such a change is useful only if the ".ndo_bpf()"
callback was called out of the rmmod context because unloading the
existing XDP program is also a part of driver removing procedure.
In other words, from the rmmod context the driver is expected to
unload the XDP program without reporting any errors. Otherwise,
the kernel warning with callstack is printed out to dmesg.

Example failing scenario:
 1. Load the i40e driver.
 2. Load the XDP program.
 3. Unload the i40e driver (using "rmmod" command).

The example kernel warning log:

[  +0.004646] WARNING: CPU: 94 PID: 10395 at net/core/dev.c:9290 unregister_netdevice_many_notify+0x7a9/0x870
[...]
[  +0.010959] RIP: 0010:unregister_netdevice_many_notify+0x7a9/0x870
[...]
[  +0.002726] Call Trace:
[  +0.002457]  <TASK>
[  +0.002119]  ? __warn+0x80/0x120
[  +0.003245]  ? unregister_netdevice_many_notify+0x7a9/0x870
[  +0.005586]  ? report_bug+0x164/0x190
[  +0.003678]  ? handle_bug+0x3c/0x80
[  +0.003503]  ? exc_invalid_op+0x17/0x70
[  +0.003846]  ? asm_exc_invalid_op+0x1a/0x20
[  +0.004200]  ? unregister_netdevice_many_notify+0x7a9/0x870
[  +0.005579]  ? unregister_netdevice_many_notify+0x3cc/0x870
[  +0.005586]  unregister_netdevice_queue+0xf7/0x140
[  +0.004806]  unregister_netdev+0x1c/0x30
[  +0.003933]  i40e_vsi_release+0x87/0x2f0 [i40e]
[  +0.004604]  i40e_remove+0x1a1/0x420 [i40e]
[  +0.004220]  pci_device_remove+0x3f/0xb0
[  +0.003943]  device_release_driver_internal+0x19f/0x200
[  +0.005243]  driver_detach+0x48/0x90
[  +0.003586]  bus_remove_driver+0x6d/0xf0
[  +0.003939]  pci_unregister_driver+0x2e/0xb0
[  +0.004278]  i40e_exit_module+0x10/0x5f0 [i40e]
[  +0.004570]  __do_sys_delete_module.isra.0+0x197/0x310
[  +0.005153]  do_syscall_64+0x85/0x170
[  +0.003684]  ? syscall_exit_to_user_mode+0x69/0x220
[  +0.004886]  ? do_syscall_64+0x95/0x170
[  +0.003851]  ? exc_page_fault+0x7e/0x180
[  +0.003932]  entry_SYSCALL_64_after_hwframe+0x71/0x79
[  +0.005064] RIP: 0033:0x7f59dc9347cb
[  +0.003648] Code: 73 01 c3 48 8b 0d 65 16 0c 00 f7 d8 64 89 01 48 83
c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f
05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 35 16 0c 00 f7 d8 64 89 01 48
[  +0.018753] RSP: 002b:00007ffffac99048 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[  +0.007577] RAX: ffffffffffffffda RBX: 0000559b9bb2f6e0 RCX: 00007f59dc9347cb
[  +0.007140] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000559b9bb2f748
[  +0.007146] RBP: 00007ffffac99070 R08: 1999999999999999 R09: 0000000000000000
[  +0.007133] R10: 00007f59dc9a5ac0 R11: 0000000000000206 R12: 0000000000000000
[  +0.007141] R13: 00007ffffac992d8 R14: 0000559b9bb2f6e0 R15: 0000000000000000
[  +0.007151]  </TASK>
[  +0.002204] ---[ end trace 0000000000000000 ]---

Fix this by checking if the XDP program is being loaded or unloaded.
Then, block only loading a new program while "__I40E_IN_REMOVE" is set.
Also, move testing "__I40E_IN_REMOVE" flag to the beginning of XDP_SETUP
callback to avoid unnecessary operations and checks.

Fixes: 6533e558c650 ("i40e: Fix reset path while removing the driver")
Signed-off-by: Michal Kubiak <michal.kubiak@intel.com>

---

v1 -> v2 changes:
 - simplify the fix according to Kuba's suggestions, i.e. remove
   checking 'NETREG_UNREGISTERING' flag directly from ndo_bpf
   and remove a separate handling for 'unregister' context.
 - update the commit message accordingly
 - add an example of the kernel warning for the issue being fixed.

See:
v1: <https://lore.kernel.org/netdev/20240528-net-2024-05-28-intel-net-fixes-v1-4-dc8593d2bbc6@intel.com/>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

Comments

Maciej Fijalkowski June 26, 2024, 12:06 p.m. UTC | #1
On Mon, Jun 24, 2024 at 05:12:13PM +0200, Michal Kubiak wrote:
> The commit 6533e558c650 ("i40e: Fix reset path while removing
> the driver") introduced a new PF state "__I40E_IN_REMOVE" to block
> modifying the XDP program while the driver is being removed.
> Unfortunately, such a change is useful only if the ".ndo_bpf()"
> callback was called out of the rmmod context because unloading the
> existing XDP program is also a part of driver removing procedure.
> In other words, from the rmmod context the driver is expected to
> unload the XDP program without reporting any errors. Otherwise,
> the kernel warning with callstack is printed out to dmesg.
> 
> Example failing scenario:
>  1. Load the i40e driver.
>  2. Load the XDP program.
>  3. Unload the i40e driver (using "rmmod" command).
> 
> The example kernel warning log:
> 
> [  +0.004646] WARNING: CPU: 94 PID: 10395 at net/core/dev.c:9290 unregister_netdevice_many_notify+0x7a9/0x870
> [...]
> [  +0.010959] RIP: 0010:unregister_netdevice_many_notify+0x7a9/0x870
> [...]
> [  +0.002726] Call Trace:
> [  +0.002457]  <TASK>
> [  +0.002119]  ? __warn+0x80/0x120
> [  +0.003245]  ? unregister_netdevice_many_notify+0x7a9/0x870
> [  +0.005586]  ? report_bug+0x164/0x190
> [  +0.003678]  ? handle_bug+0x3c/0x80
> [  +0.003503]  ? exc_invalid_op+0x17/0x70
> [  +0.003846]  ? asm_exc_invalid_op+0x1a/0x20
> [  +0.004200]  ? unregister_netdevice_many_notify+0x7a9/0x870
> [  +0.005579]  ? unregister_netdevice_many_notify+0x3cc/0x870
> [  +0.005586]  unregister_netdevice_queue+0xf7/0x140
> [  +0.004806]  unregister_netdev+0x1c/0x30
> [  +0.003933]  i40e_vsi_release+0x87/0x2f0 [i40e]
> [  +0.004604]  i40e_remove+0x1a1/0x420 [i40e]
> [  +0.004220]  pci_device_remove+0x3f/0xb0
> [  +0.003943]  device_release_driver_internal+0x19f/0x200
> [  +0.005243]  driver_detach+0x48/0x90
> [  +0.003586]  bus_remove_driver+0x6d/0xf0
> [  +0.003939]  pci_unregister_driver+0x2e/0xb0
> [  +0.004278]  i40e_exit_module+0x10/0x5f0 [i40e]
> [  +0.004570]  __do_sys_delete_module.isra.0+0x197/0x310
> [  +0.005153]  do_syscall_64+0x85/0x170
> [  +0.003684]  ? syscall_exit_to_user_mode+0x69/0x220
> [  +0.004886]  ? do_syscall_64+0x95/0x170
> [  +0.003851]  ? exc_page_fault+0x7e/0x180
> [  +0.003932]  entry_SYSCALL_64_after_hwframe+0x71/0x79
> [  +0.005064] RIP: 0033:0x7f59dc9347cb
> [  +0.003648] Code: 73 01 c3 48 8b 0d 65 16 0c 00 f7 d8 64 89 01 48 83
> c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f
> 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 35 16 0c 00 f7 d8 64 89 01 48
> [  +0.018753] RSP: 002b:00007ffffac99048 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
> [  +0.007577] RAX: ffffffffffffffda RBX: 0000559b9bb2f6e0 RCX: 00007f59dc9347cb
> [  +0.007140] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000559b9bb2f748
> [  +0.007146] RBP: 00007ffffac99070 R08: 1999999999999999 R09: 0000000000000000
> [  +0.007133] R10: 00007f59dc9a5ac0 R11: 0000000000000206 R12: 0000000000000000
> [  +0.007141] R13: 00007ffffac992d8 R14: 0000559b9bb2f6e0 R15: 0000000000000000
> [  +0.007151]  </TASK>
> [  +0.002204] ---[ end trace 0000000000000000 ]---
> 
> Fix this by checking if the XDP program is being loaded or unloaded.
> Then, block only loading a new program while "__I40E_IN_REMOVE" is set.
> Also, move testing "__I40E_IN_REMOVE" flag to the beginning of XDP_SETUP
> callback to avoid unnecessary operations and checks.
> 
> Fixes: 6533e558c650 ("i40e: Fix reset path while removing the driver")
> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com>
> 
> ---
> 
> v1 -> v2 changes:
>  - simplify the fix according to Kuba's suggestions, i.e. remove
>    checking 'NETREG_UNREGISTERING' flag directly from ndo_bpf
>    and remove a separate handling for 'unregister' context.
>  - update the commit message accordingly
>  - add an example of the kernel warning for the issue being fixed.
> 
> See:
> v1: <https://lore.kernel.org/netdev/20240528-net-2024-05-28-intel-net-fixes-v1-4-dc8593d2bbc6@intel.com/>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_main.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
> index 284c3fad5a6e..310513d9321b 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> @@ -13293,6 +13293,10 @@ static int i40e_xdp_setup(struct i40e_vsi *vsi, struct bpf_prog *prog,
>  	bool need_reset;
>  	int i;
>  
> +	/* VSI shall be deleted in a moment, block loading new programs */
> +	if (prog && test_bit(__I40E_IN_REMOVE, pf->state))
> +		return -EINVAL;
> +
>  	/* Don't allow frames that span over multiple buffers */
>  	if (vsi->netdev->mtu > frame_size - I40E_PACKET_HDR_PAD) {
>  		NL_SET_ERR_MSG_MOD(extack, "MTU too large for linear frames and XDP prog does not support frags");
> @@ -13301,14 +13305,9 @@ static int i40e_xdp_setup(struct i40e_vsi *vsi, struct bpf_prog *prog,
>  
>  	/* When turning XDP on->off/off->on we reset and rebuild the rings. */
>  	need_reset = (i40e_enabled_xdp_vsi(vsi) != !!prog);
> -
>  	if (need_reset)
>  		i40e_prep_for_reset(pf);

i40e_reset_and_rebuild() is decorated with this __I40E_IN_REMOVE flag
check and bails out early but not i40e_prep_for_reset() from what I see.
__I40E_RESET_RECOVERY_PENDING is set in i40e_remove() and that's what
i40e_prep_for_reset() checks. So after all, looks like flow you are
addressing will take care of BPF resources only, which is what we wanted.

Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>

one side note is that these flags are a nightmare, but let's not touch it
in this change.

>  
> -	/* VSI shall be deleted in a moment, just return EINVAL */
> -	if (test_bit(__I40E_IN_REMOVE, pf->state))
> -		return -EINVAL;
> -
>  	old_prog = xchg(&vsi->xdp_prog, prog);
>  
>  	if (need_reset) {
> -- 
> 2.33.1
>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 284c3fad5a6e..310513d9321b 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -13293,6 +13293,10 @@  static int i40e_xdp_setup(struct i40e_vsi *vsi, struct bpf_prog *prog,
 	bool need_reset;
 	int i;
 
+	/* VSI shall be deleted in a moment, block loading new programs */
+	if (prog && test_bit(__I40E_IN_REMOVE, pf->state))
+		return -EINVAL;
+
 	/* Don't allow frames that span over multiple buffers */
 	if (vsi->netdev->mtu > frame_size - I40E_PACKET_HDR_PAD) {
 		NL_SET_ERR_MSG_MOD(extack, "MTU too large for linear frames and XDP prog does not support frags");
@@ -13301,14 +13305,9 @@  static int i40e_xdp_setup(struct i40e_vsi *vsi, struct bpf_prog *prog,
 
 	/* When turning XDP on->off/off->on we reset and rebuild the rings. */
 	need_reset = (i40e_enabled_xdp_vsi(vsi) != !!prog);
-
 	if (need_reset)
 		i40e_prep_for_reset(pf);
 
-	/* VSI shall be deleted in a moment, just return EINVAL */
-	if (test_bit(__I40E_IN_REMOVE, pf->state))
-		return -EINVAL;
-
 	old_prog = xchg(&vsi->xdp_prog, prog);
 
 	if (need_reset) {