Message ID | 90238577e00a7a996767b84769b5e03ef840b13a.1707414045.git.thinhtr@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | bnx2x: Fix error recovering in switch configuration | expand |
Fixes: 4cace675d687
Fixes: 4cace675d687 ("bnx2x: Alloc 4k fragment for each rx ring buffer element")
On Thu, 8 Feb 2024 13:18:14 -0600 Thinh Tran wrote: > Fixes: 4cace675d687 ("bnx2x: Alloc 4k fragment for each rx ring buffer > element") The Fixes tag should be on one line, without wrapping. Please post a v9 with the tag included, as a new thread. Don't use --in-reply-to on netdev (sorry for so many rules..)
On 2/8/2024 7:29 PM, Jakub Kicinski wrote: > On Thu, 8 Feb 2024 13:18:14 -0600 Thinh Tran wrote: >> Fixes: 4cace675d687 ("bnx2x: Alloc 4k fragment for each rx ring buffer >> element") > > The Fixes tag should be on one line, without wrapping. > Please post a v9 with the tag included, as a new thread. > Don't use --in-reply-to on netdev (sorry for so many rules..) Will do. Thank you.
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h index d8b1824c334d..0bc1367fd649 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h @@ -1002,9 +1002,6 @@ static inline void bnx2x_set_fw_mac_addr(__le16 *fw_hi, __le16 *fw_mid, static inline void bnx2x_free_rx_mem_pool(struct bnx2x *bp, struct bnx2x_alloc_pool *pool) { - if (!pool->page) - return; - put_page(pool->page); pool->page = NULL; @@ -1015,6 +1012,9 @@ static inline void bnx2x_free_rx_sge_range(struct bnx2x *bp, { int i; + if (!fp->page_pool.page) + return; + if (fp->mode == TPA_MODE_DISABLED) return;
Fix race condition leading to system crash during EEH error handling During EEH error recovery, the bnx2x driver's transmit timeout logic could cause a race condition when handling reset tasks. The bnx2x_tx_timeout() schedules reset tasks via bnx2x_sp_rtnl_task(), which ultimately leads to bnx2x_nic_unload(). In bnx2x_nic_unload() SGEs are freed using bnx2x_free_rx_sge_range(). However, this could overlap with the EEH driver's attempt to reset the device using bnx2x_io_slot_reset(), which also frees SGEs. This race condition can result in system crashes due to accessing freed memory locations. [ 793.003930] EEH: Beginning: 'slot_reset' [ 793.003937] PCI 0011:01:00.0#10000: EEH: Invoking bnx2x->slot_reset() [ 793.003939] bnx2x: [bnx2x_io_slot_reset:14228(eth1)]IO slot reset initializing... [ 793.004037] bnx2x 0011:01:00.0: enabling device (0140 -> 0142) [ 793.008839] bnx2x: [bnx2x_io_slot_reset:14244(eth1)]IO slot reset --> driver unload [ 793.122134] Kernel attempted to read user page (0) - exploit attempt? (uid: 0) [ 793.122143] BUG: Kernel NULL pointer dereference on read at 0x00000000 [ 793.122147] Faulting instruction address: 0xc0080000025065fc [ 793.122152] Oops: Kernel access of bad area, sig: 11 [#1] ..... [ 793.122315] Call Trace: [ 793.122318] [c000000003c67a20] [c00800000250658c] bnx2x_io_slot_reset+0x204/0x610 [bnx2x] (unreliable) [ 793.122331] [c000000003c67af0] [c0000000000518a8] eeh_report_reset+0xb8/0xf0 [ 793.122338] [c000000003c67b60] [c000000000052130] eeh_pe_report+0x180/0x550 [ 793.122342] [c000000003c67c70] [c00000000005318c] eeh_handle_normal_event+0x84c/0xa60 [ 793.122347] [c000000003c67d50] [c000000000053a84] eeh_event_handler+0xf4/0x170 [ 793.122352] [c000000003c67da0] [c000000000194c58] kthread+0x1c8/0x1d0 [ 793.122356] [c000000003c67e10] [c00000000000cf64] ret_from_kernel_thread+0x5c/0x64 To solve this issue, we need to verify page pool allocations before freeing. Signed-off-by: Thinh Tran <thinhtr@linux.vnet.ibm.com> --- drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)