Message ID | 20240315205535.1321-1-thinhtr@linux.ibm.com (mailing list archive) |
---|---|
State | Accepted |
Commit | d27e2da94a42655861ca4baea30c8cd65546f25d |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [v11] net/bnx2x: Prevent access to a freed page in page_pool | expand |
Fri, Mar 15, 2024 at 09:55:35PM CET, thinhtr@linux.ibm.com wrote: >Fix race condition leading to system crash during EEH error handling > >During EEH error recovery, the bnx2x driver's transmit timeout logic >could cause a race condition when handling reset tasks. The >bnx2x_tx_timeout() schedules reset tasks via bnx2x_sp_rtnl_task(), >which ultimately leads to bnx2x_nic_unload(). In bnx2x_nic_unload() >SGEs are freed using bnx2x_free_rx_sge_range(). However, this could >overlap with the EEH driver's attempt to reset the device using >bnx2x_io_slot_reset(), which also tries to free SGEs. This race >condition can result in system crashes due to accessing freed memory >locations in bnx2x_free_rx_sge() > >799 static inline void bnx2x_free_rx_sge(struct bnx2x *bp, >800 struct bnx2x_fastpath *fp, u16 index) >801 { >802 struct sw_rx_page *sw_buf = &fp->rx_page_ring[index]; >803 struct page *page = sw_buf->page; >.... >where sw_buf was set to NULL after the call to dma_unmap_page() >by the preceding thread. > > >[ 793.003930] EEH: Beginning: 'slot_reset' >[ 793.003937] PCI 0011:01:00.0#10000: EEH: Invoking bnx2x->slot_reset() >[ 793.003939] bnx2x: [bnx2x_io_slot_reset:14228(eth1)]IO slot reset initializing... >[ 793.004037] bnx2x 0011:01:00.0: enabling device (0140 -> 0142) >[ 793.008839] bnx2x: [bnx2x_io_slot_reset:14244(eth1)]IO slot reset --> driver unload >[ 793.122134] Kernel attempted to read user page (0) - exploit attempt? (uid: 0) >[ 793.122143] BUG: Kernel NULL pointer dereference on read at 0x00000000 >[ 793.122147] Faulting instruction address: 0xc0080000025065fc >[ 793.122152] Oops: Kernel access of bad area, sig: 11 [#1] >..... >[ 793.122315] Call Trace: >[ 793.122318] [c000000003c67a20] [c00800000250658c] bnx2x_io_slot_reset+0x204/0x610 [bnx2x] (unreliable) >[ 793.122331] [c000000003c67af0] [c0000000000518a8] eeh_report_reset+0xb8/0xf0 >[ 793.122338] [c000000003c67b60] [c000000000052130] eeh_pe_report+0x180/0x550 >[ 793.122342] [c000000003c67c70] [c00000000005318c] eeh_handle_normal_event+0x84c/0xa60 >[ 793.122347] [c000000003c67d50] [c000000000053a84] eeh_event_handler+0xf4/0x170 >[ 793.122352] [c000000003c67da0] [c000000000194c58] kthread+0x1c8/0x1d0 >[ 793.122356] [c000000003c67e10] [c00000000000cf64] ret_from_kernel_thread+0x5c/0x64 > >To solve this issue, we need to verify page pool allocations before >freeing. > >Fixes: 4cace675d687 ("bnx2x: Alloc 4k fragment for each rx ring buffer element") > >Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski <kuba@kernel.org>: On Fri, 15 Mar 2024 15:55:35 -0500 you wrote: > Fix race condition leading to system crash during EEH error handling > > During EEH error recovery, the bnx2x driver's transmit timeout logic > could cause a race condition when handling reset tasks. The > bnx2x_tx_timeout() schedules reset tasks via bnx2x_sp_rtnl_task(), > which ultimately leads to bnx2x_nic_unload(). In bnx2x_nic_unload() > SGEs are freed using bnx2x_free_rx_sge_range(). However, this could > overlap with the EEH driver's attempt to reset the device using > bnx2x_io_slot_reset(), which also tries to free SGEs. This race > condition can result in system crashes due to accessing freed memory > locations in bnx2x_free_rx_sge() > > [...] Here is the summary with links: - [v11] net/bnx2x: Prevent access to a freed page in page_pool https://git.kernel.org/netdev/net/c/d27e2da94a42 You are awesome, thank you!
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h index d8b1824c334d..0bc1367fd649 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h @@ -1002,9 +1002,6 @@ static inline void bnx2x_set_fw_mac_addr(__le16 *fw_hi, __le16 *fw_mid, static inline void bnx2x_free_rx_mem_pool(struct bnx2x *bp, struct bnx2x_alloc_pool *pool) { - if (!pool->page) - return; - put_page(pool->page); pool->page = NULL; @@ -1015,6 +1012,9 @@ static inline void bnx2x_free_rx_sge_range(struct bnx2x *bp, { int i; + if (!fp->page_pool.page) + return; + if (fp->mode == TPA_MODE_DISABLED) return;
Fix race condition leading to system crash during EEH error handling During EEH error recovery, the bnx2x driver's transmit timeout logic could cause a race condition when handling reset tasks. The bnx2x_tx_timeout() schedules reset tasks via bnx2x_sp_rtnl_task(), which ultimately leads to bnx2x_nic_unload(). In bnx2x_nic_unload() SGEs are freed using bnx2x_free_rx_sge_range(). However, this could overlap with the EEH driver's attempt to reset the device using bnx2x_io_slot_reset(), which also tries to free SGEs. This race condition can result in system crashes due to accessing freed memory locations in bnx2x_free_rx_sge() 799 static inline void bnx2x_free_rx_sge(struct bnx2x *bp, 800 struct bnx2x_fastpath *fp, u16 index) 801 { 802 struct sw_rx_page *sw_buf = &fp->rx_page_ring[index]; 803 struct page *page = sw_buf->page; .... where sw_buf was set to NULL after the call to dma_unmap_page() by the preceding thread. [ 793.003930] EEH: Beginning: 'slot_reset' [ 793.003937] PCI 0011:01:00.0#10000: EEH: Invoking bnx2x->slot_reset() [ 793.003939] bnx2x: [bnx2x_io_slot_reset:14228(eth1)]IO slot reset initializing... [ 793.004037] bnx2x 0011:01:00.0: enabling device (0140 -> 0142) [ 793.008839] bnx2x: [bnx2x_io_slot_reset:14244(eth1)]IO slot reset --> driver unload [ 793.122134] Kernel attempted to read user page (0) - exploit attempt? (uid: 0) [ 793.122143] BUG: Kernel NULL pointer dereference on read at 0x00000000 [ 793.122147] Faulting instruction address: 0xc0080000025065fc [ 793.122152] Oops: Kernel access of bad area, sig: 11 [#1] ..... [ 793.122315] Call Trace: [ 793.122318] [c000000003c67a20] [c00800000250658c] bnx2x_io_slot_reset+0x204/0x610 [bnx2x] (unreliable) [ 793.122331] [c000000003c67af0] [c0000000000518a8] eeh_report_reset+0xb8/0xf0 [ 793.122338] [c000000003c67b60] [c000000000052130] eeh_pe_report+0x180/0x550 [ 793.122342] [c000000003c67c70] [c00000000005318c] eeh_handle_normal_event+0x84c/0xa60 [ 793.122347] [c000000003c67d50] [c000000000053a84] eeh_event_handler+0xf4/0x170 [ 793.122352] [c000000003c67da0] [c000000000194c58] kthread+0x1c8/0x1d0 [ 793.122356] [c000000003c67e10] [c00000000000cf64] ret_from_kernel_thread+0x5c/0x64 To solve this issue, we need to verify page pool allocations before freeing. Fixes: 4cace675d687 ("bnx2x: Alloc 4k fragment for each rx ring buffer element") Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com> --- drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)