Message ID | 20231205152620.568183-1-pawel.chmielewski@intel.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [iwl-next] ice: Do not get coalesce settings while in reset | expand |
On 12/5/23 16:26, Pawel Chmielewski wrote: > From: Ngai-Mint Kwan <ngai-mint.kwan@intel.com> > > Getting coalesce settings while reset is in progress can cause NULL > pointer deference bug. > If under reset, abort get coalesce for ethtool. > > Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com> > Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> > Signed-off-by: Pawel Chmielewski <pawel.chmielewski@intel.com> > --- > drivers/net/ethernet/intel/ice/ice_ethtool.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c > index bde9bc74f928..2d565cc484a0 100644 > --- a/drivers/net/ethernet/intel/ice/ice_ethtool.c > +++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c > @@ -3747,6 +3747,9 @@ __ice_get_coalesce(struct net_device *netdev, struct ethtool_coalesce *ec, > struct ice_netdev_priv *np = netdev_priv(netdev); > struct ice_vsi *vsi = np->vsi; > > + if (ice_is_reset_in_progress(vsi->back->state)) > + return -EBUSY; > + > if (q_num < 0) > q_num = 0; > Sorry for a late review, This asks for a Fixes: tag, and targeting at iwl-net instead :)
On Tue, Dec 05, 2023 at 05:09:48PM +0100, Przemek Kitszel wrote: > On 12/5/23 16:26, Pawel Chmielewski wrote: > > From: Ngai-Mint Kwan <ngai-mint.kwan@intel.com> > > > > Getting coalesce settings while reset is in progress can cause NULL > > pointer deference bug. > > If under reset, abort get coalesce for ethtool. > > > > Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com> > > Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> > > Signed-off-by: Pawel Chmielewski <pawel.chmielewski@intel.com> > > --- > > drivers/net/ethernet/intel/ice/ice_ethtool.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c > > index bde9bc74f928..2d565cc484a0 100644 > > --- a/drivers/net/ethernet/intel/ice/ice_ethtool.c > > +++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c > > @@ -3747,6 +3747,9 @@ __ice_get_coalesce(struct net_device *netdev, struct ethtool_coalesce *ec, > > struct ice_netdev_priv *np = netdev_priv(netdev); > > struct ice_vsi *vsi = np->vsi; > > + if (ice_is_reset_in_progress(vsi->back->state)) > > + return -EBUSY; > > + > > if (q_num < 0) > > q_num = 0; > > Sorry for a late review, > This asks for a Fixes: tag, and targeting at iwl-net instead :) Will fix the target, add the tag and send v2 right away :)
> -----Original Message----- > From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Pawel Chmielewski > Sent: Tuesday, December 5, 2023 8:56 PM > To: intel-wired-lan@lists.osuosl.org > Cc: Kwan, Ngai-mint <ngai-mint.kwan@intel.com>; netdev@vger.kernel.org; Chmielewski, Pawel <pawel.chmielewski@intel.com>; Polchlopek, Mateusz <mateusz.polchlopek@intel.com> > Subject: [Intel-wired-lan] [PATCH iwl-next] ice: Do not get coalesce settings while in reset > > From: Ngai-Mint Kwan <ngai-mint.kwan@intel.com> > > Getting coalesce settings while reset is in progress can cause NULL > pointer deference bug. > If under reset, abort get coalesce for ethtool. > > Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com> > Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> > Signed-off-by: Pawel Chmielewski <pawel.chmielewski@intel.com> > --- > drivers/net/ethernet/intel/ice/ice_ethtool.c | 3 +++ > 1 file changed, 3 insertions(+) After applying the patch observing new crash. Reproduction steps: #while true; do ethtool -c eth0; done #echo 1 > /sys/bus/pci/devices/0000\:18\:00.0/reset [Dec12 00:12] ice 0000:18:00.0: PTP reset successful [ +0.859959] ------------[ cut here ]------------ [ +0.000002] RTNL: assertion failed at net/core/dev.c (6422) [ +0.000017] WARNING: CPU: 88 PID: 539037 at net/core/dev.c:6422 netif_queue_set_napi+0xba/0xd0 [ +0.000008] Modules linked in: irdma ice snd_seq_dummy snd_hrtimer snd_seq snd_timer snd_seq_device snd soundcore qrtr rfkill vfat fat xfs libcrc32c rpcrdma sunrpc rdma_ucm ib_srpt intel_rapl_msr intel_rapl_common ib_isert intel_uncore_frequency intel_uncore_frequency_common iscsi_target_mod target_core_mod isst_if_common skx_edac nfit ib_iser libnvdimm libiscsi scsi_transport_iscsi x86_pkg_temp_thermal intel_powerclamp rdma_cm coretemp iw_cm ib_cm kvm_intel ipmi_ssif kvm irqbypass rapl intel_cstate iTCO_wdt iTCO_vendor_support ib_uverbs intel_uncore acpi_ipmi mei_me i2c_i801 ipmi_si pcspkr ib_core mei i2c_smbus lpc_ich ipmi_devintf intel_pch_thermal joydev ioatdma ipmi_msghandler acpi_power_meter acpi_pad ext4 mbcache jbd2 ast drm_shmem_helper drm_kms_helper sd_mod t10_pi sg ixgbe drm crct10dif_pclmul i40e crc32_pclmul ahci crc32c_intel libahci igb ghash_clmulni_intel libata i2c_algo_bit mdio dca gnss wmi fuse [last unloaded: ice] [ +0.000054] CPU: 88 PID: 539037 Comm: bash Kdump: loaded Not tainted 6.7.0-rc4_next-queue_11th_Dec-2023-00891-g9615a96563f0 #1 [ +0.000003] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0010.010620200716 01/06/2020 [ +0.000001] RIP: 0010:netif_queue_set_napi+0xba/0xd0 [ +0.000003] Code: 75 9e 80 3d d3 ba 2c 01 00 75 95 ba 16 19 00 00 48 c7 c6 fc 85 27 85 48 c7 c7 10 25 1c 85 c6 05 b7 ba 2c 01 01 e8 c6 cf 6a ff <0f> 0b e9 6f ff ff ff 0f 0b 5b 5d 41 5c 41 5d c3 cc cc cc cc 66 90 [ +0.000001] RSP: 0018:ffffc9002d827b30 EFLAGS: 00010282 [ +0.000002] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000027 [ +0.000001] RDX: ffff88980fc1f8c8 RSI: 0000000000000001 RDI: ffff88980fc1f8c0 [ +0.000001] RBP: ffff888c984dd010 R08: 0000000000000000 R09: 00000000ffff7fff [ +0.000001] R10: ffffc9002d8279d0 R11: ffffffff857e6648 R12: 0000000000000000 [ +0.000001] R13: ffff8881362e8000 R14: ffff888c984dd010 R15: 0000000000000000 [ +0.000001] FS: 00007fdbde01d740(0000) GS:ffff88980fc00000(0000) knlGS:0000000000000000 [ +0.000002] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000001] CR2: 00007f7358c89000 CR3: 0000000107fcc006 CR4: 00000000007706f0 [ +0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ +0.000001] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ +0.000001] PKRU: 55555554 [ +0.000001] Call Trace: [ +0.000001] <TASK> [ +0.000002] ? __warn+0x80/0x130 [ +0.000004] ? netif_queue_set_napi+0xba/0xd0 [ +0.000002] ? report_bug+0x195/0x1a0 [ +0.000004] ? prb_read_valid+0x17/0x20 [ +0.000004] ? handle_bug+0x3c/0x70 [ +0.000005] ? exc_invalid_op+0x14/0x70 [ +0.000001] ? asm_exc_invalid_op+0x16/0x20 [ +0.000005] ? netif_queue_set_napi+0xba/0xd0 [ +0.000003] ice_q_vector_set_napi_queues+0x37/0xf0 [ice] [ +0.000072] ice_vsi_cfg_def+0x423/0x830 [ice] [ +0.000043] ice_vsi_rebuild+0x238/0x3c0 [ice] [ +0.000042] ice_vsi_rebuild_by_type+0x76/0x180 [ice] [ +0.000033] ice_rebuild+0x191/0x510 [ice] [ +0.000041] ice_do_reset+0xa3/0x190 [ice] [ +0.000056] ice_pci_err_resume+0x3b/0xb0 [ice] [ +0.000035] pci_reset_function+0x48/0x70 [ +0.000005] reset_store+0x57/0xa0 [ +0.000004] kernfs_fop_write_iter+0x128/0x1c0 [ +0.000004] vfs_write+0x2ac/0x3c0 [ +0.000003] ksys_write+0x5f/0xe0 [ +0.000002] do_syscall_64+0x5c/0xe0 [ +0.000003] ? do_user_addr_fault+0x336/0x680 [ +0.000006] ? exc_page_fault+0x65/0x150 [ +0.000003] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ +0.000003] RIP: 0033:0x7fdbddf3eb97 [ +0.000002] Code: 0b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 [ +0.000001] RSP: 002b:00007ffdfc92bda8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ +0.000002] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fdbddf3eb97 [ +0.000001] RDX: 0000000000000002 RSI: 000055af480778a0 RDI: 0000000000000001 [ +0.000001] RBP: 000055af480778a0 R08: 0000000000000000 R09: 00007fdbddfb14e0 [ +0.000001] R10: 00007fdbddfb13e0 R11: 0000000000000246 R12: 0000000000000002 [ +0.000002] R13: 00007fdbddffb780 R14: 0000000000000002 R15: 00007fdbddff69e0 [ +0.000002] </TASK> [ +0.000001] ---[ end trace 0000000000000000 ]--- [ +0.104086] ice 0000:18:00.0: VSI rebuilt. VSI index 0, type ICE_VSI_PF [ +0.003689] ice 0000:18:00.0: VSI rebuilt. VSI index 1, type ICE_VSI_CTRL Crash Without patch: [ 251.069061] BUG: kernel NULL pointer dereference, address: 0000000000000028 [ 251.069065] #PF: supervisor read access in kernel mode [ 251.069067] #PF: error_code(0x0000) - not-present page [ 251.069069] PGD 0 P4D 0 [ 251.069072] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 251.069075] CPU: 3 PID: 20728 Comm: ethtool Kdump: loaded Not tainted 6.7.0-rc3_next-queue_4th-Dec-2023-00732-gda7b4d5ccb44 #1 [ 251.069078] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0010.010620200716 01/06/2020 [ 251.069080] RIP: 0010:ice_get_q_coalesce+0x2e/0xa0 [ice] [ 251.069158] Code: 00 55 53 48 89 fb 48 89 f7 48 83 ec 08 0f b7 8b 96 04 00 00 0f b7 83 92 04 00 00 39 d1 7e 30 48 8b 4b 20 48 63 ea 48 8b 0c e9 <48> 8b 71 28 48 81 c6 98 01 00 00 39 c2 7c 32 e8 fe fe ff ff 85 c0 [ 251.069160] RSP: 0018:ffffc900343af980 EFLAGS: 00010206 [ 251.069162] RAX: 0000000000000060 RBX: ffff888121c39028 RCX: 0000000000000000 [ 251.069164] RDX: 0000000000000000 RSI: ffff888106062d88 RDI: ffff888106062d88 [ 251.069165] RBP: 0000000000000000 R08: 0000000038687465 R09: 0000000000000000 [ 251.069167] R10: ffff888106062d80 R11: 0000000000000002 R12: 0000000000000000 [ 251.069168] R13: ffffc900343afa30 R14: 0000000000000013 R15: ffff888106062d80 [ 251.069169] FS: 00007f3901af2740(0000) GS:ffff888c106c0000(0000) knlGS:0000000000000000 [ 251.069171] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 251.069173] CR2: 0000000000000028 CR3: 000000029e2e2006 CR4: 00000000007706f0 [ 251.069174] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 251.069175] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 251.069177] PKRU: 55555554 [ 251.069178] Call Trace: [ 251.069180] <TASK> [ 251.069181] ? __die+0x20/0x70 [ 251.069187] ? page_fault_oops+0x76/0x170 [ 251.069191] ? exc_page_fault+0x65/0x150 [ 251.069195] ? asm_exc_page_fault+0x22/0x30 [ 251.069199] ? ice_get_q_coalesce+0x2e/0xa0 [ice] [ 251.069258] ice_get_coalesce+0x13/0x30 [ice] [ 251.069313] coalesce_prepare_data+0x59/0x80 [ 251.069318] ethnl_default_doit+0xf6/0x340 [ 251.069322] ? genl_family_rcv_msg_attrs_parse.constprop.0+0x8f/0xf0 [ 251.069326] genl_family_rcv_msg_doit+0xd9/0x130 [ 251.069329] genl_family_rcv_msg+0x14d/0x220 [ 251.069332] ? __pfx_ethnl_default_doit+0x10/0x10 [ 251.069336] genl_rcv_msg+0x47/0xa0 [ 251.069338] ? __pfx_genl_rcv_msg+0x10/0x10 [ 251.069341] netlink_rcv_skb+0x54/0x100 [ 251.069344] genl_rcv+0x24/0x40 [ 251.069346] netlink_unicast+0x243/0x360 [ 251.069349] netlink_sendmsg+0x206/0x450 [ 251.069352] __sys_sendto+0x1fe/0x210 [ 251.069355] ? ___sys_recvmsg+0x88/0xd0 [ 251.069359] ? __sys_recvmsg+0x56/0xa0 [ 251.069363] __x64_sys_sendto+0x20/0x30 [ 251.069365] do_syscall_64+0x5c/0xe0 [ 251.069369] ? syscall_exit_work+0x103/0x130 [ 251.069374] ? syscall_exit_to_user_mode+0x22/0x40 [ 251.069376] ? do_syscall_64+0x6b/0xe0 [ 251.069379] ? __count_memcg_events+0x3e/0x90 [ 251.069383] ? mm_account_fault+0x6c/0x100 [ 251.069387] ? handle_mm_fault+0xd8/0x210 [ 251.069389] ? do_user_addr_fault+0x336/0x680 [ 251.069392] ? exc_page_fault+0x65/0x150 [ 251.069394] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 251.069396] RIP: 0033:0x7f390194fa9a [ 251.069398] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89 [ 251.069400] RSP: 002b:00007ffd67aab4e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 251.069403] RAX: ffffffffffffffda RBX: 000055be8b68b340 RCX: 00007f390194fa9a [ 251.069404] RDX: 0000000000000024 RSI: 000055be8b68b3b0 RDI: 0000000000000003 [ 251.069405] RBP: 000055be8b68b3b0 R08: 00007f3901af9200 R09: 000000000000000c [ 251.069407] R10: 0000000000000000 R11: 0000000000000246 R12: 000055be898b4e10 [ 251.069408] R13: 0000000000000000 R14: 000055be8b68b2a0 R15: 0000000000000000 [ 251.069410] </TASK> [ 251.069411] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_timer snd_seq_device snd soundcore qrtr rfkill vfat fat xfs libcrc32c rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common target_core_mod isst_if_common ib_iser skx_edac nfit libiscsi libnvdimm scsi_transport_iscsi rdma_cm x86_pkg_temp_thermal intel_powerclamp coretemp iw_cm ib_cm kvm_intel kvm ipmi_ssif irqbypass irdma rapl intel_cstate ib_uverbs iTCO_wdt iTCO_vendor_support intel_uncore mei_me acpi_ipmi ipmi_si i2c_i801 pcspkr ib_core mei i2c_smbus lpc_ich ipmi_devintf intel_pch_thermal ioatdma joydev ipmi_msghandler acpi_power_meter acpi_pad ext4 mbcache jbd2 ast drm_shmem_helper drm_kms_helper sd_mod t10_pi sg ice ixgbe drm i40e ahci crct10dif_pclmul libahci igb crc32_pclmul crc32c_intel ghash_clmulni_intel libata i2c_algo_bit mdio dca gnss wmi fuse
diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c index bde9bc74f928..2d565cc484a0 100644 --- a/drivers/net/ethernet/intel/ice/ice_ethtool.c +++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c @@ -3747,6 +3747,9 @@ __ice_get_coalesce(struct net_device *netdev, struct ethtool_coalesce *ec, struct ice_netdev_priv *np = netdev_priv(netdev); struct ice_vsi *vsi = np->vsi; + if (ice_is_reset_in_progress(vsi->back->state)) + return -EBUSY; + if (q_num < 0) q_num = 0;