From patchwork Fri Aug 12 17:23:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tony Nguyen X-Patchwork-Id: 12942408 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64F1EC00140 for ; Fri, 12 Aug 2022 17:23:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238198AbiHLRXW (ORCPT ); Fri, 12 Aug 2022 13:23:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57340 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235487AbiHLRXS (ORCPT ); Fri, 12 Aug 2022 13:23:18 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B4FB2B2490 for ; Fri, 12 Aug 2022 10:23:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1660324997; x=1691860997; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=i7ea0/g03lMmA+jWUowchLTAQsvS4kjN8rQdckYHts8=; b=W+CggumEdMfFp2OlQoI+bfZ4p1zkKLiJuG7Vrmh+4I4ibcvQYx4g3gwI gcTN8tJFIHSDlPV1DojvtvRaOL0d2dCcVv32tRziurPoplutdLQMFdzgc x5zaZi8sQyM88MNT6rxb+TnShpNiLEDPSG2mJi/JdcRAqONXp7i2eG5an udyv6aSHD3wwBEXMd7dDFnO4ZnyizCu3muJIXwy0Cpd2kSHqnmzFm7/I0 l0HnPCp4Sj9S+1Ck9zMIXqUxAJy3EhhuxIwiZK1HlUkPFtSuhZTK7I0Z3 CNkjtyZFpYWXPlIgTwiX4lH7V0XlscQMo/N7RQdmaDGoIXTV4hgpUN8jG w==; X-IronPort-AV: E=McAfee;i="6400,9594,10437"; a="290396032" X-IronPort-AV: E=Sophos;i="5.93,233,1654585200"; d="scan'208";a="290396032" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Aug 2022 10:23:16 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,233,1654585200"; d="scan'208";a="634716851" Received: from anguy11-desk2.jf.intel.com ([10.166.244.147]) by orsmga008.jf.intel.com with ESMTP; 12 Aug 2022 10:23:16 -0700 From: Tony Nguyen To: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, edumazet@google.com Cc: Przemyslaw Patynowski , netdev@vger.kernel.org, anthony.l.nguyen@intel.com, Jedrzej Jagielski , Marek Szlosek Subject: [PATCH net 1/4] iavf: Fix adminq error handling Date: Fri, 12 Aug 2022 10:23:06 -0700 Message-Id: <20220812172309.853230-2-anthony.l.nguyen@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220812172309.853230-1-anthony.l.nguyen@intel.com> References: <20220812172309.853230-1-anthony.l.nguyen@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Przemyslaw Patynowski iavf_alloc_asq_bufs/iavf_alloc_arq_bufs allocates with dma_alloc_coherent memory for VF mailbox. Free DMA regions for both ASQ and ARQ in case error happens during configuration of ASQ/ARQ registers. Without this change it is possible to see when unloading interface: 74626.583369: dma_debug_device_change: device driver has pending DMA allocations while released from device [count=32] One of leaked entries details: [device address=0x0000000b27ff9000] [size=4096 bytes] [mapped with DMA_BIDIRECTIONAL] [mapped as coherent] Fixes: d358aa9a7a2d ("i40evf: init code and hardware support") Signed-off-by: Przemyslaw Patynowski Signed-off-by: Jedrzej Jagielski Tested-by: Marek Szlosek Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/iavf/iavf_adminq.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/intel/iavf/iavf_adminq.c b/drivers/net/ethernet/intel/iavf/iavf_adminq.c index cd4e6a22d0f9..9ffbd24d83cb 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_adminq.c +++ b/drivers/net/ethernet/intel/iavf/iavf_adminq.c @@ -324,6 +324,7 @@ static enum iavf_status iavf_config_arq_regs(struct iavf_hw *hw) static enum iavf_status iavf_init_asq(struct iavf_hw *hw) { enum iavf_status ret_code = 0; + int i; if (hw->aq.asq.count > 0) { /* queue already initialized */ @@ -354,12 +355,17 @@ static enum iavf_status iavf_init_asq(struct iavf_hw *hw) /* initialize base registers */ ret_code = iavf_config_asq_regs(hw); if (ret_code) - goto init_adminq_free_rings; + goto init_free_asq_bufs; /* success! */ hw->aq.asq.count = hw->aq.num_asq_entries; goto init_adminq_exit; +init_free_asq_bufs: + for (i = 0; i < hw->aq.num_asq_entries; i++) + iavf_free_dma_mem(hw, &hw->aq.asq.r.asq_bi[i]); + iavf_free_virt_mem(hw, &hw->aq.asq.dma_head); + init_adminq_free_rings: iavf_free_adminq_asq(hw); @@ -383,6 +389,7 @@ static enum iavf_status iavf_init_asq(struct iavf_hw *hw) static enum iavf_status iavf_init_arq(struct iavf_hw *hw) { enum iavf_status ret_code = 0; + int i; if (hw->aq.arq.count > 0) { /* queue already initialized */ @@ -413,12 +420,16 @@ static enum iavf_status iavf_init_arq(struct iavf_hw *hw) /* initialize base registers */ ret_code = iavf_config_arq_regs(hw); if (ret_code) - goto init_adminq_free_rings; + goto init_free_arq_bufs; /* success! */ hw->aq.arq.count = hw->aq.num_arq_entries; goto init_adminq_exit; +init_free_arq_bufs: + for (i = 0; i < hw->aq.num_arq_entries; i++) + iavf_free_dma_mem(hw, &hw->aq.arq.r.arq_bi[i]); + iavf_free_virt_mem(hw, &hw->aq.arq.dma_head); init_adminq_free_rings: iavf_free_adminq_arq(hw); From patchwork Fri Aug 12 17:23:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tony Nguyen X-Patchwork-Id: 12942410 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DF3E7C00140 for ; Fri, 12 Aug 2022 17:23:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238608AbiHLRX0 (ORCPT ); Fri, 12 Aug 2022 13:23:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57356 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237635AbiHLRXU (ORCPT ); Fri, 12 Aug 2022 13:23:20 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DEA2FB248F for ; Fri, 12 Aug 2022 10:23:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1660324998; x=1691860998; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=EnnyAAL+hQShI73CgRD45ZbG+aSwmAj5DUC9dvEbflc=; b=NuKCgaoH5mchCE8cKxKk5ZIrc2ZBshyNzUoOcrIeIrFFWPo4JUxKtDjL K1/9joO8M2OQdSOJb/+j1hgCAVkR7eHXewWaOOzDtwYGQlnek7mWqsZAT ivfA4EtX5kDXxE/qaBFnhxFbWMQYRfRM9YxOtnu1t8sEWffkITnv1tfrw gzJFLwqtGPtKVulHQLnCNyvzSj3LewvxWQJRl6RMhTcAguQqPkB50Y56n j37anrcESSGGujJkLRuev3G9DoFZZ4zU12FFqlI4PSN8hWUnUvybTFmpE Kk95abLXGYgS76Q7Gtg1VjYrzeI5pMNzfdrT6aLkHSinFtJmbF4eL/Zfv g==; X-IronPort-AV: E=McAfee;i="6400,9594,10437"; a="290396034" X-IronPort-AV: E=Sophos;i="5.93,233,1654585200"; d="scan'208";a="290396034" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Aug 2022 10:23:16 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,233,1654585200"; d="scan'208";a="634716854" Received: from anguy11-desk2.jf.intel.com ([10.166.244.147]) by orsmga008.jf.intel.com with ESMTP; 12 Aug 2022 10:23:16 -0700 From: Tony Nguyen To: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, edumazet@google.com Cc: Przemyslaw Patynowski , netdev@vger.kernel.org, anthony.l.nguyen@intel.com, Jedrzej Jagielski , Marek Szlosek Subject: [PATCH net 2/4] iavf: Fix NULL pointer dereference in iavf_get_link_ksettings Date: Fri, 12 Aug 2022 10:23:07 -0700 Message-Id: <20220812172309.853230-3-anthony.l.nguyen@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220812172309.853230-1-anthony.l.nguyen@intel.com> References: <20220812172309.853230-1-anthony.l.nguyen@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Przemyslaw Patynowski Fix possible NULL pointer dereference, due to freeing of adapter->vf_res in iavf_init_get_resources. Previous commit introduced a regression, where receiving IAVF_ERR_ADMIN_QUEUE_NO_WORK from iavf_get_vf_config would free adapter->vf_res. However, netdev is still registered, so ethtool_ops can be called. Calling iavf_get_link_ksettings with no vf_res, will result with: [ 9385.242676] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 9385.242683] #PF: supervisor read access in kernel mode [ 9385.242686] #PF: error_code(0x0000) - not-present page [ 9385.242690] PGD 0 P4D 0 [ 9385.242696] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI [ 9385.242701] CPU: 6 PID: 3217 Comm: pmdalinux Kdump: loaded Tainted: G S E 5.18.0-04958-ga54ce3703613-dirty #1 [ 9385.242708] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.11.0 11/02/2019 [ 9385.242710] RIP: 0010:iavf_get_link_ksettings+0x29/0xd0 [iavf] [ 9385.242745] Code: 00 0f 1f 44 00 00 b8 01 ef ff ff 48 c7 46 30 00 00 00 00 48 c7 46 38 00 00 00 00 c6 46 0b 00 66 89 46 08 48 8b 87 68 0e 00 00 40 08 80 75 50 8b 87 5c 0e 00 00 83 f8 08 74 7a 76 1d 83 f8 20 [ 9385.242749] RSP: 0018:ffffc0560ec7fbd0 EFLAGS: 00010246 [ 9385.242755] RAX: 0000000000000000 RBX: ffffc0560ec7fc08 RCX: 0000000000000000 [ 9385.242759] RDX: ffffffffc0ad4550 RSI: ffffc0560ec7fc08 RDI: ffffa0fc66674000 [ 9385.242762] RBP: 00007ffd1fb2bf50 R08: b6a2d54b892363ee R09: ffffa101dc14fb00 [ 9385.242765] R10: 0000000000000000 R11: 0000000000000004 R12: ffffa0fc66674000 [ 9385.242768] R13: 0000000000000000 R14: ffffa0fc66674000 R15: 00000000ffffffa1 [ 9385.242771] FS: 00007f93711a2980(0000) GS:ffffa0fad72c0000(0000) knlGS:0000000000000000 [ 9385.242775] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9385.242778] CR2: 0000000000000008 CR3: 0000000a8e61c003 CR4: 00000000003706e0 [ 9385.242781] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 9385.242784] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 9385.242787] Call Trace: [ 9385.242791] [ 9385.242793] ethtool_get_settings+0x71/0x1a0 [ 9385.242814] __dev_ethtool+0x426/0x2f40 [ 9385.242823] ? slab_post_alloc_hook+0x4f/0x280 [ 9385.242836] ? kmem_cache_alloc_trace+0x15d/0x2f0 [ 9385.242841] ? dev_ethtool+0x59/0x170 [ 9385.242848] dev_ethtool+0xa7/0x170 [ 9385.242856] dev_ioctl+0xc3/0x520 [ 9385.242866] sock_do_ioctl+0xa0/0xe0 [ 9385.242877] sock_ioctl+0x22f/0x320 [ 9385.242885] __x64_sys_ioctl+0x84/0xc0 [ 9385.242896] do_syscall_64+0x3a/0x80 [ 9385.242904] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 9385.242918] RIP: 0033:0x7f93702396db [ 9385.242923] Code: 73 01 c3 48 8b 0d ad 57 38 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 7d 57 38 00 f7 d8 64 89 01 48 [ 9385.242927] RSP: 002b:00007ffd1fb2bf18 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 9385.242932] RAX: ffffffffffffffda RBX: 000055671b1d2fe0 RCX: 00007f93702396db [ 9385.242935] RDX: 00007ffd1fb2bf20 RSI: 0000000000008946 RDI: 0000000000000007 [ 9385.242937] RBP: 00007ffd1fb2bf20 R08: 0000000000000003 R09: 0030763066307330 [ 9385.242940] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffd1fb2bf80 [ 9385.242942] R13: 0000000000000007 R14: 0000556719f6de90 R15: 00007ffd1fb2c1b0 [ 9385.242948] [ 9385.242949] Modules linked in: iavf(E) xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nft_compat nf_nat_tftp nft_objref nf_conntrack_tftp bridge stp llc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables rfkill nfnetlink vfat fat irdma ib_uverbs ib_core intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm iTCO_wdt iTCO_vendor_support ice irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl i40e pcspkr intel_cstate joydev mei_me intel_uncore mxm_wmi mei ipmi_ssif lpc_ich ipmi_si acpi_power_meter xfs libcrc32c mgag200 i2c_algo_bit drm_shmem_helper drm_kms_helper sd_mod t10_pi crc64_rocksoft crc64 syscopyarea sg sysfillrect sysimgblt fb_sys_fops drm ixgbe ahci libahci libata crc32c_intel mdio dca wmi dm_mirror dm_region_hash dm_log dm_mod ipmi_devintf ipmi_msghandler fuse [ 9385.243065] [last unloaded: iavf] Dereference happens in if (ADV_LINK_SUPPORT(adapter)) statement Fixes: 209f2f9c7181 ("iavf: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2 negotiation") Signed-off-by: Przemyslaw Patynowski Signed-off-by: Jedrzej Jagielski Tested-by: Marek Szlosek Signed-off-by: Tony Nguyen --- This was initially sent [1] and has had the -EAGAIN removed. It was an extra safety check, where the adjusted goto (alone) has been tested to resolve the issue. [1] https://lore.kernel.org/netdev/20220712182636.53a957cc@kernel.org/ drivers/net/ethernet/intel/iavf/iavf_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index 45d097a164ad..6aa3eff0da2c 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -2367,7 +2367,7 @@ static void iavf_init_get_resources(struct iavf_adapter *adapter) err = iavf_get_vf_config(adapter); if (err == -EALREADY) { err = iavf_send_vf_config_msg(adapter); - goto err_alloc; + goto err; } else if (err == -EINVAL) { /* We only get -EINVAL if the device is in a very bad * state or if we've been disabled for previous bad From patchwork Fri Aug 12 17:23:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tony Nguyen X-Patchwork-Id: 12942409 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E40A8C25B0E for ; Fri, 12 Aug 2022 17:23:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238466AbiHLRXY (ORCPT ); Fri, 12 Aug 2022 13:23:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237987AbiHLRXU (ORCPT ); Fri, 12 Aug 2022 13:23:20 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 03707B2491 for ; Fri, 12 Aug 2022 10:23:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1660324999; x=1691860999; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yXk+ZQsOUtBczz47M8jmvnsdvr7cRNXIlTxiphZfOR8=; b=GgsqqjRzvnU2IAeUDL2LHI0NuY81w2K0e1uD21LgFHP2Id7eL7A7mXFV 2HdOwNUqz+DsEYxLbWCdsYo0bGOyfHOwk7F46QoOgoCWuP6ni4MPVcB7/ afu/xgUrBPMq+BVxSqTN94po+G83K7lhMJ5a4OI3cYdiHe2AEp4+ywk+u +JBK1N+9mHwnWkDvAzC3D8Fz17EqsC5jtBj5RLc5gShIkIplMW89jDU+S M9rfYh+CTTalIDTVXJlxVYRtNqqNX5ET6mo9Q3+dJEeApfb0Tb74Kzha4 3hH5y0zMSDv7YjHL9HhUQL0bp6eh4EQa1CxCAB0wFGe+x7LuYGnSumMC9 w==; X-IronPort-AV: E=McAfee;i="6400,9594,10437"; a="290396035" X-IronPort-AV: E=Sophos;i="5.93,233,1654585200"; d="scan'208";a="290396035" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Aug 2022 10:23:16 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,233,1654585200"; d="scan'208";a="634716857" Received: from anguy11-desk2.jf.intel.com ([10.166.244.147]) by orsmga008.jf.intel.com with ESMTP; 12 Aug 2022 10:23:16 -0700 From: Tony Nguyen To: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, edumazet@google.com Cc: Przemyslaw Patynowski , netdev@vger.kernel.org, anthony.l.nguyen@intel.com, Jedrzej Jagielski , Marek Szlosek Subject: [PATCH net 3/4] iavf: Fix reset error handling Date: Fri, 12 Aug 2022 10:23:08 -0700 Message-Id: <20220812172309.853230-4-anthony.l.nguyen@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220812172309.853230-1-anthony.l.nguyen@intel.com> References: <20220812172309.853230-1-anthony.l.nguyen@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Przemyslaw Patynowski Do not call iavf_close in iavf_reset_task error handling. Doing so can lead to double call of napi_disable, which can lead to deadlock there. Removing VF would lead to iavf_remove task being stuck, because it requires crit_lock, which is held by iavf_close. Call iavf_disable_vf if reset fail, so that driver will clean up remaining invalid resources. During rapid VF resets, HW can fail to setup VF mailbox. Wrong error handling can lead to iavf_remove being stuck with: [ 5218.999087] iavf 0000:82:01.0: Failed to init adminq: -53 ... [ 5267.189211] INFO: task repro.sh:11219 blocked for more than 30 seconds. [ 5267.189520] Tainted: G S E 5.18.0-04958-ga54ce3703613-dirty #1 [ 5267.189764] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 5267.190062] task:repro.sh state:D stack: 0 pid:11219 ppid: 8162 flags:0x00000000 [ 5267.190347] Call Trace: [ 5267.190647] [ 5267.190927] __schedule+0x460/0x9f0 [ 5267.191264] schedule+0x44/0xb0 [ 5267.191563] schedule_preempt_disabled+0x14/0x20 [ 5267.191890] __mutex_lock.isra.12+0x6e3/0xac0 [ 5267.192237] ? iavf_remove+0xf9/0x6c0 [iavf] [ 5267.192565] iavf_remove+0x12a/0x6c0 [iavf] [ 5267.192911] ? _raw_spin_unlock_irqrestore+0x1e/0x40 [ 5267.193285] pci_device_remove+0x36/0xb0 [ 5267.193619] device_release_driver_internal+0xc1/0x150 [ 5267.193974] pci_stop_bus_device+0x69/0x90 [ 5267.194361] pci_stop_and_remove_bus_device+0xe/0x20 [ 5267.194735] pci_iov_remove_virtfn+0xba/0x120 [ 5267.195130] sriov_disable+0x2f/0xe0 [ 5267.195506] ice_free_vfs+0x7d/0x2f0 [ice] [ 5267.196056] ? pci_get_device+0x4f/0x70 [ 5267.196496] ice_sriov_configure+0x78/0x1a0 [ice] [ 5267.196995] sriov_numvfs_store+0xfe/0x140 [ 5267.197466] kernfs_fop_write_iter+0x12e/0x1c0 [ 5267.197918] new_sync_write+0x10c/0x190 [ 5267.198404] vfs_write+0x24e/0x2d0 [ 5267.198886] ksys_write+0x5c/0xd0 [ 5267.199367] do_syscall_64+0x3a/0x80 [ 5267.199827] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 5267.200317] RIP: 0033:0x7f5b381205c8 [ 5267.200814] RSP: 002b:00007fff8c7e8c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 5267.201981] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f5b381205c8 [ 5267.202620] RDX: 0000000000000002 RSI: 00005569420ee900 RDI: 0000000000000001 [ 5267.203426] RBP: 00005569420ee900 R08: 000000000000000a R09: 00007f5b38180820 [ 5267.204327] R10: 000000000000000a R11: 0000000000000246 R12: 00007f5b383c06e0 [ 5267.205193] R13: 0000000000000002 R14: 00007f5b383bb880 R15: 0000000000000002 [ 5267.206041] [ 5267.206970] Kernel panic - not syncing: hung_task: blocked tasks [ 5267.207809] CPU: 48 PID: 551 Comm: khungtaskd Kdump: loaded Tainted: G S E 5.18.0-04958-ga54ce3703613-dirty #1 [ 5267.208726] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.11.0 11/02/2019 [ 5267.209623] Call Trace: [ 5267.210569] [ 5267.211480] dump_stack_lvl+0x33/0x42 [ 5267.212472] panic+0x107/0x294 [ 5267.213467] watchdog.cold.8+0xc/0xbb [ 5267.214413] ? proc_dohung_task_timeout_secs+0x30/0x30 [ 5267.215511] kthread+0xf4/0x120 [ 5267.216459] ? kthread_complete_and_exit+0x20/0x20 [ 5267.217505] ret_from_fork+0x22/0x30 [ 5267.218459] Fixes: f0db78928783 ("i40evf: use netdev variable in reset task") Signed-off-by: Przemyslaw Patynowski Signed-off-by: Jedrzej Jagielski Tested-by: Marek Szlosek Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/iavf/iavf_main.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index 6aa3eff0da2c..95d4348e7579 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -3086,12 +3086,15 @@ static void iavf_reset_task(struct work_struct *work) return; reset_err: + if (running) { + set_bit(__IAVF_VSI_DOWN, adapter->vsi.state); + iavf_free_traffic_irqs(adapter); + } + iavf_disable_vf(adapter); + mutex_unlock(&adapter->client_lock); mutex_unlock(&adapter->crit_lock); - if (running) - iavf_change_state(adapter, __IAVF_RUNNING); dev_err(&adapter->pdev->dev, "failed to allocate resources during reinit\n"); - iavf_close(netdev); } /** From patchwork Fri Aug 12 17:23:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tony Nguyen X-Patchwork-Id: 12942411 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B8563C282E7 for ; Fri, 12 Aug 2022 17:23:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238666AbiHLRX1 (ORCPT ); Fri, 12 Aug 2022 13:23:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57364 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237992AbiHLRXU (ORCPT ); Fri, 12 Aug 2022 13:23:20 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1D42AB2490 for ; Fri, 12 Aug 2022 10:23:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1660325000; x=1691861000; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/1VBxrCN2fjIF05RGteZ5bU8n8lorWLvY/y9d87GZ4o=; b=XMkRu7qpoNHCnOL4GHLaJHBSLulIK2QAgnGEYWBEhUvc9HbUQVvYh4xy +gTfKF7rQj8SZJ66ArzdMRZ5dUOvQr6XyaduhTsBMtrLZeeGBrhJThyRT uqFrg/LOQXSxfgRViFYwF2/hk88OBErzdg5vgdj1fHVV/LiXkbgvx1Elq Mo3bVVh4ypSWMvCLt4H32fswfHuNDeCV8jNHaK9CsNT2iTRi3ZRS9lixt FfX4U+spF0sAjB9dv/Tc+B1/2FIFBPNxbujADDh3bk5RotjUHXC86ehFg EUWDSAWPY1t741ouvW8waIdB3U0rm7zhvK09zcv5e0Xq3HuMb46hGg1Go w==; X-IronPort-AV: E=McAfee;i="6400,9594,10437"; a="290396036" X-IronPort-AV: E=Sophos;i="5.93,233,1654585200"; d="scan'208";a="290396036" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Aug 2022 10:23:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,233,1654585200"; d="scan'208";a="634716860" Received: from anguy11-desk2.jf.intel.com ([10.166.244.147]) by orsmga008.jf.intel.com with ESMTP; 12 Aug 2022 10:23:16 -0700 From: Tony Nguyen To: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, edumazet@google.com Cc: Ivan Vecera , netdev@vger.kernel.org, anthony.l.nguyen@intel.com, sassmann@kpanic.de, Konrad Jankowski Subject: [PATCH net 4/4] iavf: Fix deadlock in initialization Date: Fri, 12 Aug 2022 10:23:09 -0700 Message-Id: <20220812172309.853230-5-anthony.l.nguyen@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220812172309.853230-1-anthony.l.nguyen@intel.com> References: <20220812172309.853230-1-anthony.l.nguyen@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Ivan Vecera Fix deadlock that occurs when iavf interface is a part of failover configuration. 1. Mutex crit_lock is taken at the beginning of iavf_watchdog_task() 2. Function iavf_init_config_adapter() is called when adapter state is __IAVF_INIT_CONFIG_ADAPTER 3. iavf_init_config_adapter() calls register_netdevice() that emits NETDEV_REGISTER event 4. Notifier function failover_event() then calls net_failover_slave_register() that calls dev_open() 5. dev_open() calls iavf_open() that tries to take crit_lock in end-less loop Stack trace: ... [ 790.251876] usleep_range_state+0x5b/0x80 [ 790.252547] iavf_open+0x37/0x1d0 [iavf] [ 790.253139] __dev_open+0xcd/0x160 [ 790.253699] dev_open+0x47/0x90 [ 790.254323] net_failover_slave_register+0x122/0x220 [net_failover] [ 790.255213] failover_slave_register.part.7+0xd2/0x180 [failover] [ 790.256050] failover_event+0x122/0x1ab [failover] [ 790.256821] notifier_call_chain+0x47/0x70 [ 790.257510] register_netdevice+0x20f/0x550 [ 790.258263] iavf_watchdog_task+0x7c8/0xea0 [iavf] [ 790.259009] process_one_work+0x1a7/0x360 [ 790.259705] worker_thread+0x30/0x390 To fix the situation we should check the current adapter state after first unsuccessful mutex_trylock() and return with -EBUSY if it is __IAVF_INIT_CONFIG_ADAPTER. Fixes: 226d528512cf ("iavf: fix locking of critical sections") Signed-off-by: Ivan Vecera Tested-by: Konrad Jankowski Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/iavf/iavf_main.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index 95d4348e7579..f39440ad5c50 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -4088,8 +4088,17 @@ static int iavf_open(struct net_device *netdev) return -EIO; } - while (!mutex_trylock(&adapter->crit_lock)) + while (!mutex_trylock(&adapter->crit_lock)) { + /* If we are in __IAVF_INIT_CONFIG_ADAPTER state the crit_lock + * is already taken and iavf_open is called from an upper + * device's notifier reacting on NETDEV_REGISTER event. + * We have to leave here to avoid dead lock. + */ + if (adapter->state == __IAVF_INIT_CONFIG_ADAPTER) + return -EBUSY; + usleep_range(500, 1000); + } if (adapter->state != __IAVF_DOWN) { err = -EBUSY;