diff mbox series

[iwl-net,v5] ice: Do not get coalesce settings while in reset

Message ID 20240607121552.15127-1-dawid.osuchowski@linux.intel.com (mailing list archive)
State Awaiting Upstream
Delegated to: Netdev Maintainers
Headers show
Series [iwl-net,v5] ice: Do not get coalesce settings while in reset | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 864 this patch: 864
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers fail 2 blamed authors not CCed: brett.creeley@intel.com anirudh.venkataramanan@intel.com; 7 maintainers not CCed: anirudh.venkataramanan@intel.com pabeni@redhat.com kuba@kernel.org edumazet@google.com anthony.l.nguyen@intel.com brett.creeley@intel.com jesse.brandeburg@intel.com
netdev/build_clang success Errors and warnings before: 868 this patch: 868
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 868 this patch: 868
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 9 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 55 this patch: 55
netdev/source_inline success Was 0 now: 0

Commit Message

Dawid Osuchowski June 7, 2024, 12:15 p.m. UTC
From: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>

Getting coalesce settings while reset is in progress can cause NULL
pointer deference bug.
If under reset, abort get coalesce for ethtool.

We cannot use ice_wait_for_reset() since both the ethtool handler and the
adapter reset flow call rtnl_lock() during operation. If we wait for
reset completion inside of an ethtool handling function such as
ice_get_coalesce(), the wait will always timeout due to reset being
blocked by rtnl_lock() inside of ice_queue_set_napi() (which is called
during reset process), and in turn we will always return -EBUSY anyways,
with the added hang time of the timeout value.

Fixes: 67fe64d78c43 ("ice: Implement getting and setting ethtool coalesce")
Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Signed-off-by: Pawel Chmielewski <pawel.chmielewski@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com>
---
Changes since v1:
* Added "Fixes:" tag
Changes since v2:
* Rebased over current IWL net branch
* Confirmed that the issue previously reported for this patch [1] by
Himasekhar Reddy Pucha was caused by other, internally tracked issue
Changes since v3:
* Using ice_wait_for_reset() instead of returning -EBUSY 
Changes since v4:
* Rebased over current IWL net branch
* Rollback the use of ice_wait_for_reset() due to rtnl_lock() deadlock
issue described in [2] and commit msg

[1] https://lore.kernel.org/netdev/BL0PR11MB3122D70ABDE6C2ACEE376073BD90A@BL0PR11MB3122.namprd11.prod.outlook.com/
[2] https://lore.kernel.org/netdev/20240501195641.1e606747@kernel.org/T/#m1629ecfe88d26551852c5c97982cd10314991422
---
 drivers/net/ethernet/intel/ice/ice_ethtool.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Pucha, HimasekharX Reddy June 10, 2024, 9:53 a.m. UTC | #1
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Dawid Osuchowski
> Sent: Friday, June 7, 2024 5:46 PM
> To: intel-wired-lan@lists.osuosl.org
> Cc: Kwan, Ngai-mint <ngai-mint.kwan@intel.com>; netdev@vger.kernel.org; Chmielewski, Pawel <pawel.chmielewski@intel.com>; Simon Horman <horms@kernel.org>; Polchlopek, Mateusz <mateusz.polchlopek@intel.com>; Dawid Osuchowski <dawid.osuchowski@linux.intel.com>
> Subject: [Intel-wired-lan] [PATCH iwl-net v5] ice: Do not get coalesce settings while in reset
>
> From: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
>
> Getting coalesce settings while reset is in progress can cause NULL pointer deference bug.
> If under reset, abort get coalesce for ethtool.
>
> We cannot use ice_wait_for_reset() since both the ethtool handler and the adapter reset flow call rtnl_lock() during operation. If we wait for reset completion inside of an ethtool handling function such as ice_get_coalesce(), the wait will always timeout due to reset being blocked by rtnl_lock() inside of ice_queue_set_napi() (which is called during reset process), and in turn we will always return -EBUSY anyways, with the added hang time of the timeout value.
>
> Fixes: 67fe64d78c43 ("ice: Implement getting and setting ethtool coalesce")
> Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
> Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
> Signed-off-by: Pawel Chmielewski <pawel.chmielewski@intel.com>
> Reviewed-by: Simon Horman <horms@kernel.org>
> Signed-off-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com>
> ---
> Changes since v1:
> * Added "Fixes:" tag
> Changes since v2:
> * Rebased over current IWL net branch
> * Confirmed that the issue previously reported for this patch [1] by Himasekhar Reddy Pucha was caused by other, internally tracked issue Changes since v3:
> * Using ice_wait_for_reset() instead of returning -EBUSY Changes since v4:
> * Rebased over current IWL net branch
> * Rollback the use of ice_wait_for_reset() due to rtnl_lock() deadlock issue described in [2] and commit msg
>
> [1] https://lore.kernel.org/netdev/BL0PR11MB3122D70ABDE6C2ACEE376073BD90A@BL0PR11MB3122.namprd11.prod.outlook.com/
> [2] https://lore.kernel.org/netdev/20240501195641.1e606747@kernel.org/T/#m1629ecfe88d26551852c5c97982cd10314991422
> ---
>  drivers/net/ethernet/intel/ice/ice_ethtool.c | 3 +++
>  1 file changed, 3 insertions(+)
>

Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Jakub Kicinski June 11, 2024, 2:47 a.m. UTC | #2
On Fri,  7 Jun 2024 14:15:52 +0200 Dawid Osuchowski wrote:
> We cannot use ice_wait_for_reset() since both the ethtool handler and the
> adapter reset flow call rtnl_lock() during operation. If we wait for
> reset completion inside of an ethtool handling function such as
> ice_get_coalesce(), the wait will always timeout due to reset being
> blocked by rtnl_lock() inside of ice_queue_set_napi() (which is called
> during reset process), and in turn we will always return -EBUSY anyways,
> with the added hang time of the timeout value.

Why does the reset not call netif_device_detach()?
Then core will know not to call the driver.

> Fixes: 67fe64d78c43 ("ice: Implement getting and setting ethtool coalesce")

Isn't ice_queue_set_napi() much more recent than this commit?
Dawid Osuchowski June 14, 2024, 11:50 a.m. UTC | #3
On 11.06.2024 04:47, Jakub Kicinski wrote:

> Why does the reset not call netif_device_detach()?
> Then core will know not to call the driver.
Will use this approach in new patch, thanks.

--Dawid
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c
index 62c8205fceba..2ffe864a364c 100644
--- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
+++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
@@ -3810,6 +3810,9 @@  __ice_get_coalesce(struct net_device *netdev, struct ethtool_coalesce *ec,
 	struct ice_netdev_priv *np = netdev_priv(netdev);
 	struct ice_vsi *vsi = np->vsi;
 
+	if (ice_is_reset_in_progress(vsi->back->state))
+		return -EBUSY;
+
 	if (q_num < 0)
 		q_num = 0;