Message ID | 20240805182159.3547482-1-manojvishy@google.com (mailing list archive) |
---|---|
State | Awaiting Upstream |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [iwl-net] idpf: Acquire the lock before accessing the xn->salt | expand |
On 8/5/24 20:21, Manoj Vishwanathan wrote: > The transaction salt was being accessed before acquiring the > idpf_vc_xn_lock when idpf has to forward the virtchnl reply. > > Fixes: 34c21fa894a1a (“idpf: implement virtchnl transaction manager”) > Signed-off-by: Manoj Vishwanathan <manojvishy@google.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Thank you! (next time please add "v2" (and so on) to subsequent versions of the patch/series; please also send as a standalone thread, instead of in-reply-to v1; no need to repost this one for any of those though) > --- > drivers/net/ethernet/intel/idpf/idpf_virtchnl.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c > index 70986e12da28..30eec674d594 100644 > --- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c > +++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c > @@ -612,14 +612,15 @@ idpf_vc_xn_forward_reply(struct idpf_adapter *adapter, > return -EINVAL; > } > xn = &adapter->vcxn_mngr->ring[xn_idx]; > + idpf_vc_xn_lock(xn); > salt = FIELD_GET(IDPF_VC_XN_SALT_M, msg_info); > if (xn->salt != salt) { > dev_err_ratelimited(&adapter->pdev->dev, "Transaction salt does not match (%02x != %02x)\n", > xn->salt, salt); > + idpf_vc_xn_unlock(xn); > return -EINVAL; > } > > - idpf_vc_xn_lock(xn); > switch (xn->state) { > case IDPF_VC_XN_WAITING: > /* success */
From: Manoj Vishwanathan <manojvishy@google.com> Date: Mon, 5 Aug 2024 18:21:59 +0000 > The transaction salt was being accessed before acquiring the > idpf_vc_xn_lock when idpf has to forward the virtchnl reply. You need to explain in details here what issue you have faced due to that, how to reproduce it and how this fix does help. Otherwise, it's impossible to suggest what is happening and how to test whether the fix is correct. > > Fixes: 34c21fa894a1a (“idpf: implement virtchnl transaction manager”) > Signed-off-by: Manoj Vishwanathan <manojvishy@google.com> > --- > drivers/net/ethernet/intel/idpf/idpf_virtchnl.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c > index 70986e12da28..30eec674d594 100644 > --- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c > +++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c > @@ -612,14 +612,15 @@ idpf_vc_xn_forward_reply(struct idpf_adapter *adapter, > return -EINVAL; > } > xn = &adapter->vcxn_mngr->ring[xn_idx]; > + idpf_vc_xn_lock(xn); > salt = FIELD_GET(IDPF_VC_XN_SALT_M, msg_info); The lock can be taken here after the FIELD_GET(), not before, to reduce the critical/locked section execution time. > if (xn->salt != salt) { > dev_err_ratelimited(&adapter->pdev->dev, "Transaction salt does not match (%02x != %02x)\n", > xn->salt, salt); > + idpf_vc_xn_unlock(xn); > return -EINVAL; > } > > - idpf_vc_xn_lock(xn); > switch (xn->state) { > case IDPF_VC_XN_WAITING: > /* success */ Thanks, Olek
Thanks Przemek & Olek for your quick feedback and responses. Hi Olek, I can add more details about the issue we faced in the commit message. The bug we had here was a virtchnl delay leading to the xn->salt mismatch. This could be due to several factors including default CPU bounded kworker workqueue for virtchnl message processing being starved by aggressive userspace load causing the virtchnl to be delayed. While debugging this issue, this locking order appeared like a potential issue, hence the change was made. But, this change is more a clean up we felt based on concurrent access to the virtchnl transaction struct and does not fix the issue. This is more of the patch to do the right thing before we access the "xn". I wanted to start with a first patch to the community for acceptance followed by a series of other patches that are general clean up or improvements to IDPF in general. Will follow with with [PATCH v3] Thanks, Manoj On Wed, Aug 7, 2024 at 4:05 AM Alexander Lobakin <aleksander.lobakin@intel.com> wrote: > > From: Manoj Vishwanathan <manojvishy@google.com> > Date: Mon, 5 Aug 2024 18:21:59 +0000 > > > The transaction salt was being accessed before acquiring the > > idpf_vc_xn_lock when idpf has to forward the virtchnl reply. > > You need to explain in details here what issue you have faced due to > that, how to reproduce it and how this fix does help. > Otherwise, it's impossible to suggest what is happening and how to test > whether the fix is correct. > > > > > Fixes: 34c21fa894a1a (“idpf: implement virtchnl transaction manager”) > > Signed-off-by: Manoj Vishwanathan <manojvishy@google.com> > > --- > > drivers/net/ethernet/intel/idpf/idpf_virtchnl.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c > > index 70986e12da28..30eec674d594 100644 > > --- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c > > +++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c > > @@ -612,14 +612,15 @@ idpf_vc_xn_forward_reply(struct idpf_adapter *adapter, > > return -EINVAL; > > } > > xn = &adapter->vcxn_mngr->ring[xn_idx]; > > + idpf_vc_xn_lock(xn); > > salt = FIELD_GET(IDPF_VC_XN_SALT_M, msg_info); > > The lock can be taken here after the FIELD_GET(), not before, to reduce > the critical/locked section execution time. > > > if (xn->salt != salt) { > > dev_err_ratelimited(&adapter->pdev->dev, "Transaction salt does not match (%02x != %02x)\n", > > xn->salt, salt); > > + idpf_vc_xn_unlock(xn); > > return -EINVAL; > > } > > > > - idpf_vc_xn_lock(xn); > > switch (xn->state) { > > case IDPF_VC_XN_WAITING: > > /* success */ > > Thanks, > Olek
From: Manoj Vishwanathan <manojvishy@google.com> Date: Wed, 7 Aug 2024 06:58:59 -0700 > Thanks Przemek & Olek for your quick feedback and responses. > Hi Olek, > I can add more details about the issue we faced in the commit message. > The bug we had here was a virtchnl delay leading to the xn->salt > mismatch. This could be due to several factors including default CPU > bounded kworker workqueue for virtchnl message processing being That's why I always tell people to stop creating more and more private workqueues and just use the system ones, there's a whole collection for every need... > starved by aggressive userspace load causing the virtchnl to be > delayed. While debugging this issue, this locking order appeared like > a potential issue, hence the change was made. > But, this change is more a clean up we felt based on concurrent access > to the virtchnl transaction struct and does not fix the issue. This is > more of the patch to do the right thing before we access the "xn". > I wanted to start with a first patch to the community for acceptance > followed by a series of other patches that are general clean up or > improvements to IDPF in general. Will follow with with [PATCH v3] Maybe you'd prepare a full series then right away? I hope it won't conflict much with my tree (but you always can double-check[0]) (Chapter II is already posted here on IWL and netdev@) > > Thanks, > Manoj > > On Wed, Aug 7, 2024 at 4:05 AM Alexander Lobakin > <aleksander.lobakin@intel.com> wrote: >> >> From: Manoj Vishwanathan <manojvishy@google.com> >> Date: Mon, 5 Aug 2024 18:21:59 +0000 >> >>> The transaction salt was being accessed before acquiring the >>> idpf_vc_xn_lock when idpf has to forward the virtchnl reply. >> >> You need to explain in details here what issue you have faced due to >> that, how to reproduce it and how this fix does help. >> Otherwise, it's impossible to suggest what is happening and how to test >> whether the fix is correct. >> >>> >>> Fixes: 34c21fa894a1a (“idpf: implement virtchnl transaction manager”) >>> Signed-off-by: Manoj Vishwanathan <manojvishy@google.com> >>> --- >>> drivers/net/ethernet/intel/idpf/idpf_virtchnl.c | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c >>> index 70986e12da28..30eec674d594 100644 >>> --- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c >>> +++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c >>> @@ -612,14 +612,15 @@ idpf_vc_xn_forward_reply(struct idpf_adapter *adapter, >>> return -EINVAL; >>> } >>> xn = &adapter->vcxn_mngr->ring[xn_idx]; >>> + idpf_vc_xn_lock(xn); >>> salt = FIELD_GET(IDPF_VC_XN_SALT_M, msg_info); >> >> The lock can be taken here after the FIELD_GET(), not before, to reduce >> the critical/locked section execution time. >> >>> if (xn->salt != salt) { >>> dev_err_ratelimited(&adapter->pdev->dev, "Transaction salt does not match (%02x != %02x)\n", >>> xn->salt, salt); >>> + idpf_vc_xn_unlock(xn); >>> return -EINVAL; >>> } >>> >>> - idpf_vc_xn_lock(xn); >>> switch (xn->state) { >>> case IDPF_VC_XN_WAITING: >>> /* success */ >> >> Thanks, >> Olek [0] https://github.com/alobakin/linux/commits/idpf-libie-new Thanks, Olek
On Wed, Aug 07, 2024 at 06:58:59AM -0700, Manoj Vishwanathan wrote: > Thanks Przemek & Olek for your quick feedback and responses. > Hi Olek, > I can add more details about the issue we faced in the commit message. > The bug we had here was a virtchnl delay leading to the xn->salt > mismatch. This could be due to several factors including default CPU > bounded kworker workqueue for virtchnl message processing being > starved by aggressive userspace load causing the virtchnl to be > delayed. While debugging this issue, this locking order appeared like > a potential issue, hence the change was made. > But, this change is more a clean up we felt based on concurrent access > to the virtchnl transaction struct and does not fix the issue. This is > more of the patch to do the right thing before we access the "xn". > I wanted to start with a first patch to the community for acceptance > followed by a series of other patches that are general clean up or > improvements to IDPF in general. Will follow with with [PATCH v3] Still, I am a little confused about the protection offered to xn->salt. My analysis is as follows, where guarded is used loosely to mean the lock is held. * In idpf_vc_xn_pop_free() it is guarded by vcxn_mngr->xn_bm_lock. * In idpf_vc_xn_exec() it is guarded by: 1. vcxn_mngr->xn_bm_lock when idpf_vc_xn_pop_free is called 2. idpf_vc_xn_lock, otherwise * And with this patch, in idpf_vc_xn_forward_reply it is guarded by idpf_vc_xn_lock(). This doesn't seem entirely consistent. Also, please don't top-post on Kernel mailing lists. ...
diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c index 70986e12da28..30eec674d594 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c +++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c @@ -612,14 +612,15 @@ idpf_vc_xn_forward_reply(struct idpf_adapter *adapter, return -EINVAL; } xn = &adapter->vcxn_mngr->ring[xn_idx]; + idpf_vc_xn_lock(xn); salt = FIELD_GET(IDPF_VC_XN_SALT_M, msg_info); if (xn->salt != salt) { dev_err_ratelimited(&adapter->pdev->dev, "Transaction salt does not match (%02x != %02x)\n", xn->salt, salt); + idpf_vc_xn_unlock(xn); return -EINVAL; } - idpf_vc_xn_lock(xn); switch (xn->state) { case IDPF_VC_XN_WAITING: /* success */
The transaction salt was being accessed before acquiring the idpf_vc_xn_lock when idpf has to forward the virtchnl reply. Fixes: 34c21fa894a1a (“idpf: implement virtchnl transaction manager”) Signed-off-by: Manoj Vishwanathan <manojvishy@google.com> --- drivers/net/ethernet/intel/idpf/idpf_virtchnl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)