Message ID | 20210112181441.206545-1-sukadev@linux.ibm.com (mailing list archive) |
---|---|
Headers | show |
Series | ibmvnic: Use more consistent locking | expand |
On Tue, 2021-01-12 at 10:14 -0800, Sukadev Bhattiprolu wrote: > Use more consistent locking when reading/writing the adapter->state > field. This patch set fixes a race condition during ibmvnic_open() > where the adapter could be left in the PROBED state if a reset occurs > at the wrong time. This can cause networking to not come up during > boot and potentially require manual intervention in bringing up > applications that depend on the network. > > Changelog[v2] [Address comments from Jakub Kicinski] > - Fix up commit log for patch 5/7 and drop unnecessary variable > - Format Fixes line properly (no wrapping, no blank lines) > > Sukadev Bhattiprolu (7): > ibmvnic: restore state in change-param reset > ibmvnic: update reset function prototypes > ibmvnic: avoid allocating rwi entries > ibmvnic: switch order of checks in ibmvnic_reset > ibmvnic: serialize access to work queue > ibmvnic: check adapter->state under state_lock > ibmvnic: add comments about state_lock > > Other than the two minor comments, Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
On Tue, 12 Jan 2021 10:14:34 -0800 Sukadev Bhattiprolu wrote: > Use more consistent locking when reading/writing the adapter->state > field. This patch set fixes a race condition during ibmvnic_open() > where the adapter could be left in the PROBED state if a reset occurs > at the wrong time. This can cause networking to not come up during > boot and potentially require manual intervention in bringing up > applications that depend on the network. Apologies for not having enough time to suggest details, but let me state this again - the patches which fix bugs need to go into net with Fixes tags, the refactoring needs to go to net-next without Fixes tags. If there are dependencies, patches go to net first, then within a week or so the reset can be posted for net-next, after net -> net-next merge.
Jakub Kicinski [kuba@kernel.org] wrote: > On Tue, 12 Jan 2021 10:14:34 -0800 Sukadev Bhattiprolu wrote: > > Use more consistent locking when reading/writing the adapter->state > > field. This patch set fixes a race condition during ibmvnic_open() > > where the adapter could be left in the PROBED state if a reset occurs > > at the wrong time. This can cause networking to not come up during > > boot and potentially require manual intervention in bringing up > > applications that depend on the network. > > Apologies for not having enough time to suggest details, but let me > state this again - the patches which fix bugs need to go into net with > Fixes tags, the refactoring needs to go to net-next without Fixes tags. > If there are dependencies, patches go to net first, then within a week > or so the reset can be posted for net-next, after net -> net-next merge. Well, the patch set fixes a set of bugs - main one is a locking bug fixed in patch 6. Other bugs are more minor or corner cases. Fixing the locking bug requires some refactoring/simplifying/reordering checks so lock can be properly acquired. Because of the size/cleanup, should we treat it as "next" material? i.e should I just drop the Fixes tag and resend to net-next? Or can we ignore the size of patchset and treat it all as bug fixes? Appreciate your input. Thanks, Sukadev
On Tue, 12 Jan 2021 20:42:47 -0800 Sukadev Bhattiprolu wrote: > Jakub Kicinski [kuba@kernel.org] wrote: > > On Tue, 12 Jan 2021 10:14:34 -0800 Sukadev Bhattiprolu wrote: > > > Use more consistent locking when reading/writing the adapter->state > > > field. This patch set fixes a race condition during ibmvnic_open() > > > where the adapter could be left in the PROBED state if a reset occurs > > > at the wrong time. This can cause networking to not come up during > > > boot and potentially require manual intervention in bringing up > > > applications that depend on the network. > > > > Apologies for not having enough time to suggest details, but let me > > state this again - the patches which fix bugs need to go into net with > > Fixes tags, the refactoring needs to go to net-next without Fixes tags. > > If there are dependencies, patches go to net first, then within a week > > or so the reset can be posted for net-next, after net -> net-next merge. > > Well, the patch set fixes a set of bugs - main one is a locking bug fixed > in patch 6. Other bugs are more minor or corner cases. Fixing the locking > bug requires some refactoring/simplifying/reordering checks so lock can be > properly acquired. > > Because of the size/cleanup, should we treat it as "next" material? i.e > should I just drop the Fixes tag and resend to net-next? > > Or can we ignore the size of patchset and treat it all as bug fixes? No, focus on doing this right rather than trying to justify why your patches deserve special treatment. Throw this entire series out and start over with the goal of fixing the bugs with minimal patches.
Jakub Kicinski [kuba@kernel.org] wrote: > On Tue, 12 Jan 2021 20:42:47 -0800 Sukadev Bhattiprolu wrote: > > Jakub Kicinski [kuba@kernel.org] wrote: > > > On Tue, 12 Jan 2021 10:14:34 -0800 Sukadev Bhattiprolu wrote: > > > > Use more consistent locking when reading/writing the adapter->state > > > > field. This patch set fixes a race condition during ibmvnic_open() > > > > where the adapter could be left in the PROBED state if a reset occurs > > > > at the wrong time. This can cause networking to not come up during > > > > boot and potentially require manual intervention in bringing up > > > > applications that depend on the network. > > > > > > Apologies for not having enough time to suggest details, but let me > > > state this again - the patches which fix bugs need to go into net with > > > Fixes tags, the refactoring needs to go to net-next without Fixes tags. > > > If there are dependencies, patches go to net first, then within a week > > > or so the reset can be posted for net-next, after net -> net-next merge. > > > > Well, the patch set fixes a set of bugs - main one is a locking bug fixed > > in patch 6. Other bugs are more minor or corner cases. Fixing the locking > > bug requires some refactoring/simplifying/reordering checks so lock can be > > properly acquired. > > > > Because of the size/cleanup, should we treat it as "next" material? i.e > > should I just drop the Fixes tag and resend to net-next? > > > > Or can we ignore the size of patchset and treat it all as bug fixes? > > No, focus on doing this right rather than trying to justify why your > patches deserve special treatment. I am not asking for special treatment. I just don't get the rationale for saying that a bug fix cannot have some amount of refactoring. Specially a bug fix like this locking one which clearly touches various parts of the code. To take the lock properly we do have to move code around. > > Throw this entire series out and start over with the goal of fixing > the bugs with minimal patches. Fixing corner case bugs that have been around for a while in code that we are going to throw away feels like spinning wheels just to comply with the "process". Other than the optimization for the spin lock, there have been no reported technical issues with the patch set. Throwing away the patch set and starting over - I would be focusing not on the bugs or making the driver better but somehow complying with the process. The tiny memory leak issues I mention for completeness are just rare corner cases and a distraction. The main issue that people actually hit is the locking one. Fixing the locking issue touches lot of code. I to throw away the list implementation and add a couple of simple interfaces because if the allocation fails we call ibmvnic_close() - that messes with the locking I am trying to fix. Sukadev
Jakub Kicinski [kuba@kernel.org] wrote: > On Tue, 12 Jan 2021 10:14:34 -0800 Sukadev Bhattiprolu wrote: > > Use more consistent locking when reading/writing the adapter->state > > field. This patch set fixes a race condition during ibmvnic_open() > > where the adapter could be left in the PROBED state if a reset occurs > > at the wrong time. This can cause networking to not come up during > > boot and potentially require manual intervention in bringing up > > applications that depend on the network. > > Apologies for not having enough time to suggest details, but let me > state this again - the patches which fix bugs need to go into net with > Fixes tags, the refactoring needs to go to net-next without Fixes tags. > If there are dependencies, patches go to net first, then within a week > or so the reset can be posted for net-next, after net -> net-next merge. I think the locking bug fixes need the refactoring. So would it be ok to send the refactoring (patches 2 through 4) first to net-next and when they are merged send the the bug fixes (1, 5 and 6)? Patch 7 can be sent to net-next later after that. Thanks, Sukadev