Message ID | 20210420213517.24171-1-drt@linux.ibm.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [V2,net] ibmvnic: Continue with reset if set link down failed | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | fail | 1 blamed authors not CCed: nfont@linux.vnet.ibm.com; 1 maintainers not CCed: nfont@linux.vnet.ibm.com |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 0 this patch: 0 |
netdev/kdoc | success | Errors and warnings before: 3 this patch: 3 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | warning | WARNING: line length of 100 exceeds 80 columns |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 0 this patch: 0 |
netdev/header_inline | success | Link |
> On Apr 20, 2021, at 4:35 PM, Dany Madden <drt@linux.ibm.com> wrote: > > When ibmvnic gets a FATAL error message from the vnicserver, it marks > the Command Respond Queue (CRQ) inactive and resets the adapter. If this > FATAL reset fails and a transmission timeout reset follows, the CRQ is > still inactive, ibmvnic's attempt to set link down will also fail. If > ibmvnic abandons the reset because of this failed set link down and this > is the last reset in the workqueue, then this adapter will be left in an > inoperable state. > > Instead, make the driver ignore this link down failure and continue to > free and re-register CRQ so that the adapter has an opportunity to > recover. This v2 does not adddress the concerns mentioned in v1. And I think it is better to exit with error from do_reset, and schedule a thorough do_hard_reset if the the adapter is already in unstable state. > > Fixes: ed651a10875f ("ibmvnic: Updated reset handling") > Signed-off-by: Dany Madden <drt@linux.ibm.com> > Reviewed-by: Rick Lindsley <ricklind@linux.ibm.com> > Reviewed-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> > --- > Changes in V2: > - Update description to clarify background for the patch > - Include Reviewed-by tags > --- > drivers/net/ethernet/ibm/ibmvnic.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c > index ffb2a91750c7..4bd8c5d1a275 100644 > --- a/drivers/net/ethernet/ibm/ibmvnic.c > +++ b/drivers/net/ethernet/ibm/ibmvnic.c > @@ -1970,8 +1970,10 @@ static int do_reset(struct ibmvnic_adapter *adapter, > rtnl_unlock(); > rc = set_link_state(adapter, IBMVNIC_LOGICAL_LNK_DN); > rtnl_lock(); > - if (rc) > - goto out; > + if (rc) { > + netdev_dbg(netdev, > + "Setting link down failed rc=%d. Continue anyway\n", rc); > + } > > if (adapter->state == VNIC_OPEN) { > /* When we dropped rtnl, ibmvnic_open() got > -- > 2.26.2 >
Lijun Pan [ljp@linux.vnet.ibm.com] wrote: > > > > On Apr 20, 2021, at 4:35 PM, Dany Madden <drt@linux.ibm.com> wrote: > > > > When ibmvnic gets a FATAL error message from the vnicserver, it marks > > the Command Respond Queue (CRQ) inactive and resets the adapter. If this > > FATAL reset fails and a transmission timeout reset follows, the CRQ is > > still inactive, ibmvnic's attempt to set link down will also fail. If > > ibmvnic abandons the reset because of this failed set link down and this > > is the last reset in the workqueue, then this adapter will be left in an > > inoperable state. > > > > Instead, make the driver ignore this link down failure and continue to > > free and re-register CRQ so that the adapter has an opportunity to > > recover. > > This v2 does not adddress the concerns mentioned in v1. > And I think it is better to exit with error from do_reset, and schedule a thorough > do_hard_reset if the the adapter is already in unstable state. We had a FATAL error and when handling it, we failed to send a link-down message to the VIOS. So what we need to try next is to reset the connection with the VIOS. For this we must talk to the firmware using the H_FREE_CRQ and H_REG_CRQ hcalls. do_reset() does just that in ibmvnic_reset_crq(). Now, sure we can attempt a "thorough hard reset" which also does the same hcalls to reestablish the connection. Is there any other magic in do_hard_reset()? But in addition, it also frees lot more Linux kernel buffers and reallocates them for instance. If we are having a communication problem with the VIOS, what is the point of freeing and reallocating Linux kernel buffers? Beside being inefficient, it would expose us to even more errors during reset under heavy workloads? From what I understand so far, do_reset() is complicated because it is attempting some optimizations. If we are going to fall back to hard reset for every error we might as well drop the do_reset() and just do the "thorough hard reset" every time right? The protocol spec is ambiguous and so far I did not get a clear answer on whether the link-down is even needed. If it is needed, then should we add it to do_hard_reset() also? If not, we should remove it (like you mentioned your earlier) completely but am waiting for confirmation on that. git history has not been helpful. While there are other rough edges around do_reset() that we are working on fixing separately (eg: ignore the error return from __ibmvnic_close() right above this change) I see a benefit to the customer with this patch. I am not convinced we should perform a hard reset just because the link down failed. Sukadev
On 4/20/21 2:42 PM, Lijun Pan wrote: > > This v2 does not adddress the concerns mentioned in v1. > And I think it is better to exit with error from do_reset, and schedule a thorough > do_hard_reset if the the adapter is already in unstable state. But the point is that the testing and analysis has indicated that doing a full hard reset is not necessary. We are about to take the very action which will fix this situation, but currently do not. Please describe the advantage in deferring it further by routing it through do_hard_reset(). I don't see one. Rick
On Wed, Apr 21, 2021 at 2:25 AM Sukadev Bhattiprolu <sukadev@linux.ibm.com> wrote: > > Lijun Pan [ljp@linux.vnet.ibm.com] wrote: > > > > > > > On Apr 20, 2021, at 4:35 PM, Dany Madden <drt@linux.ibm.com> wrote: > > > > > > When ibmvnic gets a FATAL error message from the vnicserver, it marks > > > the Command Respond Queue (CRQ) inactive and resets the adapter. If this > > > FATAL reset fails and a transmission timeout reset follows, the CRQ is > > > still inactive, ibmvnic's attempt to set link down will also fail. If > > > ibmvnic abandons the reset because of this failed set link down and this > > > is the last reset in the workqueue, then this adapter will be left in an > > > inoperable state. > > > > > > Instead, make the driver ignore this link down failure and continue to > > > free and re-register CRQ so that the adapter has an opportunity to > > > recover. > > > > This v2 does not adddress the concerns mentioned in v1. > > And I think it is better to exit with error from do_reset, and schedule a thorough > > do_hard_reset if the the adapter is already in unstable state. > > We had a FATAL error and when handling it, we failed to send a > link-down message to the VIOS. So what we need to try next is to > reset the connection with the VIOS. For this we must talk to the > firmware using the H_FREE_CRQ and H_REG_CRQ hcalls. do_reset() > does just that in ibmvnic_reset_crq(). > > Now, sure we can attempt a "thorough hard reset" which also does > the same hcalls to reestablish the connection. Is there any > other magic in do_hard_reset()? But in addition, it also frees lot > more Linux kernel buffers and reallocates them for instance. Working around everything in do_reset will make the code very difficult to manage. Ultimately do_reset can do anything I am afraid, and do_hard_reset can be removed completely or merged into do_reset. > > If we are having a communication problem with the VIOS, what is > the point of freeing and reallocating Linux kernel buffers? Beside > being inefficient, it would expose us to even more errors during > reset under heavy workloads? No real customer runs the system under that heavy load created by HTX stress test, which can tear down any working system. > > From what I understand so far, do_reset() is complicated because > it is attempting some optimizations. If we are going to fall back > to hard reset for every error we might as well drop the do_reset() > and just do the "thorough hard reset" every time right? I think such optimizations are catered for passing HTX tests. Whether the optimization benefits the adapter, say making the adapter more stable, I doubt it. I think there should be a trade off between optimization and stability.
On Wed, Apr 21, 2021 at 3:06 AM Rick Lindsley <ricklind@linux.vnet.ibm.com> wrote: > > On 4/20/21 2:42 PM, Lijun Pan wrote: > > > > This v2 does not adddress the concerns mentioned in v1. > > And I think it is better to exit with error from do_reset, and schedule a thorough > > do_hard_reset if the the adapter is already in unstable state. > > But the point is that the testing and analysis has indicated that doing a full > hard reset is not necessary. We are about to take the very action which will fix > this situation, but currently do not. The testing was done on this patch. It was not performed on a full hard reset. So I don't think you could even compare the two results. > > Please describe the advantage in deferring it further by routing it through > do_hard_reset(). I don't see one. It is not deferred. It exits with error and calls do_hard_reset. See my reply to Suka's.
On Tue, Apr 20, 2021 at 4:37 PM Dany Madden <drt@linux.ibm.com> wrote: > > When ibmvnic gets a FATAL error message from the vnicserver, it marks > the Command Respond Queue (CRQ) inactive and resets the adapter. If this > FATAL reset fails and a transmission timeout reset follows, the CRQ is > still inactive, ibmvnic's attempt to set link down will also fail. If > ibmvnic abandons the reset because of this failed set link down and this > is the last reset in the workqueue, then this adapter will be left in an > inoperable state. > > Instead, make the driver ignore this link down failure and continue to > free and re-register CRQ so that the adapter has an opportunity to > recover. > > Fixes: ed651a10875f ("ibmvnic: Updated reset handling") > Signed-off-by: Dany Madden <drt@linux.ibm.com> > Reviewed-by: Rick Lindsley <ricklind@linux.ibm.com> > Reviewed-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> One thing I would like to point out as already pointed out by Nathan Lynch is that those review-by tags given by the same groups of people from the same company loses credibility over time if you never critique or ask questions on the list.
Lijun Pan [lijunp213@gmail.com] wrote: > > Now, sure we can attempt a "thorough hard reset" which also does > > the same hcalls to reestablish the connection. Is there any > > other magic in do_hard_reset()? But in addition, it also frees lot > > more Linux kernel buffers and reallocates them for instance. > > Working around everything in do_reset will make the code very difficult We are not working around everything. We are doing in do_reset() exactly what we would do in hard reset for this error (ignore the set link down error and try to reestablish the connection with the VIOS). What we are avoiding is unnecessary work on the Linux side for a communication problem on the VIOS side. > to manage. Ultimately do_reset can do anything I am afraid, and do_hard_reset > can be removed completely or merged into do_reset. > > > > > If we are having a communication problem with the VIOS, what is > > the point of freeing and reallocating Linux kernel buffers? Beside > > being inefficient, it would expose us to even more errors during > > reset under heavy workloads? > > No real customer runs the system under that heavy load created by > HTX stress test, which can tear down any working system. We need to talk to capacity planning and test architects about that, but all I want to know is what hard reset would do differently to fix this communication error with VIOS. Sukadev
On Wed, Apr 21, 2021 at 3:06 AM Rick Lindsley <ricklind@linux.vnet.ibm.com> wrote: >> Please describe the advantage in deferring it further by routing it through >> do_hard_reset(). I don't see one. On 4/21/21 10:12 PM, Lijun Pan replied: > It is not deferred. It exits with error and calls do_hard_reset. > See my reply to Suka's. I saw your reply, but it does not answer the question I asked. The patch would have us reinitialize and restart the communication queues. Your suggestion would do more work than that. Please describe the advantage in deferring the reinitialization - and yes, defer is the right word - by routing it through do_hard_reset() and executing that extra code. I see that route as doing more work than necessary and so introducing additional risk, for no clear advantage. So I would find it helpful if you would describe the advantage. > The testing was done on this patch. It was not performed on a full hard reset. > So I don't think you could even compare the two results. A problem has been noted, a patch has been proposed, and the reasoning that the patch is correct has been given. Testing with this patch has demonstrated the problem has not returned. So far, that sounds like a pretty reasonable solution. Your comment is curious - why would testing for this patch be done on a full hard reset when this patch does not invoke a full hard reset? If you have data to consider then let's have it. I'm willing to be convinced, but so far this just sounds like "I wouldn't do it that way myself, and I have a bad feeling about doing it any other way." Rick
On 4/21/21 10:06 PM, Lijun Pan wrote: > No real customer runs the system under that heavy load created by > HTX stress test, which can tear down any working system. So, are you saying the bugs that HTX uncovers are not worth fixing? There's a fair number of individuals and teams out there that utilize HTX in the absence of such a workload, and not just for vnic. What workload would you suggest to better serve "real customers"? > I think such optimizations are catered for passing HTX tests. Well, yes. If the bugs are found with HTX, the fixes are verified against HTX. If they were found with a "real customer workload" they'd be verified against a "real customer workload."
On 4/21/21 10:30 PM, Lijun Pan wrote: >> Fixes: ed651a10875f ("ibmvnic: Updated reset handling") >> Signed-off-by: Dany Madden <drt@linux.ibm.com> >> Reviewed-by: Rick Lindsley <ricklind@linux.ibm.com> >> Reviewed-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> > > One thing I would like to point out as already pointed out by Nathan Lynch is > that those review-by tags given by the same groups of people from the same > company loses credibility over time if you never critique or ask > questions on the list. > Well, so far you aren't addressing either my critiques or questions. I have been asking questions but all I have from you are the above attempts to discredit the reputation of myself and other people, and non-technical statements like will make the code very difficult to manage I think there should be a trade off between optimization and stability. So I don't think you could even compare the two results On the other hand, from the original submission I see some very specific details: If ibmvnic abandons the reset because of this failed set link down and this is the last reset in the workqueue, then this adapter will be left in an inoperable state. and from a followup discussion: We had a FATAL error and when handling it, we failed to send a link-down message to the VIOS. So what we need to try next is to reset the connection with the VIOS. For this we must ... These are great technical points that could be argued or discussed. Problem is, I agree with them. I will ask again: can you please supply some technical reasons for your objections. Otherwise, your objections are meritless and at worst simply an ad hominem attack. Rick
On Thu, Apr 22, 2021 at 2:07 AM Rick Lindsley <ricklind@linux.vnet.ibm.com> wrote: > > On 4/21/21 10:30 PM, Lijun Pan wrote: > >> Fixes: ed651a10875f ("ibmvnic: Updated reset handling") > >> Signed-off-by: Dany Madden <drt@linux.ibm.com> > >> Reviewed-by: Rick Lindsley <ricklind@linux.ibm.com> > >> Reviewed-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> > > > > One thing I would like to point out as already pointed out by Nathan Lynch is > > that those review-by tags given by the same groups of people from the same > > company loses credibility over time if you never critique or ask > > questions on the list. > > > > Well, so far you aren't addressing either my critiques or questions. > > I have been asking questions but all I have from you are the above > attempts to discredit the reputation of myself and other people, and > non-technical statements like > > will make the code very difficult to manage > I think there should be a trade off between optimization and stability. > So I don't think you could even compare the two results > > On the other hand, from the original submission I see some very specific > details: > > If ibmvnic abandons the reset because of this failed set link > down and this is the last reset in the workqueue, then this > adapter will be left in an inoperable state. > > and from a followup discussion: > > We had a FATAL error and when handling it, we failed to > send a link-down message to the VIOS. So what we need > to try next is to reset the connection with the VIOS. For > this we must ... > > These are great technical points that could be argued or discussed. > Problem is, I agree with them. > > I will ask again: can you please supply some technical reasons for > your objections. Otherwise, your objections are meritless and at worst > simply an ad hominem attack. Well, from the beginning of v1, I started to provide technical inputs. Then I was not allowed to post anything in the community about this patch and VNIC via ljp@linux.ibm.com except giving an ack-by/reviewed-by.
Hello, On Thu, Apr 22, 2021 at 12:06:45AM -0500, Lijun Pan wrote: > On Wed, Apr 21, 2021 at 2:25 AM Sukadev Bhattiprolu > <sukadev@linux.ibm.com> wrote: > > > > Lijun Pan [ljp@linux.vnet.ibm.com] wrote: > > > > > > > > > > On Apr 20, 2021, at 4:35 PM, Dany Madden <drt@linux.ibm.com> wrote: > > > > > > > > When ibmvnic gets a FATAL error message from the vnicserver, it marks > > > > the Command Respond Queue (CRQ) inactive and resets the adapter. If this > > > > FATAL reset fails and a transmission timeout reset follows, the CRQ is > > > > still inactive, ibmvnic's attempt to set link down will also fail. If > > > > ibmvnic abandons the reset because of this failed set link down and this > > > > is the last reset in the workqueue, then this adapter will be left in an > > > > inoperable state. > > > > > > > > Instead, make the driver ignore this link down failure and continue to > > > > free and re-register CRQ so that the adapter has an opportunity to > > > > recover. > > > > > > This v2 does not adddress the concerns mentioned in v1. > > > And I think it is better to exit with error from do_reset, and schedule a thorough > > > do_hard_reset if the the adapter is already in unstable state. > > > > We had a FATAL error and when handling it, we failed to send a > > link-down message to the VIOS. So what we need to try next is to > > reset the connection with the VIOS. For this we must talk to the > > firmware using the H_FREE_CRQ and H_REG_CRQ hcalls. do_reset() > > does just that in ibmvnic_reset_crq(). > > > > Now, sure we can attempt a "thorough hard reset" which also does > > the same hcalls to reestablish the connection. Is there any > > other magic in do_hard_reset()? But in addition, it also frees lot > > more Linux kernel buffers and reallocates them for instance. > > Working around everything in do_reset will make the code very difficult > to manage. Ultimately do_reset can do anything I am afraid, and do_hard_reset > can be removed completely or merged into do_reset. This debate is not very constructive. In the context of driver that has separate do_reset and do_hard_reset this fix picks the correct one unless you can refute the arguments provided. Merging do_reset and do_hard_reset might be a good code cleanup which is out of the scope of this fix. Given that vast majority of fixes to the vnic driver are related to the reset handling it would improve stability and testability if every reset took the same code path. In the context of merging do_hard_reset and do_reset the question is what is the intended distinction and performance gain by having 'lightweight' reset. I don't have a vnic protocol manual at hand and I suspect I would not get one even if I searched for one. From reading through the fixes in the past my understanding is that the full reset is required when the backend changes which then potentially requires different size/number of buffers. What is the expected situation when reset is required without changing the backend? Is this so common that it warrants a separate 'lightweight' optimized function? Thanks Michal
On Thu, Apr 22, 2021 at 12:21 PM Michal Suchánek <msuchanek@suse.de> wrote: > > Hello, > > On Thu, Apr 22, 2021 at 12:06:45AM -0500, Lijun Pan wrote: > > On Wed, Apr 21, 2021 at 2:25 AM Sukadev Bhattiprolu > > <sukadev@linux.ibm.com> wrote: > > > > > > Lijun Pan [ljp@linux.vnet.ibm.com] wrote: > > > > > > > > > > > > > On Apr 20, 2021, at 4:35 PM, Dany Madden <drt@linux.ibm.com> wrote: > > > > > > > > > > When ibmvnic gets a FATAL error message from the vnicserver, it marks > > > > > the Command Respond Queue (CRQ) inactive and resets the adapter. If this > > > > > FATAL reset fails and a transmission timeout reset follows, the CRQ is > > > > > still inactive, ibmvnic's attempt to set link down will also fail. If > > > > > ibmvnic abandons the reset because of this failed set link down and this > > > > > is the last reset in the workqueue, then this adapter will be left in an > > > > > inoperable state. > > > > > > > > > > Instead, make the driver ignore this link down failure and continue to > > > > > free and re-register CRQ so that the adapter has an opportunity to > > > > > recover. > > > > > > > > This v2 does not adddress the concerns mentioned in v1. > > > > And I think it is better to exit with error from do_reset, and schedule a thorough > > > > do_hard_reset if the the adapter is already in unstable state. > > > > > > We had a FATAL error and when handling it, we failed to send a > > > link-down message to the VIOS. So what we need to try next is to > > > reset the connection with the VIOS. For this we must talk to the > > > firmware using the H_FREE_CRQ and H_REG_CRQ hcalls. do_reset() > > > does just that in ibmvnic_reset_crq(). > > > > > > Now, sure we can attempt a "thorough hard reset" which also does > > > the same hcalls to reestablish the connection. Is there any > > > other magic in do_hard_reset()? But in addition, it also frees lot > > > more Linux kernel buffers and reallocates them for instance. > > > > Working around everything in do_reset will make the code very difficult > > to manage. Ultimately do_reset can do anything I am afraid, and do_hard_reset > > can be removed completely or merged into do_reset. Michal, I should have given more details about the above statement. Thanks for your detailed info in the below. > > This debate is not very constructive. > > In the context of driver that has separate do_reset and do_hard_reset > this fix picks the correct one unless you can refute the arguments > provided. > > Merging do_reset and do_hard_reset might be a good code cleanup which is > out of the scope of this fix. Right. > > > > Given that vast majority of fixes to the vnic driver are related to the > reset handling it would improve stability and testability if every > reset took the same code path. I agree. > > In the context of merging do_hard_reset and do_reset the question is > what is the intended distinction and performance gain by having > 'lightweight' reset. Right. > > I don't have a vnic protocol manual at hand and I suspect I would not > get one even if I searched for one. > > From reading through the fixes in the past my understanding is that the > full reset is required when the backend changes which then potentially > requires different size/number of buffers. Yes, full reset is better when the backend changes. > > What is the expected situation when reset is required without changing > the backend? > > Is this so common that it warrants a separate 'lightweight' optimized > function? >
On 4/22/21 10:21 AM, Michal Suchánek wrote: > Merging do_reset and do_hard_reset might be a good code cleanup which is > out of the scope of this fix. I agree, on both points. Accepting the patch, and improving the reset path are not in conflict with each other. My position is that improving the reset path is certainly on the table, but not with this bug or this fix. Within the context of this discovered problem, the issue is well understood, the fix is small and addresses the immediate problem, and it has not generated new bugs under subsequent testing. For those reasons, the submitted patch has my support. Rick
On 2021-04-22 19:26, Rick Lindsley wrote: > On 4/22/21 10:21 AM, Michal Suchánek wrote: > >> Merging do_reset and do_hard_reset might be a good code cleanup which >> is >> out of the scope of this fix. > > I agree, on both points. Accepting the patch, and improving the reset > path are not in conflict with each other. > > My position is that improving the reset path is certainly on the table, > but not with this bug or this fix. Within the context of this > discovered > problem, the issue is well understood, the fix is small and addresses > the > immediate problem, and it has not generated new bugs under subsequent > testing. For those reasons, the submitted patch has my support. > > Rick Thanks for all the feedback on the patch. Refactoring the ibmvnic reset functions is something we plan to evaluate, as this would potentially simplify the reset code. In the mean time, the proposed patch is simple, and has been validated in our test environment to solve an issue resulting in vnic interfaces getting stuck in an offline state following a vnic timeout reset. Given that, I suggest we merge this patch while we look at further optimizations in future. I will submit a V3 patch shortly. Dany
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index ffb2a91750c7..4bd8c5d1a275 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -1970,8 +1970,10 @@ static int do_reset(struct ibmvnic_adapter *adapter, rtnl_unlock(); rc = set_link_state(adapter, IBMVNIC_LOGICAL_LNK_DN); rtnl_lock(); - if (rc) - goto out; + if (rc) { + netdev_dbg(netdev, + "Setting link down failed rc=%d. Continue anyway\n", rc); + } if (adapter->state == VNIC_OPEN) { /* When we dropped rtnl, ibmvnic_open() got