Message ID | 20210226163301.419727-3-danielhb413@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | send QAPI_EVENT_MEM_UNPLUG_ERROR for ppc64 unplugs | expand |
On Fri, 26 Feb 2021 13:32:58 -0300 Daniel Henrique Barboza <danielhb413@gmail.com> wrote: > Now that we're asserting the first DRC LMB earlier, use it to query if > the DRC is already pending unplug and, in this case, issue the same > error we already do. > > The previous check was introduced in commit 2a129767ebb1 and it works, > but it's easier to check the unplug_requested flag instead of looking > for the existence of the sPAPRDIMMState. It's also compliant with what > is already done in other unplug_request functions for other devices. > > Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> > --- Reviewed-by: Greg Kurz <groug@kaod.org> > hw/ppc/spapr.c | 8 +------- > 1 file changed, 1 insertion(+), 7 deletions(-) > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index 74e046b522..149dc2113f 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -3681,13 +3681,7 @@ static void spapr_memory_unplug_request(HotplugHandler *hotplug_dev, > addr_start / SPAPR_MEMORY_BLOCK_SIZE); > g_assert(drc_start); > > - /* > - * An existing pending dimm state for this DIMM means that there is an > - * unplug operation in progress, waiting for the spapr_lmb_release > - * callback to complete the job (BQL can't cover that far). In this case, > - * bail out to avoid detaching DRCs that were already released. > - */ > - if (spapr_pending_dimm_unplugs_find(spapr, dimm)) { > + if (spapr_drc_unplug_requested(drc_start)) { > error_setg(errp, "Memory unplug already in progress for device %s", > dev->id); > return;
On Fri, Feb 26, 2021 at 01:32:58PM -0300, Daniel Henrique Barboza wrote: > Now that we're asserting the first DRC LMB earlier, use it to query if > the DRC is already pending unplug and, in this case, issue the same > error we already do. > > The previous check was introduced in commit 2a129767ebb1 and it works, > but it's easier to check the unplug_requested flag instead of looking > for the existence of the sPAPRDIMMState. It's also compliant with what > is already done in other unplug_request functions for other devices. > > Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> I'm having some trouble completely convincing myself this is right. What about this situation: 1. We initiate a DIMM unplug - unplug_request is set on all the LMBs - all the LMBs go on the pending_unplug list 2. The guest encounters no problems, and starts issuing set indicator calls to mark the LMBs unusable, starting from the lowest address 3. On drc_set_unusable() for the first LMB, we see that unplug is requested and call spapr_drc_release() 4. spapr_drc_release() on the first LMB clears unplug_requested 5. At this point, but before this is done on *all* of the DIMM's LMBs, the user attempts another unplug triggering the code below AFAICT this will now skip the error, since the first LMB is no longer in unplug_requested state, but there *are* still pending unplugs for some of the remaining LMBs, so the old code would have tripped the error. > --- > hw/ppc/spapr.c | 8 +------- > 1 file changed, 1 insertion(+), 7 deletions(-) > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index 74e046b522..149dc2113f 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -3681,13 +3681,7 @@ static void spapr_memory_unplug_request(HotplugHandler *hotplug_dev, > addr_start / SPAPR_MEMORY_BLOCK_SIZE); > g_assert(drc_start); > > - /* > - * An existing pending dimm state for this DIMM means that there is an > - * unplug operation in progress, waiting for the spapr_lmb_release > - * callback to complete the job (BQL can't cover that far). In this case, > - * bail out to avoid detaching DRCs that were already released. > - */ > - if (spapr_pending_dimm_unplugs_find(spapr, dimm)) { > + if (spapr_drc_unplug_requested(drc_start)) { > error_setg(errp, "Memory unplug already in progress for device %s", > dev->id); > return;
On 3/1/21 11:03 PM, David Gibson wrote: > On Fri, Feb 26, 2021 at 01:32:58PM -0300, Daniel Henrique Barboza wrote: >> Now that we're asserting the first DRC LMB earlier, use it to query if >> the DRC is already pending unplug and, in this case, issue the same >> error we already do. >> >> The previous check was introduced in commit 2a129767ebb1 and it works, >> but it's easier to check the unplug_requested flag instead of looking >> for the existence of the sPAPRDIMMState. It's also compliant with what >> is already done in other unplug_request functions for other devices. >> >> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> > > I'm having some trouble completely convincing myself this is right. > > What about this situation: > 1. We initiate a DIMM unplug > - unplug_request is set on all the LMBs > - all the LMBs go on the pending_unplug list > 2. The guest encounters no problems, and starts issuing set > indicator calls to mark the LMBs unusable, starting from the > lowest address > 3. On drc_set_unusable() for the first LMB, we see that unplug is > requested and call spapr_drc_release() > 4. spapr_drc_release() on the first LMB clears unplug_requested > 5. At this point, but before this is done on *all* of the DIMM's > LMBs, the user attempts another unplug triggering the code > below > > AFAICT this will now skip the error, since the first LMB is no longer > in unplug_requested state, but there *are* still pending unplugs for > some of the remaining LMBs, so the old code would have tripped the > error. Good point. Checking the existence of the sPAPRDIMMState struct at this point is the same as checking for drc->unplug_requested for all the DRCs of the DIMM. I could check for drc->unplug_requested inside the loop where we instantiate each DRC, but there is no gain in doing that instead of what we already have. I'll drop this patch and change patch 1 to just remove the duplicated assert. Thanks, DHB > >> --- >> hw/ppc/spapr.c | 8 +------- >> 1 file changed, 1 insertion(+), 7 deletions(-) >> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c >> index 74e046b522..149dc2113f 100644 >> --- a/hw/ppc/spapr.c >> +++ b/hw/ppc/spapr.c >> @@ -3681,13 +3681,7 @@ static void spapr_memory_unplug_request(HotplugHandler *hotplug_dev, >> addr_start / SPAPR_MEMORY_BLOCK_SIZE); >> g_assert(drc_start); >> >> - /* >> - * An existing pending dimm state for this DIMM means that there is an >> - * unplug operation in progress, waiting for the spapr_lmb_release >> - * callback to complete the job (BQL can't cover that far). In this case, >> - * bail out to avoid detaching DRCs that were already released. >> - */ >> - if (spapr_pending_dimm_unplugs_find(spapr, dimm)) { >> + if (spapr_drc_unplug_requested(drc_start)) { >> error_setg(errp, "Memory unplug already in progress for device %s", >> dev->id); >> return; >
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 74e046b522..149dc2113f 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -3681,13 +3681,7 @@ static void spapr_memory_unplug_request(HotplugHandler *hotplug_dev, addr_start / SPAPR_MEMORY_BLOCK_SIZE); g_assert(drc_start); - /* - * An existing pending dimm state for this DIMM means that there is an - * unplug operation in progress, waiting for the spapr_lmb_release - * callback to complete the job (BQL can't cover that far). In this case, - * bail out to avoid detaching DRCs that were already released. - */ - if (spapr_pending_dimm_unplugs_find(spapr, dimm)) { + if (spapr_drc_unplug_requested(drc_start)) { error_setg(errp, "Memory unplug already in progress for device %s", dev->id); return;
Now that we're asserting the first DRC LMB earlier, use it to query if the DRC is already pending unplug and, in this case, issue the same error we already do. The previous check was introduced in commit 2a129767ebb1 and it works, but it's easier to check the unplug_requested flag instead of looking for the existence of the sPAPRDIMMState. It's also compliant with what is already done in other unplug_request functions for other devices. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- hw/ppc/spapr.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-)