Message ID | 20170316143039.375-1-james.morse@arm.com (mailing list archive) |
---|---|
State | Accepted, archived |
Delegated to: | Rafael Wysocki |
Headers | show |
Hi, James, James Morse <james.morse@arm.com> writes: > When removing a GHES device notified by SCI, list_del_rcu() is used, > ghes_remove() should call synchronize_rcu() before it goes on to call > kfree(ghes), otherwise concurrent RCU readers may still hold this list > entry after it has been freed. > > Signed-off-by: James Morse <james.morse@arm.com> > Cc: Huang Ying <ying.huang@intel.com> > > --- > It looks like 81e88fdc432a lifted this into ACPI_HEST_NOTIFY_NMI, missing > that ACPI_HEST_NOTIFY_SCI needed it too. > > If there is only ever one SCI GHES entry this is safe today as > unregister_acpi_hed_notifier() takes a write lock on its semaphore, meaning > any RCU readers will have finished. > If there can be more than one SCI GHES entry... > > Fixes: 81e88fdc432a ("ACPI, APEI, Generic Hardware Error Source POLL/IRQ/NMI notification type support") > > drivers/acpi/apei/ghes.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c > index b192b42a8351..79b3c9c5a3bc 100644 > --- a/drivers/acpi/apei/ghes.c > +++ b/drivers/acpi/apei/ghes.c > @@ -1073,6 +1073,7 @@ static int ghes_remove(struct platform_device *ghes_dev) > if (list_empty(&ghes_sci)) > unregister_acpi_hed_notifier(&ghes_notifier_sci); In remove path unregister_acpi_hed_notifier() blocking_notifier_chain_unregister() down_write(&nh->rwsem) While in notifier call path acpi_hed_notify() blocking_notifier_call_chain() __blocking_notifier_call_chain() down_read(&nh->rwsem) So when unregister succeeds, the notifier call should have finished. Best Regards, Huang, Ying > mutex_unlock(&ghes_list_mutex); > + synchronize_rcu(); > break; > case ACPI_HEST_NOTIFY_NMI: > ghes_nmi_remove(ghes); -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Huang, Ying On 17/03/17 01:23, Huang, Ying wrote: > James Morse <james.morse@arm.com> writes: > >> When removing a GHES device notified by SCI, list_del_rcu() is used, >> ghes_remove() should call synchronize_rcu() before it goes on to call >> kfree(ghes), otherwise concurrent RCU readers may still hold this list >> entry after it has been freed. >> --- >> It looks like 81e88fdc432a lifted this into ACPI_HEST_NOTIFY_NMI, missing >> that ACPI_HEST_NOTIFY_SCI needed it too. >> >> If there is only ever one SCI GHES entry this is safe today as >> unregister_acpi_hed_notifier() takes a write lock on its semaphore, meaning >> any RCU readers will have finished. > In remove path > > unregister_acpi_hed_notifier() > blocking_notifier_chain_unregister() > down_write(&nh->rwsem) > > While in notifier call path > > acpi_hed_notify() > blocking_notifier_call_chain() > __blocking_notifier_call_chain() > down_read(&nh->rwsem) > > So when unregister succeeds, the notifier call should have > finished. You are only protected like this if the unregister call is made for every list_del_rcu(), which would only be the case if there is only ever one SCI GHES entry. Is this how NOTIFY_SCI is expected to be used? If so I agree its safe, (but confusing!) today, and we need to take account of this behaviour in Shiju Jose's patch. If there can be multiple SCI entries, you only get this unregister protection for the last one to be freed, as the list_empty() check skips the unregister for all but the last entry. > case ACPI_HEST_NOTIFY_SCI: > mutex_lock(&ghes_list_mutex); > list_del_rcu(&ghes->list); > if (list_empty(&ghes_sci)) > unregister_acpi_hed_notifier(&ghes_notifier_sci); > mutex_unlock(&ghes_list_mutex); > break; Thanks, James -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
James Morse <james.morse@arm.com> writes: > Hi Huang, Ying > > On 17/03/17 01:23, Huang, Ying wrote: >> James Morse <james.morse@arm.com> writes: >> >>> When removing a GHES device notified by SCI, list_del_rcu() is used, >>> ghes_remove() should call synchronize_rcu() before it goes on to call >>> kfree(ghes), otherwise concurrent RCU readers may still hold this list >>> entry after it has been freed. > >>> --- >>> It looks like 81e88fdc432a lifted this into ACPI_HEST_NOTIFY_NMI, missing >>> that ACPI_HEST_NOTIFY_SCI needed it too. >>> >>> If there is only ever one SCI GHES entry this is safe today as >>> unregister_acpi_hed_notifier() takes a write lock on its semaphore, meaning >>> any RCU readers will have finished. > > >> In remove path >> >> unregister_acpi_hed_notifier() >> blocking_notifier_chain_unregister() >> down_write(&nh->rwsem) >> >> While in notifier call path >> >> acpi_hed_notify() >> blocking_notifier_call_chain() >> __blocking_notifier_call_chain() >> down_read(&nh->rwsem) >> >> So when unregister succeeds, the notifier call should have >> finished. > > You are only protected like this if the unregister call is made for every > list_del_rcu(), which would only be the case if there is only ever one SCI GHES > entry. > > Is this how NOTIFY_SCI is expected to be used? > > If so I agree its safe, (but confusing!) today, and we need to take account of > this behaviour in Shiju Jose's patch. > > > If there can be multiple SCI entries, you only get this unregister protection > for the last one to be freed, as the list_empty() check skips the unregister for > all but the last entry. Yes. You are right. Feel free to add Reviewed-by: "Huang, Ying" <ying.huang@intel.com> In your patch. Best Regards, Huang, Ying >> case ACPI_HEST_NOTIFY_SCI: >> mutex_lock(&ghes_list_mutex); >> list_del_rcu(&ghes->list); >> if (list_empty(&ghes_sci)) >> unregister_acpi_hed_notifier(&ghes_notifier_sci); >> mutex_unlock(&ghes_list_mutex); >> break; > > > > Thanks, > > James -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index b192b42a8351..79b3c9c5a3bc 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -1073,6 +1073,7 @@ static int ghes_remove(struct platform_device *ghes_dev) if (list_empty(&ghes_sci)) unregister_acpi_hed_notifier(&ghes_notifier_sci); mutex_unlock(&ghes_list_mutex); + synchronize_rcu(); break; case ACPI_HEST_NOTIFY_NMI: ghes_nmi_remove(ghes);
When removing a GHES device notified by SCI, list_del_rcu() is used, ghes_remove() should call synchronize_rcu() before it goes on to call kfree(ghes), otherwise concurrent RCU readers may still hold this list entry after it has been freed. Signed-off-by: James Morse <james.morse@arm.com> Cc: Huang Ying <ying.huang@intel.com> --- It looks like 81e88fdc432a lifted this into ACPI_HEST_NOTIFY_NMI, missing that ACPI_HEST_NOTIFY_SCI needed it too. If there is only ever one SCI GHES entry this is safe today as unregister_acpi_hed_notifier() takes a write lock on its semaphore, meaning any RCU readers will have finished. If there can be more than one SCI GHES entry... Fixes: 81e88fdc432a ("ACPI, APEI, Generic Hardware Error Source POLL/IRQ/NMI notification type support") drivers/acpi/apei/ghes.c | 1 + 1 file changed, 1 insertion(+)