diff mbox series

mm/slab: protect cache_reap() against CPU and memory hot plug operations

Message ID 20190311191701.24325-1-ldufour@linux.ibm.com (mailing list archive)
State New, archived
Headers show
Series mm/slab: protect cache_reap() against CPU and memory hot plug operations | expand

Commit Message

Laurent Dufour March 11, 2019, 7:17 p.m. UTC
The commit 95402b382901 ("cpu-hotplug: replace per-subsystem mutexes with
get_online_cpus()") remove the CPU_LOCK_ACQUIRE operation which was use to
grap the cache_chain_mutex lock which was protecting cache_reap() against
CPU hot plug operations.

Later the commit 18004c5d4084 ("mm, sl[aou]b: Use a common mutex
definition") changed cache_chain_mutex to slab_mutex but this didn't help
fixing the missing the cache_reap() protection against CPU hot plug
operations.

Here we are stopping the per cpu worker while holding the slab_mutex to
ensure that cache_reap() is not running in our back and will not be
triggered anymore for this cpu.

This patch fixes that race leading to SLAB's data corruption when CPU
hotplug are triggered. We hit it while doing partition migration on PowerVM
leading to CPU reconfiguration through the CPU hotplug mechanism.

This fix is covering kernel containing to the commit 6731d4f12315 ("slab:
Convert to hotplug state machine"), ie 4.9.1, earlier kernel needs a
slightly different patch.

Cc: stable@vger.kernel.org
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
---
 mm/slab.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Michal Hocko March 12, 2019, 2:58 p.m. UTC | #1
On Mon 11-03-19 20:17:01, Laurent Dufour wrote:
> The commit 95402b382901 ("cpu-hotplug: replace per-subsystem mutexes with
> get_online_cpus()") remove the CPU_LOCK_ACQUIRE operation which was use to
> grap the cache_chain_mutex lock which was protecting cache_reap() against
> CPU hot plug operations.
> 
> Later the commit 18004c5d4084 ("mm, sl[aou]b: Use a common mutex
> definition") changed cache_chain_mutex to slab_mutex but this didn't help
> fixing the missing the cache_reap() protection against CPU hot plug
> operations.
> 
> Here we are stopping the per cpu worker while holding the slab_mutex to
> ensure that cache_reap() is not running in our back and will not be
> triggered anymore for this cpu.
> 
> This patch fixes that race leading to SLAB's data corruption when CPU
> hotplug are triggered. We hit it while doing partition migration on PowerVM
> leading to CPU reconfiguration through the CPU hotplug mechanism.

What is the actual race? slab_offline_cpu calls cancel_delayed_work_sync
so it removes a pending item and waits for the item to finish if they run
concurently. So why do we need an additional lock?

> This fix is covering kernel containing to the commit 6731d4f12315 ("slab:
> Convert to hotplug state machine"), ie 4.9.1, earlier kernel needs a
> slightly different patch.
> 
> Cc: stable@vger.kernel.org
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Pekka Enberg <penberg@kernel.org>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
> ---
>  mm/slab.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/mm/slab.c b/mm/slab.c
> index 28652e4218e0..ba499d90f27f 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -1103,6 +1103,7 @@ static int slab_online_cpu(unsigned int cpu)
>  
>  static int slab_offline_cpu(unsigned int cpu)
>  {
> +	mutex_lock(&slab_mutex);
>  	/*
>  	 * Shutdown cache reaper. Note that the slab_mutex is held so
>  	 * that if cache_reap() is invoked it cannot do anything
> @@ -1112,6 +1113,7 @@ static int slab_offline_cpu(unsigned int cpu)
>  	cancel_delayed_work_sync(&per_cpu(slab_reap_work, cpu));
>  	/* Now the cache_reaper is guaranteed to be not running. */
>  	per_cpu(slab_reap_work, cpu).work.func = NULL;
> +	mutex_unlock(&slab_mutex);
>  	return 0;
>  }
>  
> -- 
> 2.21.0
Laurent Dufour March 12, 2019, 4:28 p.m. UTC | #2
Le 12/03/2019 à 15:58, Michal Hocko a écrit :
> On Mon 11-03-19 20:17:01, Laurent Dufour wrote:
>> The commit 95402b382901 ("cpu-hotplug: replace per-subsystem mutexes with
>> get_online_cpus()") remove the CPU_LOCK_ACQUIRE operation which was use to
>> grap the cache_chain_mutex lock which was protecting cache_reap() against
>> CPU hot plug operations.
>>
>> Later the commit 18004c5d4084 ("mm, sl[aou]b: Use a common mutex
>> definition") changed cache_chain_mutex to slab_mutex but this didn't help
>> fixing the missing the cache_reap() protection against CPU hot plug
>> operations.
>>
>> Here we are stopping the per cpu worker while holding the slab_mutex to
>> ensure that cache_reap() is not running in our back and will not be
>> triggered anymore for this cpu.
>>
>> This patch fixes that race leading to SLAB's data corruption when CPU
>> hotplug are triggered. We hit it while doing partition migration on PowerVM
>> leading to CPU reconfiguration through the CPU hotplug mechanism.
> 
> What is the actual race? slab_offline_cpu calls cancel_delayed_work_sync
> so it removes a pending item and waits for the item to finish if they run
> concurently. So why do we need an additional lock?

You're right.
Reading cancel_delayed_work_sync() again I can't see how this could help.

The tests done with that patch were successful, while we were seeing a 
SLAB data corruption without it, but this needs to be investigated 
further since this one should not help. This was perhaps a lucky side 
effect.

Please forgot about this one.

> 
>> This fix is covering kernel containing to the commit 6731d4f12315 ("slab:
>> Convert to hotplug state machine"), ie 4.9.1, earlier kernel needs a
>> slightly different patch.
>>
>> Cc: stable@vger.kernel.org
>> Cc: Christoph Lameter <cl@linux.com>
>> Cc: Pekka Enberg <penberg@kernel.org>
>> Cc: David Rientjes <rientjes@google.com>
>> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
>> ---
>>   mm/slab.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/mm/slab.c b/mm/slab.c
>> index 28652e4218e0..ba499d90f27f 100644
>> --- a/mm/slab.c
>> +++ b/mm/slab.c
>> @@ -1103,6 +1103,7 @@ static int slab_online_cpu(unsigned int cpu)
>>   
>>   static int slab_offline_cpu(unsigned int cpu)
>>   {
>> +	mutex_lock(&slab_mutex);
>>   	/*
>>   	 * Shutdown cache reaper. Note that the slab_mutex is held so
>>   	 * that if cache_reap() is invoked it cannot do anything
>> @@ -1112,6 +1113,7 @@ static int slab_offline_cpu(unsigned int cpu)
>>   	cancel_delayed_work_sync(&per_cpu(slab_reap_work, cpu));
>>   	/* Now the cache_reaper is guaranteed to be not running. */
>>   	per_cpu(slab_reap_work, cpu).work.func = NULL;
>> +	mutex_unlock(&slab_mutex);
>>   	return 0;
>>   }
>>   
>> -- 
>> 2.21.0
>
Sasha Levin March 25, 2019, 12:38 a.m. UTC | #3
Hi,

[This is an automated email]

This commit has been processed because it contains a -stable tag.
The stable tag indicates that it's relevant for the following trees: all

The bot has tested the following trees: v5.0.3, v4.19.30, v4.14.107, v4.9.164, v4.4.176, v3.18.136.

v5.0.3: Build OK!
v4.19.30: Build OK!
v4.14.107: Build OK!
v4.9.164: Build OK!
v4.4.176: Failed to apply! Possible dependencies:
    27590dc17b34 ("hrtimer: Convert to hotplug state machine")
    31487f8328f2 ("smp/cfd: Convert core to hotplug state machine")
    512089d98457 ("perf/x86/intel/rapl: Clean up the printk output")
    55f2890f0726 ("perf/x86/intel/rapl: Add proper error handling")
    57ecde42cc74 ("powerpc/perf: Convert book3s notifier to state machine callbacks")
    6731d4f12315 ("slab: Convert to hotplug state machine")
    6b2c28471de5 ("x86/x2apic: Convert to CPU hotplug state machine")
    7162b8fea630 ("perf/x86/intel/rapl: Refactor the code some more")
    75c7003fbf41 ("perf/x86/intel/rapl: Calculate timing once")
    7ee681b25284 ("workqueue: Convert to state machine callbacks")
    8a6d2f8f73ca ("perf/x86/intel/rapl: Utilize event->pmu_private")
    8b5b773d6245 ("perf/x86/intel/rapl: Convert to hotplug state machine")
    9de8d686955b ("perf/x86/intel/rapl: Convert it to a per package facility")
    a208749c6426 ("perf/x86/intel/rapl: Make PMU lock raw")
    a409f5ee2937 ("blackfin/perf: Convert hotplug notifier to state machine")
    b8b3319a471b ("perf/x86/intel/rapl: Sanitize the quirk handling")
    e3cfce17d309 ("sh/perf: Convert the hotplug notifiers to state machine callbacks")
    e6d4989a9ad1 ("relayfs: Convert to hotplug state machine")
    e722d8daafb9 ("profile: Convert to hotplug state machine")

v3.18.136: Failed to apply! Possible dependencies:
    13ca62b243f6 ("ACPI: Fix minor syntax issues in processor_core.c")
    27590dc17b34 ("hrtimer: Convert to hotplug state machine")
    31487f8328f2 ("smp/cfd: Convert core to hotplug state machine")
    4daa832d9987 ("x86: Drop bogus __ref / __refdata annotations")
    57ecde42cc74 ("powerpc/perf: Convert book3s notifier to state machine callbacks")
    645523960102 ("perf/x86/intel/rapl: Fix energy counter measurements but supporing per domain energy units")
    6731d4f12315 ("slab: Convert to hotplug state machine")
    6b2c28471de5 ("x86/x2apic: Convert to CPU hotplug state machine")
    7162b8fea630 ("perf/x86/intel/rapl: Refactor the code some more")
    7ee681b25284 ("workqueue: Convert to state machine callbacks")
    828aef376d7a ("ACPI / processor: Introduce phys_cpuid_t for CPU hardware ID")
    8b5b773d6245 ("perf/x86/intel/rapl: Convert to hotplug state machine")
    9de8d686955b ("perf/x86/intel/rapl: Convert it to a per package facility")
    a409f5ee2937 ("blackfin/perf: Convert hotplug notifier to state machine")
    af8f3f514d19 ("ACPI / processor: Convert apic_id to phys_id to make it arch agnostic")
    d02dc27db0dc ("ACPI / processor: Rename acpi_(un)map_lsapic() to acpi_(un)map_cpu()")
    d089f8e97d37 ("x86: fix up obsolete cpu function usage.")
    e3cfce17d309 ("sh/perf: Convert the hotplug notifiers to state machine callbacks")
    e6d4989a9ad1 ("relayfs: Convert to hotplug state machine")
    e722d8daafb9 ("profile: Convert to hotplug state machine")
    ecf5636dcd59 ("ACPI: Add interfaces to parse IOAPIC ID for IOAPIC hotplug")
    fdaf3a6539d6 ("x86: fix more deprecated cpu function usage.")


How should we proceed with this patch?

--
Thanks,
Sasha
Laurent Dufour March 25, 2019, 10:28 a.m. UTC | #4
Le 25/03/2019 à 01:38, Sasha Levin a écrit :
> Hi,
> 
> [This is an automated email]
> 
> This commit has been processed because it contains a -stable tag.
> The stable tag indicates that it's relevant for the following trees: all
> 
> The bot has tested the following trees: v5.0.3, v4.19.30, v4.14.107, v4.9.164, v4.4.176, v3.18.136.
> 
> v5.0.3: Build OK!
> v4.19.30: Build OK!
> v4.14.107: Build OK!
> v4.9.164: Build OK!
> v4.4.176: Failed to apply! Possible dependencies:
>      27590dc17b34 ("hrtimer: Convert to hotplug state machine")
>      31487f8328f2 ("smp/cfd: Convert core to hotplug state machine")
>      512089d98457 ("perf/x86/intel/rapl: Clean up the printk output")
>      55f2890f0726 ("perf/x86/intel/rapl: Add proper error handling")
>      57ecde42cc74 ("powerpc/perf: Convert book3s notifier to state machine callbacks")
>      6731d4f12315 ("slab: Convert to hotplug state machine")
>      6b2c28471de5 ("x86/x2apic: Convert to CPU hotplug state machine")
>      7162b8fea630 ("perf/x86/intel/rapl: Refactor the code some more")
>      75c7003fbf41 ("perf/x86/intel/rapl: Calculate timing once")
>      7ee681b25284 ("workqueue: Convert to state machine callbacks")
>      8a6d2f8f73ca ("perf/x86/intel/rapl: Utilize event->pmu_private")
>      8b5b773d6245 ("perf/x86/intel/rapl: Convert to hotplug state machine")
>      9de8d686955b ("perf/x86/intel/rapl: Convert it to a per package facility")
>      a208749c6426 ("perf/x86/intel/rapl: Make PMU lock raw")
>      a409f5ee2937 ("blackfin/perf: Convert hotplug notifier to state machine")
>      b8b3319a471b ("perf/x86/intel/rapl: Sanitize the quirk handling")
>      e3cfce17d309 ("sh/perf: Convert the hotplug notifiers to state machine callbacks")
>      e6d4989a9ad1 ("relayfs: Convert to hotplug state machine")
>      e722d8daafb9 ("profile: Convert to hotplug state machine")
> 
> v3.18.136: Failed to apply! Possible dependencies:
>      13ca62b243f6 ("ACPI: Fix minor syntax issues in processor_core.c")
>      27590dc17b34 ("hrtimer: Convert to hotplug state machine")
>      31487f8328f2 ("smp/cfd: Convert core to hotplug state machine")
>      4daa832d9987 ("x86: Drop bogus __ref / __refdata annotations")
>      57ecde42cc74 ("powerpc/perf: Convert book3s notifier to state machine callbacks")
>      645523960102 ("perf/x86/intel/rapl: Fix energy counter measurements but supporing per domain energy units")
>      6731d4f12315 ("slab: Convert to hotplug state machine")
>      6b2c28471de5 ("x86/x2apic: Convert to CPU hotplug state machine")
>      7162b8fea630 ("perf/x86/intel/rapl: Refactor the code some more")
>      7ee681b25284 ("workqueue: Convert to state machine callbacks")
>      828aef376d7a ("ACPI / processor: Introduce phys_cpuid_t for CPU hardware ID")
>      8b5b773d6245 ("perf/x86/intel/rapl: Convert to hotplug state machine")
>      9de8d686955b ("perf/x86/intel/rapl: Convert it to a per package facility")
>      a409f5ee2937 ("blackfin/perf: Convert hotplug notifier to state machine")
>      af8f3f514d19 ("ACPI / processor: Convert apic_id to phys_id to make it arch agnostic")
>      d02dc27db0dc ("ACPI / processor: Rename acpi_(un)map_lsapic() to acpi_(un)map_cpu()")
>      d089f8e97d37 ("x86: fix up obsolete cpu function usage.")
>      e3cfce17d309 ("sh/perf: Convert the hotplug notifiers to state machine callbacks")
>      e6d4989a9ad1 ("relayfs: Convert to hotplug state machine")
>      e722d8daafb9 ("profile: Convert to hotplug state machine")
>      ecf5636dcd59 ("ACPI: Add interfaces to parse IOAPIC ID for IOAPIC hotplug")
>      fdaf3a6539d6 ("x86: fix more deprecated cpu function usage.")
> 
> 
> How should we proceed with this patch?

Please forget about it.

As reported by Michal, this patch is useless.
Sorry for the noise.

Thanks,
Laurent.
diff mbox series

Patch

diff --git a/mm/slab.c b/mm/slab.c
index 28652e4218e0..ba499d90f27f 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1103,6 +1103,7 @@  static int slab_online_cpu(unsigned int cpu)
 
 static int slab_offline_cpu(unsigned int cpu)
 {
+	mutex_lock(&slab_mutex);
 	/*
 	 * Shutdown cache reaper. Note that the slab_mutex is held so
 	 * that if cache_reap() is invoked it cannot do anything
@@ -1112,6 +1113,7 @@  static int slab_offline_cpu(unsigned int cpu)
 	cancel_delayed_work_sync(&per_cpu(slab_reap_work, cpu));
 	/* Now the cache_reaper is guaranteed to be not running. */
 	per_cpu(slab_reap_work, cpu).work.func = NULL;
+	mutex_unlock(&slab_mutex);
 	return 0;
 }