[v2] padata: validate cpumask without removed CPU during offline
diff mbox series

Message ID 20190809210603.20900-1-daniel.m.jordan@oracle.com
State Changes Requested
Delegated to: Herbert Xu
Headers show
Series
  • [v2] padata: validate cpumask without removed CPU during offline
Related show

Commit Message

Daniel Jordan Aug. 9, 2019, 9:06 p.m. UTC
Configuring an instance's parallel mask without any online CPUs...

  echo 2 > /sys/kernel/pcrypt/pencrypt/parallel_cpumask
  echo 0 > /sys/devices/system/cpu/cpu1/online

...crashes like this:

  divide error: 0000 [#1] SMP PTI
  CPU: 4 PID: 281 Comm: modprobe Not tainted 5.2.0-padata-base+ #25
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-<snip>
  RIP: 0010:padata_do_parallel+0xf1/0x270
  ...
  Call Trace:
   pcrypt_do_parallel+0xed/0x1e0 [pcrypt]
   pcrypt_aead_encrypt+0xbf/0xd0 [pcrypt]
   do_mult_aead_op+0x68/0x112 [tcrypt]
   test_mb_aead_speed.constprop.0.cold+0x21a/0x55a [tcrypt]
   do_test+0x2280/0x4ca2 [tcrypt]
   tcrypt_mod_init+0x55/0x1000 [tcrypt]
   ...

The cpumask_weight call in padata_cpu_hash returns 0, causing the
division error, because the mask has no CPUs, which is expected in this
situation.  The problem is __padata_remove_cpu doesn't mark the instance
PADATA_INVALID as expected, which would have made padata_do_parallel
return error before doing the division, because it checks for valid
masks too early.

Fix by moving the checks after the masks have been adjusted for the
offlined CPU.  Only do the second check if the first succeeded to avoid
inadvertently clearing PADATA_INVALID.

Stop the instance unconditionally and start again if the masks are
valid.  Stopping the instance only after an invalid mask is found risks
this div-by-0 crash since a padata_do_parallel call in another task
could happen between cpumask_clear_cpu and padata_validate_cpumask.

Fixes: 33e54450683c ("padata: Handle empty padata cpumasks")
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---

v2: Don't leave the instance stopped if the masks are valid.

 kernel/padata.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Herbert Xu Aug. 22, 2019, 3:50 a.m. UTC | #1
On Fri, Aug 09, 2019 at 05:06:03PM -0400, Daniel Jordan wrote:
>
> diff --git a/kernel/padata.c b/kernel/padata.c
> index d056276a96ce..01460ea1d160 100644
> --- a/kernel/padata.c
> +++ b/kernel/padata.c
> @@ -702,10 +702,7 @@ static int __padata_remove_cpu(struct padata_instance *pinst, int cpu)
>  	struct parallel_data *pd = NULL;
>  
>  	if (cpumask_test_cpu(cpu, cpu_online_mask)) {
> -
> -		if (!padata_validate_cpumask(pinst, pinst->cpumask.pcpu) ||
> -		    !padata_validate_cpumask(pinst, pinst->cpumask.cbcpu))
> -			__padata_stop(pinst);
> +		__padata_stop(pinst);
>  
>  		pd = padata_alloc_pd(pinst, pinst->cpumask.pcpu,
>  				     pinst->cpumask.cbcpu);
> @@ -716,6 +713,9 @@ static int __padata_remove_cpu(struct padata_instance *pinst, int cpu)
>  
>  		cpumask_clear_cpu(cpu, pd->cpumask.cbcpu);
>  		cpumask_clear_cpu(cpu, pd->cpumask.pcpu);
> +		if (padata_validate_cpumask(pinst, pd->cpumask.pcpu) &&
> +		    padata_validate_cpumask(pinst, pd->cpumask.cbcpu))
> +			__padata_start(pinst);
>  	}

I looked back at the original code and in fact the original
assumption is to call this after cpu_online_mask has been modified.

So I suspect we need to change the state at which this is called
by CPU hotplug.  IOW the commit that broke this is 30e92153b4e6.

This would also allow us to get rid of the two cpumask_clear_cpu
calls on pd->cpumask which is just bogus as you should only ever
modify the pd->cpumask prior to the padata_repalce call (because
the readers are not serialised with respect to this).

Cheers,
Daniel Jordan Aug. 22, 2019, 10:10 p.m. UTC | #2
On 8/21/19 11:50 PM, Herbert Xu wrote:
> On Fri, Aug 09, 2019 at 05:06:03PM -0400, Daniel Jordan wrote:
>> diff --git a/kernel/padata.c b/kernel/padata.c
>> index d056276a96ce..01460ea1d160 100644
>> --- a/kernel/padata.c
>> +++ b/kernel/padata.c
>> @@ -702,10 +702,7 @@ static int __padata_remove_cpu(struct padata_instance *pinst, int cpu)
>>   	struct parallel_data *pd = NULL;
>>   
>>   	if (cpumask_test_cpu(cpu, cpu_online_mask)) {
>> -
>> -		if (!padata_validate_cpumask(pinst, pinst->cpumask.pcpu) ||
>> -		    !padata_validate_cpumask(pinst, pinst->cpumask.cbcpu))
>> -			__padata_stop(pinst);
>> +		__padata_stop(pinst);
>>   
>>   		pd = padata_alloc_pd(pinst, pinst->cpumask.pcpu,
>>   				     pinst->cpumask.cbcpu);
>> @@ -716,6 +713,9 @@ static int __padata_remove_cpu(struct padata_instance *pinst, int cpu)
>>   
>>   		cpumask_clear_cpu(cpu, pd->cpumask.cbcpu);
>>   		cpumask_clear_cpu(cpu, pd->cpumask.pcpu);
>> +		if (padata_validate_cpumask(pinst, pd->cpumask.pcpu) &&
>> +		    padata_validate_cpumask(pinst, pd->cpumask.cbcpu))
>> +			__padata_start(pinst);
>>   	}
> 
> I looked back at the original code and in fact the original
> assumption is to call this after cpu_online_mask has been modified.
> 
> So I suspect we need to change the state at which this is called
> by CPU hotplug.

Yes the state idea is good, it's cleaner to have the CPU out of the online mask ahead of time.

I think we'll need two states.  We want a CPU being offlined to already be removed from the online cpumask so and'ing the user-supplied and online masks reflects conditions after the hotplug operation is finished.  For the same reason we want a CPU being onlined to already be in the online mask, and we can use the existing hotplug state for that, though we'd need a new padata-specific state for the offline case.

> IOW the commit that broke this is 30e92153b4e6.

I don't think 30e92153b4e6 is the one since the commit before that only allows __padata_remove_cpu to do its work if @cpu is in the online mask, so the call happens before cpu_online_mask has been modified.  Same story for the very first padata commit, so it seems like that should actually be Fixes.

> This would also allow us to get rid of the two cpumask_clear_cpu
> calls on pd->cpumask which is just bogus as you should only ever
> modify the pd->cpumask prior to the padata_repalce call (because
> the readers are not serialised with respect to this).

Yeah, makes sense.

Daniel
Daniel Jordan Aug. 22, 2019, 10:53 p.m. UTC | #3
On 8/22/19 6:10 PM, Daniel Jordan wrote:
> On 8/21/19 11:50 PM, Herbert Xu wrote:
>> On Fri, Aug 09, 2019 at 05:06:03PM -0400, Daniel Jordan wrote:
>>> diff --git a/kernel/padata.c b/kernel/padata.c
>>> index d056276a96ce..01460ea1d160 100644
>>> --- a/kernel/padata.c
>>> +++ b/kernel/padata.c
>>> @@ -702,10 +702,7 @@ static int __padata_remove_cpu(struct padata_instance *pinst, int cpu)
>>>       struct parallel_data *pd = NULL;
>>>       if (cpumask_test_cpu(cpu, cpu_online_mask)) {
>>> -
>>> -        if (!padata_validate_cpumask(pinst, pinst->cpumask.pcpu) ||
>>> -            !padata_validate_cpumask(pinst, pinst->cpumask.cbcpu))
>>> -            __padata_stop(pinst);
>>> +        __padata_stop(pinst);
>>>           pd = padata_alloc_pd(pinst, pinst->cpumask.pcpu,
>>>                        pinst->cpumask.cbcpu);
>>> @@ -716,6 +713,9 @@ static int __padata_remove_cpu(struct padata_instance *pinst, int cpu)
>>>           cpumask_clear_cpu(cpu, pd->cpumask.cbcpu);
>>>           cpumask_clear_cpu(cpu, pd->cpumask.pcpu);
>>> +        if (padata_validate_cpumask(pinst, pd->cpumask.pcpu) &&
>>> +            padata_validate_cpumask(pinst, pd->cpumask.cbcpu))
>>> +            __padata_start(pinst);
>>>       }
>>
>> I looked back at the original code and in fact the original
>> assumption is to call this after cpu_online_mask has been modified.
>>
>> So I suspect we need to change the state at which this is called
>> by CPU hotplug.
> 
> Yes the state idea is good, it's cleaner to have the CPU out of the online mask ahead of time.
> 
> I think we'll need two states.  We want a CPU being offlined to already be removed from the online cpumask so and'ing the user-supplied and online masks reflects conditions after the hotplug operation is finished.  For the same reason we want a CPU being onlined to already be in the online mask, and we can use the existing hotplug state for that, though we'd need a new padata-specific state for the offline case.

The new state would be something before CPUHP_BRINGUP_CPU so the cpu isn't in the online mask yet.

> 
>> IOW the commit that broke this is 30e92153b4e6.
> 
> I don't think 30e92153b4e6 is the one since the commit before that only allows __padata_remove_cpu to do its work if @cpu is in the online mask, so the call happens before cpu_online_mask has been modified.  Same story for the very first padata commit, so it seems like that should actually be Fixes.
> 
>> This would also allow us to get rid of the two cpumask_clear_cpu
>> calls on pd->cpumask which is just bogus as you should only ever
>> modify the pd->cpumask prior to the padata_repalce call (because
>> the readers are not serialised with respect to this).
> 
> Yeah, makes sense.
> 
> Daniel

Patch
diff mbox series

diff --git a/kernel/padata.c b/kernel/padata.c
index d056276a96ce..01460ea1d160 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -702,10 +702,7 @@  static int __padata_remove_cpu(struct padata_instance *pinst, int cpu)
 	struct parallel_data *pd = NULL;
 
 	if (cpumask_test_cpu(cpu, cpu_online_mask)) {
-
-		if (!padata_validate_cpumask(pinst, pinst->cpumask.pcpu) ||
-		    !padata_validate_cpumask(pinst, pinst->cpumask.cbcpu))
-			__padata_stop(pinst);
+		__padata_stop(pinst);
 
 		pd = padata_alloc_pd(pinst, pinst->cpumask.pcpu,
 				     pinst->cpumask.cbcpu);
@@ -716,6 +713,9 @@  static int __padata_remove_cpu(struct padata_instance *pinst, int cpu)
 
 		cpumask_clear_cpu(cpu, pd->cpumask.cbcpu);
 		cpumask_clear_cpu(cpu, pd->cpumask.pcpu);
+		if (padata_validate_cpumask(pinst, pd->cpumask.pcpu) &&
+		    padata_validate_cpumask(pinst, pd->cpumask.cbcpu))
+			__padata_start(pinst);
 	}
 
 	return 0;