[v2,21/21] arm64: Panic when VHE and non VHE CPUs coexist

Message ID	1453737235-16522-22-git-send-email-marc.zyngier@arm.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@kernel.org> From: Marc Zyngier <marc.zyngier@arm.com> To: Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will.deacon@arm.com>, Christoffer Dall <christoffer.dall@linaro.org> Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu Subject: [PATCH v2 21/21] arm64: Panic when VHE and non VHE CPUs coexist Date: Mon, 25 Jan 2016 15:53:55 +0000 Message-Id: <1453737235-16522-22-git-send-email-marc.zyngier@arm.com> In-Reply-To: <1453737235-16522-1-git-send-email-marc.zyngier@arm.com> References: <1453737235-16522-1-git-send-email-marc.zyngier@arm.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk

Marc Zyngier Jan. 25, 2016, 3:53 p.m. UTC

Having both VHE and non-VHE capable CPUs in the same system
is likely to be a recipe for disaster.

If the boot CPU has VHE, but a secondary is not, we won't be
able to downgrade and run the kernel at EL1. Add CPU hotplug
to the mix, and this produces a terrifying mess.

Let's solve the problem once and for all. If you mix VHE and
non-VHE CPUs in the same system, you deserve to loose, and this
patch makes sure you don't get a chance.

This is implemented by storing the kernel execution level in
a global variable. Secondaries will park themselves in a
WFI loop if they observe a mismatch. Also, the primary CPU
will detect that the secondary CPU has died on a mismatched
execution level. Panic will follow.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/virt.h | 17 +++++++++++++++++
 arch/arm64/kernel/head.S      | 19 +++++++++++++++++++
 arch/arm64/kernel/smp.c       |  3 +++
 3 files changed, 39 insertions(+)

Suzuki K Poulose Jan. 26, 2016, 2:25 p.m. UTC | #1

On 25/01/16 15:53, Marc Zyngier wrote:
> Having both VHE and non-VHE capable CPUs in the same system
> is likely to be a recipe for disaster.


> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index b1adc51..bc7650a 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -113,6 +113,9 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
>   			pr_crit("CPU%u: failed to come online\n", cpu);
>   			ret = -EIO;
>   		}
> +
> +		if (is_kernel_mode_mismatched())
> +			panic("CPU%u: incompatible execution level", cpu);


fyi,

I have a series which tries to perform some checks for early CPU features,
like this at [1] and adds support for early CPU boot failures, passing the error
status back to the master. May be we could move this check there(once it settles),
and fail the CPU boot with CPU_PANIC_KERNEL status.


[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-January/401727.html

Thanks
Suzuki
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Marc Zyngier Jan. 26, 2016, 2:34 p.m. UTC | #2

On 26/01/16 14:25, Suzuki K. Poulose wrote:
> On 25/01/16 15:53, Marc Zyngier wrote:
>> Having both VHE and non-VHE capable CPUs in the same system
>> is likely to be a recipe for disaster.
> 
> 
>> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
>> index b1adc51..bc7650a 100644
>> --- a/arch/arm64/kernel/smp.c
>> +++ b/arch/arm64/kernel/smp.c
>> @@ -113,6 +113,9 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
>>   			pr_crit("CPU%u: failed to come online\n", cpu);
>>   			ret = -EIO;
>>   		}
>> +
>> +		if (is_kernel_mode_mismatched())
>> +			panic("CPU%u: incompatible execution level", cpu);
> 
> 
> fyi,
> 
> I have a series which tries to perform some checks for early CPU features,
> like this at [1] and adds support for early CPU boot failures, passing the error
> status back to the master. May be we could move this check there(once it settles),
> and fail the CPU boot with CPU_PANIC_KERNEL status.
> 
> 
> [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-January/401727.html

Definitely, there is room for consolidation in this area...

Thanks,

	M.

Christoffer Dall Feb. 1, 2016, 3:36 p.m. UTC | #3

On Mon, Jan 25, 2016 at 03:53:55PM +0000, Marc Zyngier wrote:
> Having both VHE and non-VHE capable CPUs in the same system
> is likely to be a recipe for disaster.
> 
> If the boot CPU has VHE, but a secondary is not, we won't be
> able to downgrade and run the kernel at EL1. Add CPU hotplug
> to the mix, and this produces a terrifying mess.
> 
> Let's solve the problem once and for all. If you mix VHE and
> non-VHE CPUs in the same system, you deserve to loose, and this
> patch makes sure you don't get a chance.
> 
> This is implemented by storing the kernel execution level in
> a global variable. Secondaries will park themselves in a
> WFI loop if they observe a mismatch. Also, the primary CPU
> will detect that the secondary CPU has died on a mismatched
> execution level. Panic will follow.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/virt.h | 17 +++++++++++++++++
>  arch/arm64/kernel/head.S      | 19 +++++++++++++++++++
>  arch/arm64/kernel/smp.c       |  3 +++
>  3 files changed, 39 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> index 9f22dd6..f81a345 100644
> --- a/arch/arm64/include/asm/virt.h
> +++ b/arch/arm64/include/asm/virt.h
> @@ -36,6 +36,11 @@
>   */
>  extern u32 __boot_cpu_mode[2];
>  
> +/*
> + * __run_cpu_mode records the mode the boot CPU uses for the kernel.
> + */
> +extern u32 __run_cpu_mode[2];
> +
>  void __hyp_set_vectors(phys_addr_t phys_vector_base);
>  phys_addr_t __hyp_get_vectors(void);
>  
> @@ -60,6 +65,18 @@ static inline bool is_kernel_in_hyp_mode(void)
>  	return el == CurrentEL_EL2;
>  }
>  
> +static inline bool is_kernel_mode_mismatched(void)
> +{
> +	/*
> +	 * A mismatched CPU will have written its own CurrentEL in
> +	 * __run_cpu_mode[1] (initially set to zero) after failing to
> +	 * match the value in __run_cpu_mode[0]. Thus, a non-zero
> +	 * value in __run_cpu_mode[1] is enough to detect the
> +	 * pathological case.
> +	 */
> +	return !!ACCESS_ONCE(__run_cpu_mode[1]);
> +}
> +
>  /* The section containing the hypervisor text */
>  extern char __hyp_text_start[];
>  extern char __hyp_text_end[];
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index 2a7134c..bc44cf8 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -577,7 +577,23 @@ ENTRY(set_cpu_boot_mode_flag)
>  1:	str	w20, [x1]			// This CPU has booted in EL1
>  	dmb	sy
>  	dc	ivac, x1			// Invalidate potentially stale cache line
> +	adr_l	x1, __run_cpu_mode
> +	ldr	w0, [x1]
> +	mrs	x20, CurrentEL
> +	cbz	x0, skip_el_check
> +	cmp	x0, x20
> +	bne	mismatched_el

can't you do a ret here instead of writing the same value and flushing
caches etc.?

> +skip_el_check:			// Only the first CPU gets to set the rule
> +	str	w20, [x1]
> +	dmb	sy
> +	dc	ivac, x1	// Invalidate potentially stale cache line
>  	ret
> +mismatched_el:
> +	str	w20, [x1, #4]
> +	dmb	sy
> +	dc	ivac, x1	// Invalidate potentially stale cache line
> +1:	wfi

I'm no expert on SMP bringup, but doesn't this prevent the CPU from
signaling completion and thus you'll never actually reach the checking
code in __cpu_up?

Thanks,
-Christoffer

> +	b	1b
>  ENDPROC(set_cpu_boot_mode_flag)
>  
>  /*
> @@ -592,6 +608,9 @@ ENDPROC(set_cpu_boot_mode_flag)
>  ENTRY(__boot_cpu_mode)
>  	.long	BOOT_CPU_MODE_EL2
>  	.long	BOOT_CPU_MODE_EL1
> +ENTRY(__run_cpu_mode)
> +	.long	0
> +	.long	0
>  	.popsection
>  
>  	/*
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index b1adc51..bc7650a 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -113,6 +113,9 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
>  			pr_crit("CPU%u: failed to come online\n", cpu);
>  			ret = -EIO;
>  		}
> +
> +		if (is_kernel_mode_mismatched())
> +			panic("CPU%u: incompatible execution level", cpu);
>  	} else {
>  		pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
>  	}
> -- 
> 2.1.4
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Marc Zyngier Feb. 2, 2016, 3:32 p.m. UTC | #4

On 01/02/16 15:36, Christoffer Dall wrote:
> On Mon, Jan 25, 2016 at 03:53:55PM +0000, Marc Zyngier wrote:
>> Having both VHE and non-VHE capable CPUs in the same system
>> is likely to be a recipe for disaster.
>>
>> If the boot CPU has VHE, but a secondary is not, we won't be
>> able to downgrade and run the kernel at EL1. Add CPU hotplug
>> to the mix, and this produces a terrifying mess.
>>
>> Let's solve the problem once and for all. If you mix VHE and
>> non-VHE CPUs in the same system, you deserve to loose, and this
>> patch makes sure you don't get a chance.
>>
>> This is implemented by storing the kernel execution level in
>> a global variable. Secondaries will park themselves in a
>> WFI loop if they observe a mismatch. Also, the primary CPU
>> will detect that the secondary CPU has died on a mismatched
>> execution level. Panic will follow.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm64/include/asm/virt.h | 17 +++++++++++++++++
>>  arch/arm64/kernel/head.S      | 19 +++++++++++++++++++
>>  arch/arm64/kernel/smp.c       |  3 +++
>>  3 files changed, 39 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
>> index 9f22dd6..f81a345 100644
>> --- a/arch/arm64/include/asm/virt.h
>> +++ b/arch/arm64/include/asm/virt.h
>> @@ -36,6 +36,11 @@
>>   */
>>  extern u32 __boot_cpu_mode[2];
>>  
>> +/*
>> + * __run_cpu_mode records the mode the boot CPU uses for the kernel.
>> + */
>> +extern u32 __run_cpu_mode[2];
>> +
>>  void __hyp_set_vectors(phys_addr_t phys_vector_base);
>>  phys_addr_t __hyp_get_vectors(void);
>>  
>> @@ -60,6 +65,18 @@ static inline bool is_kernel_in_hyp_mode(void)
>>  	return el == CurrentEL_EL2;
>>  }
>>  
>> +static inline bool is_kernel_mode_mismatched(void)
>> +{
>> +	/*
>> +	 * A mismatched CPU will have written its own CurrentEL in
>> +	 * __run_cpu_mode[1] (initially set to zero) after failing to
>> +	 * match the value in __run_cpu_mode[0]. Thus, a non-zero
>> +	 * value in __run_cpu_mode[1] is enough to detect the
>> +	 * pathological case.
>> +	 */
>> +	return !!ACCESS_ONCE(__run_cpu_mode[1]);
>> +}
>> +
>>  /* The section containing the hypervisor text */
>>  extern char __hyp_text_start[];
>>  extern char __hyp_text_end[];
>> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
>> index 2a7134c..bc44cf8 100644
>> --- a/arch/arm64/kernel/head.S
>> +++ b/arch/arm64/kernel/head.S
>> @@ -577,7 +577,23 @@ ENTRY(set_cpu_boot_mode_flag)
>>  1:	str	w20, [x1]			// This CPU has booted in EL1
>>  	dmb	sy
>>  	dc	ivac, x1			// Invalidate potentially stale cache line
>> +	adr_l	x1, __run_cpu_mode
>> +	ldr	w0, [x1]
>> +	mrs	x20, CurrentEL
>> +	cbz	x0, skip_el_check
>> +	cmp	x0, x20
>> +	bne	mismatched_el
> 
> can't you do a ret here instead of writing the same value and flushing
> caches etc.?

Yes, good point.

> 
>> +skip_el_check:			// Only the first CPU gets to set the rule
>> +	str	w20, [x1]
>> +	dmb	sy
>> +	dc	ivac, x1	// Invalidate potentially stale cache line
>>  	ret
>> +mismatched_el:
>> +	str	w20, [x1, #4]
>> +	dmb	sy
>> +	dc	ivac, x1	// Invalidate potentially stale cache line
>> +1:	wfi
> 
> I'm no expert on SMP bringup, but doesn't this prevent the CPU from
> signaling completion and thus you'll never actually reach the checking
> code in __cpu_up?

Indeed, and that's the whole point. The primary CPU will notice that the
secondary CPU has failed to boot (timeout), and will find the reason in
__run_cpu_mode.

Thanks,

	M.

Christoffer Dall Feb. 3, 2016, 8:49 a.m. UTC | #5

On Tue, Feb 02, 2016 at 03:32:04PM +0000, Marc Zyngier wrote:
> On 01/02/16 15:36, Christoffer Dall wrote:
> > On Mon, Jan 25, 2016 at 03:53:55PM +0000, Marc Zyngier wrote:
> >> Having both VHE and non-VHE capable CPUs in the same system
> >> is likely to be a recipe for disaster.
> >>
> >> If the boot CPU has VHE, but a secondary is not, we won't be
> >> able to downgrade and run the kernel at EL1. Add CPU hotplug
> >> to the mix, and this produces a terrifying mess.
> >>
> >> Let's solve the problem once and for all. If you mix VHE and
> >> non-VHE CPUs in the same system, you deserve to loose, and this
> >> patch makes sure you don't get a chance.
> >>
> >> This is implemented by storing the kernel execution level in
> >> a global variable. Secondaries will park themselves in a
> >> WFI loop if they observe a mismatch. Also, the primary CPU
> >> will detect that the secondary CPU has died on a mismatched
> >> execution level. Panic will follow.
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  arch/arm64/include/asm/virt.h | 17 +++++++++++++++++
> >>  arch/arm64/kernel/head.S      | 19 +++++++++++++++++++
> >>  arch/arm64/kernel/smp.c       |  3 +++
> >>  3 files changed, 39 insertions(+)
> >>
> >> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> >> index 9f22dd6..f81a345 100644
> >> --- a/arch/arm64/include/asm/virt.h
> >> +++ b/arch/arm64/include/asm/virt.h
> >> @@ -36,6 +36,11 @@
> >>   */
> >>  extern u32 __boot_cpu_mode[2];
> >>  
> >> +/*
> >> + * __run_cpu_mode records the mode the boot CPU uses for the kernel.
> >> + */
> >> +extern u32 __run_cpu_mode[2];
> >> +
> >>  void __hyp_set_vectors(phys_addr_t phys_vector_base);
> >>  phys_addr_t __hyp_get_vectors(void);
> >>  
> >> @@ -60,6 +65,18 @@ static inline bool is_kernel_in_hyp_mode(void)
> >>  	return el == CurrentEL_EL2;
> >>  }
> >>  
> >> +static inline bool is_kernel_mode_mismatched(void)
> >> +{
> >> +	/*
> >> +	 * A mismatched CPU will have written its own CurrentEL in
> >> +	 * __run_cpu_mode[1] (initially set to zero) after failing to
> >> +	 * match the value in __run_cpu_mode[0]. Thus, a non-zero
> >> +	 * value in __run_cpu_mode[1] is enough to detect the
> >> +	 * pathological case.
> >> +	 */
> >> +	return !!ACCESS_ONCE(__run_cpu_mode[1]);
> >> +}
> >> +
> >>  /* The section containing the hypervisor text */
> >>  extern char __hyp_text_start[];
> >>  extern char __hyp_text_end[];
> >> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> >> index 2a7134c..bc44cf8 100644
> >> --- a/arch/arm64/kernel/head.S
> >> +++ b/arch/arm64/kernel/head.S
> >> @@ -577,7 +577,23 @@ ENTRY(set_cpu_boot_mode_flag)
> >>  1:	str	w20, [x1]			// This CPU has booted in EL1
> >>  	dmb	sy
> >>  	dc	ivac, x1			// Invalidate potentially stale cache line
> >> +	adr_l	x1, __run_cpu_mode
> >> +	ldr	w0, [x1]
> >> +	mrs	x20, CurrentEL
> >> +	cbz	x0, skip_el_check
> >> +	cmp	x0, x20
> >> +	bne	mismatched_el
> > 
> > can't you do a ret here instead of writing the same value and flushing
> > caches etc.?
> 
> Yes, good point.
> 
> > 
> >> +skip_el_check:			// Only the first CPU gets to set the rule
> >> +	str	w20, [x1]
> >> +	dmb	sy
> >> +	dc	ivac, x1	// Invalidate potentially stale cache line
> >>  	ret
> >> +mismatched_el:
> >> +	str	w20, [x1, #4]
> >> +	dmb	sy
> >> +	dc	ivac, x1	// Invalidate potentially stale cache line
> >> +1:	wfi
> > 
> > I'm no expert on SMP bringup, but doesn't this prevent the CPU from
> > signaling completion and thus you'll never actually reach the checking
> > code in __cpu_up?
> 
> Indeed, and that's the whole point. The primary CPU will notice that the
> secondary CPU has failed to boot (timeout), and will find the reason in
> __run_cpu_mode.
> 
That wasn't exactly my point.  If I understand correctly and __cpu_up is
the primary CPU executing a function to bring up a secondary core, then
it will wait for the cpu_running completion which should be signalled by
the secondary core, but because the secondary core never makes any
progress it will timeout the wait for completion and you will see that
error "..failed to come online" instead of the "incompatible execution
level".

(This is based on my reading of the code as the completion is signalled
in secondary_start_kernl which happens after this stuff above in
head.S).

-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Marc Zyngier Feb. 3, 2016, 5:45 p.m. UTC | #6

On 03/02/16 08:49, Christoffer Dall wrote:
> On Tue, Feb 02, 2016 at 03:32:04PM +0000, Marc Zyngier wrote:
>> On 01/02/16 15:36, Christoffer Dall wrote:
>>> On Mon, Jan 25, 2016 at 03:53:55PM +0000, Marc Zyngier wrote:
>>>> Having both VHE and non-VHE capable CPUs in the same system
>>>> is likely to be a recipe for disaster.
>>>>
>>>> If the boot CPU has VHE, but a secondary is not, we won't be
>>>> able to downgrade and run the kernel at EL1. Add CPU hotplug
>>>> to the mix, and this produces a terrifying mess.
>>>>
>>>> Let's solve the problem once and for all. If you mix VHE and
>>>> non-VHE CPUs in the same system, you deserve to loose, and this
>>>> patch makes sure you don't get a chance.
>>>>
>>>> This is implemented by storing the kernel execution level in
>>>> a global variable. Secondaries will park themselves in a
>>>> WFI loop if they observe a mismatch. Also, the primary CPU
>>>> will detect that the secondary CPU has died on a mismatched
>>>> execution level. Panic will follow.
>>>>
>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>> ---
>>>>  arch/arm64/include/asm/virt.h | 17 +++++++++++++++++
>>>>  arch/arm64/kernel/head.S      | 19 +++++++++++++++++++
>>>>  arch/arm64/kernel/smp.c       |  3 +++
>>>>  3 files changed, 39 insertions(+)
>>>>
>>>> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
>>>> index 9f22dd6..f81a345 100644
>>>> --- a/arch/arm64/include/asm/virt.h
>>>> +++ b/arch/arm64/include/asm/virt.h
>>>> @@ -36,6 +36,11 @@
>>>>   */
>>>>  extern u32 __boot_cpu_mode[2];
>>>>  
>>>> +/*
>>>> + * __run_cpu_mode records the mode the boot CPU uses for the kernel.
>>>> + */
>>>> +extern u32 __run_cpu_mode[2];
>>>> +
>>>>  void __hyp_set_vectors(phys_addr_t phys_vector_base);
>>>>  phys_addr_t __hyp_get_vectors(void);
>>>>  
>>>> @@ -60,6 +65,18 @@ static inline bool is_kernel_in_hyp_mode(void)
>>>>  	return el == CurrentEL_EL2;
>>>>  }
>>>>  
>>>> +static inline bool is_kernel_mode_mismatched(void)
>>>> +{
>>>> +	/*
>>>> +	 * A mismatched CPU will have written its own CurrentEL in
>>>> +	 * __run_cpu_mode[1] (initially set to zero) after failing to
>>>> +	 * match the value in __run_cpu_mode[0]. Thus, a non-zero
>>>> +	 * value in __run_cpu_mode[1] is enough to detect the
>>>> +	 * pathological case.
>>>> +	 */
>>>> +	return !!ACCESS_ONCE(__run_cpu_mode[1]);
>>>> +}
>>>> +
>>>>  /* The section containing the hypervisor text */
>>>>  extern char __hyp_text_start[];
>>>>  extern char __hyp_text_end[];
>>>> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
>>>> index 2a7134c..bc44cf8 100644
>>>> --- a/arch/arm64/kernel/head.S
>>>> +++ b/arch/arm64/kernel/head.S
>>>> @@ -577,7 +577,23 @@ ENTRY(set_cpu_boot_mode_flag)
>>>>  1:	str	w20, [x1]			// This CPU has booted in EL1
>>>>  	dmb	sy
>>>>  	dc	ivac, x1			// Invalidate potentially stale cache line
>>>> +	adr_l	x1, __run_cpu_mode
>>>> +	ldr	w0, [x1]
>>>> +	mrs	x20, CurrentEL
>>>> +	cbz	x0, skip_el_check
>>>> +	cmp	x0, x20
>>>> +	bne	mismatched_el
>>>
>>> can't you do a ret here instead of writing the same value and flushing
>>> caches etc.?
>>
>> Yes, good point.
>>
>>>
>>>> +skip_el_check:			// Only the first CPU gets to set the rule
>>>> +	str	w20, [x1]
>>>> +	dmb	sy
>>>> +	dc	ivac, x1	// Invalidate potentially stale cache line
>>>>  	ret
>>>> +mismatched_el:
>>>> +	str	w20, [x1, #4]
>>>> +	dmb	sy
>>>> +	dc	ivac, x1	// Invalidate potentially stale cache line
>>>> +1:	wfi
>>>
>>> I'm no expert on SMP bringup, but doesn't this prevent the CPU from
>>> signaling completion and thus you'll never actually reach the checking
>>> code in __cpu_up?
>>
>> Indeed, and that's the whole point. The primary CPU will notice that the
>> secondary CPU has failed to boot (timeout), and will find the reason in
>> __run_cpu_mode.
>>
> That wasn't exactly my point.  If I understand correctly and __cpu_up is
> the primary CPU executing a function to bring up a secondary core, then
> it will wait for the cpu_running completion which should be signalled by
> the secondary core, but because the secondary core never makes any
> progress it will timeout the wait for completion and you will see that
> error "..failed to come online" instead of the "incompatible execution
> level".

It will actually do both. Here's an example on the model configured for
such a braindead case:

CPU4: failed to come online
Kernel panic - not syncing: CPU4: incompatible execution level
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc2+ #5459
Hardware name: FVP Base (DT)
Call trace:
[<ffffffc0000899e0>] dump_backtrace+0x0/0x180
[<ffffffc000089b74>] show_stack+0x14/0x20
[<ffffffc000333b08>] dump_stack+0x90/0xc8
[<ffffffc00014d424>] panic+0x10c/0x250
[<ffffffc00008ef24>] __cpu_up+0xfc/0x100
[<ffffffc0000b7a9c>] _cpu_up+0x154/0x188
[<ffffffc0000b7b54>] cpu_up+0x84/0xa8
[<ffffffc0009e9d00>] smp_init+0xbc/0xc0
[<ffffffc0009dca10>] kernel_init_freeable+0x94/0x1ec
[<ffffffc000712f90>] kernel_init+0x10/0xe0
[<ffffffc000085cd0>] ret_from_fork+0x10/0x40

Am I missing something *really* obvious?

Thanks,

	M.

Christoffer Dall Feb. 3, 2016, 7:12 p.m. UTC | #7

On Wed, Feb 03, 2016 at 05:45:47PM +0000, Marc Zyngier wrote:
> On 03/02/16 08:49, Christoffer Dall wrote:
> > On Tue, Feb 02, 2016 at 03:32:04PM +0000, Marc Zyngier wrote:
> >> On 01/02/16 15:36, Christoffer Dall wrote:
> >>> On Mon, Jan 25, 2016 at 03:53:55PM +0000, Marc Zyngier wrote:
> >>>> Having both VHE and non-VHE capable CPUs in the same system
> >>>> is likely to be a recipe for disaster.
> >>>>
> >>>> If the boot CPU has VHE, but a secondary is not, we won't be
> >>>> able to downgrade and run the kernel at EL1. Add CPU hotplug
> >>>> to the mix, and this produces a terrifying mess.
> >>>>
> >>>> Let's solve the problem once and for all. If you mix VHE and
> >>>> non-VHE CPUs in the same system, you deserve to loose, and this
> >>>> patch makes sure you don't get a chance.
> >>>>
> >>>> This is implemented by storing the kernel execution level in
> >>>> a global variable. Secondaries will park themselves in a
> >>>> WFI loop if they observe a mismatch. Also, the primary CPU
> >>>> will detect that the secondary CPU has died on a mismatched
> >>>> execution level. Panic will follow.
> >>>>
> >>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >>>> ---
> >>>>  arch/arm64/include/asm/virt.h | 17 +++++++++++++++++
> >>>>  arch/arm64/kernel/head.S      | 19 +++++++++++++++++++
> >>>>  arch/arm64/kernel/smp.c       |  3 +++
> >>>>  3 files changed, 39 insertions(+)
> >>>>
> >>>> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> >>>> index 9f22dd6..f81a345 100644
> >>>> --- a/arch/arm64/include/asm/virt.h
> >>>> +++ b/arch/arm64/include/asm/virt.h
> >>>> @@ -36,6 +36,11 @@
> >>>>   */
> >>>>  extern u32 __boot_cpu_mode[2];
> >>>>  
> >>>> +/*
> >>>> + * __run_cpu_mode records the mode the boot CPU uses for the kernel.
> >>>> + */
> >>>> +extern u32 __run_cpu_mode[2];
> >>>> +
> >>>>  void __hyp_set_vectors(phys_addr_t phys_vector_base);
> >>>>  phys_addr_t __hyp_get_vectors(void);
> >>>>  
> >>>> @@ -60,6 +65,18 @@ static inline bool is_kernel_in_hyp_mode(void)
> >>>>  	return el == CurrentEL_EL2;
> >>>>  }
> >>>>  
> >>>> +static inline bool is_kernel_mode_mismatched(void)
> >>>> +{
> >>>> +	/*
> >>>> +	 * A mismatched CPU will have written its own CurrentEL in
> >>>> +	 * __run_cpu_mode[1] (initially set to zero) after failing to
> >>>> +	 * match the value in __run_cpu_mode[0]. Thus, a non-zero
> >>>> +	 * value in __run_cpu_mode[1] is enough to detect the
> >>>> +	 * pathological case.
> >>>> +	 */
> >>>> +	return !!ACCESS_ONCE(__run_cpu_mode[1]);
> >>>> +}
> >>>> +
> >>>>  /* The section containing the hypervisor text */
> >>>>  extern char __hyp_text_start[];
> >>>>  extern char __hyp_text_end[];
> >>>> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> >>>> index 2a7134c..bc44cf8 100644
> >>>> --- a/arch/arm64/kernel/head.S
> >>>> +++ b/arch/arm64/kernel/head.S
> >>>> @@ -577,7 +577,23 @@ ENTRY(set_cpu_boot_mode_flag)
> >>>>  1:	str	w20, [x1]			// This CPU has booted in EL1
> >>>>  	dmb	sy
> >>>>  	dc	ivac, x1			// Invalidate potentially stale cache line
> >>>> +	adr_l	x1, __run_cpu_mode
> >>>> +	ldr	w0, [x1]
> >>>> +	mrs	x20, CurrentEL
> >>>> +	cbz	x0, skip_el_check
> >>>> +	cmp	x0, x20
> >>>> +	bne	mismatched_el
> >>>
> >>> can't you do a ret here instead of writing the same value and flushing
> >>> caches etc.?
> >>
> >> Yes, good point.
> >>
> >>>
> >>>> +skip_el_check:			// Only the first CPU gets to set the rule
> >>>> +	str	w20, [x1]
> >>>> +	dmb	sy
> >>>> +	dc	ivac, x1	// Invalidate potentially stale cache line
> >>>>  	ret
> >>>> +mismatched_el:
> >>>> +	str	w20, [x1, #4]
> >>>> +	dmb	sy
> >>>> +	dc	ivac, x1	// Invalidate potentially stale cache line
> >>>> +1:	wfi
> >>>
> >>> I'm no expert on SMP bringup, but doesn't this prevent the CPU from
> >>> signaling completion and thus you'll never actually reach the checking
> >>> code in __cpu_up?
> >>
> >> Indeed, and that's the whole point. The primary CPU will notice that the
> >> secondary CPU has failed to boot (timeout), and will find the reason in
> >> __run_cpu_mode.
> >>
> > That wasn't exactly my point.  If I understand correctly and __cpu_up is
> > the primary CPU executing a function to bring up a secondary core, then
> > it will wait for the cpu_running completion which should be signalled by
> > the secondary core, but because the secondary core never makes any
> > progress it will timeout the wait for completion and you will see that
> > error "..failed to come online" instead of the "incompatible execution
> > level".
> 
> It will actually do both. Here's an example on the model configured for
> such a braindead case:
> 
> CPU4: failed to come online
> Kernel panic - not syncing: CPU4: incompatible execution level
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc2+ #5459
> Hardware name: FVP Base (DT)
> Call trace:
> [<ffffffc0000899e0>] dump_backtrace+0x0/0x180
> [<ffffffc000089b74>] show_stack+0x14/0x20
> [<ffffffc000333b08>] dump_stack+0x90/0xc8
> [<ffffffc00014d424>] panic+0x10c/0x250
> [<ffffffc00008ef24>] __cpu_up+0xfc/0x100
> [<ffffffc0000b7a9c>] _cpu_up+0x154/0x188
> [<ffffffc0000b7b54>] cpu_up+0x84/0xa8
> [<ffffffc0009e9d00>] smp_init+0xbc/0xc0
> [<ffffffc0009dca10>] kernel_init_freeable+0x94/0x1ec
> [<ffffffc000712f90>] kernel_init+0x10/0xe0
> [<ffffffc000085cd0>] ret_from_fork+0x10/0x40
> 
> Am I missing something *really* obvious?
> 
No, I was, it says "ret = -EIO;" not "return -EIO"...

sorry.

-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[v2,21/21] arm64: Panic when VHE and non VHE CPUs coexist

Commit Message

Comments

Patch