diff mbox

ARM: KVM: iterate over all CPUs for CPU compatibility check

Message ID	1365771875-5788-1-git-send-email-andre.przywara@linaro.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org> From: Andre Przywara <andre.przywara@linaro.org> To: cdall@cs.columbia.edu Subject: [PATCH] ARM: KVM: iterate over all CPUs for CPU compatibility check Date: Fri, 12 Apr 2013 15:04:35 +0200 Message-Id: <1365771875-5788-1-git-send-email-andre.przywara@linaro.org> summary: Content analysis details: (-2.6 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [209.85.128.43 listed in list.dnswl.org] -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Cc: marc.zyngier@arm.com, peter.maydell@linaro.org, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org

Message ID

1365771875-5788-1-git-send-email-andre.przywara@linaro.org (mailing list archive)

State

New, archived

Headers

From: Andre Przywara <andre.przywara@linaro.org>
To: cdall@cs.columbia.edu
Subject: [PATCH] ARM: KVM: iterate over all CPUs for CPU compatibility check
Date: Fri, 12 Apr 2013 15:04:35 +0200
Message-Id: <1365771875-5788-1-git-send-email-andre.przywara@linaro.org>
Cc: marc.zyngier@arm.com, peter.maydell@linaro.org,
	kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org
Precedence: list
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org>
Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org

Commit Message

Andre Przywara April 12, 2013, 1:04 p.m. UTC

kvm_target_cpus() checks the compatibility of the used CPU with
KVM, which is currently limited to ARM Cortex-A15 cores.
However by calling it only once on any random CPU it assumes that
all cores are the same, which is not true for big.LITTLE parts.

After doing about 40 boots on a TC-2 core tile, I found it running
in all but one case on one of the A7 cores (which correctly denied
KVM initialization). On the 39th boot however the code ran on
an A15, leading to a hang after returning success:

...
TCP: reno registered
UDP hash table entries: 512 (order: 2, 16384 bytes)
UDP-Lite hash table entries: 512 (order: 2, 16384 bytes)
NET: Registered protocol family 1
kvm_target_cpu() on CPU #1, part is c0f0
 ... (pause for a while) ...
INFO: rcu_sched self-detected stall on CPUINFO: rcu_sched detected stalls on CPU
s/tasks: { 1} (detected by 0, t=6002 jiffies, g=4294966999, c=4294966998, q=15)
Task dump for CPU 1:
swapper/0       R running      0     1      0 0x00000002

So iterate over every CPU to correctly determine the capability of
the system to run the current KVM implementation.
In case a big.LITTLE configuration is the reason for denial, give
the user a hint how to get it running anyway (maxcpus= on the kernel
command line).

Please push this still into 3.9.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
---
 arch/arm/kvm/arm.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

Comments

Marc Zyngier April 12, 2013, 1:24 p.m. UTC | #1

On 12/04/13 14:04, Andre Przywara wrote:
> kvm_target_cpus() checks the compatibility of the used CPU with
> KVM, which is currently limited to ARM Cortex-A15 cores.
> However by calling it only once on any random CPU it assumes that
> all cores are the same, which is not true for big.LITTLE parts.
> 
> After doing about 40 boots on a TC-2 core tile, I found it running
> in all but one case on one of the A7 cores (which correctly denied
> KVM initialization). On the 39th boot however the code ran on
> an A15, leading to a hang after returning success:
> 
> ...
> TCP: reno registered
> UDP hash table entries: 512 (order: 2, 16384 bytes)
> UDP-Lite hash table entries: 512 (order: 2, 16384 bytes)
> NET: Registered protocol family 1
> kvm_target_cpu() on CPU #1, part is c0f0
>  ... (pause for a while) ...
> INFO: rcu_sched self-detected stall on CPUINFO: rcu_sched detected stalls on CPU
> s/tasks: { 1} (detected by 0, t=6002 jiffies, g=4294966999, c=4294966998, q=15)
> Task dump for CPU 1:
> swapper/0       R running      0     1      0 0x00000002
> 
> So iterate over every CPU to correctly determine the capability of
> the system to run the current KVM implementation.
> In case a big.LITTLE configuration is the reason for denial, give
> the user a hint how to get it running anyway (maxcpus= on the kernel
> command line).
> 
> Please push this still into 3.9.
> 
> Signed-off-by: Andre Przywara <andre.przywara@linaro.org>

[...]

Nak. The fact that one of the CPUs seem to hang is a sure sign that
something is severely broken, and you definitely want to fix that issue,
instead of blindly ignoring it.

Additionally, it seems you're just papering over the issue. You should
be able to exclude the A7 processors, but not completely deny KVM from
running on the hardware.

	M.

Peter Maydell April 12, 2013, 1:40 p.m. UTC | #2

On 12 April 2013 14:24, Marc Zyngier <marc.zyngier@arm.com> wrote:
> Nak. The fact that one of the CPUs seem to hang is a sure sign that
> something is severely broken, and you definitely want to fix that issue,
> instead of blindly ignoring it.
>
> Additionally, it seems you're just papering over the issue. You should
> be able to exclude the A7 processors, but not completely deny KVM from
> running on the hardware.

Well that might be nice, as would fully supporting big.LITTLE
systems. But until somebody actually does that work it seems
like a better idea to fail gracefully rather than having a 50%
chance of failing gracefully and a 50% chance of going weird.

-- PMM

Marc Zyngier April 12, 2013, 1:49 p.m. UTC | #3

On 12/04/13 14:40, Peter Maydell wrote:
> On 12 April 2013 14:24, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> Nak. The fact that one of the CPUs seem to hang is a sure sign that
>> something is severely broken, and you definitely want to fix that issue,
>> instead of blindly ignoring it.
>>
>> Additionally, it seems you're just papering over the issue. You should
>> be able to exclude the A7 processors, but not completely deny KVM from
>> running on the hardware.
> 
> Well that might be nice, as would fully supporting big.LITTLE
> systems. But until somebody actually does that work it seems
> like a better idea to fail gracefully rather than having a 50%
> chance of failing gracefully and a 50% chance of going weird.

Nothing prevents the kernel (or even the user) from forcing the affinity
of the CPU threads to the A15s. I'm not saying we should ignore the
problem either. Just that the proposed approach is wrong.

	M.

Andre Przywara April 12, 2013, 1:58 p.m. UTC | #4

On 04/12/2013 03:24 PM, Marc Zyngier wrote:
> On 12/04/13 14:04, Andre Przywara wrote:
>> kvm_target_cpus() checks the compatibility of the used CPU with
>> KVM, which is currently limited to ARM Cortex-A15 cores.
>> However by calling it only once on any random CPU it assumes that
>> all cores are the same, which is not true for big.LITTLE parts.
>>
>> After doing about 40 boots on a TC-2 core tile, I found it running
>> in all but one case on one of the A7 cores (which correctly denied
>> KVM initialization). On the 39th boot however the code ran on
>> an A15, leading to a hang after returning success:
>>
>> ...
>> TCP: reno registered
>> UDP hash table entries: 512 (order: 2, 16384 bytes)
>> UDP-Lite hash table entries: 512 (order: 2, 16384 bytes)
>> NET: Registered protocol family 1
>> kvm_target_cpu() on CPU #1, part is c0f0
>>   ... (pause for a while) ...
>> INFO: rcu_sched self-detected stall on CPUINFO: rcu_sched detected stalls on CPU
>> s/tasks: { 1} (detected by 0, t=6002 jiffies, g=4294966999, c=4294966998, q=15)
>> Task dump for CPU 1:
>> swapper/0       R running      0     1      0 0x00000002
>>
>> So iterate over every CPU to correctly determine the capability of
>> the system to run the current KVM implementation.
>> In case a big.LITTLE configuration is the reason for denial, give
>> the user a hint how to get it running anyway (maxcpus= on the kernel
>> command line).
>>
>> Please push this still into 3.9.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
>
> [...]
>
> Nak. The fact that one of the CPUs seem to hang is a sure sign that
> something is severely broken, and you definitely want to fix that issue,
> instead of blindly ignoring it.

In general I agree, but we should fix it quickly for 3.9, as it can 
cause a kernel hang.

> Additionally, it seems you're just papering over the issue. You should
> be able to exclude the A7 processors, but not completely deny KVM from
> running on the hardware.

Is there an easy way to prevent VCPUs from running on not-supported 
cores? Something like an kvm_cpu_mask?
What would be the wisest hook to utilize for this?
It looks like much of the CPU selection code is generic, so this would 
be non-trivial, right?

Regards,
Andre.

Marc Zyngier April 12, 2013, 2:14 p.m. UTC | #5

On 12/04/13 14:58, Andre Przywara wrote:
> On 04/12/2013 03:24 PM, Marc Zyngier wrote:
>> On 12/04/13 14:04, Andre Przywara wrote:
>>> kvm_target_cpus() checks the compatibility of the used CPU with
>>> KVM, which is currently limited to ARM Cortex-A15 cores.
>>> However by calling it only once on any random CPU it assumes that
>>> all cores are the same, which is not true for big.LITTLE parts.
>>>
>>> After doing about 40 boots on a TC-2 core tile, I found it running
>>> in all but one case on one of the A7 cores (which correctly denied
>>> KVM initialization). On the 39th boot however the code ran on
>>> an A15, leading to a hang after returning success:
>>>
>>> ...
>>> TCP: reno registered
>>> UDP hash table entries: 512 (order: 2, 16384 bytes)
>>> UDP-Lite hash table entries: 512 (order: 2, 16384 bytes)
>>> NET: Registered protocol family 1
>>> kvm_target_cpu() on CPU #1, part is c0f0
>>>   ... (pause for a while) ...
>>> INFO: rcu_sched self-detected stall on CPUINFO: rcu_sched detected stalls on CPU
>>> s/tasks: { 1} (detected by 0, t=6002 jiffies, g=4294966999, c=4294966998, q=15)
>>> Task dump for CPU 1:
>>> swapper/0       R running      0     1      0 0x00000002
>>>
>>> So iterate over every CPU to correctly determine the capability of
>>> the system to run the current KVM implementation.
>>> In case a big.LITTLE configuration is the reason for denial, give
>>> the user a hint how to get it running anyway (maxcpus= on the kernel
>>> command line).
>>>
>>> Please push this still into 3.9.
>>>
>>> Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
>>
>> [...]
>>
>> Nak. The fact that one of the CPUs seem to hang is a sure sign that
>> something is severely broken, and you definitely want to fix that issue,
>> instead of blindly ignoring it.
> 
> In general I agree, but we should fix it quickly for 3.9, as it can 
> cause a kernel hang.

Fixing it, certainly. As you have a BL system up and running, maybe you
could start by finding out what is this CPU doing, and how it ended up
there?

>> Additionally, it seems you're just papering over the issue. You should
>> be able to exclude the A7 processors, but not completely deny KVM from
>> running on the hardware.
> 
> Is there an easy way to prevent VCPUs from running on not-supported 
> cores? Something like an kvm_cpu_mask?

Probably.

> What would be the wisest hook to utilize for this?

The scheduler has sched_[gs]etaffinity(). if you're allowed to call
these though...

	M.

Christoffer Dall April 15, 2013, 4:57 a.m. UTC | #6

On Fri, Apr 12, 2013 at 6:24 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 12/04/13 14:04, Andre Przywara wrote:
>> kvm_target_cpus() checks the compatibility of the used CPU with
>> KVM, which is currently limited to ARM Cortex-A15 cores.
>> However by calling it only once on any random CPU it assumes that
>> all cores are the same, which is not true for big.LITTLE parts.
>>
>> After doing about 40 boots on a TC-2 core tile, I found it running
>> in all but one case on one of the A7 cores (which correctly denied
>> KVM initialization). On the 39th boot however the code ran on
>> an A15, leading to a hang after returning success:
>>
>> ...
>> TCP: reno registered
>> UDP hash table entries: 512 (order: 2, 16384 bytes)
>> UDP-Lite hash table entries: 512 (order: 2, 16384 bytes)
>> NET: Registered protocol family 1
>> kvm_target_cpu() on CPU #1, part is c0f0
>>  ... (pause for a while) ...
>> INFO: rcu_sched self-detected stall on CPUINFO: rcu_sched detected stalls on CPU
>> s/tasks: { 1} (detected by 0, t=6002 jiffies, g=4294966999, c=4294966998, q=15)
>> Task dump for CPU 1:
>> swapper/0       R running      0     1      0 0x00000002
>>
>> So iterate over every CPU to correctly determine the capability of
>> the system to run the current KVM implementation.
>> In case a big.LITTLE configuration is the reason for denial, give
>> the user a hint how to get it running anyway (maxcpus= on the kernel
>> command line).
>>
>> Please push this still into 3.9.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
>
> [...]
>
> Nak. The fact that one of the CPUs seem to hang is a sure sign that
> something is severely broken, and you definitely want to fix that issue,
> instead of blindly ignoring it.
>
> Additionally, it seems you're just papering over the issue. You should
> be able to exclude the A7 processors, but not completely deny KVM from
> running on the hardware.
>
Marc, I disagree with this nak. If the current kernel breaks boot on a
Big.Little system, we need to take care of that first, and the
proposed patch is a quick way to do so, and it does not stand in the
way of introducing proper Big.Little support in any way, which I'm
sure is going to open up a lot of other interesting questions.

I'm going to take this one.

-Christoffer

Marc Zyngier April 15, 2013, 7:50 a.m. UTC | #7

On Sun, 14 Apr 2013 21:57:36 -0700, Christoffer Dall
<cdall@cs.columbia.edu> wrote:
> On Fri, Apr 12, 2013 at 6:24 AM, Marc Zyngier <marc.zyngier@arm.com>
wrote:
>> On 12/04/13 14:04, Andre Przywara wrote:
>>> kvm_target_cpus() checks the compatibility of the used CPU with
>>> KVM, which is currently limited to ARM Cortex-A15 cores.
>>> However by calling it only once on any random CPU it assumes that
>>> all cores are the same, which is not true for big.LITTLE parts.
>>>
>>> After doing about 40 boots on a TC-2 core tile, I found it running
>>> in all but one case on one of the A7 cores (which correctly denied
>>> KVM initialization). On the 39th boot however the code ran on
>>> an A15, leading to a hang after returning success:
>>>
>>> ...
>>> TCP: reno registered
>>> UDP hash table entries: 512 (order: 2, 16384 bytes)
>>> UDP-Lite hash table entries: 512 (order: 2, 16384 bytes)
>>> NET: Registered protocol family 1
>>> kvm_target_cpu() on CPU #1, part is c0f0
>>>  ... (pause for a while) ...
>>> INFO: rcu_sched self-detected stall on CPUINFO: rcu_sched detected
>>> stalls on CPU
>>> s/tasks: { 1} (detected by 0, t=6002 jiffies, g=4294966999,
>>> c=4294966998, q=15)
>>> Task dump for CPU 1:
>>> swapper/0       R running      0     1      0 0x00000002
>>>
>>> So iterate over every CPU to correctly determine the capability of
>>> the system to run the current KVM implementation.
>>> In case a big.LITTLE configuration is the reason for denial, give
>>> the user a hint how to get it running anyway (maxcpus= on the kernel
>>> command line).
>>>
>>> Please push this still into 3.9.
>>>
>>> Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
>>
>> [...]
>>
>> Nak. The fact that one of the CPUs seem to hang is a sure sign that
>> something is severely broken, and you definitely want to fix that
issue,
>> instead of blindly ignoring it.
>>
>> Additionally, it seems you're just papering over the issue. You should
>> be able to exclude the A7 processors, but not completely deny KVM from
>> running on the hardware.
>>
> Marc, I disagree with this nak. If the current kernel breaks boot on a
> Big.Little system, we need to take care of that first, and the
> proposed patch is a quick way to do so, and it does not stand in the
> way of introducing proper Big.Little support in any way, which I'm
> sure is going to open up a lot of other interesting questions.
> 
> I'm going to take this one.

That's your privilege.

Now, my objections about this patch still stand:
- It papers over what looks like a serious bug (CPU hanging? bah, let's
not bother with that). It is an A15 hanging here by the way, not an A7.
- It forces the user to choose between 5 CPUs and KVM (while simply
setting thread affinity would solve the problem this patch tries to solve).
- It reports potentially wrong information (setting maxcpus= will probably
do the wrong thing if cluster 0 is A7 based).

For these reasons, I'm still strongly opposed to this patch being merged.
Yes, this is a quick way to hide a (much) bigger problem, but in no way a
fix.

        M.

Christoffer Dall April 15, 2013, 8:28 a.m. UTC | #8

On Mon, Apr 15, 2013 at 12:50 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On Sun, 14 Apr 2013 21:57:36 -0700, Christoffer Dall
> <cdall@cs.columbia.edu> wrote:
>> On Fri, Apr 12, 2013 at 6:24 AM, Marc Zyngier <marc.zyngier@arm.com>
> wrote:
>>> On 12/04/13 14:04, Andre Przywara wrote:
>>>> kvm_target_cpus() checks the compatibility of the used CPU with
>>>> KVM, which is currently limited to ARM Cortex-A15 cores.
>>>> However by calling it only once on any random CPU it assumes that
>>>> all cores are the same, which is not true for big.LITTLE parts.
>>>>
>>>> After doing about 40 boots on a TC-2 core tile, I found it running
>>>> in all but one case on one of the A7 cores (which correctly denied
>>>> KVM initialization). On the 39th boot however the code ran on
>>>> an A15, leading to a hang after returning success:
>>>>
>>>> ...
>>>> TCP: reno registered
>>>> UDP hash table entries: 512 (order: 2, 16384 bytes)
>>>> UDP-Lite hash table entries: 512 (order: 2, 16384 bytes)
>>>> NET: Registered protocol family 1
>>>> kvm_target_cpu() on CPU #1, part is c0f0
>>>>  ... (pause for a while) ...
>>>> INFO: rcu_sched self-detected stall on CPUINFO: rcu_sched detected
>>>> stalls on CPU
>>>> s/tasks: { 1} (detected by 0, t=6002 jiffies, g=4294966999,
>>>> c=4294966998, q=15)
>>>> Task dump for CPU 1:
>>>> swapper/0       R running      0     1      0 0x00000002
>>>>
>>>> So iterate over every CPU to correctly determine the capability of
>>>> the system to run the current KVM implementation.
>>>> In case a big.LITTLE configuration is the reason for denial, give
>>>> the user a hint how to get it running anyway (maxcpus= on the kernel
>>>> command line).
>>>>
>>>> Please push this still into 3.9.
>>>>
>>>> Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
>>>
>>> [...]
>>>
>>> Nak. The fact that one of the CPUs seem to hang is a sure sign that
>>> something is severely broken, and you definitely want to fix that
> issue,
>>> instead of blindly ignoring it.
>>>
>>> Additionally, it seems you're just papering over the issue. You should
>>> be able to exclude the A7 processors, but not completely deny KVM from
>>> running on the hardware.
>>>
>> Marc, I disagree with this nak. If the current kernel breaks boot on a
>> Big.Little system, we need to take care of that first, and the
>> proposed patch is a quick way to do so, and it does not stand in the
>> way of introducing proper Big.Little support in any way, which I'm
>> sure is going to open up a lot of other interesting questions.
>>
>> I'm going to take this one.
>
> That's your privilege.
>
> Now, my objections about this patch still stand:
> - It papers over what looks like a serious bug (CPU hanging? bah, let's
> not bother with that). It is an A15 hanging here by the way, not an A7.

I missed that part. Are we even sure then that this is related to
running on Big.Little?

> - It forces the user to choose between 5 CPUs and KVM (while simply
> setting thread affinity would solve the problem this patch tries to solve).

But this happens during boot, so thread affinity won't really be a
valid work around for this...

> - It reports potentially wrong information (setting maxcpus= will probably
> do the wrong thing if cluster 0 is A7 based).

ok, fair enough.

>
> For these reasons, I'm still strongly opposed to this patch being merged.
> Yes, this is a quick way to hide a (much) bigger problem, but in no way a
> fix.
>

I'm not claiming this to be a fix for the underlying problem, and I
would much rather see a proper fix. But, I also don't want kernels
configured with KVM to cause systems not to boot and if we can prevent
that from happening for now, in whichever rough possible way, I think
that takes priority.

Right now all the code is written with the inherent assumption that
we're running on an A15, and while I did not scrutinize if some of our
code can break the host if run on an A7, we should probably make sure
that doesn't happen until we actively support Big.Little and A7.  I
think relying on users not to crash their kernels by setting CPU
affinities is also a big mistake.

Andre, when I look back over the log, I'm actually not seeing definite
proof that it's inside the KVM init that the core hangs, how do you
determine this?

Also, if you only experienced the problem in one boot out of 40, how
can you be sure that this patch actually fixes the issue you're
seeing?

Marc Zyngier April 15, 2013, 8:43 a.m. UTC | #9

On Mon, 15 Apr 2013 01:28:25 -0700, Christoffer Dall
<cdall@cs.columbia.edu> wrote:
> On Mon, Apr 15, 2013 at 12:50 AM, Marc Zyngier <marc.zyngier@arm.com>
> wrote:
>> On Sun, 14 Apr 2013 21:57:36 -0700, Christoffer Dall
>> <cdall@cs.columbia.edu> wrote:
>>> On Fri, Apr 12, 2013 at 6:24 AM, Marc Zyngier <marc.zyngier@arm.com>
>> wrote:
>>>> On 12/04/13 14:04, Andre Przywara wrote:
>>>>> kvm_target_cpus() checks the compatibility of the used CPU with
>>>>> KVM, which is currently limited to ARM Cortex-A15 cores.
>>>>> However by calling it only once on any random CPU it assumes that
>>>>> all cores are the same, which is not true for big.LITTLE parts.
>>>>>
>>>>> After doing about 40 boots on a TC-2 core tile, I found it running
>>>>> in all but one case on one of the A7 cores (which correctly denied
>>>>> KVM initialization). On the 39th boot however the code ran on
>>>>> an A15, leading to a hang after returning success:
>>>>>
>>>>> ...
>>>>> TCP: reno registered
>>>>> UDP hash table entries: 512 (order: 2, 16384 bytes)
>>>>> UDP-Lite hash table entries: 512 (order: 2, 16384 bytes)
>>>>> NET: Registered protocol family 1
>>>>> kvm_target_cpu() on CPU #1, part is c0f0
>>>>>  ... (pause for a while) ...
>>>>> INFO: rcu_sched self-detected stall on CPUINFO: rcu_sched detected
>>>>> stalls on CPU
>>>>> s/tasks: { 1} (detected by 0, t=6002 jiffies, g=4294966999,
>>>>> c=4294966998, q=15)
>>>>> Task dump for CPU 1:
>>>>> swapper/0       R running      0     1      0 0x00000002
>>>>>
>>>>> So iterate over every CPU to correctly determine the capability of
>>>>> the system to run the current KVM implementation.
>>>>> In case a big.LITTLE configuration is the reason for denial, give
>>>>> the user a hint how to get it running anyway (maxcpus= on the kernel
>>>>> command line).
>>>>>
>>>>> Please push this still into 3.9.
>>>>>
>>>>> Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
>>>>
>>>> [...]
>>>>
>>>> Nak. The fact that one of the CPUs seem to hang is a sure sign that
>>>> something is severely broken, and you definitely want to fix that
>> issue,
>>>> instead of blindly ignoring it.
>>>>
>>>> Additionally, it seems you're just papering over the issue. You
should
>>>> be able to exclude the A7 processors, but not completely deny KVM
from
>>>> running on the hardware.
>>>>
>>> Marc, I disagree with this nak. If the current kernel breaks boot on a
>>> Big.Little system, we need to take care of that first, and the
>>> proposed patch is a quick way to do so, and it does not stand in the
>>> way of introducing proper Big.Little support in any way, which I'm
>>> sure is going to open up a lot of other interesting questions.
>>>
>>> I'm going to take this one.
>>
>> That's your privilege.
>>
>> Now, my objections about this patch still stand:
>> - It papers over what looks like a serious bug (CPU hanging? bah, let's
>> not bother with that). It is an A15 hanging here by the way, not an A7.
> 
> I missed that part. Are we even sure then that this is related to
> running on Big.Little?

The log is way to terse to tell. I asked Andre to investigate this in a
separate email (I'm away from my TC2 for a while).

>> - It forces the user to choose between 5 CPUs and KVM (while simply
>> setting thread affinity would solve the problem this patch tries to
>> solve).
> 
> But this happens during boot, so thread affinity won't really be a
> valid work around for this...

Just refuse to initialize KVM on the A7s, display a warning indicating the
*valid* CPUs, and let the user use taskset to run their KVM guest. And/or
enforce this in the kernel when hitting vcpu initialisation (using
sched_setaffinity).

>> - It reports potentially wrong information (setting maxcpus= will
>> probably
>> do the wrong thing if cluster 0 is A7 based).
> 
> ok, fair enough.
> 
>>
>> For these reasons, I'm still strongly opposed to this patch being
merged.
>> Yes, this is a quick way to hide a (much) bigger problem, but in no way
a
>> fix.
>>
> 
> I'm not claiming this to be a fix for the underlying problem, and I
> would much rather see a proper fix. But, I also don't want kernels
> configured with KVM to cause systems not to boot and if we can prevent
> that from happening for now, in whichever rough possible way, I think
> that takes priority.

I agree with this. I disagree with the method.

> Right now all the code is written with the inherent assumption that
> we're running on an A15, and while I did not scrutinize if some of our
> code can break the host if run on an A7, we should probably make sure
> that doesn't happen until we actively support Big.Little and A7.  I
> think relying on users not to crash their kernels by setting CPU
> affinities is also a big mistake.

KVM initialization on A7 should really already work as it is. If it
doesn't, then it is a bug that is waiting to occur on A15. And the above
hang may just be a sign that the problem is actually occurring already.

As for the affinity, we can enforce this very easily.

        M.

Christoffer Dall April 15, 2013, 8:54 a.m. UTC | #10

On Mon, Apr 15, 2013 at 1:43 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On Mon, 15 Apr 2013 01:28:25 -0700, Christoffer Dall
> <cdall@cs.columbia.edu> wrote:
>> On Mon, Apr 15, 2013 at 12:50 AM, Marc Zyngier <marc.zyngier@arm.com>
>> wrote:
>>> On Sun, 14 Apr 2013 21:57:36 -0700, Christoffer Dall
>>> <cdall@cs.columbia.edu> wrote:
>>>> On Fri, Apr 12, 2013 at 6:24 AM, Marc Zyngier <marc.zyngier@arm.com>
>>> wrote:
>>>>> On 12/04/13 14:04, Andre Przywara wrote:
>>>>>> kvm_target_cpus() checks the compatibility of the used CPU with
>>>>>> KVM, which is currently limited to ARM Cortex-A15 cores.
>>>>>> However by calling it only once on any random CPU it assumes that
>>>>>> all cores are the same, which is not true for big.LITTLE parts.
>>>>>>
>>>>>> After doing about 40 boots on a TC-2 core tile, I found it running
>>>>>> in all but one case on one of the A7 cores (which correctly denied
>>>>>> KVM initialization). On the 39th boot however the code ran on
>>>>>> an A15, leading to a hang after returning success:
>>>>>>
>>>>>> ...
>>>>>> TCP: reno registered
>>>>>> UDP hash table entries: 512 (order: 2, 16384 bytes)
>>>>>> UDP-Lite hash table entries: 512 (order: 2, 16384 bytes)
>>>>>> NET: Registered protocol family 1
>>>>>> kvm_target_cpu() on CPU #1, part is c0f0
>>>>>>  ... (pause for a while) ...
>>>>>> INFO: rcu_sched self-detected stall on CPUINFO: rcu_sched detected
>>>>>> stalls on CPU
>>>>>> s/tasks: { 1} (detected by 0, t=6002 jiffies, g=4294966999,
>>>>>> c=4294966998, q=15)
>>>>>> Task dump for CPU 1:
>>>>>> swapper/0       R running      0     1      0 0x00000002
>>>>>>
>>>>>> So iterate over every CPU to correctly determine the capability of
>>>>>> the system to run the current KVM implementation.
>>>>>> In case a big.LITTLE configuration is the reason for denial, give
>>>>>> the user a hint how to get it running anyway (maxcpus= on the kernel
>>>>>> command line).
>>>>>>
>>>>>> Please push this still into 3.9.
>>>>>>
>>>>>> Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
>>>>>
>>>>> [...]
>>>>>
>>>>> Nak. The fact that one of the CPUs seem to hang is a sure sign that
>>>>> something is severely broken, and you definitely want to fix that
>>> issue,
>>>>> instead of blindly ignoring it.
>>>>>
>>>>> Additionally, it seems you're just papering over the issue. You
> should
>>>>> be able to exclude the A7 processors, but not completely deny KVM
> from
>>>>> running on the hardware.
>>>>>
>>>> Marc, I disagree with this nak. If the current kernel breaks boot on a
>>>> Big.Little system, we need to take care of that first, and the
>>>> proposed patch is a quick way to do so, and it does not stand in the
>>>> way of introducing proper Big.Little support in any way, which I'm
>>>> sure is going to open up a lot of other interesting questions.
>>>>
>>>> I'm going to take this one.
>>>
>>> That's your privilege.
>>>
>>> Now, my objections about this patch still stand:
>>> - It papers over what looks like a serious bug (CPU hanging? bah, let's
>>> not bother with that). It is an A15 hanging here by the way, not an A7.
>>
>> I missed that part. Are we even sure then that this is related to
>> running on Big.Little?
>
> The log is way to terse to tell. I asked Andre to investigate this in a
> separate email (I'm away from my TC2 for a while).
>
>>> - It forces the user to choose between 5 CPUs and KVM (while simply
>>> setting thread affinity would solve the problem this patch tries to
>>> solve).
>>
>> But this happens during boot, so thread affinity won't really be a
>> valid work around for this...
>
> Just refuse to initialize KVM on the A7s, display a warning indicating the
> *valid* CPUs, and let the user use taskset to run their KVM guest. And/or
> enforce this in the kernel when hitting vcpu initialisation (using
> sched_setaffinity).
>
>>> - It reports potentially wrong information (setting maxcpus= will
>>> probably
>>> do the wrong thing if cluster 0 is A7 based).
>>
>> ok, fair enough.
>>
>>>
>>> For these reasons, I'm still strongly opposed to this patch being
> merged.
>>> Yes, this is a quick way to hide a (much) bigger problem, but in no way
> a
>>> fix.
>>>
>>
>> I'm not claiming this to be a fix for the underlying problem, and I
>> would much rather see a proper fix. But, I also don't want kernels
>> configured with KVM to cause systems not to boot and if we can prevent
>> that from happening for now, in whichever rough possible way, I think
>> that takes priority.
>
> I agree with this. I disagree with the method.
>
>> Right now all the code is written with the inherent assumption that
>> we're running on an A15, and while I did not scrutinize if some of our
>> code can break the host if run on an A7, we should probably make sure
>> that doesn't happen until we actively support Big.Little and A7.  I
>> think relying on users not to crash their kernels by setting CPU
>> affinities is also a big mistake.
>
> KVM initialization on A7 should really already work as it is. If it
> doesn't, then it is a bug that is waiting to occur on A15. And the above
> hang may just be a sign that the problem is actually occurring already.
>
> As for the affinity, we can enforce this very easily.
>
OK, but let's not go down this road until we've verified that this is
really an issue that needs to be handled by excluding A7s.  Andre, can
you give us some more details?

Peter Maydell April 15, 2013, 9:14 a.m. UTC | #11

On 15 April 2013 09:54, Christoffer Dall <cdall@cs.columbia.edu> wrote:
> On Mon, Apr 15, 2013 at 1:43 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> As for the affinity, we can enforce this very easily.
>>
> OK, but let's not go down this road until we've verified that this is
> really an issue that needs to be handled by excluding A7s.  Andre, can
> you give us some more details?

We *must* exclude KVM vcpu threads from ever running on A7s,
because the cp15 handling code as it stands does not virtualize
various registers (most notably the ID/feature registers) and
so the vcpu will bounce between "looks like an A15" and "looks
like a weird hybrid between A7 and A15".

Assuming we don't want to tackle full big.LITTLE host support
right now, I can see two options:
 * forbid KVM completely on a system with any non-A15s
 * force KVM vcpus to only ever run on the A15s [Q: does
   all the hyp mode code have to also run on the A15s or
   is it A7-safe?]

-- PMM

Andre Przywara April 15, 2013, 9:39 a.m. UTC | #12

On 04/15/2013 11:14 AM, Peter Maydell wrote:
> On 15 April 2013 09:54, Christoffer Dall <cdall@cs.columbia.edu> wrote:
>> On Mon, Apr 15, 2013 at 1:43 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>> As for the affinity, we can enforce this very easily.
>>>
>> OK, but let's not go down this road until we've verified that this is
>> really an issue that needs to be handled by excluding A7s.  Andre, can
>> you give us some more details?

Sure, I am about to collect some data.

>
> We *must* exclude KVM vcpu threads from ever running on A7s,
> because the cp15 handling code as it stands does not virtualize
> various registers (most notably the ID/feature registers) and
> so the vcpu will bounce between "looks like an A15" and "looks
> like a weird hybrid between A7 and A15".

Can't we propagate some least common denominator or safe values for 
those registers instead of propagating the host ones? Or ignore writes 
to them? Or are there any real showstoppers which I overlooked? Things 
like cache architecture properties could just be set to some basic 
values, right?
On x86 propagating too much of host CPU information has proven to be 
problematic, so for instance cache info is very generic. Things are more 
complicated here, because CPUID is (guest) userland accessible.

Eventually the user/qemu should be able to override this, but as a 
default and for now we should inject one fixed handcrafted CPU 
regardless of the actual host CPU. If A7 is easier in this regards, this 
should be considered, too.

> Assuming we don't want to tackle full big.LITTLE host support
> right now, I can see two options:
>   * forbid KVM completely on a system with any non-A15s
>   * force KVM vcpus to only ever run on the A15s

I thought a bit about this. It is not so easy to accomplish in a sane 
way. The CPU is chosen by the scheduler, actually for the QEMU thread. 
taskset could be of use here, but the affinity could be changed at any 
time by userland. I thought about returning -EINVAL for the KVM_RUN 
ioctl if the CPU is not appropriate, but a) QEMU does not know how to 
handle this properly and b) kvm_arch_vcpu_load currently returns void, 
so we would need to change it for all KVM architectures.
Not sure if its worth to establish this until we fix bL support properly.

> [Q: does
>     all the hyp mode code have to also run on the A15s or
>     is it A7-safe?]

Good point. It seems like the code hangs in init_hyp_mode().
Will do more debugging in there now.

Regards,
Andre.

Peter Maydell April 15, 2013, 9:45 a.m. UTC | #13

On 15 April 2013 10:39, Andre Przywara <andre.przywara@linaro.org> wrote:
> On 04/15/2013 11:14 AM, Peter Maydell wrote:
>> We *must* exclude KVM vcpu threads from ever running on A7s,
>> because the cp15 handling code as it stands does not virtualize
>> various registers (most notably the ID/feature registers) and
>> so the vcpu will bounce between "looks like an A15" and "looks
>> like a weird hybrid between A7 and A15".
>
> Can't we propagate some least common denominator or safe values for those
> registers instead of propagating the host ones? Or ignore writes to them? Or
> are there any real showstoppers which I overlooked?

Yes, we could do all these things, and will have to; and
we need to analyse all the cp15 registers to determine which
we need to virtualize and in which of these ways. This is
"tackling full big.LITTLE host support" and it is a big job.
We need to do it at some point but for the moment we should
just make sure we don't do weird and wrong things on an
unsupported host config.

-- PMM

Andre Przywara April 15, 2013, 1:13 p.m. UTC | #14

On 04/15/2013 11:52 AM, Alexander Spyridakis wrote:
> On 15 April 2013 11:39, Andre Przywara <andre.przywara@linaro.org
> <mailto:andre.przywara@linaro.org>> wrote:
>
>      > [Q: does
>      >     all the hyp mode code have to also run on the A15s or
>      >     is it A7-safe?]
>
>     Good point. It seems like the code hangs in init_hyp_mode().
>     Will do more debugging in there now.
>
>
> I've run on this problem before, while trying to run KVM guests on A7 cores.
>
> For some reason the 3rd A7 hangs in arch/arm/kvm/init.S, on the
> instruction that updates HSCTLR between the two isbs on __do_hyp_init
> (mcr p15, 4, r0, c1, c0, 0). If you boot the system with maxcpus=4 then
> init_hyp_mode() will not hang on the A7 cluster. Other than that from my
> limited testing KVM on A7 works on a usual linux guest. I also tried to
> only boot the 3rd A7 core to rule out any racing issues, but still the
> same behaviour applies.

Could well be the same issue here. I chased it down till CPU 2 goes into 
HYP mode to do the initialization.
I am running with maxcpus=3 (this increases the likelyhood that 
kvm_target_cpu() runs on an A15), so CPU #2 is the only one A7.
As the HYP mode exception table is empty except for the HVC trap, it may 
be looping here. I am trying now to get the PC of the faulty instruction.


(on a TC2: 0-1: A15, 2-4: A7)
kvm_target_cpu() on CPU #1, part is c0f0
kvm [1]: entering init_hyp_mode on CPU 1
kvm [1]: kvm_mmu_init() returned 0 on CPU 1
kvm [1]: hyp_get_vectors() finished on CPU 1
kvm [1]: allocated stack for HV mode on 5 CPUs, now on CPU 1
kvm [0]: entering cpu_init_hyp_mode(8050fde0) on CPU 0
kvm [0]: c_i_h_m: set vectors finished
kvm [0]: c_i_h_m: initialization of local variables finished
kvm [0]: c_i_h_m: kvm_call_hyp() finished, returning...
kvm [1]: entering cpu_init_hyp_mode(8050fde0) on CPU 1
kvm [1]: c_i_h_m: set vectors finished
kvm [1]: c_i_h_m: initialization of local variables finished
kvm [1]: c_i_h_m: kvm_call_hyp() finished, returning...
kvm [0]: entering cpu_init_hyp_mode(8050fde0) on CPU 2
kvm [0]: c_i_h_m: set vectors finished
kvm [0]: c_i_h_m: initialization of local variables finished


Regards,
Andre.

Will Deacon April 15, 2013, 1:48 p.m. UTC | #15

On Mon, Apr 15, 2013 at 02:13:55PM +0100, Andre Przywara wrote:
> On 04/15/2013 11:52 AM, Alexander Spyridakis wrote:
> > I've run on this problem before, while trying to run KVM guests on A7 cores.
> >
> > For some reason the 3rd A7 hangs in arch/arm/kvm/init.S, on the
> > instruction that updates HSCTLR between the two isbs on __do_hyp_init
> > (mcr p15, 4, r0, c1, c0, 0). If you boot the system with maxcpus=4 then
> > init_hyp_mode() will not hang on the A7 cluster. Other than that from my
> > limited testing KVM on A7 works on a usual linux guest. I also tried to
> > only boot the 3rd A7 core to rule out any racing issues, but still the
> > same behaviour applies.
> 
> Could well be the same issue here. I chased it down till CPU 2 goes into 
> HYP mode to do the initialization.
> I am running with maxcpus=3 (this increases the likelyhood that 
> kvm_target_cpu() runs on an A15), so CPU #2 is the only one A7.
> As the HYP mode exception table is empty except for the HVC trap, it may 
> be looping here. I am trying now to get the PC of the faulty instruction.

Yes, it sounds like you're taking a recursive fault because the vectors
aren't installed yet. Is there any chance you can find out what value you end
up writing (or trying to write) to the HSCTLR please?

Will

Andre Przywara April 15, 2013, 2:26 p.m. UTC | #16

On 04/15/2013 03:48 PM, Will Deacon wrote:
> On Mon, Apr 15, 2013 at 02:13:55PM +0100, Andre Przywara wrote:
>> On 04/15/2013 11:52 AM, Alexander Spyridakis wrote:
>>> I've run on this problem before, while trying to run KVM guests on A7 cores.
>>>
>>> For some reason the 3rd A7 hangs in arch/arm/kvm/init.S, on the
>>> instruction that updates HSCTLR between the two isbs on __do_hyp_init
>>> (mcr p15, 4, r0, c1, c0, 0). If you boot the system with maxcpus=4 then
>>> init_hyp_mode() will not hang on the A7 cluster. Other than that from my
>>> limited testing KVM on A7 works on a usual linux guest. I also tried to
>>> only boot the 3rd A7 core to rule out any racing issues, but still the
>>> same behaviour applies.
>>
>> Could well be the same issue here. I chased it down till CPU 2 goes into
>> HYP mode to do the initialization.
>> I am running with maxcpus=3 (this increases the likelyhood that
>> kvm_target_cpu() runs on an A15), so CPU #2 is the only one A7.
>> As the HYP mode exception table is empty except for the HVC trap, it may
>> be looping here. I am trying now to get the PC of the faulty instruction.
>
> Yes, it sounds like you're taking a recursive fault because the vectors
> aren't installed yet. Is there any chance you can find out what value you end
> up writing (or trying to write) to the HSCTLR please?

Mmh, still thinking about a nice way to get some information out of HYP 
mode. schedule_delayed_work() to monitor a memory address didn't work 
(too early?).

But meanwhile I compared the A7 and A15 TRM 4.3.32 HSCTLR:
Bit 21 is fixed 0 on A15, but fixed 1 on A7.
Can someone confirm that this is not a typo?

Regards,
Andre.

Peter Maydell April 15, 2013, 2:39 p.m. UTC | #17

On 15 April 2013 15:26, Andre Przywara <andre.przywara@linaro.org> wrote:
> But meanwhile I compared the A7 and A15 TRM 4.3.32 HSCTLR:
> Bit 21 is fixed 0 on A15, but fixed 1 on A7.
> Can someone confirm that this is not a typo?

The A7 TRM diagram says '1' but the text says RAZ/WI.
The bit should match SCTLR.FI (bit 21), which the A7 TRM
definitely says is RAZ/WI. If you have the hardware you
can confirm the behaviour yourself by reading the register
and seeing what the bit value is :-)

-- PMM

Alexander Spyridakis April 15, 2013, 2:53 p.m. UTC | #18

On 15 April 2013 16:26, Andre Przywara <andre.przywara@linaro.org> wrote:
> Mmh, still thinking about a nice way to get some information out of HYP mode. schedule_delayed_work() to monitor a memory address didn't work (too early?).
>
> But meanwhile I compared the A7 and A15 TRM 4.3.32 HSCTLR:
> Bit 21 is fixed 0 on A15, but fixed 1 on A7.
> Can someone confirm that this is not a typo?

The HSCTLR value, seemed sane for all cores and it was fixed (read
HSCTLR and SCTLR, ORs some bits and updates HSCTLR). If I remember
correctly, at the time I was testing it, all cores reported the same
value for SCTLR but in the case of the 3rd A7 it would crash as soon
as it would try to write on HSCTLR. If you want to continue execution
remove one of the isbs, of course the 3rd A7 will immediately hang
when you try to run a guest (the rest of the cores will operate
normally even when running a guest on them).

Regards.

Geoff Levand April 16, 2013, 12:26 a.m. UTC | #19

On Sun, 2013-04-14 at 21:57 -0700, Christoffer Dall wrote:
> On Fri, Apr 12, 2013 at 6:24 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> Nak. The fact that one of the CPUs seem to hang is a sure sign that
> > something is severely broken, and you definitely want to fix that issue,
> > instead of blindly ignoring it.
> >
> > Additionally, it seems you're just papering over the issue. You should
> > be able to exclude the A7 processors, but not completely deny KVM from
> > running on the hardware.
> >
> Marc, I disagree with this nak. If the current kernel breaks boot on a
> Big.Little system, we need to take care of that first, and the
> proposed patch is a quick way to do so, and it does not stand in the
> way of introducing proper Big.Little support in any way, which I'm
> sure is going to open up a lot of other interesting questions.
> 
> I'm going to take this one.

Since this problem will cause the 3.9 kernel to hang then a workaround
like this should go in.  There isn't enough time to do a proper fix for
3.9, and even if it could be done I think it would be too intrusive to
get merged this late.

-Geoff

Christoffer Dall April 16, 2013, 3:59 p.m. UTC | #20

On Mon, Apr 15, 2013 at 2:52 AM, Alexander Spyridakis
<a.spyridakis@virtualopensystems.com> wrote:
> On 15 April 2013 11:39, Andre Przywara <andre.przywara@linaro.org> wrote:
>>
>> > [Q: does
>> >     all the hyp mode code have to also run on the A15s or
>> >     is it A7-safe?]
>>
>> Good point. It seems like the code hangs in init_hyp_mode().
>> Will do more debugging in there now.
>
>
> I've run on this problem before, while trying to run KVM guests on A7 cores.
>
> For some reason the 3rd A7 hangs in arch/arm/kvm/init.S, on the instruction
> that updates HSCTLR between the two isbs on __do_hyp_init (mcr p15, 4, r0,
> c1, c0, 0). If you boot the system with maxcpus=4 then init_hyp_mode() will
> not hang on the A7 cluster. Other than that from my limited testing KVM on
> A7 works on a usual linux guest. I also tried to only boot the 3rd A7 core
> to rule out any racing issues, but still the same behaviour applies.
>
You lost me on the maxcpus=4 and racing point here. If it works with
maxcpus=4 then it works with two A7s, right? Why would running with
maxcpus=3 (ie. one A7) rule out any race condition?

-Christoffer

Christoffer Dall April 16, 2013, 4:03 p.m. UTC | #21

On Tue, Apr 16, 2013 at 8:59 AM, Christoffer Dall <cdall@cs.columbia.edu> wrote:
> On Mon, Apr 15, 2013 at 2:52 AM, Alexander Spyridakis
> <a.spyridakis@virtualopensystems.com> wrote:
>> On 15 April 2013 11:39, Andre Przywara <andre.przywara@linaro.org> wrote:
>>>
>>> > [Q: does
>>> >     all the hyp mode code have to also run on the A15s or
>>> >     is it A7-safe?]
>>>
>>> Good point. It seems like the code hangs in init_hyp_mode().
>>> Will do more debugging in there now.
>>
>>
>> I've run on this problem before, while trying to run KVM guests on A7 cores.
>>
>> For some reason the 3rd A7 hangs in arch/arm/kvm/init.S, on the instruction
>> that updates HSCTLR between the two isbs on __do_hyp_init (mcr p15, 4, r0,
>> c1, c0, 0). If you boot the system with maxcpus=4 then init_hyp_mode() will
>> not hang on the A7 cluster. Other than that from my limited testing KVM on
>> A7 works on a usual linux guest. I also tried to only boot the 3rd A7 core
>> to rule out any racing issues, but still the same behaviour applies.
>>
> You lost me on the maxcpus=4 and racing point here. If it works with
> maxcpus=4 then it works with two A7s, right? Why would running with
> maxcpus=3 (ie. one A7) rule out any race condition?
>
[resending this as someone misspelled Marc's e-mail]

Christoffer Dall April 16, 2013, 4:24 p.m. UTC | #22

On Mon, Apr 15, 2013 at 5:26 PM, Geoff Levand <geoff@infradead.org> wrote:
> On Sun, 2013-04-14 at 21:57 -0700, Christoffer Dall wrote:
>> On Fri, Apr 12, 2013 at 6:24 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> Nak. The fact that one of the CPUs seem to hang is a sure sign that
>> > something is severely broken, and you definitely want to fix that issue,
>> > instead of blindly ignoring it.
>> >
>> > Additionally, it seems you're just papering over the issue. You should
>> > be able to exclude the A7 processors, but not completely deny KVM from
>> > running on the hardware.
>> >
>> Marc, I disagree with this nak. If the current kernel breaks boot on a
>> Big.Little system, we need to take care of that first, and the
>> proposed patch is a quick way to do so, and it does not stand in the
>> way of introducing proper Big.Little support in any way, which I'm
>> sure is going to open up a lot of other interesting questions.
>>
>> I'm going to take this one.
>
> Since this problem will cause the 3.9 kernel to hang then a workaround
> like this should go in.  There isn't enough time to do a proper fix for
> 3.9, and even if it could be done I think it would be too intrusive to
> get merged this late.
>
That's why I was inclined to take the patch, but as Marc pointed out
the error message is incorrect, so that should be fixed at the very
least. Also I don't think we need the counting logic, just bail out if
we have any CPUs that are not supported.

Marc, since you're the strongest opponent of this patch, are you still
opposed to making sure we don't try to run KVM on Big.Little until
support is properly introduced?

I also cannot see how we can fix the affinity issue easily from within
the kernel, do you have a concrete approach in mind you can share?

Note that this is irrespective of the boot delay issue that you guys
are observing.

-Christoffer

Christoffer Dall April 16, 2013, 4:26 p.m. UTC | #23

On Mon, Apr 15, 2013 at 6:48 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Mon, Apr 15, 2013 at 02:13:55PM +0100, Andre Przywara wrote:
>> On 04/15/2013 11:52 AM, Alexander Spyridakis wrote:
>> > I've run on this problem before, while trying to run KVM guests on A7 cores.
>> >
>> > For some reason the 3rd A7 hangs in arch/arm/kvm/init.S, on the
>> > instruction that updates HSCTLR between the two isbs on __do_hyp_init
>> > (mcr p15, 4, r0, c1, c0, 0). If you boot the system with maxcpus=4 then
>> > init_hyp_mode() will not hang on the A7 cluster. Other than that from my
>> > limited testing KVM on A7 works on a usual linux guest. I also tried to
>> > only boot the 3rd A7 core to rule out any racing issues, but still the
>> > same behaviour applies.
>>
>> Could well be the same issue here. I chased it down till CPU 2 goes into
>> HYP mode to do the initialization.
>> I am running with maxcpus=3 (this increases the likelyhood that
>> kvm_target_cpu() runs on an A15), so CPU #2 is the only one A7.
>> As the HYP mode exception table is empty except for the HVC trap, it may
>> be looping here. I am trying now to get the PC of the faulty instruction.
>
> Yes, it sounds like you're taking a recursive fault because the vectors
> aren't installed yet. Is there any chance you can find out what value you end
> up writing (or trying to write) to the HSCTLR please?
>
Actually I'm a little confused, wasn't Andre seeing a halt on an A15
cpu, not an A7? Or is the theory that an A7 locks up and the calling
A15 hangs on the SMP call to cpu_init_hyp_mode, waiting for the A7 to
complete?

-Christoffer

Marc Zyngier April 16, 2013, 4:33 p.m. UTC | #24

On Tue, 16 Apr 2013 09:26:26 -0700, Christoffer Dall
<cdall@cs.columbia.edu> wrote:
> On Mon, Apr 15, 2013 at 6:48 AM, Will Deacon <will.deacon@arm.com>
wrote:
>> On Mon, Apr 15, 2013 at 02:13:55PM +0100, Andre Przywara wrote:
>>> On 04/15/2013 11:52 AM, Alexander Spyridakis wrote:
>>> > I've run on this problem before, while trying to run KVM guests on
A7
>>> > cores.
>>> >
>>> > For some reason the 3rd A7 hangs in arch/arm/kvm/init.S, on the
>>> > instruction that updates HSCTLR between the two isbs on
__do_hyp_init
>>> > (mcr p15, 4, r0, c1, c0, 0). If you boot the system with maxcpus=4
>>> > then
>>> > init_hyp_mode() will not hang on the A7 cluster. Other than that
from
>>> > my
>>> > limited testing KVM on A7 works on a usual linux guest. I also tried
>>> > to
>>> > only boot the 3rd A7 core to rule out any racing issues, but still
the
>>> > same behaviour applies.
>>>
>>> Could well be the same issue here. I chased it down till CPU 2 goes
into
>>> HYP mode to do the initialization.
>>> I am running with maxcpus=3 (this increases the likelyhood that
>>> kvm_target_cpu() runs on an A15), so CPU #2 is the only one A7.
>>> As the HYP mode exception table is empty except for the HVC trap, it
may
>>> be looping here. I am trying now to get the PC of the faulty
>>> instruction.
>>
>> Yes, it sounds like you're taking a recursive fault because the vectors
>> aren't installed yet. Is there any chance you can find out what value
>> you end
>> up writing (or trying to write) to the HSCTLR please?
>>
> Actually I'm a little confused, wasn't Andre seeing a halt on an A15
> cpu, not an A7? Or is the theory that an A7 locks up and the calling
> A15 hangs on the SMP call to cpu_init_hyp_mode, waiting for the A7 to
> complete?

Yes, A15 hanging, not A7. That's why I'm strongly opposed to this patch.
I'm pretty sure the A7s only have a side effect that triggers a kernel bug
on the A15 side. Before taking *any* patch around this, we should
understand the issue fully, and not start patching random stuff just
because Linus is going to tag 3.9.

        M.

Marc Zyngier April 16, 2013, 4:40 p.m. UTC | #25

On Tue, 16 Apr 2013 09:24:03 -0700, Christoffer Dall
<cdall@cs.columbia.edu> wrote:
> On Mon, Apr 15, 2013 at 5:26 PM, Geoff Levand <geoff@infradead.org>
wrote:
>> On Sun, 2013-04-14 at 21:57 -0700, Christoffer Dall wrote:
>>> On Fri, Apr 12, 2013 at 6:24 AM, Marc Zyngier <marc.zyngier@arm.com>
>>> wrote:
>>> Nak. The fact that one of the CPUs seem to hang is a sure sign that
>>> > something is severely broken, and you definitely want to fix that
>>> > issue,
>>> > instead of blindly ignoring it.
>>> >
>>> > Additionally, it seems you're just papering over the issue. You
should
>>> > be able to exclude the A7 processors, but not completely deny KVM
from
>>> > running on the hardware.
>>> >
>>> Marc, I disagree with this nak. If the current kernel breaks boot on a
>>> Big.Little system, we need to take care of that first, and the
>>> proposed patch is a quick way to do so, and it does not stand in the
>>> way of introducing proper Big.Little support in any way, which I'm
>>> sure is going to open up a lot of other interesting questions.
>>>
>>> I'm going to take this one.
>>
>> Since this problem will cause the 3.9 kernel to hang then a workaround
>> like this should go in.  There isn't enough time to do a proper fix for
>> 3.9, and even if it could be done I think it would be too intrusive to
>> get merged this late.
>>
> That's why I was inclined to take the patch, but as Marc pointed out
> the error message is incorrect, so that should be fixed at the very
> least. Also I don't think we need the counting logic, just bail out if
> we have any CPUs that are not supported.
> 
> Marc, since you're the strongest opponent of this patch, are you still
> opposed to making sure we don't try to run KVM on Big.Little until
> support is properly introduced?

Nothing I've seen so far proves that BL isn't working. Yes, we know the
guest side is going to break. But the host should be solid, and if it
isn't, let's fix it. What we've seen so far is an A15 hanging (Andre's
trace), and Alexander's weird problem with the third A7 hanging. So far,
I'd be inclined to say that BL is working well enough for that bug to be
reproduced on both sides.

> I also cannot see how we can fix the affinity issue easily from within
> the kernel, do you have a concrete approach in mind you can share?

My idea was to check the affinity of the vcpu thread, compare it to the
"KVM affinity", and force it to the intersection of these two sets
(rescheduling if not on an A15). If the intersection is null, just give up.

Not pretty, but ensures nothing gets scheduled on an A7.

        M.

Alexander Spyridakis April 16, 2013, 6:37 p.m. UTC | #26

On 16 April 2013 17:59, Christoffer Dall <cdall@cs.columbia.edu> wrote:
> On Mon, Apr 15, 2013 at 2:52 AM, Alexander Spyridakis
> <a.spyridakis@virtualopensystems.com> wrote:
>> I've run on this problem before, while trying to run KVM guests on A7 cores.
>>
>> For some reason the 3rd A7 hangs in arch/arm/kvm/init.S, on the instruction
>> that updates HSCTLR between the two isbs on __do_hyp_init (mcr p15, 4, r0,
>> c1, c0, 0). If you boot the system with maxcpus=4 then init_hyp_mode() will
>> not hang on the A7 cluster. Other than that from my limited testing KVM on
>> A7 works on a usual linux guest. I also tried to only boot the 3rd A7 core
>> to rule out any racing issues, but still the same behaviour applies.
>>
> You lost me on the maxcpus=4 and racing point here. If it works with
> maxcpus=4 then it works with two A7s, right? Why would running with
> maxcpus=3 (ie. one A7) rule out any race condition?

maxcpus=3 would still work (only one A7), but I didn't test it that
way. To make sure and avoid any race conditions from other cpus, I
changed the bootwrapper to explicitly boot only the 3rd A7 (rest of
the cpus being in an infinite loop) and it would still hang. Booting
only the 1st or 2nd A7 core by the same method would work as expected
(no hang in __do_hyp_init_).

Regards.

Alexander Spyridakis April 16, 2013, 6:43 p.m. UTC | #27

[sorry, resending this as my client still insists on misspelling Marc's mail]

On 16 April 2013 20:37, Alexander Spyridakis
<a.spyridakis@virtualopensystems.com> wrote:
> On 16 April 2013 17:59, Christoffer Dall <cdall@cs.columbia.edu> wrote:
>> On Mon, Apr 15, 2013 at 2:52 AM, Alexander Spyridakis
>> <a.spyridakis@virtualopensystems.com> wrote:
>>> I've run on this problem before, while trying to run KVM guests on A7 cores.
>>>
>>> For some reason the 3rd A7 hangs in arch/arm/kvm/init.S, on the instruction
>>> that updates HSCTLR between the two isbs on __do_hyp_init (mcr p15, 4, r0,
>>> c1, c0, 0). If you boot the system with maxcpus=4 then init_hyp_mode() will
>>> not hang on the A7 cluster. Other than that from my limited testing KVM on
>>> A7 works on a usual linux guest. I also tried to only boot the 3rd A7 core
>>> to rule out any racing issues, but still the same behaviour applies.
>>>
>> You lost me on the maxcpus=4 and racing point here. If it works with
>> maxcpus=4 then it works with two A7s, right? Why would running with
>> maxcpus=3 (ie. one A7) rule out any race condition?
>
> maxcpus=3 would still work (only one A7), but I didn't test it that
> way. To make sure and avoid any race conditions from other cpus, I
> changed the bootwrapper to explicitly boot only the 3rd A7 (rest of
> the cpus being in an infinite loop) and it would still hang. Booting
> only the 1st or 2nd A7 core by the same method would work as expected
> (no hang in __do_hyp_init_).
>
> Regards.

Christoffer Dall April 16, 2013, 11:13 p.m. UTC | #28

On Tue, Apr 16, 2013 at 11:37 AM, Alexander Spyridakis
<a.spyridakis@virtualopensystems.com> wrote:
> On 16 April 2013 17:59, Christoffer Dall <cdall@cs.columbia.edu> wrote:
>> On Mon, Apr 15, 2013 at 2:52 AM, Alexander Spyridakis
>> <a.spyridakis@virtualopensystems.com> wrote:
>>> I've run on this problem before, while trying to run KVM guests on A7 cores.
>>>
>>> For some reason the 3rd A7 hangs in arch/arm/kvm/init.S, on the instruction
>>> that updates HSCTLR between the two isbs on __do_hyp_init (mcr p15, 4, r0,
>>> c1, c0, 0). If you boot the system with maxcpus=4 then init_hyp_mode() will
>>> not hang on the A7 cluster. Other than that from my limited testing KVM on
>>> A7 works on a usual linux guest. I also tried to only boot the 3rd A7 core
>>> to rule out any racing issues, but still the same behaviour applies.
>>>
>> You lost me on the maxcpus=4 and racing point here. If it works with
>> maxcpus=4 then it works with two A7s, right? Why would running with
>> maxcpus=3 (ie. one A7) rule out any race condition?
>
> maxcpus=3 would still work (only one A7), but I didn't test it that
> way. To make sure and avoid any race conditions from other cpus, I
> changed the bootwrapper to explicitly boot only the 3rd A7 (rest of
> the cpus being in an infinite loop) and it would still hang. Booting
> only the 1st or 2nd A7 core by the same method would work as expected
> (no hang in __do_hyp_init_).
>
so work as expected, as in it doesn't work... Hence the confusion.

I don't think this is racy in any way, but for the record, you still
have a couple of other A15s running here, so you can't completely rule
out races based on your experimental approach, but I really don't
think that's the issue anyhow.

Andre Przywara April 17, 2013, 8:01 a.m. UTC | #29

On 04/16/2013 06:24 PM, Christoffer Dall wrote:
> On Mon, Apr 15, 2013 at 5:26 PM, Geoff Levand <geoff@infradead.org> wrote:
>> On Sun, 2013-04-14 at 21:57 -0700, Christoffer Dall wrote:
>>> On Fri, Apr 12, 2013 at 6:24 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>> Nak. The fact that one of the CPUs seem to hang is a sure sign that
>>>> something is severely broken, and you definitely want to fix that issue,
>>>> instead of blindly ignoring it.
>>>>
>>>> Additionally, it seems you're just papering over the issue. You should
>>>> be able to exclude the A7 processors, but not completely deny KVM from
>>>> running on the hardware.
>>>>
>>> Marc, I disagree with this nak. If the current kernel breaks boot on a
>>> Big.Little system, we need to take care of that first, and the
>>> proposed patch is a quick way to do so, and it does not stand in the
>>> way of introducing proper Big.Little support in any way, which I'm
>>> sure is going to open up a lot of other interesting questions.
>>>
>>> I'm going to take this one.
>>
>> Since this problem will cause the 3.9 kernel to hang then a workaround
>> like this should go in.  There isn't enough time to do a proper fix for
>> 3.9, and even if it could be done I think it would be too intrusive to
>> get merged this late.
>>
> That's why I was inclined to take the patch, but as Marc pointed out
> the error message is incorrect, so that should be fixed at the very
> least. Also I don't think we need the counting logic, just bail out if
> we have any CPUs that are not supported.

The counting logic is just to prevent an erroneous hint message. If 
Cluster 0 is A7, the first CPU checked will fail, the counter is 0 and 
KVM outputs the "Target CPU not supported!" message.
But if at least one CPU already passed the test, we have a) a b.L system 
and b) restricting the number of CPUs with maxcpus= would help.
Thus the hint at this point.

But I am OK with removing both the hint message and the counting 
altogether, if you want to have an easier patch.

Regards,
Andre.

> Marc, since you're the strongest opponent of this patch, are you still
> opposed to making sure we don't try to run KVM on Big.Little until
> support is properly introduced?
>
> I also cannot see how we can fix the affinity issue easily from within
> the kernel, do you have a concrete approach in mind you can share?
>
> Note that this is irrespective of the boot delay issue that you guys
> are observing.
>
> -Christoffer
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm
>

Andre Przywara April 17, 2013, 8:08 a.m. UTC | #30

On 04/16/2013 06:33 PM, Marc Zyngier wrote:
> On Tue, 16 Apr 2013 09:26:26 -0700, Christoffer Dall
> <cdall@cs.columbia.edu> wrote:
>> On Mon, Apr 15, 2013 at 6:48 AM, Will Deacon <will.deacon@arm.com>
> wrote:
>>> On Mon, Apr 15, 2013 at 02:13:55PM +0100, Andre Przywara wrote:
>>>> On 04/15/2013 11:52 AM, Alexander Spyridakis wrote:
>>>>> I've run on this problem before, while trying to run KVM guests on
> A7
>>>>> cores.
>>>>>
>>>>> For some reason the 3rd A7 hangs in arch/arm/kvm/init.S, on the
>>>>> instruction that updates HSCTLR between the two isbs on
> __do_hyp_init
>>>>> (mcr p15, 4, r0, c1, c0, 0). If you boot the system with maxcpus=4
>>>>> then
>>>>> init_hyp_mode() will not hang on the A7 cluster. Other than that
> from
>>>>> my
>>>>> limited testing KVM on A7 works on a usual linux guest. I also tried
>>>>> to
>>>>> only boot the 3rd A7 core to rule out any racing issues, but still
> the
>>>>> same behaviour applies.
>>>>
>>>> Could well be the same issue here. I chased it down till CPU 2 goes
> into
>>>> HYP mode to do the initialization.
>>>> I am running with maxcpus=3 (this increases the likelyhood that
>>>> kvm_target_cpu() runs on an A15), so CPU #2 is the only one A7.
>>>> As the HYP mode exception table is empty except for the HVC trap, it
> may
>>>> be looping here. I am trying now to get the PC of the faulty
>>>> instruction.
>>>
>>> Yes, it sounds like you're taking a recursive fault because the vectors
>>> aren't installed yet. Is there any chance you can find out what value
>>> you end
>>> up writing (or trying to write) to the HSCTLR please?
>>>
>> Actually I'm a little confused, wasn't Andre seeing a halt on an A15
>> cpu, not an A7? Or is the theory that an A7 locks up and the calling
>> A15 hangs on the SMP call to cpu_init_hyp_mode, waiting for the A7 to
>> complete?
>
> Yes, A15 hanging, not A7. That's why I'm strongly opposed to this patch.
> I'm pretty sure the A7s only have a side effect that triggers a kernel bug
> on the A15 side. Before taking *any* patch around this, we should
> understand the issue fully, and not start patching random stuff just
> because Linus is going to tag 3.9.

I think there is a misunderstanding. The RCU watchdog was complaining 
because the A15 wasn't making any progress. As Christoffer said, this is 
because it was waiting for CPU 2 to return from the SMP call. It is 
actually the A7 hanging inside HYP mode.
I tried some ways to get information out of there, but had no luck so 
far. The different mapping between HYP and SVC doesn't make it easy to 
dump some variables, but I am still working on it (but only half steam 
because I am home looking after my sick daughter). So for now I assume 
that it is the HSCTLR setting Alexander observed already.

Regards,
Andre.

Marc Zyngier April 17, 2013, 8:16 a.m. UTC | #31

On Wed, 17 Apr 2013 10:08:12 +0200, Andre Przywara
<andre.przywara@linaro.org> wrote:
> On 04/16/2013 06:33 PM, Marc Zyngier wrote:
>> On Tue, 16 Apr 2013 09:26:26 -0700, Christoffer Dall
>> <cdall@cs.columbia.edu> wrote:
>>> On Mon, Apr 15, 2013 at 6:48 AM, Will Deacon <will.deacon@arm.com>
>> wrote:
>>>> On Mon, Apr 15, 2013 at 02:13:55PM +0100, Andre Przywara wrote:
>>>>> On 04/15/2013 11:52 AM, Alexander Spyridakis wrote:
>>>>>> I've run on this problem before, while trying to run KVM guests on
>> A7
>>>>>> cores.
>>>>>>
>>>>>> For some reason the 3rd A7 hangs in arch/arm/kvm/init.S, on the
>>>>>> instruction that updates HSCTLR between the two isbs on
>> __do_hyp_init
>>>>>> (mcr p15, 4, r0, c1, c0, 0). If you boot the system with maxcpus=4
>>>>>> then
>>>>>> init_hyp_mode() will not hang on the A7 cluster. Other than that
>> from
>>>>>> my
>>>>>> limited testing KVM on A7 works on a usual linux guest. I also
tried
>>>>>> to
>>>>>> only boot the 3rd A7 core to rule out any racing issues, but still
>> the
>>>>>> same behaviour applies.
>>>>>
>>>>> Could well be the same issue here. I chased it down till CPU 2 goes
>> into
>>>>> HYP mode to do the initialization.
>>>>> I am running with maxcpus=3 (this increases the likelyhood that
>>>>> kvm_target_cpu() runs on an A15), so CPU #2 is the only one A7.
>>>>> As the HYP mode exception table is empty except for the HVC trap, it
>> may
>>>>> be looping here. I am trying now to get the PC of the faulty
>>>>> instruction.
>>>>
>>>> Yes, it sounds like you're taking a recursive fault because the
vectors
>>>> aren't installed yet. Is there any chance you can find out what value
>>>> you end
>>>> up writing (or trying to write) to the HSCTLR please?
>>>>
>>> Actually I'm a little confused, wasn't Andre seeing a halt on an A15
>>> cpu, not an A7? Or is the theory that an A7 locks up and the calling
>>> A15 hangs on the SMP call to cpu_init_hyp_mode, waiting for the A7 to
>>> complete?
>>
>> Yes, A15 hanging, not A7. That's why I'm strongly opposed to this
patch.
>> I'm pretty sure the A7s only have a side effect that triggers a kernel
>> bug
>> on the A15 side. Before taking *any* patch around this, we should
>> understand the issue fully, and not start patching random stuff just
>> because Linus is going to tag 3.9.
> 
> I think there is a misunderstanding. The RCU watchdog was complaining 
> because the A15 wasn't making any progress. As Christoffer said, this is

> because it was waiting for CPU 2 to return from the SMP call. It is 
> actually the A7 hanging inside HYP mode.
> I tried some ways to get information out of there, but had no luck so 
> far. The different mapping between HYP and SVC doesn't make it easy to 
> dump some variables, but I am still working on it (but only half steam 

You could force a full mapping of the kernel text in HYP. Ugly, but should
work.

> because I am home looking after my sick daughter). So for now I assume 
> that it is the HSCTLR setting Alexander observed already.

I'll give it a go today or tomorrow, depending how quickly I can get rid
of my backlog after a couple of days off work.

Assuming this is an A7 handing on HSCTLR access, it should be pretty easy
to narrow down by booting only on the A7s, leaving the A15s held in reset.

        M.

Russell King - ARM Linux April 17, 2013, 10:19 a.m. UTC | #32

On Fri, Apr 12, 2013 at 02:49:43PM +0100, Marc Zyngier wrote:
> On 12/04/13 14:40, Peter Maydell wrote:
> > On 12 April 2013 14:24, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >> Nak. The fact that one of the CPUs seem to hang is a sure sign that
> >> something is severely broken, and you definitely want to fix that issue,
> >> instead of blindly ignoring it.
> >>
> >> Additionally, it seems you're just papering over the issue. You should
> >> be able to exclude the A7 processors, but not completely deny KVM from
> >> running on the hardware.
> > 
> > Well that might be nice, as would fully supporting big.LITTLE
> > systems. But until somebody actually does that work it seems
> > like a better idea to fail gracefully rather than having a 50%
> > chance of failing gracefully and a 50% chance of going weird.
> 
> Nothing prevents the kernel (or even the user) from forcing the affinity
> of the CPU threads to the A15s. I'm not saying we should ignore the
> problem either. Just that the proposed approach is wrong.

But nothing guarantees that you get that affinity.  If you offline all
A15 CPUs, then you will find those threads running on whatever is left.
Affinity is just a hint, nothing more.

Marc Zyngier April 17, 2013, 10:35 a.m. UTC | #33

On 17/04/13 11:19, Russell King - ARM Linux wrote:
> On Fri, Apr 12, 2013 at 02:49:43PM +0100, Marc Zyngier wrote:
>> On 12/04/13 14:40, Peter Maydell wrote:
>>> On 12 April 2013 14:24, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>>> Nak. The fact that one of the CPUs seem to hang is a sure sign that
>>>> something is severely broken, and you definitely want to fix that issue,
>>>> instead of blindly ignoring it.
>>>>
>>>> Additionally, it seems you're just papering over the issue. You should
>>>> be able to exclude the A7 processors, but not completely deny KVM from
>>>> running on the hardware.
>>>
>>> Well that might be nice, as would fully supporting big.LITTLE
>>> systems. But until somebody actually does that work it seems
>>> like a better idea to fail gracefully rather than having a 50%
>>> chance of failing gracefully and a 50% chance of going weird.
>>
>> Nothing prevents the kernel (or even the user) from forcing the affinity
>> of the CPU threads to the A15s. I'm not saying we should ignore the
>> problem either. Just that the proposed approach is wrong.
> 
> But nothing guarantees that you get that affinity.  If you offline all
> A15 CPUs, then you will find those threads running on whatever is left.
> Affinity is just a hint, nothing more.

I completely agree with you. But if we're left with only CPUs we can't
run on, we're screwed and must abort.

It's the same story as the RealView PB-X, where only one of the two A9
has NEON. If the NEON-capable core is down, any process using NEON is
virtually dead.

Should that be a reason to completely disable the HW (in this case the 3
A7s)? I'm not sure...

	M.

Christoffer Dall April 17, 2013, 11:07 a.m. UTC | #34

On Wed, Apr 17, 2013 at 3:35 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 17/04/13 11:19, Russell King - ARM Linux wrote:
>> On Fri, Apr 12, 2013 at 02:49:43PM +0100, Marc Zyngier wrote:
>>> On 12/04/13 14:40, Peter Maydell wrote:
>>>> On 12 April 2013 14:24, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>>>> Nak. The fact that one of the CPUs seem to hang is a sure sign that
>>>>> something is severely broken, and you definitely want to fix that issue,
>>>>> instead of blindly ignoring it.
>>>>>
>>>>> Additionally, it seems you're just papering over the issue. You should
>>>>> be able to exclude the A7 processors, but not completely deny KVM from
>>>>> running on the hardware.
>>>>
>>>> Well that might be nice, as would fully supporting big.LITTLE
>>>> systems. But until somebody actually does that work it seems
>>>> like a better idea to fail gracefully rather than having a 50%
>>>> chance of failing gracefully and a 50% chance of going weird.
>>>
>>> Nothing prevents the kernel (or even the user) from forcing the affinity
>>> of the CPU threads to the A15s. I'm not saying we should ignore the
>>> problem either. Just that the proposed approach is wrong.
>>
>> But nothing guarantees that you get that affinity.  If you offline all
>> A15 CPUs, then you will find those threads running on whatever is left.
>> Affinity is just a hint, nothing more.
>
> I completely agree with you. But if we're left with only CPUs we can't
> run on, we're screwed and must abort.
>
> It's the same story as the RealView PB-X, where only one of the two A9
> has NEON. If the NEON-capable core is down, any process using NEON is
> virtually dead.
>
> Should that be a reason to completely disable the HW (in this case the 3
> A7s)? I'm not sure...
>
But we're not talking about disabling the A7's, we're talking about
disabling KVM/ARM, a quite new feature, on a system where it's not
well-tested and may cause boot problems or other issues that we
haven't investigated in depth yet.

Marc Zyngier April 17, 2013, 11:30 a.m. UTC | #35

On 17/04/13 12:07, Christoffer Dall wrote:
> On Wed, Apr 17, 2013 at 3:35 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> On 17/04/13 11:19, Russell King - ARM Linux wrote:
>>> On Fri, Apr 12, 2013 at 02:49:43PM +0100, Marc Zyngier wrote:
>>>> On 12/04/13 14:40, Peter Maydell wrote:
>>>>> On 12 April 2013 14:24, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>>>>> Nak. The fact that one of the CPUs seem to hang is a sure sign that
>>>>>> something is severely broken, and you definitely want to fix that issue,
>>>>>> instead of blindly ignoring it.
>>>>>>
>>>>>> Additionally, it seems you're just papering over the issue. You should
>>>>>> be able to exclude the A7 processors, but not completely deny KVM from
>>>>>> running on the hardware.
>>>>>
>>>>> Well that might be nice, as would fully supporting big.LITTLE
>>>>> systems. But until somebody actually does that work it seems
>>>>> like a better idea to fail gracefully rather than having a 50%
>>>>> chance of failing gracefully and a 50% chance of going weird.
>>>>
>>>> Nothing prevents the kernel (or even the user) from forcing the affinity
>>>> of the CPU threads to the A15s. I'm not saying we should ignore the
>>>> problem either. Just that the proposed approach is wrong.
>>>
>>> But nothing guarantees that you get that affinity.  If you offline all
>>> A15 CPUs, then you will find those threads running on whatever is left.
>>> Affinity is just a hint, nothing more.
>>
>> I completely agree with you. But if we're left with only CPUs we can't
>> run on, we're screwed and must abort.
>>
>> It's the same story as the RealView PB-X, where only one of the two A9
>> has NEON. If the NEON-capable core is down, any process using NEON is
>> virtually dead.
>>
>> Should that be a reason to completely disable the HW (in this case the 3
>> A7s)? I'm not sure...
>>
> But we're not talking about disabling the A7's, we're talking about
> disabling KVM/ARM, a quite new feature, on a system where it's not
> well-tested and may cause boot problems or other issues that we
> haven't investigated in depth yet.

Well, it's a choice between disabling KVM or the A7s. Cheese or dessert?
Black death or cholera?

And I go back to my earlier argument: we don't know what's going wrong.
It could be a bug in our init code, it could be the bootloader that
fails to correctly initialize the A7s in HYP mode, it could be something
else.

By disabling the problematic configuration, we just bury our head in the
sand and pretend everything is fine. Yes, this is a new feature. But by
saying "not a supported configuration", while the code *should* support
it, we're only fooling ourselves.

	M.

Peter Maydell April 17, 2013, 11:38 a.m. UTC | #36

On 17 April 2013 12:30, Marc Zyngier <marc.zyngier@arm.com> wrote:
> By disabling the problematic configuration, we just bury our head in the
> sand and pretend everything is fine. Yes, this is a new feature. But by
> saying "not a supported configuration", while the code *should* support
> it, we're only fooling ourselves.

Who says the code should support running on big.LITTLE? I think
it's very clear that the only thing it should support at the
moment is A15 guest CPUs on an all-A15 host CPU system.

-- PMM

Christoffer Dall April 17, 2013, 11:38 a.m. UTC | #37

On Wed, Apr 17, 2013 at 4:30 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 17/04/13 12:07, Christoffer Dall wrote:
>> On Wed, Apr 17, 2013 at 3:35 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>> On 17/04/13 11:19, Russell King - ARM Linux wrote:
>>>> On Fri, Apr 12, 2013 at 02:49:43PM +0100, Marc Zyngier wrote:
>>>>> On 12/04/13 14:40, Peter Maydell wrote:
>>>>>> On 12 April 2013 14:24, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>>>>>> Nak. The fact that one of the CPUs seem to hang is a sure sign that
>>>>>>> something is severely broken, and you definitely want to fix that issue,
>>>>>>> instead of blindly ignoring it.
>>>>>>>
>>>>>>> Additionally, it seems you're just papering over the issue. You should
>>>>>>> be able to exclude the A7 processors, but not completely deny KVM from
>>>>>>> running on the hardware.
>>>>>>
>>>>>> Well that might be nice, as would fully supporting big.LITTLE
>>>>>> systems. But until somebody actually does that work it seems
>>>>>> like a better idea to fail gracefully rather than having a 50%
>>>>>> chance of failing gracefully and a 50% chance of going weird.
>>>>>
>>>>> Nothing prevents the kernel (or even the user) from forcing the affinity
>>>>> of the CPU threads to the A15s. I'm not saying we should ignore the
>>>>> problem either. Just that the proposed approach is wrong.
>>>>
>>>> But nothing guarantees that you get that affinity.  If you offline all
>>>> A15 CPUs, then you will find those threads running on whatever is left.
>>>> Affinity is just a hint, nothing more.
>>>
>>> I completely agree with you. But if we're left with only CPUs we can't
>>> run on, we're screwed and must abort.
>>>
>>> It's the same story as the RealView PB-X, where only one of the two A9
>>> has NEON. If the NEON-capable core is down, any process using NEON is
>>> virtually dead.
>>>
>>> Should that be a reason to completely disable the HW (in this case the 3
>>> A7s)? I'm not sure...
>>>
>> But we're not talking about disabling the A7's, we're talking about
>> disabling KVM/ARM, a quite new feature, on a system where it's not
>> well-tested and may cause boot problems or other issues that we
>> haven't investigated in depth yet.
>
> Well, it's a choice between disabling KVM or the A7s. Cheese or dessert?
> Black death or cholera?
>
> And I go back to my earlier argument: we don't know what's going wrong.
> It could be a bug in our init code, it could be the bootloader that
> fails to correctly initialize the A7s in HYP mode, it could be something
> else.
>
> By disabling the problematic configuration, we just bury our head in the
> sand and pretend everything is fine. Yes, this is a new feature. But by
> saying "not a supported configuration", while the code *should* support
> it, we're only fooling ourselves.
>
I agree completely that we shouldn't hide some unknown bug by taking a
wild guess, I'm obviously not arguing for this.

However, we didn't verify that things should work on Big.Little, and
we don't have any code to support A7's. We've written the code
specifically with the idea that we emulate the underlying hardware and
we expect that the target_cpu function will return something valid,
which it does not for an A7, so no, it there is no expectation that
things should work on Big.Little or an A7. Again, I stress that this
is an orthogonal point to the specific issue that Andre/Alex are
seeing on Big.Little.

Alexander Graf April 17, 2013, 11:42 a.m. UTC | #38

On 17.04.2013, at 13:38, Peter Maydell wrote:

> On 17 April 2013 12:30, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> By disabling the problematic configuration, we just bury our head in the
>> sand and pretend everything is fine. Yes, this is a new feature. But by
>> saying "not a supported configuration", while the code *should* support
>> it, we're only fooling ourselves.
> 
> Who says the code should support running on big.LITTLE? I think
> it's very clear that the only thing it should support at the
> moment is A15 guest CPUs on an all-A15 host CPU system.

Is anyone going to run this code on big.LITTLE systems before 3.10 is out?

Just fix it for real with 3.10 and everyone's happy. If need be, you can always CC the fixes to stable to have them in 3.9.

Alex

Christoffer Dall April 17, 2013, 11:45 a.m. UTC | #39

On Wed, Apr 17, 2013 at 4:42 AM, Alexander Graf <agraf@suse.de> wrote:
>
> On 17.04.2013, at 13:38, Peter Maydell wrote:
>
>> On 17 April 2013 12:30, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>> By disabling the problematic configuration, we just bury our head in the
>>> sand and pretend everything is fine. Yes, this is a new feature. But by
>>> saying "not a supported configuration", while the code *should* support
>>> it, we're only fooling ourselves.
>>
>> Who says the code should support running on big.LITTLE? I think
>> it's very clear that the only thing it should support at the
>> moment is A15 guest CPUs on an all-A15 host CPU system.
>
> Is anyone going to run this code on big.LITTLE systems before 3.10 is out?
>

Apparently yes, otherwise we would not be having this discussion in
the first place.

> Just fix it for real with 3.10 and everyone's happy. If need be, you can always CC the fixes to stable to have them in 3.9.
>
>
That's really an added feature set, which I don't think is appropriate
for stable. Currently we *do not support big.little* but we pretend
that we do, and even worse, we pretend that we can emulate an A15 on
an A7 for example, which we obviously cannot, and nobody looked at
what kind of damage that can cause. Therefore, the "fix" for the
current supported features is to bail out on something as funky as a
big.little, and I honestly don't see what the big problem with that
is. And, such a fix is what should go in stable.

Marc Zyngier April 17, 2013, 12:24 p.m. UTC | #40

On 17/04/13 12:38, Peter Maydell wrote:
> On 17 April 2013 12:30, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> By disabling the problematic configuration, we just bury our head in the
>> sand and pretend everything is fine. Yes, this is a new feature. But by
>> saying "not a supported configuration", while the code *should* support
>> it, we're only fooling ourselves.
> 
> Who says the code should support running on big.LITTLE?

I do. The host code definitely should. If it doesn't, then lets fix that
bug, as it probably means the support is also broken on A15.

As I said in another email, guest support is a separate issue.

	M.

Peter Maydell April 17, 2013, 12:25 p.m. UTC | #41

On 17 April 2013 13:24, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 17/04/13 12:38, Peter Maydell wrote
>> Who says the code should support running on big.LITTLE?
>
> I do. The host code definitely should. If it doesn't, then lets fix that
> bug, as it probably means the support is also broken on A15.

Last time I looked we weren't intercepting ID and feature
registers. That's broken if the host isn't A15.

-- PMM

Marc Zyngier April 17, 2013, 12:28 p.m. UTC | #42

On 17/04/13 13:25, Peter Maydell wrote:
> On 17 April 2013 13:24, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> On 17/04/13 12:38, Peter Maydell wrote
>>> Who says the code should support running on big.LITTLE?
>>
>> I do. The host code definitely should. If it doesn't, then lets fix that
>> bug, as it probably means the support is also broken on A15.
> 
> Last time I looked we weren't intercepting ID and feature
> registers. That's broken if the host isn't A15.

Would you please look at the sentence that follows the one you've
quoted? It says "guest support is a separate issue". As in "orthogonal
problem".

	M.

Peter Maydell April 17, 2013, 12:38 p.m. UTC | #43

On 17 April 2013 13:28, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 17/04/13 13:25, Peter Maydell wrote:
>> On 17 April 2013 13:24, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>> On 17/04/13 12:38, Peter Maydell wrote
>>>> Who says the code should support running on big.LITTLE?
>>>
>>> I do. The host code definitely should. If it doesn't, then lets fix that
>>> bug, as it probably means the support is also broken on A15.
>>
>> Last time I looked we weren't intercepting ID and feature
>> registers. That's broken if the host isn't A15.
>
> Would you please look at the sentence that follows the one you've
> quoted? It says "guest support is a separate issue". As in "orthogonal
> problem".

I'm confused now, because I thought the whole point of this
patch was that we don't support guests on big.LITTLE but
we're not locking them out (the way we would for an all-A7
system). That is, we're fixing a bug in the check code.

Obviously it seems worth investigating exactly what is
happening on the A7s when we try to run the init code,
but it seems to me like that's the orthogonal issue,
because we shouldn't be running that code in the first
place.

-- PMM

Marc Zyngier April 17, 2013, 1 p.m. UTC | #44

On 17/04/13 13:38, Peter Maydell wrote:
> On 17 April 2013 13:28, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> On 17/04/13 13:25, Peter Maydell wrote:
>>> On 17 April 2013 13:24, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>>> On 17/04/13 12:38, Peter Maydell wrote
>>>>> Who says the code should support running on big.LITTLE?
>>>>
>>>> I do. The host code definitely should. If it doesn't, then lets fix that
>>>> bug, as it probably means the support is also broken on A15.
>>>
>>> Last time I looked we weren't intercepting ID and feature
>>> registers. That's broken if the host isn't A15.
>>
>> Would you please look at the sentence that follows the one you've
>> quoted? It says "guest support is a separate issue". As in "orthogonal
>> problem".
> 
> I'm confused now, because I thought the whole point of this
> patch was that we don't support guests on big.LITTLE but
> we're not locking them out (the way we would for an all-A7
> system). That is, we're fixing a bug in the check code.
> 
> Obviously it seems worth investigating exactly what is
> happening on the A7s when we try to run the init code,
> but it seems to me like that's the orthogonal issue,
> because we shouldn't be running that code in the first
> place.

Turns out that init code was designed to run unchanged on both A15 and
A7. If that code isn't properly working on a HYP-aware CPU, then we're
in trouble.

I don't give a damn about full BL support in KVM at that stage. All I
want is a clear description of *why* that code is failing, and possibly
a way to run guests on the A15 side of a BL platform, without having to
give away half of the cores.

I think I've already spent too much time arguing about this. Hopefully
someone will be interested enough in BL to revert this patch and fix the
code (whether it is in the kernel or bootloader).

	M.

Christoffer Dall April 17, 2013, 7:28 p.m. UTC | #45

On Wed, Apr 17, 2013 at 6:00 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 17/04/13 13:38, Peter Maydell wrote:
>> On 17 April 2013 13:28, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>> On 17/04/13 13:25, Peter Maydell wrote:
>>>> On 17 April 2013 13:24, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>>>> On 17/04/13 12:38, Peter Maydell wrote
>>>>>> Who says the code should support running on big.LITTLE?
>>>>>
>>>>> I do. The host code definitely should. If it doesn't, then lets fix that
>>>>> bug, as it probably means the support is also broken on A15.
>>>>
>>>> Last time I looked we weren't intercepting ID and feature
>>>> registers. That's broken if the host isn't A15.
>>>
>>> Would you please look at the sentence that follows the one you've
>>> quoted? It says "guest support is a separate issue". As in "orthogonal
>>> problem".
>>
>> I'm confused now, because I thought the whole point of this
>> patch was that we don't support guests on big.LITTLE but
>> we're not locking them out (the way we would for an all-A7
>> system). That is, we're fixing a bug in the check code.
>>
>> Obviously it seems worth investigating exactly what is
>> happening on the A7s when we try to run the init code,
>> but it seems to me like that's the orthogonal issue,
>> because we shouldn't be running that code in the first
>> place.
>
> Turns out that init code was designed to run unchanged on both A15 and
> A7. If that code isn't properly working on a HYP-aware CPU, then we're
> in trouble.
>

yes yes, we agree here, but it's a separate issue.

> I don't give a damn about full BL support in KVM at that stage. All I
> want is a clear description of *why* that code is failing, and possibly
> a way to run guests on the A15 side of a BL platform, without having to
> give away half of the cores.
>
> I think I've already spent too much time arguing about this. Hopefully
> someone will be interested enough in BL to revert this patch and fix the
> code (whether it is in the kernel or bootloader).
>

Currently we only support A15s, and that's explicit.

You're arguing that we should pretend like we support other systems as
well, should we then also just remove the check all together and let
KVM run on a krait, on an A42 [1], and on foo, and hope for the best?
Or should we be nice and tell the rest of the system, sorry this is
not supported yet? I think the latter.

Seriously, I don't think it sounds like a quick fix to make the code
run only on A15s on BL, but if you do, please send the patch, and I'll
be happy to review and merge that code.

-Christoffer

[1]: Made this up.

Andre Przywara April 19, 2013, 12:58 p.m. UTC | #46

On 04/17/2013 11:12 AM, Christoffer Dall wrote:
>
>
>
> On Wed, Apr 17, 2013 at 1:16 AM, Marc Zyngier <marc.zyngier@arm.com
> <mailto:marc.zyngier@arm.com>> wrote:
>
>     On Wed, 17 Apr 2013 10:08:12 +0200, Andre Przywara
>     <andre.przywara@linaro.org <mailto:andre.przywara@linaro.org>> wrote:
>      > On 04/16/2013 06:33 PM, Marc Zyngier wrote:
>      >> On Tue, 16 Apr 2013 09:26:26 -0700, Christoffer Dall
>      >> <cdall@cs.columbia.edu <mailto:cdall@cs.columbia.edu>> wrote:
>      >>> On Mon, Apr 15, 2013 at 6:48 AM, Will Deacon
>     <will.deacon@arm.com <mailto:will.deacon@arm.com>>
>      >> wrote:
>      >>>> On Mon, Apr 15, 2013 at 02:13:55PM +0100, Andre Przywara wrote:
>      >>>>> On 04/15/2013 11:52 AM, Alexander Spyridakis wrote:
>      >>>>>> I've run on this problem before, while trying to run KVM
>     guests on
>      >> A7
>      >>>>>> cores.
>      >>>>>>
>      >>>>>> For some reason the 3rd A7 hangs in arch/arm/kvm/init.S, on the
>      >>>>>> instruction that updates HSCTLR between the two isbs on
>      >> __do_hyp_init
>      >>>>>> (mcr p15, 4, r0, c1, c0, 0). If you boot the system with
>     maxcpus=4
>      >>>>>> then
>      >>>>>> init_hyp_mode() will not hang on the A7 cluster. Other than that
>      >> from
>      >>>>>> my
>      >>>>>> limited testing KVM on A7 works on a usual linux guest. I also
>     tried
>      >>>>>> to
>      >>>>>> only boot the 3rd A7 core to rule out any racing issues, but
>     still
>      >> the
>      >>>>>> same behaviour applies.
>      >>>>>
>      >>>>> Could well be the same issue here. I chased it down till CPU
>     2 goes
>      >> into
>      >>>>> HYP mode to do the initialization.
>      >>>>> I am running with maxcpus=3 (this increases the likelyhood that
>      >>>>> kvm_target_cpu() runs on an A15), so CPU #2 is the only one A7.
>      >>>>> As the HYP mode exception table is empty except for the HVC
>     trap, it
>      >> may
>      >>>>> be looping here. I am trying now to get the PC of the faulty
>      >>>>> instruction.
>      >>>>
>      >>>> Yes, it sounds like you're taking a recursive fault because the
>     vectors
>      >>>> aren't installed yet. Is there any chance you can find out
>     what value
>      >>>> you end
>      >>>> up writing (or trying to write) to the HSCTLR please?
>      >>>>
>      >>> Actually I'm a little confused, wasn't Andre seeing a halt on
>     an A15
>      >>> cpu, not an A7? Or is the theory that an A7 locks up and the
>     calling
>      >>> A15 hangs on the SMP call to cpu_init_hyp_mode, waiting for the
>     A7 to
>      >>> complete?
>      >>
>      >> Yes, A15 hanging, not A7. That's why I'm strongly opposed to this
>     patch.
>      >> I'm pretty sure the A7s only have a side effect that triggers a
>     kernel
>      >> bug
>      >> on the A15 side. Before taking *any* patch around this, we should
>      >> understand the issue fully, and not start patching random stuff just
>      >> because Linus is going to tag 3.9.
>      >
>      > I think there is a misunderstanding. The RCU watchdog was complaining
>      > because the A15 wasn't making any progress. As Christoffer said,
>     this is
>
>      > because it was waiting for CPU 2 to return from the SMP call. It is
>      > actually the A7 hanging inside HYP mode.
>      > I tried some ways to get information out of there, but had no luck so
>      > far. The different mapping between HYP and SVC doesn't make it
>     easy to
>      > dump some variables, but I am still working on it (but only half
>     steam
>
>     You could force a full mapping of the kernel text in HYP. Ugly, but
>     should
>     work.
>
>      > because I am home looking after my sick daughter). So for now I
>     assume
>      > that it is the HSCTLR setting Alexander observed already.
>
>     I'll give it a go today or tomorrow, depending how quickly I can get rid
>     of my backlog after a couple of days off work.
>
>     Assuming this is an A7 handing on HSCTLR access, it should be pretty
>     easy
>     to narrow down by booting only on the A7s, leaving the A15s held in
>     reset.
>
>
> You could also try installing a vector handler early and detect faults,
> and add an alternative return path from the init function with some
> error reporting value in r0 or something like that, just for debugging,
> naturally, but that could be a way to detect if we really are taking
> recursive faults here.

OK, I added code to return earlier on CPUs not from cluster 0.
Indeed it hangs in the HSCR write. The two A15s pass this instruction, 
writing 0x30c5187F into the register.
This means all the fixed bits for A15 correctly, C,A,M and I set and 
WXN, EE, TE cleared. FI was also cleared
The A7 wanted to write the very same value. I tried to set bit 21, which 
kind of the A7 TRM hints to do: but no change.
Before the HSCLTR write, the register reads 0x30c50878, with SCTLR being 
0x30c5387d.
So the code wants to set M, A, C and I in HSCLTR. Interestingly SCTLR 
has the V bits set, could that be an issue?

Regards,
Andre.

Christoffer Dall April 19, 2013, 4:13 p.m. UTC | #47

On Fri, Apr 19, 2013 at 5:58 AM, Andre Przywara
<andre.przywara@linaro.org> wrote:
> On 04/17/2013 11:12 AM, Christoffer Dall wrote:
>>
>>
>>
>>
>> On Wed, Apr 17, 2013 at 1:16 AM, Marc Zyngier <marc.zyngier@arm.com
>> <mailto:marc.zyngier@arm.com>> wrote:
>>
>>     On Wed, 17 Apr 2013 10:08:12 +0200, Andre Przywara
>>     <andre.przywara@linaro.org <mailto:andre.przywara@linaro.org>> wrote:
>>      > On 04/16/2013 06:33 PM, Marc Zyngier wrote:
>>      >> On Tue, 16 Apr 2013 09:26:26 -0700, Christoffer Dall
>>      >> <cdall@cs.columbia.edu <mailto:cdall@cs.columbia.edu>> wrote:
>>      >>> On Mon, Apr 15, 2013 at 6:48 AM, Will Deacon
>>     <will.deacon@arm.com <mailto:will.deacon@arm.com>>
>>
>>      >> wrote:
>>      >>>> On Mon, Apr 15, 2013 at 02:13:55PM +0100, Andre Przywara wrote:
>>      >>>>> On 04/15/2013 11:52 AM, Alexander Spyridakis wrote:
>>      >>>>>> I've run on this problem before, while trying to run KVM
>>     guests on
>>      >> A7
>>      >>>>>> cores.
>>      >>>>>>
>>      >>>>>> For some reason the 3rd A7 hangs in arch/arm/kvm/init.S, on
>> the
>>      >>>>>> instruction that updates HSCTLR between the two isbs on
>>      >> __do_hyp_init
>>      >>>>>> (mcr p15, 4, r0, c1, c0, 0). If you boot the system with
>>     maxcpus=4
>>      >>>>>> then
>>      >>>>>> init_hyp_mode() will not hang on the A7 cluster. Other than
>> that
>>      >> from
>>      >>>>>> my
>>      >>>>>> limited testing KVM on A7 works on a usual linux guest. I also
>>     tried
>>      >>>>>> to
>>      >>>>>> only boot the 3rd A7 core to rule out any racing issues, but
>>     still
>>      >> the
>>      >>>>>> same behaviour applies.
>>      >>>>>
>>      >>>>> Could well be the same issue here. I chased it down till CPU
>>     2 goes
>>      >> into
>>      >>>>> HYP mode to do the initialization.
>>      >>>>> I am running with maxcpus=3 (this increases the likelyhood that
>>      >>>>> kvm_target_cpu() runs on an A15), so CPU #2 is the only one A7.
>>      >>>>> As the HYP mode exception table is empty except for the HVC
>>     trap, it
>>      >> may
>>      >>>>> be looping here. I am trying now to get the PC of the faulty
>>      >>>>> instruction.
>>      >>>>
>>      >>>> Yes, it sounds like you're taking a recursive fault because the
>>     vectors
>>      >>>> aren't installed yet. Is there any chance you can find out
>>     what value
>>      >>>> you end
>>      >>>> up writing (or trying to write) to the HSCTLR please?
>>      >>>>
>>      >>> Actually I'm a little confused, wasn't Andre seeing a halt on
>>     an A15
>>      >>> cpu, not an A7? Or is the theory that an A7 locks up and the
>>     calling
>>      >>> A15 hangs on the SMP call to cpu_init_hyp_mode, waiting for the
>>     A7 to
>>      >>> complete?
>>      >>
>>      >> Yes, A15 hanging, not A7. That's why I'm strongly opposed to this
>>     patch.
>>      >> I'm pretty sure the A7s only have a side effect that triggers a
>>     kernel
>>      >> bug
>>      >> on the A15 side. Before taking *any* patch around this, we should
>>      >> understand the issue fully, and not start patching random stuff
>> just
>>      >> because Linus is going to tag 3.9.
>>      >
>>      > I think there is a misunderstanding. The RCU watchdog was
>> complaining
>>      > because the A15 wasn't making any progress. As Christoffer said,
>>     this is
>>
>>      > because it was waiting for CPU 2 to return from the SMP call. It is
>>      > actually the A7 hanging inside HYP mode.
>>      > I tried some ways to get information out of there, but had no luck
>> so
>>      > far. The different mapping between HYP and SVC doesn't make it
>>     easy to
>>      > dump some variables, but I am still working on it (but only half
>>     steam
>>
>>     You could force a full mapping of the kernel text in HYP. Ugly, but
>>     should
>>     work.
>>
>>      > because I am home looking after my sick daughter). So for now I
>>     assume
>>      > that it is the HSCTLR setting Alexander observed already.
>>
>>     I'll give it a go today or tomorrow, depending how quickly I can get
>> rid
>>     of my backlog after a couple of days off work.
>>
>>     Assuming this is an A7 handing on HSCTLR access, it should be pretty
>>     easy
>>     to narrow down by booting only on the A7s, leaving the A15s held in
>>     reset.
>>
>>
>> You could also try installing a vector handler early and detect faults,
>> and add an alternative return path from the init function with some
>> error reporting value in r0 or something like that, just for debugging,
>> naturally, but that could be a way to detect if we really are taking
>> recursive faults here.
>
>
> OK, I added code to return earlier on CPUs not from cluster 0.
> Indeed it hangs in the HSCR write. The two A15s pass this instruction,
> writing 0x30c5187F into the register.
> This means all the fixed bits for A15 correctly, C,A,M and I set and WXN,
> EE, TE cleared. FI was also cleared
> The A7 wanted to write the very same value. I tried to set bit 21, which
> kind of the A7 TRM hints to do: but no change.
> Before the HSCLTR write, the register reads 0x30c50878, with SCTLR being
> 0x30c5387d.
> So the code wants to set M, A, C and I in HSCLTR. Interestingly SCTLR has
> the V bits set, could that be an issue?
>
Can you try writing 0x30c50879 into the register instead? Basically
check to see if enabling caches or alignment checks causes the issue,
or if it is indeed enabling the MMU that's the issue... If that works,
start a bisect on the remaining bits. Also, just for fun, could you
try flushing the entire I-cache before writing into the HSCLTR?

Why wouldn't the V bit be set in the SCTLR, Linus uses high vectors
(at 0xffff0000) for exception handling on ARM.

-Christoffer

Andre Przywara April 22, 2013, 10:36 a.m. UTC | #48

On 04/19/2013 06:13 PM, Christoffer Dall wrote:
> On Fri, Apr 19, 2013 at 5:58 AM, Andre Przywara
> <andre.przywara@linaro.org> wrote:
>> On 04/17/2013 11:12 AM, Christoffer Dall wrote:
>>>
>>>  ...
>>>
>>> You could also try installing a vector handler early and detect faults,
>>> and add an alternative return path from the init function with some
>>> error reporting value in r0 or something like that, just for debugging,
>>> naturally, but that could be a way to detect if we really are taking
>>> recursive faults here.
>>
>>
>> OK, I added code to return earlier on CPUs not from cluster 0.
>> Indeed it hangs in the HSCR write. The two A15s pass this instruction,
>> writing 0x30c5187F into the register.
>> This means all the fixed bits for A15 correctly, C,A,M and I set and WXN,
>> EE, TE cleared. FI was also cleared
>> The A7 wanted to write the very same value. I tried to set bit 21, which
>> kind of the A7 TRM hints to do: but no change.
>> Before the HSCLTR write, the register reads 0x30c50878, with SCTLR being
>> 0x30c5387d.
>> So the code wants to set M, A, C and I in HSCLTR. Interestingly SCTLR has
>> the V bits set, could that be an issue?
>>
> Can you try writing 0x30c50879 into the register instead? Basically
> check to see if enabling caches or alignment checks causes the issue,
> or if it is indeed enabling the MMU that's the issue... If that works,
> start a bisect on the remaining bits. Also, just for fun, could you
> try flushing the entire I-cache before writing into the HSCLTR?

OK, both clearing the I-bit and doing an "isb; ICIALLU" before the "isb; 
write HSCLTR; isb" worked, the kernel boots on and KVM is enabled.

I could easily make a patch, but I am not sure how to proceed from here:

1.) At least I don't have an understanding why this is only a problem on 
A7 and not on A15. I would feel better if we have an explanation for 
this. Mark, Will, Peter: any ideas?

2.) We still do not claim to support A7s. ARM_CPU_PART_CORTEX_A15 is the 
only MIDR content not returning -EINVAL. And there is no coproc_a7.c 
yet. From browsing through coproc_a15.c I sense it looks compatible to 
the A7, only handling L2CTLR[25:24] and ACTLR[SMP] explicitly. On pure 
A7 systems this should work, but in mixed (aka. b.L) systems guests 
would see inconsistent input depending on the current scheduling. This 
looks like we need a real solution, which needs some time and work 
(read: proper b.L support).

3.) Even if we would understand this fix and take it, the kernel 
currently still randomly enables or disables KVM - with only A15 
supported and the check routine running on one of the cores only.
And even if we add A7 to the list, I think we still need to iterate 
through all cores to catch unsupported cores in a different b.L 
configuration.

I suppose it is to late for 3.9 in any case right? Or should we try to 
push at least the ICIALLU fix?

> Why wouldn't the V bit be set in the SCTLR, Linus uses high vectors
> (at 0xffff0000) for exception handling on ARM.

Shame on me, didn't know this. I assumed that those high vectors are 
there for legacy reasons only. Good to know and thanks for telling me.

Regards,
Andre.

Marc Zyngier April 22, 2013, 11:02 a.m. UTC | #49

On 22/04/13 11:36, Andre Przywara wrote:
> On 04/19/2013 06:13 PM, Christoffer Dall wrote:
>> On Fri, Apr 19, 2013 at 5:58 AM, Andre Przywara
>> <andre.przywara@linaro.org> wrote:
>>> On 04/17/2013 11:12 AM, Christoffer Dall wrote:
>>>>
>>>>  ...
>>>>
>>>> You could also try installing a vector handler early and detect faults,
>>>> and add an alternative return path from the init function with some
>>>> error reporting value in r0 or something like that, just for debugging,
>>>> naturally, but that could be a way to detect if we really are taking
>>>> recursive faults here.
>>>
>>>
>>> OK, I added code to return earlier on CPUs not from cluster 0.
>>> Indeed it hangs in the HSCR write. The two A15s pass this instruction,
>>> writing 0x30c5187F into the register.
>>> This means all the fixed bits for A15 correctly, C,A,M and I set and WXN,
>>> EE, TE cleared. FI was also cleared
>>> The A7 wanted to write the very same value. I tried to set bit 21, which
>>> kind of the A7 TRM hints to do: but no change.
>>> Before the HSCLTR write, the register reads 0x30c50878, with SCTLR being
>>> 0x30c5387d.
>>> So the code wants to set M, A, C and I in HSCLTR. Interestingly SCTLR has
>>> the V bits set, could that be an issue?
>>>
>> Can you try writing 0x30c50879 into the register instead? Basically
>> check to see if enabling caches or alignment checks causes the issue,
>> or if it is indeed enabling the MMU that's the issue... If that works,
>> start a bisect on the remaining bits. Also, just for fun, could you
>> try flushing the entire I-cache before writing into the HSCLTR?
> 
> OK, both clearing the I-bit and doing an "isb; ICIALLU" before the "isb; 
> write HSCLTR; isb" worked, the kernel boots on and KVM is enabled.
> 
> I could easily make a patch, but I am not sure how to proceed from here:
> 
> 1.) At least I don't have an understanding why this is only a problem on 
> A7 and not on A15. I would feel better if we have an explanation for 
> this. Mark, Will, Peter: any ideas?

Blink!

Can you check your board.txt config file? It should have a line that
starts with "SCC: 0x400 ..."

If not, can you add a line that reads:
SCC: 0x400 0x00000000

Thanks,

	M.

Marc Zyngier April 22, 2013, 11:14 a.m. UTC | #50

On Mon, 22 Apr 2013 12:02:59 +0100, Marc Zyngier <marc.zyngier@arm.com>
wrote:
> On 22/04/13 11:36, Andre Przywara wrote:
>> On 04/19/2013 06:13 PM, Christoffer Dall wrote:
>>> On Fri, Apr 19, 2013 at 5:58 AM, Andre Przywara
>>> <andre.przywara@linaro.org> wrote:
>>>> On 04/17/2013 11:12 AM, Christoffer Dall wrote:
>>>>>
>>>>>  ...
>>>>>
>>>>> You could also try installing a vector handler early and detect
>>>>> faults,
>>>>> and add an alternative return path from the init function with some
>>>>> error reporting value in r0 or something like that, just for
>>>>> debugging,
>>>>> naturally, but that could be a way to detect if we really are taking
>>>>> recursive faults here.
>>>>
>>>>
>>>> OK, I added code to return earlier on CPUs not from cluster 0.
>>>> Indeed it hangs in the HSCR write. The two A15s pass this
instruction,
>>>> writing 0x30c5187F into the register.
>>>> This means all the fixed bits for A15 correctly, C,A,M and I set and
>>>> WXN,
>>>> EE, TE cleared. FI was also cleared
>>>> The A7 wanted to write the very same value. I tried to set bit 21,
>>>> which
>>>> kind of the A7 TRM hints to do: but no change.
>>>> Before the HSCLTR write, the register reads 0x30c50878, with SCTLR
>>>> being
>>>> 0x30c5387d.
>>>> So the code wants to set M, A, C and I in HSCLTR. Interestingly SCTLR
>>>> has
>>>> the V bits set, could that be an issue?
>>>>
>>> Can you try writing 0x30c50879 into the register instead? Basically
>>> check to see if enabling caches or alignment checks causes the issue,
>>> or if it is indeed enabling the MMU that's the issue... If that works,
>>> start a bisect on the remaining bits. Also, just for fun, could you
>>> try flushing the entire I-cache before writing into the HSCLTR?
>> 
>> OK, both clearing the I-bit and doing an "isb; ICIALLU" before the
"isb; 
>> write HSCLTR; isb" worked, the kernel boots on and KVM is enabled.
>> 
>> I could easily make a patch, but I am not sure how to proceed from
here:
>> 
>> 1.) At least I don't have an understanding why this is only a problem
on 
>> A7 and not on A15. I would feel better if we have an explanation for 
>> this. Mark, Will, Peter: any ideas?
> 
> Blink!
> 
> Can you check your board.txt config file? It should have a line that
> starts with "SCC: 0x400 ..."
> 
> If not, can you add a line that reads:
> SCC: 0x400 0x00000000

Actually you, may want to try:
SCC: 0x400 0x3330c00

instead, as Peter pointed out that some bits are flagged as reserved in
the TRM...

        M.

Andre Przywara April 22, 2013, 2:35 p.m. UTC | #51

On 04/22/2013 01:14 PM, Marc Zyngier wrote:
> On Mon, 22 Apr 2013 12:02:59 +0100, Marc Zyngier <marc.zyngier@arm.com>
> wrote:
>> On 22/04/13 11:36, Andre Przywara wrote:
>>> On 04/19/2013 06:13 PM, Christoffer Dall wrote:
>>>> On Fri, Apr 19, 2013 at 5:58 AM, Andre Przywara
>>>> <andre.przywara@linaro.org> wrote:
>>>>> On 04/17/2013 11:12 AM, Christoffer Dall wrote:
>>>>>>
>>>>>>   ...
>>>>>>
>>>>>> You could also try installing a vector handler early and detect
>>>>>> faults,
>>>>>> and add an alternative return path from the init function with some
>>>>>> error reporting value in r0 or something like that, just for
>>>>>> debugging,
>>>>>> naturally, but that could be a way to detect if we really are taking
>>>>>> recursive faults here.
>>>>>
>>>>>
>>>>> OK, I added code to return earlier on CPUs not from cluster 0.
>>>>> Indeed it hangs in the HSCR write. The two A15s pass this
> instruction,
>>>>> writing 0x30c5187F into the register.
>>>>> This means all the fixed bits for A15 correctly, C,A,M and I set and
>>>>> WXN,
>>>>> EE, TE cleared. FI was also cleared
>>>>> The A7 wanted to write the very same value. I tried to set bit 21,
>>>>> which
>>>>> kind of the A7 TRM hints to do: but no change.
>>>>> Before the HSCLTR write, the register reads 0x30c50878, with SCTLR
>>>>> being
>>>>> 0x30c5387d.
>>>>> So the code wants to set M, A, C and I in HSCLTR. Interestingly SCTLR
>>>>> has
>>>>> the V bits set, could that be an issue?
>>>>>
>>>> Can you try writing 0x30c50879 into the register instead? Basically
>>>> check to see if enabling caches or alignment checks causes the issue,
>>>> or if it is indeed enabling the MMU that's the issue... If that works,
>>>> start a bisect on the remaining bits. Also, just for fun, could you
>>>> try flushing the entire I-cache before writing into the HSCLTR?
>>>
>>> OK, both clearing the I-bit and doing an "isb; ICIALLU" before the
> "isb;
>>> write HSCLTR; isb" worked, the kernel boots on and KVM is enabled.
>>>

Arrgh! While cleaning up the code I saw that I had left a debug exit 
point early in init.S, so the HSCTLR write was not executed. Fixing this 
I can now say that cleaning the I-cache and even disabling it does not 
help. Sorry for the premature hooray!
In fact the only thing that makes the kernel boot is cleaning the M bit 
(disabling the MMU), which is a debug hint, but by no means a useful 
workaround :-(

So the problem still persists.

>>> I could easily make a patch, but I am not sure how to proceed from
> here:
>>>
>>> 1.) At least I don't have an understanding why this is only a problem
> on
>>> A7 and not on A15. I would feel better if we have an explanation for
>>> this. Mark, Will, Peter: any ideas?
>>
>> Blink!
>>
>> Can you check your board.txt config file? It should have a line that
>> starts with "SCC: 0x400 ..."
>>
>> If not, can you add a line that reads:
>> SCC: 0x400 0x00000000
>
> Actually you, may want to try:
> SCC: 0x400 0x3330c00
>
> instead, as Peter pointed out that some bits are flagged as reserved in
> the TRM...

I tried to play with SCC 0x400 for two boots, but it didn't change 
anything. Didn't do the full scale of possible HSCLTR bits with and 
without this, though. Does that make sense to do this, or was SCC 0x400 
only for cache problems?

Regards,
Andre.

diff mbox

Patch

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 5a93698..9b53cdd 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -1128,20 +1128,35 @@  out_err:
 	return err;
 }
 
+static void check_kvm_target_cpu(void *ret)
+{
+	*(int*)ret = kvm_target_cpu();
+}
+
 /**
  * Initialize Hyp-mode and memory mappings on all CPUs.
  */
 int kvm_arch_init(void *opaque)
 {
 	int err;
+	int count = 0, ret, cpu;
 
 	if (!is_hyp_mode_available()) {
 		kvm_err("HYP mode not available\n");
 		return -ENODEV;
 	}
 
-	if (kvm_target_cpu() < 0) {
-		kvm_err("Target CPU not supported!\n");
+	for_each_online_cpu(cpu) {
+		smp_call_function_single(cpu, check_kvm_target_cpu, &ret, 1);
+		if (ret >= 0) {
+			count++;
+			continue;
+		}
+		if (count)
+			kvm_err("big.LITTLE not supported, limit to A15 CPUs with maxcpus= to use KVM\n");
+		else
+			kvm_err("Target CPU not supported!\n");
+
 		return -ENODEV;
 	}