diff mbox

KVM: ARM: ignore guest L2 cache control SMCs on Highbank and OMAP

Message ID 1376472125-23350-1-git-send-email-andre.przywara@calxeda.com (mailing list archive)
State New, archived
Headers show

Commit Message

Andre Przywara Aug. 14, 2013, 9:22 a.m. UTC
Guest kernels with CONFIG_L2X0 set (for instance Highbank or OMAP4)
will trigger SMCs to handle the L2 cache controller (PL310).
This will currently inject #UNDEFs and eventually stop the guest.

We don't need explicit L2 cache controller handling on A15s anymore,
so it is safe to simply ignore these calls and proceed with the next
instruction.

Signed-off-by: Andre Przywara <andre.przywara@calxeda.com>
---
 arch/arm/kvm/handle_exit.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

Comments

Marc Zyngier Aug. 14, 2013, 9:32 a.m. UTC | #1
On 2013-08-14 10:22, Andre Przywara wrote:
> Guest kernels with CONFIG_L2X0 set (for instance Highbank or OMAP4)
> will trigger SMCs to handle the L2 cache controller (PL310).
> This will currently inject #UNDEFs and eventually stop the guest.
>
> We don't need explicit L2 cache controller handling on A15s anymore,
> so it is safe to simply ignore these calls and proceed with the next
> instruction.
>
> Signed-off-by: Andre Przywara <andre.przywara@calxeda.com>

Hold on.

Are you trying to run A9 guests on KVM? Sorry, but that's not a 
supported mode of operation just yet.

So, until we have a proper framework to deal with multiple CPUs, the 
only valid configuration is A15-on-A15.

> ---
>  arch/arm/kvm/handle_exit.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
>
> diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
> index df4c82d..2cbe6a0 100644
> --- a/arch/arm/kvm/handle_exit.c
> +++ b/arch/arm/kvm/handle_exit.c
> @@ -50,8 +50,28 @@ static int handle_hvc(struct kvm_vcpu *vcpu,
> struct kvm_run *run)
>  	return 1;
>  }
>
> +/*
> + * OMAP4 and Highbank machines do a SMC call to handle the L2 cache
> + * controller. They put 0x102 in r12 to request this functionality.
> + * This is not needed on A15s, so we can safely ignore it in KVM 
> guests.
> + */
> +static int kvm_ignore_l2x0_call(struct kvm_vcpu *vcpu)
> +{
> +	unsigned long fn_nr = *vcpu_reg(vcpu, 12) & ~((u32) 0);
> +
> +	if (fn_nr == 0x102) {
> +		kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> +		return 1;
> +	}
> +
> +	return 0;
> +}

And what if I run mach-foo which uses r12 to request bar services from 
secure mode? Is it safe to ignore it? We need something much better than 
just testing random registers to guess what the guest wants.

>  static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  {
> +	if (kvm_ignore_l2x0_call(vcpu))
> +		return 1;
> +
>  	kvm_inject_undefined(vcpu);
>  	return 1;
>  }

         M.
Andre Przywara Aug. 14, 2013, 9:39 a.m. UTC | #2
On 08/14/2013 11:32 AM, Marc Zyngier wrote:
> On 2013-08-14 10:22, Andre Przywara wrote:
>> Guest kernels with CONFIG_L2X0 set (for instance Highbank or OMAP4)
>> will trigger SMCs to handle the L2 cache controller (PL310).
>> This will currently inject #UNDEFs and eventually stop the guest.
>>
>> We don't need explicit L2 cache controller handling on A15s anymore,
>> so it is safe to simply ignore these calls and proceed with the next
>> instruction.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@calxeda.com>
>
> Hold on.
>
> Are you trying to run A9 guests on KVM? Sorry, but that's not a
> supported mode of operation just yet.

No, I don't. I just run guests with kernels that would support A9 also. 
If you select Highbank in your .config, you will get CACHE_L2X0 and the 
kernel will do SMCs - regardless of the CPU you are running on. Those 
SMCs are ignored by the firmware on Midway.

I agree that the proper solution would be to detect at run-time in the 
(guest) kernel whether you actually need the PL310 handling, but for the 
time being and to support older kernels we will need this fix.

For me this fixes "qemu -machine midway --enable-kvm".

Regards,
Andre.

>
> So, until we have a proper framework to deal with multiple CPUs, the
> only valid configuration is A15-on-A15.
>
>> ---
>>   arch/arm/kvm/handle_exit.c | 20 ++++++++++++++++++++
>>   1 file changed, 20 insertions(+)
>>
>> diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
>> index df4c82d..2cbe6a0 100644
>> --- a/arch/arm/kvm/handle_exit.c
>> +++ b/arch/arm/kvm/handle_exit.c
>> @@ -50,8 +50,28 @@ static int handle_hvc(struct kvm_vcpu *vcpu,
>> struct kvm_run *run)
>>   	return 1;
>>   }
>>
>> +/*
>> + * OMAP4 and Highbank machines do a SMC call to handle the L2 cache
>> + * controller. They put 0x102 in r12 to request this functionality.
>> + * This is not needed on A15s, so we can safely ignore it in KVM
>> guests.
>> + */
>> +static int kvm_ignore_l2x0_call(struct kvm_vcpu *vcpu)
>> +{
>> +	unsigned long fn_nr = *vcpu_reg(vcpu, 12) & ~((u32) 0);
>> +
>> +	if (fn_nr == 0x102) {
>> +		kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
>> +		return 1;
>> +	}
>> +
>> +	return 0;
>> +}
>
> And what if I run mach-foo which uses r12 to request bar services from
> secure mode? Is it safe to ignore it? We need something much better than
> just testing random registers to guess what the guest wants.
>
>>   static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>   {
>> +	if (kvm_ignore_l2x0_call(vcpu))
>> +		return 1;
>> +
>>   	kvm_inject_undefined(vcpu);
>>   	return 1;
>>   }
>
>           M.
>
Peter Maydell Aug. 14, 2013, 10:22 a.m. UTC | #3
On 14 August 2013 10:32, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 2013-08-14 10:22, Andre Przywara wrote:

>> +static int kvm_ignore_l2x0_call(struct kvm_vcpu *vcpu)
>> +{
>> +     unsigned long fn_nr = *vcpu_reg(vcpu, 12) & ~((u32) 0);
>> +
>> +     if (fn_nr == 0x102) {
>> +             kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
>> +             return 1;
>> +     }
>> +
>> +     return 0;
>> +}
>
> And what if I run mach-foo which uses r12 to request bar services from
> secure mode? Is it safe to ignore it? We need something much better than
> just testing random registers to guess what the guest wants.

Definitely. This needs to be addressed via the kernel providing
some mechanism so that userspace and/or a KVM-specific bit
of 'firmware' running in the guest VM can handle the SMC
calls the guest tries to make, because it's totally board
specific.

-- PMM
Marc Zyngier Aug. 14, 2013, 10:22 a.m. UTC | #4
On 2013-08-14 10:39, Andre Przywara wrote:
> On 08/14/2013 11:32 AM, Marc Zyngier wrote:
>> On 2013-08-14 10:22, Andre Przywara wrote:
>>> Guest kernels with CONFIG_L2X0 set (for instance Highbank or OMAP4)
>>> will trigger SMCs to handle the L2 cache controller (PL310).
>>> This will currently inject #UNDEFs and eventually stop the guest.
>>>
>>> We don't need explicit L2 cache controller handling on A15s 
>>> anymore,
>>> so it is safe to simply ignore these calls and proceed with the 
>>> next
>>> instruction.
>>>
>>> Signed-off-by: Andre Przywara <andre.przywara@calxeda.com>
>>
>> Hold on.
>>
>> Are you trying to run A9 guests on KVM? Sorry, but that's not a
>> supported mode of operation just yet.
>
> No, I don't. I just run guests with kernels that would support A9
> also. If you select Highbank in your .config, you will get CACHE_L2X0
> and the kernel will do SMCs - regardless of the CPU you are running
> on. Those SMCs are ignored by the firmware on Midway.
>
> I agree that the proper solution would be to detect at run-time in
> the (guest) kernel whether you actually need the PL310 handling, but
> for the time being and to support older kernels we will need this 
> fix.
>
> For me this fixes "qemu -machine midway --enable-kvm".

I understand that this fixes an issue, but I'd rather see either the 
guest kernel being fixed, or some decent framework for SMC handling in 
KVM (potentially leaving it to platform emulation to handle it).

Testing random registers won't cut it, I'm afraid.

         M.
Marc Zyngier Aug. 14, 2013, 10:30 a.m. UTC | #5
On 2013-08-14 11:22, Peter Maydell wrote:
> On 14 August 2013 10:32, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> On 2013-08-14 10:22, Andre Przywara wrote:
>
>>> +static int kvm_ignore_l2x0_call(struct kvm_vcpu *vcpu)
>>> +{
>>> +     unsigned long fn_nr = *vcpu_reg(vcpu, 12) & ~((u32) 0);
>>> +
>>> +     if (fn_nr == 0x102) {
>>> +             kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
>>> +             return 1;
>>> +     }
>>> +
>>> +     return 0;
>>> +}
>>
>> And what if I run mach-foo which uses r12 to request bar services 
>> from
>> secure mode? Is it safe to ignore it? We need something much better 
>> than
>> just testing random registers to guess what the guest wants.
>
> Definitely. This needs to be addressed via the kernel providing
> some mechanism so that userspace and/or a KVM-specific bit
> of 'firmware' running in the guest VM can handle the SMC
> calls the guest tries to make, because it's totally board
> specific.

Right. We're in violent agreement here.

What I can imagine is some kind of feature bit that would cause an exit 
all the way to userspace, letting QEMU handle the call.

That would be simple enough to implement, I believe. At least on the 
kernel side.

         M.
Dave Martin Aug. 14, 2013, 10:41 a.m. UTC | #6
On Wed, Aug 14, 2013 at 11:39:21AM +0200, Andre Przywara wrote:
> On 08/14/2013 11:32 AM, Marc Zyngier wrote:
> >On 2013-08-14 10:22, Andre Przywara wrote:
> >>Guest kernels with CONFIG_L2X0 set (for instance Highbank or OMAP4)
> >>will trigger SMCs to handle the L2 cache controller (PL310).
> >>This will currently inject #UNDEFs and eventually stop the guest.
> >>
> >>We don't need explicit L2 cache controller handling on A15s anymore,
> >>so it is safe to simply ignore these calls and proceed with the next
> >>instruction.
> >>
> >>Signed-off-by: Andre Przywara <andre.przywara@calxeda.com>
> >
> >Hold on.
> >
> >Are you trying to run A9 guests on KVM? Sorry, but that's not a
> >supported mode of operation just yet.
> 
> No, I don't. I just run guests with kernels that would support A9
> also. If you select Highbank in your .config, you will get
> CACHE_L2X0 and the kernel will do SMCs - regardless of the CPU you
> are running on. Those SMCs are ignored by the firmware on Midway.
> 
> I agree that the proper solution would be to detect at run-time in
> the (guest) kernel whether you actually need the PL310 handling, but
> for the time being and to support older kernels we will need this
> fix.
> 
> For me this fixes "qemu -machine midway --enable-kvm".
> 
> Regards,
> Andre.
> 
> >
> >So, until we have a proper framework to deal with multiple CPUs, the
> >only valid configuration is A15-on-A15.
> >
> >>---
> >>  arch/arm/kvm/handle_exit.c | 20 ++++++++++++++++++++
> >>  1 file changed, 20 insertions(+)
> >>
> >>diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
> >>index df4c82d..2cbe6a0 100644
> >>--- a/arch/arm/kvm/handle_exit.c
> >>+++ b/arch/arm/kvm/handle_exit.c
> >>@@ -50,8 +50,28 @@ static int handle_hvc(struct kvm_vcpu *vcpu,
> >>struct kvm_run *run)
> >>  	return 1;
> >>  }
> >>
> >>+/*
> >>+ * OMAP4 and Highbank machines do a SMC call to handle the L2 cache
> >>+ * controller. They put 0x102 in r12 to request this functionality.
> >>+ * This is not needed on A15s, so we can safely ignore it in KVM
> >>guests.
> >>+ */
> >>+static int kvm_ignore_l2x0_call(struct kvm_vcpu *vcpu)
> >>+{
> >>+	unsigned long fn_nr = *vcpu_reg(vcpu, 12) & ~((u32) 0);
> >>+
> >>+	if (fn_nr == 0x102) {
> >>+		kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> >>+		return 1;
> >>+	}
> >>+
> >>+	return 0;
> >>+}
> >
> >And what if I run mach-foo which uses r12 to request bar services from
> >secure mode? Is it safe to ignore it? We need something much better than
> >just testing random registers to guess what the guest wants.
> >
> >>  static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >>  {
> >>+	if (kvm_ignore_l2x0_call(vcpu))
> >>+		return 1;
> >>+
> >>  	kvm_inject_undefined(vcpu);
> >>  	return 1;
> >>  }
> >
> >          M.
> >

Now, suppose a second board family has a similar problem, with a
different SMC ABI, which doesn't use r12.

If the kernel might believe it is running on that board, it
may make SMC calls, but r12 may contain garbage.  Sometimes it
will contain 0x102 and hit your code.


Because this is a board-specific emulation issue, not virtualisation, it
seems wrong to put knowledge about every platform's random firmware
into KVM.


I see two solutions:

 1) Describe the presence/absence of the firmware in the DT.

 2) Provide a framework which allows qemu to emulate the needed
    parts of the the firmware (i.e., allowing SMCs to be trapped back to
    qemu)
   

(1) is the best option for any situation where we don't have legacy
    to support (i.e., we're not trying to run old kernels which don't
    know about the DT binding)

(2) allows for the most authentic simulation in KVM.   It's also the
    only way to be backwards compatible with older kernels that don't
    understand the added DT bindings.


We would also want a way to turn the PSCI implementation in the
kernel off: that's valid for the kvmtool case, because it is part of
the canonical paravirtualised environment.  But it's not valid for the
qemu full emulation case where we probably want to punt HVC/SMC calls
back to qemu for emulation, or otherwise fault/ignore them (depending on
whether the board in question has the Security/Virtualisation
extensions, and on what the firmware is supposed to be present).

Cheers
---Dave
Christoffer Dall Aug. 14, 2013, 5:18 p.m. UTC | #7
On Wed, Aug 14, 2013 at 11:30:03AM +0100, Marc Zyngier wrote:
> On 2013-08-14 11:22, Peter Maydell wrote:
> > On 14 August 2013 10:32, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >> On 2013-08-14 10:22, Andre Przywara wrote:
> >
> >>> +static int kvm_ignore_l2x0_call(struct kvm_vcpu *vcpu)
> >>> +{
> >>> +     unsigned long fn_nr = *vcpu_reg(vcpu, 12) & ~((u32) 0);
> >>> +
> >>> +     if (fn_nr == 0x102) {
> >>> +             kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> >>> +             return 1;
> >>> +     }
> >>> +
> >>> +     return 0;
> >>> +}
> >>
> >> And what if I run mach-foo which uses r12 to request bar services 
> >> from
> >> secure mode? Is it safe to ignore it? We need something much better 
> >> than
> >> just testing random registers to guess what the guest wants.
> >
> > Definitely. This needs to be addressed via the kernel providing
> > some mechanism so that userspace and/or a KVM-specific bit
> > of 'firmware' running in the guest VM can handle the SMC
> > calls the guest tries to make, because it's totally board
> > specific.
> 
> Right. We're in violent agreement here.
> 
> What I can imagine is some kind of feature bit that would cause an exit 
> all the way to userspace, letting QEMU handle the call.
> 
> That would be simple enough to implement, I believe. At least on the 
> kernel side.
> 
How would we distinguish between a PSCI call that the kernel should
support and a call to secure firmware that needs to be forwarded to
QEMU?  Is this simply a binary config at VM creation time?

-Christoffer
Christoffer Dall Aug. 14, 2013, 5:21 p.m. UTC | #8
On Wed, Aug 14, 2013 at 11:41:04AM +0100, Dave Martin wrote:
> On Wed, Aug 14, 2013 at 11:39:21AM +0200, Andre Przywara wrote:
> > On 08/14/2013 11:32 AM, Marc Zyngier wrote:
> > >On 2013-08-14 10:22, Andre Przywara wrote:
> > >>Guest kernels with CONFIG_L2X0 set (for instance Highbank or OMAP4)
> > >>will trigger SMCs to handle the L2 cache controller (PL310).
> > >>This will currently inject #UNDEFs and eventually stop the guest.
> > >>
> > >>We don't need explicit L2 cache controller handling on A15s anymore,
> > >>so it is safe to simply ignore these calls and proceed with the next
> > >>instruction.
> > >>
> > >>Signed-off-by: Andre Przywara <andre.przywara@calxeda.com>
> > >
> > >Hold on.
> > >
> > >Are you trying to run A9 guests on KVM? Sorry, but that's not a
> > >supported mode of operation just yet.
> > 
> > No, I don't. I just run guests with kernels that would support A9
> > also. If you select Highbank in your .config, you will get
> > CACHE_L2X0 and the kernel will do SMCs - regardless of the CPU you
> > are running on. Those SMCs are ignored by the firmware on Midway.
> > 
> > I agree that the proper solution would be to detect at run-time in
> > the (guest) kernel whether you actually need the PL310 handling, but
> > for the time being and to support older kernels we will need this
> > fix.
> > 
> > For me this fixes "qemu -machine midway --enable-kvm".
> > 
> > Regards,
> > Andre.
> > 
> > >
> > >So, until we have a proper framework to deal with multiple CPUs, the
> > >only valid configuration is A15-on-A15.
> > >
> > >>---
> > >>  arch/arm/kvm/handle_exit.c | 20 ++++++++++++++++++++
> > >>  1 file changed, 20 insertions(+)
> > >>
> > >>diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
> > >>index df4c82d..2cbe6a0 100644
> > >>--- a/arch/arm/kvm/handle_exit.c
> > >>+++ b/arch/arm/kvm/handle_exit.c
> > >>@@ -50,8 +50,28 @@ static int handle_hvc(struct kvm_vcpu *vcpu,
> > >>struct kvm_run *run)
> > >>  	return 1;
> > >>  }
> > >>
> > >>+/*
> > >>+ * OMAP4 and Highbank machines do a SMC call to handle the L2 cache
> > >>+ * controller. They put 0x102 in r12 to request this functionality.
> > >>+ * This is not needed on A15s, so we can safely ignore it in KVM
> > >>guests.
> > >>+ */
> > >>+static int kvm_ignore_l2x0_call(struct kvm_vcpu *vcpu)
> > >>+{
> > >>+	unsigned long fn_nr = *vcpu_reg(vcpu, 12) & ~((u32) 0);
> > >>+
> > >>+	if (fn_nr == 0x102) {
> > >>+		kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> > >>+		return 1;
> > >>+	}
> > >>+
> > >>+	return 0;
> > >>+}
> > >
> > >And what if I run mach-foo which uses r12 to request bar services from
> > >secure mode? Is it safe to ignore it? We need something much better than
> > >just testing random registers to guess what the guest wants.
> > >
> > >>  static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
> > >>  {
> > >>+	if (kvm_ignore_l2x0_call(vcpu))
> > >>+		return 1;
> > >>+
> > >>  	kvm_inject_undefined(vcpu);
> > >>  	return 1;
> > >>  }
> > >
> > >          M.
> > >
> 
> Now, suppose a second board family has a similar problem, with a
> different SMC ABI, which doesn't use r12.
> 
> If the kernel might believe it is running on that board, it
> may make SMC calls, but r12 may contain garbage.  Sometimes it
> will contain 0x102 and hit your code.
> 
> 
> Because this is a board-specific emulation issue, not virtualisation, it
> seems wrong to put knowledge about every platform's random firmware
> into KVM.
> 
> 
> I see two solutions:
> 
>  1) Describe the presence/absence of the firmware in the DT.
> 
>  2) Provide a framework which allows qemu to emulate the needed
>     parts of the the firmware (i.e., allowing SMCs to be trapped back to
>     qemu)
>    
> 
> (1) is the best option for any situation where we don't have legacy
>     to support (i.e., we're not trying to run old kernels which don't
>     know about the DT binding)
> 
> (2) allows for the most authentic simulation in KVM.   It's also the
>     only way to be backwards compatible with older kernels that don't
>     understand the added DT bindings.

The question really is if we need legacy support for kernels.  I suspect
that this use case will arise (we are already hearing some chatter about
this from the networking space) and therefore we will most likely end up
with some combination of (1) and (2).

> 
> 
> We would also want a way to turn the PSCI implementation in the
> kernel off: that's valid for the kvmtool case, because it is part of
> the canonical paravirtualised environment.  But it's not valid for the
> qemu full emulation case where we probably want to punt HVC/SMC calls
> back to qemu for emulation, or otherwise fault/ignore them (depending on
> whether the board in question has the Security/Virtualisation
> extensions, and on what the firmware is supposed to be present).
> 

I don't understand this paragraph, sorry.

-Christoffer
Peter Maydell Aug. 14, 2013, 6:01 p.m. UTC | #9
On 14 August 2013 18:18, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> How would we distinguish between a PSCI call that the kernel should
> support and a call to secure firmware that needs to be forwarded to
> QEMU?  Is this simply a binary config at VM creation time?

Kernel PSCI is always HVC (right?) so you could just say
that HVC is the kernel's business and SMC is the guest
firmware's.

If we make the kernel just restart the guest inside its
firmware blob without reflecting the SMC out to userspace
are we going to regret it later?

-- PMM
Christoffer Dall Aug. 14, 2013, 6:13 p.m. UTC | #10
On Wed, Aug 14, 2013 at 07:01:05PM +0100, Peter Maydell wrote:
> On 14 August 2013 18:18, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> > How would we distinguish between a PSCI call that the kernel should
> > support and a call to secure firmware that needs to be forwarded to
> > QEMU?  Is this simply a binary config at VM creation time?
> 
> Kernel PSCI is always HVC (right?) so you could just say
> that HVC is the kernel's business and SMC is the guest
> firmware's.
> 

As I understand it, current implementations rely on info from the DT and
guests are therefore told only to use HVCs, but the spec allows both an
HVC and an SMC as the conduit (unless I read this wrong), so I think
it's quite possible that we'll end up supporting something that needs
make PSCI calls via SMC.  On the other hand, if QEMU can make the
distinction and do everything that the kernel would otherwise be able to
do to handle the PSCI, then we can still just let QEMU handle the whole
thing.

(feel free to replace QEMU with "user space" in the above)

> If we make the kernel just restart the guest inside its
> firmware blob without reflecting the SMC out to userspace
> are we going to regret it later?
> 

Are you suggesting that we'd load the secure firmware inside the guest
in a separate address space somehow and just let it execute the binary?
That won't work without considerable emulation efforts in the kernel to
support the privileged operations right?  What if the secure firmware
does something SoC-specific that KVM will never know about, but QEMU
would, then there's still the need for some 'backdoor' out to QEMU.

Did I misunderstand your point here?

I would imagine that at most QEMU can tell KVM to set SMC calls to
exactly one of these modes:
 1) Handle SMCs as undefined
 2) Handle SMCs as PSCI
 3) Forward all SMCs to me

And that would more or less be the end of it as far as KVM is
involved...

-Christoffer
Peter Maydell Aug. 14, 2013, 6:22 p.m. UTC | #11
On 14 August 2013 19:13, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> On Wed, Aug 14, 2013 at 07:01:05PM +0100, Peter Maydell wrote:
>> If we make the kernel just restart the guest inside its
>> firmware blob without reflecting the SMC out to userspace
>> are we going to regret it later?
>
> Are you suggesting that we'd load the secure firmware inside the guest
> in a separate address space somehow and just let it execute the binary?
> That won't work without considerable emulation efforts in the kernel to
> support the privileged operations right?  What if the secure firmware
> does something SoC-specific that KVM will never know about, but QEMU
> would, then there's still the need for some 'backdoor' out to QEMU.

No, the suggestion is that the 'firmware' blob is a specifically
written thing to work with QEMU/KVM, so it supports the right
entry points but is written to work within a slightly odd
environment where:
 * the kernel supports an emulated MVBAR
 * SMC causes us to redirect the guest so it enters as per
   the MVBAR, but in EL1, not EL3
 * the "firmware" blob does whatever is necessary before
   returning from the SMC

There is obviously no insulation between the guest kernel and the
firmware blob in this scenario, but if we're not actually trying
to emulate secure mode, just deal somehow with a handful of API
calls, that should be fine.

(This is an idea I've floated before, based partly on what the
current QEMU OMAP3 emulation does. The advantage from my point of
view is that it keeps the details of what the SMC entrypoints
are supposed to do out of QEMU and in the board-specific blob.)

-- PMM
Christoffer Dall Aug. 14, 2013, 6:36 p.m. UTC | #12
On Wed, Aug 14, 2013 at 07:22:07PM +0100, Peter Maydell wrote:
> On 14 August 2013 19:13, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> > On Wed, Aug 14, 2013 at 07:01:05PM +0100, Peter Maydell wrote:
> >> If we make the kernel just restart the guest inside its
> >> firmware blob without reflecting the SMC out to userspace
> >> are we going to regret it later?
> >
> > Are you suggesting that we'd load the secure firmware inside the guest
> > in a separate address space somehow and just let it execute the binary?
> > That won't work without considerable emulation efforts in the kernel to
> > support the privileged operations right?  What if the secure firmware
> > does something SoC-specific that KVM will never know about, but QEMU
> > would, then there's still the need for some 'backdoor' out to QEMU.
> 
> No, the suggestion is that the 'firmware' blob is a specifically
> written thing to work with QEMU/KVM, so it supports the right
> entry points but is written to work within a slightly odd
> environment where:
>  * the kernel supports an emulated MVBAR
>  * SMC causes us to redirect the guest so it enters as per
>    the MVBAR, but in EL1, not EL3
>  * the "firmware" blob does whatever is necessary before
>    returning from the SMC

ok, I see.

> 
> There is obviously no insulation between the guest kernel and the
> firmware blob in this scenario, but if we're not actually trying
> to emulate secure mode, just deal somehow with a handful of API
> calls, that should be fine.

Well, we could load the firmware in memory that we only ever map in
Stage-2 mappings when we execute the special firmware blob, which would
at least prevent the guest kernel from mocking with the firmware code.
Sort of like an emulated secure physical memory region.

> 
> (This is an idea I've floated before, based partly on what the
> current QEMU OMAP3 emulation does. The advantage from my point of
> view is that it keeps the details of what the SMC entrypoints
> are supposed to do out of QEMU and in the board-specific blob.)
> 

The clear advantage is that it keeps the code out of QEMU.  The downside
is a potentially more complicated development environment (you're sort
writing a small OS here, and you can't really reuse existing secure
firmwares because the environment is special, right?) where having the
emulation simply integrated in QEMU makes it a nice debuggable piece of
user space code.

Hmmm....

-Christoffer
diff mbox

Patch

diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
index df4c82d..2cbe6a0 100644
--- a/arch/arm/kvm/handle_exit.c
+++ b/arch/arm/kvm/handle_exit.c
@@ -50,8 +50,28 @@  static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	return 1;
 }
 
+/*
+ * OMAP4 and Highbank machines do a SMC call to handle the L2 cache
+ * controller. They put 0x102 in r12 to request this functionality.
+ * This is not needed on A15s, so we can safely ignore it in KVM guests.
+ */
+static int kvm_ignore_l2x0_call(struct kvm_vcpu *vcpu)
+{
+	unsigned long fn_nr = *vcpu_reg(vcpu, 12) & ~((u32) 0);
+
+	if (fn_nr == 0x102) {
+		kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
+		return 1;
+	}
+
+	return 0;
+}
+
 static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
+	if (kvm_ignore_l2x0_call(vcpu))
+		return 1;
+
 	kvm_inject_undefined(vcpu);
 	return 1;
 }