diff mbox

arm/monitor: flush the icache during SMC traps

Message ID 20170125200255.15467-1-tamas.lengyel@zentific.com (mailing list archive)
State New, archived
Headers show

Commit Message

Tamas Lengyel Jan. 25, 2017, 8:02 p.m. UTC
During an SMC trap it is possible that the user may change the memory
from where the SMC was fetched. However, without flushing the icache
the SMC may still trigger if the pCPU was idle during the trap. Flush
the icache to ensure consistency.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
---
Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
---
 xen/arch/arm/monitor.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Razvan Cojocaru Jan. 25, 2017, 8:07 p.m. UTC | #1
On 01/25/2017 10:02 PM, Tamas K Lengyel wrote:
> During an SMC trap it is possible that the user may change the memory
> from where the SMC was fetched. However, without flushing the icache
> the SMC may still trigger if the pCPU was idle during the trap. Flush
> the icache to ensure consistency.
> 
> Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>

Fair enough, assuming the ARM maintainers have no objections:

Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>


Thanks,
Razvan
Julien Grall Jan. 25, 2017, 10:41 p.m. UTC | #2
Hi Tamas,

On 25/01/2017 20:02, Tamas K Lengyel wrote:
> During an SMC trap it is possible that the user may change the memory

By user, do you mean the monitor application?

> from where the SMC was fetched. However, without flushing the icache
> the SMC may still trigger if the pCPU was idle during the trap. Flush
> the icache to ensure consistency.

invalidate_icache will invalidate the cache to PoU on all the pCPUs. But 
here you only mention a problem on the current pCPU. So which behavior 
do you want to achieve?

>
> Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
> ---
> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Julien Grall <julien.grall@arm.com>
> ---
>  xen/arch/arm/monitor.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/xen/arch/arm/monitor.c b/xen/arch/arm/monitor.c
> index 59ce8f635f..ae1b97993f 100644
> --- a/xen/arch/arm/monitor.c
> +++ b/xen/arch/arm/monitor.c
> @@ -63,6 +63,9 @@ int monitor_smc(void)
>          .reason = VM_EVENT_REASON_PRIVILEGED_CALL
>      };
>
> +    /* Nuke the icache as the memory may get changed underneath us. */
> +    invalidate_icache();
> +

Can you explain why you put this call before the monitor trap and not 
after? From my understanding the monitoring application may change the 
memory. But by the time you modify the instruction, the instruction 
cache may have prefetched the previous instruction. So the problem is 
still there.

Furthermore, the instruction cache may not snoop the data cache. So you 
have to ensure the data written reached the memory. Otherwise you may 
read the previous instruction. Where is it done? If you expect the 
monitor app to flush the data cache, then please comment it.

Lastly, this looks like to me that all the traps will have this issue. 
So maybe this should go in monitor_traps instead?

>      return monitor_traps(current, 1, &req);
>  }
>
>

Cheers,
Julien Grall Jan. 26, 2017, 11:30 a.m. UTC | #3
(CC xen-devel, Ravzan, and Stefao)

Hi Tamas,

Not sure why you only CC me on the answer. I have CCed again xen-devel 
as I don't see any sensible discussion in it.

On 26/01/2017 00:11, Tamas K Lengyel wrote:
> On Wed, Jan 25, 2017 at 3:41 PM, Julien Grall <julien.grall@arm.com> wrote:
>> Hi Tamas,
>>
>> On 25/01/2017 20:02, Tamas K Lengyel wrote:
>>>
>>> During an SMC trap it is possible that the user may change the memory
>>
>>
>> By user, do you mean the monitor application?
>
> Yes.

It would be worth clarifying in the commit message.

>
>>
>>> from where the SMC was fetched. However, without flushing the icache
>>> the SMC may still trigger if the pCPU was idle during the trap. Flush
>>> the icache to ensure consistency.
>>
>>
>> invalidate_icache will invalidate the cache to PoU on all the pCPUs. But
>> here you only mention a problem on the current pCPU. So which behavior do
>> you want to achieve?
>
> It would be sufficient to flush the icache on the specific pCPU that
> trapped with the SMC. Didn't see anything defined in Xen to achieve
> that.
>
>>
>>>
>>> Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
>>> ---
>>> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
>>> Cc: Stefano Stabellini <sstabellini@kernel.org>
>>> Cc: Julien Grall <julien.grall@arm.com>
>>> ---
>>>  xen/arch/arm/monitor.c | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/xen/arch/arm/monitor.c b/xen/arch/arm/monitor.c
>>> index 59ce8f635f..ae1b97993f 100644
>>> --- a/xen/arch/arm/monitor.c
>>> +++ b/xen/arch/arm/monitor.c
>>> @@ -63,6 +63,9 @@ int monitor_smc(void)
>>>          .reason = VM_EVENT_REASON_PRIVILEGED_CALL
>>>      };
>>>
>>> +    /* Nuke the icache as the memory may get changed underneath us. */
>>> +    invalidate_icache();
>>> +
>>
>>
>> Can you explain why you put this call before the monitor trap and not after?
>> From my understanding the monitoring application may change the memory. But
>> by the time you modify the instruction, the instruction cache may have
>> prefetched the previous instruction. So the problem is still there.
>
> Not sure how would that happen exactly? The vCPU gets paused until the
> user responds to the vm_event request, so where would it perform the
> prefetch again? Also, with this change I've verified that the repeated
> SMC issue no longer presents itself (it has been triggered quite
> regularly on the hikey before).

The ARM ARM provides a set of expected behavior and where to call the 
cache maintenance instruction. If it is not done correctly, it may not 
affect your platform but at some point in the future (or even already 
today) it will break a platform. I wish you good luck to debug that when 
it happens. It is a really nightmare :).

So what I care the most is what is the correct behavior based on the ARM 
ARM. A good overview can be found in a talk made by Mark Rutland [1].

What I was concerned about is the instruction cache been shared between 
multiple processors (for instance L2 cache or higher). A vCPU could also 
get rescheduled to another processor afterwards. Leading to accessing an 
out of date instruction cache.

>
> Also, for multi-vCPU guest another vCPU fetching the SMC before the
> memory modification happen (ie. just after this flush) is not a
> problem and is expected behavior. Providing a non-racy setup for
> trapping on multi-vCPU guests requires other work (like altp2m).

I am not that concerned about the SMC itself, but the fact that you may 
modify the guest memory whilst it is running. So another vCPU may end up 
to execute a mix between new and old instructions depending of the state 
of its instruction cache. That would result to a potential undefined 
behavior.

Also, you want to make sure that if you write another SMC in memory, it 
is effectively affecting all the vCPUs now and not a moment after.

So I still think the invalidate_icache should be done afterwards. IHMO 
modifying guest instructions while other vCPU are running is very 
fragile as other thread may execute the instructions your are running.

>
>>
>> Furthermore, the instruction cache may not snoop the data cache. So you have
>> to ensure the data written reached the memory. Otherwise you may read the
>> previous instruction. Where is it done? If you expect the monitor app to
>> flush the data cache, then please comment it.
>
> AFAICT there is no public API for the user to call to request flushing
> the caches like that. Memory could get changed by the user any time
> under the VM, not just during event callbacks. So I'm not sure where
> that comment should be added.

If the monitor app can modify the guest memory at any time. Then you 
need to provide an interface to clean the data cache + invalidate the 
instruction cache. Otherwise the guest may execute stale instructions.

An hypercall might not be necessary because I think (this need to be 
check) that you could do execute both cache flush from the monitor app.
So I would document this in the vm_event header, or in a document 
explaining the good practices to implement a monitor app.

OOI, when you are modifying the guest memory, do you pause all the vCPUs?

>
>>
>> Lastly, this looks like to me that all the traps will have this issue. So
>> maybe this should go in monitor_traps instead?
>
> Memory traps do not have this issue AFAICT as they prohibit the CPU
> from fetching the memory to begin with, so the caches wouldn't have
> stale contents.

I was not speaking about memory traps but potential other system 
register traps (TTBR,...) that would be added in the future.

Regarding the memory trap. I think you can get into the same problem if 
you decide to forbid read-write but allow instruction fetch. You would 
need some cache maintenance instruction here if you decide to modify the 
memory.

>
> Tamas
>

[1] 
https://events.linuxfoundation.org/sites/events/files/slides/slides_17.pdf
Julien Grall Jan. 26, 2017, 11:31 a.m. UTC | #4
CC correct e-mail address for Stefano.

On 26/01/2017 11:30, Julien Grall wrote:
> (CC xen-devel, Ravzan, and Stefao)
>
> Hi Tamas,
>
> Not sure why you only CC me on the answer. I have CCed again xen-devel
> as I don't see any sensible discussion in it.
>
> On 26/01/2017 00:11, Tamas K Lengyel wrote:
>> On Wed, Jan 25, 2017 at 3:41 PM, Julien Grall <julien.grall@arm.com>
>> wrote:
>>> Hi Tamas,
>>>
>>> On 25/01/2017 20:02, Tamas K Lengyel wrote:
>>>>
>>>> During an SMC trap it is possible that the user may change the memory
>>>
>>>
>>> By user, do you mean the monitor application?
>>
>> Yes.
>
> It would be worth clarifying in the commit message.
>
>>
>>>
>>>> from where the SMC was fetched. However, without flushing the icache
>>>> the SMC may still trigger if the pCPU was idle during the trap. Flush
>>>> the icache to ensure consistency.
>>>
>>>
>>> invalidate_icache will invalidate the cache to PoU on all the pCPUs. But
>>> here you only mention a problem on the current pCPU. So which
>>> behavior do
>>> you want to achieve?
>>
>> It would be sufficient to flush the icache on the specific pCPU that
>> trapped with the SMC. Didn't see anything defined in Xen to achieve
>> that.
>>
>>>
>>>>
>>>> Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
>>>> ---
>>>> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
>>>> Cc: Stefano Stabellini <sstabellini@kernel.org>
>>>> Cc: Julien Grall <julien.grall@arm.com>
>>>> ---
>>>>  xen/arch/arm/monitor.c | 3 +++
>>>>  1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/xen/arch/arm/monitor.c b/xen/arch/arm/monitor.c
>>>> index 59ce8f635f..ae1b97993f 100644
>>>> --- a/xen/arch/arm/monitor.c
>>>> +++ b/xen/arch/arm/monitor.c
>>>> @@ -63,6 +63,9 @@ int monitor_smc(void)
>>>>          .reason = VM_EVENT_REASON_PRIVILEGED_CALL
>>>>      };
>>>>
>>>> +    /* Nuke the icache as the memory may get changed underneath us. */
>>>> +    invalidate_icache();
>>>> +
>>>
>>>
>>> Can you explain why you put this call before the monitor trap and not
>>> after?
>>> From my understanding the monitoring application may change the
>>> memory. But
>>> by the time you modify the instruction, the instruction cache may have
>>> prefetched the previous instruction. So the problem is still there.
>>
>> Not sure how would that happen exactly? The vCPU gets paused until the
>> user responds to the vm_event request, so where would it perform the
>> prefetch again? Also, with this change I've verified that the repeated
>> SMC issue no longer presents itself (it has been triggered quite
>> regularly on the hikey before).
>
> The ARM ARM provides a set of expected behavior and where to call the
> cache maintenance instruction. If it is not done correctly, it may not
> affect your platform but at some point in the future (or even already
> today) it will break a platform. I wish you good luck to debug that when
> it happens. It is a really nightmare :).
>
> So what I care the most is what is the correct behavior based on the ARM
> ARM. A good overview can be found in a talk made by Mark Rutland [1].
>
> What I was concerned about is the instruction cache been shared between
> multiple processors (for instance L2 cache or higher). A vCPU could also
> get rescheduled to another processor afterwards. Leading to accessing an
> out of date instruction cache.
>
>>
>> Also, for multi-vCPU guest another vCPU fetching the SMC before the
>> memory modification happen (ie. just after this flush) is not a
>> problem and is expected behavior. Providing a non-racy setup for
>> trapping on multi-vCPU guests requires other work (like altp2m).
>
> I am not that concerned about the SMC itself, but the fact that you may
> modify the guest memory whilst it is running. So another vCPU may end up
> to execute a mix between new and old instructions depending of the state
> of its instruction cache. That would result to a potential undefined
> behavior.
>
> Also, you want to make sure that if you write another SMC in memory, it
> is effectively affecting all the vCPUs now and not a moment after.
>
> So I still think the invalidate_icache should be done afterwards. IHMO
> modifying guest instructions while other vCPU are running is very
> fragile as other thread may execute the instructions your are running.
>
>>
>>>
>>> Furthermore, the instruction cache may not snoop the data cache. So
>>> you have
>>> to ensure the data written reached the memory. Otherwise you may read
>>> the
>>> previous instruction. Where is it done? If you expect the monitor app to
>>> flush the data cache, then please comment it.
>>
>> AFAICT there is no public API for the user to call to request flushing
>> the caches like that. Memory could get changed by the user any time
>> under the VM, not just during event callbacks. So I'm not sure where
>> that comment should be added.
>
> If the monitor app can modify the guest memory at any time. Then you
> need to provide an interface to clean the data cache + invalidate the
> instruction cache. Otherwise the guest may execute stale instructions.
>
> An hypercall might not be necessary because I think (this need to be
> check) that you could do execute both cache flush from the monitor app.
> So I would document this in the vm_event header, or in a document
> explaining the good practices to implement a monitor app.
>
> OOI, when you are modifying the guest memory, do you pause all the vCPUs?
>
>>
>>>
>>> Lastly, this looks like to me that all the traps will have this
>>> issue. So
>>> maybe this should go in monitor_traps instead?
>>
>> Memory traps do not have this issue AFAICT as they prohibit the CPU
>> from fetching the memory to begin with, so the caches wouldn't have
>> stale contents.
>
> I was not speaking about memory traps but potential other system
> register traps (TTBR,...) that would be added in the future.
>
> Regarding the memory trap. I think you can get into the same problem if
> you decide to forbid read-write but allow instruction fetch. You would
> need some cache maintenance instruction here if you decide to modify the
> memory.
>
>>
>> Tamas
>>
>
> [1]
> https://events.linuxfoundation.org/sites/events/files/slides/slides_17.pdf
>
Tamas Lengyel Jan. 26, 2017, 5:54 p.m. UTC | #5
On Thu, Jan 26, 2017 at 4:30 AM, Julien Grall <julien.grall@arm.com> wrote:
> (CC xen-devel, Ravzan, and Stefao)
>
> Hi Tamas,
>
> Not sure why you only CC me on the answer. I have CCed again xen-devel as I
> don't see any sensible discussion in it.
>
> On 26/01/2017 00:11, Tamas K Lengyel wrote:
>>
>> On Wed, Jan 25, 2017 at 3:41 PM, Julien Grall <julien.grall@arm.com>
>> wrote:
>>>
>>> Hi Tamas,
>>>
>>> On 25/01/2017 20:02, Tamas K Lengyel wrote:
>>>>
>>>>
>>>> During an SMC trap it is possible that the user may change the memory
>>>
>>>
>>>
>>> By user, do you mean the monitor application?
>>
>>
>> Yes.
>
>
> It would be worth clarifying in the commit message.
>
>
>>
>>>
>>>> from where the SMC was fetched. However, without flushing the icache
>>>> the SMC may still trigger if the pCPU was idle during the trap. Flush
>>>> the icache to ensure consistency.
>>>
>>>
>>>
>>> invalidate_icache will invalidate the cache to PoU on all the pCPUs. But
>>> here you only mention a problem on the current pCPU. So which behavior do
>>> you want to achieve?
>>
>>
>> It would be sufficient to flush the icache on the specific pCPU that
>> trapped with the SMC. Didn't see anything defined in Xen to achieve
>> that.
>>
>>>
>>>>
>>>> Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
>>>> ---
>>>> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
>>>> Cc: Stefano Stabellini <sstabellini@kernel.org>
>>>> Cc: Julien Grall <julien.grall@arm.com>
>>>> ---
>>>>  xen/arch/arm/monitor.c | 3 +++
>>>>  1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/xen/arch/arm/monitor.c b/xen/arch/arm/monitor.c
>>>> index 59ce8f635f..ae1b97993f 100644
>>>> --- a/xen/arch/arm/monitor.c
>>>> +++ b/xen/arch/arm/monitor.c
>>>> @@ -63,6 +63,9 @@ int monitor_smc(void)
>>>>          .reason = VM_EVENT_REASON_PRIVILEGED_CALL
>>>>      };
>>>>
>>>> +    /* Nuke the icache as the memory may get changed underneath us. */
>>>> +    invalidate_icache();
>>>> +
>>>
>>>
>>>
>>> Can you explain why you put this call before the monitor trap and not
>>> after?
>>> From my understanding the monitoring application may change the memory.
>>> But
>>> by the time you modify the instruction, the instruction cache may have
>>> prefetched the previous instruction. So the problem is still there.
>>
>>
>> Not sure how would that happen exactly? The vCPU gets paused until the
>> user responds to the vm_event request, so where would it perform the
>> prefetch again? Also, with this change I've verified that the repeated
>> SMC issue no longer presents itself (it has been triggered quite
>> regularly on the hikey before).
>
>
> The ARM ARM provides a set of expected behavior and where to call the cache
> maintenance instruction. If it is not done correctly, it may not affect your
> platform but at some point in the future (or even already today) it will
> break a platform. I wish you good luck to debug that when it happens. It is
> a really nightmare :).
>
> So what I care the most is what is the correct behavior based on the ARM
> ARM. A good overview can be found in a talk made by Mark Rutland [1].
>
> What I was concerned about is the instruction cache been shared between
> multiple processors (for instance L2 cache or higher). A vCPU could also get
> rescheduled to another processor afterwards. Leading to accessing an out of
> date instruction cache.

 I see, yea, in that case the instruction may still trigger regardless. Sigh.

>
>>
>> Also, for multi-vCPU guest another vCPU fetching the SMC before the
>> memory modification happen (ie. just after this flush) is not a
>> problem and is expected behavior. Providing a non-racy setup for
>> trapping on multi-vCPU guests requires other work (like altp2m).
>
>
> I am not that concerned about the SMC itself, but the fact that you may
> modify the guest memory whilst it is running. So another vCPU may end up to
> execute a mix between new and old instructions depending of the state of its
> instruction cache. That would result to a potential undefined behavior.
>
> Also, you want to make sure that if you write another SMC in memory, it is
> effectively affecting all the vCPUs now and not a moment after.

Yeap.

>
> So I still think the invalidate_icache should be done afterwards. IHMO
> modifying guest instructions while other vCPU are running is very fragile as
> other thread may execute the instructions your are running.

I see your point, just got confused because the return from
monitor_traps is not the return from the user. That just sends off the
notification to the user. The actual return happens elsewhere once the
user replies. That would be the point where the flush should happen.

>
>>
>>>
>>> Furthermore, the instruction cache may not snoop the data cache. So you
>>> have
>>> to ensure the data written reached the memory. Otherwise you may read the
>>> previous instruction. Where is it done? If you expect the monitor app to
>>> flush the data cache, then please comment it.
>>
>>
>> AFAICT there is no public API for the user to call to request flushing
>> the caches like that. Memory could get changed by the user any time
>> under the VM, not just during event callbacks. So I'm not sure where
>> that comment should be added.
>
>
> If the monitor app can modify the guest memory at any time. Then you need to
> provide an interface to clean the data cache + invalidate the instruction
> cache. Otherwise the guest may execute stale instructions.
>
> An hypercall might not be necessary because I think (this need to be check)
> that you could do execute both cache flush from the monitor app.

True! I'll check what happens if I execute "ic ialluis" in the monitor
app instead of Xen. If the monitor application can do its own cache
maintenance than that would be the best option.

> So I would document this in the vm_event header, or in a document explaining
> the good practices to implement a monitor app.
>
> OOI, when you are modifying the guest memory, do you pause all the vCPUs?

Yes, the initial SMC injection happens with all vCPUs paused. This was
also just a test-case for single-vCPU systems as removing/adding SMCs
into memory with a shared p2m view across vCPUs is known to be racy
without pausing all vCPUs.

>
>>
>>>
>>> Lastly, this looks like to me that all the traps will have this issue. So
>>> maybe this should go in monitor_traps instead?
>>
>>
>> Memory traps do not have this issue AFAICT as they prohibit the CPU
>> from fetching the memory to begin with, so the caches wouldn't have
>> stale contents.
>
>
> I was not speaking about memory traps but potential other system register
> traps (TTBR,...) that would be added in the future.

Gotcha.

>
> Regarding the memory trap. I think you can get into the same problem if you
> decide to forbid read-write but allow instruction fetch. You would need some
> cache maintenance instruction here if you decide to modify the memory.

I guess cache maintenance should happen every time the memory is
modified. The best place for that is where the user actually does the
modification so I'll check whether that's possible. If so, then that
simplifies things on the Xen side.

Thanks!
Tamas
Tamas Lengyel Jan. 26, 2017, 7:02 p.m. UTC | #6
On Thu, Jan 26, 2017 at 10:54 AM, Tamas K Lengyel
<tamas.lengyel@zentific.com> wrote:
> On Thu, Jan 26, 2017 at 4:30 AM, Julien Grall <julien.grall@arm.com> wrote:
>> (CC xen-devel, Ravzan, and Stefao)
>>
>> Hi Tamas,
>>
>> Not sure why you only CC me on the answer. I have CCed again xen-devel as I
>> don't see any sensible discussion in it.
>>
>> On 26/01/2017 00:11, Tamas K Lengyel wrote:
>>>
>>> On Wed, Jan 25, 2017 at 3:41 PM, Julien Grall <julien.grall@arm.com>
>>> wrote:
>>>>
>>>> Hi Tamas,
>>>>
>>>> On 25/01/2017 20:02, Tamas K Lengyel wrote:
>>>>>
>>>>>
>>>>> During an SMC trap it is possible that the user may change the memory
>>>>
>>>>
>>>>
>>>> By user, do you mean the monitor application?
>>>
>>>
>>> Yes.
>>
>>
>> It would be worth clarifying in the commit message.
>>
>>
>>>
>>>>
>>>>> from where the SMC was fetched. However, without flushing the icache
>>>>> the SMC may still trigger if the pCPU was idle during the trap. Flush
>>>>> the icache to ensure consistency.
>>>>
>>>>
>>>>
>>>> invalidate_icache will invalidate the cache to PoU on all the pCPUs. But
>>>> here you only mention a problem on the current pCPU. So which behavior do
>>>> you want to achieve?
>>>
>>>
>>> It would be sufficient to flush the icache on the specific pCPU that
>>> trapped with the SMC. Didn't see anything defined in Xen to achieve
>>> that.
>>>
>>>>
>>>>>
>>>>> Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
>>>>> ---
>>>>> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
>>>>> Cc: Stefano Stabellini <sstabellini@kernel.org>
>>>>> Cc: Julien Grall <julien.grall@arm.com>
>>>>> ---
>>>>>  xen/arch/arm/monitor.c | 3 +++
>>>>>  1 file changed, 3 insertions(+)
>>>>>
>>>>> diff --git a/xen/arch/arm/monitor.c b/xen/arch/arm/monitor.c
>>>>> index 59ce8f635f..ae1b97993f 100644
>>>>> --- a/xen/arch/arm/monitor.c
>>>>> +++ b/xen/arch/arm/monitor.c
>>>>> @@ -63,6 +63,9 @@ int monitor_smc(void)
>>>>>          .reason = VM_EVENT_REASON_PRIVILEGED_CALL
>>>>>      };
>>>>>
>>>>> +    /* Nuke the icache as the memory may get changed underneath us. */
>>>>> +    invalidate_icache();
>>>>> +
>>>>
>>>>
>>>>
>>>> Can you explain why you put this call before the monitor trap and not
>>>> after?
>>>> From my understanding the monitoring application may change the memory.
>>>> But
>>>> by the time you modify the instruction, the instruction cache may have
>>>> prefetched the previous instruction. So the problem is still there.
>>>
>>>
>>> Not sure how would that happen exactly? The vCPU gets paused until the
>>> user responds to the vm_event request, so where would it perform the
>>> prefetch again? Also, with this change I've verified that the repeated
>>> SMC issue no longer presents itself (it has been triggered quite
>>> regularly on the hikey before).
>>
>>
>> The ARM ARM provides a set of expected behavior and where to call the cache
>> maintenance instruction. If it is not done correctly, it may not affect your
>> platform but at some point in the future (or even already today) it will
>> break a platform. I wish you good luck to debug that when it happens. It is
>> a really nightmare :).
>>
>> So what I care the most is what is the correct behavior based on the ARM
>> ARM. A good overview can be found in a talk made by Mark Rutland [1].
>>
>> What I was concerned about is the instruction cache been shared between
>> multiple processors (for instance L2 cache or higher). A vCPU could also get
>> rescheduled to another processor afterwards. Leading to accessing an out of
>> date instruction cache.
>
>  I see, yea, in that case the instruction may still trigger regardless. Sigh.
>
>>
>>>
>>> Also, for multi-vCPU guest another vCPU fetching the SMC before the
>>> memory modification happen (ie. just after this flush) is not a
>>> problem and is expected behavior. Providing a non-racy setup for
>>> trapping on multi-vCPU guests requires other work (like altp2m).
>>
>>
>> I am not that concerned about the SMC itself, but the fact that you may
>> modify the guest memory whilst it is running. So another vCPU may end up to
>> execute a mix between new and old instructions depending of the state of its
>> instruction cache. That would result to a potential undefined behavior.
>>
>> Also, you want to make sure that if you write another SMC in memory, it is
>> effectively affecting all the vCPUs now and not a moment after.
>
> Yeap.
>
>>
>> So I still think the invalidate_icache should be done afterwards. IHMO
>> modifying guest instructions while other vCPU are running is very fragile as
>> other thread may execute the instructions your are running.
>
> I see your point, just got confused because the return from
> monitor_traps is not the return from the user. That just sends off the
> notification to the user. The actual return happens elsewhere once the
> user replies. That would be the point where the flush should happen.
>
>>
>>>
>>>>
>>>> Furthermore, the instruction cache may not snoop the data cache. So you
>>>> have
>>>> to ensure the data written reached the memory. Otherwise you may read the
>>>> previous instruction. Where is it done? If you expect the monitor app to
>>>> flush the data cache, then please comment it.
>>>
>>>
>>> AFAICT there is no public API for the user to call to request flushing
>>> the caches like that. Memory could get changed by the user any time
>>> under the VM, not just during event callbacks. So I'm not sure where
>>> that comment should be added.
>>
>>
>> If the monitor app can modify the guest memory at any time. Then you need to
>> provide an interface to clean the data cache + invalidate the instruction
>> cache. Otherwise the guest may execute stale instructions.
>>
>> An hypercall might not be necessary because I think (this need to be check)
>> that you could do execute both cache flush from the monitor app.
>
> True! I'll check what happens if I execute "ic ialluis" in the monitor
> app instead of Xen. If the monitor application can do its own cache
> maintenance than that would be the best option.
>

So according to the manual (C5.3.9-10) IALLU and IALLUIS are not
available from EL0. IVAU can be enabled to execute in EL0; however on
my 4.1 kernel it generates a permission fault. I'm also not sure if
IVAU would actually be a solution. So IMHO a hypercall would be a good
way to enable safe flushing of the caches while all vCPUs of the
target domain are paused. Doing so only in a vm_event reply may be
insufficient as the memory could be changed outside of vm_event
callbacks. Thoughts?

Tamas
Stefano Stabellini Jan. 27, 2017, 7:27 p.m. UTC | #7
On Thu, 26 Jan 2017, Tamas K Lengyel wrote:
> On Thu, Jan 26, 2017 at 10:54 AM, Tamas K Lengyel
> <tamas.lengyel@zentific.com> wrote:
> > On Thu, Jan 26, 2017 at 4:30 AM, Julien Grall <julien.grall@arm.com> wrote:
> >> (CC xen-devel, Ravzan, and Stefao)
> >>
> >> Hi Tamas,
> >>
> >> Not sure why you only CC me on the answer. I have CCed again xen-devel as I
> >> don't see any sensible discussion in it.
> >>
> >> On 26/01/2017 00:11, Tamas K Lengyel wrote:
> >>>
> >>> On Wed, Jan 25, 2017 at 3:41 PM, Julien Grall <julien.grall@arm.com>
> >>> wrote:
> >>>>
> >>>> Hi Tamas,
> >>>>
> >>>> On 25/01/2017 20:02, Tamas K Lengyel wrote:
> >>>>>
> >>>>>
> >>>>> During an SMC trap it is possible that the user may change the memory
> >>>>
> >>>>
> >>>>
> >>>> By user, do you mean the monitor application?
> >>>
> >>>
> >>> Yes.
> >>
> >>
> >> It would be worth clarifying in the commit message.
> >>
> >>
> >>>
> >>>>
> >>>>> from where the SMC was fetched. However, without flushing the icache
> >>>>> the SMC may still trigger if the pCPU was idle during the trap. Flush
> >>>>> the icache to ensure consistency.
> >>>>
> >>>>
> >>>>
> >>>> invalidate_icache will invalidate the cache to PoU on all the pCPUs. But
> >>>> here you only mention a problem on the current pCPU. So which behavior do
> >>>> you want to achieve?
> >>>
> >>>
> >>> It would be sufficient to flush the icache on the specific pCPU that
> >>> trapped with the SMC. Didn't see anything defined in Xen to achieve
> >>> that.
> >>>
> >>>>
> >>>>>
> >>>>> Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
> >>>>> ---
> >>>>> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
> >>>>> Cc: Stefano Stabellini <sstabellini@kernel.org>
> >>>>> Cc: Julien Grall <julien.grall@arm.com>
> >>>>> ---
> >>>>>  xen/arch/arm/monitor.c | 3 +++
> >>>>>  1 file changed, 3 insertions(+)
> >>>>>
> >>>>> diff --git a/xen/arch/arm/monitor.c b/xen/arch/arm/monitor.c
> >>>>> index 59ce8f635f..ae1b97993f 100644
> >>>>> --- a/xen/arch/arm/monitor.c
> >>>>> +++ b/xen/arch/arm/monitor.c
> >>>>> @@ -63,6 +63,9 @@ int monitor_smc(void)
> >>>>>          .reason = VM_EVENT_REASON_PRIVILEGED_CALL
> >>>>>      };
> >>>>>
> >>>>> +    /* Nuke the icache as the memory may get changed underneath us. */
> >>>>> +    invalidate_icache();
> >>>>> +
> >>>>
> >>>>
> >>>>
> >>>> Can you explain why you put this call before the monitor trap and not
> >>>> after?
> >>>> From my understanding the monitoring application may change the memory.
> >>>> But
> >>>> by the time you modify the instruction, the instruction cache may have
> >>>> prefetched the previous instruction. So the problem is still there.
> >>>
> >>>
> >>> Not sure how would that happen exactly? The vCPU gets paused until the
> >>> user responds to the vm_event request, so where would it perform the
> >>> prefetch again? Also, with this change I've verified that the repeated
> >>> SMC issue no longer presents itself (it has been triggered quite
> >>> regularly on the hikey before).
> >>
> >>
> >> The ARM ARM provides a set of expected behavior and where to call the cache
> >> maintenance instruction. If it is not done correctly, it may not affect your
> >> platform but at some point in the future (or even already today) it will
> >> break a platform. I wish you good luck to debug that when it happens. It is
> >> a really nightmare :).
> >>
> >> So what I care the most is what is the correct behavior based on the ARM
> >> ARM. A good overview can be found in a talk made by Mark Rutland [1].
> >>
> >> What I was concerned about is the instruction cache been shared between
> >> multiple processors (for instance L2 cache or higher). A vCPU could also get
> >> rescheduled to another processor afterwards. Leading to accessing an out of
> >> date instruction cache.
> >
> >  I see, yea, in that case the instruction may still trigger regardless. Sigh.
> >
> >>
> >>>
> >>> Also, for multi-vCPU guest another vCPU fetching the SMC before the
> >>> memory modification happen (ie. just after this flush) is not a
> >>> problem and is expected behavior. Providing a non-racy setup for
> >>> trapping on multi-vCPU guests requires other work (like altp2m).
> >>
> >>
> >> I am not that concerned about the SMC itself, but the fact that you may
> >> modify the guest memory whilst it is running. So another vCPU may end up to
> >> execute a mix between new and old instructions depending of the state of its
> >> instruction cache. That would result to a potential undefined behavior.
> >>
> >> Also, you want to make sure that if you write another SMC in memory, it is
> >> effectively affecting all the vCPUs now and not a moment after.
> >
> > Yeap.
> >
> >>
> >> So I still think the invalidate_icache should be done afterwards. IHMO
> >> modifying guest instructions while other vCPU are running is very fragile as
> >> other thread may execute the instructions your are running.
> >
> > I see your point, just got confused because the return from
> > monitor_traps is not the return from the user. That just sends off the
> > notification to the user. The actual return happens elsewhere once the
> > user replies. That would be the point where the flush should happen.
> >
> >>
> >>>
> >>>>
> >>>> Furthermore, the instruction cache may not snoop the data cache. So you
> >>>> have
> >>>> to ensure the data written reached the memory. Otherwise you may read the
> >>>> previous instruction. Where is it done? If you expect the monitor app to
> >>>> flush the data cache, then please comment it.
> >>>
> >>>
> >>> AFAICT there is no public API for the user to call to request flushing
> >>> the caches like that. Memory could get changed by the user any time
> >>> under the VM, not just during event callbacks. So I'm not sure where
> >>> that comment should be added.
> >>
> >>
> >> If the monitor app can modify the guest memory at any time. Then you need to
> >> provide an interface to clean the data cache + invalidate the instruction
> >> cache. Otherwise the guest may execute stale instructions.
> >>
> >> An hypercall might not be necessary because I think (this need to be check)
> >> that you could do execute both cache flush from the monitor app.
> >
> > True! I'll check what happens if I execute "ic ialluis" in the monitor
> > app instead of Xen. If the monitor application can do its own cache
> > maintenance than that would be the best option.
> >
> 
> So according to the manual (C5.3.9-10) IALLU and IALLUIS are not
> available from EL0. IVAU can be enabled to execute in EL0; however on
> my 4.1 kernel it generates a permission fault. I'm also not sure if
> IVAU would actually be a solution.

I think there is a Linux syscall available for this kind of things,
called cacheflush. See arch/arm/kernel/traps.c:arm_syscall.
Julien Grall Jan. 27, 2017, 8:43 p.m. UTC | #8
Hi Stefano,

On 27/01/2017 19:27, Stefano Stabellini wrote:
> On Thu, 26 Jan 2017, Tamas K Lengyel wrote:
>> On Thu, Jan 26, 2017 at 10:54 AM, Tamas K Lengyel
>> <tamas.lengyel@zentific.com> wrote:
>>> On Thu, Jan 26, 2017 at 4:30 AM, Julien Grall <julien.grall@arm.com> wrote:
>> So according to the manual (C5.3.9-10) IALLU and IALLUIS are not
>> available from EL0. IVAU can be enabled to execute in EL0; however on
>> my 4.1 kernel it generates a permission fault. I'm also not sure if
>> IVAU would actually be a solution.
>
> I think there is a Linux syscall available for this kind of things,
> called cacheflush. See arch/arm/kernel/traps.c:arm_syscall.

The cacheflush system is only implement on ARM32. And even thought, we 
have a domctl cacheflush for this purpose.

Cheers,
diff mbox

Patch

diff --git a/xen/arch/arm/monitor.c b/xen/arch/arm/monitor.c
index 59ce8f635f..ae1b97993f 100644
--- a/xen/arch/arm/monitor.c
+++ b/xen/arch/arm/monitor.c
@@ -63,6 +63,9 @@  int monitor_smc(void)
         .reason = VM_EVENT_REASON_PRIVILEGED_CALL
     };
 
+    /* Nuke the icache as the memory may get changed underneath us. */
+    invalidate_icache();
+
     return monitor_traps(current, 1, &req);
 }