diff mbox

x86/hvm: simplify emulation triggered by vm_event response

Message ID 1454588852-5389-1-git-send-email-rcojocaru@bitdefender.com
State New, archived
Headers show

Commit Message

Razvan Cojocaru Feb. 4, 2016, 12:27 p.m. UTC
Currently, after receiving a vm_event reply requesting emulation,
the actual emulation is triggered in p2m_mem_access_check(),
which means that we're waiting for the page fault to occur again
before emulating. Aside from the performance impact, this
complicates the code since between hvm_do_resume() and the second
page fault it is possible that the latter becomes a completely
new page fault - hence checking that EIP and the GPA match with
the ones in the original page fault. If they don't, duplicate
EPT fault vm_events will occur, of which a monitoring application
needs to be aware.
This patch makes struct arch_vm_event smaller (since we no longer
need to track eip and gpa), removes the checking code from
p2m_mem_access_check(), and moves the emulation in hvm_do_resume().

Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
---
 xen/arch/x86/hvm/hvm.c         | 17 +++++++++++++++++
 xen/arch/x86/mm/p2m.c          | 34 ----------------------------------
 xen/include/asm-x86/vm_event.h |  2 --
 3 files changed, 17 insertions(+), 36 deletions(-)

Comments

Andrew Cooper Feb. 4, 2016, 12:36 p.m. UTC | #1
On 04/02/16 12:27, Razvan Cojocaru wrote:
> Currently, after receiving a vm_event reply requesting emulation,
> the actual emulation is triggered in p2m_mem_access_check(),
> which means that we're waiting for the page fault to occur again
> before emulating.

Presumably this means that we re-enter the guest and exit immediately
for (hopefully) the same violation?

>  Aside from the performance impact, this
> complicates the code since between hvm_do_resume() and the second
> page fault it is possible that the latter becomes a completely
> new page fault - hence checking that EIP and the GPA match with
> the ones in the original page fault.

Presumably this occurs when we injected an event on the vmentry?

>  If they don't, duplicate
> EPT fault vm_events will occur, of which a monitoring application
> needs to be aware.
> This patch makes struct arch_vm_event smaller (since we no longer
> need to track eip and gpa), removes the checking code from
> p2m_mem_access_check(), and moves the emulation in hvm_do_resume().
>
> Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
> ---
>  xen/arch/x86/hvm/hvm.c         | 17 +++++++++++++++++
>  xen/arch/x86/mm/p2m.c          | 34 ----------------------------------
>  xen/include/asm-x86/vm_event.h |  2 --
>  3 files changed, 17 insertions(+), 36 deletions(-)

Gotta love that diffstat!

The logic makes sense, so Acked-by: Andrew Cooper
<andrew.cooper3@citrix.com> for the x86-related nature, but it would be
nice to have a review from Tamas for the vm_event side of things.
Razvan Cojocaru Feb. 4, 2016, 12:52 p.m. UTC | #2
On 02/04/2016 02:36 PM, Andrew Cooper wrote:
> On 04/02/16 12:27, Razvan Cojocaru wrote:
>> Currently, after receiving a vm_event reply requesting emulation,
>> the actual emulation is triggered in p2m_mem_access_check(),
>> which means that we're waiting for the page fault to occur again
>> before emulating.
> 
> Presumably this means that we re-enter the guest and exit immediately
> for (hopefully) the same violation?

Yes, something along those lines: the original page fault occurs, a
vm_event is being sent to the application, which replies with the
EMULATE flag set (but does not lift the page restrictions). Now we're
hoping that the same instruction that has caused the first page fault
runs, which triggers a new page fault (thus a new call of
p2m_mem_access_check()). But in p2m_mem_access_check() we check to see
if emulate flags are set for the current VCPU, and if they are, we check
to see that the instruction is really the one that has caused the
original page fault, and if it is (i.e. both EIP and GPA match), we emulate.

The ideal case is the one where the second page fault is being caused by
the same instruction hitting the same page, and that happens most of the
time, but it unfortunately does not happen all of the time.

So when the second page fault is _not_ caused by the same instruction,
we just reset the emulate flags and carry on with regular processing,
which means that a new vm_event will be sent out about this new page
fault. But even though the application has reuquested that the page
fault that has triggered the last page fault be emulated, it wasn't (as
a design limitation). So now, when / if the old instruction hits the
page again, it will be received by the monitoring application as a new
hit, not still the old, unemulated one.

There are safeguards possible for this in the monitoring application,
but they too have limitations, and it is ultimately less efficient and
more error-prone that the alternative hopefully is.

>>  Aside from the performance impact, this
>> complicates the code since between hvm_do_resume() and the second
>> page fault it is possible that the latter becomes a completely
>> new page fault - hence checking that EIP and the GPA match with
>> the ones in the original page fault.
> 
> Presumably this occurs when we injected an event on the vmentry?

Any asynchronous-type cause of a page fault will do, hopefully I've been
able to explain somewhat above.

>>  If they don't, duplicate
>> EPT fault vm_events will occur, of which a monitoring application
>> needs to be aware.
>> This patch makes struct arch_vm_event smaller (since we no longer
>> need to track eip and gpa), removes the checking code from
>> p2m_mem_access_check(), and moves the emulation in hvm_do_resume().
>>
>> Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
>> ---
>>  xen/arch/x86/hvm/hvm.c         | 17 +++++++++++++++++
>>  xen/arch/x86/mm/p2m.c          | 34 ----------------------------------
>>  xen/include/asm-x86/vm_event.h |  2 --
>>  3 files changed, 17 insertions(+), 36 deletions(-)
> 
> Gotta love that diffstat!
> 
> The logic makes sense, so Acked-by: Andrew Cooper
> <andrew.cooper3@citrix.com> for the x86-related nature, but it would be
> nice to have a review from Tamas for the vm_event side of things.

Thanks! Of course. Hopefully Tamas will like this, based on a
conversation we've had a few days ago here.


Thanks,
Razvan
Razvan Cojocaru Feb. 8, 2016, 9:12 a.m. UTC | #3
On 02/04/2016 02:52 PM, Razvan Cojocaru wrote:
> On 02/04/2016 02:36 PM, Andrew Cooper wrote:
>> On 04/02/16 12:27, Razvan Cojocaru wrote:
>>> Currently, after receiving a vm_event reply requesting emulation,
>>> the actual emulation is triggered in p2m_mem_access_check(),
>>> which means that we're waiting for the page fault to occur again
>>> before emulating.
>>
>> Presumably this means that we re-enter the guest and exit immediately
>> for (hopefully) the same violation?
> 
> Yes, something along those lines: the original page fault occurs, a
> vm_event is being sent to the application, which replies with the
> EMULATE flag set (but does not lift the page restrictions). Now we're
> hoping that the same instruction that has caused the first page fault
> runs, which triggers a new page fault (thus a new call of
> p2m_mem_access_check()). But in p2m_mem_access_check() we check to see
> if emulate flags are set for the current VCPU, and if they are, we check
> to see that the instruction is really the one that has caused the
> original page fault, and if it is (i.e. both EIP and GPA match), we emulate.
> 
> The ideal case is the one where the second page fault is being caused by
> the same instruction hitting the same page, and that happens most of the
> time, but it unfortunately does not happen all of the time.
> 
> So when the second page fault is _not_ caused by the same instruction,
> we just reset the emulate flags and carry on with regular processing,
> which means that a new vm_event will be sent out about this new page
> fault. But even though the application has reuquested that the page
> fault that has triggered the last page fault be emulated, it wasn't (as
> a design limitation). So now, when / if the old instruction hits the
> page again, it will be received by the monitoring application as a new
> hit, not still the old, unemulated one.
> 
> There are safeguards possible for this in the monitoring application,
> but they too have limitations, and it is ultimately less efficient and
> more error-prone that the alternative hopefully is.
> 
>>>  Aside from the performance impact, this
>>> complicates the code since between hvm_do_resume() and the second
>>> page fault it is possible that the latter becomes a completely
>>> new page fault - hence checking that EIP and the GPA match with
>>> the ones in the original page fault.
>>
>> Presumably this occurs when we injected an event on the vmentry?
> 
> Any asynchronous-type cause of a page fault will do, hopefully I've been
> able to explain somewhat above.
> 
>>>  If they don't, duplicate
>>> EPT fault vm_events will occur, of which a monitoring application
>>> needs to be aware.
>>> This patch makes struct arch_vm_event smaller (since we no longer
>>> need to track eip and gpa), removes the checking code from
>>> p2m_mem_access_check(), and moves the emulation in hvm_do_resume().
>>>
>>> Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
>>> ---
>>>  xen/arch/x86/hvm/hvm.c         | 17 +++++++++++++++++
>>>  xen/arch/x86/mm/p2m.c          | 34 ----------------------------------
>>>  xen/include/asm-x86/vm_event.h |  2 --
>>>  3 files changed, 17 insertions(+), 36 deletions(-)
>>
>> Gotta love that diffstat!
>>
>> The logic makes sense, so Acked-by: Andrew Cooper
>> <andrew.cooper3@citrix.com> for the x86-related nature, but it would be
>> nice to have a review from Tamas for the vm_event side of things.
> 
> Thanks! Of course. Hopefully Tamas will like this, based on a
> conversation we've had a few days ago here.

I've changed Tamas' address to the @novetta one, I think he might not be
using the older @zentific one.
Tamas K Lengyel Feb. 8, 2016, 5:03 p.m. UTC | #4
> >> The logic makes sense, so Acked-by: Andrew Cooper
> >> <andrew.cooper3@citrix.com> for the x86-related nature, but it would be
> >> nice to have a review from Tamas for the vm_event side of things.
> >
> > Thanks! Of course. Hopefully Tamas will like this, based on a
> > conversation we've had a few days ago here.
>

Yes, it's a step in the right direction!


>
> I've changed Tamas' address to the @novetta one, I think he might not be
> using the older @zentific one.
>

Thanks, either of these emails is actually fine but for Xen related things
tamas@tklengyel.com (as in the Maintainers file) works best as that's where
I have the proper filtering rules setup as to avoid getting mails like this
lost in the inbox.

Tamas
Tamas K Lengyel Feb. 8, 2016, 5:23 p.m. UTC | #5
On Thu, Feb 4, 2016 at 5:27 AM, Razvan Cojocaru <rcojocaru@bitdefender.com>
wrote:

> Currently, after receiving a vm_event reply requesting emulation,
> the actual emulation is triggered in p2m_mem_access_check(),
> which means that we're waiting for the page fault to occur again
> before emulating. Aside from the performance impact, this
> complicates the code since between hvm_do_resume() and the second
> page fault it is possible that the latter becomes a completely
> new page fault - hence checking that EIP and the GPA match with
> the ones in the original page fault. If they don't, duplicate
> EPT fault vm_events will occur, of which a monitoring application
> needs to be aware.
> This patch makes struct arch_vm_event smaller (since we no longer
> need to track eip and gpa), removes the checking code from
> p2m_mem_access_check(), and moves the emulation in hvm_do_resume().
>
> Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
>

Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
diff mbox

Patch

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 35ec6c9..930d0e3 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -552,6 +552,23 @@  void hvm_do_resume(struct vcpu *v)
     {
         struct monitor_write_data *w = &v->arch.vm_event->write_data;
 
+        if ( v->arch.vm_event->emulate_flags )
+        {
+            enum emul_kind kind = EMUL_KIND_NORMAL;
+
+            if ( v->arch.vm_event->emulate_flags &
+                 VM_EVENT_FLAG_SET_EMUL_READ_DATA )
+                kind = EMUL_KIND_SET_CONTEXT;
+            else if ( v->arch.vm_event->emulate_flags &
+                      VM_EVENT_FLAG_EMULATE_NOWRITE )
+                kind = EMUL_KIND_NOWRITE;
+
+            hvm_mem_access_emulate_one(kind, TRAP_invalid_op,
+                                       HVM_DELIVER_NO_ERROR_CODE);
+
+            v->arch.vm_event->emulate_flags = 0;
+        }
+
         if ( w->do_write.msr )
         {
             hvm_msr_write_intercept(w->msr, w->value, 0);
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index a45ee35..47e7fad 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1639,7 +1639,6 @@  bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
     p2m_access_t p2ma;
     vm_event_request_t *req;
     int rc;
-    unsigned long eip = guest_cpu_user_regs()->eip;
 
     if ( altp2m_active(d) )
         p2m = p2m_get_altp2m(v);
@@ -1698,39 +1697,6 @@  bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
         }
     }
 
-    /* The previous vm_event reply does not match the current state. */
-    if ( unlikely(v->arch.vm_event) &&
-         (v->arch.vm_event->gpa != gpa || v->arch.vm_event->eip != eip) )
-    {
-        /* Don't emulate the current instruction, send a new vm_event. */
-        v->arch.vm_event->emulate_flags = 0;
-
-        /*
-         * Make sure to mark the current state to match it again against
-         * the new vm_event about to be sent.
-         */
-        v->arch.vm_event->gpa = gpa;
-        v->arch.vm_event->eip = eip;
-    }
-
-    if ( unlikely(v->arch.vm_event) && v->arch.vm_event->emulate_flags )
-    {
-        enum emul_kind kind = EMUL_KIND_NORMAL;
-
-        if ( v->arch.vm_event->emulate_flags &
-             VM_EVENT_FLAG_SET_EMUL_READ_DATA )
-            kind = EMUL_KIND_SET_CONTEXT;
-        else if ( v->arch.vm_event->emulate_flags &
-                  VM_EVENT_FLAG_EMULATE_NOWRITE )
-            kind = EMUL_KIND_NOWRITE;
-
-        hvm_mem_access_emulate_one(kind, TRAP_invalid_op,
-                                   HVM_DELIVER_NO_ERROR_CODE);
-
-        v->arch.vm_event->emulate_flags = 0;
-        return 1;
-    }
-
     *req_ptr = NULL;
     req = xzalloc(vm_event_request_t);
     if ( req )
diff --git a/xen/include/asm-x86/vm_event.h b/xen/include/asm-x86/vm_event.h
index 5aff834..fff8326 100644
--- a/xen/include/asm-x86/vm_event.h
+++ b/xen/include/asm-x86/vm_event.h
@@ -28,8 +28,6 @@ 
  */
 struct arch_vm_event {
     uint32_t emulate_flags;
-    unsigned long gpa;
-    unsigned long eip;
     struct vm_event_emul_read_data emul_read_data;
     struct monitor_write_data write_data;
 };