diff mbox

[1/2] KVM: X86: remove read buffer for mmio read

Message ID 4FFA9E16.10001@linux.vnet.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Xiao Guangrong July 9, 2012, 9:02 a.m. UTC
After commit f78146b0f9230765c6315b2e14f56112513389ad:

 KVM: Fix page-crossing MMIO

    MMIO that are split across a page boundary are currently broken - the
    code does not expect to be aborted by the exit to userspace for the
    first MMIO fragment.

    This patch fixes the problem by generalizing the current code for handling
    16-byte MMIOs to handle a number of "fragments", and changes the MMIO
    code to create those fragments.

    Signed-off-by: Avi Kivity <avi@redhat.com>
    Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

Multiple MMIO reads can be merged into mmio_fragments, the read buffer is not
needed anymore

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 arch/x86/include/asm/kvm_emulate.h |    1 -
 arch/x86/kvm/emulate.c             |   43 ++++-------------------------------
 arch/x86/kvm/x86.c                 |    2 -
 3 files changed, 5 insertions(+), 41 deletions(-)

Comments

Avi Kivity July 9, 2012, 11:15 a.m. UTC | #1
On 07/09/2012 12:02 PM, Xiao Guangrong wrote:
> After commit f78146b0f9230765c6315b2e14f56112513389ad:
> 
>  KVM: Fix page-crossing MMIO
> 
>     MMIO that are split across a page boundary are currently broken - the
>     code does not expect to be aborted by the exit to userspace for the
>     first MMIO fragment.
> 
>     This patch fixes the problem by generalizing the current code for handling
>     16-byte MMIOs to handle a number of "fragments", and changes the MMIO
>     code to create those fragments.
> 
>     Signed-off-by: Avi Kivity <avi@redhat.com>
>     Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> 
> Multiple MMIO reads can be merged into mmio_fragments, the read buffer is not
> needed anymore
> 
> Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
> ---
>  arch/x86/include/asm/kvm_emulate.h |    1 -
>  arch/x86/kvm/emulate.c             |   43 ++++-------------------------------
>  arch/x86/kvm/x86.c                 |    2 -
>  3 files changed, 5 insertions(+), 41 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
> index 1ac46c22..339d7c6 100644
> --- a/arch/x86/include/asm/kvm_emulate.h
> +++ b/arch/x86/include/asm/kvm_emulate.h
> @@ -286,7 +286,6 @@ struct x86_emulate_ctxt {
>  	struct operand *memopp;
>  	struct fetch_cache fetch;
>  	struct read_cache io_read;
> -	struct read_cache mem_read;
>  };
> 

Suppose we have a RMW instruction.  On the first entry to
x86_emulate_insn() we'll drop to userspace and perform the read, and the
seconds we'll read from the cache and complete the write.

Without the read cache this cannot work.

kvm-unit-tests.git has a test for xchg to mmio.  Does it still work?

I agree this code has to go, but it needs to be replaced by something.
Maybe a .valid flag in struct operand.
Gleb Natapov July 9, 2012, 11:23 a.m. UTC | #2
On Mon, Jul 09, 2012 at 02:15:37PM +0300, Avi Kivity wrote:
> On 07/09/2012 12:02 PM, Xiao Guangrong wrote:
> > After commit f78146b0f9230765c6315b2e14f56112513389ad:
> > 
> >  KVM: Fix page-crossing MMIO
> > 
> >     MMIO that are split across a page boundary are currently broken - the
> >     code does not expect to be aborted by the exit to userspace for the
> >     first MMIO fragment.
> > 
> >     This patch fixes the problem by generalizing the current code for handling
> >     16-byte MMIOs to handle a number of "fragments", and changes the MMIO
> >     code to create those fragments.
> > 
> >     Signed-off-by: Avi Kivity <avi@redhat.com>
> >     Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> > 
> > Multiple MMIO reads can be merged into mmio_fragments, the read buffer is not
> > needed anymore
> > 
> > Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
> > ---
> >  arch/x86/include/asm/kvm_emulate.h |    1 -
> >  arch/x86/kvm/emulate.c             |   43 ++++-------------------------------
> >  arch/x86/kvm/x86.c                 |    2 -
> >  3 files changed, 5 insertions(+), 41 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
> > index 1ac46c22..339d7c6 100644
> > --- a/arch/x86/include/asm/kvm_emulate.h
> > +++ b/arch/x86/include/asm/kvm_emulate.h
> > @@ -286,7 +286,6 @@ struct x86_emulate_ctxt {
> >  	struct operand *memopp;
> >  	struct fetch_cache fetch;
> >  	struct read_cache io_read;
> > -	struct read_cache mem_read;
> >  };
> > 
> 
> Suppose we have a RMW instruction.  On the first entry to
> x86_emulate_insn() we'll drop to userspace and perform the read, and the
> seconds we'll read from the cache and complete the write.
> 
> Without the read cache this cannot work.
> 
Cache is needed to emulate instructions that need more than one read
that can go to MMIO.

> kvm-unit-tests.git has a test for xchg to mmio.  Does it still work?
> 
> I agree this code has to go, but it needs to be replaced by something.
> Maybe a .valid flag in struct operand.
> 
Valid will not enough for that.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Avi Kivity July 9, 2012, 12:48 p.m. UTC | #3
On 07/09/2012 02:23 PM, Gleb Natapov wrote:
> 
>> kvm-unit-tests.git has a test for xchg to mmio.  Does it still work?
>> 
>> I agree this code has to go, but it needs to be replaced by something.
>> Maybe a .valid flag in struct operand.
>> 
> Valid will not enough for that.

If me make everything go through operands, why not?
Avi Kivity July 9, 2012, 12:49 p.m. UTC | #4
On 07/09/2012 02:23 PM, Gleb Natapov wrote:
> 
>> kvm-unit-tests.git has a test for xchg to mmio.  Does it still work?
>> 
>> I agree this code has to go, but it needs to be replaced by something.
>> Maybe a .valid flag in struct operand.
>> 
> Valid will not enough for that.

If we make everything go through operands, any reason why not?
Xiao Guangrong July 9, 2012, 1:23 p.m. UTC | #5
On 07/09/2012 08:49 PM, Avi Kivity wrote:
> On 07/09/2012 02:23 PM, Gleb Natapov wrote:
>>
>>> kvm-unit-tests.git has a test for xchg to mmio.  Does it still work?
>>>
>>> I agree this code has to go, but it needs to be replaced by something.
>>> Maybe a .valid flag in struct operand.
>>>
>> Valid will not enough for that.
> 
> If we make everything go through operands, any reason why not?
> 

I noticed some instructions need to read ESP for many times (e.g, iret_real),
maybe .valid flag is not enough for this case if the stack is in MMIO, yes?

IIUC, I also noticed ESP is not reset back if it is emulated fail (mmio is needed).
If the stack located in mmio region, this kind of instruct will be broken, i know no
guest will use mmio as stack but SDM does not limit it, is it valid?




--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gleb Natapov July 9, 2012, 1:26 p.m. UTC | #6
On Mon, Jul 09, 2012 at 03:49:05PM +0300, Avi Kivity wrote:
> On 07/09/2012 02:23 PM, Gleb Natapov wrote:
> > 
> >> kvm-unit-tests.git has a test for xchg to mmio.  Does it still work?
> >> 
> >> I agree this code has to go, but it needs to be replaced by something.
> >> Maybe a .valid flag in struct operand.
> >> 
> > Valid will not enough for that.
> 
> If we make everything go through operands, any reason why not?
> 
Ah I missed operand part. Thought about adding valid to x86.c mmio read
buffer. What about doing more complex things from MMIO, like task switch?

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gleb Natapov July 9, 2012, 1:26 p.m. UTC | #7
On Mon, Jul 09, 2012 at 09:23:03PM +0800, Xiao Guangrong wrote:
> On 07/09/2012 08:49 PM, Avi Kivity wrote:
> > On 07/09/2012 02:23 PM, Gleb Natapov wrote:
> >>
> >>> kvm-unit-tests.git has a test for xchg to mmio.  Does it still work?
> >>>
> >>> I agree this code has to go, but it needs to be replaced by something.
> >>> Maybe a .valid flag in struct operand.
> >>>
> >> Valid will not enough for that.
> > 
> > If we make everything go through operands, any reason why not?
> > 
> 
> I noticed some instructions need to read ESP for many times (e.g, iret_real),
> maybe .valid flag is not enough for this case if the stack is in MMIO, yes?
> 
> IIUC, I also noticed ESP is not reset back if it is emulated fail (mmio is needed).
> If the stack located in mmio region, this kind of instruct will be broken, i know no
> guest will use mmio as stack but SDM does not limit it, is it valid?
> 
Good point about MMIO stack too.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Avi Kivity July 9, 2012, 1:34 p.m. UTC | #8
On 07/09/2012 04:23 PM, Xiao Guangrong wrote:
> On 07/09/2012 08:49 PM, Avi Kivity wrote:
>> On 07/09/2012 02:23 PM, Gleb Natapov wrote:
>>>
>>>> kvm-unit-tests.git has a test for xchg to mmio.  Does it still work?
>>>>
>>>> I agree this code has to go, but it needs to be replaced by something.
>>>> Maybe a .valid flag in struct operand.
>>>>
>>> Valid will not enough for that.
>> 
>> If we make everything go through operands, any reason why not?
>> 
> 
> I noticed some instructions need to read ESP for many times (e.g, iret_real),
> maybe .valid flag is not enough for this case if the stack is in MMIO, yes?

Good catch.  We either have to fix it or to restrict stack operations to
regular memory (->read_std).

> IIUC, I also noticed ESP is not reset back if it is emulated fail (mmio is needed).
> If the stack located in mmio region, this kind of instruct will be broken, i know no
> guest will use mmio as stack but SDM does not limit it, is it valid?

Stack in mmio (or task switch in mmio) is architecturally valid.  We
don't have to support it if no guests do it.
Gleb Natapov July 10, 2012, 10:36 a.m. UTC | #9
On Mon, Jul 09, 2012 at 04:34:50PM +0300, Avi Kivity wrote:
> On 07/09/2012 04:23 PM, Xiao Guangrong wrote:
> > On 07/09/2012 08:49 PM, Avi Kivity wrote:
> >> On 07/09/2012 02:23 PM, Gleb Natapov wrote:
> >>>
> >>>> kvm-unit-tests.git has a test for xchg to mmio.  Does it still work?
> >>>>
> >>>> I agree this code has to go, but it needs to be replaced by something.
> >>>> Maybe a .valid flag in struct operand.
> >>>>
> >>> Valid will not enough for that.
> >> 
> >> If we make everything go through operands, any reason why not?
> >> 
> > 
> > I noticed some instructions need to read ESP for many times (e.g, iret_real),
> > maybe .valid flag is not enough for this case if the stack is in MMIO, yes?
> 
> Good catch.  We either have to fix it or to restrict stack operations to
> regular memory (->read_std).
> 
> > IIUC, I also noticed ESP is not reset back if it is emulated fail (mmio is needed).
> > If the stack located in mmio region, this kind of instruct will be broken, i know no
> > guest will use mmio as stack but SDM does not limit it, is it valid?
> 
> Stack in mmio (or task switch in mmio) is architecturally valid.  We
> don't have to support it if no guests do it.
> 
But the code is already here, why drop it?

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Avi Kivity July 10, 2012, 10:45 a.m. UTC | #10
On 07/10/2012 01:36 PM, Gleb Natapov wrote:
> On Mon, Jul 09, 2012 at 04:34:50PM +0300, Avi Kivity wrote:
> > On 07/09/2012 04:23 PM, Xiao Guangrong wrote:
> > > On 07/09/2012 08:49 PM, Avi Kivity wrote:
> > >> On 07/09/2012 02:23 PM, Gleb Natapov wrote:
> > >>>
> > >>>> kvm-unit-tests.git has a test for xchg to mmio.  Does it still work?
> > >>>>
> > >>>> I agree this code has to go, but it needs to be replaced by something.
> > >>>> Maybe a .valid flag in struct operand.
> > >>>>
> > >>> Valid will not enough for that.
> > >> 
> > >> If we make everything go through operands, any reason why not?
> > >> 
> > > 
> > > I noticed some instructions need to read ESP for many times (e.g, iret_real),
> > > maybe .valid flag is not enough for this case if the stack is in MMIO, yes?
> > 
> > Good catch.  We either have to fix it or to restrict stack operations to
> > regular memory (->read_std).
> > 
> > > IIUC, I also noticed ESP is not reset back if it is emulated fail (mmio is needed).
> > > If the stack located in mmio region, this kind of instruct will be broken, i know no
> > > guest will use mmio as stack but SDM does not limit it, is it valid?
> > 
> > Stack in mmio (or task switch in mmio) is architecturally valid.  We
> > don't have to support it if no guests do it.
> > 
> But the code is already here, why drop it?

The read cache is not effective for multiple disjunct reads.  The
splitting into 8-byte groups is unneeded.
Gleb Natapov July 10, 2012, 10:48 a.m. UTC | #11
On Tue, Jul 10, 2012 at 01:45:15PM +0300, Avi Kivity wrote:
> On 07/10/2012 01:36 PM, Gleb Natapov wrote:
> > On Mon, Jul 09, 2012 at 04:34:50PM +0300, Avi Kivity wrote:
> > > On 07/09/2012 04:23 PM, Xiao Guangrong wrote:
> > > > On 07/09/2012 08:49 PM, Avi Kivity wrote:
> > > >> On 07/09/2012 02:23 PM, Gleb Natapov wrote:
> > > >>>
> > > >>>> kvm-unit-tests.git has a test for xchg to mmio.  Does it still work?
> > > >>>>
> > > >>>> I agree this code has to go, but it needs to be replaced by something.
> > > >>>> Maybe a .valid flag in struct operand.
> > > >>>>
> > > >>> Valid will not enough for that.
> > > >> 
> > > >> If we make everything go through operands, any reason why not?
> > > >> 
> > > > 
> > > > I noticed some instructions need to read ESP for many times (e.g, iret_real),
> > > > maybe .valid flag is not enough for this case if the stack is in MMIO, yes?
> > > 
> > > Good catch.  We either have to fix it or to restrict stack operations to
> > > regular memory (->read_std).
> > > 
> > > > IIUC, I also noticed ESP is not reset back if it is emulated fail (mmio is needed).
> > > > If the stack located in mmio region, this kind of instruct will be broken, i know no
> > > > guest will use mmio as stack but SDM does not limit it, is it valid?
> > > 
> > > Stack in mmio (or task switch in mmio) is architecturally valid.  We
> > > don't have to support it if no guests do it.
> > > 
> > But the code is already here, why drop it?
> 
> The read cache is not effective for multiple disjunct reads.
What do you mean?

>                                                                The
> splitting into 8-byte groups is unneeded.
> 
Agree. Easy to fix.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Avi Kivity July 10, 2012, 12:50 p.m. UTC | #12
On 07/10/2012 01:48 PM, Gleb Natapov wrote:
> > > > 
> > > But the code is already here, why drop it?
> > 
> > The read cache is not effective for multiple disjunct reads.
> What do you mean?

If an instruction reads from several sources in mmio, then the first
read will be flushed from the cache by the second read.  So if we need a
third read, we'll have to exit for the first again.
Gleb Natapov July 10, 2012, 1:01 p.m. UTC | #13
On Tue, Jul 10, 2012 at 03:50:39PM +0300, Avi Kivity wrote:
> On 07/10/2012 01:48 PM, Gleb Natapov wrote:
> > > > > 
> > > > But the code is already here, why drop it?
> > > 
> > > The read cache is not effective for multiple disjunct reads.
> > What do you mean?
> 
> If an instruction reads from several sources in mmio, then the first
> read will be flushed from the cache by the second read.  So if we need a
> third read, we'll have to exit for the first again.
> 
The cache is flashed only before instruction decode, never during
emulation.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Avi Kivity July 10, 2012, 4:04 p.m. UTC | #14
On 07/10/2012 04:01 PM, Gleb Natapov wrote:
> On Tue, Jul 10, 2012 at 03:50:39PM +0300, Avi Kivity wrote:
> > On 07/10/2012 01:48 PM, Gleb Natapov wrote:
> > > > > > 
> > > > > But the code is already here, why drop it?
> > > > 
> > > > The read cache is not effective for multiple disjunct reads.
> > > What do you mean?
> > 
> > If an instruction reads from several sources in mmio, then the first
> > read will be flushed from the cache by the second read.  So if we need a
> > third read, we'll have to exit for the first again.
> > 
> The cache is flashed only before instruction decode, never during
> emulation.

Oh, I misread the code.
diff mbox

Patch

diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index 1ac46c22..339d7c6 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -286,7 +286,6 @@  struct x86_emulate_ctxt {
 	struct operand *memopp;
 	struct fetch_cache fetch;
 	struct read_cache io_read;
-	struct read_cache mem_read;
 };

 /* Repeat String Operation Prefix */
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index f95d242..aa455da 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -1128,33 +1128,6 @@  static void fetch_bit_operand(struct x86_emulate_ctxt *ctxt)
 	ctxt->src.val &= (ctxt->dst.bytes << 3) - 1;
 }

-static int read_emulated(struct x86_emulate_ctxt *ctxt,
-			 unsigned long addr, void *dest, unsigned size)
-{
-	int rc;
-	struct read_cache *mc = &ctxt->mem_read;
-
-	while (size) {
-		int n = min(size, 8u);
-		size -= n;
-		if (mc->pos < mc->end)
-			goto read_cached;
-
-		rc = ctxt->ops->read_emulated(ctxt, addr, mc->data + mc->end, n,
-					      &ctxt->exception);
-		if (rc != X86EMUL_CONTINUE)
-			return rc;
-		mc->end += n;
-
-	read_cached:
-		memcpy(dest, mc->data + mc->pos, n);
-		mc->pos += n;
-		dest += n;
-		addr += n;
-	}
-	return X86EMUL_CONTINUE;
-}
-
 static int segmented_read(struct x86_emulate_ctxt *ctxt,
 			  struct segmented_address addr,
 			  void *data,
@@ -1166,7 +1139,9 @@  static int segmented_read(struct x86_emulate_ctxt *ctxt,
 	rc = linearize(ctxt, addr, size, false, &linear);
 	if (rc != X86EMUL_CONTINUE)
 		return rc;
-	return read_emulated(ctxt, linear, data, size);
+
+	return ctxt->ops->read_emulated(ctxt, linear, data, size,
+					&ctxt->exception);
 }

 static int segmented_write(struct x86_emulate_ctxt *ctxt,
@@ -4122,8 +4097,6 @@  int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
 	int rc = X86EMUL_CONTINUE;
 	int saved_dst_type = ctxt->dst.type;

-	ctxt->mem_read.pos = 0;
-
 	if (ctxt->mode == X86EMUL_MODE_PROT64 && (ctxt->d & No64)) {
 		rc = emulate_ud(ctxt);
 		goto done;
@@ -4364,15 +4337,9 @@  writeback:
 			 * or, if it is not used, after each 1024 iteration.
 			 */
 			if ((r->end != 0 || ctxt->regs[VCPU_REGS_RCX] & 0x3ff) &&
-			    (r->end == 0 || r->end != r->pos)) {
-				/*
-				 * Reset read cache. Usually happens before
-				 * decode, but since instruction is restarted
-				 * we have to do it here.
-				 */
-				ctxt->mem_read.end = 0;
+			    (r->end == 0 || r->end != r->pos))
 				return EMULATION_RESTART;
-			}
+
 			goto done; /* skip rip writeback */
 		}
 	}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a01a424..7445545 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4399,8 +4399,6 @@  static void init_decode_cache(struct x86_emulate_ctxt *ctxt,
 	ctxt->fetch.end = 0;
 	ctxt->io_read.pos = 0;
 	ctxt->io_read.end = 0;
-	ctxt->mem_read.pos = 0;
-	ctxt->mem_read.end = 0;
 }

 static void init_emulate_ctxt(struct kvm_vcpu *vcpu)