diff mbox series

arm64/kvm: correct the error report in kvm_handle_guest_abort

Message ID 20210115093028.6504-1-jianyong.wu@arm.com (mailing list archive)
State New, archived
Headers show
Series arm64/kvm: correct the error report in kvm_handle_guest_abort | expand

Commit Message

Jianyong Wu Jan. 15, 2021, 9:30 a.m. UTC
Currently, error report when cache maintenance at read-only memory range,
like rom, is not clear enough and even not correct. As the specific error
is definitely known by kvm, it is obliged to give it out.

Fox example, in a qemu/kvm VM, if the guest do dc at the pflash range from
0 to 128M, error is reported by kvm as "Data abort outside memslots with
no valid syndrome info" which is not quite correct.

Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
---
 arch/arm64/kvm/mmu.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

Comments

Marc Zyngier Jan. 15, 2021, 11:20 a.m. UTC | #1
On 2021-01-15 09:30, Jianyong Wu wrote:
> Currently, error report when cache maintenance at read-only memory 
> range,
> like rom, is not clear enough and even not correct. As the specific 
> error
> is definitely known by kvm, it is obliged to give it out.
> 
> Fox example, in a qemu/kvm VM, if the guest do dc at the pflash range 
> from
> 0 to 128M, error is reported by kvm as "Data abort outside memslots 
> with
> no valid syndrome info" which is not quite correct.
> 
> Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
> ---
>  arch/arm64/kvm/mmu.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 7d2257cc5438..de66b7e38a5b 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1022,9 +1022,15 @@ int kvm_handle_guest_abort(struct kvm_vcpu 
> *vcpu)
>  		 * So let's assume that the guest is just being
>  		 * cautious, and skip the instruction.
>  		 */
> -		if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) {
> -			kvm_incr_pc(vcpu);
> -			ret = 1;
> +		if (kvm_vcpu_dabt_is_cm(vcpu)) {
> +			if (kvm_is_error_hva(hva)) {
> +				kvm_incr_pc(vcpu);
> +				ret = 1;
> +				goto out_unlock;
> +			}
> +
> +			kvm_err("Do cache maintenance in the read-only memory range\n");

We don't scream on the console for guests bugs.

> +			ret = -EFAULT;
>  			goto out_unlock;
>  		}

And what is userspace going to do with that? To be honest, I'd rather
not report it in any case:

- either it isn't mapped, and there is no cache to clean/invalidate
- or it is mapped read-only:
   - if it is a "DC IVAC", the guest should get the fault as per
     the ARM ARM. But I don't think we can identify the particular CMO
     at this stage, so actually performing an invalidation is the least
     bad thing to do.

How about this (untested)?

         M.

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 7d2257cc5438..0f497faad131 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1013,16 +1013,27 @@ int kvm_handle_guest_abort(struct kvm_vcpu 
*vcpu)
  		}

  		/*
-		 * Check for a cache maintenance operation. Since we
-		 * ended-up here, we know it is outside of any memory
-		 * slot. But we can't find out if that is for a device,
-		 * or if the guest is just being stupid. The only thing
-		 * we know for sure is that this range cannot be cached.
+		 * Check for a cache maintenance operation. Two cases:
  		 *
-		 * So let's assume that the guest is just being
-		 * cautious, and skip the instruction.
+		 * - It is outside of any memory slot. But we can't
+		 *   find out if that is for a device, or if the guest
+		 *   is just being stupid. The only thing we know for
+		 *   sure is that this range cannot be cached.  So
+		 *   let's assume that the guest is just being
+		 *   cautious, and skip the instruction.
+		 *
+		 * - Otherwise, clean/invalidate the whole memslot. We
+		 *   should special-case DC IVAC and inject a
+		 *   permission fault, but we can't really identify it
+		 *   in this context.
  		 */
-		if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) {
+		if (kvm_vcpu_dabt_is_cm(vcpu)) {
+			if (!kvm_is_error_hva(hva)) {
+				spin_lock(&vcpu->kvm->mmu_lock);
+				stage2_flush_memslot(vcpu->kvm, memslot);
+				spin_unlock(&vcpu->kvm->mmu_lock);
+			}
+
  			kvm_incr_pc(vcpu);
  			ret = 1;
  			goto out_unlock;
Jianyong Wu Jan. 16, 2021, 8:46 a.m. UTC | #2
Hi Marc,

> -----Original Message-----
> From: Marc Zyngier <maz@kernel.org>
> Sent: Friday, January 15, 2021 7:21 PM
> To: Jianyong Wu <Jianyong.Wu@arm.com>
> Cc: James Morse <James.Morse@arm.com>; will@kernel.org; Suzuki
> Poulose <Suzuki.Poulose@arm.com>; linux-arm-kernel@lists.infradead.org;
> kvmarm@lists.cs.columbia.edu; Steve Capper <Steve.Capper@arm.com>;
> Justin He <Justin.He@arm.com>; nd <nd@arm.com>
> Subject: Re: [PATCH] arm64/kvm: correct the error report in
> kvm_handle_guest_abort
> 
> On 2021-01-15 09:30, Jianyong Wu wrote:
> > Currently, error report when cache maintenance at read-only memory
> > range, like rom, is not clear enough and even not correct. As the
> > specific error is definitely known by kvm, it is obliged to give it
> > out.
> >
> > Fox example, in a qemu/kvm VM, if the guest do dc at the pflash range
> > from
> > 0 to 128M, error is reported by kvm as "Data abort outside memslots
> > with no valid syndrome info" which is not quite correct.
> >
> > Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
> > ---
> >  arch/arm64/kvm/mmu.c | 12 +++++++++---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index
> > 7d2257cc5438..de66b7e38a5b 100644
> > --- a/arch/arm64/kvm/mmu.c
> > +++ b/arch/arm64/kvm/mmu.c
> > @@ -1022,9 +1022,15 @@ int kvm_handle_guest_abort(struct kvm_vcpu
> > *vcpu)
> >  		 * So let's assume that the guest is just being
> >  		 * cautious, and skip the instruction.
> >  		 */
> > -		if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu))
> {
> > -			kvm_incr_pc(vcpu);
> > -			ret = 1;
> > +		if (kvm_vcpu_dabt_is_cm(vcpu)) {
> > +			if (kvm_is_error_hva(hva)) {
> > +				kvm_incr_pc(vcpu);
> > +				ret = 1;
> > +				goto out_unlock;
> > +			}
> > +
> > +			kvm_err("Do cache maintenance in the read-only
> memory range\n");
> 
> We don't scream on the console for guests bugs.
Ok

> 
> > +			ret = -EFAULT;
> >  			goto out_unlock;
> >  		}
> 
> And what is userspace going to do with that? To be honest, I'd rather not
> report it in any case:
> 
> - either it isn't mapped, and there is no cache to clean/invalidate
> - or it is mapped read-only:
>    - if it is a "DC IVAC", the guest should get the fault as per
>      the ARM ARM. But I don't think we can identify the particular CMO
>      at this stage, so actually performing an invalidation is the least
>      bad thing to do.
> 
> How about this (untested)?
> 
>          M.
> 
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index
> 7d2257cc5438..0f497faad131 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1013,16 +1013,27 @@ int kvm_handle_guest_abort(struct kvm_vcpu
> *vcpu)
>   		}
> 
>   		/*
> -		 * Check for a cache maintenance operation. Since we
> -		 * ended-up here, we know it is outside of any memory
> -		 * slot. But we can't find out if that is for a device,
> -		 * or if the guest is just being stupid. The only thing
> -		 * we know for sure is that this range cannot be cached.
> +		 * Check for a cache maintenance operation. Two cases:
>   		 *
> -		 * So let's assume that the guest is just being
> -		 * cautious, and skip the instruction.
> +		 * - It is outside of any memory slot. But we can't
> +		 *   find out if that is for a device, or if the guest
> +		 *   is just being stupid. The only thing we know for
> +		 *   sure is that this range cannot be cached.  So
> +		 *   let's assume that the guest is just being
> +		 *   cautious, and skip the instruction.
> +		 *
> +		 * - Otherwise, clean/invalidate the whole memslot. We
> +		 *   should special-case DC IVAC and inject a
> +		 *   permission fault, but we can't really identify it
> +		 *   in this context.
>   		 */
> -		if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu))
> {
> +		if (kvm_vcpu_dabt_is_cm(vcpu)) {
> +			if (!kvm_is_error_hva(hva)) {
> +				spin_lock(&vcpu->kvm->mmu_lock);
> +				stage2_flush_memslot(vcpu->kvm,
> memslot);
> +				spin_unlock(&vcpu->kvm->mmu_lock);
> +			}
> +
>   			kvm_incr_pc(vcpu);
>   			ret = 1;
>   			goto out_unlock;
> 
Thanks Marc, it's OK for me and I will do the test for it.

Thanks
Jianyong

> --
> Jazz is not dead. It just smells funny...
Jianyong Wu Jan. 18, 2021, 1:01 p.m. UTC | #3
Hi Marc,

> -----Original Message-----
> From: kvmarm-bounces@lists.cs.columbia.edu <kvmarm-
> bounces@lists.cs.columbia.edu> On Behalf Of Jianyong Wu
> Sent: Saturday, January 16, 2021 4:47 PM
> To: Marc Zyngier <maz@kernel.org>
> Cc: Justin He <Justin.He@arm.com>; nd <nd@arm.com>; will@kernel.org;
> kvmarm@lists.cs.columbia.edu; linux-arm-kernel@lists.infradead.org
> Subject: RE: [PATCH] arm64/kvm: correct the error report in
> kvm_handle_guest_abort
> 
> Hi Marc,
> 
> > -----Original Message-----
> > From: Marc Zyngier <maz@kernel.org>
> > Sent: Friday, January 15, 2021 7:21 PM
> > To: Jianyong Wu <Jianyong.Wu@arm.com>
> > Cc: James Morse <James.Morse@arm.com>; will@kernel.org; Suzuki
> Poulose
> > <Suzuki.Poulose@arm.com>; linux-arm-kernel@lists.infradead.org;
> > kvmarm@lists.cs.columbia.edu; Steve Capper <Steve.Capper@arm.com>;
> > Justin He <Justin.He@arm.com>; nd <nd@arm.com>
> > Subject: Re: [PATCH] arm64/kvm: correct the error report in
> > kvm_handle_guest_abort
> >
> > On 2021-01-15 09:30, Jianyong Wu wrote:
> > > Currently, error report when cache maintenance at read-only memory
> > > range, like rom, is not clear enough and even not correct. As the
> > > specific error is definitely known by kvm, it is obliged to give it
> > > out.
> > >
> > > Fox example, in a qemu/kvm VM, if the guest do dc at the pflash
> > > range from
> > > 0 to 128M, error is reported by kvm as "Data abort outside memslots
> > > with no valid syndrome info" which is not quite correct.
> > >
> > > Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
> > > ---
> > >  arch/arm64/kvm/mmu.c | 12 +++++++++---
> > >  1 file changed, 9 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index
> > > 7d2257cc5438..de66b7e38a5b 100644
> > > --- a/arch/arm64/kvm/mmu.c
> > > +++ b/arch/arm64/kvm/mmu.c
> > > @@ -1022,9 +1022,15 @@ int kvm_handle_guest_abort(struct kvm_vcpu
> > > *vcpu)
> > >  		 * So let's assume that the guest is just being
> > >  		 * cautious, and skip the instruction.
> > >  		 */
> > > -		if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu))
> > {
> > > -			kvm_incr_pc(vcpu);
> > > -			ret = 1;
> > > +		if (kvm_vcpu_dabt_is_cm(vcpu)) {
> > > +			if (kvm_is_error_hva(hva)) {
> > > +				kvm_incr_pc(vcpu);
> > > +				ret = 1;
> > > +				goto out_unlock;
> > > +			}
> > > +
> > > +			kvm_err("Do cache maintenance in the read-only
> > memory range\n");
> >
> > We don't scream on the console for guests bugs.
> Ok
> 
> >
> > > +			ret = -EFAULT;
> > >  			goto out_unlock;
> > >  		}
> >
> > And what is userspace going to do with that? To be honest, I'd rather
> > not report it in any case:
> >
> > - either it isn't mapped, and there is no cache to clean/invalidate
> > - or it is mapped read-only:
> >    - if it is a "DC IVAC", the guest should get the fault as per
> >      the ARM ARM. But I don't think we can identify the particular CMO
> >      at this stage, so actually performing an invalidation is the least
> >      bad thing to do.
> >
> > How about this (untested)?

I have tested for this. It works that DC ops can pass on memory range for rom. But there is performance issue. It takes too long a time that do DC on rom range compared with on  normal memory range. Here is some data:
Ops                  memory type                                size                     time
dc civac         rom memory                                128M               6700ms;
dc civac       writable normal memory             128M                300ms;

It's a single thread test and may be worse on multi thread. I'm not sure we can bear it. WDYT?

Thanks
Jianyong 

> >
> >          M.
> >
> > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index
> > 7d2257cc5438..0f497faad131 100644
> > --- a/arch/arm64/kvm/mmu.c
> > +++ b/arch/arm64/kvm/mmu.c
> > @@ -1013,16 +1013,27 @@ int kvm_handle_guest_abort(struct kvm_vcpu
> > *vcpu)
> >   		}
> >
> >   		/*
> > -		 * Check for a cache maintenance operation. Since we
> > -		 * ended-up here, we know it is outside of any memory
> > -		 * slot. But we can't find out if that is for a device,
> > -		 * or if the guest is just being stupid. The only thing
> > -		 * we know for sure is that this range cannot be cached.
> > +		 * Check for a cache maintenance operation. Two cases:
> >   		 *
> > -		 * So let's assume that the guest is just being
> > -		 * cautious, and skip the instruction.
> > +		 * - It is outside of any memory slot. But we can't
> > +		 *   find out if that is for a device, or if the guest
> > +		 *   is just being stupid. The only thing we know for
> > +		 *   sure is that this range cannot be cached.  So
> > +		 *   let's assume that the guest is just being
> > +		 *   cautious, and skip the instruction.
> > +		 *
> > +		 * - Otherwise, clean/invalidate the whole memslot. We
> > +		 *   should special-case DC IVAC and inject a
> > +		 *   permission fault, but we can't really identify it
> > +		 *   in this context.
> >   		 */
> > -		if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu))
> > {
> > +		if (kvm_vcpu_dabt_is_cm(vcpu)) {
> > +			if (!kvm_is_error_hva(hva)) {
> > +				spin_lock(&vcpu->kvm->mmu_lock);
> > +				stage2_flush_memslot(vcpu->kvm,
> > memslot);
> > +				spin_unlock(&vcpu->kvm->mmu_lock);
> > +			}
> > +
> >   			kvm_incr_pc(vcpu);
> >   			ret = 1;
> >   			goto out_unlock;
> >
> Thanks Marc, it's OK for me and I will do the test for it.
> 
> Thanks
> Jianyong
> 
> > --
> > Jazz is not dead. It just smells funny...
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Jianyong Wu Jan. 18, 2021, 1:04 p.m. UTC | #4
Hi Marc,

> -----Original Message-----
> From: kvmarm-bounces@lists.cs.columbia.edu <kvmarm-
> bounces@lists.cs.columbia.edu> On Behalf Of Jianyong Wu
> Sent: Saturday, January 16, 2021 4:47 PM
> To: Marc Zyngier <maz@kernel.org>
> Cc: Justin He <Justin.He@arm.com>; nd <nd@arm.com>; will@kernel.org;
> kvmarm@lists.cs.columbia.edu; linux-arm-kernel@lists.infradead.org
> Subject: RE: [PATCH] arm64/kvm: correct the error report in
> kvm_handle_guest_abort
> 
> Hi Marc,
> 
> > -----Original Message-----
> > From: Marc Zyngier <maz@kernel.org>
> > Sent: Friday, January 15, 2021 7:21 PM
> > To: Jianyong Wu <Jianyong.Wu@arm.com>
> > Cc: James Morse <James.Morse@arm.com>; will@kernel.org; Suzuki
> Poulose
> > <Suzuki.Poulose@arm.com>; linux-arm-kernel@lists.infradead.org;
> > kvmarm@lists.cs.columbia.edu; Steve Capper <Steve.Capper@arm.com>;
> > Justin He <Justin.He@arm.com>; nd <nd@arm.com>
> > Subject: Re: [PATCH] arm64/kvm: correct the error report in
> > kvm_handle_guest_abort
> >
> > On 2021-01-15 09:30, Jianyong Wu wrote:
> > > Currently, error report when cache maintenance at read-only memory
> > > range, like rom, is not clear enough and even not correct. As the
> > > specific error is definitely known by kvm, it is obliged to give it
> > > out.
> > >
> > > Fox example, in a qemu/kvm VM, if the guest do dc at the pflash
> > > range from
> > > 0 to 128M, error is reported by kvm as "Data abort outside memslots
> > > with no valid syndrome info" which is not quite correct.
> > >
> > > Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
> > > ---
> > >  arch/arm64/kvm/mmu.c | 12 +++++++++---
> > >  1 file changed, 9 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index
> > > 7d2257cc5438..de66b7e38a5b 100644
> > > --- a/arch/arm64/kvm/mmu.c
> > > +++ b/arch/arm64/kvm/mmu.c
> > > @@ -1022,9 +1022,15 @@ int kvm_handle_guest_abort(struct kvm_vcpu
> > > *vcpu)
> > >  		 * So let's assume that the guest is just being
> > >  		 * cautious, and skip the instruction.
> > >  		 */
> > > -		if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu))
> > {
> > > -			kvm_incr_pc(vcpu);
> > > -			ret = 1;
> > > +		if (kvm_vcpu_dabt_is_cm(vcpu)) {
> > > +			if (kvm_is_error_hva(hva)) {
> > > +				kvm_incr_pc(vcpu);
> > > +				ret = 1;
> > > +				goto out_unlock;
> > > +			}
> > > +
> > > +			kvm_err("Do cache maintenance in the read-only
> > memory range\n");
> >
> > We don't scream on the console for guests bugs.
> Ok
> 
> >
> > > +			ret = -EFAULT;
> > >  			goto out_unlock;
> > >  		}
> >
> > And what is userspace going to do with that? To be honest, I'd rather
> > not report it in any case:
> >
> > - either it isn't mapped, and there is no cache to clean/invalidate
> > - or it is mapped read-only:
> >    - if it is a "DC IVAC", the guest should get the fault as per
> >      the ARM ARM. But I don't think we can identify the particular CMO
> >      at this stage, so actually performing an invalidation is the least
> >      bad thing to do.
> >
> > How about this (untested)?

I have tested for this. It works that DC ops can pass on memory range for rom. But there is performance issue. It takes too long a time that do DC on rom range compared with on  normal memory range. Here is some data:
Ops                  memory type                                size                     time
dc civac         rom memory                                128M               6700ms;
dc civac       writable normal memory             128M                300ms;

It's a single thread test and may be worse on multi thread. I'm not sure we can bear it. WDYT?

Thanks
Jianyong 

> >
> >          M.
> >
> > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index
> > 7d2257cc5438..0f497faad131 100644
> > --- a/arch/arm64/kvm/mmu.c
> > +++ b/arch/arm64/kvm/mmu.c
> > @@ -1013,16 +1013,27 @@ int kvm_handle_guest_abort(struct kvm_vcpu
> > *vcpu)
> >   		}
> >
> >   		/*
> > -		 * Check for a cache maintenance operation. Since we
> > -		 * ended-up here, we know it is outside of any memory
> > -		 * slot. But we can't find out if that is for a device,
> > -		 * or if the guest is just being stupid. The only thing
> > -		 * we know for sure is that this range cannot be cached.
> > +		 * Check for a cache maintenance operation. Two cases:
> >   		 *
> > -		 * So let's assume that the guest is just being
> > -		 * cautious, and skip the instruction.
> > +		 * - It is outside of any memory slot. But we can't
> > +		 *   find out if that is for a device, or if the guest
> > +		 *   is just being stupid. The only thing we know for
> > +		 *   sure is that this range cannot be cached.  So
> > +		 *   let's assume that the guest is just being
> > +		 *   cautious, and skip the instruction.
> > +		 *
> > +		 * - Otherwise, clean/invalidate the whole memslot. We
> > +		 *   should special-case DC IVAC and inject a
> > +		 *   permission fault, but we can't really identify it
> > +		 *   in this context.
> >   		 */
> > -		if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu))
> > {
> > +		if (kvm_vcpu_dabt_is_cm(vcpu)) {
> > +			if (!kvm_is_error_hva(hva)) {
> > +				spin_lock(&vcpu->kvm->mmu_lock);
> > +				stage2_flush_memslot(vcpu->kvm,
> > memslot);
> > +				spin_unlock(&vcpu->kvm->mmu_lock);
> > +			}
> > +
> >   			kvm_incr_pc(vcpu);
> >   			ret = 1;
> >   			goto out_unlock;
> >
> Thanks Marc, it's OK for me and I will do the test for it.
> 
> Thanks
> Jianyong
> 
> > --
> > Jazz is not dead. It just smells funny...
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Marc Zyngier Jan. 18, 2021, 1:26 p.m. UTC | #5
On 2021-01-18 13:04, Jianyong Wu wrote:
> Hi Marc,
> 
>> -----Original Message-----
>> From: kvmarm-bounces@lists.cs.columbia.edu <kvmarm-
>> bounces@lists.cs.columbia.edu> On Behalf Of Jianyong Wu
>> Sent: Saturday, January 16, 2021 4:47 PM
>> To: Marc Zyngier <maz@kernel.org>
>> Cc: Justin He <Justin.He@arm.com>; nd <nd@arm.com>; will@kernel.org;
>> kvmarm@lists.cs.columbia.edu; linux-arm-kernel@lists.infradead.org
>> Subject: RE: [PATCH] arm64/kvm: correct the error report in
>> kvm_handle_guest_abort
>> 
>> Hi Marc,
>> 
>> > -----Original Message-----
>> > From: Marc Zyngier <maz@kernel.org>
>> > Sent: Friday, January 15, 2021 7:21 PM
>> > To: Jianyong Wu <Jianyong.Wu@arm.com>
>> > Cc: James Morse <James.Morse@arm.com>; will@kernel.org; Suzuki
>> Poulose
>> > <Suzuki.Poulose@arm.com>; linux-arm-kernel@lists.infradead.org;
>> > kvmarm@lists.cs.columbia.edu; Steve Capper <Steve.Capper@arm.com>;
>> > Justin He <Justin.He@arm.com>; nd <nd@arm.com>
>> > Subject: Re: [PATCH] arm64/kvm: correct the error report in
>> > kvm_handle_guest_abort
>> >
>> > On 2021-01-15 09:30, Jianyong Wu wrote:
>> > > Currently, error report when cache maintenance at read-only memory
>> > > range, like rom, is not clear enough and even not correct. As the
>> > > specific error is definitely known by kvm, it is obliged to give it
>> > > out.
>> > >
>> > > Fox example, in a qemu/kvm VM, if the guest do dc at the pflash
>> > > range from
>> > > 0 to 128M, error is reported by kvm as "Data abort outside memslots
>> > > with no valid syndrome info" which is not quite correct.
>> > >
>> > > Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
>> > > ---
>> > >  arch/arm64/kvm/mmu.c | 12 +++++++++---
>> > >  1 file changed, 9 insertions(+), 3 deletions(-)
>> > >
>> > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index
>> > > 7d2257cc5438..de66b7e38a5b 100644
>> > > --- a/arch/arm64/kvm/mmu.c
>> > > +++ b/arch/arm64/kvm/mmu.c
>> > > @@ -1022,9 +1022,15 @@ int kvm_handle_guest_abort(struct kvm_vcpu
>> > > *vcpu)
>> > >  		 * So let's assume that the guest is just being
>> > >  		 * cautious, and skip the instruction.
>> > >  		 */
>> > > -		if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu))
>> > {
>> > > -			kvm_incr_pc(vcpu);
>> > > -			ret = 1;
>> > > +		if (kvm_vcpu_dabt_is_cm(vcpu)) {
>> > > +			if (kvm_is_error_hva(hva)) {
>> > > +				kvm_incr_pc(vcpu);
>> > > +				ret = 1;
>> > > +				goto out_unlock;
>> > > +			}
>> > > +
>> > > +			kvm_err("Do cache maintenance in the read-only
>> > memory range\n");
>> >
>> > We don't scream on the console for guests bugs.
>> Ok
>> 
>> >
>> > > +			ret = -EFAULT;
>> > >  			goto out_unlock;
>> > >  		}
>> >
>> > And what is userspace going to do with that? To be honest, I'd rather
>> > not report it in any case:
>> >
>> > - either it isn't mapped, and there is no cache to clean/invalidate
>> > - or it is mapped read-only:
>> >    - if it is a "DC IVAC", the guest should get the fault as per
>> >      the ARM ARM. But I don't think we can identify the particular CMO
>> >      at this stage, so actually performing an invalidation is the least
>> >      bad thing to do.
>> >
>> > How about this (untested)?
> 
> I have tested for this. It works that DC ops can pass on memory range
> for rom. But there is performance issue. It takes too long a time that
> do DC on rom range compared with on  normal memory range. Here is some
> data:
> Ops                  memory type                                size
>                   time
> dc civac         rom memory                                128M
>        6700ms;
> dc civac       writable normal memory             128M                
> 300ms;
> 
> It's a single thread test and may be worse on multi thread. I'm not
> sure we can bear it. WDYT?

The problem is that the guest is invalidating one cache-line at
a time, but we invalidate 128M at a time in your example.

I would say that I really don't care how slow it is. We cannot know
which address the guest is trying to invalidate (as your commit
message shows, there is no syndrome information available).

So it seems our only choices are:
- don't do any invalidation, which is likely to break the guest
- invalidate everything, always

Given that, I'd rather have a slow guest. Also, it very much looks
like no existing SW does this, so I cannot say I care much.

Thanks,

         M.
Jianyong Wu Jan. 18, 2021, 1:38 p.m. UTC | #6
> -----Original Message-----
> From: Marc Zyngier <maz@kernel.org>
> Sent: Monday, January 18, 2021 9:26 PM
> To: Jianyong Wu <Jianyong.Wu@arm.com>
> Cc: Justin He <Justin.He@arm.com>; nd <nd@arm.com>; will@kernel.org;
> kvmarm@lists.cs.columbia.edu; linux-arm-kernel@lists.infradead.org
> Subject: Re: [PATCH] arm64/kvm: correct the error report in
> kvm_handle_guest_abort
> 
> On 2021-01-18 13:04, Jianyong Wu wrote:
> > Hi Marc,
> >
> >> -----Original Message-----
> >> From: kvmarm-bounces@lists.cs.columbia.edu <kvmarm-
> >> bounces@lists.cs.columbia.edu> On Behalf Of Jianyong Wu
> >> Sent: Saturday, January 16, 2021 4:47 PM
> >> To: Marc Zyngier <maz@kernel.org>
> >> Cc: Justin He <Justin.He@arm.com>; nd <nd@arm.com>; will@kernel.org;
> >> kvmarm@lists.cs.columbia.edu; linux-arm-kernel@lists.infradead.org
> >> Subject: RE: [PATCH] arm64/kvm: correct the error report in
> >> kvm_handle_guest_abort
> >>
> >> Hi Marc,
> >>
> >> > -----Original Message-----
> >> > From: Marc Zyngier <maz@kernel.org>
> >> > Sent: Friday, January 15, 2021 7:21 PM
> >> > To: Jianyong Wu <Jianyong.Wu@arm.com>
> >> > Cc: James Morse <James.Morse@arm.com>; will@kernel.org; Suzuki
> >> Poulose
> >> > <Suzuki.Poulose@arm.com>; linux-arm-kernel@lists.infradead.org;
> >> > kvmarm@lists.cs.columbia.edu; Steve Capper
> <Steve.Capper@arm.com>;
> >> > Justin He <Justin.He@arm.com>; nd <nd@arm.com>
> >> > Subject: Re: [PATCH] arm64/kvm: correct the error report in
> >> > kvm_handle_guest_abort
> >> >
> >> > On 2021-01-15 09:30, Jianyong Wu wrote:
> >> > > Currently, error report when cache maintenance at read-only
> >> > > memory range, like rom, is not clear enough and even not correct.
> >> > > As the specific error is definitely known by kvm, it is obliged
> >> > > to give it out.
> >> > >
> >> > > Fox example, in a qemu/kvm VM, if the guest do dc at the pflash
> >> > > range from
> >> > > 0 to 128M, error is reported by kvm as "Data abort outside
> >> > > memslots with no valid syndrome info" which is not quite correct.
> >> > >
> >> > > Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
> >> > > ---
> >> > >  arch/arm64/kvm/mmu.c | 12 +++++++++---
> >> > >  1 file changed, 9 insertions(+), 3 deletions(-)
> >> > >
> >> > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index
> >> > > 7d2257cc5438..de66b7e38a5b 100644
> >> > > --- a/arch/arm64/kvm/mmu.c
> >> > > +++ b/arch/arm64/kvm/mmu.c
> >> > > @@ -1022,9 +1022,15 @@ int kvm_handle_guest_abort(struct
> kvm_vcpu
> >> > > *vcpu)
> >> > >  		 * So let's assume that the guest is just being
> >> > >  		 * cautious, and skip the instruction.
> >> > >  		 */
> >> > > -		if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu))
> >> > {
> >> > > -			kvm_incr_pc(vcpu);
> >> > > -			ret = 1;
> >> > > +		if (kvm_vcpu_dabt_is_cm(vcpu)) {
> >> > > +			if (kvm_is_error_hva(hva)) {
> >> > > +				kvm_incr_pc(vcpu);
> >> > > +				ret = 1;
> >> > > +				goto out_unlock;
> >> > > +			}
> >> > > +
> >> > > +			kvm_err("Do cache maintenance in the read-
> only
> >> > memory range\n");
> >> >
> >> > We don't scream on the console for guests bugs.
> >> Ok
> >>
> >> >
> >> > > +			ret = -EFAULT;
> >> > >  			goto out_unlock;
> >> > >  		}
> >> >
> >> > And what is userspace going to do with that? To be honest, I'd
> >> > rather not report it in any case:
> >> >
> >> > - either it isn't mapped, and there is no cache to clean/invalidate
> >> > - or it is mapped read-only:
> >> >    - if it is a "DC IVAC", the guest should get the fault as per
> >> >      the ARM ARM. But I don't think we can identify the particular CMO
> >> >      at this stage, so actually performing an invalidation is the least
> >> >      bad thing to do.
> >> >
> >> > How about this (untested)?
> >
> > I have tested for this. It works that DC ops can pass on memory range
> > for rom. But there is performance issue. It takes too long a time that
> > do DC on rom range compared with on  normal memory range. Here is
> some
> > data:
> > Ops                  memory type                                size
> >                   time
> > dc civac         rom memory                                128M
> >        6700ms;
> > dc civac       writable normal memory             128M
> > 300ms;
> >
> > It's a single thread test and may be worse on multi thread. I'm not
> > sure we can bear it. WDYT?
> 
> The problem is that the guest is invalidating one cache-line at a time, but we
> invalidate 128M at a time in your example.
> 
> I would say that I really don't care how slow it is. We cannot know which
> address the guest is trying to invalidate (as your commit message shows,
> there is no syndrome information available).
> 
> So it seems our only choices are:
> - don't do any invalidation, which is likely to break the guest
> - invalidate everything, always
> 
> Given that, I'd rather have a slow guest. Also, it very much looks like no
> existing SW does this, so I cannot say I care much.

OK, get it.

Thanks
Jianyong

> 
> Thanks,
> 
>          M.
> --
> Jazz is not dead. It just smells funny...
Marc Zyngier Jan. 18, 2021, 1:44 p.m. UTC | #7
On 2021-01-18 13:38, Jianyong Wu wrote:
>> -----Original Message-----
>> From: Marc Zyngier <maz@kernel.org>
>> Sent: Monday, January 18, 2021 9:26 PM
>> To: Jianyong Wu <Jianyong.Wu@arm.com>
>> Cc: Justin He <Justin.He@arm.com>; nd <nd@arm.com>; will@kernel.org;
>> kvmarm@lists.cs.columbia.edu; linux-arm-kernel@lists.infradead.org
>> Subject: Re: [PATCH] arm64/kvm: correct the error report in
>> kvm_handle_guest_abort
>> 
>> On 2021-01-18 13:04, Jianyong Wu wrote:
>> > Hi Marc,
>> >
>> >> -----Original Message-----
>> >> From: kvmarm-bounces@lists.cs.columbia.edu <kvmarm-
>> >> bounces@lists.cs.columbia.edu> On Behalf Of Jianyong Wu
>> >> Sent: Saturday, January 16, 2021 4:47 PM
>> >> To: Marc Zyngier <maz@kernel.org>
>> >> Cc: Justin He <Justin.He@arm.com>; nd <nd@arm.com>; will@kernel.org;
>> >> kvmarm@lists.cs.columbia.edu; linux-arm-kernel@lists.infradead.org
>> >> Subject: RE: [PATCH] arm64/kvm: correct the error report in
>> >> kvm_handle_guest_abort
>> >>
>> >> Hi Marc,
>> >>
>> >> > -----Original Message-----
>> >> > From: Marc Zyngier <maz@kernel.org>
>> >> > Sent: Friday, January 15, 2021 7:21 PM
>> >> > To: Jianyong Wu <Jianyong.Wu@arm.com>
>> >> > Cc: James Morse <James.Morse@arm.com>; will@kernel.org; Suzuki
>> >> Poulose
>> >> > <Suzuki.Poulose@arm.com>; linux-arm-kernel@lists.infradead.org;
>> >> > kvmarm@lists.cs.columbia.edu; Steve Capper
>> <Steve.Capper@arm.com>;
>> >> > Justin He <Justin.He@arm.com>; nd <nd@arm.com>
>> >> > Subject: Re: [PATCH] arm64/kvm: correct the error report in
>> >> > kvm_handle_guest_abort
>> >> >
>> >> > On 2021-01-15 09:30, Jianyong Wu wrote:
>> >> > > Currently, error report when cache maintenance at read-only
>> >> > > memory range, like rom, is not clear enough and even not correct.
>> >> > > As the specific error is definitely known by kvm, it is obliged
>> >> > > to give it out.
>> >> > >
>> >> > > Fox example, in a qemu/kvm VM, if the guest do dc at the pflash
>> >> > > range from
>> >> > > 0 to 128M, error is reported by kvm as "Data abort outside
>> >> > > memslots with no valid syndrome info" which is not quite correct.
>> >> > >
>> >> > > Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
>> >> > > ---
>> >> > >  arch/arm64/kvm/mmu.c | 12 +++++++++---
>> >> > >  1 file changed, 9 insertions(+), 3 deletions(-)
>> >> > >
>> >> > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index
>> >> > > 7d2257cc5438..de66b7e38a5b 100644
>> >> > > --- a/arch/arm64/kvm/mmu.c
>> >> > > +++ b/arch/arm64/kvm/mmu.c
>> >> > > @@ -1022,9 +1022,15 @@ int kvm_handle_guest_abort(struct
>> kvm_vcpu
>> >> > > *vcpu)
>> >> > >  		 * So let's assume that the guest is just being
>> >> > >  		 * cautious, and skip the instruction.
>> >> > >  		 */
>> >> > > -		if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu))
>> >> > {
>> >> > > -			kvm_incr_pc(vcpu);
>> >> > > -			ret = 1;
>> >> > > +		if (kvm_vcpu_dabt_is_cm(vcpu)) {
>> >> > > +			if (kvm_is_error_hva(hva)) {
>> >> > > +				kvm_incr_pc(vcpu);
>> >> > > +				ret = 1;
>> >> > > +				goto out_unlock;
>> >> > > +			}
>> >> > > +
>> >> > > +			kvm_err("Do cache maintenance in the read-
>> only
>> >> > memory range\n");
>> >> >
>> >> > We don't scream on the console for guests bugs.
>> >> Ok
>> >>
>> >> >
>> >> > > +			ret = -EFAULT;
>> >> > >  			goto out_unlock;
>> >> > >  		}
>> >> >
>> >> > And what is userspace going to do with that? To be honest, I'd
>> >> > rather not report it in any case:
>> >> >
>> >> > - either it isn't mapped, and there is no cache to clean/invalidate
>> >> > - or it is mapped read-only:
>> >> >    - if it is a "DC IVAC", the guest should get the fault as per
>> >> >      the ARM ARM. But I don't think we can identify the particular CMO
>> >> >      at this stage, so actually performing an invalidation is the least
>> >> >      bad thing to do.
>> >> >
>> >> > How about this (untested)?
>> >
>> > I have tested for this. It works that DC ops can pass on memory range
>> > for rom. But there is performance issue. It takes too long a time that
>> > do DC on rom range compared with on  normal memory range. Here is
>> some
>> > data:
>> > Ops                  memory type                                size
>> >                   time
>> > dc civac         rom memory                                128M
>> >        6700ms;
>> > dc civac       writable normal memory             128M
>> > 300ms;
>> >
>> > It's a single thread test and may be worse on multi thread. I'm not
>> > sure we can bear it. WDYT?
>> 
>> The problem is that the guest is invalidating one cache-line at a 
>> time, but we
>> invalidate 128M at a time in your example.
>> 
>> I would say that I really don't care how slow it is. We cannot know 
>> which
>> address the guest is trying to invalidate (as your commit message 
>> shows,
>> there is no syndrome information available).
>> 
>> So it seems our only choices are:
>> - don't do any invalidation, which is likely to break the guest
>> - invalidate everything, always
>> 
>> Given that, I'd rather have a slow guest. Also, it very much looks 
>> like no
>> existing SW does this, so I cannot say I care much.
> 
> OK, get it.

Actually, there could be a way to make this a bit faster. Once we have
cleaned+invalidated the memslot, we could unmap it, speeding up the
following cache invalidations (nothing will be mapped).

Could you please share your test case?

Thanks,

         M.
Jianyong Wu Jan. 18, 2021, 2:24 p.m. UTC | #8
> -----Original Message-----
> From: Marc Zyngier <maz@kernel.org>
> Sent: Monday, January 18, 2021 9:44 PM
> To: Jianyong Wu <Jianyong.Wu@arm.com>
> Cc: Justin He <Justin.He@arm.com>; nd <nd@arm.com>; will@kernel.org;
> kvmarm@lists.cs.columbia.edu; linux-arm-kernel@lists.infradead.org
> Subject: Re: [PATCH] arm64/kvm: correct the error report in
> kvm_handle_guest_abort
> 
> On 2021-01-18 13:38, Jianyong Wu wrote:
> >> -----Original Message-----
> >> From: Marc Zyngier <maz@kernel.org>
> >> Sent: Monday, January 18, 2021 9:26 PM
> >> To: Jianyong Wu <Jianyong.Wu@arm.com>
> >> Cc: Justin He <Justin.He@arm.com>; nd <nd@arm.com>; will@kernel.org;
> >> kvmarm@lists.cs.columbia.edu; linux-arm-kernel@lists.infradead.org
> >> Subject: Re: [PATCH] arm64/kvm: correct the error report in
> >> kvm_handle_guest_abort
> >>
> >> On 2021-01-18 13:04, Jianyong Wu wrote:
> >> > Hi Marc,
> >> >
> >> >> -----Original Message-----
> >> >> From: kvmarm-bounces@lists.cs.columbia.edu <kvmarm-
> >> >> bounces@lists.cs.columbia.edu> On Behalf Of Jianyong Wu
> >> >> Sent: Saturday, January 16, 2021 4:47 PM
> >> >> To: Marc Zyngier <maz@kernel.org>
> >> >> Cc: Justin He <Justin.He@arm.com>; nd <nd@arm.com>;
> >> >> will@kernel.org; kvmarm@lists.cs.columbia.edu;
> >> >> linux-arm-kernel@lists.infradead.org
> >> >> Subject: RE: [PATCH] arm64/kvm: correct the error report in
> >> >> kvm_handle_guest_abort
> >> >>
> >> >> Hi Marc,
> >> >>
> >> >> > -----Original Message-----
> >> >> > From: Marc Zyngier <maz@kernel.org>
> >> >> > Sent: Friday, January 15, 2021 7:21 PM
> >> >> > To: Jianyong Wu <Jianyong.Wu@arm.com>
> >> >> > Cc: James Morse <James.Morse@arm.com>; will@kernel.org; Suzuki
> >> >> Poulose
> >> >> > <Suzuki.Poulose@arm.com>; linux-arm-kernel@lists.infradead.org;
> >> >> > kvmarm@lists.cs.columbia.edu; Steve Capper
> >> <Steve.Capper@arm.com>;
> >> >> > Justin He <Justin.He@arm.com>; nd <nd@arm.com>
> >> >> > Subject: Re: [PATCH] arm64/kvm: correct the error report in
> >> >> > kvm_handle_guest_abort
> >> >> >
> >> >> > On 2021-01-15 09:30, Jianyong Wu wrote:
> >> >> > > Currently, error report when cache maintenance at read-only
> >> >> > > memory range, like rom, is not clear enough and even not correct.
> >> >> > > As the specific error is definitely known by kvm, it is
> >> >> > > obliged to give it out.
> >> >> > >
> >> >> > > Fox example, in a qemu/kvm VM, if the guest do dc at the
> >> >> > > pflash range from
> >> >> > > 0 to 128M, error is reported by kvm as "Data abort outside
> >> >> > > memslots with no valid syndrome info" which is not quite correct.
> >> >> > >
> >> >> > > Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
> >> >> > > ---
> >> >> > >  arch/arm64/kvm/mmu.c | 12 +++++++++---
> >> >> > >  1 file changed, 9 insertions(+), 3 deletions(-)
> >> >> > >
> >> >> > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index
> >> >> > > 7d2257cc5438..de66b7e38a5b 100644
> >> >> > > --- a/arch/arm64/kvm/mmu.c
> >> >> > > +++ b/arch/arm64/kvm/mmu.c
> >> >> > > @@ -1022,9 +1022,15 @@ int kvm_handle_guest_abort(struct
> >> kvm_vcpu
> >> >> > > *vcpu)
> >> >> > >  		 * So let's assume that the guest is just being
> >> >> > >  		 * cautious, and skip the instruction.
> >> >> > >  		 */
> >> >> > > -		if (kvm_is_error_hva(hva) &&
> kvm_vcpu_dabt_is_cm(vcpu))
> >> >> > {
> >> >> > > -			kvm_incr_pc(vcpu);
> >> >> > > -			ret = 1;
> >> >> > > +		if (kvm_vcpu_dabt_is_cm(vcpu)) {
> >> >> > > +			if (kvm_is_error_hva(hva)) {
> >> >> > > +				kvm_incr_pc(vcpu);
> >> >> > > +				ret = 1;
> >> >> > > +				goto out_unlock;
> >> >> > > +			}
> >> >> > > +
> >> >> > > +			kvm_err("Do cache maintenance in the read-
> >> only
> >> >> > memory range\n");
> >> >> >
> >> >> > We don't scream on the console for guests bugs.
> >> >> Ok
> >> >>
> >> >> >
> >> >> > > +			ret = -EFAULT;
> >> >> > >  			goto out_unlock;
> >> >> > >  		}
> >> >> >
> >> >> > And what is userspace going to do with that? To be honest, I'd
> >> >> > rather not report it in any case:
> >> >> >
> >> >> > - either it isn't mapped, and there is no cache to
> >> >> > clean/invalidate
> >> >> > - or it is mapped read-only:
> >> >> >    - if it is a "DC IVAC", the guest should get the fault as per
> >> >> >      the ARM ARM. But I don't think we can identify the particular CMO
> >> >> >      at this stage, so actually performing an invalidation is the least
> >> >> >      bad thing to do.
> >> >> >
> >> >> > How about this (untested)?
> >> >
> >> > I have tested for this. It works that DC ops can pass on memory
> >> > range for rom. But there is performance issue. It takes too long a
> >> > time that do DC on rom range compared with on  normal memory range.
> >> > Here is
> >> some
> >> > data:
> >> > Ops                  memory type                                size
> >> >                   time
> >> > dc civac         rom memory                                128M
> >> >        6700ms;
> >> > dc civac       writable normal memory             128M
> >> > 300ms;
> >> >
> >> > It's a single thread test and may be worse on multi thread. I'm not
> >> > sure we can bear it. WDYT?
> >>
> >> The problem is that the guest is invalidating one cache-line at a
> >> time, but we invalidate 128M at a time in your example.
> >>
> >> I would say that I really don't care how slow it is. We cannot know
> >> which address the guest is trying to invalidate (as your commit
> >> message shows, there is no syndrome information available).
> >>
> >> So it seems our only choices are:
> >> - don't do any invalidation, which is likely to break the guest
> >> - invalidate everything, always
> >>
> >> Given that, I'd rather have a slow guest. Also, it very much looks
> >> like no existing SW does this, so I cannot say I care much.
> >
> > OK, get it.
> 
> Actually, there could be a way to make this a bit faster. Once we have
> cleaned+invalidated the memslot, we could unmap it, speeding up the
> following cache invalidations (nothing will be mapped).
> 
> Could you please share your test case?

Yeah, here is my test method:
*Make sure your arm64 server supports gic-v2.
git clone https://github.com/unikraft/unikraft  #unikraft is a unikernel
cd unikraft
vim plat/kvm/arm/entry.S
before jumping to clean_and_invalidate_dcache_range set x0 as base address and x1 as the memory size.
For the qemu/virt, rom address is 0~128M, normal memory starts 1G. like:
...
/*
         * set x0 to the location stores dtb as the base address of the
         * memory range to be cache maintained
         */
        mov x0, 0
        mov x1, 0x08000000
        bl clean_and_invalidate_dcache_range
...
Build unikraft using "make", before build, using "make menuconfig" to choose the armv8 as architecture and kvm as platform.
Run it using qemu:
qemu-system-aarch64 \
        -machine virt,gic-version=2,accel=kvm \
        -m 1024 \
         -display none  \
        -nographic -nodefaults \
        -serial stdio -kernel build/unikraft_kvm-arm64.dbg \
        -cpu host -smp 1

Also I inject debug info into qemu to measure the time consumed by "ioctl(cpu->kvm_fd, type, arg)" in "kvm_vcpu_ioctl". Like:
//accel/kvm/kvm-all.c
int kvm_vcpu_ioctl(CPUState *cpu, int type, ...)
{
...
	gettimeofday(&tv1,NULL);
 	ret = ioctl(cpu->kvm_fd, type, arg);
	gettimeofday(&tv2,NULL);
	us = tv2.tv_usec - tv1.tv_usec;
	sec = tv2.tv_sec - tv1.tv_sec;
	if (sec > 0 || us > 50000) {
        	    printf("++++++ kvm_vcpu_ioctl, time is %ds, %dus ++++\n", count, sec, us);
	  }
.,..
}
You can have a try.

Thanks
Jianyong 
> 
> Thanks,
> 
>          M.
> --
> Jazz is not dead. It just smells funny...
Jianyong Wu Jan. 26, 2021, 8:10 a.m. UTC | #9
Hi Marc,

>   		/*
> -		 * Check for a cache maintenance operation. Since we
> -		 * ended-up here, we know it is outside of any memory
> -		 * slot. But we can't find out if that is for a device,
> -		 * or if the guest is just being stupid. The only thing
> -		 * we know for sure is that this range cannot be cached.
> +		 * Check for a cache maintenance operation. Two cases:
>   		 *
> -		 * So let's assume that the guest is just being
> -		 * cautious, and skip the instruction.
> +		 * - It is outside of any memory slot. But we can't
> +		 *   find out if that is for a device, or if the guest
> +		 *   is just being stupid. The only thing we know for
> +		 *   sure is that this range cannot be cached.  So
> +		 *   let's assume that the guest is just being
> +		 *   cautious, and skip the instruction.
> +		 *
> +		 * - Otherwise, clean/invalidate the whole memslot. We
> +		 *   should special-case DC IVAC and inject a
> +		 *   permission fault, but we can't really identify it
> +		 *   in this context.
>   		 */
> -		if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu))
> {
> +		if (kvm_vcpu_dabt_is_cm(vcpu)) {
> +			if (!kvm_is_error_hva(hva)) {
> +				spin_lock(&vcpu->kvm->mmu_lock);
> +				stage2_flush_memslot(vcpu->kvm,
> memslot);

Maybe we should not flush the whole memslot here as every "dc ivac" only work on a range of memory with cache line size. So what about using:
stage2_apply_range_resched(vcpu->kvm, fault_ipa, fault_ipa + cache_line_size(), kvm_pgtable_stage2_flush) instead. It will a bit faster than flush the whole memslot.
Also I test your idea of "unmap after flush" using:
	stage2_apply_range_resched(vcpu->kvm, fault_ipa, fault_ipa + cache_line_size(), kvm_pgtable_stage2_flush);
	stage2_apply_range(vcpu->kvm, fault_ipa, fault_ipa + cache_line_size(), kvm_pgtable_stage2_unmap, true);
then I do "dc ivac" on the rom of 128M twice and got the double time of around 11s. it means that there is no optimization when do the second "dc ivac".
I'm not sure if there is something wrong in my test.

So what about just using " stage2_apply_range_resched(vcpu->kvm, fault_ipa, fault_ipa + cache_line_size(), kvm_pgtable_stage2_flush);" instead of 
" stage2_flush_memslot(vcpu->kvm, memslot);" and let the guest bears the disadvantage of low performance. 

Thanks
Jianyong

> +				spin_unlock(&vcpu->kvm->mmu_lock);
> +			}
> +
>   			kvm_incr_pc(vcpu);
>   			ret = 1;
>   			goto out_unlock;
> 
> --
> Jazz is not dead. It just smells funny...
Marc Zyngier Jan. 26, 2021, 9:18 a.m. UTC | #10
On 2021-01-26 08:10, Jianyong Wu wrote:
> Hi Marc,
> 
>>   		/*
>> -		 * Check for a cache maintenance operation. Since we
>> -		 * ended-up here, we know it is outside of any memory
>> -		 * slot. But we can't find out if that is for a device,
>> -		 * or if the guest is just being stupid. The only thing
>> -		 * we know for sure is that this range cannot be cached.
>> +		 * Check for a cache maintenance operation. Two cases:
>>   		 *
>> -		 * So let's assume that the guest is just being
>> -		 * cautious, and skip the instruction.
>> +		 * - It is outside of any memory slot. But we can't
>> +		 *   find out if that is for a device, or if the guest
>> +		 *   is just being stupid. The only thing we know for
>> +		 *   sure is that this range cannot be cached.  So
>> +		 *   let's assume that the guest is just being
>> +		 *   cautious, and skip the instruction.
>> +		 *
>> +		 * - Otherwise, clean/invalidate the whole memslot. We
>> +		 *   should special-case DC IVAC and inject a
>> +		 *   permission fault, but we can't really identify it
>> +		 *   in this context.
>>   		 */
>> -		if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu))
>> {
>> +		if (kvm_vcpu_dabt_is_cm(vcpu)) {
>> +			if (!kvm_is_error_hva(hva)) {
>> +				spin_lock(&vcpu->kvm->mmu_lock);
>> +				stage2_flush_memslot(vcpu->kvm,
>> memslot);
> 
> Maybe we should not flush the whole memslot here as every "dc ivac"
> only work on a range of memory with cache line size. So what about
> using:
> stage2_apply_range_resched(vcpu->kvm, fault_ipa, fault_ipa +
> cache_line_size(), kvm_pgtable_stage2_flush) instead. It will a bit
> faster than flush the whole memslot.
> Also I test your idea of "unmap after flush" using:
> 	stage2_apply_range_resched(vcpu->kvm, fault_ipa, fault_ipa +
> cache_line_size(), kvm_pgtable_stage2_flush);
> 	stage2_apply_range(vcpu->kvm, fault_ipa, fault_ipa +
> cache_line_size(), kvm_pgtable_stage2_unmap, true);
> then I do "dc ivac" on the rom of 128M twice and got the double time
> of around 11s. it means that there is no optimization when do the
> second "dc ivac".
> I'm not sure if there is something wrong in my test.
> 
> So what about just using " stage2_apply_range_resched(vcpu->kvm,
> fault_ipa, fault_ipa + cache_line_size(), kvm_pgtable_stage2_flush);"
> instead of
> " stage2_flush_memslot(vcpu->kvm, memslot);" and let the guest bears
> the disadvantage of low performance.

No, both solutions are wrong. I had to write my own test case because
the idea of hacking some unknown guest wasn't very appealing.

At the end of the day, what we need to implement is as follow:

- if a CMO hits normal memory, it's all already handled
- if a CMO hits non-memory, we skip it, as we do today
- if a CMO hits R/O memory, that's where things become fun:
   - if it is a DC IVAC, the architecture says this should result
     in a permission fault
   - if it is a DC CIVAC, it works as expected

So we need to distinguish between IVAC and CIVAC. One way to do it
is to treat CMOs generating a translation fault as a *read*, even
when they are on a RO memslot.

If they come back with a permission fault:
- inside a RW memslot: no problem, treat it as a write
- inside a RO memslot: only IVAC will fault here, inject an abort
   in the guest

This translates into the following patch, which does the trick
for me.

         M.

 From 8f2a919d6f13d36445974794c76821fbb6b40f88 Mon Sep 17 00:00:00 2001
 From: Marc Zyngier <maz@kernel.org>
Date: Sat, 16 Jan 2021 10:53:21 +0000
Subject: [PATCH] CMO on RO memslot

Signed-off-by: Marc Zyngier <maz@kernel.org>
---
  arch/arm64/kvm/mmu.c | 51 +++++++++++++++++++++++++++++++++-----------
  1 file changed, 39 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 7d2257cc5438..3c176b5b0a28 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -760,7 +760,15 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
  	struct kvm_pgtable *pgt;

  	fault_granule = 1UL << ARM64_HW_PGTABLE_LEVEL_SHIFT(fault_level);
-	write_fault = kvm_is_write_fault(vcpu);
+	/*
+	 * Treat translation faults on CMOs as read faults. Should
+	 * this further generate a permission fault, it will be caught
+	 * in kvm_handle_guest_abort(), with prejudice...
+	 */
+	if (fault_status == FSC_FAULT && kvm_vcpu_dabt_is_cm(vcpu))
+		write_fault = false;
+	else
+		write_fault = kvm_is_write_fault(vcpu);
  	exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu);
  	VM_BUG_ON(write_fault && exec_fault);

@@ -1013,19 +1021,37 @@ int kvm_handle_guest_abort(struct kvm_vcpu 
*vcpu)
  		}

  		/*
-		 * Check for a cache maintenance operation. Since we
-		 * ended-up here, we know it is outside of any memory
-		 * slot. But we can't find out if that is for a device,
-		 * or if the guest is just being stupid. The only thing
-		 * we know for sure is that this range cannot be cached.
+		 * Check for a cache maintenance operation. Two cases:
+		 *
+		 * - It is outside of any memory slot. But we can't find out
+		 *   if that is for a device, or if the guest is just being
+		 *   stupid. The only thing we know for sure is that this
+		 *   range cannot be cached.  So let's assume that the guest
+		 *   is just being cautious, and skip the instruction.
+		 *
+		 * - Otherwise, check whether this is a permission fault.
+		 *   If so, that's a DC IVAC on a R/O memslot, which is a
+		 *   pretty bad idea, and we tell the guest so.
  		 *
-		 * So let's assume that the guest is just being
-		 * cautious, and skip the instruction.
+		 * - If this wasn't a permission fault, pass it along for
+                 *   further handling (including faulting the page in 
if it
+                 *   was a translation fault).
  		 */
-		if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) {
-			kvm_incr_pc(vcpu);
-			ret = 1;
-			goto out_unlock;
+		if (kvm_vcpu_dabt_is_cm(vcpu)) {
+			if (kvm_is_error_hva(hva)) {
+				kvm_incr_pc(vcpu);
+				ret = 1;
+				goto out_unlock;
+			}
+
+			if (fault_status == FSC_PERM) {
+				/* DC IVAC on a R/O memslot */
+				kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
+				ret = 1;
+				goto out_unlock;
+			}
+
+			goto handle_access;
  		}

  		/*
@@ -1039,6 +1065,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
  		goto out_unlock;
  	}

+handle_access:
  	/* Userspace should not be able to register out-of-bounds IPAs */
  	VM_BUG_ON(fault_ipa >= kvm_phys_size(vcpu->kvm));
Jianyong Wu Jan. 28, 2021, 3:01 a.m. UTC | #11
Hi Marc,

> 
>  From 8f2a919d6f13d36445974794c76821fbb6b40f88 Mon Sep 17 00:00:00
> 2001
>  From: Marc Zyngier <maz@kernel.org>
> Date: Sat, 16 Jan 2021 10:53:21 +0000
> Subject: [PATCH] CMO on RO memslot
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
>   arch/arm64/kvm/mmu.c | 51 +++++++++++++++++++++++++++++++++----
> -------
>   1 file changed, 39 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index
> 7d2257cc5438..3c176b5b0a28 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -760,7 +760,15 @@ static int user_mem_abort(struct kvm_vcpu *vcpu,
> phys_addr_t fault_ipa,
>   	struct kvm_pgtable *pgt;
> 
>   	fault_granule = 1UL <<
> ARM64_HW_PGTABLE_LEVEL_SHIFT(fault_level);
> -	write_fault = kvm_is_write_fault(vcpu);
> +	/*
> +	 * Treat translation faults on CMOs as read faults. Should
> +	 * this further generate a permission fault, it will be caught
> +	 * in kvm_handle_guest_abort(), with prejudice...
> +	 */
> +	if (fault_status == FSC_FAULT && kvm_vcpu_dabt_is_cm(vcpu))
> +		write_fault = false;
> +	else
> +		write_fault = kvm_is_write_fault(vcpu);
>   	exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu);
>   	VM_BUG_ON(write_fault && exec_fault);
> 
> @@ -1013,19 +1021,37 @@ int kvm_handle_guest_abort(struct kvm_vcpu
> *vcpu)
>   		}
> 
>   		/*
> -		 * Check for a cache maintenance operation. Since we
> -		 * ended-up here, we know it is outside of any memory
> -		 * slot. But we can't find out if that is for a device,
> -		 * or if the guest is just being stupid. The only thing
> -		 * we know for sure is that this range cannot be cached.
> +		 * Check for a cache maintenance operation. Two cases:
> +		 *
> +		 * - It is outside of any memory slot. But we can't find out
> +		 *   if that is for a device, or if the guest is just being
> +		 *   stupid. The only thing we know for sure is that this
> +		 *   range cannot be cached.  So let's assume that the guest
> +		 *   is just being cautious, and skip the instruction.
> +		 *
> +		 * - Otherwise, check whether this is a permission fault.
> +		 *   If so, that's a DC IVAC on a R/O memslot, which is a
> +		 *   pretty bad idea, and we tell the guest so.
>   		 *
> -		 * So let's assume that the guest is just being
> -		 * cautious, and skip the instruction.
> +		 * - If this wasn't a permission fault, pass it along for
> +                 *   further handling (including faulting the page in
> if it
> +                 *   was a translation fault).
>   		 */
> -		if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu))
> {
> -			kvm_incr_pc(vcpu);
> -			ret = 1;
> -			goto out_unlock;
> +		if (kvm_vcpu_dabt_is_cm(vcpu)) {
> +			if (kvm_is_error_hva(hva)) {
> +				kvm_incr_pc(vcpu);
> +				ret = 1;
> +				goto out_unlock;
> +			}
> +
> +			if (fault_status == FSC_PERM) {
> +				/* DC IVAC on a R/O memslot */
> +				kvm_inject_dabt(vcpu,
> kvm_vcpu_get_hfar(vcpu));

One question:
In general, the "DC" ops show up very early in guest. So what if the guest do this before interrupt initialization? If so, the guest may stuck here.

Thanks
Jianyong

> +				ret = 1;
> +				goto out_unlock;
> +			}
> +
> +			goto handle_access;
>   		}
> 
>   		/*
> @@ -1039,6 +1065,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu
> *vcpu)
>   		goto out_unlock;
>   	}
> 
> +handle_access:
>   	/* Userspace should not be able to register out-of-bounds IPAs */
>   	VM_BUG_ON(fault_ipa >= kvm_phys_size(vcpu->kvm));
> 
> --
> 2.29.2
> 
> 
> --
> Jazz is not dead. It just smells funny...
Marc Zyngier Jan. 28, 2021, 9:07 a.m. UTC | #12
On 2021-01-28 03:01, Jianyong Wu wrote:
> Hi Marc,
> 
>> 
>>  From 8f2a919d6f13d36445974794c76821fbb6b40f88 Mon Sep 17 00:00:00
>> 2001
>>  From: Marc Zyngier <maz@kernel.org>
>> Date: Sat, 16 Jan 2021 10:53:21 +0000
>> Subject: [PATCH] CMO on RO memslot
>> 
>> Signed-off-by: Marc Zyngier <maz@kernel.org>
>> ---
>>   arch/arm64/kvm/mmu.c | 51 +++++++++++++++++++++++++++++++++----
>> -------
>>   1 file changed, 39 insertions(+), 12 deletions(-)
>> 
>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index
>> 7d2257cc5438..3c176b5b0a28 100644
>> --- a/arch/arm64/kvm/mmu.c
>> +++ b/arch/arm64/kvm/mmu.c
>> @@ -760,7 +760,15 @@ static int user_mem_abort(struct kvm_vcpu *vcpu,
>> phys_addr_t fault_ipa,
>>   	struct kvm_pgtable *pgt;
>> 
>>   	fault_granule = 1UL <<
>> ARM64_HW_PGTABLE_LEVEL_SHIFT(fault_level);
>> -	write_fault = kvm_is_write_fault(vcpu);
>> +	/*
>> +	 * Treat translation faults on CMOs as read faults. Should
>> +	 * this further generate a permission fault, it will be caught
>> +	 * in kvm_handle_guest_abort(), with prejudice...
>> +	 */
>> +	if (fault_status == FSC_FAULT && kvm_vcpu_dabt_is_cm(vcpu))
>> +		write_fault = false;
>> +	else
>> +		write_fault = kvm_is_write_fault(vcpu);
>>   	exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu);
>>   	VM_BUG_ON(write_fault && exec_fault);
>> 
>> @@ -1013,19 +1021,37 @@ int kvm_handle_guest_abort(struct kvm_vcpu
>> *vcpu)
>>   		}
>> 
>>   		/*
>> -		 * Check for a cache maintenance operation. Since we
>> -		 * ended-up here, we know it is outside of any memory
>> -		 * slot. But we can't find out if that is for a device,
>> -		 * or if the guest is just being stupid. The only thing
>> -		 * we know for sure is that this range cannot be cached.
>> +		 * Check for a cache maintenance operation. Two cases:
>> +		 *
>> +		 * - It is outside of any memory slot. But we can't find out
>> +		 *   if that is for a device, or if the guest is just being
>> +		 *   stupid. The only thing we know for sure is that this
>> +		 *   range cannot be cached.  So let's assume that the guest
>> +		 *   is just being cautious, and skip the instruction.
>> +		 *
>> +		 * - Otherwise, check whether this is a permission fault.
>> +		 *   If so, that's a DC IVAC on a R/O memslot, which is a
>> +		 *   pretty bad idea, and we tell the guest so.
>>   		 *
>> -		 * So let's assume that the guest is just being
>> -		 * cautious, and skip the instruction.
>> +		 * - If this wasn't a permission fault, pass it along for
>> +                 *   further handling (including faulting the page in
>> if it
>> +                 *   was a translation fault).
>>   		 */
>> -		if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu))
>> {
>> -			kvm_incr_pc(vcpu);
>> -			ret = 1;
>> -			goto out_unlock;
>> +		if (kvm_vcpu_dabt_is_cm(vcpu)) {
>> +			if (kvm_is_error_hva(hva)) {
>> +				kvm_incr_pc(vcpu);
>> +				ret = 1;
>> +				goto out_unlock;
>> +			}
>> +
>> +			if (fault_status == FSC_PERM) {
>> +				/* DC IVAC on a R/O memslot */
>> +				kvm_inject_dabt(vcpu,
>> kvm_vcpu_get_hfar(vcpu));
> 
> One question:
> In general, the "DC" ops show up very early in guest. So what if the
> guest do this before interrupt initialization? If so, the guest may
> stuck here.

I don't understand your question. Do you mean "what if the guest
does this without being able to handle an exception"?

If that's your question, then the answer is "don't do that".
The architecture is clear that DC IVAC needs write permission, and
will result in an abort being delivered if there is no writable
mapping (and there can't be, the memslot is R/O).

DC CIVAC doesn't have that requirement, and will not generate an
exception.

Thanks,

         M.
Jianyong Wu Jan. 28, 2021, 9:55 a.m. UTC | #13
> -----Original Message-----
> From: Marc Zyngier <maz@kernel.org>
> Sent: Thursday, January 28, 2021 5:07 PM
> To: Jianyong Wu <Jianyong.Wu@arm.com>
> Cc: James Morse <James.Morse@arm.com>; will@kernel.org; Suzuki
> Poulose <Suzuki.Poulose@arm.com>; linux-arm-kernel@lists.infradead.org;
> kvmarm@lists.cs.columbia.edu; Steve Capper <Steve.Capper@arm.com>;
> Justin He <Justin.He@arm.com>; nd <nd@arm.com>
> Subject: Re: [PATCH] arm64/kvm: correct the error report in
> kvm_handle_guest_abort
> 
> On 2021-01-28 03:01, Jianyong Wu wrote:
> > Hi Marc,
> >
> >>
> >>  From 8f2a919d6f13d36445974794c76821fbb6b40f88 Mon Sep 17 00:00:00
> >> 2001
> >>  From: Marc Zyngier <maz@kernel.org>
> >> Date: Sat, 16 Jan 2021 10:53:21 +0000
> >> Subject: [PATCH] CMO on RO memslot
> >>
> >> Signed-off-by: Marc Zyngier <maz@kernel.org>
> >> ---
> >>   arch/arm64/kvm/mmu.c | 51
> +++++++++++++++++++++++++++++++++----
> >> -------
> >>   1 file changed, 39 insertions(+), 12 deletions(-)
> >>
> >> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index
> >> 7d2257cc5438..3c176b5b0a28 100644
> >> --- a/arch/arm64/kvm/mmu.c
> >> +++ b/arch/arm64/kvm/mmu.c
> >> @@ -760,7 +760,15 @@ static int user_mem_abort(struct kvm_vcpu
> *vcpu,
> >> phys_addr_t fault_ipa,
> >>   	struct kvm_pgtable *pgt;
> >>
> >>   	fault_granule = 1UL <<
> >> ARM64_HW_PGTABLE_LEVEL_SHIFT(fault_level);
> >> -	write_fault = kvm_is_write_fault(vcpu);
> >> +	/*
> >> +	 * Treat translation faults on CMOs as read faults. Should
> >> +	 * this further generate a permission fault, it will be caught
> >> +	 * in kvm_handle_guest_abort(), with prejudice...
> >> +	 */
> >> +	if (fault_status == FSC_FAULT && kvm_vcpu_dabt_is_cm(vcpu))
> >> +		write_fault = false;
> >> +	else
> >> +		write_fault = kvm_is_write_fault(vcpu);
> >>   	exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu);
> >>   	VM_BUG_ON(write_fault && exec_fault);
> >>
> >> @@ -1013,19 +1021,37 @@ int kvm_handle_guest_abort(struct
> kvm_vcpu
> >> *vcpu)
> >>   		}
> >>
> >>   		/*
> >> -		 * Check for a cache maintenance operation. Since we
> >> -		 * ended-up here, we know it is outside of any memory
> >> -		 * slot. But we can't find out if that is for a device,
> >> -		 * or if the guest is just being stupid. The only thing
> >> -		 * we know for sure is that this range cannot be cached.
> >> +		 * Check for a cache maintenance operation. Two cases:
> >> +		 *
> >> +		 * - It is outside of any memory slot. But we can't find out
> >> +		 *   if that is for a device, or if the guest is just being
> >> +		 *   stupid. The only thing we know for sure is that this
> >> +		 *   range cannot be cached.  So let's assume that the guest
> >> +		 *   is just being cautious, and skip the instruction.
> >> +		 *
> >> +		 * - Otherwise, check whether this is a permission fault.
> >> +		 *   If so, that's a DC IVAC on a R/O memslot, which is a
> >> +		 *   pretty bad idea, and we tell the guest so.
> >>   		 *
> >> -		 * So let's assume that the guest is just being
> >> -		 * cautious, and skip the instruction.
> >> +		 * - If this wasn't a permission fault, pass it along for
> >> +                 *   further handling (including faulting the page in
> >> if it
> >> +                 *   was a translation fault).
> >>   		 */
> >> -		if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu))
> >> {
> >> -			kvm_incr_pc(vcpu);
> >> -			ret = 1;
> >> -			goto out_unlock;
> >> +		if (kvm_vcpu_dabt_is_cm(vcpu)) {
> >> +			if (kvm_is_error_hva(hva)) {
> >> +				kvm_incr_pc(vcpu);
> >> +				ret = 1;
> >> +				goto out_unlock;
> >> +			}
> >> +
> >> +			if (fault_status == FSC_PERM) {
> >> +				/* DC IVAC on a R/O memslot */
> >> +				kvm_inject_dabt(vcpu,
> >> kvm_vcpu_get_hfar(vcpu));
> >
> > One question:
> > In general, the "DC" ops show up very early in guest. So what if the
> > guest do this before interrupt initialization? If so, the guest may
> > stuck here.
> 
> I don't understand your question. Do you mean "what if the guest does this
> without being able to handle an exception"?
> 
> If that's your question, then the answer is "don't do that".
> The architecture is clear that DC IVAC needs write permission, and will result
> in an abort being delivered if there is no writable mapping (and there can't be,
> the memslot is R/O).
> 
> DC CIVAC doesn't have that requirement, and will not generate an exception.
> 

OK, get it.
I have tested the patch above using my test case. It works well for "dc civac" and for "dc ivac" , a "Synchronous External Abort" occurs in guest as expected.

Thanks
Jianyong

> Thanks,
> 
>          M.
> --
> Jazz is not dead. It just smells funny...
diff mbox series

Patch

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 7d2257cc5438..de66b7e38a5b 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1022,9 +1022,15 @@  int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
 		 * So let's assume that the guest is just being
 		 * cautious, and skip the instruction.
 		 */
-		if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) {
-			kvm_incr_pc(vcpu);
-			ret = 1;
+		if (kvm_vcpu_dabt_is_cm(vcpu)) {
+			if (kvm_is_error_hva(hva)) {
+				kvm_incr_pc(vcpu);
+				ret = 1;
+				goto out_unlock;
+			}
+
+			kvm_err("Do cache maintenance in the read-only memory range\n");
+			ret = -EFAULT;
 			goto out_unlock;
 		}