Message ID | 20210115093028.6504-1-jianyong.wu@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64/kvm: correct the error report in kvm_handle_guest_abort | expand |
On 2021-01-15 09:30, Jianyong Wu wrote: > Currently, error report when cache maintenance at read-only memory > range, > like rom, is not clear enough and even not correct. As the specific > error > is definitely known by kvm, it is obliged to give it out. > > Fox example, in a qemu/kvm VM, if the guest do dc at the pflash range > from > 0 to 128M, error is reported by kvm as "Data abort outside memslots > with > no valid syndrome info" which is not quite correct. > > Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> > --- > arch/arm64/kvm/mmu.c | 12 +++++++++--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > index 7d2257cc5438..de66b7e38a5b 100644 > --- a/arch/arm64/kvm/mmu.c > +++ b/arch/arm64/kvm/mmu.c > @@ -1022,9 +1022,15 @@ int kvm_handle_guest_abort(struct kvm_vcpu > *vcpu) > * So let's assume that the guest is just being > * cautious, and skip the instruction. > */ > - if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) { > - kvm_incr_pc(vcpu); > - ret = 1; > + if (kvm_vcpu_dabt_is_cm(vcpu)) { > + if (kvm_is_error_hva(hva)) { > + kvm_incr_pc(vcpu); > + ret = 1; > + goto out_unlock; > + } > + > + kvm_err("Do cache maintenance in the read-only memory range\n"); We don't scream on the console for guests bugs. > + ret = -EFAULT; > goto out_unlock; > } And what is userspace going to do with that? To be honest, I'd rather not report it in any case: - either it isn't mapped, and there is no cache to clean/invalidate - or it is mapped read-only: - if it is a "DC IVAC", the guest should get the fault as per the ARM ARM. But I don't think we can identify the particular CMO at this stage, so actually performing an invalidation is the least bad thing to do. How about this (untested)? M. diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 7d2257cc5438..0f497faad131 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1013,16 +1013,27 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) } /* - * Check for a cache maintenance operation. Since we - * ended-up here, we know it is outside of any memory - * slot. But we can't find out if that is for a device, - * or if the guest is just being stupid. The only thing - * we know for sure is that this range cannot be cached. + * Check for a cache maintenance operation. Two cases: * - * So let's assume that the guest is just being - * cautious, and skip the instruction. + * - It is outside of any memory slot. But we can't + * find out if that is for a device, or if the guest + * is just being stupid. The only thing we know for + * sure is that this range cannot be cached. So + * let's assume that the guest is just being + * cautious, and skip the instruction. + * + * - Otherwise, clean/invalidate the whole memslot. We + * should special-case DC IVAC and inject a + * permission fault, but we can't really identify it + * in this context. */ - if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) { + if (kvm_vcpu_dabt_is_cm(vcpu)) { + if (!kvm_is_error_hva(hva)) { + spin_lock(&vcpu->kvm->mmu_lock); + stage2_flush_memslot(vcpu->kvm, memslot); + spin_unlock(&vcpu->kvm->mmu_lock); + } + kvm_incr_pc(vcpu); ret = 1; goto out_unlock;
Hi Marc, > -----Original Message----- > From: Marc Zyngier <maz@kernel.org> > Sent: Friday, January 15, 2021 7:21 PM > To: Jianyong Wu <Jianyong.Wu@arm.com> > Cc: James Morse <James.Morse@arm.com>; will@kernel.org; Suzuki > Poulose <Suzuki.Poulose@arm.com>; linux-arm-kernel@lists.infradead.org; > kvmarm@lists.cs.columbia.edu; Steve Capper <Steve.Capper@arm.com>; > Justin He <Justin.He@arm.com>; nd <nd@arm.com> > Subject: Re: [PATCH] arm64/kvm: correct the error report in > kvm_handle_guest_abort > > On 2021-01-15 09:30, Jianyong Wu wrote: > > Currently, error report when cache maintenance at read-only memory > > range, like rom, is not clear enough and even not correct. As the > > specific error is definitely known by kvm, it is obliged to give it > > out. > > > > Fox example, in a qemu/kvm VM, if the guest do dc at the pflash range > > from > > 0 to 128M, error is reported by kvm as "Data abort outside memslots > > with no valid syndrome info" which is not quite correct. > > > > Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> > > --- > > arch/arm64/kvm/mmu.c | 12 +++++++++--- > > 1 file changed, 9 insertions(+), 3 deletions(-) > > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index > > 7d2257cc5438..de66b7e38a5b 100644 > > --- a/arch/arm64/kvm/mmu.c > > +++ b/arch/arm64/kvm/mmu.c > > @@ -1022,9 +1022,15 @@ int kvm_handle_guest_abort(struct kvm_vcpu > > *vcpu) > > * So let's assume that the guest is just being > > * cautious, and skip the instruction. > > */ > > - if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) > { > > - kvm_incr_pc(vcpu); > > - ret = 1; > > + if (kvm_vcpu_dabt_is_cm(vcpu)) { > > + if (kvm_is_error_hva(hva)) { > > + kvm_incr_pc(vcpu); > > + ret = 1; > > + goto out_unlock; > > + } > > + > > + kvm_err("Do cache maintenance in the read-only > memory range\n"); > > We don't scream on the console for guests bugs. Ok > > > + ret = -EFAULT; > > goto out_unlock; > > } > > And what is userspace going to do with that? To be honest, I'd rather not > report it in any case: > > - either it isn't mapped, and there is no cache to clean/invalidate > - or it is mapped read-only: > - if it is a "DC IVAC", the guest should get the fault as per > the ARM ARM. But I don't think we can identify the particular CMO > at this stage, so actually performing an invalidation is the least > bad thing to do. > > How about this (untested)? > > M. > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index > 7d2257cc5438..0f497faad131 100644 > --- a/arch/arm64/kvm/mmu.c > +++ b/arch/arm64/kvm/mmu.c > @@ -1013,16 +1013,27 @@ int kvm_handle_guest_abort(struct kvm_vcpu > *vcpu) > } > > /* > - * Check for a cache maintenance operation. Since we > - * ended-up here, we know it is outside of any memory > - * slot. But we can't find out if that is for a device, > - * or if the guest is just being stupid. The only thing > - * we know for sure is that this range cannot be cached. > + * Check for a cache maintenance operation. Two cases: > * > - * So let's assume that the guest is just being > - * cautious, and skip the instruction. > + * - It is outside of any memory slot. But we can't > + * find out if that is for a device, or if the guest > + * is just being stupid. The only thing we know for > + * sure is that this range cannot be cached. So > + * let's assume that the guest is just being > + * cautious, and skip the instruction. > + * > + * - Otherwise, clean/invalidate the whole memslot. We > + * should special-case DC IVAC and inject a > + * permission fault, but we can't really identify it > + * in this context. > */ > - if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) > { > + if (kvm_vcpu_dabt_is_cm(vcpu)) { > + if (!kvm_is_error_hva(hva)) { > + spin_lock(&vcpu->kvm->mmu_lock); > + stage2_flush_memslot(vcpu->kvm, > memslot); > + spin_unlock(&vcpu->kvm->mmu_lock); > + } > + > kvm_incr_pc(vcpu); > ret = 1; > goto out_unlock; > Thanks Marc, it's OK for me and I will do the test for it. Thanks Jianyong > -- > Jazz is not dead. It just smells funny...
Hi Marc, > -----Original Message----- > From: kvmarm-bounces@lists.cs.columbia.edu <kvmarm- > bounces@lists.cs.columbia.edu> On Behalf Of Jianyong Wu > Sent: Saturday, January 16, 2021 4:47 PM > To: Marc Zyngier <maz@kernel.org> > Cc: Justin He <Justin.He@arm.com>; nd <nd@arm.com>; will@kernel.org; > kvmarm@lists.cs.columbia.edu; linux-arm-kernel@lists.infradead.org > Subject: RE: [PATCH] arm64/kvm: correct the error report in > kvm_handle_guest_abort > > Hi Marc, > > > -----Original Message----- > > From: Marc Zyngier <maz@kernel.org> > > Sent: Friday, January 15, 2021 7:21 PM > > To: Jianyong Wu <Jianyong.Wu@arm.com> > > Cc: James Morse <James.Morse@arm.com>; will@kernel.org; Suzuki > Poulose > > <Suzuki.Poulose@arm.com>; linux-arm-kernel@lists.infradead.org; > > kvmarm@lists.cs.columbia.edu; Steve Capper <Steve.Capper@arm.com>; > > Justin He <Justin.He@arm.com>; nd <nd@arm.com> > > Subject: Re: [PATCH] arm64/kvm: correct the error report in > > kvm_handle_guest_abort > > > > On 2021-01-15 09:30, Jianyong Wu wrote: > > > Currently, error report when cache maintenance at read-only memory > > > range, like rom, is not clear enough and even not correct. As the > > > specific error is definitely known by kvm, it is obliged to give it > > > out. > > > > > > Fox example, in a qemu/kvm VM, if the guest do dc at the pflash > > > range from > > > 0 to 128M, error is reported by kvm as "Data abort outside memslots > > > with no valid syndrome info" which is not quite correct. > > > > > > Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> > > > --- > > > arch/arm64/kvm/mmu.c | 12 +++++++++--- > > > 1 file changed, 9 insertions(+), 3 deletions(-) > > > > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index > > > 7d2257cc5438..de66b7e38a5b 100644 > > > --- a/arch/arm64/kvm/mmu.c > > > +++ b/arch/arm64/kvm/mmu.c > > > @@ -1022,9 +1022,15 @@ int kvm_handle_guest_abort(struct kvm_vcpu > > > *vcpu) > > > * So let's assume that the guest is just being > > > * cautious, and skip the instruction. > > > */ > > > - if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) > > { > > > - kvm_incr_pc(vcpu); > > > - ret = 1; > > > + if (kvm_vcpu_dabt_is_cm(vcpu)) { > > > + if (kvm_is_error_hva(hva)) { > > > + kvm_incr_pc(vcpu); > > > + ret = 1; > > > + goto out_unlock; > > > + } > > > + > > > + kvm_err("Do cache maintenance in the read-only > > memory range\n"); > > > > We don't scream on the console for guests bugs. > Ok > > > > > > + ret = -EFAULT; > > > goto out_unlock; > > > } > > > > And what is userspace going to do with that? To be honest, I'd rather > > not report it in any case: > > > > - either it isn't mapped, and there is no cache to clean/invalidate > > - or it is mapped read-only: > > - if it is a "DC IVAC", the guest should get the fault as per > > the ARM ARM. But I don't think we can identify the particular CMO > > at this stage, so actually performing an invalidation is the least > > bad thing to do. > > > > How about this (untested)? I have tested for this. It works that DC ops can pass on memory range for rom. But there is performance issue. It takes too long a time that do DC on rom range compared with on normal memory range. Here is some data: Ops memory type size time dc civac rom memory 128M 6700ms; dc civac writable normal memory 128M 300ms; It's a single thread test and may be worse on multi thread. I'm not sure we can bear it. WDYT? Thanks Jianyong > > > > M. > > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index > > 7d2257cc5438..0f497faad131 100644 > > --- a/arch/arm64/kvm/mmu.c > > +++ b/arch/arm64/kvm/mmu.c > > @@ -1013,16 +1013,27 @@ int kvm_handle_guest_abort(struct kvm_vcpu > > *vcpu) > > } > > > > /* > > - * Check for a cache maintenance operation. Since we > > - * ended-up here, we know it is outside of any memory > > - * slot. But we can't find out if that is for a device, > > - * or if the guest is just being stupid. The only thing > > - * we know for sure is that this range cannot be cached. > > + * Check for a cache maintenance operation. Two cases: > > * > > - * So let's assume that the guest is just being > > - * cautious, and skip the instruction. > > + * - It is outside of any memory slot. But we can't > > + * find out if that is for a device, or if the guest > > + * is just being stupid. The only thing we know for > > + * sure is that this range cannot be cached. So > > + * let's assume that the guest is just being > > + * cautious, and skip the instruction. > > + * > > + * - Otherwise, clean/invalidate the whole memslot. We > > + * should special-case DC IVAC and inject a > > + * permission fault, but we can't really identify it > > + * in this context. > > */ > > - if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) > > { > > + if (kvm_vcpu_dabt_is_cm(vcpu)) { > > + if (!kvm_is_error_hva(hva)) { > > + spin_lock(&vcpu->kvm->mmu_lock); > > + stage2_flush_memslot(vcpu->kvm, > > memslot); > > + spin_unlock(&vcpu->kvm->mmu_lock); > > + } > > + > > kvm_incr_pc(vcpu); > > ret = 1; > > goto out_unlock; > > > Thanks Marc, it's OK for me and I will do the test for it. > > Thanks > Jianyong > > > -- > > Jazz is not dead. It just smells funny... > _______________________________________________ > kvmarm mailing list > kvmarm@lists.cs.columbia.edu > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Hi Marc, > -----Original Message----- > From: kvmarm-bounces@lists.cs.columbia.edu <kvmarm- > bounces@lists.cs.columbia.edu> On Behalf Of Jianyong Wu > Sent: Saturday, January 16, 2021 4:47 PM > To: Marc Zyngier <maz@kernel.org> > Cc: Justin He <Justin.He@arm.com>; nd <nd@arm.com>; will@kernel.org; > kvmarm@lists.cs.columbia.edu; linux-arm-kernel@lists.infradead.org > Subject: RE: [PATCH] arm64/kvm: correct the error report in > kvm_handle_guest_abort > > Hi Marc, > > > -----Original Message----- > > From: Marc Zyngier <maz@kernel.org> > > Sent: Friday, January 15, 2021 7:21 PM > > To: Jianyong Wu <Jianyong.Wu@arm.com> > > Cc: James Morse <James.Morse@arm.com>; will@kernel.org; Suzuki > Poulose > > <Suzuki.Poulose@arm.com>; linux-arm-kernel@lists.infradead.org; > > kvmarm@lists.cs.columbia.edu; Steve Capper <Steve.Capper@arm.com>; > > Justin He <Justin.He@arm.com>; nd <nd@arm.com> > > Subject: Re: [PATCH] arm64/kvm: correct the error report in > > kvm_handle_guest_abort > > > > On 2021-01-15 09:30, Jianyong Wu wrote: > > > Currently, error report when cache maintenance at read-only memory > > > range, like rom, is not clear enough and even not correct. As the > > > specific error is definitely known by kvm, it is obliged to give it > > > out. > > > > > > Fox example, in a qemu/kvm VM, if the guest do dc at the pflash > > > range from > > > 0 to 128M, error is reported by kvm as "Data abort outside memslots > > > with no valid syndrome info" which is not quite correct. > > > > > > Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> > > > --- > > > arch/arm64/kvm/mmu.c | 12 +++++++++--- > > > 1 file changed, 9 insertions(+), 3 deletions(-) > > > > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index > > > 7d2257cc5438..de66b7e38a5b 100644 > > > --- a/arch/arm64/kvm/mmu.c > > > +++ b/arch/arm64/kvm/mmu.c > > > @@ -1022,9 +1022,15 @@ int kvm_handle_guest_abort(struct kvm_vcpu > > > *vcpu) > > > * So let's assume that the guest is just being > > > * cautious, and skip the instruction. > > > */ > > > - if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) > > { > > > - kvm_incr_pc(vcpu); > > > - ret = 1; > > > + if (kvm_vcpu_dabt_is_cm(vcpu)) { > > > + if (kvm_is_error_hva(hva)) { > > > + kvm_incr_pc(vcpu); > > > + ret = 1; > > > + goto out_unlock; > > > + } > > > + > > > + kvm_err("Do cache maintenance in the read-only > > memory range\n"); > > > > We don't scream on the console for guests bugs. > Ok > > > > > > + ret = -EFAULT; > > > goto out_unlock; > > > } > > > > And what is userspace going to do with that? To be honest, I'd rather > > not report it in any case: > > > > - either it isn't mapped, and there is no cache to clean/invalidate > > - or it is mapped read-only: > > - if it is a "DC IVAC", the guest should get the fault as per > > the ARM ARM. But I don't think we can identify the particular CMO > > at this stage, so actually performing an invalidation is the least > > bad thing to do. > > > > How about this (untested)? I have tested for this. It works that DC ops can pass on memory range for rom. But there is performance issue. It takes too long a time that do DC on rom range compared with on normal memory range. Here is some data: Ops memory type size time dc civac rom memory 128M 6700ms; dc civac writable normal memory 128M 300ms; It's a single thread test and may be worse on multi thread. I'm not sure we can bear it. WDYT? Thanks Jianyong > > > > M. > > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index > > 7d2257cc5438..0f497faad131 100644 > > --- a/arch/arm64/kvm/mmu.c > > +++ b/arch/arm64/kvm/mmu.c > > @@ -1013,16 +1013,27 @@ int kvm_handle_guest_abort(struct kvm_vcpu > > *vcpu) > > } > > > > /* > > - * Check for a cache maintenance operation. Since we > > - * ended-up here, we know it is outside of any memory > > - * slot. But we can't find out if that is for a device, > > - * or if the guest is just being stupid. The only thing > > - * we know for sure is that this range cannot be cached. > > + * Check for a cache maintenance operation. Two cases: > > * > > - * So let's assume that the guest is just being > > - * cautious, and skip the instruction. > > + * - It is outside of any memory slot. But we can't > > + * find out if that is for a device, or if the guest > > + * is just being stupid. The only thing we know for > > + * sure is that this range cannot be cached. So > > + * let's assume that the guest is just being > > + * cautious, and skip the instruction. > > + * > > + * - Otherwise, clean/invalidate the whole memslot. We > > + * should special-case DC IVAC and inject a > > + * permission fault, but we can't really identify it > > + * in this context. > > */ > > - if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) > > { > > + if (kvm_vcpu_dabt_is_cm(vcpu)) { > > + if (!kvm_is_error_hva(hva)) { > > + spin_lock(&vcpu->kvm->mmu_lock); > > + stage2_flush_memslot(vcpu->kvm, > > memslot); > > + spin_unlock(&vcpu->kvm->mmu_lock); > > + } > > + > > kvm_incr_pc(vcpu); > > ret = 1; > > goto out_unlock; > > > Thanks Marc, it's OK for me and I will do the test for it. > > Thanks > Jianyong > > > -- > > Jazz is not dead. It just smells funny... > _______________________________________________ > kvmarm mailing list > kvmarm@lists.cs.columbia.edu > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
On 2021-01-18 13:04, Jianyong Wu wrote: > Hi Marc, > >> -----Original Message----- >> From: kvmarm-bounces@lists.cs.columbia.edu <kvmarm- >> bounces@lists.cs.columbia.edu> On Behalf Of Jianyong Wu >> Sent: Saturday, January 16, 2021 4:47 PM >> To: Marc Zyngier <maz@kernel.org> >> Cc: Justin He <Justin.He@arm.com>; nd <nd@arm.com>; will@kernel.org; >> kvmarm@lists.cs.columbia.edu; linux-arm-kernel@lists.infradead.org >> Subject: RE: [PATCH] arm64/kvm: correct the error report in >> kvm_handle_guest_abort >> >> Hi Marc, >> >> > -----Original Message----- >> > From: Marc Zyngier <maz@kernel.org> >> > Sent: Friday, January 15, 2021 7:21 PM >> > To: Jianyong Wu <Jianyong.Wu@arm.com> >> > Cc: James Morse <James.Morse@arm.com>; will@kernel.org; Suzuki >> Poulose >> > <Suzuki.Poulose@arm.com>; linux-arm-kernel@lists.infradead.org; >> > kvmarm@lists.cs.columbia.edu; Steve Capper <Steve.Capper@arm.com>; >> > Justin He <Justin.He@arm.com>; nd <nd@arm.com> >> > Subject: Re: [PATCH] arm64/kvm: correct the error report in >> > kvm_handle_guest_abort >> > >> > On 2021-01-15 09:30, Jianyong Wu wrote: >> > > Currently, error report when cache maintenance at read-only memory >> > > range, like rom, is not clear enough and even not correct. As the >> > > specific error is definitely known by kvm, it is obliged to give it >> > > out. >> > > >> > > Fox example, in a qemu/kvm VM, if the guest do dc at the pflash >> > > range from >> > > 0 to 128M, error is reported by kvm as "Data abort outside memslots >> > > with no valid syndrome info" which is not quite correct. >> > > >> > > Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> >> > > --- >> > > arch/arm64/kvm/mmu.c | 12 +++++++++--- >> > > 1 file changed, 9 insertions(+), 3 deletions(-) >> > > >> > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index >> > > 7d2257cc5438..de66b7e38a5b 100644 >> > > --- a/arch/arm64/kvm/mmu.c >> > > +++ b/arch/arm64/kvm/mmu.c >> > > @@ -1022,9 +1022,15 @@ int kvm_handle_guest_abort(struct kvm_vcpu >> > > *vcpu) >> > > * So let's assume that the guest is just being >> > > * cautious, and skip the instruction. >> > > */ >> > > - if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) >> > { >> > > - kvm_incr_pc(vcpu); >> > > - ret = 1; >> > > + if (kvm_vcpu_dabt_is_cm(vcpu)) { >> > > + if (kvm_is_error_hva(hva)) { >> > > + kvm_incr_pc(vcpu); >> > > + ret = 1; >> > > + goto out_unlock; >> > > + } >> > > + >> > > + kvm_err("Do cache maintenance in the read-only >> > memory range\n"); >> > >> > We don't scream on the console for guests bugs. >> Ok >> >> > >> > > + ret = -EFAULT; >> > > goto out_unlock; >> > > } >> > >> > And what is userspace going to do with that? To be honest, I'd rather >> > not report it in any case: >> > >> > - either it isn't mapped, and there is no cache to clean/invalidate >> > - or it is mapped read-only: >> > - if it is a "DC IVAC", the guest should get the fault as per >> > the ARM ARM. But I don't think we can identify the particular CMO >> > at this stage, so actually performing an invalidation is the least >> > bad thing to do. >> > >> > How about this (untested)? > > I have tested for this. It works that DC ops can pass on memory range > for rom. But there is performance issue. It takes too long a time that > do DC on rom range compared with on normal memory range. Here is some > data: > Ops memory type size > time > dc civac rom memory 128M > 6700ms; > dc civac writable normal memory 128M > 300ms; > > It's a single thread test and may be worse on multi thread. I'm not > sure we can bear it. WDYT? The problem is that the guest is invalidating one cache-line at a time, but we invalidate 128M at a time in your example. I would say that I really don't care how slow it is. We cannot know which address the guest is trying to invalidate (as your commit message shows, there is no syndrome information available). So it seems our only choices are: - don't do any invalidation, which is likely to break the guest - invalidate everything, always Given that, I'd rather have a slow guest. Also, it very much looks like no existing SW does this, so I cannot say I care much. Thanks, M.
> -----Original Message----- > From: Marc Zyngier <maz@kernel.org> > Sent: Monday, January 18, 2021 9:26 PM > To: Jianyong Wu <Jianyong.Wu@arm.com> > Cc: Justin He <Justin.He@arm.com>; nd <nd@arm.com>; will@kernel.org; > kvmarm@lists.cs.columbia.edu; linux-arm-kernel@lists.infradead.org > Subject: Re: [PATCH] arm64/kvm: correct the error report in > kvm_handle_guest_abort > > On 2021-01-18 13:04, Jianyong Wu wrote: > > Hi Marc, > > > >> -----Original Message----- > >> From: kvmarm-bounces@lists.cs.columbia.edu <kvmarm- > >> bounces@lists.cs.columbia.edu> On Behalf Of Jianyong Wu > >> Sent: Saturday, January 16, 2021 4:47 PM > >> To: Marc Zyngier <maz@kernel.org> > >> Cc: Justin He <Justin.He@arm.com>; nd <nd@arm.com>; will@kernel.org; > >> kvmarm@lists.cs.columbia.edu; linux-arm-kernel@lists.infradead.org > >> Subject: RE: [PATCH] arm64/kvm: correct the error report in > >> kvm_handle_guest_abort > >> > >> Hi Marc, > >> > >> > -----Original Message----- > >> > From: Marc Zyngier <maz@kernel.org> > >> > Sent: Friday, January 15, 2021 7:21 PM > >> > To: Jianyong Wu <Jianyong.Wu@arm.com> > >> > Cc: James Morse <James.Morse@arm.com>; will@kernel.org; Suzuki > >> Poulose > >> > <Suzuki.Poulose@arm.com>; linux-arm-kernel@lists.infradead.org; > >> > kvmarm@lists.cs.columbia.edu; Steve Capper > <Steve.Capper@arm.com>; > >> > Justin He <Justin.He@arm.com>; nd <nd@arm.com> > >> > Subject: Re: [PATCH] arm64/kvm: correct the error report in > >> > kvm_handle_guest_abort > >> > > >> > On 2021-01-15 09:30, Jianyong Wu wrote: > >> > > Currently, error report when cache maintenance at read-only > >> > > memory range, like rom, is not clear enough and even not correct. > >> > > As the specific error is definitely known by kvm, it is obliged > >> > > to give it out. > >> > > > >> > > Fox example, in a qemu/kvm VM, if the guest do dc at the pflash > >> > > range from > >> > > 0 to 128M, error is reported by kvm as "Data abort outside > >> > > memslots with no valid syndrome info" which is not quite correct. > >> > > > >> > > Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> > >> > > --- > >> > > arch/arm64/kvm/mmu.c | 12 +++++++++--- > >> > > 1 file changed, 9 insertions(+), 3 deletions(-) > >> > > > >> > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index > >> > > 7d2257cc5438..de66b7e38a5b 100644 > >> > > --- a/arch/arm64/kvm/mmu.c > >> > > +++ b/arch/arm64/kvm/mmu.c > >> > > @@ -1022,9 +1022,15 @@ int kvm_handle_guest_abort(struct > kvm_vcpu > >> > > *vcpu) > >> > > * So let's assume that the guest is just being > >> > > * cautious, and skip the instruction. > >> > > */ > >> > > - if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) > >> > { > >> > > - kvm_incr_pc(vcpu); > >> > > - ret = 1; > >> > > + if (kvm_vcpu_dabt_is_cm(vcpu)) { > >> > > + if (kvm_is_error_hva(hva)) { > >> > > + kvm_incr_pc(vcpu); > >> > > + ret = 1; > >> > > + goto out_unlock; > >> > > + } > >> > > + > >> > > + kvm_err("Do cache maintenance in the read- > only > >> > memory range\n"); > >> > > >> > We don't scream on the console for guests bugs. > >> Ok > >> > >> > > >> > > + ret = -EFAULT; > >> > > goto out_unlock; > >> > > } > >> > > >> > And what is userspace going to do with that? To be honest, I'd > >> > rather not report it in any case: > >> > > >> > - either it isn't mapped, and there is no cache to clean/invalidate > >> > - or it is mapped read-only: > >> > - if it is a "DC IVAC", the guest should get the fault as per > >> > the ARM ARM. But I don't think we can identify the particular CMO > >> > at this stage, so actually performing an invalidation is the least > >> > bad thing to do. > >> > > >> > How about this (untested)? > > > > I have tested for this. It works that DC ops can pass on memory range > > for rom. But there is performance issue. It takes too long a time that > > do DC on rom range compared with on normal memory range. Here is > some > > data: > > Ops memory type size > > time > > dc civac rom memory 128M > > 6700ms; > > dc civac writable normal memory 128M > > 300ms; > > > > It's a single thread test and may be worse on multi thread. I'm not > > sure we can bear it. WDYT? > > The problem is that the guest is invalidating one cache-line at a time, but we > invalidate 128M at a time in your example. > > I would say that I really don't care how slow it is. We cannot know which > address the guest is trying to invalidate (as your commit message shows, > there is no syndrome information available). > > So it seems our only choices are: > - don't do any invalidation, which is likely to break the guest > - invalidate everything, always > > Given that, I'd rather have a slow guest. Also, it very much looks like no > existing SW does this, so I cannot say I care much. OK, get it. Thanks Jianyong > > Thanks, > > M. > -- > Jazz is not dead. It just smells funny...
On 2021-01-18 13:38, Jianyong Wu wrote: >> -----Original Message----- >> From: Marc Zyngier <maz@kernel.org> >> Sent: Monday, January 18, 2021 9:26 PM >> To: Jianyong Wu <Jianyong.Wu@arm.com> >> Cc: Justin He <Justin.He@arm.com>; nd <nd@arm.com>; will@kernel.org; >> kvmarm@lists.cs.columbia.edu; linux-arm-kernel@lists.infradead.org >> Subject: Re: [PATCH] arm64/kvm: correct the error report in >> kvm_handle_guest_abort >> >> On 2021-01-18 13:04, Jianyong Wu wrote: >> > Hi Marc, >> > >> >> -----Original Message----- >> >> From: kvmarm-bounces@lists.cs.columbia.edu <kvmarm- >> >> bounces@lists.cs.columbia.edu> On Behalf Of Jianyong Wu >> >> Sent: Saturday, January 16, 2021 4:47 PM >> >> To: Marc Zyngier <maz@kernel.org> >> >> Cc: Justin He <Justin.He@arm.com>; nd <nd@arm.com>; will@kernel.org; >> >> kvmarm@lists.cs.columbia.edu; linux-arm-kernel@lists.infradead.org >> >> Subject: RE: [PATCH] arm64/kvm: correct the error report in >> >> kvm_handle_guest_abort >> >> >> >> Hi Marc, >> >> >> >> > -----Original Message----- >> >> > From: Marc Zyngier <maz@kernel.org> >> >> > Sent: Friday, January 15, 2021 7:21 PM >> >> > To: Jianyong Wu <Jianyong.Wu@arm.com> >> >> > Cc: James Morse <James.Morse@arm.com>; will@kernel.org; Suzuki >> >> Poulose >> >> > <Suzuki.Poulose@arm.com>; linux-arm-kernel@lists.infradead.org; >> >> > kvmarm@lists.cs.columbia.edu; Steve Capper >> <Steve.Capper@arm.com>; >> >> > Justin He <Justin.He@arm.com>; nd <nd@arm.com> >> >> > Subject: Re: [PATCH] arm64/kvm: correct the error report in >> >> > kvm_handle_guest_abort >> >> > >> >> > On 2021-01-15 09:30, Jianyong Wu wrote: >> >> > > Currently, error report when cache maintenance at read-only >> >> > > memory range, like rom, is not clear enough and even not correct. >> >> > > As the specific error is definitely known by kvm, it is obliged >> >> > > to give it out. >> >> > > >> >> > > Fox example, in a qemu/kvm VM, if the guest do dc at the pflash >> >> > > range from >> >> > > 0 to 128M, error is reported by kvm as "Data abort outside >> >> > > memslots with no valid syndrome info" which is not quite correct. >> >> > > >> >> > > Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> >> >> > > --- >> >> > > arch/arm64/kvm/mmu.c | 12 +++++++++--- >> >> > > 1 file changed, 9 insertions(+), 3 deletions(-) >> >> > > >> >> > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index >> >> > > 7d2257cc5438..de66b7e38a5b 100644 >> >> > > --- a/arch/arm64/kvm/mmu.c >> >> > > +++ b/arch/arm64/kvm/mmu.c >> >> > > @@ -1022,9 +1022,15 @@ int kvm_handle_guest_abort(struct >> kvm_vcpu >> >> > > *vcpu) >> >> > > * So let's assume that the guest is just being >> >> > > * cautious, and skip the instruction. >> >> > > */ >> >> > > - if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) >> >> > { >> >> > > - kvm_incr_pc(vcpu); >> >> > > - ret = 1; >> >> > > + if (kvm_vcpu_dabt_is_cm(vcpu)) { >> >> > > + if (kvm_is_error_hva(hva)) { >> >> > > + kvm_incr_pc(vcpu); >> >> > > + ret = 1; >> >> > > + goto out_unlock; >> >> > > + } >> >> > > + >> >> > > + kvm_err("Do cache maintenance in the read- >> only >> >> > memory range\n"); >> >> > >> >> > We don't scream on the console for guests bugs. >> >> Ok >> >> >> >> > >> >> > > + ret = -EFAULT; >> >> > > goto out_unlock; >> >> > > } >> >> > >> >> > And what is userspace going to do with that? To be honest, I'd >> >> > rather not report it in any case: >> >> > >> >> > - either it isn't mapped, and there is no cache to clean/invalidate >> >> > - or it is mapped read-only: >> >> > - if it is a "DC IVAC", the guest should get the fault as per >> >> > the ARM ARM. But I don't think we can identify the particular CMO >> >> > at this stage, so actually performing an invalidation is the least >> >> > bad thing to do. >> >> > >> >> > How about this (untested)? >> > >> > I have tested for this. It works that DC ops can pass on memory range >> > for rom. But there is performance issue. It takes too long a time that >> > do DC on rom range compared with on normal memory range. Here is >> some >> > data: >> > Ops memory type size >> > time >> > dc civac rom memory 128M >> > 6700ms; >> > dc civac writable normal memory 128M >> > 300ms; >> > >> > It's a single thread test and may be worse on multi thread. I'm not >> > sure we can bear it. WDYT? >> >> The problem is that the guest is invalidating one cache-line at a >> time, but we >> invalidate 128M at a time in your example. >> >> I would say that I really don't care how slow it is. We cannot know >> which >> address the guest is trying to invalidate (as your commit message >> shows, >> there is no syndrome information available). >> >> So it seems our only choices are: >> - don't do any invalidation, which is likely to break the guest >> - invalidate everything, always >> >> Given that, I'd rather have a slow guest. Also, it very much looks >> like no >> existing SW does this, so I cannot say I care much. > > OK, get it. Actually, there could be a way to make this a bit faster. Once we have cleaned+invalidated the memslot, we could unmap it, speeding up the following cache invalidations (nothing will be mapped). Could you please share your test case? Thanks, M.
> -----Original Message----- > From: Marc Zyngier <maz@kernel.org> > Sent: Monday, January 18, 2021 9:44 PM > To: Jianyong Wu <Jianyong.Wu@arm.com> > Cc: Justin He <Justin.He@arm.com>; nd <nd@arm.com>; will@kernel.org; > kvmarm@lists.cs.columbia.edu; linux-arm-kernel@lists.infradead.org > Subject: Re: [PATCH] arm64/kvm: correct the error report in > kvm_handle_guest_abort > > On 2021-01-18 13:38, Jianyong Wu wrote: > >> -----Original Message----- > >> From: Marc Zyngier <maz@kernel.org> > >> Sent: Monday, January 18, 2021 9:26 PM > >> To: Jianyong Wu <Jianyong.Wu@arm.com> > >> Cc: Justin He <Justin.He@arm.com>; nd <nd@arm.com>; will@kernel.org; > >> kvmarm@lists.cs.columbia.edu; linux-arm-kernel@lists.infradead.org > >> Subject: Re: [PATCH] arm64/kvm: correct the error report in > >> kvm_handle_guest_abort > >> > >> On 2021-01-18 13:04, Jianyong Wu wrote: > >> > Hi Marc, > >> > > >> >> -----Original Message----- > >> >> From: kvmarm-bounces@lists.cs.columbia.edu <kvmarm- > >> >> bounces@lists.cs.columbia.edu> On Behalf Of Jianyong Wu > >> >> Sent: Saturday, January 16, 2021 4:47 PM > >> >> To: Marc Zyngier <maz@kernel.org> > >> >> Cc: Justin He <Justin.He@arm.com>; nd <nd@arm.com>; > >> >> will@kernel.org; kvmarm@lists.cs.columbia.edu; > >> >> linux-arm-kernel@lists.infradead.org > >> >> Subject: RE: [PATCH] arm64/kvm: correct the error report in > >> >> kvm_handle_guest_abort > >> >> > >> >> Hi Marc, > >> >> > >> >> > -----Original Message----- > >> >> > From: Marc Zyngier <maz@kernel.org> > >> >> > Sent: Friday, January 15, 2021 7:21 PM > >> >> > To: Jianyong Wu <Jianyong.Wu@arm.com> > >> >> > Cc: James Morse <James.Morse@arm.com>; will@kernel.org; Suzuki > >> >> Poulose > >> >> > <Suzuki.Poulose@arm.com>; linux-arm-kernel@lists.infradead.org; > >> >> > kvmarm@lists.cs.columbia.edu; Steve Capper > >> <Steve.Capper@arm.com>; > >> >> > Justin He <Justin.He@arm.com>; nd <nd@arm.com> > >> >> > Subject: Re: [PATCH] arm64/kvm: correct the error report in > >> >> > kvm_handle_guest_abort > >> >> > > >> >> > On 2021-01-15 09:30, Jianyong Wu wrote: > >> >> > > Currently, error report when cache maintenance at read-only > >> >> > > memory range, like rom, is not clear enough and even not correct. > >> >> > > As the specific error is definitely known by kvm, it is > >> >> > > obliged to give it out. > >> >> > > > >> >> > > Fox example, in a qemu/kvm VM, if the guest do dc at the > >> >> > > pflash range from > >> >> > > 0 to 128M, error is reported by kvm as "Data abort outside > >> >> > > memslots with no valid syndrome info" which is not quite correct. > >> >> > > > >> >> > > Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> > >> >> > > --- > >> >> > > arch/arm64/kvm/mmu.c | 12 +++++++++--- > >> >> > > 1 file changed, 9 insertions(+), 3 deletions(-) > >> >> > > > >> >> > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > index > >> >> > > 7d2257cc5438..de66b7e38a5b 100644 > >> >> > > --- a/arch/arm64/kvm/mmu.c > >> >> > > +++ b/arch/arm64/kvm/mmu.c > >> >> > > @@ -1022,9 +1022,15 @@ int kvm_handle_guest_abort(struct > >> kvm_vcpu > >> >> > > *vcpu) > >> >> > > * So let's assume that the guest is just being > >> >> > > * cautious, and skip the instruction. > >> >> > > */ > >> >> > > - if (kvm_is_error_hva(hva) && > kvm_vcpu_dabt_is_cm(vcpu)) > >> >> > { > >> >> > > - kvm_incr_pc(vcpu); > >> >> > > - ret = 1; > >> >> > > + if (kvm_vcpu_dabt_is_cm(vcpu)) { > >> >> > > + if (kvm_is_error_hva(hva)) { > >> >> > > + kvm_incr_pc(vcpu); > >> >> > > + ret = 1; > >> >> > > + goto out_unlock; > >> >> > > + } > >> >> > > + > >> >> > > + kvm_err("Do cache maintenance in the read- > >> only > >> >> > memory range\n"); > >> >> > > >> >> > We don't scream on the console for guests bugs. > >> >> Ok > >> >> > >> >> > > >> >> > > + ret = -EFAULT; > >> >> > > goto out_unlock; > >> >> > > } > >> >> > > >> >> > And what is userspace going to do with that? To be honest, I'd > >> >> > rather not report it in any case: > >> >> > > >> >> > - either it isn't mapped, and there is no cache to > >> >> > clean/invalidate > >> >> > - or it is mapped read-only: > >> >> > - if it is a "DC IVAC", the guest should get the fault as per > >> >> > the ARM ARM. But I don't think we can identify the particular CMO > >> >> > at this stage, so actually performing an invalidation is the least > >> >> > bad thing to do. > >> >> > > >> >> > How about this (untested)? > >> > > >> > I have tested for this. It works that DC ops can pass on memory > >> > range for rom. But there is performance issue. It takes too long a > >> > time that do DC on rom range compared with on normal memory range. > >> > Here is > >> some > >> > data: > >> > Ops memory type size > >> > time > >> > dc civac rom memory 128M > >> > 6700ms; > >> > dc civac writable normal memory 128M > >> > 300ms; > >> > > >> > It's a single thread test and may be worse on multi thread. I'm not > >> > sure we can bear it. WDYT? > >> > >> The problem is that the guest is invalidating one cache-line at a > >> time, but we invalidate 128M at a time in your example. > >> > >> I would say that I really don't care how slow it is. We cannot know > >> which address the guest is trying to invalidate (as your commit > >> message shows, there is no syndrome information available). > >> > >> So it seems our only choices are: > >> - don't do any invalidation, which is likely to break the guest > >> - invalidate everything, always > >> > >> Given that, I'd rather have a slow guest. Also, it very much looks > >> like no existing SW does this, so I cannot say I care much. > > > > OK, get it. > > Actually, there could be a way to make this a bit faster. Once we have > cleaned+invalidated the memslot, we could unmap it, speeding up the > following cache invalidations (nothing will be mapped). > > Could you please share your test case? Yeah, here is my test method: *Make sure your arm64 server supports gic-v2. git clone https://github.com/unikraft/unikraft #unikraft is a unikernel cd unikraft vim plat/kvm/arm/entry.S before jumping to clean_and_invalidate_dcache_range set x0 as base address and x1 as the memory size. For the qemu/virt, rom address is 0~128M, normal memory starts 1G. like: ... /* * set x0 to the location stores dtb as the base address of the * memory range to be cache maintained */ mov x0, 0 mov x1, 0x08000000 bl clean_and_invalidate_dcache_range ... Build unikraft using "make", before build, using "make menuconfig" to choose the armv8 as architecture and kvm as platform. Run it using qemu: qemu-system-aarch64 \ -machine virt,gic-version=2,accel=kvm \ -m 1024 \ -display none \ -nographic -nodefaults \ -serial stdio -kernel build/unikraft_kvm-arm64.dbg \ -cpu host -smp 1 Also I inject debug info into qemu to measure the time consumed by "ioctl(cpu->kvm_fd, type, arg)" in "kvm_vcpu_ioctl". Like: //accel/kvm/kvm-all.c int kvm_vcpu_ioctl(CPUState *cpu, int type, ...) { ... gettimeofday(&tv1,NULL); ret = ioctl(cpu->kvm_fd, type, arg); gettimeofday(&tv2,NULL); us = tv2.tv_usec - tv1.tv_usec; sec = tv2.tv_sec - tv1.tv_sec; if (sec > 0 || us > 50000) { printf("++++++ kvm_vcpu_ioctl, time is %ds, %dus ++++\n", count, sec, us); } .,.. } You can have a try. Thanks Jianyong > > Thanks, > > M. > -- > Jazz is not dead. It just smells funny...
Hi Marc, > /* > - * Check for a cache maintenance operation. Since we > - * ended-up here, we know it is outside of any memory > - * slot. But we can't find out if that is for a device, > - * or if the guest is just being stupid. The only thing > - * we know for sure is that this range cannot be cached. > + * Check for a cache maintenance operation. Two cases: > * > - * So let's assume that the guest is just being > - * cautious, and skip the instruction. > + * - It is outside of any memory slot. But we can't > + * find out if that is for a device, or if the guest > + * is just being stupid. The only thing we know for > + * sure is that this range cannot be cached. So > + * let's assume that the guest is just being > + * cautious, and skip the instruction. > + * > + * - Otherwise, clean/invalidate the whole memslot. We > + * should special-case DC IVAC and inject a > + * permission fault, but we can't really identify it > + * in this context. > */ > - if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) > { > + if (kvm_vcpu_dabt_is_cm(vcpu)) { > + if (!kvm_is_error_hva(hva)) { > + spin_lock(&vcpu->kvm->mmu_lock); > + stage2_flush_memslot(vcpu->kvm, > memslot); Maybe we should not flush the whole memslot here as every "dc ivac" only work on a range of memory with cache line size. So what about using: stage2_apply_range_resched(vcpu->kvm, fault_ipa, fault_ipa + cache_line_size(), kvm_pgtable_stage2_flush) instead. It will a bit faster than flush the whole memslot. Also I test your idea of "unmap after flush" using: stage2_apply_range_resched(vcpu->kvm, fault_ipa, fault_ipa + cache_line_size(), kvm_pgtable_stage2_flush); stage2_apply_range(vcpu->kvm, fault_ipa, fault_ipa + cache_line_size(), kvm_pgtable_stage2_unmap, true); then I do "dc ivac" on the rom of 128M twice and got the double time of around 11s. it means that there is no optimization when do the second "dc ivac". I'm not sure if there is something wrong in my test. So what about just using " stage2_apply_range_resched(vcpu->kvm, fault_ipa, fault_ipa + cache_line_size(), kvm_pgtable_stage2_flush);" instead of " stage2_flush_memslot(vcpu->kvm, memslot);" and let the guest bears the disadvantage of low performance. Thanks Jianyong > + spin_unlock(&vcpu->kvm->mmu_lock); > + } > + > kvm_incr_pc(vcpu); > ret = 1; > goto out_unlock; > > -- > Jazz is not dead. It just smells funny...
On 2021-01-26 08:10, Jianyong Wu wrote: > Hi Marc, > >> /* >> - * Check for a cache maintenance operation. Since we >> - * ended-up here, we know it is outside of any memory >> - * slot. But we can't find out if that is for a device, >> - * or if the guest is just being stupid. The only thing >> - * we know for sure is that this range cannot be cached. >> + * Check for a cache maintenance operation. Two cases: >> * >> - * So let's assume that the guest is just being >> - * cautious, and skip the instruction. >> + * - It is outside of any memory slot. But we can't >> + * find out if that is for a device, or if the guest >> + * is just being stupid. The only thing we know for >> + * sure is that this range cannot be cached. So >> + * let's assume that the guest is just being >> + * cautious, and skip the instruction. >> + * >> + * - Otherwise, clean/invalidate the whole memslot. We >> + * should special-case DC IVAC and inject a >> + * permission fault, but we can't really identify it >> + * in this context. >> */ >> - if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) >> { >> + if (kvm_vcpu_dabt_is_cm(vcpu)) { >> + if (!kvm_is_error_hva(hva)) { >> + spin_lock(&vcpu->kvm->mmu_lock); >> + stage2_flush_memslot(vcpu->kvm, >> memslot); > > Maybe we should not flush the whole memslot here as every "dc ivac" > only work on a range of memory with cache line size. So what about > using: > stage2_apply_range_resched(vcpu->kvm, fault_ipa, fault_ipa + > cache_line_size(), kvm_pgtable_stage2_flush) instead. It will a bit > faster than flush the whole memslot. > Also I test your idea of "unmap after flush" using: > stage2_apply_range_resched(vcpu->kvm, fault_ipa, fault_ipa + > cache_line_size(), kvm_pgtable_stage2_flush); > stage2_apply_range(vcpu->kvm, fault_ipa, fault_ipa + > cache_line_size(), kvm_pgtable_stage2_unmap, true); > then I do "dc ivac" on the rom of 128M twice and got the double time > of around 11s. it means that there is no optimization when do the > second "dc ivac". > I'm not sure if there is something wrong in my test. > > So what about just using " stage2_apply_range_resched(vcpu->kvm, > fault_ipa, fault_ipa + cache_line_size(), kvm_pgtable_stage2_flush);" > instead of > " stage2_flush_memslot(vcpu->kvm, memslot);" and let the guest bears > the disadvantage of low performance. No, both solutions are wrong. I had to write my own test case because the idea of hacking some unknown guest wasn't very appealing. At the end of the day, what we need to implement is as follow: - if a CMO hits normal memory, it's all already handled - if a CMO hits non-memory, we skip it, as we do today - if a CMO hits R/O memory, that's where things become fun: - if it is a DC IVAC, the architecture says this should result in a permission fault - if it is a DC CIVAC, it works as expected So we need to distinguish between IVAC and CIVAC. One way to do it is to treat CMOs generating a translation fault as a *read*, even when they are on a RO memslot. If they come back with a permission fault: - inside a RW memslot: no problem, treat it as a write - inside a RO memslot: only IVAC will fault here, inject an abort in the guest This translates into the following patch, which does the trick for me. M. From 8f2a919d6f13d36445974794c76821fbb6b40f88 Mon Sep 17 00:00:00 2001 From: Marc Zyngier <maz@kernel.org> Date: Sat, 16 Jan 2021 10:53:21 +0000 Subject: [PATCH] CMO on RO memslot Signed-off-by: Marc Zyngier <maz@kernel.org> --- arch/arm64/kvm/mmu.c | 51 +++++++++++++++++++++++++++++++++----------- 1 file changed, 39 insertions(+), 12 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 7d2257cc5438..3c176b5b0a28 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -760,7 +760,15 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, struct kvm_pgtable *pgt; fault_granule = 1UL << ARM64_HW_PGTABLE_LEVEL_SHIFT(fault_level); - write_fault = kvm_is_write_fault(vcpu); + /* + * Treat translation faults on CMOs as read faults. Should + * this further generate a permission fault, it will be caught + * in kvm_handle_guest_abort(), with prejudice... + */ + if (fault_status == FSC_FAULT && kvm_vcpu_dabt_is_cm(vcpu)) + write_fault = false; + else + write_fault = kvm_is_write_fault(vcpu); exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu); VM_BUG_ON(write_fault && exec_fault); @@ -1013,19 +1021,37 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) } /* - * Check for a cache maintenance operation. Since we - * ended-up here, we know it is outside of any memory - * slot. But we can't find out if that is for a device, - * or if the guest is just being stupid. The only thing - * we know for sure is that this range cannot be cached. + * Check for a cache maintenance operation. Two cases: + * + * - It is outside of any memory slot. But we can't find out + * if that is for a device, or if the guest is just being + * stupid. The only thing we know for sure is that this + * range cannot be cached. So let's assume that the guest + * is just being cautious, and skip the instruction. + * + * - Otherwise, check whether this is a permission fault. + * If so, that's a DC IVAC on a R/O memslot, which is a + * pretty bad idea, and we tell the guest so. * - * So let's assume that the guest is just being - * cautious, and skip the instruction. + * - If this wasn't a permission fault, pass it along for + * further handling (including faulting the page in if it + * was a translation fault). */ - if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) { - kvm_incr_pc(vcpu); - ret = 1; - goto out_unlock; + if (kvm_vcpu_dabt_is_cm(vcpu)) { + if (kvm_is_error_hva(hva)) { + kvm_incr_pc(vcpu); + ret = 1; + goto out_unlock; + } + + if (fault_status == FSC_PERM) { + /* DC IVAC on a R/O memslot */ + kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu)); + ret = 1; + goto out_unlock; + } + + goto handle_access; } /* @@ -1039,6 +1065,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) goto out_unlock; } +handle_access: /* Userspace should not be able to register out-of-bounds IPAs */ VM_BUG_ON(fault_ipa >= kvm_phys_size(vcpu->kvm));
Hi Marc, > > From 8f2a919d6f13d36445974794c76821fbb6b40f88 Mon Sep 17 00:00:00 > 2001 > From: Marc Zyngier <maz@kernel.org> > Date: Sat, 16 Jan 2021 10:53:21 +0000 > Subject: [PATCH] CMO on RO memslot > > Signed-off-by: Marc Zyngier <maz@kernel.org> > --- > arch/arm64/kvm/mmu.c | 51 +++++++++++++++++++++++++++++++++---- > ------- > 1 file changed, 39 insertions(+), 12 deletions(-) > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index > 7d2257cc5438..3c176b5b0a28 100644 > --- a/arch/arm64/kvm/mmu.c > +++ b/arch/arm64/kvm/mmu.c > @@ -760,7 +760,15 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, > phys_addr_t fault_ipa, > struct kvm_pgtable *pgt; > > fault_granule = 1UL << > ARM64_HW_PGTABLE_LEVEL_SHIFT(fault_level); > - write_fault = kvm_is_write_fault(vcpu); > + /* > + * Treat translation faults on CMOs as read faults. Should > + * this further generate a permission fault, it will be caught > + * in kvm_handle_guest_abort(), with prejudice... > + */ > + if (fault_status == FSC_FAULT && kvm_vcpu_dabt_is_cm(vcpu)) > + write_fault = false; > + else > + write_fault = kvm_is_write_fault(vcpu); > exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu); > VM_BUG_ON(write_fault && exec_fault); > > @@ -1013,19 +1021,37 @@ int kvm_handle_guest_abort(struct kvm_vcpu > *vcpu) > } > > /* > - * Check for a cache maintenance operation. Since we > - * ended-up here, we know it is outside of any memory > - * slot. But we can't find out if that is for a device, > - * or if the guest is just being stupid. The only thing > - * we know for sure is that this range cannot be cached. > + * Check for a cache maintenance operation. Two cases: > + * > + * - It is outside of any memory slot. But we can't find out > + * if that is for a device, or if the guest is just being > + * stupid. The only thing we know for sure is that this > + * range cannot be cached. So let's assume that the guest > + * is just being cautious, and skip the instruction. > + * > + * - Otherwise, check whether this is a permission fault. > + * If so, that's a DC IVAC on a R/O memslot, which is a > + * pretty bad idea, and we tell the guest so. > * > - * So let's assume that the guest is just being > - * cautious, and skip the instruction. > + * - If this wasn't a permission fault, pass it along for > + * further handling (including faulting the page in > if it > + * was a translation fault). > */ > - if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) > { > - kvm_incr_pc(vcpu); > - ret = 1; > - goto out_unlock; > + if (kvm_vcpu_dabt_is_cm(vcpu)) { > + if (kvm_is_error_hva(hva)) { > + kvm_incr_pc(vcpu); > + ret = 1; > + goto out_unlock; > + } > + > + if (fault_status == FSC_PERM) { > + /* DC IVAC on a R/O memslot */ > + kvm_inject_dabt(vcpu, > kvm_vcpu_get_hfar(vcpu)); One question: In general, the "DC" ops show up very early in guest. So what if the guest do this before interrupt initialization? If so, the guest may stuck here. Thanks Jianyong > + ret = 1; > + goto out_unlock; > + } > + > + goto handle_access; > } > > /* > @@ -1039,6 +1065,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu > *vcpu) > goto out_unlock; > } > > +handle_access: > /* Userspace should not be able to register out-of-bounds IPAs */ > VM_BUG_ON(fault_ipa >= kvm_phys_size(vcpu->kvm)); > > -- > 2.29.2 > > > -- > Jazz is not dead. It just smells funny...
On 2021-01-28 03:01, Jianyong Wu wrote: > Hi Marc, > >> >> From 8f2a919d6f13d36445974794c76821fbb6b40f88 Mon Sep 17 00:00:00 >> 2001 >> From: Marc Zyngier <maz@kernel.org> >> Date: Sat, 16 Jan 2021 10:53:21 +0000 >> Subject: [PATCH] CMO on RO memslot >> >> Signed-off-by: Marc Zyngier <maz@kernel.org> >> --- >> arch/arm64/kvm/mmu.c | 51 +++++++++++++++++++++++++++++++++---- >> ------- >> 1 file changed, 39 insertions(+), 12 deletions(-) >> >> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index >> 7d2257cc5438..3c176b5b0a28 100644 >> --- a/arch/arm64/kvm/mmu.c >> +++ b/arch/arm64/kvm/mmu.c >> @@ -760,7 +760,15 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, >> phys_addr_t fault_ipa, >> struct kvm_pgtable *pgt; >> >> fault_granule = 1UL << >> ARM64_HW_PGTABLE_LEVEL_SHIFT(fault_level); >> - write_fault = kvm_is_write_fault(vcpu); >> + /* >> + * Treat translation faults on CMOs as read faults. Should >> + * this further generate a permission fault, it will be caught >> + * in kvm_handle_guest_abort(), with prejudice... >> + */ >> + if (fault_status == FSC_FAULT && kvm_vcpu_dabt_is_cm(vcpu)) >> + write_fault = false; >> + else >> + write_fault = kvm_is_write_fault(vcpu); >> exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu); >> VM_BUG_ON(write_fault && exec_fault); >> >> @@ -1013,19 +1021,37 @@ int kvm_handle_guest_abort(struct kvm_vcpu >> *vcpu) >> } >> >> /* >> - * Check for a cache maintenance operation. Since we >> - * ended-up here, we know it is outside of any memory >> - * slot. But we can't find out if that is for a device, >> - * or if the guest is just being stupid. The only thing >> - * we know for sure is that this range cannot be cached. >> + * Check for a cache maintenance operation. Two cases: >> + * >> + * - It is outside of any memory slot. But we can't find out >> + * if that is for a device, or if the guest is just being >> + * stupid. The only thing we know for sure is that this >> + * range cannot be cached. So let's assume that the guest >> + * is just being cautious, and skip the instruction. >> + * >> + * - Otherwise, check whether this is a permission fault. >> + * If so, that's a DC IVAC on a R/O memslot, which is a >> + * pretty bad idea, and we tell the guest so. >> * >> - * So let's assume that the guest is just being >> - * cautious, and skip the instruction. >> + * - If this wasn't a permission fault, pass it along for >> + * further handling (including faulting the page in >> if it >> + * was a translation fault). >> */ >> - if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) >> { >> - kvm_incr_pc(vcpu); >> - ret = 1; >> - goto out_unlock; >> + if (kvm_vcpu_dabt_is_cm(vcpu)) { >> + if (kvm_is_error_hva(hva)) { >> + kvm_incr_pc(vcpu); >> + ret = 1; >> + goto out_unlock; >> + } >> + >> + if (fault_status == FSC_PERM) { >> + /* DC IVAC on a R/O memslot */ >> + kvm_inject_dabt(vcpu, >> kvm_vcpu_get_hfar(vcpu)); > > One question: > In general, the "DC" ops show up very early in guest. So what if the > guest do this before interrupt initialization? If so, the guest may > stuck here. I don't understand your question. Do you mean "what if the guest does this without being able to handle an exception"? If that's your question, then the answer is "don't do that". The architecture is clear that DC IVAC needs write permission, and will result in an abort being delivered if there is no writable mapping (and there can't be, the memslot is R/O). DC CIVAC doesn't have that requirement, and will not generate an exception. Thanks, M.
> -----Original Message----- > From: Marc Zyngier <maz@kernel.org> > Sent: Thursday, January 28, 2021 5:07 PM > To: Jianyong Wu <Jianyong.Wu@arm.com> > Cc: James Morse <James.Morse@arm.com>; will@kernel.org; Suzuki > Poulose <Suzuki.Poulose@arm.com>; linux-arm-kernel@lists.infradead.org; > kvmarm@lists.cs.columbia.edu; Steve Capper <Steve.Capper@arm.com>; > Justin He <Justin.He@arm.com>; nd <nd@arm.com> > Subject: Re: [PATCH] arm64/kvm: correct the error report in > kvm_handle_guest_abort > > On 2021-01-28 03:01, Jianyong Wu wrote: > > Hi Marc, > > > >> > >> From 8f2a919d6f13d36445974794c76821fbb6b40f88 Mon Sep 17 00:00:00 > >> 2001 > >> From: Marc Zyngier <maz@kernel.org> > >> Date: Sat, 16 Jan 2021 10:53:21 +0000 > >> Subject: [PATCH] CMO on RO memslot > >> > >> Signed-off-by: Marc Zyngier <maz@kernel.org> > >> --- > >> arch/arm64/kvm/mmu.c | 51 > +++++++++++++++++++++++++++++++++---- > >> ------- > >> 1 file changed, 39 insertions(+), 12 deletions(-) > >> > >> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index > >> 7d2257cc5438..3c176b5b0a28 100644 > >> --- a/arch/arm64/kvm/mmu.c > >> +++ b/arch/arm64/kvm/mmu.c > >> @@ -760,7 +760,15 @@ static int user_mem_abort(struct kvm_vcpu > *vcpu, > >> phys_addr_t fault_ipa, > >> struct kvm_pgtable *pgt; > >> > >> fault_granule = 1UL << > >> ARM64_HW_PGTABLE_LEVEL_SHIFT(fault_level); > >> - write_fault = kvm_is_write_fault(vcpu); > >> + /* > >> + * Treat translation faults on CMOs as read faults. Should > >> + * this further generate a permission fault, it will be caught > >> + * in kvm_handle_guest_abort(), with prejudice... > >> + */ > >> + if (fault_status == FSC_FAULT && kvm_vcpu_dabt_is_cm(vcpu)) > >> + write_fault = false; > >> + else > >> + write_fault = kvm_is_write_fault(vcpu); > >> exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu); > >> VM_BUG_ON(write_fault && exec_fault); > >> > >> @@ -1013,19 +1021,37 @@ int kvm_handle_guest_abort(struct > kvm_vcpu > >> *vcpu) > >> } > >> > >> /* > >> - * Check for a cache maintenance operation. Since we > >> - * ended-up here, we know it is outside of any memory > >> - * slot. But we can't find out if that is for a device, > >> - * or if the guest is just being stupid. The only thing > >> - * we know for sure is that this range cannot be cached. > >> + * Check for a cache maintenance operation. Two cases: > >> + * > >> + * - It is outside of any memory slot. But we can't find out > >> + * if that is for a device, or if the guest is just being > >> + * stupid. The only thing we know for sure is that this > >> + * range cannot be cached. So let's assume that the guest > >> + * is just being cautious, and skip the instruction. > >> + * > >> + * - Otherwise, check whether this is a permission fault. > >> + * If so, that's a DC IVAC on a R/O memslot, which is a > >> + * pretty bad idea, and we tell the guest so. > >> * > >> - * So let's assume that the guest is just being > >> - * cautious, and skip the instruction. > >> + * - If this wasn't a permission fault, pass it along for > >> + * further handling (including faulting the page in > >> if it > >> + * was a translation fault). > >> */ > >> - if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) > >> { > >> - kvm_incr_pc(vcpu); > >> - ret = 1; > >> - goto out_unlock; > >> + if (kvm_vcpu_dabt_is_cm(vcpu)) { > >> + if (kvm_is_error_hva(hva)) { > >> + kvm_incr_pc(vcpu); > >> + ret = 1; > >> + goto out_unlock; > >> + } > >> + > >> + if (fault_status == FSC_PERM) { > >> + /* DC IVAC on a R/O memslot */ > >> + kvm_inject_dabt(vcpu, > >> kvm_vcpu_get_hfar(vcpu)); > > > > One question: > > In general, the "DC" ops show up very early in guest. So what if the > > guest do this before interrupt initialization? If so, the guest may > > stuck here. > > I don't understand your question. Do you mean "what if the guest does this > without being able to handle an exception"? > > If that's your question, then the answer is "don't do that". > The architecture is clear that DC IVAC needs write permission, and will result > in an abort being delivered if there is no writable mapping (and there can't be, > the memslot is R/O). > > DC CIVAC doesn't have that requirement, and will not generate an exception. > OK, get it. I have tested the patch above using my test case. It works well for "dc civac" and for "dc ivac" , a "Synchronous External Abort" occurs in guest as expected. Thanks Jianyong > Thanks, > > M. > -- > Jazz is not dead. It just smells funny...
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 7d2257cc5438..de66b7e38a5b 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1022,9 +1022,15 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) * So let's assume that the guest is just being * cautious, and skip the instruction. */ - if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) { - kvm_incr_pc(vcpu); - ret = 1; + if (kvm_vcpu_dabt_is_cm(vcpu)) { + if (kvm_is_error_hva(hva)) { + kvm_incr_pc(vcpu); + ret = 1; + goto out_unlock; + } + + kvm_err("Do cache maintenance in the read-only memory range\n"); + ret = -EFAULT; goto out_unlock; }
Currently, error report when cache maintenance at read-only memory range, like rom, is not clear enough and even not correct. As the specific error is definitely known by kvm, it is obliged to give it out. Fox example, in a qemu/kvm VM, if the guest do dc at the pflash range from 0 to 128M, error is reported by kvm as "Data abort outside memslots with no valid syndrome info" which is not quite correct. Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> --- arch/arm64/kvm/mmu.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)