Message ID | 5034D4E3.2080801@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 08/22/2012 03:47 PM, Xiao Guangrong wrote: > On 08/22/2012 08:06 PM, Avi Kivity wrote: >> On 08/21/2012 06:03 AM, Xiao Guangrong wrote: >>> Introduce write_readonly_mem in mmio-exit-info to indicate this exit is >>> caused by write access on readonly memslot >> >> Please document this in chapter 5 of apic.txt. >> > > Okay, please review this one. > > Subject: [PATCH v6 12/12] KVM: indicate readonly access fault > > Introduce write_readonly_mem in mmio-exit-info to indicate this exit is > caused by write access on readonly memslot > I'm not sure whether this indication can be trusted by userspace. By the time userspace gets to process this, the slot may no longer exist, or it may be writable. (in the same way an mmio exit might actually hit RAM)
On 09/06/2012 10:09 PM, Avi Kivity wrote: > On 08/22/2012 03:47 PM, Xiao Guangrong wrote: >> On 08/22/2012 08:06 PM, Avi Kivity wrote: >>> On 08/21/2012 06:03 AM, Xiao Guangrong wrote: >>>> Introduce write_readonly_mem in mmio-exit-info to indicate this exit is >>>> caused by write access on readonly memslot >>> >>> Please document this in chapter 5 of apic.txt. >>> >> >> Okay, please review this one. >> >> Subject: [PATCH v6 12/12] KVM: indicate readonly access fault >> >> Introduce write_readonly_mem in mmio-exit-info to indicate this exit is >> caused by write access on readonly memslot >> > > I'm not sure whether this indication can be trusted by userspace. By > the time userspace gets to process this, the slot may no longer exist, > or it may be writable. The case of deleting memslot is ok, because userspace just skips this fault if no readonly mem or no fault handler can be found. Switching memslot from readonly to writable sounds strange, i agree with you that this flag is untrusty under this case. Marcelo, any comments? > > (in the same way an mmio exit might actually hit RAM) So, in the userspace, for the safe reason, we should walk all memslots not just walking mmio handlers? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/07/2012 12:56 PM, Xiao Guangrong wrote: > On 09/06/2012 10:09 PM, Avi Kivity wrote: >> On 08/22/2012 03:47 PM, Xiao Guangrong wrote: >>> On 08/22/2012 08:06 PM, Avi Kivity wrote: >>>> On 08/21/2012 06:03 AM, Xiao Guangrong wrote: >>>>> Introduce write_readonly_mem in mmio-exit-info to indicate this exit is >>>>> caused by write access on readonly memslot >>>> >>>> Please document this in chapter 5 of apic.txt. >>>> >>> >>> Okay, please review this one. >>> >>> Subject: [PATCH v6 12/12] KVM: indicate readonly access fault >>> >>> Introduce write_readonly_mem in mmio-exit-info to indicate this exit is >>> caused by write access on readonly memslot >>> >> >> I'm not sure whether this indication can be trusted by userspace. By >> the time userspace gets to process this, the slot may no longer exist, >> or it may be writable. > > The case of deleting memslot is ok, because userspace just skips this fault > if no readonly mem or no fault handler can be found. > > Switching memslot from readonly to writable sounds strange, i agree with you > that this flag is untrusty under this case. It's not so strange, see the PAM registers in the 440fx chipset. The strategy for shadowing the BIOS is to set PAM for (read: pci, write: RAM) (=a readonly slot in kvm), memcpy the BIOS onto itself, then switch the PAM to (read: RAM, write: RAM) (=rw slot in kvm) or (read: RAM, write: PCI) (=different readonly slot in kvm). Of course that usually happens with just one cpu running so there would be no confusion. But in general the indication is racy.
On Fri, Sep 07, 2012 at 05:56:39PM +0800, Xiao Guangrong wrote: > On 09/06/2012 10:09 PM, Avi Kivity wrote: > > On 08/22/2012 03:47 PM, Xiao Guangrong wrote: > >> On 08/22/2012 08:06 PM, Avi Kivity wrote: > >>> On 08/21/2012 06:03 AM, Xiao Guangrong wrote: > >>>> Introduce write_readonly_mem in mmio-exit-info to indicate this exit is > >>>> caused by write access on readonly memslot > >>> > >>> Please document this in chapter 5 of apic.txt. > >>> > >> > >> Okay, please review this one. > >> > >> Subject: [PATCH v6 12/12] KVM: indicate readonly access fault > >> > >> Introduce write_readonly_mem in mmio-exit-info to indicate this exit is > >> caused by write access on readonly memslot > >> > > > > I'm not sure whether this indication can be trusted by userspace. By > > the time userspace gets to process this, the slot may no longer exist, > > or it may be writable. > > The case of deleting memslot is ok, because userspace just skips this fault > if no readonly mem or no fault handler can be found. > > Switching memslot from readonly to writable sounds strange, i agree with you > that this flag is untrusty under this case. > > Marcelo, any comments? The same can happen with slot deletion, for example. Userspace (which performed the modification which can result in faults to non-existant/read-only/.../new-tag memslot), must handle the faults properly or avoid the possibility for reference to memslot information from the past. I think its worthwhile to add a note about this in the API documentation: "The user of this interface is responsible for handling references to stale memslot information, either by handling exit notifications which reference stale memslot information or not allowing these notifications to exist by stopping all vcpus in userspace before performing modifications to the memslots map". > > (in the same way an mmio exit might actually hit RAM) > > So, in the userspace, for the safe reason, we should walk all memslots not > just walking mmio handlers? > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/11/2012 01:31 AM, Marcelo Tosatti wrote: > On Fri, Sep 07, 2012 at 05:56:39PM +0800, Xiao Guangrong wrote: >> On 09/06/2012 10:09 PM, Avi Kivity wrote: >> > On 08/22/2012 03:47 PM, Xiao Guangrong wrote: >> >> On 08/22/2012 08:06 PM, Avi Kivity wrote: >> >>> On 08/21/2012 06:03 AM, Xiao Guangrong wrote: >> >>>> Introduce write_readonly_mem in mmio-exit-info to indicate this exit is >> >>>> caused by write access on readonly memslot >> >>> >> >>> Please document this in chapter 5 of apic.txt. >> >>> >> >> >> >> Okay, please review this one. >> >> >> >> Subject: [PATCH v6 12/12] KVM: indicate readonly access fault >> >> >> >> Introduce write_readonly_mem in mmio-exit-info to indicate this exit is >> >> caused by write access on readonly memslot >> >> >> > >> > I'm not sure whether this indication can be trusted by userspace. By >> > the time userspace gets to process this, the slot may no longer exist, >> > or it may be writable. >> >> The case of deleting memslot is ok, because userspace just skips this fault >> if no readonly mem or no fault handler can be found. >> >> Switching memslot from readonly to writable sounds strange, i agree with you >> that this flag is untrusty under this case. >> >> Marcelo, any comments? > > The same can happen with slot deletion, for example. > > Userspace (which performed the modification which can result in faults > to non-existant/read-only/.../new-tag memslot), must handle the faults > properly or avoid the possibility for reference to memslot information > from the past. > > I think its worthwhile to add a note about this in the API > documentation: "The user of this interface is responsible for handling > references to stale memslot information, either by handling > exit notifications which reference stale memslot information or not > allowing these notifications to exist by stopping all vcpus in userspace > before performing modifications to the memslots map". Or we can drop the new interface and rely on userspace to perform the lookup under its own locking rules. It's slow, but writes to ROM or ROM/device are rare anyway.
On Tue, Sep 11, 2012 at 12:18:22PM +0300, Avi Kivity wrote: > > The same can happen with slot deletion, for example. > > > > Userspace (which performed the modification which can result in faults > > to non-existant/read-only/.../new-tag memslot), must handle the faults > > properly or avoid the possibility for reference to memslot information > > from the past. > > > > I think its worthwhile to add a note about this in the API > > documentation: "The user of this interface is responsible for handling > > references to stale memslot information, either by handling > > exit notifications which reference stale memslot information or not > > allowing these notifications to exist by stopping all vcpus in userspace > > before performing modifications to the memslots map". > > Or we can drop the new interface and rely on userspace to perform the > lookup under its own locking rules. > > It's slow, but writes to ROM or ROM/device are rare anyway. Lookup what information? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Sep 11, 2012 at 11:39:01AM -0300, Marcelo Tosatti wrote: > On Tue, Sep 11, 2012 at 12:18:22PM +0300, Avi Kivity wrote: > > > The same can happen with slot deletion, for example. > > > > > > Userspace (which performed the modification which can result in faults > > > to non-existant/read-only/.../new-tag memslot), must handle the faults > > > properly or avoid the possibility for reference to memslot information > > > from the past. > > > > > > I think its worthwhile to add a note about this in the API > > > documentation: "The user of this interface is responsible for handling > > > references to stale memslot information, either by handling > > > exit notifications which reference stale memslot information or not > > > allowing these notifications to exist by stopping all vcpus in userspace > > > before performing modifications to the memslots map". > > > > Or we can drop the new interface and rely on userspace to perform the > > lookup under its own locking rules. > > > > It's slow, but writes to ROM or ROM/device are rare anyway. > > Lookup what information? Ping? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/11/2012 05:39 PM, Marcelo Tosatti wrote: > On Tue, Sep 11, 2012 at 12:18:22PM +0300, Avi Kivity wrote: >> > The same can happen with slot deletion, for example. >> > >> > Userspace (which performed the modification which can result in faults >> > to non-existant/read-only/.../new-tag memslot), must handle the faults >> > properly or avoid the possibility for reference to memslot information >> > from the past. >> > >> > I think its worthwhile to add a note about this in the API >> > documentation: "The user of this interface is responsible for handling >> > references to stale memslot information, either by handling >> > exit notifications which reference stale memslot information or not >> > allowing these notifications to exist by stopping all vcpus in userspace >> > before performing modifications to the memslots map". >> >> Or we can drop the new interface and rely on userspace to perform the >> lookup under its own locking rules. >> >> It's slow, but writes to ROM or ROM/device are rare anyway. > > Lookup what information? Where to dispatch the write. In fact userspace has to do that anyway if it's a ROM/device. There's no way userspace can guess that unless we pass in the slot number (which isn't synchronized with anything).
On Wed, Sep 12, 2012 at 06:34:33PM +0300, Avi Kivity wrote: > On 09/11/2012 05:39 PM, Marcelo Tosatti wrote: > > On Tue, Sep 11, 2012 at 12:18:22PM +0300, Avi Kivity wrote: > >> > The same can happen with slot deletion, for example. > >> > > >> > Userspace (which performed the modification which can result in faults > >> > to non-existant/read-only/.../new-tag memslot), must handle the faults > >> > properly or avoid the possibility for reference to memslot information > >> > from the past. > >> > > >> > I think its worthwhile to add a note about this in the API > >> > documentation: "The user of this interface is responsible for handling > >> > references to stale memslot information, either by handling > >> > exit notifications which reference stale memslot information or not > >> > allowing these notifications to exist by stopping all vcpus in userspace > >> > before performing modifications to the memslots map". > >> > >> Or we can drop the new interface and rely on userspace to perform the > >> lookup under its own locking rules. > >> > >> It's slow, but writes to ROM or ROM/device are rare anyway. > > > > Lookup what information? > > Where to dispatch the write. > > In fact userspace has to do that anyway if it's a ROM/device. There's > no way userspace can guess that unless we pass in the slot number (which > isn't synchronized with anything). Alright, do you prefer the details of this exit to be worked out later, when necessary, then? That is, not merge this particular patch of the series? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/12/2012 06:44 PM, Marcelo Tosatti wrote: > On Wed, Sep 12, 2012 at 06:34:33PM +0300, Avi Kivity wrote: >> On 09/11/2012 05:39 PM, Marcelo Tosatti wrote: >> > On Tue, Sep 11, 2012 at 12:18:22PM +0300, Avi Kivity wrote: >> >> > The same can happen with slot deletion, for example. >> >> > >> >> > Userspace (which performed the modification which can result in faults >> >> > to non-existant/read-only/.../new-tag memslot), must handle the faults >> >> > properly or avoid the possibility for reference to memslot information >> >> > from the past. >> >> > >> >> > I think its worthwhile to add a note about this in the API >> >> > documentation: "The user of this interface is responsible for handling >> >> > references to stale memslot information, either by handling >> >> > exit notifications which reference stale memslot information or not >> >> > allowing these notifications to exist by stopping all vcpus in userspace >> >> > before performing modifications to the memslots map". >> >> >> >> Or we can drop the new interface and rely on userspace to perform the >> >> lookup under its own locking rules. >> >> >> >> It's slow, but writes to ROM or ROM/device are rare anyway. >> > >> > Lookup what information? >> >> Where to dispatch the write. >> >> In fact userspace has to do that anyway if it's a ROM/device. There's >> no way userspace can guess that unless we pass in the slot number (which >> isn't synchronized with anything). > > Alright, do you prefer the details of this exit to be worked out later, > when necessary, then? > > That is, not merge this particular patch of the series? > Right. I think it is unneeded.
diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index b91bfd4..baf477e 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2089,12 +2089,18 @@ Unused. __u8 data[8]; __u32 len; __u8 is_write; +#ifdef __KVM_HAVE_READONLY_MEM + __u8 write_readonly_mem; +#endif } mmio; If exit_reason is KVM_EXIT_MMIO, then the vcpu has executed a memory-mapped I/O instruction which could not be satisfied by kvm. The 'data' member contains the written data if 'is_write' is -true, and should be filled by application code otherwise. +true, and should be filled by application code otherwise. When the +KVM_CAP_READONLY_MEM capability, the 'write_readonly_mem' member indicates +whether the exit is caused by write access on the readonly memslot +(see KVM_MEM_READONLY in 4.35 KVM_SET_USER_MEMORY_REGION). NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO and KVM_EXIT_OSI, the corresponding operations are complete (and guest state is consistent) only after userspace diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 42bbf41..6d7fe4a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3710,9 +3710,10 @@ int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa, ret = kvm_write_guest(vcpu->kvm, gpa, val, bytes); if (ret < 0) - return 0; + return ret; + kvm_mmu_pte_write(vcpu, gpa, val, bytes); - return 1; + return 0; } struct read_write_emulator_ops { @@ -3742,7 +3743,7 @@ static int read_prepare(struct kvm_vcpu *vcpu, void *val, int bytes) static int read_emulate(struct kvm_vcpu *vcpu, gpa_t gpa, void *val, int bytes) { - return !kvm_read_guest(vcpu->kvm, gpa, val, bytes); + return kvm_read_guest(vcpu->kvm, gpa, val, bytes); } static int write_emulate(struct kvm_vcpu *vcpu, gpa_t gpa, @@ -3807,7 +3808,8 @@ static int emulator_read_write_onepage(unsigned long addr, void *val, if (ret) goto mmio; - if (ops->read_write_emulate(vcpu, gpa, val, bytes)) + ret = ops->read_write_emulate(vcpu, gpa, val, bytes); + if (!ret) return X86EMUL_CONTINUE; mmio: @@ -3829,6 +3831,7 @@ mmio: frag->gpa = gpa; frag->data = val; frag->len = now; + frag->write_readonly_mem = (ret == -EPERM); gpa += now; val += now; @@ -3842,8 +3845,8 @@ int emulator_read_write(struct x86_emulate_ctxt *ctxt, unsigned long addr, struct x86_exception *exception, struct read_write_emulator_ops *ops) { + struct kvm_mmio_fragment *frag; struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt); - gpa_t gpa; int rc; if (ops->read_write_prepare && @@ -3875,17 +3878,18 @@ int emulator_read_write(struct x86_emulate_ctxt *ctxt, unsigned long addr, if (!vcpu->mmio_nr_fragments) return rc; - gpa = vcpu->mmio_fragments[0].gpa; + frag = &vcpu->mmio_fragments[0]; vcpu->mmio_needed = 1; vcpu->mmio_cur_fragment = 0; - vcpu->run->mmio.len = vcpu->mmio_fragments[0].len; + vcpu->run->mmio.len = frag->len; vcpu->run->mmio.is_write = vcpu->mmio_is_write = ops->write; vcpu->run->exit_reason = KVM_EXIT_MMIO; - vcpu->run->mmio.phys_addr = gpa; + vcpu->run->mmio.phys_addr = frag->gpa; + vcpu->run->mmio.write_readonly_mem = frag->write_readonly_mem; - return ops->read_write_exit_mmio(vcpu, gpa, val, bytes); + return ops->read_write_exit_mmio(vcpu, frag->gpa, val, bytes); } static int emulator_read_emulated(struct x86_emulate_ctxt *ctxt, @@ -5525,6 +5529,7 @@ static int complete_mmio(struct kvm_vcpu *vcpu) ++frag; run->exit_reason = KVM_EXIT_MMIO; run->mmio.phys_addr = frag->gpa; + vcpu->run->mmio.write_readonly_mem = frag->write_readonly_mem; if (vcpu->mmio_is_write) memcpy(run->mmio.data, frag->data, frag->len); run->mmio.len = frag->len; diff --git a/include/linux/kvm.h b/include/linux/kvm.h index d808694..9d8002f 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -226,6 +226,9 @@ struct kvm_run { __u8 data[8]; __u32 len; __u8 is_write; +#ifdef __KVM_HAVE_READONLY_MEM + __u8 write_readonly_mem; +#endif } mmio; /* KVM_EXIT_HYPERCALL */ struct { diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 5972c98..c37b225 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -189,6 +189,7 @@ struct kvm_mmio_fragment { gpa_t gpa; void *data; unsigned len; + bool write_readonly_mem; }; struct kvm_vcpu { diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 3416f8a..0c7def7 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1456,6 +1456,9 @@ int kvm_write_guest_page(struct kvm *kvm, gfn_t gfn, const void *data, unsigned long addr; addr = gfn_to_hva(kvm, gfn); + if (addr == KVM_HVA_ERR_RO_BAD) + return -EPERM; + if (kvm_is_error_hva(addr)) return -EFAULT; r = __copy_to_user((void __user *)addr + offset, data, len);