Message ID | 20221205232341.4131240-1-vannapurve@google.com (mailing list archive) |
---|---|
Headers | show |
Series | KVM: selftests: selftests for fd-based private memory | expand |
On Mon, Dec 05, 2022, Vishal Annapurve wrote: > This series implements selftests targeting the feature floated by Chao via: > https://lore.kernel.org/lkml/20221202061347.1070246-10-chao.p.peng@linux.intel.com/T/ > > Below changes aim to test the fd based approach for guest private memory > in context of normal (non-confidential) VMs executing on non-confidential > platforms. > > private_mem_test.c file adds selftest to access private memory from the > guest via private/shared accesses and checking if the contents can be > leaked to/accessed by vmm via shared memory view before/after conversions. > > Updates in V2: > 1) Simplified vcpu run loop implementation API > 2) Removed VM creation logic from private mem library I pushed a rework version of this series to: git@github.com:sean-jc/linux.git x86/upm_base_support Can you take a look and make sure that I didn't completely botch anything, and preserved the spirit of what you are testing? Going forward, no need to send a v3 at this time. Whoever sends v11 of the series will be responsible for including tests. No need to respond to comments either, unless of course there's something you object to, want to clarify, etc., in which case definitely pipe up. Beyond the SEV series, do you have additional UPM testcases written? If so, can you post them, even if they're in a less-than-perfect state? If they're in a "too embarassing to post" state, feel from to send them off list :-) Last question, do you have a list of testcases that you consider "required" for UPM? My off-the-cuff list of selftests I want to have before merging UPM is pretty short at this point: - Negative testing of the memslot changes, e.g. bad alignment, bad fd, illegal memslot updates, etc. - Negative testing of restrictedmem, e.g. various combinations of overlapping bindings of a single restrictedmem instance. - Access vs. conversion stress, e.g. accessing a region in the guest while it's concurrently converted by the host, maybe with fancy guest code to try and detect TLB or ordering bugs?
On Tue, Jan 17, 2023 at 5:09 PM Sean Christopherson <seanjc@google.com> wrote: > > On Mon, Dec 05, 2022, Vishal Annapurve wrote: > > This series implements selftests targeting the feature floated by Chao via: > > https://lore.kernel.org/lkml/20221202061347.1070246-10-chao.p.peng@linux.intel.com/T/ > > > > Below changes aim to test the fd based approach for guest private memory > > in context of normal (non-confidential) VMs executing on non-confidential > > platforms. > > > > private_mem_test.c file adds selftest to access private memory from the > > guest via private/shared accesses and checking if the contents can be > > leaked to/accessed by vmm via shared memory view before/after conversions. > > > > Updates in V2: > > 1) Simplified vcpu run loop implementation API > > 2) Removed VM creation logic from private mem library > > I pushed a rework version of this series to: > > git@github.com:sean-jc/linux.git x86/upm_base_support > Thanks for the review and spending time to rework this series. The revised version [1] looks cleaner and lighter. > Can you take a look and make sure that I didn't completely botch anything, and > preserved the spirit of what you are testing? > Yeah, the reworked selftest structure [1] looks good to me in general. Few test cases that are missing in the reworked version: * Checking if contents of GPA ranges corresponding to private memory are not leaked to host userspace when accessing guest memory using HVA ranges * Checking if private to shared conversion of memory affects nearby private pages. > Going forward, no need to send a v3 at this time. Whoever sends v11 of the series > will be responsible for including tests. > Sounds good to me. > No need to respond to comments either, unless of course there's something you > object to, want to clarify, etc., in which case definitely pipe up. > > Beyond the SEV series, do you have additional UPM testcases written? If so, can > you post them, even if they're in a less-than-perfect state? If they're in a > "too embarassing to post" state, feel from to send them off list :-) > Ackerley (ackerleytng@google.com) is working on publishing the rfc v3 version of TDX selftests that include UPM specific selftests. He plans to publish them this week. > Last question, do you have a list of testcases that you consider "required" for > UPM? My off-the-cuff list of selftests I want to have before merging UPM is pretty > short at this point: > > - Negative testing of the memslot changes, e.g. bad alignment, bad fd, > illegal memslot updates, etc. > - Negative testing of restrictedmem, e.g. various combinations of overlapping > bindings of a single restrictedmem instance. > - Access vs. conversion stress, e.g. accessing a region in the guest while it's > concurrently converted by the host, maybe with fancy guest code to try and > detect TLB or ordering bugs? List of testcases that I was tracking (covered by the current selftests) as required: 1) Ensure private memory contents are not accessible to host userspace using the HVA 2) Ensure shared memory contents are visible/accessible from both host userspace and the guest 3) Ensure 1 and 2 holds across explicit memory conversions 4) Exercise memory conversions with mixed shared/private memory pages in a huge page to catch issues like [2] 5) Ensure that explicit memory conversions don't affect nearby GPA ranges Test Cases that will be covered by TDX/SNP selftests (in addition to above scenarios): 6) Ensure 1 and 2 holds across implicit memory conversions 7) Ensure that implicit memory conversions don't affect nearby GPA ranges Additional testcases possible: 8) Running conversion tests for non-overlapping GPA ranges of same/different memslots from multiple vcpus [1] - https://github.com/sean-jc/linux/commit/7e536bf3c45c623425bc84e8a96634efc3a619ed [2] - https://lore.kernel.org/linux-mm/CAGtprH82H_fjtRbL0KUxOkgOk4pgbaEbAydDYfZ0qxz41JCnAQ@mail.gmail.com/
On Wed, Jan 18, 2023 at 01:09:49AM +0000, Sean Christopherson wrote: > On Mon, Dec 05, 2022, Vishal Annapurve wrote: > > This series implements selftests targeting the feature floated by Chao via: > > https://lore.kernel.org/lkml/20221202061347.1070246-10-chao.p.peng@linux.intel.com/T/ > > > > Below changes aim to test the fd based approach for guest private memory > > in context of normal (non-confidential) VMs executing on non-confidential > > platforms. > > > > private_mem_test.c file adds selftest to access private memory from the > > guest via private/shared accesses and checking if the contents can be > > leaked to/accessed by vmm via shared memory view before/after conversions. > > > > Updates in V2: > > 1) Simplified vcpu run loop implementation API > > 2) Removed VM creation logic from private mem library > > I pushed a rework version of this series to: > > git@github.com:sean-jc/linux.git x86/upm_base_support It still has build issue with lockdep enabled that I mentioned before: https://lore.kernel.org/all/20230116134845.vboraky2nd56szos@box.shutemov.name/ And when compiled with lockdep, it triggers a lot of warnings about missed locks, like this: [ 59.632024] kvm: FIXME: Walk the memory attributes of the slot and set the mixed status appropriately [ 59.684888] ------------[ cut here ]------------ [ 59.690677] WARNING: CPU: 2 PID: 138 at include/linux/kvm_host.h:2307 kvm_mmu_do_page_fault+0x19a/0x1b0 [ 59.693531] CPU: 2 PID: 138 Comm: private_mem_con Not tainted 6.1.0-rc4-00624-g7e536bf3c45c-dirty #1 [ 59.696265] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 [ 59.699586] RIP: 0010:kvm_mmu_do_page_fault+0x19a/0x1b0 [ 59.700720] Code: d8 1c 00 00 eb e3 65 48 8b 0c 25 28 00 00 00 48 3b 4c 24 50 75 1b 48 83 c4 58 5b 41 5e 41 5f 5d c3 48 81 c0 [ 59.704711] RSP: 0018:ffffc90000323c80 EFLAGS: 00010246 [ 59.705830] RAX: 0000000000000000 RBX: ffff888103bc8000 RCX: ffffffff8107dff0 [ 59.707353] RDX: 0000000000000001 RSI: ffffffff82549d49 RDI: ffffffff825abe77 [ 59.708865] RBP: ffffc90000e59000 R08: 0000000000000000 R09: 0000000000000000 [ 59.710369] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 59.711859] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000180 [ 59.713338] FS: 00007f2e556de740(0000) GS:ffff8881f9d00000(0000) knlGS:0000000000000000 [ 59.714978] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 59.716168] CR2: 0000000000000000 CR3: 0000000100e90005 CR4: 0000000000372ee0 [ 59.717631] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 59.719086] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 59.721148] Call Trace: [ 59.722661] <TASK> [ 59.723986] ? lock_is_held_type+0xdb/0x150 [ 59.726501] kvm_mmu_page_fault+0x41/0x170 [ 59.728946] vmx_handle_exit+0x343/0x750 [ 59.731007] kvm_arch_vcpu_ioctl_run+0x1d12/0x2790 [ 59.733319] kvm_vcpu_ioctl+0x4a6/0x590 [ 59.735195] __se_sys_ioctl+0x6a/0xb0 [ 59.736976] do_syscall_64+0x3d/0x80 [ 59.738698] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 59.740743] RIP: 0033:0x7f2e557d8f6b [ 59.741907] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 c7 04 24 10 00 00 00 b0 [ 59.747836] RSP: 002b:00007ffe8b84eb50 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 59.750147] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2e557d8f6b [ 59.751754] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000007 [ 59.753361] RBP: 000000000042f880 R08: 0000000000000007 R09: 0000000000430210 [ 59.754952] R10: ca7f9f3d969d5d5c R11: 0000000000000246 R12: 000000000042d2a0 [ 59.756596] R13: 0000000100000000 R14: 0000000000422e00 R15: 00007f2e558f7000 [ 59.758231] </TASK> [ 59.758752] irq event stamp: 8637 [ 59.759540] hardirqs last enabled at (8647): [<ffffffff8119ae18>] __up_console_sem+0x68/0x90 [ 59.761309] hardirqs last disabled at (8654): [<ffffffff8119adfd>] __up_console_sem+0x4d/0x90 [ 59.763022] softirqs last enabled at (8550): [<ffffffff81123c7a>] __irq_exit_rcu+0xaa/0x130 [ 59.764731] softirqs last disabled at (8539): [<ffffffff81123c7a>] __irq_exit_rcu+0xaa/0x130 [ 59.766409] ---[ end trace 0000000000000000 ]---
On Wed, Jan 18, 2023, Kirill A. Shutemov wrote: > On Wed, Jan 18, 2023 at 01:09:49AM +0000, Sean Christopherson wrote: > > On Mon, Dec 05, 2022, Vishal Annapurve wrote: > > > This series implements selftests targeting the feature floated by Chao via: > > > https://lore.kernel.org/lkml/20221202061347.1070246-10-chao.p.peng@linux.intel.com/T/ > > > > > > Below changes aim to test the fd based approach for guest private memory > > > in context of normal (non-confidential) VMs executing on non-confidential > > > platforms. > > > > > > private_mem_test.c file adds selftest to access private memory from the > > > guest via private/shared accesses and checking if the contents can be > > > leaked to/accessed by vmm via shared memory view before/after conversions. > > > > > > Updates in V2: > > > 1) Simplified vcpu run loop implementation API > > > 2) Removed VM creation logic from private mem library > > > > I pushed a rework version of this series to: > > > > git@github.com:sean-jc/linux.git x86/upm_base_support > > It still has build issue with lockdep enabled that I mentioned before: Yeah, I haven't updated the branch since last Friday, i.e. I haven't fixed the known bugs yet. I pushed the selftests changes at the same as everything else, just didn't get to typing up the emails until yesterday. > https://lore.kernel.org/all/20230116134845.vboraky2nd56szos@box.shutemov.name/ > > And when compiled with lockdep, it triggers a lot of warnings about missed > locks, like this: Ah crud, I knew I was forgetting something. The lockdep assertion can simply be removed, mmu_lock doesn't need to be held to read attributes. I knew the assertion was wrong when I added it, but I wanted to remind myself to take a closer look at the usage of kvm_mem_is_private() and forgot to go back and delete the assertion. The use of kvm_mem_is_private() in kvm_mmu_do_page_fault() is technically unsafe. Similar to getting the pfn, if mmu_lock isn't held, consuming the attributes (private vs. shared) needs MMU notifier protection, i.e. the attributes are safe to read only after mmu_invalidate_seq is snapshot. However, is_private gets rechecked by __kvm_faultin_pfn(), which is protected by the sequence counter, and so the technically unsafe read is verified in the end. The obvious alternative is to make is_private an output of kvm_faultin_pfn(), but that's incorrect for SNP and TDX guests, in which case "is_private" is a property of the fault itself. TL;DR: I'm going to delete the assertion and add a comment in kvm_mmu_do_page_fault() explaining why it's safe to consume kvm_mem_is_private() for "legacy" guests. diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 35a339891aed..da0afe81cf10 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2310,8 +2310,6 @@ static inline void kvm_account_pgtable_pages(void *virt, int nr) #ifdef CONFIG_KVM_GENERIC_MEMORY_ATTRIBUTES static inline unsigned long kvm_get_memory_attributes(struct kvm *kvm, gfn_t gfn) { - lockdep_assert_held(kvm->mmu_lock); - return xa_to_value(xa_load(&kvm->mem_attr_array, gfn)); }
On Tue, Jan 17, 2023 at 7:11 PM Vishal Annapurve <vannapurve@google.com> wrote: > > ... > > Last question, do you have a list of testcases that you consider "required" for > > UPM? My off-the-cuff list of selftests I want to have before merging UPM is pretty > > short at this point: > > > > - Negative testing of the memslot changes, e.g. bad alignment, bad fd, > > illegal memslot updates, etc. > > - Negative testing of restrictedmem, e.g. various combinations of overlapping > > bindings of a single restrictedmem instance. > > - Access vs. conversion stress, e.g. accessing a region in the guest while it's > > concurrently converted by the host, maybe with fancy guest code to try and > > detect TLB or ordering bugs? > > List of testcases that I was tracking (covered by the current > selftests) as required: > 1) Ensure private memory contents are not accessible to host userspace > using the HVA > 2) Ensure shared memory contents are visible/accessible from both host > userspace and the guest > 3) Ensure 1 and 2 holds across explicit memory conversions > 4) Exercise memory conversions with mixed shared/private memory pages > in a huge page to catch issues like [2] > 5) Ensure that explicit memory conversions don't affect nearby GPA ranges > > Test Cases that will be covered by TDX/SNP selftests (in addition to > above scenarios): > 6) Ensure 1 and 2 holds across implicit memory conversions > 7) Ensure that implicit memory conversions don't affect nearby GPA ranges > > Additional testcases possible: > 8) Running conversion tests for non-overlapping GPA ranges of > same/different memslots from multiple vcpus > > [1] - https://github.com/sean-jc/linux/commit/7e536bf3c45c623425bc84e8a96634efc3a619ed > [2] - https://lore.kernel.org/linux-mm/CAGtprH82H_fjtRbL0KUxOkgOk4pgbaEbAydDYfZ0qxz41JCnAQ@mail.gmail.com/ List of additional testcases that could help increase basic coverage (including what sean mentioned earlier): 1) restrictedmem functionality testing - read/write/mmap should not work - fstat/fallocate should work as expected 2) restrictedmem registration/modification testing with: - bad alignment, bad fd, modifying properties of existing memslot - Installing multiple memslots with ranges within the same restricted mem files - deleting memslots with restricted memfd while guests are being executed 3) Runtime restricted mem testing: - Access vs conversion testing from multiple vcpus - conversion and access to non-overlapping ranges from multiple vcpus Regards, Vishal
On Fri, Feb 10, 2023 at 11:59:23AM -0800, Vishal Annapurve wrote: > On Tue, Jan 17, 2023 at 7:11 PM Vishal Annapurve <vannapurve@google.com> wrote: > > > > ... > > > > Last question, do you have a list of testcases that you consider "required" for > > > UPM? My off-the-cuff list of selftests I want to have before merging UPM is pretty > > > short at this point: > > > > > > - Negative testing of the memslot changes, e.g. bad alignment, bad fd, > > > illegal memslot updates, etc. > > > - Negative testing of restrictedmem, e.g. various combinations of overlapping > > > bindings of a single restrictedmem instance. > > > - Access vs. conversion stress, e.g. accessing a region in the guest while it's > > > concurrently converted by the host, maybe with fancy guest code to try and > > > detect TLB or ordering bugs? > > > > List of testcases that I was tracking (covered by the current > > selftests) as required: > > 1) Ensure private memory contents are not accessible to host userspace > > using the HVA > > 2) Ensure shared memory contents are visible/accessible from both host > > userspace and the guest > > 3) Ensure 1 and 2 holds across explicit memory conversions > > 4) Exercise memory conversions with mixed shared/private memory pages > > in a huge page to catch issues like [2] > > 5) Ensure that explicit memory conversions don't affect nearby GPA ranges > > > > Test Cases that will be covered by TDX/SNP selftests (in addition to > > above scenarios): > > 6) Ensure 1 and 2 holds across implicit memory conversions > > 7) Ensure that implicit memory conversions don't affect nearby GPA ranges > > > > Additional testcases possible: > > 8) Running conversion tests for non-overlapping GPA ranges of > > same/different memslots from multiple vcpus > > > > [1] - https://github.com/sean-jc/linux/commit/7e536bf3c45c623425bc84e8a96634efc3a619ed > > [2] - https://lore.kernel.org/linux-mm/CAGtprH82H_fjtRbL0KUxOkgOk4pgbaEbAydDYfZ0qxz41JCnAQ@mail.gmail.com/ > > List of additional testcases that could help increase basic coverage > (including what sean mentioned earlier): > 1) restrictedmem functionality testing > - read/write/mmap should not work > - fstat/fallocate should work as expected > 2) restrictedmem registration/modification testing with: > - bad alignment, bad fd, modifying properties of existing memslot > - Installing multiple memslots with ranges within the same > restricted mem files > - deleting memslots with restricted memfd while guests are being executed In case you havn't started, I will work on 1) and 2) for the following days. As a start, I will first add restrictedmem tests (without KVM) then move to new memslots related tests. Chao > 3) Runtime restricted mem testing: > - Access vs conversion testing from multiple vcpus > - conversion and access to non-overlapping ranges from multiple vcpus > > Regards, > Vishal
On Mon, Mar 06, 2023 at 06:21:24PM +0000, Ackerley Tng wrote: > Chao Peng <chao.p.peng@linux.intel.com> writes: > > > On Fri, Feb 10, 2023 at 11:59:23AM -0800, Vishal Annapurve wrote: > > > On Tue, Jan 17, 2023 at 7:11 PM Vishal Annapurve > > > <vannapurve@google.com> wrote: > > > > > > > > ... > > > > > > Last question, do you have a list of testcases that you consider > > > "required" for > > > > > UPM? My off-the-cuff list of selftests I want to have before > > > merging UPM is pretty > > > > > short at this point: > > > > > > > > > > - Negative testing of the memslot changes, e.g. bad alignment, > > > bad fd, > > > > > illegal memslot updates, etc. > > > > > - Negative testing of restrictedmem, e.g. various combinations > > > of overlapping > > > > > bindings of a single restrictedmem instance. > > > > > - Access vs. conversion stress, e.g. accessing a region in the > > > guest while it's > > > > > concurrently converted by the host, maybe with fancy guest > > > code to try and > > > > > detect TLB or ordering bugs? > > > > > > > > List of testcases that I was tracking (covered by the current > > > > selftests) as required: > > > > 1) Ensure private memory contents are not accessible to host userspace > > > > using the HVA > > > > 2) Ensure shared memory contents are visible/accessible from both host > > > > userspace and the guest > > > > 3) Ensure 1 and 2 holds across explicit memory conversions > > > > 4) Exercise memory conversions with mixed shared/private memory pages > > > > in a huge page to catch issues like [2] > > > > 5) Ensure that explicit memory conversions don't affect nearby GPA > > > ranges > > > > > > > > Test Cases that will be covered by TDX/SNP selftests (in addition to > > > > above scenarios): > > > > 6) Ensure 1 and 2 holds across implicit memory conversions > > > > 7) Ensure that implicit memory conversions don't affect nearby GPA > > > ranges > > > > > > > > Additional testcases possible: > > > > 8) Running conversion tests for non-overlapping GPA ranges of > > > > same/different memslots from multiple vcpus > > > > > > > > [1] - https://github.com/sean-jc/linux/commit/7e536bf3c45c623425bc84e8a96634efc3a619ed > > > > [2] - https://lore.kernel.org/linux-mm/CAGtprH82H_fjtRbL0KUxOkgOk4pgbaEbAydDYfZ0qxz41JCnAQ@mail.gmail.com/ > > > > List of additional testcases that could help increase basic coverage > > > (including what sean mentioned earlier): > > > 1) restrictedmem functionality testing > > > - read/write/mmap should not work > > > - fstat/fallocate should work as expected > > > 2) restrictedmem registration/modification testing with: > > > - bad alignment, bad fd, modifying properties of existing memslot > > > - Installing multiple memslots with ranges within the same > > > restricted mem files > > > - deleting memslots with restricted memfd while guests are > > > being executed > > > In case you havn't started, I will work on 1) and 2) for the following > > days. As a start, I will first add restrictedmem tests (without KVM) then > > move to new memslots related tests. > > > Chao > > > > 3) Runtime restricted mem testing: > > > - Access vs conversion testing from multiple vcpus > > > - conversion and access to non-overlapping ranges from multiple vcpus > > > > Regards, > > > Vishal > > Chao, I'll work on > > + Running conversion tests for non-overlapping GPA ranges of > same/different memslots from multiple vcpus > + Deleting memslots with restricted memfd while guests are being > executed > + Installing multiple memslots with ranges within the same restricted > mem files > > this week. Thanks Ackerley. Looks good to me. BTW, for whom may have interest, below are the testcases I added: https://github.com/chao-p/linux/commit/24dd1257d5c93acb8c8cc6c76c51cf6869970f8a https://github.com/chao-p/linux/commit/39a872ef09d539ce0c953451152eb05276b87018 https://github.com/chao-p/linux/commit/ddd2c92b268a2fdc6158f82a6169ad1a57f2a01d Chao
On Wed, Mar 08, 2023, Ackerley Tng wrote: > While I was working on the selftests I noticed that this could perhaps > be improved: > > https://github.com/chao-p/linux/blob/ddd2c92b268a2fdc6158f82a6169ad1a57f2a01d/virt/kvm/kvm_main.c#L1035 > > We should use a temporary variable to hold the result of fget(fd). > > As it is now, if the user provides any invalide fd, like -1, > slot->restrictedmem.file would be overwritten and lost. Meh, that can happen if and only if KVM has a bug elsehwere. If slot->restrictedmem.file is anything but NULL, KVM is hosed. E.g. waiting to set slot->restrictedmem.file until the very end wouldn't magically prevent a file descriptor leak if slot->restrictedmem.file is non-NULL. > We cannot update slot->restrictedmem.file until after the > file_is_restrictedmem() check. > > For now there isn't a big problem because kvm_restrictedmem_bind() is > only called on a new struct kvm_memory_slot, but I think this should be > changed in case the function is used elsewhere in future. Nah, if anything we could add if (WARN_ON_ONCE(slot->restrictedmem.file)) return -EIO; but I don't see the point. There's exactly one caller and the entire scheme depends on binding the memslot to restricted memory when the memslot is created, i.e. this would be but one of many changes if KVM were to allowed re-binding a memslot.