Message ID | 1547026933-31226-1-git-send-email-karahmed@amazon.de (mailing list archive) |
---|---|
Headers | show |
Series | KVM/X86: Introduce a new guest mapping interface | expand |
On Wed, Jan 09, 2019 at 10:42:00AM +0100, KarimAllah Ahmed wrote: > Guest memory can either be directly managed by the kernel (i.e. have a "struct > page") or they can simply live outside kernel control (i.e. do not have a > "struct page"). KVM mostly support these two modes, except in a few places > where the code seems to assume that guest memory must have a "struct page". > > This patchset introduces a new mapping interface to map guest memory into host > kernel memory which also supports PFN-based memory (i.e. memory without 'struct > page'). It also converts all offending code to this interface or simply > read/write directly from guest memory. Patch 2 is additionally fixing an > incorrect page release and marking the page as dirty (i.e. as a side-effect of > using the helper function to write). > > As far as I can see all offending code is now fixed except the APIC-access page > which I will handle in a seperate series along with dropping > kvm_vcpu_gfn_to_page and kvm_vcpu_gpa_to_page from the internal KVM API. > > The current implementation of the new API uses memremap to map memory that does > not have a "struct page". This proves to be very slow for high frequency > mappings. Since this does not affect the normal use-case where a "struct page" > is available, the performance of this API will be handled by a seperate patch > series. Where could one find this patchset? Also is there an simple test-case (or a writeup) you have for testing this code? Specifically I am thinking about the use-case of "memory without the 'struct page'" And thank you for posting this patchset. It was a pleasure reviewing the code!
On Wed, 2019-01-23 at 13:16 -0500, Konrad Rzeszutek Wilk wrote: > On Wed, Jan 09, 2019 at 10:42:00AM +0100, KarimAllah Ahmed wrote: > > > > Guest memory can either be directly managed by the kernel (i.e. have a "struct > > page") or they can simply live outside kernel control (i.e. do not have a > > "struct page"). KVM mostly support these two modes, except in a few places > > where the code seems to assume that guest memory must have a "struct page". > > > > This patchset introduces a new mapping interface to map guest memory into host > > kernel memory which also supports PFN-based memory (i.e. memory without 'struct > > page'). It also converts all offending code to this interface or simply > > read/write directly from guest memory. Patch 2 is additionally fixing an > > incorrect page release and marking the page as dirty (i.e. as a side-effect of > > using the helper function to write). > > > > As far as I can see all offending code is now fixed except the APIC-access page > > which I will handle in a seperate series along with dropping > > kvm_vcpu_gfn_to_page and kvm_vcpu_gpa_to_page from the internal KVM API. > > > > The current implementation of the new API uses memremap to map memory that does > > not have a "struct page". This proves to be very slow for high frequency > > mappings. Since this does not affect the normal use-case where a "struct page" > > is available, the performance of this API will be handled by a seperate patch > > series. > > Where could one find this patchset? Let me clean it and send it out as well :) > > Also is there an simple test-case (or a writeup) you have for testing > this code? Specifically I am thinking about the use-case of "memory > without the 'struct page'" So the simple way to do it is: 1- Pass 'mem=' in the kernel command-line to limit the amount of memory managed by the kernel. 2- Map this physical memory you want to give to the guest with mmap("/dev/mem", physical_address_offset, ..) 3- Use the user-space virtual address as the "userspace_addr" field in KVM_SET_USER_MEMORY_REGION ioctl. You will also need this patch (hopefully I will repost next week as well): https://patchwork.kernel.org/patch/9191755/ I will make sure to expand on this in the cover letter in v6. > > And thank you for posting this patchset. It was a pleasure reviewing the > code! Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich Ust-ID: DE 289 237 879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
On 25/01/19 19:28, Raslan, KarimAllah wrote: > So the simple way to do it is: > > 1- Pass 'mem=' in the kernel command-line to limit the amount of memory managed > by the kernel. > 2- Map this physical memory you want to give to the guest with > mmap("/dev/mem", physical_address_offset, ..) > 3- Use the user-space virtual address as the "userspace_addr" field > in KVM_SET_USER_MEMORY_REGION ioctl. > > You will also need this patch (hopefully I will repost next week as well): > https://patchwork.kernel.org/patch/9191755/ I took a look again at that patch and I guess I've changed my mind now that the kernel provides e820__mapped_any and e820__mapped_all. However, please do use e820__mapped_any instead of adding a new function e820_is_ram. Thanks, Paolo > I will make sure to expand on this in the cover letter in v6.
On Wed, 2019-01-30 at 18:14 +0100, Paolo Bonzini wrote: > On 25/01/19 19:28, Raslan, KarimAllah wrote: > > > > So the simple way to do it is: > > > > 1- Pass 'mem=' in the kernel command-line to limit the amount of memory managed > > by the kernel. > > 2- Map this physical memory you want to give to the guest with > > mmap("/dev/mem", physical_address_offset, ..) > > 3- Use the user-space virtual address as the "userspace_addr" field > > in KVM_SET_USER_MEMORY_REGION ioctl. > > > > You will also need this patch (hopefully I will repost next week as well): > > https://patchwork.kernel.org/patch/9191755/ > > I took a look again at that patch and I guess I've changed my mind now > that the kernel provides e820__mapped_any and e820__mapped_all. > However, please do use e820__mapped_any instead of adding a new function > e820_is_ram. The problem with e820__mapped_* is that they are iterating over 'e820_table' which is already truncated from the 'mem=' and 'memmap=' parameters: """ * - 'e820_table': this is the main E820 table that is massaged by the * low level x86 platform code, or modified by boot parameters, before * passed on to higher level MM layers. """ .. so I really still can not use it for this purpose. The structure that I want to look at is actually 'e820_table_firmware' which is: """ * - 'e820_table_firmware': the original firmware version passed to us by the * bootloader - not modified by the kernel. It is composed of two parts: * the first 128 E820 memory entries in boot_params.e820_table and the remaining * (if any) entries of the SETUP_E820_EXT nodes. We use this to: * * - inform the user about the firmware's notion of memory layout * via /sys/firmware/memmap * * - the hibernation code uses it to generate a kernel-independent MD5 * fingerprint of the physical memory layout of a system. """ The users of e820__mapped_any expect these semantics, so even changing the implementation of these functions to use 'e820_table_firmware' to handle this will not be an option! One option here would be to add 'e820__mapped_raw_any' (or whatever other name) and make it identical to the current implementation of e820__mapped_any at. Would that be slightly more acceptable? :) > > Thanks, > > Paolo > > > > > I will make sure to expand on this in the cover letter in v6. > Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich Ust-ID: DE 289 237 879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
On 31/01/19 14:03, Raslan, KarimAllah wrote: > > One option here would be to add 'e820__mapped_raw_any' (or whatever > other name) and make it identical to the current implementation of > e820__mapped_any at. Would that be slightly more acceptable? :) Yes, of course it would (for me at least :)). Paolo