diff mbox series

[v5,3/3] xen: add helpers to allocate unpopulated memory

Message ID 20200901083326.21264-4-roger.pau@citrix.com (mailing list archive)
State New, archived
Headers show
Series None | expand

Commit Message

Roger Pau Monné Sept. 1, 2020, 8:33 a.m. UTC
To be used in order to create foreign mappings. This is based on the
ZONE_DEVICE facility which is used by persistent memory devices in
order to create struct pages and kernel virtual mappings for the IOMEM
areas of such devices. Note that on kernels without support for
ZONE_DEVICE Xen will fallback to use ballooned pages in order to
create foreign mappings.

The newly added helpers use the same parameters as the existing
{alloc/free}_xenballooned_pages functions, which allows for in-place
replacement of the callers. Once a memory region has been added to be
used as scratch mapping space it will no longer be released, and pages
returned are kept in a linked list. This allows to have a buffer of
pages and prevents resorting to frequent additions and removals of
regions.

If enabled (because ZONE_DEVICE is supported) the usage of the new
functionality untangles Xen balloon and RAM hotplug from the usage of
unpopulated physical memory ranges to map foreign pages, which is the
correct thing to do in order to avoid mappings of foreign pages depend
on memory hotplug.

Note the driver is currently not enabled on Arm platforms because it
would interfere with the identity mapping required on some platforms.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Roger Pau Monne <roger.pau@citrix.com>
Cc: Wei Liu <wl@xen.org>
Cc: Yan Yankovskyi <yyankovskyi@gmail.com>
Cc: dri-devel@lists.freedesktop.org
Cc: xen-devel@lists.xenproject.org
Cc: linux-mm@kvack.org
Cc: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
---
Changes since v4:
 - Introduce a description for the option.
 - Force selection of ZONE_DEVICE on X86 and select
   XEN_UNPOPULATED_ALLOC if running on dom0 mode or having any
   backends.

Changes since v3:
 - Introduce a Kconfig option that gates the addition of the
   unpopulated alloc driver. This allows to easily disable it on Arm
   platforms.
 - Dropped Juergen RB due to the addition of the Kconfig option.
 - Switched from MEMORY_DEVICE_DEVDAX to MEMORY_DEVICE_GENERIC.

Changes since v2:
 - Drop BUILD_BUG_ON regarding PVMMU page sizes.
 - Use a SPDX license identifier.
 - Call fill with only the minimum required number of pages.
 - Include xen.h header in xen_drm_front_gem.c.
 - Use less generic function names.
 - Exit early from the init function if not a PV guest.
 - Don't use all caps for region name.
---
 drivers/gpu/drm/xen/xen_drm_front_gem.c |   9 +-
 drivers/xen/Kconfig                     |  11 ++
 drivers/xen/Makefile                    |   1 +
 drivers/xen/balloon.c                   |   4 +-
 drivers/xen/grant-table.c               |   4 +-
 drivers/xen/privcmd.c                   |   4 +-
 drivers/xen/unpopulated-alloc.c         | 185 ++++++++++++++++++++++++
 drivers/xen/xenbus/xenbus_client.c      |   6 +-
 drivers/xen/xlate_mmu.c                 |   4 +-
 include/xen/xen.h                       |   9 ++
 10 files changed, 222 insertions(+), 15 deletions(-)
 create mode 100644 drivers/xen/unpopulated-alloc.c

Comments

Roger Pau Monné Sept. 1, 2020, 2:45 p.m. UTC | #1
On Tue, Sep 01, 2020 at 10:33:26AM +0200, Roger Pau Monne wrote:
> +static int fill_list(unsigned int nr_pages)
> +{
> +	struct dev_pagemap *pgmap;
> +	void *vaddr;
> +	unsigned int i, alloc_pages = round_up(nr_pages, PAGES_PER_SECTION);
> +	int nid, ret;
> +
> +	pgmap = kzalloc(sizeof(*pgmap), GFP_KERNEL);
> +	if (!pgmap)
> +		return -ENOMEM;
> +
> +	pgmap->type = MEMORY_DEVICE_GENERIC;
> +	pgmap->res.name = "Xen scratch";
> +	pgmap->res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
> +
> +	ret = allocate_resource(&iomem_resource, &pgmap->res,
> +				alloc_pages * PAGE_SIZE, 0, -1,
> +				PAGES_PER_SECTION * PAGE_SIZE, NULL, NULL);
> +	if (ret < 0) {
> +		pr_err("Cannot allocate new IOMEM resource\n");
> +		kfree(pgmap);
> +		return ret;
> +	}
> +
> +	nid = memory_add_physaddr_to_nid(pgmap->res.start);

I think this is not needed ...

> +
> +#ifdef CONFIG_XEN_HAVE_PVMMU
> +        /*
> +         * memremap will build page tables for the new memory so
> +         * the p2m must contain invalid entries so the correct
> +         * non-present PTEs will be written.
> +         *
> +         * If a failure occurs, the original (identity) p2m entries
> +         * are not restored since this region is now known not to
> +         * conflict with any devices.
> +         */
> +	if (!xen_feature(XENFEAT_auto_translated_physmap)) {
> +		xen_pfn_t pfn = PFN_DOWN(pgmap->res.start);
> +
> +		for (i = 0; i < alloc_pages; i++) {
> +			if (!set_phys_to_machine(pfn + i, INVALID_P2M_ENTRY)) {
> +				pr_warn("set_phys_to_machine() failed, no memory added\n");
> +				release_resource(&pgmap->res);
> +				kfree(pgmap);
> +				return -ENOMEM;
> +			}
> +                }
> +	}
> +#endif
> +
> +	vaddr = memremap_pages(pgmap, nid);

... and NUMA_NO_NODE should be used here instead, as this memory is just
fictitious space to map foreign memory, and shouldn't be related to
any NUMA node.

The following chunk should be folded in, or I can resend.

Thanks, Roger.
---8<---
diff --git a/drivers/xen/unpopulated-alloc.c b/drivers/xen/unpopulated-alloc.c
index 1b5d157c6977..3b98dc921426 100644
--- a/drivers/xen/unpopulated-alloc.c
+++ b/drivers/xen/unpopulated-alloc.c
@@ -20,7 +20,7 @@ static int fill_list(unsigned int nr_pages)
 	struct dev_pagemap *pgmap;
 	void *vaddr;
 	unsigned int i, alloc_pages = round_up(nr_pages, PAGES_PER_SECTION);
-	int nid, ret;
+	int ret;
 
 	pgmap = kzalloc(sizeof(*pgmap), GFP_KERNEL);
 	if (!pgmap)
@@ -39,8 +39,6 @@ static int fill_list(unsigned int nr_pages)
 		return ret;
 	}
 
-	nid = memory_add_physaddr_to_nid(pgmap->res.start);
-
 #ifdef CONFIG_XEN_HAVE_PVMMU
         /*
          * memremap will build page tables for the new memory so
@@ -65,7 +63,7 @@ static int fill_list(unsigned int nr_pages)
 	}
 #endif
 
-	vaddr = memremap_pages(pgmap, nid);
+	vaddr = memremap_pages(pgmap, NUMA_NO_NODE);
 	if (IS_ERR(vaddr)) {
 		pr_err("Cannot remap memory range\n");
 		release_resource(&pgmap->res);
Jürgen Groß Sept. 1, 2020, 2:54 p.m. UTC | #2
On 01.09.20 16:45, Roger Pau Monné wrote:
> On Tue, Sep 01, 2020 at 10:33:26AM +0200, Roger Pau Monne wrote:
>> +static int fill_list(unsigned int nr_pages)
>> +{
>> +	struct dev_pagemap *pgmap;
>> +	void *vaddr;
>> +	unsigned int i, alloc_pages = round_up(nr_pages, PAGES_PER_SECTION);
>> +	int nid, ret;
>> +
>> +	pgmap = kzalloc(sizeof(*pgmap), GFP_KERNEL);
>> +	if (!pgmap)
>> +		return -ENOMEM;
>> +
>> +	pgmap->type = MEMORY_DEVICE_GENERIC;
>> +	pgmap->res.name = "Xen scratch";
>> +	pgmap->res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
>> +
>> +	ret = allocate_resource(&iomem_resource, &pgmap->res,
>> +				alloc_pages * PAGE_SIZE, 0, -1,
>> +				PAGES_PER_SECTION * PAGE_SIZE, NULL, NULL);
>> +	if (ret < 0) {
>> +		pr_err("Cannot allocate new IOMEM resource\n");
>> +		kfree(pgmap);
>> +		return ret;
>> +	}
>> +
>> +	nid = memory_add_physaddr_to_nid(pgmap->res.start);
> 
> I think this is not needed ...
> 
>> +
>> +#ifdef CONFIG_XEN_HAVE_PVMMU
>> +        /*
>> +         * memremap will build page tables for the new memory so
>> +         * the p2m must contain invalid entries so the correct
>> +         * non-present PTEs will be written.
>> +         *
>> +         * If a failure occurs, the original (identity) p2m entries
>> +         * are not restored since this region is now known not to
>> +         * conflict with any devices.
>> +         */
>> +	if (!xen_feature(XENFEAT_auto_translated_physmap)) {
>> +		xen_pfn_t pfn = PFN_DOWN(pgmap->res.start);
>> +
>> +		for (i = 0; i < alloc_pages; i++) {
>> +			if (!set_phys_to_machine(pfn + i, INVALID_P2M_ENTRY)) {
>> +				pr_warn("set_phys_to_machine() failed, no memory added\n");
>> +				release_resource(&pgmap->res);
>> +				kfree(pgmap);
>> +				return -ENOMEM;
>> +			}
>> +                }
>> +	}
>> +#endif
>> +
>> +	vaddr = memremap_pages(pgmap, nid);
> 
> ... and NUMA_NO_NODE should be used here instead, as this memory is just
> fictitious space to map foreign memory, and shouldn't be related to
> any NUMA node.
> 
> The following chunk should be folded in, or I can resend.

I can fold it in.


Juergen
Jürgen Groß Sept. 3, 2020, 3:30 p.m. UTC | #3
On 01.09.20 10:33, Roger Pau Monne wrote:
> To be used in order to create foreign mappings. This is based on the
> ZONE_DEVICE facility which is used by persistent memory devices in
> order to create struct pages and kernel virtual mappings for the IOMEM
> areas of such devices. Note that on kernels without support for
> ZONE_DEVICE Xen will fallback to use ballooned pages in order to
> create foreign mappings.
> 
> The newly added helpers use the same parameters as the existing
> {alloc/free}_xenballooned_pages functions, which allows for in-place
> replacement of the callers. Once a memory region has been added to be
> used as scratch mapping space it will no longer be released, and pages
> returned are kept in a linked list. This allows to have a buffer of
> pages and prevents resorting to frequent additions and removals of
> regions.
> 
> If enabled (because ZONE_DEVICE is supported) the usage of the new
> functionality untangles Xen balloon and RAM hotplug from the usage of
> unpopulated physical memory ranges to map foreign pages, which is the
> correct thing to do in order to avoid mappings of foreign pages depend
> on memory hotplug.
> 
> Note the driver is currently not enabled on Arm platforms because it
> would interfere with the identity mapping required on some platforms.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Sorry, I just got a build error for x86 32-bit build:

WARNING: unmet direct dependencies detected for ZONE_DEVICE
   Depends on [n]: MEMORY_HOTPLUG [=n] && MEMORY_HOTREMOVE [=n] && 
SPARSEMEM_VMEMMAP [=n] && ARCH_HAS_PTE_DEVMAP [=n]
   Selected by [y]:
   - XEN_UNPOPULATED_ALLOC [=y] && XEN [=y] && X86 [=y]
   GEN     Makefile
   CC      kernel/bounds.s
   CALL    /home/gross/korg/src/scripts/atomic/check-atomics.sh
   UPD     include/generated/bounds.h
   CC      arch/x86/kernel/asm-offsets.s
In file included from /home/gross/korg/src/include/linux/mmzone.h:19:0,
                  from /home/gross/korg/src/include/linux/gfp.h:6,
                  from /home/gross/korg/src/include/linux/slab.h:15,
                  from /home/gross/korg/src/include/linux/crypto.h:19,
                  from /home/gross/korg/src/arch/x86/kernel/asm-offsets.c:9:
/home/gross/korg/src/include/linux/page-flags-layout.h:95:2: error: 
#error "Not enough bits in page flags"
  #error "Not enough bits in page flags"
   ^~~~~
make[2]: *** [/home/gross/korg/src/scripts/Makefile.build:114: 
arch/x86/kernel/asm-offsets.s] Error 1
make[1]: *** [/home/gross/korg/src/Makefile:1175: prepare0] Error 2
make[1]: Leaving directory '/home/gross/korg/x8632'
make: *** [Makefile:185: __sub-make] Error 2


Juergen
Roger Pau Monné Sept. 3, 2020, 4:38 p.m. UTC | #4
On Thu, Sep 03, 2020 at 05:30:07PM +0200, Jürgen Groß wrote:
> On 01.09.20 10:33, Roger Pau Monne wrote:
> > To be used in order to create foreign mappings. This is based on the
> > ZONE_DEVICE facility which is used by persistent memory devices in
> > order to create struct pages and kernel virtual mappings for the IOMEM
> > areas of such devices. Note that on kernels without support for
> > ZONE_DEVICE Xen will fallback to use ballooned pages in order to
> > create foreign mappings.
> > 
> > The newly added helpers use the same parameters as the existing
> > {alloc/free}_xenballooned_pages functions, which allows for in-place
> > replacement of the callers. Once a memory region has been added to be
> > used as scratch mapping space it will no longer be released, and pages
> > returned are kept in a linked list. This allows to have a buffer of
> > pages and prevents resorting to frequent additions and removals of
> > regions.
> > 
> > If enabled (because ZONE_DEVICE is supported) the usage of the new
> > functionality untangles Xen balloon and RAM hotplug from the usage of
> > unpopulated physical memory ranges to map foreign pages, which is the
> > correct thing to do in order to avoid mappings of foreign pages depend
> > on memory hotplug.
> > 
> > Note the driver is currently not enabled on Arm platforms because it
> > would interfere with the identity mapping required on some platforms.
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Sorry, I just got a build error for x86 32-bit build:
> 
> WARNING: unmet direct dependencies detected for ZONE_DEVICE
>   Depends on [n]: MEMORY_HOTPLUG [=n] && MEMORY_HOTREMOVE [=n] &&
> SPARSEMEM_VMEMMAP [=n] && ARCH_HAS_PTE_DEVMAP [=n]
>   Selected by [y]:
>   - XEN_UNPOPULATED_ALLOC [=y] && XEN [=y] && X86 [=y]
>   GEN     Makefile
>   CC      kernel/bounds.s
>   CALL    /home/gross/korg/src/scripts/atomic/check-atomics.sh
>   UPD     include/generated/bounds.h
>   CC      arch/x86/kernel/asm-offsets.s
> In file included from /home/gross/korg/src/include/linux/mmzone.h:19:0,
>                  from /home/gross/korg/src/include/linux/gfp.h:6,
>                  from /home/gross/korg/src/include/linux/slab.h:15,
>                  from /home/gross/korg/src/include/linux/crypto.h:19,
>                  from /home/gross/korg/src/arch/x86/kernel/asm-offsets.c:9:
> /home/gross/korg/src/include/linux/page-flags-layout.h:95:2: error: #error
> "Not enough bits in page flags"
>  #error "Not enough bits in page flags"
>   ^~~~~
> make[2]: *** [/home/gross/korg/src/scripts/Makefile.build:114:
> arch/x86/kernel/asm-offsets.s] Error 1
> make[1]: *** [/home/gross/korg/src/Makefile:1175: prepare0] Error 2
> make[1]: Leaving directory '/home/gross/korg/x8632'
> make: *** [Makefile:185: __sub-make] Error 2

Sorry for this. I've tested a 32bit build but I think it was before
the last Kconfig changes. I'm a little unsure how to solve this, as
ZONE_DEVICE doesn't select the required options for it to run, but
rather depends on them to be available.

You can trigger something similar on x86-64 by doing:

$ make ARCH=x86_64 xen.config
Using .config as base
Merging ./kernel/configs/xen.config
Merging ./arch/x86/configs/xen.config
#
# merged configuration written to .config (needs make)
#
scripts/kconfig/conf  --olddefconfig Kconfig

WARNING: unmet direct dependencies detected for ZONE_DEVICE
  Depends on [n]: MEMORY_HOTPLUG [=y] && MEMORY_HOTREMOVE [=n] && SPARSEMEM_VMEMMAP [=y] && ARCH_HAS_PTE_DEVMAP [=y]
  Selected by [y]:
  - XEN_UNPOPULATED_ALLOC [=y] && XEN [=y] && X86_64 [=y]
#
# configuration written to .config
#

I think the only solution is to have XEN_UNPOPULATED_ALLOC depend on
ZONE_DEVICE rather than select it?

Thanks, Roger.
Jürgen Groß Sept. 4, 2020, 7 a.m. UTC | #5
On 03.09.20 18:38, Roger Pau Monné wrote:
> On Thu, Sep 03, 2020 at 05:30:07PM +0200, Jürgen Groß wrote:
>> On 01.09.20 10:33, Roger Pau Monne wrote:
>>> To be used in order to create foreign mappings. This is based on the
>>> ZONE_DEVICE facility which is used by persistent memory devices in
>>> order to create struct pages and kernel virtual mappings for the IOMEM
>>> areas of such devices. Note that on kernels without support for
>>> ZONE_DEVICE Xen will fallback to use ballooned pages in order to
>>> create foreign mappings.
>>>
>>> The newly added helpers use the same parameters as the existing
>>> {alloc/free}_xenballooned_pages functions, which allows for in-place
>>> replacement of the callers. Once a memory region has been added to be
>>> used as scratch mapping space it will no longer be released, and pages
>>> returned are kept in a linked list. This allows to have a buffer of
>>> pages and prevents resorting to frequent additions and removals of
>>> regions.
>>>
>>> If enabled (because ZONE_DEVICE is supported) the usage of the new
>>> functionality untangles Xen balloon and RAM hotplug from the usage of
>>> unpopulated physical memory ranges to map foreign pages, which is the
>>> correct thing to do in order to avoid mappings of foreign pages depend
>>> on memory hotplug.
>>>
>>> Note the driver is currently not enabled on Arm platforms because it
>>> would interfere with the identity mapping required on some platforms.
>>>
>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>
>> Sorry, I just got a build error for x86 32-bit build:
>>
>> WARNING: unmet direct dependencies detected for ZONE_DEVICE
>>    Depends on [n]: MEMORY_HOTPLUG [=n] && MEMORY_HOTREMOVE [=n] &&
>> SPARSEMEM_VMEMMAP [=n] && ARCH_HAS_PTE_DEVMAP [=n]
>>    Selected by [y]:
>>    - XEN_UNPOPULATED_ALLOC [=y] && XEN [=y] && X86 [=y]
>>    GEN     Makefile
>>    CC      kernel/bounds.s
>>    CALL    /home/gross/korg/src/scripts/atomic/check-atomics.sh
>>    UPD     include/generated/bounds.h
>>    CC      arch/x86/kernel/asm-offsets.s
>> In file included from /home/gross/korg/src/include/linux/mmzone.h:19:0,
>>                   from /home/gross/korg/src/include/linux/gfp.h:6,
>>                   from /home/gross/korg/src/include/linux/slab.h:15,
>>                   from /home/gross/korg/src/include/linux/crypto.h:19,
>>                   from /home/gross/korg/src/arch/x86/kernel/asm-offsets.c:9:
>> /home/gross/korg/src/include/linux/page-flags-layout.h:95:2: error: #error
>> "Not enough bits in page flags"
>>   #error "Not enough bits in page flags"
>>    ^~~~~
>> make[2]: *** [/home/gross/korg/src/scripts/Makefile.build:114:
>> arch/x86/kernel/asm-offsets.s] Error 1
>> make[1]: *** [/home/gross/korg/src/Makefile:1175: prepare0] Error 2
>> make[1]: Leaving directory '/home/gross/korg/x8632'
>> make: *** [Makefile:185: __sub-make] Error 2
> 
> Sorry for this. I've tested a 32bit build but I think it was before
> the last Kconfig changes. I'm a little unsure how to solve this, as
> ZONE_DEVICE doesn't select the required options for it to run, but
> rather depends on them to be available.
> 
> You can trigger something similar on x86-64 by doing:
> 
> $ make ARCH=x86_64 xen.config
> Using .config as base
> Merging ./kernel/configs/xen.config
> Merging ./arch/x86/configs/xen.config
> #
> # merged configuration written to .config (needs make)
> #
> scripts/kconfig/conf  --olddefconfig Kconfig
> 
> WARNING: unmet direct dependencies detected for ZONE_DEVICE
>    Depends on [n]: MEMORY_HOTPLUG [=y] && MEMORY_HOTREMOVE [=n] && SPARSEMEM_VMEMMAP [=y] && ARCH_HAS_PTE_DEVMAP [=y]
>    Selected by [y]:
>    - XEN_UNPOPULATED_ALLOC [=y] && XEN [=y] && X86_64 [=y]
> #
> # configuration written to .config
> #
> 
> I think the only solution is to have XEN_UNPOPULATED_ALLOC depend on
> ZONE_DEVICE rather than select it?

Yes, I think so.

I've folded that in and now build is fine.


Juergen
Roger Pau Monné Sept. 4, 2020, 8:42 a.m. UTC | #6
On Fri, Sep 04, 2020 at 09:00:18AM +0200, Jürgen Groß wrote:
> On 03.09.20 18:38, Roger Pau Monné wrote:
> > On Thu, Sep 03, 2020 at 05:30:07PM +0200, Jürgen Groß wrote:
> > > On 01.09.20 10:33, Roger Pau Monne wrote:
> > > > To be used in order to create foreign mappings. This is based on the
> > > > ZONE_DEVICE facility which is used by persistent memory devices in
> > > > order to create struct pages and kernel virtual mappings for the IOMEM
> > > > areas of such devices. Note that on kernels without support for
> > > > ZONE_DEVICE Xen will fallback to use ballooned pages in order to
> > > > create foreign mappings.
> > > > 
> > > > The newly added helpers use the same parameters as the existing
> > > > {alloc/free}_xenballooned_pages functions, which allows for in-place
> > > > replacement of the callers. Once a memory region has been added to be
> > > > used as scratch mapping space it will no longer be released, and pages
> > > > returned are kept in a linked list. This allows to have a buffer of
> > > > pages and prevents resorting to frequent additions and removals of
> > > > regions.
> > > > 
> > > > If enabled (because ZONE_DEVICE is supported) the usage of the new
> > > > functionality untangles Xen balloon and RAM hotplug from the usage of
> > > > unpopulated physical memory ranges to map foreign pages, which is the
> > > > correct thing to do in order to avoid mappings of foreign pages depend
> > > > on memory hotplug.
> > > > 
> > > > Note the driver is currently not enabled on Arm platforms because it
> > > > would interfere with the identity mapping required on some platforms.
> > > > 
> > > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > > 
> > > Sorry, I just got a build error for x86 32-bit build:
> > > 
> > > WARNING: unmet direct dependencies detected for ZONE_DEVICE
> > >    Depends on [n]: MEMORY_HOTPLUG [=n] && MEMORY_HOTREMOVE [=n] &&
> > > SPARSEMEM_VMEMMAP [=n] && ARCH_HAS_PTE_DEVMAP [=n]
> > >    Selected by [y]:
> > >    - XEN_UNPOPULATED_ALLOC [=y] && XEN [=y] && X86 [=y]
> > >    GEN     Makefile
> > >    CC      kernel/bounds.s
> > >    CALL    /home/gross/korg/src/scripts/atomic/check-atomics.sh
> > >    UPD     include/generated/bounds.h
> > >    CC      arch/x86/kernel/asm-offsets.s
> > > In file included from /home/gross/korg/src/include/linux/mmzone.h:19:0,
> > >                   from /home/gross/korg/src/include/linux/gfp.h:6,
> > >                   from /home/gross/korg/src/include/linux/slab.h:15,
> > >                   from /home/gross/korg/src/include/linux/crypto.h:19,
> > >                   from /home/gross/korg/src/arch/x86/kernel/asm-offsets.c:9:
> > > /home/gross/korg/src/include/linux/page-flags-layout.h:95:2: error: #error
> > > "Not enough bits in page flags"
> > >   #error "Not enough bits in page flags"
> > >    ^~~~~
> > > make[2]: *** [/home/gross/korg/src/scripts/Makefile.build:114:
> > > arch/x86/kernel/asm-offsets.s] Error 1
> > > make[1]: *** [/home/gross/korg/src/Makefile:1175: prepare0] Error 2
> > > make[1]: Leaving directory '/home/gross/korg/x8632'
> > > make: *** [Makefile:185: __sub-make] Error 2
> > 
> > Sorry for this. I've tested a 32bit build but I think it was before
> > the last Kconfig changes. I'm a little unsure how to solve this, as
> > ZONE_DEVICE doesn't select the required options for it to run, but
> > rather depends on them to be available.
> > 
> > You can trigger something similar on x86-64 by doing:
> > 
> > $ make ARCH=x86_64 xen.config
> > Using .config as base
> > Merging ./kernel/configs/xen.config
> > Merging ./arch/x86/configs/xen.config
> > #
> > # merged configuration written to .config (needs make)
> > #
> > scripts/kconfig/conf  --olddefconfig Kconfig
> > 
> > WARNING: unmet direct dependencies detected for ZONE_DEVICE
> >    Depends on [n]: MEMORY_HOTPLUG [=y] && MEMORY_HOTREMOVE [=n] && SPARSEMEM_VMEMMAP [=y] && ARCH_HAS_PTE_DEVMAP [=y]
> >    Selected by [y]:
> >    - XEN_UNPOPULATED_ALLOC [=y] && XEN [=y] && X86_64 [=y]
> > #
> > # configuration written to .config
> > #
> > 
> > I think the only solution is to have XEN_UNPOPULATED_ALLOC depend on
> > ZONE_DEVICE rather than select it?
> 
> Yes, I think so.
> 
> I've folded that in and now build is fine.

Thanks, I assume no further action is needed on my side.

Roger.
Jürgen Groß Sept. 4, 2020, 12:40 p.m. UTC | #7
On 04.09.20 10:42, Roger Pau Monné wrote:
> On Fri, Sep 04, 2020 at 09:00:18AM +0200, Jürgen Groß wrote:
>> On 03.09.20 18:38, Roger Pau Monné wrote:
>>> On Thu, Sep 03, 2020 at 05:30:07PM +0200, Jürgen Groß wrote:
>>>> On 01.09.20 10:33, Roger Pau Monne wrote:
>>>>> To be used in order to create foreign mappings. This is based on the
>>>>> ZONE_DEVICE facility which is used by persistent memory devices in
>>>>> order to create struct pages and kernel virtual mappings for the IOMEM
>>>>> areas of such devices. Note that on kernels without support for
>>>>> ZONE_DEVICE Xen will fallback to use ballooned pages in order to
>>>>> create foreign mappings.
>>>>>
>>>>> The newly added helpers use the same parameters as the existing
>>>>> {alloc/free}_xenballooned_pages functions, which allows for in-place
>>>>> replacement of the callers. Once a memory region has been added to be
>>>>> used as scratch mapping space it will no longer be released, and pages
>>>>> returned are kept in a linked list. This allows to have a buffer of
>>>>> pages and prevents resorting to frequent additions and removals of
>>>>> regions.
>>>>>
>>>>> If enabled (because ZONE_DEVICE is supported) the usage of the new
>>>>> functionality untangles Xen balloon and RAM hotplug from the usage of
>>>>> unpopulated physical memory ranges to map foreign pages, which is the
>>>>> correct thing to do in order to avoid mappings of foreign pages depend
>>>>> on memory hotplug.
>>>>>
>>>>> Note the driver is currently not enabled on Arm platforms because it
>>>>> would interfere with the identity mapping required on some platforms.
>>>>>
>>>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>>>
>>>> Sorry, I just got a build error for x86 32-bit build:
>>>>
>>>> WARNING: unmet direct dependencies detected for ZONE_DEVICE
>>>>     Depends on [n]: MEMORY_HOTPLUG [=n] && MEMORY_HOTREMOVE [=n] &&
>>>> SPARSEMEM_VMEMMAP [=n] && ARCH_HAS_PTE_DEVMAP [=n]
>>>>     Selected by [y]:
>>>>     - XEN_UNPOPULATED_ALLOC [=y] && XEN [=y] && X86 [=y]
>>>>     GEN     Makefile
>>>>     CC      kernel/bounds.s
>>>>     CALL    /home/gross/korg/src/scripts/atomic/check-atomics.sh
>>>>     UPD     include/generated/bounds.h
>>>>     CC      arch/x86/kernel/asm-offsets.s
>>>> In file included from /home/gross/korg/src/include/linux/mmzone.h:19:0,
>>>>                    from /home/gross/korg/src/include/linux/gfp.h:6,
>>>>                    from /home/gross/korg/src/include/linux/slab.h:15,
>>>>                    from /home/gross/korg/src/include/linux/crypto.h:19,
>>>>                    from /home/gross/korg/src/arch/x86/kernel/asm-offsets.c:9:
>>>> /home/gross/korg/src/include/linux/page-flags-layout.h:95:2: error: #error
>>>> "Not enough bits in page flags"
>>>>    #error "Not enough bits in page flags"
>>>>     ^~~~~
>>>> make[2]: *** [/home/gross/korg/src/scripts/Makefile.build:114:
>>>> arch/x86/kernel/asm-offsets.s] Error 1
>>>> make[1]: *** [/home/gross/korg/src/Makefile:1175: prepare0] Error 2
>>>> make[1]: Leaving directory '/home/gross/korg/x8632'
>>>> make: *** [Makefile:185: __sub-make] Error 2
>>>
>>> Sorry for this. I've tested a 32bit build but I think it was before
>>> the last Kconfig changes. I'm a little unsure how to solve this, as
>>> ZONE_DEVICE doesn't select the required options for it to run, but
>>> rather depends on them to be available.
>>>
>>> You can trigger something similar on x86-64 by doing:
>>>
>>> $ make ARCH=x86_64 xen.config
>>> Using .config as base
>>> Merging ./kernel/configs/xen.config
>>> Merging ./arch/x86/configs/xen.config
>>> #
>>> # merged configuration written to .config (needs make)
>>> #
>>> scripts/kconfig/conf  --olddefconfig Kconfig
>>>
>>> WARNING: unmet direct dependencies detected for ZONE_DEVICE
>>>     Depends on [n]: MEMORY_HOTPLUG [=y] && MEMORY_HOTREMOVE [=n] && SPARSEMEM_VMEMMAP [=y] && ARCH_HAS_PTE_DEVMAP [=y]
>>>     Selected by [y]:
>>>     - XEN_UNPOPULATED_ALLOC [=y] && XEN [=y] && X86_64 [=y]
>>> #
>>> # configuration written to .config
>>> #
>>>
>>> I think the only solution is to have XEN_UNPOPULATED_ALLOC depend on
>>> ZONE_DEVICE rather than select it?
>>
>> Yes, I think so.
>>
>> I've folded that in and now build is fine.
> 
> Thanks, I assume no further action is needed on my side.

Right.


Juergen
diff mbox series

Patch

diff --git a/drivers/gpu/drm/xen/xen_drm_front_gem.c b/drivers/gpu/drm/xen/xen_drm_front_gem.c
index 39ff95b75357..534daf37c97e 100644
--- a/drivers/gpu/drm/xen/xen_drm_front_gem.c
+++ b/drivers/gpu/drm/xen/xen_drm_front_gem.c
@@ -18,6 +18,7 @@ 
 #include <drm/drm_probe_helper.h>
 
 #include <xen/balloon.h>
+#include <xen/xen.h>
 
 #include "xen_drm_front.h"
 #include "xen_drm_front_gem.h"
@@ -99,8 +100,8 @@  static struct xen_gem_object *gem_create(struct drm_device *dev, size_t size)
 		 * allocate ballooned pages which will be used to map
 		 * grant references provided by the backend
 		 */
-		ret = alloc_xenballooned_pages(xen_obj->num_pages,
-					       xen_obj->pages);
+		ret = xen_alloc_unpopulated_pages(xen_obj->num_pages,
+					          xen_obj->pages);
 		if (ret < 0) {
 			DRM_ERROR("Cannot allocate %zu ballooned pages: %d\n",
 				  xen_obj->num_pages, ret);
@@ -152,8 +153,8 @@  void xen_drm_front_gem_free_object_unlocked(struct drm_gem_object *gem_obj)
 	} else {
 		if (xen_obj->pages) {
 			if (xen_obj->be_alloc) {
-				free_xenballooned_pages(xen_obj->num_pages,
-							xen_obj->pages);
+				xen_free_unpopulated_pages(xen_obj->num_pages,
+							   xen_obj->pages);
 				gem_free_pages_array(xen_obj);
 			} else {
 				drm_gem_put_pages(&xen_obj->base,
diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index ea6c1e7e3e42..e38c33558d0d 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -325,4 +325,15 @@  config XEN_HAVE_VPMU
 config XEN_FRONT_PGDIR_SHBUF
 	tristate
 
+config XEN_UNPOPULATED_ALLOC
+	bool "Use unpopulated memory ranges for guest mappings"
+	depends on X86
+	select ZONE_DEVICE
+	default XEN_BACKEND || XEN_GNTDEV || XEN_DOM0
+	help
+	  Use unpopulated memory ranges in order to create mappings for guest
+	  memory regions, including grant maps and foreign pages. This avoids
+	  having to balloon out RAM regions in order to obtain physical memory
+	  space to create such mappings.
+
 endmenu
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index c25c9a699b48..babdca808861 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -41,3 +41,4 @@  xen-gntdev-$(CONFIG_XEN_GNTDEV_DMABUF)	+= gntdev-dmabuf.o
 xen-gntalloc-y				:= gntalloc.o
 xen-privcmd-y				:= privcmd.o privcmd-buf.o
 obj-$(CONFIG_XEN_FRONT_PGDIR_SHBUF)	+= xen-front-pgdir-shbuf.o
+obj-$(CONFIG_XEN_UNPOPULATED_ALLOC)	+= unpopulated-alloc.o
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 37ffccda8bb8..51427c752b37 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -653,7 +653,7 @@  void free_xenballooned_pages(int nr_pages, struct page **pages)
 }
 EXPORT_SYMBOL(free_xenballooned_pages);
 
-#ifdef CONFIG_XEN_PV
+#if defined(CONFIG_XEN_PV) && !defined(CONFIG_XEN_UNPOPULATED_ALLOC)
 static void __init balloon_add_region(unsigned long start_pfn,
 				      unsigned long pages)
 {
@@ -707,7 +707,7 @@  static int __init balloon_init(void)
 	register_sysctl_table(xen_root);
 #endif
 
-#ifdef CONFIG_XEN_PV
+#if defined(CONFIG_XEN_PV) && !defined(CONFIG_XEN_UNPOPULATED_ALLOC)
 	{
 		int i;
 
diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 8d06bf1cc347..523dcdf39cc9 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -801,7 +801,7 @@  int gnttab_alloc_pages(int nr_pages, struct page **pages)
 {
 	int ret;
 
-	ret = alloc_xenballooned_pages(nr_pages, pages);
+	ret = xen_alloc_unpopulated_pages(nr_pages, pages);
 	if (ret < 0)
 		return ret;
 
@@ -836,7 +836,7 @@  EXPORT_SYMBOL_GPL(gnttab_pages_clear_private);
 void gnttab_free_pages(int nr_pages, struct page **pages)
 {
 	gnttab_pages_clear_private(nr_pages, pages);
-	free_xenballooned_pages(nr_pages, pages);
+	xen_free_unpopulated_pages(nr_pages, pages);
 }
 EXPORT_SYMBOL_GPL(gnttab_free_pages);
 
diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
index 63abe6c3642b..b0c73c58f987 100644
--- a/drivers/xen/privcmd.c
+++ b/drivers/xen/privcmd.c
@@ -424,7 +424,7 @@  static int alloc_empty_pages(struct vm_area_struct *vma, int numpgs)
 	if (pages == NULL)
 		return -ENOMEM;
 
-	rc = alloc_xenballooned_pages(numpgs, pages);
+	rc = xen_alloc_unpopulated_pages(numpgs, pages);
 	if (rc != 0) {
 		pr_warn("%s Could not alloc %d pfns rc:%d\n", __func__,
 			numpgs, rc);
@@ -895,7 +895,7 @@  static void privcmd_close(struct vm_area_struct *vma)
 
 	rc = xen_unmap_domain_gfn_range(vma, numgfns, pages);
 	if (rc == 0)
-		free_xenballooned_pages(numpgs, pages);
+		xen_free_unpopulated_pages(numpgs, pages);
 	else
 		pr_crit("unable to unmap MFN range: leaking %d pages. rc=%d\n",
 			numpgs, rc);
diff --git a/drivers/xen/unpopulated-alloc.c b/drivers/xen/unpopulated-alloc.c
new file mode 100644
index 000000000000..1b5d157c6977
--- /dev/null
+++ b/drivers/xen/unpopulated-alloc.c
@@ -0,0 +1,185 @@ 
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/errno.h>
+#include <linux/gfp.h>
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/memremap.h>
+#include <linux/slab.h>
+
+#include <asm/page.h>
+
+#include <xen/page.h>
+#include <xen/xen.h>
+
+static DEFINE_MUTEX(list_lock);
+static LIST_HEAD(page_list);
+static unsigned int list_count;
+
+static int fill_list(unsigned int nr_pages)
+{
+	struct dev_pagemap *pgmap;
+	void *vaddr;
+	unsigned int i, alloc_pages = round_up(nr_pages, PAGES_PER_SECTION);
+	int nid, ret;
+
+	pgmap = kzalloc(sizeof(*pgmap), GFP_KERNEL);
+	if (!pgmap)
+		return -ENOMEM;
+
+	pgmap->type = MEMORY_DEVICE_GENERIC;
+	pgmap->res.name = "Xen scratch";
+	pgmap->res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+
+	ret = allocate_resource(&iomem_resource, &pgmap->res,
+				alloc_pages * PAGE_SIZE, 0, -1,
+				PAGES_PER_SECTION * PAGE_SIZE, NULL, NULL);
+	if (ret < 0) {
+		pr_err("Cannot allocate new IOMEM resource\n");
+		kfree(pgmap);
+		return ret;
+	}
+
+	nid = memory_add_physaddr_to_nid(pgmap->res.start);
+
+#ifdef CONFIG_XEN_HAVE_PVMMU
+        /*
+         * memremap will build page tables for the new memory so
+         * the p2m must contain invalid entries so the correct
+         * non-present PTEs will be written.
+         *
+         * If a failure occurs, the original (identity) p2m entries
+         * are not restored since this region is now known not to
+         * conflict with any devices.
+         */
+	if (!xen_feature(XENFEAT_auto_translated_physmap)) {
+		xen_pfn_t pfn = PFN_DOWN(pgmap->res.start);
+
+		for (i = 0; i < alloc_pages; i++) {
+			if (!set_phys_to_machine(pfn + i, INVALID_P2M_ENTRY)) {
+				pr_warn("set_phys_to_machine() failed, no memory added\n");
+				release_resource(&pgmap->res);
+				kfree(pgmap);
+				return -ENOMEM;
+			}
+                }
+	}
+#endif
+
+	vaddr = memremap_pages(pgmap, nid);
+	if (IS_ERR(vaddr)) {
+		pr_err("Cannot remap memory range\n");
+		release_resource(&pgmap->res);
+		kfree(pgmap);
+		return PTR_ERR(vaddr);
+	}
+
+	for (i = 0; i < alloc_pages; i++) {
+		struct page *pg = virt_to_page(vaddr + PAGE_SIZE * i);
+
+		BUG_ON(!virt_addr_valid(vaddr + PAGE_SIZE * i));
+		list_add(&pg->lru, &page_list);
+		list_count++;
+	}
+
+	return 0;
+}
+
+/**
+ * xen_alloc_unpopulated_pages - alloc unpopulated pages
+ * @nr_pages: Number of pages
+ * @pages: pages returned
+ * @return 0 on success, error otherwise
+ */
+int xen_alloc_unpopulated_pages(unsigned int nr_pages, struct page **pages)
+{
+	unsigned int i;
+	int ret = 0;
+
+	mutex_lock(&list_lock);
+	if (list_count < nr_pages) {
+		ret = fill_list(nr_pages - list_count);
+		if (ret)
+			goto out;
+	}
+
+	for (i = 0; i < nr_pages; i++) {
+		struct page *pg = list_first_entry_or_null(&page_list,
+							   struct page,
+							   lru);
+
+		BUG_ON(!pg);
+		list_del(&pg->lru);
+		list_count--;
+		pages[i] = pg;
+
+#ifdef CONFIG_XEN_HAVE_PVMMU
+		if (!xen_feature(XENFEAT_auto_translated_physmap)) {
+			ret = xen_alloc_p2m_entry(page_to_pfn(pg));
+			if (ret < 0) {
+				unsigned int j;
+
+				for (j = 0; j <= i; j++) {
+					list_add(&pages[j]->lru, &page_list);
+					list_count++;
+				}
+				goto out;
+			}
+		}
+#endif
+	}
+
+out:
+	mutex_unlock(&list_lock);
+	return ret;
+}
+EXPORT_SYMBOL(xen_alloc_unpopulated_pages);
+
+/**
+ * xen_free_unpopulated_pages - return unpopulated pages
+ * @nr_pages: Number of pages
+ * @pages: pages to return
+ */
+void xen_free_unpopulated_pages(unsigned int nr_pages, struct page **pages)
+{
+	unsigned int i;
+
+	mutex_lock(&list_lock);
+	for (i = 0; i < nr_pages; i++) {
+		list_add(&pages[i]->lru, &page_list);
+		list_count++;
+	}
+	mutex_unlock(&list_lock);
+}
+EXPORT_SYMBOL(xen_free_unpopulated_pages);
+
+#ifdef CONFIG_XEN_PV
+static int __init init(void)
+{
+	unsigned int i;
+
+	if (!xen_domain())
+		return -ENODEV;
+
+	if (!xen_pv_domain())
+		return 0;
+
+	/*
+	 * Initialize with pages from the extra memory regions (see
+	 * arch/x86/xen/setup.c).
+	 */
+	for (i = 0; i < XEN_EXTRA_MEM_MAX_REGIONS; i++) {
+		unsigned int j;
+
+		for (j = 0; j < xen_extra_mem[i].n_pfns; j++) {
+			struct page *pg =
+				pfn_to_page(xen_extra_mem[i].start_pfn + j);
+
+			list_add(&pg->lru, &page_list);
+			list_count++;
+		}
+	}
+
+	return 0;
+}
+subsys_initcall(init);
+#endif
diff --git a/drivers/xen/xenbus/xenbus_client.c b/drivers/xen/xenbus/xenbus_client.c
index 907bcbb93afb..2690318ad50f 100644
--- a/drivers/xen/xenbus/xenbus_client.c
+++ b/drivers/xen/xenbus/xenbus_client.c
@@ -621,7 +621,7 @@  static int xenbus_map_ring_hvm(struct xenbus_device *dev,
 	bool leaked = false;
 	unsigned int nr_pages = XENBUS_PAGES(nr_grefs);
 
-	err = alloc_xenballooned_pages(nr_pages, node->hvm.pages);
+	err = xen_alloc_unpopulated_pages(nr_pages, node->hvm.pages);
 	if (err)
 		goto out_err;
 
@@ -662,7 +662,7 @@  static int xenbus_map_ring_hvm(struct xenbus_device *dev,
 			 addr, nr_pages);
  out_free_ballooned_pages:
 	if (!leaked)
-		free_xenballooned_pages(nr_pages, node->hvm.pages);
+		xen_free_unpopulated_pages(nr_pages, node->hvm.pages);
  out_err:
 	return err;
 }
@@ -858,7 +858,7 @@  static int xenbus_unmap_ring_hvm(struct xenbus_device *dev, void *vaddr)
 			       info.addrs);
 	if (!rv) {
 		vunmap(vaddr);
-		free_xenballooned_pages(nr_pages, node->hvm.pages);
+		xen_free_unpopulated_pages(nr_pages, node->hvm.pages);
 	}
 	else
 		WARN(1, "Leaking %p, size %u page(s)\n", vaddr, nr_pages);
diff --git a/drivers/xen/xlate_mmu.c b/drivers/xen/xlate_mmu.c
index 7b1077f0abcb..34742c6e189e 100644
--- a/drivers/xen/xlate_mmu.c
+++ b/drivers/xen/xlate_mmu.c
@@ -232,7 +232,7 @@  int __init xen_xlate_map_ballooned_pages(xen_pfn_t **gfns, void **virt,
 		kfree(pages);
 		return -ENOMEM;
 	}
-	rc = alloc_xenballooned_pages(nr_pages, pages);
+	rc = xen_alloc_unpopulated_pages(nr_pages, pages);
 	if (rc) {
 		pr_warn("%s Couldn't balloon alloc %ld pages rc:%d\n", __func__,
 			nr_pages, rc);
@@ -249,7 +249,7 @@  int __init xen_xlate_map_ballooned_pages(xen_pfn_t **gfns, void **virt,
 	if (!vaddr) {
 		pr_warn("%s Couldn't map %ld pages rc:%d\n", __func__,
 			nr_pages, rc);
-		free_xenballooned_pages(nr_pages, pages);
+		xen_free_unpopulated_pages(nr_pages, pages);
 		kfree(pages);
 		kfree(pfns);
 		return -ENOMEM;
diff --git a/include/xen/xen.h b/include/xen/xen.h
index 19a72f591e2b..43efba045acc 100644
--- a/include/xen/xen.h
+++ b/include/xen/xen.h
@@ -52,4 +52,13 @@  bool xen_biovec_phys_mergeable(const struct bio_vec *vec1,
 extern u64 xen_saved_max_mem_size;
 #endif
 
+#ifdef CONFIG_XEN_UNPOPULATED_ALLOC
+int xen_alloc_unpopulated_pages(unsigned int nr_pages, struct page **pages);
+void xen_free_unpopulated_pages(unsigned int nr_pages, struct page **pages);
+#else
+#define xen_alloc_unpopulated_pages alloc_xenballooned_pages
+#define xen_free_unpopulated_pages free_xenballooned_pages
+#include <xen/balloon.h>
+#endif
+
 #endif	/* _XEN_XEN_H */