diff mbox series

[v4,10/13] lib: test_hmm add module param for zone device type

Message ID 20210717192135.9030-11-alex.sierra@amd.com (mailing list archive)
State New
Headers show
Series Support DEVICE_GENERIC memory in migrate_vma_* | expand

Commit Message

Sierra Guiza, Alejandro (Alex) July 17, 2021, 7:21 p.m. UTC
In order to configure device generic in test_hmm, two
module parameters should be passed, which correspon to the
SP start address of each device (2) spm_addr_dev0 &
spm_addr_dev1. If no parameters are passed, private device
type is configured.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
---
 lib/test_hmm.c      | 45 +++++++++++++++++++++++++++++++++------------
 lib/test_hmm_uapi.h |  1 +
 2 files changed, 34 insertions(+), 12 deletions(-)

Comments

Jason Gunthorpe July 22, 2021, 12:23 p.m. UTC | #1
On Sat, Jul 17, 2021 at 02:21:32PM -0500, Alex Sierra wrote:
> In order to configure device generic in test_hmm, two
> module parameters should be passed, which correspon to the
> SP start address of each device (2) spm_addr_dev0 &
> spm_addr_dev1. If no parameters are passed, private device
> type is configured.

I don't think tests should need configuration like this, is it really
necessary? How can people with normal HW run this test?

Jason
Sierra Guiza, Alejandro (Alex) July 22, 2021, 4:59 p.m. UTC | #2
On 7/22/2021 7:23 AM, Jason Gunthorpe wrote:
> On Sat, Jul 17, 2021 at 02:21:32PM -0500, Alex Sierra wrote:
>> In order to configure device generic in test_hmm, two
>> module parameters should be passed, which correspon to the
>> SP start address of each device (2) spm_addr_dev0 &
>> spm_addr_dev1. If no parameters are passed, private device
>> type is configured.
> I don't think tests should need configuration like this, is it really
> necessary? How can people with normal HW run this test?
Hi Jason,
The idea was to add an easy way to validate the codepaths touched by this
patch series, which make modifications to the migration helpers for device
generic type pages. We're using CONFIG_EFI_FAKE_MEMMAP to create fake SPM
devices inside system memory. No special HW needed. And passing the kernel
parameter efi_fake_mem. Ex. efi_fake_mem=1G@0x100000000:0x40000. I should
probably need to include a small example of how to set this in the 
test_hmm.sh
usage().

Alex S.

> Jason
Jason Gunthorpe July 22, 2021, 5:26 p.m. UTC | #3
On Thu, Jul 22, 2021 at 11:59:17AM -0500, Sierra Guiza, Alejandro (Alex) wrote:
> 
> On 7/22/2021 7:23 AM, Jason Gunthorpe wrote:
> > On Sat, Jul 17, 2021 at 02:21:32PM -0500, Alex Sierra wrote:
> > > In order to configure device generic in test_hmm, two
> > > module parameters should be passed, which correspon to the
> > > SP start address of each device (2) spm_addr_dev0 &
> > > spm_addr_dev1. If no parameters are passed, private device
> > > type is configured.
> > I don't think tests should need configuration like this, is it really
> > necessary? How can people with normal HW run this test?
> Hi Jason,
> The idea was to add an easy way to validate the codepaths touched by this
> patch series, which make modifications to the migration helpers for device
> generic type pages. We're using CONFIG_EFI_FAKE_MEMMAP to create fake SPM
> devices inside system memory. No special HW needed. And passing the kernel
> parameter efi_fake_mem. Ex. efi_fake_mem=1G@0x100000000:0x40000. I should
> probably need to include a small example of how to set this in the
> test_hmm.sh
> usage().

I don't think anything about hmm is sensitive to how the pages are
acquired - you can't create device generic pages without relying on
FAKE_MEMMAP?

Jason
Sierra Guiza, Alejandro (Alex) July 28, 2021, 11:45 p.m. UTC | #4
On 7/22/2021 12:26 PM, Jason Gunthorpe wrote:
> On Thu, Jul 22, 2021 at 11:59:17AM -0500, Sierra Guiza, Alejandro (Alex) wrote:
>> On 7/22/2021 7:23 AM, Jason Gunthorpe wrote:
>>> On Sat, Jul 17, 2021 at 02:21:32PM -0500, Alex Sierra wrote:
>>>> In order to configure device generic in test_hmm, two
>>>> module parameters should be passed, which correspon to the
>>>> SP start address of each device (2) spm_addr_dev0 &
>>>> spm_addr_dev1. If no parameters are passed, private device
>>>> type is configured.
>>> I don't think tests should need configuration like this, is it really
>>> necessary? How can people with normal HW run this test?
>> Hi Jason,
>> The idea was to add an easy way to validate the codepaths touched by this
>> patch series, which make modifications to the migration helpers for device
>> generic type pages. We're using CONFIG_EFI_FAKE_MEMMAP to create fake SPM
>> devices inside system memory. No special HW needed. And passing the kernel
>> parameter efi_fake_mem. Ex. efi_fake_mem=1G@0x100000000:0x40000. I should
>> probably need to include a small example of how to set this in the
>> test_hmm.sh
>> usage().
> I don't think anything about hmm is sensitive to how the pages are
> acquired - you can't create device generic pages without relying on
> FAKE_MEMMAP?
The reason we used fake SPM approach was to have a "special memory"
not managed by Linux (NOT registered as normal system memory). But
also accessible by the CPU.

For device_generic we cannot allocate new physical addresses.
We need the physical address to match the actual system memory
physical address, so that CPU mappings work as expected.

Would you recommend to use a different approach?

Regards,
Alex Sierra

>
> Jason
Felix Kuehling July 30, 2021, 7:11 p.m. UTC | #5
Am 2021-07-28 um 7:45 p.m. schrieb Sierra Guiza, Alejandro (Alex):
>
> On 7/22/2021 12:26 PM, Jason Gunthorpe wrote:
>> On Thu, Jul 22, 2021 at 11:59:17AM -0500, Sierra Guiza, Alejandro
>> (Alex) wrote:
>>> On 7/22/2021 7:23 AM, Jason Gunthorpe wrote:
>>>> On Sat, Jul 17, 2021 at 02:21:32PM -0500, Alex Sierra wrote:
>>>>> In order to configure device generic in test_hmm, two
>>>>> module parameters should be passed, which correspon to the
>>>>> SP start address of each device (2) spm_addr_dev0 &
>>>>> spm_addr_dev1. If no parameters are passed, private device
>>>>> type is configured.
>>>> I don't think tests should need configuration like this, is it really
>>>> necessary? How can people with normal HW run this test?
>>> Hi Jason,
>>> The idea was to add an easy way to validate the codepaths touched by
>>> this
>>> patch series, which make modifications to the migration helpers for
>>> device
>>> generic type pages. We're using CONFIG_EFI_FAKE_MEMMAP to create
>>> fake SPM
>>> devices inside system memory. No special HW needed. And passing the
>>> kernel
>>> parameter efi_fake_mem. Ex. efi_fake_mem=1G@0x100000000:0x40000. I
>>> should
>>> probably need to include a small example of how to set this in the
>>> test_hmm.sh
>>> usage().
>> I don't think anything about hmm is sensitive to how the pages are
>> acquired - you can't create device generic pages without relying on
>> FAKE_MEMMAP?
> The reason we used fake SPM approach was to have a "special memory"
> not managed by Linux (NOT registered as normal system memory). But
> also accessible by the CPU.
>
> For device_generic we cannot allocate new physical addresses.
> We need the physical address to match the actual system memory
> physical address, so that CPU mappings work as expected.
>
> Would you recommend to use a different approach?

Hi Jason,

Sorry it took us so long to respond to your comment. I was on vacation
for a week. Then I tried to brain-storm some ways with Alex to simplify
hmm_test for device_generic memory, but couldn't really come up with
anything simpler.

The problem as I see it is, that DEVICE_GENERIC pages for hmm_test
should be pages in physical system memory, but they should not be
managed by the Linux page allocator. Unlike DEVICE_PRIVATE, we cannot
allocate arbitrary physical addresses for these pages
(request_free_mem_region). Otherwise we'd break all the assumptions that
make those pages directly accessible in user mode virtual address spaces.

We could maybe allocate contiguous memory from the page allocator and
then register those as device generic pages. But that means, you'd now
have two struct pages for the same physical page. I didn't think that
would be a good idea.

IMHO, reserving some memory for this test with efi_fake_mem is the best
way to keep things sane. It does not require any special hardware or
firmware.

Regards,
  Felix


>
> Regards,
> Alex Sierra
>
>>
>> Jason
diff mbox series

Patch

diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index 3cd91ca31dd7..3c2e1fbedbd4 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -33,6 +33,16 @@ 
 #define DEVMEM_CHUNK_SIZE		(256 * 1024 * 1024U)
 #define DEVMEM_CHUNKS_RESERVE		16
 
+static unsigned long spm_addr_dev0;
+module_param(spm_addr_dev0, long, 0644);
+MODULE_PARM_DESC(spm_addr_dev0,
+		"Specify start address for SPM (special purpose memory) used for device 0. By setting this Generic device type will be used. Make sure spm_addr_dev1 is set too");
+
+static unsigned long spm_addr_dev1;
+module_param(spm_addr_dev1, long, 0644);
+MODULE_PARM_DESC(spm_addr_dev1,
+		"Specify start address for SPM (special purpose memory) used for device 1. By setting this Generic device type will be used. Make sure spm_addr_dev0 is set too");
+
 static const struct dev_pagemap_ops dmirror_devmem_ops;
 static const struct mmu_interval_notifier_ops dmirror_min_ops;
 static dev_t dmirror_dev;
@@ -450,11 +460,11 @@  static int dmirror_write(struct dmirror *dmirror, struct hmm_dmirror_cmd *cmd)
 	return ret;
 }
 
-static bool dmirror_allocate_chunk(struct dmirror_device *mdevice,
+static int dmirror_allocate_chunk(struct dmirror_device *mdevice,
 				   struct page **ppage)
 {
 	struct dmirror_chunk *devmem;
-	struct resource *res;
+	struct resource *res = NULL;
 	unsigned long pfn;
 	unsigned long pfn_first;
 	unsigned long pfn_last;
@@ -462,14 +472,26 @@  static bool dmirror_allocate_chunk(struct dmirror_device *mdevice,
 
 	devmem = kzalloc(sizeof(*devmem), GFP_KERNEL);
 	if (!devmem)
-		return false;
+		return -ENOMEM;
+
+	if (!spm_addr_dev0 && !spm_addr_dev1) {
+		res = request_free_mem_region(&iomem_resource, DEVMEM_CHUNK_SIZE,
+					      "hmm_dmirror");
+		devmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
+		mdevice->zone_device_type = HMM_DMIRROR_MEMORY_DEVICE_PRIVATE;
+	} else if (spm_addr_dev0 && spm_addr_dev1) {
+		res = lookup_resource(&iomem_resource, MINOR(mdevice->cdevice.dev) ?
+							spm_addr_dev0 :
+							spm_addr_dev1);
+		devmem->pagemap.type = MEMORY_DEVICE_GENERIC;
+		mdevice->zone_device_type = HMM_DMIRROR_MEMORY_DEVICE_GENERIC;
+	} else {
+		pr_err("Both spm_addr_dev parameters should be set\n");
+	}
 
-	res = request_free_mem_region(&iomem_resource, DEVMEM_CHUNK_SIZE,
-				      "hmm_dmirror");
 	if (IS_ERR(res))
 		goto err_devmem;
 
-	mdevice->zone_device_type = HMM_DMIRROR_MEMORY_DEVICE_PRIVATE;
 	devmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
 	devmem->pagemap.range.start = res->start;
 	devmem->pagemap.range.end = res->end;
@@ -1097,10 +1119,8 @@  static int dmirror_device_init(struct dmirror_device *mdevice, int id)
 	if (ret)
 		return ret;
 
-	/* Build a list of free ZONE_DEVICE private struct pages */
-	dmirror_allocate_chunk(mdevice, NULL);
-
-	return 0;
+	/* Build a list of free ZONE_DEVICE struct pages */
+	return dmirror_allocate_chunk(mdevice, NULL);
 }
 
 static void dmirror_device_remove(struct dmirror_device *mdevice)
@@ -1113,8 +1133,9 @@  static void dmirror_device_remove(struct dmirror_device *mdevice)
 				mdevice->devmem_chunks[i];
 
 			memunmap_pages(&devmem->pagemap);
-			release_mem_region(devmem->pagemap.range.start,
-					   range_len(&devmem->pagemap.range));
+			if (devmem->pagemap.type == MEMORY_DEVICE_PRIVATE)
+				release_mem_region(devmem->pagemap.range.start,
+						   range_len(&devmem->pagemap.range));
 			kfree(devmem);
 		}
 		kfree(mdevice->devmem_chunks);
diff --git a/lib/test_hmm_uapi.h b/lib/test_hmm_uapi.h
index ee88701793d5..17a6b5059871 100644
--- a/lib/test_hmm_uapi.h
+++ b/lib/test_hmm_uapi.h
@@ -65,6 +65,7 @@  enum {
 enum {
 	/* 0 is reserved to catch uninitialized type fields */
 	HMM_DMIRROR_MEMORY_DEVICE_PRIVATE = 1,
+	HMM_DMIRROR_MEMORY_DEVICE_GENERIC,
 };
 
 #endif /* _LIB_TEST_HMM_UAPI_H */