Message ID | 20180817154127.28602-1-osalvador@techadventures.net (mailing list archive) |
---|---|
Headers | show |
Series | Do not touch pages in remove_memory path | expand |
On Fri, Aug 17, 2018 at 05:41:25PM +0200, Oscar Salvador wrote: > From: Oscar Salvador <osalvador@suse.de> [...] > > The main difficulty I faced here was in regard of HMM/devm, as it really handles > the hot-add/remove memory particulary, and what is more important, > also the resources. > > I really scratched my head for ideas about how to handle this case, and > after some fails I came up with the idea that we could check for the > res->flags. > > Memory resources that goes through the "official" memory-hotplug channels > have the IORESOURCE_SYSTEM_RAM flag. > This flag is made of (IORESOURCE_MEM|IORESOURCE_SYSRAM). > > HMM/devm, on the other hand, request and release the resources > through devm_request_mem_region/devm_release_mem_region, and > these resources do not contain the IORESOURCE_SYSRAM flag. > > So what I ended up doing is to check for IORESOURCE_SYSRAM > in release_mem_region_adjustable. > If we see that a resource does not have such a flag, we know that > we are dealing with a resource coming from HMM/devm, and so, > we do not need to do anything as HMM/dev will take care of that part. > Jerome/Dan, now that the merge window is closed, and before sending the RFCv3, could you please check this and see if you see something that is flagrant wrong? (about devm/HMM) If you prefer I can send v3 spliting up even more. Maybe this will ease the review. Thanks
On Tue, Aug 28, 2018 at 01:47:09PM +0200, Oscar Salvador wrote: > On Fri, Aug 17, 2018 at 05:41:25PM +0200, Oscar Salvador wrote: > > From: Oscar Salvador <osalvador@suse.de> > [...] > > > > The main difficulty I faced here was in regard of HMM/devm, as it really handles > > the hot-add/remove memory particulary, and what is more important, > > also the resources. > > > > I really scratched my head for ideas about how to handle this case, and > > after some fails I came up with the idea that we could check for the > > res->flags. > > > > Memory resources that goes through the "official" memory-hotplug channels > > have the IORESOURCE_SYSTEM_RAM flag. > > This flag is made of (IORESOURCE_MEM|IORESOURCE_SYSRAM). > > > > HMM/devm, on the other hand, request and release the resources > > through devm_request_mem_region/devm_release_mem_region, and > > these resources do not contain the IORESOURCE_SYSRAM flag. > > > > So what I ended up doing is to check for IORESOURCE_SYSRAM > > in release_mem_region_adjustable. > > If we see that a resource does not have such a flag, we know that > > we are dealing with a resource coming from HMM/devm, and so, > > we do not need to do anything as HMM/dev will take care of that part. > > > > Jerome/Dan, now that the merge window is closed, and before sending the RFCv3, could you please check > this and see if you see something that is flagrant wrong? (about devm/HMM) > > If you prefer I can send v3 spliting up even more. > Maybe this will ease the review. > This looks good to me you can add Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
From: Oscar Salvador <osalvador@suse.de> This patchset moves all zone/page handling from the remove_memory path back to the offline_pages stage. This has be done for two reasons: 1) We can access steal pages if we remove memory that was never online [1] 2) Code consistency Currently, when we online memory, online_pages() takes care to initialize the pages and put the memory range into its corresponding zone. So, zone/pgdat's spanned/present pages get resized. But the opposite does not happen when we offline memory. Only present pages is decremented, and we wait to shrink zone/node's spanned_pages until we remove that memory. But as explained above, this is wrong. So this patchset tries to cover this by moving this handling to the place it should be. The main difficulty I faced here was in regard of HMM/devm, as it really handles the hot-add/remove memory particulary, and what is more important, also the resources. I really scratched my head for ideas about how to handle this case, and after some fails I came up with the idea that we could check for the res->flags. Memory resources that goes through the "official" memory-hotplug channels have the IORESOURCE_SYSTEM_RAM flag. This flag is made of (IORESOURCE_MEM|IORESOURCE_SYSRAM). HMM/devm, on the other hand, request and release the resources through devm_request_mem_region/devm_release_mem_region, and these resources do not contain the IORESOURCE_SYSRAM flag. So what I ended up doing is to check for IORESOURCE_SYSRAM in release_mem_region_adjustable. If we see that a resource does not have such a flag, we know that we are dealing with a resource coming from HMM/devm, and so, we do not need to do anything as HMM/dev will take care of that part. I online compiled the code, but I did not test it (I will do next week), but I sent this RFCv2 mainly because I would like to get feedback, and see if the direction I took is the right one. This time I left out [2] because I am working on this in a separate patch, and does not really belong to this patchset. [1] https://patchwork.kernel.org/patch/10547445/ (Reported by David) [2] https://patchwork.kernel.org/patch/10558723/ Oscar Salvador (2): mm/memory_hotplug: Add nid parameter to arch_remove_memory mm/memory_hotplug: Shrink spanned pages when offlining memory arch/ia64/mm/init.c | 6 +- arch/powerpc/mm/mem.c | 12 +--- arch/s390/mm/init.c | 2 +- arch/sh/mm/init.c | 6 +- arch/x86/mm/init_32.c | 6 +- arch/x86/mm/init_64.c | 10 +-- include/linux/memory_hotplug.h | 11 +++- kernel/memremap.c | 16 ++--- kernel/resource.c | 16 +++++ mm/hmm.c | 34 +++++----- mm/memory_hotplug.c | 145 ++++++++++++++++++++++++++--------------- mm/sparse.c | 4 +- 12 files changed, 157 insertions(+), 111 deletions(-)