Message ID | 1606706992-26656-4-git-send-email-anshuman.khandual@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm/hotplug: Pre-validate the address range with platform | expand |
On Mon, Nov 30, 2020 at 08:59:52AM +0530, Anshuman Khandual wrote: > This overrides arch_get_mappabble_range() on s390 platform and drops now > redundant similar check in vmem_add_mapping(). This compensates by adding > a new check __segment_load() to preserve the existing functionality. > > Cc: Heiko Carstens <hca@linux.ibm.com> > Cc: Vasily Gorbik <gor@linux.ibm.com> > Cc: David Hildenbrand <david@redhat.com> > Cc: linux-s390@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> > --- > arch/s390/mm/extmem.c | 5 +++++ > arch/s390/mm/vmem.c | 13 +++++++++---- > 2 files changed, 14 insertions(+), 4 deletions(-) > > diff --git a/arch/s390/mm/extmem.c b/arch/s390/mm/extmem.c > index 5060956b8e7d..cc055a78f7b6 100644 > --- a/arch/s390/mm/extmem.c > +++ b/arch/s390/mm/extmem.c > @@ -337,6 +337,11 @@ __segment_load (char *name, int do_nonshared, unsigned long *addr, unsigned long > goto out_free_resource; > } > > + if (seg->end + 1 > VMEM_MAX_PHYS || seg->end + 1 < seg->start_addr) { > + rc = -ERANGE; > + goto out_resource; > + } > + > rc = vmem_add_mapping(seg->start_addr, seg->end - seg->start_addr + 1); > if (rc) > goto out_resource; > diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c > index b239f2ba93b0..06dddcc0ce06 100644 > --- a/arch/s390/mm/vmem.c > +++ b/arch/s390/mm/vmem.c > @@ -532,14 +532,19 @@ void vmem_remove_mapping(unsigned long start, unsigned long size) > mutex_unlock(&vmem_mutex); > } > > +struct range arch_get_mappable_range(void) > +{ > + struct range memhp_range; > + > + memhp_range.start = 0; > + memhp_range.end = VMEM_MAX_PHYS; > + return memhp_range; > +} > + > int vmem_add_mapping(unsigned long start, unsigned long size) > { > int ret; > > - if (start + size > VMEM_MAX_PHYS || > - start + size < start) > - return -ERANGE; > - I really fail to see how this could be considered an improvement for s390. Especially I do not like that the (central) range check is now moved to the caller (__segment_load). Which would mean potential additional future callers would have to duplicate that code as well.
On 12/3/20 2:02 AM, Heiko Carstens wrote: > On Mon, Nov 30, 2020 at 08:59:52AM +0530, Anshuman Khandual wrote: >> This overrides arch_get_mappabble_range() on s390 platform and drops now >> redundant similar check in vmem_add_mapping(). This compensates by adding >> a new check __segment_load() to preserve the existing functionality. >> >> Cc: Heiko Carstens <hca@linux.ibm.com> >> Cc: Vasily Gorbik <gor@linux.ibm.com> >> Cc: David Hildenbrand <david@redhat.com> >> Cc: linux-s390@vger.kernel.org >> Cc: linux-kernel@vger.kernel.org >> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> >> --- >> arch/s390/mm/extmem.c | 5 +++++ >> arch/s390/mm/vmem.c | 13 +++++++++---- >> 2 files changed, 14 insertions(+), 4 deletions(-) >> >> diff --git a/arch/s390/mm/extmem.c b/arch/s390/mm/extmem.c >> index 5060956b8e7d..cc055a78f7b6 100644 >> --- a/arch/s390/mm/extmem.c >> +++ b/arch/s390/mm/extmem.c >> @@ -337,6 +337,11 @@ __segment_load (char *name, int do_nonshared, unsigned long *addr, unsigned long >> goto out_free_resource; >> } >> >> + if (seg->end + 1 > VMEM_MAX_PHYS || seg->end + 1 < seg->start_addr) { >> + rc = -ERANGE; >> + goto out_resource; >> + } >> + >> rc = vmem_add_mapping(seg->start_addr, seg->end - seg->start_addr + 1); >> if (rc) >> goto out_resource; >> diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c >> index b239f2ba93b0..06dddcc0ce06 100644 >> --- a/arch/s390/mm/vmem.c >> +++ b/arch/s390/mm/vmem.c >> @@ -532,14 +532,19 @@ void vmem_remove_mapping(unsigned long start, unsigned long size) >> mutex_unlock(&vmem_mutex); >> } >> >> +struct range arch_get_mappable_range(void) >> +{ >> + struct range memhp_range; >> + >> + memhp_range.start = 0; >> + memhp_range.end = VMEM_MAX_PHYS; >> + return memhp_range; >> +} >> + >> int vmem_add_mapping(unsigned long start, unsigned long size) >> { >> int ret; >> >> - if (start + size > VMEM_MAX_PHYS || >> - start + size < start) >> - return -ERANGE; >> - > > I really fail to see how this could be considered an improvement for > s390. Especially I do not like that the (central) range check is now > moved to the caller (__segment_load). Which would mean potential > additional future callers would have to duplicate that code as well. The physical range check is being moved to the generic hotplug code via arch_get_mappable_range() instead, making the existing check in vmem_add_mapping() redundant. Dropping the check there necessitates adding back a similar check in __segment_load(). Otherwise there will be a loss of functionality in terms of range check. May be we could just keep this existing check in vmem_add_mapping() as well in order avoid this movement but then it would be redundant check in every hotplug path. So I guess the choice is to either have redundant range checks in all hotplug paths or future internal callers of vmem_add_mapping() take care of the range check.
On Thu, Dec 03, 2020 at 06:03:00AM +0530, Anshuman Khandual wrote: > >> diff --git a/arch/s390/mm/extmem.c b/arch/s390/mm/extmem.c > >> index 5060956b8e7d..cc055a78f7b6 100644 > >> --- a/arch/s390/mm/extmem.c > >> +++ b/arch/s390/mm/extmem.c > >> @@ -337,6 +337,11 @@ __segment_load (char *name, int do_nonshared, unsigned long *addr, unsigned long > >> goto out_free_resource; > >> } > >> > >> + if (seg->end + 1 > VMEM_MAX_PHYS || seg->end + 1 < seg->start_addr) { > >> + rc = -ERANGE; > >> + goto out_resource; > >> + } > >> + > >> rc = vmem_add_mapping(seg->start_addr, seg->end - seg->start_addr + 1); > >> if (rc) > >> goto out_resource; > >> diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c > >> index b239f2ba93b0..06dddcc0ce06 100644 > >> --- a/arch/s390/mm/vmem.c > >> +++ b/arch/s390/mm/vmem.c > >> @@ -532,14 +532,19 @@ void vmem_remove_mapping(unsigned long start, unsigned long size) > >> mutex_unlock(&vmem_mutex); > >> } > >> > >> +struct range arch_get_mappable_range(void) > >> +{ > >> + struct range memhp_range; > >> + > >> + memhp_range.start = 0; > >> + memhp_range.end = VMEM_MAX_PHYS; > >> + return memhp_range; > >> +} > >> + > >> int vmem_add_mapping(unsigned long start, unsigned long size) > >> { > >> int ret; > >> > >> - if (start + size > VMEM_MAX_PHYS || > >> - start + size < start) > >> - return -ERANGE; > >> - > > > > I really fail to see how this could be considered an improvement for > > s390. Especially I do not like that the (central) range check is now > > moved to the caller (__segment_load). Which would mean potential > > additional future callers would have to duplicate that code as well. > > The physical range check is being moved to the generic hotplug code > via arch_get_mappable_range() instead, making the existing check in > vmem_add_mapping() redundant. Dropping the check there necessitates > adding back a similar check in __segment_load(). Otherwise there > will be a loss of functionality in terms of range check. > > May be we could just keep this existing check in vmem_add_mapping() > as well in order avoid this movement but then it would be redundant > check in every hotplug path. > > So I guess the choice is to either have redundant range checks in > all hotplug paths or future internal callers of vmem_add_mapping() > take care of the range check. The problem I have with this current approach from an architecture perspective: we end up having two completely different methods which are doing the same and must be kept in sync. This might be obvious looking at this patch, but I'm sure this will go out-of-sync (aka broken) sooner or later. Therefore I would really like to see a single method to do the range checking. Maybe you could add a callback into architecture code, so that such an architecture specific function could also be used elsewhere. Dunno.
On 03.12.20 12:51, Heiko Carstens wrote: > On Thu, Dec 03, 2020 at 06:03:00AM +0530, Anshuman Khandual wrote: >>>> diff --git a/arch/s390/mm/extmem.c b/arch/s390/mm/extmem.c >>>> index 5060956b8e7d..cc055a78f7b6 100644 >>>> --- a/arch/s390/mm/extmem.c >>>> +++ b/arch/s390/mm/extmem.c >>>> @@ -337,6 +337,11 @@ __segment_load (char *name, int do_nonshared, unsigned long *addr, unsigned long >>>> goto out_free_resource; >>>> } >>>> >>>> + if (seg->end + 1 > VMEM_MAX_PHYS || seg->end + 1 < seg->start_addr) { >>>> + rc = -ERANGE; >>>> + goto out_resource; >>>> + } >>>> + >>>> rc = vmem_add_mapping(seg->start_addr, seg->end - seg->start_addr + 1); >>>> if (rc) >>>> goto out_resource; >>>> diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c >>>> index b239f2ba93b0..06dddcc0ce06 100644 >>>> --- a/arch/s390/mm/vmem.c >>>> +++ b/arch/s390/mm/vmem.c >>>> @@ -532,14 +532,19 @@ void vmem_remove_mapping(unsigned long start, unsigned long size) >>>> mutex_unlock(&vmem_mutex); >>>> } >>>> >>>> +struct range arch_get_mappable_range(void) >>>> +{ >>>> + struct range memhp_range; >>>> + >>>> + memhp_range.start = 0; >>>> + memhp_range.end = VMEM_MAX_PHYS; >>>> + return memhp_range; >>>> +} >>>> + >>>> int vmem_add_mapping(unsigned long start, unsigned long size) >>>> { >>>> int ret; >>>> >>>> - if (start + size > VMEM_MAX_PHYS || >>>> - start + size < start) >>>> - return -ERANGE; >>>> - >>> >>> I really fail to see how this could be considered an improvement for >>> s390. Especially I do not like that the (central) range check is now >>> moved to the caller (__segment_load). Which would mean potential >>> additional future callers would have to duplicate that code as well. >> >> The physical range check is being moved to the generic hotplug code >> via arch_get_mappable_range() instead, making the existing check in >> vmem_add_mapping() redundant. Dropping the check there necessitates >> adding back a similar check in __segment_load(). Otherwise there >> will be a loss of functionality in terms of range check. >> >> May be we could just keep this existing check in vmem_add_mapping() >> as well in order avoid this movement but then it would be redundant >> check in every hotplug path. >> >> So I guess the choice is to either have redundant range checks in >> all hotplug paths or future internal callers of vmem_add_mapping() >> take care of the range check. > > The problem I have with this current approach from an architecture > perspective: we end up having two completely different methods which > are doing the same and must be kept in sync. This might be obvious > looking at this patch, but I'm sure this will go out-of-sync (aka > broken) sooner or later. Exactly, there should be one function only that was the whole idea of arch_get_mappable_range(). > > Therefore I would really like to see a single method to do the range > checking. Maybe you could add a callback into architecture code, so > that such an architecture specific function could also be used > elsewhere. Dunno. > I think we can just switch to using "memhp_range_allowed()" here then after implementing arch_get_mappable_range(). Doesn't hurt to double check in vmem_add_mapping() - especially to keep extmem working without changes. At least for callers of memory hotplug it's then clear which values actually won't fail deep down in arch code.
On 12/3/20 5:31 PM, David Hildenbrand wrote: > On 03.12.20 12:51, Heiko Carstens wrote: >> On Thu, Dec 03, 2020 at 06:03:00AM +0530, Anshuman Khandual wrote: >>>>> diff --git a/arch/s390/mm/extmem.c b/arch/s390/mm/extmem.c >>>>> index 5060956b8e7d..cc055a78f7b6 100644 >>>>> --- a/arch/s390/mm/extmem.c >>>>> +++ b/arch/s390/mm/extmem.c >>>>> @@ -337,6 +337,11 @@ __segment_load (char *name, int do_nonshared, unsigned long *addr, unsigned long >>>>> goto out_free_resource; >>>>> } >>>>> >>>>> + if (seg->end + 1 > VMEM_MAX_PHYS || seg->end + 1 < seg->start_addr) { >>>>> + rc = -ERANGE; >>>>> + goto out_resource; >>>>> + } >>>>> + >>>>> rc = vmem_add_mapping(seg->start_addr, seg->end - seg->start_addr + 1); >>>>> if (rc) >>>>> goto out_resource; >>>>> diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c >>>>> index b239f2ba93b0..06dddcc0ce06 100644 >>>>> --- a/arch/s390/mm/vmem.c >>>>> +++ b/arch/s390/mm/vmem.c >>>>> @@ -532,14 +532,19 @@ void vmem_remove_mapping(unsigned long start, unsigned long size) >>>>> mutex_unlock(&vmem_mutex); >>>>> } >>>>> >>>>> +struct range arch_get_mappable_range(void) >>>>> +{ >>>>> + struct range memhp_range; >>>>> + >>>>> + memhp_range.start = 0; >>>>> + memhp_range.end = VMEM_MAX_PHYS; >>>>> + return memhp_range; >>>>> +} >>>>> + >>>>> int vmem_add_mapping(unsigned long start, unsigned long size) >>>>> { >>>>> int ret; >>>>> >>>>> - if (start + size > VMEM_MAX_PHYS || >>>>> - start + size < start) >>>>> - return -ERANGE; >>>>> - >>>> >>>> I really fail to see how this could be considered an improvement for >>>> s390. Especially I do not like that the (central) range check is now >>>> moved to the caller (__segment_load). Which would mean potential >>>> additional future callers would have to duplicate that code as well. >>> >>> The physical range check is being moved to the generic hotplug code >>> via arch_get_mappable_range() instead, making the existing check in >>> vmem_add_mapping() redundant. Dropping the check there necessitates >>> adding back a similar check in __segment_load(). Otherwise there >>> will be a loss of functionality in terms of range check. >>> >>> May be we could just keep this existing check in vmem_add_mapping() >>> as well in order avoid this movement but then it would be redundant >>> check in every hotplug path. >>> >>> So I guess the choice is to either have redundant range checks in >>> all hotplug paths or future internal callers of vmem_add_mapping() >>> take care of the range check. >> >> The problem I have with this current approach from an architecture >> perspective: we end up having two completely different methods which >> are doing the same and must be kept in sync. This might be obvious >> looking at this patch, but I'm sure this will go out-of-sync (aka >> broken) sooner or later. > > Exactly, there should be one function only that was the whole idea of > arch_get_mappable_range(). > >> >> Therefore I would really like to see a single method to do the range >> checking. Maybe you could add a callback into architecture code, so >> that such an architecture specific function could also be used >> elsewhere. Dunno. >> > > I think we can just switch to using "memhp_range_allowed()" here then > after implementing arch_get_mappable_range(). > > Doesn't hurt to double check in vmem_add_mapping() - especially to keep > extmem working without changes. At least for callers of memory hotplug > it's then clear which values actually won't fail deep down in arch code. But there is a small problem here. memhp_range_allowed() is now defined and available with CONFIG_MEMORY_HOTPLUG where as vmem_add_mapping() and __segment_load() are generally available without any config dependency. So if CONFIG_MEMORY_HOTPLUG is not enabled there will be a build failure in vmem_add_mapping() for memhp_range_allowed() symbol. We could just move VM_BUG_ON(!memhp_range_allowed(start, size, 1)) check from vmem_add_mapping() to arch_add_memory() like on arm64 platform. But then __segment_load() would need that additional new check to compensate as proposed earlier. Also leaving vmem_add_mapping() and __segment_load() unchanged will cause the address range check to be called three times on the hotplug path i.e 1. register_memory_resource() 2. arch_add_memory() 3. vmem_add_mapping() Moving memhp_range_allowed() check inside arch_add_memory() seems better and consistent with arm64. Also in the future, any platform which choose to override arch_get_mappable() will have this additional VM_BUG_ON() in their arch_add_memory().
On 07.12.20 05:38, Anshuman Khandual wrote: > > > On 12/3/20 5:31 PM, David Hildenbrand wrote: >> On 03.12.20 12:51, Heiko Carstens wrote: >>> On Thu, Dec 03, 2020 at 06:03:00AM +0530, Anshuman Khandual wrote: >>>>>> diff --git a/arch/s390/mm/extmem.c b/arch/s390/mm/extmem.c >>>>>> index 5060956b8e7d..cc055a78f7b6 100644 >>>>>> --- a/arch/s390/mm/extmem.c >>>>>> +++ b/arch/s390/mm/extmem.c >>>>>> @@ -337,6 +337,11 @@ __segment_load (char *name, int do_nonshared, unsigned long *addr, unsigned long >>>>>> goto out_free_resource; >>>>>> } >>>>>> >>>>>> + if (seg->end + 1 > VMEM_MAX_PHYS || seg->end + 1 < seg->start_addr) { >>>>>> + rc = -ERANGE; >>>>>> + goto out_resource; >>>>>> + } >>>>>> + >>>>>> rc = vmem_add_mapping(seg->start_addr, seg->end - seg->start_addr + 1); >>>>>> if (rc) >>>>>> goto out_resource; >>>>>> diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c >>>>>> index b239f2ba93b0..06dddcc0ce06 100644 >>>>>> --- a/arch/s390/mm/vmem.c >>>>>> +++ b/arch/s390/mm/vmem.c >>>>>> @@ -532,14 +532,19 @@ void vmem_remove_mapping(unsigned long start, unsigned long size) >>>>>> mutex_unlock(&vmem_mutex); >>>>>> } >>>>>> >>>>>> +struct range arch_get_mappable_range(void) >>>>>> +{ >>>>>> + struct range memhp_range; >>>>>> + >>>>>> + memhp_range.start = 0; >>>>>> + memhp_range.end = VMEM_MAX_PHYS; >>>>>> + return memhp_range; >>>>>> +} >>>>>> + >>>>>> int vmem_add_mapping(unsigned long start, unsigned long size) >>>>>> { >>>>>> int ret; >>>>>> >>>>>> - if (start + size > VMEM_MAX_PHYS || >>>>>> - start + size < start) >>>>>> - return -ERANGE; >>>>>> - >>>>> >>>>> I really fail to see how this could be considered an improvement for >>>>> s390. Especially I do not like that the (central) range check is now >>>>> moved to the caller (__segment_load). Which would mean potential >>>>> additional future callers would have to duplicate that code as well. >>>> >>>> The physical range check is being moved to the generic hotplug code >>>> via arch_get_mappable_range() instead, making the existing check in >>>> vmem_add_mapping() redundant. Dropping the check there necessitates >>>> adding back a similar check in __segment_load(). Otherwise there >>>> will be a loss of functionality in terms of range check. >>>> >>>> May be we could just keep this existing check in vmem_add_mapping() >>>> as well in order avoid this movement but then it would be redundant >>>> check in every hotplug path. >>>> >>>> So I guess the choice is to either have redundant range checks in >>>> all hotplug paths or future internal callers of vmem_add_mapping() >>>> take care of the range check. >>> >>> The problem I have with this current approach from an architecture >>> perspective: we end up having two completely different methods which >>> are doing the same and must be kept in sync. This might be obvious >>> looking at this patch, but I'm sure this will go out-of-sync (aka >>> broken) sooner or later. >> >> Exactly, there should be one function only that was the whole idea of >> arch_get_mappable_range(). >> >>> >>> Therefore I would really like to see a single method to do the range >>> checking. Maybe you could add a callback into architecture code, so >>> that such an architecture specific function could also be used >>> elsewhere. Dunno. >>> >> >> I think we can just switch to using "memhp_range_allowed()" here then >> after implementing arch_get_mappable_range(). >> >> Doesn't hurt to double check in vmem_add_mapping() - especially to keep >> extmem working without changes. At least for callers of memory hotplug >> it's then clear which values actually won't fail deep down in arch code. > > But there is a small problem here. memhp_range_allowed() is now defined > and available with CONFIG_MEMORY_HOTPLUG where as vmem_add_mapping() and > __segment_load() are generally available without any config dependency. > So if CONFIG_MEMORY_HOTPLUG is not enabled there will be a build failure > in vmem_add_mapping() for memhp_range_allowed() symbol. > > We could just move VM_BUG_ON(!memhp_range_allowed(start, size, 1)) check > from vmem_add_mapping() to arch_add_memory() like on arm64 platform. But > then __segment_load() would need that additional new check to compensate > as proposed earlier. > > Also leaving vmem_add_mapping() and __segment_load() unchanged will cause > the address range check to be called three times on the hotplug path i.e > > 1. register_memory_resource() > 2. arch_add_memory() > 3. vmem_add_mapping() > > Moving memhp_range_allowed() check inside arch_add_memory() seems better > and consistent with arm64. Also in the future, any platform which choose > to override arch_get_mappable() will have this additional VM_BUG_ON() in > their arch_add_memory(). Yeah, it might not make sense to add these checks all over the place. The important part is that 1. There is a check somewhere (and if it's deep down in arch code) 2. There is an obvious way for callers to find out what valid values are. I guess it would be good enough to a) Factor out getting arch ranges into arch_get_mappable_range() b) Provide memhp_get_pluggable_range() Both changes only make sense with an in-tree user. I'm planning on using this functionality in virtio-mem code. I can pickup your patches, drop the superfluous checks, and use it from virtio-mem code. Makese sense (BTW, looks like we'll see aarch64 support for virtio-mem soon)?
On 12/7/20 2:33 PM, David Hildenbrand wrote: > On 07.12.20 05:38, Anshuman Khandual wrote: >> >> >> On 12/3/20 5:31 PM, David Hildenbrand wrote: >>> On 03.12.20 12:51, Heiko Carstens wrote: >>>> On Thu, Dec 03, 2020 at 06:03:00AM +0530, Anshuman Khandual wrote: >>>>>>> diff --git a/arch/s390/mm/extmem.c b/arch/s390/mm/extmem.c >>>>>>> index 5060956b8e7d..cc055a78f7b6 100644 >>>>>>> --- a/arch/s390/mm/extmem.c >>>>>>> +++ b/arch/s390/mm/extmem.c >>>>>>> @@ -337,6 +337,11 @@ __segment_load (char *name, int do_nonshared, unsigned long *addr, unsigned long >>>>>>> goto out_free_resource; >>>>>>> } >>>>>>> >>>>>>> + if (seg->end + 1 > VMEM_MAX_PHYS || seg->end + 1 < seg->start_addr) { >>>>>>> + rc = -ERANGE; >>>>>>> + goto out_resource; >>>>>>> + } >>>>>>> + >>>>>>> rc = vmem_add_mapping(seg->start_addr, seg->end - seg->start_addr + 1); >>>>>>> if (rc) >>>>>>> goto out_resource; >>>>>>> diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c >>>>>>> index b239f2ba93b0..06dddcc0ce06 100644 >>>>>>> --- a/arch/s390/mm/vmem.c >>>>>>> +++ b/arch/s390/mm/vmem.c >>>>>>> @@ -532,14 +532,19 @@ void vmem_remove_mapping(unsigned long start, unsigned long size) >>>>>>> mutex_unlock(&vmem_mutex); >>>>>>> } >>>>>>> >>>>>>> +struct range arch_get_mappable_range(void) >>>>>>> +{ >>>>>>> + struct range memhp_range; >>>>>>> + >>>>>>> + memhp_range.start = 0; >>>>>>> + memhp_range.end = VMEM_MAX_PHYS; >>>>>>> + return memhp_range; >>>>>>> +} >>>>>>> + >>>>>>> int vmem_add_mapping(unsigned long start, unsigned long size) >>>>>>> { >>>>>>> int ret; >>>>>>> >>>>>>> - if (start + size > VMEM_MAX_PHYS || >>>>>>> - start + size < start) >>>>>>> - return -ERANGE; >>>>>>> - >>>>>> >>>>>> I really fail to see how this could be considered an improvement for >>>>>> s390. Especially I do not like that the (central) range check is now >>>>>> moved to the caller (__segment_load). Which would mean potential >>>>>> additional future callers would have to duplicate that code as well. >>>>> >>>>> The physical range check is being moved to the generic hotplug code >>>>> via arch_get_mappable_range() instead, making the existing check in >>>>> vmem_add_mapping() redundant. Dropping the check there necessitates >>>>> adding back a similar check in __segment_load(). Otherwise there >>>>> will be a loss of functionality in terms of range check. >>>>> >>>>> May be we could just keep this existing check in vmem_add_mapping() >>>>> as well in order avoid this movement but then it would be redundant >>>>> check in every hotplug path. >>>>> >>>>> So I guess the choice is to either have redundant range checks in >>>>> all hotplug paths or future internal callers of vmem_add_mapping() >>>>> take care of the range check. >>>> >>>> The problem I have with this current approach from an architecture >>>> perspective: we end up having two completely different methods which >>>> are doing the same and must be kept in sync. This might be obvious >>>> looking at this patch, but I'm sure this will go out-of-sync (aka >>>> broken) sooner or later. >>> >>> Exactly, there should be one function only that was the whole idea of >>> arch_get_mappable_range(). >>> >>>> >>>> Therefore I would really like to see a single method to do the range >>>> checking. Maybe you could add a callback into architecture code, so >>>> that such an architecture specific function could also be used >>>> elsewhere. Dunno. >>>> >>> >>> I think we can just switch to using "memhp_range_allowed()" here then >>> after implementing arch_get_mappable_range(). >>> >>> Doesn't hurt to double check in vmem_add_mapping() - especially to keep >>> extmem working without changes. At least for callers of memory hotplug >>> it's then clear which values actually won't fail deep down in arch code. >> >> But there is a small problem here. memhp_range_allowed() is now defined >> and available with CONFIG_MEMORY_HOTPLUG where as vmem_add_mapping() and >> __segment_load() are generally available without any config dependency. >> So if CONFIG_MEMORY_HOTPLUG is not enabled there will be a build failure >> in vmem_add_mapping() for memhp_range_allowed() symbol. >> >> We could just move VM_BUG_ON(!memhp_range_allowed(start, size, 1)) check >> from vmem_add_mapping() to arch_add_memory() like on arm64 platform. But >> then __segment_load() would need that additional new check to compensate >> as proposed earlier. >> >> Also leaving vmem_add_mapping() and __segment_load() unchanged will cause >> the address range check to be called three times on the hotplug path i.e >> >> 1. register_memory_resource() >> 2. arch_add_memory() >> 3. vmem_add_mapping() >> >> Moving memhp_range_allowed() check inside arch_add_memory() seems better >> and consistent with arm64. Also in the future, any platform which choose >> to override arch_get_mappable() will have this additional VM_BUG_ON() in >> their arch_add_memory(). > > Yeah, it might not make sense to add these checks all over the place. > The important part is that > > 1. There is a check somewhere (and if it's deep down in arch code) > 2. There is an obvious way for callers to find out what valid values are. > > > I guess it would be good enough to > > a) Factor out getting arch ranges into arch_get_mappable_range() > b) Provide memhp_get_pluggable_range() Have posted V1 earlier in the day which hopefully accommodates all previous suggestions but otherwise do let me know if anything else still needs to be improved upon. https://lore.kernel.org/linux-mm/1607400978-31595-1-git-send-email-anshuman.khandual@arm.com/ > > Both changes only make sense with an in-tree user. I'm planning on using > this functionality in virtio-mem code. I can pickup your patches, drop > the superfluous checks, and use it from virtio-mem code. Makese sense > (BTW, looks like we'll see aarch64 support for virtio-mem soon)? I have not been following virtio-mem closely. But is there something pending on arm64 platform which prevents virtio-mem enablement ?
>> >> Both changes only make sense with an in-tree user. I'm planning on using >> this functionality in virtio-mem code. I can pickup your patches, drop >> the superfluous checks, and use it from virtio-mem code. Makese sense >> (BTW, looks like we'll see aarch64 support for virtio-mem soon)? > > I have not been following virtio-mem closely. But is there something pending > on arm64 platform which prevents virtio-mem enablement ? Regarding enablement, I expect things to be working out of the box mostly. Jonathan is currently doing some testing and wants to send a simple unlock patch once done. [1] Now, there are some things to improve in the future. virtio-mem adds/removes individual Linux memory blocks and logically plugs/unplugs MAX_ORDER - 1/pageblock_order pages inside Linux memory blocks. 1. memblock On arm64 and powerpc, we create/delete memblocks when adding/removing memory, which is suboptimal (and the code is quite fragile as we don't handle errors ...). Hotplugged memory never has holes, so we can tweak relevant code to not check via the memblock api. For example, pfn_valid() only has to check for memblock_is_map_memory() in case of !early_section() - otherwise it can just fallback to our generic pfn_valid() function. 2. MAX_ORDER - 1 / pageblock_order With 64k base pages, virtio-mem can only logically plug/unplug in 512MB granularity, which is sub-optimal and inflexible. 4/2MB would be much better - however this would require always using 2MB THP on arm64 (IIRC via "cont" bits). Luckily, only some distributions use 64k base pages as default nowadays ... :) 3. Section size virtio-mem benefits from small section sizes. Currently, we have 1G. With 4k base pages we could easily reduce it to something what x86 has (128 MB) - and I remember discussions regarding that already in other (IIRC NVDIMM / DIMM) context. Again, with 64k base pages we cannot go below 512 MB right now. [1] https://lkml.kernel.org/r/20201125145659.00004b3e@Huawei.com
diff --git a/arch/s390/mm/extmem.c b/arch/s390/mm/extmem.c index 5060956b8e7d..cc055a78f7b6 100644 --- a/arch/s390/mm/extmem.c +++ b/arch/s390/mm/extmem.c @@ -337,6 +337,11 @@ __segment_load (char *name, int do_nonshared, unsigned long *addr, unsigned long goto out_free_resource; } + if (seg->end + 1 > VMEM_MAX_PHYS || seg->end + 1 < seg->start_addr) { + rc = -ERANGE; + goto out_resource; + } + rc = vmem_add_mapping(seg->start_addr, seg->end - seg->start_addr + 1); if (rc) goto out_resource; diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c index b239f2ba93b0..06dddcc0ce06 100644 --- a/arch/s390/mm/vmem.c +++ b/arch/s390/mm/vmem.c @@ -532,14 +532,19 @@ void vmem_remove_mapping(unsigned long start, unsigned long size) mutex_unlock(&vmem_mutex); } +struct range arch_get_mappable_range(void) +{ + struct range memhp_range; + + memhp_range.start = 0; + memhp_range.end = VMEM_MAX_PHYS; + return memhp_range; +} + int vmem_add_mapping(unsigned long start, unsigned long size) { int ret; - if (start + size > VMEM_MAX_PHYS || - start + size < start) - return -ERANGE; - mutex_lock(&vmem_mutex); ret = vmem_add_range(start, size); if (ret)
This overrides arch_get_mappabble_range() on s390 platform and drops now redundant similar check in vmem_add_mapping(). This compensates by adding a new check __segment_load() to preserve the existing functionality. Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: David Hildenbrand <david@redhat.com> Cc: linux-s390@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> --- arch/s390/mm/extmem.c | 5 +++++ arch/s390/mm/vmem.c | 13 +++++++++---- 2 files changed, 14 insertions(+), 4 deletions(-)