Message ID | 1579143668-27941-1-git-send-email-kernelfans@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm/sparse: reset section's mem_map when fully deactivated | expand |
> On Jan 15, 2020, at 10:01 PM, Pingfan Liu <kernelfans@gmail.com> wrote: > > When fully deactivated, it is meaningless to keep the value of a section's > mem_map. And its mem_map will be reassigned during re-added. > > Beside this, it breaks the user space tool "makedumpfile", which makes > assumption that a hot-removed section having mem_map as NULL. Well, tools like makedumpfile and the crash utility always has to copy with low kernel implementation details changes like this. Why is it different this time? > > The bug can be reproduced on IBM POWERVM by "drmgr -c mem -r -q 5" , > trigger a crash, and save vmcore by makedumpfile > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com> > To: linux-mm@kvack.org > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: David Hildenbrand <david@redhat.com> > Cc: Dan Williams <dan.j.williams@intel.com> > Cc: Oscar Salvador <osalvador@suse.de> > Cc: Michal Hocko <mhocko@kernel.org> > Cc: kexec@lists.infradead.org > Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com> > --- > mm/sparse.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/sparse.c b/mm/sparse.c > index 3822ecb..fddac80 100644 > --- a/mm/sparse.c > +++ b/mm/sparse.c > @@ -789,7 +789,7 @@ static void section_deactivate(unsigned long pfn, unsigned long nr_pages, > ms->usage = NULL; > } > memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr); > - ms->section_mem_map = sparse_encode_mem_map(NULL, section_nr); > + ms->section_mem_map = NULL; > } > > if (section_is_early && memmap) > -- > 2.7.5 > >
On Thu, Jan 16, 2020 at 11:18 AM Qian Cai <cai@lca.pw> wrote: > > > > > On Jan 15, 2020, at 10:01 PM, Pingfan Liu <kernelfans@gmail.com> wrote: > > > > When fully deactivated, it is meaningless to keep the value of a section's > > mem_map. And its mem_map will be reassigned during re-added. > > > > Beside this, it breaks the user space tool "makedumpfile", which makes > > assumption that a hot-removed section having mem_map as NULL. > > Well, tools like makedumpfile and the crash utility always has to copy with ^^^ cope ? > low kernel implementation details changes like this. Why is it different this time? Yeah, there are two direction. But as the first argument, it is meaningless to keep the value. Thanks, Pingfan
On Thu 16-01-20 11:01:08, Pingfan Liu wrote: > When fully deactivated, it is meaningless to keep the value of a section's > mem_map. And its mem_map will be reassigned during re-added. > > Beside this, it breaks the user space tool "makedumpfile", which makes > assumption that a hot-removed section having mem_map as NULL. We used to do that before ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug"). Dan was this an intentional change? > The bug can be reproduced on IBM POWERVM by "drmgr -c mem -r -q 5" , > trigger a crash, and save vmcore by makedumpfile > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com> > To: linux-mm@kvack.org > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: David Hildenbrand <david@redhat.com> > Cc: Dan Williams <dan.j.williams@intel.com> > Cc: Oscar Salvador <osalvador@suse.de> > Cc: Michal Hocko <mhocko@kernel.org> > Cc: kexec@lists.infradead.org > Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com> > --- > mm/sparse.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/sparse.c b/mm/sparse.c > index 3822ecb..fddac80 100644 > --- a/mm/sparse.c > +++ b/mm/sparse.c > @@ -789,7 +789,7 @@ static void section_deactivate(unsigned long pfn, unsigned long nr_pages, > ms->usage = NULL; > } > memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr); > - ms->section_mem_map = sparse_encode_mem_map(NULL, section_nr); > + ms->section_mem_map = NULL; > } > > if (section_is_early && memmap) > -- > 2.7.5
On 16.01.20 04:01, Pingfan Liu wrote: > When fully deactivated, it is meaningless to keep the value of a section's > mem_map. And its mem_map will be reassigned during re-added. > > Beside this, it breaks the user space tool "makedumpfile", which makes > assumption that a hot-removed section having mem_map as NULL. > > The bug can be reproduced on IBM POWERVM by "drmgr -c mem -r -q 5" , > trigger a crash, and save vmcore by makedumpfile Are you using an up-to-date makedumfile and did kdump.service properly get reloaded on the udev events? I remember that this works. makedumpfile will not dump memory sections that a) are not marked offline (SECTION_IS_ONLINE) - after offlining b) are not part of an iomem resource - after memory unplug. The current code makes sure that sparse_decode_mem_map() will return NULL.
On 16.01.20 09:06, David Hildenbrand wrote: > On 16.01.20 04:01, Pingfan Liu wrote: >> When fully deactivated, it is meaningless to keep the value of a section's >> mem_map. And its mem_map will be reassigned during re-added. >> >> Beside this, it breaks the user space tool "makedumpfile", which makes >> assumption that a hot-removed section having mem_map as NULL. >> >> The bug can be reproduced on IBM POWERVM by "drmgr -c mem -r -q 5" , >> trigger a crash, and save vmcore by makedumpfile > > Are you using an up-to-date makedumfile and did kdump.service properly > get reloaded on the udev events? I remember that this works. > > makedumpfile will not dump memory sections that a) are not marked > offline (SECTION_IS_ONLINE) - after offlining b) are not part of an > iomem resource - after memory unplug. > > > The current code makes sure that sparse_decode_mem_map() will return NULL. > ... but it's only used at this very place. I think we should add a Fixes: tag, although this might be fixed as well in makedumpfile (so people are aware that patch broke something) Reviewed-by: David Hildenbrand <david@redhat.com>
On 01/16/20 at 09:14am, David Hildenbrand wrote: > On 16.01.20 09:06, David Hildenbrand wrote: > > On 16.01.20 04:01, Pingfan Liu wrote: > >> When fully deactivated, it is meaningless to keep the value of a section's > >> mem_map. And its mem_map will be reassigned during re-added. > >> > >> Beside this, it breaks the user space tool "makedumpfile", which makes > >> assumption that a hot-removed section having mem_map as NULL. > >> > >> The bug can be reproduced on IBM POWERVM by "drmgr -c mem -r -q 5" , > >> trigger a crash, and save vmcore by makedumpfile > > > > Are you using an up-to-date makedumfile and did kdump.service properly > > get reloaded on the udev events? I remember that this works. > > > > makedumpfile will not dump memory sections that a) are not marked > > offline (SECTION_IS_ONLINE) - after offlining b) are not part of an > > iomem resource - after memory unplug. Makedumpfile seems to only check SECTION_MARKED_PRESENT. Then the NULL memmap will fail vmcore dumping, I guess. > > > > > > The current code makes sure that sparse_decode_mem_map() will return NULL. > > > > ... but it's only used at this very place. I think we should add a > Fixes: tag, although this might be fixed as well in makedumpfile (so > people are aware that patch broke something) Agree, it's worth fixing it too in makedumpfile side to enhance.
On 16.01.20 09:24, Baoquan He wrote: > On 01/16/20 at 09:14am, David Hildenbrand wrote: >> On 16.01.20 09:06, David Hildenbrand wrote: >>> On 16.01.20 04:01, Pingfan Liu wrote: >>>> When fully deactivated, it is meaningless to keep the value of a section's >>>> mem_map. And its mem_map will be reassigned during re-added. >>>> >>>> Beside this, it breaks the user space tool "makedumpfile", which makes >>>> assumption that a hot-removed section having mem_map as NULL. >>>> >>>> The bug can be reproduced on IBM POWERVM by "drmgr -c mem -r -q 5" , >>>> trigger a crash, and save vmcore by makedumpfile >>> >>> Are you using an up-to-date makedumfile and did kdump.service properly >>> get reloaded on the udev events? I remember that this works. >>> >>> makedumpfile will not dump memory sections that a) are not marked >>> offline (SECTION_IS_ONLINE) - after offlining b) are not part of an >>> iomem resource - after memory unplug. > > Makedumpfile seems to only check SECTION_MARKED_PRESENT. Then the NULL > memmap will fail vmcore dumping, I guess. But why should the section be marked SECTION_MARKED_PRESENT? After unplugging a section, the flag will be cleared. validate_mem_section() seems to iterate over all sections 0..num_section - 1 and validates them. section = nr_to_section(section_nr, mem_sec); if (section == NOT_KV_ADDR) { mem_map = NOT_MEMMAP_ADDR: } else { mem_map = section_mem_map_addr(section, &map_mask) if (!(map_mask & SECTION_MARKED_PRESENT)) { return FALSE; } if (mem_map == 0) { mem_map = NOT_MEMMAP_ADDR; } else { ... But here it is: section_mem_map_addr(): map = ULONG(mem_section + OFFSET(mem_section.section_mem_map)); mask = SECTION_MAP_MASK; *map_mask = map & ~mask; if (map == 0x0) *map_mask |= SECTION_MARKED_PRESENT; If the map is zero, the section is assumed to be present. The way the map is calculated, it will be zero due to the changes in the kernel.
On Thu, Jan 16, 2020 at 4:06 PM David Hildenbrand <david@redhat.com> wrote: > > On 16.01.20 04:01, Pingfan Liu wrote: > > When fully deactivated, it is meaningless to keep the value of a section's > > mem_map. And its mem_map will be reassigned during re-added. > > > > Beside this, it breaks the user space tool "makedumpfile", which makes > > assumption that a hot-removed section having mem_map as NULL. > > > > The bug can be reproduced on IBM POWERVM by "drmgr -c mem -r -q 5" , > > trigger a crash, and save vmcore by makedumpfile > > Are you using an up-to-date makedumfile and did kdump.service properly > get reloaded on the udev events? I remember that this works. I tried to get a machine and double-check it. The latest devel branch of makedumpfile can not work. > > makedumpfile will not dump memory sections that a) are not marked > offline (SECTION_IS_ONLINE) - after offlining b) are not part of an > iomem resource - after memory unplug. I think the current implementation of makedumpfile should improve the handling process. And my primary argument for this patch is a pattern like alloc/free. And when fully deactivated, it is meaningless to keep the section start pfn info > > > The current code makes sure that sparse_decode_mem_map() will return NULL. Do you mean the code in makedumpfile? If yes, it can. But makedumpfile related code has some bug, and should be improved to utilize this function. Thanks, Pingfan > > -- > Thanks, > > David / dhildenb >
On Thu, Jan 16, 2020 at 4:25 PM Baoquan He <bhe@redhat.com> wrote: > > On 01/16/20 at 09:14am, David Hildenbrand wrote: > > On 16.01.20 09:06, David Hildenbrand wrote: > > > On 16.01.20 04:01, Pingfan Liu wrote: > > >> When fully deactivated, it is meaningless to keep the value of a section's > > >> mem_map. And its mem_map will be reassigned during re-added. > > >> > > >> Beside this, it breaks the user space tool "makedumpfile", which makes > > >> assumption that a hot-removed section having mem_map as NULL. > > >> > > >> The bug can be reproduced on IBM POWERVM by "drmgr -c mem -r -q 5" , > > >> trigger a crash, and save vmcore by makedumpfile > > > > > > Are you using an up-to-date makedumfile and did kdump.service properly > > > get reloaded on the udev events? I remember that this works. > > > > > > makedumpfile will not dump memory sections that a) are not marked > > > offline (SECTION_IS_ONLINE) - after offlining b) are not part of an > > > iomem resource - after memory unplug. > > Makedumpfile seems to only check SECTION_MARKED_PRESENT. Then the NULL > memmap will fail vmcore dumping, I guess. makedumpfile.c / validate_mem_section() has some trick, so it will not fail. > > > > > > > > > > The current code makes sure that sparse_decode_mem_map() will return NULL. > > > > > > > ... but it's only used at this very place. I think we should add a > > Fixes: tag, although this might be fixed as well in makedumpfile (so > > people are aware that patch broke something) > > Agree, it's worth fixing it too in makedumpfile side to enhance. > Yeah, I also agree :)
On Thu, Jan 16, 2020 at 3:50 PM Michal Hocko <mhocko@suse.com> wrote: > > On Thu 16-01-20 11:01:08, Pingfan Liu wrote: > > When fully deactivated, it is meaningless to keep the value of a section's > > mem_map. And its mem_map will be reassigned during re-added. > > > > Beside this, it breaks the user space tool "makedumpfile", which makes > > assumption that a hot-removed section having mem_map as NULL. > > We used to do that before ba72b4c8cf60 ("mm/sparsemem: support > sub-section hotplug"). Dan was this an intentional change? I do not know the purpose of this. But the change just leave section start pfn in fully deactivated section_mem_map, and not used any more. Thanks, Pingfan
On Thu, Jan 16, 2020 at 10:23 PM Pingfan Liu <kernelfans@gmail.com> wrote: > > On Thu, Jan 16, 2020 at 3:50 PM Michal Hocko <mhocko@suse.com> wrote: > > > > On Thu 16-01-20 11:01:08, Pingfan Liu wrote: > > > When fully deactivated, it is meaningless to keep the value of a section's > > > mem_map. And its mem_map will be reassigned during re-added. > > > > > > Beside this, it breaks the user space tool "makedumpfile", which makes > > > assumption that a hot-removed section having mem_map as NULL. > > > > We used to do that before ba72b4c8cf60 ("mm/sparsemem: support > > sub-section hotplug"). Dan was this an intentional change? > I do not know the purpose of this. But the change just leave section > start pfn in fully deactivated section_mem_map, and not used any more. Not intentional, IIRC at the time I had convinced myself that the value would always be translated by sparse_decode_mem_map(), so I thought it could be hiding NULL de-references. I don't see any harm in the patch.
On Thu 16-01-20 23:14:02, Dan Williams wrote: > On Thu, Jan 16, 2020 at 10:23 PM Pingfan Liu <kernelfans@gmail.com> wrote: > > > > On Thu, Jan 16, 2020 at 3:50 PM Michal Hocko <mhocko@suse.com> wrote: > > > > > > On Thu 16-01-20 11:01:08, Pingfan Liu wrote: > > > > When fully deactivated, it is meaningless to keep the value of a section's > > > > mem_map. And its mem_map will be reassigned during re-added. > > > > > > > > Beside this, it breaks the user space tool "makedumpfile", which makes > > > > assumption that a hot-removed section having mem_map as NULL. > > > > > > We used to do that before ba72b4c8cf60 ("mm/sparsemem: support > > > sub-section hotplug"). Dan was this an intentional change? > > I do not know the purpose of this. But the change just leave section > > start pfn in fully deactivated section_mem_map, and not used any more. > > Not intentional, IIRC at the time I had convinced myself that the > value would always be translated by sparse_decode_mem_map(), so I > thought it could be hiding NULL de-references. I don't see any harm > in the patch. Thanks for the confirmation. It would be great to have this in the changelog.
On Fri, Jan 17, 2020 at 3:47 PM Michal Hocko <mhocko@suse.com> wrote: > > On Thu 16-01-20 23:14:02, Dan Williams wrote: > > On Thu, Jan 16, 2020 at 10:23 PM Pingfan Liu <kernelfans@gmail.com> wrote: > > > > > > On Thu, Jan 16, 2020 at 3:50 PM Michal Hocko <mhocko@suse.com> wrote: > > > > > > > > On Thu 16-01-20 11:01:08, Pingfan Liu wrote: > > > > > When fully deactivated, it is meaningless to keep the value of a section's > > > > > mem_map. And its mem_map will be reassigned during re-added. > > > > > > > > > > Beside this, it breaks the user space tool "makedumpfile", which makes > > > > > assumption that a hot-removed section having mem_map as NULL. > > > > > > > > We used to do that before ba72b4c8cf60 ("mm/sparsemem: support > > > > sub-section hotplug"). Dan was this an intentional change? > > > I do not know the purpose of this. But the change just leave section > > > start pfn in fully deactivated section_mem_map, and not used any more. > > > > Not intentional, IIRC at the time I had convinced myself that the > > value would always be translated by sparse_decode_mem_map(), so I > > thought it could be hiding NULL de-references. I don't see any harm > > in the patch. > > Thanks for the confirmation. It would be great to have this in the > changelog. Should I post V2 with this commit log? Thanks, Pingfan
On 17.01.20 10:49, Pingfan Liu wrote: > On Fri, Jan 17, 2020 at 3:47 PM Michal Hocko <mhocko@suse.com> wrote: >> >> On Thu 16-01-20 23:14:02, Dan Williams wrote: >>> On Thu, Jan 16, 2020 at 10:23 PM Pingfan Liu <kernelfans@gmail.com> wrote: >>>> >>>> On Thu, Jan 16, 2020 at 3:50 PM Michal Hocko <mhocko@suse.com> wrote: >>>>> >>>>> On Thu 16-01-20 11:01:08, Pingfan Liu wrote: >>>>>> When fully deactivated, it is meaningless to keep the value of a section's >>>>>> mem_map. And its mem_map will be reassigned during re-added. >>>>>> >>>>>> Beside this, it breaks the user space tool "makedumpfile", which makes >>>>>> assumption that a hot-removed section having mem_map as NULL. >>>>> >>>>> We used to do that before ba72b4c8cf60 ("mm/sparsemem: support >>>>> sub-section hotplug"). Dan was this an intentional change? >>>> I do not know the purpose of this. But the change just leave section >>>> start pfn in fully deactivated section_mem_map, and not used any more. >>> >>> Not intentional, IIRC at the time I had convinced myself that the >>> value would always be translated by sparse_decode_mem_map(), so I >>> thought it could be hiding NULL de-references. I don't see any harm >>> in the patch. >> >> Thanks for the confirmation. It would be great to have this in the >> changelog. > Should I post V2 with this commit log? > > Thanks, > Pingfan > Id' say yes, and maybe also add some details why this makes makedumpfile bail out now, so people looking at this commit only know what's happening.
On Fri, Jan 17, 2020 at 5:53 PM David Hildenbrand <david@redhat.com> wrote: > [...] > >> Thanks for the confirmation. It would be great to have this in the > >> changelog. > > Should I post V2 with this commit log? > > > > Thanks, > > Pingfan > > > > Id' say yes, and maybe also add some details why this makes makedumpfile > bail out now, so people looking at this commit only know what's happening. OK, thanks for the suggestion. I have sent another patch for makedumpfile and cc you. Thanks, Pingfan
diff --git a/mm/sparse.c b/mm/sparse.c index 3822ecb..fddac80 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -789,7 +789,7 @@ static void section_deactivate(unsigned long pfn, unsigned long nr_pages, ms->usage = NULL; } memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr); - ms->section_mem_map = sparse_encode_mem_map(NULL, section_nr); + ms->section_mem_map = NULL; } if (section_is_early && memmap)
When fully deactivated, it is meaningless to keep the value of a section's mem_map. And its mem_map will be reassigned during re-added. Beside this, it breaks the user space tool "makedumpfile", which makes assumption that a hot-removed section having mem_map as NULL. The bug can be reproduced on IBM POWERVM by "drmgr -c mem -r -q 5" , trigger a crash, and save vmcore by makedumpfile Signed-off-by: Pingfan Liu <kernelfans@gmail.com> To: linux-mm@kvack.org Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Hildenbrand <david@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: kexec@lists.infradead.org Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com> --- mm/sparse.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)