Message ID | 20230612210712.683175-5-eric.devolder@oracle.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | crash: Kernel handling of CPU and memory hot un/plug | expand |
On Mon, Jun 12, 2023 at 05:07:08PM -0400, Eric DeVolder wrote: > Introduce the crash_hotplug attribute for memory and CPUs for > use by userspace. These attributes directly facilitate the udev > rule for managing userspace re-loading of the crash kernel upon > hot un/plug changes. > > For memory, expose the crash_hotplug attribute to the > /sys/devices/system/memory directory. For example: > > # udevadm info --attribute-walk /sys/devices/system/memory/memory81 > looking at device '/devices/system/memory/memory81': > KERNEL=="memory81" > SUBSYSTEM=="memory" > DRIVER=="" > ATTR{online}=="1" > ATTR{phys_device}=="0" > ATTR{phys_index}=="00000051" > ATTR{removable}=="1" > ATTR{state}=="online" > ATTR{valid_zones}=="Movable" > > looking at parent device '/devices/system/memory': > KERNELS=="memory" > SUBSYSTEMS=="" > DRIVERS=="" > ATTRS{auto_online_blocks}=="offline" > ATTRS{block_size_bytes}=="8000000" > ATTRS{crash_hotplug}=="1" > > For CPUs, expose the crash_hotplug attribute to the > /sys/devices/system/cpu directory. For example: > > # udevadm info --attribute-walk /sys/devices/system/cpu/cpu0 > looking at device '/devices/system/cpu/cpu0': > KERNEL=="cpu0" > SUBSYSTEM=="cpu" > DRIVER=="processor" > ATTR{crash_notes}=="277c38600" > ATTR{crash_notes_size}=="368" > ATTR{online}=="1" > > looking at parent device '/devices/system/cpu': > KERNELS=="cpu" > SUBSYSTEMS=="" > DRIVERS=="" > ATTRS{crash_hotplug}=="1" > ATTRS{isolated}=="" > ATTRS{kernel_max}=="8191" > ATTRS{nohz_full}==" (null)" > ATTRS{offline}=="4-7" > ATTRS{online}=="0-3" > ATTRS{possible}=="0-7" > ATTRS{present}=="0-3" > > With these sysfs attributes in place, it is possible to efficiently > instruct the udev rule to skip crash kernel reloading for kernels > configured with crash hotplug support. > > For example, the following is the proposed udev rule change for RHEL > system 98-kexec.rules (as the first lines of the rule file): > > # The kernel updates the crash elfcorehdr for CPU and memory changes > SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" > SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" > > When examined in the context of 98-kexec.rules, the above rules > test if crash_hotplug is set, and if so, the userspace initiated > unload-then-reload of the crash kernel is skipped. > > CPU and memory checks are separated in accordance with > CONFIG_HOTPLUG_CPU and CONFIG_MEMORY_HOTPLUG kernel config options. > If an architecture supports, for example, memory hotplug but not > CPU hotplug, then the /sys/devices/system/memory/crash_hotplug > attribute file is present, but the /sys/devices/system/cpu/crash_hotplug > attribute file will NOT be present. Thus the udev rule skips > userspace processing of memory hot un/plug events, but the udev > rule will evaluate false for CPU events, thus allowing userspace to > process CPU hot un/plug events (ie the unload-then-reload of the kdump > capture kernel). > > Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> > Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com> > Acked-by: Hari Bathini <hbathini@linux.ibm.com> > Acked-by: Baoquan He <bhe@redhat.com> > --- > .../admin-guide/mm/memory-hotplug.rst | 8 ++++++++ > Documentation/core-api/cpu_hotplug.rst | 18 ++++++++++++++++++ > drivers/base/cpu.c | 14 ++++++++++++++ > drivers/base/memory.c | 13 +++++++++++++ > include/linux/kexec.h | 8 ++++++++ > 5 files changed, 61 insertions(+) > > diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst > index 1b02fe5807cc..eb99d79223a3 100644 > --- a/Documentation/admin-guide/mm/memory-hotplug.rst > +++ b/Documentation/admin-guide/mm/memory-hotplug.rst > @@ -291,6 +291,14 @@ The following files are currently defined: > Availability depends on the CONFIG_ARCH_MEMORY_PROBE > kernel configuration option. > ``uevent`` read-write: generic udev file for device subsystems. > +``crash_hotplug`` read-only: when changes to the system memory map > + occur due to hot un/plug of memory, this file contains > + '1' if the kernel updates the kdump capture kernel memory > + map itself (via elfcorehdr), or '0' if userspace must update > + the kdump capture kernel memory map. > + > + Availability depends on the CONFIG_MEMORY_HOTPLUG kernel > + configuration option. > ====================== ========================================================= > > .. note:: > diff --git a/Documentation/core-api/cpu_hotplug.rst b/Documentation/core-api/cpu_hotplug.rst > index f75778d37488..0c8dc3fe5f94 100644 > --- a/Documentation/core-api/cpu_hotplug.rst > +++ b/Documentation/core-api/cpu_hotplug.rst > @@ -750,6 +750,24 @@ will receive all events. A script like:: > > can process the event further. > > +When changes to the CPUs in the system occur, the sysfs file > +/sys/devices/system/cpu/crash_hotplug contains '1' if the kernel > +updates the kdump capture kernel list of CPUs itself (via elfcorehdr), > +or '0' if userspace must update the kdump capture kernel list of CPUs. > + > +The availability depends on the CONFIG_HOTPLUG_CPU kernel configuration > +option. > + > +To skip userspace processing of CPU hot un/plug events for kdump > +(ie the unload-then-reload to obtain a current list of CPUs), this sysfs > +file can be used in a udev rule as follows: > + > + SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" > + > +For a cpu hot un/plug event, if the architecture supports kernel updates > +of the elfcorehdr (which contains the list of CPUs), then the rule skips > +the unload-then-reload of the kdump capture kernel. > + > Kernel Inline Documentations Reference > ====================================== > > diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c > index c1815b9dae68..06a0c22b37b8 100644 > --- a/drivers/base/cpu.c > +++ b/drivers/base/cpu.c > @@ -282,6 +282,17 @@ static ssize_t print_cpus_nohz_full(struct device *dev, > static DEVICE_ATTR(nohz_full, 0444, print_cpus_nohz_full, NULL); > #endif > > +#ifdef CONFIG_HOTPLUG_CPU > +#include <linux/kexec.h> > +static ssize_t crash_hotplug_show(struct device *dev, > + struct device_attribute *attr, > + char *buf) > +{ > + return sprintf(buf, "%d\n", crash_hotplug_cpu_support()); > +} > +static DEVICE_ATTR_ADMIN_RO(crash_hotplug); > +#endif > + > static void cpu_device_release(struct device *dev) > { > /* > @@ -469,6 +480,9 @@ static struct attribute *cpu_root_attrs[] = { > #ifdef CONFIG_NO_HZ_FULL > &dev_attr_nohz_full.attr, > #endif > +#ifdef CONFIG_HOTPLUG_CPU > + &dev_attr_crash_hotplug.attr, > +#endif > #ifdef CONFIG_GENERIC_CPU_AUTOPROBE > &dev_attr_modalias.attr, > #endif > diff --git a/drivers/base/memory.c b/drivers/base/memory.c > index b456ac213610..24b8ef4c830c 100644 > --- a/drivers/base/memory.c > +++ b/drivers/base/memory.c > @@ -490,6 +490,16 @@ static ssize_t auto_online_blocks_store(struct device *dev, > > static DEVICE_ATTR_RW(auto_online_blocks); > > +#ifdef CONFIG_MEMORY_HOTPLUG > +#include <linux/kexec.h> > +static ssize_t crash_hotplug_show(struct device *dev, > + struct device_attribute *attr, char *buf) > +{ > + return sprintf(buf, "%d\n", crash_hotplug_memory_support()); > +} This sysfs file has to be documented in Documentation/ABI/ right? And did you use checkpatch? It should have told you to use sysfs_emit() instead... > +static DEVICE_ATTR_RO(crash_hotplug); > +#endif All of these #ifdefs should all be removed and instead use the is_visible() callback to determine if the attribute is shown or not, using the IS_ENABLED() test in the function. thanks, greg k-h
On 6/13/23 03:03, Greg KH wrote: > On Mon, Jun 12, 2023 at 05:07:08PM -0400, Eric DeVolder wrote: >> Introduce the crash_hotplug attribute for memory and CPUs for >> use by userspace. These attributes directly facilitate the udev >> rule for managing userspace re-loading of the crash kernel upon >> hot un/plug changes. >> >> For memory, expose the crash_hotplug attribute to the >> /sys/devices/system/memory directory. For example: >> >> # udevadm info --attribute-walk /sys/devices/system/memory/memory81 >> looking at device '/devices/system/memory/memory81': >> KERNEL=="memory81" >> SUBSYSTEM=="memory" >> DRIVER=="" >> ATTR{online}=="1" >> ATTR{phys_device}=="0" >> ATTR{phys_index}=="00000051" >> ATTR{removable}=="1" >> ATTR{state}=="online" >> ATTR{valid_zones}=="Movable" >> >> looking at parent device '/devices/system/memory': >> KERNELS=="memory" >> SUBSYSTEMS=="" >> DRIVERS=="" >> ATTRS{auto_online_blocks}=="offline" >> ATTRS{block_size_bytes}=="8000000" >> ATTRS{crash_hotplug}=="1" >> >> For CPUs, expose the crash_hotplug attribute to the >> /sys/devices/system/cpu directory. For example: >> >> # udevadm info --attribute-walk /sys/devices/system/cpu/cpu0 >> looking at device '/devices/system/cpu/cpu0': >> KERNEL=="cpu0" >> SUBSYSTEM=="cpu" >> DRIVER=="processor" >> ATTR{crash_notes}=="277c38600" >> ATTR{crash_notes_size}=="368" >> ATTR{online}=="1" >> >> looking at parent device '/devices/system/cpu': >> KERNELS=="cpu" >> SUBSYSTEMS=="" >> DRIVERS=="" >> ATTRS{crash_hotplug}=="1" >> ATTRS{isolated}=="" >> ATTRS{kernel_max}=="8191" >> ATTRS{nohz_full}==" (null)" >> ATTRS{offline}=="4-7" >> ATTRS{online}=="0-3" >> ATTRS{possible}=="0-7" >> ATTRS{present}=="0-3" >> >> With these sysfs attributes in place, it is possible to efficiently >> instruct the udev rule to skip crash kernel reloading for kernels >> configured with crash hotplug support. >> >> For example, the following is the proposed udev rule change for RHEL >> system 98-kexec.rules (as the first lines of the rule file): >> >> # The kernel updates the crash elfcorehdr for CPU and memory changes >> SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" >> SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" >> >> When examined in the context of 98-kexec.rules, the above rules >> test if crash_hotplug is set, and if so, the userspace initiated >> unload-then-reload of the crash kernel is skipped. >> >> CPU and memory checks are separated in accordance with >> CONFIG_HOTPLUG_CPU and CONFIG_MEMORY_HOTPLUG kernel config options. >> If an architecture supports, for example, memory hotplug but not >> CPU hotplug, then the /sys/devices/system/memory/crash_hotplug >> attribute file is present, but the /sys/devices/system/cpu/crash_hotplug >> attribute file will NOT be present. Thus the udev rule skips >> userspace processing of memory hot un/plug events, but the udev >> rule will evaluate false for CPU events, thus allowing userspace to >> process CPU hot un/plug events (ie the unload-then-reload of the kdump >> capture kernel). >> >> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> >> Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com> >> Acked-by: Hari Bathini <hbathini@linux.ibm.com> >> Acked-by: Baoquan He <bhe@redhat.com> >> --- >> .../admin-guide/mm/memory-hotplug.rst | 8 ++++++++ >> Documentation/core-api/cpu_hotplug.rst | 18 ++++++++++++++++++ >> drivers/base/cpu.c | 14 ++++++++++++++ >> drivers/base/memory.c | 13 +++++++++++++ >> include/linux/kexec.h | 8 ++++++++ >> 5 files changed, 61 insertions(+) >> >> diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst >> index 1b02fe5807cc..eb99d79223a3 100644 >> --- a/Documentation/admin-guide/mm/memory-hotplug.rst >> +++ b/Documentation/admin-guide/mm/memory-hotplug.rst >> @@ -291,6 +291,14 @@ The following files are currently defined: >> Availability depends on the CONFIG_ARCH_MEMORY_PROBE >> kernel configuration option. >> ``uevent`` read-write: generic udev file for device subsystems. >> +``crash_hotplug`` read-only: when changes to the system memory map >> + occur due to hot un/plug of memory, this file contains >> + '1' if the kernel updates the kdump capture kernel memory >> + map itself (via elfcorehdr), or '0' if userspace must update >> + the kdump capture kernel memory map. >> + >> + Availability depends on the CONFIG_MEMORY_HOTPLUG kernel >> + configuration option. >> ====================== ========================================================= >> >> .. note:: >> diff --git a/Documentation/core-api/cpu_hotplug.rst b/Documentation/core-api/cpu_hotplug.rst >> index f75778d37488..0c8dc3fe5f94 100644 >> --- a/Documentation/core-api/cpu_hotplug.rst >> +++ b/Documentation/core-api/cpu_hotplug.rst >> @@ -750,6 +750,24 @@ will receive all events. A script like:: >> >> can process the event further. >> >> +When changes to the CPUs in the system occur, the sysfs file >> +/sys/devices/system/cpu/crash_hotplug contains '1' if the kernel >> +updates the kdump capture kernel list of CPUs itself (via elfcorehdr), >> +or '0' if userspace must update the kdump capture kernel list of CPUs. >> + >> +The availability depends on the CONFIG_HOTPLUG_CPU kernel configuration >> +option. >> + >> +To skip userspace processing of CPU hot un/plug events for kdump >> +(ie the unload-then-reload to obtain a current list of CPUs), this sysfs >> +file can be used in a udev rule as follows: >> + >> + SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" >> + >> +For a cpu hot un/plug event, if the architecture supports kernel updates >> +of the elfcorehdr (which contains the list of CPUs), then the rule skips >> +the unload-then-reload of the kdump capture kernel. >> + >> Kernel Inline Documentations Reference >> ====================================== >> >> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c >> index c1815b9dae68..06a0c22b37b8 100644 >> --- a/drivers/base/cpu.c >> +++ b/drivers/base/cpu.c >> @@ -282,6 +282,17 @@ static ssize_t print_cpus_nohz_full(struct device *dev, >> static DEVICE_ATTR(nohz_full, 0444, print_cpus_nohz_full, NULL); >> #endif >> >> +#ifdef CONFIG_HOTPLUG_CPU >> +#include <linux/kexec.h> >> +static ssize_t crash_hotplug_show(struct device *dev, >> + struct device_attribute *attr, >> + char *buf) >> +{ >> + return sprintf(buf, "%d\n", crash_hotplug_cpu_support()); >> +} >> +static DEVICE_ATTR_ADMIN_RO(crash_hotplug); >> +#endif >> + >> static void cpu_device_release(struct device *dev) >> { >> /* >> @@ -469,6 +480,9 @@ static struct attribute *cpu_root_attrs[] = { >> #ifdef CONFIG_NO_HZ_FULL >> &dev_attr_nohz_full.attr, >> #endif >> +#ifdef CONFIG_HOTPLUG_CPU >> + &dev_attr_crash_hotplug.attr, >> +#endif >> #ifdef CONFIG_GENERIC_CPU_AUTOPROBE >> &dev_attr_modalias.attr, >> #endif >> diff --git a/drivers/base/memory.c b/drivers/base/memory.c >> index b456ac213610..24b8ef4c830c 100644 >> --- a/drivers/base/memory.c >> +++ b/drivers/base/memory.c >> @@ -490,6 +490,16 @@ static ssize_t auto_online_blocks_store(struct device *dev, >> >> static DEVICE_ATTR_RW(auto_online_blocks); >> >> +#ifdef CONFIG_MEMORY_HOTPLUG >> +#include <linux/kexec.h> >> +static ssize_t crash_hotplug_show(struct device *dev, >> + struct device_attribute *attr, char *buf) >> +{ >> + return sprintf(buf, "%d\n", crash_hotplug_memory_support()); >> +} > > This sysfs file has to be documented in Documentation/ABI/ right? I will add these entries. > > And did you use checkpatch? It should have told you to use sysfs_emit() > instead... I did use checkpatch, and it did not reveal that. [root@localhost linux]# ./scripts/checkpatch.pl --patch ../v23/0004-crash-memory-and-CPU-hotplug-sysfs-attributes.patch total: 0 errors, 0 warnings, 103 lines checked ../v23/0004-crash-memory-and-CPU-hotplug-sysfs-attributes.patch has no obvious style problems and is ready for submission. This is not the first time it's been suggested to me that checkpatch should have caught a problem; is there a different invocation than what I'm doing? > >> +static DEVICE_ATTR_RO(crash_hotplug); >> +#endif > > All of these #ifdefs should all be removed and instead use the > is_visible() callback to determine if the attribute is shown or not, > using the IS_ENABLED() test in the function. ok, I'll correct this. Thank you for looking at this! eric > > thanks, > > greg k-h
On 6/13/23 10:24, Eric DeVolder wrote: > > > On 6/13/23 03:03, Greg KH wrote: >> On Mon, Jun 12, 2023 at 05:07:08PM -0400, Eric DeVolder wrote: >>> Introduce the crash_hotplug attribute for memory and CPUs for >>> use by userspace. These attributes directly facilitate the udev >>> rule for managing userspace re-loading of the crash kernel upon >>> hot un/plug changes. >>> >>> For memory, expose the crash_hotplug attribute to the >>> /sys/devices/system/memory directory. For example: >>> >>> # udevadm info --attribute-walk /sys/devices/system/memory/memory81 >>> looking at device '/devices/system/memory/memory81': >>> KERNEL=="memory81" >>> SUBSYSTEM=="memory" >>> DRIVER=="" >>> ATTR{online}=="1" >>> ATTR{phys_device}=="0" >>> ATTR{phys_index}=="00000051" >>> ATTR{removable}=="1" >>> ATTR{state}=="online" >>> ATTR{valid_zones}=="Movable" >>> >>> looking at parent device '/devices/system/memory': >>> KERNELS=="memory" >>> SUBSYSTEMS=="" >>> DRIVERS=="" >>> ATTRS{auto_online_blocks}=="offline" >>> ATTRS{block_size_bytes}=="8000000" >>> ATTRS{crash_hotplug}=="1" >>> >>> For CPUs, expose the crash_hotplug attribute to the >>> /sys/devices/system/cpu directory. For example: >>> >>> # udevadm info --attribute-walk /sys/devices/system/cpu/cpu0 >>> looking at device '/devices/system/cpu/cpu0': >>> KERNEL=="cpu0" >>> SUBSYSTEM=="cpu" >>> DRIVER=="processor" >>> ATTR{crash_notes}=="277c38600" >>> ATTR{crash_notes_size}=="368" >>> ATTR{online}=="1" >>> >>> looking at parent device '/devices/system/cpu': >>> KERNELS=="cpu" >>> SUBSYSTEMS=="" >>> DRIVERS=="" >>> ATTRS{crash_hotplug}=="1" >>> ATTRS{isolated}=="" >>> ATTRS{kernel_max}=="8191" >>> ATTRS{nohz_full}==" (null)" >>> ATTRS{offline}=="4-7" >>> ATTRS{online}=="0-3" >>> ATTRS{possible}=="0-7" >>> ATTRS{present}=="0-3" >>> >>> With these sysfs attributes in place, it is possible to efficiently >>> instruct the udev rule to skip crash kernel reloading for kernels >>> configured with crash hotplug support. >>> >>> For example, the following is the proposed udev rule change for RHEL >>> system 98-kexec.rules (as the first lines of the rule file): >>> >>> # The kernel updates the crash elfcorehdr for CPU and memory changes >>> SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" >>> SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" >>> >>> When examined in the context of 98-kexec.rules, the above rules >>> test if crash_hotplug is set, and if so, the userspace initiated >>> unload-then-reload of the crash kernel is skipped. >>> >>> CPU and memory checks are separated in accordance with >>> CONFIG_HOTPLUG_CPU and CONFIG_MEMORY_HOTPLUG kernel config options. >>> If an architecture supports, for example, memory hotplug but not >>> CPU hotplug, then the /sys/devices/system/memory/crash_hotplug >>> attribute file is present, but the /sys/devices/system/cpu/crash_hotplug >>> attribute file will NOT be present. Thus the udev rule skips >>> userspace processing of memory hot un/plug events, but the udev >>> rule will evaluate false for CPU events, thus allowing userspace to >>> process CPU hot un/plug events (ie the unload-then-reload of the kdump >>> capture kernel). >>> >>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> >>> Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com> >>> Acked-by: Hari Bathini <hbathini@linux.ibm.com> >>> Acked-by: Baoquan He <bhe@redhat.com> >>> --- >>> .../admin-guide/mm/memory-hotplug.rst | 8 ++++++++ >>> Documentation/core-api/cpu_hotplug.rst | 18 ++++++++++++++++++ >>> drivers/base/cpu.c | 14 ++++++++++++++ >>> drivers/base/memory.c | 13 +++++++++++++ >>> include/linux/kexec.h | 8 ++++++++ >>> 5 files changed, 61 insertions(+) >>> >>> diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst >>> b/Documentation/admin-guide/mm/memory-hotplug.rst >>> index 1b02fe5807cc..eb99d79223a3 100644 >>> --- a/Documentation/admin-guide/mm/memory-hotplug.rst >>> +++ b/Documentation/admin-guide/mm/memory-hotplug.rst >>> @@ -291,6 +291,14 @@ The following files are currently defined: >>> Availability depends on the CONFIG_ARCH_MEMORY_PROBE >>> kernel configuration option. >>> ``uevent`` read-write: generic udev file for device subsystems. >>> +``crash_hotplug`` read-only: when changes to the system memory map >>> + occur due to hot un/plug of memory, this file contains >>> + '1' if the kernel updates the kdump capture kernel memory >>> + map itself (via elfcorehdr), or '0' if userspace must update >>> + the kdump capture kernel memory map. >>> + >>> + Availability depends on the CONFIG_MEMORY_HOTPLUG kernel >>> + configuration option. >>> ====================== ========================================================= >>> .. note:: >>> diff --git a/Documentation/core-api/cpu_hotplug.rst b/Documentation/core-api/cpu_hotplug.rst >>> index f75778d37488..0c8dc3fe5f94 100644 >>> --- a/Documentation/core-api/cpu_hotplug.rst >>> +++ b/Documentation/core-api/cpu_hotplug.rst >>> @@ -750,6 +750,24 @@ will receive all events. A script like:: >>> can process the event further. >>> +When changes to the CPUs in the system occur, the sysfs file >>> +/sys/devices/system/cpu/crash_hotplug contains '1' if the kernel >>> +updates the kdump capture kernel list of CPUs itself (via elfcorehdr), >>> +or '0' if userspace must update the kdump capture kernel list of CPUs. >>> + >>> +The availability depends on the CONFIG_HOTPLUG_CPU kernel configuration >>> +option. >>> + >>> +To skip userspace processing of CPU hot un/plug events for kdump >>> +(ie the unload-then-reload to obtain a current list of CPUs), this sysfs >>> +file can be used in a udev rule as follows: >>> + >>> + SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" >>> + >>> +For a cpu hot un/plug event, if the architecture supports kernel updates >>> +of the elfcorehdr (which contains the list of CPUs), then the rule skips >>> +the unload-then-reload of the kdump capture kernel. >>> + >>> Kernel Inline Documentations Reference >>> ====================================== >>> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c >>> index c1815b9dae68..06a0c22b37b8 100644 >>> --- a/drivers/base/cpu.c >>> +++ b/drivers/base/cpu.c >>> @@ -282,6 +282,17 @@ static ssize_t print_cpus_nohz_full(struct device *dev, >>> static DEVICE_ATTR(nohz_full, 0444, print_cpus_nohz_full, NULL); >>> #endif >>> +#ifdef CONFIG_HOTPLUG_CPU >>> +#include <linux/kexec.h> >>> +static ssize_t crash_hotplug_show(struct device *dev, >>> + struct device_attribute *attr, >>> + char *buf) >>> +{ >>> + return sprintf(buf, "%d\n", crash_hotplug_cpu_support()); >>> +} >>> +static DEVICE_ATTR_ADMIN_RO(crash_hotplug); >>> +#endif >>> + >>> static void cpu_device_release(struct device *dev) >>> { >>> /* >>> @@ -469,6 +480,9 @@ static struct attribute *cpu_root_attrs[] = { >>> #ifdef CONFIG_NO_HZ_FULL >>> &dev_attr_nohz_full.attr, >>> #endif >>> +#ifdef CONFIG_HOTPLUG_CPU >>> + &dev_attr_crash_hotplug.attr, >>> +#endif >>> #ifdef CONFIG_GENERIC_CPU_AUTOPROBE >>> &dev_attr_modalias.attr, >>> #endif >>> diff --git a/drivers/base/memory.c b/drivers/base/memory.c >>> index b456ac213610..24b8ef4c830c 100644 >>> --- a/drivers/base/memory.c >>> +++ b/drivers/base/memory.c >>> @@ -490,6 +490,16 @@ static ssize_t auto_online_blocks_store(struct device *dev, >>> static DEVICE_ATTR_RW(auto_online_blocks); >>> +#ifdef CONFIG_MEMORY_HOTPLUG >>> +#include <linux/kexec.h> >>> +static ssize_t crash_hotplug_show(struct device *dev, >>> + struct device_attribute *attr, char *buf) >>> +{ >>> + return sprintf(buf, "%d\n", crash_hotplug_memory_support()); >>> +} >> >> This sysfs file has to be documented in Documentation/ABI/ right? > > I will add these entries. > >> >> And did you use checkpatch? It should have told you to use sysfs_emit() >> instead... > I did use checkpatch, and it did not reveal that. > > [root@localhost linux]# ./scripts/checkpatch.pl --patch > ../v23/0004-crash-memory-and-CPU-hotplug-sysfs-attributes.patch > total: 0 errors, 0 warnings, 103 lines checked > > ../v23/0004-crash-memory-and-CPU-hotplug-sysfs-attributes.patch has no obvious style problems and is > ready for submission. > > This is not the first time it's been suggested to me that checkpatch should have caught a problem; > is there a different invocation than what I'm doing? > >> >>> +static DEVICE_ATTR_RO(crash_hotplug); >>> +#endif >> >> All of these #ifdefs should all be removed and instead use the >> is_visible() callback to determine if the attribute is shown or not, >> using the IS_ENABLED() test in the function. > > ok, I'll correct this. I've been examining drivers/base/cacheinfo.c as a template for how to remove the #ifdefs and use the is_visible() callback for the drivers/base/cpu|memory.c files. I'm attempting to apply this technique to drivers/base/cpu.c. In this file, there are features that are compiled in/out based on the CONFIG settings, for example CONFIG_ARCH_CPU_PROBE_RELEASE. My attempts at applying the technique thus far have resulted in link-time errors for missing symbols, ie. arch_cpu_probe() and arch_cpu_release(). As I understand it, to use IS_ENABLED(XYZ) to guard-band conditional code, the contents of that code still needs to be compile-able (eg. no references to struct members with surrounding #ifdef CONFIG_XYZ) and link-able (eg. any called functions must also be compiled). Given my understanding of the request and IS_ENABLED, I'm not seeing a path forward. I'm thinking the approach works well in cacheinfo.c for example because cache info is always compiled-in. I suspect I am misunderstanding, and would appreciate a pointer/guidance. Thanks! eric > > Thank you for looking at this! > eric > >> >> thanks, >> >> greg k-h
On Tue, Jun 13 2023 at 14:58, Eric DeVolder wrote: > On 6/13/23 10:24, Eric DeVolder wrote: >> On 6/13/23 03:03, Greg KH wrote: >>> All of these #ifdefs should all be removed and instead use the >>> is_visible() callback to determine if the attribute is shown or not, >>> using the IS_ENABLED() test in the function. >> >> ok, I'll correct this. > > I've been examining drivers/base/cacheinfo.c as a template for how to remove the > #ifdefs and use the is_visible() callback for the drivers/base/cpu|memory.c files. > > I'm attempting to apply this technique to drivers/base/cpu.c. In this file, there > are features that are compiled in/out based on the CONFIG settings, for example > CONFIG_ARCH_CPU_PROBE_RELEASE. My attempts at applying the technique thus far have > resulted in link-time errors for missing symbols, ie. arch_cpu_probe() and > arch_cpu_release(). > > As I understand it, to use IS_ENABLED(XYZ) to guard-band conditional code, the contents > of that code still needs to be compile-able (eg. no references to struct members with > surrounding #ifdef CONFIG_XYZ) and link-able (eg. any called functions must also be > compiled). You can't obviously reference anything which is #ifdeffed out in a data structure. But functions is a different story. All you needs is a declaration. void foo(void); if (IS_ENABLED(FOO)) foo(); Builds correctly if FOO=n and foo() is not built in. The wonders of dead code elimination. Thanks, tglx
On 6/13/23 03:03, Greg KH wrote: > On Mon, Jun 12, 2023 at 05:07:08PM -0400, Eric DeVolder wrote: >> Introduce the crash_hotplug attribute for memory and CPUs for >> use by userspace. These attributes directly facilitate the udev >> rule for managing userspace re-loading of the crash kernel upon >> hot un/plug changes. >> >> For memory, expose the crash_hotplug attribute to the >> /sys/devices/system/memory directory. For example: >> >> # udevadm info --attribute-walk /sys/devices/system/memory/memory81 >> looking at device '/devices/system/memory/memory81': >> KERNEL=="memory81" >> SUBSYSTEM=="memory" >> DRIVER=="" >> ATTR{online}=="1" >> ATTR{phys_device}=="0" >> ATTR{phys_index}=="00000051" >> ATTR{removable}=="1" >> ATTR{state}=="online" >> ATTR{valid_zones}=="Movable" >> >> looking at parent device '/devices/system/memory': >> KERNELS=="memory" >> SUBSYSTEMS=="" >> DRIVERS=="" >> ATTRS{auto_online_blocks}=="offline" >> ATTRS{block_size_bytes}=="8000000" >> ATTRS{crash_hotplug}=="1" >> >> For CPUs, expose the crash_hotplug attribute to the >> /sys/devices/system/cpu directory. For example: >> >> # udevadm info --attribute-walk /sys/devices/system/cpu/cpu0 >> looking at device '/devices/system/cpu/cpu0': >> KERNEL=="cpu0" >> SUBSYSTEM=="cpu" >> DRIVER=="processor" >> ATTR{crash_notes}=="277c38600" >> ATTR{crash_notes_size}=="368" >> ATTR{online}=="1" >> >> looking at parent device '/devices/system/cpu': >> KERNELS=="cpu" >> SUBSYSTEMS=="" >> DRIVERS=="" >> ATTRS{crash_hotplug}=="1" >> ATTRS{isolated}=="" >> ATTRS{kernel_max}=="8191" >> ATTRS{nohz_full}==" (null)" >> ATTRS{offline}=="4-7" >> ATTRS{online}=="0-3" >> ATTRS{possible}=="0-7" >> ATTRS{present}=="0-3" >> >> With these sysfs attributes in place, it is possible to efficiently >> instruct the udev rule to skip crash kernel reloading for kernels >> configured with crash hotplug support. >> >> For example, the following is the proposed udev rule change for RHEL >> system 98-kexec.rules (as the first lines of the rule file): >> >> # The kernel updates the crash elfcorehdr for CPU and memory changes >> SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" >> SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" >> >> When examined in the context of 98-kexec.rules, the above rules >> test if crash_hotplug is set, and if so, the userspace initiated >> unload-then-reload of the crash kernel is skipped. >> >> CPU and memory checks are separated in accordance with >> CONFIG_HOTPLUG_CPU and CONFIG_MEMORY_HOTPLUG kernel config options. >> If an architecture supports, for example, memory hotplug but not >> CPU hotplug, then the /sys/devices/system/memory/crash_hotplug >> attribute file is present, but the /sys/devices/system/cpu/crash_hotplug >> attribute file will NOT be present. Thus the udev rule skips >> userspace processing of memory hot un/plug events, but the udev >> rule will evaluate false for CPU events, thus allowing userspace to >> process CPU hot un/plug events (ie the unload-then-reload of the kdump >> capture kernel). >> >> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> >> Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com> >> Acked-by: Hari Bathini <hbathini@linux.ibm.com> >> Acked-by: Baoquan He <bhe@redhat.com> >> --- >> .../admin-guide/mm/memory-hotplug.rst | 8 ++++++++ >> Documentation/core-api/cpu_hotplug.rst | 18 ++++++++++++++++++ >> drivers/base/cpu.c | 14 ++++++++++++++ >> drivers/base/memory.c | 13 +++++++++++++ >> include/linux/kexec.h | 8 ++++++++ >> 5 files changed, 61 insertions(+) >> >> diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst >> index 1b02fe5807cc..eb99d79223a3 100644 >> --- a/Documentation/admin-guide/mm/memory-hotplug.rst >> +++ b/Documentation/admin-guide/mm/memory-hotplug.rst >> @@ -291,6 +291,14 @@ The following files are currently defined: >> Availability depends on the CONFIG_ARCH_MEMORY_PROBE >> kernel configuration option. >> ``uevent`` read-write: generic udev file for device subsystems. >> +``crash_hotplug`` read-only: when changes to the system memory map >> + occur due to hot un/plug of memory, this file contains >> + '1' if the kernel updates the kdump capture kernel memory >> + map itself (via elfcorehdr), or '0' if userspace must update >> + the kdump capture kernel memory map. >> + >> + Availability depends on the CONFIG_MEMORY_HOTPLUG kernel >> + configuration option. >> ====================== ========================================================= >> >> .. note:: >> diff --git a/Documentation/core-api/cpu_hotplug.rst b/Documentation/core-api/cpu_hotplug.rst >> index f75778d37488..0c8dc3fe5f94 100644 >> --- a/Documentation/core-api/cpu_hotplug.rst >> +++ b/Documentation/core-api/cpu_hotplug.rst >> @@ -750,6 +750,24 @@ will receive all events. A script like:: >> >> can process the event further. >> >> +When changes to the CPUs in the system occur, the sysfs file >> +/sys/devices/system/cpu/crash_hotplug contains '1' if the kernel >> +updates the kdump capture kernel list of CPUs itself (via elfcorehdr), >> +or '0' if userspace must update the kdump capture kernel list of CPUs. >> + >> +The availability depends on the CONFIG_HOTPLUG_CPU kernel configuration >> +option. >> + >> +To skip userspace processing of CPU hot un/plug events for kdump >> +(ie the unload-then-reload to obtain a current list of CPUs), this sysfs >> +file can be used in a udev rule as follows: >> + >> + SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" >> + >> +For a cpu hot un/plug event, if the architecture supports kernel updates >> +of the elfcorehdr (which contains the list of CPUs), then the rule skips >> +the unload-then-reload of the kdump capture kernel. >> + >> Kernel Inline Documentations Reference >> ====================================== >> >> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c >> index c1815b9dae68..06a0c22b37b8 100644 >> --- a/drivers/base/cpu.c >> +++ b/drivers/base/cpu.c >> @@ -282,6 +282,17 @@ static ssize_t print_cpus_nohz_full(struct device *dev, >> static DEVICE_ATTR(nohz_full, 0444, print_cpus_nohz_full, NULL); >> #endif >> >> +#ifdef CONFIG_HOTPLUG_CPU >> +#include <linux/kexec.h> >> +static ssize_t crash_hotplug_show(struct device *dev, >> + struct device_attribute *attr, >> + char *buf) >> +{ >> + return sprintf(buf, "%d\n", crash_hotplug_cpu_support()); >> +} >> +static DEVICE_ATTR_ADMIN_RO(crash_hotplug); >> +#endif >> + >> static void cpu_device_release(struct device *dev) >> { >> /* >> @@ -469,6 +480,9 @@ static struct attribute *cpu_root_attrs[] = { >> #ifdef CONFIG_NO_HZ_FULL >> &dev_attr_nohz_full.attr, >> #endif >> +#ifdef CONFIG_HOTPLUG_CPU >> + &dev_attr_crash_hotplug.attr, >> +#endif >> #ifdef CONFIG_GENERIC_CPU_AUTOPROBE >> &dev_attr_modalias.attr, >> #endif >> diff --git a/drivers/base/memory.c b/drivers/base/memory.c >> index b456ac213610..24b8ef4c830c 100644 >> --- a/drivers/base/memory.c >> +++ b/drivers/base/memory.c >> @@ -490,6 +490,16 @@ static ssize_t auto_online_blocks_store(struct device *dev, >> >> static DEVICE_ATTR_RW(auto_online_blocks); >> >> +#ifdef CONFIG_MEMORY_HOTPLUG >> +#include <linux/kexec.h> >> +static ssize_t crash_hotplug_show(struct device *dev, >> + struct device_attribute *attr, char *buf) >> +{ >> + return sprintf(buf, "%d\n", crash_hotplug_memory_support()); >> +} > > This sysfs file has to be documented in Documentation/ABI/ right? > > And did you use checkpatch? It should have told you to use sysfs_emit() > instead... > >> +static DEVICE_ATTR_RO(crash_hotplug); >> +#endif > > All of these #ifdefs should all be removed and instead use the > is_visible() callback to determine if the attribute is shown or not, > using the IS_ENABLED() test in the function. > > thanks, > > greg k-h Greg, I've been examining this request, and could use a bit more guidance. Taking a look at just drivers/base/cpu.c for the purposes of this discussion, I need to add the .is_visible method to the following: static const struct attribute_group cpu_root_attr_group = { .attrs = cpu_root_attrs, }; ok, that makes sense. The request is to remove the #ifdefs from the following in cpu_root_attrs[]: static struct attribute *cpu_root_attrs[] = { #ifdef CONFIG_ARCH_CPU_PROBE_RELEASE &dev_attr_probe.attr, &dev_attr_release.attr, #endif &cpu_attrs[0].attr.attr, &cpu_attrs[1].attr.attr, &cpu_attrs[2].attr.attr, &dev_attr_kernel_max.attr, &dev_attr_offline.attr, &dev_attr_isolated.attr, #ifdef CONFIG_NO_HZ_FULL &dev_attr_nohz_full.attr, #endif #ifdef CONFIG_GENERIC_CPU_AUTOPROBE &dev_attr_modalias.attr, #endif NULL }; In order to remove the #ifdefs in this struct, as requested, that means these attributes need to be always compiled-in. So I can move the declarations of these outside of the #ifdef which contain them, but these all contain a callback function for show()/store(), which is inside the #ifdef. The struct device-attribute .show/.store pointer to the callback function must point to something linkable. Is an acceptable solution to place nop callback functions for show()/store() on an #else part of the code blocks in the file? Ie: #ifdef CONFIG_ARCH_CPU_PROBE_RELEASE static ssize_t cpu_probe_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { ... real function contents ... } #else static ssize_t cpu_probe_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { return 0; } #endif I've been looking around the source tree for examples, but I find most are either groups that are all strongly related and thus compiled-in/out together (ie drivers/base/cacheinfo.c, drivers/acpi/nfit/core.c), or utilize #ifdefs (ie this file, drivers/base/topology.c) . I may not understand the goal of the request either, as it appears it would result in a kernel slightly larger (struct attribute containing entries for items that are not configured in the kernel) than a kernel utilizing the #ifdefs as-is currently. Thanks, eric
diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst index 1b02fe5807cc..eb99d79223a3 100644 --- a/Documentation/admin-guide/mm/memory-hotplug.rst +++ b/Documentation/admin-guide/mm/memory-hotplug.rst @@ -291,6 +291,14 @@ The following files are currently defined: Availability depends on the CONFIG_ARCH_MEMORY_PROBE kernel configuration option. ``uevent`` read-write: generic udev file for device subsystems. +``crash_hotplug`` read-only: when changes to the system memory map + occur due to hot un/plug of memory, this file contains + '1' if the kernel updates the kdump capture kernel memory + map itself (via elfcorehdr), or '0' if userspace must update + the kdump capture kernel memory map. + + Availability depends on the CONFIG_MEMORY_HOTPLUG kernel + configuration option. ====================== ========================================================= .. note:: diff --git a/Documentation/core-api/cpu_hotplug.rst b/Documentation/core-api/cpu_hotplug.rst index f75778d37488..0c8dc3fe5f94 100644 --- a/Documentation/core-api/cpu_hotplug.rst +++ b/Documentation/core-api/cpu_hotplug.rst @@ -750,6 +750,24 @@ will receive all events. A script like:: can process the event further. +When changes to the CPUs in the system occur, the sysfs file +/sys/devices/system/cpu/crash_hotplug contains '1' if the kernel +updates the kdump capture kernel list of CPUs itself (via elfcorehdr), +or '0' if userspace must update the kdump capture kernel list of CPUs. + +The availability depends on the CONFIG_HOTPLUG_CPU kernel configuration +option. + +To skip userspace processing of CPU hot un/plug events for kdump +(ie the unload-then-reload to obtain a current list of CPUs), this sysfs +file can be used in a udev rule as follows: + + SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" + +For a cpu hot un/plug event, if the architecture supports kernel updates +of the elfcorehdr (which contains the list of CPUs), then the rule skips +the unload-then-reload of the kdump capture kernel. + Kernel Inline Documentations Reference ====================================== diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c index c1815b9dae68..06a0c22b37b8 100644 --- a/drivers/base/cpu.c +++ b/drivers/base/cpu.c @@ -282,6 +282,17 @@ static ssize_t print_cpus_nohz_full(struct device *dev, static DEVICE_ATTR(nohz_full, 0444, print_cpus_nohz_full, NULL); #endif +#ifdef CONFIG_HOTPLUG_CPU +#include <linux/kexec.h> +static ssize_t crash_hotplug_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + return sprintf(buf, "%d\n", crash_hotplug_cpu_support()); +} +static DEVICE_ATTR_ADMIN_RO(crash_hotplug); +#endif + static void cpu_device_release(struct device *dev) { /* @@ -469,6 +480,9 @@ static struct attribute *cpu_root_attrs[] = { #ifdef CONFIG_NO_HZ_FULL &dev_attr_nohz_full.attr, #endif +#ifdef CONFIG_HOTPLUG_CPU + &dev_attr_crash_hotplug.attr, +#endif #ifdef CONFIG_GENERIC_CPU_AUTOPROBE &dev_attr_modalias.attr, #endif diff --git a/drivers/base/memory.c b/drivers/base/memory.c index b456ac213610..24b8ef4c830c 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -490,6 +490,16 @@ static ssize_t auto_online_blocks_store(struct device *dev, static DEVICE_ATTR_RW(auto_online_blocks); +#ifdef CONFIG_MEMORY_HOTPLUG +#include <linux/kexec.h> +static ssize_t crash_hotplug_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "%d\n", crash_hotplug_memory_support()); +} +static DEVICE_ATTR_RO(crash_hotplug); +#endif + /* * Some architectures will have custom drivers to do this, and * will not need to do it from userspace. The fake hot-add code @@ -889,6 +899,9 @@ static struct attribute *memory_root_attrs[] = { &dev_attr_block_size_bytes.attr, &dev_attr_auto_online_blocks.attr, +#ifdef CONFIG_MEMORY_HOTPLUG + &dev_attr_crash_hotplug.attr, +#endif NULL }; diff --git a/include/linux/kexec.h b/include/linux/kexec.h index b9903dd48e24..6a8a724ac638 100644 --- a/include/linux/kexec.h +++ b/include/linux/kexec.h @@ -501,6 +501,14 @@ static inline void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages) { static inline void arch_crash_handle_hotplug_event(struct kimage *image) { } #endif +#ifndef crash_hotplug_cpu_support +static inline int crash_hotplug_cpu_support(void) { return 0; } +#endif + +#ifndef crash_hotplug_memory_support +static inline int crash_hotplug_memory_support(void) { return 0; } +#endif + #else /* !CONFIG_KEXEC_CORE */ struct pt_regs; struct task_struct;