Message ID | CAAGQ2nQNQ-aFkcrQHNA6H5TZ1tTovtfO_0Ohfndn9jXy13Hc6A@mail.gmail.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Delegated to: | Andy Gross |
Headers | show |
Hi Andy, On 2018-06-16 06:37, Andy Strohman wrote: > Hi, > > I'm trying to get kexec to work consistently for ipq4019. I load > the crash kernel like this: > > kexec --type zImage -p zImage-initramfs > --dtb=image-qcom-ipq4019-eap1300.dtb --append="maxcpus=1 > reset_devices" --image-size=34419456 > > I have reserved 64MB of memory for the crash kernel with parameter: > crashkernel=64M > > This seems to work ~70% of the time. When it doesn't work, I see the > "bye!" message followed by a 5-10 second hang without output. Then > the machine resets. > > I've been testing with: > echo c > /proc/sysrq-trigger > > Does anyone have an idea of what may be causing the failures or how > to troubleshoot this? > I will try to reproduce this and get back to you shortly. Regards, Sricharan > I'm using OpenWRT with kernel 4.14.37. I added the following patch > in order to load the crash kernel: > > --- a/arch/arm/mach-qcom/platsmp.c > +++ b/arch/arm/mach-qcom/platsmp.c > @@ -332,6 +332,12 @@ static void __init qcom_smp_prepare_cpus > } > } > > +/* Needed by kexec and platform_can_cpu_hotplug() */ > +int qcom_cpu_kill(unsigned int cpu) > +{ > + return 1; > +} > + > static const struct smp_operations smp_msm8660_ops __initconst = { > .smp_prepare_cpus = qcom_smp_prepare_cpus, > .smp_secondary_init = qcom_secondary_init, > @@ -358,6 +364,7 @@ static const struct smp_operations qcom_ > .smp_boot_secondary = kpssv2_boot_secondary, > #ifdef CONFIG_HOTPLUG_CPU > .cpu_die = qcom_cpu_die, > + .cpu_kill = qcom_cpu_kill, > #endif > }; > CPU_METHOD_OF_DECLARE(qcom_smp_kpssv2, "qcom,kpss-acc-v2", > &qcom_smp_kpssv2_ops); > > > Thanks, > > Andy -- To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Jun 17, 2018 at 11:31 PM, <sricharan@codeaurora.org> wrote: > Hi Andy, > > On 2018-06-16 06:37, Andy Strohman wrote: >> >> Hi, >> >> I'm trying to get kexec to work consistently for ipq4019. I load >> the crash kernel like this: >> >> kexec --type zImage -p zImage-initramfs >> --dtb=image-qcom-ipq4019-eap1300.dtb --append="maxcpus=1 >> reset_devices" --image-size=34419456 >> >> I have reserved 64MB of memory for the crash kernel with parameter: >> crashkernel=64M >> >> This seems to work ~70% of the time. When it doesn't work, I see the >> "bye!" message followed by a 5-10 second hang without output. Then >> the machine resets. >> >> I've been testing with: >> echo c > /proc/sysrq-trigger >> >> Does anyone have an idea of what may be causing the failures or how >> to troubleshoot this? >> > > I will try to reproduce this and get back to you shortly. > > Regards, > Sricharan > > >> I'm using OpenWRT with kernel 4.14.37. I added the following patch >> in order to load the crash kernel: >> >> --- a/arch/arm/mach-qcom/platsmp.c >> +++ b/arch/arm/mach-qcom/platsmp.c >> @@ -332,6 +332,12 @@ static void __init qcom_smp_prepare_cpus >> } >> } >> >> +/* Needed by kexec and platform_can_cpu_hotplug() */ >> +int qcom_cpu_kill(unsigned int cpu) >> +{ >> + return 1; >> +} >> + >> static const struct smp_operations smp_msm8660_ops __initconst = { >> .smp_prepare_cpus = qcom_smp_prepare_cpus, >> .smp_secondary_init = qcom_secondary_init, >> @@ -358,6 +364,7 @@ static const struct smp_operations qcom_ >> .smp_boot_secondary = kpssv2_boot_secondary, >> #ifdef CONFIG_HOTPLUG_CPU >> .cpu_die = qcom_cpu_die, >> + .cpu_kill = qcom_cpu_kill, >> #endif >> }; >> CPU_METHOD_OF_DECLARE(qcom_smp_kpssv2, "qcom,kpss-acc-v2", >> &qcom_smp_kpssv2_ops); >> >> >> Thanks, >> >> Andy Hi Sricharan, Thanks for your response. Did you get a chance to try this out? If so, were you able to reproduce? Thanks, Andy
Hi Andy, On 6/27/2018 2:47 AM, Andy Strohman wrote: > On Sun, Jun 17, 2018 at 11:31 PM, <sricharan@codeaurora.org> wrote: >> Hi Andy, >> >> On 2018-06-16 06:37, Andy Strohman wrote: >>> >>> Hi, >>> >>> I'm trying to get kexec to work consistently for ipq4019. I load >>> the crash kernel like this: >>> >>> kexec --type zImage -p zImage-initramfs >>> --dtb=image-qcom-ipq4019-eap1300.dtb --append="maxcpus=1 >>> reset_devices" --image-size=34419456 >>> >>> I have reserved 64MB of memory for the crash kernel with parameter: >>> crashkernel=64M >>> >>> This seems to work ~70% of the time. When it doesn't work, I see the >>> "bye!" message followed by a 5-10 second hang without output. Then >>> the machine resets. >>> >>> I've been testing with: >>> echo c > /proc/sysrq-trigger >>> >>> Does anyone have an idea of what may be causing the failures or how >>> to troubleshoot this? >>> >> >> I will try to reproduce this and get back to you shortly. >> >> Regards, >> Sricharan >> >> >>> I'm using OpenWRT with kernel 4.14.37. I added the following patch >>> in order to load the crash kernel: >>> >>> --- a/arch/arm/mach-qcom/platsmp.c >>> +++ b/arch/arm/mach-qcom/platsmp.c >>> @@ -332,6 +332,12 @@ static void __init qcom_smp_prepare_cpus >>> } >>> } >>> >>> +/* Needed by kexec and platform_can_cpu_hotplug() */ >>> +int qcom_cpu_kill(unsigned int cpu) >>> +{ >>> + return 1; >>> +} >>> + >>> static const struct smp_operations smp_msm8660_ops __initconst = { >>> .smp_prepare_cpus = qcom_smp_prepare_cpus, >>> .smp_secondary_init = qcom_secondary_init, >>> @@ -358,6 +364,7 @@ static const struct smp_operations qcom_ >>> .smp_boot_secondary = kpssv2_boot_secondary, >>> #ifdef CONFIG_HOTPLUG_CPU >>> .cpu_die = qcom_cpu_die, >>> + .cpu_kill = qcom_cpu_kill, >>> #endif >>> }; >>> CPU_METHOD_OF_DECLARE(qcom_smp_kpssv2, "qcom,kpss-acc-v2", >>> &qcom_smp_kpssv2_ops); >>> >>> >>> Thanks, >>> >>> Andy > > Hi Sricharan, > > Thanks for your response. Did you get a chance to try this out? If > so, were you able to reproduce? > I have been trying to kexec while chroot'ing for a different reason. I did not observe a issue so far. But that is with a 4.4.60 openwrt kernel. Can you point me a link to the kernel that you are trying with ? Regards, Sricharan
On Wed, Jun 27, 2018 at 9:33 PM, Sricharan R <sricharan@codeaurora.org> wrote: > Hi Andy, > > On 6/27/2018 2:47 AM, Andy Strohman wrote: >> On Sun, Jun 17, 2018 at 11:31 PM, <sricharan@codeaurora.org> wrote: >>> Hi Andy, >>> >>> On 2018-06-16 06:37, Andy Strohman wrote: >>>> >>>> Hi, >>>> >>>> I'm trying to get kexec to work consistently for ipq4019. I load >>>> the crash kernel like this: >>>> >>>> kexec --type zImage -p zImage-initramfs >>>> --dtb=image-qcom-ipq4019-eap1300.dtb --append="maxcpus=1 >>>> reset_devices" --image-size=34419456 >>>> >>>> I have reserved 64MB of memory for the crash kernel with parameter: >>>> crashkernel=64M >>>> >>>> This seems to work ~70% of the time. When it doesn't work, I see the >>>> "bye!" message followed by a 5-10 second hang without output. Then >>>> the machine resets. >>>> >>>> I've been testing with: >>>> echo c > /proc/sysrq-trigger >>>> >>>> Does anyone have an idea of what may be causing the failures or how >>>> to troubleshoot this? >>>> >>> >>> I will try to reproduce this and get back to you shortly. >>> >>> Regards, >>> Sricharan >>> >>> >>>> I'm using OpenWRT with kernel 4.14.37. I added the following patch >>>> in order to load the crash kernel: >>>> >>>> --- a/arch/arm/mach-qcom/platsmp.c >>>> +++ b/arch/arm/mach-qcom/platsmp.c >>>> @@ -332,6 +332,12 @@ static void __init qcom_smp_prepare_cpus >>>> } >>>> } >>>> >>>> +/* Needed by kexec and platform_can_cpu_hotplug() */ >>>> +int qcom_cpu_kill(unsigned int cpu) >>>> +{ >>>> + return 1; >>>> +} >>>> + >>>> static const struct smp_operations smp_msm8660_ops __initconst = { >>>> .smp_prepare_cpus = qcom_smp_prepare_cpus, >>>> .smp_secondary_init = qcom_secondary_init, >>>> @@ -358,6 +364,7 @@ static const struct smp_operations qcom_ >>>> .smp_boot_secondary = kpssv2_boot_secondary, >>>> #ifdef CONFIG_HOTPLUG_CPU >>>> .cpu_die = qcom_cpu_die, >>>> + .cpu_kill = qcom_cpu_kill, >>>> #endif >>>> }; >>>> CPU_METHOD_OF_DECLARE(qcom_smp_kpssv2, "qcom,kpss-acc-v2", >>>> &qcom_smp_kpssv2_ops); >>>> >>>> >>>> Thanks, >>>> >>>> Andy >> >> Hi Sricharan, >> >> Thanks for your response. Did you get a chance to try this out? If >> so, were you able to reproduce? >> > > I have been trying to kexec while chroot'ing for a different reason. > I did not observe a issue so far. But that is with a 4.4.60 openwrt kernel. > Can you point me a link to the kernel that you are trying with ? > > Regards, > Sricharan > > -- > "QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation Sricharan, I am using https://git.openwrt.org/openwrt/openwrt.git, commit: ac70ac3532fefa78c944d8a26c8df0ca5d88d04e Can you provide me a link to the source that you are using? I think my problem is that callback cpu_kill within struct smp_operations is not properly implemented in arch/arm/mach-qcom/platsmp.c. I added a dummy function that just returns 1 to allow loading the crash kernel. That is the patch in my original email in this thread. I gave this approach a try because I saw another SUBARCH doing the same, but I think it's inadequate. After reading the surrounding code, https://patchwork.ozlabs.org/patch/207562/ and https://patchwork.kernel.org/patch/1925071/ , I now believe that I need to power down CPUs in cpu_kill. Since I don't have the datasheet for ipq4019, I'm not sure how to do that. If you have any advise regarding this, that would great. When I boot the machine with nr_cpus=1, kexec always works. It also seems to work reliably if I taskset the process that triggers the crash to cpu 0. Thanks, Andy
Hi Andy, >> -- >> "QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation > > Sricharan, > > I am using https://git.openwrt.org/openwrt/openwrt.git, commit: > ac70ac3532fefa78c944d8a26c8df0ca5d88d04e > > Can you provide me a link to the source that you are using? > > I think my problem is that callback cpu_kill within struct > smp_operations is not properly implemented in > arch/arm/mach-qcom/platsmp.c. I added a dummy function that just > returns 1 to allow loading the crash kernel. That is the patch in my > original email in this thread. I gave this approach a try because I > saw another SUBARCH doing the same, but I think it's inadequate. > > After reading the surrounding code, > https://patchwork.ozlabs.org/patch/207562/ and > https://patchwork.kernel.org/patch/1925071/ , I now believe that I > need to power down CPUs in cpu_kill. Since I don't have the datasheet > for ipq4019, I'm not sure how to do that. If you have any advise > regarding this, that would great. > > When I boot the machine with nr_cpus=1, kexec always works. It > also seems to work reliably if I taskset the process that triggers the > crash to cpu 0. > Sorry for the delayed response. https://source.codeaurora.org/quic/qsdk/oss/kernel/linux-msm/tree/?h=eggplant Just see that we are always doing a nr_cpus=1 for the kexec kernel and thats why it was ok. That said, the current code mach-qcom/platsmp.c does not implement the cpu_kill callback. Even for a hotplug its just a wfi(). While doing a wfi() is going to work for hotplug, it would not for the kexec, since the cpu's were never put it reset state and waking them would simply fail. That means we need to have the complement to kpssv2_release_secondary implemented for cpu_kill callback. I will try to write down the exact sequence from programming guide and give here. Regards, Sricharan
Hi Andy, > > That said, the current code mach-qcom/platsmp.c does not implement the > cpu_kill callback. Even for a hotplug its just a wfi(). While doing > a wfi() is going to work for hotplug, it would not for the kexec, since > the cpu's were never put it reset state and waking them would simply fail. > > That means we need to have the complement to kpssv2_release_secondary > implemented for cpu_kill callback. I will try to write down the exact > sequence from programming guide and give here. > Please look at the code in drivers/soc/qcom/spm.c that controls the sequence of cpu 'c' state during the cpuidle. spm block is the one that takes care of powerdown/up sequence of the cpu after 'wfi' . Similar thing needs to be then done for cpu_kill if we expect a 'cpu' to be powercollapsed and to be brought back during the kexec kernel reboot. Also, please have a look at https://source.codeaurora.org/quic/la/kernel/msm-4.4/tree/arch/arm/mach-msm/platsmp.c?h=LA.HB.1.1.5.c1, old non-dt code that is having the cpu_kill back. When no PM, it simply is a WFI. Regards, Sricharan
On Tue, Jul 3, 2018 at 11:36 PM, Sricharan R <sricharan@codeaurora.org> wrote: > Hi Andy, > >> >> That said, the current code mach-qcom/platsmp.c does not implement the >> cpu_kill callback. Even for a hotplug its just a wfi(). While doing >> a wfi() is going to work for hotplug, it would not for the kexec, since >> the cpu's were never put it reset state and waking them would simply fail. >> >> That means we need to have the complement to kpssv2_release_secondary >> implemented for cpu_kill callback. I will try to write down the exact >> sequence from programming guide and give here. >> > > Please look at the code in drivers/soc/qcom/spm.c that controls the sequence > of cpu 'c' state during the cpuidle. spm block is the one that takes care > of powerdown/up sequence of the cpu after 'wfi' . Similar thing needs to be > then done for cpu_kill if we expect a 'cpu' to be powercollapsed and to > be brought back during the kexec kernel reboot. > > Also, please have a look at > https://source.codeaurora.org/quic/la/kernel/msm-4.4/tree/arch/arm/mach-msm/platsmp.c?h=LA.HB.1.1.5.c1, > old non-dt code that is having the cpu_kill back. When no PM, it simply is a WFI. > > Regards, > Sricharan > > -- > "QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation Hi Sricharan, I just want to thank you for your help and suggestions. I wasn't able to get it working quickly, so I have to move on to other things. If/when I get this working, I'll be sure to let you know what I did. Thanks, Andy
Hi Andy, On 7/11/2018 5:10 AM, Andy Strohman wrote: > On Tue, Jul 3, 2018 at 11:36 PM, Sricharan R <sricharan@codeaurora.org> wrote: >> Hi Andy, >> >>> >>> That said, the current code mach-qcom/platsmp.c does not implement the >>> cpu_kill callback. Even for a hotplug its just a wfi(). While doing >>> a wfi() is going to work for hotplug, it would not for the kexec, since >>> the cpu's were never put it reset state and waking them would simply fail. >>> >>> That means we need to have the complement to kpssv2_release_secondary >>> implemented for cpu_kill callback. I will try to write down the exact >>> sequence from programming guide and give here. >>> >> >> Please look at the code in drivers/soc/qcom/spm.c that controls the sequence >> of cpu 'c' state during the cpuidle. spm block is the one that takes care >> of powerdown/up sequence of the cpu after 'wfi' . Similar thing needs to be >> then done for cpu_kill if we expect a 'cpu' to be powercollapsed and to >> be brought back during the kexec kernel reboot. >> >> Also, please have a look at >> https://source.codeaurora.org/quic/la/kernel/msm-4.4/tree/arch/arm/mach-msm/platsmp.c?h=LA.HB.1.1.5.c1, >> old non-dt code that is having the cpu_kill back. When no PM, it simply is a WFI. >> >> Regards, >> Sricharan >> >> -- >> "QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation > > Hi Sricharan, > > I just want to thank you for your help and suggestions. I wasn't > able to get it working quickly, so I have to move on to other things. > If/when I get this working, I'll be sure to let you know what I did. > Sure. Me too would try this (kexec without no_cpus=1) and let you know. At the moment, little busy with few other things. But would surely mark this in my next to-do and comeback. Regards, Sricharan
--- a/arch/arm/mach-qcom/platsmp.c +++ b/arch/arm/mach-qcom/platsmp.c @@ -332,6 +332,12 @@ static void __init qcom_smp_prepare_cpus } } +/* Needed by kexec and platform_can_cpu_hotplug() */ +int qcom_cpu_kill(unsigned int cpu) +{ + return 1; +} + static const struct smp_operations smp_msm8660_ops __initconst = { .smp_prepare_cpus = qcom_smp_prepare_cpus, .smp_secondary_init = qcom_secondary_init, @@ -358,6 +364,7 @@ static const struct smp_operations qcom_ .smp_boot_secondary = kpssv2_boot_secondary, #ifdef CONFIG_HOTPLUG_CPU .cpu_die = qcom_cpu_die, + .cpu_kill = qcom_cpu_kill, #endif }; CPU_METHOD_OF_DECLARE(qcom_smp_kpssv2, "qcom,kpss-acc-v2",