Message ID | 13207937.r2GEYrEf4f@kreacher (mailing list archive) |
---|---|
Headers | show |
Series | cpufreq: intel_pstate: Implement passive mode with HWP enabled | expand |
On Tue, 2020-07-28 at 17:09 +0200, Rafael J. Wysocki wrote: > Hi All, > > On Monday, July 27, 2020 5:13:40 PM CEST Rafael J. Wysocki wrote: > > On Thursday, July 16, 2020 7:37:04 PM CEST Rafael J. Wysocki wrote: > > > This really is a v2 of this patch: > > > > > > https://patchwork.kernel.org/patch/11663271/ > > > > > > with an extra preceding cleanup patch to avoid making unrelated > > > changes in the > > > [2/2]. > > I applied this series along with [PATCH] cpufreq: intel_pstate: Fix EPP setting via sysfs in active mode on 5.8 latest master (On top of raw epp patchset). When intel_pstate=passive from kernel command line then it is fine, no crash. But switch dynamically, crashed: Attached crash.txt. I may need to try your linux-pm tree. Then after some playing I reached a state when I monitor MSR 0x774: while true; do rdmsr 0x774; sleep 1; done 80002704 ... ... ff000101 ff000101 ff000101 ff000101 ff000101 ff000101 ff000101 ff000101 Don't have a recipe to reproduce this. Thanks, Srinivas > > Almost the same as before, but the first patch has been reworked to > > handle > > errors in store_energy_performance_preference() correctly and > > rebased on top > > of the current linux-pm.git branch. > > > > No functional changes otherwise. > > One more update of the second patch. > > Namely, I realized that the hwp_dynamic_boost sysfs switch was > present in the > passive mode after the v3 (and the previous versions) of that patch > which isn't > correct, so this modifies it to avoid exposing hwp_dynamic_boost in > the passive > mode. > > The first patch is the same as in the v2. > > Thanks! > > > [ 232.483420] BUG: kernel NULL pointer dereference, address: 0000000000000030 [ 232.483435] #PF: supervisor read access in kernel mode [ 232.483441] #PF: error_code(0x0000) - not-present page [ 232.483446] PGD 0 P4D 0 [ 232.483457] Oops: 0000 [#1] SMP NOPTI [ 232.483469] CPU: 7 PID: 2064 Comm: bash Tainted: G W 5.8.0-rc6+ #6 [ 232.483474] Hardware name: Dell Inc. XPS 13 7390 2-in-1/06CDVY, BIOS 1.3.1 03/02/2020 [ 232.483491] RIP: 0010:sysfs_remove_file_ns+0x6/0x20 [ 232.483500] Code: ff 4c 89 e7 e8 bb ce ff ff 4c 89 ef e8 43 f9 1d 00 41 5c 41 5d 5d c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 55 <48> 8b 7f 30 48 8b 36 48 89 e5 e8 cb e2 ff ff 5d c3 66 0f 1f 84 00 [ 232.483507] RSP: 0018:ffffaa37c1c93df8 EFLAGS: 00010246 [ 232.483514] RAX: 0000000000000001 RBX: 0000000000000007 RCX: 0000000000000008 [ 232.483519] RDX: 0000000000000000 RSI: ffffffffa87f1f60 RDI: 0000000000000000 [ 232.483524] RBP: ffffaa37c1c93e18 R08: 0000000000000000 R09: ffffffffa791ff00 [ 232.483529] R10: ffff8a15bc0f3600 R11: 0000000000000001 R12: 0000000000000008 [ 232.483533] R13: ffff8a15b3059197 R14: fffffffffffffff2 R15: ffff8a159f945020 [ 232.483541] FS: 00007f068a3d4740(0000) GS:ffff8a15bf7c0000(0000) knlGS:0000000000000000 [ 232.483547] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 232.483552] CR2: 0000000000000030 CR3: 000000046ec78004 CR4: 0000000000760ee0 [ 232.483557] PKRU: 55555554 [ 232.483561] Call Trace: [ 232.483581] ? intel_pstate_driver_cleanup+0xbd/0xd0 [ 232.483590] store_status+0x9b/0x180 [ 232.483603] kobj_attr_store+0x12/0x20 [ 232.483610] sysfs_kf_write+0x3e/0x50 [ 232.483623] kernfs_fop_write+0xda/0x1b0 [ 232.483636] vfs_write+0xc9/0x200 [ 232.483647] ksys_write+0x67/0xe0 [ 232.483657] __x64_sys_write+0x1a/0x20 [ 232.483668] do_syscall_64+0x52/0xc0 [ 232.483680] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 232.483689] RIP: 0033:0x7f068a4e8057 [ 232.483693] Code: Bad RIP value. [ 232.483699] RSP: 002b:00007ffe0d4faec8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 232.483705] RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007f068a4e8057 [ 232.483710] RDX: 0000000000000008 RSI: 000055b52e3a6700 RDI: 0000000000000001 [ 232.483715] RBP: 000055b52e3a6700 R08: 000000000000000a R09: 0000000000000007 [ 232.483719] R10: 000055b52cca5017 R11: 0000000000000246 R12: 0000000000000008 [ 232.483724] R13: 00007f068a5c36a0 R14: 00007f068a5c44a0 R15: 00007f068a5c38a0 [ 232.483735] Modules linked in: msr rfcomm ccm cmac algif_hash algif_skcipher af_alg wacom usbhid hid_multitouch bnep hid_sensor_als hid_sensor_incl_3d hid_sensor_accel_3d hid_sensor_magn_3d hid_sensor_gyro_3d hid_sensor_rotation hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common industrialio hid_sensor_custom x86_pkg_temp_thermal hid_sensor_hub intel_powerclamp hid_generic intel_ishtp_loader snd_sof_pci snd_sof_intel_byt intel_ishtp_hid snd_sof_intel_ipc dell_laptop dell_wmi cros_ec_ishtp snd_sof_intel_hda_common mei_hdcp rtsx_pci_sdmmc intel_rapl_msr intel_wmi_thunderbolt coretemp wmi_bmof dell_smbios snd_soc_hdac_hda cros_ec snd_sof_xtensa_dsp snd_sof_intel_hda snd_sof kvm_intel dell_wmi_descriptor snd_hda_codec_hdmi dcdbas snd_hda_ext_core dell_smm_hwmon nls_iso8859_1 snd_soc_acpi_intel_match kvm snd_hda_codec_realtek snd_soc_acpi snd_hda_codec_generic iwlmvm ledtrig_audio snd_soc_core crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel [ 232.483821] crypto_simd cryptd snd_compress glue_helper ac97_bus snd_pcm_dmaengine rapl mac80211 intel_cstate snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core joydev snd_hwdep snd_pcm libarc4 efi_pstore iwlwifi btusb snd_seq_midi snd_seq_midi_event btrtl snd_rawmidi btbcm btintel snd_seq bluetooth cfg80211 snd_seq_device snd_timer snd i2c_i801 i2c_smbus ucsi_acpi processor_thermal_device typec_ucsi intel_rapl_common intel_lpss_pci intel_lpss idma64 rtsx_pci soundcore mei_me ecdh_generic mei intel_ish_ipc ecc i2c_hid intel_ishtp intel_soc_dts_iosf typec virt_dma wmi hid int3403_thermal soc_button_array int340x_thermal_zone int3400_thermal intel_hid acpi_thermal_rel sparse_keymap acpi_pad acpi_tad sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm psmouse nvme nvme_core input_leds serio_raw mac_hid video pinctrl_icelake pinctrl_intel [ 232.483940] CR2: 0000000000000030 [ 232.483950] ---[ end trace 31db41bab6fdff6f ]--- [ 233.812260] RIP: 0010:sysfs_remove_file_ns+0x6/0x20 [ 233.812281] Code: ff 4c 89 e7 e8 bb ce ff ff 4c 89 ef e8 43 f9 1d 00 41 5c 41 5d 5d c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 55 <48> 8b 7f 30 48 8b 36 48 89 e5 e8 cb e2 ff ff 5d c3 66 0f 1f 84 00 [ 233.812292] RSP: 0018:ffffaa37c1c93df8 EFLAGS: 00010246 [ 233.812302] RAX: 0000000000000001 RBX: 0000000000000007 RCX: 0000000000000008 [ 233.812308] RDX: 0000000000000000 RSI: ffffffffa87f1f60 RDI: 0000000000000000 [ 233.812313] RBP: ffffaa37c1c93e18 R08: 0000000000000000 R09: ffffffffa791ff00 [ 233.812318] R10: ffff8a15bc0f3600 R11: 0000000000000001 R12: 0000000000000008 [ 233.812323] R13: ffff8a15b3059197 R14: fffffffffffffff2 R15: ffff8a159f945020 [ 233.812331] FS: 00007f068a3d4740(0000) GS:ffff8a15bf7c0000(0000) knlGS:0000000000000000 [ 233.812337] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 233.812343] CR2: 0000000000000030 CR3: 000000046ec78004 CR4: 0000000000760ee0 [ 233.812349] PKRU: 55555554
On 2020.08.01 09:40 Srinivas Pandruvada wrote: >> On Monday, July 27, 2020 5:13:40 PM CEST Rafael J. Wysocki wrote: >>> On Thursday, July 16, 2020 7:37:04 PM CEST Rafael J. Wysocki wrote: >>>> This really is a v2 of this patch: >>>> >>>> https://patchwork.kernel.org/patch/11663271/ >>>> >>>> with an extra preceding cleanup patch to avoid making unrelated >>>> changes in the >>>> [2/2]. >>> > I applied this series along with > [PATCH] cpufreq: intel_pstate: Fix EPP setting via sysfs in active mode > on 5.8 latest master (On top of raw epp patchset). Hi Srinivas, Would you be kind enough to provide a "git log --oneline" output of what you did. I have been trying unsuccessfully to apply the patches, so somewhere I obviously missed something. > When intel_pstate=passive from kernel command line then it is fine, no > crash. But switch dynamically, crashed: I'll try to repeat, if I can get an actual kernel. > Attached crash.txt. I may need to try your linux-pm tree. I also tried the linux-pm tree, same. ... Doug
On Sun, 2020-08-02 at 07:00 -0700, Doug Smythies wrote: > On 2020.08.01 09:40 Srinivas Pandruvada wrote: > > > On Monday, July 27, 2020 5:13:40 PM CEST Rafael J. Wysocki wrote: > > > > On Thursday, July 16, 2020 7:37:04 PM CEST Rafael J. Wysocki > > > > wrote: > > > > > This really is a v2 of this patch: > > > > > > > > > > https://patchwork.kernel.org/patch/11663271/ > > > > > > > > > > with an extra preceding cleanup patch to avoid making > > > > > unrelated > > > > > changes in the > > > > > [2/2]. > > I applied this series along with > > [PATCH] cpufreq: intel_pstate: Fix EPP setting via sysfs in active > > mode > > on 5.8 latest master (On top of raw epp patchset). > > Hi Srinivas, Hi Doug, > > Would you be kind enough to provide a "git log --oneline" output > of what you did. 69dd9b2b11cd (HEAD -> 5-9-devel) cpufreq: intel_pstate: Implement passive mode with HWP enabled 63efaa01b06a cpufreq: intel_pstate: Fix EPP setting via sysfs in active mode e11e0a2edf83 cpufreq: intel_pstate: Rearrange the storing of new EPP values 93c3fd6a315c cpufreq: intel_pstate: Avoid enabling HWP if EPP is not supported 7cef1dd371c3 cpufreq: intel_pstate: Clean up aperf_mperf_shift description a3248d8d3a11 cpufreq: intel_pstate: Supply struct attribute description for get_aperf_mperf_shift() f52b6b075b07 cpufreq: intel_pstate: Fix static checker warning for epp variable 4a59d6be0774 cpufreq: intel_pstate: Allow raw energy performance preference value 7b34b5acdcc6 cpufreq: intel_pstate: Allow enable/disable energy efficiency ac3a0c847296 (origin/master, origin/HEAD, master) Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Thanks, Srinivas > > I have been trying unsuccessfully to apply the patches, > so somewhere I obviously missed something. > > > When intel_pstate=passive from kernel command line then it is fine, > > no > > crash. But switch dynamically, crashed: > > I'll try to repeat, if I can get an actual kernel. > > > Attached crash.txt. I may need to try your linux-pm tree. > > I also tried the linux-pm tree, same. > ... Doug > >
Hi Srinivas, Thanks for your help. I was missing several needed patches. On 2020.08.02 11:39 Srinivas Pandruvada wrote: > On Sun, 2020-08-02 at 07:00 -0700, Doug Smythies wrote: > > On 2020.08.01 09:40 Srinivas Pandruvada wrote: > > > > On Monday, July 27, 2020 5:13:40 PM CEST Rafael J. Wysocki wrote: > > > > > On Thursday, July 16, 2020 7:37:04 PM CEST Rafael J. Wysocki > > > > > wrote: > > > > > > This really is a v2 of this patch: > > > > > > > > > > > > https://patchwork.kernel.org/patch/11663271/ > > > > > > > > > > > > with an extra preceding cleanup patch to avoid making > > > > > > unrelated > > > > > > changes in the > > > > > > [2/2]. > > > I applied this series along with > > > [PATCH] cpufreq: intel_pstate: Fix EPP setting via sysfs in active > > > mode > > > on 5.8 latest master (On top of raw epp patchset). > > > > Would you be kind enough to provide a "git log --oneline" output > > of what you did. > > 69dd9b2b11cd (HEAD -> 5-9-devel) cpufreq: intel_pstate: Implement > passive mode with HWP enabled > 63efaa01b06a cpufreq: intel_pstate: Fix EPP setting via sysfs in active > mode > e11e0a2edf83 cpufreq: intel_pstate: Rearrange the storing of new EPP > values > 93c3fd6a315c cpufreq: intel_pstate: Avoid enabling HWP if EPP is not > supported > 7cef1dd371c3 cpufreq: intel_pstate: Clean up aperf_mperf_shift > description > a3248d8d3a11 cpufreq: intel_pstate: Supply struct attribute description > for get_aperf_mperf_shift() > f52b6b075b07 cpufreq: intel_pstate: Fix static checker warning for epp > variable > 4a59d6be0774 cpufreq: intel_pstate: Allow raw energy performance > preference value > 7b34b5acdcc6 cpufreq: intel_pstate: Allow enable/disable energy > efficiency > ac3a0c847296 (origin/master, origin/HEAD, master) Merge > git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net > > > > I have been trying unsuccessfully to apply the patches, > > so somewhere I obviously missed something. > > > > > When intel_pstate=passive from kernel command line then it is fine, > > > no > > > crash. But switch dynamically, crashed: > > > > I'll try to repeat, if I can get an actual kernel. I could not repeat your crash. I tried booting with and without intel_pstate=passive on the kernel command line and then switching back and forth thereafter. However, I do confirm EPP is messed up. But not min and max from MSR 0x774, they behave as expected, based on quick testing only. Since you mentioned: >>> Don't have a recipe to reproduce this. Maybe I simply didn't hit whatever. ... Doug Useless additional stuff: # cat /proc/cmdline BOOT_IMAGE=/boot/vmlinuz-5.8.0-rc7-dssp root=UUID=0ac356c1-caa9-4c2e-8229-4408bd998dbd ro ipv6.disable=1 consoleblank=450 intel_pstate=passive cpuidle_sysfs_switch cpuidle.governor=teo Went "active" then "passive" and set ondemand governor. 2 X 100% CPU loads: # /home/doug/c/msr-decoder How many CPUs?: 6 8.) 0x198: IA32_PERF_STATUS : CPU 0-5 : 46 : 46 : 46 : 46 : 46 : 46 : B.) 0x770: IA32_PM_ENABLE: 1 : HWP enable 1.) 0x19C: IA32_THERM_STATUS: 883C0000 2.) 0x1AA: MSR_MISC_PWR_MGMT: 401CC0 EIST enabled Coordination enabled OOB Bit 8 reset OOB Bit 18 reset 3.) 0x1B1: IA32_PACKAGE_THERM_STATUS: 882D0000 4.) 0x64F: MSR_CORE_PERF_LIMIT_REASONS: 0 A.) 0x1FC: MSR_POWER_CTL: 3C005D : C1E disable : EEO disable : RHO disable 5.) 0x771: IA32_HWP_CAPABILITIES (performance): 10B252E : high 46 : guaranteed 37 : efficient 11 : lowest 1 6.) 0x774: IA32_HWP_REQUEST: CPU 0-5 : raw: FF002E0A : FF002E2E : FF002E2E : FF002E08 : FF002E18 : FF002E08 : min: 10 : 46 : 46 : 8 : 24 : 8 : max: 46 : 46 : 46 : 46 : 46 : 46 : des: 0 : 0 : 0 : 0 : 0 : 0 : epp: 255 : 255 : 255 : 255 : 255 : 255 : act: 0 : 0 : 0 : 0 : 0 : 0 : 7.) 0x777: IA32_HWP_STATUS: 4 : high 4 : guaranteed 0 : efficient 0 : lowest 0 Kernel: d72c8472dbd5 (HEAD -> k58rc7-d3) cpufreq: intel_pstate: Fix EPP setting via sysfs in active mode c2f4869fbc27 cpufreq: intel_pstate: Implement passive mode with HWP enabled 85219968fab9 cpufreq: intel_pstate: Rearrange the storing of new EPP values 5c09a1a38106 cpufreq: intel_pstate: Avoid enabling HWP if EPP is not supported 9f29c81fe0b3 cpufreq: intel_pstate: Clean up aperf_mperf_shift description 2a863c241495 cpufreq: intel_pstate: Supply struct attribute description for get_aperf_mperf_shift() 4180d8413037 cpufreq: intel_pstate: Fix static checker warning for epp variable 7cd50e86a9e6 cpufreq: intel_pstate: Allow raw energy performance preference value 56dce9a1081e cpufreq: intel_pstate: Allow enable/disable energy efficiency 92ed30191993 (tag: v5.8-rc7) Linux 5.8-rc7
On Saturday, August 1, 2020 6:39:30 PM CEST Srinivas Pandruvada wrote: > > --=-bU21ZBsdw4g45G9I/wXt > Content-Type: text/plain; charset="UTF-8" > Content-Transfer-Encoding: 7bit > > On Tue, 2020-07-28 at 17:09 +0200, Rafael J. Wysocki wrote: > > Hi All, > > > > On Monday, July 27, 2020 5:13:40 PM CEST Rafael J. Wysocki wrote: > > > On Thursday, July 16, 2020 7:37:04 PM CEST Rafael J. Wysocki wrote: > > > > This really is a v2 of this patch: > > > > > > > > https://patchwork.kernel.org/patch/11663271/ > > > > > > > > with an extra preceding cleanup patch to avoid making unrelated > > > > changes in the > > > > [2/2]. > > > > I applied this series along with > [PATCH] cpufreq: intel_pstate: Fix EPP setting via sysfs in active mode > on 5.8 latest master (On top of raw epp patchset). > > When intel_pstate=passive from kernel command line then it is fine, no > crash. But switch dynamically, crashed: > Attached crash.txt. I may need to try your linux-pm tree. Please try the v5 on top of my linux-next branch: https://patchwork.kernel.org/patch/11698495/ FWIW, I cannot reproduce the crash with it. > Then after some playing I reached a state when I monitor MSR 0x774: > while true; do rdmsr 0x774; sleep 1; done > 80002704 > ... > ... > ff000101 > ff000101 > ff000101 > ff000101 > ff000101 > ff000101 > ff000101 > ff000101 > > Don't have a recipe to reproduce this. Well, maybe it locked up due to the deadlock in the v4 of the patch. Please see if you get this with the v5 above applied. Cheers!