Message ID | 20250306021031.5538-2-xueshuai@linux.alibaba.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm/hwpoison: Fix regressions in memory failure handling | expand |
> diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c > index dac4d64dfb2a..cb021058165f 100644 > --- a/arch/x86/kernel/cpu/mce/severity.c > +++ b/arch/x86/kernel/cpu/mce/severity.c > @@ -300,13 +300,12 @@ static noinstr int error_context(struct mce *m, struct pt_regs *regs) > copy_user = is_copy_from_user(regs); > instrumentation_end(); > > - switch (fixup_type) { > - case EX_TYPE_UACCESS: > - if (!copy_user) > - return IN_KERNEL; > - m->kflags |= MCE_IN_KERNEL_COPYIN; > - fallthrough; > + if (copy_user) { > + m->kflags |= MCE_IN_KERNEL_COPYIN | MCE_IN_KERNEL_COPYIN; You have " MCE_IN_KERNEL_COPYIN" twice here. > + return IN_KERNEL_RECOV > + } > > + switch (fixup_type) { > case EX_TYPE_FAULT_MCE_SAFE: > case EX_TYPE_DEFAULT_MCE_SAFE: > m->kflags |= MCE_IN_KERNEL_RECOV; > -- -Tony
在 2025/3/7 02:15, Luck, Tony 写道: >> diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c >> index dac4d64dfb2a..cb021058165f 100644 >> --- a/arch/x86/kernel/cpu/mce/severity.c >> +++ b/arch/x86/kernel/cpu/mce/severity.c >> @@ -300,13 +300,12 @@ static noinstr int error_context(struct mce *m, struct pt_regs *regs) >> copy_user = is_copy_from_user(regs); >> instrumentation_end(); >> >> - switch (fixup_type) { >> - case EX_TYPE_UACCESS: >> - if (!copy_user) >> - return IN_KERNEL; >> - m->kflags |= MCE_IN_KERNEL_COPYIN; >> - fallthrough; >> + if (copy_user) { >> + m->kflags |= MCE_IN_KERNEL_COPYIN | MCE_IN_KERNEL_COPYIN; > > You have " MCE_IN_KERNEL_COPYIN" twice here. Sorry, I forgot to format a new patch and send a old version. The corrected one: --- arch/x86/kernel/cpu/mce/severity.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c index dac4d64dfb2a..2235a7477436 100644 --- a/arch/x86/kernel/cpu/mce/severity.c +++ b/arch/x86/kernel/cpu/mce/severity.c @@ -300,13 +300,12 @@ static noinstr int error_context(struct mce *m, struct pt_regs *regs) copy_user = is_copy_from_user(regs); instrumentation_end(); - switch (fixup_type) { - case EX_TYPE_UACCESS: - if (!copy_user) - return IN_KERNEL; - m->kflags |= MCE_IN_KERNEL_COPYIN; - fallthrough; + if (copy_user) { + m->kflags |= MCE_IN_KERNEL_COPYIN | MCE_IN_KERNEL_RECOV; + return IN_KERNEL_RECOV; + } + switch (fixup_type) { case EX_TYPE_FAULT_MCE_SAFE: case EX_TYPE_DEFAULT_MCE_SAFE: m->kflags |= MCE_IN_KERNEL_RECOV; Will fix it in next version. Thanks. Shuai
Hi Shuai, kernel test robot noticed the following build errors: [auto build test ERROR on akpm-mm/mm-everything] url: https://github.com/intel-lab-lkp/linux/commits/Shuai-Xue/x86-mce-Use-is_copy_from_user-to-determine-copy-from-user-context/20250306-101505 base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything patch link: https://lore.kernel.org/r/20250306021031.5538-2-xueshuai%40linux.alibaba.com patch subject: [PATCH v3 1/3] x86/mce: Use is_copy_from_user() to determine copy-from-user context config: i386-buildonly-randconfig-002-20250307 (https://download.01.org/0day-ci/archive/20250307/202503071154.xQpKARjN-lkp@intel.com/config) compiler: clang version 19.1.7 (https://github.com/llvm/llvm-project cd708029e0b2869e80abe31ddb175f7c35361f90) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250307/202503071154.xQpKARjN-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202503071154.xQpKARjN-lkp@intel.com/ All errors (new ones prefixed by >>): In file included from arch/x86/kernel/cpu/mce/severity.c:16: In file included from arch/x86/include/asm/traps.h:6: In file included from include/linux/kprobes.h:28: In file included from include/linux/ftrace.h:13: In file included from include/linux/kallsyms.h:13: In file included from include/linux/mm.h:2321: include/linux/vmstat.h:518:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion] 518 | return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_" | ~~~~~~~~~~~ ^ ~~~ >> arch/x86/kernel/cpu/mce/severity.c:305:25: error: expected ';' after return statement 305 | return IN_KERNEL_RECOV | ^ | ; 1 warning and 1 error generated. vim +305 arch/x86/kernel/cpu/mce/severity.c 274 275 /* 276 * If mcgstatus indicated that ip/cs on the stack were 277 * no good, then "m->cs" will be zero and we will have 278 * to assume the worst case (IN_KERNEL) as we actually 279 * have no idea what we were executing when the machine 280 * check hit. 281 * If we do have a good "m->cs" (or a faked one in the 282 * case we were executing in VM86 mode) we can use it to 283 * distinguish an exception taken in user from from one 284 * taken in the kernel. 285 */ 286 static noinstr int error_context(struct mce *m, struct pt_regs *regs) 287 { 288 int fixup_type; 289 bool copy_user; 290 291 if ((m->cs & 3) == 3) 292 return IN_USER; 293 294 if (!mc_recoverable(m->mcgstatus)) 295 return IN_KERNEL; 296 297 /* Allow instrumentation around external facilities usage. */ 298 instrumentation_begin(); 299 fixup_type = ex_get_fixup_type(m->ip); 300 copy_user = is_copy_from_user(regs); 301 instrumentation_end(); 302 303 if (copy_user) { 304 m->kflags |= MCE_IN_KERNEL_COPYIN | MCE_IN_KERNEL_COPYIN; > 305 return IN_KERNEL_RECOV 306 } 307 308 switch (fixup_type) { 309 case EX_TYPE_FAULT_MCE_SAFE: 310 case EX_TYPE_DEFAULT_MCE_SAFE: 311 m->kflags |= MCE_IN_KERNEL_RECOV; 312 return IN_KERNEL_RECOV; 313 314 default: 315 return IN_KERNEL; 316 } 317 } 318
Hi Shuai, kernel test robot noticed the following build errors: [auto build test ERROR on akpm-mm/mm-everything] url: https://github.com/intel-lab-lkp/linux/commits/Shuai-Xue/x86-mce-Use-is_copy_from_user-to-determine-copy-from-user-context/20250306-101505 base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything patch link: https://lore.kernel.org/r/20250306021031.5538-2-xueshuai%40linux.alibaba.com patch subject: [PATCH v3 1/3] x86/mce: Use is_copy_from_user() to determine copy-from-user context config: i386-buildonly-randconfig-005-20250307 (https://download.01.org/0day-ci/archive/20250307/202503071115.uNkoVksh-lkp@intel.com/config) compiler: gcc-12 (Debian 12.2.0-14) 12.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250307/202503071115.uNkoVksh-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202503071115.uNkoVksh-lkp@intel.com/ All errors (new ones prefixed by >>): arch/x86/kernel/cpu/mce/severity.c: In function 'error_context': >> arch/x86/kernel/cpu/mce/severity.c:305:39: error: expected ';' before '}' token 305 | return IN_KERNEL_RECOV | ^ | ; 306 | } | ~ vim +305 arch/x86/kernel/cpu/mce/severity.c 274 275 /* 276 * If mcgstatus indicated that ip/cs on the stack were 277 * no good, then "m->cs" will be zero and we will have 278 * to assume the worst case (IN_KERNEL) as we actually 279 * have no idea what we were executing when the machine 280 * check hit. 281 * If we do have a good "m->cs" (or a faked one in the 282 * case we were executing in VM86 mode) we can use it to 283 * distinguish an exception taken in user from from one 284 * taken in the kernel. 285 */ 286 static noinstr int error_context(struct mce *m, struct pt_regs *regs) 287 { 288 int fixup_type; 289 bool copy_user; 290 291 if ((m->cs & 3) == 3) 292 return IN_USER; 293 294 if (!mc_recoverable(m->mcgstatus)) 295 return IN_KERNEL; 296 297 /* Allow instrumentation around external facilities usage. */ 298 instrumentation_begin(); 299 fixup_type = ex_get_fixup_type(m->ip); 300 copy_user = is_copy_from_user(regs); 301 instrumentation_end(); 302 303 if (copy_user) { 304 m->kflags |= MCE_IN_KERNEL_COPYIN | MCE_IN_KERNEL_COPYIN; > 305 return IN_KERNEL_RECOV 306 } 307 308 switch (fixup_type) { 309 case EX_TYPE_FAULT_MCE_SAFE: 310 case EX_TYPE_DEFAULT_MCE_SAFE: 311 m->kflags |= MCE_IN_KERNEL_RECOV; 312 return IN_KERNEL_RECOV; 313 314 default: 315 return IN_KERNEL; 316 } 317 } 318
在 2025/3/7 02:15, Luck, Tony 写道: >> diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c >> index dac4d64dfb2a..cb021058165f 100644 >> --- a/arch/x86/kernel/cpu/mce/severity.c >> +++ b/arch/x86/kernel/cpu/mce/severity.c >> @@ -300,13 +300,12 @@ static noinstr int error_context(struct mce *m, struct pt_regs *regs) >> copy_user = is_copy_from_user(regs); >> instrumentation_end(); >> >> - switch (fixup_type) { >> - case EX_TYPE_UACCESS: >> - if (!copy_user) >> - return IN_KERNEL; >> - m->kflags |= MCE_IN_KERNEL_COPYIN; >> - fallthrough; >> + if (copy_user) { >> + m->kflags |= MCE_IN_KERNEL_COPYIN | MCE_IN_KERNEL_COPYIN; > > You have " MCE_IN_KERNEL_COPYIN" twice here. Sorry for this noise, please ignore this version, I resend a new ready version, please see: https://lore.kernel.org/linux-mm/20250307054404.73877-1-xueshuai@linux.alibaba.com/ Thanks Shuai
diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c index dac4d64dfb2a..cb021058165f 100644 --- a/arch/x86/kernel/cpu/mce/severity.c +++ b/arch/x86/kernel/cpu/mce/severity.c @@ -300,13 +300,12 @@ static noinstr int error_context(struct mce *m, struct pt_regs *regs) copy_user = is_copy_from_user(regs); instrumentation_end(); - switch (fixup_type) { - case EX_TYPE_UACCESS: - if (!copy_user) - return IN_KERNEL; - m->kflags |= MCE_IN_KERNEL_COPYIN; - fallthrough; + if (copy_user) { + m->kflags |= MCE_IN_KERNEL_COPYIN | MCE_IN_KERNEL_COPYIN; + return IN_KERNEL_RECOV + } + switch (fixup_type) { case EX_TYPE_FAULT_MCE_SAFE: case EX_TYPE_DEFAULT_MCE_SAFE: m->kflags |= MCE_IN_KERNEL_RECOV;
Commit 4c132d1d844a ("x86/futex: Remove .fixup usage") introduced a new extable fixup type, EX_TYPE_EFAULT_REG, and commit 4c132d1d844a ("x86/futex: Remove .fixup usage") updated the extable fixup type for copy-from-user operations, changing it from EX_TYPE_UACCESS to EX_TYPE_EFAULT_REG. The error context for copy-from-user operations no longer functions as an in-kernel recovery context. Consequently, the error context for copy-from-user operations no longer functions as an in-kernel recovery context, resulting in kernel panics with the message: "Machine check: Data load in unrecoverable area of kernel." The critical aspect is identifying whether the error context involves a read from user memory. We do not care about the ex-type if we know its a MOV reading from userspace. is_copy_from_user() return true when both of the following conditions are met: - the current instruction is copy - source address is user memory So, use is_copy_from_user() to determin if a context is copy user directly. Fixes: 4c132d1d844a ("x86/futex: Remove .fixup usage") Suggested-by: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> --- arch/x86/kernel/cpu/mce/severity.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)