diff mbox series

[v4,1/3] x86/mce: Use is_copy_from_user() to determine copy-from-user context

Message ID 20250307054404.73877-2-xueshuai@linux.alibaba.com (mailing list archive)
State New
Headers show
Series mm/hwpoison: Fix regressions in memory failure handling | expand

Commit Message

Shuai Xue March 7, 2025, 5:44 a.m. UTC
Commit 4c132d1d844a ("x86/futex: Remove .fixup usage") introduced a new
extable fixup type, EX_TYPE_EFAULT_REG, and commit 4c132d1d844a
("x86/futex: Remove .fixup usage") updated the extable fixup type for
copy-from-user operations, changing it from EX_TYPE_UACCESS to
EX_TYPE_EFAULT_REG. The error context for copy-from-user operations no
longer functions as an in-kernel recovery context. Consequently, the error
context for copy-from-user operations no longer functions as an in-kernel
recovery context, resulting in kernel panics with the message: "Machine
check: Data load in unrecoverable area of kernel."

The critical aspect is identifying whether the error context involves a
read from user memory. We do not care about the ex-type if we know its a
MOV reading from userspace. is_copy_from_user() return true when both of
the following conditions are met:

    - the current instruction is copy
    - source address is user memory

So, use is_copy_from_user() to determin if a context is copy user directly.

Fixes: 4c132d1d844a ("x86/futex: Remove .fixup usage")
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
---
 arch/x86/kernel/cpu/mce/severity.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

Comments

Borislav Petkov March 7, 2025, 8:40 p.m. UTC | #1
On Fri, Mar 07, 2025 at 01:44:02PM +0800, Shuai Xue wrote:
> Commit 4c132d1d844a ("x86/futex: Remove .fixup usage") introduced a new
> extable fixup type, EX_TYPE_EFAULT_REG, and commit 4c132d1d844a
> ("x86/futex: Remove .fixup usage") updated the extable fixup type for
> copy-from-user operations, changing it from EX_TYPE_UACCESS to
> EX_TYPE_EFAULT_REG. The error context for copy-from-user operations no
> longer functions as an in-kernel recovery context. Consequently, the error
> context for copy-from-user operations no longer functions as an in-kernel
> recovery context, resulting in kernel panics with the message: "Machine
> check: Data load in unrecoverable area of kernel."
> 
> The critical aspect is identifying whether the error context involves a
> read from user memory. We do not care about the ex-type if we know its a

Please use passive voice in your commit message: no "we" or "I", etc,
and describe your changes in imperative mood.

Also, pls read section "2) Describe your changes" in
Documentation/process/submitting-patches.rst for more details.

Also, see section "Changelog" in
Documentation/process/maintainer-tip.rst

Bottom line is: personal pronouns are ambiguous in text, especially with
so many parties/companies/etc developing the kernel so let's avoid them
please.

"ex-type"?

Please write in plain English - not in a programming language.

> MOV reading from userspace. is_copy_from_user() return true when both of
> the following conditions are met:
> 
>     - the current instruction is copy

There is no "copy instruction". You mean the "current operation".

>     - source address is user memory

So you can simply say "when reading user memory". Simple.
> 
> So, use is_copy_from_user() to determin if a context is copy user directly.

Unknown word [determin] in commit message.
Suggestions: ['determine',

Please introduce a spellchecker into your patch creation workflow.

Also, run your commit messages through AI to correct the grammar and
formulations in them.

The more important part which I asked for already is, is is_copy_from_user()
exhaustive in determining the that the operation really is a copy from user?

The EX_TYPE_UACCESS things *explicitly* marked such places in the code. Does
is_copy_from_user() guarantee the same, without false positives?
Luck, Tony March 7, 2025, 10:05 p.m. UTC | #2
> The more important part which I asked for already is, is is_copy_from_user()
> exhaustive in determining the that the operation really is a copy from user?
>
> The EX_TYPE_UACCESS things *explicitly* marked such places in the code. Does
> is_copy_from_user() guarantee the same, without false positives?

is_copy_from_user() decodes the instruction that took the trap. It looks for
MOV, MOVZ and MOVS instructions to find the source address, and then
checks whether that's user (< TASK_SIZE_MAX) or kernel.

So no false positives.

There could be some false negatives if some other instruction is doing
the "load" operation.

-Tony
Borislav Petkov March 7, 2025, 10:46 p.m. UTC | #3
On Fri, Mar 07, 2025 at 10:05:12PM +0000, Luck, Tony wrote:
> is_copy_from_user() decodes the instruction that took the trap. It looks for
> MOV, MOVZ and MOVS instructions to find the source address, and then
> checks whether that's user (< TASK_SIZE_MAX) or kernel.

You mean there's absolutely nothing else like, say, some epbf or some other
hackery we tend to do in the kernel (or we will do in the future) which won't
create the exact same two conditions:

- one of the three insns
- user mem read

and it would cause a recovery action.

Perhaps it still might be the proper thing to do even then but it does sound
fishy and unclean to me.

Nothing beats the explicit markup we had until recently...
Luck, Tony March 7, 2025, 11:11 p.m. UTC | #4
> > is_copy_from_user() decodes the instruction that took the trap. It looks for
> > MOV, MOVZ and MOVS instructions to find the source address, and then
> > checks whether that's user (< TASK_SIZE_MAX) or kernel.
>
> You mean there's absolutely nothing else like, say, some epbf or some other
> hackery we tend to do in the kernel (or we will do in the future) which won't
> create the exact same two conditions:
>
> - one of the three insns
> - user mem read
>
> and it would cause a recovery action.
>
> Perhaps it still might be the proper thing to do even then but it does sound
> fishy and unclean to me.
>
> Nothing beats the explicit markup we had until recently...

Every "user mem read" needs to have an extable[] recovery entry
attached to the IP of the instruction  (to handle the much more common
#PF for page-not-present). All those places already have to deal with
the possibility that the #PF can't be recovered. The #MC handling is
really just a small extension.

As for "explicit markup" I don't think it would be better to decorate
every get_user() and copy_from_user() with some "this one can
recover from #MC" 

Note also that "what we had recently" was fragile, broke, and resulted
in this regression.

-Tony
Borislav Petkov March 7, 2025, 11:22 p.m. UTC | #5
On Fri, Mar 07, 2025 at 11:11:26PM +0000, Luck, Tony wrote:
> As for "explicit markup" I don't think it would be better to decorate
> every get_user() and copy_from_user() with some "this one can
> recover from #MC" 

I don't mean every function - I mean what we had there with EX_TYPE_UACCESS.
That is explicit and unambiguous. Proving that is_copy_from_user() is always
correct is a lot harder.

> Note also that "what we had recently" was fragile, broke, and resulted
> in this regression.

Because those exception types got renamed? Oh well, that should've been
reverted actually but no one involved realized that MCE is using those.

And I'm not saying this is the only way to solve this. We could do something
like collecting all addresses on which an MCE can be recoverable, for example.
We haven't considered it that important... yet.

Looks like we're going to try this new is_copy_from_user() thing now and then
see where it gets us.

So, after the commit message has been fixed:

Acked-by: Borislav Petkov (AMD) <bp@alien8.de>

I'm presuming, this is going through akpm...
Shuai Xue March 8, 2025, 11:25 a.m. UTC | #6
在 2025/3/8 04:40, Borislav Petkov 写道:
> On Fri, Mar 07, 2025 at 01:44:02PM +0800, Shuai Xue wrote:
>> Commit 4c132d1d844a ("x86/futex: Remove .fixup usage") introduced a new
>> extable fixup type, EX_TYPE_EFAULT_REG, and commit 4c132d1d844a
>> ("x86/futex: Remove .fixup usage") updated the extable fixup type for
>> copy-from-user operations, changing it from EX_TYPE_UACCESS to
>> EX_TYPE_EFAULT_REG. The error context for copy-from-user operations no
>> longer functions as an in-kernel recovery context. Consequently, the error
>> context for copy-from-user operations no longer functions as an in-kernel
>> recovery context, resulting in kernel panics with the message: "Machine
>> check: Data load in unrecoverable area of kernel."
>>
>> The critical aspect is identifying whether the error context involves a
>> read from user memory. We do not care about the ex-type if we know its a
> 
> Please use passive voice in your commit message: no "we" or "I", etc,
> and describe your changes in imperative mood.
> 
> Also, pls read section "2) Describe your changes" in
> Documentation/process/submitting-patches.rst for more details.
> 
> Also, see section "Changelog" in
> Documentation/process/maintainer-tip.rst
> 
> Bottom line is: personal pronouns are ambiguous in text, especially with
> so many parties/companies/etc developing the kernel so let's avoid them
> please.
> 
> "ex-type"?
> 
> Please write in plain English - not in a programming language.
> 
>> MOV reading from userspace. is_copy_from_user() return true when both of
>> the following conditions are met:
>>
>>      - the current instruction is copy
> 
> There is no "copy instruction". You mean the "current operation".
> 
>>      - source address is user memory
> 
> So you can simply say "when reading user memory". Simple.
>>
>> So, use is_copy_from_user() to determin if a context is copy user directly.
> 
> Unknown word [determin] in commit message.
> Suggestions: ['determine',
> 
> Please introduce a spellchecker into your patch creation workflow.
> 
> Also, run your commit messages through AI to correct the grammar and
> formulations in them.

Certainly, thank you for bringing that to my attention.
I will refine the commit log accordingly.

> 
> The more important part which I asked for already is, is is_copy_from_user()
> exhaustive in determining the that the operation really is a copy from user?
> 
> The EX_TYPE_UACCESS things *explicitly* marked such places in the code. Does
> is_copy_from_user() guarantee the same, without false positives?
> 

Following your discussion with Tony, it seems that we have reached a conclusion.

Thanks.
Best Regards,
Shuai
Shuai Xue March 8, 2025, 11:27 a.m. UTC | #7
在 2025/3/8 07:22, Borislav Petkov 写道:
> On Fri, Mar 07, 2025 at 11:11:26PM +0000, Luck, Tony wrote:
>> As for "explicit markup" I don't think it would be better to decorate
>> every get_user() and copy_from_user() with some "this one can
>> recover from #MC"
> 
> I don't mean every function - I mean what we had there with EX_TYPE_UACCESS.
> That is explicit and unambiguous. Proving that is_copy_from_user() is always
> correct is a lot harder.
> 
>> Note also that "what we had recently" was fragile, broke, and resulted
>> in this regression.
> 
> Because those exception types got renamed? Oh well, that should've been
> reverted actually but no one involved realized that MCE is using those.
> 
> And I'm not saying this is the only way to solve this. We could do something
> like collecting all addresses on which an MCE can be recoverable, for example.
> We haven't considered it that important... yet.
> 
> Looks like we're going to try this new is_copy_from_user() thing now and then
> see where it gets us.
> 
> So, after the commit message has been fixed:
> 
> Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
> 
> I'm presuming, this is going through akpm...
> 

Thanks.
Shuai
diff mbox series

Patch

diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c
index dac4d64dfb2a..2235a7477436 100644
--- a/arch/x86/kernel/cpu/mce/severity.c
+++ b/arch/x86/kernel/cpu/mce/severity.c
@@ -300,13 +300,12 @@  static noinstr int error_context(struct mce *m, struct pt_regs *regs)
 	copy_user  = is_copy_from_user(regs);
 	instrumentation_end();
 
-	switch (fixup_type) {
-	case EX_TYPE_UACCESS:
-		if (!copy_user)
-			return IN_KERNEL;
-		m->kflags |= MCE_IN_KERNEL_COPYIN;
-		fallthrough;
+	if (copy_user) {
+		m->kflags |= MCE_IN_KERNEL_COPYIN | MCE_IN_KERNEL_RECOV;
+		return IN_KERNEL_RECOV;
+	}
 
+	switch (fixup_type) {
 	case EX_TYPE_FAULT_MCE_SAFE:
 	case EX_TYPE_DEFAULT_MCE_SAFE:
 		m->kflags |= MCE_IN_KERNEL_RECOV;