diff mbox series

[RFCv2,07/10] x86/mm: Handle tagged memory accesses from kernel threads

Message ID 20220511022751.65540-9-kirill.shutemov@linux.intel.com (mailing list archive)
State New
Headers show
Series Linear Address Masking enabling | expand

Commit Message

kirill.shutemov@linux.intel.com May 11, 2022, 2:27 a.m. UTC
When a kernel thread performs memory access on behalf of a process (like
in async I/O, io_uring, etc.) it has to respect tagging setup of the
process as user addresses can include tags.

Normally, LAM setup is per-thread and recorded in thread features, but
for this use case kernel also tracks LAM setup per-mm. mm->context.lam
would record LAM that allows the most tag bits among the threads of
the mm.

The info used by switch_mm_irqs_off() to construct CR3 if the task is
kernel thread. Thread featrues of the kernel thread get updated
according to mm->context.lam. It allows untagged_addr() to work
correctly.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/include/asm/mmu.h |  1 +
 arch/x86/mm/tlb.c          | 28 ++++++++++++++++++++++++++++
 2 files changed, 29 insertions(+)

Comments

Peter Zijlstra May 11, 2022, 7:23 a.m. UTC | #1
On Wed, May 11, 2022 at 05:27:48AM +0300, Kirill A. Shutemov wrote:
> When a kernel thread performs memory access on behalf of a process (like
> in async I/O, io_uring, etc.) it has to respect tagging setup of the
> process as user addresses can include tags.
> 
> Normally, LAM setup is per-thread and recorded in thread features, but
> for this use case kernel also tracks LAM setup per-mm. mm->context.lam
> would record LAM that allows the most tag bits among the threads of
> the mm.

Then why does it *ever* make sense to track it per thread? It's not like
it makes heaps of sense to allow one thread in a process to use LAM but
not the others.
Thomas Gleixner May 12, 2022, 1:30 p.m. UTC | #2
On Wed, May 11 2022 at 05:27, Kirill A. Shutemov wrote:
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index f9fe71d1f42c..b320556e1c22 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -185,6 +185,34 @@ static u8 gen_lam(struct task_struct *tsk, struct mm_struct *mm)
>  	if (!tsk)
>  		return LAM_NONE;
>  
> +	if (tsk->flags & PF_KTHREAD) {
> +		/*
> +		 * For kernel thread use the most permissive LAM
> +		 * used by the mm. It's required to handle kernel thread
> +		 * memory accesses on behalf of a process.
> +		 *
> +		 * Adjust thread flags accodringly, so untagged_addr() would
> +		 * work correctly.
> +		 */
> +
> +		tsk->thread.features &= ~(X86_THREAD_LAM_U48 |
> +					  X86_THREAD_LAM_U57);
> +
> +		switch (mm->context.lam) {
> +		case LAM_NONE:
> +			return LAM_NONE;
> +		case LAM_U57:
> +			tsk->thread.features |= X86_THREAD_LAM_U57;
> +			return LAM_U57;
> +		case LAM_U48:
> +			tsk->thread.features |= X86_THREAD_LAM_U48;
> +			return LAM_U48;

Pretending that LAM is configurable per thread and then having a magic
override in the per process mm when accessing that process' memory from
a kernel thread is inconsistent, a horrible hack and a recipe for
hard to diagnose problems.

LAM has to be enabled by the process _before_ creating threads and then
stay enabled until the whole thing dies. That's the only sensible use
case.

I understand that tsk->thread.features is conveniant for the untagging
mechanism, but the whole setup should be:

prctl(ENABLE, which)
     if (can_enable_lam(which)) {
     	mm->lam.c3_mask = CR3_LAM(which);
        mm->lam.untag_mask = UNTAG_LAM(which);
        current->thread.lam_untag_mask = mm->lam.untag_mask;
     }

and

can_enable_lam(which)
    if (current_is_multithreaded())
    	return -ETOOLATE;
    if (current->mm->lam_cr3_mask)
    	return -EBUSY;
    ....
    	

Now vs. kernel threads. Doing this like the above is just the wrong
place. If a kernel thread accesses user space memory of a process then
it has to invoke kthread_use_mm(), right? So the obvious point to cache
that setting is in kthread_use_mm() and kthread_unuse_mm() clears it:

kthread_use_mm()
     current->thread.lam_untag_mask = mm->lam.untag_mask;

kthread_unuse_mm()
     current->thread.lam_untag_mask = 0;

This makes all of the mechanics trivial because CR3 switch then simply
does:

     new_cr3 |= mm->lam.c3_mask;

No conditionals and evaluations, nothing. Just straight forward and
comprehensible code.

Thanks,

        tglx
diff mbox series

Patch

diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
index 5d7494631ea9..52f3749f14e8 100644
--- a/arch/x86/include/asm/mmu.h
+++ b/arch/x86/include/asm/mmu.h
@@ -40,6 +40,7 @@  typedef struct {
 
 #ifdef CONFIG_X86_64
 	unsigned short flags;
+	u8 lam;
 #endif
 
 	struct mutex lock;
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index f9fe71d1f42c..b320556e1c22 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -185,6 +185,34 @@  static u8 gen_lam(struct task_struct *tsk, struct mm_struct *mm)
 	if (!tsk)
 		return LAM_NONE;
 
+	if (tsk->flags & PF_KTHREAD) {
+		/*
+		 * For kernel thread use the most permissive LAM
+		 * used by the mm. It's required to handle kernel thread
+		 * memory accesses on behalf of a process.
+		 *
+		 * Adjust thread flags accodringly, so untagged_addr() would
+		 * work correctly.
+		 */
+
+		tsk->thread.features &= ~(X86_THREAD_LAM_U48 |
+					  X86_THREAD_LAM_U57);
+
+		switch (mm->context.lam) {
+		case LAM_NONE:
+			return LAM_NONE;
+		case LAM_U57:
+			tsk->thread.features |= X86_THREAD_LAM_U57;
+			return LAM_U57;
+		case LAM_U48:
+			tsk->thread.features |= X86_THREAD_LAM_U48;
+			return LAM_U48;
+		default:
+			WARN_ON_ONCE(1);
+			return LAM_NONE;
+		}
+	}
+
 	if (tsk->thread.features & X86_THREAD_LAM_U57)
 		return LAM_U57;
 	if (tsk->thread.features & X86_THREAD_LAM_U48)