diff mbox series

[RFC,1/3] x86/mm: fix LAM cr3 mask inconsistency during context switch

Message ID 20240307133916.3782068-2-yosryahmed@google.com (mailing list archive)
State New
Headers show
Series x86/mm: LAM fixups and cleanups | expand

Commit Message

Yosry Ahmed March 7, 2024, 1:39 p.m. UTC
In switch_mm_irqs_off(), we read the 'mm->context.lam_cr3_mask' into
'new_lam', which is later passed to load_new_mm_cr3(). However, there is
a call to set_tlbstate_lam_mode() in between which will read
'mm->context.lam_cr3_mask' again and set 'cpu_tlbstate.lam' accordingly.
If we race with another thread updating 'mm->context.lam_cr3_mask', the
value in 'cpu_tlbstate.lam' could end up being different from CR3.

Fix the problem by updating set_tlbstate_lam_mode() to return the LAM
mask that was set to 'cpu_tlbstate.lam', and use that mask in
switch_mm_irqs_off() when writing CR3. Use READ_ONCE to make sure we
read the mask once and use it consistenly.

No practical problems have been observed from this, but it's a recipe
for future problems (e.g. debug warnings in switch_mm_irqs_off() or
__get_current_cr3_fast() could fire). It is unlikely that this can cause
any real issues since only a single-threaded process can update its own
LAM mask, so the race here could happen when context switching between
kthreads using a borrowed MM. In this case, it's unlikely that LAM is
important. If it is, then we would need to ensure all CPUs using the mm
are updated before returning to userspace when LAM is enabled -- but we
don't do that.

While we are at it, remove the misguiding comment that states that
'new_lam' may not match tlbstate_lam_cr3_mask() if a race occurs. This
can happen without a race, a different thread could have just enabled
LAM since the last context switch on the current CPU. Replace it with a
hopefully clearer comment above set_tlbstate_lam_mode().

Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
---
 arch/x86/include/asm/tlbflush.h | 11 +++++++----
 arch/x86/mm/tlb.c               | 17 ++++++++---------
 2 files changed, 15 insertions(+), 13 deletions(-)

Comments

Kirill A . Shutemov March 7, 2024, 5:22 p.m. UTC | #1
On Thu, Mar 07, 2024 at 01:39:14PM +0000, Yosry Ahmed wrote:
> In switch_mm_irqs_off(), we read the 'mm->context.lam_cr3_mask' into
> 'new_lam', which is later passed to load_new_mm_cr3(). However, there is
> a call to set_tlbstate_lam_mode() in between which will read
> 'mm->context.lam_cr3_mask' again and set 'cpu_tlbstate.lam' accordingly.
> If we race with another thread updating 'mm->context.lam_cr3_mask', the
> value in 'cpu_tlbstate.lam' could end up being different from CR3.

What other thread? LAM can only be enabled when the process has single
thread. And cannot be disabled. See MM_CONTEXT_LOCK_LAM.

> While we are at it, remove the misguiding comment that states that
> 'new_lam' may not match tlbstate_lam_cr3_mask() if a race occurs.

The comment is indeed misguiding, but for different reason. It is leftover
from the earlier version of LAM patchset.
Dave Hansen March 7, 2024, 5:36 p.m. UTC | #2
I know we all have different rules, but any time you could spend absorbing:

	https://www.kernel.org/doc/html/next/process/maintainer-tip.html

would be appreciated, especially:

> The condensed patch description in the subject line should start with
> a uppercase letter and should be written in imperative tone.


On 3/7/24 05:39, Yosry Ahmed wrote:
> In switch_mm_irqs_off(), we read the 'mm->context.lam_cr3_mask' into
> 'new_lam', which is later passed to load_new_mm_cr3(). However, there is
> a call to set_tlbstate_lam_mode() in between which will read
> 'mm->context.lam_cr3_mask' again and set 'cpu_tlbstate.lam' accordingly.
> If we race with another thread updating 'mm->context.lam_cr3_mask', the
> value in 'cpu_tlbstate.lam' could end up being different from CR3.

Your description is fine (modulo the we's).  But I slightly reworded it
to make it more plainly readable:

LAM can only be enabled when a process is single-threaded.  But _kernel_
threads can temporarily use a single-threaded process's mm.  That means
that a context-switching kernel thread can race and observe the mm's LAM
metadata (mm->context.lam_cr3_mask) change.

The context switch code does two logical things with that metadata:
populate CR3 and populate 'cpu_tlbstate.lam'.  If it hits this race,
'cpu_tlbstate.lam' and CR3 can end up out of sync.

This de-synchronization is currently harmless.  But it is confusing and
might lead to warnings or real bugs.

--

> Fix the problem by updating set_tlbstate_lam_mode() to return the LAM
> mask that was set to 'cpu_tlbstate.lam', and use that mask in
> switch_mm_irqs_off() when writing CR3. Use READ_ONCE to make sure we
> read the mask once and use it consistenly.

Spell checking is also appreciated.

...
> -static inline void set_tlbstate_lam_mode(struct mm_struct *mm)
> +static inline unsigned long set_tlbstate_lam_mode(struct mm_struct *mm)
>  {
> -	this_cpu_write(cpu_tlbstate.lam,
> -		       mm->context.lam_cr3_mask >> X86_CR3_LAM_U57_BIT);
> +	unsigned long lam = READ_ONCE(mm->context.lam_cr3_mask);
> +
> +	this_cpu_write(cpu_tlbstate.lam, lam >> X86_CR3_LAM_U57_BIT);
>  	this_cpu_write(tlbstate_untag_mask, mm->context.untag_mask);
> +	return lam;
>  }

The comments about races need to be _here_ so that the purpose of the
READ_ONCE() is clear.

It would also be nice to call out the rule that this can only
meaningfully be called once per context switch.

> @@ -633,7 +628,12 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
>  		barrier();
>  	}
>  
> -	set_tlbstate_lam_mode(next);
> +	/*
> +	 * Even if we are not actually switching mm's, another thread could have
> +	 * updated mm->context.lam_cr3_mask. Make sure tlbstate_lam_cr3_mask()
> +	 * and the loaded CR3 use the up-to-date mask.
> +	 */

I kinda dislike how the comment talks about the details of what
set_tlbstate_lam_mode() does.  It would be much better to put the meat
of this comment at the set_tlbstate_lam_mode() definition.

> +	new_lam = set_tlbstate_lam_mode(next);
>  	if (need_flush) {
>  		this_cpu_write(cpu_tlbstate.ctxs[new_asid].ctx_id, next->context.ctx_id);
>  		this_cpu_write(cpu_tlbstate.ctxs[new_asid].tlb_gen, next_tlb_gen);

This is less a complaint about your change and more of the existing
code, but I wish it was more obvious that set_tlbstate_lam_mode() is
logically shuffling data (once) from 'next' into the tlbstate.

The naming makes it sound like it is modifying the tlbstate of 'next'.

But I don't have any particularly brilliant ideas to fix it either.
Maybe just:

	/* new_lam is effectively cpu_tlbstate.lam */

> @@ -705,7 +705,6 @@ void initialize_tlbstate_and_flush(void)
>  
>  	/* LAM expected to be disabled */
>  	WARN_ON(cr3 & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57));
> -	WARN_ON(mm_lam_cr3_mask(mm));
>  
>  	/*
>  	 * Assert that CR4.PCIDE is set if needed.  (CR4.PCIDE initialization
> @@ -724,7 +723,7 @@ void initialize_tlbstate_and_flush(void)
>  	this_cpu_write(cpu_tlbstate.next_asid, 1);
>  	this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, mm->context.ctx_id);
>  	this_cpu_write(cpu_tlbstate.ctxs[0].tlb_gen, tlb_gen);
> -	set_tlbstate_lam_mode(mm);
> +	WARN_ON(set_tlbstate_lam_mode(mm));

The "set_" naming bugs me in both of the sites that get modified here.
I'd be with a new name that fits better, if we can think of one.
Sean Christopherson March 7, 2024, 6:49 p.m. UTC | #3
On Thu, Mar 07, 2024, Dave Hansen wrote:
> I know we all have different rules, but any time you could spend absorbing:
> 
> 	https://www.kernel.org/doc/html/next/process/maintainer-tip.html
> 
> would be appreciated, especially:
> 
> > The condensed patch description in the subject line should start with
> > a uppercase letter and should be written in imperative tone.

And while you're at it,

  https://www.kernel.org/doc/html/next/process/maintainer-kvm-x86.html

which my crystal ball says will be helpful in your future. ;-)
Yosry Ahmed March 7, 2024, 8:31 p.m. UTC | #4
On Thu, Mar 07, 2024 at 07:22:36PM +0200, Kirill A. Shutemov wrote:
> On Thu, Mar 07, 2024 at 01:39:14PM +0000, Yosry Ahmed wrote:
> > In switch_mm_irqs_off(), we read the 'mm->context.lam_cr3_mask' into
> > 'new_lam', which is later passed to load_new_mm_cr3(). However, there is
> > a call to set_tlbstate_lam_mode() in between which will read
> > 'mm->context.lam_cr3_mask' again and set 'cpu_tlbstate.lam' accordingly.
> > If we race with another thread updating 'mm->context.lam_cr3_mask', the
> > value in 'cpu_tlbstate.lam' could end up being different from CR3.
> 
> What other thread? LAM can only be enabled when the process has single
> thread. And cannot be disabled. See MM_CONTEXT_LOCK_LAM.

Right, but a kthread may run with that single-threaded process's mm
IIUC. I think this can happen via kthread_use_mm() or if we context
switch directly from the user process to the kthread (context_switch()
doesn't seem to update the mm in this case).

> 
> > While we are at it, remove the misguiding comment that states that
> > 'new_lam' may not match tlbstate_lam_cr3_mask() if a race occurs.
> 
> The comment is indeed misguiding, but for different reason. It is leftover
> from the earlier version of LAM patchset.
Yosry Ahmed March 7, 2024, 8:42 p.m. UTC | #5
On Thu, Mar 07, 2024 at 09:36:58AM -0800, Dave Hansen wrote:
> I know we all have different rules, but any time you could spend absorbing:
> 
> 	https://www.kernel.org/doc/html/next/process/maintainer-tip.html

Thanks for the quick review and tips.

I didn't know this existed, I will take a look before respinning.

> 
> would be appreciated, especially:
> 
> > The condensed patch description in the subject line should start with
> > a uppercase letter and should be written in imperative tone.
> 
> 
> On 3/7/24 05:39, Yosry Ahmed wrote:
> > In switch_mm_irqs_off(), we read the 'mm->context.lam_cr3_mask' into
> > 'new_lam', which is later passed to load_new_mm_cr3(). However, there is
> > a call to set_tlbstate_lam_mode() in between which will read
> > 'mm->context.lam_cr3_mask' again and set 'cpu_tlbstate.lam' accordingly.
> > If we race with another thread updating 'mm->context.lam_cr3_mask', the
> > value in 'cpu_tlbstate.lam' could end up being different from CR3.
> 
> Your description is fine (modulo the we's).  But I slightly reworded it
> to make it more plainly readable:
> 
> LAM can only be enabled when a process is single-threaded.  But _kernel_
> threads can temporarily use a single-threaded process's mm.  That means
> that a context-switching kernel thread can race and observe the mm's LAM
> metadata (mm->context.lam_cr3_mask) change.
> 
> The context switch code does two logical things with that metadata:
> populate CR3 and populate 'cpu_tlbstate.lam'.  If it hits this race,
> 'cpu_tlbstate.lam' and CR3 can end up out of sync.
> 
> This de-synchronization is currently harmless.  But it is confusing and
> might lead to warnings or real bugs.

Thanks a lot! I will adopt your version moving forward :)

> 
> --
> 
> > Fix the problem by updating set_tlbstate_lam_mode() to return the LAM
> > mask that was set to 'cpu_tlbstate.lam', and use that mask in
> > switch_mm_irqs_off() when writing CR3. Use READ_ONCE to make sure we
> > read the mask once and use it consistenly.
> 
> Spell checking is also appreciated.

I did run checkpatch. Did it miss something?

> 
> ...
> > -static inline void set_tlbstate_lam_mode(struct mm_struct *mm)
> > +static inline unsigned long set_tlbstate_lam_mode(struct mm_struct *mm)
> >  {
> > -	this_cpu_write(cpu_tlbstate.lam,
> > -		       mm->context.lam_cr3_mask >> X86_CR3_LAM_U57_BIT);
> > +	unsigned long lam = READ_ONCE(mm->context.lam_cr3_mask);
> > +
> > +	this_cpu_write(cpu_tlbstate.lam, lam >> X86_CR3_LAM_U57_BIT);
> >  	this_cpu_write(tlbstate_untag_mask, mm->context.untag_mask);
> > +	return lam;
> >  }
> 
> The comments about races need to be _here_ so that the purpose of the
> READ_ONCE() is clear.
> 
> It would also be nice to call out the rule that this can only
> meaningfully be called once per context switch.

I wanted the comments in switch_mm_irqs_off() where the races actually
matter, but I guess I can make the comment more generic and specify that
the return value is used to write CR3 so we READ_ONCE keeps CR3 and
tlbstate.lam consistent.

> 
> > @@ -633,7 +628,12 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
> >  		barrier();
> >  	}
> >  
> > -	set_tlbstate_lam_mode(next);
> > +	/*
> > +	 * Even if we are not actually switching mm's, another thread could have
> > +	 * updated mm->context.lam_cr3_mask. Make sure tlbstate_lam_cr3_mask()
> > +	 * and the loaded CR3 use the up-to-date mask.
> > +	 */
> 
> I kinda dislike how the comment talks about the details of what
> set_tlbstate_lam_mode() does.  It would be much better to put the meat
> of this comment at the set_tlbstate_lam_mode() definition.

Agreed. I will move most comments to set_tlbstate_lam_mode().

> 
> > +	new_lam = set_tlbstate_lam_mode(next);
> >  	if (need_flush) {
> >  		this_cpu_write(cpu_tlbstate.ctxs[new_asid].ctx_id, next->context.ctx_id);
> >  		this_cpu_write(cpu_tlbstate.ctxs[new_asid].tlb_gen, next_tlb_gen);
> 
> This is less a complaint about your change and more of the existing
> code, but I wish it was more obvious that set_tlbstate_lam_mode() is
> logically shuffling data (once) from 'next' into the tlbstate.
> 
> The naming makes it sound like it is modifying the tlbstate of 'next'.

We can update the function name to make it more verbose, maybe something
like update_cpu_tlbstate_lam()? We can also try to put "return"
somewhere in the name to imply that it returns the LAM mask it sets, but
I can't make that look pretty.

> 
> But I don't have any particularly brilliant ideas to fix it either.
> Maybe just:
> 
> 	/* new_lam is effectively cpu_tlbstate.lam */
> 
> > @@ -705,7 +705,6 @@ void initialize_tlbstate_and_flush(void)
> >  
> >  	/* LAM expected to be disabled */
> >  	WARN_ON(cr3 & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57));
> > -	WARN_ON(mm_lam_cr3_mask(mm));
> >  
> >  	/*
> >  	 * Assert that CR4.PCIDE is set if needed.  (CR4.PCIDE initialization
> > @@ -724,7 +723,7 @@ void initialize_tlbstate_and_flush(void)
> >  	this_cpu_write(cpu_tlbstate.next_asid, 1);
> >  	this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, mm->context.ctx_id);
> >  	this_cpu_write(cpu_tlbstate.ctxs[0].tlb_gen, tlb_gen);
> > -	set_tlbstate_lam_mode(mm);
> > +	WARN_ON(set_tlbstate_lam_mode(mm));
> 
> The "set_" naming bugs me in both of the sites that get modified here.
> I'd be with a new name that fits better, if we can think of one.

Is it because it's not clear we are updating cpu_tlbstate (in which case
I think update_cpu_tlbstate_lam() is an improvement), or is it because
the function returns a value now? If the latter we can put "return" in
the name somewhere, or keep the function void and pass in an output
parameter.
Yosry Ahmed March 7, 2024, 8:44 p.m. UTC | #6
On Thu, Mar 07, 2024 at 10:49:51AM -0800, Sean Christopherson wrote:
> On Thu, Mar 07, 2024, Dave Hansen wrote:
> > I know we all have different rules, but any time you could spend absorbing:
> > 
> > 	https://www.kernel.org/doc/html/next/process/maintainer-tip.html
> > 
> > would be appreciated, especially:
> > 
> > > The condensed patch description in the subject line should start with
> > > a uppercase letter and should be written in imperative tone.
> 
> And while you're at it,
> 
>   https://www.kernel.org/doc/html/next/process/maintainer-kvm-x86.html
> 
> which my crystal ball says will be helpful in your future. ;-)

I am sincerely hoping that there are no contradictions between different
maintainer docs, otherwise my brain will break really fast.
Sean Christopherson March 7, 2024, 10:12 p.m. UTC | #7
On Thu, Mar 07, 2024, Yosry Ahmed wrote:
> On Thu, Mar 07, 2024 at 10:49:51AM -0800, Sean Christopherson wrote:
> > On Thu, Mar 07, 2024, Dave Hansen wrote:
> > > I know we all have different rules, but any time you could spend absorbing:
> > > 
> > > 	https://www.kernel.org/doc/html/next/process/maintainer-tip.html
> > > 
> > > would be appreciated, especially:
> > > 
> > > > The condensed patch description in the subject line should start with
> > > > a uppercase letter and should be written in imperative tone.
> > 
> > And while you're at it,
> > 
> >   https://www.kernel.org/doc/html/next/process/maintainer-kvm-x86.html
> > 
> > which my crystal ball says will be helpful in your future. ;-)
> 
> I am sincerely hoping that there are no contradictions between different
> maintainer docs, otherwise my brain will break really fast.

Heh, as long as you don't contribute to net/, you'll be fine.  kvm-x86 mostly just
follows tip, with a few minor exceptions.  But the exceptions are explicitly
documented in kvm-x86, and I am intentionally forgiving/lax on them, because yeah,
getting pulled in multiple directions is no fun.
Yosry Ahmed March 7, 2024, 11:21 p.m. UTC | #8
> > The "set_" naming bugs me in both of the sites that get modified here.
> > I'd be with a new name that fits better, if we can think of one.
>
> Is it because it's not clear we are updating cpu_tlbstate (in which case
> I think update_cpu_tlbstate_lam() is an improvement), or is it because
> the function returns a value now? If the latter we can put "return" in
> the name somewhere, or keep the function void and pass in an output
> parameter.

Or we can avoid returning a value from the helper and avoid passing an
mm. The callers would be more verbose, they'll have to call
mm_lam_cr3_mask() and mm_untag_mask() and pass the results into the
helper (set_tlbstate_lam_mode() or update_cpu_tlbstate_lam()). Another
advantage of this is that we can move the READ_ONCE to
switch_mm_irqs_off() and keep the comment here.

WDYT?
Dave Hansen March 7, 2024, 11:32 p.m. UTC | #9
On 3/7/24 15:21, Yosry Ahmed wrote:
>>> The "set_" naming bugs me in both of the sites that get modified here.
>>> I'd be with a new name that fits better, if we can think of one.
>> Is it because it's not clear we are updating cpu_tlbstate (in which case
>> I think update_cpu_tlbstate_lam() is an improvement), or is it because
>> the function returns a value now? 

That's part of it.

>> If the latter we can put "return" in the name somewhere, or keep
>> the function void and pass in an output parameter.
No, adding a "_return" won't make it better.

> Or we can avoid returning a value from the helper and avoid passing an
> mm. The callers would be more verbose, they'll have to call
> mm_lam_cr3_mask() and mm_untag_mask() and pass the results into the
> helper (set_tlbstate_lam_mode() or update_cpu_tlbstate_lam()). Another
> advantage of this is that we can move the READ_ONCE to
> switch_mm_irqs_off() and keep the comment here.


One thing I don't like about set_tlbstate_lam_mode() is that it's not
obvious that it's writing to "cpu_tlbstate" and its right smack in the
middle of a bunch of other writers to the same thing.

But I'm also not sure that open-coding it at its three call sites makes
things better overall.

I honestly don't have any brilliant suggestions.
Yosry Ahmed March 7, 2024, 11:37 p.m. UTC | #10
> > Or we can avoid returning a value from the helper and avoid passing an
> > mm. The callers would be more verbose, they'll have to call
> > mm_lam_cr3_mask() and mm_untag_mask() and pass the results into the
> > helper (set_tlbstate_lam_mode() or update_cpu_tlbstate_lam()). Another
> > advantage of this is that we can move the READ_ONCE to
> > switch_mm_irqs_off() and keep the comment here.
>
>
> One thing I don't like about set_tlbstate_lam_mode() is that it's not
> obvious that it's writing to "cpu_tlbstate" and its right smack in the
> middle of a bunch of other writers to the same thing.
>
> But I'm also not sure that open-coding it at its three call sites makes
> things better overall.
>
> I honestly don't have any brilliant suggestions.

Let me ponder this a little bit and try to come up with something, I
think a max of renaming and open-coding could make an improvement.

Thanks!
diff mbox series

Patch

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 25726893c6f4d..a4ddb20f84fe7 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -399,11 +399,13 @@  static inline  u64 tlbstate_lam_cr3_mask(void)
 	return lam << X86_CR3_LAM_U57_BIT;
 }
 
-static inline void set_tlbstate_lam_mode(struct mm_struct *mm)
+static inline unsigned long set_tlbstate_lam_mode(struct mm_struct *mm)
 {
-	this_cpu_write(cpu_tlbstate.lam,
-		       mm->context.lam_cr3_mask >> X86_CR3_LAM_U57_BIT);
+	unsigned long lam = READ_ONCE(mm->context.lam_cr3_mask);
+
+	this_cpu_write(cpu_tlbstate.lam, lam >> X86_CR3_LAM_U57_BIT);
 	this_cpu_write(tlbstate_untag_mask, mm->context.untag_mask);
+	return lam;
 }
 
 #else
@@ -413,8 +415,9 @@  static inline u64 tlbstate_lam_cr3_mask(void)
 	return 0;
 }
 
-static inline void set_tlbstate_lam_mode(struct mm_struct *mm)
+static inline unsigned long set_tlbstate_lam_mode(struct mm_struct *mm)
 {
+	return 0;
 }
 #endif
 #endif /* !MODULE */
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 51f9f56941058..2975d3f89a5de 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -503,9 +503,9 @@  void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
 {
 	struct mm_struct *prev = this_cpu_read(cpu_tlbstate.loaded_mm);
 	u16 prev_asid = this_cpu_read(cpu_tlbstate.loaded_mm_asid);
-	unsigned long new_lam = mm_lam_cr3_mask(next);
 	bool was_lazy = this_cpu_read(cpu_tlbstate_shared.is_lazy);
 	unsigned cpu = smp_processor_id();
+	unsigned long new_lam;
 	u64 next_tlb_gen;
 	bool need_flush;
 	u16 new_asid;
@@ -561,11 +561,6 @@  void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
 		VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[prev_asid].ctx_id) !=
 			   next->context.ctx_id);
 
-		/*
-		 * If this races with another thread that enables lam, 'new_lam'
-		 * might not match tlbstate_lam_cr3_mask().
-		 */
-
 		/*
 		 * Even in lazy TLB mode, the CPU should stay set in the
 		 * mm_cpumask. The TLB shootdown code can figure out from
@@ -633,7 +628,12 @@  void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
 		barrier();
 	}
 
-	set_tlbstate_lam_mode(next);
+	/*
+	 * Even if we are not actually switching mm's, another thread could have
+	 * updated mm->context.lam_cr3_mask. Make sure tlbstate_lam_cr3_mask()
+	 * and the loaded CR3 use the up-to-date mask.
+	 */
+	new_lam = set_tlbstate_lam_mode(next);
 	if (need_flush) {
 		this_cpu_write(cpu_tlbstate.ctxs[new_asid].ctx_id, next->context.ctx_id);
 		this_cpu_write(cpu_tlbstate.ctxs[new_asid].tlb_gen, next_tlb_gen);
@@ -705,7 +705,6 @@  void initialize_tlbstate_and_flush(void)
 
 	/* LAM expected to be disabled */
 	WARN_ON(cr3 & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57));
-	WARN_ON(mm_lam_cr3_mask(mm));
 
 	/*
 	 * Assert that CR4.PCIDE is set if needed.  (CR4.PCIDE initialization
@@ -724,7 +723,7 @@  void initialize_tlbstate_and_flush(void)
 	this_cpu_write(cpu_tlbstate.next_asid, 1);
 	this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, mm->context.ctx_id);
 	this_cpu_write(cpu_tlbstate.ctxs[0].tlb_gen, tlb_gen);
-	set_tlbstate_lam_mode(mm);
+	WARN_ON(set_tlbstate_lam_mode(mm));
 
 	for (i = 1; i < TLB_NR_DYN_ASIDS; i++)
 		this_cpu_write(cpu_tlbstate.ctxs[i].ctx_id, 0);