diff mbox series

x86/traps: Re-enable IRQs after reading cr2 in the #PF handler

Message ID 20240911145823.12066-1-alejandro.vallejo@cloud.com (mailing list archive)
State New
Headers show
Series x86/traps: Re-enable IRQs after reading cr2 in the #PF handler | expand

Commit Message

Alejandro Vallejo Sept. 11, 2024, 2:58 p.m. UTC
Moves sti directly after the cr2 read and immediately after the #PF
handler.

While in the area, remove redundant q suffix to a movq in entry.S

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
I don't think this is a bug as much as an accident about to happen. Even if
there's no cases at the moment in which the IRQ handler may page fault, that
might change in the future.

Note: I haven't tested it extensively beyond running it on GitLab.

pipeline:
    https://gitlab.com/xen-project/people/agvallejo/xen/-/pipelines/1449182525

---
 xen/arch/x86/traps.c        |  2 ++
 xen/arch/x86/x86_64/entry.S | 11 +++++++++--
 2 files changed, 11 insertions(+), 2 deletions(-)

Comments

Frediano Ziglio Sept. 11, 2024, 7:58 p.m. UTC | #1
On Wed, Sep 11, 2024 at 3:58 PM Alejandro Vallejo
<alejandro.vallejo@cloud.com> wrote:
>
> Moves sti directly after the cr2 read and immediately after the #PF
> handler.
>
> While in the area, remove redundant q suffix to a movq in entry.S
>
> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
> ---
> I don't think this is a bug as much as an accident about to happen. Even if
> there's no cases at the moment in which the IRQ handler may page fault, that
> might change in the future.
>
> Note: I haven't tested it extensively beyond running it on GitLab.
>
> pipeline:
>     https://gitlab.com/xen-project/people/agvallejo/xen/-/pipelines/1449182525
>
> ---
>  xen/arch/x86/traps.c        |  2 ++
>  xen/arch/x86/x86_64/entry.S | 11 +++++++++--
>  2 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index 708136f625..1c04c03d9f 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -1600,6 +1600,8 @@ void asmlinkage do_page_fault(struct cpu_user_regs *regs)
>
>      addr = read_cr2();
>
> +    local_irq_enable();
> +
>      /* fixup_page_fault() might change regs->error_code, so cache it here. */
>      error_code = regs->error_code;
>
> diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
> index b8482de8ee..ef803f6288 100644
> --- a/xen/arch/x86/x86_64/entry.S
> +++ b/xen/arch/x86/x86_64/entry.S
> @@ -844,8 +844,7 @@ handle_exception_saved:
>  #elif !defined(CONFIG_PV)
>          ASSERT_CONTEXT_IS_XEN
>  #endif /* CONFIG_PV */
> -        sti
> -1:      movq  %rsp,%rdi
> +1:      mov   %rsp,%rdi
>          movzbl UREGS_entry_vector(%rsp),%eax
>  #ifdef CONFIG_PERF_COUNTERS
>          lea   per_cpu__perfcounters(%rip), %rcx
> @@ -866,7 +865,15 @@ handle_exception_saved:
>          jmp   .L_exn_dispatch_done;    \
>  .L_ ## vec ## _done:
>
> +        /*
> +         * IRQs kept off to derisk being hit by a nested interrupt before
> +         * reading %cr2. Otherwise a page fault in the nested interrupt hadnler


Minor, typo: hadnler -> handler

>
> +         * would corrupt %cr2.
> +         */
>          DISPATCH(X86_EXC_PF, do_page_fault)
> +
> +        sti
> +
>          DISPATCH(X86_EXC_GP, do_general_protection)
>          DISPATCH(X86_EXC_UD, do_invalid_op)
>          DISPATCH(X86_EXC_NM, do_device_not_available)
>

Frediano
Roger Pau Monné Sept. 12, 2024, 9:41 a.m. UTC | #2
On Wed, Sep 11, 2024 at 03:58:23PM +0100, Alejandro Vallejo wrote:
> Moves sti directly after the cr2 read and immediately after the #PF
> handler.
> 
> While in the area, remove redundant q suffix to a movq in entry.S
> 
> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
> ---
> I don't think this is a bug as much as an accident about to happen. Even if
> there's no cases at the moment in which the IRQ handler may page fault, that
> might change in the future.
> 
> Note: I haven't tested it extensively beyond running it on GitLab.
> 
> pipeline:
>     https://gitlab.com/xen-project/people/agvallejo/xen/-/pipelines/1449182525
> 
> ---
>  xen/arch/x86/traps.c        |  2 ++
>  xen/arch/x86/x86_64/entry.S | 11 +++++++++--
>  2 files changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index 708136f625..1c04c03d9f 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -1600,6 +1600,8 @@ void asmlinkage do_page_fault(struct cpu_user_regs *regs)
>  
>      addr = read_cr2();
>  
> +    local_irq_enable();

I would maybe add an ASSERT(!local_irq_is_enabled()); at the top of the
function, just to make sure the context is as expected.

> +
>      /* fixup_page_fault() might change regs->error_code, so cache it here. */
>      error_code = regs->error_code;
>  
> diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
> index b8482de8ee..ef803f6288 100644
> --- a/xen/arch/x86/x86_64/entry.S
> +++ b/xen/arch/x86/x86_64/entry.S
> @@ -844,8 +844,7 @@ handle_exception_saved:
>  #elif !defined(CONFIG_PV)
>          ASSERT_CONTEXT_IS_XEN
>  #endif /* CONFIG_PV */
> -        sti
> -1:      movq  %rsp,%rdi
> +1:      mov   %rsp,%rdi

Since you are modifying this already - we usually add a space between
the comma and the next operand.

Thanks, Roger.
Andrew Cooper Sept. 12, 2024, 9:49 a.m. UTC | #3
On 11/09/2024 3:58 pm, Alejandro Vallejo wrote:
> diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
> index b8482de8ee..ef803f6288 100644
> --- a/xen/arch/x86/x86_64/entry.S
> +++ b/xen/arch/x86/x86_64/entry.S
> @@ -844,8 +844,7 @@ handle_exception_saved:
>  #elif !defined(CONFIG_PV)
>          ASSERT_CONTEXT_IS_XEN
>  #endif /* CONFIG_PV */
> -        sti
> -1:      movq  %rsp,%rdi
> +1:      mov   %rsp,%rdi
>          movzbl UREGS_entry_vector(%rsp),%eax
>  #ifdef CONFIG_PERF_COUNTERS
>          lea   per_cpu__perfcounters(%rip), %rcx

I'm afraid this isn't correctly.  The STI is only on one of two paths to
the dispatch logic.

Right now, you're re-enabling interrupts even if #PF hits an irqs-off
region in Xen.

You must not enabled IRQs if going via the exception_with_ints_disabled
path, which is the user of that 1: label immediately after STI.

~Andrew
Alejandro Vallejo Sept. 12, 2024, 10:07 a.m. UTC | #4
On Thu Sep 12, 2024 at 10:49 AM BST, Andrew Cooper wrote:
> On 11/09/2024 3:58 pm, Alejandro Vallejo wrote:
> > diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
> > index b8482de8ee..ef803f6288 100644
> > --- a/xen/arch/x86/x86_64/entry.S
> > +++ b/xen/arch/x86/x86_64/entry.S
> > @@ -844,8 +844,7 @@ handle_exception_saved:
> >  #elif !defined(CONFIG_PV)
> >          ASSERT_CONTEXT_IS_XEN
> >  #endif /* CONFIG_PV */
> > -        sti
> > -1:      movq  %rsp,%rdi
> > +1:      mov   %rsp,%rdi
> >          movzbl UREGS_entry_vector(%rsp),%eax
> >  #ifdef CONFIG_PERF_COUNTERS
> >          lea   per_cpu__perfcounters(%rip), %rcx
>
> I'm afraid this isn't correctly.  The STI is only on one of two paths to
> the dispatch logic.
>
> Right now, you're re-enabling interrupts even if #PF hits an irqs-off
> region in Xen.
>
> You must not enabled IRQs if going via the exception_with_ints_disabled
> path, which is the user of that 1: label immediately after STI.
>
> ~Andrew

Well, darn. That's a well-hidden Waldo.

I'll send a v2 with conditional enables on C and assembly, and a change of that
label from "1" to ".Lfoo" to clearly imply the control flow might take a
backflip from several miles down the file.

Cheers,
Alejandro
diff mbox series

Patch

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 708136f625..1c04c03d9f 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1600,6 +1600,8 @@  void asmlinkage do_page_fault(struct cpu_user_regs *regs)
 
     addr = read_cr2();
 
+    local_irq_enable();
+
     /* fixup_page_fault() might change regs->error_code, so cache it here. */
     error_code = regs->error_code;
 
diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index b8482de8ee..ef803f6288 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -844,8 +844,7 @@  handle_exception_saved:
 #elif !defined(CONFIG_PV)
         ASSERT_CONTEXT_IS_XEN
 #endif /* CONFIG_PV */
-        sti
-1:      movq  %rsp,%rdi
+1:      mov   %rsp,%rdi
         movzbl UREGS_entry_vector(%rsp),%eax
 #ifdef CONFIG_PERF_COUNTERS
         lea   per_cpu__perfcounters(%rip), %rcx
@@ -866,7 +865,15 @@  handle_exception_saved:
         jmp   .L_exn_dispatch_done;    \
 .L_ ## vec ## _done:
 
+        /*
+         * IRQs kept off to derisk being hit by a nested interrupt before
+         * reading %cr2. Otherwise a page fault in the nested interrupt hadnler
+         * would corrupt %cr2.
+         */
         DISPATCH(X86_EXC_PF, do_page_fault)
+
+        sti
+
         DISPATCH(X86_EXC_GP, do_general_protection)
         DISPATCH(X86_EXC_UD, do_invalid_op)
         DISPATCH(X86_EXC_NM, do_device_not_available)