mbox series

[PATCHv2,0/5] arm64/irqentry: remove duplicate housekeeping of

Message ID 20210924132837.45994-1-kernelfans@gmail.com (mailing list archive)
Headers show
Series arm64/irqentry: remove duplicate housekeeping of | expand

Message

Pingfan Liu Sept. 24, 2021, 1:28 p.m. UTC
After introducing arm64/kernel/entry_common.c which is akin to
kernel/entry/common.c , the housekeeping of rcu/trace are done twice as
the following:
    enter_from_kernel_mode()->rcu_irq_enter().
And
    gic_handle_irq()->...->handle_domain_irq()->irq_enter()->rcu_irq_enter()

Besides redundance, based on code analysis, the redundance also raise
some mistake, e.g.  rcu_data->dynticks_nmi_nesting inc 2, which causes
rcu_is_cpu_rrupt_from_idle() unexpected.

Nmi also faces duplicate accounts. This series aims to address these
duplicate issues.
[1-2/5]: address nmi account duplicate
[3-4/5]: address rcu housekeeping duplicate in irq
[5/5]: as a natural result of [3-4/5], address a history issue. [1]


History:
v1 -> v2:
    change the subject as the motivation varies.
    add the fix for nmi account duplicate

The subject of v1 is "[PATCH 1/3] kernel/irq: __handle_domain_irq()
makes irq_enter/exit arch optional". [2] It is brought up to fix [1].

There have been some tries to enable crash-stop-NMI on arm64, one by me,
the other by Yuichi's [4].  I hope after this series, they can advance,
as Marc said in [3] "No additional NMI patches will make it until we
have resolved the issues"

[1] https://lore.kernel.org/linux-arm-kernel/87lfewnmdz.fsf@nanos.tec.linutronix.de/
[2] https://lore.kernel.org/linux-arm-kernel/1607912752-12481-1-git-send-email-kernelfans@gmail.com
[3] https://lore.kernel.org/linux-arm-kernel/afd82be798cb55fd2f96940db7be78c0@kernel.org
[4] https://lore.kernel.org/linux-arm-kernel/20201104080539.3205889-1-ito-yuichi@fujitsu.com

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Julien Thierry <julien.thierry@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yuichi Ito <ito-yuichi@fujitsu.com>
Cc: linux-kernel@vger.kernel.org
To: linux-arm-kernel@lists.infradead.org


Pingfan Liu (5):
  arm64/entry-common: push the judgement of nmi ahead
  irqchip/GICv3: expose handle_nmi() directly
  kernel/irq: make irq_{enter,exit}() in handle_domain_irq() arch
    optional
  irqchip/GICv3: let gic_handle_irq() utilize irqentry on arm64
  irqchip/GICv3: make reschedule-ipi light weight

 arch/arm64/Kconfig               |  1 +
 arch/arm64/include/asm/irq.h     |  7 ++++
 arch/arm64/kernel/entry-common.c | 45 +++++++++++++++-------
 arch/arm64/kernel/irq.c          | 29 ++++++++++++++
 drivers/irqchip/irq-gic-v3.c     | 66 ++++++++++++++++++++------------
 kernel/irq/Kconfig               |  3 ++
 kernel/irq/irqdesc.c             |  4 ++
 7 files changed, 116 insertions(+), 39 deletions(-)

Comments

Mark Rutland Sept. 24, 2021, 5:36 p.m. UTC | #1
[Adding Paul for RCU, s390 folk for entry code RCU semantics]

On Fri, Sep 24, 2021 at 09:28:32PM +0800, Pingfan Liu wrote:
> After introducing arm64/kernel/entry_common.c which is akin to
> kernel/entry/common.c , the housekeeping of rcu/trace are done twice as
> the following:
>     enter_from_kernel_mode()->rcu_irq_enter().
> And
>     gic_handle_irq()->...->handle_domain_irq()->irq_enter()->rcu_irq_enter()
>
> Besides redundance, based on code analysis, the redundance also raise
> some mistake, e.g.  rcu_data->dynticks_nmi_nesting inc 2, which causes
> rcu_is_cpu_rrupt_from_idle() unexpected.

Hmmm...

The fundamental questionss are:

1) Who is supposed to be responsible for doing the rcu entry/exit?

2) Is it supposed to matter if this happens multiple times?

For (1), I'd generally expect that this is supposed to happen in the
arch/common entry code, since that itself (or the irqchip driver) could
depend on RCU, and if that's the case thatn handle_domain_irq()
shouldn't need to call rcu_irq_enter(). That would be consistent with
the way we handle all other exceptions.

For (2) I don't know whether the level of nesting is suppoosed to
matter. I was under the impression it wasn't meant to matter in general,
so I'm a little surprised that rcu_is_cpu_rrupt_from_idle() depends on a
specific level of nesting.

From a glance it looks like this would cause rcu_sched_clock_irq() to
skip setting TIF_NEED_RESCHED, and to not call invoke_rcu_core(), which
doesn't sound right, at least...

Thomas, Paul, thoughts?

AFAICT, s390 will have a similar flow on its IRQ handling path, so if
this is a real issue they'll be affected too.

Thanks,
Mark.

> Nmi also faces duplicate accounts. This series aims to address these
> duplicate issues.
> [1-2/5]: address nmi account duplicate
> [3-4/5]: address rcu housekeeping duplicate in irq
> [5/5]: as a natural result of [3-4/5], address a history issue. [1]
> 
> 
> History:
> v1 -> v2:
>     change the subject as the motivation varies.
>     add the fix for nmi account duplicate
> 
> The subject of v1 is "[PATCH 1/3] kernel/irq: __handle_domain_irq()
> makes irq_enter/exit arch optional". [2] It is brought up to fix [1].
> 
> There have been some tries to enable crash-stop-NMI on arm64, one by me,
> the other by Yuichi's [4].  I hope after this series, they can advance,
> as Marc said in [3] "No additional NMI patches will make it until we
> have resolved the issues"
> 
> [1] https://lore.kernel.org/linux-arm-kernel/87lfewnmdz.fsf@nanos.tec.linutronix.de/
> [2] https://lore.kernel.org/linux-arm-kernel/1607912752-12481-1-git-send-email-kernelfans@gmail.com
> [3] https://lore.kernel.org/linux-arm-kernel/afd82be798cb55fd2f96940db7be78c0@kernel.org
> [4] https://lore.kernel.org/linux-arm-kernel/20201104080539.3205889-1-ito-yuichi@fujitsu.com
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Joey Gouly <joey.gouly@arm.com>
> Cc: Sami Tolvanen <samitolvanen@google.com>
> Cc: Julien Thierry <julien.thierry@arm.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Yuichi Ito <ito-yuichi@fujitsu.com>
> Cc: linux-kernel@vger.kernel.org
> To: linux-arm-kernel@lists.infradead.org
> 
> 
> Pingfan Liu (5):
>   arm64/entry-common: push the judgement of nmi ahead
>   irqchip/GICv3: expose handle_nmi() directly
>   kernel/irq: make irq_{enter,exit}() in handle_domain_irq() arch
>     optional
>   irqchip/GICv3: let gic_handle_irq() utilize irqentry on arm64
>   irqchip/GICv3: make reschedule-ipi light weight
> 
>  arch/arm64/Kconfig               |  1 +
>  arch/arm64/include/asm/irq.h     |  7 ++++
>  arch/arm64/kernel/entry-common.c | 45 +++++++++++++++-------
>  arch/arm64/kernel/irq.c          | 29 ++++++++++++++
>  drivers/irqchip/irq-gic-v3.c     | 66 ++++++++++++++++++++------------
>  kernel/irq/Kconfig               |  3 ++
>  kernel/irq/irqdesc.c             |  4 ++
>  7 files changed, 116 insertions(+), 39 deletions(-)
> 
> -- 
> 2.31.1
>
Paul E. McKenney Sept. 24, 2021, 10:59 p.m. UTC | #2
On Fri, Sep 24, 2021 at 06:36:15PM +0100, Mark Rutland wrote:
> [Adding Paul for RCU, s390 folk for entry code RCU semantics]
> 
> On Fri, Sep 24, 2021 at 09:28:32PM +0800, Pingfan Liu wrote:
> > After introducing arm64/kernel/entry_common.c which is akin to
> > kernel/entry/common.c , the housekeeping of rcu/trace are done twice as
> > the following:
> >     enter_from_kernel_mode()->rcu_irq_enter().
> > And
> >     gic_handle_irq()->...->handle_domain_irq()->irq_enter()->rcu_irq_enter()
> >
> > Besides redundance, based on code analysis, the redundance also raise
> > some mistake, e.g.  rcu_data->dynticks_nmi_nesting inc 2, which causes
> > rcu_is_cpu_rrupt_from_idle() unexpected.
> 
> Hmmm...
> 
> The fundamental questionss are:
> 
> 1) Who is supposed to be responsible for doing the rcu entry/exit?
> 
> 2) Is it supposed to matter if this happens multiple times?
> 
> For (1), I'd generally expect that this is supposed to happen in the
> arch/common entry code, since that itself (or the irqchip driver) could
> depend on RCU, and if that's the case thatn handle_domain_irq()
> shouldn't need to call rcu_irq_enter(). That would be consistent with
> the way we handle all other exceptions.
> 
> For (2) I don't know whether the level of nesting is suppoosed to
> matter. I was under the impression it wasn't meant to matter in general,
> so I'm a little surprised that rcu_is_cpu_rrupt_from_idle() depends on a
> specific level of nesting.
> 
> >From a glance it looks like this would cause rcu_sched_clock_irq() to
> skip setting TIF_NEED_RESCHED, and to not call invoke_rcu_core(), which
> doesn't sound right, at least...
> 
> Thomas, Paul, thoughts?

It is absolutely required that rcu_irq_enter() and rcu_irq_exit() calls
be balanced.  Normally, this is taken care of by the fact that irq_enter()
invokes rcu_irq_enter() and irq_exit() invokes rcu_irq_exit().  Similarly,
nmi_enter() invokes rcu_nmi_enter() and nmi_exit() invokes rcu_nmi_exit().

But if you are doing some special-case exception where the handler needs
to use RCU readers, but where the rest of the work is not needed, then
the resulting calls to rcu_irq_enter() and rcu_irq_exit() must be in
the architecture-specific code and must be properly balanced.

So if exception entry invokes rcu_irq_enter() twice, then exception
exit also needs to invoke rcu_irq_exit() twice.

There are some constraints on where calls to these functions are place.
For example, any exception-entry code prior to the call to rcu_irq_enter()
must consist solely of functions marked noinstr, but Thomas can tell
you more.

Or am I missing the point of your question?

							Thanx, Paul

> AFAICT, s390 will have a similar flow on its IRQ handling path, so if
> this is a real issue they'll be affected too.
> 
> Thanks,
> Mark.
> 
> > Nmi also faces duplicate accounts. This series aims to address these
> > duplicate issues.
> > [1-2/5]: address nmi account duplicate
> > [3-4/5]: address rcu housekeeping duplicate in irq
> > [5/5]: as a natural result of [3-4/5], address a history issue. [1]
> > 
> > 
> > History:
> > v1 -> v2:
> >     change the subject as the motivation varies.
> >     add the fix for nmi account duplicate
> > 
> > The subject of v1 is "[PATCH 1/3] kernel/irq: __handle_domain_irq()
> > makes irq_enter/exit arch optional". [2] It is brought up to fix [1].
> > 
> > There have been some tries to enable crash-stop-NMI on arm64, one by me,
> > the other by Yuichi's [4].  I hope after this series, they can advance,
> > as Marc said in [3] "No additional NMI patches will make it until we
> > have resolved the issues"
> > 
> > [1] https://lore.kernel.org/linux-arm-kernel/87lfewnmdz.fsf@nanos.tec.linutronix.de/
> > [2] https://lore.kernel.org/linux-arm-kernel/1607912752-12481-1-git-send-email-kernelfans@gmail.com
> > [3] https://lore.kernel.org/linux-arm-kernel/afd82be798cb55fd2f96940db7be78c0@kernel.org
> > [4] https://lore.kernel.org/linux-arm-kernel/20201104080539.3205889-1-ito-yuichi@fujitsu.com
> > 
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: Marc Zyngier <maz@kernel.org>
> > Cc: Joey Gouly <joey.gouly@arm.com>
> > Cc: Sami Tolvanen <samitolvanen@google.com>
> > Cc: Julien Thierry <julien.thierry@arm.com>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Yuichi Ito <ito-yuichi@fujitsu.com>
> > Cc: linux-kernel@vger.kernel.org
> > To: linux-arm-kernel@lists.infradead.org
> > 
> > 
> > Pingfan Liu (5):
> >   arm64/entry-common: push the judgement of nmi ahead
> >   irqchip/GICv3: expose handle_nmi() directly
> >   kernel/irq: make irq_{enter,exit}() in handle_domain_irq() arch
> >     optional
> >   irqchip/GICv3: let gic_handle_irq() utilize irqentry on arm64
> >   irqchip/GICv3: make reschedule-ipi light weight
> > 
> >  arch/arm64/Kconfig               |  1 +
> >  arch/arm64/include/asm/irq.h     |  7 ++++
> >  arch/arm64/kernel/entry-common.c | 45 +++++++++++++++-------
> >  arch/arm64/kernel/irq.c          | 29 ++++++++++++++
> >  drivers/irqchip/irq-gic-v3.c     | 66 ++++++++++++++++++++------------
> >  kernel/irq/Kconfig               |  3 ++
> >  kernel/irq/irqdesc.c             |  4 ++
> >  7 files changed, 116 insertions(+), 39 deletions(-)
> > 
> > -- 
> > 2.31.1
> >
Pingfan Liu Sept. 25, 2021, 3:12 p.m. UTC | #3
On Fri, Sep 24, 2021 at 06:36:15PM +0100, Mark Rutland wrote:
> [Adding Paul for RCU, s390 folk for entry code RCU semantics]
> 
> On Fri, Sep 24, 2021 at 09:28:32PM +0800, Pingfan Liu wrote:
> > After introducing arm64/kernel/entry_common.c which is akin to
> > kernel/entry/common.c , the housekeeping of rcu/trace are done twice as
> > the following:
> >     enter_from_kernel_mode()->rcu_irq_enter().
> > And
> >     gic_handle_irq()->...->handle_domain_irq()->irq_enter()->rcu_irq_enter()
> >
> > Besides redundance, based on code analysis, the redundance also raise
> > some mistake, e.g.  rcu_data->dynticks_nmi_nesting inc 2, which causes
> > rcu_is_cpu_rrupt_from_idle() unexpected.
> 
> Hmmm...
> 
> The fundamental questionss are:
> 
> 1) Who is supposed to be responsible for doing the rcu entry/exit?
> 
> 2) Is it supposed to matter if this happens multiple times?
> 
> For (1), I'd generally expect that this is supposed to happen in the
> arch/common entry code, since that itself (or the irqchip driver) could
> depend on RCU, and if that's the case thatn handle_domain_irq()
> shouldn't need to call rcu_irq_enter(). That would be consistent with
> the way we handle all other exceptions.
> 
In my humble opinion, it had better happen in arch/common entry code as
you said.  But for many arches which assures that before
handle_domain_irq(), no data is involved in rcu updater, it can be done
in handle_domain_irq().  And that is a cheap way to integrate with rcu
system (at least for the time being).

For the (2), it goes deeply into RCU core, hope guides from Paul and
Thomas.
But at least for the condition
  if ((user || rcu_is_cpu_rrupt_from_idle()) && rcu_nohz_full_cpu())
in rcu_pending(), it makes sense to tell the nested interrupt from a
first level interrupt.

Thanks,

	Pingfan
> For (2) I don't know whether the level of nesting is suppoosed to
> matter. I was under the impression it wasn't meant to matter in general,
> so I'm a little surprised that rcu_is_cpu_rrupt_from_idle() depends on a
> specific level of nesting.
> 
> From a glance it looks like this would cause rcu_sched_clock_irq() to
> skip setting TIF_NEED_RESCHED, and to not call invoke_rcu_core(), which
> doesn't sound right, at least...
> 
> Thomas, Paul, thoughts?
> 
> AFAICT, s390 will have a similar flow on its IRQ handling path, so if
> this is a real issue they'll be affected too.
> 
> Thanks,
> Mark.
> 
> > Nmi also faces duplicate accounts. This series aims to address these
> > duplicate issues.
> > [1-2/5]: address nmi account duplicate
> > [3-4/5]: address rcu housekeeping duplicate in irq
> > [5/5]: as a natural result of [3-4/5], address a history issue. [1]
> > 
> > 
> > History:
> > v1 -> v2:
> >     change the subject as the motivation varies.
> >     add the fix for nmi account duplicate
> > 
> > The subject of v1 is "[PATCH 1/3] kernel/irq: __handle_domain_irq()
> > makes irq_enter/exit arch optional". [2] It is brought up to fix [1].
> > 
> > There have been some tries to enable crash-stop-NMI on arm64, one by me,
> > the other by Yuichi's [4].  I hope after this series, they can advance,
> > as Marc said in [3] "No additional NMI patches will make it until we
> > have resolved the issues"
> > 
> > [1] https://lore.kernel.org/linux-arm-kernel/87lfewnmdz.fsf@nanos.tec.linutronix.de/
> > [2] https://lore.kernel.org/linux-arm-kernel/1607912752-12481-1-git-send-email-kernelfans@gmail.com
> > [3] https://lore.kernel.org/linux-arm-kernel/afd82be798cb55fd2f96940db7be78c0@kernel.org
> > [4] https://lore.kernel.org/linux-arm-kernel/20201104080539.3205889-1-ito-yuichi@fujitsu.com
> > 
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: Marc Zyngier <maz@kernel.org>
> > Cc: Joey Gouly <joey.gouly@arm.com>
> > Cc: Sami Tolvanen <samitolvanen@google.com>
> > Cc: Julien Thierry <julien.thierry@arm.com>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Yuichi Ito <ito-yuichi@fujitsu.com>
> > Cc: linux-kernel@vger.kernel.org
> > To: linux-arm-kernel@lists.infradead.org
> > 
> > 
> > Pingfan Liu (5):
> >   arm64/entry-common: push the judgement of nmi ahead
> >   irqchip/GICv3: expose handle_nmi() directly
> >   kernel/irq: make irq_{enter,exit}() in handle_domain_irq() arch
> >     optional
> >   irqchip/GICv3: let gic_handle_irq() utilize irqentry on arm64
> >   irqchip/GICv3: make reschedule-ipi light weight
> > 
> >  arch/arm64/Kconfig               |  1 +
> >  arch/arm64/include/asm/irq.h     |  7 ++++
> >  arch/arm64/kernel/entry-common.c | 45 +++++++++++++++-------
> >  arch/arm64/kernel/irq.c          | 29 ++++++++++++++
> >  drivers/irqchip/irq-gic-v3.c     | 66 ++++++++++++++++++++------------
> >  kernel/irq/Kconfig               |  3 ++
> >  kernel/irq/irqdesc.c             |  4 ++
> >  7 files changed, 116 insertions(+), 39 deletions(-)
> > 
> > -- 
> > 2.31.1
> > 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Mark Rutland Sept. 27, 2021, 9:23 a.m. UTC | #4
On Fri, Sep 24, 2021 at 03:59:54PM -0700, Paul E. McKenney wrote:
> On Fri, Sep 24, 2021 at 06:36:15PM +0100, Mark Rutland wrote:
> > [Adding Paul for RCU, s390 folk for entry code RCU semantics]
> > 
> > On Fri, Sep 24, 2021 at 09:28:32PM +0800, Pingfan Liu wrote:
> > > After introducing arm64/kernel/entry_common.c which is akin to
> > > kernel/entry/common.c , the housekeeping of rcu/trace are done twice as
> > > the following:
> > >     enter_from_kernel_mode()->rcu_irq_enter().
> > > And
> > >     gic_handle_irq()->...->handle_domain_irq()->irq_enter()->rcu_irq_enter()
> > >
> > > Besides redundance, based on code analysis, the redundance also raise
> > > some mistake, e.g.  rcu_data->dynticks_nmi_nesting inc 2, which causes
> > > rcu_is_cpu_rrupt_from_idle() unexpected.
> > 
> > Hmmm...
> > 
> > The fundamental questionss are:
> > 
> > 1) Who is supposed to be responsible for doing the rcu entry/exit?
> > 
> > 2) Is it supposed to matter if this happens multiple times?
> > 
> > For (1), I'd generally expect that this is supposed to happen in the
> > arch/common entry code, since that itself (or the irqchip driver) could
> > depend on RCU, and if that's the case thatn handle_domain_irq()
> > shouldn't need to call rcu_irq_enter(). That would be consistent with
> > the way we handle all other exceptions.
> > 
> > For (2) I don't know whether the level of nesting is suppoosed to
> > matter. I was under the impression it wasn't meant to matter in general,
> > so I'm a little surprised that rcu_is_cpu_rrupt_from_idle() depends on a
> > specific level of nesting.
> > 
> > >From a glance it looks like this would cause rcu_sched_clock_irq() to
> > skip setting TIF_NEED_RESCHED, and to not call invoke_rcu_core(), which
> > doesn't sound right, at least...
> > 
> > Thomas, Paul, thoughts?
> 
> It is absolutely required that rcu_irq_enter() and rcu_irq_exit() calls
> be balanced.  Normally, this is taken care of by the fact that irq_enter()
> invokes rcu_irq_enter() and irq_exit() invokes rcu_irq_exit().  Similarly,
> nmi_enter() invokes rcu_nmi_enter() and nmi_exit() invokes rcu_nmi_exit().

Sure; I didn't mean to suggest those weren't balanced! The problem here
is *nesting*. Due to the structure of our entry code and the core IRQ
code, when handling an IRQ we have a sequence:

	irq_enter() // arch code
	irq_enter() // irq code

	< irq handler here >

	irq_exit() // irq code
	irq_exit() // arch code

... and if we use something like rcu_is_cpu_rrupt_from_idle() in the
middle (e.g. as part of rcu_sched_clock_irq()), this will not give the
expected result because of the additional nesting, since
rcu_is_cpu_rrupt_from_idle() seems to expect that dynticks_nmi_nesting
is only incremented once per exception entry, when it does:

	/* Are we at first interrupt nesting level? */
	nesting = __this_cpu_read(rcu_data.dynticks_nmi_nesting);
	if (nesting > 1)
		return false;

What I'm trying to figure out is whether that expectation is legitimate,
and assuming so, where the entry/exit should happen.

Thanks,
Mark.

> But if you are doing some special-case exception where the handler needs
> to use RCU readers, but where the rest of the work is not needed, then
> the resulting calls to rcu_irq_enter() and rcu_irq_exit() must be in
> the architecture-specific code and must be properly balanced.
> 
> So if exception entry invokes rcu_irq_enter() twice, then exception
> exit also needs to invoke rcu_irq_exit() twice.
> 
> There are some constraints on where calls to these functions are place.
> For example, any exception-entry code prior to the call to rcu_irq_enter()
> must consist solely of functions marked noinstr, but Thomas can tell
> you more.
> 
> Or am I missing the point of your question?
> 
> 							Thanx, Paul
> 
> > AFAICT, s390 will have a similar flow on its IRQ handling path, so if
> > this is a real issue they'll be affected too.
> > 
> > Thanks,
> > Mark.
> > 
> > > Nmi also faces duplicate accounts. This series aims to address these
> > > duplicate issues.
> > > [1-2/5]: address nmi account duplicate
> > > [3-4/5]: address rcu housekeeping duplicate in irq
> > > [5/5]: as a natural result of [3-4/5], address a history issue. [1]
> > > 
> > > 
> > > History:
> > > v1 -> v2:
> > >     change the subject as the motivation varies.
> > >     add the fix for nmi account duplicate
> > > 
> > > The subject of v1 is "[PATCH 1/3] kernel/irq: __handle_domain_irq()
> > > makes irq_enter/exit arch optional". [2] It is brought up to fix [1].
> > > 
> > > There have been some tries to enable crash-stop-NMI on arm64, one by me,
> > > the other by Yuichi's [4].  I hope after this series, they can advance,
> > > as Marc said in [3] "No additional NMI patches will make it until we
> > > have resolved the issues"
> > > 
> > > [1] https://lore.kernel.org/linux-arm-kernel/87lfewnmdz.fsf@nanos.tec.linutronix.de/
> > > [2] https://lore.kernel.org/linux-arm-kernel/1607912752-12481-1-git-send-email-kernelfans@gmail.com
> > > [3] https://lore.kernel.org/linux-arm-kernel/afd82be798cb55fd2f96940db7be78c0@kernel.org
> > > [4] https://lore.kernel.org/linux-arm-kernel/20201104080539.3205889-1-ito-yuichi@fujitsu.com
> > > 
> > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > Cc: Will Deacon <will@kernel.org>
> > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > Cc: Marc Zyngier <maz@kernel.org>
> > > Cc: Joey Gouly <joey.gouly@arm.com>
> > > Cc: Sami Tolvanen <samitolvanen@google.com>
> > > Cc: Julien Thierry <julien.thierry@arm.com>
> > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > Cc: Yuichi Ito <ito-yuichi@fujitsu.com>
> > > Cc: linux-kernel@vger.kernel.org
> > > To: linux-arm-kernel@lists.infradead.org
> > > 
> > > 
> > > Pingfan Liu (5):
> > >   arm64/entry-common: push the judgement of nmi ahead
> > >   irqchip/GICv3: expose handle_nmi() directly
> > >   kernel/irq: make irq_{enter,exit}() in handle_domain_irq() arch
> > >     optional
> > >   irqchip/GICv3: let gic_handle_irq() utilize irqentry on arm64
> > >   irqchip/GICv3: make reschedule-ipi light weight
> > > 
> > >  arch/arm64/Kconfig               |  1 +
> > >  arch/arm64/include/asm/irq.h     |  7 ++++
> > >  arch/arm64/kernel/entry-common.c | 45 +++++++++++++++-------
> > >  arch/arm64/kernel/irq.c          | 29 ++++++++++++++
> > >  drivers/irqchip/irq-gic-v3.c     | 66 ++++++++++++++++++++------------
> > >  kernel/irq/Kconfig               |  3 ++
> > >  kernel/irq/irqdesc.c             |  4 ++
> > >  7 files changed, 116 insertions(+), 39 deletions(-)
> > > 
> > > -- 
> > > 2.31.1
> > >
Paul E. McKenney Sept. 28, 2021, 12:09 a.m. UTC | #5
On Mon, Sep 27, 2021 at 10:23:18AM +0100, Mark Rutland wrote:
> On Fri, Sep 24, 2021 at 03:59:54PM -0700, Paul E. McKenney wrote:
> > On Fri, Sep 24, 2021 at 06:36:15PM +0100, Mark Rutland wrote:
> > > [Adding Paul for RCU, s390 folk for entry code RCU semantics]
> > > 
> > > On Fri, Sep 24, 2021 at 09:28:32PM +0800, Pingfan Liu wrote:
> > > > After introducing arm64/kernel/entry_common.c which is akin to
> > > > kernel/entry/common.c , the housekeeping of rcu/trace are done twice as
> > > > the following:
> > > >     enter_from_kernel_mode()->rcu_irq_enter().
> > > > And
> > > >     gic_handle_irq()->...->handle_domain_irq()->irq_enter()->rcu_irq_enter()
> > > >
> > > > Besides redundance, based on code analysis, the redundance also raise
> > > > some mistake, e.g.  rcu_data->dynticks_nmi_nesting inc 2, which causes
> > > > rcu_is_cpu_rrupt_from_idle() unexpected.
> > > 
> > > Hmmm...
> > > 
> > > The fundamental questionss are:
> > > 
> > > 1) Who is supposed to be responsible for doing the rcu entry/exit?
> > > 
> > > 2) Is it supposed to matter if this happens multiple times?
> > > 
> > > For (1), I'd generally expect that this is supposed to happen in the
> > > arch/common entry code, since that itself (or the irqchip driver) could
> > > depend on RCU, and if that's the case thatn handle_domain_irq()
> > > shouldn't need to call rcu_irq_enter(). That would be consistent with
> > > the way we handle all other exceptions.
> > > 
> > > For (2) I don't know whether the level of nesting is suppoosed to
> > > matter. I was under the impression it wasn't meant to matter in general,
> > > so I'm a little surprised that rcu_is_cpu_rrupt_from_idle() depends on a
> > > specific level of nesting.
> > > 
> > > >From a glance it looks like this would cause rcu_sched_clock_irq() to
> > > skip setting TIF_NEED_RESCHED, and to not call invoke_rcu_core(), which
> > > doesn't sound right, at least...
> > > 
> > > Thomas, Paul, thoughts?
> > 
> > It is absolutely required that rcu_irq_enter() and rcu_irq_exit() calls
> > be balanced.  Normally, this is taken care of by the fact that irq_enter()
> > invokes rcu_irq_enter() and irq_exit() invokes rcu_irq_exit().  Similarly,
> > nmi_enter() invokes rcu_nmi_enter() and nmi_exit() invokes rcu_nmi_exit().
> 
> Sure; I didn't mean to suggest those weren't balanced! The problem here
> is *nesting*. Due to the structure of our entry code and the core IRQ
> code, when handling an IRQ we have a sequence:
> 
> 	irq_enter() // arch code
> 	irq_enter() // irq code
> 
> 	< irq handler here >
> 
> 	irq_exit() // irq code
> 	irq_exit() // arch code
> 
> ... and if we use something like rcu_is_cpu_rrupt_from_idle() in the
> middle (e.g. as part of rcu_sched_clock_irq()), this will not give the
> expected result because of the additional nesting, since
> rcu_is_cpu_rrupt_from_idle() seems to expect that dynticks_nmi_nesting
> is only incremented once per exception entry, when it does:
> 
> 	/* Are we at first interrupt nesting level? */
> 	nesting = __this_cpu_read(rcu_data.dynticks_nmi_nesting);
> 	if (nesting > 1)
> 		return false;
> 
> What I'm trying to figure out is whether that expectation is legitimate,
> and assuming so, where the entry/exit should happen.

Oooh...

The penalty for fooling rcu_is_cpu_rrupt_from_idle() is that RCU will
be unable to detect a userspace quiescent state for a non-nohz_full
CPU.  That could result in RCU CPU stall warnings if a user task runs
continuously on a given CPU for more than 21 seconds (60 seconds in
some distros).  And this can easily happen if the user has a CPU-bound
thread that is the only runnable task on that CPU.

So, yes, this does need some sort of resolution.

The traditional approach is (as you surmise) to have only a single call
to irq_enter() on exception entry and only a single call to irq_exit()
on exception exit.  If this is feasible, it is highly recommended.

In theory, we could have that "1" in "nesting > 1" be a constant supplied
by the architecture (you would want "3" if I remember correctly) but
in practice could we please avoid this?  For one thing, if there is
some other path into the kernel for your architecture that does only a
single irq_enter(), then rcu_is_cpu_rrupt_from_idle() just doesn't stand
a chance.  It would need to compare against a different value depending
on what exception showed up.  Even if that cannot happen, it would be
better if your architecture could remain in blissful ignorance of the
colorful details of ->dynticks_nmi_nesting manipulations.

Another approach would be for the arch code to supply RCU a function that
it calls.  If there is such a function (or perhaps better, if some new
Kconfig option is enabled), RCU invokes it.  Otherwise, it compares to
"1" as it does now.  But you break it, you buy it!  ;-)

Thoughts?  Other approaches?

							Thanx, Paul

> Thanks,
> Mark.
> 
> > But if you are doing some special-case exception where the handler needs
> > to use RCU readers, but where the rest of the work is not needed, then
> > the resulting calls to rcu_irq_enter() and rcu_irq_exit() must be in
> > the architecture-specific code and must be properly balanced.
> > 
> > So if exception entry invokes rcu_irq_enter() twice, then exception
> > exit also needs to invoke rcu_irq_exit() twice.
> > 
> > There are some constraints on where calls to these functions are place.
> > For example, any exception-entry code prior to the call to rcu_irq_enter()
> > must consist solely of functions marked noinstr, but Thomas can tell
> > you more.
> > 
> > Or am I missing the point of your question?
> > 
> > 							Thanx, Paul
> > 
> > > AFAICT, s390 will have a similar flow on its IRQ handling path, so if
> > > this is a real issue they'll be affected too.
> > > 
> > > Thanks,
> > > Mark.
> > > 
> > > > Nmi also faces duplicate accounts. This series aims to address these
> > > > duplicate issues.
> > > > [1-2/5]: address nmi account duplicate
> > > > [3-4/5]: address rcu housekeeping duplicate in irq
> > > > [5/5]: as a natural result of [3-4/5], address a history issue. [1]
> > > > 
> > > > 
> > > > History:
> > > > v1 -> v2:
> > > >     change the subject as the motivation varies.
> > > >     add the fix for nmi account duplicate
> > > > 
> > > > The subject of v1 is "[PATCH 1/3] kernel/irq: __handle_domain_irq()
> > > > makes irq_enter/exit arch optional". [2] It is brought up to fix [1].
> > > > 
> > > > There have been some tries to enable crash-stop-NMI on arm64, one by me,
> > > > the other by Yuichi's [4].  I hope after this series, they can advance,
> > > > as Marc said in [3] "No additional NMI patches will make it until we
> > > > have resolved the issues"
> > > > 
> > > > [1] https://lore.kernel.org/linux-arm-kernel/87lfewnmdz.fsf@nanos.tec.linutronix.de/
> > > > [2] https://lore.kernel.org/linux-arm-kernel/1607912752-12481-1-git-send-email-kernelfans@gmail.com
> > > > [3] https://lore.kernel.org/linux-arm-kernel/afd82be798cb55fd2f96940db7be78c0@kernel.org
> > > > [4] https://lore.kernel.org/linux-arm-kernel/20201104080539.3205889-1-ito-yuichi@fujitsu.com
> > > > 
> > > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > > Cc: Will Deacon <will@kernel.org>
> > > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > > Cc: Marc Zyngier <maz@kernel.org>
> > > > Cc: Joey Gouly <joey.gouly@arm.com>
> > > > Cc: Sami Tolvanen <samitolvanen@google.com>
> > > > Cc: Julien Thierry <julien.thierry@arm.com>
> > > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > > Cc: Yuichi Ito <ito-yuichi@fujitsu.com>
> > > > Cc: linux-kernel@vger.kernel.org
> > > > To: linux-arm-kernel@lists.infradead.org
> > > > 
> > > > 
> > > > Pingfan Liu (5):
> > > >   arm64/entry-common: push the judgement of nmi ahead
> > > >   irqchip/GICv3: expose handle_nmi() directly
> > > >   kernel/irq: make irq_{enter,exit}() in handle_domain_irq() arch
> > > >     optional
> > > >   irqchip/GICv3: let gic_handle_irq() utilize irqentry on arm64
> > > >   irqchip/GICv3: make reschedule-ipi light weight
> > > > 
> > > >  arch/arm64/Kconfig               |  1 +
> > > >  arch/arm64/include/asm/irq.h     |  7 ++++
> > > >  arch/arm64/kernel/entry-common.c | 45 +++++++++++++++-------
> > > >  arch/arm64/kernel/irq.c          | 29 ++++++++++++++
> > > >  drivers/irqchip/irq-gic-v3.c     | 66 ++++++++++++++++++++------------
> > > >  kernel/irq/Kconfig               |  3 ++
> > > >  kernel/irq/irqdesc.c             |  4 ++
> > > >  7 files changed, 116 insertions(+), 39 deletions(-)
> > > > 
> > > > -- 
> > > > 2.31.1
> > > >
Mark Rutland Sept. 28, 2021, 8:32 a.m. UTC | #6
On Mon, Sep 27, 2021 at 05:09:22PM -0700, Paul E. McKenney wrote:
> On Mon, Sep 27, 2021 at 10:23:18AM +0100, Mark Rutland wrote:
> > On Fri, Sep 24, 2021 at 03:59:54PM -0700, Paul E. McKenney wrote:
> > > On Fri, Sep 24, 2021 at 06:36:15PM +0100, Mark Rutland wrote:
> > > > [Adding Paul for RCU, s390 folk for entry code RCU semantics]
> > > > 
> > > > On Fri, Sep 24, 2021 at 09:28:32PM +0800, Pingfan Liu wrote:
> > > > > After introducing arm64/kernel/entry_common.c which is akin to
> > > > > kernel/entry/common.c , the housekeeping of rcu/trace are done twice as
> > > > > the following:
> > > > >     enter_from_kernel_mode()->rcu_irq_enter().
> > > > > And
> > > > >     gic_handle_irq()->...->handle_domain_irq()->irq_enter()->rcu_irq_enter()
> > > > >
> > > > > Besides redundance, based on code analysis, the redundance also raise
> > > > > some mistake, e.g.  rcu_data->dynticks_nmi_nesting inc 2, which causes
> > > > > rcu_is_cpu_rrupt_from_idle() unexpected.
> > > > 
> > > > Hmmm...
> > > > 
> > > > The fundamental questionss are:
> > > > 
> > > > 1) Who is supposed to be responsible for doing the rcu entry/exit?
> > > > 
> > > > 2) Is it supposed to matter if this happens multiple times?
> > > > 
> > > > For (1), I'd generally expect that this is supposed to happen in the
> > > > arch/common entry code, since that itself (or the irqchip driver) could
> > > > depend on RCU, and if that's the case thatn handle_domain_irq()
> > > > shouldn't need to call rcu_irq_enter(). That would be consistent with
> > > > the way we handle all other exceptions.
> > > > 
> > > > For (2) I don't know whether the level of nesting is suppoosed to
> > > > matter. I was under the impression it wasn't meant to matter in general,
> > > > so I'm a little surprised that rcu_is_cpu_rrupt_from_idle() depends on a
> > > > specific level of nesting.
> > > > 
> > > > >From a glance it looks like this would cause rcu_sched_clock_irq() to
> > > > skip setting TIF_NEED_RESCHED, and to not call invoke_rcu_core(), which
> > > > doesn't sound right, at least...
> > > > 
> > > > Thomas, Paul, thoughts?
> > > 
> > > It is absolutely required that rcu_irq_enter() and rcu_irq_exit() calls
> > > be balanced.  Normally, this is taken care of by the fact that irq_enter()
> > > invokes rcu_irq_enter() and irq_exit() invokes rcu_irq_exit().  Similarly,
> > > nmi_enter() invokes rcu_nmi_enter() and nmi_exit() invokes rcu_nmi_exit().
> > 
> > Sure; I didn't mean to suggest those weren't balanced! The problem here
> > is *nesting*. Due to the structure of our entry code and the core IRQ
> > code, when handling an IRQ we have a sequence:
> > 
> > 	irq_enter() // arch code
> > 	irq_enter() // irq code
> > 
> > 	< irq handler here >
> > 
> > 	irq_exit() // irq code
> > 	irq_exit() // arch code
> > 
> > ... and if we use something like rcu_is_cpu_rrupt_from_idle() in the
> > middle (e.g. as part of rcu_sched_clock_irq()), this will not give the
> > expected result because of the additional nesting, since
> > rcu_is_cpu_rrupt_from_idle() seems to expect that dynticks_nmi_nesting
> > is only incremented once per exception entry, when it does:
> > 
> > 	/* Are we at first interrupt nesting level? */
> > 	nesting = __this_cpu_read(rcu_data.dynticks_nmi_nesting);
> > 	if (nesting > 1)
> > 		return false;
> > 
> > What I'm trying to figure out is whether that expectation is legitimate,
> > and assuming so, where the entry/exit should happen.
> 
> Oooh...
> 
> The penalty for fooling rcu_is_cpu_rrupt_from_idle() is that RCU will
> be unable to detect a userspace quiescent state for a non-nohz_full
> CPU.  That could result in RCU CPU stall warnings if a user task runs
> continuously on a given CPU for more than 21 seconds (60 seconds in
> some distros).  And this can easily happen if the user has a CPU-bound
> thread that is the only runnable task on that CPU.
> 
> So, yes, this does need some sort of resolution.
> 
> The traditional approach is (as you surmise) to have only a single call
> to irq_enter() on exception entry and only a single call to irq_exit()
> on exception exit.  If this is feasible, it is highly recommended.

Cool; that's roughly what I was expecting / hoping to hear!

> In theory, we could have that "1" in "nesting > 1" be a constant supplied
> by the architecture (you would want "3" if I remember correctly) but
> in practice could we please avoid this?  For one thing, if there is
> some other path into the kernel for your architecture that does only a
> single irq_enter(), then rcu_is_cpu_rrupt_from_idle() just doesn't stand
> a chance.  It would need to compare against a different value depending
> on what exception showed up.  Even if that cannot happen, it would be
> better if your architecture could remain in blissful ignorance of the
> colorful details of ->dynticks_nmi_nesting manipulations.

I completely agree. I think it's much harder to keep that in check than
to enforce a "once per architectural exception" policy in the arch code.

> Another approach would be for the arch code to supply RCU a function that
> it calls.  If there is such a function (or perhaps better, if some new
> Kconfig option is enabled), RCU invokes it.  Otherwise, it compares to
> "1" as it does now.  But you break it, you buy it!  ;-)

I guess we could look at the exception regs and inspect the original
context, but it sounds overkill...

I think the cleanest thing is to leave this to arch code, and have the
common IRQ code stay well clear. Unfortunately most architectures
(including arch/arm) still need the common IRQ code to handle this, so
we'll have to make that conditional on Kconfig, something like the below
(build+boot tested only).

If there are no objections, I'll go check who else needs the same
treatment (IIUC at least s390 will), and spin that as a real
patch/series.

Thanks,
Mark.

---->8----
diff --git a/arch/Kconfig b/arch/Kconfig
index 8df1c7102643..c59475e50e4c 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -225,6 +225,12 @@ config GENERIC_SMP_IDLE_THREAD
 config GENERIC_IDLE_POLL_SETUP
        bool
 
+config ARCH_ENTERS_IRQ
+       bool
+       help
+         An architecture should select this when it performs irq entry
+         management itself (e.g. calling irq_enter() and irq_exit()).
+
 config ARCH_HAS_FORTIFY_SOURCE
        bool
        help
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 5c7ae4c3954b..fa6476bf2b4d 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -16,6 +16,7 @@ config ARM64
        select ARCH_ENABLE_MEMORY_HOTREMOVE
        select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
        select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
+       select ARCH_ENTERS_IRQ
        select ARCH_HAS_CACHE_LINE_SIZE
        select ARCH_HAS_DEBUG_VIRTUAL
        select ARCH_HAS_DEBUG_VM_PGTABLE
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 4e3c29bb603c..6affa12222e0 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -677,6 +677,15 @@ int generic_handle_domain_irq(struct irq_domain *domain, unsigned int hwirq)
 EXPORT_SYMBOL_GPL(generic_handle_domain_irq);
 
 #ifdef CONFIG_HANDLE_DOMAIN_IRQ
+
+#ifdef ARCH_ENTERS_IRQ
+#define handle_irq_enter()
+#define handle_irq_exit()
+#else
+#define handle_irq_enter()     irq_enter()
+#define handle_irq_exit()      irq_exit()
+#endif
+
 /**
  * handle_domain_irq - Invoke the handler for a HW irq belonging to a domain,
  *                     usually for a root interrupt controller
@@ -693,7 +702,7 @@ int handle_domain_irq(struct irq_domain *domain,
        struct irq_desc *desc;
        int ret = 0;
 
-       irq_enter();
+       handle_irq_enter();
 
        /* The irqdomain code provides boundary checks */
        desc = irq_resolve_mapping(domain, hwirq);
@@ -702,7 +711,7 @@ int handle_domain_irq(struct irq_domain *domain,
        else
                ret = -EINVAL;
 
-       irq_exit();
+       handle_irq_exit();
        set_irq_regs(old_regs);
        return ret;
 }
Mark Rutland Sept. 28, 2021, 8:35 a.m. UTC | #7
On Tue, Sep 28, 2021 at 09:32:22AM +0100, Mark Rutland wrote:
> On Mon, Sep 27, 2021 at 05:09:22PM -0700, Paul E. McKenney wrote:
> > On Mon, Sep 27, 2021 at 10:23:18AM +0100, Mark Rutland wrote:
> > > On Fri, Sep 24, 2021 at 03:59:54PM -0700, Paul E. McKenney wrote:
> > > > On Fri, Sep 24, 2021 at 06:36:15PM +0100, Mark Rutland wrote:
> > > > > [Adding Paul for RCU, s390 folk for entry code RCU semantics]
> > > > > 
> > > > > On Fri, Sep 24, 2021 at 09:28:32PM +0800, Pingfan Liu wrote:
> > > > > > After introducing arm64/kernel/entry_common.c which is akin to
> > > > > > kernel/entry/common.c , the housekeeping of rcu/trace are done twice as
> > > > > > the following:
> > > > > >     enter_from_kernel_mode()->rcu_irq_enter().
> > > > > > And
> > > > > >     gic_handle_irq()->...->handle_domain_irq()->irq_enter()->rcu_irq_enter()
> > > > > >
> > > > > > Besides redundance, based on code analysis, the redundance also raise
> > > > > > some mistake, e.g.  rcu_data->dynticks_nmi_nesting inc 2, which causes
> > > > > > rcu_is_cpu_rrupt_from_idle() unexpected.
> > > > > 
> > > > > Hmmm...
> > > > > 
> > > > > The fundamental questionss are:
> > > > > 
> > > > > 1) Who is supposed to be responsible for doing the rcu entry/exit?
> > > > > 
> > > > > 2) Is it supposed to matter if this happens multiple times?
> > > > > 
> > > > > For (1), I'd generally expect that this is supposed to happen in the
> > > > > arch/common entry code, since that itself (or the irqchip driver) could
> > > > > depend on RCU, and if that's the case thatn handle_domain_irq()
> > > > > shouldn't need to call rcu_irq_enter(). That would be consistent with
> > > > > the way we handle all other exceptions.
> > > > > 
> > > > > For (2) I don't know whether the level of nesting is suppoosed to
> > > > > matter. I was under the impression it wasn't meant to matter in general,
> > > > > so I'm a little surprised that rcu_is_cpu_rrupt_from_idle() depends on a
> > > > > specific level of nesting.
> > > > > 
> > > > > >From a glance it looks like this would cause rcu_sched_clock_irq() to
> > > > > skip setting TIF_NEED_RESCHED, and to not call invoke_rcu_core(), which
> > > > > doesn't sound right, at least...
> > > > > 
> > > > > Thomas, Paul, thoughts?
> > > > 
> > > > It is absolutely required that rcu_irq_enter() and rcu_irq_exit() calls
> > > > be balanced.  Normally, this is taken care of by the fact that irq_enter()
> > > > invokes rcu_irq_enter() and irq_exit() invokes rcu_irq_exit().  Similarly,
> > > > nmi_enter() invokes rcu_nmi_enter() and nmi_exit() invokes rcu_nmi_exit().
> > > 
> > > Sure; I didn't mean to suggest those weren't balanced! The problem here
> > > is *nesting*. Due to the structure of our entry code and the core IRQ
> > > code, when handling an IRQ we have a sequence:
> > > 
> > > 	irq_enter() // arch code
> > > 	irq_enter() // irq code
> > > 
> > > 	< irq handler here >
> > > 
> > > 	irq_exit() // irq code
> > > 	irq_exit() // arch code
> > > 
> > > ... and if we use something like rcu_is_cpu_rrupt_from_idle() in the
> > > middle (e.g. as part of rcu_sched_clock_irq()), this will not give the
> > > expected result because of the additional nesting, since
> > > rcu_is_cpu_rrupt_from_idle() seems to expect that dynticks_nmi_nesting
> > > is only incremented once per exception entry, when it does:
> > > 
> > > 	/* Are we at first interrupt nesting level? */
> > > 	nesting = __this_cpu_read(rcu_data.dynticks_nmi_nesting);
> > > 	if (nesting > 1)
> > > 		return false;
> > > 
> > > What I'm trying to figure out is whether that expectation is legitimate,
> > > and assuming so, where the entry/exit should happen.
> > 
> > Oooh...
> > 
> > The penalty for fooling rcu_is_cpu_rrupt_from_idle() is that RCU will
> > be unable to detect a userspace quiescent state for a non-nohz_full
> > CPU.  That could result in RCU CPU stall warnings if a user task runs
> > continuously on a given CPU for more than 21 seconds (60 seconds in
> > some distros).  And this can easily happen if the user has a CPU-bound
> > thread that is the only runnable task on that CPU.
> > 
> > So, yes, this does need some sort of resolution.
> > 
> > The traditional approach is (as you surmise) to have only a single call
> > to irq_enter() on exception entry and only a single call to irq_exit()
> > on exception exit.  If this is feasible, it is highly recommended.
> 
> Cool; that's roughly what I was expecting / hoping to hear!
> 
> > In theory, we could have that "1" in "nesting > 1" be a constant supplied
> > by the architecture (you would want "3" if I remember correctly) but
> > in practice could we please avoid this?  For one thing, if there is
> > some other path into the kernel for your architecture that does only a
> > single irq_enter(), then rcu_is_cpu_rrupt_from_idle() just doesn't stand
> > a chance.  It would need to compare against a different value depending
> > on what exception showed up.  Even if that cannot happen, it would be
> > better if your architecture could remain in blissful ignorance of the
> > colorful details of ->dynticks_nmi_nesting manipulations.
> 
> I completely agree. I think it's much harder to keep that in check than
> to enforce a "once per architectural exception" policy in the arch code.
> 
> > Another approach would be for the arch code to supply RCU a function that
> > it calls.  If there is such a function (or perhaps better, if some new
> > Kconfig option is enabled), RCU invokes it.  Otherwise, it compares to
> > "1" as it does now.  But you break it, you buy it!  ;-)
> 
> I guess we could look at the exception regs and inspect the original
> context, but it sounds overkill...
> 
> I think the cleanest thing is to leave this to arch code, and have the
> common IRQ code stay well clear. Unfortunately most architectures
> (including arch/arm) still need the common IRQ code to handle this, so
> we'll have to make that conditional on Kconfig, something like the below
> (build+boot tested only).
> 
> If there are no objections, I'll go check who else needs the same
> treatment (IIUC at least s390 will), and spin that as a real
> patch/series.

Ah, looking again this is basically Pinfan's patch 2, so ignore the
below, and I'll review Pingfan's patch instead.

Thanks,
Mark.
Sven Schnelle Sept. 28, 2021, 9:52 a.m. UTC | #8
Mark Rutland <mark.rutland@arm.com> writes:

> On Mon, Sep 27, 2021 at 05:09:22PM -0700, Paul E. McKenney wrote:
>> On Mon, Sep 27, 2021 at 10:23:18AM +0100, Mark Rutland wrote:
>> > On Fri, Sep 24, 2021 at 03:59:54PM -0700, Paul E. McKenney wrote:
>> > > On Fri, Sep 24, 2021 at 06:36:15PM +0100, Mark Rutland wrote:
>> > > > [Adding Paul for RCU, s390 folk for entry code RCU semantics]
>> > > > 
>> > > > On Fri, Sep 24, 2021 at 09:28:32PM +0800, Pingfan Liu wrote:
>> > > > > After introducing arm64/kernel/entry_common.c which is akin to
>> > > > > kernel/entry/common.c , the housekeeping of rcu/trace are done twice as
>> > > > > the following:
>> > > > >     enter_from_kernel_mode()->rcu_irq_enter().
>> > > > > And
>> > > > >     gic_handle_irq()->...->handle_domain_irq()->irq_enter()->rcu_irq_enter()
>> > > > >
>> > > > > Besides redundance, based on code analysis, the redundance also raise
>> > > > > some mistake, e.g.  rcu_data->dynticks_nmi_nesting inc 2, which causes
>> > > > > rcu_is_cpu_rrupt_from_idle() unexpected.
>> > > > 
>> > > > Hmmm...
>> > > > 
>> > > > The fundamental questionss are:
>> > > > 
>> > > > 1) Who is supposed to be responsible for doing the rcu entry/exit?
>> > > > 
>> > > > 2) Is it supposed to matter if this happens multiple times?
>> > > > 
>> > > > For (1), I'd generally expect that this is supposed to happen in the
>> > > > arch/common entry code, since that itself (or the irqchip driver) could
>> > > > depend on RCU, and if that's the case thatn handle_domain_irq()
>> > > > shouldn't need to call rcu_irq_enter(). That would be consistent with
>> > > > the way we handle all other exceptions.
>> > > > 
>> > > > For (2) I don't know whether the level of nesting is suppoosed to
>> > > > matter. I was under the impression it wasn't meant to matter in general,
>> > > > so I'm a little surprised that rcu_is_cpu_rrupt_from_idle() depends on a
>> > > > specific level of nesting.
>> > > > 
>> > > > >From a glance it looks like this would cause rcu_sched_clock_irq() to
>> > > > skip setting TIF_NEED_RESCHED, and to not call invoke_rcu_core(), which
>> > > > doesn't sound right, at least...
>> > > > 
>> > > > Thomas, Paul, thoughts?
>> > > 
>> > > It is absolutely required that rcu_irq_enter() and rcu_irq_exit() calls
>> > > be balanced.  Normally, this is taken care of by the fact that irq_enter()
>> > > invokes rcu_irq_enter() and irq_exit() invokes rcu_irq_exit().  Similarly,
>> > > nmi_enter() invokes rcu_nmi_enter() and nmi_exit() invokes rcu_nmi_exit().
>> > 
>> > Sure; I didn't mean to suggest those weren't balanced! The problem here
>> > is *nesting*. Due to the structure of our entry code and the core IRQ
>> > code, when handling an IRQ we have a sequence:
>> > 
>> > 	irq_enter() // arch code
>> > 	irq_enter() // irq code
>> > 
>> > 	< irq handler here >
>> > 
>> > 	irq_exit() // irq code
>> > 	irq_exit() // arch code
>> > 
>> > ... and if we use something like rcu_is_cpu_rrupt_from_idle() in the
>> > middle (e.g. as part of rcu_sched_clock_irq()), this will not give the
>> > expected result because of the additional nesting, since
>> > rcu_is_cpu_rrupt_from_idle() seems to expect that dynticks_nmi_nesting
>> > is only incremented once per exception entry, when it does:
>> > 
>> > 	/* Are we at first interrupt nesting level? */
>> > 	nesting = __this_cpu_read(rcu_data.dynticks_nmi_nesting);
>> > 	if (nesting > 1)
>> > 		return false;
>> > 
>> > What I'm trying to figure out is whether that expectation is legitimate,
>> > and assuming so, where the entry/exit should happen.
>> 
>> Oooh...
>> 
>> The penalty for fooling rcu_is_cpu_rrupt_from_idle() is that RCU will
>> be unable to detect a userspace quiescent state for a non-nohz_full
>> CPU.  That could result in RCU CPU stall warnings if a user task runs
>> continuously on a given CPU for more than 21 seconds (60 seconds in
>> some distros).  And this can easily happen if the user has a CPU-bound
>> thread that is the only runnable task on that CPU.
>> 
>> So, yes, this does need some sort of resolution.
>> 
>> The traditional approach is (as you surmise) to have only a single call
>> to irq_enter() on exception entry and only a single call to irq_exit()
>> on exception exit.  If this is feasible, it is highly recommended.
>
> Cool; that's roughly what I was expecting / hoping to hear!
>
>> In theory, we could have that "1" in "nesting > 1" be a constant supplied
>> by the architecture (you would want "3" if I remember correctly) but
>> in practice could we please avoid this?  For one thing, if there is
>> some other path into the kernel for your architecture that does only a
>> single irq_enter(), then rcu_is_cpu_rrupt_from_idle() just doesn't stand
>> a chance.  It would need to compare against a different value depending
>> on what exception showed up.  Even if that cannot happen, it would be
>> better if your architecture could remain in blissful ignorance of the
>> colorful details of ->dynticks_nmi_nesting manipulations.
>
> I completely agree. I think it's much harder to keep that in check than
> to enforce a "once per architectural exception" policy in the arch code.
>
>> Another approach would be for the arch code to supply RCU a function that
>> it calls.  If there is such a function (or perhaps better, if some new
>> Kconfig option is enabled), RCU invokes it.  Otherwise, it compares to
>> "1" as it does now.  But you break it, you buy it!  ;-)
>
> I guess we could look at the exception regs and inspect the original
> context, but it sounds overkill...
>
> I think the cleanest thing is to leave this to arch code, and have the
> common IRQ code stay well clear. Unfortunately most architectures
> (including arch/arm) still need the common IRQ code to handle this, so
> we'll have to make that conditional on Kconfig, something like the below
> (build+boot tested only).
>
> If there are no objections, I'll go check who else needs the same
> treatment (IIUC at least s390 will), and spin that as a real
> patch/series.

Hmm, s390 doesn't use handle_domain_irq() and doesn't have
HANDLE_DOMAIN_IRQ set. So i don't think the patch below applies to s390.
However, i'll follow the code to make sure we're not calling
irq_enter/irq_exit twice.

> Thanks,
> Mark.
>
> ---->8----
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 8df1c7102643..c59475e50e4c 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -225,6 +225,12 @@ config GENERIC_SMP_IDLE_THREAD
>  config GENERIC_IDLE_POLL_SETUP
>         bool
>  
> +config ARCH_ENTERS_IRQ
> +       bool
> +       help
> +         An architecture should select this when it performs irq entry
> +         management itself (e.g. calling irq_enter() and irq_exit()).
> +
>  config ARCH_HAS_FORTIFY_SOURCE
>         bool
>         help
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 5c7ae4c3954b..fa6476bf2b4d 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -16,6 +16,7 @@ config ARM64
>         select ARCH_ENABLE_MEMORY_HOTREMOVE
>         select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
>         select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
> +       select ARCH_ENTERS_IRQ
>         select ARCH_HAS_CACHE_LINE_SIZE
>         select ARCH_HAS_DEBUG_VIRTUAL
>         select ARCH_HAS_DEBUG_VM_PGTABLE
> diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
> index 4e3c29bb603c..6affa12222e0 100644
> --- a/kernel/irq/irqdesc.c
> +++ b/kernel/irq/irqdesc.c
> @@ -677,6 +677,15 @@ int generic_handle_domain_irq(struct irq_domain *domain, unsigned int hwirq)
>  EXPORT_SYMBOL_GPL(generic_handle_domain_irq);
>  
>  #ifdef CONFIG_HANDLE_DOMAIN_IRQ
> +
> +#ifdef ARCH_ENTERS_IRQ
> +#define handle_irq_enter()
> +#define handle_irq_exit()
> +#else
> +#define handle_irq_enter()     irq_enter()
> +#define handle_irq_exit()      irq_exit()
> +#endif
> +
>  /**
>   * handle_domain_irq - Invoke the handler for a HW irq belonging to a domain,
>   *                     usually for a root interrupt controller
> @@ -693,7 +702,7 @@ int handle_domain_irq(struct irq_domain *domain,
>         struct irq_desc *desc;
>         int ret = 0;
>  
> -       irq_enter();
> +       handle_irq_enter();
>  
>         /* The irqdomain code provides boundary checks */
>         desc = irq_resolve_mapping(domain, hwirq);
> @@ -702,7 +711,7 @@ int handle_domain_irq(struct irq_domain *domain,
>         else
>                 ret = -EINVAL;
>  
> -       irq_exit();
> +       handle_irq_exit();
>         set_irq_regs(old_regs);
>         return ret;
>  }
Mark Rutland Sept. 28, 2021, 10:26 a.m. UTC | #9
On Tue, Sep 28, 2021 at 11:52:51AM +0200, Sven Schnelle wrote:
> Mark Rutland <mark.rutland@arm.com> writes:
> 
> > On Mon, Sep 27, 2021 at 05:09:22PM -0700, Paul E. McKenney wrote:
> >> On Mon, Sep 27, 2021 at 10:23:18AM +0100, Mark Rutland wrote:
> >> > Sure; I didn't mean to suggest those weren't balanced! The problem here
> >> > is *nesting*. Due to the structure of our entry code and the core IRQ
> >> > code, when handling an IRQ we have a sequence:
> >> > 
> >> > 	irq_enter() // arch code
> >> > 	irq_enter() // irq code
> >> > 
> >> > 	< irq handler here >
> >> > 
> >> > 	irq_exit() // irq code
> >> > 	irq_exit() // arch code
> >> > 
> >> > ... and if we use something like rcu_is_cpu_rrupt_from_idle() in the
> >> > middle (e.g. as part of rcu_sched_clock_irq()), this will not give the
> >> > expected result because of the additional nesting, since
> >> > rcu_is_cpu_rrupt_from_idle() seems to expect that dynticks_nmi_nesting
> >> > is only incremented once per exception entry, when it does:
> >> > 
> >> > 	/* Are we at first interrupt nesting level? */
> >> > 	nesting = __this_cpu_read(rcu_data.dynticks_nmi_nesting);
> >> > 	if (nesting > 1)
> >> > 		return false;
> >> > 
> >> > What I'm trying to figure out is whether that expectation is legitimate,
> >> > and assuming so, where the entry/exit should happen.
> >> 
> >> Oooh...
> >> 
> >> The penalty for fooling rcu_is_cpu_rrupt_from_idle() is that RCU will
> >> be unable to detect a userspace quiescent state for a non-nohz_full
> >> CPU.  That could result in RCU CPU stall warnings if a user task runs
> >> continuously on a given CPU for more than 21 seconds (60 seconds in
> >> some distros).  And this can easily happen if the user has a CPU-bound
> >> thread that is the only runnable task on that CPU.
> >> 
> >> So, yes, this does need some sort of resolution.
> >> 
> >> The traditional approach is (as you surmise) to have only a single call
> >> to irq_enter() on exception entry and only a single call to irq_exit()
> >> on exception exit.  If this is feasible, it is highly recommended.
> >
> > Cool; that's roughly what I was expecting / hoping to hear!
> >
> >> In theory, we could have that "1" in "nesting > 1" be a constant supplied
> >> by the architecture (you would want "3" if I remember correctly) but
> >> in practice could we please avoid this?  For one thing, if there is
> >> some other path into the kernel for your architecture that does only a
> >> single irq_enter(), then rcu_is_cpu_rrupt_from_idle() just doesn't stand
> >> a chance.  It would need to compare against a different value depending
> >> on what exception showed up.  Even if that cannot happen, it would be
> >> better if your architecture could remain in blissful ignorance of the
> >> colorful details of ->dynticks_nmi_nesting manipulations.
> >
> > I completely agree. I think it's much harder to keep that in check than
> > to enforce a "once per architectural exception" policy in the arch code.
> >
> >> Another approach would be for the arch code to supply RCU a function that
> >> it calls.  If there is such a function (or perhaps better, if some new
> >> Kconfig option is enabled), RCU invokes it.  Otherwise, it compares to
> >> "1" as it does now.  But you break it, you buy it!  ;-)
> >
> > I guess we could look at the exception regs and inspect the original
> > context, but it sounds overkill...
> >
> > I think the cleanest thing is to leave this to arch code, and have the
> > common IRQ code stay well clear. Unfortunately most architectures
> > (including arch/arm) still need the common IRQ code to handle this, so
> > we'll have to make that conditional on Kconfig, something like the below
> > (build+boot tested only).
> >
> > If there are no objections, I'll go check who else needs the same
> > treatment (IIUC at least s390 will), and spin that as a real
> > patch/series.
> 
> Hmm, s390 doesn't use handle_domain_irq() and doesn't have
> HANDLE_DOMAIN_IRQ set. So i don't think the patch below applies to s390.
> However, i'll follow the code to make sure we're not calling
> irq_enter/irq_exit twice.

I wasn't clear, but for s390, my concern was that in do_io_irq() and
do_ext_irq() you have the sequence:

	irqentry_enter()	// calls rcu_irq_enter()
	irq_enter();		// calls rcu_irq_enter() then irq_enter_rcu();

	< handler>

	irq_exit();		// calls __irq_exit_rcu then rcu_irq_exit();
	irqentry_exit();	// calls rcu_irq_exit()

... and so IIUC you call rcu_irq_enter() and rcu_irq_exit() twice,
getting the same double-increment of `dynticks_nmi_nesting` per
interrupt, and the same potential problem with
rcu_is_cpu_rrupt_from_idle().

Thanks,
Mark.
Paul E. McKenney Sept. 28, 2021, 1:55 p.m. UTC | #10
On Tue, Sep 28, 2021 at 09:32:22AM +0100, Mark Rutland wrote:
> On Mon, Sep 27, 2021 at 05:09:22PM -0700, Paul E. McKenney wrote:
> > On Mon, Sep 27, 2021 at 10:23:18AM +0100, Mark Rutland wrote:
> > > On Fri, Sep 24, 2021 at 03:59:54PM -0700, Paul E. McKenney wrote:
> > > > On Fri, Sep 24, 2021 at 06:36:15PM +0100, Mark Rutland wrote:
> > > > > [Adding Paul for RCU, s390 folk for entry code RCU semantics]
> > > > > 
> > > > > On Fri, Sep 24, 2021 at 09:28:32PM +0800, Pingfan Liu wrote:
> > > > > > After introducing arm64/kernel/entry_common.c which is akin to
> > > > > > kernel/entry/common.c , the housekeeping of rcu/trace are done twice as
> > > > > > the following:
> > > > > >     enter_from_kernel_mode()->rcu_irq_enter().
> > > > > > And
> > > > > >     gic_handle_irq()->...->handle_domain_irq()->irq_enter()->rcu_irq_enter()
> > > > > >
> > > > > > Besides redundance, based on code analysis, the redundance also raise
> > > > > > some mistake, e.g.  rcu_data->dynticks_nmi_nesting inc 2, which causes
> > > > > > rcu_is_cpu_rrupt_from_idle() unexpected.
> > > > > 
> > > > > Hmmm...
> > > > > 
> > > > > The fundamental questionss are:
> > > > > 
> > > > > 1) Who is supposed to be responsible for doing the rcu entry/exit?
> > > > > 
> > > > > 2) Is it supposed to matter if this happens multiple times?
> > > > > 
> > > > > For (1), I'd generally expect that this is supposed to happen in the
> > > > > arch/common entry code, since that itself (or the irqchip driver) could
> > > > > depend on RCU, and if that's the case thatn handle_domain_irq()
> > > > > shouldn't need to call rcu_irq_enter(). That would be consistent with
> > > > > the way we handle all other exceptions.
> > > > > 
> > > > > For (2) I don't know whether the level of nesting is suppoosed to
> > > > > matter. I was under the impression it wasn't meant to matter in general,
> > > > > so I'm a little surprised that rcu_is_cpu_rrupt_from_idle() depends on a
> > > > > specific level of nesting.
> > > > > 
> > > > > >From a glance it looks like this would cause rcu_sched_clock_irq() to
> > > > > skip setting TIF_NEED_RESCHED, and to not call invoke_rcu_core(), which
> > > > > doesn't sound right, at least...
> > > > > 
> > > > > Thomas, Paul, thoughts?
> > > > 
> > > > It is absolutely required that rcu_irq_enter() and rcu_irq_exit() calls
> > > > be balanced.  Normally, this is taken care of by the fact that irq_enter()
> > > > invokes rcu_irq_enter() and irq_exit() invokes rcu_irq_exit().  Similarly,
> > > > nmi_enter() invokes rcu_nmi_enter() and nmi_exit() invokes rcu_nmi_exit().
> > > 
> > > Sure; I didn't mean to suggest those weren't balanced! The problem here
> > > is *nesting*. Due to the structure of our entry code and the core IRQ
> > > code, when handling an IRQ we have a sequence:
> > > 
> > > 	irq_enter() // arch code
> > > 	irq_enter() // irq code
> > > 
> > > 	< irq handler here >
> > > 
> > > 	irq_exit() // irq code
> > > 	irq_exit() // arch code
> > > 
> > > ... and if we use something like rcu_is_cpu_rrupt_from_idle() in the
> > > middle (e.g. as part of rcu_sched_clock_irq()), this will not give the
> > > expected result because of the additional nesting, since
> > > rcu_is_cpu_rrupt_from_idle() seems to expect that dynticks_nmi_nesting
> > > is only incremented once per exception entry, when it does:
> > > 
> > > 	/* Are we at first interrupt nesting level? */
> > > 	nesting = __this_cpu_read(rcu_data.dynticks_nmi_nesting);
> > > 	if (nesting > 1)
> > > 		return false;
> > > 
> > > What I'm trying to figure out is whether that expectation is legitimate,
> > > and assuming so, where the entry/exit should happen.
> > 
> > Oooh...
> > 
> > The penalty for fooling rcu_is_cpu_rrupt_from_idle() is that RCU will
> > be unable to detect a userspace quiescent state for a non-nohz_full
> > CPU.  That could result in RCU CPU stall warnings if a user task runs
> > continuously on a given CPU for more than 21 seconds (60 seconds in
> > some distros).  And this can easily happen if the user has a CPU-bound
> > thread that is the only runnable task on that CPU.
> > 
> > So, yes, this does need some sort of resolution.
> > 
> > The traditional approach is (as you surmise) to have only a single call
> > to irq_enter() on exception entry and only a single call to irq_exit()
> > on exception exit.  If this is feasible, it is highly recommended.
> 
> Cool; that's roughly what I was expecting / hoping to hear!
> 
> > In theory, we could have that "1" in "nesting > 1" be a constant supplied
> > by the architecture (you would want "3" if I remember correctly) but
> > in practice could we please avoid this?  For one thing, if there is
> > some other path into the kernel for your architecture that does only a
> > single irq_enter(), then rcu_is_cpu_rrupt_from_idle() just doesn't stand
> > a chance.  It would need to compare against a different value depending
> > on what exception showed up.  Even if that cannot happen, it would be
> > better if your architecture could remain in blissful ignorance of the
> > colorful details of ->dynticks_nmi_nesting manipulations.
> 
> I completely agree. I think it's much harder to keep that in check than
> to enforce a "once per architectural exception" policy in the arch code.
> 
> > Another approach would be for the arch code to supply RCU a function that
> > it calls.  If there is such a function (or perhaps better, if some new
> > Kconfig option is enabled), RCU invokes it.  Otherwise, it compares to
> > "1" as it does now.  But you break it, you buy it!  ;-)
> 
> I guess we could look at the exception regs and inspect the original
> context, but it sounds overkill...
> 
> I think the cleanest thing is to leave this to arch code, and have the
> common IRQ code stay well clear. Unfortunately most architectures
> (including arch/arm) still need the common IRQ code to handle this, so
> we'll have to make that conditional on Kconfig, something like the below
> (build+boot tested only).
> 
> If there are no objections, I'll go check who else needs the same
> treatment (IIUC at least s390 will), and spin that as a real
> patch/series.

This approach (whether from you or Pinfan) looks good to me!

							Thanx, Paul

> Thanks,
> Mark.
> 
> ---->8----
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 8df1c7102643..c59475e50e4c 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -225,6 +225,12 @@ config GENERIC_SMP_IDLE_THREAD
>  config GENERIC_IDLE_POLL_SETUP
>         bool
>  
> +config ARCH_ENTERS_IRQ
> +       bool
> +       help
> +         An architecture should select this when it performs irq entry
> +         management itself (e.g. calling irq_enter() and irq_exit()).
> +
>  config ARCH_HAS_FORTIFY_SOURCE
>         bool
>         help
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 5c7ae4c3954b..fa6476bf2b4d 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -16,6 +16,7 @@ config ARM64
>         select ARCH_ENABLE_MEMORY_HOTREMOVE
>         select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
>         select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
> +       select ARCH_ENTERS_IRQ
>         select ARCH_HAS_CACHE_LINE_SIZE
>         select ARCH_HAS_DEBUG_VIRTUAL
>         select ARCH_HAS_DEBUG_VM_PGTABLE
> diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
> index 4e3c29bb603c..6affa12222e0 100644
> --- a/kernel/irq/irqdesc.c
> +++ b/kernel/irq/irqdesc.c
> @@ -677,6 +677,15 @@ int generic_handle_domain_irq(struct irq_domain *domain, unsigned int hwirq)
>  EXPORT_SYMBOL_GPL(generic_handle_domain_irq);
>  
>  #ifdef CONFIG_HANDLE_DOMAIN_IRQ
> +
> +#ifdef ARCH_ENTERS_IRQ
> +#define handle_irq_enter()
> +#define handle_irq_exit()
> +#else
> +#define handle_irq_enter()     irq_enter()
> +#define handle_irq_exit()      irq_exit()
> +#endif
> +
>  /**
>   * handle_domain_irq - Invoke the handler for a HW irq belonging to a domain,
>   *                     usually for a root interrupt controller
> @@ -693,7 +702,7 @@ int handle_domain_irq(struct irq_domain *domain,
>         struct irq_desc *desc;
>         int ret = 0;
>  
> -       irq_enter();
> +       handle_irq_enter();
>  
>         /* The irqdomain code provides boundary checks */
>         desc = irq_resolve_mapping(domain, hwirq);
> @@ -702,7 +711,7 @@ int handle_domain_irq(struct irq_domain *domain,
>         else
>                 ret = -EINVAL;
>  
> -       irq_exit();
> +       handle_irq_exit();
>         set_irq_regs(old_regs);
>         return ret;
>  }
>