mmc: meson-gx: remove IRQF_ONESHOT

Message ID	20201002164915.938217-1-jbrunet@baylibre.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=t563=DJ=lists.infradead.org=linux-amlogic-bounces+patchwork-linux-amlogic=patchwork.kernel.org@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BFCEF20719 From: Jerome Brunet <jbrunet@baylibre.com> To: Ulf Hansson <ulf.hansson@linaro.org> Subject: [PATCH] mmc: meson-gx: remove IRQF_ONESHOT Date: Fri, 2 Oct 2020 18:49:15 +0200 Message-Id: <20201002164915.938217-1-jbrunet@baylibre.com> MIME-Version: 1.0 summary: Content analysis details: (0.0 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [2a00:1450:4864:20:0:0:0:643 listed in] [list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature Precedence: list Cc: Kevin Hilman <khilman@baylibre.com>, Sebastian Andrzej Siewior <bigeasy@linutronix.de>, linux-mmc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-amlogic@lists.infradead.org, Brad Harper <bjharper@gmail.com>, Jerome Brunet <jbrunet@baylibre.com> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-amlogic" <linux-amlogic-bounces@lists.infradead.org> Errors-To: linux-amlogic-bounces+patchwork-linux-amlogic=patchwork.kernel.org@lists.infradead.org
Series	mmc: meson-gx: remove IRQF_ONESHOT \| expand mmc: meson-gx: remove IRQF_ONESHOT

Jerome Brunet Oct. 2, 2020, 4:49 p.m. UTC

IRQF_ONESHOT was added to this driver to make sure the irq was not enabled
again until the thread part of the irq had finished doing its job.

Doing so upsets RT because, under RT, the hardirq part of the irq handler
is not migrated to a thread if the irq is claimed with IRQF_ONESHOT.
In this case, it has been reported to eventually trigger a deadlock with
the led subsystem.

Preventing RT from doing this migration was certainly not the intent, the
description of IRQF_ONESHOT does not really reflect this constraint:

 > IRQF_ONESHOT - Interrupt is not reenabled after the hardirq handler finished.
 >              Used by threaded interrupts which need to keep the
 >              irq line disabled until the threaded handler has been run.

This is exactly what this driver was trying to acheive so I'm still a bit
confused whether this is a driver or an RT issue.

Anyway, this can be solved driver side by manually disabling the IRQs
instead of the relying on the IRQF_ONESHOT. IRQF_ONESHOT may then be removed
while still making sure the irq won't trigger until the threaded part of
the handler is done.

Fixes: eb4d81127746 ("mmc: meson-gx: correct irq flag")
Reported-by: Brad Harper <bjharper@gmail.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
---
 drivers/mmc/host/meson-gx-mmc.c | 47 ++++++++++++++++++++-------------
 1 file changed, 29 insertions(+), 18 deletions(-)

Ulf Hansson Oct. 5, 2020, 8:22 a.m. UTC | #1

+ Tglx

On Fri, 2 Oct 2020 at 18:49, Jerome Brunet <jbrunet@baylibre.com> wrote:
>
> IRQF_ONESHOT was added to this driver to make sure the irq was not enabled
> again until the thread part of the irq had finished doing its job.
>
> Doing so upsets RT because, under RT, the hardirq part of the irq handler
> is not migrated to a thread if the irq is claimed with IRQF_ONESHOT.
> In this case, it has been reported to eventually trigger a deadlock with
> the led subsystem.
>
> Preventing RT from doing this migration was certainly not the intent, the
> description of IRQF_ONESHOT does not really reflect this constraint:
>
>  > IRQF_ONESHOT - Interrupt is not reenabled after the hardirq handler finished.
>  >              Used by threaded interrupts which need to keep the
>  >              irq line disabled until the threaded handler has been run.
>
> This is exactly what this driver was trying to acheive so I'm still a bit
> confused whether this is a driver or an RT issue.
>
> Anyway, this can be solved driver side by manually disabling the IRQs
> instead of the relying on the IRQF_ONESHOT. IRQF_ONESHOT may then be removed
> while still making sure the irq won't trigger until the threaded part of
> the handler is done.

Thomas, may I have your opinion on this one.

I have no problem to apply $subject patch, but as Jerome also
highlights above - this kind of makes me wonder if this is an RT
issue, that perhaps deserves to be solved in a generic way.

What do you think?

Kind regards
Uffe

>
> Fixes: eb4d81127746 ("mmc: meson-gx: correct irq flag")
> Reported-by: Brad Harper <bjharper@gmail.com>
> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
> ---
>  drivers/mmc/host/meson-gx-mmc.c | 47 ++++++++++++++++++++-------------
>  1 file changed, 29 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/mmc/host/meson-gx-mmc.c b/drivers/mmc/host/meson-gx-mmc.c
> index 08a3b1c05acb..effc356db904 100644
> --- a/drivers/mmc/host/meson-gx-mmc.c
> +++ b/drivers/mmc/host/meson-gx-mmc.c
> @@ -101,8 +101,7 @@
>  #define   IRQ_RESP_STATUS BIT(14)
>  #define   IRQ_SDIO BIT(15)
>  #define   IRQ_EN_MASK \
> -       (IRQ_CRC_ERR | IRQ_TIMEOUTS | IRQ_END_OF_CHAIN | IRQ_RESP_STATUS |\
> -        IRQ_SDIO)
> +       (IRQ_CRC_ERR | IRQ_TIMEOUTS | IRQ_END_OF_CHAIN)
>
>  #define SD_EMMC_CMD_CFG 0x50
>  #define SD_EMMC_CMD_ARG 0x54
> @@ -170,6 +169,7 @@ struct meson_host {
>         dma_addr_t descs_dma_addr;
>
>         int irq;
> +       u32 irq_en;
>
>         bool vqmmc_enabled;
>  };
> @@ -842,22 +842,24 @@ static irqreturn_t meson_mmc_irq(int irq, void *dev_id)
>         struct meson_host *host = dev_id;
>         struct mmc_command *cmd;
>         struct mmc_data *data;
> -       u32 irq_en, status, raw_status;
> +       u32  status, raw_status;
>         irqreturn_t ret = IRQ_NONE;
>
> -       irq_en = readl(host->regs + SD_EMMC_IRQ_EN);
> +       /* Disable irqs */
> +       writel(0, host->regs + SD_EMMC_IRQ_EN);
> +
>         raw_status = readl(host->regs + SD_EMMC_STATUS);
> -       status = raw_status & irq_en;
> +       status = raw_status & host->irq_en;
>
>         if (!status) {
>                 dev_dbg(host->dev,
>                         "Unexpected IRQ! irq_en 0x%08x - status 0x%08x\n",
> -                        irq_en, raw_status);
> -               return IRQ_NONE;
> +                        host->irq_en, raw_status);
> +               goto none;
>         }
>
>         if (WARN_ON(!host) || WARN_ON(!host->cmd))
> -               return IRQ_NONE;
> +               goto none;
>
>         /* ack all raised interrupts */
>         writel(status, host->regs + SD_EMMC_STATUS);
> @@ -908,6 +910,11 @@ static irqreturn_t meson_mmc_irq(int irq, void *dev_id)
>         if (ret == IRQ_HANDLED)
>                 meson_mmc_request_done(host->mmc, cmd->mrq);
>
> +none:
> +       /* Enable the irq again if the thread will not run */
> +       if (ret != IRQ_WAKE_THREAD)
> +               writel(host->irq_en, host->regs + SD_EMMC_IRQ_EN);
> +
>         return ret;
>  }
>
> @@ -934,15 +941,17 @@ static irqreturn_t meson_mmc_irq_thread(int irq, void *dev_id)
>         struct mmc_command *next_cmd, *cmd = host->cmd;
>         struct mmc_data *data;
>         unsigned int xfer_bytes;
> +       int ret = IRQ_HANDLED;
>
> -       if (WARN_ON(!cmd))
> -               return IRQ_NONE;
> +       if (WARN_ON(!cmd)) {
> +               ret = IRQ_NONE;
> +               goto out;
> +       }
>
>         if (cmd->error) {
>                 meson_mmc_wait_desc_stop(host);
>                 meson_mmc_request_done(host->mmc, cmd->mrq);
> -
> -               return IRQ_HANDLED;
> +               goto out;
>         }
>
>         data = cmd->data;
> @@ -959,7 +968,10 @@ static irqreturn_t meson_mmc_irq_thread(int irq, void *dev_id)
>         else
>                 meson_mmc_request_done(host->mmc, cmd->mrq);
>
> -       return IRQ_HANDLED;
> +out:
> +       /* Re-enable the irqs */
> +       writel(host->irq_en, host->regs + SD_EMMC_IRQ_EN);
> +       return ret;
>  }
>
>  /*
> @@ -1133,13 +1145,12 @@ static int meson_mmc_probe(struct platform_device *pdev)
>
>         /* clear, ack and enable interrupts */
>         writel(0, host->regs + SD_EMMC_IRQ_EN);
> -       writel(IRQ_CRC_ERR | IRQ_TIMEOUTS | IRQ_END_OF_CHAIN,
> -              host->regs + SD_EMMC_STATUS);
> -       writel(IRQ_CRC_ERR | IRQ_TIMEOUTS | IRQ_END_OF_CHAIN,
> -              host->regs + SD_EMMC_IRQ_EN);
> +       host->irq_en = IRQ_EN_MASK;
> +       writel(host->irq_en, host->regs + SD_EMMC_STATUS);
> +       writel(host->irq_en, host->regs + SD_EMMC_IRQ_EN);
>
>         ret = request_threaded_irq(host->irq, meson_mmc_irq,
> -                                  meson_mmc_irq_thread, IRQF_ONESHOT,
> +                                  meson_mmc_irq_thread, 0,
>                                    dev_name(&pdev->dev), host);
>         if (ret)
>                 goto err_init_clk;
> --
> 2.25.4
>

Thomas Gleixner Oct. 5, 2020, 8:55 a.m. UTC | #2

On Mon, Oct 05 2020 at 10:22, Ulf Hansson wrote:
> On Fri, 2 Oct 2020 at 18:49, Jerome Brunet <jbrunet@baylibre.com> wrote:
>>
>> IRQF_ONESHOT was added to this driver to make sure the irq was not enabled
>> again until the thread part of the irq had finished doing its job.
>>
>> Doing so upsets RT because, under RT, the hardirq part of the irq handler
>> is not migrated to a thread if the irq is claimed with IRQF_ONESHOT.
>> In this case, it has been reported to eventually trigger a deadlock with
>> the led subsystem.
>>
>> Preventing RT from doing this migration was certainly not the intent, the
>> description of IRQF_ONESHOT does not really reflect this constraint:
>>
>>  > IRQF_ONESHOT - Interrupt is not reenabled after the hardirq handler finished.
>>  >              Used by threaded interrupts which need to keep the
>>  >              irq line disabled until the threaded handler has been run.
>>
>> This is exactly what this driver was trying to acheive so I'm still a bit
>> confused whether this is a driver or an RT issue.
>>
>> Anyway, this can be solved driver side by manually disabling the IRQs
>> instead of the relying on the IRQF_ONESHOT. IRQF_ONESHOT may then be removed
>> while still making sure the irq won't trigger until the threaded part of
>> the handler is done.
>
> Thomas, may I have your opinion on this one.
>
> I have no problem to apply $subject patch, but as Jerome also
> highlights above - this kind of makes me wonder if this is an RT
> issue, that perhaps deserves to be solved in a generic way.
>
> What do you think?

Let me stare at the core code. Something smells fishy.

Jerome Brunet Oct. 5, 2020, 12:31 p.m. UTC | #3

On Mon 05 Oct 2020 at 10:55, Thomas Gleixner <tglx@linutronix.de> wrote:

> On Mon, Oct 05 2020 at 10:22, Ulf Hansson wrote:
>> On Fri, 2 Oct 2020 at 18:49, Jerome Brunet <jbrunet@baylibre.com> wrote:
>>>
>>> IRQF_ONESHOT was added to this driver to make sure the irq was not enabled
>>> again until the thread part of the irq had finished doing its job.
>>>
>>> Doing so upsets RT because, under RT, the hardirq part of the irq handler
>>> is not migrated to a thread if the irq is claimed with IRQF_ONESHOT.
>>> In this case, it has been reported to eventually trigger a deadlock with
>>> the led subsystem.
>>>
>>> Preventing RT from doing this migration was certainly not the intent, the
>>> description of IRQF_ONESHOT does not really reflect this constraint:
>>>
>>>  > IRQF_ONESHOT - Interrupt is not reenabled after the hardirq handler finished.
>>>  >              Used by threaded interrupts which need to keep the
>>>  >              irq line disabled until the threaded handler has been run.
>>>
>>> This is exactly what this driver was trying to acheive so I'm still a bit
>>> confused whether this is a driver or an RT issue.
>>>
>>> Anyway, this can be solved driver side by manually disabling the IRQs
>>> instead of the relying on the IRQF_ONESHOT. IRQF_ONESHOT may then be removed
>>> while still making sure the irq won't trigger until the threaded part of
>>> the handler is done.
>>
>> Thomas, may I have your opinion on this one.
>>
>> I have no problem to apply $subject patch, but as Jerome also
>> highlights above - this kind of makes me wonder if this is an RT
>> issue, that perhaps deserves to be solved in a generic way.
>>
>> What do you think?
>
> Let me stare at the core code. Something smells fishy.

FYI: initial discussion can be found here
https://lore.kernel.org/linux-amlogic/24a844c3-c2e0-c735-ccb7-83736218b548@gmail.com/

Thomas Gleixner Oct. 6, 2020, 12:43 p.m. UTC | #4

On Mon, Oct 05 2020 at 10:55, Thomas Gleixner wrote:
> On Mon, Oct 05 2020 at 10:22, Ulf Hansson wrote:
>> On Fri, 2 Oct 2020 at 18:49, Jerome Brunet <jbrunet@baylibre.com> wrote:
>>>
>>> IRQF_ONESHOT was added to this driver to make sure the irq was not enabled
>>> again until the thread part of the irq had finished doing its job.
>>>
>>> Doing so upsets RT because, under RT, the hardirq part of the irq handler
>>> is not migrated to a thread if the irq is claimed with IRQF_ONESHOT.
>>> In this case, it has been reported to eventually trigger a deadlock with
>>> the led subsystem.
>>>
>>> Preventing RT from doing this migration was certainly not the intent, the
>>> description of IRQF_ONESHOT does not really reflect this constraint:
>>>
>>>  > IRQF_ONESHOT - Interrupt is not reenabled after the hardirq handler finished.
>>>  >              Used by threaded interrupts which need to keep the
>>>  >              irq line disabled until the threaded handler has been run.
>>>
>>> This is exactly what this driver was trying to acheive so I'm still a bit
>>> confused whether this is a driver or an RT issue.
>>>
>>> Anyway, this can be solved driver side by manually disabling the IRQs
>>> instead of the relying on the IRQF_ONESHOT. IRQF_ONESHOT may then be removed
>>> while still making sure the irq won't trigger until the threaded part of
>>> the handler is done.
>>
>> Thomas, may I have your opinion on this one.
>>
>> I have no problem to apply $subject patch, but as Jerome also
>> highlights above - this kind of makes me wonder if this is an RT
>> issue, that perhaps deserves to be solved in a generic way.
>>
>> What do you think?
>
> Let me stare at the core code. Something smells fishy.

The point is that for threaded interrupts (without a primary handler)
the core needs to be told that the interrupt line should be masked until
the threaded handler finished. That's what IRQF_ONESHOT is for.

For interrupts which have both a primary and a threaded handler that's a
different story. The primary handler decides whether the thread should
be woken and it decides whether to block further interrupt delivery in
the device or keep it enabled.

When forced interrupt threading is enabled (even independent of RT) then
we have the following cases:

  1) Regular device interrupt (primary handler only)

     The primary handler is replaced with the default 'wake up thread'
     handler and the original primary handler becomes the threaded
     handler. This enforces IRQF_ONESHOT so that the interupt line (for
     level interrupts) stays masked until the thread completed handling.

  2) Threaded interrupts

     Interrupts which have been requested as threaded handler (no
     primary handler) are not changed obvioulsy

  3) Interrupts which have both a primary and a thread handler

     Here IRQF_ONESHOT decides whether the primary handler will be
     forced threaded or not.

     That's a bit unfortunate and ill defined and was not intended to be
     used that way.

     We rather should make interrupts which need to have their primary
     handler in hard interrupt context to set IRQF_NO_THREAD. That
     should at the same time confirm that the primary handler is RT
     safe.

     Let me stare at the core code and the actual usage sites some more.

Thanks,

        tglx

Brad Harper Oct. 6, 2020, 1:45 p.m. UTC | #5

I'm happy to test anything on a range of amlogic hardware with standard 
/ rt and  multiple mmc devices.  Ill test Jerome's patch in next 24 
hours to report the results.

On 6/10/2020 11:43 pm, Thomas Gleixner wrote:
> On Mon, Oct 05 2020 at 10:55, Thomas Gleixner wrote:
>> On Mon, Oct 05 2020 at 10:22, Ulf Hansson wrote:
>>> On Fri, 2 Oct 2020 at 18:49, Jerome Brunet <jbrunet@baylibre.com> wrote:
>>>> IRQF_ONESHOT was added to this driver to make sure the irq was not enabled
>>>> again until the thread part of the irq had finished doing its job.
>>>>
>>>> Doing so upsets RT because, under RT, the hardirq part of the irq handler
>>>> is not migrated to a thread if the irq is claimed with IRQF_ONESHOT.
>>>> In this case, it has been reported to eventually trigger a deadlock with
>>>> the led subsystem.
>>>>
>>>> Preventing RT from doing this migration was certainly not the intent, the
>>>> description of IRQF_ONESHOT does not really reflect this constraint:
>>>>
>>>>   > IRQF_ONESHOT - Interrupt is not reenabled after the hardirq handler finished.
>>>>   >              Used by threaded interrupts which need to keep the
>>>>   >              irq line disabled until the threaded handler has been run.
>>>>
>>>> This is exactly what this driver was trying to acheive so I'm still a bit
>>>> confused whether this is a driver or an RT issue.
>>>>
>>>> Anyway, this can be solved driver side by manually disabling the IRQs
>>>> instead of the relying on the IRQF_ONESHOT. IRQF_ONESHOT may then be removed
>>>> while still making sure the irq won't trigger until the threaded part of
>>>> the handler is done.
>>> Thomas, may I have your opinion on this one.
>>>
>>> I have no problem to apply $subject patch, but as Jerome also
>>> highlights above - this kind of makes me wonder if this is an RT
>>> issue, that perhaps deserves to be solved in a generic way.
>>>
>>> What do you think?
>> Let me stare at the core code. Something smells fishy.
> The point is that for threaded interrupts (without a primary handler)
> the core needs to be told that the interrupt line should be masked until
> the threaded handler finished. That's what IRQF_ONESHOT is for.
>
> For interrupts which have both a primary and a threaded handler that's a
> different story. The primary handler decides whether the thread should
> be woken and it decides whether to block further interrupt delivery in
> the device or keep it enabled.
>
> When forced interrupt threading is enabled (even independent of RT) then
> we have the following cases:
>
>    1) Regular device interrupt (primary handler only)
>
>       The primary handler is replaced with the default 'wake up thread'
>       handler and the original primary handler becomes the threaded
>       handler. This enforces IRQF_ONESHOT so that the interupt line (for
>       level interrupts) stays masked until the thread completed handling.
>
>    2) Threaded interrupts
>
>       Interrupts which have been requested as threaded handler (no
>       primary handler) are not changed obvioulsy
>
>    3) Interrupts which have both a primary and a thread handler
>
>       Here IRQF_ONESHOT decides whether the primary handler will be
>       forced threaded or not.
>
>       That's a bit unfortunate and ill defined and was not intended to be
>       used that way.
>
>       We rather should make interrupts which need to have their primary
>       handler in hard interrupt context to set IRQF_NO_THREAD. That
>       should at the same time confirm that the primary handler is RT
>       safe.
>
>       Let me stare at the core code and the actual usage sites some more.
>
> Thanks,
>
>          tglx
>
>
>
>
>
>

Thomas Gleixner Oct. 6, 2020, 3:33 p.m. UTC | #6

Brad,

On Wed, Oct 07 2020 at 00:45, Brad Harper wrote:
> I'm happy to test anything on a range of amlogic hardware with standard 
> / rt and  multiple mmc devices.  Ill test Jerome's patch in next 24 
> hours to report the results.

please do not top-post and trim your replies.

> On 6/10/2020 11:43 pm, Thomas Gleixner wrote:
>>       We rather should make interrupts which need to have their primary
>>       handler in hard interrupt context to set IRQF_NO_THREAD. That
>>       should at the same time confirm that the primary handler is RT
>>       safe.
>>
>>       Let me stare at the core code and the actual usage sites some more.

So there are a few nasties in there and I faintly remember that there
was an assumption that interrupts which are requested with both a
primary and a secondary handler should quiesce the device interrupt in
the primary handler if needed. OTOH, this also enforces that the primary
handler is RT safe, which is after a quick scan of all the usage sites
not a given and quite some of the users rely on IRQF_ONESHOT.

The below untested patch should cure the problem and keep the interrupt
line masked if requested with IRQF_ONESHOT.

Thanks,

        tglx
---
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -967,8 +967,7 @@ static int irq_wait_for_interrupt(struct
 static void irq_finalize_oneshot(struct irq_desc *desc,
 				 struct irqaction *action)
 {
-	if (!(desc->istate & IRQS_ONESHOT) ||
-	    action->handler == irq_forced_secondary_handler)
+	if (!(action->flags & IRQF_ONESHOT))
 		return;
 again:
 	chip_bus_lock(desc);
@@ -1073,10 +1072,6 @@ irq_forced_thread_fn(struct irq_desc *de
 
 	local_bh_disable();
 	ret = action->thread_fn(action->irq, action->dev_id);
-	if (ret == IRQ_HANDLED)
-		atomic_inc(&desc->threads_handled);
-
-	irq_finalize_oneshot(desc, action);
 	local_bh_enable();
 	return ret;
 }
@@ -1089,14 +1084,7 @@ irq_forced_thread_fn(struct irq_desc *de
 static irqreturn_t irq_thread_fn(struct irq_desc *desc,
 		struct irqaction *action)
 {
-	irqreturn_t ret;
-
-	ret = action->thread_fn(action->irq, action->dev_id);
-	if (ret == IRQ_HANDLED)
-		atomic_inc(&desc->threads_handled);
-
-	irq_finalize_oneshot(desc, action);
-	return ret;
+	return action->thread_fn(action->irq, action->dev_id);
 }
 
 static void wake_threads_waitq(struct irq_desc *desc)
@@ -1172,9 +1160,14 @@ static int irq_thread(void *data)
 		irq_thread_check_affinity(desc, action);
 
 		action_ret = handler_fn(desc, action);
+		if (action_ret == IRQ_HANDLED)
+			atomic_inc(&desc->threads_handled);
+
 		if (action_ret == IRQ_WAKE_THREAD)
 			irq_wake_secondary(desc, action);
 
+		irq_finalize_oneshot(desc, action);
+
 		wake_threads_waitq(desc);
 	}
 
@@ -1219,7 +1212,7 @@ static int irq_setup_forced_threading(st
 {
 	if (!force_irqthreads)
 		return 0;
-	if (new->flags & (IRQF_NO_THREAD | IRQF_PERCPU | IRQF_ONESHOT))
+	if (new->flags & (IRQF_NO_THREAD | IRQF_PERCPU))
 		return 0;
 
 	/*
@@ -1229,8 +1222,6 @@ static int irq_setup_forced_threading(st
 	if (new->handler == irq_default_primary_handler)
 		return 0;
 
-	new->flags |= IRQF_ONESHOT;
-
 	/*
 	 * Handle the case where we have a real primary handler and a
 	 * thread handler. We force thread them as well by creating a
@@ -1246,8 +1237,11 @@ static int irq_setup_forced_threading(st
 		new->secondary->dev_id = new->dev_id;
 		new->secondary->irq = new->irq;
 		new->secondary->name = new->name;
+		/* Preserve the requested IRQF_ONESHOT */
+		new->secondary->flags = new->flags & IRQF_ONESHOT;
 	}
 	/* Deal with the primary handler */
+	new->flags |= IRQF_ONESHOT;
 	set_bit(IRQTF_FORCED_THREAD, &new->thread_flags);
 	new->thread_fn = new->handler;
 	new->handler = irq_default_primary_handler;
@@ -1341,6 +1335,38 @@ setup_irq_thread(struct irqaction *new,
 	return 0;
 }
 
+static unsigned long thread_mask_alloc(unsigned long thread_mask)
+{
+	/*
+	 * Unlikely to have 32 resp 64 irqs sharing one line,
+	 * but who knows.
+	 */
+	if (thread_mask == ~0UL)
+		return 0;
+
+	/*
+	 * The thread_mask for the action is or'ed to
+	 * desc->thread_active to indicate that the
+	 * IRQF_ONESHOT thread handler has been woken, but not
+	 * yet finished. The bit is cleared when a thread
+	 * completes. When all threads of a shared interrupt
+	 * line have completed desc->threads_active becomes
+	 * zero and the interrupt line is unmasked. See
+	 * handle.c:irq_wake_thread() for further information.
+	 *
+	 * If no thread is woken by primary (hard irq context)
+	 * interrupt handlers, then desc->threads_active is
+	 * also checked for zero to unmask the irq line in the
+	 * affected hard irq flow handlers
+	 * (handle_[fasteoi|level]_irq).
+	 *
+	 * The new action gets the first zero bit of
+	 * thread_mask assigned. See the loop above which or's
+	 * all existing action->thread_mask bits.
+	 */
+	return 1UL << ffz(thread_mask);
+}
+
 /*
  * Internal function to register an irqaction - typically used to
  * allocate special interrupts that are part of the architecture.
@@ -1525,35 +1551,18 @@ static int
 	 * conditional in irq_wake_thread().
 	 */
 	if (new->flags & IRQF_ONESHOT) {
-		/*
-		 * Unlikely to have 32 resp 64 irqs sharing one line,
-		 * but who knows.
-		 */
-		if (thread_mask == ~0UL) {
-			ret = -EBUSY;
+		ret = -EBUSY;
+		new->thread_mask = thread_mask_alloc(thread_mask);
+		if (!new->thread_mask)
 			goto out_unlock;
+
+		if (new->secondary && (new->secondary->flags & IRQF_ONESHOT)) {
+			thread_mask |= new->thread_mask;
+			new->secondary->thread_mask =
+				thread_mask_alloc(thread_mask);
+			if (!new->secondary->thread_mask)
+				goto out_unlock;
 		}
-		/*
-		 * The thread_mask for the action is or'ed to
-		 * desc->thread_active to indicate that the
-		 * IRQF_ONESHOT thread handler has been woken, but not
-		 * yet finished. The bit is cleared when a thread
-		 * completes. When all threads of a shared interrupt
-		 * line have completed desc->threads_active becomes
-		 * zero and the interrupt line is unmasked. See
-		 * handle.c:irq_wake_thread() for further information.
-		 *
-		 * If no thread is woken by primary (hard irq context)
-		 * interrupt handlers, then desc->threads_active is
-		 * also checked for zero to unmask the irq line in the
-		 * affected hard irq flow handlers
-		 * (handle_[fasteoi|level]_irq).
-		 *
-		 * The new action gets the first zero bit of
-		 * thread_mask assigned. See the loop above which or's
-		 * all existing action->thread_mask bits.
-		 */
-		new->thread_mask = 1UL << ffz(thread_mask);
 
 	} else if (new->handler == irq_default_primary_handler &&
 		   !(desc->irq_data.chip->flags & IRQCHIP_ONESHOT_SAFE)) {

Jerome Brunet Oct. 7, 2020, 11:32 a.m. UTC | #7

On Tue 06 Oct 2020 at 17:33, Thomas Gleixner <tglx@linutronix.de> wrote:

> Brad,
>
> On Wed, Oct 07 2020 at 00:45, Brad Harper wrote:
>> I'm happy to test anything on a range of amlogic hardware with standard 
>> / rt and  multiple mmc devices.  Ill test Jerome's patch in next 24 
>> hours to report the results.
>
> please do not top-post and trim your replies.
>
>> On 6/10/2020 11:43 pm, Thomas Gleixner wrote:
>>>       We rather should make interrupts which need to have their primary
>>>       handler in hard interrupt context to set IRQF_NO_THREAD. That
>>>       should at the same time confirm that the primary handler is RT
>>>       safe.
>>>
>>>       Let me stare at the core code and the actual usage sites some more.
>
> So there are a few nasties in there and I faintly remember that there
> was an assumption that interrupts which are requested with both a
> primary and a secondary handler should quiesce the device interrupt in
> the primary handler if needed. OTOH, this also enforces that the primary
> handler is RT safe, which is after a quick scan of all the usage sites
> not a given and quite some of the users rely on IRQF_ONESHOT.
>
> The below untested patch should cure the problem and keep the interrupt
> line masked if requested with IRQF_ONESHOT.
>

With arm64 defconfig on Khadas vim3, no obvious regression. Looks good.

Tested-by: Jerome Brunet <jbrunet@baylibre.com>

I did not test with RT. Brad, Could you let us know is Thomas's patch
works for you ? Thx

Brad Harper Oct. 8, 2020, 5:11 a.m. UTC | #8

On 7/10/2020 10:32 pm, Jerome Brunet wrote:
> With arm64 defconfig on Khadas vim3, no obvious regression. Looks good.
>
> Tested-by: Jerome Brunet <jbrunet@baylibre.com>
>
> I did not test with RT. Brad, Could you let us know is Thomas's patch
> works for you ? Thx
There was a merge conflict in applying against v5.9-rc8-rt12 with
particular this patch:
https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/commit/kernel/irq/manage.c?h=linux-5.9.y-rt&id=18df00ef0b2b1513dc8f1a9ed26b11fff2261c30

I did manage to add the patch after attempting to resolve the conflict
which solves the deadlock issue I am seeing with mmc and works fine during
testing (a kernel compilation on preempt_rt configured kernel).

irq_thread in /kernel/irq/manage.c Looks like this (not 100% sure I
should have placed the irq_finalize_oneshot before
add_interrupt_randomness).

Based on this I can provide

Tested-by: Brad Harper <bjharper@gmail.com>

---
static int irq_thread(void *data)
{
         struct callback_head on_exit_work;
         struct irqaction *action = data;
         struct irq_desc *desc = irq_to_desc(action->irq);
         irqreturn_t (*handler_fn)(struct irq_desc *desc,
                         struct irqaction *action);

         if (force_irqthreads && test_bit(IRQTF_FORCED_THREAD,
&action->thread_flags))
                 handler_fn = irq_forced_thread_fn;
         else
                 handler_fn = irq_thread_fn;

         init_task_work(&on_exit_work, irq_thread_dtor);
         task_work_add(current, &on_exit_work, false);

         irq_thread_check_affinity(desc, action);

         while (!irq_wait_for_interrupt(action)) {
                 irqreturn_t action_ret;

                 irq_thread_check_affinity(desc, action);

                 action_ret = handler_fn(desc, action);
                 if (action_ret == IRQ_HANDLED)
                         atomic_inc(&desc->threads_handled);

                 if (action_ret == IRQ_WAKE_THREAD)
                         irq_wake_secondary(desc, action);

                 irq_finalize_oneshot(desc, action);

                 if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
                        migrate_disable();
                        add_interrupt_randomness(action->irq, 0,
                                 desc->random_ip ^ (unsigned long) action);
                        migrate_enable();
                 }
                 wake_threads_waitq(desc);
         }

         /*
          * This is the regular exit path. __free_irq() is stopping the
          * thread via kthread_stop() after calling
          * synchronize_hardirq(). So neither IRQTF_RUNTHREAD nor the
          * oneshot mask bit can be set.
          */
         task_work_cancel(current, irq_thread_dtor);
         return 0;
}

---

Ulf Hansson Oct. 8, 2020, 9:08 a.m. UTC | #9

On Tue, 6 Oct 2020 at 17:33, Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Brad,
>
> On Wed, Oct 07 2020 at 00:45, Brad Harper wrote:
> > I'm happy to test anything on a range of amlogic hardware with standard
> > / rt and  multiple mmc devices.  Ill test Jerome's patch in next 24
> > hours to report the results.
>
> please do not top-post and trim your replies.
>
> > On 6/10/2020 11:43 pm, Thomas Gleixner wrote:
> >>       We rather should make interrupts which need to have their primary
> >>       handler in hard interrupt context to set IRQF_NO_THREAD. That
> >>       should at the same time confirm that the primary handler is RT
> >>       safe.
> >>
> >>       Let me stare at the core code and the actual usage sites some more.
>
> So there are a few nasties in there and I faintly remember that there
> was an assumption that interrupts which are requested with both a
> primary and a secondary handler should quiesce the device interrupt in
> the primary handler if needed. OTOH, this also enforces that the primary
> handler is RT safe, which is after a quick scan of all the usage sites
> not a given and quite some of the users rely on IRQF_ONESHOT.
>
> The below untested patch should cure the problem and keep the interrupt
> line masked if requested with IRQF_ONESHOT.
>
> Thanks,
>
>         tglx

Thomas, thanks a lot for helping out and looking at this!

It looks like the testing of the patch below went well. Are you
intending to queue up the patch via your tip tree?

If you need any help, just tell us!

Kind regards
Uffe

> ---
> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -967,8 +967,7 @@ static int irq_wait_for_interrupt(struct
>  static void irq_finalize_oneshot(struct irq_desc *desc,
>                                  struct irqaction *action)
>  {
> -       if (!(desc->istate & IRQS_ONESHOT) ||
> -           action->handler == irq_forced_secondary_handler)
> +       if (!(action->flags & IRQF_ONESHOT))
>                 return;
>  again:
>         chip_bus_lock(desc);
> @@ -1073,10 +1072,6 @@ irq_forced_thread_fn(struct irq_desc *de
>
>         local_bh_disable();
>         ret = action->thread_fn(action->irq, action->dev_id);
> -       if (ret == IRQ_HANDLED)
> -               atomic_inc(&desc->threads_handled);
> -
> -       irq_finalize_oneshot(desc, action);
>         local_bh_enable();
>         return ret;
>  }
> @@ -1089,14 +1084,7 @@ irq_forced_thread_fn(struct irq_desc *de
>  static irqreturn_t irq_thread_fn(struct irq_desc *desc,
>                 struct irqaction *action)
>  {
> -       irqreturn_t ret;
> -
> -       ret = action->thread_fn(action->irq, action->dev_id);
> -       if (ret == IRQ_HANDLED)
> -               atomic_inc(&desc->threads_handled);
> -
> -       irq_finalize_oneshot(desc, action);
> -       return ret;
> +       return action->thread_fn(action->irq, action->dev_id);
>  }
>
>  static void wake_threads_waitq(struct irq_desc *desc)
> @@ -1172,9 +1160,14 @@ static int irq_thread(void *data)
>                 irq_thread_check_affinity(desc, action);
>
>                 action_ret = handler_fn(desc, action);
> +               if (action_ret == IRQ_HANDLED)
> +                       atomic_inc(&desc->threads_handled);
> +
>                 if (action_ret == IRQ_WAKE_THREAD)
>                         irq_wake_secondary(desc, action);
>
> +               irq_finalize_oneshot(desc, action);
> +
>                 wake_threads_waitq(desc);
>         }
>
> @@ -1219,7 +1212,7 @@ static int irq_setup_forced_threading(st
>  {
>         if (!force_irqthreads)
>                 return 0;
> -       if (new->flags & (IRQF_NO_THREAD | IRQF_PERCPU | IRQF_ONESHOT))
> +       if (new->flags & (IRQF_NO_THREAD | IRQF_PERCPU))
>                 return 0;
>
>         /*
> @@ -1229,8 +1222,6 @@ static int irq_setup_forced_threading(st
>         if (new->handler == irq_default_primary_handler)
>                 return 0;
>
> -       new->flags |= IRQF_ONESHOT;
> -
>         /*
>          * Handle the case where we have a real primary handler and a
>          * thread handler. We force thread them as well by creating a
> @@ -1246,8 +1237,11 @@ static int irq_setup_forced_threading(st
>                 new->secondary->dev_id = new->dev_id;
>                 new->secondary->irq = new->irq;
>                 new->secondary->name = new->name;
> +               /* Preserve the requested IRQF_ONESHOT */
> +               new->secondary->flags = new->flags & IRQF_ONESHOT;
>         }
>         /* Deal with the primary handler */
> +       new->flags |= IRQF_ONESHOT;
>         set_bit(IRQTF_FORCED_THREAD, &new->thread_flags);
>         new->thread_fn = new->handler;
>         new->handler = irq_default_primary_handler;
> @@ -1341,6 +1335,38 @@ setup_irq_thread(struct irqaction *new,
>         return 0;
>  }
>
> +static unsigned long thread_mask_alloc(unsigned long thread_mask)
> +{
> +       /*
> +        * Unlikely to have 32 resp 64 irqs sharing one line,
> +        * but who knows.
> +        */
> +       if (thread_mask == ~0UL)
> +               return 0;
> +
> +       /*
> +        * The thread_mask for the action is or'ed to
> +        * desc->thread_active to indicate that the
> +        * IRQF_ONESHOT thread handler has been woken, but not
> +        * yet finished. The bit is cleared when a thread
> +        * completes. When all threads of a shared interrupt
> +        * line have completed desc->threads_active becomes
> +        * zero and the interrupt line is unmasked. See
> +        * handle.c:irq_wake_thread() for further information.
> +        *
> +        * If no thread is woken by primary (hard irq context)
> +        * interrupt handlers, then desc->threads_active is
> +        * also checked for zero to unmask the irq line in the
> +        * affected hard irq flow handlers
> +        * (handle_[fasteoi|level]_irq).
> +        *
> +        * The new action gets the first zero bit of
> +        * thread_mask assigned. See the loop above which or's
> +        * all existing action->thread_mask bits.
> +        */
> +       return 1UL << ffz(thread_mask);
> +}
> +
>  /*
>   * Internal function to register an irqaction - typically used to
>   * allocate special interrupts that are part of the architecture.
> @@ -1525,35 +1551,18 @@ static int
>          * conditional in irq_wake_thread().
>          */
>         if (new->flags & IRQF_ONESHOT) {
> -               /*
> -                * Unlikely to have 32 resp 64 irqs sharing one line,
> -                * but who knows.
> -                */
> -               if (thread_mask == ~0UL) {
> -                       ret = -EBUSY;
> +               ret = -EBUSY;
> +               new->thread_mask = thread_mask_alloc(thread_mask);
> +               if (!new->thread_mask)
>                         goto out_unlock;
> +
> +               if (new->secondary && (new->secondary->flags & IRQF_ONESHOT)) {
> +                       thread_mask |= new->thread_mask;
> +                       new->secondary->thread_mask =
> +                               thread_mask_alloc(thread_mask);
> +                       if (!new->secondary->thread_mask)
> +                               goto out_unlock;
>                 }
> -               /*
> -                * The thread_mask for the action is or'ed to
> -                * desc->thread_active to indicate that the
> -                * IRQF_ONESHOT thread handler has been woken, but not
> -                * yet finished. The bit is cleared when a thread
> -                * completes. When all threads of a shared interrupt
> -                * line have completed desc->threads_active becomes
> -                * zero and the interrupt line is unmasked. See
> -                * handle.c:irq_wake_thread() for further information.
> -                *
> -                * If no thread is woken by primary (hard irq context)
> -                * interrupt handlers, then desc->threads_active is
> -                * also checked for zero to unmask the irq line in the
> -                * affected hard irq flow handlers
> -                * (handle_[fasteoi|level]_irq).
> -                *
> -                * The new action gets the first zero bit of
> -                * thread_mask assigned. See the loop above which or's
> -                * all existing action->thread_mask bits.
> -                */
> -               new->thread_mask = 1UL << ffz(thread_mask);
>
>         } else if (new->handler == irq_default_primary_handler &&
>                    !(desc->irq_data.chip->flags & IRQCHIP_ONESHOT_SAFE)) {
>
>

Jerome Brunet Nov. 10, 2020, 3:04 p.m. UTC | #10

On Thu 08 Oct 2020 at 11:08, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> Thomas, thanks a lot for helping out and looking at this!
>
> It looks like the testing of the patch below went well. Are you
> intending to queue up the patch via your tip tree?
>
> If you need any help, just tell us!
>
> Kind regards
> Uffe
>

Hi everyone,

Do we have a plan for this issue ?
I've had Thomas's change in my tree for a month, so far, so good.

Cheers
Jerome

Ulf Hansson Nov. 11, 2020, 10:47 a.m. UTC | #11

On Tue, 10 Nov 2020 at 16:05, Jerome Brunet <jbrunet@baylibre.com> wrote:
>
>
> On Thu 08 Oct 2020 at 11:08, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> >
> > Thomas, thanks a lot for helping out and looking at this!
> >
> > It looks like the testing of the patch below went well. Are you
> > intending to queue up the patch via your tip tree?
> >
> > If you need any help, just tell us!
> >
> > Kind regards
> > Uffe
> >
>
> Hi everyone,
>
> Do we have a plan for this issue ?
> I've had Thomas's change in my tree for a month, so far, so good.

Instead of waiting for Thomas, perhaps you can pick up his patch and
re-submit it?

From my side, I can of course apply your original fix to the mmc
driver, as an intermediate step. Is there a hurry?

Kind regards
Uffe

Jerome Brunet Nov. 13, 2020, 3:25 p.m. UTC | #12

On Wed 11 Nov 2020 at 11:47, Ulf Hansson <ulf.hansson@linaro.org> wrote:

> On Tue, 10 Nov 2020 at 16:05, Jerome Brunet <jbrunet@baylibre.com> wrote:
>>
>>
>> On Thu 08 Oct 2020 at 11:08, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>> >
>> > Thomas, thanks a lot for helping out and looking at this!
>> >
>> > It looks like the testing of the patch below went well. Are you
>> > intending to queue up the patch via your tip tree?
>> >
>> > If you need any help, just tell us!
>> >
>> > Kind regards
>> > Uffe
>> >
>>
>> Hi everyone,
>>
>> Do we have a plan for this issue ?
>> I've had Thomas's change in my tree for a month, so far, so good.
>
> Instead of waiting for Thomas, perhaps you can pick up his patch and
> re-submit it?

TBH, I'm not confortable signing off something when I have no idea about
the implication, which is the case here.

>
> From my side, I can of course apply your original fix to the mmc
> driver, as an intermediate step.

In Thomas first reply, I did not really understand if it was bad from
the driver to use IRQF_ONESHOT. If it is OK, i'd prefer if things stayed
as they are. Otherwise, feel free to apply it.

> Is there a hurry?
>

Absolutely no hurry, at least not for me.

I noticed I still had Thomas's patch on top of the last rc which means
it had not been applied yet. Fishing for news, that's all.

> Kind regards
> Uffe

mmc: meson-gx: remove IRQF_ONESHOT

Commit Message

Comments

Patch