diff mbox series

[v2] spi: spi-geni-qcom: Speculative fix of "nobody cared" about interrupt

Message ID 20200317133653.v2.1.I752ebdcfd5e8bf0de06d66e767b8974932b3620e@changeid (mailing list archive)
State New, archived
Headers show
Series [v2] spi: spi-geni-qcom: Speculative fix of "nobody cared" about interrupt | expand

Commit Message

Doug Anderson March 17, 2020, 8:37 p.m. UTC
Every once in a while (like once in a few months on a device) people
have seen warnings on devices using spi-geni-qcom like:

  irq ...: nobody cared (try booting with the "irqpoll" option)

...where the interrupt number listed matches up with "spi_geni" in
/proc/interrupts.

You can get "nobody cared" if the interrupt handler returns IRQ_NONE.
Once you get into this state the driver will just stop working.

Auditing the code makes me believe that it's probably not so good
checking "cur_mcmd" in the interrupt handler not under spinlock.
Let's move it to be under spinlock.  Looking more closely at the code,
it looks as if there are some other places that could be under
spinlock, so let's add.  It looks as if the original code was assuming
that by the time that the interrupt handler got called that the write
to "cur_mcmd" (to make it not CMD_NONE anymore) would make it to the
processor handling the interrupt.  Perhaps with weakly ordered memory
this sometimes (though very rarely) happened.

Let's also add a warning (with the IRQ status) in the case that we
ever end up getting an interrupt when we think we shouldn't.  This
would give us a better chance to debug if this patch doesn't help the
issue.  We'll also try our best to clear the interrupt to hopefully
get us out of this state.

Patch is marked "speculative" because I have no way to reproduce this
problem, so I'm just hoping this fixes it.  Weakly ordered memory
makes my brain hurt.

Fixes: 561de45f72bd ("spi: spi-geni-qcom: Add SPI driver support for GENI based QUP")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v2:
- Detect true spurious interrupt.
- Still return IRQ_NONE for state machine mismatch, but print warn.

 drivers/spi/spi-geni-qcom.c | 35 ++++++++++++++++++++++++++++++-----
 1 file changed, 30 insertions(+), 5 deletions(-)

Comments

Stephen Boyd March 17, 2020, 9:36 p.m. UTC | #1
Quoting Douglas Anderson (2020-03-17 13:37:06)
> Every once in a while (like once in a few months on a device) people
> have seen warnings on devices using spi-geni-qcom like:
> 
>   irq ...: nobody cared (try booting with the "irqpoll" option)
> 
> ...where the interrupt number listed matches up with "spi_geni" in
> /proc/interrupts.
> 
> You can get "nobody cared" if the interrupt handler returns IRQ_NONE.
> Once you get into this state the driver will just stop working.
> 
> Auditing the code makes me believe that it's probably not so good
> checking "cur_mcmd" in the interrupt handler not under spinlock.
> Let's move it to be under spinlock.  Looking more closely at the code,
> it looks as if there are some other places that could be under
> spinlock, so let's add.  It looks as if the original code was assuming
> that by the time that the interrupt handler got called that the write
> to "cur_mcmd" (to make it not CMD_NONE anymore) would make it to the
> processor handling the interrupt.  Perhaps with weakly ordered memory
> this sometimes (though very rarely) happened.
> 
> Let's also add a warning (with the IRQ status) in the case that we
> ever end up getting an interrupt when we think we shouldn't.  This
> would give us a better chance to debug if this patch doesn't help the
> issue.  We'll also try our best to clear the interrupt to hopefully
> get us out of this state.
> 
> Patch is marked "speculative" because I have no way to reproduce this
> problem, so I'm just hoping this fixes it.  Weakly ordered memory
> makes my brain hurt.

It could be that. It could also be the poor design of geni_se_init() and
how it enables many interrupts that this driver doesn't look to handle?
Why do we allow the wrapper to enable all those bits in
M_COMMON_GENI_M_IRQ_EN and S_COMMON_GENI_S_IRQ_EN if nobody is going to
handle them?

> 
> Fixes: 561de45f72bd ("spi: spi-geni-qcom: Add SPI driver support for GENI based QUP")
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
> 
> Changes in v2:
> - Detect true spurious interrupt.
> - Still return IRQ_NONE for state machine mismatch, but print warn.
> 
>  drivers/spi/spi-geni-qcom.c | 35 ++++++++++++++++++++++++++++++-----
>  1 file changed, 30 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
> index c3972424af71..92e51ccb427f 100644
> --- a/drivers/spi/spi-geni-qcom.c
> +++ b/drivers/spi/spi-geni-qcom.c
> @@ -151,16 +151,19 @@ static void spi_geni_set_cs(struct spi_device *slv, bool set_flag)
>         struct geni_se *se = &mas->se;
>         unsigned long time_left;
>  
> -       reinit_completion(&mas->xfer_done);
>         pm_runtime_get_sync(mas->dev);
>         if (!(slv->mode & SPI_CS_HIGH))
>                 set_flag = !set_flag;
>  
> +       spin_lock_irq(&mas->lock);

Why is this spin_lock_irq() vs. spin_lock_irqsave()? This isn't possible
to be called from somewhere that hasn't changed irq flags?

> +       reinit_completion(&mas->xfer_done);
> +
>         mas->cur_mcmd = CMD_CS;
>         if (set_flag)
>                 geni_se_setup_m_cmd(se, SPI_CS_ASSERT, 0);
>         else
>                 geni_se_setup_m_cmd(se, SPI_CS_DEASSERT, 0);
> +       spin_unlock_irq(&mas->lock);

This will force on interrupts if they were masked.

>  
>         time_left = wait_for_completion_timeout(&mas->xfer_done, HZ);
>         if (!time_left)
> @@ -307,6 +310,8 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
>         u32 spi_tx_cfg, len;
>         struct geni_se *se = &mas->se;
>  
> +       spin_lock_irq(&mas->lock);
> +
>         spi_tx_cfg = readl(se->base + SE_SPI_TRANS_CFG);
>         if (xfer->bits_per_word != mas->cur_bits_per_word) {
>                 spi_setup_word_len(mas, mode, xfer->bits_per_word);
> @@ -376,6 +381,8 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
>          */
>         if (m_cmd & SPI_TX_ONLY)
>                 writel(mas->tx_wm, se->base + SE_GENI_TX_WATERMARK_REG);
> +
> +       spin_unlock_irq(&mas->lock);
>  }
>  
>  static int spi_geni_transfer_one(struct spi_master *spi,
> @@ -478,13 +485,29 @@ static irqreturn_t geni_spi_isr(int irq, void *data)
>         struct geni_se *se = &mas->se;
>         u32 m_irq;
>         unsigned long flags;
> -
> -       if (mas->cur_mcmd == CMD_NONE)
> -               return IRQ_NONE;
> +       irqreturn_t ret = IRQ_HANDLED;
>  
>         spin_lock_irqsave(&mas->lock, flags);
>         m_irq = readl(se->base + SE_GENI_M_IRQ_STATUS);

Does this read need to be inside the lock?

>  
> +       /* Check for spurious interrupt */
> +       if (!m_irq) {
> +               ret = IRQ_NONE;
> +               goto exit;

I ask because it could be simplified by reading the status and then
immediately returning IRQ_NONE if no bits are set without having to do
the lock/unlock dance. And does writing 0 to the irq clear register do
anything?

> +       }
> +
> +       /*
> +        * If we got a real interrupt but software state machine thinks
> +        * we were idle then give a warning.  We'll do our best to clear
> +        * the interrupt but we'll still return IRQ_NONE.  If this keeps
> +        * happening the system will eventually disable us.
> +        */
> +       if (mas->cur_mcmd == CMD_NONE) {
> +               pr_warn("Unexpected SPI interrupt: %#010x\n", m_irq);
> +               ret = IRQ_NONE;
> +               goto exit;
> +       }
> +
>         if ((m_irq & M_RX_FIFO_WATERMARK_EN) || (m_irq & M_RX_FIFO_LAST_EN))
>                 geni_spi_handle_rx(mas);
>  
> @@ -523,9 +546,11 @@ static irqreturn_t geni_spi_isr(int irq, void *data)
>                 complete(&mas->xfer_done);
>         }
>  
> +exit:
>         writel(m_irq, se->base + SE_GENI_M_IRQ_CLEAR);
>         spin_unlock_irqrestore(&mas->lock, flags);
> -       return IRQ_HANDLED;
> +
> +       return ret;
>  }
>  
>  static int spi_geni_probe(struct platform_device *pdev)
Doug Anderson March 17, 2020, 10:08 p.m. UTC | #2
Hi,

On Tue, Mar 17, 2020 at 2:36 PM Stephen Boyd <swboyd@chromium.org> wrote:
>
> > Patch is marked "speculative" because I have no way to reproduce this
> > problem, so I'm just hoping this fixes it.  Weakly ordered memory
> > makes my brain hurt.
>
> It could be that. It could also be the poor design of geni_se_init() and
> how it enables many interrupts that this driver doesn't look to handle?
> Why do we allow the wrapper to enable all those bits in
> M_COMMON_GENI_M_IRQ_EN and S_COMMON_GENI_S_IRQ_EN if nobody is going to
> handle them?

It is possible it's related to an interrupt that we don't handle.  However:

* IMO having the locking in place is safer anyway.  At some point I
read that advice that trying to reason about weakly ordered memory was
simply too hard for the average person (even the average kernel
developer).  In 99% of the cases you could just use a lock so it's
super clear and the performance difference is near zero.

* Most of the cases I saw of the "nobody cared" for geni-spi was on a
mostly idle system (presumably still doing periodic SPI transactions
to the EC, though).  It seems weird that one of these other interrupts
would suddenly fire.  It seems more likely that we just happened to
win a race of some sort.

If nothing else it will suddenly become very obvious after my patch
lands because I'll print out the status.


That all being said if someone wants to submit a patch to clean up
which interrupts are enabled I'd support it.


> > @@ -151,16 +151,19 @@ static void spi_geni_set_cs(struct spi_device *slv, bool set_flag)
> >         struct geni_se *se = &mas->se;
> >         unsigned long time_left;
> >
> > -       reinit_completion(&mas->xfer_done);
> >         pm_runtime_get_sync(mas->dev);
> >         if (!(slv->mode & SPI_CS_HIGH))
> >                 set_flag = !set_flag;
> >
> > +       spin_lock_irq(&mas->lock);
>
> Why is this spin_lock_irq() vs. spin_lock_irqsave()? This isn't possible
> to be called from somewhere that hasn't changed irq flags?

See below.


> > +       reinit_completion(&mas->xfer_done);
> > +
> >         mas->cur_mcmd = CMD_CS;
> >         if (set_flag)
> >                 geni_se_setup_m_cmd(se, SPI_CS_ASSERT, 0);
> >         else
> >                 geni_se_setup_m_cmd(se, SPI_CS_DEASSERT, 0);
> > +       spin_unlock_irq(&mas->lock);
>
> This will force on interrupts if they were masked.

I'll change it if you want, but in this function there is already a
call to "wait_for_completion_timeout".  That's not gonna be too happy
if this function is ever called with interrupts already masked.  Also
in this function is pm_runtime_get_sync() which in many cases will
sleep (I think we'll end up in geni_se_clks_on() which calls
clk_bulk_prepare_enable()).

In cases where you know for sure that interrupts aren't masked,
spin_lock_irq() and spin_unlock_irq() are fine and that's what they're
for, no?


> >         time_left = wait_for_completion_timeout(&mas->xfer_done, HZ);
> >         if (!time_left)
> > @@ -307,6 +310,8 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
> >         u32 spi_tx_cfg, len;
> >         struct geni_se *se = &mas->se;
> >
> > +       spin_lock_irq(&mas->lock);

...and just to answer the same question for here: setup_fifo_xfer() is
called from spi_geni_transfer_one() which is our "transfer_one"
function.  We don't happen to block anywhere in these functions, but
I'm nearly certain you are allowed to block in them.  We actually
return a positive number to indicate to the SPI core that we're not
doing the blocking ourselves but since the SPI core can't know we were
going to do that it has to assume we might block.


> > @@ -478,13 +485,29 @@ static irqreturn_t geni_spi_isr(int irq, void *data)
> >         struct geni_se *se = &mas->se;
> >         u32 m_irq;
> >         unsigned long flags;
> > -
> > -       if (mas->cur_mcmd == CMD_NONE)
> > -               return IRQ_NONE;
> > +       irqreturn_t ret = IRQ_HANDLED;
> >
> >         spin_lock_irqsave(&mas->lock, flags);

Ironically the above could probably just be "spin_lock" since this is
our interrupt handler.  ;-)  I'll just leave it alone though since
what's there now doesn't hurt.


> >         m_irq = readl(se->base + SE_GENI_M_IRQ_STATUS);
>
> Does this read need to be inside the lock?

Probably not.  Discussion below.


> > +       /* Check for spurious interrupt */
> > +       if (!m_irq) {
> > +               ret = IRQ_NONE;
> > +               goto exit;
>
> I ask because it could be simplified by reading the status and then
> immediately returning IRQ_NONE if no bits are set without having to do
> the lock/unlock dance. And does writing 0 to the irq clear register do
> anything?

Sure.  I'll move it if you want.  It felt nicer to just keep the whole
thing under lock so I didn't have to think about whether it mattered.
...and anyone else looking at it didn't need to think about if it
mattered, too.  It it is very easy to say that it doesn't _hurt_ to
have it under lock other than having one extra memory read under lock.
...and presumably the case where the optimization matters is
incredibly rare (a spurious interrupt) and we just spent a whole lot
of cycles calling the interrupt handler to begin with for this
spurious interrupt.

I would have a hard time believing that a write of 0 to a "write 1 to
clear" register would have any impact.  It would be a pretty bad
hardware design...


So I guess in summary, I'm not planning to spin for any of these
things unless you really insist or you say I'm wrong about something
above or someone else says my opinions are the wrong opinions.

-Doug
Stephen Boyd March 18, 2020, 8:04 a.m. UTC | #3
Quoting Doug Anderson (2020-03-17 15:08:45)
> Hi,
> 
> On Tue, Mar 17, 2020 at 2:36 PM Stephen Boyd <swboyd@chromium.org> wrote:
> >
> > > Patch is marked "speculative" because I have no way to reproduce this
> > > problem, so I'm just hoping this fixes it.  Weakly ordered memory
> > > makes my brain hurt.
> >
> > It could be that. It could also be the poor design of geni_se_init() and
> > how it enables many interrupts that this driver doesn't look to handle?
> > Why do we allow the wrapper to enable all those bits in
> > M_COMMON_GENI_M_IRQ_EN and S_COMMON_GENI_S_IRQ_EN if nobody is going to
> > handle them?
> 
> It is possible it's related to an interrupt that we don't handle.  However:
> 
> * IMO having the locking in place is safer anyway.  At some point I
> read that advice that trying to reason about weakly ordered memory was
> simply too hard for the average person (even the average kernel
> developer).  In 99% of the cases you could just use a lock so it's
> super clear and the performance difference is near zero.
> 
> * Most of the cases I saw of the "nobody cared" for geni-spi was on a
> mostly idle system (presumably still doing periodic SPI transactions
> to the EC, though).  It seems weird that one of these other interrupts
> would suddenly fire.  It seems more likely that we just happened to
> win a race of some sort.

Sure, I'm mostly interested in understanding what that race is. I think
it is something like cur_mcmd being stale when it's tested in the irq
handler because the IO access triggers the irq before mas->cur_mcmd can
be updated due to out of order execution. Nothing stops
spi_geni_set_cs() from being reordered like this, especially because
there isn't any sort of barrier besides the compiler barrier of asm
volatile inside geni_set_setup_m_cmd():

  CPU0 (device write overtakes CPU)  CPU1
  ----                               ----
  mas->cur_mcmd = CMD_NONE;
  geni_se_setup_m_cmd(...)
                                     geni_spi_isr()
				      <tests cur_mcmd and it's still NONE>
  mas->cur_mcmd = CMD_CS;

> 
> If nothing else it will suddenly become very obvious after my patch
> lands because I'll print out the status.

Fair enough.

> 
> 
> That all being said if someone wants to submit a patch to clean up
> which interrupts are enabled I'd support it.

Great!

> 
> 
> > > @@ -151,16 +151,19 @@ static void spi_geni_set_cs(struct spi_device *slv, bool set_flag)
> > >         struct geni_se *se = &mas->se;
> > >         unsigned long time_left;
> > >
> > > -       reinit_completion(&mas->xfer_done);
> > >         pm_runtime_get_sync(mas->dev);
> > >         if (!(slv->mode & SPI_CS_HIGH))
> > >                 set_flag = !set_flag;
> > >
> > > +       spin_lock_irq(&mas->lock);
> >
> > Why is this spin_lock_irq() vs. spin_lock_irqsave()? This isn't possible
> > to be called from somewhere that hasn't changed irq flags?
> 
> See below.
> 
> 
> > > +       reinit_completion(&mas->xfer_done);
> > > +
> > >         mas->cur_mcmd = CMD_CS;
> > >         if (set_flag)
> > >                 geni_se_setup_m_cmd(se, SPI_CS_ASSERT, 0);
> > >         else
> > >                 geni_se_setup_m_cmd(se, SPI_CS_DEASSERT, 0);
> > > +       spin_unlock_irq(&mas->lock);
> >
> > This will force on interrupts if they were masked.
> 
> I'll change it if you want, but in this function there is already a
> call to "wait_for_completion_timeout".  That's not gonna be too happy
> if this function is ever called with interrupts already masked.  Also
> in this function is pm_runtime_get_sync() which in many cases will
> sleep (I think we'll end up in geni_se_clks_on() which calls
> clk_bulk_prepare_enable()).
> 
> In cases where you know for sure that interrupts aren't masked,
> spin_lock_irq() and spin_unlock_irq() are fine and that's what they're
> for, no?
> 
> 
> > >         time_left = wait_for_completion_timeout(&mas->xfer_done, HZ);
> > >         if (!time_left)
> > > @@ -307,6 +310,8 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
> > >         u32 spi_tx_cfg, len;
> > >         struct geni_se *se = &mas->se;
> > >
> > > +       spin_lock_irq(&mas->lock);
> 
> ...and just to answer the same question for here: setup_fifo_xfer() is
> called from spi_geni_transfer_one() which is our "transfer_one"
> function.  We don't happen to block anywhere in these functions, but
> I'm nearly certain you are allowed to block in them.  We actually
> return a positive number to indicate to the SPI core that we're not
> doing the blocking ourselves but since the SPI core can't know we were
> going to do that it has to assume we might block.

Sounds good to not use irq save/restore here. Thanks for confirming.

> 
> 
> > > @@ -478,13 +485,29 @@ static irqreturn_t geni_spi_isr(int irq, void *data)
> > >         struct geni_se *se = &mas->se;
> > >         u32 m_irq;
> > >         unsigned long flags;
> > > -
> > > -       if (mas->cur_mcmd == CMD_NONE)
> > > -               return IRQ_NONE;
> > > +       irqreturn_t ret = IRQ_HANDLED;
> > >
> > >         spin_lock_irqsave(&mas->lock, flags);
> 
> Ironically the above could probably just be "spin_lock" since this is
> our interrupt handler.  ;-)  I'll just leave it alone though since
> what's there now doesn't hurt.

Yes that's another thing that can be done.

> 
> 
> > >         m_irq = readl(se->base + SE_GENI_M_IRQ_STATUS);
> >
> > Does this read need to be inside the lock?
> 
> Probably not.  Discussion below.
> 
> 
> > > +       /* Check for spurious interrupt */
> > > +       if (!m_irq) {
> > > +               ret = IRQ_NONE;
> > > +               goto exit;
> >
> > I ask because it could be simplified by reading the status and then
> > immediately returning IRQ_NONE if no bits are set without having to do
> > the lock/unlock dance. And does writing 0 to the irq clear register do
> > anything?
> 
> Sure.  I'll move it if you want.  It felt nicer to just keep the whole
> thing under lock so I didn't have to think about whether it mattered.
> ...and anyone else looking at it didn't need to think about if it
> mattered, too.  It it is very easy to say that it doesn't _hurt_ to
> have it under lock other than having one extra memory read under lock.
> ...and presumably the case where the optimization matters is
> incredibly rare (a spurious interrupt) and we just spent a whole lot
> of cycles calling the interrupt handler to begin with for this
> spurious interrupt.
> 
> I would have a hard time believing that a write of 0 to a "write 1 to
> clear" register would have any impact.  It would be a pretty bad
> hardware design...

This is a writel to a device so it may take some time for the memory
barrier to drain any other writes with the full memory barrier in there.
Not sure if we care about getting super high performance here but that's
a concern.

> 
> 
> So I guess in summary, I'm not planning to spin for any of these
> things unless you really insist or you say I'm wrong about something
> above or someone else says my opinions are the wrong opinions.
> 

Why do we need such large locking areas? Why can't we remove the enum
software tracking stuff and look at mas->cur_xfer to figure out if a cs
change is happening or not? I suppose mas->cur_xfer could be stale, so
we need to make sure that is modified and checked under a lock, and then
mas->rx_rem_bytes/tx_rem_bytes also need to be under a lock because
they're modified inside and outside the irq handler, but otherwise I
don't see any need to hold the lock over the register reads/writes. Most
other things are synchronized with the completions or in higher layers
of the SPI core.

I do wonder if we need to hold the lock again when we update the rx/tx
remaining bytes counter but I can't convince myself that the irq can run
on another CPU in parallel with the CPU that ran the irq first. That
seems impossible given that the interrupt would have to come again and
the irq controller would need to send it to a different CPU at the same
time. We should only need to grab the lock to make sure that cur_xfer
and remaining bytes is published by the CPU triggering the rx/tx
transfers and then assume the irq handler code is synchronous with
itself.

------8<-----
diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
index f0ca7f5ae714..ac7da801a63d 100644
--- a/drivers/spi/spi-geni-qcom.c
+++ b/drivers/spi/spi-geni-qcom.c
@@ -64,13 +64,6 @@
 #define TIMESTAMP_AFTER		BIT(3)
 #define POST_CMD_DELAY		BIT(4)
 
-enum spi_m_cmd_opcode {
-	CMD_NONE,
-	CMD_XFER,
-	CMD_CS,
-	CMD_CANCEL,
-};
-
 struct spi_geni_master {
 	struct geni_se se;
 	struct device *dev;
@@ -84,8 +77,7 @@ struct spi_geni_master {
 	const struct spi_transfer *cur_xfer;
 	struct completion xfer_done;
 	unsigned int oversampling;
-	spinlock_t lock;
-	enum spi_m_cmd_opcode cur_mcmd;
+	spinlock_t lock; /* Protects cur_xfer, {tx,rx}_rem_bytes */
 	int irq;
 };
 
@@ -123,23 +115,18 @@ static void handle_fifo_timeout(struct spi_master *spi,
 				struct spi_message *msg)
 {
 	struct spi_geni_master *mas = spi_master_get_devdata(spi);
-	unsigned long time_left, flags;
+	unsigned long time_left;
 	struct geni_se *se = &mas->se;
 
-	spin_lock_irqsave(&mas->lock, flags);
 	reinit_completion(&mas->xfer_done);
-	mas->cur_mcmd = CMD_CANCEL;
 	geni_se_cancel_m_cmd(se);
 	writel(0, se->base + SE_GENI_TX_WATERMARK_REG);
-	spin_unlock_irqrestore(&mas->lock, flags);
 	time_left = wait_for_completion_timeout(&mas->xfer_done, HZ);
 	if (time_left)
 		return;
 
-	spin_lock_irqsave(&mas->lock, flags);
 	reinit_completion(&mas->xfer_done);
 	geni_se_abort_m_cmd(se);
-	spin_unlock_irqrestore(&mas->lock, flags);
 	time_left = wait_for_completion_timeout(&mas->xfer_done, HZ);
 	if (!time_left)
 		dev_err(mas->dev, "Failed to cancel/abort m_cmd\n");
@@ -157,7 +144,6 @@ static void spi_geni_set_cs(struct spi_device *slv, bool set_flag)
 	if (!(slv->mode & SPI_CS_HIGH))
 		set_flag = !set_flag;
 
-	mas->cur_mcmd = CMD_CS;
 	if (set_flag)
 		geni_se_setup_m_cmd(se, SPI_CS_ASSERT, 0);
 	else
@@ -307,6 +293,7 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
 	u32 m_cmd = 0;
 	u32 spi_tx_cfg, len;
 	struct geni_se *se = &mas->se;
+	unsigned int rx_len = 0, tx_len = 0;
 
 	spi_tx_cfg = readl(se->base + SE_SPI_TRANS_CFG);
 	if (xfer->bits_per_word != mas->cur_bits_per_word) {
@@ -339,14 +326,14 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
 		writel(m_clk_cfg, se->base + GENI_SER_M_CLK_CFG);
 	}
 
-	mas->tx_rem_bytes = 0;
-	mas->rx_rem_bytes = 0;
-	if (xfer->tx_buf && xfer->rx_buf)
-		m_cmd = SPI_FULL_DUPLEX;
-	else if (xfer->tx_buf)
-		m_cmd = SPI_TX_ONLY;
-	else if (xfer->rx_buf)
-		m_cmd = SPI_RX_ONLY;
+	if (xfer->tx_buf) {
+		m_cmd |= SPI_TX_ONLY;
+		tx_len = xfer->len;
+	}
+	if (xfer->rx_buf) {
+		m_cmd |= SPI_RX_ONLY;
+		rx_len = xfer->len;
+	}
 
 	spi_tx_cfg &= ~CS_TOGGLE;
 
@@ -356,18 +343,17 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
 		len = xfer->len / (mas->cur_bits_per_word / BITS_PER_BYTE + 1);
 	len &= TRANS_LEN_MSK;
 
+	spin_lock_irq(&mas->lock);
 	mas->cur_xfer = xfer;
-	if (m_cmd & SPI_TX_ONLY) {
-		mas->tx_rem_bytes = xfer->len;
-		writel(len, se->base + SE_SPI_TX_TRANS_LEN);
-	}
+	mas->tx_rem_bytes = tx_len;
+	mas->rx_rem_bytes = rx_len;
+	spin_unlock_irq(&mas->lock);
 
-	if (m_cmd & SPI_RX_ONLY) {
+	if (m_cmd & SPI_TX_ONLY)
+		writel(len, se->base + SE_SPI_TX_TRANS_LEN);
+	if (m_cmd & SPI_RX_ONLY)
 		writel(len, se->base + SE_SPI_RX_TRANS_LEN);
-		mas->rx_rem_bytes = xfer->len;
-	}
 	writel(spi_tx_cfg, se->base + SE_SPI_TRANS_CFG);
-	mas->cur_mcmd = CMD_XFER;
 	geni_se_setup_m_cmd(se, m_cmd, FRAGMENTATION);
 
 	/*
@@ -416,10 +402,12 @@ static void geni_spi_handle_tx(struct spi_geni_master *mas)
 	unsigned int i = 0;
 
 	max_bytes = (mas->tx_fifo_depth - mas->tx_wm) * bytes_per_fifo_word;
+	spin_lock(&mas->lock);
 	if (mas->tx_rem_bytes < max_bytes)
 		max_bytes = mas->tx_rem_bytes;
 
 	tx_buf = mas->cur_xfer->tx_buf + mas->cur_xfer->len - mas->tx_rem_bytes;
+	spin_unlock(&mas->lock);
 	while (i < max_bytes) {
 		unsigned int j;
 		unsigned int bytes_to_write;
@@ -454,10 +442,13 @@ static void geni_spi_handle_rx(struct spi_geni_master *mas)
 		if (rx_last_byte_valid && rx_last_byte_valid < 4)
 			rx_bytes -= bytes_per_fifo_word - rx_last_byte_valid;
 	}
+	spin_lock(&mas->lock);
 	if (mas->rx_rem_bytes < rx_bytes)
 		rx_bytes = mas->rx_rem_bytes;
 
 	rx_buf = mas->cur_xfer->rx_buf + mas->cur_xfer->len - mas->rx_rem_bytes;
+	spin_unlock(&mas->lock);
+
 	while (i < rx_bytes) {
 		u32 fifo_word = 0;
 		u8 *fifo_byte = (u8 *)&fifo_word;
@@ -478,13 +469,11 @@ static irqreturn_t geni_spi_isr(int irq, void *data)
 	struct spi_geni_master *mas = spi_master_get_devdata(spi);
 	struct geni_se *se = &mas->se;
 	u32 m_irq;
-	unsigned long flags;
+	bool cs_change = false;
 
-	if (mas->cur_mcmd == CMD_NONE)
-		return IRQ_NONE;
-
-	spin_lock_irqsave(&mas->lock, flags);
 	m_irq = readl(se->base + SE_GENI_M_IRQ_STATUS);
+	if (!m_irq)
+		return IRQ_NONE;
 
 	if ((m_irq & M_RX_FIFO_WATERMARK_EN) || (m_irq & M_RX_FIFO_LAST_EN))
 		geni_spi_handle_rx(mas);
@@ -493,39 +482,48 @@ static irqreturn_t geni_spi_isr(int irq, void *data)
 		geni_spi_handle_tx(mas);
 
 	if (m_irq & M_CMD_DONE_EN) {
-		if (mas->cur_mcmd == CMD_XFER)
+		spin_lock(&mas->lock);
+		if (mas->cur_xfer) {
 			spi_finalize_current_transfer(spi);
-		else if (mas->cur_mcmd == CMD_CS)
-			complete(&mas->xfer_done);
-		mas->cur_mcmd = CMD_NONE;
-		/*
-		 * If this happens, then a CMD_DONE came before all the Tx
-		 * buffer bytes were sent out. This is unusual, log this
-		 * condition and disable the WM interrupt to prevent the
-		 * system from stalling due an interrupt storm.
-		 * If this happens when all Rx bytes haven't been received, log
-		 * the condition.
-		 * The only known time this can happen is if bits_per_word != 8
-		 * and some registers that expect xfer lengths in num spi_words
-		 * weren't written correctly.
-		 */
-		if (mas->tx_rem_bytes) {
-			writel(0, se->base + SE_GENI_TX_WATERMARK_REG);
-			dev_err(mas->dev, "Premature done. tx_rem = %d bpw%d\n",
-				mas->tx_rem_bytes, mas->cur_bits_per_word);
+			mas->cur_xfer = NULL;
+			/*
+			 * If this happens, then a CMD_DONE came before all the
+			 * Tx buffer bytes were sent out. This is unusual, log
+			 * this condition and disable the WM interrupt to
+			 * prevent the system from stalling due an interrupt
+			 * storm.
+			 *
+			 * If this happens when all Rx bytes haven't been
+			 * received, log the condition.
+			 *
+			 * The only known time this can happen is if
+			 * bits_per_word != 8 and some registers that expect
+			 * xfer lengths in num spi_words weren't written
+			 * correctly.
+			 */
+			if (mas->tx_rem_bytes) {
+				writel(0, se->base + SE_GENI_TX_WATERMARK_REG);
+				dev_err(mas->dev, "Premature done. tx_rem = %d bpw%d\n",
+					mas->tx_rem_bytes,
+					mas->cur_bits_per_word);
+			}
+			if (mas->rx_rem_bytes) {
+				dev_err(mas->dev, "Premature done. rx_rem = %d bpw%d\n",
+					mas->rx_rem_bytes,
+					mas->cur_bits_per_word);
+			}
+		} else {
+			cs_change = true;
 		}
-		if (mas->rx_rem_bytes)
-			dev_err(mas->dev, "Premature done. rx_rem = %d bpw%d\n",
-				mas->rx_rem_bytes, mas->cur_bits_per_word);
+		spin_unlock(&mas->lock);
 	}
 
-	if ((m_irq & M_CMD_CANCEL_EN) || (m_irq & M_CMD_ABORT_EN)) {
-		mas->cur_mcmd = CMD_NONE;
+	/* Wake up spi_geni_set_cs() or handle_fifo_timeout() */
+	if (cs_change || (m_irq & M_CMD_CANCEL_EN) || (m_irq & M_CMD_ABORT_EN))
 		complete(&mas->xfer_done);
-	}
 
 	writel(m_irq, se->base + SE_GENI_M_IRQ_CLEAR);
-	spin_unlock_irqrestore(&mas->lock, flags);
+
 	return IRQ_HANDLED;
 }
Doug Anderson March 18, 2020, 8:10 p.m. UTC | #4
Hi,

On Wed, Mar 18, 2020 at 1:04 AM Stephen Boyd <swboyd@chromium.org> wrote:
>
> > * Most of the cases I saw of the "nobody cared" for geni-spi was on a
> > mostly idle system (presumably still doing periodic SPI transactions
> > to the EC, though).  It seems weird that one of these other interrupts
> > would suddenly fire.  It seems more likely that we just happened to
> > win a race of some sort.
>
> Sure, I'm mostly interested in understanding what that race is. I think
> it is something like cur_mcmd being stale when it's tested in the irq
> handler because the IO access triggers the irq before mas->cur_mcmd can
> be updated due to out of order execution. Nothing stops
> spi_geni_set_cs() from being reordered like this, especially because
> there isn't any sort of barrier besides the compiler barrier of asm
> volatile inside geni_set_setup_m_cmd():
>
>   CPU0 (device write overtakes CPU)  CPU1
>   ----                               ----
>   mas->cur_mcmd = CMD_NONE;
>   geni_se_setup_m_cmd(...)
>                                      geni_spi_isr()
>                                       <tests cur_mcmd and it's still NONE>
>   mas->cur_mcmd = CMD_CS;

This is my assumption about what must have happened, yes.  I have no
proof, though.


> > > > +       /* Check for spurious interrupt */
> > > > +       if (!m_irq) {
> > > > +               ret = IRQ_NONE;
> > > > +               goto exit;
> > >
> > > I ask because it could be simplified by reading the status and then
> > > immediately returning IRQ_NONE if no bits are set without having to do
> > > the lock/unlock dance. And does writing 0 to the irq clear register do
> > > anything?
> >
> > Sure.  I'll move it if you want.  It felt nicer to just keep the whole
> > thing under lock so I didn't have to think about whether it mattered.
> > ...and anyone else looking at it didn't need to think about if it
> > mattered, too.  It it is very easy to say that it doesn't _hurt_ to
> > have it under lock other than having one extra memory read under lock.
> > ...and presumably the case where the optimization matters is
> > incredibly rare (a spurious interrupt) and we just spent a whole lot
> > of cycles calling the interrupt handler to begin with for this
> > spurious interrupt.
> >
> > I would have a hard time believing that a write of 0 to a "write 1 to
> > clear" register would have any impact.  It would be a pretty bad
> > hardware design...
>
> This is a writel to a device so it may take some time for the memory
> barrier to drain any other writes with the full memory barrier in there.
> Not sure if we care about getting super high performance here but that's
> a concern.

Right, but I guess the spurious interrupt case wasn't one I'm
optimizing for.  ...and if it's a real problem we should figure out
how to avoid so many spurious interrupts.  Presumably all the overhead
that the Linux kernel did to call our interrupt handler in the first
place is much more serious.


> > So I guess in summary, I'm not planning to spin for any of these
> > things unless you really insist or you say I'm wrong about something
> > above or someone else says my opinions are the wrong opinions.
> >
>
> Why do we need such large locking areas?

We probably don't, but it's unlikely to matter at all and the large
locking areas mean we don't have to spend time thinking about it.
It's super clear that the locking is safe.


In general there shouldn't be tons of downside to have large locking
areas because there shouldn't be contention for this lock.  As you
said, much of this is synchronized at higher levels, so mostly we're
just worried about synchronizing with our interrupt handler and one
would hope the interrupt handler wasn't firing at times we don't
expect.

...but I guess there is the downside that local interrupts are fully
disabled for this whole large locking area and that does feel like
something that might be worth optimizing for...


In theory we _could_ just have a lock around updating 'cur_mcmd' and
kicking off the transfer.  That would make sure that 'cur_mcmd' was
definitely updated by the time the interrupt fired.  It feels at least
a little awkward to me, though, because there is more shared state
between the interrupt handler and the functions kicking off the
transfers and we're really just using the lock as a poor man's memory
barrier, but maybe it's OK.  I can give a shot at this approach if it
feels viable to you.


> Why can't we remove the enum
> software tracking stuff and look at mas->cur_xfer to figure out if a cs
> change is happening or not? I suppose mas->cur_xfer could be stale, so
> we need to make sure that is modified and checked under a lock, and then
> mas->rx_rem_bytes/tx_rem_bytes also need to be under a lock because
> they're modified inside and outside the irq handler, but otherwise I
> don't see any need to hold the lock over the register reads/writes. Most
> other things are synchronized with the completions or in higher layers
> of the SPI core.

In general it's a big-ish change to code that's largely already tested
and that (IMO) it makes readability slightly worse.  It can be worth
it to make big changes for big gain, but I don't see it here.  It
might have a slight performance boost, but seems like premature
optimization since there's no evidence that the existing performance
(even after my big locking patch) is a problem.

If we want to go forward with your suggestion, I'd have to spend more
time thinking about the cancel/abort cases, especially, which I'm not
sure how to test super well.  Specifically does cancel/abort cause the
"done" bit to be set?  If so your new code will finalize the current
transfer, but the old code didn't.  I have no idea if this matters.
You also removed the spinlock from handle_fifo_timeout() and I don't
know if (for instance) it was important to clear the TX watermark
register after firing off the abort command but before the interrupt
handler could run.  It also doesn't necessarily appear that
reinit_completion() has any safety, so is there some chance that the
interrupt could fire and we'd still have the re-init pending on our
side, then clear "done" back to zero after the interrupt handler
completed us?

...one extra thing to think about with all this is that (I think)
handle_fifo_timeout() will be called if someone kicks off a SPI
transfer and then hits "Ctrl-C" partway.  In this case
spi_transfer_wait() will return -EINTR and I think that will change
the chip select and then call "handle_err".  I have no idea if that
works well today, but I have a feeling it will be harder to get right
without an extra state variable.


> I do wonder if we need to hold the lock again when we update the rx/tx
> remaining bytes counter but I can't convince myself that the irq can run
> on another CPU in parallel with the CPU that ran the irq first. That
> seems impossible given that the interrupt would have to come again and
> the irq controller would need to send it to a different CPU at the same
> time. We should only need to grab the lock to make sure that cur_xfer
> and remaining bytes is published by the CPU triggering the rx/tx
> transfers and then assume the irq handler code is synchronous with
> itself.

At least needs a comment?

...but even with a comment, what exactly are you gaining here by
making me and all future readers of the code think about this?  One of
your main goals here is to eliminate the need to have the spin lock
locked for the whole of the IRQ handler, right?  If we're in the
interrupt handler, local interrupts are already off, right?  That
means that the only thing you can gain (in terms of speed) by
decreasing the number of lines "under lock" is that you could allow
someone else trying to get the same lock to run sooner.  As you've
pointed out, there's already a lot of synchronization at higher
layers, and many of the arguments you make are assuming that the other
code isn't running anyway.

So this overall seems like trying to shorten the amount of time that
the lock is held just for the sheer principle of having spin locks
held for the fewest lines of code, but not for any real gain?

To get into micro-optimizations, too, I will also say that having fine
grained locking in the interrupt handler also will lead to more than
one call to spin_lock() per IRQ.  I put traces in and I often saw "rx"
and "cmd_done" fall together and both separately lock in your code.
In general isn't reducing the number of calls to spin_lock() (which I
think has some overhead) more important?


> -       if ((m_irq & M_CMD_CANCEL_EN) || (m_irq & M_CMD_ABORT_EN)) {
> -               mas->cur_mcmd = CMD_NONE;
> +       /* Wake up spi_geni_set_cs() or handle_fifo_timeout() */
> +       if (cs_change || (m_irq & M_CMD_CANCEL_EN) || (m_irq & M_CMD_ABORT_EN))
>                 complete(&mas->xfer_done);
> -       }
>
>         writel(m_irq, se->base + SE_GENI_M_IRQ_CLEAR);
> -       spin_unlock_irqrestore(&mas->lock, flags);
> +

It doesn't feel legit to release the spinlock before this writel().
Once you complete(), can't (in theory) another transfer kick off on
another CPU, and then cause an interrupt to fire?  You'd be clearing
someone else's interrupt, perhaps?  Maybe this actually happens in
handle_fifo_timeout() which (after your patch) waits for the
completion and then immediate (with no locking) fires off an abort.

As an aisde and unrelated to your change, I never like it when
interrupt handler clear their IRQ bits at the end anyway because it
always feels like a race.  The only time you clear at the end is if
you have "latched level" interrupts (interrupts that are latched but
will continually re-assert unless the thing that caused them goes
away).  I have no idea of these bits are "latched level", but it feels
unlikely.


NOTE: I didn't do a complete review of your patch because (I think) I
already found a few issues and am still not convinced it's a good use
of time to fully re-architect the control flow of this driver.  If you
still think this is something we need to do, I will try to make time
to work through all the corner cases in my head and try my best to
make sure there are no weird memory-ordering issues, but I can't
promise I won't miss anything.


-Doug
Dan Carpenter March 23, 2020, 11:08 a.m. UTC | #5
Hi Douglas,

url:    https://github.com/0day-ci/linux/commits/Douglas-Anderson/spi-spi-geni-qcom-Speculative-fix-of-nobody-cared-about-interrupt/20200318-043933
base:   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

smatch warnings:
drivers/spi/spi-geni-qcom.c:385 setup_fifo_xfer() warn: inconsistent returns 'irq'.
drivers/spi/spi-geni-qcom.c:385 setup_fifo_xfer() warn: inconsistent returns 'mas->lock'.

# https://github.com/0day-ci/linux/commit/365ef891fdac5e58b1f621b0b0d57608ffafeb2b
git remote add linux-review https://github.com/0day-ci/linux
git remote update linux-review
git checkout 365ef891fdac5e58b1f621b0b0d57608ffafeb2b
vim +/irq +385 drivers/spi/spi-geni-qcom.c

561de45f72bd5f Girish Mahadevan 2018-10-03  305  static void setup_fifo_xfer(struct spi_transfer *xfer,
561de45f72bd5f Girish Mahadevan 2018-10-03  306  				struct spi_geni_master *mas,
561de45f72bd5f Girish Mahadevan 2018-10-03  307  				u16 mode, struct spi_master *spi)
561de45f72bd5f Girish Mahadevan 2018-10-03  308  {
561de45f72bd5f Girish Mahadevan 2018-10-03  309  	u32 m_cmd = 0;
561de45f72bd5f Girish Mahadevan 2018-10-03  310  	u32 spi_tx_cfg, len;
561de45f72bd5f Girish Mahadevan 2018-10-03  311  	struct geni_se *se = &mas->se;
561de45f72bd5f Girish Mahadevan 2018-10-03  312  
365ef891fdac5e Douglas Anderson 2020-03-17  313  	spin_lock_irq(&mas->lock);
365ef891fdac5e Douglas Anderson 2020-03-17  314  
561de45f72bd5f Girish Mahadevan 2018-10-03  315  	spi_tx_cfg = readl(se->base + SE_SPI_TRANS_CFG);
561de45f72bd5f Girish Mahadevan 2018-10-03  316  	if (xfer->bits_per_word != mas->cur_bits_per_word) {
561de45f72bd5f Girish Mahadevan 2018-10-03  317  		spi_setup_word_len(mas, mode, xfer->bits_per_word);
561de45f72bd5f Girish Mahadevan 2018-10-03  318  		mas->cur_bits_per_word = xfer->bits_per_word;
561de45f72bd5f Girish Mahadevan 2018-10-03  319  	}
561de45f72bd5f Girish Mahadevan 2018-10-03  320  
561de45f72bd5f Girish Mahadevan 2018-10-03  321  	/* Speed and bits per word can be overridden per transfer */
561de45f72bd5f Girish Mahadevan 2018-10-03  322  	if (xfer->speed_hz != mas->cur_speed_hz) {
561de45f72bd5f Girish Mahadevan 2018-10-03  323  		int ret;
561de45f72bd5f Girish Mahadevan 2018-10-03  324  		u32 clk_sel, m_clk_cfg;
561de45f72bd5f Girish Mahadevan 2018-10-03  325  		unsigned int idx, div;
561de45f72bd5f Girish Mahadevan 2018-10-03  326  
561de45f72bd5f Girish Mahadevan 2018-10-03  327  		ret = get_spi_clk_cfg(xfer->speed_hz, mas, &idx, &div);
561de45f72bd5f Girish Mahadevan 2018-10-03  328  		if (ret) {
561de45f72bd5f Girish Mahadevan 2018-10-03  329  			dev_err(mas->dev, "Err setting clks:%d\n", ret);
561de45f72bd5f Girish Mahadevan 2018-10-03  330  			return;

Needs to drop the lock before returning.

561de45f72bd5f Girish Mahadevan 2018-10-03  331  		}
561de45f72bd5f Girish Mahadevan 2018-10-03  332  		/*
561de45f72bd5f Girish Mahadevan 2018-10-03  333  		 * SPI core clock gets configured with the requested frequency
561de45f72bd5f Girish Mahadevan 2018-10-03  334  		 * or the frequency closer to the requested frequency.
561de45f72bd5f Girish Mahadevan 2018-10-03  335  		 * For that reason requested frequency is stored in the
561de45f72bd5f Girish Mahadevan 2018-10-03  336  		 * cur_speed_hz and referred in the consecutive transfer instead
561de45f72bd5f Girish Mahadevan 2018-10-03  337  		 * of calling clk_get_rate() API.
561de45f72bd5f Girish Mahadevan 2018-10-03  338  		 */
561de45f72bd5f Girish Mahadevan 2018-10-03  339  		mas->cur_speed_hz = xfer->speed_hz;
561de45f72bd5f Girish Mahadevan 2018-10-03  340  		clk_sel = idx & CLK_SEL_MSK;
561de45f72bd5f Girish Mahadevan 2018-10-03  341  		m_clk_cfg = (div << CLK_DIV_SHFT) | SER_CLK_EN;
561de45f72bd5f Girish Mahadevan 2018-10-03  342  		writel(clk_sel, se->base + SE_GENI_CLK_SEL);
561de45f72bd5f Girish Mahadevan 2018-10-03  343  		writel(m_clk_cfg, se->base + GENI_SER_M_CLK_CFG);
561de45f72bd5f Girish Mahadevan 2018-10-03  344  	}
561de45f72bd5f Girish Mahadevan 2018-10-03  345  
561de45f72bd5f Girish Mahadevan 2018-10-03  346  	mas->tx_rem_bytes = 0;
561de45f72bd5f Girish Mahadevan 2018-10-03  347  	mas->rx_rem_bytes = 0;
561de45f72bd5f Girish Mahadevan 2018-10-03  348  	if (xfer->tx_buf && xfer->rx_buf)
561de45f72bd5f Girish Mahadevan 2018-10-03  349  		m_cmd = SPI_FULL_DUPLEX;
561de45f72bd5f Girish Mahadevan 2018-10-03  350  	else if (xfer->tx_buf)
561de45f72bd5f Girish Mahadevan 2018-10-03  351  		m_cmd = SPI_TX_ONLY;
561de45f72bd5f Girish Mahadevan 2018-10-03  352  	else if (xfer->rx_buf)
561de45f72bd5f Girish Mahadevan 2018-10-03  353  		m_cmd = SPI_RX_ONLY;
561de45f72bd5f Girish Mahadevan 2018-10-03  354  
561de45f72bd5f Girish Mahadevan 2018-10-03  355  	spi_tx_cfg &= ~CS_TOGGLE;
561de45f72bd5f Girish Mahadevan 2018-10-03  356  
561de45f72bd5f Girish Mahadevan 2018-10-03  357  	if (!(mas->cur_bits_per_word % MIN_WORD_LEN))
561de45f72bd5f Girish Mahadevan 2018-10-03  358  		len = xfer->len * BITS_PER_BYTE / mas->cur_bits_per_word;
561de45f72bd5f Girish Mahadevan 2018-10-03  359  	else
561de45f72bd5f Girish Mahadevan 2018-10-03  360  		len = xfer->len / (mas->cur_bits_per_word / BITS_PER_BYTE + 1);
561de45f72bd5f Girish Mahadevan 2018-10-03  361  	len &= TRANS_LEN_MSK;
561de45f72bd5f Girish Mahadevan 2018-10-03  362  
561de45f72bd5f Girish Mahadevan 2018-10-03  363  	mas->cur_xfer = xfer;
561de45f72bd5f Girish Mahadevan 2018-10-03  364  	if (m_cmd & SPI_TX_ONLY) {
561de45f72bd5f Girish Mahadevan 2018-10-03  365  		mas->tx_rem_bytes = xfer->len;
561de45f72bd5f Girish Mahadevan 2018-10-03  366  		writel(len, se->base + SE_SPI_TX_TRANS_LEN);
561de45f72bd5f Girish Mahadevan 2018-10-03  367  	}
561de45f72bd5f Girish Mahadevan 2018-10-03  368  
561de45f72bd5f Girish Mahadevan 2018-10-03  369  	if (m_cmd & SPI_RX_ONLY) {
561de45f72bd5f Girish Mahadevan 2018-10-03  370  		writel(len, se->base + SE_SPI_RX_TRANS_LEN);
561de45f72bd5f Girish Mahadevan 2018-10-03  371  		mas->rx_rem_bytes = xfer->len;
561de45f72bd5f Girish Mahadevan 2018-10-03  372  	}
561de45f72bd5f Girish Mahadevan 2018-10-03  373  	writel(spi_tx_cfg, se->base + SE_SPI_TRANS_CFG);
561de45f72bd5f Girish Mahadevan 2018-10-03  374  	mas->cur_mcmd = CMD_XFER;
561de45f72bd5f Girish Mahadevan 2018-10-03  375  	geni_se_setup_m_cmd(se, m_cmd, FRAGMENTATION);
561de45f72bd5f Girish Mahadevan 2018-10-03  376  
561de45f72bd5f Girish Mahadevan 2018-10-03  377  	/*
561de45f72bd5f Girish Mahadevan 2018-10-03  378  	 * TX_WATERMARK_REG should be set after SPI configuration and
561de45f72bd5f Girish Mahadevan 2018-10-03  379  	 * setting up GENI SE engine, as driver starts data transfer
561de45f72bd5f Girish Mahadevan 2018-10-03  380  	 * for the watermark interrupt.
561de45f72bd5f Girish Mahadevan 2018-10-03  381  	 */
561de45f72bd5f Girish Mahadevan 2018-10-03  382  	if (m_cmd & SPI_TX_ONLY)
561de45f72bd5f Girish Mahadevan 2018-10-03  383  		writel(mas->tx_wm, se->base + SE_GENI_TX_WATERMARK_REG);
365ef891fdac5e Douglas Anderson 2020-03-17  384  
365ef891fdac5e Douglas Anderson 2020-03-17 @385  	spin_unlock_irq(&mas->lock);
561de45f72bd5f Girish Mahadevan 2018-10-03  386  }

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Doug Anderson March 23, 2020, 4:32 p.m. UTC | #6
Hi,

On Mon, Mar 23, 2020 at 4:08 AM Dan Carpenter <dan.carpenter@oracle.com> wrote:
>
> 561de45f72bd5f Girish Mahadevan 2018-10-03  327                 ret = get_spi_clk_cfg(xfer->speed_hz, mas, &idx, &div);
> 561de45f72bd5f Girish Mahadevan 2018-10-03  328                 if (ret) {
> 561de45f72bd5f Girish Mahadevan 2018-10-03  329                         dev_err(mas->dev, "Err setting clks:%d\n", ret);
> 561de45f72bd5f Girish Mahadevan 2018-10-03  330                         return;
>
> Needs to drop the lock before returning.

Oops, thanks for catching.  I will wait before spinning a v3 until
there is some clarity about whether folks want to do something more
like Stephen suggested or whether I should continue with my strategy.
At the moment I'm still in favor of keeping w/ my strategy and seeing
if I can reduce the amount of time with interrupts disabled in
setup_fifo_xfer(), maybe just grabbing the lock around the start of
the transfer to keep the state machine in sync with the kickoff...

-Doug
diff mbox series

Patch

diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
index c3972424af71..92e51ccb427f 100644
--- a/drivers/spi/spi-geni-qcom.c
+++ b/drivers/spi/spi-geni-qcom.c
@@ -151,16 +151,19 @@  static void spi_geni_set_cs(struct spi_device *slv, bool set_flag)
 	struct geni_se *se = &mas->se;
 	unsigned long time_left;
 
-	reinit_completion(&mas->xfer_done);
 	pm_runtime_get_sync(mas->dev);
 	if (!(slv->mode & SPI_CS_HIGH))
 		set_flag = !set_flag;
 
+	spin_lock_irq(&mas->lock);
+	reinit_completion(&mas->xfer_done);
+
 	mas->cur_mcmd = CMD_CS;
 	if (set_flag)
 		geni_se_setup_m_cmd(se, SPI_CS_ASSERT, 0);
 	else
 		geni_se_setup_m_cmd(se, SPI_CS_DEASSERT, 0);
+	spin_unlock_irq(&mas->lock);
 
 	time_left = wait_for_completion_timeout(&mas->xfer_done, HZ);
 	if (!time_left)
@@ -307,6 +310,8 @@  static void setup_fifo_xfer(struct spi_transfer *xfer,
 	u32 spi_tx_cfg, len;
 	struct geni_se *se = &mas->se;
 
+	spin_lock_irq(&mas->lock);
+
 	spi_tx_cfg = readl(se->base + SE_SPI_TRANS_CFG);
 	if (xfer->bits_per_word != mas->cur_bits_per_word) {
 		spi_setup_word_len(mas, mode, xfer->bits_per_word);
@@ -376,6 +381,8 @@  static void setup_fifo_xfer(struct spi_transfer *xfer,
 	 */
 	if (m_cmd & SPI_TX_ONLY)
 		writel(mas->tx_wm, se->base + SE_GENI_TX_WATERMARK_REG);
+
+	spin_unlock_irq(&mas->lock);
 }
 
 static int spi_geni_transfer_one(struct spi_master *spi,
@@ -478,13 +485,29 @@  static irqreturn_t geni_spi_isr(int irq, void *data)
 	struct geni_se *se = &mas->se;
 	u32 m_irq;
 	unsigned long flags;
-
-	if (mas->cur_mcmd == CMD_NONE)
-		return IRQ_NONE;
+	irqreturn_t ret = IRQ_HANDLED;
 
 	spin_lock_irqsave(&mas->lock, flags);
 	m_irq = readl(se->base + SE_GENI_M_IRQ_STATUS);
 
+	/* Check for spurious interrupt */
+	if (!m_irq) {
+		ret = IRQ_NONE;
+		goto exit;
+	}
+
+	/*
+	 * If we got a real interrupt but software state machine thinks
+	 * we were idle then give a warning.  We'll do our best to clear
+	 * the interrupt but we'll still return IRQ_NONE.  If this keeps
+	 * happening the system will eventually disable us.
+	 */
+	if (mas->cur_mcmd == CMD_NONE) {
+		pr_warn("Unexpected SPI interrupt: %#010x\n", m_irq);
+		ret = IRQ_NONE;
+		goto exit;
+	}
+
 	if ((m_irq & M_RX_FIFO_WATERMARK_EN) || (m_irq & M_RX_FIFO_LAST_EN))
 		geni_spi_handle_rx(mas);
 
@@ -523,9 +546,11 @@  static irqreturn_t geni_spi_isr(int irq, void *data)
 		complete(&mas->xfer_done);
 	}
 
+exit:
 	writel(m_irq, se->base + SE_GENI_M_IRQ_CLEAR);
 	spin_unlock_irqrestore(&mas->lock, flags);
-	return IRQ_HANDLED;
+
+	return ret;
 }
 
 static int spi_geni_probe(struct platform_device *pdev)