diff mbox series

i2c: i2c-qcom-geni: Fix DMA transfer race

Message ID 20200720172448.1.I7efdf6efaa6edadbb690196cd4fbe3392a582c89@changeid (mailing list archive)
State Superseded
Headers show
Series i2c: i2c-qcom-geni: Fix DMA transfer race | expand

Commit Message

Doug Anderson July 21, 2020, 12:24 a.m. UTC
When I have KASAN enabled on my kernel and I start stressing the
touchscreen my system tends to hang.  The touchscreen is one of the
only things that does a lot of big i2c transfers and ends up hitting
the DMA paths in the geni i2c driver.  It appears that KASAN adds
enough delay in my system to tickle a race condition in the DMA setup
code.

When the system hangs, I found that it was running the geni_i2c_irq()
over and over again.  It had these:

m_stat   = 0x04000080
rx_st    = 0x30000011
dm_tx_st = 0x00000000
dm_rx_st = 0x00000000
dma      = 0x00000001

Notably we're in DMA mode but are getting M_RX_IRQ_EN and
M_RX_FIFO_WATERMARK_EN over and over again.

Putting some traces in geni_i2c_rx_one_msg() showed that when we
failed we were getting to the start of geni_i2c_rx_one_msg() but were
never executing geni_se_rx_dma_prep().

I believe that the problem here is that we are writing the transfer
length and setting up the geni command before we run
geni_se_rx_dma_prep().  If a transfer makes it far enough before we do
that then we get into the state I have observed.  Let's change the
order, which seems to work fine.

Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 drivers/i2c/busses/i2c-qcom-geni.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Sai Prakash Ranjan July 21, 2020, 5:37 a.m. UTC | #1
On 2020-07-21 05:54, Douglas Anderson wrote:
> When I have KASAN enabled on my kernel and I start stressing the
> touchscreen my system tends to hang.  The touchscreen is one of the
> only things that does a lot of big i2c transfers and ends up hitting
> the DMA paths in the geni i2c driver.  It appears that KASAN adds
> enough delay in my system to tickle a race condition in the DMA setup
> code.
> 
> When the system hangs, I found that it was running the geni_i2c_irq()
> over and over again.  It had these:
> 
> m_stat   = 0x04000080
> rx_st    = 0x30000011
> dm_tx_st = 0x00000000
> dm_rx_st = 0x00000000
> dma      = 0x00000001
> 
> Notably we're in DMA mode but are getting M_RX_IRQ_EN and
> M_RX_FIFO_WATERMARK_EN over and over again.
> 
> Putting some traces in geni_i2c_rx_one_msg() showed that when we
> failed we were getting to the start of geni_i2c_rx_one_msg() but were
> never executing geni_se_rx_dma_prep().
> 
> I believe that the problem here is that we are writing the transfer
> length and setting up the geni command before we run
> geni_se_rx_dma_prep().  If a transfer makes it far enough before we do
> that then we get into the state I have observed.  Let's change the
> order, which seems to work fine.
> 
> Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the
> Qualcomm GENI I2C controller")
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
> 
>  drivers/i2c/busses/i2c-qcom-geni.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/i2c/busses/i2c-qcom-geni.c
> b/drivers/i2c/busses/i2c-qcom-geni.c
> index 18d1e4fd4cf3..21e27f10510a 100644
> --- a/drivers/i2c/busses/i2c-qcom-geni.c
> +++ b/drivers/i2c/busses/i2c-qcom-geni.c
> @@ -366,15 +366,15 @@ static int geni_i2c_rx_one_msg(struct
> geni_i2c_dev *gi2c, struct i2c_msg *msg,
>  	else
>  		geni_se_select_mode(se, GENI_SE_FIFO);
> 
> -	writel_relaxed(len, se->base + SE_I2C_RX_TRANS_LEN);
> -	geni_se_setup_m_cmd(se, I2C_READ, m_param);
> -
>  	if (dma_buf && geni_se_rx_dma_prep(se, dma_buf, len, &rx_dma)) {
>  		geni_se_select_mode(se, GENI_SE_FIFO);
>  		i2c_put_dma_safe_msg_buf(dma_buf, msg, false);
>  		dma_buf = NULL;
>  	}
> 
> +	writel_relaxed(len, se->base + SE_I2C_RX_TRANS_LEN);
> +	geni_se_setup_m_cmd(se, I2C_READ, m_param);
> +
>  	time_left = wait_for_completion_timeout(&gi2c->done, XFER_TIMEOUT);
>  	if (!time_left)
>  		geni_i2c_abort_xfer(gi2c);

Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Stephen Boyd July 21, 2020, 5:59 a.m. UTC | #2
Quoting Douglas Anderson (2020-07-20 17:24:53)
> When I have KASAN enabled on my kernel and I start stressing the
> touchscreen my system tends to hang.  The touchscreen is one of the
> only things that does a lot of big i2c transfers and ends up hitting
> the DMA paths in the geni i2c driver.  It appears that KASAN adds
> enough delay in my system to tickle a race condition in the DMA setup
> code.
> 
> When the system hangs, I found that it was running the geni_i2c_irq()
> over and over again.  It had these:
> 
> m_stat   = 0x04000080
> rx_st    = 0x30000011
> dm_tx_st = 0x00000000
> dm_rx_st = 0x00000000
> dma      = 0x00000001
> 
> Notably we're in DMA mode but are getting M_RX_IRQ_EN and
> M_RX_FIFO_WATERMARK_EN over and over again.
> 
> Putting some traces in geni_i2c_rx_one_msg() showed that when we
> failed we were getting to the start of geni_i2c_rx_one_msg() but were
> never executing geni_se_rx_dma_prep().
> 
> I believe that the problem here is that we are writing the transfer
> length and setting up the geni command before we run
> geni_se_rx_dma_prep().  If a transfer makes it far enough before we do
> that then we get into the state I have observed.  Let's change the
> order, which seems to work fine.

Does the length matter or the I2C_READ m_cmd matter? Or somehow both?
Otherwise it sounds correct to me that we're configuring it to start the
read before mapping the buffer.

> 
> Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller")
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
> 
>  drivers/i2c/busses/i2c-qcom-geni.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/i2c-qcom-geni.c
> index 18d1e4fd4cf3..21e27f10510a 100644
> --- a/drivers/i2c/busses/i2c-qcom-geni.c
> +++ b/drivers/i2c/busses/i2c-qcom-geni.c
> @@ -366,15 +366,15 @@ static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
>         else
>                 geni_se_select_mode(se, GENI_SE_FIFO);
>  
> -       writel_relaxed(len, se->base + SE_I2C_RX_TRANS_LEN);
> -       geni_se_setup_m_cmd(se, I2C_READ, m_param);
> -
>         if (dma_buf && geni_se_rx_dma_prep(se, dma_buf, len, &rx_dma)) {
>                 geni_se_select_mode(se, GENI_SE_FIFO);
>                 i2c_put_dma_safe_msg_buf(dma_buf, msg, false);
>                 dma_buf = NULL;
>         }
>  

I worry that we also need a dmb() here to make sure the dma buffer is
properly mapped before this write to the device is attempted. But it may
only matter to be before the I2C_READ.

> +       writel_relaxed(len, se->base + SE_I2C_RX_TRANS_LEN);
> +       geni_se_setup_m_cmd(se, I2C_READ, m_param);
> +
>         time_left = wait_for_completion_timeout(&gi2c->done, XFER_TIMEOUT);
>         if (!time_left)
>                 geni_i2c_abort_xfer(gi2c);
Stephen Boyd July 21, 2020, 7:07 a.m. UTC | #3
Quoting Stephen Boyd (2020-07-20 22:59:14)
> 
> I worry that we also need a dmb() here to make sure the dma buffer is
> properly mapped before this write to the device is attempted. But it may
> only matter to be before the I2C_READ.
> 

I'm suggesting this patch instead where we make geni_se_setup_m_cmd()
use a writel() so that it has the proper barrier semantics to wait for
the other memory writes that happened in program order before this point
to complete before the device is kicked to do a read or a write.

----8<----
diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/i2c-qcom-geni.c
index 18d1e4fd4cf3..7f130829bf01 100644
--- a/drivers/i2c/busses/i2c-qcom-geni.c
+++ b/drivers/i2c/busses/i2c-qcom-geni.c
@@ -367,7 +367,6 @@ static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
 		geni_se_select_mode(se, GENI_SE_FIFO);
 
 	writel_relaxed(len, se->base + SE_I2C_RX_TRANS_LEN);
-	geni_se_setup_m_cmd(se, I2C_READ, m_param);
 
 	if (dma_buf && geni_se_rx_dma_prep(se, dma_buf, len, &rx_dma)) {
 		geni_se_select_mode(se, GENI_SE_FIFO);
@@ -375,6 +374,8 @@ static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
 		dma_buf = NULL;
 	}
 
+	geni_se_setup_m_cmd(se, I2C_READ, m_param);
+
 	time_left = wait_for_completion_timeout(&gi2c->done, XFER_TIMEOUT);
 	if (!time_left)
 		geni_i2c_abort_xfer(gi2c);
@@ -408,7 +409,6 @@ static int geni_i2c_tx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
 		geni_se_select_mode(se, GENI_SE_FIFO);
 
 	writel_relaxed(len, se->base + SE_I2C_TX_TRANS_LEN);
-	geni_se_setup_m_cmd(se, I2C_WRITE, m_param);
 
 	if (dma_buf && geni_se_tx_dma_prep(se, dma_buf, len, &tx_dma)) {
 		geni_se_select_mode(se, GENI_SE_FIFO);
@@ -416,6 +416,8 @@ static int geni_i2c_tx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
 		dma_buf = NULL;
 	}
 
+	geni_se_setup_m_cmd(se, I2C_WRITE, m_param);
+
 	if (!dma_buf) /* Get FIFO IRQ */
 		writel_relaxed(1, se->base + SE_GENI_TX_WATERMARK_REG);
 
diff --git a/include/linux/qcom-geni-se.h b/include/linux/qcom-geni-se.h
index dd464943f717..1dc134e9eb36 100644
--- a/include/linux/qcom-geni-se.h
+++ b/include/linux/qcom-geni-se.h
@@ -262,7 +262,7 @@ static inline void geni_se_setup_m_cmd(struct geni_se *se, u32 cmd, u32 params)
 	u32 m_cmd;
 
 	m_cmd = (cmd << M_OPCODE_SHFT) | (params & M_PARAMS_MSK);
-	writel_relaxed(m_cmd, se->base + SE_GENI_M_CMD0);
+	writel(m_cmd, se->base + SE_GENI_M_CMD0);
 }
 
 /**
Akash Asthana July 21, 2020, 10:48 a.m. UTC | #4
On 7/21/2020 5:54 AM, Douglas Anderson wrote:
> When I have KASAN enabled on my kernel and I start stressing the
> touchscreen my system tends to hang.  The touchscreen is one of the
> only things that does a lot of big i2c transfers and ends up hitting
> the DMA paths in the geni i2c driver.  It appears that KASAN adds
> enough delay in my system to tickle a race condition in the DMA setup
> code.
>
> When the system hangs, I found that it was running the geni_i2c_irq()
> over and over again.  It had these:
>
> m_stat   = 0x04000080
> rx_st    = 0x30000011
> dm_tx_st = 0x00000000
> dm_rx_st = 0x00000000
> dma      = 0x00000001
>
> Notably we're in DMA mode but are getting M_RX_IRQ_EN and
> M_RX_FIFO_WATERMARK_EN over and over again.
>
> Putting some traces in geni_i2c_rx_one_msg() showed that when we
> failed we were getting to the start of geni_i2c_rx_one_msg() but were
> never executing geni_se_rx_dma_prep().
>
> I believe that the problem here is that we are writing the transfer
> length and setting up the geni command before we run
> geni_se_rx_dma_prep().  If a transfer makes it far enough before we do
> that then we get into the state I have observed.  Let's change the
> order, which seems to work fine.
>
> Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller")
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---

Reviewed-by: Akash Asthana <akashast@codeaurora.org>
Mukesh, Savaliya July 21, 2020, 10:58 a.m. UTC | #5
On 7/21/2020 12:37 PM, Stephen Boyd wrote:
> Quoting Stephen Boyd (2020-07-20 22:59:14)
>> I worry that we also need a dmb() here to make sure the dma buffer is
>> properly mapped before this write to the device is attempted. But it may
>> only matter to be before the I2C_READ.
>>
> I'm suggesting this patch instead where we make geni_se_setup_m_cmd()
> use a writel() so that it has the proper barrier semantics to wait for
> the other memory writes that happened in program order before this point
> to complete before the device is kicked to do a read or a write.

Not sure if the issue was because of the barrier, but fundamentally for 
read operation, before FIFO data gets written by the DMA to memory,

buffer should be present. Hence the previous change from Doug seem to be 
fine as well.

> ----8<----
> diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/i2c-qcom-geni.c
> index 18d1e4fd4cf3..7f130829bf01 100644
> --- a/drivers/i2c/busses/i2c-qcom-geni.c
> +++ b/drivers/i2c/busses/i2c-qcom-geni.c
> @@ -367,7 +367,6 @@ static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
>   		geni_se_select_mode(se, GENI_SE_FIFO);
>   
>   	writel_relaxed(len, se->base + SE_I2C_RX_TRANS_LEN);
> -	geni_se_setup_m_cmd(se, I2C_READ, m_param);
>   
>   	if (dma_buf && geni_se_rx_dma_prep(se, dma_buf, len, &rx_dma)) {
>   		geni_se_select_mode(se, GENI_SE_FIFO);
> @@ -375,6 +374,8 @@ static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
>   		dma_buf = NULL;
>   	}
>   
> +	geni_se_setup_m_cmd(se, I2C_READ, m_param);
> +
>   	time_left = wait_for_completion_timeout(&gi2c->done, XFER_TIMEOUT);
>   	if (!time_left)
>   		geni_i2c_abort_xfer(gi2c);
> @@ -408,7 +409,6 @@ static int geni_i2c_tx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
>   		geni_se_select_mode(se, GENI_SE_FIFO);
>   
>   	writel_relaxed(len, se->base + SE_I2C_TX_TRANS_LEN);
> -	geni_se_setup_m_cmd(se, I2C_WRITE, m_param);
>   
>   	if (dma_buf && geni_se_tx_dma_prep(se, dma_buf, len, &tx_dma)) {
>   		geni_se_select_mode(se, GENI_SE_FIFO);
> @@ -416,6 +416,8 @@ static int geni_i2c_tx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
>   		dma_buf = NULL;
>   	}
>   
> +	geni_se_setup_m_cmd(se, I2C_WRITE, m_param);
> +
>   	if (!dma_buf) /* Get FIFO IRQ */
>   		writel_relaxed(1, se->base + SE_GENI_TX_WATERMARK_REG);
>   
> diff --git a/include/linux/qcom-geni-se.h b/include/linux/qcom-geni-se.h
> index dd464943f717..1dc134e9eb36 100644
> --- a/include/linux/qcom-geni-se.h
> +++ b/include/linux/qcom-geni-se.h
> @@ -262,7 +262,7 @@ static inline void geni_se_setup_m_cmd(struct geni_se *se, u32 cmd, u32 params)
>   	u32 m_cmd;
>   
>   	m_cmd = (cmd << M_OPCODE_SHFT) | (params & M_PARAMS_MSK);
> -	writel_relaxed(m_cmd, se->base + SE_GENI_M_CMD0);
> +	writel(m_cmd, se->base + SE_GENI_M_CMD0);
>   }
>   
>   /**
Doug Anderson July 21, 2020, 4:18 p.m. UTC | #6
Hi,

On Tue, Jul 21, 2020 at 12:08 AM Stephen Boyd <swboyd@chromium.org> wrote:
>
> Quoting Stephen Boyd (2020-07-20 22:59:14)
> >
> > I worry that we also need a dmb() here to make sure the dma buffer is
> > properly mapped before this write to the device is attempted. But it may
> > only matter to be before the I2C_READ.
> >
>
> I'm suggesting this patch instead where we make geni_se_setup_m_cmd()
> use a writel() so that it has the proper barrier semantics to wait for
> the other memory writes that happened in program order before this point
> to complete before the device is kicked to do a read or a write.

Are you saying that dma_map_single() isn't guaranteed to have a
barrier or something?  I tried to do some searching and found a thread
[1] where someone tried to add a barrierless variant of them.  To me
that means that the current APIs have barriers.

...or is there something else you're worried about?


> ----8<----
> diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/i2c-qcom-geni.c
> index 18d1e4fd4cf3..7f130829bf01 100644
> --- a/drivers/i2c/busses/i2c-qcom-geni.c
> +++ b/drivers/i2c/busses/i2c-qcom-geni.c
> @@ -367,7 +367,6 @@ static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
>                 geni_se_select_mode(se, GENI_SE_FIFO);
>
>         writel_relaxed(len, se->base + SE_I2C_RX_TRANS_LEN);
> -       geni_se_setup_m_cmd(se, I2C_READ, m_param);
>
>         if (dma_buf && geni_se_rx_dma_prep(se, dma_buf, len, &rx_dma)) {
>                 geni_se_select_mode(se, GENI_SE_FIFO);
> @@ -375,6 +374,8 @@ static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
>                 dma_buf = NULL;
>         }
>
> +       geni_se_setup_m_cmd(se, I2C_READ, m_param);

I guess it's true that we only need the setup_m_cmd moved.


> +
>         time_left = wait_for_completion_timeout(&gi2c->done, XFER_TIMEOUT);
>         if (!time_left)
>                 geni_i2c_abort_xfer(gi2c);
> @@ -408,7 +409,6 @@ static int geni_i2c_tx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
>                 geni_se_select_mode(se, GENI_SE_FIFO);
>
>         writel_relaxed(len, se->base + SE_I2C_TX_TRANS_LEN);
> -       geni_se_setup_m_cmd(se, I2C_WRITE, m_param);
>
>         if (dma_buf && geni_se_tx_dma_prep(se, dma_buf, len, &tx_dma)) {
>                 geni_se_select_mode(se, GENI_SE_FIFO);
> @@ -416,6 +416,8 @@ static int geni_i2c_tx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
>                 dma_buf = NULL;
>         }
>
> +       geni_se_setup_m_cmd(se, I2C_WRITE, m_param);
> +

True, it's probably safer to do the TX too even if I'm not seeing
problems there.  Of course, I don't think I'm doing any large writes
so probably never triggering this path anyway.


>         if (!dma_buf) /* Get FIFO IRQ */
>                 writel_relaxed(1, se->base + SE_GENI_TX_WATERMARK_REG);
>
> diff --git a/include/linux/qcom-geni-se.h b/include/linux/qcom-geni-se.h
> index dd464943f717..1dc134e9eb36 100644
> --- a/include/linux/qcom-geni-se.h
> +++ b/include/linux/qcom-geni-se.h
> @@ -262,7 +262,7 @@ static inline void geni_se_setup_m_cmd(struct geni_se *se, u32 cmd, u32 params)
>         u32 m_cmd;
>
>         m_cmd = (cmd << M_OPCODE_SHFT) | (params & M_PARAMS_MSK);
> -       writel_relaxed(m_cmd, se->base + SE_GENI_M_CMD0);
> +       writel(m_cmd, se->base + SE_GENI_M_CMD0);

I'll wait a little bit to see if you agree that the implicit barrier
that's part of dma_map_single() gets rid of the need to change
geni_se_setup_m_cmd().  If you agree then I'll send a v2 that moves
just the setup_m_cmd and does TX in addition to RX.  I'll plan to keep
accumulated tags unless someone says this is a bad idea.


[1] https://lore.kernel.org/r/1264473346-32721-1-git-send-email-adharmap@codeaurora.org/

-Doug
Stephen Boyd July 21, 2020, 6:55 p.m. UTC | #7
Quoting Doug Anderson (2020-07-21 09:18:35)
> On Tue, Jul 21, 2020 at 12:08 AM Stephen Boyd <swboyd@chromium.org> wrote:
> >
> > Quoting Stephen Boyd (2020-07-20 22:59:14)
> > >
> > > I worry that we also need a dmb() here to make sure the dma buffer is
> > > properly mapped before this write to the device is attempted. But it may
> > > only matter to be before the I2C_READ.
> > >
> >
> > I'm suggesting this patch instead where we make geni_se_setup_m_cmd()
> > use a writel() so that it has the proper barrier semantics to wait for
> > the other memory writes that happened in program order before this point
> > to complete before the device is kicked to do a read or a write.
> 
> Are you saying that dma_map_single() isn't guaranteed to have a
> barrier or something?  I tried to do some searching and found a thread
> [1] where someone tried to add a barrierless variant of them.  To me
> that means that the current APIs have barriers.
> 
> ...or is there something else you're worried about?

I'm not really thinking about dma_map_single() having a barrier or not.
The patch you mention is from 2010. Many things have changed in the last
decade. Does it have barrier semantics? The presence of a patch on the
mailing list doesn't mean much.

Specifically I'm looking at "KERNEL I/O BARRIER EFFECTS" of
Documentation/memory-barriers.txt and noticing that this driver is using
relaxed IO accessors meaning that the reads and writes aren't ordered
with respect to other memory accesses. They're only ordered to
themselves within the same device. I'm concerned that the CPU will issue
the IO access to start the write DMA operation before the buffer is
copied over due to out of order execution.

I'm not an expert in this area, but this is why we ask driver authors to
use the non-relaxed accessors because they have the appropriate
semantics built in to make them easy to reason about. They do what they
say when they say to do it.

> 
> 
> > ----8<----
> > diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/i2c-qcom-geni.c
> > index 18d1e4fd4cf3..7f130829bf01 100644
> > --- a/drivers/i2c/busses/i2c-qcom-geni.c
> > +++ b/drivers/i2c/busses/i2c-qcom-geni.c
> > @@ -367,7 +367,6 @@ static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
> >                 geni_se_select_mode(se, GENI_SE_FIFO);
> >
> >         writel_relaxed(len, se->base + SE_I2C_RX_TRANS_LEN);
> > -       geni_se_setup_m_cmd(se, I2C_READ, m_param);
> >
> >         if (dma_buf && geni_se_rx_dma_prep(se, dma_buf, len, &rx_dma)) {
> >                 geni_se_select_mode(se, GENI_SE_FIFO);
> > @@ -375,6 +374,8 @@ static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
> >                 dma_buf = NULL;
> >         }
> >
> > +       geni_se_setup_m_cmd(se, I2C_READ, m_param);
> 
> I guess it's true that we only need the setup_m_cmd moved.

Alright cool. That makes more sense.

> 
> 
> > +
> >         time_left = wait_for_completion_timeout(&gi2c->done, XFER_TIMEOUT);
> >         if (!time_left)
> >                 geni_i2c_abort_xfer(gi2c);
> > @@ -408,7 +409,6 @@ static int geni_i2c_tx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
> >                 geni_se_select_mode(se, GENI_SE_FIFO);
> >
> >         writel_relaxed(len, se->base + SE_I2C_TX_TRANS_LEN);
> > -       geni_se_setup_m_cmd(se, I2C_WRITE, m_param);
> >
> >         if (dma_buf && geni_se_tx_dma_prep(se, dma_buf, len, &tx_dma)) {
> >                 geni_se_select_mode(se, GENI_SE_FIFO);
> > @@ -416,6 +416,8 @@ static int geni_i2c_tx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
> >                 dma_buf = NULL;
> >         }
> >
> > +       geni_se_setup_m_cmd(se, I2C_WRITE, m_param);
> > +
> 
> True, it's probably safer to do the TX too even if I'm not seeing
> problems there.  Of course, I don't think I'm doing any large writes
> so probably never triggering this path anyway.

Right, this is just by inspection of the code to see that it's the same
scenario, kicking off the DMA operation at the device before mapping the
buffer.

> 
> 
> >         if (!dma_buf) /* Get FIFO IRQ */
> >                 writel_relaxed(1, se->base + SE_GENI_TX_WATERMARK_REG);
> >
Doug Anderson July 21, 2020, 8:26 p.m. UTC | #8
Hi,

On Tue, Jul 21, 2020 at 11:55 AM Stephen Boyd <swboyd@chromium.org> wrote:
>
> Quoting Doug Anderson (2020-07-21 09:18:35)
> > On Tue, Jul 21, 2020 at 12:08 AM Stephen Boyd <swboyd@chromium.org> wrote:
> > >
> > > Quoting Stephen Boyd (2020-07-20 22:59:14)
> > > >
> > > > I worry that we also need a dmb() here to make sure the dma buffer is
> > > > properly mapped before this write to the device is attempted. But it may
> > > > only matter to be before the I2C_READ.
> > > >
> > >
> > > I'm suggesting this patch instead where we make geni_se_setup_m_cmd()
> > > use a writel() so that it has the proper barrier semantics to wait for
> > > the other memory writes that happened in program order before this point
> > > to complete before the device is kicked to do a read or a write.
> >
> > Are you saying that dma_map_single() isn't guaranteed to have a
> > barrier or something?  I tried to do some searching and found a thread
> > [1] where someone tried to add a barrierless variant of them.  To me
> > that means that the current APIs have barriers.
> >
> > ...or is there something else you're worried about?
>
> I'm not really thinking about dma_map_single() having a barrier or not.
> The patch you mention is from 2010. Many things have changed in the last
> decade. Does it have barrier semantics? The presence of a patch on the
> mailing list doesn't mean much.

Yes, it's pretty old, but if you follow the thread and look at the
patch I'm fairly certain it's still relevant.  Specifically, following
one thread of dma_map_single() on arm64:

dma_map_single()
-> dma_map_single_attrs()
--> dma_map_page_attrs()
---> dma_direct_map_page()
----> arch_sync_dma_for_device()
-----> __dma_map_area()
------> __dma_inv_area() which has a "dsb"

I'm sure there are lots of other possible paths, but one thing pointed
out by following that path is 'DMA_ATTR_SKIP_CPU_SYNC'.  The
documentation of that option talks about the normal flow.  It says
that in the normal flow that dma_map_{single,page,sg} will
synchronize.  We are in the normal flow here.

As far as I understand, the whole point of dma_map_single() is to take
a given buffer and get it all ready so that if a device does DMA on it
right after the function exits that it's all set.


> Specifically I'm looking at "KERNEL I/O BARRIER EFFECTS" of
> Documentation/memory-barriers.txt and noticing that this driver is using
> relaxed IO accessors meaning that the reads and writes aren't ordered
> with respect to other memory accesses. They're only ordered to
> themselves within the same device. I'm concerned that the CPU will issue
> the IO access to start the write DMA operation before the buffer is
> copied over due to out of order execution.

I'm not an expert either, but it really looks like dma_map_single()
does all that we need it to.


> I'm not an expert in this area, but this is why we ask driver authors to
> use the non-relaxed accessors because they have the appropriate
> semantics built in to make them easy to reason about. They do what they
> say when they say to do it.

I'm all for avoiding using the relaxed variants too except if it's
been shown to be a performance problem.  The one hesitation I have,
though, is that I've spent time poking a bunch at the geni SPI driver.
We do _a lot_ of very small SPI transfers on our system.  For each of
these it's gotta setup a lot of commands.  When I was poking I
definitely noticed the difference between writel() and
writel_relaxed().  If we can save a few microseconds on each one of
these transfers it's probably worth it since it's effectively in the
inner loop of some transfers.

One option I thought of was to track the mode (DMA vs. FIFO) and only
do writel() for DMA mode.  If you're not convinced by my arguments
about dma_map_single(), would you be good with just doing the
non-relaxed version if we're in DMA mode?

-Doug
Doug Anderson July 22, 2020, 10:08 p.m. UTC | #9
Hi,

On Tue, Jul 21, 2020 at 1:26 PM Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Tue, Jul 21, 2020 at 11:55 AM Stephen Boyd <swboyd@chromium.org> wrote:
> >
> > Quoting Doug Anderson (2020-07-21 09:18:35)
> > > On Tue, Jul 21, 2020 at 12:08 AM Stephen Boyd <swboyd@chromium.org> wrote:
> > > >
> > > > Quoting Stephen Boyd (2020-07-20 22:59:14)
> > > > >
> > > > > I worry that we also need a dmb() here to make sure the dma buffer is
> > > > > properly mapped before this write to the device is attempted. But it may
> > > > > only matter to be before the I2C_READ.
> > > > >
> > > >
> > > > I'm suggesting this patch instead where we make geni_se_setup_m_cmd()
> > > > use a writel() so that it has the proper barrier semantics to wait for
> > > > the other memory writes that happened in program order before this point
> > > > to complete before the device is kicked to do a read or a write.
> > >
> > > Are you saying that dma_map_single() isn't guaranteed to have a
> > > barrier or something?  I tried to do some searching and found a thread
> > > [1] where someone tried to add a barrierless variant of them.  To me
> > > that means that the current APIs have barriers.
> > >
> > > ...or is there something else you're worried about?
> >
> > I'm not really thinking about dma_map_single() having a barrier or not.
> > The patch you mention is from 2010. Many things have changed in the last
> > decade. Does it have barrier semantics? The presence of a patch on the
> > mailing list doesn't mean much.
>
> Yes, it's pretty old, but if you follow the thread and look at the
> patch I'm fairly certain it's still relevant.  Specifically, following
> one thread of dma_map_single() on arm64:
>
> dma_map_single()
> -> dma_map_single_attrs()
> --> dma_map_page_attrs()
> ---> dma_direct_map_page()
> ----> arch_sync_dma_for_device()
> -----> __dma_map_area()
> ------> __dma_inv_area() which has a "dsb"
>
> I'm sure there are lots of other possible paths, but one thing pointed
> out by following that path is 'DMA_ATTR_SKIP_CPU_SYNC'.  The
> documentation of that option talks about the normal flow.  It says
> that in the normal flow that dma_map_{single,page,sg} will
> synchronize.  We are in the normal flow here.
>
> As far as I understand, the whole point of dma_map_single() is to take
> a given buffer and get it all ready so that if a device does DMA on it
> right after the function exits that it's all set.
>
>
> > Specifically I'm looking at "KERNEL I/O BARRIER EFFECTS" of
> > Documentation/memory-barriers.txt and noticing that this driver is using
> > relaxed IO accessors meaning that the reads and writes aren't ordered
> > with respect to other memory accesses. They're only ordered to
> > themselves within the same device. I'm concerned that the CPU will issue
> > the IO access to start the write DMA operation before the buffer is
> > copied over due to out of order execution.
>
> I'm not an expert either, but it really looks like dma_map_single()
> does all that we need it to.
>
>
> > I'm not an expert in this area, but this is why we ask driver authors to
> > use the non-relaxed accessors because they have the appropriate
> > semantics built in to make them easy to reason about. They do what they
> > say when they say to do it.
>
> I'm all for avoiding using the relaxed variants too except if it's
> been shown to be a performance problem.  The one hesitation I have,
> though, is that I've spent time poking a bunch at the geni SPI driver.
> We do _a lot_ of very small SPI transfers on our system.  For each of
> these it's gotta setup a lot of commands.  When I was poking I
> definitely noticed the difference between writel() and
> writel_relaxed().  If we can save a few microseconds on each one of
> these transfers it's probably worth it since it's effectively in the
> inner loop of some transfers.
>
> One option I thought of was to track the mode (DMA vs. FIFO) and only
> do writel() for DMA mode.  If you're not convinced by my arguments
> about dma_map_single(), would you be good with just doing the
> non-relaxed version if we're in DMA mode?

OK, so I did some quick benchmarking and I couldn't find any
performance regression with just always using writel() here.  Even if
dma_map_single() does guarantee that things are synced:

* There's no guarantee that all geni users will use dma_map_{xxx}.

* As Stephen says, the writel() is easier to reason about.

The change to a writel() is a bit orthogonal to the issue being
discussed here, though and it wouldn't make sense to have one patch
touch both the geni headers and also the i2c code.  Thus, I have sent
v2 without it (just with the other fixes that Stephen requested) and
also sent out a separate patch to change from writel_relaxed() to
writel().

Breadcrumbs:

[PATCH v2] i2c: i2c-qcom-geni: Fix DMA transfer race
https://lore.kernel.org/r/20200722145948.v2.1.I7efdf6efaa6edadbb690196cd4fbe3392a582c89@changeid/

[PATCH] soc: qcom-geni-se: Don't use relaxed writes when writing commands
https://lore.kernel.org/r/20200722150113.1.Ia50ab5cb8a6d3a73d302e6bdc25542d48ffd27f4@changeid/

As mentioned after the cut in the i2c change, I have kept people's
tested/reviewed tags for v2.

-Doug
diff mbox series

Patch

diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/i2c-qcom-geni.c
index 18d1e4fd4cf3..21e27f10510a 100644
--- a/drivers/i2c/busses/i2c-qcom-geni.c
+++ b/drivers/i2c/busses/i2c-qcom-geni.c
@@ -366,15 +366,15 @@  static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
 	else
 		geni_se_select_mode(se, GENI_SE_FIFO);
 
-	writel_relaxed(len, se->base + SE_I2C_RX_TRANS_LEN);
-	geni_se_setup_m_cmd(se, I2C_READ, m_param);
-
 	if (dma_buf && geni_se_rx_dma_prep(se, dma_buf, len, &rx_dma)) {
 		geni_se_select_mode(se, GENI_SE_FIFO);
 		i2c_put_dma_safe_msg_buf(dma_buf, msg, false);
 		dma_buf = NULL;
 	}
 
+	writel_relaxed(len, se->base + SE_I2C_RX_TRANS_LEN);
+	geni_se_setup_m_cmd(se, I2C_READ, m_param);
+
 	time_left = wait_for_completion_timeout(&gi2c->done, XFER_TIMEOUT);
 	if (!time_left)
 		geni_i2c_abort_xfer(gi2c);