Message ID | 20200722145948.v2.1.I7efdf6efaa6edadbb690196cd4fbe3392a582c89@changeid (mailing list archive) |
---|---|
State | Accepted |
Commit | 02b9aec59243c6240fc42884acc958602146ddf6 |
Headers | show |
Series | [v2] i2c: i2c-qcom-geni: Fix DMA transfer race | expand |
Quoting Douglas Anderson (2020-07-22 15:00:21) > When I have KASAN enabled on my kernel and I start stressing the > touchscreen my system tends to hang. The touchscreen is one of the > only things that does a lot of big i2c transfers and ends up hitting > the DMA paths in the geni i2c driver. It appears that KASAN adds > enough delay in my system to tickle a race condition in the DMA setup > code. > > When the system hangs, I found that it was running the geni_i2c_irq() > over and over again. It had these: > > m_stat = 0x04000080 > rx_st = 0x30000011 > dm_tx_st = 0x00000000 > dm_rx_st = 0x00000000 > dma = 0x00000001 > > Notably we're in DMA mode but are getting M_RX_IRQ_EN and > M_RX_FIFO_WATERMARK_EN over and over again. > > Putting some traces in geni_i2c_rx_one_msg() showed that when we > failed we were getting to the start of geni_i2c_rx_one_msg() but were > never executing geni_se_rx_dma_prep(). > > I believe that the problem here is that we are starting the geni > command before we run geni_se_rx_dma_prep(). If a transfer makes it > far enough before we do that then we get into the state I have > observed. Let's change the order, which seems to work fine. > > Although problems were seen on the RX path, code inspection suggests > that the TX should be changed too. Change it as well. > > Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller") > Signed-off-by: Douglas Anderson <dianders@chromium.org> > Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> > Reviewed-by: Akash Asthana <akashast@codeaurora.org> > --- Reviewed-by: Stephen Boyd <swboyd@chromium.org>
On 7/23/2020 6:20 AM, Stephen Boyd wrote: > Quoting Douglas Anderson (2020-07-22 15:00:21) >> When I have KASAN enabled on my kernel and I start stressing the >> touchscreen my system tends to hang. The touchscreen is one of the >> only things that does a lot of big i2c transfers and ends up hitting >> the DMA paths in the geni i2c driver. It appears that KASAN adds >> enough delay in my system to tickle a race condition in the DMA setup >> code. >> >> When the system hangs, I found that it was running the geni_i2c_irq() >> over and over again. It had these: >> >> m_stat = 0x04000080 >> rx_st = 0x30000011 >> dm_tx_st = 0x00000000 >> dm_rx_st = 0x00000000 >> dma = 0x00000001 >> >> Notably we're in DMA mode but are getting M_RX_IRQ_EN and >> M_RX_FIFO_WATERMARK_EN over and over again. >> >> Putting some traces in geni_i2c_rx_one_msg() showed that when we >> failed we were getting to the start of geni_i2c_rx_one_msg() but were >> never executing geni_se_rx_dma_prep(). >> >> I believe that the problem here is that we are starting the geni >> command before we run geni_se_rx_dma_prep(). If a transfer makes it >> far enough before we do that then we get into the state I have >> observed. Let's change the order, which seems to work fine. >> >> Although problems were seen on the RX path, code inspection suggests >> that the TX should be changed too. Change it as well. >> >> Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller") >> Signed-off-by: Douglas Anderson <dianders@chromium.org> >> Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> >> Reviewed-by: Akash Asthana <akashast@codeaurora.org> Reviewed-by: Mukesh Kumar Savaliya <msavaliy@codeaurora.org> >> --- > Reviewed-by: Stephen Boyd <swboyd@chromium.org>
On Wed, Jul 22, 2020 at 03:00:21PM -0700, Douglas Anderson wrote: > When I have KASAN enabled on my kernel and I start stressing the > touchscreen my system tends to hang. The touchscreen is one of the > only things that does a lot of big i2c transfers and ends up hitting > the DMA paths in the geni i2c driver. It appears that KASAN adds > enough delay in my system to tickle a race condition in the DMA setup > code. > > When the system hangs, I found that it was running the geni_i2c_irq() > over and over again. It had these: > > m_stat = 0x04000080 > rx_st = 0x30000011 > dm_tx_st = 0x00000000 > dm_rx_st = 0x00000000 > dma = 0x00000001 > > Notably we're in DMA mode but are getting M_RX_IRQ_EN and > M_RX_FIFO_WATERMARK_EN over and over again. > > Putting some traces in geni_i2c_rx_one_msg() showed that when we > failed we were getting to the start of geni_i2c_rx_one_msg() but were > never executing geni_se_rx_dma_prep(). > > I believe that the problem here is that we are starting the geni > command before we run geni_se_rx_dma_prep(). If a transfer makes it > far enough before we do that then we get into the state I have > observed. Let's change the order, which seems to work fine. > > Although problems were seen on the RX path, code inspection suggests > that the TX should be changed too. Change it as well. > > Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller") > Signed-off-by: Douglas Anderson <dianders@chromium.org> > Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> > Reviewed-by: Akash Asthana <akashast@codeaurora.org> Applied to for-current, thanks!
On Thu, Jul 23, 2020 at 09:56:34PM +0200, Wolfram Sang wrote: > On Wed, Jul 22, 2020 at 03:00:21PM -0700, Douglas Anderson wrote: > > When I have KASAN enabled on my kernel and I start stressing the > > touchscreen my system tends to hang. The touchscreen is one of the > > only things that does a lot of big i2c transfers and ends up hitting > > the DMA paths in the geni i2c driver. It appears that KASAN adds > > enough delay in my system to tickle a race condition in the DMA setup > > code. > > > > When the system hangs, I found that it was running the geni_i2c_irq() > > over and over again. It had these: > > > > m_stat = 0x04000080 > > rx_st = 0x30000011 > > dm_tx_st = 0x00000000 > > dm_rx_st = 0x00000000 > > dma = 0x00000001 > > > > Notably we're in DMA mode but are getting M_RX_IRQ_EN and > > M_RX_FIFO_WATERMARK_EN over and over again. > > > > Putting some traces in geni_i2c_rx_one_msg() showed that when we > > failed we were getting to the start of geni_i2c_rx_one_msg() but were > > never executing geni_se_rx_dma_prep(). > > > > I believe that the problem here is that we are starting the geni > > command before we run geni_se_rx_dma_prep(). If a transfer makes it > > far enough before we do that then we get into the state I have > > observed. Let's change the order, which seems to work fine. > > > > Although problems were seen on the RX path, code inspection suggests > > that the TX should be changed too. Change it as well. > > > > Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller") > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > > Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> > > Reviewed-by: Akash Asthana <akashast@codeaurora.org> > > Applied to for-current, thanks! Glad we got this sorted. I just wondered that Alok wasn't part of the discussion. Is he still interested in maintaining the driver? Also because there is an unprocessed patch left for this driver: http://patchwork.ozlabs.org/project/linux-i2c/patch/20191103212204.13606-1-colin.king@canonical.com/
On 7/26/2020 6:17 PM, Wolfram Sang wrote: > On Thu, Jul 23, 2020 at 09:56:34PM +0200, Wolfram Sang wrote: >> On Wed, Jul 22, 2020 at 03:00:21PM -0700, Douglas Anderson wrote: >>> When I have KASAN enabled on my kernel and I start stressing the >>> touchscreen my system tends to hang. The touchscreen is one of the >>> only things that does a lot of big i2c transfers and ends up hitting >>> the DMA paths in the geni i2c driver. It appears that KASAN adds >>> enough delay in my system to tickle a race condition in the DMA setup >>> code. >>> >>> When the system hangs, I found that it was running the geni_i2c_irq() >>> over and over again. It had these: >>> >>> m_stat = 0x04000080 >>> rx_st = 0x30000011 >>> dm_tx_st = 0x00000000 >>> dm_rx_st = 0x00000000 >>> dma = 0x00000001 >>> >>> Notably we're in DMA mode but are getting M_RX_IRQ_EN and >>> M_RX_FIFO_WATERMARK_EN over and over again. >>> >>> Putting some traces in geni_i2c_rx_one_msg() showed that when we >>> failed we were getting to the start of geni_i2c_rx_one_msg() but were >>> never executing geni_se_rx_dma_prep(). >>> >>> I believe that the problem here is that we are starting the geni >>> command before we run geni_se_rx_dma_prep(). If a transfer makes it >>> far enough before we do that then we get into the state I have >>> observed. Let's change the order, which seems to work fine. >>> >>> Although problems were seen on the RX path, code inspection suggests >>> that the TX should be changed too. Change it as well. >>> >>> Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller") >>> Signed-off-by: Douglas Anderson <dianders@chromium.org> >>> Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> >>> Reviewed-by: Akash Asthana <akashast@codeaurora.org> >> Applied to for-current, thanks! > Glad we got this sorted. I just wondered that Alok wasn't part of the > discussion. Is he still interested in maintaining the driver? Also > because there is an unprocessed patch left for this driver: > > http://patchwork.ozlabs.org/project/linux-i2c/patch/20191103212204.13606-1-colin.king@canonical.com/ Alok has moved out of GENI team, he no longer supports GENI I2C drivers. I have posted a patch to update maintainers list. Patch: https://patchwork.kernel.org/patch/11686541/ [MAINTAINERS: Update Geni I2C maintainers list] Also, Girish Mahadevan, Sagar Dharia and Karthikeyan Ramasubramanian no longer supports GENI drivers. Regards, Akash >
diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/i2c-qcom-geni.c index 18d1e4fd4cf3..7f130829bf01 100644 --- a/drivers/i2c/busses/i2c-qcom-geni.c +++ b/drivers/i2c/busses/i2c-qcom-geni.c @@ -367,7 +367,6 @@ static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg, geni_se_select_mode(se, GENI_SE_FIFO); writel_relaxed(len, se->base + SE_I2C_RX_TRANS_LEN); - geni_se_setup_m_cmd(se, I2C_READ, m_param); if (dma_buf && geni_se_rx_dma_prep(se, dma_buf, len, &rx_dma)) { geni_se_select_mode(se, GENI_SE_FIFO); @@ -375,6 +374,8 @@ static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg, dma_buf = NULL; } + geni_se_setup_m_cmd(se, I2C_READ, m_param); + time_left = wait_for_completion_timeout(&gi2c->done, XFER_TIMEOUT); if (!time_left) geni_i2c_abort_xfer(gi2c); @@ -408,7 +409,6 @@ static int geni_i2c_tx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg, geni_se_select_mode(se, GENI_SE_FIFO); writel_relaxed(len, se->base + SE_I2C_TX_TRANS_LEN); - geni_se_setup_m_cmd(se, I2C_WRITE, m_param); if (dma_buf && geni_se_tx_dma_prep(se, dma_buf, len, &tx_dma)) { geni_se_select_mode(se, GENI_SE_FIFO); @@ -416,6 +416,8 @@ static int geni_i2c_tx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg, dma_buf = NULL; } + geni_se_setup_m_cmd(se, I2C_WRITE, m_param); + if (!dma_buf) /* Get FIFO IRQ */ writel_relaxed(1, se->base + SE_GENI_TX_WATERMARK_REG);