diff mbox

i2c: tegra: Check for overflow errors with BUG_ON.

Message ID 1313434172-18319-1-git-send-email-dianders@chromium.org (mailing list archive)
State New, archived
Headers show

Commit Message

Doug Anderson Aug. 15, 2011, 6:49 p.m. UTC
This change doesn't fix any known problems but turns
on the overflow detection feature of the i2c controller
in the hopes of flushing out any current (or future)
bugs in the i2c driver.

Inspired by a change on nvidia's git server:
  http://nv-tegra.nvidia.com/gitweb/?p=linux-2.6.git;a=commit;h=266d1b7397284505e55d06254b497cb32be07b69

Signed-off-by: Doug Anderson <dianders@chromium.org>
---
 drivers/i2c/busses/i2c-tegra.c |   11 ++++++++---
 1 files changed, 8 insertions(+), 3 deletions(-)

Comments

Felipe Balbi Aug. 15, 2011, 7:17 p.m. UTC | #1
Hi,

On Mon, Aug 15, 2011 at 11:49:32AM -0700, Doug Anderson wrote:
> This change doesn't fix any known problems but turns
> on the overflow detection feature of the i2c controller
> in the hopes of flushing out any current (or future)
> bugs in the i2c driver.
> 
> Inspired by a change on nvidia's git server:
>   http://nv-tegra.nvidia.com/gitweb/?p=linux-2.6.git;a=commit;h=266d1b7397284505e55d06254b497cb32be07b69
> 
> Signed-off-by: Doug Anderson <dianders@chromium.org>
> ---
>  drivers/i2c/busses/i2c-tegra.c |   11 ++++++++---
>  1 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/i2c/busses/i2c-tegra.c b/drivers/i2c/busses/i2c-tegra.c
> index 2440b74..4dbba23 100644
> --- a/drivers/i2c/busses/i2c-tegra.c
> +++ b/drivers/i2c/busses/i2c-tegra.c
> @@ -367,7 +367,8 @@ static int tegra_i2c_init(struct tegra_i2c_dev *i2c_dev)
>  static irqreturn_t tegra_i2c_isr(int irq, void *dev_id)
>  {
>  	u32 status;
> -	const u32 status_err = I2C_INT_NO_ACK | I2C_INT_ARBITRATION_LOST;
> +	const u32 status_err = I2C_INT_NO_ACK | I2C_INT_ARBITRATION_LOST |
> +		I2C_INT_TX_FIFO_OVERFLOW;
>  	struct tegra_i2c_dev *i2c_dev = dev_id;
>  
>  	status = i2c_readl(i2c_dev, I2C_INT_STATUS);
> @@ -389,6 +390,9 @@ static irqreturn_t tegra_i2c_isr(int irq, void *dev_id)
>  	}
>  
>  	if (unlikely(status & status_err)) {
> +		/* Don't pass this back--it can only happen due to a bug. */
> +		BUG_ON(status & I2C_INT_TX_FIFO_OVERFLOW);

so due to a FIFO overflow you lock up the whole system ? Can't you e.g.
reset the controller and reconfigure it rather than locking up the
system ?
Doug Anderson Aug. 15, 2011, 7:52 p.m. UTC | #2
Felipe,

On Mon, Aug 15, 2011 at 12:17 PM, Felipe Balbi <balbi@ti.com> wrote:
> so due to a FIFO overflow you lock up the whole system ? Can't you e.g.
> reset the controller and reconfigure it rather than locking up the
> system ?

Certainly we could try to be more proactive and reset / retry / return
the error to the client.  However, since the only expected situation
where this BUG_ON should hit is due to a bug in this driver itself
(AKA: i2c clients shouldn't be able to do anything to cause the BUG_ON
to hit), that seems like a lot of added complexity.

Also: if there is an arbitrary software bug that causing an overflow
condition to occur, I'm not sure how stable the system will be.
Specifically, the i2c controller is used (among other things) to talk
to the PMU and adjust voltages in the system.  If we just sent it a
random command, I'd rather report the bug right away so we don't get
hard to find/reproduce failures in other parts of the system.

What do others think?

-Doug
Felipe Balbi Aug. 15, 2011, 8:03 p.m. UTC | #3
HI,

On Mon, Aug 15, 2011 at 12:52:36PM -0700, Doug Anderson wrote:
> Felipe,
> 
> On Mon, Aug 15, 2011 at 12:17 PM, Felipe Balbi <balbi@ti.com> wrote:
> > so due to a FIFO overflow you lock up the whole system ? Can't you e.g.
> > reset the controller and reconfigure it rather than locking up the
> > system ?
> 
> Certainly we could try to be more proactive and reset / retry / return
> the error to the client.  However, since the only expected situation
> where this BUG_ON should hit is due to a bug in this driver itself
> (AKA: i2c clients shouldn't be able to do anything to cause the BUG_ON
> to hit), that seems like a lot of added complexity.

so at least just pass an error to the client, but hanging the entire
system seems a bit too much, dont you think ?

> Also: if there is an arbitrary software bug that causing an overflow
> condition to occur, I'm not sure how stable the system will be.
> Specifically, the i2c controller is used (among other things) to talk
> to the PMU and adjust voltages in the system.  If we just sent it a
> random command, I'd rather report the bug right away so we don't get
> hard to find/reproduce failures in other parts of the system.

that's a good point, I still think that e.g. making a cellphone
unresponsive until a watchdog reset triggers just because you got a FIFO
overflow on the I2C controller is too much.
Ben Dooks Aug. 23, 2011, 6:39 p.m. UTC | #4
On Mon, Aug 15, 2011 at 11:03:50PM +0300, Felipe Balbi wrote:
> HI,
> 
> On Mon, Aug 15, 2011 at 12:52:36PM -0700, Doug Anderson wrote:
> > Felipe,
> > 
> > On Mon, Aug 15, 2011 at 12:17 PM, Felipe Balbi <balbi@ti.com> wrote:
> > > so due to a FIFO overflow you lock up the whole system ? Can't you e.g.
> > > reset the controller and reconfigure it rather than locking up the
> > > system ?
> > 
> > Certainly we could try to be more proactive and reset / retry / return
> > the error to the client.  However, since the only expected situation
> > where this BUG_ON should hit is due to a bug in this driver itself
> > (AKA: i2c clients shouldn't be able to do anything to cause the BUG_ON
> > to hit), that seems like a lot of added complexity.
> 
> so at least just pass an error to the client, but hanging the entire
> system seems a bit too much, dont you think ?
> 
> > Also: if there is an arbitrary software bug that causing an overflow
> > condition to occur, I'm not sure how stable the system will be.
> > Specifically, the i2c controller is used (among other things) to talk
> > to the PMU and adjust voltages in the system.  If we just sent it a
> > random command, I'd rather report the bug right away so we don't get
> > hard to find/reproduce failures in other parts of the system.
> 
> that's a good point, I still think that e.g. making a cellphone
> unresponsive until a watchdog reset triggers just because you got a FIFO
> overflow on the I2C controller is too much.

Yes, I would agree on that. BUG() really should be only used
for occasions where there's little possiblity the entire system
can continue to work.

In this case, it seems far more sensible to report this as an
error and see what can be done to recover the bus and controller
for the next transaction.
diff mbox

Patch

diff --git a/drivers/i2c/busses/i2c-tegra.c b/drivers/i2c/busses/i2c-tegra.c
index 2440b74..4dbba23 100644
--- a/drivers/i2c/busses/i2c-tegra.c
+++ b/drivers/i2c/busses/i2c-tegra.c
@@ -367,7 +367,8 @@  static int tegra_i2c_init(struct tegra_i2c_dev *i2c_dev)
 static irqreturn_t tegra_i2c_isr(int irq, void *dev_id)
 {
 	u32 status;
-	const u32 status_err = I2C_INT_NO_ACK | I2C_INT_ARBITRATION_LOST;
+	const u32 status_err = I2C_INT_NO_ACK | I2C_INT_ARBITRATION_LOST |
+		I2C_INT_TX_FIFO_OVERFLOW;
 	struct tegra_i2c_dev *i2c_dev = dev_id;
 
 	status = i2c_readl(i2c_dev, I2C_INT_STATUS);
@@ -389,6 +390,9 @@  static irqreturn_t tegra_i2c_isr(int irq, void *dev_id)
 	}
 
 	if (unlikely(status & status_err)) {
+		/* Don't pass this back--it can only happen due to a bug. */
+		BUG_ON(status & I2C_INT_TX_FIFO_OVERFLOW);
+
 		if (status & I2C_INT_NO_ACK)
 			i2c_dev->msg_err |= I2C_ERR_NO_ACK;
 		if (status & I2C_INT_ARBITRATION_LOST)
@@ -423,7 +427,7 @@  err:
 	/* An error occurred, mask all interrupts */
 	tegra_i2c_mask_irq(i2c_dev, I2C_INT_NO_ACK | I2C_INT_ARBITRATION_LOST |
 		I2C_INT_PACKET_XFER_COMPLETE | I2C_INT_TX_FIFO_DATA_REQ |
-		I2C_INT_RX_FIFO_DATA_REQ);
+		I2C_INT_RX_FIFO_DATA_REQ | I2C_INT_TX_FIFO_OVERFLOW);
 	i2c_writel(i2c_dev, status, I2C_INT_STATUS);
 	if (i2c_dev->is_dvc)
 		dvc_writel(i2c_dev, DVC_STATUS_I2C_DONE_INTR, DVC_STATUS);
@@ -473,7 +477,8 @@  static int tegra_i2c_xfer_msg(struct tegra_i2c_dev *i2c_dev,
 	if (!(msg->flags & I2C_M_RD))
 		tegra_i2c_fill_tx_fifo(i2c_dev);
 
-	int_mask = I2C_INT_NO_ACK | I2C_INT_ARBITRATION_LOST;
+	int_mask = I2C_INT_NO_ACK | I2C_INT_ARBITRATION_LOST |
+		I2C_INT_TX_FIFO_OVERFLOW;
 	if (msg->flags & I2C_M_RD)
 		int_mask |= I2C_INT_RX_FIFO_DATA_REQ;
 	else if (i2c_dev->msg_buf_remaining)