diff mbox series

[v8,2/2] iio: pressure: dps310: Reset chip after timeout

Message ID 20220915195719.136812-3-eajames@linux.ibm.com (mailing list archive)
State Accepted
Headers show
Series iio: pressure: dps310: Reset chip if MEAS_CFG is corrupt | expand

Commit Message

Eddie James Sept. 15, 2022, 7:57 p.m. UTC
The DPS310 chip has been observed to get "stuck" such that pressure
and temperature measurements are never indicated as "ready" in the
MEAS_CFG register. The only solution is to reset the device and try
again. In order to avoid continual failures, use a boolean flag to
only try the reset after timeout once if errors persist.

Fixes: ba6ec48e76bc ("iio: Add driver for Infineon DPS310")
Cc: <stable@vger.kernel.org>
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
---
 drivers/iio/pressure/dps310.c | 74 ++++++++++++++++++++++++++++++-----
 1 file changed, 64 insertions(+), 10 deletions(-)

Comments

Andy Shevchenko Sept. 16, 2022, 5:51 a.m. UTC | #1
On Thu, Sep 15, 2022 at 10:57 PM Eddie James <eajames@linux.ibm.com> wrote:
>
> The DPS310 chip has been observed to get "stuck" such that pressure
> and temperature measurements are never indicated as "ready" in the
> MEAS_CFG register. The only solution is to reset the device and try
> again. In order to avoid continual failures, use a boolean flag to
> only try the reset after timeout once if errors persist.

...

> +static int dps310_ready(struct dps310_data *data, int ready_bit, int timeout)
> +{
> +       int rc;
> +
> +       rc = dps310_ready_status(data, ready_bit, timeout);
> +       if (rc) {

> +               if (rc == -ETIMEDOUT && !data->timeout_recovery_failed) {

Here you compare rc with a certain error code...

> +                       /* Reset and reinitialize the chip. */
> +                       if (dps310_reset_reinit(data)) {
> +                               data->timeout_recovery_failed = true;
> +                       } else {
> +                               /* Try again to get sensor ready status. */

> +                               if (dps310_ready_status(data, ready_bit, timeout))

...but here you assume that the only error code is -ETIMEDOUT. It's
kinda inconsistent (if you rely on internals of ready_status, then why
to check the certain error code, or otherwise why not to return a real
second error code). That's why I asked about re-using rc here.

In any case I don't think this justifies a v9, let's wait for your
answer and Jonathan's opinion.

> +                                       data->timeout_recovery_failed = true;
> +                               else
> +                                       return 0;
> +                       }
> +               }
> +
> +               return rc;
> +       }
> +
> +       data->timeout_recovery_failed = false;
> +       return 0;
> +}
Jonathan Cameron Sept. 16, 2022, 3:25 p.m. UTC | #2
On Fri, 16 Sep 2022 08:51:13 +0300
Andy Shevchenko <andy.shevchenko@gmail.com> wrote:

> On Thu, Sep 15, 2022 at 10:57 PM Eddie James <eajames@linux.ibm.com> wrote:
> >
> > The DPS310 chip has been observed to get "stuck" such that pressure
> > and temperature measurements are never indicated as "ready" in the
> > MEAS_CFG register. The only solution is to reset the device and try
> > again. In order to avoid continual failures, use a boolean flag to
> > only try the reset after timeout once if errors persist.  
> 
> ...
> 
> > +static int dps310_ready(struct dps310_data *data, int ready_bit, int timeout)
> > +{
> > +       int rc;
> > +
> > +       rc = dps310_ready_status(data, ready_bit, timeout);
> > +       if (rc) {  
> 
> > +               if (rc == -ETIMEDOUT && !data->timeout_recovery_failed) {  
> 
> Here you compare rc with a certain error code...
> 
> > +                       /* Reset and reinitialize the chip. */
> > +                       if (dps310_reset_reinit(data)) {
> > +                               data->timeout_recovery_failed = true;
> > +                       } else {
> > +                               /* Try again to get sensor ready status. */  
> 
> > +                               if (dps310_ready_status(data, ready_bit, timeout))  
> 
> ...but here you assume that the only error code is -ETIMEDOUT. It's
> kinda inconsistent (if you rely on internals of ready_status, then why
> to check the certain error code, or otherwise why not to return a real
> second error code). That's why I asked about re-using rc here.

Hmm.

1st time around, any other error code would result in us just returning directly
which is fine.
2nd time around I'm not sure we care about what the error code is.  Even if
something else went wrong we failed to recover and the first error
that lead to that was -ETIMEDOUT.

So I think this is correct as is, be it a little unusual!

Jonathan

> 
> In any case I don't think this justifies a v9, let's wait for your
> answer and Jonathan's opinion.
> 
> > +                                       data->timeout_recovery_failed = true;
> > +                               else
> > +                                       return 0;
> > +                       }
> > +               }
> > +
> > +               return rc;
> > +       }
> > +
> > +       data->timeout_recovery_failed = false;
> > +       return 0;
> > +}  
>
Jonathan Cameron Sept. 18, 2022, 3:52 p.m. UTC | #3
On Fri, 16 Sep 2022 16:25:35 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Fri, 16 Sep 2022 08:51:13 +0300
> Andy Shevchenko <andy.shevchenko@gmail.com> wrote:
> 
> > On Thu, Sep 15, 2022 at 10:57 PM Eddie James <eajames@linux.ibm.com> wrote:  
> > >
> > > The DPS310 chip has been observed to get "stuck" such that pressure
> > > and temperature measurements are never indicated as "ready" in the
> > > MEAS_CFG register. The only solution is to reset the device and try
> > > again. In order to avoid continual failures, use a boolean flag to
> > > only try the reset after timeout once if errors persist.    
> > 
> > ...
> >   
> > > +static int dps310_ready(struct dps310_data *data, int ready_bit, int timeout)
> > > +{
> > > +       int rc;
> > > +
> > > +       rc = dps310_ready_status(data, ready_bit, timeout);
> > > +       if (rc) {    
> >   
> > > +               if (rc == -ETIMEDOUT && !data->timeout_recovery_failed) {    
> > 
> > Here you compare rc with a certain error code...
> >   
> > > +                       /* Reset and reinitialize the chip. */
> > > +                       if (dps310_reset_reinit(data)) {
> > > +                               data->timeout_recovery_failed = true;
> > > +                       } else {
> > > +                               /* Try again to get sensor ready status. */    
> >   
> > > +                               if (dps310_ready_status(data, ready_bit, timeout))    
> > 
> > ...but here you assume that the only error code is -ETIMEDOUT. It's
> > kinda inconsistent (if you rely on internals of ready_status, then why
> > to check the certain error code, or otherwise why not to return a real
> > second error code). That's why I asked about re-using rc here.  
> 
> Hmm.
> 
> 1st time around, any other error code would result in us just returning directly
> which is fine.
> 2nd time around I'm not sure we care about what the error code is.  Even if
> something else went wrong we failed to recover and the first error
> that lead to that was -ETIMEDOUT.
> 
> So I think this is correct as is, be it a little unusual!

Given timing late in the cycle, I've queued this up for the next merge
window rather than rushing it in.

Applied to the togreg branch of iio.git and pushed out as testing.
Note I plan to rebase that branch shortly as I need some stuff that
is in upstream for other series.  Hence still time for this discussion to
continue if anyone wants to!

Thanks,

Jonathan

> 
> Jonathan
> 
> > 
> > In any case I don't think this justifies a v9, let's wait for your
> > answer and Jonathan's opinion.
> >   
> > > +                                       data->timeout_recovery_failed = true;
> > > +                               else
> > > +                                       return 0;
> > > +                       }
> > > +               }
> > > +
> > > +               return rc;
> > > +       }
> > > +
> > > +       data->timeout_recovery_failed = false;
> > > +       return 0;
> > > +}    
> >   
>
diff mbox series

Patch

diff --git a/drivers/iio/pressure/dps310.c b/drivers/iio/pressure/dps310.c
index c706a8b423b5..984a3f511a1a 100644
--- a/drivers/iio/pressure/dps310.c
+++ b/drivers/iio/pressure/dps310.c
@@ -89,6 +89,7 @@  struct dps310_data {
 	s32 c00, c10, c20, c30, c01, c11, c21;
 	s32 pressure_raw;
 	s32 temp_raw;
+	bool timeout_recovery_failed;
 };
 
 static const struct iio_chan_spec dps310_channels[] = {
@@ -393,11 +394,69 @@  static int dps310_get_temp_k(struct dps310_data *data)
 	return scale_factors[ilog2(rc)];
 }
 
+static int dps310_reset_wait(struct dps310_data *data)
+{
+	int rc;
+
+	rc = regmap_write(data->regmap, DPS310_RESET, DPS310_RESET_MAGIC);
+	if (rc)
+		return rc;
+
+	/* Wait for device chip access: 2.5ms in specification */
+	usleep_range(2500, 12000);
+	return 0;
+}
+
+static int dps310_reset_reinit(struct dps310_data *data)
+{
+	int rc;
+
+	rc = dps310_reset_wait(data);
+	if (rc)
+		return rc;
+
+	return dps310_startup(data);
+}
+
+static int dps310_ready_status(struct dps310_data *data, int ready_bit, int timeout)
+{
+	int sleep = DPS310_POLL_SLEEP_US(timeout);
+	int ready;
+
+	return regmap_read_poll_timeout(data->regmap, DPS310_MEAS_CFG, ready, ready & ready_bit,
+					sleep, timeout);
+}
+
+static int dps310_ready(struct dps310_data *data, int ready_bit, int timeout)
+{
+	int rc;
+
+	rc = dps310_ready_status(data, ready_bit, timeout);
+	if (rc) {
+		if (rc == -ETIMEDOUT && !data->timeout_recovery_failed) {
+			/* Reset and reinitialize the chip. */
+			if (dps310_reset_reinit(data)) {
+				data->timeout_recovery_failed = true;
+			} else {
+				/* Try again to get sensor ready status. */
+				if (dps310_ready_status(data, ready_bit, timeout))
+					data->timeout_recovery_failed = true;
+				else
+					return 0;
+			}
+		}
+
+		return rc;
+	}
+
+	data->timeout_recovery_failed = false;
+	return 0;
+}
+
 static int dps310_read_pres_raw(struct dps310_data *data)
 {
 	int rc;
 	int rate;
-	int ready;
 	int timeout;
 	s32 raw;
 	u8 val[3];
@@ -409,9 +468,7 @@  static int dps310_read_pres_raw(struct dps310_data *data)
 	timeout = DPS310_POLL_TIMEOUT_US(rate);
 
 	/* Poll for sensor readiness; base the timeout upon the sample rate. */
-	rc = regmap_read_poll_timeout(data->regmap, DPS310_MEAS_CFG, ready,
-				      ready & DPS310_PRS_RDY,
-				      DPS310_POLL_SLEEP_US(timeout), timeout);
+	rc = dps310_ready(data, DPS310_PRS_RDY, timeout);
 	if (rc)
 		goto done;
 
@@ -448,7 +505,6 @@  static int dps310_read_temp_raw(struct dps310_data *data)
 {
 	int rc;
 	int rate;
-	int ready;
 	int timeout;
 
 	if (mutex_lock_interruptible(&data->lock))
@@ -458,10 +514,8 @@  static int dps310_read_temp_raw(struct dps310_data *data)
 	timeout = DPS310_POLL_TIMEOUT_US(rate);
 
 	/* Poll for sensor readiness; base the timeout upon the sample rate. */
-	rc = regmap_read_poll_timeout(data->regmap, DPS310_MEAS_CFG, ready,
-				      ready & DPS310_TMP_RDY,
-				      DPS310_POLL_SLEEP_US(timeout), timeout);
-	if (rc < 0)
+	rc = dps310_ready(data, DPS310_TMP_RDY, timeout);
+	if (rc)
 		goto done;
 
 	rc = dps310_read_temp_ready(data);
@@ -756,7 +810,7 @@  static void dps310_reset(void *action_data)
 {
 	struct dps310_data *data = action_data;
 
-	regmap_write(data->regmap, DPS310_RESET, DPS310_RESET_MAGIC);
+	dps310_reset_wait(data);
 }
 
 static const struct regmap_config dps310_regmap_config = {