Message ID | 20220615032048.465486-3-peng.fan@oss.nxp.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | remoteproc: support self recovery | expand |
On Wed, Jun 15, 2022 at 11:20:48AM +0800, Peng Fan (OSS) wrote: > From: Peng Fan <peng.fan@nxp.com> > > Current logic only support main processor to stop/start the remote > processor after rproc crash. However to SoC, such as i.MX8QM/QXP, the > remote processor could do attach recovery after crash and trigger watchdog > reboot. It does not need main processor to load image, or stop/start M4 > core. > > Introduce two functions: rproc_attach_recovery, rproc_firmware_recovery > for the two cases. Firmware recovery is as before, let main processor to > help recovery, while attach recovery is recover itself withou help. > To attach recovery, we only do detach and attach. > > Signed-off-by: Peng Fan <peng.fan@nxp.com> > --- > drivers/remoteproc/remoteproc_core.c | 64 +++++++++++++++++++--------- > 1 file changed, 45 insertions(+), 19 deletions(-) > > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c > index 02a04ab34a23..1c1c90176aff 100644 > --- a/drivers/remoteproc/remoteproc_core.c > +++ b/drivers/remoteproc/remoteproc_core.c > @@ -1883,6 +1883,47 @@ static int __rproc_detach(struct rproc *rproc) > return 0; > } > > +static int rproc_attach_recovery(struct rproc *rproc) > +{ > + int ret; > + > + mutex_unlock(&rproc->lock); > + ret = __rproc_detach(rproc); > + mutex_lock(&rproc->lock); > + if (ret) > + return ret; > + > + return __rproc_attach(rproc); > +} > + > +static int rproc_firmware_recovery(struct rproc *rproc) s/rproc_firmware_recovery/rproc_boot_recovery > +{ > + const struct firmware *firmware_p; > + struct device *dev = &rproc->dev; > + int ret; > + > + ret = rproc_stop(rproc, true); > + if (ret) > + return ret; > + > + /* generate coredump */ > + rproc->ops->coredump(rproc); > + > + /* load firmware */ > + ret = request_firmware(&firmware_p, rproc->firmware, dev); > + if (ret < 0) { > + dev_err(dev, "request_firmware failed: %d\n", ret); > + return ret; > + } > + > + /* boot the remote processor up again */ > + ret = rproc_start(rproc, firmware_p); > + > + release_firmware(firmware_p); > + > + return ret; > +} > + > /** > * rproc_trigger_recovery() - recover a remoteproc > * @rproc: the remote processor > @@ -1897,7 +1938,6 @@ static int __rproc_detach(struct rproc *rproc) > */ > int rproc_trigger_recovery(struct rproc *rproc) > { > - const struct firmware *firmware_p; > struct device *dev = &rproc->dev; > int ret; > > @@ -1911,24 +1951,10 @@ int rproc_trigger_recovery(struct rproc *rproc) > > dev_err(dev, "recovering %s\n", rproc->name); > > - ret = rproc_stop(rproc, true); > - if (ret) > - goto unlock_mutex; > - > - /* generate coredump */ > - rproc->ops->coredump(rproc); > - > - /* load firmware */ > - ret = request_firmware(&firmware_p, rproc->firmware, dev); > - if (ret < 0) { > - dev_err(dev, "request_firmware failed: %d\n", ret); > - goto unlock_mutex; > - } > - > - /* boot the remote processor up again */ > - ret = rproc_start(rproc, firmware_p); > - > - release_firmware(firmware_p); > + if (rproc_has_feature(rproc, RPROC_FEAT_ATTACH_ON_RECOVERY)) > + ret = rproc_attach_recovery(rproc); > + else > + ret = rproc_firmware_recovery(rproc); This patch contains a serious flaw related to locking that should have been obvious when it was put together. Please go back and carefully review the code you are submitting. I will not consider another revision of this set until July 15th. Thanks, Mathieu > > unlock_mutex: > mutex_unlock(&rproc->lock); > -- > 2.25.1 >
Hi Mathieu, > Subject: Re: [PATCH V5 2/2] remoteproc: support attach recovery after rproc > crash > > On Wed, Jun 15, 2022 at 11:20:48AM +0800, Peng Fan (OSS) wrote: > > From: Peng Fan <peng.fan@nxp.com> > > > > Current logic only support main processor to stop/start the remote > > processor after rproc crash. However to SoC, such as i.MX8QM/QXP, the > > remote processor could do attach recovery after crash and trigger > > watchdog reboot. It does not need main processor to load image, or > > stop/start M4 core. > > > > Introduce two functions: rproc_attach_recovery, > > rproc_firmware_recovery for the two cases. Firmware recovery is as > > before, let main processor to help recovery, while attach recovery is recover > itself withou help. > > To attach recovery, we only do detach and attach. > > > > Signed-off-by: Peng Fan <peng.fan@nxp.com> > > --- > > drivers/remoteproc/remoteproc_core.c | 64 > > +++++++++++++++++++--------- > > 1 file changed, 45 insertions(+), 19 deletions(-) > > > > diff --git a/drivers/remoteproc/remoteproc_core.c > > b/drivers/remoteproc/remoteproc_core.c > > index 02a04ab34a23..1c1c90176aff 100644 > > --- a/drivers/remoteproc/remoteproc_core.c > > +++ b/drivers/remoteproc/remoteproc_core.c > > @@ -1883,6 +1883,47 @@ static int __rproc_detach(struct rproc *rproc) > > return 0; > > } > > > > +static int rproc_attach_recovery(struct rproc *rproc) { > > + int ret; > > + > > + mutex_unlock(&rproc->lock); > > + ret = __rproc_detach(rproc); > > + mutex_lock(&rproc->lock); > > + if (ret) > > + return ret; > > + > > + return __rproc_attach(rproc); > > +} > > + > > +static int rproc_firmware_recovery(struct rproc *rproc) > > s/rproc_firmware_recovery/rproc_boot_recovery > > > +{ > > + const struct firmware *firmware_p; > > + struct device *dev = &rproc->dev; > > + int ret; > > + > > + ret = rproc_stop(rproc, true); > > + if (ret) > > + return ret; > > + > > + /* generate coredump */ > > + rproc->ops->coredump(rproc); > > + > > + /* load firmware */ > > + ret = request_firmware(&firmware_p, rproc->firmware, dev); > > + if (ret < 0) { > > + dev_err(dev, "request_firmware failed: %d\n", ret); > > + return ret; > > + } > > + > > + /* boot the remote processor up again */ > > + ret = rproc_start(rproc, firmware_p); > > + > > + release_firmware(firmware_p); > > + > > + return ret; > > +} > > + > > /** > > * rproc_trigger_recovery() - recover a remoteproc > > * @rproc: the remote processor > > @@ -1897,7 +1938,6 @@ static int __rproc_detach(struct rproc *rproc) > > */ > > int rproc_trigger_recovery(struct rproc *rproc) { > > - const struct firmware *firmware_p; > > struct device *dev = &rproc->dev; > > int ret; > > > > @@ -1911,24 +1951,10 @@ int rproc_trigger_recovery(struct rproc > > *rproc) > > > > dev_err(dev, "recovering %s\n", rproc->name); > > > > - ret = rproc_stop(rproc, true); > > - if (ret) > > - goto unlock_mutex; > > - > > - /* generate coredump */ > > - rproc->ops->coredump(rproc); > > - > > - /* load firmware */ > > - ret = request_firmware(&firmware_p, rproc->firmware, dev); > > - if (ret < 0) { > > - dev_err(dev, "request_firmware failed: %d\n", ret); > > - goto unlock_mutex; > > - } > > - > > - /* boot the remote processor up again */ > > - ret = rproc_start(rproc, firmware_p); > > - > > - release_firmware(firmware_p); > > + if (rproc_has_feature(rproc, RPROC_FEAT_ATTACH_ON_RECOVERY)) > > + ret = rproc_attach_recovery(rproc); > > + else > > + ret = rproc_firmware_recovery(rproc); > > This patch contains a serious flaw related to locking that should have been > obvious when it was put together. Please go back and carefully review the > code you are submitting. I think you mean the following change? In v4, I use rproc_detach, but I missed to drop the unlock and lock when changing to use __rproc_detach based on your comments in V4. My bad. +static int rproc_attach_recovery(struct rproc *rproc) +{ + int ret; + + mutex_unlock(&rproc->lock); + ret = __rproc_detach(rproc); + mutex_lock(&rproc->lock); + if (ret) + return ret; + + return __rproc_attach(rproc); +} I will drop the unlock and lock as below. static int rproc_attach_recovery(struct rproc *rproc) { int ret; ret = __rproc_detach(rproc); if (ret) return ret; return __rproc_attach(rproc); } > > I will not consider another revision of this set until July 15th. No problem. Hope until then, my v6 patch would not be just enter into your queue and be the end one :) Thanks, Peng. > > Thanks, > Mathieu > > > > > unlock_mutex: > > mutex_unlock(&rproc->lock); > > -- > > 2.25.1 > >
diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c index 02a04ab34a23..1c1c90176aff 100644 --- a/drivers/remoteproc/remoteproc_core.c +++ b/drivers/remoteproc/remoteproc_core.c @@ -1883,6 +1883,47 @@ static int __rproc_detach(struct rproc *rproc) return 0; } +static int rproc_attach_recovery(struct rproc *rproc) +{ + int ret; + + mutex_unlock(&rproc->lock); + ret = __rproc_detach(rproc); + mutex_lock(&rproc->lock); + if (ret) + return ret; + + return __rproc_attach(rproc); +} + +static int rproc_firmware_recovery(struct rproc *rproc) +{ + const struct firmware *firmware_p; + struct device *dev = &rproc->dev; + int ret; + + ret = rproc_stop(rproc, true); + if (ret) + return ret; + + /* generate coredump */ + rproc->ops->coredump(rproc); + + /* load firmware */ + ret = request_firmware(&firmware_p, rproc->firmware, dev); + if (ret < 0) { + dev_err(dev, "request_firmware failed: %d\n", ret); + return ret; + } + + /* boot the remote processor up again */ + ret = rproc_start(rproc, firmware_p); + + release_firmware(firmware_p); + + return ret; +} + /** * rproc_trigger_recovery() - recover a remoteproc * @rproc: the remote processor @@ -1897,7 +1938,6 @@ static int __rproc_detach(struct rproc *rproc) */ int rproc_trigger_recovery(struct rproc *rproc) { - const struct firmware *firmware_p; struct device *dev = &rproc->dev; int ret; @@ -1911,24 +1951,10 @@ int rproc_trigger_recovery(struct rproc *rproc) dev_err(dev, "recovering %s\n", rproc->name); - ret = rproc_stop(rproc, true); - if (ret) - goto unlock_mutex; - - /* generate coredump */ - rproc->ops->coredump(rproc); - - /* load firmware */ - ret = request_firmware(&firmware_p, rproc->firmware, dev); - if (ret < 0) { - dev_err(dev, "request_firmware failed: %d\n", ret); - goto unlock_mutex; - } - - /* boot the remote processor up again */ - ret = rproc_start(rproc, firmware_p); - - release_firmware(firmware_p); + if (rproc_has_feature(rproc, RPROC_FEAT_ATTACH_ON_RECOVERY)) + ret = rproc_attach_recovery(rproc); + else + ret = rproc_firmware_recovery(rproc); unlock_mutex: mutex_unlock(&rproc->lock);