diff mbox series

[v3,09/15] remoteproc: Introduce function rproc_detach()

Message ID 20201126210642.897302-10-mathieu.poirier@linaro.org (mailing list archive)
State New, archived
Headers show
Series remoteproc: Add support for detaching from rproc | expand

Commit Message

Mathieu Poirier Nov. 26, 2020, 9:06 p.m. UTC
Introduce function rproc_detach() to enable the remoteproc
core to release the resources associated with a remote processor
without stopping its operation.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Peng Fan <peng.fan@nxp.com>
---
 drivers/remoteproc/remoteproc_core.c | 65 +++++++++++++++++++++++++++-
 include/linux/remoteproc.h           |  1 +
 2 files changed, 65 insertions(+), 1 deletion(-)

Comments

Arnaud POULIQUEN Dec. 8, 2020, 6:35 p.m. UTC | #1
Hi Mathieu,


On 11/26/20 10:06 PM, Mathieu Poirier wrote:
> Introduce function rproc_detach() to enable the remoteproc
> core to release the resources associated with a remote processor
> without stopping its operation.
> 
> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> Reviewed-by: Peng Fan <peng.fan@nxp.com>
> ---
>  drivers/remoteproc/remoteproc_core.c | 65 +++++++++++++++++++++++++++-
>  include/linux/remoteproc.h           |  1 +
>  2 files changed, 65 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> index 928b3f975798..f5adf05762e9 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -1667,7 +1667,7 @@ static int rproc_stop(struct rproc *rproc, bool crashed)
>  /*
>   * __rproc_detach(): Does the opposite of rproc_attach()
>   */
> -static int __maybe_unused __rproc_detach(struct rproc *rproc)
> +static int __rproc_detach(struct rproc *rproc)
>  {
>  	struct device *dev = &rproc->dev;
>  	int ret;
> @@ -1910,6 +1910,69 @@ void rproc_shutdown(struct rproc *rproc)
>  }
>  EXPORT_SYMBOL(rproc_shutdown);
>  
> +/**
> + * rproc_detach() - Detach the remote processor from the
> + * remoteproc core
> + *
> + * @rproc: the remote processor
> + *
> + * Detach a remote processor (previously attached to with rproc_actuate()).
> + *
> + * In case @rproc is still being used by an additional user(s), then
> + * this function will just decrement the power refcount and exit,
> + * without disconnecting the device.
> + *
> + * Function rproc_detach() calls __rproc_detach() in order to let a remote
> + * processor know that services provided by the application processor are
> + * no longer available.  From there it should be possible to remove the
> + * platform driver and even power cycle the application processor (if the HW
> + * supports it) without needing to switch off the remote processor.
> + */
> +int rproc_detach(struct rproc *rproc)
> +{
> +	struct device *dev = &rproc->dev;
> +	int ret;
> +
> +	ret = mutex_lock_interruptible(&rproc->lock);
> +	if (ret) {
> +		dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, ret);
> +		return ret;
> +	}
> +
> +	if (rproc->state != RPROC_RUNNING && rproc->state != RPROC_ATTACHED) {
> +		ret = -EPERM;
> +		goto out;
> +	}
> +
> +	/* if the remote proc is still needed, bail out */
> +	if (!atomic_dec_and_test(&rproc->power)) {
> +		ret = -EBUSY;
> +		goto out;
> +	}
> +
> +	ret = __rproc_detach(rproc);
> +	if (ret) {
> +		atomic_inc(&rproc->power);
> +		goto out;
> +	}
> +
> +	/* clean up all acquired resources */
> +	rproc_resource_cleanup(rproc);

I started to test the series, I found 2 problems testing in STM32P1 board.

1) the resource_table pointer is unmapped if the firmware has been booted by the
Linux, generating a crash in rproc_free_vring.
I attached a fix at the end of the mail.

2) After the detach, the rproc state is "detached"
but it is no longer possible to re-attach to it correctly.
Neither if the firmware is standalone, nor if it has been booted
by the Linux.

I did not investigate, but the issue is probably linked to the resource
table address which is set to NULL.

So we either have to fix the problem in order to attach or forbid the transition.


Regards,
Arnaud

> +
> +	rproc_disable_iommu(rproc);
> +
> +	/*
> +	 * Set the remote processor's table pointer to NULL.  Since mapping
> +	 * of the resource table to a virtual address is done in the platform
> +	 * driver, unmapping should also be done there.
> +	 */
> +	rproc->table_ptr = NULL;
> +out:
> +	mutex_unlock(&rproc->lock);
> +	return ret;
> +}
> +EXPORT_SYMBOL(rproc_detach);
> +
>  /**
>   * rproc_get_by_phandle() - find a remote processor by phandle
>   * @phandle: phandle to the rproc
> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> index da15b77583d3..329c1c071dcf 100644
> --- a/include/linux/remoteproc.h
> +++ b/include/linux/remoteproc.h
> @@ -656,6 +656,7 @@ rproc_of_resm_mem_entry_init(struct device *dev, u32 of_resm_idx, size_t len,
>  
>  int rproc_boot(struct rproc *rproc);
>  void rproc_shutdown(struct rproc *rproc);
> +int rproc_detach(struct rproc *rproc);
>  int rproc_set_firmware(struct rproc *rproc, const char *fw_name);
>  void rproc_report_crash(struct rproc *rproc, enum rproc_crash_type type);
>  int rproc_coredump_add_segment(struct rproc *rproc, dma_addr_t da, size_t size);
> 

From: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
Date: Tue, 8 Dec 2020 18:54:51 +0100
Subject: [PATCH] remoteproc: core: fix detach for unmapped table_ptr

If the firmware has been loaded and started by the kernel, the
resource table has probably been mapped by the carveout allocation
(see rproc_elf_find_loaded_rsc_table).
In this case the memory can have been unmapped before the vrings are free.
The result is a crash that occurs in rproc_free_vring while try to use the
unmapped pointer.

Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
---
 drivers/remoteproc/remoteproc_core.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/remoteproc/remoteproc_core.c
b/drivers/remoteproc/remoteproc_core.c
index 2b0a52fb3398..3508ffba4a2a 100644
--- a/drivers/remoteproc/remoteproc_core.c
+++ b/drivers/remoteproc/remoteproc_core.c
@@ -1964,6 +1964,13 @@ int rproc_detach(struct rproc *rproc)
 		goto out;
 	}

+	/*
+	 * Prevent case that the installed resource table is no longer
+	 * accessible (e.g. memory unmapped), use the cache if available
+	 */
+	if (rproc->cached_table)
+		rproc->table_ptr = rproc->cached_table;
+
 	ret = __rproc_detach(rproc);
 	if (ret) {
 		atomic_inc(&rproc->power);
@@ -1975,10 +1982,14 @@ int rproc_detach(struct rproc *rproc)

 	rproc_disable_iommu(rproc);

+	/* Free the chached table memory that can has been allocated*/
+	kfree(rproc->cached_table);
+	rproc->cached_table = NULL;
 	/*
-	 * Set the remote processor's table pointer to NULL.  Since mapping
-	 * of the resource table to a virtual address is done in the platform
-	 * driver, unmapping should also be done there.
+	 * Set the remote processor's table pointer to NULL. If mapping
+	 * of the resource table to a virtual address has been done in the
+	 * platform driver(attachment to an existing firmware),
+	 * unmapping should also be done there.
 	 */
 	rproc->table_ptr = NULL;
 out:
Mathieu Poirier Dec. 8, 2020, 8:25 p.m. UTC | #2
On Tue, Dec 08, 2020 at 07:35:18PM +0100, Arnaud POULIQUEN wrote:
> Hi Mathieu,
> 
> 
> On 11/26/20 10:06 PM, Mathieu Poirier wrote:
> > Introduce function rproc_detach() to enable the remoteproc
> > core to release the resources associated with a remote processor
> > without stopping its operation.
> > 
> > Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> > Reviewed-by: Peng Fan <peng.fan@nxp.com>
> > ---
> >  drivers/remoteproc/remoteproc_core.c | 65 +++++++++++++++++++++++++++-
> >  include/linux/remoteproc.h           |  1 +
> >  2 files changed, 65 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> > index 928b3f975798..f5adf05762e9 100644
> > --- a/drivers/remoteproc/remoteproc_core.c
> > +++ b/drivers/remoteproc/remoteproc_core.c
> > @@ -1667,7 +1667,7 @@ static int rproc_stop(struct rproc *rproc, bool crashed)
> >  /*
> >   * __rproc_detach(): Does the opposite of rproc_attach()
> >   */
> > -static int __maybe_unused __rproc_detach(struct rproc *rproc)
> > +static int __rproc_detach(struct rproc *rproc)
> >  {
> >  	struct device *dev = &rproc->dev;
> >  	int ret;
> > @@ -1910,6 +1910,69 @@ void rproc_shutdown(struct rproc *rproc)
> >  }
> >  EXPORT_SYMBOL(rproc_shutdown);
> >  
> > +/**
> > + * rproc_detach() - Detach the remote processor from the
> > + * remoteproc core
> > + *
> > + * @rproc: the remote processor
> > + *
> > + * Detach a remote processor (previously attached to with rproc_actuate()).
> > + *
> > + * In case @rproc is still being used by an additional user(s), then
> > + * this function will just decrement the power refcount and exit,
> > + * without disconnecting the device.
> > + *
> > + * Function rproc_detach() calls __rproc_detach() in order to let a remote
> > + * processor know that services provided by the application processor are
> > + * no longer available.  From there it should be possible to remove the
> > + * platform driver and even power cycle the application processor (if the HW
> > + * supports it) without needing to switch off the remote processor.
> > + */
> > +int rproc_detach(struct rproc *rproc)
> > +{
> > +	struct device *dev = &rproc->dev;
> > +	int ret;
> > +
> > +	ret = mutex_lock_interruptible(&rproc->lock);
> > +	if (ret) {
> > +		dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, ret);
> > +		return ret;
> > +	}
> > +
> > +	if (rproc->state != RPROC_RUNNING && rproc->state != RPROC_ATTACHED) {
> > +		ret = -EPERM;
> > +		goto out;
> > +	}
> > +
> > +	/* if the remote proc is still needed, bail out */
> > +	if (!atomic_dec_and_test(&rproc->power)) {
> > +		ret = -EBUSY;
> > +		goto out;
> > +	}
> > +
> > +	ret = __rproc_detach(rproc);
> > +	if (ret) {
> > +		atomic_inc(&rproc->power);
> > +		goto out;
> > +	}
> > +
> > +	/* clean up all acquired resources */
> > +	rproc_resource_cleanup(rproc);
> 
> I started to test the series, I found 2 problems testing in STM32P1 board.
> 
> 1) the resource_table pointer is unmapped if the firmware has been booted by the
> Linux, generating a crash in rproc_free_vring.
> I attached a fix at the end of the mail.
> 
> 2) After the detach, the rproc state is "detached"
> but it is no longer possible to re-attach to it correctly.
> Neither if the firmware is standalone, nor if it has been booted
> by the Linux.

Thanks for the report - I thought both problems had been fixed...

> 
> I did not investigate, but the issue is probably linked to the resource
> table address which is set to NULL.
> 
> So we either have to fix the problem in order to attach or forbid the transition.
> 

Perfect timing on your side as I was contemplating sending another revision.
Let me look at things and I will get back to you.

> 
> Regards,
> Arnaud
> 
> > +
> > +	rproc_disable_iommu(rproc);
> > +
> > +	/*
> > +	 * Set the remote processor's table pointer to NULL.  Since mapping
> > +	 * of the resource table to a virtual address is done in the platform
> > +	 * driver, unmapping should also be done there.
> > +	 */
> > +	rproc->table_ptr = NULL;
> > +out:
> > +	mutex_unlock(&rproc->lock);
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL(rproc_detach);
> > +
> >  /**
> >   * rproc_get_by_phandle() - find a remote processor by phandle
> >   * @phandle: phandle to the rproc
> > diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> > index da15b77583d3..329c1c071dcf 100644
> > --- a/include/linux/remoteproc.h
> > +++ b/include/linux/remoteproc.h
> > @@ -656,6 +656,7 @@ rproc_of_resm_mem_entry_init(struct device *dev, u32 of_resm_idx, size_t len,
> >  
> >  int rproc_boot(struct rproc *rproc);
> >  void rproc_shutdown(struct rproc *rproc);
> > +int rproc_detach(struct rproc *rproc);
> >  int rproc_set_firmware(struct rproc *rproc, const char *fw_name);
> >  void rproc_report_crash(struct rproc *rproc, enum rproc_crash_type type);
> >  int rproc_coredump_add_segment(struct rproc *rproc, dma_addr_t da, size_t size);
> > 
> 
> From: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
> Date: Tue, 8 Dec 2020 18:54:51 +0100
> Subject: [PATCH] remoteproc: core: fix detach for unmapped table_ptr
> 
> If the firmware has been loaded and started by the kernel, the
> resource table has probably been mapped by the carveout allocation
> (see rproc_elf_find_loaded_rsc_table).
> In this case the memory can have been unmapped before the vrings are free.
> The result is a crash that occurs in rproc_free_vring while try to use the
> unmapped pointer.
> 
> Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
> ---
>  drivers/remoteproc/remoteproc_core.c | 17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/remoteproc/remoteproc_core.c
> b/drivers/remoteproc/remoteproc_core.c
> index 2b0a52fb3398..3508ffba4a2a 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -1964,6 +1964,13 @@ int rproc_detach(struct rproc *rproc)
>  		goto out;
>  	}
> 
> +	/*
> +	 * Prevent case that the installed resource table is no longer
> +	 * accessible (e.g. memory unmapped), use the cache if available
> +	 */
> +	if (rproc->cached_table)
> +		rproc->table_ptr = rproc->cached_table;
> +
>  	ret = __rproc_detach(rproc);
>  	if (ret) {
>  		atomic_inc(&rproc->power);
> @@ -1975,10 +1982,14 @@ int rproc_detach(struct rproc *rproc)
> 
>  	rproc_disable_iommu(rproc);
> 
> +	/* Free the chached table memory that can has been allocated*/
> +	kfree(rproc->cached_table);
> +	rproc->cached_table = NULL;
>  	/*
> -	 * Set the remote processor's table pointer to NULL.  Since mapping
> -	 * of the resource table to a virtual address is done in the platform
> -	 * driver, unmapping should also be done there.
> +	 * Set the remote processor's table pointer to NULL. If mapping
> +	 * of the resource table to a virtual address has been done in the
> +	 * platform driver(attachment to an existing firmware),
> +	 * unmapping should also be done there.
>  	 */
>  	rproc->table_ptr = NULL;
>  out:
> -- 
> 2.17.1
> 
> 
>
Mathieu Poirier Dec. 9, 2020, 12:53 a.m. UTC | #3
On Tue, Dec 08, 2020 at 07:35:18PM +0100, Arnaud POULIQUEN wrote:
> Hi Mathieu,
> 
> 
> On 11/26/20 10:06 PM, Mathieu Poirier wrote:
> > Introduce function rproc_detach() to enable the remoteproc
> > core to release the resources associated with a remote processor
> > without stopping its operation.
> > 
> > Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> > Reviewed-by: Peng Fan <peng.fan@nxp.com>
> > ---
> >  drivers/remoteproc/remoteproc_core.c | 65 +++++++++++++++++++++++++++-
> >  include/linux/remoteproc.h           |  1 +
> >  2 files changed, 65 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> > index 928b3f975798..f5adf05762e9 100644
> > --- a/drivers/remoteproc/remoteproc_core.c
> > +++ b/drivers/remoteproc/remoteproc_core.c
> > @@ -1667,7 +1667,7 @@ static int rproc_stop(struct rproc *rproc, bool crashed)
> >  /*
> >   * __rproc_detach(): Does the opposite of rproc_attach()
> >   */
> > -static int __maybe_unused __rproc_detach(struct rproc *rproc)
> > +static int __rproc_detach(struct rproc *rproc)
> >  {
> >  	struct device *dev = &rproc->dev;
> >  	int ret;
> > @@ -1910,6 +1910,69 @@ void rproc_shutdown(struct rproc *rproc)
> >  }
> >  EXPORT_SYMBOL(rproc_shutdown);
> >  
> > +/**
> > + * rproc_detach() - Detach the remote processor from the
> > + * remoteproc core
> > + *
> > + * @rproc: the remote processor
> > + *
> > + * Detach a remote processor (previously attached to with rproc_actuate()).
> > + *
> > + * In case @rproc is still being used by an additional user(s), then
> > + * this function will just decrement the power refcount and exit,
> > + * without disconnecting the device.
> > + *
> > + * Function rproc_detach() calls __rproc_detach() in order to let a remote
> > + * processor know that services provided by the application processor are
> > + * no longer available.  From there it should be possible to remove the
> > + * platform driver and even power cycle the application processor (if the HW
> > + * supports it) without needing to switch off the remote processor.
> > + */
> > +int rproc_detach(struct rproc *rproc)
> > +{
> > +	struct device *dev = &rproc->dev;
> > +	int ret;
> > +
> > +	ret = mutex_lock_interruptible(&rproc->lock);
> > +	if (ret) {
> > +		dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, ret);
> > +		return ret;
> > +	}
> > +
> > +	if (rproc->state != RPROC_RUNNING && rproc->state != RPROC_ATTACHED) {
> > +		ret = -EPERM;
> > +		goto out;
> > +	}
> > +
> > +	/* if the remote proc is still needed, bail out */
> > +	if (!atomic_dec_and_test(&rproc->power)) {
> > +		ret = -EBUSY;
> > +		goto out;
> > +	}
> > +
> > +	ret = __rproc_detach(rproc);
> > +	if (ret) {
> > +		atomic_inc(&rproc->power);
> > +		goto out;
> > +	}
> > +
> > +	/* clean up all acquired resources */
> > +	rproc_resource_cleanup(rproc);
> 
> I started to test the series, I found 2 problems testing in STM32P1 board.
> 
> 1) the resource_table pointer is unmapped if the firmware has been booted by the
> Linux, generating a crash in rproc_free_vring.
> I attached a fix at the end of the mail.
> 

I have reproduced the condition on my side and confirm that your solution is
correct.  See below for a minor comment. 

> 2) After the detach, the rproc state is "detached"
> but it is no longer possible to re-attach to it correctly.
> Neither if the firmware is standalone, nor if it has been booted
> by the Linux.
> 

Did you update your FW image?  If so, I need to run the same one.

> I did not investigate, but the issue is probably linked to the resource
> table address which is set to NULL.
> 
> So we either have to fix the problem in order to attach or forbid the transition.
> 
> 
> Regards,
> Arnaud
> 
> > +
> > +	rproc_disable_iommu(rproc);
> > +
> > +	/*
> > +	 * Set the remote processor's table pointer to NULL.  Since mapping
> > +	 * of the resource table to a virtual address is done in the platform
> > +	 * driver, unmapping should also be done there.
> > +	 */
> > +	rproc->table_ptr = NULL;
> > +out:
> > +	mutex_unlock(&rproc->lock);
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL(rproc_detach);
> > +
> >  /**
> >   * rproc_get_by_phandle() - find a remote processor by phandle
> >   * @phandle: phandle to the rproc
> > diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> > index da15b77583d3..329c1c071dcf 100644
> > --- a/include/linux/remoteproc.h
> > +++ b/include/linux/remoteproc.h
> > @@ -656,6 +656,7 @@ rproc_of_resm_mem_entry_init(struct device *dev, u32 of_resm_idx, size_t len,
> >  
> >  int rproc_boot(struct rproc *rproc);
> >  void rproc_shutdown(struct rproc *rproc);
> > +int rproc_detach(struct rproc *rproc);
> >  int rproc_set_firmware(struct rproc *rproc, const char *fw_name);
> >  void rproc_report_crash(struct rproc *rproc, enum rproc_crash_type type);
> >  int rproc_coredump_add_segment(struct rproc *rproc, dma_addr_t da, size_t size);
> > 
> 
> From: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
> Date: Tue, 8 Dec 2020 18:54:51 +0100
> Subject: [PATCH] remoteproc: core: fix detach for unmapped table_ptr
> 
> If the firmware has been loaded and started by the kernel, the
> resource table has probably been mapped by the carveout allocation
> (see rproc_elf_find_loaded_rsc_table).
> In this case the memory can have been unmapped before the vrings are free.
> The result is a crash that occurs in rproc_free_vring while try to use the
> unmapped pointer.
> 
> Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
> ---
>  drivers/remoteproc/remoteproc_core.c | 17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/remoteproc/remoteproc_core.c
> b/drivers/remoteproc/remoteproc_core.c
> index 2b0a52fb3398..3508ffba4a2a 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -1964,6 +1964,13 @@ int rproc_detach(struct rproc *rproc)
>  		goto out;
>  	}
> 
> +	/*
> +	 * Prevent case that the installed resource table is no longer
> +	 * accessible (e.g. memory unmapped), use the cache if available
> +	 */
> +	if (rproc->cached_table)
> +		rproc->table_ptr = rproc->cached_table;

I don't think there is an explicit need to check ->cached_table.  If the remote
processor has been started by the remoteproc core it is valid anyway.  And below
kfree() is called invariably. 

So that problem is fixed.  Let me know about your FW image and we'll pick it up
from there.

Mathieu

> +
>  	ret = __rproc_detach(rproc);
>  	if (ret) {
>  		atomic_inc(&rproc->power);
> @@ -1975,10 +1982,14 @@ int rproc_detach(struct rproc *rproc)
> 
>  	rproc_disable_iommu(rproc);
> 
> +	/* Free the chached table memory that can has been allocated*/
> +	kfree(rproc->cached_table);
> +	rproc->cached_table = NULL;
>  	/*
> -	 * Set the remote processor's table pointer to NULL.  Since mapping
> -	 * of the resource table to a virtual address is done in the platform
> -	 * driver, unmapping should also be done there.
> +	 * Set the remote processor's table pointer to NULL. If mapping
> +	 * of the resource table to a virtual address has been done in the
> +	 * platform driver(attachment to an existing firmware),
> +	 * unmapping should also be done there.
>  	 */
>  	rproc->table_ptr = NULL;
>  out:
> -- 
> 2.17.1
> 
> 
>
Arnaud POULIQUEN Dec. 9, 2020, 8:45 a.m. UTC | #4
On 12/9/20 1:53 AM, Mathieu Poirier wrote:
> On Tue, Dec 08, 2020 at 07:35:18PM +0100, Arnaud POULIQUEN wrote:
>> Hi Mathieu,
>>
>>
>> On 11/26/20 10:06 PM, Mathieu Poirier wrote:
>>> Introduce function rproc_detach() to enable the remoteproc
>>> core to release the resources associated with a remote processor
>>> without stopping its operation.
>>>
>>> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
>>> Reviewed-by: Peng Fan <peng.fan@nxp.com>
>>> ---
>>>  drivers/remoteproc/remoteproc_core.c | 65 +++++++++++++++++++++++++++-
>>>  include/linux/remoteproc.h           |  1 +
>>>  2 files changed, 65 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
>>> index 928b3f975798..f5adf05762e9 100644
>>> --- a/drivers/remoteproc/remoteproc_core.c
>>> +++ b/drivers/remoteproc/remoteproc_core.c
>>> @@ -1667,7 +1667,7 @@ static int rproc_stop(struct rproc *rproc, bool crashed)
>>>  /*
>>>   * __rproc_detach(): Does the opposite of rproc_attach()
>>>   */
>>> -static int __maybe_unused __rproc_detach(struct rproc *rproc)
>>> +static int __rproc_detach(struct rproc *rproc)
>>>  {
>>>  	struct device *dev = &rproc->dev;
>>>  	int ret;
>>> @@ -1910,6 +1910,69 @@ void rproc_shutdown(struct rproc *rproc)
>>>  }
>>>  EXPORT_SYMBOL(rproc_shutdown);
>>>  
>>> +/**
>>> + * rproc_detach() - Detach the remote processor from the
>>> + * remoteproc core
>>> + *
>>> + * @rproc: the remote processor
>>> + *
>>> + * Detach a remote processor (previously attached to with rproc_actuate()).
>>> + *
>>> + * In case @rproc is still being used by an additional user(s), then
>>> + * this function will just decrement the power refcount and exit,
>>> + * without disconnecting the device.
>>> + *
>>> + * Function rproc_detach() calls __rproc_detach() in order to let a remote
>>> + * processor know that services provided by the application processor are
>>> + * no longer available.  From there it should be possible to remove the
>>> + * platform driver and even power cycle the application processor (if the HW
>>> + * supports it) without needing to switch off the remote processor.
>>> + */
>>> +int rproc_detach(struct rproc *rproc)
>>> +{
>>> +	struct device *dev = &rproc->dev;
>>> +	int ret;
>>> +
>>> +	ret = mutex_lock_interruptible(&rproc->lock);
>>> +	if (ret) {
>>> +		dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, ret);
>>> +		return ret;
>>> +	}
>>> +
>>> +	if (rproc->state != RPROC_RUNNING && rproc->state != RPROC_ATTACHED) {
>>> +		ret = -EPERM;
>>> +		goto out;
>>> +	}
>>> +
>>> +	/* if the remote proc is still needed, bail out */
>>> +	if (!atomic_dec_and_test(&rproc->power)) {
>>> +		ret = -EBUSY;
>>> +		goto out;
>>> +	}
>>> +
>>> +	ret = __rproc_detach(rproc);
>>> +	if (ret) {
>>> +		atomic_inc(&rproc->power);
>>> +		goto out;
>>> +	}
>>> +
>>> +	/* clean up all acquired resources */
>>> +	rproc_resource_cleanup(rproc);
>>
>> I started to test the series, I found 2 problems testing in STM32P1 board.
>>
>> 1) the resource_table pointer is unmapped if the firmware has been booted by the
>> Linux, generating a crash in rproc_free_vring.
>> I attached a fix at the end of the mail.
>>
> 
> I have reproduced the condition on my side and confirm that your solution is
> correct.  See below for a minor comment. 
> 
>> 2) After the detach, the rproc state is "detached"
>> but it is no longer possible to re-attach to it correctly.
>> Neither if the firmware is standalone, nor if it has been booted
>> by the Linux.
>>
> 
> Did you update your FW image?  If so, I need to run the same one.
> 
>> I did not investigate, but the issue is probably linked to the resource
>> table address which is set to NULL.
>>
>> So we either have to fix the problem in order to attach or forbid the transition.
>>
>>
>> Regards,
>> Arnaud
>>
>>> +
>>> +	rproc_disable_iommu(rproc);
>>> +
>>> +	/*
>>> +	 * Set the remote processor's table pointer to NULL.  Since mapping
>>> +	 * of the resource table to a virtual address is done in the platform
>>> +	 * driver, unmapping should also be done there.
>>> +	 */
>>> +	rproc->table_ptr = NULL;
>>> +out:
>>> +	mutex_unlock(&rproc->lock);
>>> +	return ret;
>>> +}
>>> +EXPORT_SYMBOL(rproc_detach);
>>> +
>>>  /**
>>>   * rproc_get_by_phandle() - find a remote processor by phandle
>>>   * @phandle: phandle to the rproc
>>> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
>>> index da15b77583d3..329c1c071dcf 100644
>>> --- a/include/linux/remoteproc.h
>>> +++ b/include/linux/remoteproc.h
>>> @@ -656,6 +656,7 @@ rproc_of_resm_mem_entry_init(struct device *dev, u32 of_resm_idx, size_t len,
>>>  
>>>  int rproc_boot(struct rproc *rproc);
>>>  void rproc_shutdown(struct rproc *rproc);
>>> +int rproc_detach(struct rproc *rproc);
>>>  int rproc_set_firmware(struct rproc *rproc, const char *fw_name);
>>>  void rproc_report_crash(struct rproc *rproc, enum rproc_crash_type type);
>>>  int rproc_coredump_add_segment(struct rproc *rproc, dma_addr_t da, size_t size);
>>>
>>
>> From: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
>> Date: Tue, 8 Dec 2020 18:54:51 +0100
>> Subject: [PATCH] remoteproc: core: fix detach for unmapped table_ptr
>>
>> If the firmware has been loaded and started by the kernel, the
>> resource table has probably been mapped by the carveout allocation
>> (see rproc_elf_find_loaded_rsc_table).
>> In this case the memory can have been unmapped before the vrings are free.
>> The result is a crash that occurs in rproc_free_vring while try to use the
>> unmapped pointer.
>>
>> Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
>> ---
>>  drivers/remoteproc/remoteproc_core.c | 17 ++++++++++++++---
>>  1 file changed, 14 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/remoteproc/remoteproc_core.c
>> b/drivers/remoteproc/remoteproc_core.c
>> index 2b0a52fb3398..3508ffba4a2a 100644
>> --- a/drivers/remoteproc/remoteproc_core.c
>> +++ b/drivers/remoteproc/remoteproc_core.c
>> @@ -1964,6 +1964,13 @@ int rproc_detach(struct rproc *rproc)
>>  		goto out;
>>  	}
>>
>> +	/*
>> +	 * Prevent case that the installed resource table is no longer
>> +	 * accessible (e.g. memory unmapped), use the cache if available
>> +	 */
>> +	if (rproc->cached_table)
>> +		rproc->table_ptr = rproc->cached_table;
> 
> I don't think there is an explicit need to check ->cached_table.  If the remote
> processor has been started by the remoteproc core it is valid anyway.  And below
> kfree() is called invariably. 

The condition is needed, the  rproc->cached_table is null if the firmware as
been preloaded and the Linux remote proc just attaches to it.
The cached is used only when Linux loads the firmware, as the resource table is
extracted from the elf file to parse resource before the load of the firmware.

> 
> So that problem is fixed.  Let me know about your FW image and we'll pick it up
> from there.

I use the following example available on the stm32mp1 image:
/usr/local/Cube-M4-examples/STM32MP157C-DK2/Applications/OpenAMP/OpenAMP_TTY_echo_wakeup/lib/firmware/
This exemple use the RPMsg and also blink a LED when while running.

Don't hesitate if you need me to send it to you by mail.

Thank,
Arnaud

> 
> Mathieu
> 
>> +
>>  	ret = __rproc_detach(rproc);
>>  	if (ret) {
>>  		atomic_inc(&rproc->power);
>> @@ -1975,10 +1982,14 @@ int rproc_detach(struct rproc *rproc)
>>
>>  	rproc_disable_iommu(rproc);
>>
>> +	/* Free the chached table memory that can has been allocated*/
>> +	kfree(rproc->cached_table);
>> +	rproc->cached_table = NULL;
>>  	/*
>> -	 * Set the remote processor's table pointer to NULL.  Since mapping
>> -	 * of the resource table to a virtual address is done in the platform
>> -	 * driver, unmapping should also be done there.
>> +	 * Set the remote processor's table pointer to NULL. If mapping
>> +	 * of the resource table to a virtual address has been done in the
>> +	 * platform driver(attachment to an existing firmware),
>> +	 * unmapping should also be done there.
>>  	 */
>>  	rproc->table_ptr = NULL;
>>  out:
>> -- 
>> 2.17.1
>>
>>
>>
Mathieu Poirier Dec. 9, 2020, 9:18 p.m. UTC | #5
On Wed, Dec 09, 2020 at 09:45:32AM +0100, Arnaud POULIQUEN wrote:
> 
> 
> On 12/9/20 1:53 AM, Mathieu Poirier wrote:
> > On Tue, Dec 08, 2020 at 07:35:18PM +0100, Arnaud POULIQUEN wrote:
> >> Hi Mathieu,
> >>
> >>
> >> On 11/26/20 10:06 PM, Mathieu Poirier wrote:
> >>> Introduce function rproc_detach() to enable the remoteproc
> >>> core to release the resources associated with a remote processor
> >>> without stopping its operation.
> >>>
> >>> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> >>> Reviewed-by: Peng Fan <peng.fan@nxp.com>
> >>> ---
> >>>  drivers/remoteproc/remoteproc_core.c | 65 +++++++++++++++++++++++++++-
> >>>  include/linux/remoteproc.h           |  1 +
> >>>  2 files changed, 65 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> >>> index 928b3f975798..f5adf05762e9 100644
> >>> --- a/drivers/remoteproc/remoteproc_core.c
> >>> +++ b/drivers/remoteproc/remoteproc_core.c
> >>> @@ -1667,7 +1667,7 @@ static int rproc_stop(struct rproc *rproc, bool crashed)
> >>>  /*
> >>>   * __rproc_detach(): Does the opposite of rproc_attach()
> >>>   */
> >>> -static int __maybe_unused __rproc_detach(struct rproc *rproc)
> >>> +static int __rproc_detach(struct rproc *rproc)
> >>>  {
> >>>  	struct device *dev = &rproc->dev;
> >>>  	int ret;
> >>> @@ -1910,6 +1910,69 @@ void rproc_shutdown(struct rproc *rproc)
> >>>  }
> >>>  EXPORT_SYMBOL(rproc_shutdown);
> >>>  
> >>> +/**
> >>> + * rproc_detach() - Detach the remote processor from the
> >>> + * remoteproc core
> >>> + *
> >>> + * @rproc: the remote processor
> >>> + *
> >>> + * Detach a remote processor (previously attached to with rproc_actuate()).
> >>> + *
> >>> + * In case @rproc is still being used by an additional user(s), then
> >>> + * this function will just decrement the power refcount and exit,
> >>> + * without disconnecting the device.
> >>> + *
> >>> + * Function rproc_detach() calls __rproc_detach() in order to let a remote
> >>> + * processor know that services provided by the application processor are
> >>> + * no longer available.  From there it should be possible to remove the
> >>> + * platform driver and even power cycle the application processor (if the HW
> >>> + * supports it) without needing to switch off the remote processor.
> >>> + */
> >>> +int rproc_detach(struct rproc *rproc)
> >>> +{
> >>> +	struct device *dev = &rproc->dev;
> >>> +	int ret;
> >>> +
> >>> +	ret = mutex_lock_interruptible(&rproc->lock);
> >>> +	if (ret) {
> >>> +		dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, ret);
> >>> +		return ret;
> >>> +	}
> >>> +
> >>> +	if (rproc->state != RPROC_RUNNING && rproc->state != RPROC_ATTACHED) {
> >>> +		ret = -EPERM;
> >>> +		goto out;
> >>> +	}
> >>> +
> >>> +	/* if the remote proc is still needed, bail out */
> >>> +	if (!atomic_dec_and_test(&rproc->power)) {
> >>> +		ret = -EBUSY;
> >>> +		goto out;
> >>> +	}
> >>> +
> >>> +	ret = __rproc_detach(rproc);
> >>> +	if (ret) {
> >>> +		atomic_inc(&rproc->power);
> >>> +		goto out;
> >>> +	}
> >>> +
> >>> +	/* clean up all acquired resources */
> >>> +	rproc_resource_cleanup(rproc);
> >>
> >> I started to test the series, I found 2 problems testing in STM32P1 board.
> >>
> >> 1) the resource_table pointer is unmapped if the firmware has been booted by the
> >> Linux, generating a crash in rproc_free_vring.
> >> I attached a fix at the end of the mail.
> >>
> > 
> > I have reproduced the condition on my side and confirm that your solution is
> > correct.  See below for a minor comment. 
> > 
> >> 2) After the detach, the rproc state is "detached"
> >> but it is no longer possible to re-attach to it correctly.
> >> Neither if the firmware is standalone, nor if it has been booted
> >> by the Linux.
> >>
> > 
> > Did you update your FW image?  If so, I need to run the same one.
> > 
> >> I did not investigate, but the issue is probably linked to the resource
> >> table address which is set to NULL.
> >>
> >> So we either have to fix the problem in order to attach or forbid the transition.
> >>
> >>
> >> Regards,
> >> Arnaud
> >>
> >>> +
> >>> +	rproc_disable_iommu(rproc);
> >>> +
> >>> +	/*
> >>> +	 * Set the remote processor's table pointer to NULL.  Since mapping
> >>> +	 * of the resource table to a virtual address is done in the platform
> >>> +	 * driver, unmapping should also be done there.
> >>> +	 */
> >>> +	rproc->table_ptr = NULL;
> >>> +out:
> >>> +	mutex_unlock(&rproc->lock);
> >>> +	return ret;
> >>> +}
> >>> +EXPORT_SYMBOL(rproc_detach);
> >>> +
> >>>  /**
> >>>   * rproc_get_by_phandle() - find a remote processor by phandle
> >>>   * @phandle: phandle to the rproc
> >>> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> >>> index da15b77583d3..329c1c071dcf 100644
> >>> --- a/include/linux/remoteproc.h
> >>> +++ b/include/linux/remoteproc.h
> >>> @@ -656,6 +656,7 @@ rproc_of_resm_mem_entry_init(struct device *dev, u32 of_resm_idx, size_t len,
> >>>  
> >>>  int rproc_boot(struct rproc *rproc);
> >>>  void rproc_shutdown(struct rproc *rproc);
> >>> +int rproc_detach(struct rproc *rproc);
> >>>  int rproc_set_firmware(struct rproc *rproc, const char *fw_name);
> >>>  void rproc_report_crash(struct rproc *rproc, enum rproc_crash_type type);
> >>>  int rproc_coredump_add_segment(struct rproc *rproc, dma_addr_t da, size_t size);
> >>>
> >>
> >> From: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
> >> Date: Tue, 8 Dec 2020 18:54:51 +0100
> >> Subject: [PATCH] remoteproc: core: fix detach for unmapped table_ptr
> >>
> >> If the firmware has been loaded and started by the kernel, the
> >> resource table has probably been mapped by the carveout allocation
> >> (see rproc_elf_find_loaded_rsc_table).
> >> In this case the memory can have been unmapped before the vrings are free.
> >> The result is a crash that occurs in rproc_free_vring while try to use the
> >> unmapped pointer.
> >>
> >> Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
> >> ---
> >>  drivers/remoteproc/remoteproc_core.c | 17 ++++++++++++++---
> >>  1 file changed, 14 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/remoteproc/remoteproc_core.c
> >> b/drivers/remoteproc/remoteproc_core.c
> >> index 2b0a52fb3398..3508ffba4a2a 100644
> >> --- a/drivers/remoteproc/remoteproc_core.c
> >> +++ b/drivers/remoteproc/remoteproc_core.c
> >> @@ -1964,6 +1964,13 @@ int rproc_detach(struct rproc *rproc)
> >>  		goto out;
> >>  	}
> >>
> >> +	/*
> >> +	 * Prevent case that the installed resource table is no longer
> >> +	 * accessible (e.g. memory unmapped), use the cache if available
> >> +	 */
> >> +	if (rproc->cached_table)
> >> +		rproc->table_ptr = rproc->cached_table;
> > 
> > I don't think there is an explicit need to check ->cached_table.  If the remote
> > processor has been started by the remoteproc core it is valid anyway.  And below
> > kfree() is called invariably. 
> 
> The condition is needed, the  rproc->cached_table is null if the firmware as
> been preloaded and the Linux remote proc just attaches to it.
> The cached is used only when Linux loads the firmware, as the resource table is
> extracted from the elf file to parse resource before the load of the firmware.

I have taken another look at this and you are correct. The if() condition is
needed because ->table_ptr is set only once when the platform driver is
probed.  See further down...

> 
> > 
> > So that problem is fixed.  Let me know about your FW image and we'll pick it up
> > from there.
> 
> I use the following example available on the stm32mp1 image:
> /usr/local/Cube-M4-examples/STM32MP157C-DK2/Applications/OpenAMP/OpenAMP_TTY_echo_wakeup/lib/firmware/
> This exemple use the RPMsg and also blink a LED when while running.
> 
> Don't hesitate if you need me to send it to you by mail.
> 
> Thank,
> Arnaud
> 
> > 
> > Mathieu
> > 
> >> +
> >>  	ret = __rproc_detach(rproc);
> >>  	if (ret) {
> >>  		atomic_inc(&rproc->power);
> >> @@ -1975,10 +1982,14 @@ int rproc_detach(struct rproc *rproc)
> >>
> >>  	rproc_disable_iommu(rproc);
> >>
> >> +	/* Free the chached table memory that can has been allocated*/
> >> +	kfree(rproc->cached_table);
> >> +	rproc->cached_table = NULL;
> >>  	/*
> >> -	 * Set the remote processor's table pointer to NULL.  Since mapping
> >> -	 * of the resource table to a virtual address is done in the platform
> >> -	 * driver, unmapping should also be done there.
> >> +	 * Set the remote processor's table pointer to NULL. If mapping
> >> +	 * of the resource table to a virtual address has been done in the
> >> +	 * platform driver(attachment to an existing firmware),
> >> +	 * unmapping should also be done there.
> >>  	 */
> >>  	rproc->table_ptr = NULL;

With the above in mind we can't to that, otherwise trying to re-attach with
rproc_attach() won't work because ->table_ptr will be NULL.

I wasn't able to test that code path because I didn't have the FW that supported
detaching.  Now that the feature is maturing it needs to be done.  

> >>  out:
> >> -- 
> >> 2.17.1
> >>
> >>
> >>
Arnaud POULIQUEN Dec. 10, 2020, 8:51 a.m. UTC | #6
On 12/9/20 10:18 PM, Mathieu Poirier wrote:
> On Wed, Dec 09, 2020 at 09:45:32AM +0100, Arnaud POULIQUEN wrote:
>>
>>
>> On 12/9/20 1:53 AM, Mathieu Poirier wrote:
>>> On Tue, Dec 08, 2020 at 07:35:18PM +0100, Arnaud POULIQUEN wrote:
>>>> Hi Mathieu,
>>>>
>>>>
>>>> On 11/26/20 10:06 PM, Mathieu Poirier wrote:
>>>>> Introduce function rproc_detach() to enable the remoteproc
>>>>> core to release the resources associated with a remote processor
>>>>> without stopping its operation.
>>>>>
>>>>> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
>>>>> Reviewed-by: Peng Fan <peng.fan@nxp.com>
>>>>> ---
>>>>>  drivers/remoteproc/remoteproc_core.c | 65 +++++++++++++++++++++++++++-
>>>>>  include/linux/remoteproc.h           |  1 +
>>>>>  2 files changed, 65 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
>>>>> index 928b3f975798..f5adf05762e9 100644
>>>>> --- a/drivers/remoteproc/remoteproc_core.c
>>>>> +++ b/drivers/remoteproc/remoteproc_core.c
>>>>> @@ -1667,7 +1667,7 @@ static int rproc_stop(struct rproc *rproc, bool crashed)
>>>>>  /*
>>>>>   * __rproc_detach(): Does the opposite of rproc_attach()
>>>>>   */
>>>>> -static int __maybe_unused __rproc_detach(struct rproc *rproc)
>>>>> +static int __rproc_detach(struct rproc *rproc)
>>>>>  {
>>>>>  	struct device *dev = &rproc->dev;
>>>>>  	int ret;
>>>>> @@ -1910,6 +1910,69 @@ void rproc_shutdown(struct rproc *rproc)
>>>>>  }
>>>>>  EXPORT_SYMBOL(rproc_shutdown);
>>>>>  
>>>>> +/**
>>>>> + * rproc_detach() - Detach the remote processor from the
>>>>> + * remoteproc core
>>>>> + *
>>>>> + * @rproc: the remote processor
>>>>> + *
>>>>> + * Detach a remote processor (previously attached to with rproc_actuate()).
>>>>> + *
>>>>> + * In case @rproc is still being used by an additional user(s), then
>>>>> + * this function will just decrement the power refcount and exit,
>>>>> + * without disconnecting the device.
>>>>> + *
>>>>> + * Function rproc_detach() calls __rproc_detach() in order to let a remote
>>>>> + * processor know that services provided by the application processor are
>>>>> + * no longer available.  From there it should be possible to remove the
>>>>> + * platform driver and even power cycle the application processor (if the HW
>>>>> + * supports it) without needing to switch off the remote processor.
>>>>> + */
>>>>> +int rproc_detach(struct rproc *rproc)
>>>>> +{
>>>>> +	struct device *dev = &rproc->dev;
>>>>> +	int ret;
>>>>> +
>>>>> +	ret = mutex_lock_interruptible(&rproc->lock);
>>>>> +	if (ret) {
>>>>> +		dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, ret);
>>>>> +		return ret;
>>>>> +	}
>>>>> +
>>>>> +	if (rproc->state != RPROC_RUNNING && rproc->state != RPROC_ATTACHED) {
>>>>> +		ret = -EPERM;
>>>>> +		goto out;
>>>>> +	}
>>>>> +
>>>>> +	/* if the remote proc is still needed, bail out */
>>>>> +	if (!atomic_dec_and_test(&rproc->power)) {
>>>>> +		ret = -EBUSY;
>>>>> +		goto out;
>>>>> +	}
>>>>> +
>>>>> +	ret = __rproc_detach(rproc);
>>>>> +	if (ret) {
>>>>> +		atomic_inc(&rproc->power);
>>>>> +		goto out;
>>>>> +	}
>>>>> +
>>>>> +	/* clean up all acquired resources */
>>>>> +	rproc_resource_cleanup(rproc);
>>>>
>>>> I started to test the series, I found 2 problems testing in STM32P1 board.
>>>>
>>>> 1) the resource_table pointer is unmapped if the firmware has been booted by the
>>>> Linux, generating a crash in rproc_free_vring.
>>>> I attached a fix at the end of the mail.
>>>>
>>>
>>> I have reproduced the condition on my side and confirm that your solution is
>>> correct.  See below for a minor comment. 
>>>
>>>> 2) After the detach, the rproc state is "detached"
>>>> but it is no longer possible to re-attach to it correctly.
>>>> Neither if the firmware is standalone, nor if it has been booted
>>>> by the Linux.
>>>>
>>>
>>> Did you update your FW image?  If so, I need to run the same one.
>>>
>>>> I did not investigate, but the issue is probably linked to the resource
>>>> table address which is set to NULL.
>>>>
>>>> So we either have to fix the problem in order to attach or forbid the transition.
>>>>
>>>>
>>>> Regards,
>>>> Arnaud
>>>>
>>>>> +
>>>>> +	rproc_disable_iommu(rproc);
>>>>> +
>>>>> +	/*
>>>>> +	 * Set the remote processor's table pointer to NULL.  Since mapping
>>>>> +	 * of the resource table to a virtual address is done in the platform
>>>>> +	 * driver, unmapping should also be done there.
>>>>> +	 */
>>>>> +	rproc->table_ptr = NULL;
>>>>> +out:
>>>>> +	mutex_unlock(&rproc->lock);
>>>>> +	return ret;
>>>>> +}
>>>>> +EXPORT_SYMBOL(rproc_detach);
>>>>> +
>>>>>  /**
>>>>>   * rproc_get_by_phandle() - find a remote processor by phandle
>>>>>   * @phandle: phandle to the rproc
>>>>> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
>>>>> index da15b77583d3..329c1c071dcf 100644
>>>>> --- a/include/linux/remoteproc.h
>>>>> +++ b/include/linux/remoteproc.h
>>>>> @@ -656,6 +656,7 @@ rproc_of_resm_mem_entry_init(struct device *dev, u32 of_resm_idx, size_t len,
>>>>>  
>>>>>  int rproc_boot(struct rproc *rproc);
>>>>>  void rproc_shutdown(struct rproc *rproc);
>>>>> +int rproc_detach(struct rproc *rproc);
>>>>>  int rproc_set_firmware(struct rproc *rproc, const char *fw_name);
>>>>>  void rproc_report_crash(struct rproc *rproc, enum rproc_crash_type type);
>>>>>  int rproc_coredump_add_segment(struct rproc *rproc, dma_addr_t da, size_t size);
>>>>>
>>>>
>>>> From: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
>>>> Date: Tue, 8 Dec 2020 18:54:51 +0100
>>>> Subject: [PATCH] remoteproc: core: fix detach for unmapped table_ptr
>>>>
>>>> If the firmware has been loaded and started by the kernel, the
>>>> resource table has probably been mapped by the carveout allocation
>>>> (see rproc_elf_find_loaded_rsc_table).
>>>> In this case the memory can have been unmapped before the vrings are free.
>>>> The result is a crash that occurs in rproc_free_vring while try to use the
>>>> unmapped pointer.
>>>>
>>>> Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
>>>> ---
>>>>  drivers/remoteproc/remoteproc_core.c | 17 ++++++++++++++---
>>>>  1 file changed, 14 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/remoteproc/remoteproc_core.c
>>>> b/drivers/remoteproc/remoteproc_core.c
>>>> index 2b0a52fb3398..3508ffba4a2a 100644
>>>> --- a/drivers/remoteproc/remoteproc_core.c
>>>> +++ b/drivers/remoteproc/remoteproc_core.c
>>>> @@ -1964,6 +1964,13 @@ int rproc_detach(struct rproc *rproc)
>>>>  		goto out;
>>>>  	}
>>>>
>>>> +	/*
>>>> +	 * Prevent case that the installed resource table is no longer
>>>> +	 * accessible (e.g. memory unmapped), use the cache if available
>>>> +	 */
>>>> +	if (rproc->cached_table)
>>>> +		rproc->table_ptr = rproc->cached_table;
>>>
>>> I don't think there is an explicit need to check ->cached_table.  If the remote
>>> processor has been started by the remoteproc core it is valid anyway.  And below
>>> kfree() is called invariably. 
>>
>> The condition is needed, the  rproc->cached_table is null if the firmware as
>> been preloaded and the Linux remote proc just attaches to it.
>> The cached is used only when Linux loads the firmware, as the resource table is
>> extracted from the elf file to parse resource before the load of the firmware.
> 
> I have taken another look at this and you are correct. The if() condition is
> needed because ->table_ptr is set only once when the platform driver is
> probed.  See further down...
> 
>>
>>>
>>> So that problem is fixed.  Let me know about your FW image and we'll pick it up
>>> from there.
>>
>> I use the following example available on the stm32mp1 image:
>> /usr/local/Cube-M4-examples/STM32MP157C-DK2/Applications/OpenAMP/OpenAMP_TTY_echo_wakeup/lib/firmware/
>> This exemple use the RPMsg and also blink a LED when while running.
>>
>> Don't hesitate if you need me to send it to you by mail.
>>
>> Thank,
>> Arnaud
>>
>>>
>>> Mathieu
>>>
>>>> +
>>>>  	ret = __rproc_detach(rproc);
>>>>  	if (ret) {
>>>>  		atomic_inc(&rproc->power);
>>>> @@ -1975,10 +1982,14 @@ int rproc_detach(struct rproc *rproc)
>>>>
>>>>  	rproc_disable_iommu(rproc);
>>>>
>>>> +	/* Free the chached table memory that can has been allocated*/
>>>> +	kfree(rproc->cached_table);
>>>> +	rproc->cached_table = NULL;
>>>>  	/*
>>>> -	 * Set the remote processor's table pointer to NULL.  Since mapping
>>>> -	 * of the resource table to a virtual address is done in the platform
>>>> -	 * driver, unmapping should also be done there.
>>>> +	 * Set the remote processor's table pointer to NULL. If mapping
>>>> +	 * of the resource table to a virtual address has been done in the
>>>> +	 * platform driver(attachment to an existing firmware),
>>>> +	 * unmapping should also be done there.
>>>>  	 */
>>>>  	rproc->table_ptr = NULL;
> 
> With the above in mind we can't to that, otherwise trying to re-attach with
> rproc_attach() won't work because ->table_ptr will be NULL.

Yes, or an alternative would be to call find_loaded_rsc_table on attach. I did
not test it but could make sense to call the ops instead of expecting that the
platform has set table_ptr.

> 
> I wasn't able to test that code path because I didn't have the FW that supported
> detaching.  Now that the feature is maturing it needs to be done.  
> 
>>>>  out:
>>>> -- 
>>>> 2.17.1
>>>>
>>>>
>>>>
diff mbox series

Patch

diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
index 928b3f975798..f5adf05762e9 100644
--- a/drivers/remoteproc/remoteproc_core.c
+++ b/drivers/remoteproc/remoteproc_core.c
@@ -1667,7 +1667,7 @@  static int rproc_stop(struct rproc *rproc, bool crashed)
 /*
  * __rproc_detach(): Does the opposite of rproc_attach()
  */
-static int __maybe_unused __rproc_detach(struct rproc *rproc)
+static int __rproc_detach(struct rproc *rproc)
 {
 	struct device *dev = &rproc->dev;
 	int ret;
@@ -1910,6 +1910,69 @@  void rproc_shutdown(struct rproc *rproc)
 }
 EXPORT_SYMBOL(rproc_shutdown);
 
+/**
+ * rproc_detach() - Detach the remote processor from the
+ * remoteproc core
+ *
+ * @rproc: the remote processor
+ *
+ * Detach a remote processor (previously attached to with rproc_actuate()).
+ *
+ * In case @rproc is still being used by an additional user(s), then
+ * this function will just decrement the power refcount and exit,
+ * without disconnecting the device.
+ *
+ * Function rproc_detach() calls __rproc_detach() in order to let a remote
+ * processor know that services provided by the application processor are
+ * no longer available.  From there it should be possible to remove the
+ * platform driver and even power cycle the application processor (if the HW
+ * supports it) without needing to switch off the remote processor.
+ */
+int rproc_detach(struct rproc *rproc)
+{
+	struct device *dev = &rproc->dev;
+	int ret;
+
+	ret = mutex_lock_interruptible(&rproc->lock);
+	if (ret) {
+		dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, ret);
+		return ret;
+	}
+
+	if (rproc->state != RPROC_RUNNING && rproc->state != RPROC_ATTACHED) {
+		ret = -EPERM;
+		goto out;
+	}
+
+	/* if the remote proc is still needed, bail out */
+	if (!atomic_dec_and_test(&rproc->power)) {
+		ret = -EBUSY;
+		goto out;
+	}
+
+	ret = __rproc_detach(rproc);
+	if (ret) {
+		atomic_inc(&rproc->power);
+		goto out;
+	}
+
+	/* clean up all acquired resources */
+	rproc_resource_cleanup(rproc);
+
+	rproc_disable_iommu(rproc);
+
+	/*
+	 * Set the remote processor's table pointer to NULL.  Since mapping
+	 * of the resource table to a virtual address is done in the platform
+	 * driver, unmapping should also be done there.
+	 */
+	rproc->table_ptr = NULL;
+out:
+	mutex_unlock(&rproc->lock);
+	return ret;
+}
+EXPORT_SYMBOL(rproc_detach);
+
 /**
  * rproc_get_by_phandle() - find a remote processor by phandle
  * @phandle: phandle to the rproc
diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
index da15b77583d3..329c1c071dcf 100644
--- a/include/linux/remoteproc.h
+++ b/include/linux/remoteproc.h
@@ -656,6 +656,7 @@  rproc_of_resm_mem_entry_init(struct device *dev, u32 of_resm_idx, size_t len,
 
 int rproc_boot(struct rproc *rproc);
 void rproc_shutdown(struct rproc *rproc);
+int rproc_detach(struct rproc *rproc);
 int rproc_set_firmware(struct rproc *rproc, const char *fw_name);
 void rproc_report_crash(struct rproc *rproc, enum rproc_crash_type type);
 int rproc_coredump_add_segment(struct rproc *rproc, dma_addr_t da, size_t size);