diff mbox

[2/2] remoteproc: add support to handle internal memories

Message ID 1404836521-59637-3-git-send-email-s-anna@ti.com (mailing list archive)
State New, archived
Headers show

Commit Message

Suman Anna July 8, 2014, 4:22 p.m. UTC
A remote processor may need to load certain firmware sections into
internal memories (eg: RAM at L1 or L2 levels) for performance or
other reasons. Introduce a new resource type (RSC_INTMEM) and add
an associated handler function to handle such memories. The handler
creates a kernel mapping for the resource's 'pa' (physical address).

Note that no iommu mapping is performed for this resource, as the
resource is primarily used to represent physical internal memories.
If the internal memory region can only be accessed through an iommu,
a devmem resource entry should be used instead.

Signed-off-by: Robert Tivy <rtivy@ti.com>
Signed-off-by: Suman Anna <s-anna@ti.com>
---
 drivers/remoteproc/remoteproc_core.c | 85 +++++++++++++++++++++++++++++++++++-
 include/linux/remoteproc.h           | 43 +++++++++++++++++-
 2 files changed, 126 insertions(+), 2 deletions(-)

Comments

Ohad Ben Cohen July 29, 2014, 11 a.m. UTC | #1
Hi Suman,

On Tue, Jul 8, 2014 at 7:22 PM, Suman Anna <s-anna@ti.com> wrote:
> A remote processor may need to load certain firmware sections into
> internal memories (eg: RAM at L1 or L2 levels) for performance or
> other reasons.

Can you please provide as much details as you can about the scenario
you need this for? what hardware, what sections, which specific
memory, what's the use case, numbers, sizes, everything.

I'd like to better understand the use case please.

Thanks,
Ohad.
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Suman Anna July 29, 2014, 7:33 p.m. UTC | #2
Hi Ohad,

On 07/29/2014 06:00 AM, Ohad Ben-Cohen wrote:
> Hi Suman,
> 
> On Tue, Jul 8, 2014 at 7:22 PM, Suman Anna <s-anna@ti.com> wrote:
>> A remote processor may need to load certain firmware sections into
>> internal memories (eg: RAM at L1 or L2 levels) for performance or
>> other reasons.
> 
> Can you please provide as much details as you can about the scenario
> you need this for? what hardware, what sections, which specific
> memory, what's the use case, numbers, sizes, everything.
> 
> I'd like to better understand the use case please.

We currently have two usecases. The primary usecase is the WkupM3
processor on TI Sitara AM335x/AM437x SoCs used for suspend/resume
management. This series is a dependency for the WkupM3 remoteproc driver
that Dave posted [1]. More details are in section 8.1.4.6 of the AM335x
TRM [2]. The program/data sections for this processor all _needs_ to be
in the two internal memory RAMS (16K Unified RAM and 8K Data RAM), and
there is no MMU for this processor. The current RSC_CARVEOUT and
RSC_DEVMEM do not fit to describe this type of memory (we neither
allocate memory through dma api nor do we need to map these into an MMU).

The second usecase is for some code to be loaded into the internal
memories of the DSP in existing OMAPs directly during remoteproc loading
stage. These memories are accessible to the processor again without
having to go through the L2MMU through which the external RAM and
peripherals are accessed through.

regards
Suman

[1] https://patchwork.kernel.org/patch/4529651/
[2] www.ti.com/lit.ug/spruh73k/spruh73k.pdf
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ohad Ben Cohen Aug. 19, 2014, 9:10 a.m. UTC | #3
Hi Suman,

On Tue, Jul 29, 2014 at 10:33 PM, Suman Anna <s-anna@ti.com> wrote:
> We currently have two usecases. The primary usecase is the WkupM3
> processor on TI Sitara AM335x/AM437x SoCs used for suspend/resume
> management. This series is a dependency for the WkupM3 remoteproc driver
> that Dave posted [1]. More details are in section 8.1.4.6 of the AM335x
> TRM [2]. The program/data sections for this processor all _needs_ to be
> in the two internal memory RAMS (16K Unified RAM and 8K Data RAM), and
> there is no MMU for this processor. The current RSC_CARVEOUT and
> RSC_DEVMEM do not fit to describe this type of memory (we neither
> allocate memory through dma api nor do we need to map these into an MMU).

Thanks for the details.

Can we define a CMA block for these regions, and then just use
carveout resource entries instead of the ioremap approach?

This may require some changes in remoteproc which we'll need to think
about, but it sounds like it may fit the problem better instead of
forcing ioremap to provide a regular pointer (we're supposed to use
ioremaped memory only with memory primitives such as readl/writel/..).

Thanks,
Ohad.
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Suman Anna Sept. 15, 2014, 7:39 p.m. UTC | #4
Hi Ohad,

> On Tue, Jul 29, 2014 at 10:33 PM, Suman Anna <s-anna@ti.com> wrote:
>> We currently have two usecases. The primary usecase is the WkupM3
>> processor on TI Sitara AM335x/AM437x SoCs used for suspend/resume
>> management. This series is a dependency for the WkupM3 remoteproc driver
>> that Dave posted [1]. More details are in section 8.1.4.6 of the AM335x
>> TRM [2]. The program/data sections for this processor all _needs_ to be
>> in the two internal memory RAMS (16K Unified RAM and 8K Data RAM), and
>> there is no MMU for this processor. The current RSC_CARVEOUT and
>> RSC_DEVMEM do not fit to describe this type of memory (we neither
>> allocate memory through dma api nor do we need to map these into an MMU).
> 
> Thanks for the details.
> 
> Can we define a CMA block for these regions, and then just use
> carveout resource entries instead of the ioremap approach?

I am looking at refreshing these patches, and found that I missed
responding to this message.

These processors need to use their internal RAM for loading, which is
not for generic usage by the kernel, so defining a CMA block for this
memory doesn't make sense.

> This may require some changes in remoteproc which we'll need to think
> about, but it sounds like it may fit the problem better instead of
> forcing ioremap to provide a regular pointer (we're supposed to use
> ioremaped memory only with memory primitives such as readl/writel/..).

Will it suffice to replace the memcpy() with memcpy_toio()?

regards
Suman
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ohad Ben Cohen Sept. 23, 2014, 2:16 p.m. UTC | #5
Hi Suman,

On Mon, Sep 15, 2014 at 10:39 PM, Suman Anna <s-anna@ti.com> wrote:
> These processors need to use their internal RAM for loading, which is
> not for generic usage by the kernel, so defining a CMA block for this
> memory doesn't make sense.

Ok - so just to make sure I understand, this is physical memory you
want to use, which belongs to the remote processor, and which isn't
mapped normally by the kernel?

> Will it suffice to replace the memcpy() with memcpy_toio()?

Yes, memcpy_toio should be fine (and then you don't need to cast the
cookie returned by ioremap).

Thanks,
Ohad.
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Suman Anna Sept. 23, 2014, 4:42 p.m. UTC | #6
Hi Ohad,

On 09/23/2014 09:16 AM, Ohad Ben-Cohen wrote:
> Hi Suman,
> 
> On Mon, Sep 15, 2014 at 10:39 PM, Suman Anna <s-anna@ti.com> wrote:
>> These processors need to use their internal RAM for loading, which is
>> not for generic usage by the kernel, so defining a CMA block for this
>> memory doesn't make sense.
> 
> Ok - so just to make sure I understand, this is physical memory you
> want to use, which belongs to the remote processor, and which isn't
> mapped normally by the kernel?

Yes, this is not the regular DDR that is mapped into kernel normally,
but is a RAM internal to the remote processor subsystem. The MPU can
access it through a bus address/

> 
>> Will it suffice to replace the memcpy() with memcpy_toio()?
> 
> Yes, memcpy_toio should be fine (and then you don't need to cast the
> cookie returned by ioremap).

I have posted v2, and have not modified for this. The memcpy portion is
actually present in the remoteproc_elf_loader.c, and looks like I need
to export some flags from rproc_va_to_da if I were to differentiate
this. Is that ok with you?

regards
Suman
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
index 11cdb11..e2bd869 100644
--- a/drivers/remoteproc/remoteproc_core.c
+++ b/drivers/remoteproc/remoteproc_core.c
@@ -664,6 +664,84 @@  free_carv:
 	return ret;
 }
 
+/**
+ * rproc_handle_intmem() - handle internal memory resource entry
+ * @rproc: rproc handle
+ * @rsc: the intmem resource entry
+ * @offset: offset of the resource data in resource table
+ * @avail: size of available data (for image validation)
+ *
+ * This function will handle firmware requests for mapping a memory region
+ * internal to a remote processor into kernel. It neither allocates any
+ * physical pages, nor performs any iommu mapping, as this resource entry
+ * is primarily used for representing physical internal memories. If the
+ * internal memory region can only be accessed through an iommu, please
+ * use a devmem resource entry.
+ *
+ * These resource entries should be grouped near the carveout entries in
+ * the firmware's resource table, as other firmware entries might request
+ * placing other data objects inside these memory regions (e.g. data/code
+ * segments, trace resource entries, ...).
+ */
+static int rproc_handle_intmem(struct rproc *rproc, struct fw_rsc_intmem *rsc,
+			       int offset, int avail)
+{
+	struct rproc_mem_entry *intmem;
+	struct device *dev = &rproc->dev;
+	void *va;
+	int ret;
+
+	if (sizeof(*rsc) > avail) {
+		dev_err(dev, "intmem rsc is truncated\n");
+		return -EINVAL;
+	}
+
+	if (rsc->version != 1) {
+		dev_err(dev, "intmem rsc version %d is not supported\n",
+			rsc->version);
+		return -EINVAL;
+	}
+
+	if (rsc->reserved) {
+		dev_err(dev, "intmem rsc has non zero reserved bytes\n");
+		return -EINVAL;
+	}
+
+	dev_dbg(dev, "intmem rsc: da 0x%x, pa 0x%x, len 0x%x\n",
+		rsc->da, rsc->pa, rsc->len);
+
+	intmem = kzalloc(sizeof(*intmem), GFP_KERNEL);
+	if (!intmem) {
+		dev_err(dev, "kzalloc carveout failed\n");
+		return -ENOMEM;
+	}
+
+	va = (__force void *)ioremap_nocache(rsc->pa, rsc->len);
+	if (!va) {
+		dev_err(dev, "ioremap_nocache err: %d\n", rsc->len);
+		ret = -ENOMEM;
+		goto free_intmem;
+	}
+
+	dev_dbg(dev, "intmem mapped pa 0x%x of len 0x%x into kernel va %p\n",
+		rsc->pa, rsc->len, va);
+
+	intmem->va = va;
+	intmem->len = rsc->len;
+	intmem->dma = rsc->pa;
+	intmem->da = rsc->da;
+	intmem->priv = (void *)1;    /* prevents freeing */
+
+	/* reuse the rproc->carveouts list, so that loading is automatic */
+	list_add_tail(&intmem->node, &rproc->carveouts);
+
+	return 0;
+
+free_intmem:
+	kfree(intmem);
+	return ret;
+}
+
 static int rproc_count_vrings(struct rproc *rproc, struct fw_rsc_vdev *rsc,
 			      int offset, int avail)
 {
@@ -681,6 +759,7 @@  static rproc_handle_resource_t rproc_loading_handlers[RSC_LAST] = {
 	[RSC_CARVEOUT] = (rproc_handle_resource_t)rproc_handle_carveout,
 	[RSC_DEVMEM] = (rproc_handle_resource_t)rproc_handle_devmem,
 	[RSC_TRACE] = (rproc_handle_resource_t)rproc_handle_trace,
+	[RSC_INTMEM] = (rproc_handle_resource_t)rproc_handle_intmem,
 	[RSC_VDEV] = NULL, /* VDEVs were handled upon registrarion */
 };
 
@@ -768,7 +847,11 @@  static void rproc_resource_cleanup(struct rproc *rproc)
 
 	/* clean up carveout allocations */
 	list_for_each_entry_safe(entry, tmp, &rproc->carveouts, node) {
-		dma_free_coherent(dev->parent, entry->len, entry->va, entry->dma);
+		if (!entry->priv)
+			dma_free_coherent(dev->parent, entry->len, entry->va,
+					  entry->dma);
+		else
+			iounmap((__force void __iomem *)entry->va);
 		list_del(&entry->node);
 		kfree(entry);
 	}
diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
index 78b8a9b..2a25ee8 100644
--- a/include/linux/remoteproc.h
+++ b/include/linux/remoteproc.h
@@ -100,6 +100,7 @@  struct fw_rsc_hdr {
  *		    the remote processor will be writing logs.
  * @RSC_VDEV:       declare support for a virtio device, and serve as its
  *		    virtio header.
+ * @RSC_INTMEM:     request to map into kernel an internal memory region.
  * @RSC_LAST:       just keep this one at the end
  *
  * For more details regarding a specific resource type, please see its
@@ -115,7 +116,8 @@  enum fw_resource_type {
 	RSC_DEVMEM	= 1,
 	RSC_TRACE	= 2,
 	RSC_VDEV	= 3,
-	RSC_LAST	= 4,
+	RSC_INTMEM	= 4,
+	RSC_LAST	= 5,
 };
 
 #define FW_RSC_ADDR_ANY (0xFFFFFFFFFFFFFFFF)
@@ -306,6 +308,45 @@  struct fw_rsc_vdev {
 } __packed;
 
 /**
+ * struct fw_rsc_intmem - internal memory publishing request
+ * @version: version for this resource type (must be one)
+ * @da: device address
+ * @pa: physical address
+ * @len: length (in bytes)
+ * @reserved: reserved (must be zero)
+ * @name: human-readable name of the region being published
+ *
+ * This resource entry allows a remote processor to publish an internal
+ * memory region to the host. This resource type allows a remote processor
+ * to publish the whole or just a portion of certain internal memories,
+ * while it owns and manages any unpublished portion (eg: a shared L1
+ * memory that can be split configured as RAM and/or cache). This is
+ * primarily provided to allow a host to load code/data into internal
+ * memories, the memory for which is neither allocated nor required to
+ * be mapped into an iommu.
+ *
+ * @da should specify the required address as accessible by the device
+ * without going through an iommu, @pa should specify the physical address
+ * for the region as seen on the bus, @len should specify the size of the
+ * memory region. As always, @name may (optionally) contain a human readable
+ * name of this mapping (mainly for debugging purposes). The @version field
+ * is added for future scalability, and should be 1 for now.
+ *
+ * Note: at this point we just "trust" these intmem entries to contain valid
+ * physical bus addresses. these are not currently intended to be managed
+ * as host-controlled heaps, as it is much better to do that from the remote
+ * processor side.
+ */
+struct fw_rsc_intmem {
+	u32 version;
+	u32 da;
+	u32 pa;
+	u32 len;
+	u32 reserved;
+	u8 name[32];
+} __packed;
+
+/**
  * struct rproc_mem_entry - memory entry descriptor
  * @va:	virtual address
  * @dma: dma address