diff mbox

[v3,2/2] remoteproc: add support to handle internal memories

Message ID 1420838519-15669-3-git-send-email-s-anna@ti.com (mailing list archive)
State New, archived
Headers show

Commit Message

Suman Anna Jan. 9, 2015, 9:21 p.m. UTC
A remote processor may need to load certain firmware sections into
internal memories (eg: RAM at L1 or L2 levels) for performance or
other reasons. Introduce a new resource type (RSC_INTMEM) and add
an associated handler function to handle such memories. The handler
creates a kernel mapping for the resource's 'pa' (physical address).

Note that no iommu mapping is performed for this resource, as the
resource is primarily used to represent physical internal memories.
If the internal memory region can only be accessed through an iommu,
a devmem resource entry should be used instead.

Signed-off-by: Robert Tivy <rtivy@ti.com>
Signed-off-by: Suman Anna <s-anna@ti.com>
---
v3:
- leverage memcpy_toio and memset_io for loading into internal memory
- rproc_da_to_va takes an additional argument to allow this distinction

 drivers/remoteproc/remoteproc_core.c       | 89 +++++++++++++++++++++++++++++-
 drivers/remoteproc/remoteproc_elf_loader.c | 23 ++++++--
 drivers/remoteproc/remoteproc_internal.h   |  6 +-
 include/linux/remoteproc.h                 | 43 ++++++++++++++-
 4 files changed, 150 insertions(+), 11 deletions(-)

Comments

Ohad Ben Cohen Feb. 10, 2015, 10:10 a.m. UTC | #1
Hi Suman,

On Fri, Jan 9, 2015 at 11:21 PM, Suman Anna <s-anna@ti.com> wrote:
> A remote processor may need to load certain firmware sections into
> internal memories (eg: RAM at L1 or L2 levels) for performance or
> other reasons. Introduce a new resource type (RSC_INTMEM) and add
> an associated handler function to handle such memories. The handler
> creates a kernel mapping for the resource's 'pa' (physical address).
...
> + * rproc_handle_intmem() - handle internal memory resource entry
> + * @rproc: rproc handle
> + * @rsc: the intmem resource entry
> + * @offset: offset of the resource data in resource table
> + * @avail: size of available data (for image validation)
> + *
> + * This function will handle firmware requests for mapping a memory region
> + * internal to a remote processor into kernel. It neither allocates any
> + * physical pages, nor performs any iommu mapping, as this resource entry
> + * is primarily used for representing physical internal memories. If the
> + * internal memory region can only be accessed through an iommu, please
> + * use a devmem resource entry.
> + *
> + * These resource entries should be grouped near the carveout entries in
> + * the firmware's resource table, as other firmware entries might request
> + * placing other data objects inside these memory regions (e.g. data/code
> + * segments, trace resource entries, ...).
> + */
> +static int rproc_handle_intmem(struct rproc *rproc, struct fw_rsc_intmem *rsc,
> +                              int offset, int avail)
> +{
...
> +       va = (__force void *)ioremap_nocache(rsc->pa, rsc->len);

Back in the days when we developed remoteproc there was a tremendous
effort to move away from ioremap when not strictly needed.

I'd be happy if someone intimate with the related hardware could ack
that in this specific case ioremap is indeed needed. No need to review
the entire patch, or anything remoteproc, just make sure that
generally ioremap is how we want to access this internal memory.

Tony or Kevin any chance you could take a look and ack?

If ioremap is indeed the way to go, I'd also expect that we wouldn't
have to use __force here, but that's probably a minor patch cleanup.

Thanks,
Ohad.
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tony Lindgren Feb. 11, 2015, 8:57 p.m. UTC | #2
* Ohad Ben-Cohen <ohad@wizery.com> [150210 02:14]:
> Hi Suman,
> 
> On Fri, Jan 9, 2015 at 11:21 PM, Suman Anna <s-anna@ti.com> wrote:
> > A remote processor may need to load certain firmware sections into
> > internal memories (eg: RAM at L1 or L2 levels) for performance or
> > other reasons. Introduce a new resource type (RSC_INTMEM) and add
> > an associated handler function to handle such memories. The handler
> > creates a kernel mapping for the resource's 'pa' (physical address).
> ...
> > + * rproc_handle_intmem() - handle internal memory resource entry
> > + * @rproc: rproc handle
> > + * @rsc: the intmem resource entry
> > + * @offset: offset of the resource data in resource table
> > + * @avail: size of available data (for image validation)
> > + *
> > + * This function will handle firmware requests for mapping a memory region
> > + * internal to a remote processor into kernel. It neither allocates any
> > + * physical pages, nor performs any iommu mapping, as this resource entry
> > + * is primarily used for representing physical internal memories. If the
> > + * internal memory region can only be accessed through an iommu, please
> > + * use a devmem resource entry.
> > + *
> > + * These resource entries should be grouped near the carveout entries in
> > + * the firmware's resource table, as other firmware entries might request
> > + * placing other data objects inside these memory regions (e.g. data/code
> > + * segments, trace resource entries, ...).
> > + */
> > +static int rproc_handle_intmem(struct rproc *rproc, struct fw_rsc_intmem *rsc,
> > +                              int offset, int avail)
> > +{
> ...
> > +       va = (__force void *)ioremap_nocache(rsc->pa, rsc->len);
> 
> Back in the days when we developed remoteproc there was a tremendous
> effort to move away from ioremap when not strictly needed.

The use of ioremap in general is just fine for drivers as long
as they access a dedicated area to the specific device. Accessing
random registers and memory in the SoC is what I'm worried about.
 
> I'd be happy if someone intimate with the related hardware could ack
> that in this specific case ioremap is indeed needed. No need to review
> the entire patch, or anything remoteproc, just make sure that
> generally ioremap is how we want to access this internal memory.
> 
> Tony or Kevin any chance you could take a look and ack?
> 
> If ioremap is indeed the way to go, I'd also expect that we wouldn't
> have to use __force here, but that's probably a minor patch cleanup.

Hmm sounds like this memory should be dedicated to the accelerator?

In that case it should use memblock to reserve that area early so
the kernel won't be accessing it at all.

If it needs to be shared between the kernel and the accelerator,
then is the remoteproc or mailbox somehow needs to coordinating the
shared access to this memory.. I think those cases should be handled
separately and not with a single interface.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Suman Anna Feb. 11, 2015, 10:28 p.m. UTC | #3
On 02/11/2015 02:57 PM, Tony Lindgren wrote:
> * Ohad Ben-Cohen <ohad@wizery.com> [150210 02:14]:
>> Hi Suman,
>>
>> On Fri, Jan 9, 2015 at 11:21 PM, Suman Anna <s-anna@ti.com> wrote:
>>> A remote processor may need to load certain firmware sections into
>>> internal memories (eg: RAM at L1 or L2 levels) for performance or
>>> other reasons. Introduce a new resource type (RSC_INTMEM) and add
>>> an associated handler function to handle such memories. The handler
>>> creates a kernel mapping for the resource's 'pa' (physical address).
>> ...
>>> + * rproc_handle_intmem() - handle internal memory resource entry
>>> + * @rproc: rproc handle
>>> + * @rsc: the intmem resource entry
>>> + * @offset: offset of the resource data in resource table
>>> + * @avail: size of available data (for image validation)
>>> + *
>>> + * This function will handle firmware requests for mapping a memory region
>>> + * internal to a remote processor into kernel. It neither allocates any
>>> + * physical pages, nor performs any iommu mapping, as this resource entry
>>> + * is primarily used for representing physical internal memories. If the
>>> + * internal memory region can only be accessed through an iommu, please
>>> + * use a devmem resource entry.
>>> + *
>>> + * These resource entries should be grouped near the carveout entries in
>>> + * the firmware's resource table, as other firmware entries might request
>>> + * placing other data objects inside these memory regions (e.g. data/code
>>> + * segments, trace resource entries, ...).
>>> + */
>>> +static int rproc_handle_intmem(struct rproc *rproc, struct fw_rsc_intmem *rsc,
>>> +                              int offset, int avail)
>>> +{
>> ...
>>> +       va = (__force void *)ioremap_nocache(rsc->pa, rsc->len);
>>
>> Back in the days when we developed remoteproc there was a tremendous
>> effort to move away from ioremap when not strictly needed.
> 
> The use of ioremap in general is just fine for drivers as long
> as they access a dedicated area to the specific device. Accessing
> random registers and memory in the SoC is what I'm worried about.
>  
>> I'd be happy if someone intimate with the related hardware could ack
>> that in this specific case ioremap is indeed needed. No need to review
>> the entire patch, or anything remoteproc, just make sure that
>> generally ioremap is how we want to access this internal memory.
>>
>> Tony or Kevin any chance you could take a look and ack?
>>
>> If ioremap is indeed the way to go, I'd also expect that we wouldn't
>> have to use __force here, but that's probably a minor patch cleanup.
> 
> Hmm sounds like this memory should be dedicated to the accelerator?
> 
> In that case it should use memblock to reserve that area early so
> the kernel won't be accessing it at all.

The usage here is not really on regular memory, but on internal device
memory (eg: L2RAM within DSP which is accessible by MPU through L3 bus).
For the regular shared memory for vrings and vring buffers, the
remoteproc core does rely on CMA pools.

regards
Suman
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tony Lindgren Feb. 11, 2015, 10:48 p.m. UTC | #4
* Suman Anna <s-anna@ti.com> [150211 14:32]:
> On 02/11/2015 02:57 PM, Tony Lindgren wrote:
> > * Ohad Ben-Cohen <ohad@wizery.com> [150210 02:14]:
> >> Hi Suman,
> >>
> >> On Fri, Jan 9, 2015 at 11:21 PM, Suman Anna <s-anna@ti.com> wrote:
> >>> A remote processor may need to load certain firmware sections into
> >>> internal memories (eg: RAM at L1 or L2 levels) for performance or
> >>> other reasons. Introduce a new resource type (RSC_INTMEM) and add
> >>> an associated handler function to handle such memories. The handler
> >>> creates a kernel mapping for the resource's 'pa' (physical address).
> >> ...
> >>> + * rproc_handle_intmem() - handle internal memory resource entry
> >>> + * @rproc: rproc handle
> >>> + * @rsc: the intmem resource entry
> >>> + * @offset: offset of the resource data in resource table
> >>> + * @avail: size of available data (for image validation)
> >>> + *
> >>> + * This function will handle firmware requests for mapping a memory region
> >>> + * internal to a remote processor into kernel. It neither allocates any
> >>> + * physical pages, nor performs any iommu mapping, as this resource entry
> >>> + * is primarily used for representing physical internal memories. If the
> >>> + * internal memory region can only be accessed through an iommu, please
> >>> + * use a devmem resource entry.
> >>> + *
> >>> + * These resource entries should be grouped near the carveout entries in
> >>> + * the firmware's resource table, as other firmware entries might request
> >>> + * placing other data objects inside these memory regions (e.g. data/code
> >>> + * segments, trace resource entries, ...).
> >>> + */
> >>> +static int rproc_handle_intmem(struct rproc *rproc, struct fw_rsc_intmem *rsc,
> >>> +                              int offset, int avail)
> >>> +{
> >> ...
> >>> +       va = (__force void *)ioremap_nocache(rsc->pa, rsc->len);
> >>
> >> Back in the days when we developed remoteproc there was a tremendous
> >> effort to move away from ioremap when not strictly needed.
> > 
> > The use of ioremap in general is just fine for drivers as long
> > as they access a dedicated area to the specific device. Accessing
> > random registers and memory in the SoC is what I'm worried about.
> >  
> >> I'd be happy if someone intimate with the related hardware could ack
> >> that in this specific case ioremap is indeed needed. No need to review
> >> the entire patch, or anything remoteproc, just make sure that
> >> generally ioremap is how we want to access this internal memory.
> >>
> >> Tony or Kevin any chance you could take a look and ack?
> >>
> >> If ioremap is indeed the way to go, I'd also expect that we wouldn't
> >> have to use __force here, but that's probably a minor patch cleanup.
> > 
> > Hmm sounds like this memory should be dedicated to the accelerator?
> > 
> > In that case it should use memblock to reserve that area early so
> > the kernel won't be accessing it at all.
> 
> The usage here is not really on regular memory, but on internal device
> memory (eg: L2RAM within DSP which is accessible by MPU through L3 bus).
> For the regular shared memory for vrings and vring buffers, the
> remoteproc core does rely on CMA pools.

OK sounds like Linux needs to access it initially to load the DSP boot
code to L2RAM to get the DSP booted.

Maybe it can be done with the API provided by drivers/misc/sram.c?

You could set up the L2RAM as compatible = "mmio-sram" and then
parse the optional phandle for that in the remoteproc code, then
allocate some memory from it to load the DSP boot code and free
it.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Suman Anna Feb. 12, 2015, 12:01 a.m. UTC | #5
On 02/11/2015 04:48 PM, Tony Lindgren wrote:
> * Suman Anna <s-anna@ti.com> [150211 14:32]:
>> On 02/11/2015 02:57 PM, Tony Lindgren wrote:
>>> * Ohad Ben-Cohen <ohad@wizery.com> [150210 02:14]:
>>>> Hi Suman,
>>>>
>>>> On Fri, Jan 9, 2015 at 11:21 PM, Suman Anna <s-anna@ti.com> wrote:
>>>>> A remote processor may need to load certain firmware sections into
>>>>> internal memories (eg: RAM at L1 or L2 levels) for performance or
>>>>> other reasons. Introduce a new resource type (RSC_INTMEM) and add
>>>>> an associated handler function to handle such memories. The handler
>>>>> creates a kernel mapping for the resource's 'pa' (physical address).
>>>> ...
>>>>> + * rproc_handle_intmem() - handle internal memory resource entry
>>>>> + * @rproc: rproc handle
>>>>> + * @rsc: the intmem resource entry
>>>>> + * @offset: offset of the resource data in resource table
>>>>> + * @avail: size of available data (for image validation)
>>>>> + *
>>>>> + * This function will handle firmware requests for mapping a memory region
>>>>> + * internal to a remote processor into kernel. It neither allocates any
>>>>> + * physical pages, nor performs any iommu mapping, as this resource entry
>>>>> + * is primarily used for representing physical internal memories. If the
>>>>> + * internal memory region can only be accessed through an iommu, please
>>>>> + * use a devmem resource entry.
>>>>> + *
>>>>> + * These resource entries should be grouped near the carveout entries in
>>>>> + * the firmware's resource table, as other firmware entries might request
>>>>> + * placing other data objects inside these memory regions (e.g. data/code
>>>>> + * segments, trace resource entries, ...).
>>>>> + */
>>>>> +static int rproc_handle_intmem(struct rproc *rproc, struct fw_rsc_intmem *rsc,
>>>>> +                              int offset, int avail)
>>>>> +{
>>>> ...
>>>>> +       va = (__force void *)ioremap_nocache(rsc->pa, rsc->len);
>>>>
>>>> Back in the days when we developed remoteproc there was a tremendous
>>>> effort to move away from ioremap when not strictly needed.
>>>
>>> The use of ioremap in general is just fine for drivers as long
>>> as they access a dedicated area to the specific device. Accessing
>>> random registers and memory in the SoC is what I'm worried about.
>>>  
>>>> I'd be happy if someone intimate with the related hardware could ack
>>>> that in this specific case ioremap is indeed needed. No need to review
>>>> the entire patch, or anything remoteproc, just make sure that
>>>> generally ioremap is how we want to access this internal memory.
>>>>
>>>> Tony or Kevin any chance you could take a look and ack?
>>>>
>>>> If ioremap is indeed the way to go, I'd also expect that we wouldn't
>>>> have to use __force here, but that's probably a minor patch cleanup.
>>>
>>> Hmm sounds like this memory should be dedicated to the accelerator?
>>>
>>> In that case it should use memblock to reserve that area early so
>>> the kernel won't be accessing it at all.
>>
>> The usage here is not really on regular memory, but on internal device
>> memory (eg: L2RAM within DSP which is accessible by MPU through L3 bus).
>> For the regular shared memory for vrings and vring buffers, the
>> remoteproc core does rely on CMA pools.
> 
> OK sounds like Linux needs to access it initially to load the DSP boot
> code to L2RAM to get the DSP booted.
> 
> Maybe it can be done with the API provided by drivers/misc/sram.c?
> 
> You could set up the L2RAM as compatible = "mmio-sram" and then
> parse the optional phandle for that in the remoteproc code, then
> allocate some memory from it to load the DSP boot code and free
> it.

Not quite the same usage, there are no implicit assumptions on managing
this memory. Isn't the SRAM driver better suited for allocating memory
using the gen_pool API. It is just regular code that is being placed
into RAM, and the linker file on the remoteproc side dictates which
portion we are using. So, the section can be anywhere based on the ELF
parsing. Further, the same RAM space can be partitioned into Cache
and/or RAM, which is usually controlled from internal processor
subsystem register programming.

regards
Suman
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tony Lindgren Feb. 12, 2015, 12:18 a.m. UTC | #6
* Suman Anna <s-anna@ti.com> [150211 16:05]:
> On 02/11/2015 04:48 PM, Tony Lindgren wrote:
> > * Suman Anna <s-anna@ti.com> [150211 14:32]:
> >> On 02/11/2015 02:57 PM, Tony Lindgren wrote:
> >>> * Ohad Ben-Cohen <ohad@wizery.com> [150210 02:14]:
> >>>> Hi Suman,
> >>>>
> >>>> On Fri, Jan 9, 2015 at 11:21 PM, Suman Anna <s-anna@ti.com> wrote:
> >>>>> A remote processor may need to load certain firmware sections into
> >>>>> internal memories (eg: RAM at L1 or L2 levels) for performance or
> >>>>> other reasons. Introduce a new resource type (RSC_INTMEM) and add
> >>>>> an associated handler function to handle such memories. The handler
> >>>>> creates a kernel mapping for the resource's 'pa' (physical address).
> >>>> ...
> >>>>> + * rproc_handle_intmem() - handle internal memory resource entry
> >>>>> + * @rproc: rproc handle
> >>>>> + * @rsc: the intmem resource entry
> >>>>> + * @offset: offset of the resource data in resource table
> >>>>> + * @avail: size of available data (for image validation)
> >>>>> + *
> >>>>> + * This function will handle firmware requests for mapping a memory region
> >>>>> + * internal to a remote processor into kernel. It neither allocates any
> >>>>> + * physical pages, nor performs any iommu mapping, as this resource entry
> >>>>> + * is primarily used for representing physical internal memories. If the
> >>>>> + * internal memory region can only be accessed through an iommu, please
> >>>>> + * use a devmem resource entry.
> >>>>> + *
> >>>>> + * These resource entries should be grouped near the carveout entries in
> >>>>> + * the firmware's resource table, as other firmware entries might request
> >>>>> + * placing other data objects inside these memory regions (e.g. data/code
> >>>>> + * segments, trace resource entries, ...).
> >>>>> + */
> >>>>> +static int rproc_handle_intmem(struct rproc *rproc, struct fw_rsc_intmem *rsc,
> >>>>> +                              int offset, int avail)
> >>>>> +{
> >>>> ...
> >>>>> +       va = (__force void *)ioremap_nocache(rsc->pa, rsc->len);
> >>>>
> >>>> Back in the days when we developed remoteproc there was a tremendous
> >>>> effort to move away from ioremap when not strictly needed.
> >>>
> >>> The use of ioremap in general is just fine for drivers as long
> >>> as they access a dedicated area to the specific device. Accessing
> >>> random registers and memory in the SoC is what I'm worried about.
> >>>  
> >>>> I'd be happy if someone intimate with the related hardware could ack
> >>>> that in this specific case ioremap is indeed needed. No need to review
> >>>> the entire patch, or anything remoteproc, just make sure that
> >>>> generally ioremap is how we want to access this internal memory.
> >>>>
> >>>> Tony or Kevin any chance you could take a look and ack?
> >>>>
> >>>> If ioremap is indeed the way to go, I'd also expect that we wouldn't
> >>>> have to use __force here, but that's probably a minor patch cleanup.
> >>>
> >>> Hmm sounds like this memory should be dedicated to the accelerator?
> >>>
> >>> In that case it should use memblock to reserve that area early so
> >>> the kernel won't be accessing it at all.
> >>
> >> The usage here is not really on regular memory, but on internal device
> >> memory (eg: L2RAM within DSP which is accessible by MPU through L3 bus).
> >> For the regular shared memory for vrings and vring buffers, the
> >> remoteproc core does rely on CMA pools.
> > 
> > OK sounds like Linux needs to access it initially to load the DSP boot
> > code to L2RAM to get the DSP booted.
> > 
> > Maybe it can be done with the API provided by drivers/misc/sram.c?
> > 
> > You could set up the L2RAM as compatible = "mmio-sram" and then
> > parse the optional phandle for that in the remoteproc code, then
> > allocate some memory from it to load the DSP boot code and free
> > it.
> 
> Not quite the same usage, there are no implicit assumptions on managing
> this memory. Isn't the SRAM driver better suited for allocating memory
> using the gen_pool API. It is just regular code that is being placed
> into RAM, and the linker file on the remoteproc side dictates which
> portion we are using. So, the section can be anywhere based on the ELF
> parsing. Further, the same RAM space can be partitioned into Cache
> and/or RAM, which is usually controlled from internal processor
> subsystem register programming.

It still sounds like you need an API like gen_pool to allocate and
load the DSP code though? So from that point of view it's best to
use some Linux generic API.

Just guessing, but the process here is probably something like
request_firmware, configure hardware, allocate memory area,
copy firmware to memory, unallocate memory, boot m3 :)

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Suman Anna Feb. 12, 2015, 1:07 a.m. UTC | #7
On 02/11/2015 06:18 PM, Tony Lindgren wrote:
> * Suman Anna <s-anna@ti.com> [150211 16:05]:
>> On 02/11/2015 04:48 PM, Tony Lindgren wrote:
>>> * Suman Anna <s-anna@ti.com> [150211 14:32]:
>>>> On 02/11/2015 02:57 PM, Tony Lindgren wrote:
>>>>> * Ohad Ben-Cohen <ohad@wizery.com> [150210 02:14]:
>>>>>> Hi Suman,
>>>>>>
>>>>>> On Fri, Jan 9, 2015 at 11:21 PM, Suman Anna <s-anna@ti.com> wrote:
>>>>>>> A remote processor may need to load certain firmware sections into
>>>>>>> internal memories (eg: RAM at L1 or L2 levels) for performance or
>>>>>>> other reasons. Introduce a new resource type (RSC_INTMEM) and add
>>>>>>> an associated handler function to handle such memories. The handler
>>>>>>> creates a kernel mapping for the resource's 'pa' (physical address).
>>>>>> ...
>>>>>>> + * rproc_handle_intmem() - handle internal memory resource entry
>>>>>>> + * @rproc: rproc handle
>>>>>>> + * @rsc: the intmem resource entry
>>>>>>> + * @offset: offset of the resource data in resource table
>>>>>>> + * @avail: size of available data (for image validation)
>>>>>>> + *
>>>>>>> + * This function will handle firmware requests for mapping a memory region
>>>>>>> + * internal to a remote processor into kernel. It neither allocates any
>>>>>>> + * physical pages, nor performs any iommu mapping, as this resource entry
>>>>>>> + * is primarily used for representing physical internal memories. If the
>>>>>>> + * internal memory region can only be accessed through an iommu, please
>>>>>>> + * use a devmem resource entry.
>>>>>>> + *
>>>>>>> + * These resource entries should be grouped near the carveout entries in
>>>>>>> + * the firmware's resource table, as other firmware entries might request
>>>>>>> + * placing other data objects inside these memory regions (e.g. data/code
>>>>>>> + * segments, trace resource entries, ...).
>>>>>>> + */
>>>>>>> +static int rproc_handle_intmem(struct rproc *rproc, struct fw_rsc_intmem *rsc,
>>>>>>> +                              int offset, int avail)
>>>>>>> +{
>>>>>> ...
>>>>>>> +       va = (__force void *)ioremap_nocache(rsc->pa, rsc->len);
>>>>>>
>>>>>> Back in the days when we developed remoteproc there was a tremendous
>>>>>> effort to move away from ioremap when not strictly needed.
>>>>>
>>>>> The use of ioremap in general is just fine for drivers as long
>>>>> as they access a dedicated area to the specific device. Accessing
>>>>> random registers and memory in the SoC is what I'm worried about.
>>>>>  
>>>>>> I'd be happy if someone intimate with the related hardware could ack
>>>>>> that in this specific case ioremap is indeed needed. No need to review
>>>>>> the entire patch, or anything remoteproc, just make sure that
>>>>>> generally ioremap is how we want to access this internal memory.
>>>>>>
>>>>>> Tony or Kevin any chance you could take a look and ack?
>>>>>>
>>>>>> If ioremap is indeed the way to go, I'd also expect that we wouldn't
>>>>>> have to use __force here, but that's probably a minor patch cleanup.
>>>>>
>>>>> Hmm sounds like this memory should be dedicated to the accelerator?
>>>>>
>>>>> In that case it should use memblock to reserve that area early so
>>>>> the kernel won't be accessing it at all.
>>>>
>>>> The usage here is not really on regular memory, but on internal device
>>>> memory (eg: L2RAM within DSP which is accessible by MPU through L3 bus).
>>>> For the regular shared memory for vrings and vring buffers, the
>>>> remoteproc core does rely on CMA pools.
>>>
>>> OK sounds like Linux needs to access it initially to load the DSP boot
>>> code to L2RAM to get the DSP booted.
>>>
>>> Maybe it can be done with the API provided by drivers/misc/sram.c?
>>>
>>> You could set up the L2RAM as compatible = "mmio-sram" and then
>>> parse the optional phandle for that in the remoteproc code, then
>>> allocate some memory from it to load the DSP boot code and free
>>> it.
>>
>> Not quite the same usage, there are no implicit assumptions on managing
>> this memory. Isn't the SRAM driver better suited for allocating memory
>> using the gen_pool API. It is just regular code that is being placed
>> into RAM, and the linker file on the remoteproc side dictates which
>> portion we are using. So, the section can be anywhere based on the ELF
>> parsing. Further, the same RAM space can be partitioned into Cache
>> and/or RAM, which is usually controlled from internal processor
>> subsystem register programming.
> 
> It still sounds like you need an API like gen_pool to allocate and
> load the DSP code though? So from that point of view it's best to
> use some Linux generic API.
> 
> Just guessing, but the process here is probably something like
> request_firmware, configure hardware, allocate memory area,
> copy firmware to memory, unallocate memory, boot m3 :)

Yeah, atleast for the processors with MMUs, it's usually allocate
memory, program IOMMU, copy firmware, boot rproc. Memory is freed when
unloading the processor and loading a different firmware. For the cases
with internal memory, either I need an ioremap of the region for copying
the firmware sections, or as you said, allocate, copy and unallocate.
That almost always means, I have to allocate the entire region, since I
would need to usually copy the data to a specific location based on the
ELF pheader data. The sram driver also does an ioremap internally, so I
guess it can be done, and probably a bit more code for management within
the rproc core.

regards
Suman


--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ohad Ben Cohen Feb. 12, 2015, 9:09 a.m. UTC | #8
On Wed, Feb 11, 2015 at 10:57 PM, Tony Lindgren <tony@atomide.com> wrote:
>> > +static int rproc_handle_intmem(struct rproc *rproc, struct fw_rsc_intmem *rsc,
>> > +                              int offset, int avail)
>> > +{
>> ...
>> > +       va = (__force void *)ioremap_nocache(rsc->pa, rsc->len);
>>
>> Back in the days when we developed remoteproc there was a tremendous
>> effort to move away from ioremap when not strictly needed.
>
> The use of ioremap in general is just fine for drivers as long
> as they access a dedicated area to the specific device. Accessing
> random registers and memory in the SoC is what I'm worried about.

Yes, the proposed interface essentially allows exactly this random
access, since the parameters to ioremap would be provided from the
user space (specifically from the resource table contained within the
firmware of the remote processor).

Thanks,
Ohad.
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Suman Anna Feb. 12, 2015, 8:54 p.m. UTC | #9
Hi Ohad,

On 02/12/2015 03:09 AM, Ohad Ben-Cohen wrote:
> On Wed, Feb 11, 2015 at 10:57 PM, Tony Lindgren <tony@atomide.com> wrote:
>>>> +static int rproc_handle_intmem(struct rproc *rproc, struct fw_rsc_intmem *rsc,
>>>> +                              int offset, int avail)
>>>> +{
>>> ...
>>>> +       va = (__force void *)ioremap_nocache(rsc->pa, rsc->len);
>>>
>>> Back in the days when we developed remoteproc there was a tremendous
>>> effort to move away from ioremap when not strictly needed.
>>
>> The use of ioremap in general is just fine for drivers as long
>> as they access a dedicated area to the specific device. Accessing
>> random registers and memory in the SoC is what I'm worried about.
> 
> Yes, the proposed interface essentially allows exactly this random
> access, since the parameters to ioremap would be provided from the
> user space (specifically from the resource table contained within the
> firmware of the remote processor).

My original motivation was that it would only need to be added on
firmwares requiring support for loading into internal memories,
otherwise, these are something left to be managed by the software
running on the remote processor completely, and MPU will not even touch
them.

So, let me know if this is a NAK. If so, we have two options - one to go
the sram node model where each of them have to be defined separately,
and have a specific property in the rproc nodes to be able to get the
gen_pool handles. The other one is simply to define these as <reg> and
use devm_ioremap_resource() (so use DT for defining the regions instead
of a resource table entry).

regards
Suman
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ohad Ben Cohen Feb. 13, 2015, 5:20 a.m. UTC | #10
On Thu, Feb 12, 2015 at 10:54 PM, Suman Anna <s-anna@ti.com> wrote:
> My original motivation was that it would only need to be added on
> firmwares requiring support for loading into internal memories,
> otherwise, these are something left to be managed by the software
> running on the remote processor completely, and MPU will not even touch
> them.

Sure. But even if you guys will use this interface correctly, this
patch essentially exposes ioremap to user space, which is something we
generally want to avoid.

> So, let me know if this is a NAK. If so, we have two options - one to go
> the sram node model where each of them have to be defined separately,
> and have a specific property in the rproc nodes to be able to get the
> gen_pool handles. The other one is simply to define these as <reg> and
> use devm_ioremap_resource() (so use DT for defining the regions instead
> of a resource table entry).

Any approach where these regions are defined explicitly really sounds
better. If you could look into these two alternatives that would be
great.

Thanks,
Ohad.
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Suman Anna Feb. 13, 2015, 4:13 p.m. UTC | #11
Ohad,

On 02/12/2015 11:20 PM, Ohad Ben-Cohen wrote:
> On Thu, Feb 12, 2015 at 10:54 PM, Suman Anna <s-anna@ti.com> wrote:
>> My original motivation was that it would only need to be added on
>> firmwares requiring support for loading into internal memories,
>> otherwise, these are something left to be managed by the software
>> running on the remote processor completely, and MPU will not even touch
>> them.
> 
> Sure. But even if you guys will use this interface correctly, this
> patch essentially exposes ioremap to user space, which is something we
> generally want to avoid.
> 
>> So, let me know if this is a NAK. If so, we have two options - one to go
>> the sram node model where each of them have to be defined separately,
>> and have a specific property in the rproc nodes to be able to get the
>> gen_pool handles. The other one is simply to define these as <reg> and
>> use devm_ioremap_resource() (so use DT for defining the regions instead
>> of a resource table entry).
> 
> Any approach where these regions are defined explicitly really sounds
> better. If you could look into these two alternatives that would be
> great.

OK, will do. Meanwhile, can you pick up Patch 1, that is independent of
this patch.

regards
Suman

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tony Lindgren Feb. 13, 2015, 6:35 p.m. UTC | #12
* Suman Anna <s-anna@ti.com> [150213 08:17]:
> Ohad,
> 
> On 02/12/2015 11:20 PM, Ohad Ben-Cohen wrote:
> > On Thu, Feb 12, 2015 at 10:54 PM, Suman Anna <s-anna@ti.com> wrote:
> >> My original motivation was that it would only need to be added on
> >> firmwares requiring support for loading into internal memories,
> >> otherwise, these are something left to be managed by the software
> >> running on the remote processor completely, and MPU will not even touch
> >> them.
> > 
> > Sure. But even if you guys will use this interface correctly, this
> > patch essentially exposes ioremap to user space, which is something we
> > generally want to avoid.
> > 
> >> So, let me know if this is a NAK. If so, we have two options - one to go
> >> the sram node model where each of them have to be defined separately,
> >> and have a specific property in the rproc nodes to be able to get the
> >> gen_pool handles. The other one is simply to define these as <reg> and
> >> use devm_ioremap_resource() (so use DT for defining the regions instead
> >> of a resource table entry).
> > 
> > Any approach where these regions are defined explicitly really sounds
> > better. If you could look into these two alternatives that would be
> > great.
> 
> OK, will do. Meanwhile, can you pick up Patch 1, that is independent of
> this patch.

If the memory are is hardware specific, then it should be specified in
the dts file. If some further configuration depending on the firmware
version is needed, then you can parse that from the firmware and make
sure it's contained within the hardware specific memory area defined
in the dts file. I guess in some cases module options may be also
needed.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
index 11cdb119e4f3..e0ecc0f802c1 100644
--- a/drivers/remoteproc/remoteproc_core.c
+++ b/drivers/remoteproc/remoteproc_core.c
@@ -153,7 +153,7 @@  static void rproc_disable_iommu(struct rproc *rproc)
  * but only on kernel direct mapped RAM memory. Instead, we're just using
  * here the output of the DMA API, which should be more correct.
  */
-void *rproc_da_to_va(struct rproc *rproc, u64 da, int len)
+void *rproc_da_to_va(struct rproc *rproc, u64 da, int len, u32 *flags)
 {
 	struct rproc_mem_entry *carveout;
 	void *ptr = NULL;
@@ -170,6 +170,8 @@  void *rproc_da_to_va(struct rproc *rproc, u64 da, int len)
 			continue;
 
 		ptr = carveout->va + offset;
+		if (flags && carveout->priv)
+			*flags = RPROC_INTMEM;
 
 		break;
 	}
@@ -404,7 +406,7 @@  static int rproc_handle_trace(struct rproc *rproc, struct fw_rsc_trace *rsc,
 	}
 
 	/* what's the kernel address of this resource ? */
-	ptr = rproc_da_to_va(rproc, rsc->da, rsc->len);
+	ptr = rproc_da_to_va(rproc, rsc->da, rsc->len, NULL);
 	if (!ptr) {
 		dev_err(dev, "erroneous trace resource entry\n");
 		return -EINVAL;
@@ -664,6 +666,82 @@  free_carv:
 	return ret;
 }
 
+/**
+ * rproc_handle_intmem() - handle internal memory resource entry
+ * @rproc: rproc handle
+ * @rsc: the intmem resource entry
+ * @offset: offset of the resource data in resource table
+ * @avail: size of available data (for image validation)
+ *
+ * This function will handle firmware requests for mapping a memory region
+ * internal to a remote processor into kernel. It neither allocates any
+ * physical pages, nor performs any iommu mapping, as this resource entry
+ * is primarily used for representing physical internal memories. If the
+ * internal memory region can only be accessed through an iommu, please
+ * use a devmem resource entry.
+ *
+ * These resource entries should be grouped near the carveout entries in
+ * the firmware's resource table, as other firmware entries might request
+ * placing other data objects inside these memory regions (e.g. data/code
+ * segments, trace resource entries, ...).
+ */
+static int rproc_handle_intmem(struct rproc *rproc, struct fw_rsc_intmem *rsc,
+			       int offset, int avail)
+{
+	struct rproc_mem_entry *intmem;
+	struct device *dev = &rproc->dev;
+	void *va;
+	int ret;
+
+	if (sizeof(*rsc) > avail) {
+		dev_err(dev, "intmem rsc is truncated\n");
+		return -EINVAL;
+	}
+
+	if (rsc->version != 1) {
+		dev_err(dev, "intmem rsc version %d is not supported\n",
+			rsc->version);
+		return -EINVAL;
+	}
+
+	if (rsc->reserved) {
+		dev_err(dev, "intmem rsc has non zero reserved bytes\n");
+		return -EINVAL;
+	}
+
+	dev_dbg(dev, "intmem rsc: da 0x%x, pa 0x%x, len 0x%x\n",
+		rsc->da, rsc->pa, rsc->len);
+
+	intmem = kzalloc(sizeof(*intmem), GFP_KERNEL);
+	if (!intmem)
+		return -ENOMEM;
+
+	va = (__force void *)ioremap_nocache(rsc->pa, rsc->len);
+	if (!va) {
+		dev_err(dev, "ioremap_nocache err: %d\n", rsc->len);
+		ret = -ENOMEM;
+		goto free_intmem;
+	}
+
+	dev_dbg(dev, "intmem mapped pa 0x%x of len 0x%x into kernel va %p\n",
+		rsc->pa, rsc->len, va);
+
+	intmem->va = va;
+	intmem->len = rsc->len;
+	intmem->dma = rsc->pa;
+	intmem->da = rsc->da;
+	intmem->priv = (void *)RPROC_INTMEM;    /* prevents freeing */
+
+	/* reuse the rproc->carveouts list, so that loading is automatic */
+	list_add_tail(&intmem->node, &rproc->carveouts);
+
+	return 0;
+
+free_intmem:
+	kfree(intmem);
+	return ret;
+}
+
 static int rproc_count_vrings(struct rproc *rproc, struct fw_rsc_vdev *rsc,
 			      int offset, int avail)
 {
@@ -681,6 +759,7 @@  static rproc_handle_resource_t rproc_loading_handlers[RSC_LAST] = {
 	[RSC_CARVEOUT] = (rproc_handle_resource_t)rproc_handle_carveout,
 	[RSC_DEVMEM] = (rproc_handle_resource_t)rproc_handle_devmem,
 	[RSC_TRACE] = (rproc_handle_resource_t)rproc_handle_trace,
+	[RSC_INTMEM] = (rproc_handle_resource_t)rproc_handle_intmem,
 	[RSC_VDEV] = NULL, /* VDEVs were handled upon registrarion */
 };
 
@@ -768,7 +847,11 @@  static void rproc_resource_cleanup(struct rproc *rproc)
 
 	/* clean up carveout allocations */
 	list_for_each_entry_safe(entry, tmp, &rproc->carveouts, node) {
-		dma_free_coherent(dev->parent, entry->len, entry->va, entry->dma);
+		if (!entry->priv)
+			dma_free_coherent(dev->parent, entry->len, entry->va,
+					  entry->dma);
+		else
+			iounmap((__force void __iomem *)entry->va);
 		list_del(&entry->node);
 		kfree(entry);
 	}
diff --git a/drivers/remoteproc/remoteproc_elf_loader.c b/drivers/remoteproc/remoteproc_elf_loader.c
index ce283a5b42a1..cdd7b622cee3 100644
--- a/drivers/remoteproc/remoteproc_elf_loader.c
+++ b/drivers/remoteproc/remoteproc_elf_loader.c
@@ -150,6 +150,7 @@  rproc_elf_load_segments(struct rproc *rproc, const struct firmware *fw)
 	struct elf32_phdr *phdr;
 	int i, ret = 0;
 	const u8 *elf_data = fw->data;
+	u32 flags = 0;
 
 	ehdr = (struct elf32_hdr *)elf_data;
 	phdr = (struct elf32_phdr *)(elf_data + ehdr->e_phoff);
@@ -183,7 +184,7 @@  rproc_elf_load_segments(struct rproc *rproc, const struct firmware *fw)
 		}
 
 		/* grab the kernel address for this device address */
-		ptr = rproc_da_to_va(rproc, da, memsz);
+		ptr = rproc_da_to_va(rproc, da, memsz, &flags);
 		if (!ptr) {
 			dev_err(dev, "bad phdr da 0x%x mem 0x%x\n", da, memsz);
 			ret = -EINVAL;
@@ -191,8 +192,13 @@  rproc_elf_load_segments(struct rproc *rproc, const struct firmware *fw)
 		}
 
 		/* put the segment where the remote processor expects it */
-		if (phdr->p_filesz)
-			memcpy(ptr, elf_data + phdr->p_offset, filesz);
+		if (phdr->p_filesz) {
+			if (flags & RPROC_INTMEM)
+				memcpy_toio((void __iomem *)ptr,
+					    elf_data + phdr->p_offset, filesz);
+			else
+				memcpy(ptr, elf_data + phdr->p_offset, filesz);
+		}
 
 		/*
 		 * Zero out remaining memory for this segment.
@@ -201,8 +207,13 @@  rproc_elf_load_segments(struct rproc *rproc, const struct firmware *fw)
 		 * did this for us. albeit harmless, we may consider removing
 		 * this.
 		 */
-		if (memsz > filesz)
-			memset(ptr + filesz, 0, memsz - filesz);
+		if (memsz > filesz) {
+			if (flags & RPROC_INTMEM)
+				memset_io((void __iomem *)ptr + filesz,
+					  0, memsz - filesz);
+			else
+				memset(ptr + filesz, 0, memsz - filesz);
+		}
 	}
 
 	return ret;
@@ -325,7 +336,7 @@  rproc_elf_find_loaded_rsc_table(struct rproc *rproc, const struct firmware *fw)
 	if (!shdr)
 		return NULL;
 
-	return rproc_da_to_va(rproc, shdr->sh_addr, shdr->sh_size);
+	return rproc_da_to_va(rproc, shdr->sh_addr, shdr->sh_size, NULL);
 }
 
 const struct rproc_fw_ops rproc_elf_fw_ops = {
diff --git a/drivers/remoteproc/remoteproc_internal.h b/drivers/remoteproc/remoteproc_internal.h
index 70701a50ddfa..8af4a8188488 100644
--- a/drivers/remoteproc/remoteproc_internal.h
+++ b/drivers/remoteproc/remoteproc_internal.h
@@ -23,6 +23,10 @@ 
 #include <linux/irqreturn.h>
 #include <linux/firmware.h>
 
+enum rproc_mem_type {
+	RPROC_INTMEM = 1,
+};
+
 struct rproc;
 
 /**
@@ -65,7 +69,7 @@  void rproc_exit_debugfs(void);
 void rproc_free_vring(struct rproc_vring *rvring);
 int rproc_alloc_vring(struct rproc_vdev *rvdev, int i);
 
-void *rproc_da_to_va(struct rproc *rproc, u64 da, int len);
+void *rproc_da_to_va(struct rproc *rproc, u64 da, int len, u32 *flags);
 int rproc_trigger_recovery(struct rproc *rproc);
 
 static inline
diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
index 78b8a9b9d40a..2a25ee8a34dd 100644
--- a/include/linux/remoteproc.h
+++ b/include/linux/remoteproc.h
@@ -100,6 +100,7 @@  struct fw_rsc_hdr {
  *		    the remote processor will be writing logs.
  * @RSC_VDEV:       declare support for a virtio device, and serve as its
  *		    virtio header.
+ * @RSC_INTMEM:     request to map into kernel an internal memory region.
  * @RSC_LAST:       just keep this one at the end
  *
  * For more details regarding a specific resource type, please see its
@@ -115,7 +116,8 @@  enum fw_resource_type {
 	RSC_DEVMEM	= 1,
 	RSC_TRACE	= 2,
 	RSC_VDEV	= 3,
-	RSC_LAST	= 4,
+	RSC_INTMEM	= 4,
+	RSC_LAST	= 5,
 };
 
 #define FW_RSC_ADDR_ANY (0xFFFFFFFFFFFFFFFF)
@@ -306,6 +308,45 @@  struct fw_rsc_vdev {
 } __packed;
 
 /**
+ * struct fw_rsc_intmem - internal memory publishing request
+ * @version: version for this resource type (must be one)
+ * @da: device address
+ * @pa: physical address
+ * @len: length (in bytes)
+ * @reserved: reserved (must be zero)
+ * @name: human-readable name of the region being published
+ *
+ * This resource entry allows a remote processor to publish an internal
+ * memory region to the host. This resource type allows a remote processor
+ * to publish the whole or just a portion of certain internal memories,
+ * while it owns and manages any unpublished portion (eg: a shared L1
+ * memory that can be split configured as RAM and/or cache). This is
+ * primarily provided to allow a host to load code/data into internal
+ * memories, the memory for which is neither allocated nor required to
+ * be mapped into an iommu.
+ *
+ * @da should specify the required address as accessible by the device
+ * without going through an iommu, @pa should specify the physical address
+ * for the region as seen on the bus, @len should specify the size of the
+ * memory region. As always, @name may (optionally) contain a human readable
+ * name of this mapping (mainly for debugging purposes). The @version field
+ * is added for future scalability, and should be 1 for now.
+ *
+ * Note: at this point we just "trust" these intmem entries to contain valid
+ * physical bus addresses. these are not currently intended to be managed
+ * as host-controlled heaps, as it is much better to do that from the remote
+ * processor side.
+ */
+struct fw_rsc_intmem {
+	u32 version;
+	u32 da;
+	u32 pa;
+	u32 len;
+	u32 reserved;
+	u8 name[32];
+} __packed;
+
+/**
  * struct rproc_mem_entry - memory entry descriptor
  * @va:	virtual address
  * @dma: dma address