diff mbox series

[RFC,v3,2/6] swiotlb: Add restricted DMA pool

Message ID 20210106034124.30560-3-tientzu@chromium.org (mailing list archive)
State Superseded
Headers show
Series Restricted DMA | expand

Commit Message

Claire Chang Jan. 6, 2021, 3:41 a.m. UTC
Add the initialization function to create restricted DMA pools from
matching reserved-memory nodes in the device tree.

Signed-off-by: Claire Chang <tientzu@chromium.org>
---
 include/linux/device.h  |   4 ++
 include/linux/swiotlb.h |   7 +-
 kernel/dma/Kconfig      |   1 +
 kernel/dma/swiotlb.c    | 144 ++++++++++++++++++++++++++++++++++------
 4 files changed, 131 insertions(+), 25 deletions(-)

Comments

Greg Kroah-Hartman Jan. 6, 2021, 7:50 a.m. UTC | #1
On Wed, Jan 06, 2021 at 11:41:20AM +0800, Claire Chang wrote:
> Add the initialization function to create restricted DMA pools from
> matching reserved-memory nodes in the device tree.
> 
> Signed-off-by: Claire Chang <tientzu@chromium.org>
> ---
>  include/linux/device.h  |   4 ++
>  include/linux/swiotlb.h |   7 +-
>  kernel/dma/Kconfig      |   1 +
>  kernel/dma/swiotlb.c    | 144 ++++++++++++++++++++++++++++++++++------
>  4 files changed, 131 insertions(+), 25 deletions(-)
> 
> diff --git a/include/linux/device.h b/include/linux/device.h
> index 89bb8b84173e..ca6f71ec8871 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -413,6 +413,7 @@ struct dev_links_info {
>   * @dma_pools:	Dma pools (if dma'ble device).
>   * @dma_mem:	Internal for coherent mem override.
>   * @cma_area:	Contiguous memory area for dma allocations
> + * @dma_io_tlb_mem: Internal for swiotlb io_tlb_mem override.

Why does this have to be added here?  Shouldn't the platform-specific
code handle it instead?

thanks,

greg k-h
Konrad Rzeszutek Wilk Jan. 6, 2021, 6:52 p.m. UTC | #2
Hello!

In this file:

> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index e4368159f88a..7fb2ac087d23 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
..

> +static const struct reserved_mem_ops rmem_swiotlb_ops = {
> +	.device_init	= rmem_swiotlb_device_init,
> +	.device_release	= rmem_swiotlb_device_release,
> +};
> +
> +static int __init rmem_swiotlb_setup(struct reserved_mem *rmem)
> +{
> +	unsigned long node = rmem->fdt_node;
> +
> +	if (of_get_flat_dt_prop(node, "reusable", NULL) ||
> +	    of_get_flat_dt_prop(node, "linux,cma-default", NULL) ||
> +	    of_get_flat_dt_prop(node, "linux,dma-default", NULL) ||
> +	    of_get_flat_dt_prop(node, "no-map", NULL))
> +		return -EINVAL;
> +
> +	rmem->ops = &rmem_swiotlb_ops;
> +	pr_info("Reserved memory: created device swiotlb memory pool at %pa, size %ld MiB\n",
> +		&rmem->base, (unsigned long)rmem->size / SZ_1M);
> +	return 0;
> +}
> +
> +RESERVEDMEM_OF_DECLARE(dma, "restricted-dma-pool", rmem_swiotlb_setup);

The code should be as much as possible arch-agnostic. That is why there
are multiple -swiotlb files scattered in arch directories that own the
architecture specific code.

Would it be possible to move the code there and perhaps have a ARM
specific front-end for this DMA restricted pool there? See for example
the xen-swiotlb code.

Cheers!

Konrad
Claire Chang Jan. 7, 2021, 5:39 p.m. UTC | #3
Hi Greg and Konrad,

This change is intended to be non-arch specific. Any arch that lacks DMA access
control and has devices not behind an IOMMU can make use of it. Could you share
why you think this should be arch specific?

Thanks!
Konrad Rzeszutek Wilk Jan. 7, 2021, 5:57 p.m. UTC | #4
On Fri, Jan 08, 2021 at 01:39:18AM +0800, Claire Chang wrote:
> Hi Greg and Konrad,
> 
> This change is intended to be non-arch specific. Any arch that lacks DMA access
> control and has devices not behind an IOMMU can make use of it. Could you share
> why you think this should be arch specific?

The idea behind non-arch specific code is it to be generic. The devicetree
is specific to PowerPC, Sparc, and ARM, and not to x86 - hence it should
be in arch specific code.

> 
> Thanks!
Florian Fainelli Jan. 7, 2021, 6:09 p.m. UTC | #5
On 1/7/21 9:57 AM, Konrad Rzeszutek Wilk wrote:
> On Fri, Jan 08, 2021 at 01:39:18AM +0800, Claire Chang wrote:
>> Hi Greg and Konrad,
>>
>> This change is intended to be non-arch specific. Any arch that lacks DMA access
>> control and has devices not behind an IOMMU can make use of it. Could you share
>> why you think this should be arch specific?
> 
> The idea behind non-arch specific code is it to be generic. The devicetree
> is specific to PowerPC, Sparc, and ARM, and not to x86 - hence it should
> be in arch specific code.

In premise the same code could be used with an ACPI enabled system with
an appropriate service to identify the restricted DMA regions and unlock
them.

More than 1 architecture requiring this function (ARM and ARM64 are the
two I can think of needing this immediately) sort of calls for making
the code architecture agnostic since past 2, you need something that scales.

There is already code today under kernel/dma/contiguous.c that is only
activated on a CONFIG_OF=y && CONFIG_OF_RESERVED_MEM=y system, this is
no different.
Konrad Rzeszutek Wilk Jan. 7, 2021, 9:19 p.m. UTC | #6
On Thu, Jan 07, 2021 at 10:09:14AM -0800, Florian Fainelli wrote:
> On 1/7/21 9:57 AM, Konrad Rzeszutek Wilk wrote:
> > On Fri, Jan 08, 2021 at 01:39:18AM +0800, Claire Chang wrote:
> >> Hi Greg and Konrad,
> >>
> >> This change is intended to be non-arch specific. Any arch that lacks DMA access
> >> control and has devices not behind an IOMMU can make use of it. Could you share
> >> why you think this should be arch specific?
> > 
> > The idea behind non-arch specific code is it to be generic. The devicetree
> > is specific to PowerPC, Sparc, and ARM, and not to x86 - hence it should
> > be in arch specific code.
> 
> In premise the same code could be used with an ACPI enabled system with
> an appropriate service to identify the restricted DMA regions and unlock
> them.

Which this patchset is not.

> 
> More than 1 architecture requiring this function (ARM and ARM64 are the
> two I can think of needing this immediately) sort of calls for making
> the code architecture agnostic since past 2, you need something that scales.

I believe the use-case is for ARM64 at this moment.

> 
> There is already code today under kernel/dma/contiguous.c that is only
> activated on a CONFIG_OF=y && CONFIG_OF_RESERVED_MEM=y system, this is
> no different.
> -- 
> Florian
Florian Fainelli Jan. 12, 2021, 11:52 p.m. UTC | #7
On 1/7/21 1:19 PM, Konrad Rzeszutek Wilk wrote:
> On Thu, Jan 07, 2021 at 10:09:14AM -0800, Florian Fainelli wrote:
>> On 1/7/21 9:57 AM, Konrad Rzeszutek Wilk wrote:
>>> On Fri, Jan 08, 2021 at 01:39:18AM +0800, Claire Chang wrote:
>>>> Hi Greg and Konrad,
>>>>
>>>> This change is intended to be non-arch specific. Any arch that lacks DMA access
>>>> control and has devices not behind an IOMMU can make use of it. Could you share
>>>> why you think this should be arch specific?
>>>
>>> The idea behind non-arch specific code is it to be generic. The devicetree
>>> is specific to PowerPC, Sparc, and ARM, and not to x86 - hence it should
>>> be in arch specific code.
>>
>> In premise the same code could be used with an ACPI enabled system with
>> an appropriate service to identify the restricted DMA regions and unlock
>> them.
> 
> Which this patchset is not.

ACPI is not included, but the comment about Device Tree being specific
to PowerPC, SPARC and ARM is x86 is not quite correct. There is an
architecture specific part to obtaining where the Device Tree lives in
memory, but the implementation itself is architecture agnostic (with
some early SPARC/OpenFirmware shenanigans), and x86 does, or rather did
support Device Tree to a very small extent with the CE4100 platform.

Would you prefer that an swiotlb_of.c file be created instead or
something along those lines to better encapsulate where the OF specific
code lives?

> 
>>
>> More than 1 architecture requiring this function (ARM and ARM64 are the
>> two I can think of needing this immediately) sort of calls for making
>> the code architecture agnostic since past 2, you need something that scales.
> 
> I believe the use-case is for ARM64 at this moment.

For the platforms that Claire uses, certainly for the ones we use, ARM
and ARM64 are in scope.
Florian Fainelli Jan. 13, 2021, 12:03 a.m. UTC | #8
On 1/5/21 7:41 PM, Claire Chang wrote:
> Add the initialization function to create restricted DMA pools from
> matching reserved-memory nodes in the device tree.
> 
> Signed-off-by: Claire Chang <tientzu@chromium.org>
> ---
>  include/linux/device.h  |   4 ++
>  include/linux/swiotlb.h |   7 +-
>  kernel/dma/Kconfig      |   1 +
>  kernel/dma/swiotlb.c    | 144 ++++++++++++++++++++++++++++++++++------
>  4 files changed, 131 insertions(+), 25 deletions(-)
> 
> diff --git a/include/linux/device.h b/include/linux/device.h
> index 89bb8b84173e..ca6f71ec8871 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -413,6 +413,7 @@ struct dev_links_info {
>   * @dma_pools:	Dma pools (if dma'ble device).
>   * @dma_mem:	Internal for coherent mem override.
>   * @cma_area:	Contiguous memory area for dma allocations
> + * @dma_io_tlb_mem: Internal for swiotlb io_tlb_mem override.
>   * @archdata:	For arch-specific additions.
>   * @of_node:	Associated device tree node.
>   * @fwnode:	Associated device node supplied by platform firmware.
> @@ -515,6 +516,9 @@ struct device {
>  #ifdef CONFIG_DMA_CMA
>  	struct cma *cma_area;		/* contiguous memory area for dma
>  					   allocations */
> +#endif
> +#ifdef CONFIG_SWIOTLB
> +	struct io_tlb_mem	*dma_io_tlb_mem;
>  #endif
>  	/* arch specific additions */
>  	struct dev_archdata	archdata;
> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> index dd8eb57cbb8f..a1bbd7788885 100644
> --- a/include/linux/swiotlb.h
> +++ b/include/linux/swiotlb.h
> @@ -76,12 +76,13 @@ extern enum swiotlb_force swiotlb_force;
>   *
>   * @start:	The start address of the swiotlb memory pool. Used to do a quick
>   *		range check to see if the memory was in fact allocated by this
> - *		API.
> + *		API. For restricted DMA pool, this is device tree adjustable.

Maybe write it as this is "firmware adjustable" such that when/if ACPI
needs something like this, the description does not need updating.

[snip]

> +static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
> +				    struct device *dev)
> +{
> +	struct io_tlb_mem *mem = rmem->priv;
> +	int ret;
> +
> +	if (dev->dma_io_tlb_mem)
> +		return -EBUSY;
> +
> +	if (!mem) {
> +		mem = kzalloc(sizeof(*mem), GFP_KERNEL);
> +		if (!mem)
> +			return -ENOMEM;
> +
> +		if (!memremap(rmem->base, rmem->size, MEMREMAP_WB)) {

MEMREMAP_WB sounds appropriate as a default.
Documentation/devicetree/bindings/reserved-memory/ramoops.txt does
define an "unbuffered" property which in premise could be applied to the
generic reserved memory binding as well and that we may have to be
honoring here, if we were to make it more generic. Oh well, this does
not need to be addressed right now I guess.
Robin Murphy Jan. 13, 2021, 1:53 a.m. UTC | #9
On 2021-01-07 17:57, Konrad Rzeszutek Wilk wrote:
> On Fri, Jan 08, 2021 at 01:39:18AM +0800, Claire Chang wrote:
>> Hi Greg and Konrad,
>>
>> This change is intended to be non-arch specific. Any arch that lacks DMA access
>> control and has devices not behind an IOMMU can make use of it. Could you share
>> why you think this should be arch specific?
> 
> The idea behind non-arch specific code is it to be generic. The devicetree
> is specific to PowerPC, Sparc, and ARM, and not to x86 - hence it should
> be in arch specific code.

Sorry, but that's an absurd argument. By the same token you'd equally 
have to claim that bits of, say, the Broadcom WiFi driver (not to 
mention dozens of others) should be split out into arch code, since not 
all platforms use the devicetree parts, nor the ACPI parts, nor the PCI 
parts...

There is nothing architecture-specific about using devicetree as a 
system description - AFAIK there *are* a handful of x86 platforms that 
use it, besides even more architectures than you've listed above. It has 
long been the policy that devicetree-related code for a particular 
subsystem should just live within that subsystem. Sometimes if there's 
enough of it it gets collected together into its own file - e.g. 
drivers/pci/of.c - otherwise it tends to just get #ifdef'ed - e.g. 
of_spi_parse_dt(), or the other DMA reserved-memory consumers that 
already exist as Florian points out.

Besides, there are far more platforms that enable CONFIG_OF than enable 
CONFIG_SWIOTLB, so by that metric the whole of the SWIOTLB code itself 
is even less "generic" than any DT parsing :P

Robin.
Christoph Hellwig Jan. 13, 2021, 11:51 a.m. UTC | #10
On Wed, Jan 06, 2021 at 08:50:03AM +0100, Greg KH wrote:
> > --- a/include/linux/device.h
> > +++ b/include/linux/device.h
> > @@ -413,6 +413,7 @@ struct dev_links_info {
> >   * @dma_pools:	Dma pools (if dma'ble device).
> >   * @dma_mem:	Internal for coherent mem override.
> >   * @cma_area:	Contiguous memory area for dma allocations
> > + * @dma_io_tlb_mem: Internal for swiotlb io_tlb_mem override.
> 
> Why does this have to be added here?  Shouldn't the platform-specific
> code handle it instead?

The whole code added here is pretty generic.  What we need to eventually
do, though is to add a separate dma_device instead of adding more and more
bloat to struct device.
Greg Kroah-Hartman Jan. 13, 2021, 12:29 p.m. UTC | #11
On Wed, Jan 13, 2021 at 12:51:26PM +0100, Christoph Hellwig wrote:
> On Wed, Jan 06, 2021 at 08:50:03AM +0100, Greg KH wrote:
> > > --- a/include/linux/device.h
> > > +++ b/include/linux/device.h
> > > @@ -413,6 +413,7 @@ struct dev_links_info {
> > >   * @dma_pools:	Dma pools (if dma'ble device).
> > >   * @dma_mem:	Internal for coherent mem override.
> > >   * @cma_area:	Contiguous memory area for dma allocations
> > > + * @dma_io_tlb_mem: Internal for swiotlb io_tlb_mem override.
> > 
> > Why does this have to be added here?  Shouldn't the platform-specific
> > code handle it instead?
> 
> The whole code added here is pretty generic.  What we need to eventually
> do, though is to add a separate dma_device instead of adding more and more
> bloat to struct device.

I have no objections for that happening!
Christoph Hellwig Jan. 13, 2021, 12:37 p.m. UTC | #12
On Wed, Jan 13, 2021 at 01:29:05PM +0100, Greg KH wrote:
> > > Why does this have to be added here?  Shouldn't the platform-specific
> > > code handle it instead?
> > 
> > The whole code added here is pretty generic.  What we need to eventually
> > do, though is to add a separate dma_device instead of adding more and more
> > bloat to struct device.
> 
> I have no objections for that happening!

I'm pretty sure you agreed to it before in fact.  Now someone just needs
to find the time to do this heavy lifting, where "someone" probably means
me.
Christoph Hellwig Jan. 13, 2021, 12:42 p.m. UTC | #13
> +#ifdef CONFIG_SWIOTLB
> +	struct io_tlb_mem	*dma_io_tlb_mem;
>  #endif

Please add a new config option for this code instead of always building
it when swiotlb is enabled.

> +static int swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start,
> +				   size_t size)

Can you split the refactoring in swiotlb.c into one or more prep
patches?

> +static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
> +				    struct device *dev)
> +{
> +	struct io_tlb_mem *mem = rmem->priv;
> +	int ret;
> +
> +	if (dev->dma_io_tlb_mem)
> +		return -EBUSY;
> +
> +	if (!mem) {
> +		mem = kzalloc(sizeof(*mem), GFP_KERNEL);
> +		if (!mem)
> +			return -ENOMEM;

What is the calling convention here that allows for a NULL and non-NULL
private data?
Nicolas Saenz Julienne Jan. 13, 2021, 1:59 p.m. UTC | #14
Hi All,

On Tue, 2021-01-12 at 16:03 -0800, Florian Fainelli wrote:
> On 1/5/21 7:41 PM, Claire Chang wrote:
> > Add the initialization function to create restricted DMA pools from
> > matching reserved-memory nodes in the device tree.
> > 
> > Signed-off-by: Claire Chang <tientzu@chromium.org>
> > ---
> >  include/linux/device.h  |   4 ++
> >  include/linux/swiotlb.h |   7 +-
> >  kernel/dma/Kconfig      |   1 +
> >  kernel/dma/swiotlb.c    | 144 ++++++++++++++++++++++++++++++++++------
> >  4 files changed, 131 insertions(+), 25 deletions(-)
> > 
> > diff --git a/include/linux/device.h b/include/linux/device.h
> > index 89bb8b84173e..ca6f71ec8871 100644
> > --- a/include/linux/device.h
> > +++ b/include/linux/device.h
> > @@ -413,6 +413,7 @@ struct dev_links_info {
> >   * @dma_pools:	Dma pools (if dma'ble device).
> >   * @dma_mem:	Internal for coherent mem override.
> >   * @cma_area:	Contiguous memory area for dma allocations
> > + * @dma_io_tlb_mem: Internal for swiotlb io_tlb_mem override.
> >   * @archdata:	For arch-specific additions.
> >   * @of_node:	Associated device tree node.
> >   * @fwnode:	Associated device node supplied by platform firmware.
> > @@ -515,6 +516,9 @@ struct device {
> >  #ifdef CONFIG_DMA_CMA
> >  	struct cma *cma_area;		/* contiguous memory area for dma
> >  					   allocations */
> > +#endif
> > +#ifdef CONFIG_SWIOTLB
> > +	struct io_tlb_mem	*dma_io_tlb_mem;
> >  #endif
> >  	/* arch specific additions */
> >  	struct dev_archdata	archdata;
> > diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> > index dd8eb57cbb8f..a1bbd7788885 100644
> > --- a/include/linux/swiotlb.h
> > +++ b/include/linux/swiotlb.h
> > @@ -76,12 +76,13 @@ extern enum swiotlb_force swiotlb_force;
> >   *
> >   * @start:	The start address of the swiotlb memory pool. Used to do a quick
> >   *		range check to see if the memory was in fact allocated by this
> > - *		API.
> > + *		API. For restricted DMA pool, this is device tree adjustable.
> 
> Maybe write it as this is "firmware adjustable" such that when/if ACPI
> needs something like this, the description does not need updating.
> 
> [snip]
> 
> > +static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
> > +				    struct device *dev)
> > +{
> > +	struct io_tlb_mem *mem = rmem->priv;
> > +	int ret;
> > +
> > +	if (dev->dma_io_tlb_mem)
> > +		return -EBUSY;
> > +
> > +	if (!mem) {
> > +		mem = kzalloc(sizeof(*mem), GFP_KERNEL);
> > +		if (!mem)
> > +			return -ENOMEM;
> > +
> > +		if (!memremap(rmem->base, rmem->size, MEMREMAP_WB)) {
> 
> MEMREMAP_WB sounds appropriate as a default.

As per the binding 'no-map' has to be disabled here. So AFAIU, this memory will
be part of the linear mapping. Is this really needed then?

> Documentation/devicetree/bindings/reserved-memory/ramoops.txt does
> define an "unbuffered" property which in premise could be applied to the
> generic reserved memory binding as well and that we may have to be
> honoring here, if we were to make it more generic. Oh well, this does
> not need to be addressed right now I guess.
Robin Murphy Jan. 13, 2021, 3:27 p.m. UTC | #15
On 2021-01-13 13:59, Nicolas Saenz Julienne wrote:
> Hi All,
> 
> On Tue, 2021-01-12 at 16:03 -0800, Florian Fainelli wrote:
>> On 1/5/21 7:41 PM, Claire Chang wrote:
>>> Add the initialization function to create restricted DMA pools from
>>> matching reserved-memory nodes in the device tree.
>>>
>>> Signed-off-by: Claire Chang <tientzu@chromium.org>
>>> ---
>>>   include/linux/device.h  |   4 ++
>>>   include/linux/swiotlb.h |   7 +-
>>>   kernel/dma/Kconfig      |   1 +
>>>   kernel/dma/swiotlb.c    | 144 ++++++++++++++++++++++++++++++++++------
>>>   4 files changed, 131 insertions(+), 25 deletions(-)
>>>
>>> diff --git a/include/linux/device.h b/include/linux/device.h
>>> index 89bb8b84173e..ca6f71ec8871 100644
>>> --- a/include/linux/device.h
>>> +++ b/include/linux/device.h
>>> @@ -413,6 +413,7 @@ struct dev_links_info {
>>>    * @dma_pools:	Dma pools (if dma'ble device).
>>>    * @dma_mem:	Internal for coherent mem override.
>>>    * @cma_area:	Contiguous memory area for dma allocations
>>> + * @dma_io_tlb_mem: Internal for swiotlb io_tlb_mem override.
>>>    * @archdata:	For arch-specific additions.
>>>    * @of_node:	Associated device tree node.
>>>    * @fwnode:	Associated device node supplied by platform firmware.
>>> @@ -515,6 +516,9 @@ struct device {
>>>   #ifdef CONFIG_DMA_CMA
>>>   	struct cma *cma_area;		/* contiguous memory area for dma
>>>   					   allocations */
>>> +#endif
>>> +#ifdef CONFIG_SWIOTLB
>>> +	struct io_tlb_mem	*dma_io_tlb_mem;
>>>   #endif
>>>   	/* arch specific additions */
>>>   	struct dev_archdata	archdata;
>>> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
>>> index dd8eb57cbb8f..a1bbd7788885 100644
>>> --- a/include/linux/swiotlb.h
>>> +++ b/include/linux/swiotlb.h
>>> @@ -76,12 +76,13 @@ extern enum swiotlb_force swiotlb_force;
>>>    *
>>>    * @start:	The start address of the swiotlb memory pool. Used to do a quick
>>>    *		range check to see if the memory was in fact allocated by this
>>> - *		API.
>>> + *		API. For restricted DMA pool, this is device tree adjustable.
>>
>> Maybe write it as this is "firmware adjustable" such that when/if ACPI
>> needs something like this, the description does not need updating.

TBH I really don't think this needs calling out at all. Even in the 
regular case, the details of exactly how and where the pool is allocated 
are beyond the scope of this code - architectures already have several 
ways to control that and make their own decisions.

>>
>> [snip]
>>
>>> +static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
>>> +				    struct device *dev)
>>> +{
>>> +	struct io_tlb_mem *mem = rmem->priv;
>>> +	int ret;
>>> +
>>> +	if (dev->dma_io_tlb_mem)
>>> +		return -EBUSY;
>>> +
>>> +	if (!mem) {
>>> +		mem = kzalloc(sizeof(*mem), GFP_KERNEL);
>>> +		if (!mem)
>>> +			return -ENOMEM;
>>> +
>>> +		if (!memremap(rmem->base, rmem->size, MEMREMAP_WB)) {
>>
>> MEMREMAP_WB sounds appropriate as a default.
> 
> As per the binding 'no-map' has to be disabled here. So AFAIU, this memory will
> be part of the linear mapping. Is this really needed then?

More than that, I'd assume that we *have* to use the linear/direct map 
address rather than anything that has any possibility of being a vmalloc 
remap, otherwise we can no longer safely rely on 
phys_to_dma/dma_to_phys, no?

That said, given that we're not actually using the returned address, I'm 
not entirely sure what the point of this call is anyway. If we can 
assume it's always going to return the linear map address via 
try_ram_remap() then we can equally just go ahead and use the linear map 
address straight away. I don't really see how we could ever hit the 
"is_ram == REGION_MIXED" case in memremap() that would return NULL, if 
we passed the memblock check earlier in __reserved_mem_alloc_size() such 
that this rmem node ever got to be initialised at all.

Robin.

>> Documentation/devicetree/bindings/reserved-memory/ramoops.txt does
>> define an "unbuffered" property which in premise could be applied to the
>> generic reserved memory binding as well and that we may have to be
>> honoring here, if we were to make it more generic. Oh well, this does
>> not need to be addressed right now I guess.
> 
> 
>
Florian Fainelli Jan. 13, 2021, 5:43 p.m. UTC | #16
On 1/13/21 7:27 AM, Robin Murphy wrote:
> On 2021-01-13 13:59, Nicolas Saenz Julienne wrote:
>> Hi All,
>>
>> On Tue, 2021-01-12 at 16:03 -0800, Florian Fainelli wrote:
>>> On 1/5/21 7:41 PM, Claire Chang wrote:
>>>> Add the initialization function to create restricted DMA pools from
>>>> matching reserved-memory nodes in the device tree.
>>>>
>>>> Signed-off-by: Claire Chang <tientzu@chromium.org>
>>>> ---
>>>>   include/linux/device.h  |   4 ++
>>>>   include/linux/swiotlb.h |   7 +-
>>>>   kernel/dma/Kconfig      |   1 +
>>>>   kernel/dma/swiotlb.c    | 144
>>>> ++++++++++++++++++++++++++++++++++------
>>>>   4 files changed, 131 insertions(+), 25 deletions(-)
>>>>
>>>> diff --git a/include/linux/device.h b/include/linux/device.h
>>>> index 89bb8b84173e..ca6f71ec8871 100644
>>>> --- a/include/linux/device.h
>>>> +++ b/include/linux/device.h
>>>> @@ -413,6 +413,7 @@ struct dev_links_info {
>>>>    * @dma_pools:    Dma pools (if dma'ble device).
>>>>    * @dma_mem:    Internal for coherent mem override.
>>>>    * @cma_area:    Contiguous memory area for dma allocations
>>>> + * @dma_io_tlb_mem: Internal for swiotlb io_tlb_mem override.
>>>>    * @archdata:    For arch-specific additions.
>>>>    * @of_node:    Associated device tree node.
>>>>    * @fwnode:    Associated device node supplied by platform firmware.
>>>> @@ -515,6 +516,9 @@ struct device {
>>>>   #ifdef CONFIG_DMA_CMA
>>>>       struct cma *cma_area;        /* contiguous memory area for dma
>>>>                          allocations */
>>>> +#endif
>>>> +#ifdef CONFIG_SWIOTLB
>>>> +    struct io_tlb_mem    *dma_io_tlb_mem;
>>>>   #endif
>>>>       /* arch specific additions */
>>>>       struct dev_archdata    archdata;
>>>> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
>>>> index dd8eb57cbb8f..a1bbd7788885 100644
>>>> --- a/include/linux/swiotlb.h
>>>> +++ b/include/linux/swiotlb.h
>>>> @@ -76,12 +76,13 @@ extern enum swiotlb_force swiotlb_force;
>>>>    *
>>>>    * @start:    The start address of the swiotlb memory pool. Used
>>>> to do a quick
>>>>    *        range check to see if the memory was in fact allocated
>>>> by this
>>>> - *        API.
>>>> + *        API. For restricted DMA pool, this is device tree
>>>> adjustable.
>>>
>>> Maybe write it as this is "firmware adjustable" such that when/if ACPI
>>> needs something like this, the description does not need updating.
> 
> TBH I really don't think this needs calling out at all. Even in the
> regular case, the details of exactly how and where the pool is allocated
> are beyond the scope of this code - architectures already have several
> ways to control that and make their own decisions.
> 
>>>
>>> [snip]
>>>
>>>> +static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
>>>> +                    struct device *dev)
>>>> +{
>>>> +    struct io_tlb_mem *mem = rmem->priv;
>>>> +    int ret;
>>>> +
>>>> +    if (dev->dma_io_tlb_mem)
>>>> +        return -EBUSY;
>>>> +
>>>> +    if (!mem) {
>>>> +        mem = kzalloc(sizeof(*mem), GFP_KERNEL);
>>>> +        if (!mem)
>>>> +            return -ENOMEM;
>>>> +
>>>> +        if (!memremap(rmem->base, rmem->size, MEMREMAP_WB)) {
>>>
>>> MEMREMAP_WB sounds appropriate as a default.
>>
>> As per the binding 'no-map' has to be disabled here. So AFAIU, this
>> memory will
>> be part of the linear mapping. Is this really needed then?
> 
> More than that, I'd assume that we *have* to use the linear/direct map
> address rather than anything that has any possibility of being a vmalloc
> remap, otherwise we can no longer safely rely on
> phys_to_dma/dma_to_phys, no?

I believe you are right, which means that if we want to make use of the
restricted DMA pool on a 32-bit architecture (and we do, at least, I do)
we should probably add some error checking/warning to ensure the
restricted DMA pool falls within the linear map.
Robin Murphy Jan. 13, 2021, 6:03 p.m. UTC | #17
On 2021-01-13 17:43, Florian Fainelli wrote:
> On 1/13/21 7:27 AM, Robin Murphy wrote:
>> On 2021-01-13 13:59, Nicolas Saenz Julienne wrote:
>>> Hi All,
>>>
>>> On Tue, 2021-01-12 at 16:03 -0800, Florian Fainelli wrote:
>>>> On 1/5/21 7:41 PM, Claire Chang wrote:
>>>>> Add the initialization function to create restricted DMA pools from
>>>>> matching reserved-memory nodes in the device tree.
>>>>>
>>>>> Signed-off-by: Claire Chang <tientzu@chromium.org>
>>>>> ---
>>>>>    include/linux/device.h  |   4 ++
>>>>>    include/linux/swiotlb.h |   7 +-
>>>>>    kernel/dma/Kconfig      |   1 +
>>>>>    kernel/dma/swiotlb.c    | 144
>>>>> ++++++++++++++++++++++++++++++++++------
>>>>>    4 files changed, 131 insertions(+), 25 deletions(-)
>>>>>
>>>>> diff --git a/include/linux/device.h b/include/linux/device.h
>>>>> index 89bb8b84173e..ca6f71ec8871 100644
>>>>> --- a/include/linux/device.h
>>>>> +++ b/include/linux/device.h
>>>>> @@ -413,6 +413,7 @@ struct dev_links_info {
>>>>>     * @dma_pools:    Dma pools (if dma'ble device).
>>>>>     * @dma_mem:    Internal for coherent mem override.
>>>>>     * @cma_area:    Contiguous memory area for dma allocations
>>>>> + * @dma_io_tlb_mem: Internal for swiotlb io_tlb_mem override.
>>>>>     * @archdata:    For arch-specific additions.
>>>>>     * @of_node:    Associated device tree node.
>>>>>     * @fwnode:    Associated device node supplied by platform firmware.
>>>>> @@ -515,6 +516,9 @@ struct device {
>>>>>    #ifdef CONFIG_DMA_CMA
>>>>>        struct cma *cma_area;        /* contiguous memory area for dma
>>>>>                           allocations */
>>>>> +#endif
>>>>> +#ifdef CONFIG_SWIOTLB
>>>>> +    struct io_tlb_mem    *dma_io_tlb_mem;
>>>>>    #endif
>>>>>        /* arch specific additions */
>>>>>        struct dev_archdata    archdata;
>>>>> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
>>>>> index dd8eb57cbb8f..a1bbd7788885 100644
>>>>> --- a/include/linux/swiotlb.h
>>>>> +++ b/include/linux/swiotlb.h
>>>>> @@ -76,12 +76,13 @@ extern enum swiotlb_force swiotlb_force;
>>>>>     *
>>>>>     * @start:    The start address of the swiotlb memory pool. Used
>>>>> to do a quick
>>>>>     *        range check to see if the memory was in fact allocated
>>>>> by this
>>>>> - *        API.
>>>>> + *        API. For restricted DMA pool, this is device tree
>>>>> adjustable.
>>>>
>>>> Maybe write it as this is "firmware adjustable" such that when/if ACPI
>>>> needs something like this, the description does not need updating.
>>
>> TBH I really don't think this needs calling out at all. Even in the
>> regular case, the details of exactly how and where the pool is allocated
>> are beyond the scope of this code - architectures already have several
>> ways to control that and make their own decisions.
>>
>>>>
>>>> [snip]
>>>>
>>>>> +static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
>>>>> +                    struct device *dev)
>>>>> +{
>>>>> +    struct io_tlb_mem *mem = rmem->priv;
>>>>> +    int ret;
>>>>> +
>>>>> +    if (dev->dma_io_tlb_mem)
>>>>> +        return -EBUSY;
>>>>> +
>>>>> +    if (!mem) {
>>>>> +        mem = kzalloc(sizeof(*mem), GFP_KERNEL);
>>>>> +        if (!mem)
>>>>> +            return -ENOMEM;
>>>>> +
>>>>> +        if (!memremap(rmem->base, rmem->size, MEMREMAP_WB)) {
>>>>
>>>> MEMREMAP_WB sounds appropriate as a default.
>>>
>>> As per the binding 'no-map' has to be disabled here. So AFAIU, this
>>> memory will
>>> be part of the linear mapping. Is this really needed then?
>>
>> More than that, I'd assume that we *have* to use the linear/direct map
>> address rather than anything that has any possibility of being a vmalloc
>> remap, otherwise we can no longer safely rely on
>> phys_to_dma/dma_to_phys, no?
> 
> I believe you are right, which means that if we want to make use of the
> restricted DMA pool on a 32-bit architecture (and we do, at least, I do)
> we should probably add some error checking/warning to ensure the
> restricted DMA pool falls within the linear map.

Oh, good point - I'm so used to 64-bit that I instinctively just blanked 
out the !PageHighMem() condition in try_ram_remap(). So maybe the 
original intent here *was* to effectively just implement that check, but 
if so it could still do with being a lot more explicit.

Cheers,
Robin.
Claire Chang Jan. 14, 2021, 9:06 a.m. UTC | #18
On Wed, Jan 13, 2021 at 8:42 PM Christoph Hellwig <hch@lst.de> wrote:
>
> > +#ifdef CONFIG_SWIOTLB
> > +     struct io_tlb_mem       *dma_io_tlb_mem;
> >  #endif
>
> Please add a new config option for this code instead of always building
> it when swiotlb is enabled.
>
> > +static int swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start,
> > +                                size_t size)
>
> Can you split the refactoring in swiotlb.c into one or more prep
> patches?
>
> > +static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
> > +                                 struct device *dev)
> > +{
> > +     struct io_tlb_mem *mem = rmem->priv;
> > +     int ret;
> > +
> > +     if (dev->dma_io_tlb_mem)
> > +             return -EBUSY;
> > +
> > +     if (!mem) {
> > +             mem = kzalloc(sizeof(*mem), GFP_KERNEL);
> > +             if (!mem)
> > +                     return -ENOMEM;
>
> What is the calling convention here that allows for a NULL and non-NULL
> private data?

Since multiple devices can share the same pool, the private data,
io_tlb_mem struct, will be initialized by the first device attached to
it.
This is similar to rmem_dma_device_init() in kernel/dma/coherent.c.
I'll add a comment for it in next version.
Jon Masters Jan. 25, 2021, 5:26 a.m. UTC | #19
On 1/7/21 1:09 PM, Florian Fainelli wrote:
> On 1/7/21 9:57 AM, Konrad Rzeszutek Wilk wrote:
>> On Fri, Jan 08, 2021 at 01:39:18AM +0800, Claire Chang wrote:
>>> Hi Greg and Konrad,
>>>
>>> This change is intended to be non-arch specific. Any arch that lacks DMA access
>>> control and has devices not behind an IOMMU can make use of it. Could you share
>>> why you think this should be arch specific?
>>
>> The idea behind non-arch specific code is it to be generic. The devicetree
>> is specific to PowerPC, Sparc, and ARM, and not to x86 - hence it should
>> be in arch specific code.
> 
> In premise the same code could be used with an ACPI enabled system with
> an appropriate service to identify the restricted DMA regions and unlock
> them.
> 
> More than 1 architecture requiring this function (ARM and ARM64 are the
> two I can think of needing this immediately) sort of calls for making
> the code architecture agnostic since past 2, you need something that scales.
> 
> There is already code today under kernel/dma/contiguous.c that is only
> activated on a CONFIG_OF=y && CONFIG_OF_RESERVED_MEM=y system, this is
> no different.

<unrelated to these patches, which are useful for the case cited>

Just a note for history/archives that this approach would not be 
appropriate on general purpose Arm systems, such as SystemReady-ES 
edge/non-server platforms seeking to run general purpose distros. I want 
to have that in the record before someone at Arm (or NVidia, or a bunch 
of others that come to mind who have memory firewalls) gets an idea.

If you're working at an Arm vendor and come looking at this later 
thinking "wow, what a great idea!", please fix your hardware to have a 
real IOMMU/SMMU and real PCIe. You'll be pointed at this reply.

Jon.
diff mbox series

Patch

diff --git a/include/linux/device.h b/include/linux/device.h
index 89bb8b84173e..ca6f71ec8871 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -413,6 +413,7 @@  struct dev_links_info {
  * @dma_pools:	Dma pools (if dma'ble device).
  * @dma_mem:	Internal for coherent mem override.
  * @cma_area:	Contiguous memory area for dma allocations
+ * @dma_io_tlb_mem: Internal for swiotlb io_tlb_mem override.
  * @archdata:	For arch-specific additions.
  * @of_node:	Associated device tree node.
  * @fwnode:	Associated device node supplied by platform firmware.
@@ -515,6 +516,9 @@  struct device {
 #ifdef CONFIG_DMA_CMA
 	struct cma *cma_area;		/* contiguous memory area for dma
 					   allocations */
+#endif
+#ifdef CONFIG_SWIOTLB
+	struct io_tlb_mem	*dma_io_tlb_mem;
 #endif
 	/* arch specific additions */
 	struct dev_archdata	archdata;
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index dd8eb57cbb8f..a1bbd7788885 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -76,12 +76,13 @@  extern enum swiotlb_force swiotlb_force;
  *
  * @start:	The start address of the swiotlb memory pool. Used to do a quick
  *		range check to see if the memory was in fact allocated by this
- *		API.
+ *		API. For restricted DMA pool, this is device tree adjustable.
  * @end:	The end address of the swiotlb memory pool. Used to do a quick
  *		range check to see if the memory was in fact allocated by this
- *		API.
+ *		API. For restricted DMA pool, this is device tree adjustable.
  * @nslabs:	The number of IO TLB blocks (in groups of 64) between @start and
- *		@end. This is command line adjustable via setup_io_tlb_npages.
+ *		@end. For default swiotlb, this is command line adjustable via
+ *		setup_io_tlb_npages.
  * @used:	The number of used IO TLB block.
  * @list:	The free list describing the number of free entries available
  *		from each index.
diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
index 479fc145acfc..131a0a66781b 100644
--- a/kernel/dma/Kconfig
+++ b/kernel/dma/Kconfig
@@ -82,6 +82,7 @@  config ARCH_HAS_FORCE_DMA_UNENCRYPTED
 config SWIOTLB
 	bool
 	select NEED_DMA_MAP_STATE
+	select OF_EARLY_FLATTREE
 
 #
 # Should be selected if we can mmap non-coherent mappings to userspace.
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index e4368159f88a..7fb2ac087d23 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -36,6 +36,11 @@ 
 #include <linux/scatterlist.h>
 #include <linux/mem_encrypt.h>
 #include <linux/set_memory.h>
+#include <linux/io.h>
+#include <linux/of.h>
+#include <linux/of_fdt.h>
+#include <linux/of_reserved_mem.h>
+#include <linux/slab.h>
 #ifdef CONFIG_DEBUG_FS
 #include <linux/debugfs.h>
 #endif
@@ -319,20 +324,21 @@  static void swiotlb_cleanup(void)
 	max_segment = 0;
 }
 
-int
-swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
+static int swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start,
+				   size_t size)
 {
-	struct io_tlb_mem *mem = &io_tlb_default_mem;
-	unsigned long i, bytes;
+	unsigned long i;
+	void *vaddr = phys_to_virt(start);
 
-	bytes = nslabs << IO_TLB_SHIFT;
+	size = ALIGN(size, 1 << IO_TLB_SHIFT);
+	mem->nslabs = size >> IO_TLB_SHIFT;
+	mem->nslabs = ALIGN(mem->nslabs, IO_TLB_SEGSIZE);
 
-	mem->nslabs = nslabs;
-	mem->start = virt_to_phys(tlb);
-	mem->end = mem->start + bytes;
+	mem->start = start;
+	mem->end = mem->start + size;
 
-	set_memory_decrypted((unsigned long)tlb, bytes >> PAGE_SHIFT);
-	memset(tlb, 0, bytes);
+	set_memory_decrypted((unsigned long)vaddr, size >> PAGE_SHIFT);
+	memset(vaddr, 0, size);
 
 	/*
 	 * Allocate and initialize the free list array.  This array is used
@@ -356,13 +362,6 @@  swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
 		mem->orig_addr[i] = INVALID_PHYS_ADDR;
 	}
 	mem->index = 0;
-	no_iotlb_memory = false;
-
-	swiotlb_print_info();
-
-	late_alloc = 1;
-
-	swiotlb_set_max_segment(mem->nslabs << IO_TLB_SHIFT);
 
 	return 0;
 
@@ -375,6 +374,27 @@  swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
 	return -ENOMEM;
 }
 
+int swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
+{
+	struct io_tlb_mem *mem = &io_tlb_default_mem;
+	unsigned long bytes = nslabs << IO_TLB_SHIFT;
+	int ret;
+
+	ret = swiotlb_init_io_tlb_mem(mem, virt_to_phys(tlb), bytes);
+	if (ret)
+		return ret;
+
+	no_iotlb_memory = false;
+
+	swiotlb_print_info();
+
+	late_alloc = 1;
+
+	swiotlb_set_max_segment(bytes);
+
+	return 0;
+}
+
 void __init swiotlb_exit(void)
 {
 	struct io_tlb_mem *mem = &io_tlb_default_mem;
@@ -703,16 +723,96 @@  bool is_swiotlb_active(void)
 
 #ifdef CONFIG_DEBUG_FS
 
-static int __init swiotlb_create_debugfs(void)
+static void swiotlb_create_debugfs(struct io_tlb_mem *mem, const char *name,
+				   struct dentry *node)
 {
-	struct io_tlb_mem *mem = &io_tlb_default_mem;
-
-	mem->debugfs = debugfs_create_dir("swiotlb", NULL);
+	mem->debugfs = debugfs_create_dir(name, node);
 	debugfs_create_ulong("io_tlb_nslabs", 0400, mem->debugfs, &mem->nslabs);
 	debugfs_create_ulong("io_tlb_used", 0400, mem->debugfs, &mem->used);
+}
+
+static int __init swiotlb_create_default_debugfs(void)
+{
+	swiotlb_create_debugfs(&io_tlb_default_mem, "swiotlb", NULL);
+
 	return 0;
 }
 
-late_initcall(swiotlb_create_debugfs);
+late_initcall(swiotlb_create_default_debugfs);
 
 #endif
+
+static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
+				    struct device *dev)
+{
+	struct io_tlb_mem *mem = rmem->priv;
+	int ret;
+
+	if (dev->dma_io_tlb_mem)
+		return -EBUSY;
+
+	if (!mem) {
+		mem = kzalloc(sizeof(*mem), GFP_KERNEL);
+		if (!mem)
+			return -ENOMEM;
+
+		if (!memremap(rmem->base, rmem->size, MEMREMAP_WB)) {
+			ret = -EINVAL;
+			goto cleanup;
+		}
+
+		ret = swiotlb_init_io_tlb_mem(mem, rmem->base, rmem->size);
+		if (ret)
+			goto cleanup;
+
+		rmem->priv = mem;
+	}
+
+#ifdef CONFIG_DEBUG_FS
+	swiotlb_create_debugfs(mem, dev_name(dev), io_tlb_default_mem.debugfs);
+#endif
+
+	dev->dma_io_tlb_mem = mem;
+
+	return 0;
+
+cleanup:
+	kfree(mem);
+
+	return ret;
+}
+
+static void rmem_swiotlb_device_release(struct reserved_mem *rmem,
+					struct device *dev)
+{
+	if (!dev)
+		return;
+
+#ifdef CONFIG_DEBUG_FS
+	debugfs_remove_recursive(dev->dma_io_tlb_mem->debugfs);
+#endif
+	dev->dma_io_tlb_mem = NULL;
+}
+
+static const struct reserved_mem_ops rmem_swiotlb_ops = {
+	.device_init	= rmem_swiotlb_device_init,
+	.device_release	= rmem_swiotlb_device_release,
+};
+
+static int __init rmem_swiotlb_setup(struct reserved_mem *rmem)
+{
+	unsigned long node = rmem->fdt_node;
+
+	if (of_get_flat_dt_prop(node, "reusable", NULL) ||
+	    of_get_flat_dt_prop(node, "linux,cma-default", NULL) ||
+	    of_get_flat_dt_prop(node, "linux,dma-default", NULL) ||
+	    of_get_flat_dt_prop(node, "no-map", NULL))
+		return -EINVAL;
+
+	rmem->ops = &rmem_swiotlb_ops;
+	pr_info("Reserved memory: created device swiotlb memory pool at %pa, size %ld MiB\n",
+		&rmem->base, (unsigned long)rmem->size / SZ_1M);
+	return 0;
+}
+
+RESERVEDMEM_OF_DECLARE(dma, "restricted-dma-pool", rmem_swiotlb_setup);