diff mbox series

RISC-V: Remove IORESOURCE_BUSY flag for no-map reserved memory

Message ID 20220511111851.559684-1-xianting.tian@linux.alibaba.com (mailing list archive)
State New, archived
Headers show
Series RISC-V: Remove IORESOURCE_BUSY flag for no-map reserved memory | expand

Commit Message

Xianting Tian May 11, 2022, 11:18 a.m. UTC
Commit 00ab027a3b82 ("RISC-V: Add kernel image sections to the resource tree")
added IORESOURCE_BUSY flag for no-map reserved memory, this casued
devm_ioremap_resource() failed for the no-map reserved memory in subsequent
operations of related driver, so remove the IORESOURCE_BUSY flag.

The code to reproduce the issue,
dts:
	mem0: memory@a0000000 {
                reg = <0x0 0xa0000000 0 0x1000000>;
                no-map;
        };

	&test {
		status = "okay";
		memory-region = <&mem0>;
	};

code:
	np = of_parse_phandle(pdev->dev.of_node, "memory-region", 0);
	ret = of_address_to_resource(np, 0, &r);
	base = devm_ioremap_resource(&pdev->dev, &r);
	// base = -EBUSY

Fixes: 00ab027a3b82 ("RISC-V: Add kernel image sections to the resource tree")
Reported-by: Huaming Jiang <jianghuaming.jhm@alibaba-inc.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
CC: Nick Kossifidis <mick@ics.forth.gr>
Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com>
---
 arch/riscv/kernel/setup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Nick Kossifidis May 12, 2022, 2:32 a.m. UTC | #1
Hello Xianting,

> ---
>  arch/riscv/kernel/setup.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
> index 834eb652a7b9..71f2966b1474 100644
> --- a/arch/riscv/kernel/setup.c
> +++ b/arch/riscv/kernel/setup.c
> @@ -214,7 +214,7 @@ static void __init init_resources(void)
> 
>  		if (unlikely(memblock_is_nomap(region))) {
>  			res->name = "Reserved";
> -			res->flags = IORESOURCE_MEM | IORESOURCE_BUSY;
> +			res->flags = IORESOURCE_MEM;
>  		} else {
>  			res->name = "System RAM";
>  			res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;

The short story:

This makes sense but we should at least mark them as 
IORESOURCE_EXCLUSIVE, and also remove IORESOURCE_BUSY from line 192 
above 
(https://elixir.bootlin.com/linux/v5.18-rc6/source/arch/riscv/kernel/setup.c#L192).

The long story:

The spec says about no-map:

"
Indicates the operating system must not create a virtual mapping
of the region as part of its standard mapping of system memory,
nor permit speculative access to it under any circumstances other
than under the control of the device driver using the region.
"

It basically says that only the driver that uses the region should be 
able to create mappings for it and access it, and even that is not 
enough to prevent speculative access to the region by someone other than 
the driver. The thing is we can't implement this in a simple way because 
-to begin with- we don't have a way to pin those regions to specific 
devices/drivers, the memory-region binding doesn't say anything about 
that. When using devm_ioremap_resource() the first driver to claim the 
resource will mark it as busy so other drivers using the same api won't 
be able to use it, however the region can still be mapped in other ways 
(e.g. through ioremap directly) so using the resource tree to 
track/protect the regions marked with no-map isn't enough. They can even 
be accessed from userspace through /dev/mem unless we add them to the 
resource tree as IORESOURCE_MEM and enable/set 
CONFIG_IO_STRICT_DEVMEM/iomem=strict, but even then in case the 
corresponding ioresource isn't busy (e.g. hasn't been claimed by a 
driver yet through devm_ioremap_resource) we still have to mark them as 
IORESOURCE_EXCLUSIVE for iomem_is_exclusive() to do the trick 
(https://elixir.bootlin.com/linux/v5.18-rc6/source/kernel/resource.c#L1739) 
and prevent access through /dev/mem.

Finally the definition of no-map and the definition of MEMBLOCK_NOMAP 
are not the same, all MEMBLOCK_NOMAP says is "don't add to kernel direct 
mapping" so ioremap that uses vmalloc can still be used by everyone, in 
general it's a mess. It becomes worse, if you mark a reserved-memory 
region with no-map and that region overlaps with /memory, 
early_init_dt_reserve_memory_arch() will isolate it, mark it as 
MEMBLOCK_NOMAP and won't add it to memblock.reserved, if it doesn't 
overlap it will just ignore it and still not add it to 
memblock.reserved. So if we want to add a reserved-memory region that 
doesn't overlap with /memory (a valid scenario allowed by the binding), 
there is no way to mark it with no-map.

When I wrote that code I was looking for a way to prevent access to 
reserved regions through /dev/mem and by other drivers, regardless of 
being part of /memory or not, and since I couldn't mark them with no-map 
to track them because of early_init_dt_reserve_memory_arch(), I marked 
them as busy and then used them from the driver with ioremap directly. 
It was a temporary measure until I had a better approach (e.g. override 
ioremap / devmem_is_allowed like other archs do) but I never got the 
time to finish it, sorry for the mess !

Regards,
Nick
Xianting Tian May 12, 2022, 2:50 a.m. UTC | #2
在 2022/5/12 上午10:32, Nick Kossifidis 写道:
> Hello Xianting,
>
>> ---
>>  arch/riscv/kernel/setup.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
>> index 834eb652a7b9..71f2966b1474 100644
>> --- a/arch/riscv/kernel/setup.c
>> +++ b/arch/riscv/kernel/setup.c
>> @@ -214,7 +214,7 @@ static void __init init_resources(void)
>>
>>          if (unlikely(memblock_is_nomap(region))) {
>>              res->name = "Reserved";
>> -            res->flags = IORESOURCE_MEM | IORESOURCE_BUSY;
>> +            res->flags = IORESOURCE_MEM;
>>          } else {
>>              res->name = "System RAM";
>>              res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
>
> The short story:
>
> This makes sense but we should at least mark them as 
> IORESOURCE_EXCLUSIVE, and also remove IORESOURCE_BUSY from line 192 
> above 
> (https://elixir.bootlin.com/linux/v5.18-rc6/source/arch/riscv/kernel/setup.c#L192).
Thanks Nick for the detailed reply, I will sent V2 patch soon according 
to your suggestion.
>
> The long story:
>
> The spec says about no-map:
>
> "
> Indicates the operating system must not create a virtual mapping
> of the region as part of its standard mapping of system memory,
> nor permit speculative access to it under any circumstances other
> than under the control of the device driver using the region.
> "
>
> It basically says that only the driver that uses the region should be 
> able to create mappings for it and access it, and even that is not 
> enough to prevent speculative access to the region by someone other 
> than the driver. The thing is we can't implement this in a simple way 
> because -to begin with- we don't have a way to pin those regions to 
> specific devices/drivers, the memory-region binding doesn't say 
> anything about that. When using devm_ioremap_resource() the first 
> driver to claim the resource will mark it as busy so other drivers 
> using the same api won't be able to use it, however the region can 
> still be mapped in other ways (e.g. through ioremap directly) so using 
> the resource tree to track/protect the regions marked with no-map 
> isn't enough. They can even be accessed from userspace through 
> /dev/mem unless we add them to the resource tree as IORESOURCE_MEM and 
> enable/set CONFIG_IO_STRICT_DEVMEM/iomem=strict, but even then in case 
> the corresponding ioresource isn't busy (e.g. hasn't been claimed by a 
> driver yet through devm_ioremap_resource) we still have to mark them 
> as IORESOURCE_EXCLUSIVE for iomem_is_exclusive() to do the trick 
> (https://elixir.bootlin.com/linux/v5.18-rc6/source/kernel/resource.c#L1739) 
> and prevent access through /dev/mem.
>
> Finally the definition of no-map and the definition of MEMBLOCK_NOMAP 
> are not the same, all MEMBLOCK_NOMAP says is "don't add to kernel 
> direct mapping" so ioremap that uses vmalloc can still be used by 
> everyone, in general it's a mess. It becomes worse, if you mark a 
> reserved-memory region with no-map and that region overlaps with 
> /memory, early_init_dt_reserve_memory_arch() will isolate it, mark it 
> as MEMBLOCK_NOMAP and won't add it to memblock.reserved, if it doesn't 
> overlap it will just ignore it and still not add it to 
> memblock.reserved. So if we want to add a reserved-memory region that 
> doesn't overlap with /memory (a valid scenario allowed by the 
> binding), there is no way to mark it with no-map.
>
> When I wrote that code I was looking for a way to prevent access to 
> reserved regions through /dev/mem and by other drivers, regardless of 
> being part of /memory or not, and since I couldn't mark them with 
> no-map to track them because of early_init_dt_reserve_memory_arch(), I 
> marked them as busy and then used them from the driver with ioremap 
> directly. It was a temporary measure until I had a better approach 
> (e.g. override ioremap / devmem_is_allowed like other archs do) but I 
> never got the time to finish it, sorry for the mess !
>
> Regards,
> Nick
diff mbox series

Patch

diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index 834eb652a7b9..71f2966b1474 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -214,7 +214,7 @@  static void __init init_resources(void)
 
 		if (unlikely(memblock_is_nomap(region))) {
 			res->name = "Reserved";
-			res->flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+			res->flags = IORESOURCE_MEM;
 		} else {
 			res->name = "System RAM";
 			res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;