mbox series

[00/46] Dynamic allocation of reserved_mem array.

Message ID 20240126235425.12233-1-quic_obabatun@quicinc.com (mailing list archive)
Headers show
Series Dynamic allocation of reserved_mem array. | expand

Message

Oreoluwa Babatunde Jan. 26, 2024, 11:53 p.m. UTC
The reserved_mem array is used to store data for the different
reserved memory regions defined in the DT of a device.  The array
stores information such as region name, node, start-address, and size
of the reserved memory regions.

The array is currently statically allocated with a size of
MAX_RESERVED_REGIONS(64). This means that any system that specifies a
number of reserved memory regions greater than MAX_RESERVED_REGIONS(64)
will not have enough space to store the information for all the regions.

Therefore, this series extends the use of the static array for
reserved_mem, and introduces a dynamically allocated array using
memblock_alloc() based on the number of reserved memory regions
specified in the DT.

Some architectures such as arm64 require the page tables to be setup
before memblock allocated memory is writable.  Therefore, the dynamic
allocation of the reserved_mem array will need to be done after the
page tables have been setup on these architectures. In most cases that
will be after paging_init().

Reserved memory regions can be divided into 2 groups.
i) Statically-placed reserved memory regions
i.e. regions defined in the DT using the @reg property.
ii) Dynamically-placed reserved memory regions.
i.e. regions specified in the DT using the @alloc_ranges
    and @size properties.

It is possible to call memblock_reserve() and memblock_mark_nomap() on
the statically-placed reserved memory regions and not need to save them
to the reserved_mem array until memory is allocated for it using
memblock, which will be after the page tables have been setup.
For the dynamically-placed reserved memory regions, it is not possible
to wait to store its information because the starting address is
allocated only at run time, and hence they need to be stored somewhere
after they are allocated.
Waiting until after the page tables have been setup to allocate memory
for the dynamically-placed regions is also not an option because the
allocations will come from memory that have already been added to the
page tables, which is not good for memory that is supposed to be
reserved and/or marked as nomap.

Therefore, this series splits up the processing of the reserved memory
regions into two stages, of which the first stage is carried out by
early_init_fdt_scan_reserved_mem() and the second is carried out by
fdt_init_reserved_mem().

The early_init_fdt_scan_reserved_mem(), which is called before the page
tables are setup is used to:
1. Call memblock_reserve() and memblock_mark_nomap() on all the
   statically-placed reserved memory regions as needed.
2. Allocate memory from memblock for the dynamically-placed reserved
   memory regions and store them in the static array for reserved_mem.
   memblock_reserve() and memblock_mark_nomap() are also called as
   needed on all the memory allocated for the dynamically-placed
   regions.
3. Count the total number of reserved memory regions found in the DT.

fdt_init_reserved_mem(), which should be called after the page tables
have been setup, is used to carry out the following:
1. Allocate memory for the reserved_mem array based on the number of
   reserved memory regions counted as mentioned above.
2. Copy all the information for the dynamically-placed reserved memory
   regions from the static array into the new allocated memory for the
   reserved_mem array.
3. Add the information for the statically-placed reserved memory into
   reserved_mem array.
4. Run the region specific init functions for each of the reserve memory
   regions saved in the reserved_mem array.

Once the above steps have been completed and the init process is done
running, the original statically allocated reserved_mem array of size
MAX_RESERVED_REGIONS(64) will be automatically freed back to buddy
because it is no longer needed. This is done by marking the array as an
"__initdata" object in Patch 0018.

Note:

- Per Architecture, this series is effectively only 10 patches. The
  code for each architecture is split up into separate patches to
  allow each architecture to be tested independently of changes from
  other architectures. Should this series be accepted, this should
  allow for each arcitecture change to be picked up independently as
  well.

  Patch 0001: Splits up the processing of the reserved memory regions
  between early_init_fdt_scan_reserved_mem and fdt_init_reserved_mem.

  Patch 0002: Introduces a copy of early_init_fdt_scan_reserved_mem()
  which is used to separate it from fdt_init_reserved_mem() so that the
  two functions can be called independently of each other.

  Patch 0003 - Patch 0016: Duplicated change for each architecture to
  call early_init_fdt_scan_reserved_mem() and fdt_init_reserved_mem()
  at their appropriate locations. Here fdt_init_reserved_mem() is called
  either before of after the page tables have been setup depending on
  the architecture requirements.

  Patch 0017: Deletes the early_init_fdt_scan_reserved_mem() function
  since all architectures are now using the copy introduced in
  Patch 0002.

  Patch 0018: Dynamically allocate memory for the reserved_mem array
  based on the total number of reserved memory regions specified in the
  DT.

  Patch 0019 - Patch 0029: Duplicated change for each architecture to
  move the fdt_init_reserved_mem() function call to below the
  unflatten_devicetree() function call. This is so that the unflatten
  devicetree APIs can be used to process the reserved memory regions.

  Patch 0030: Make code changes to start using the unflatten devicetree
  APIs to access the reserved memory regions defined in the DT.

  Patch 0031: Rename fdt_* functions as dt_* to refelct that the
  flattened devicetree (fdt) APIs have been replaced with the unflatten
  devicetree APIs.

  Patch 0032 - Patch 0045: Duplicated change for each architecture to
  switch from the use of fdt_init_reserved_mem() to
  dt_init_reserved_mem(), which is the same function but the later uses
  the unflatten devicetree APIs.

  Patch 0046: Delete the fdt_init_reserved_mem() function as all
  architectures have switched to using dt_init_reserved_mem() which was
  introduced in Patch 0031.

- The limitation to this approach is that there is still a limit of
  64 for dynamically-placed reserved memory regions. But from my current
  analysis, these types of reserved memory regions are generally less
  in number when compared to the statically-placed reserved memory
  regions.

- I have looked through all architectures and placed the call to
  memblock_alloc() for the reserved_mem array at points where I
  believe memblock allocated memory are available to be written to.
  I currently only have access to an arm64 device and this is where I am
  testing the functionality of this series. Hence, I will need help from
  architecture maintainers to test this series on other architectures to
  ensure that the code is functioning properly on there.

Previous patch revisions:
1. [RFC V1 Patchset]:
https://lore.kernel.org/all/20231019184825.9712-1-quic_obabatun@quicinc.com/

2. [RFC V2 Patchset]:
https://lore.kernel.org/all/20231204041339.9902-1-quic_obabatun@quicinc.com/
- Extend changes to all other relevant architectures.
- Add code to use unflatten devicetree APIs to process the reserved
  memory regions.

Oreoluwa Babatunde (46):
  of: reserved_mem: Change the order that reserved_mem regions are
    stored
  of: reserved_mem: Introduce new early reserved memory scan function
  ARC: reserved_mem: Implement the new processing order for reserved
    memory
  ARM: reserved_mem: Implement the new processing order for reserved
    memory
  arm64: reserved_mem: Implement the new processing order for reserved
    memory
  csky: reserved_mem: Implement the new processing order for reserved
    memory
  Loongarch: reserved_mem: Implement the new processing order for
    reserved memory
  microblaze: reserved_mem: Implement the new processing order for
    reserved memory
  mips: reserved_mem: Implement the new processing order for reserved
    memory
  nios2: reserved_mem: Implement the new processing order for reserved
    memory
  openrisc: reserved_mem: Implement the new processing order for
    reserved memory
  powerpc: reserved_mem: Implement the new processing order for reserved
    memory
  riscv: reserved_mem: Implement the new processing order for reserved
    memory
  sh: reserved_mem: Implement the new processing order for reserved
    memory
  um: reserved_mem: Implement the new processing order for reserved
    memory
  xtensa: reserved_mem: Implement the new processing order for reserved
    memory
  of: reserved_mem: Delete the early_init_fdt_scan_reserved_mem()
    function
  of: reserved_mem: Add code to dynamically allocate reserved_mem array
  ARC: resrved_mem: Move fdt_init_reserved_mem() below
    unflatten_device_tree()
  ARM: resrved_mem: Move fdt_init_reserved_mem() below
    unflatten_device_tree()
  arm64: resrved_mem: Move fdt_init_reserved_mem() below
    unflatten_device_tree()
  csky: resrved_mem: Move fdt_init_reserved_mem() below
    unflatten_device_tree()
  microblaze: resrved_mem: Move fdt_init_reserved_mem() below
    unflatten_device_tree()
  mips: resrved_mem: Move fdt_init_reserved_mem() below
    unflatten_device_tree()
  nios2: resrved_mem: Move fdt_init_reserved_mem() below
    unflatten_device_tree()
  powerpc: resrved_mem: Move fdt_init_reserved_mem() below
    unflatten_device_tree()
  riscv: resrved_mem: Move fdt_init_reserved_mem() below
    unflatten_device_tree()
  um: resrved_mem: Move fdt_init_reserved_mem() below
    unflatten_device_tree()
  xtensa: resrved_mem: Move fdt_init_reserved_mem() below
    unflatten_device_tree()
  of: reserved_mem: Add code to use unflattened DT for reserved_mem
    nodes
  of: reserved_mem: Rename fdt_* functions to refelct use of unflattened
    devicetree APIs
  ARC: reserved_mem: Switch fdt_init_reserved_mem() to
    dt_init_reserved_mem()
  ARM: reserved_mem: Switch fdt_init_reserved_mem() to
    dt_init_reserved_mem()
  arm64: reserved_mem: Switch fdt_init_reserved_mem() to
    dt_init_reserved_mem()
  csky: reserved_mem: Switch fdt_init_reserved_mem() to
    dt_init_reserved_mem()
  loongarch: reserved_mem: Switch fdt_init_reserved_mem to
    dt_init_reserved_mem
  microblaze: reserved_mem: Switch fdt_init_reserved_mem to
    dt_init_reserved_mem
  mips: reserved_mem: Switch fdt_init_reserved_mem() to
    dt_init_reserved_mem()
  nios2: reserved_mem: Switch fdt_init_reserved_mem() to
    dt_init_reserved_mem()
  openrisc: reserved_mem: Switch fdt_init_reserved_mem to
    dt_init_reserved_mem
  powerpc: reserved_mem: Switch fdt_init_reserved_mem() to
    dt_init_reserved_mem()
  riscv: reserved_mem: Switch fdt_init_reserved_mem() to
    dt_init_reserved_mem()
  sh: reserved_mem: Switch fdt_init_reserved_mem() to
    dt_init_reserved_mem()
  um: reserved_mem: Switch fdt_init_reserved_mem() to
    dt_init_reserved_mem()
  xtensa: reserved_mem: Switch fdt_init_reserved_mem() to
    dt_init_reserved_mem()
  of: reserved_mem: Delete the fdt_init_reserved_mem() function

 arch/arc/kernel/setup.c            |   2 +
 arch/arc/mm/init.c                 |   2 +-
 arch/arm/kernel/setup.c            |   4 +
 arch/arm/mm/init.c                 |   2 +-
 arch/arm64/kernel/setup.c          |   3 +
 arch/arm64/mm/init.c               |   2 +-
 arch/csky/kernel/setup.c           |   4 +-
 arch/loongarch/kernel/setup.c      |   4 +-
 arch/microblaze/kernel/setup.c     |   3 +
 arch/microblaze/mm/init.c          |   2 +-
 arch/mips/kernel/setup.c           |   4 +-
 arch/nios2/kernel/setup.c          |   5 +-
 arch/openrisc/kernel/setup.c       |   4 +-
 arch/powerpc/kernel/prom.c         |   2 +-
 arch/powerpc/kernel/setup-common.c |   3 +
 arch/riscv/kernel/setup.c          |   3 +
 arch/riscv/mm/init.c               |   2 +-
 arch/sh/boards/of-generic.c        |   4 +-
 arch/um/kernel/dtb.c               |   4 +-
 arch/xtensa/kernel/setup.c         |   2 +
 arch/xtensa/mm/init.c              |   2 +-
 drivers/of/fdt.c                   |  42 +++++--
 drivers/of/of_private.h            |   5 +-
 drivers/of/of_reserved_mem.c       | 178 +++++++++++++++++++++--------
 include/linux/of_fdt.h             |   4 +-
 include/linux/of_reserved_mem.h    |  11 +-
 kernel/dma/coherent.c              |   4 +-
 kernel/dma/contiguous.c            |   8 +-
 kernel/dma/swiotlb.c               |  10 +-
 29 files changed, 234 insertions(+), 91 deletions(-)

Comments

Rob Herring (Arm) Jan. 31, 2024, 12:07 a.m. UTC | #1
On Fri, Jan 26, 2024 at 03:53:39PM -0800, Oreoluwa Babatunde wrote:
> The reserved_mem array is used to store data for the different
> reserved memory regions defined in the DT of a device.  The array
> stores information such as region name, node, start-address, and size
> of the reserved memory regions.
> 
> The array is currently statically allocated with a size of
> MAX_RESERVED_REGIONS(64). This means that any system that specifies a
> number of reserved memory regions greater than MAX_RESERVED_REGIONS(64)
> will not have enough space to store the information for all the regions.
> 
> Therefore, this series extends the use of the static array for
> reserved_mem, and introduces a dynamically allocated array using
> memblock_alloc() based on the number of reserved memory regions
> specified in the DT.
> 
> Some architectures such as arm64 require the page tables to be setup
> before memblock allocated memory is writable.  Therefore, the dynamic
> allocation of the reserved_mem array will need to be done after the
> page tables have been setup on these architectures. In most cases that
> will be after paging_init().
> 
> Reserved memory regions can be divided into 2 groups.
> i) Statically-placed reserved memory regions
> i.e. regions defined in the DT using the @reg property.
> ii) Dynamically-placed reserved memory regions.
> i.e. regions specified in the DT using the @alloc_ranges
>     and @size properties.
> 
> It is possible to call memblock_reserve() and memblock_mark_nomap() on
> the statically-placed reserved memory regions and not need to save them
> to the reserved_mem array until memory is allocated for it using
> memblock, which will be after the page tables have been setup.
> For the dynamically-placed reserved memory regions, it is not possible
> to wait to store its information because the starting address is
> allocated only at run time, and hence they need to be stored somewhere
> after they are allocated.
> Waiting until after the page tables have been setup to allocate memory
> for the dynamically-placed regions is also not an option because the
> allocations will come from memory that have already been added to the
> page tables, which is not good for memory that is supposed to be
> reserved and/or marked as nomap.
> 
> Therefore, this series splits up the processing of the reserved memory
> regions into two stages, of which the first stage is carried out by
> early_init_fdt_scan_reserved_mem() and the second is carried out by
> fdt_init_reserved_mem().
> 
> The early_init_fdt_scan_reserved_mem(), which is called before the page
> tables are setup is used to:
> 1. Call memblock_reserve() and memblock_mark_nomap() on all the
>    statically-placed reserved memory regions as needed.
> 2. Allocate memory from memblock for the dynamically-placed reserved
>    memory regions and store them in the static array for reserved_mem.
>    memblock_reserve() and memblock_mark_nomap() are also called as
>    needed on all the memory allocated for the dynamically-placed
>    regions.
> 3. Count the total number of reserved memory regions found in the DT.
> 
> fdt_init_reserved_mem(), which should be called after the page tables
> have been setup, is used to carry out the following:
> 1. Allocate memory for the reserved_mem array based on the number of
>    reserved memory regions counted as mentioned above.
> 2. Copy all the information for the dynamically-placed reserved memory
>    regions from the static array into the new allocated memory for the
>    reserved_mem array.
> 3. Add the information for the statically-placed reserved memory into
>    reserved_mem array.
> 4. Run the region specific init functions for each of the reserve memory
>    regions saved in the reserved_mem array.

I don't see the need for fdt_init_reserved_mem() to be explicitly called 
by arch code. I said this already, but that can be done at the same time 
as unflattening the DT. The same conditions are needed for both: we need 
to be able to allocate memory from memblock.

To put it another way, if fdt_init_reserved_mem() can be called "early", 
then unflattening could be moved earlier as well. Though I don't think 
we should optimize that. I'd rather see all arches call the DT functions 
at the same stages.

> Once the above steps have been completed and the init process is done
> running, the original statically allocated reserved_mem array of size
> MAX_RESERVED_REGIONS(64) will be automatically freed back to buddy
> because it is no longer needed. This is done by marking the array as an
> "__initdata" object in Patch 0018.
> 
> Note:
> 
> - Per Architecture, this series is effectively only 10 patches. The
>   code for each architecture is split up into separate patches to
>   allow each architecture to be tested independently of changes from
>   other architectures. Should this series be accepted, this should
>   allow for each arcitecture change to be picked up independently as
>   well.

Only if patches 1 and 2 are accepted in one cycle and the arch ones in 
the next cycle. No need for that though, I can take the whole thing 
(when it's ready).


> 
>   Patch 0001: Splits up the processing of the reserved memory regions
>   between early_init_fdt_scan_reserved_mem and fdt_init_reserved_mem.
> 
>   Patch 0002: Introduces a copy of early_init_fdt_scan_reserved_mem()
>   which is used to separate it from fdt_init_reserved_mem() so that the
>   two functions can be called independently of each other.
> 
>   Patch 0003 - Patch 0016: Duplicated change for each architecture to
>   call early_init_fdt_scan_reserved_mem() and fdt_init_reserved_mem()
>   at their appropriate locations. Here fdt_init_reserved_mem() is called
>   either before of after the page tables have been setup depending on
>   the architecture requirements.
> 
>   Patch 0017: Deletes the early_init_fdt_scan_reserved_mem() function
>   since all architectures are now using the copy introduced in
>   Patch 0002.
> 
>   Patch 0018: Dynamically allocate memory for the reserved_mem array
>   based on the total number of reserved memory regions specified in the
>   DT.
> 
>   Patch 0019 - Patch 0029: Duplicated change for each architecture to
>   move the fdt_init_reserved_mem() function call to below the
>   unflatten_devicetree() function call. This is so that the unflatten
>   devicetree APIs can be used to process the reserved memory regions.
> 
>   Patch 0030: Make code changes to start using the unflatten devicetree
>   APIs to access the reserved memory regions defined in the DT.
> 
>   Patch 0031: Rename fdt_* functions as dt_* to refelct that the
>   flattened devicetree (fdt) APIs have been replaced with the unflatten
>   devicetree APIs.
> 
>   Patch 0032 - Patch 0045: Duplicated change for each architecture to
>   switch from the use of fdt_init_reserved_mem() to
>   dt_init_reserved_mem(), which is the same function but the later uses
>   the unflatten devicetree APIs.
> 
>   Patch 0046: Delete the fdt_init_reserved_mem() function as all
>   architectures have switched to using dt_init_reserved_mem() which was
>   introduced in Patch 0031.
> 
> - The limitation to this approach is that there is still a limit of
>   64 for dynamically-placed reserved memory regions. But from my current
>   analysis, these types of reserved memory regions are generally less
>   in number when compared to the statically-placed reserved memory
>   regions.
> 
> - I have looked through all architectures and placed the call to
>   memblock_alloc() for the reserved_mem array at points where I
>   believe memblock allocated memory are available to be written to.
>   I currently only have access to an arm64 device and this is where I am
>   testing the functionality of this series. Hence, I will need help from
>   architecture maintainers to test this series on other architectures to
>   ensure that the code is functioning properly on there.
> 
> Previous patch revisions:
> 1. [RFC V1 Patchset]:
> https://lore.kernel.org/all/20231019184825.9712-1-quic_obabatun@quicinc.com/
> 
> 2. [RFC V2 Patchset]:
> https://lore.kernel.org/all/20231204041339.9902-1-quic_obabatun@quicinc.com/
> - Extend changes to all other relevant architectures.
> - Add code to use unflatten devicetree APIs to process the reserved
>   memory regions.

Dropping RFC does not make this v1. RFC is a state of the patches not a 
version.

Rob
Oreoluwa Babatunde Feb. 1, 2024, 5:08 p.m. UTC | #2
On 1/30/2024 4:07 PM, Rob Herring wrote:
> On Fri, Jan 26, 2024 at 03:53:39PM -0800, Oreoluwa Babatunde wrote:
>> The reserved_mem array is used to store data for the different
>> reserved memory regions defined in the DT of a device.  The array
>> stores information such as region name, node, start-address, and size
>> of the reserved memory regions.
>>
>> The array is currently statically allocated with a size of
>> MAX_RESERVED_REGIONS(64). This means that any system that specifies a
>> number of reserved memory regions greater than MAX_RESERVED_REGIONS(64)
>> will not have enough space to store the information for all the regions.
>>
>> Therefore, this series extends the use of the static array for
>> reserved_mem, and introduces a dynamically allocated array using
>> memblock_alloc() based on the number of reserved memory regions
>> specified in the DT.
>>
>> Some architectures such as arm64 require the page tables to be setup
>> before memblock allocated memory is writable.  Therefore, the dynamic
>> allocation of the reserved_mem array will need to be done after the
>> page tables have been setup on these architectures. In most cases that
>> will be after paging_init().
>>
>> Reserved memory regions can be divided into 2 groups.
>> i) Statically-placed reserved memory regions
>> i.e. regions defined in the DT using the @reg property.
>> ii) Dynamically-placed reserved memory regions.
>> i.e. regions specified in the DT using the @alloc_ranges
>>     and @size properties.
>>
>> It is possible to call memblock_reserve() and memblock_mark_nomap() on
>> the statically-placed reserved memory regions and not need to save them
>> to the reserved_mem array until memory is allocated for it using
>> memblock, which will be after the page tables have been setup.
>> For the dynamically-placed reserved memory regions, it is not possible
>> to wait to store its information because the starting address is
>> allocated only at run time, and hence they need to be stored somewhere
>> after they are allocated.
>> Waiting until after the page tables have been setup to allocate memory
>> for the dynamically-placed regions is also not an option because the
>> allocations will come from memory that have already been added to the
>> page tables, which is not good for memory that is supposed to be
>> reserved and/or marked as nomap.
>>
>> Therefore, this series splits up the processing of the reserved memory
>> regions into two stages, of which the first stage is carried out by
>> early_init_fdt_scan_reserved_mem() and the second is carried out by
>> fdt_init_reserved_mem().
>>
>> The early_init_fdt_scan_reserved_mem(), which is called before the page
>> tables are setup is used to:
>> 1. Call memblock_reserve() and memblock_mark_nomap() on all the
>>    statically-placed reserved memory regions as needed.
>> 2. Allocate memory from memblock for the dynamically-placed reserved
>>    memory regions and store them in the static array for reserved_mem.
>>    memblock_reserve() and memblock_mark_nomap() are also called as
>>    needed on all the memory allocated for the dynamically-placed
>>    regions.
>> 3. Count the total number of reserved memory regions found in the DT.
>>
>> fdt_init_reserved_mem(), which should be called after the page tables
>> have been setup, is used to carry out the following:
>> 1. Allocate memory for the reserved_mem array based on the number of
>>    reserved memory regions counted as mentioned above.
>> 2. Copy all the information for the dynamically-placed reserved memory
>>    regions from the static array into the new allocated memory for the
>>    reserved_mem array.
>> 3. Add the information for the statically-placed reserved memory into
>>    reserved_mem array.
>> 4. Run the region specific init functions for each of the reserve memory
>>    regions saved in the reserved_mem array.
> I don't see the need for fdt_init_reserved_mem() to be explicitly called 
> by arch code. I said this already, but that can be done at the same time 
> as unflattening the DT. The same conditions are needed for both: we need 
> to be able to allocate memory from memblock.
>
> To put it another way, if fdt_init_reserved_mem() can be called "early", 
> then unflattening could be moved earlier as well. Though I don't think 
> we should optimize that. I'd rather see all arches call the DT functions 
> at the same stages.
Hi Rob,

The reason we moved fdt_init_reserved_mem() back into the arch specific code
was because we realized that there was no apparently obvious way to call
early_init_fdt_scan_reserved_mem() and fdt_init_reserved_mem() in the correct
order that will work for all archs if we placed fdt_init_reserved_mem() inside the
unflatten_devicetree() function.

early_init_fdt_scan_reserved_mem() needs to be
called first before fdt_init_reserved_mem(). But on some archs,
unflatten_devicetree() is called before early_init_fdt_scan_reserved_mem(), which
means that if we have fdt_init_reserved_mem() inside the unflatten_devicetree()
function, it will be called before early_init_fdt_scan_reserved_mem().

This is connected to your other comments on Patch 7 & Patch 14.
I agree, unflatten_devicetree() should NOT be getting called before we reserve
memory for the reserved memory regions because that could cause memory to be
allocated from regions that should be reserved.

Hence, resolving this issue should allow us to call fdt_init_reserved_mem() from
the  unflatten_devicetree() function without it changing the order that we are
trying to have.

I will work on implementing this and send another revision.
>
>> Once the above steps have been completed and the init process is done
>> running, the original statically allocated reserved_mem array of size
>> MAX_RESERVED_REGIONS(64) will be automatically freed back to buddy
>> because it is no longer needed. This is done by marking the array as an
>> "__initdata" object in Patch 0018.
>>
>> Note:
>>
>> - Per Architecture, this series is effectively only 10 patches. The
>>   code for each architecture is split up into separate patches to
>>   allow each architecture to be tested independently of changes from
>>   other architectures. Should this series be accepted, this should
>>   allow for each arcitecture change to be picked up independently as
>>   well.
> Only if patches 1 and 2 are accepted in one cycle and the arch ones in 
> the next cycle. No need for that though, I can take the whole thing 
> (when it's ready).
ack.
>
>>   Patch 0001: Splits up the processing of the reserved memory regions
>>   between early_init_fdt_scan_reserved_mem and fdt_init_reserved_mem.
>>
>>   Patch 0002: Introduces a copy of early_init_fdt_scan_reserved_mem()
>>   which is used to separate it from fdt_init_reserved_mem() so that the
>>   two functions can be called independently of each other.
>>
>>   Patch 0003 - Patch 0016: Duplicated change for each architecture to
>>   call early_init_fdt_scan_reserved_mem() and fdt_init_reserved_mem()
>>   at their appropriate locations. Here fdt_init_reserved_mem() is called
>>   either before of after the page tables have been setup depending on
>>   the architecture requirements.
>>
>>   Patch 0017: Deletes the early_init_fdt_scan_reserved_mem() function
>>   since all architectures are now using the copy introduced in
>>   Patch 0002.
>>
>>   Patch 0018: Dynamically allocate memory for the reserved_mem array
>>   based on the total number of reserved memory regions specified in the
>>   DT.
>>
>>   Patch 0019 - Patch 0029: Duplicated change for each architecture to
>>   move the fdt_init_reserved_mem() function call to below the
>>   unflatten_devicetree() function call. This is so that the unflatten
>>   devicetree APIs can be used to process the reserved memory regions.
>>
>>   Patch 0030: Make code changes to start using the unflatten devicetree
>>   APIs to access the reserved memory regions defined in the DT.
>>
>>   Patch 0031: Rename fdt_* functions as dt_* to refelct that the
>>   flattened devicetree (fdt) APIs have been replaced with the unflatten
>>   devicetree APIs.
>>
>>   Patch 0032 - Patch 0045: Duplicated change for each architecture to
>>   switch from the use of fdt_init_reserved_mem() to
>>   dt_init_reserved_mem(), which is the same function but the later uses
>>   the unflatten devicetree APIs.
>>
>>   Patch 0046: Delete the fdt_init_reserved_mem() function as all
>>   architectures have switched to using dt_init_reserved_mem() which was
>>   introduced in Patch 0031.
>>
>> - The limitation to this approach is that there is still a limit of
>>   64 for dynamically-placed reserved memory regions. But from my current
>>   analysis, these types of reserved memory regions are generally less
>>   in number when compared to the statically-placed reserved memory
>>   regions.
>>
>> - I have looked through all architectures and placed the call to
>>   memblock_alloc() for the reserved_mem array at points where I
>>   believe memblock allocated memory are available to be written to.
>>   I currently only have access to an arm64 device and this is where I am
>>   testing the functionality of this series. Hence, I will need help from
>>   architecture maintainers to test this series on other architectures to
>>   ensure that the code is functioning properly on there.
>>
>> Previous patch revisions:
>> 1. [RFC V1 Patchset]:
>> https://lore.kernel.org/all/20231019184825.9712-1-quic_obabatun@quicinc.com/
>>
>> 2. [RFC V2 Patchset]:
>> https://lore.kernel.org/all/20231204041339.9902-1-quic_obabatun@quicinc.com/
>> - Extend changes to all other relevant architectures.
>> - Add code to use unflatten devicetree APIs to process the reserved
>>   memory regions.
> Dropping RFC does not make this v1. RFC is a state of the patches not a 
> version.
ack.


Thank you for your comments!

Oreoluwa
Rob Herring (Arm) Feb. 1, 2024, 7:46 p.m. UTC | #3
On Thu, Feb 01, 2024 at 09:08:06AM -0800, Oreoluwa Babatunde wrote:
> 
> On 1/30/2024 4:07 PM, Rob Herring wrote:
> > On Fri, Jan 26, 2024 at 03:53:39PM -0800, Oreoluwa Babatunde wrote:
> >> The reserved_mem array is used to store data for the different
> >> reserved memory regions defined in the DT of a device.  The array
> >> stores information such as region name, node, start-address, and size
> >> of the reserved memory regions.
> >>
> >> The array is currently statically allocated with a size of
> >> MAX_RESERVED_REGIONS(64). This means that any system that specifies a
> >> number of reserved memory regions greater than MAX_RESERVED_REGIONS(64)
> >> will not have enough space to store the information for all the regions.
> >>
> >> Therefore, this series extends the use of the static array for
> >> reserved_mem, and introduces a dynamically allocated array using
> >> memblock_alloc() based on the number of reserved memory regions
> >> specified in the DT.
> >>
> >> Some architectures such as arm64 require the page tables to be setup
> >> before memblock allocated memory is writable.  Therefore, the dynamic
> >> allocation of the reserved_mem array will need to be done after the
> >> page tables have been setup on these architectures. In most cases that
> >> will be after paging_init().
> >>
> >> Reserved memory regions can be divided into 2 groups.
> >> i) Statically-placed reserved memory regions
> >> i.e. regions defined in the DT using the @reg property.
> >> ii) Dynamically-placed reserved memory regions.
> >> i.e. regions specified in the DT using the @alloc_ranges
> >>     and @size properties.
> >>
> >> It is possible to call memblock_reserve() and memblock_mark_nomap() on
> >> the statically-placed reserved memory regions and not need to save them
> >> to the reserved_mem array until memory is allocated for it using
> >> memblock, which will be after the page tables have been setup.
> >> For the dynamically-placed reserved memory regions, it is not possible
> >> to wait to store its information because the starting address is
> >> allocated only at run time, and hence they need to be stored somewhere
> >> after they are allocated.
> >> Waiting until after the page tables have been setup to allocate memory
> >> for the dynamically-placed regions is also not an option because the
> >> allocations will come from memory that have already been added to the
> >> page tables, which is not good for memory that is supposed to be
> >> reserved and/or marked as nomap.
> >>
> >> Therefore, this series splits up the processing of the reserved memory
> >> regions into two stages, of which the first stage is carried out by
> >> early_init_fdt_scan_reserved_mem() and the second is carried out by
> >> fdt_init_reserved_mem().
> >>
> >> The early_init_fdt_scan_reserved_mem(), which is called before the page
> >> tables are setup is used to:
> >> 1. Call memblock_reserve() and memblock_mark_nomap() on all the
> >>    statically-placed reserved memory regions as needed.
> >> 2. Allocate memory from memblock for the dynamically-placed reserved
> >>    memory regions and store them in the static array for reserved_mem.
> >>    memblock_reserve() and memblock_mark_nomap() are also called as
> >>    needed on all the memory allocated for the dynamically-placed
> >>    regions.
> >> 3. Count the total number of reserved memory regions found in the DT.
> >>
> >> fdt_init_reserved_mem(), which should be called after the page tables
> >> have been setup, is used to carry out the following:
> >> 1. Allocate memory for the reserved_mem array based on the number of
> >>    reserved memory regions counted as mentioned above.
> >> 2. Copy all the information for the dynamically-placed reserved memory
> >>    regions from the static array into the new allocated memory for the
> >>    reserved_mem array.
> >> 3. Add the information for the statically-placed reserved memory into
> >>    reserved_mem array.
> >> 4. Run the region specific init functions for each of the reserve memory
> >>    regions saved in the reserved_mem array.
> > I don't see the need for fdt_init_reserved_mem() to be explicitly called 
> > by arch code. I said this already, but that can be done at the same time 
> > as unflattening the DT. The same conditions are needed for both: we need 
> > to be able to allocate memory from memblock.
> >
> > To put it another way, if fdt_init_reserved_mem() can be called "early", 
> > then unflattening could be moved earlier as well. Though I don't think 
> > we should optimize that. I'd rather see all arches call the DT functions 
> > at the same stages.
> Hi Rob,
> 
> The reason we moved fdt_init_reserved_mem() back into the arch specific code
> was because we realized that there was no apparently obvious way to call
> early_init_fdt_scan_reserved_mem() and fdt_init_reserved_mem() in the correct
> order that will work for all archs if we placed fdt_init_reserved_mem() inside the
> unflatten_devicetree() function.
> 
> early_init_fdt_scan_reserved_mem() needs to be
> called first before fdt_init_reserved_mem(). But on some archs,
> unflatten_devicetree() is called before early_init_fdt_scan_reserved_mem(), which
> means that if we have fdt_init_reserved_mem() inside the unflatten_devicetree()
> function, it will be called before early_init_fdt_scan_reserved_mem().
> 
> This is connected to your other comments on Patch 7 & Patch 14.
> I agree, unflatten_devicetree() should NOT be getting called before we reserve
> memory for the reserved memory regions because that could cause memory to be
> allocated from regions that should be reserved.
> 
> Hence, resolving this issue should allow us to call fdt_init_reserved_mem() from
> the  unflatten_devicetree() function without it changing the order that we are
> trying to have.

There's one issue I've found which is unflatten_device_tree() isn't 
called for ACPI case on arm64. Turns out we need /reserved-memory 
handled in that case too. However, I think we're going to change 
calling unflatten_device_tree() unconditionally for another reason[1]. 

[1] https://lore.kernel.org/all/efe6a7886c3491cc9c225a903efa2b1e.sboyd@kernel.org/

> 
> I will work on implementing this and send another revision.

I think we should go with a simpler route that's just copy the an 
initial array in initdata to a properly sized, allocated array like the 
patch below. Of course it will need some arch fixes and a follow-on 
patch to increase the initial array size.

8<--------------------------------------------------------------------
From: Rob Herring <robh@kernel.org>
Date: Wed, 31 Jan 2024 16:26:23 -0600
Subject: [PATCH] of: reserved-mem: Re-allocate reserved_mem array to actual
 size

In preparation to increase the static reserved_mem array size yet again,
copy the initial array to an allocated array sized based on the actual
size needed. Now increasing the the size of the static reserved_mem
array only eats up the initdata space. For platforms with reasonable
number of reserved regions, we have a net gain in free memory.

In order to do memblock allocations, fdt_init_reserved_mem() is moved a
bit later to unflatten_device_tree(). On some arches this is effectively
a nop.

Signed-off-by: Rob Herring <robh@kernel.org>
---
RFC as this is compile tested only. This is an alternative to this
series[1].

[1] https://lore.kernel.org/all/20240126235425.12233-1-quic_obabatun@quicinc.com/
---
 drivers/of/fdt.c             |  4 ++--
 drivers/of/of_reserved_mem.c | 18 +++++++++++++-----
 2 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index bf502ba8da95..14360f5191ae 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -645,8 +645,6 @@ void __init early_init_fdt_scan_reserved_mem(void)
 			break;
 		memblock_reserve(base, size);
 	}
-
-	fdt_init_reserved_mem();
 }
 
 /**
@@ -1328,6 +1326,8 @@ bool __init early_init_dt_scan(void *params)
  */
 void __init unflatten_device_tree(void)
 {
+	fdt_init_reserved_mem();
+
 	__unflatten_device_tree(initial_boot_params, NULL, &of_root,
 				early_init_dt_alloc_memory_arch, false);
 
diff --git a/drivers/of/of_reserved_mem.c b/drivers/of/of_reserved_mem.c
index 7ec94cfcbddb..ae323d6b25ad 100644
--- a/drivers/of/of_reserved_mem.c
+++ b/drivers/of/of_reserved_mem.c
@@ -27,7 +27,8 @@
 #include "of_private.h"
 
 #define MAX_RESERVED_REGIONS	64
-static struct reserved_mem reserved_mem[MAX_RESERVED_REGIONS];
+static struct reserved_mem reserved_mem[MAX_RESERVED_REGIONS] __initdata;
+static struct reserved_mem *reserved_mem_p;
 static int reserved_mem_count;
 
 static int __init early_init_dt_alloc_reserved_memory_arch(phys_addr_t size,
@@ -354,6 +355,13 @@ void __init fdt_init_reserved_mem(void)
 			}
 		}
 	}
+
+	reserved_mem_p = memblock_alloc(sizeof(struct reserved_mem) * reserved_mem_count,
+					sizeof(struct reserved_mem));
+	if (WARN(!reserved_mem_p, "of: reserved-memory allocation failed, continuing with __initdata array!\n"))
+		reserved_mem_p = reserved_mem;
+	else
+		memcpy(reserved_mem_p, reserved_mem, sizeof(struct reserved_mem) * reserved_mem_count);
 }
 
 static inline struct reserved_mem *__find_rmem(struct device_node *node)
@@ -364,8 +372,8 @@ static inline struct reserved_mem *__find_rmem(struct device_node *node)
 		return NULL;
 
 	for (i = 0; i < reserved_mem_count; i++)
-		if (reserved_mem[i].phandle == node->phandle)
-			return &reserved_mem[i];
+		if (reserved_mem_p[i].phandle == node->phandle)
+			return &reserved_mem_p[i];
 	return NULL;
 }
 
@@ -507,8 +515,8 @@ struct reserved_mem *of_reserved_mem_lookup(struct device_node *np)
 
 	name = kbasename(np->full_name);
 	for (i = 0; i < reserved_mem_count; i++)
-		if (!strcmp(reserved_mem[i].name, name))
-			return &reserved_mem[i];
+		if (!strcmp(reserved_mem_p[i].name, name))
+			return &reserved_mem_p[i];
 
 	return NULL;
 }
Oreoluwa Babatunde Feb. 1, 2024, 9:10 p.m. UTC | #4
On 2/1/2024 11:46 AM, Rob Herring wrote:
> On Thu, Feb 01, 2024 at 09:08:06AM -0800, Oreoluwa Babatunde wrote:
>> On 1/30/2024 4:07 PM, Rob Herring wrote:
>>> On Fri, Jan 26, 2024 at 03:53:39PM -0800, Oreoluwa Babatunde wrote:
>>>> The reserved_mem array is used to store data for the different
>>>> reserved memory regions defined in the DT of a device.  The array
>>>> stores information such as region name, node, start-address, and size
>>>> of the reserved memory regions.
>>>>
>>>> The array is currently statically allocated with a size of
>>>> MAX_RESERVED_REGIONS(64). This means that any system that specifies a
>>>> number of reserved memory regions greater than MAX_RESERVED_REGIONS(64)
>>>> will not have enough space to store the information for all the regions.
>>>>
>>>> Therefore, this series extends the use of the static array for
>>>> reserved_mem, and introduces a dynamically allocated array using
>>>> memblock_alloc() based on the number of reserved memory regions
>>>> specified in the DT.
>>>>
>>>> Some architectures such as arm64 require the page tables to be setup
>>>> before memblock allocated memory is writable.  Therefore, the dynamic
>>>> allocation of the reserved_mem array will need to be done after the
>>>> page tables have been setup on these architectures. In most cases that
>>>> will be after paging_init().
>>>>
>>>> Reserved memory regions can be divided into 2 groups.
>>>> i) Statically-placed reserved memory regions
>>>> i.e. regions defined in the DT using the @reg property.
>>>> ii) Dynamically-placed reserved memory regions.
>>>> i.e. regions specified in the DT using the @alloc_ranges
>>>>     and @size properties.
>>>>
>>>> It is possible to call memblock_reserve() and memblock_mark_nomap() on
>>>> the statically-placed reserved memory regions and not need to save them
>>>> to the reserved_mem array until memory is allocated for it using
>>>> memblock, which will be after the page tables have been setup.
>>>> For the dynamically-placed reserved memory regions, it is not possible
>>>> to wait to store its information because the starting address is
>>>> allocated only at run time, and hence they need to be stored somewhere
>>>> after they are allocated.
>>>> Waiting until after the page tables have been setup to allocate memory
>>>> for the dynamically-placed regions is also not an option because the
>>>> allocations will come from memory that have already been added to the
>>>> page tables, which is not good for memory that is supposed to be
>>>> reserved and/or marked as nomap.
>>>>
>>>> Therefore, this series splits up the processing of the reserved memory
>>>> regions into two stages, of which the first stage is carried out by
>>>> early_init_fdt_scan_reserved_mem() and the second is carried out by
>>>> fdt_init_reserved_mem().
>>>>
>>>> The early_init_fdt_scan_reserved_mem(), which is called before the page
>>>> tables are setup is used to:
>>>> 1. Call memblock_reserve() and memblock_mark_nomap() on all the
>>>>    statically-placed reserved memory regions as needed.
>>>> 2. Allocate memory from memblock for the dynamically-placed reserved
>>>>    memory regions and store them in the static array for reserved_mem.
>>>>    memblock_reserve() and memblock_mark_nomap() are also called as
>>>>    needed on all the memory allocated for the dynamically-placed
>>>>    regions.
>>>> 3. Count the total number of reserved memory regions found in the DT.
>>>>
>>>> fdt_init_reserved_mem(), which should be called after the page tables
>>>> have been setup, is used to carry out the following:
>>>> 1. Allocate memory for the reserved_mem array based on the number of
>>>>    reserved memory regions counted as mentioned above.
>>>> 2. Copy all the information for the dynamically-placed reserved memory
>>>>    regions from the static array into the new allocated memory for the
>>>>    reserved_mem array.
>>>> 3. Add the information for the statically-placed reserved memory into
>>>>    reserved_mem array.
>>>> 4. Run the region specific init functions for each of the reserve memory
>>>>    regions saved in the reserved_mem array.
>>> I don't see the need for fdt_init_reserved_mem() to be explicitly called 
>>> by arch code. I said this already, but that can be done at the same time 
>>> as unflattening the DT. The same conditions are needed for both: we need 
>>> to be able to allocate memory from memblock.
>>>
>>> To put it another way, if fdt_init_reserved_mem() can be called "early", 
>>> then unflattening could be moved earlier as well. Though I don't think 
>>> we should optimize that. I'd rather see all arches call the DT functions 
>>> at the same stages.
>> Hi Rob,
>>
>> The reason we moved fdt_init_reserved_mem() back into the arch specific code
>> was because we realized that there was no apparently obvious way to call
>> early_init_fdt_scan_reserved_mem() and fdt_init_reserved_mem() in the correct
>> order that will work for all archs if we placed fdt_init_reserved_mem() inside the
>> unflatten_devicetree() function.
>>
>> early_init_fdt_scan_reserved_mem() needs to be
>> called first before fdt_init_reserved_mem(). But on some archs,
>> unflatten_devicetree() is called before early_init_fdt_scan_reserved_mem(), which
>> means that if we have fdt_init_reserved_mem() inside the unflatten_devicetree()
>> function, it will be called before early_init_fdt_scan_reserved_mem().
>>
>> This is connected to your other comments on Patch 7 & Patch 14.
>> I agree, unflatten_devicetree() should NOT be getting called before we reserve
>> memory for the reserved memory regions because that could cause memory to be
>> allocated from regions that should be reserved.
>>
>> Hence, resolving this issue should allow us to call fdt_init_reserved_mem() from
>> the  unflatten_devicetree() function without it changing the order that we are
>> trying to have.
> There's one issue I've found which is unflatten_device_tree() isn't 
> called for ACPI case on arm64. Turns out we need /reserved-memory 
> handled in that case too. However, I think we're going to change 
> calling unflatten_device_tree() unconditionally for another reason[1]. 
>
> [1] https://lore.kernel.org/all/efe6a7886c3491cc9c225a903efa2b1e.sboyd@kernel.org/
>
>> I will work on implementing this and send another revision.
> I think we should go with a simpler route that's just copy the an 
> initial array in initdata to a properly sized, allocated array like the 
> patch below. Of course it will need some arch fixes and a follow-on 
> patch to increase the initial array size.
>
> 8<--------------------------------------------------------------------
> From: Rob Herring <robh@kernel.org>
> Date: Wed, 31 Jan 2024 16:26:23 -0600
> Subject: [PATCH] of: reserved-mem: Re-allocate reserved_mem array to actual
>  size
>
> In preparation to increase the static reserved_mem array size yet again,
> copy the initial array to an allocated array sized based on the actual
> size needed. Now increasing the the size of the static reserved_mem
> array only eats up the initdata space. For platforms with reasonable
> number of reserved regions, we have a net gain in free memory.
>
> In order to do memblock allocations, fdt_init_reserved_mem() is moved a
> bit later to unflatten_device_tree(). On some arches this is effectively
> a nop.
>
> Signed-off-by: Rob Herring <robh@kernel.org>
> ---
> RFC as this is compile tested only. This is an alternative to this
> series[1].
>
> [1] https://lore.kernel.org/all/20240126235425.12233-1-quic_obabatun@quicinc.com/
> ---
>  drivers/of/fdt.c             |  4 ++--
>  drivers/of/of_reserved_mem.c | 18 +++++++++++++-----
>  2 files changed, 15 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> index bf502ba8da95..14360f5191ae 100644
> --- a/drivers/of/fdt.c
> +++ b/drivers/of/fdt.c
> @@ -645,8 +645,6 @@ void __init early_init_fdt_scan_reserved_mem(void)
>  			break;
>  		memblock_reserve(base, size);
>  	}
> -
> -	fdt_init_reserved_mem();
>  }
>  
>  /**
> @@ -1328,6 +1326,8 @@ bool __init early_init_dt_scan(void *params)
>   */
>  void __init unflatten_device_tree(void)
>  {
> +	fdt_init_reserved_mem();
> +
>  	__unflatten_device_tree(initial_boot_params, NULL, &of_root,
>  				early_init_dt_alloc_memory_arch, false);
>  
> diff --git a/drivers/of/of_reserved_mem.c b/drivers/of/of_reserved_mem.c
> index 7ec94cfcbddb..ae323d6b25ad 100644
> --- a/drivers/of/of_reserved_mem.c
> +++ b/drivers/of/of_reserved_mem.c
> @@ -27,7 +27,8 @@
>  #include "of_private.h"
>  
>  #define MAX_RESERVED_REGIONS	64
> -static struct reserved_mem reserved_mem[MAX_RESERVED_REGIONS];
> +static struct reserved_mem reserved_mem[MAX_RESERVED_REGIONS] __initdata;
> +static struct reserved_mem *reserved_mem_p;
>  static int reserved_mem_count;
>  
>  static int __init early_init_dt_alloc_reserved_memory_arch(phys_addr_t size,
> @@ -354,6 +355,13 @@ void __init fdt_init_reserved_mem(void)
>  			}
>  		}
>  	}
> +
> +	reserved_mem_p = memblock_alloc(sizeof(struct reserved_mem) * reserved_mem_count,
> +					sizeof(struct reserved_mem));
> +	if (WARN(!reserved_mem_p, "of: reserved-memory allocation failed, continuing with __initdata array!\n"))
> +		reserved_mem_p = reserved_mem;
> +	else
> +		memcpy(reserved_mem_p, reserved_mem, sizeof(struct reserved_mem) * reserved_mem_count);
>  }
>  
>  static inline struct reserved_mem *__find_rmem(struct device_node *node)
> @@ -364,8 +372,8 @@ static inline struct reserved_mem *__find_rmem(struct device_node *node)
>  		return NULL;
>  
>  	for (i = 0; i < reserved_mem_count; i++)
> -		if (reserved_mem[i].phandle == node->phandle)
> -			return &reserved_mem[i];
> +		if (reserved_mem_p[i].phandle == node->phandle)
> +			return &reserved_mem_p[i];
>  	return NULL;
>  }
>  
> @@ -507,8 +515,8 @@ struct reserved_mem *of_reserved_mem_lookup(struct device_node *np)
>  
>  	name = kbasename(np->full_name);
>  	for (i = 0; i < reserved_mem_count; i++)
> -		if (!strcmp(reserved_mem[i].name, name))
> -			return &reserved_mem[i];
> +		if (!strcmp(reserved_mem_p[i].name, name))
> +			return &reserved_mem_p[i];
>  
>  	return NULL;
>  }
Hi Rob,

One thing that could come up with this is that  memory
for the dynamically-placed reserved memory regions
won't be allocated until we call fdt_init_reserved_mem().
(i.e. reserved memory regions defined using @alloc-ranges
and @size properties)

Since fdt_init_reserved_mem() is now being called from
unflatten_device_tree(), the page tables would have been
setup on most architectures, which means we will be
allocating from memory that have already been mapped.

Could this be an issue for memory that is supposed to be
reserved? Especially for the regions that are specified as
no-map?
Rob Herring (Arm) Feb. 2, 2024, 3:29 p.m. UTC | #5
On Thu, Feb 01, 2024 at 01:10:18PM -0800, Oreoluwa Babatunde wrote:
> 
> On 2/1/2024 11:46 AM, Rob Herring wrote:
> > On Thu, Feb 01, 2024 at 09:08:06AM -0800, Oreoluwa Babatunde wrote:
> >> On 1/30/2024 4:07 PM, Rob Herring wrote:
> >>> On Fri, Jan 26, 2024 at 03:53:39PM -0800, Oreoluwa Babatunde wrote:
> >>>> The reserved_mem array is used to store data for the different
> >>>> reserved memory regions defined in the DT of a device.  The array
> >>>> stores information such as region name, node, start-address, and size
> >>>> of the reserved memory regions.
> >>>>
> >>>> The array is currently statically allocated with a size of
> >>>> MAX_RESERVED_REGIONS(64). This means that any system that specifies a
> >>>> number of reserved memory regions greater than MAX_RESERVED_REGIONS(64)
> >>>> will not have enough space to store the information for all the regions.
> >>>>
> >>>> Therefore, this series extends the use of the static array for
> >>>> reserved_mem, and introduces a dynamically allocated array using
> >>>> memblock_alloc() based on the number of reserved memory regions
> >>>> specified in the DT.
> >>>>
> >>>> Some architectures such as arm64 require the page tables to be setup
> >>>> before memblock allocated memory is writable.  Therefore, the dynamic
> >>>> allocation of the reserved_mem array will need to be done after the
> >>>> page tables have been setup on these architectures. In most cases that
> >>>> will be after paging_init().
> >>>>
> >>>> Reserved memory regions can be divided into 2 groups.
> >>>> i) Statically-placed reserved memory regions
> >>>> i.e. regions defined in the DT using the @reg property.
> >>>> ii) Dynamically-placed reserved memory regions.
> >>>> i.e. regions specified in the DT using the @alloc_ranges
> >>>>     and @size properties.
> >>>>
> >>>> It is possible to call memblock_reserve() and memblock_mark_nomap() on
> >>>> the statically-placed reserved memory regions and not need to save them
> >>>> to the reserved_mem array until memory is allocated for it using
> >>>> memblock, which will be after the page tables have been setup.
> >>>> For the dynamically-placed reserved memory regions, it is not possible
> >>>> to wait to store its information because the starting address is
> >>>> allocated only at run time, and hence they need to be stored somewhere
> >>>> after they are allocated.
> >>>> Waiting until after the page tables have been setup to allocate memory
> >>>> for the dynamically-placed regions is also not an option because the
> >>>> allocations will come from memory that have already been added to the
> >>>> page tables, which is not good for memory that is supposed to be
> >>>> reserved and/or marked as nomap.
> >>>>
> >>>> Therefore, this series splits up the processing of the reserved memory
> >>>> regions into two stages, of which the first stage is carried out by
> >>>> early_init_fdt_scan_reserved_mem() and the second is carried out by
> >>>> fdt_init_reserved_mem().
> >>>>
> >>>> The early_init_fdt_scan_reserved_mem(), which is called before the page
> >>>> tables are setup is used to:
> >>>> 1. Call memblock_reserve() and memblock_mark_nomap() on all the
> >>>>    statically-placed reserved memory regions as needed.
> >>>> 2. Allocate memory from memblock for the dynamically-placed reserved
> >>>>    memory regions and store them in the static array for reserved_mem.
> >>>>    memblock_reserve() and memblock_mark_nomap() are also called as
> >>>>    needed on all the memory allocated for the dynamically-placed
> >>>>    regions.
> >>>> 3. Count the total number of reserved memory regions found in the DT.
> >>>>
> >>>> fdt_init_reserved_mem(), which should be called after the page tables
> >>>> have been setup, is used to carry out the following:
> >>>> 1. Allocate memory for the reserved_mem array based on the number of
> >>>>    reserved memory regions counted as mentioned above.
> >>>> 2. Copy all the information for the dynamically-placed reserved memory
> >>>>    regions from the static array into the new allocated memory for the
> >>>>    reserved_mem array.
> >>>> 3. Add the information for the statically-placed reserved memory into
> >>>>    reserved_mem array.
> >>>> 4. Run the region specific init functions for each of the reserve memory
> >>>>    regions saved in the reserved_mem array.
> >>> I don't see the need for fdt_init_reserved_mem() to be explicitly called 
> >>> by arch code. I said this already, but that can be done at the same time 
> >>> as unflattening the DT. The same conditions are needed for both: we need 
> >>> to be able to allocate memory from memblock.
> >>>
> >>> To put it another way, if fdt_init_reserved_mem() can be called "early", 
> >>> then unflattening could be moved earlier as well. Though I don't think 
> >>> we should optimize that. I'd rather see all arches call the DT functions 
> >>> at the same stages.
> >> Hi Rob,
> >>
> >> The reason we moved fdt_init_reserved_mem() back into the arch specific code
> >> was because we realized that there was no apparently obvious way to call
> >> early_init_fdt_scan_reserved_mem() and fdt_init_reserved_mem() in the correct
> >> order that will work for all archs if we placed fdt_init_reserved_mem() inside the
> >> unflatten_devicetree() function.
> >>
> >> early_init_fdt_scan_reserved_mem() needs to be
> >> called first before fdt_init_reserved_mem(). But on some archs,
> >> unflatten_devicetree() is called before early_init_fdt_scan_reserved_mem(), which
> >> means that if we have fdt_init_reserved_mem() inside the unflatten_devicetree()
> >> function, it will be called before early_init_fdt_scan_reserved_mem().
> >>
> >> This is connected to your other comments on Patch 7 & Patch 14.
> >> I agree, unflatten_devicetree() should NOT be getting called before we reserve
> >> memory for the reserved memory regions because that could cause memory to be
> >> allocated from regions that should be reserved.
> >>
> >> Hence, resolving this issue should allow us to call fdt_init_reserved_mem() from
> >> the  unflatten_devicetree() function without it changing the order that we are
> >> trying to have.
> > There's one issue I've found which is unflatten_device_tree() isn't 
> > called for ACPI case on arm64. Turns out we need /reserved-memory 
> > handled in that case too. However, I think we're going to change 
> > calling unflatten_device_tree() unconditionally for another reason[1]. 
> >
> > [1] https://lore.kernel.org/all/efe6a7886c3491cc9c225a903efa2b1e.sboyd@kernel.org/
> >
> >> I will work on implementing this and send another revision.
> > I think we should go with a simpler route that's just copy the an 
> > initial array in initdata to a properly sized, allocated array like the 
> > patch below. Of course it will need some arch fixes and a follow-on 
> > patch to increase the initial array size.
> >
> > 8<--------------------------------------------------------------------
> > From: Rob Herring <robh@kernel.org>
> > Date: Wed, 31 Jan 2024 16:26:23 -0600
> > Subject: [PATCH] of: reserved-mem: Re-allocate reserved_mem array to actual
> >  size
> >
> > In preparation to increase the static reserved_mem array size yet again,
> > copy the initial array to an allocated array sized based on the actual
> > size needed. Now increasing the the size of the static reserved_mem
> > array only eats up the initdata space. For platforms with reasonable
> > number of reserved regions, we have a net gain in free memory.
> >
> > In order to do memblock allocations, fdt_init_reserved_mem() is moved a
> > bit later to unflatten_device_tree(). On some arches this is effectively
> > a nop.

[...]

> Hi Rob,
> 
> One thing that could come up with this is that  memory
> for the dynamically-placed reserved memory regions
> won't be allocated until we call fdt_init_reserved_mem().
> (i.e. reserved memory regions defined using @alloc-ranges
> and @size properties)
> 
> Since fdt_init_reserved_mem() is now being called from
> unflatten_device_tree(), the page tables would have been
> setup on most architectures, which means we will be
> allocating from memory that have already been mapped.
> 
> Could this be an issue for memory that is supposed to be
> reserved? 

I suppose if the alloc-ranges region is not much bigger than the size 
and the kernel already made some allocation that landed in the region, 
then the allocation could fail. Not much we can do other than alloc the 
reserved regions as soon as possible. Are there cases where that's not 
happening?

I suppose the kernel could try and avoid all alloc-ranges until they've 
been allocated, but that would have to be best effort. I've seen 
optimizations where it's desired to spread buffers across DRAM banks, so 
you could have N alloc-ranges for N banks that covers all of memory.

There's also the issue that if you have more fixed regions than memblock 
can handle (128) before it can reallocate its arrays, then the 
page tables themselves could be allocated in reserved regions.

> Especially for the regions that are specified as
> no-map?

'no-map' is a hint, not a guarantee. Arm32 ignores it for regions 
within the kernel's linear map (at least it used to). I don't think 
anything changes here with it.

Rob
Oreoluwa Babatunde Feb. 7, 2024, 9:13 p.m. UTC | #6
On 2/2/2024 7:29 AM, Rob Herring wrote:
> On Thu, Feb 01, 2024 at 01:10:18PM -0800, Oreoluwa Babatunde wrote:
>> On 2/1/2024 11:46 AM, Rob Herring wrote:
>>> On Thu, Feb 01, 2024 at 09:08:06AM -0800, Oreoluwa Babatunde wrote:
>>>> On 1/30/2024 4:07 PM, Rob Herring wrote:
>>>>> On Fri, Jan 26, 2024 at 03:53:39PM -0800, Oreoluwa Babatunde wrote:
>>>>>> The reserved_mem array is used to store data for the different
>>>>>> reserved memory regions defined in the DT of a device.  The array
>>>>>> stores information such as region name, node, start-address, and size
>>>>>> of the reserved memory regions.
>>>>>>
>>>>>> The array is currently statically allocated with a size of
>>>>>> MAX_RESERVED_REGIONS(64). This means that any system that specifies a
>>>>>> number of reserved memory regions greater than MAX_RESERVED_REGIONS(64)
>>>>>> will not have enough space to store the information for all the regions.
>>>>>>
>>>>>> Therefore, this series extends the use of the static array for
>>>>>> reserved_mem, and introduces a dynamically allocated array using
>>>>>> memblock_alloc() based on the number of reserved memory regions
>>>>>> specified in the DT.
>>>>>>
>>>>>> Some architectures such as arm64 require the page tables to be setup
>>>>>> before memblock allocated memory is writable.  Therefore, the dynamic
>>>>>> allocation of the reserved_mem array will need to be done after the
>>>>>> page tables have been setup on these architectures. In most cases that
>>>>>> will be after paging_init().
>>>>>>
>>>>>> Reserved memory regions can be divided into 2 groups.
>>>>>> i) Statically-placed reserved memory regions
>>>>>> i.e. regions defined in the DT using the @reg property.
>>>>>> ii) Dynamically-placed reserved memory regions.
>>>>>> i.e. regions specified in the DT using the @alloc_ranges
>>>>>>     and @size properties.
>>>>>>
>>>>>> It is possible to call memblock_reserve() and memblock_mark_nomap() on
>>>>>> the statically-placed reserved memory regions and not need to save them
>>>>>> to the reserved_mem array until memory is allocated for it using
>>>>>> memblock, which will be after the page tables have been setup.
>>>>>> For the dynamically-placed reserved memory regions, it is not possible
>>>>>> to wait to store its information because the starting address is
>>>>>> allocated only at run time, and hence they need to be stored somewhere
>>>>>> after they are allocated.
>>>>>> Waiting until after the page tables have been setup to allocate memory
>>>>>> for the dynamically-placed regions is also not an option because the
>>>>>> allocations will come from memory that have already been added to the
>>>>>> page tables, which is not good for memory that is supposed to be
>>>>>> reserved and/or marked as nomap.
>>>>>>
>>>>>> Therefore, this series splits up the processing of the reserved memory
>>>>>> regions into two stages, of which the first stage is carried out by
>>>>>> early_init_fdt_scan_reserved_mem() and the second is carried out by
>>>>>> fdt_init_reserved_mem().
>>>>>>
>>>>>> The early_init_fdt_scan_reserved_mem(), which is called before the page
>>>>>> tables are setup is used to:
>>>>>> 1. Call memblock_reserve() and memblock_mark_nomap() on all the
>>>>>>    statically-placed reserved memory regions as needed.
>>>>>> 2. Allocate memory from memblock for the dynamically-placed reserved
>>>>>>    memory regions and store them in the static array for reserved_mem.
>>>>>>    memblock_reserve() and memblock_mark_nomap() are also called as
>>>>>>    needed on all the memory allocated for the dynamically-placed
>>>>>>    regions.
>>>>>> 3. Count the total number of reserved memory regions found in the DT.
>>>>>>
>>>>>> fdt_init_reserved_mem(), which should be called after the page tables
>>>>>> have been setup, is used to carry out the following:
>>>>>> 1. Allocate memory for the reserved_mem array based on the number of
>>>>>>    reserved memory regions counted as mentioned above.
>>>>>> 2. Copy all the information for the dynamically-placed reserved memory
>>>>>>    regions from the static array into the new allocated memory for the
>>>>>>    reserved_mem array.
>>>>>> 3. Add the information for the statically-placed reserved memory into
>>>>>>    reserved_mem array.
>>>>>> 4. Run the region specific init functions for each of the reserve memory
>>>>>>    regions saved in the reserved_mem array.
>>>>> I don't see the need for fdt_init_reserved_mem() to be explicitly called 
>>>>> by arch code. I said this already, but that can be done at the same time 
>>>>> as unflattening the DT. The same conditions are needed for both: we need 
>>>>> to be able to allocate memory from memblock.
>>>>>
>>>>> To put it another way, if fdt_init_reserved_mem() can be called "early", 
>>>>> then unflattening could be moved earlier as well. Though I don't think 
>>>>> we should optimize that. I'd rather see all arches call the DT functions 
>>>>> at the same stages.
>>>> Hi Rob,
>>>>
>>>> The reason we moved fdt_init_reserved_mem() back into the arch specific code
>>>> was because we realized that there was no apparently obvious way to call
>>>> early_init_fdt_scan_reserved_mem() and fdt_init_reserved_mem() in the correct
>>>> order that will work for all archs if we placed fdt_init_reserved_mem() inside the
>>>> unflatten_devicetree() function.
>>>>
>>>> early_init_fdt_scan_reserved_mem() needs to be
>>>> called first before fdt_init_reserved_mem(). But on some archs,
>>>> unflatten_devicetree() is called before early_init_fdt_scan_reserved_mem(), which
>>>> means that if we have fdt_init_reserved_mem() inside the unflatten_devicetree()
>>>> function, it will be called before early_init_fdt_scan_reserved_mem().
>>>>
>>>> This is connected to your other comments on Patch 7 & Patch 14.
>>>> I agree, unflatten_devicetree() should NOT be getting called before we reserve
>>>> memory for the reserved memory regions because that could cause memory to be
>>>> allocated from regions that should be reserved.
>>>>
>>>> Hence, resolving this issue should allow us to call fdt_init_reserved_mem() from
>>>> the  unflatten_devicetree() function without it changing the order that we are
>>>> trying to have.
>>> There's one issue I've found which is unflatten_device_tree() isn't 
>>> called for ACPI case on arm64. Turns out we need /reserved-memory 
>>> handled in that case too. However, I think we're going to change 
>>> calling unflatten_device_tree() unconditionally for another reason[1]. 
>>>
>>> [1] https://lore.kernel.org/all/efe6a7886c3491cc9c225a903efa2b1e.sboyd@kernel.org/
>>>
>>>> I will work on implementing this and send another revision.
>>> I think we should go with a simpler route that's just copy the an 
>>> initial array in initdata to a properly sized, allocated array like the 
>>> patch below. Of course it will need some arch fixes and a follow-on 
>>> patch to increase the initial array size.
>>>
>>> 8<--------------------------------------------------------------------
>>> From: Rob Herring <robh@kernel.org>
>>> Date: Wed, 31 Jan 2024 16:26:23 -0600
>>> Subject: [PATCH] of: reserved-mem: Re-allocate reserved_mem array to actual
>>>  size
>>>
>>> In preparation to increase the static reserved_mem array size yet again,
>>> copy the initial array to an allocated array sized based on the actual
>>> size needed. Now increasing the the size of the static reserved_mem
>>> array only eats up the initdata space. For platforms with reasonable
>>> number of reserved regions, we have a net gain in free memory.
>>>
>>> In order to do memblock allocations, fdt_init_reserved_mem() is moved a
>>> bit later to unflatten_device_tree(). On some arches this is effectively
>>> a nop.
> [...]
>
>> Hi Rob,
>>
>> One thing that could come up with this is that  memory
>> for the dynamically-placed reserved memory regions
>> won't be allocated until we call fdt_init_reserved_mem().
>> (i.e. reserved memory regions defined using @alloc-ranges
>> and @size properties)
>>
>> Since fdt_init_reserved_mem() is now being called from
>> unflatten_device_tree(), the page tables would have been
>> setup on most architectures, which means we will be
>> allocating from memory that have already been mapped.
>>
>> Could this be an issue for memory that is supposed to be
>> reserved? 
> I suppose if the alloc-ranges region is not much bigger than the size 
> and the kernel already made some allocation that landed in the region, 
> then the allocation could fail. Not much we can do other than alloc the 
> reserved regions as soon as possible. Are there cases where that's not 
> happening?
Correct, the best thing we can do here is to make sure we allocate the
reserved memory regions as soon as possible to avoid other users from
allocating from these regions.
>
> I suppose the kernel could try and avoid all alloc-ranges until they've 
> been allocated, but that would have to be best effort. I've seen 
> optimizations where it's desired to spread buffers across DRAM banks, so 
> you could have N alloc-ranges for N banks that covers all of memory.
ack
>
> There's also the issue that if you have more fixed regions than memblock 
> can handle (128) before it can reallocate its arrays, then the 
> page tables themselves could be allocated in reserved regions.
True, this is also a limitation on the side of memblock, and there is
not much else we can do on this front as well.
> [ . . . ]
> 'no-map' is a hint, not a guarantee. Arm32 ignores it for regions 
> within the kernel's linear map (at least it used to). I don't think 
> anything changes here with it.
Anyone adding the "no-map" property to a reserved region is specifying that the region must not be mapped into the kernel page tables so that that there is no access from other users, not even speculative accesses.

This can be seen in the description of no-map here: https://github.com/devicetree-org/dt-schema/blob/main/dtschema/schemas/reserved-memory/reserved-memory.yaml#L79

I'm not sure about the arm32 architecture taking no-map as a hint, but I ran some tests on the arm64 architecture which I am currently testing on, and when a region is marked as no-map, it is excluded from the kernel page mappings (which is the correct behavior from the description of no-map above). But if we wait until after the page tables are setup to initialize reserved memory regions that are marked as no-map, then those regions would already be part of the page tables. And this defeats the purpose of users specifying no-map for their reserved memory. Hence, I think it is important to prioritize marking all nomap reserved memory regions as nomap before the page tables are setup so that this functionality is preserved. And we are able to achieve this with the code re-ordering that is being done in [PATCH 01/46] of this series. Regards, Oreoluwa