diff mbox series

[v4,3/3] hw/riscv: change riscv_compute_fdt_addr() semantics

Message ID 20230126135219.1054658-4-dbarboza@ventanamicro.com (mailing list archive)
State New, archived
Headers show
Series riscv_load_fdt() semantics change | expand

Commit Message

Daniel Henrique Barboza Jan. 26, 2023, 1:52 p.m. UTC
As it is now, riscv_compute_fdt_addr() is receiving a dram_base, a
mem_size (which is defaulted to MachineState::ram_size in all boards)
and the FDT pointer. And it makes a very important assumption: the DRAM
interval dram_base + mem_size is contiguous. This is indeed the case for
most boards that uses a FDT.

The Icicle Kit board works with 2 distinct RAM banks that are separated
by a gap. We have a lower bank with 1GiB size, a gap follows, then at
64GiB the high memory starts. MachineClass::default_ram_size for this
board is set to 1.5Gb, and machine_init() is enforcing it as minimal RAM
size, meaning that there we'll always have at least 512 MiB in the Hi
RAM area.

Using riscv_compute_fdt_addr() in this board is weird because not only
the board has sparse RAM, and it's calling it using the base address of
the Lo RAM area, but it's also using a mem_size that we have guarantees
that it will go up to the Hi RAM. All the function assumptions doesn't
work for this board.

In fact, what makes the function works at all in this case is a
coincidence.  Commit 1a475d39ef54 introduced a 3GB boundary for the FDT,
down from 4Gb, that is enforced if dram_base is lower than 3072 MiB. For
the Icicle Kit board, memmap[MICROCHIP_PFSOC_DRAM_LO].base is 0x80000000
(2 Gb) and it has a 1Gb size, so it will fall in the conditions to put
the FDT under a 3Gb address, which happens to be exactly at the end of
DRAM_LO. If the base address of the Lo area started later than 3Gb this
function would be unusable by the board. Changing any assumptions inside
riscv_compute_fdt_addr() can also break it by accident as well.

Let's change riscv_compute_fdt_addr() semantics to be appropriate to the
Icicle Kit board and for future boards that might have sparse RAM
topologies to worry about:

- relieve the condition that the dram_base + mem_size area is contiguous,
since this is already not the case today;

- receive an extra 'dram_size' size attribute that refers to a contiguous
RAM block that the board wants the FDT to reside on.

Together with 'mem_size' and 'fdt', which are now now being consumed by a
MachineState pointer, we're able to make clear assumptions based on the
DRAM block and total mem_size available to ensure that the FDT will be put
in a valid RAM address.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
---
 hw/riscv/boot.c            | 38 ++++++++++++++++++++++++++------------
 hw/riscv/microchip_pfsoc.c |  3 ++-
 hw/riscv/sifive_u.c        |  3 ++-
 hw/riscv/spike.c           |  3 ++-
 hw/riscv/virt.c            |  3 ++-
 include/hw/riscv/boot.h    |  4 ++--
 6 files changed, 36 insertions(+), 18 deletions(-)

Comments

Bin Meng Jan. 29, 2023, 5:45 a.m. UTC | #1
On Thu, Jan 26, 2023 at 9:54 PM Daniel Henrique Barboza
<dbarboza@ventanamicro.com> wrote:
>
> As it is now, riscv_compute_fdt_addr() is receiving a dram_base, a
> mem_size (which is defaulted to MachineState::ram_size in all boards)
> and the FDT pointer. And it makes a very important assumption: the DRAM
> interval dram_base + mem_size is contiguous. This is indeed the case for
> most boards that uses a FDT.

s/uses/use

>
> The Icicle Kit board works with 2 distinct RAM banks that are separated
> by a gap. We have a lower bank with 1GiB size, a gap follows, then at
> 64GiB the high memory starts. MachineClass::default_ram_size for this
> board is set to 1.5Gb, and machine_init() is enforcing it as minimal RAM
> size, meaning that there we'll always have at least 512 MiB in the Hi
> RAM area.
>
> Using riscv_compute_fdt_addr() in this board is weird because not only
> the board has sparse RAM, and it's calling it using the base address of
> the Lo RAM area, but it's also using a mem_size that we have guarantees
> that it will go up to the Hi RAM. All the function assumptions doesn't
> work for this board.
>
> In fact, what makes the function works at all in this case is a
> coincidence.  Commit 1a475d39ef54 introduced a 3GB boundary for the FDT,
> down from 4Gb, that is enforced if dram_base is lower than 3072 MiB. For
> the Icicle Kit board, memmap[MICROCHIP_PFSOC_DRAM_LO].base is 0x80000000
> (2 Gb) and it has a 1Gb size, so it will fall in the conditions to put
> the FDT under a 3Gb address, which happens to be exactly at the end of
> DRAM_LO. If the base address of the Lo area started later than 3Gb this
> function would be unusable by the board. Changing any assumptions inside
> riscv_compute_fdt_addr() can also break it by accident as well.
>
> Let's change riscv_compute_fdt_addr() semantics to be appropriate to the
> Icicle Kit board and for future boards that might have sparse RAM
> topologies to worry about:
>
> - relieve the condition that the dram_base + mem_size area is contiguous,
> since this is already not the case today;
>
> - receive an extra 'dram_size' size attribute that refers to a contiguous
> RAM block that the board wants the FDT to reside on.
>
> Together with 'mem_size' and 'fdt', which are now now being consumed by a
> MachineState pointer, we're able to make clear assumptions based on the
> DRAM block and total mem_size available to ensure that the FDT will be put
> in a valid RAM address.
>

Well written commit message. Thanks!

> Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
> ---
>  hw/riscv/boot.c            | 38 ++++++++++++++++++++++++++------------
>  hw/riscv/microchip_pfsoc.c |  3 ++-
>  hw/riscv/sifive_u.c        |  3 ++-
>  hw/riscv/spike.c           |  3 ++-
>  hw/riscv/virt.c            |  3 ++-
>  include/hw/riscv/boot.h    |  4 ++--
>  6 files changed, 36 insertions(+), 18 deletions(-)
>
> diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
> index a6f7b8ae8e..8f4991480b 100644
> --- a/hw/riscv/boot.c
> +++ b/hw/riscv/boot.c
> @@ -284,33 +284,47 @@ out:
>  }
>
>  /*
> - * The FDT should be put at the farthest point possible to
> - * avoid overwriting it with the kernel/initrd.
> + * This function makes an assumption that the DRAM interval
> + * 'dram_base' + 'dram_size' is contiguous.
>   *
> - * This function makes an assumption that the DRAM is
> - * contiguous. It also cares about 32-bit systems and
> - * will limit fdt_addr to be addressable by them even for
> - * 64-bit CPUs.
> + * Considering that 'dram_end' is the lowest value between
> + * the end of the DRAM block and MachineState->ram_size, the
> + * FDT location will vary according to 'dram_base':
> + *
> + * - if 'dram_base' is less that 3072 MiB, the FDT will be
> + * put at the lowest value between 3072 MiB and 'dram_end';
> + *
> + * - if 'dram_base' is higher than 3072 MiB, the FDT will be
> + * put at 'dram_end'.
>   *
>   * The FDT is fdt_packed() during the calculation.
>   */
> -uint32_t riscv_compute_fdt_addr(hwaddr dram_base, uint64_t mem_size,
> -                                void *fdt)
> +hwaddr riscv_compute_fdt_addr(hwaddr dram_base, hwaddr dram_size,

Using hwaddr to represent a size looks weird. Although technically
they are the same ... I would leave this as it is.

> +                              MachineState *ms)
>  {
> -    uint64_t temp;
> -    hwaddr dram_end = dram_base + mem_size;
> -    int ret = fdt_pack(fdt);
> +    int ret = fdt_pack(ms->fdt);
> +    hwaddr dram_end, temp;
>      int fdtsize;
>
>      /* Should only fail if we've built a corrupted tree */
>      g_assert(ret == 0);
>
> -    fdtsize = fdt_totalsize(fdt);
> +    fdtsize = fdt_totalsize(ms->fdt);
>      if (fdtsize <= 0) {
>          error_report("invalid device-tree");
>          exit(1);
>      }
>
> +    /*
> +     * A dram_size == 0, usually from a MemMapEntry[].size element,
> +     * means that the DRAM block goes all the way to ms->ram_size.
> +     */
> +    if (dram_size == 0x0) {
> +        dram_end = dram_base + ms->ram_size;
> +    } else {
> +        dram_end = dram_base + MIN(ms->ram_size, dram_size);
> +    }

How about:

g_assert(dram_size < ms->ram_size);
dram_end = dram_base + (dram_size ? dram_size : ms->ram_size);

> +
>      /*
>       * We should put fdt as far as possible to avoid kernel/initrd overwriting
>       * its content. But it should be addressable by 32 bit system as well.
> diff --git a/hw/riscv/microchip_pfsoc.c b/hw/riscv/microchip_pfsoc.c
> index a30203db85..e81bbd12df 100644
> --- a/hw/riscv/microchip_pfsoc.c
> +++ b/hw/riscv/microchip_pfsoc.c
> @@ -634,7 +634,8 @@ static void microchip_icicle_kit_machine_init(MachineState *machine)
>
>          /* Compute the fdt load address in dram */
>          fdt_load_addr = riscv_compute_fdt_addr(memmap[MICROCHIP_PFSOC_DRAM_LO].base,
> -                                               machine->ram_size, machine->fdt);
> +                                               memmap[MICROCHIP_PFSOC_DRAM_LO].size,
> +                                               machine);
>          riscv_load_fdt(fdt_load_addr, machine->fdt);
>
>          /* Load the reset vector */
> diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
> index 6bbdbe5fb7..ad3bb35b34 100644
> --- a/hw/riscv/sifive_u.c
> +++ b/hw/riscv/sifive_u.c
> @@ -609,7 +609,8 @@ static void sifive_u_machine_init(MachineState *machine)
>      }
>
>      fdt_load_addr = riscv_compute_fdt_addr(memmap[SIFIVE_U_DEV_DRAM].base,
> -                                           machine->ram_size, machine->fdt);
> +                                           memmap[SIFIVE_U_DEV_DRAM].size,
> +                                           machine);
>      riscv_load_fdt(fdt_load_addr, machine->fdt);
>
>      if (!riscv_is_32bit(&s->soc.u_cpus)) {
> diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
> index ceebe34c5f..b5979eddd6 100644
> --- a/hw/riscv/spike.c
> +++ b/hw/riscv/spike.c
> @@ -317,7 +317,8 @@ static void spike_board_init(MachineState *machine)
>      }
>
>      fdt_load_addr = riscv_compute_fdt_addr(memmap[SPIKE_DRAM].base,
> -                                           machine->ram_size, machine->fdt);
> +                                           memmap[SPIKE_DRAM].size,
> +                                           machine);
>      riscv_load_fdt(fdt_load_addr, machine->fdt);
>
>      /* load the reset vector */
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index 43fca597f0..f079a30b60 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -1293,7 +1293,8 @@ static void virt_machine_done(Notifier *notifier, void *data)
>      }
>
>      fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base,
> -                                           machine->ram_size, machine->fdt);
> +                                           memmap[VIRT_DRAM].size,
> +                                           machine);
>      riscv_load_fdt(fdt_load_addr, machine->fdt);
>
>      /* load the reset vector */
> diff --git a/include/hw/riscv/boot.h b/include/hw/riscv/boot.h
> index 7babd669c7..a6099c2dc6 100644
> --- a/include/hw/riscv/boot.h
> +++ b/include/hw/riscv/boot.h
> @@ -48,8 +48,8 @@ target_ulong riscv_load_kernel(MachineState *machine,
>                                 target_ulong firmware_end_addr,
>                                 bool load_initrd,
>                                 symbol_fn_t sym_cb);
> -uint32_t riscv_compute_fdt_addr(hwaddr dram_start, uint64_t dram_size,
> -                                void *fdt);
> +hwaddr riscv_compute_fdt_addr(hwaddr dram_start, uint64_t dram_size,
> +                              MachineState *ms);
>  void riscv_load_fdt(hwaddr fdt_addr, void *fdt);
>  void riscv_setup_rom_reset_vec(MachineState *machine, RISCVHartArrayState *harts,
>                                 hwaddr saddr,
> --

Regards,
Bin
Daniel Henrique Barboza Jan. 30, 2023, 5:16 p.m. UTC | #2
On 1/29/23 02:45, Bin Meng wrote:
> On Thu, Jan 26, 2023 at 9:54 PM Daniel Henrique Barboza
> <dbarboza@ventanamicro.com> wrote:
>>
>> As it is now, riscv_compute_fdt_addr() is receiving a dram_base, a
>> mem_size (which is defaulted to MachineState::ram_size in all boards)
>> and the FDT pointer. And it makes a very important assumption: the DRAM
>> interval dram_base + mem_size is contiguous. This is indeed the case for
>> most boards that uses a FDT.
> 
> s/uses/use
> 
>>
>> The Icicle Kit board works with 2 distinct RAM banks that are separated
>> by a gap. We have a lower bank with 1GiB size, a gap follows, then at
>> 64GiB the high memory starts. MachineClass::default_ram_size for this
>> board is set to 1.5Gb, and machine_init() is enforcing it as minimal RAM
>> size, meaning that there we'll always have at least 512 MiB in the Hi
>> RAM area.
>>
>> Using riscv_compute_fdt_addr() in this board is weird because not only
>> the board has sparse RAM, and it's calling it using the base address of
>> the Lo RAM area, but it's also using a mem_size that we have guarantees
>> that it will go up to the Hi RAM. All the function assumptions doesn't
>> work for this board.
>>
>> In fact, what makes the function works at all in this case is a
>> coincidence.  Commit 1a475d39ef54 introduced a 3GB boundary for the FDT,
>> down from 4Gb, that is enforced if dram_base is lower than 3072 MiB. For
>> the Icicle Kit board, memmap[MICROCHIP_PFSOC_DRAM_LO].base is 0x80000000
>> (2 Gb) and it has a 1Gb size, so it will fall in the conditions to put
>> the FDT under a 3Gb address, which happens to be exactly at the end of
>> DRAM_LO. If the base address of the Lo area started later than 3Gb this
>> function would be unusable by the board. Changing any assumptions inside
>> riscv_compute_fdt_addr() can also break it by accident as well.
>>
>> Let's change riscv_compute_fdt_addr() semantics to be appropriate to the
>> Icicle Kit board and for future boards that might have sparse RAM
>> topologies to worry about:
>>
>> - relieve the condition that the dram_base + mem_size area is contiguous,
>> since this is already not the case today;
>>
>> - receive an extra 'dram_size' size attribute that refers to a contiguous
>> RAM block that the board wants the FDT to reside on.
>>
>> Together with 'mem_size' and 'fdt', which are now now being consumed by a
>> MachineState pointer, we're able to make clear assumptions based on the
>> DRAM block and total mem_size available to ensure that the FDT will be put
>> in a valid RAM address.
>>
> 
> Well written commit message. Thanks!
> 
>> Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
>> ---
>>   hw/riscv/boot.c            | 38 ++++++++++++++++++++++++++------------
>>   hw/riscv/microchip_pfsoc.c |  3 ++-
>>   hw/riscv/sifive_u.c        |  3 ++-
>>   hw/riscv/spike.c           |  3 ++-
>>   hw/riscv/virt.c            |  3 ++-
>>   include/hw/riscv/boot.h    |  4 ++--
>>   6 files changed, 36 insertions(+), 18 deletions(-)
>>
>> diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
>> index a6f7b8ae8e..8f4991480b 100644
>> --- a/hw/riscv/boot.c
>> +++ b/hw/riscv/boot.c
>> @@ -284,33 +284,47 @@ out:
>>   }
>>
>>   /*
>> - * The FDT should be put at the farthest point possible to
>> - * avoid overwriting it with the kernel/initrd.
>> + * This function makes an assumption that the DRAM interval
>> + * 'dram_base' + 'dram_size' is contiguous.
>>    *
>> - * This function makes an assumption that the DRAM is
>> - * contiguous. It also cares about 32-bit systems and
>> - * will limit fdt_addr to be addressable by them even for
>> - * 64-bit CPUs.
>> + * Considering that 'dram_end' is the lowest value between
>> + * the end of the DRAM block and MachineState->ram_size, the
>> + * FDT location will vary according to 'dram_base':
>> + *
>> + * - if 'dram_base' is less that 3072 MiB, the FDT will be
>> + * put at the lowest value between 3072 MiB and 'dram_end';
>> + *
>> + * - if 'dram_base' is higher than 3072 MiB, the FDT will be
>> + * put at 'dram_end'.
>>    *
>>    * The FDT is fdt_packed() during the calculation.
>>    */
>> -uint32_t riscv_compute_fdt_addr(hwaddr dram_base, uint64_t mem_size,
>> -                                void *fdt)
>> +hwaddr riscv_compute_fdt_addr(hwaddr dram_base, hwaddr dram_size,
> 
> Using hwaddr to represent a size looks weird. Although technically
> they are the same ... I would leave this as it is.

I'll leave it as it was back in patch 2 (uint64_t).

> 
>> +                              MachineState *ms)
>>   {
>> -    uint64_t temp;
>> -    hwaddr dram_end = dram_base + mem_size;
>> -    int ret = fdt_pack(fdt);
>> +    int ret = fdt_pack(ms->fdt);
>> +    hwaddr dram_end, temp;
>>       int fdtsize;
>>
>>       /* Should only fail if we've built a corrupted tree */
>>       g_assert(ret == 0);
>>
>> -    fdtsize = fdt_totalsize(fdt);
>> +    fdtsize = fdt_totalsize(ms->fdt);
>>       if (fdtsize <= 0) {
>>           error_report("invalid device-tree");
>>           exit(1);
>>       }
>>
>> +    /*
>> +     * A dram_size == 0, usually from a MemMapEntry[].size element,
>> +     * means that the DRAM block goes all the way to ms->ram_size.
>> +     */
>> +    if (dram_size == 0x0) {
>> +        dram_end = dram_base + ms->ram_size;
>> +    } else {
>> +        dram_end = dram_base + MIN(ms->ram_size, dram_size);
>> +    }
> 
> How about:
> 
> g_assert(dram_size < ms->ram_size);

I don't believe that dram_size > ms->ram_size should be an error. A board can
have a declared MemMapEntry.size that is larger than its current setting of
ms->ram_size.
  

> dram_end = dram_base + (dram_size ? dram_size : ms->ram_size);

I can change the if/else statement up there for a ternary:

dram_end = dram_base + (dram_size ? ms->ram_size : MIN(ms->ram_size, dram_size))


Thanks,

Daniel

> 
>> +
>>       /*
>>        * We should put fdt as far as possible to avoid kernel/initrd overwriting
>>        * its content. But it should be addressable by 32 bit system as well.
>> diff --git a/hw/riscv/microchip_pfsoc.c b/hw/riscv/microchip_pfsoc.c
>> index a30203db85..e81bbd12df 100644
>> --- a/hw/riscv/microchip_pfsoc.c
>> +++ b/hw/riscv/microchip_pfsoc.c
>> @@ -634,7 +634,8 @@ static void microchip_icicle_kit_machine_init(MachineState *machine)
>>
>>           /* Compute the fdt load address in dram */
>>           fdt_load_addr = riscv_compute_fdt_addr(memmap[MICROCHIP_PFSOC_DRAM_LO].base,
>> -                                               machine->ram_size, machine->fdt);
>> +                                               memmap[MICROCHIP_PFSOC_DRAM_LO].size,
>> +                                               machine);
>>           riscv_load_fdt(fdt_load_addr, machine->fdt);
>>
>>           /* Load the reset vector */
>> diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
>> index 6bbdbe5fb7..ad3bb35b34 100644
>> --- a/hw/riscv/sifive_u.c
>> +++ b/hw/riscv/sifive_u.c
>> @@ -609,7 +609,8 @@ static void sifive_u_machine_init(MachineState *machine)
>>       }
>>
>>       fdt_load_addr = riscv_compute_fdt_addr(memmap[SIFIVE_U_DEV_DRAM].base,
>> -                                           machine->ram_size, machine->fdt);
>> +                                           memmap[SIFIVE_U_DEV_DRAM].size,
>> +                                           machine);
>>       riscv_load_fdt(fdt_load_addr, machine->fdt);
>>
>>       if (!riscv_is_32bit(&s->soc.u_cpus)) {
>> diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
>> index ceebe34c5f..b5979eddd6 100644
>> --- a/hw/riscv/spike.c
>> +++ b/hw/riscv/spike.c
>> @@ -317,7 +317,8 @@ static void spike_board_init(MachineState *machine)
>>       }
>>
>>       fdt_load_addr = riscv_compute_fdt_addr(memmap[SPIKE_DRAM].base,
>> -                                           machine->ram_size, machine->fdt);
>> +                                           memmap[SPIKE_DRAM].size,
>> +                                           machine);
>>       riscv_load_fdt(fdt_load_addr, machine->fdt);
>>
>>       /* load the reset vector */
>> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
>> index 43fca597f0..f079a30b60 100644
>> --- a/hw/riscv/virt.c
>> +++ b/hw/riscv/virt.c
>> @@ -1293,7 +1293,8 @@ static void virt_machine_done(Notifier *notifier, void *data)
>>       }
>>
>>       fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base,
>> -                                           machine->ram_size, machine->fdt);
>> +                                           memmap[VIRT_DRAM].size,
>> +                                           machine);
>>       riscv_load_fdt(fdt_load_addr, machine->fdt);
>>
>>       /* load the reset vector */
>> diff --git a/include/hw/riscv/boot.h b/include/hw/riscv/boot.h
>> index 7babd669c7..a6099c2dc6 100644
>> --- a/include/hw/riscv/boot.h
>> +++ b/include/hw/riscv/boot.h
>> @@ -48,8 +48,8 @@ target_ulong riscv_load_kernel(MachineState *machine,
>>                                  target_ulong firmware_end_addr,
>>                                  bool load_initrd,
>>                                  symbol_fn_t sym_cb);
>> -uint32_t riscv_compute_fdt_addr(hwaddr dram_start, uint64_t dram_size,
>> -                                void *fdt);
>> +hwaddr riscv_compute_fdt_addr(hwaddr dram_start, uint64_t dram_size,
>> +                              MachineState *ms);
>>   void riscv_load_fdt(hwaddr fdt_addr, void *fdt);
>>   void riscv_setup_rom_reset_vec(MachineState *machine, RISCVHartArrayState *harts,
>>                                  hwaddr saddr,
>> --
> 
> Regards,
> Bin
Daniel Henrique Barboza Jan. 30, 2023, 6:02 p.m. UTC | #3
On 1/30/23 14:16, Daniel Henrique Barboza wrote:
> 
> 
> On 1/29/23 02:45, Bin Meng wrote:
>> On Thu, Jan 26, 2023 at 9:54 PM Daniel Henrique Barboza
>> <dbarboza@ventanamicro.com> wrote:
>>>
>>> As it is now, riscv_compute_fdt_addr() is receiving a dram_base, a
>>> mem_size (which is defaulted to MachineState::ram_size in all boards)
>>> and the FDT pointer. And it makes a very important assumption: the DRAM
>>> interval dram_base + mem_size is contiguous. This is indeed the case for
>>> most boards that uses a FDT.
>>
>> s/uses/use
>>
>>>
>>> The Icicle Kit board works with 2 distinct RAM banks that are separated
>>> by a gap. We have a lower bank with 1GiB size, a gap follows, then at
>>> 64GiB the high memory starts. MachineClass::default_ram_size for this
>>> board is set to 1.5Gb, and machine_init() is enforcing it as minimal RAM
>>> size, meaning that there we'll always have at least 512 MiB in the Hi
>>> RAM area.
>>>
>>> Using riscv_compute_fdt_addr() in this board is weird because not only
>>> the board has sparse RAM, and it's calling it using the base address of
>>> the Lo RAM area, but it's also using a mem_size that we have guarantees
>>> that it will go up to the Hi RAM. All the function assumptions doesn't
>>> work for this board.
>>>
>>> In fact, what makes the function works at all in this case is a
>>> coincidence.  Commit 1a475d39ef54 introduced a 3GB boundary for the FDT,
>>> down from 4Gb, that is enforced if dram_base is lower than 3072 MiB. For
>>> the Icicle Kit board, memmap[MICROCHIP_PFSOC_DRAM_LO].base is 0x80000000
>>> (2 Gb) and it has a 1Gb size, so it will fall in the conditions to put
>>> the FDT under a 3Gb address, which happens to be exactly at the end of
>>> DRAM_LO. If the base address of the Lo area started later than 3Gb this
>>> function would be unusable by the board. Changing any assumptions inside
>>> riscv_compute_fdt_addr() can also break it by accident as well.
>>>
>>> Let's change riscv_compute_fdt_addr() semantics to be appropriate to the
>>> Icicle Kit board and for future boards that might have sparse RAM
>>> topologies to worry about:
>>>
>>> - relieve the condition that the dram_base + mem_size area is contiguous,
>>> since this is already not the case today;
>>>
>>> - receive an extra 'dram_size' size attribute that refers to a contiguous
>>> RAM block that the board wants the FDT to reside on.
>>>
>>> Together with 'mem_size' and 'fdt', which are now now being consumed by a
>>> MachineState pointer, we're able to make clear assumptions based on the
>>> DRAM block and total mem_size available to ensure that the FDT will be put
>>> in a valid RAM address.
>>>
>>
>> Well written commit message. Thanks!
>>
>>> Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
>>> ---
>>>   hw/riscv/boot.c            | 38 ++++++++++++++++++++++++++------------
>>>   hw/riscv/microchip_pfsoc.c |  3 ++-
>>>   hw/riscv/sifive_u.c        |  3 ++-
>>>   hw/riscv/spike.c           |  3 ++-
>>>   hw/riscv/virt.c            |  3 ++-
>>>   include/hw/riscv/boot.h    |  4 ++--
>>>   6 files changed, 36 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
>>> index a6f7b8ae8e..8f4991480b 100644
>>> --- a/hw/riscv/boot.c
>>> +++ b/hw/riscv/boot.c
>>> @@ -284,33 +284,47 @@ out:
>>>   }
>>>
>>>   /*
>>> - * The FDT should be put at the farthest point possible to
>>> - * avoid overwriting it with the kernel/initrd.
>>> + * This function makes an assumption that the DRAM interval
>>> + * 'dram_base' + 'dram_size' is contiguous.
>>>    *
>>> - * This function makes an assumption that the DRAM is
>>> - * contiguous. It also cares about 32-bit systems and
>>> - * will limit fdt_addr to be addressable by them even for
>>> - * 64-bit CPUs.
>>> + * Considering that 'dram_end' is the lowest value between
>>> + * the end of the DRAM block and MachineState->ram_size, the
>>> + * FDT location will vary according to 'dram_base':
>>> + *
>>> + * - if 'dram_base' is less that 3072 MiB, the FDT will be
>>> + * put at the lowest value between 3072 MiB and 'dram_end';
>>> + *
>>> + * - if 'dram_base' is higher than 3072 MiB, the FDT will be
>>> + * put at 'dram_end'.
>>>    *
>>>    * The FDT is fdt_packed() during the calculation.
>>>    */
>>> -uint32_t riscv_compute_fdt_addr(hwaddr dram_base, uint64_t mem_size,
>>> -                                void *fdt)
>>> +hwaddr riscv_compute_fdt_addr(hwaddr dram_base, hwaddr dram_size,
>>
>> Using hwaddr to represent a size looks weird. Although technically
>> they are the same ... I would leave this as it is.
> 
> I'll leave it as it was back in patch 2 (uint64_t).
> 
>>
>>> +                              MachineState *ms)
>>>   {
>>> -    uint64_t temp;
>>> -    hwaddr dram_end = dram_base + mem_size;
>>> -    int ret = fdt_pack(fdt);
>>> +    int ret = fdt_pack(ms->fdt);
>>> +    hwaddr dram_end, temp;
>>>       int fdtsize;
>>>
>>>       /* Should only fail if we've built a corrupted tree */
>>>       g_assert(ret == 0);
>>>
>>> -    fdtsize = fdt_totalsize(fdt);
>>> +    fdtsize = fdt_totalsize(ms->fdt);
>>>       if (fdtsize <= 0) {
>>>           error_report("invalid device-tree");
>>>           exit(1);
>>>       }
>>>
>>> +    /*
>>> +     * A dram_size == 0, usually from a MemMapEntry[].size element,
>>> +     * means that the DRAM block goes all the way to ms->ram_size.
>>> +     */
>>> +    if (dram_size == 0x0) {
>>> +        dram_end = dram_base + ms->ram_size;
>>> +    } else {
>>> +        dram_end = dram_base + MIN(ms->ram_size, dram_size);
>>> +    }
>>
>> How about:
>>
>> g_assert(dram_size < ms->ram_size);
> 
> I don't believe that dram_size > ms->ram_size should be an error. A board can
> have a declared MemMapEntry.size that is larger than its current setting of
> ms->ram_size.
> 
> 
>> dram_end = dram_base + (dram_size ? dram_size : ms->ram_size);
> 
> I can change the if/else statement up there for a ternary:
> 
> dram_end = dram_base + (dram_size ? ms->ram_size : MIN(ms->ram_size, dram_size))

This is an ooopsie. This ternary is doing the opposite of what it should do :/

I would do something like this, breaking in two lines to avoid 80+ chars in the
same line:

     dram_end = dram_base;
     dram_end += dram_size ? MIN(ms->ram_size, dram_size) : ms->ram_size;


Daniel

> 
> 
> Thanks,
> 
> Daniel
> 
>>
>>> +
>>>       /*
>>>        * We should put fdt as far as possible to avoid kernel/initrd overwriting
>>>        * its content. But it should be addressable by 32 bit system as well.
>>> diff --git a/hw/riscv/microchip_pfsoc.c b/hw/riscv/microchip_pfsoc.c
>>> index a30203db85..e81bbd12df 100644
>>> --- a/hw/riscv/microchip_pfsoc.c
>>> +++ b/hw/riscv/microchip_pfsoc.c
>>> @@ -634,7 +634,8 @@ static void microchip_icicle_kit_machine_init(MachineState *machine)
>>>
>>>           /* Compute the fdt load address in dram */
>>>           fdt_load_addr = riscv_compute_fdt_addr(memmap[MICROCHIP_PFSOC_DRAM_LO].base,
>>> -                                               machine->ram_size, machine->fdt);
>>> +                                               memmap[MICROCHIP_PFSOC_DRAM_LO].size,
>>> +                                               machine);
>>>           riscv_load_fdt(fdt_load_addr, machine->fdt);
>>>
>>>           /* Load the reset vector */
>>> diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
>>> index 6bbdbe5fb7..ad3bb35b34 100644
>>> --- a/hw/riscv/sifive_u.c
>>> +++ b/hw/riscv/sifive_u.c
>>> @@ -609,7 +609,8 @@ static void sifive_u_machine_init(MachineState *machine)
>>>       }
>>>
>>>       fdt_load_addr = riscv_compute_fdt_addr(memmap[SIFIVE_U_DEV_DRAM].base,
>>> -                                           machine->ram_size, machine->fdt);
>>> +                                           memmap[SIFIVE_U_DEV_DRAM].size,
>>> +                                           machine);
>>>       riscv_load_fdt(fdt_load_addr, machine->fdt);
>>>
>>>       if (!riscv_is_32bit(&s->soc.u_cpus)) {
>>> diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
>>> index ceebe34c5f..b5979eddd6 100644
>>> --- a/hw/riscv/spike.c
>>> +++ b/hw/riscv/spike.c
>>> @@ -317,7 +317,8 @@ static void spike_board_init(MachineState *machine)
>>>       }
>>>
>>>       fdt_load_addr = riscv_compute_fdt_addr(memmap[SPIKE_DRAM].base,
>>> -                                           machine->ram_size, machine->fdt);
>>> +                                           memmap[SPIKE_DRAM].size,
>>> +                                           machine);
>>>       riscv_load_fdt(fdt_load_addr, machine->fdt);
>>>
>>>       /* load the reset vector */
>>> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
>>> index 43fca597f0..f079a30b60 100644
>>> --- a/hw/riscv/virt.c
>>> +++ b/hw/riscv/virt.c
>>> @@ -1293,7 +1293,8 @@ static void virt_machine_done(Notifier *notifier, void *data)
>>>       }
>>>
>>>       fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base,
>>> -                                           machine->ram_size, machine->fdt);
>>> +                                           memmap[VIRT_DRAM].size,
>>> +                                           machine);
>>>       riscv_load_fdt(fdt_load_addr, machine->fdt);
>>>
>>>       /* load the reset vector */
>>> diff --git a/include/hw/riscv/boot.h b/include/hw/riscv/boot.h
>>> index 7babd669c7..a6099c2dc6 100644
>>> --- a/include/hw/riscv/boot.h
>>> +++ b/include/hw/riscv/boot.h
>>> @@ -48,8 +48,8 @@ target_ulong riscv_load_kernel(MachineState *machine,
>>>                                  target_ulong firmware_end_addr,
>>>                                  bool load_initrd,
>>>                                  symbol_fn_t sym_cb);
>>> -uint32_t riscv_compute_fdt_addr(hwaddr dram_start, uint64_t dram_size,
>>> -                                void *fdt);
>>> +hwaddr riscv_compute_fdt_addr(hwaddr dram_start, uint64_t dram_size,
>>> +                              MachineState *ms);
>>>   void riscv_load_fdt(hwaddr fdt_addr, void *fdt);
>>>   void riscv_setup_rom_reset_vec(MachineState *machine, RISCVHartArrayState *harts,
>>>                                  hwaddr saddr,
>>> -- 
>>
>> Regards,
>> Bin
Bin Meng Jan. 31, 2023, 1 a.m. UTC | #4
On Tue, Jan 31, 2023 at 1:16 AM Daniel Henrique Barboza
<dbarboza@ventanamicro.com> wrote:
>
>
>
> On 1/29/23 02:45, Bin Meng wrote:
> > On Thu, Jan 26, 2023 at 9:54 PM Daniel Henrique Barboza
> > <dbarboza@ventanamicro.com> wrote:
> >>
> >> As it is now, riscv_compute_fdt_addr() is receiving a dram_base, a
> >> mem_size (which is defaulted to MachineState::ram_size in all boards)
> >> and the FDT pointer. And it makes a very important assumption: the DRAM
> >> interval dram_base + mem_size is contiguous. This is indeed the case for
> >> most boards that uses a FDT.
> >
> > s/uses/use
> >
> >>
> >> The Icicle Kit board works with 2 distinct RAM banks that are separated
> >> by a gap. We have a lower bank with 1GiB size, a gap follows, then at
> >> 64GiB the high memory starts. MachineClass::default_ram_size for this
> >> board is set to 1.5Gb, and machine_init() is enforcing it as minimal RAM
> >> size, meaning that there we'll always have at least 512 MiB in the Hi
> >> RAM area.
> >>
> >> Using riscv_compute_fdt_addr() in this board is weird because not only
> >> the board has sparse RAM, and it's calling it using the base address of
> >> the Lo RAM area, but it's also using a mem_size that we have guarantees
> >> that it will go up to the Hi RAM. All the function assumptions doesn't
> >> work for this board.
> >>
> >> In fact, what makes the function works at all in this case is a
> >> coincidence.  Commit 1a475d39ef54 introduced a 3GB boundary for the FDT,
> >> down from 4Gb, that is enforced if dram_base is lower than 3072 MiB. For
> >> the Icicle Kit board, memmap[MICROCHIP_PFSOC_DRAM_LO].base is 0x80000000
> >> (2 Gb) and it has a 1Gb size, so it will fall in the conditions to put
> >> the FDT under a 3Gb address, which happens to be exactly at the end of
> >> DRAM_LO. If the base address of the Lo area started later than 3Gb this
> >> function would be unusable by the board. Changing any assumptions inside
> >> riscv_compute_fdt_addr() can also break it by accident as well.
> >>
> >> Let's change riscv_compute_fdt_addr() semantics to be appropriate to the
> >> Icicle Kit board and for future boards that might have sparse RAM
> >> topologies to worry about:
> >>
> >> - relieve the condition that the dram_base + mem_size area is contiguous,
> >> since this is already not the case today;
> >>
> >> - receive an extra 'dram_size' size attribute that refers to a contiguous
> >> RAM block that the board wants the FDT to reside on.
> >>
> >> Together with 'mem_size' and 'fdt', which are now now being consumed by a
> >> MachineState pointer, we're able to make clear assumptions based on the
> >> DRAM block and total mem_size available to ensure that the FDT will be put
> >> in a valid RAM address.
> >>
> >
> > Well written commit message. Thanks!
> >
> >> Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
> >> ---
> >>   hw/riscv/boot.c            | 38 ++++++++++++++++++++++++++------------
> >>   hw/riscv/microchip_pfsoc.c |  3 ++-
> >>   hw/riscv/sifive_u.c        |  3 ++-
> >>   hw/riscv/spike.c           |  3 ++-
> >>   hw/riscv/virt.c            |  3 ++-
> >>   include/hw/riscv/boot.h    |  4 ++--
> >>   6 files changed, 36 insertions(+), 18 deletions(-)
> >>
> >> diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
> >> index a6f7b8ae8e..8f4991480b 100644
> >> --- a/hw/riscv/boot.c
> >> +++ b/hw/riscv/boot.c
> >> @@ -284,33 +284,47 @@ out:
> >>   }
> >>
> >>   /*
> >> - * The FDT should be put at the farthest point possible to
> >> - * avoid overwriting it with the kernel/initrd.
> >> + * This function makes an assumption that the DRAM interval
> >> + * 'dram_base' + 'dram_size' is contiguous.
> >>    *
> >> - * This function makes an assumption that the DRAM is
> >> - * contiguous. It also cares about 32-bit systems and
> >> - * will limit fdt_addr to be addressable by them even for
> >> - * 64-bit CPUs.
> >> + * Considering that 'dram_end' is the lowest value between
> >> + * the end of the DRAM block and MachineState->ram_size, the
> >> + * FDT location will vary according to 'dram_base':
> >> + *
> >> + * - if 'dram_base' is less that 3072 MiB, the FDT will be
> >> + * put at the lowest value between 3072 MiB and 'dram_end';
> >> + *
> >> + * - if 'dram_base' is higher than 3072 MiB, the FDT will be
> >> + * put at 'dram_end'.
> >>    *
> >>    * The FDT is fdt_packed() during the calculation.
> >>    */
> >> -uint32_t riscv_compute_fdt_addr(hwaddr dram_base, uint64_t mem_size,
> >> -                                void *fdt)
> >> +hwaddr riscv_compute_fdt_addr(hwaddr dram_base, hwaddr dram_size,
> >
> > Using hwaddr to represent a size looks weird. Although technically
> > they are the same ... I would leave this as it is.
>
> I'll leave it as it was back in patch 2 (uint64_t).
>
> >
> >> +                              MachineState *ms)
> >>   {
> >> -    uint64_t temp;
> >> -    hwaddr dram_end = dram_base + mem_size;
> >> -    int ret = fdt_pack(fdt);
> >> +    int ret = fdt_pack(ms->fdt);
> >> +    hwaddr dram_end, temp;
> >>       int fdtsize;
> >>
> >>       /* Should only fail if we've built a corrupted tree */
> >>       g_assert(ret == 0);
> >>
> >> -    fdtsize = fdt_totalsize(fdt);
> >> +    fdtsize = fdt_totalsize(ms->fdt);
> >>       if (fdtsize <= 0) {
> >>           error_report("invalid device-tree");
> >>           exit(1);
> >>       }
> >>
> >> +    /*
> >> +     * A dram_size == 0, usually from a MemMapEntry[].size element,
> >> +     * means that the DRAM block goes all the way to ms->ram_size.
> >> +     */
> >> +    if (dram_size == 0x0) {
> >> +        dram_end = dram_base + ms->ram_size;
> >> +    } else {
> >> +        dram_end = dram_base + MIN(ms->ram_size, dram_size);
> >> +    }
> >
> > How about:
> >
> > g_assert(dram_size < ms->ram_size);
>
> I don't believe that dram_size > ms->ram_size should be an error. A board can
> have a declared MemMapEntry.size that is larger than its current setting of
> ms->ram_size.

What use case is that? This updated function now has the assumption that:

1. dram_size being 0 meaning contiguous system RAM region from dram_base
2. otherwise dram_size being the *first* contiguous system RAM region
from dram_base

We can use g_assert(dram_size < ms->ram_size) to catch either case.

>
>
> > dram_end = dram_base + (dram_size ? dram_size : ms->ram_size);
>
> I can change the if/else statement up there for a ternary:
>
> dram_end = dram_base + (dram_size ? ms->ram_size : MIN(ms->ram_size, dram_size))
>

Regards,
Bin
Daniel Henrique Barboza Jan. 31, 2023, 9:57 a.m. UTC | #5
On 1/30/23 22:00, Bin Meng wrote:
> On Tue, Jan 31, 2023 at 1:16 AM Daniel Henrique Barboza
> <dbarboza@ventanamicro.com> wrote:
>>
>>
>>
>> On 1/29/23 02:45, Bin Meng wrote:
>>> On Thu, Jan 26, 2023 at 9:54 PM Daniel Henrique Barboza
>>> <dbarboza@ventanamicro.com> wrote:
>>>>
>>>> As it is now, riscv_compute_fdt_addr() is receiving a dram_base, a
>>>> mem_size (which is defaulted to MachineState::ram_size in all boards)
>>>> and the FDT pointer. And it makes a very important assumption: the DRAM
>>>> interval dram_base + mem_size is contiguous. This is indeed the case for
>>>> most boards that uses a FDT.
>>>
>>> s/uses/use
>>>
>>>>
>>>> The Icicle Kit board works with 2 distinct RAM banks that are separated
>>>> by a gap. We have a lower bank with 1GiB size, a gap follows, then at
>>>> 64GiB the high memory starts. MachineClass::default_ram_size for this
>>>> board is set to 1.5Gb, and machine_init() is enforcing it as minimal RAM
>>>> size, meaning that there we'll always have at least 512 MiB in the Hi
>>>> RAM area.
>>>>
>>>> Using riscv_compute_fdt_addr() in this board is weird because not only
>>>> the board has sparse RAM, and it's calling it using the base address of
>>>> the Lo RAM area, but it's also using a mem_size that we have guarantees
>>>> that it will go up to the Hi RAM. All the function assumptions doesn't
>>>> work for this board.
>>>>
>>>> In fact, what makes the function works at all in this case is a
>>>> coincidence.  Commit 1a475d39ef54 introduced a 3GB boundary for the FDT,
>>>> down from 4Gb, that is enforced if dram_base is lower than 3072 MiB. For
>>>> the Icicle Kit board, memmap[MICROCHIP_PFSOC_DRAM_LO].base is 0x80000000
>>>> (2 Gb) and it has a 1Gb size, so it will fall in the conditions to put
>>>> the FDT under a 3Gb address, which happens to be exactly at the end of
>>>> DRAM_LO. If the base address of the Lo area started later than 3Gb this
>>>> function would be unusable by the board. Changing any assumptions inside
>>>> riscv_compute_fdt_addr() can also break it by accident as well.
>>>>
>>>> Let's change riscv_compute_fdt_addr() semantics to be appropriate to the
>>>> Icicle Kit board and for future boards that might have sparse RAM
>>>> topologies to worry about:
>>>>
>>>> - relieve the condition that the dram_base + mem_size area is contiguous,
>>>> since this is already not the case today;
>>>>
>>>> - receive an extra 'dram_size' size attribute that refers to a contiguous
>>>> RAM block that the board wants the FDT to reside on.
>>>>
>>>> Together with 'mem_size' and 'fdt', which are now now being consumed by a
>>>> MachineState pointer, we're able to make clear assumptions based on the
>>>> DRAM block and total mem_size available to ensure that the FDT will be put
>>>> in a valid RAM address.
>>>>
>>>
>>> Well written commit message. Thanks!
>>>
>>>> Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
>>>> ---
>>>>    hw/riscv/boot.c            | 38 ++++++++++++++++++++++++++------------
>>>>    hw/riscv/microchip_pfsoc.c |  3 ++-
>>>>    hw/riscv/sifive_u.c        |  3 ++-
>>>>    hw/riscv/spike.c           |  3 ++-
>>>>    hw/riscv/virt.c            |  3 ++-
>>>>    include/hw/riscv/boot.h    |  4 ++--
>>>>    6 files changed, 36 insertions(+), 18 deletions(-)
>>>>
>>>> diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
>>>> index a6f7b8ae8e..8f4991480b 100644
>>>> --- a/hw/riscv/boot.c
>>>> +++ b/hw/riscv/boot.c
>>>> @@ -284,33 +284,47 @@ out:
>>>>    }
>>>>
>>>>    /*
>>>> - * The FDT should be put at the farthest point possible to
>>>> - * avoid overwriting it with the kernel/initrd.
>>>> + * This function makes an assumption that the DRAM interval
>>>> + * 'dram_base' + 'dram_size' is contiguous.
>>>>     *
>>>> - * This function makes an assumption that the DRAM is
>>>> - * contiguous. It also cares about 32-bit systems and
>>>> - * will limit fdt_addr to be addressable by them even for
>>>> - * 64-bit CPUs.
>>>> + * Considering that 'dram_end' is the lowest value between
>>>> + * the end of the DRAM block and MachineState->ram_size, the
>>>> + * FDT location will vary according to 'dram_base':
>>>> + *
>>>> + * - if 'dram_base' is less that 3072 MiB, the FDT will be
>>>> + * put at the lowest value between 3072 MiB and 'dram_end';
>>>> + *
>>>> + * - if 'dram_base' is higher than 3072 MiB, the FDT will be
>>>> + * put at 'dram_end'.
>>>>     *
>>>>     * The FDT is fdt_packed() during the calculation.
>>>>     */
>>>> -uint32_t riscv_compute_fdt_addr(hwaddr dram_base, uint64_t mem_size,
>>>> -                                void *fdt)
>>>> +hwaddr riscv_compute_fdt_addr(hwaddr dram_base, hwaddr dram_size,
>>>
>>> Using hwaddr to represent a size looks weird. Although technically
>>> they are the same ... I would leave this as it is.
>>
>> I'll leave it as it was back in patch 2 (uint64_t).
>>
>>>
>>>> +                              MachineState *ms)
>>>>    {
>>>> -    uint64_t temp;
>>>> -    hwaddr dram_end = dram_base + mem_size;
>>>> -    int ret = fdt_pack(fdt);
>>>> +    int ret = fdt_pack(ms->fdt);
>>>> +    hwaddr dram_end, temp;
>>>>        int fdtsize;
>>>>
>>>>        /* Should only fail if we've built a corrupted tree */
>>>>        g_assert(ret == 0);
>>>>
>>>> -    fdtsize = fdt_totalsize(fdt);
>>>> +    fdtsize = fdt_totalsize(ms->fdt);
>>>>        if (fdtsize <= 0) {
>>>>            error_report("invalid device-tree");
>>>>            exit(1);
>>>>        }
>>>>
>>>> +    /*
>>>> +     * A dram_size == 0, usually from a MemMapEntry[].size element,
>>>> +     * means that the DRAM block goes all the way to ms->ram_size.
>>>> +     */
>>>> +    if (dram_size == 0x0) {
>>>> +        dram_end = dram_base + ms->ram_size;
>>>> +    } else {
>>>> +        dram_end = dram_base + MIN(ms->ram_size, dram_size);
>>>> +    }
>>>
>>> How about:
>>>
>>> g_assert(dram_size < ms->ram_size);
>>
>> I don't believe that dram_size > ms->ram_size should be an error. A board can
>> have a declared MemMapEntry.size that is larger than its current setting of
>> ms->ram_size.
> 
> What use case is that? This updated function now has the assumption that:
> 
> 1. dram_size being 0 meaning contiguous system RAM region from dram_base
> 2. otherwise dram_size being the *first* contiguous system RAM region
> from dram_base

Yes, but that doesn't mean that dram_size is necessarily smaller than ms->ram_size.

We don't have any board where this happens today but let's pretend that the Icicle
Kit board didn't have the 1.5Gb minimal RAM restriction. Its first region has
dram_size 1Gb:

[MICROCHIP_PFSOC_DRAM_LO] =         { 0x80000000,   0x40000000 },

So, if I start the board with 512M, ms->ram_size = 512M and this assert will trigger
because dram_size = 1Gb.


Thanks,


Daniel


> 
> We can use g_assert(dram_size < ms->ram_size) to catch either case.
> 
>>
>>
>>> dram_end = dram_base + (dram_size ? dram_size : ms->ram_size);
>>
>> I can change the if/else statement up there for a ternary:
>>
>> dram_end = dram_base + (dram_size ? ms->ram_size : MIN(ms->ram_size, dram_size))
>>
> 
> Regards,
> Bin
Bin Meng Jan. 31, 2023, 12:11 p.m. UTC | #6
On Tue, Jan 31, 2023 at 5:57 PM Daniel Henrique Barboza
<dbarboza@ventanamicro.com> wrote:
>
>
>
> On 1/30/23 22:00, Bin Meng wrote:
> > On Tue, Jan 31, 2023 at 1:16 AM Daniel Henrique Barboza
> > <dbarboza@ventanamicro.com> wrote:
> >>
> >>
> >>
> >> On 1/29/23 02:45, Bin Meng wrote:
> >>> On Thu, Jan 26, 2023 at 9:54 PM Daniel Henrique Barboza
> >>> <dbarboza@ventanamicro.com> wrote:
> >>>>
> >>>> As it is now, riscv_compute_fdt_addr() is receiving a dram_base, a
> >>>> mem_size (which is defaulted to MachineState::ram_size in all boards)
> >>>> and the FDT pointer. And it makes a very important assumption: the DRAM
> >>>> interval dram_base + mem_size is contiguous. This is indeed the case for
> >>>> most boards that uses a FDT.
> >>>
> >>> s/uses/use
> >>>
> >>>>
> >>>> The Icicle Kit board works with 2 distinct RAM banks that are separated
> >>>> by a gap. We have a lower bank with 1GiB size, a gap follows, then at
> >>>> 64GiB the high memory starts. MachineClass::default_ram_size for this
> >>>> board is set to 1.5Gb, and machine_init() is enforcing it as minimal RAM
> >>>> size, meaning that there we'll always have at least 512 MiB in the Hi
> >>>> RAM area.
> >>>>
> >>>> Using riscv_compute_fdt_addr() in this board is weird because not only
> >>>> the board has sparse RAM, and it's calling it using the base address of
> >>>> the Lo RAM area, but it's also using a mem_size that we have guarantees
> >>>> that it will go up to the Hi RAM. All the function assumptions doesn't
> >>>> work for this board.
> >>>>
> >>>> In fact, what makes the function works at all in this case is a
> >>>> coincidence.  Commit 1a475d39ef54 introduced a 3GB boundary for the FDT,
> >>>> down from 4Gb, that is enforced if dram_base is lower than 3072 MiB. For
> >>>> the Icicle Kit board, memmap[MICROCHIP_PFSOC_DRAM_LO].base is 0x80000000
> >>>> (2 Gb) and it has a 1Gb size, so it will fall in the conditions to put
> >>>> the FDT under a 3Gb address, which happens to be exactly at the end of
> >>>> DRAM_LO. If the base address of the Lo area started later than 3Gb this
> >>>> function would be unusable by the board. Changing any assumptions inside
> >>>> riscv_compute_fdt_addr() can also break it by accident as well.
> >>>>
> >>>> Let's change riscv_compute_fdt_addr() semantics to be appropriate to the
> >>>> Icicle Kit board and for future boards that might have sparse RAM
> >>>> topologies to worry about:
> >>>>
> >>>> - relieve the condition that the dram_base + mem_size area is contiguous,
> >>>> since this is already not the case today;
> >>>>
> >>>> - receive an extra 'dram_size' size attribute that refers to a contiguous
> >>>> RAM block that the board wants the FDT to reside on.
> >>>>
> >>>> Together with 'mem_size' and 'fdt', which are now now being consumed by a
> >>>> MachineState pointer, we're able to make clear assumptions based on the
> >>>> DRAM block and total mem_size available to ensure that the FDT will be put
> >>>> in a valid RAM address.
> >>>>
> >>>
> >>> Well written commit message. Thanks!
> >>>
> >>>> Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
> >>>> ---
> >>>>    hw/riscv/boot.c            | 38 ++++++++++++++++++++++++++------------
> >>>>    hw/riscv/microchip_pfsoc.c |  3 ++-
> >>>>    hw/riscv/sifive_u.c        |  3 ++-
> >>>>    hw/riscv/spike.c           |  3 ++-
> >>>>    hw/riscv/virt.c            |  3 ++-
> >>>>    include/hw/riscv/boot.h    |  4 ++--
> >>>>    6 files changed, 36 insertions(+), 18 deletions(-)
> >>>>
> >>>> diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
> >>>> index a6f7b8ae8e..8f4991480b 100644
> >>>> --- a/hw/riscv/boot.c
> >>>> +++ b/hw/riscv/boot.c
> >>>> @@ -284,33 +284,47 @@ out:
> >>>>    }
> >>>>
> >>>>    /*
> >>>> - * The FDT should be put at the farthest point possible to
> >>>> - * avoid overwriting it with the kernel/initrd.
> >>>> + * This function makes an assumption that the DRAM interval
> >>>> + * 'dram_base' + 'dram_size' is contiguous.
> >>>>     *
> >>>> - * This function makes an assumption that the DRAM is
> >>>> - * contiguous. It also cares about 32-bit systems and
> >>>> - * will limit fdt_addr to be addressable by them even for
> >>>> - * 64-bit CPUs.
> >>>> + * Considering that 'dram_end' is the lowest value between
> >>>> + * the end of the DRAM block and MachineState->ram_size, the
> >>>> + * FDT location will vary according to 'dram_base':
> >>>> + *
> >>>> + * - if 'dram_base' is less that 3072 MiB, the FDT will be
> >>>> + * put at the lowest value between 3072 MiB and 'dram_end';
> >>>> + *
> >>>> + * - if 'dram_base' is higher than 3072 MiB, the FDT will be
> >>>> + * put at 'dram_end'.
> >>>>     *
> >>>>     * The FDT is fdt_packed() during the calculation.
> >>>>     */
> >>>> -uint32_t riscv_compute_fdt_addr(hwaddr dram_base, uint64_t mem_size,
> >>>> -                                void *fdt)
> >>>> +hwaddr riscv_compute_fdt_addr(hwaddr dram_base, hwaddr dram_size,
> >>>
> >>> Using hwaddr to represent a size looks weird. Although technically
> >>> they are the same ... I would leave this as it is.
> >>
> >> I'll leave it as it was back in patch 2 (uint64_t).
> >>
> >>>
> >>>> +                              MachineState *ms)
> >>>>    {
> >>>> -    uint64_t temp;
> >>>> -    hwaddr dram_end = dram_base + mem_size;
> >>>> -    int ret = fdt_pack(fdt);
> >>>> +    int ret = fdt_pack(ms->fdt);
> >>>> +    hwaddr dram_end, temp;
> >>>>        int fdtsize;
> >>>>
> >>>>        /* Should only fail if we've built a corrupted tree */
> >>>>        g_assert(ret == 0);
> >>>>
> >>>> -    fdtsize = fdt_totalsize(fdt);
> >>>> +    fdtsize = fdt_totalsize(ms->fdt);
> >>>>        if (fdtsize <= 0) {
> >>>>            error_report("invalid device-tree");
> >>>>            exit(1);
> >>>>        }
> >>>>
> >>>> +    /*
> >>>> +     * A dram_size == 0, usually from a MemMapEntry[].size element,
> >>>> +     * means that the DRAM block goes all the way to ms->ram_size.
> >>>> +     */
> >>>> +    if (dram_size == 0x0) {
> >>>> +        dram_end = dram_base + ms->ram_size;
> >>>> +    } else {
> >>>> +        dram_end = dram_base + MIN(ms->ram_size, dram_size);
> >>>> +    }
> >>>
> >>> How about:
> >>>
> >>> g_assert(dram_size < ms->ram_size);
> >>
> >> I don't believe that dram_size > ms->ram_size should be an error. A board can
> >> have a declared MemMapEntry.size that is larger than its current setting of
> >> ms->ram_size.
> >
> > What use case is that? This updated function now has the assumption that:
> >
> > 1. dram_size being 0 meaning contiguous system RAM region from dram_base
> > 2. otherwise dram_size being the *first* contiguous system RAM region
> > from dram_base
>
> Yes, but that doesn't mean that dram_size is necessarily smaller than ms->ram_size.
>
> We don't have any board where this happens today but let's pretend that the Icicle
> Kit board didn't have the 1.5Gb minimal RAM restriction. Its first region has
> dram_size 1Gb:
>
> [MICROCHIP_PFSOC_DRAM_LO] =         { 0x80000000,   0x40000000 },
>
> So, if I start the board with 512M, ms->ram_size = 512M and this assert will trigger
> because dram_size = 1Gb.
>

Ah, I see. So for a machine whose memory size can be dynamically
configured, yes that makes sense.

Regards,
Bin
diff mbox series

Patch

diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
index a6f7b8ae8e..8f4991480b 100644
--- a/hw/riscv/boot.c
+++ b/hw/riscv/boot.c
@@ -284,33 +284,47 @@  out:
 }
 
 /*
- * The FDT should be put at the farthest point possible to
- * avoid overwriting it with the kernel/initrd.
+ * This function makes an assumption that the DRAM interval
+ * 'dram_base' + 'dram_size' is contiguous.
  *
- * This function makes an assumption that the DRAM is
- * contiguous. It also cares about 32-bit systems and
- * will limit fdt_addr to be addressable by them even for
- * 64-bit CPUs.
+ * Considering that 'dram_end' is the lowest value between
+ * the end of the DRAM block and MachineState->ram_size, the
+ * FDT location will vary according to 'dram_base':
+ *
+ * - if 'dram_base' is less that 3072 MiB, the FDT will be
+ * put at the lowest value between 3072 MiB and 'dram_end';
+ *
+ * - if 'dram_base' is higher than 3072 MiB, the FDT will be
+ * put at 'dram_end'.
  *
  * The FDT is fdt_packed() during the calculation.
  */
-uint32_t riscv_compute_fdt_addr(hwaddr dram_base, uint64_t mem_size,
-                                void *fdt)
+hwaddr riscv_compute_fdt_addr(hwaddr dram_base, hwaddr dram_size,
+                              MachineState *ms)
 {
-    uint64_t temp;
-    hwaddr dram_end = dram_base + mem_size;
-    int ret = fdt_pack(fdt);
+    int ret = fdt_pack(ms->fdt);
+    hwaddr dram_end, temp;
     int fdtsize;
 
     /* Should only fail if we've built a corrupted tree */
     g_assert(ret == 0);
 
-    fdtsize = fdt_totalsize(fdt);
+    fdtsize = fdt_totalsize(ms->fdt);
     if (fdtsize <= 0) {
         error_report("invalid device-tree");
         exit(1);
     }
 
+    /*
+     * A dram_size == 0, usually from a MemMapEntry[].size element,
+     * means that the DRAM block goes all the way to ms->ram_size.
+     */
+    if (dram_size == 0x0) {
+        dram_end = dram_base + ms->ram_size;
+    } else {
+        dram_end = dram_base + MIN(ms->ram_size, dram_size);
+    }
+
     /*
      * We should put fdt as far as possible to avoid kernel/initrd overwriting
      * its content. But it should be addressable by 32 bit system as well.
diff --git a/hw/riscv/microchip_pfsoc.c b/hw/riscv/microchip_pfsoc.c
index a30203db85..e81bbd12df 100644
--- a/hw/riscv/microchip_pfsoc.c
+++ b/hw/riscv/microchip_pfsoc.c
@@ -634,7 +634,8 @@  static void microchip_icicle_kit_machine_init(MachineState *machine)
 
         /* Compute the fdt load address in dram */
         fdt_load_addr = riscv_compute_fdt_addr(memmap[MICROCHIP_PFSOC_DRAM_LO].base,
-                                               machine->ram_size, machine->fdt);
+                                               memmap[MICROCHIP_PFSOC_DRAM_LO].size,
+                                               machine);
         riscv_load_fdt(fdt_load_addr, machine->fdt);
 
         /* Load the reset vector */
diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
index 6bbdbe5fb7..ad3bb35b34 100644
--- a/hw/riscv/sifive_u.c
+++ b/hw/riscv/sifive_u.c
@@ -609,7 +609,8 @@  static void sifive_u_machine_init(MachineState *machine)
     }
 
     fdt_load_addr = riscv_compute_fdt_addr(memmap[SIFIVE_U_DEV_DRAM].base,
-                                           machine->ram_size, machine->fdt);
+                                           memmap[SIFIVE_U_DEV_DRAM].size,
+                                           machine);
     riscv_load_fdt(fdt_load_addr, machine->fdt);
 
     if (!riscv_is_32bit(&s->soc.u_cpus)) {
diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
index ceebe34c5f..b5979eddd6 100644
--- a/hw/riscv/spike.c
+++ b/hw/riscv/spike.c
@@ -317,7 +317,8 @@  static void spike_board_init(MachineState *machine)
     }
 
     fdt_load_addr = riscv_compute_fdt_addr(memmap[SPIKE_DRAM].base,
-                                           machine->ram_size, machine->fdt);
+                                           memmap[SPIKE_DRAM].size,
+                                           machine);
     riscv_load_fdt(fdt_load_addr, machine->fdt);
 
     /* load the reset vector */
diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 43fca597f0..f079a30b60 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -1293,7 +1293,8 @@  static void virt_machine_done(Notifier *notifier, void *data)
     }
 
     fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base,
-                                           machine->ram_size, machine->fdt);
+                                           memmap[VIRT_DRAM].size,
+                                           machine);
     riscv_load_fdt(fdt_load_addr, machine->fdt);
 
     /* load the reset vector */
diff --git a/include/hw/riscv/boot.h b/include/hw/riscv/boot.h
index 7babd669c7..a6099c2dc6 100644
--- a/include/hw/riscv/boot.h
+++ b/include/hw/riscv/boot.h
@@ -48,8 +48,8 @@  target_ulong riscv_load_kernel(MachineState *machine,
                                target_ulong firmware_end_addr,
                                bool load_initrd,
                                symbol_fn_t sym_cb);
-uint32_t riscv_compute_fdt_addr(hwaddr dram_start, uint64_t dram_size,
-                                void *fdt);
+hwaddr riscv_compute_fdt_addr(hwaddr dram_start, uint64_t dram_size,
+                              MachineState *ms);
 void riscv_load_fdt(hwaddr fdt_addr, void *fdt);
 void riscv_setup_rom_reset_vec(MachineState *machine, RISCVHartArrayState *harts,
                                hwaddr saddr,