diff mbox series

hw/riscv: virt: pass random seed to fdt

Message ID 20220613115810.178210-1-Jason@zx2c4.com (mailing list archive)
State New, archived
Headers show
Series hw/riscv: virt: pass random seed to fdt | expand

Commit Message

Jason A. Donenfeld June 13, 2022, 11:58 a.m. UTC
If the FDT contains /chosen/rng-seed, then the Linux RNG will use it to
initialize early. Set this using the usual guest random number
generation function. This is confirmed to successfully initialize the
RNG on Linux 5.19-rc2.

Cc: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
 hw/riscv/virt.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Bin Meng June 15, 2022, 4:05 a.m. UTC | #1
On Mon, Jun 13, 2022 at 8:08 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> If the FDT contains /chosen/rng-seed, then the Linux RNG will use it to
> initialize early. Set this using the usual guest random number
> generation function. This is confirmed to successfully initialize the
> RNG on Linux 5.19-rc2.
>
> Cc: Alistair Francis <alistair.francis@wdc.com>
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> ---
>  hw/riscv/virt.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index bc424dd2f5..368a723bf6 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -21,6 +21,7 @@
>  #include "qemu/osdep.h"
>  #include "qemu/units.h"
>  #include "qemu/error-report.h"
> +#include "qemu/guest-random.h"
>  #include "qapi/error.h"
>  #include "hw/boards.h"
>  #include "hw/loader.h"
> @@ -998,6 +999,7 @@ static void create_fdt(RISCVVirtState *s, const MemMapEntry *memmap,
>      MachineState *mc = MACHINE(s);
>      uint32_t phandle = 1, irq_mmio_phandle = 1, msi_pcie_phandle = 1;
>      uint32_t irq_pcie_phandle = 1, irq_virtio_phandle = 1;
> +    uint8_t rng_seed[32];
>
>      if (mc->dtb) {
>          mc->fdt = load_device_tree(mc->dtb, &s->fdt_size);
> @@ -1046,6 +1048,10 @@ update_bootargs:
>      if (cmdline && *cmdline) {
>          qemu_fdt_setprop_string(mc->fdt, "/chosen", "bootargs", cmdline);
>      }
> +
> +    /* Pass seed to RNG. */

nits: please remove the ending period

> +    qemu_guest_getrandom_nofail(rng_seed, sizeof(rng_seed));
> +    qemu_fdt_setprop(mc->fdt, "/chosen", "rng-seed", rng_seed, sizeof(rng_seed));
>  }
>
>  static inline DeviceState *gpex_pcie_init(MemoryRegion *sys_mem,
> --

Otherwise,
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Alistair Francis June 16, 2022, 2:32 a.m. UTC | #2
On Wed, Jun 15, 2022 at 2:07 PM Bin Meng <bmeng.cn@gmail.com> wrote:
>
> On Mon, Jun 13, 2022 at 8:08 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> >
> > If the FDT contains /chosen/rng-seed, then the Linux RNG will use it to
> > initialize early. Set this using the usual guest random number
> > generation function. This is confirmed to successfully initialize the
> > RNG on Linux 5.19-rc2.
> >
> > Cc: Alistair Francis <alistair.francis@wdc.com>
> > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

Thanks!

Applied to riscv-to-apply.next with the full stop removed

Alistair

> > ---
> >  hw/riscv/virt.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> > index bc424dd2f5..368a723bf6 100644
> > --- a/hw/riscv/virt.c
> > +++ b/hw/riscv/virt.c
> > @@ -21,6 +21,7 @@
> >  #include "qemu/osdep.h"
> >  #include "qemu/units.h"
> >  #include "qemu/error-report.h"
> > +#include "qemu/guest-random.h"
> >  #include "qapi/error.h"
> >  #include "hw/boards.h"
> >  #include "hw/loader.h"
> > @@ -998,6 +999,7 @@ static void create_fdt(RISCVVirtState *s, const MemMapEntry *memmap,
> >      MachineState *mc = MACHINE(s);
> >      uint32_t phandle = 1, irq_mmio_phandle = 1, msi_pcie_phandle = 1;
> >      uint32_t irq_pcie_phandle = 1, irq_virtio_phandle = 1;
> > +    uint8_t rng_seed[32];
> >
> >      if (mc->dtb) {
> >          mc->fdt = load_device_tree(mc->dtb, &s->fdt_size);
> > @@ -1046,6 +1048,10 @@ update_bootargs:
> >      if (cmdline && *cmdline) {
> >          qemu_fdt_setprop_string(mc->fdt, "/chosen", "bootargs", cmdline);
> >      }
> > +
> > +    /* Pass seed to RNG. */
>
> nits: please remove the ending period
>
> > +    qemu_guest_getrandom_nofail(rng_seed, sizeof(rng_seed));
> > +    qemu_fdt_setprop(mc->fdt, "/chosen", "rng-seed", rng_seed, sizeof(rng_seed));
> >  }
> >
> >  static inline DeviceState *gpex_pcie_init(MemoryRegion *sys_mem,
> > --
>
> Otherwise,
> Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
>
Jason A. Donenfeld June 16, 2022, 10:01 a.m. UTC | #3
Hi Alistair,

On Thu, Jun 16, 2022 at 12:32:36PM +1000, Alistair Francis wrote:
> Applied to riscv-to-apply.next with the full stop removed

Great, thanks. Just wondering: am I looking in the right repo? I don't
see it here: https://github.com/alistair23/qemu/commits/riscv-to-apply.next

Jason
Alistair Francis June 16, 2022, 12:17 p.m. UTC | #4
On Thu, Jun 16, 2022 at 8:01 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> Hi Alistair,
>
> On Thu, Jun 16, 2022 at 12:32:36PM +1000, Alistair Francis wrote:
> > Applied to riscv-to-apply.next with the full stop removed
>
> Great, thanks. Just wondering: am I looking in the right repo? I don't
> see it here: https://github.com/alistair23/qemu/commits/riscv-to-apply.next

That's the right repo, I just have to push the latest updates. You
should see it there tomorrow

Alistair

>
> Jason
Alistair Francis June 29, 2022, 2:09 a.m. UTC | #5
On Mon, Jun 13, 2022 at 10:10 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> If the FDT contains /chosen/rng-seed, then the Linux RNG will use it to
> initialize early. Set this using the usual guest random number
> generation function. This is confirmed to successfully initialize the
> RNG on Linux 5.19-rc2.

I have a Linux 5.8 test case that is failing due to this patch.

The command line is:

qemu-system-riscv64 \
-machine virt -m 64M \
-cpu rv64,mmu=false \
-serial mon:stdio -serial null -nographic \
-append "root=/dev/vda rw highres=off  console=ttyS0 mem=1G ip=dhcp
earlycon=sbi" \
-device virtio-net-device,netdev=net0,mac=52:54:00:12:34:02 \
-netdev user,id=net0 \
-object rng-random,filename=/dev/urandom,id=rng0 \
-device virtio-rng-device,rng=rng0 \
-smp 1  \
-d guest_errors \
-kernel ./images/qemuriscv64/nommu-Image \
-drive id=disk0,file=./images/qemuriscv64/nommu-rootfs.ext2,if=none,format=raw \
-device virtio -blk-device,drive=disk0 \
-bios none

The working log (before this commit) is:

[    0.000000] Linux version 5.8.0 (alistair@risc6-mainframe)
(riscv64-elf-gcc (Arch Linux Repositories) 10.1.0, GNU ld (GNU
Binutils) 2.34) #2 SMP Wed Sep
30 12:02:11 PDT 2020
[    0.000000] earlycon: uart8250 at MMIO 0x0000000010000000 (options
'115200n8')
[    0.000000] printk: bootconsole [uart8250] enabled
[    0.000000] Zone ranges:
[    0.000000]   DMA32    [mem 0x0000000080000000-0x0000000083ffffff]
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000080000000-0x0000000083ffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000080000000-0x0000000083ffffff]
[    0.000000] riscv: ISA extensions abcdefhimnrs
[    0.000000] riscv: ELF capabilities acdfim
[    0.000000] percpu: max_distance=0xc000 too large for vmalloc space 0x0
[    0.000000] percpu: Embedded 12 pages/cpu s18528 r0 d30624 u49152
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 16160
[    0.000000] Kernel command line: root=/dev/vda rw
earlycon=uart8250,mmio,0x10000000,115200n8 console=ttyS0
[    0.000000] Dentry cache hash table entries: 8192 (order: 4, 65536
bytes, linear)
[    0.000000] Inode-cache hash table entries: 4096 (order: 3, 32768
bytes, linear)
[    0.000000] Sorting __ex_table...
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Memory: 62472K/65536K available (1369K kernel code,
144K rwdata, 238K rodata, 106K init, 134K bss, 3064K reserved, 0K
cma-reserved)
[    0.000000] rcu: Hierarchical RCU implementation.
[    0.000000] rcu:     RCU restricting CPUs from NR_CPUS=8 to nr_cpu_ids=1.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay
is 25 jiffies.
[    0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
[    0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[    0.000000] riscv-intc: 64 local interrupts mapped
sifive_plic_read: Invalid register read 0x200c
sifive_plic_write: Invalid enable write 0x200c
[    0.000000] plic: plic@c000000: mapped 96 interrupts with 1
handlers for 2 contexts.
[    0.000000] riscv_timer_init_dt: Registering clocksource cpuid [0] hartid [0]
[    0.000000] clocksource: riscv_clocksource: mask:
0xffffffffffffffff max_cycles: 0x24e6a1710, max_idle_ns: 440795202120
ns
[    0.000106] sched_clock: 64 bits at 10MHz, resolution 100ns, wraps
every 4398046511100ns
[    0.002649] Console: colour dummy device 80x25
[    0.003599] Calibrating delay loop (skipped), value calculated
using timer frequency.. 20.00 BogoMIPS (lpj=40000)
[    0.003960] pid_max: default: 4096 minimum: 301
[    0.004718] Mount-cache hash table entries: 512 (order: 0, 4096
bytes, linear)
[    0.004922] Mountpoint-cache hash table entries: 512 (order: 0,
4096 bytes, linear)
[    0.022781] rcu: Hierarchical SRCU implementation.
[    0.024374] smp: Bringing up secondary CPUs ...
[    0.024583] smp: Brought up 1 node, 1 CPU
[    0.030450] devtmpfs: initialized
[    0.035183] clocksource: jiffies: mask: 0xffffffff max_cycles:
0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.035600] futex hash table entries: 16 (order: -2, 1024 bytes, linear)
[    0.055175] clocksource: Switched to clocksource riscv_clocksource
[    0.073226] workingset: timestamp_bits=62 max_order=14 bucket_order=0
[    0.078326] Serial: 8250/16550 driver, 1 ports, IRQ sharing disabled
[    0.082509] printk: console [ttyS0] disabled
[    0.083408] 10000000.uart: ttyS0 at MMIO 0x10000000 (irq = 2,
base_baud = 230400) is a 16550A
[    0.084805] printk: console [ttyS0] enabled
[    0.084805] printk: console [ttyS0] enabled
[    0.085242] printk: bootconsole [uart8250] disabled
[    0.085242] printk: bootconsole [uart8250] disabled
virtio_mmio_write: attempt to write guest features with
guest_features_sel > 0 in legacy mode
[    0.095810] virtio_blk virtio2: [vda] 122880 512-byte logical
blocks (62.9 MB/60.0 MiB)
[    0.096155] vda: detected capacity change from 0 to 62914560
[    0.099882] random: get_random_bytes called from 0x0000000080020c8c
with crng_init=0
[    0.120690] EXT4-fs (vda): warning: mounting unchecked fs, running
e2fsck is recommended
[    0.160540] EXT4-fs (vda): mounted filesystem without journal. Opts: (null)
[    0.160910] VFS: Mounted root (ext4 filesystem) on device 254:0.
[    0.162645] devtmpfs: mounted
[    0.171902] Freeing unused kernel memory: 104K
[    0.172061] This architecture does not have kernel memory protection.
[    0.172387] Run /sbin/init as init process
[    0.174104] Run /etc/init as init process
[    0.174534] Run /bin/init as init process
[    0.174964] Run /bin/sh as init process


BusyBox v1.32.0 (2020-09-24 13:17:53 PDT) hush - the humble shell
Enter 'help' for a list of built-in commands.

After applying this commit all I see is:

Invalid read at addr 0x0, size 8, region '(null)', reason: rejected

It looks like the rng-seed property is causing failures on older kernels.

Alistair

>
> Cc: Alistair Francis <alistair.francis@wdc.com>
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> ---
>  hw/riscv/virt.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index bc424dd2f5..368a723bf6 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -21,6 +21,7 @@
>  #include "qemu/osdep.h"
>  #include "qemu/units.h"
>  #include "qemu/error-report.h"
> +#include "qemu/guest-random.h"
>  #include "qapi/error.h"
>  #include "hw/boards.h"
>  #include "hw/loader.h"
> @@ -998,6 +999,7 @@ static void create_fdt(RISCVVirtState *s, const MemMapEntry *memmap,
>      MachineState *mc = MACHINE(s);
>      uint32_t phandle = 1, irq_mmio_phandle = 1, msi_pcie_phandle = 1;
>      uint32_t irq_pcie_phandle = 1, irq_virtio_phandle = 1;
> +    uint8_t rng_seed[32];
>
>      if (mc->dtb) {
>          mc->fdt = load_device_tree(mc->dtb, &s->fdt_size);
> @@ -1046,6 +1048,10 @@ update_bootargs:
>      if (cmdline && *cmdline) {
>          qemu_fdt_setprop_string(mc->fdt, "/chosen", "bootargs", cmdline);
>      }
> +
> +    /* Pass seed to RNG. */
> +    qemu_guest_getrandom_nofail(rng_seed, sizeof(rng_seed));
> +    qemu_fdt_setprop(mc->fdt, "/chosen", "rng-seed", rng_seed, sizeof(rng_seed));
>  }
>
>  static inline DeviceState *gpex_pcie_init(MemoryRegion *sys_mem,
> --
> 2.35.1
>
>
Jason A. Donenfeld July 5, 2022, 1:09 a.m. UTC | #6
Hi Alistair,

On Wed, Jun 29, 2022 at 4:09 AM Alistair Francis <alistair23@gmail.com> wrote:
> I have a Linux 5.8 test case that is failing due to this patch.

Before I started fixing things in random.c, there were a lot of early
boot bugs with the RNG in Linux. I backported the fixes for these to
all stable kernels. It's a bummer that risc-v got hit by these bugs,
but I think that's just the way things go unfortunately.

Jason
Jason A. Donenfeld July 7, 2022, 1:04 a.m. UTC | #7
Hey Alistair,

On Tue, Jul 05, 2022 at 03:09:09AM +0200, Jason A. Donenfeld wrote:
> Hi Alistair,
> 
> On Wed, Jun 29, 2022 at 4:09 AM Alistair Francis <alistair23@gmail.com> wrote:
> > I have a Linux 5.8 test case that is failing due to this patch.
> 
> Before I started fixing things in random.c, there were a lot of early
> boot bugs with the RNG in Linux. I backported the fixes for these to
> all stable kernels. It's a bummer that risc-v got hit by these bugs,
> but I think that's just the way things go unfortunately.
> 
> Jason
> 

By the way, I still can't find this in your github tree. I was hoping we
could get this in for 7.1.

As for your 5.8 issue, I've been trying to reproduce that to understand
more about it, but I'm unable to. I've been trying with
nommu_virt_defconfig using my patch ontop of qemu master. Maybe it's
possible in testing this out you were testing the wrong branch? Anyway,
it'd be nice to get this queued up...

Jason
Alistair Francis July 8, 2022, 7:59 a.m. UTC | #8
On Thu, Jul 7, 2022 at 11:04 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> Hey Alistair,
>
> On Tue, Jul 05, 2022 at 03:09:09AM +0200, Jason A. Donenfeld wrote:
> > Hi Alistair,
> >
> > On Wed, Jun 29, 2022 at 4:09 AM Alistair Francis <alistair23@gmail.com> wrote:
> > > I have a Linux 5.8 test case that is failing due to this patch.
> >
> > Before I started fixing things in random.c, there were a lot of early
> > boot bugs with the RNG in Linux. I backported the fixes for these to
> > all stable kernels. It's a bummer that risc-v got hit by these bugs,
> > but I think that's just the way things go unfortunately.

Hmm... That's a pain. So there is a bug in older kernels where they
won't boot if we specify this?

Can you point to the fixes?

> >
> > Jason
> >
>
> By the way, I still can't find this in your github tree. I was hoping we
> could get this in for 7.1.

Yeah, it's hard to accept when it will break users. I would rather
avoid someone upgrading to QEMU 7.1 and the kernel failing to boot
with no information.

>
> As for your 5.8 issue, I've been trying to reproduce that to understand
> more about it, but I'm unable to. I've been trying with
> nommu_virt_defconfig using my patch ontop of qemu master. Maybe it's
> possible in testing this out you were testing the wrong branch? Anyway,
> it'd be nice to get this queued up...

Hmm... you can't reproduce it?

Alistair

>
> Jason
Jason A. Donenfeld July 8, 2022, 9:56 a.m. UTC | #9
Hi Alistair,

On 7/8/22, Alistair Francis <alistair23@gmail.com> wrote:

>> > but I think that's just the way things go unfortunately.
>
> Hmm... That's a pain. So there is a bug in older kernels where they
> won't boot if we specify this?
>
> Can you point to the fixes?

Actually, in trying to reproduce this, I don't actually think this is
affected by those old random.c bugs.


>> As for your 5.8 issue, I've been trying to reproduce that to understand
>> more about it, but I'm unable to. I've been trying with
>> nommu_virt_defconfig using my patch ontop of qemu master. Maybe it's
>> possible in testing this out you were testing the wrong branch? Anyway,
>> it'd be nice to get this queued up...
>
> Hmm... you can't reproduce it?

No, I can't, and I'm now no longer convinced that there *is* a bug.
Can you try to repro again and send me detailed reproduction steps?

Thanks,
Jason
Alistair Francis July 11, 2022, 12:25 a.m. UTC | #10
On Fri, Jul 8, 2022 at 7:56 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> Hi Alistair,
>
> On 7/8/22, Alistair Francis <alistair23@gmail.com> wrote:
>
> >> > but I think that's just the way things go unfortunately.
> >
> > Hmm... That's a pain. So there is a bug in older kernels where they
> > won't boot if we specify this?
> >
> > Can you point to the fixes?
>
> Actually, in trying to reproduce this, I don't actually think this is
> affected by those old random.c bugs.
>
>
> >> As for your 5.8 issue, I've been trying to reproduce that to understand
> >> more about it, but I'm unable to. I've been trying with
> >> nommu_virt_defconfig using my patch ontop of qemu master. Maybe it's
> >> possible in testing this out you were testing the wrong branch? Anyway,
> >> it'd be nice to get this queued up...
> >
> > Hmm... you can't reproduce it?
>
> No, I can't, and I'm now no longer convinced that there *is* a bug.
> Can you try to repro again and send me detailed reproduction steps?

I just checked again and I can confirm it is this patch that causes
the regression.

This is the command line:

qemu-system-riscv64 \
-machine virt -m 64M \
-cpu rv64,mmu=false \
-serial mon:stdio -serial null -nographic \
-append "root=/dev/vda rw highres=off  console=ttyS0 mem=1G ip=dhcp
earlycon=sbi" \
-device virtio-net-device,netdev=net0,mac=52:54:00:12:34:02 -netdev
user,id=net0 \
-object rng-random,filename=/dev/urandom,id=rng0 -device
virtio-rng-device,rng=rng0 \
-smp 1 -d guest_errors \
-kernel ./images/qemuriscv64/nommu-Image \
-drive id=disk0,file=./images/qemuriscv64/nommu-rootfs.ext2,if=none,format=raw \
-device virtio-blk-device,drive=disk0 -bios none

You can access the images from:
https://nextcloud.alistair23.me/index.php/s/a2zrtbT7DjdTx9t

Alistair

>
> Thanks,
> Jason
Jason A. Donenfeld July 11, 2022, 12:27 a.m. UTC | #11
On 7/11/22, Alistair Francis <alistair23@gmail.com> wrote:
> On Fri, Jul 8, 2022 at 7:56 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>>
>> Hi Alistair,
>>
>> On 7/8/22, Alistair Francis <alistair23@gmail.com> wrote:
>>
>> >> > but I think that's just the way things go unfortunately.
>> >
>> > Hmm... That's a pain. So there is a bug in older kernels where they
>> > won't boot if we specify this?
>> >
>> > Can you point to the fixes?
>>
>> Actually, in trying to reproduce this, I don't actually think this is
>> affected by those old random.c bugs.
>>
>>
>> >> As for your 5.8 issue, I've been trying to reproduce that to
>> >> understand
>> >> more about it, but I'm unable to. I've been trying with
>> >> nommu_virt_defconfig using my patch ontop of qemu master. Maybe it's
>> >> possible in testing this out you were testing the wrong branch?
>> >> Anyway,
>> >> it'd be nice to get this queued up...
>> >
>> > Hmm... you can't reproduce it?
>>
>> No, I can't, and I'm now no longer convinced that there *is* a bug.
>> Can you try to repro again and send me detailed reproduction steps?
>
> I just checked again and I can confirm it is this patch that causes
> the regression.
>
> This is the command line:
>
> qemu-system-riscv64 \
> -machine virt -m 64M \
> -cpu rv64,mmu=false \
> -serial mon:stdio -serial null -nographic \
> -append "root=/dev/vda rw highres=off  console=ttyS0 mem=1G ip=dhcp
> earlycon=sbi" \
> -device virtio-net-device,netdev=net0,mac=52:54:00:12:34:02 -netdev
> user,id=net0 \
> -object rng-random,filename=/dev/urandom,id=rng0 -device
> virtio-rng-device,rng=rng0 \
> -smp 1 -d guest_errors \
> -kernel ./images/qemuriscv64/nommu-Image \
> -drive
> id=disk0,file=./images/qemuriscv64/nommu-rootfs.ext2,if=none,format=raw \
> -device virtio-blk-device,drive=disk0 -bios none
>
> You can access the images from:
> https://nextcloud.alistair23.me/index.php/s/a2zrtbT7DjdTx9t
>

Thanks. Can you send the kernel . config too?

Jason
Alistair Francis July 11, 2022, 3:36 a.m. UTC | #12
On Mon, Jul 11, 2022 at 10:28 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> On 7/11/22, Alistair Francis <alistair23@gmail.com> wrote:
> > On Fri, Jul 8, 2022 at 7:56 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> >>
> >> Hi Alistair,
> >>
> >> On 7/8/22, Alistair Francis <alistair23@gmail.com> wrote:
> >>
> >> >> > but I think that's just the way things go unfortunately.
> >> >
> >> > Hmm... That's a pain. So there is a bug in older kernels where they
> >> > won't boot if we specify this?
> >> >
> >> > Can you point to the fixes?
> >>
> >> Actually, in trying to reproduce this, I don't actually think this is
> >> affected by those old random.c bugs.
> >>
> >>
> >> >> As for your 5.8 issue, I've been trying to reproduce that to
> >> >> understand
> >> >> more about it, but I'm unable to. I've been trying with
> >> >> nommu_virt_defconfig using my patch ontop of qemu master. Maybe it's
> >> >> possible in testing this out you were testing the wrong branch?
> >> >> Anyway,
> >> >> it'd be nice to get this queued up...
> >> >
> >> > Hmm... you can't reproduce it?
> >>
> >> No, I can't, and I'm now no longer convinced that there *is* a bug.
> >> Can you try to repro again and send me detailed reproduction steps?
> >
> > I just checked again and I can confirm it is this patch that causes
> > the regression.
> >
> > This is the command line:
> >
> > qemu-system-riscv64 \
> > -machine virt -m 64M \
> > -cpu rv64,mmu=false \
> > -serial mon:stdio -serial null -nographic \
> > -append "root=/dev/vda rw highres=off  console=ttyS0 mem=1G ip=dhcp
> > earlycon=sbi" \
> > -device virtio-net-device,netdev=net0,mac=52:54:00:12:34:02 -netdev
> > user,id=net0 \
> > -object rng-random,filename=/dev/urandom,id=rng0 -device
> > virtio-rng-device,rng=rng0 \
> > -smp 1 -d guest_errors \
> > -kernel ./images/qemuriscv64/nommu-Image \
> > -drive
> > id=disk0,file=./images/qemuriscv64/nommu-rootfs.ext2,if=none,format=raw \
> > -device virtio-blk-device,drive=disk0 -bios none
> >
> > You can access the images from:
> > https://nextcloud.alistair23.me/index.php/s/a2zrtbT7DjdTx9t
> >
>
> Thanks. Can you send the kernel . config too?

Unfortunately I don't have it, it should just be the 5.8 no MMU defconfig though

Alistair

>
> Jason
Jason A. Donenfeld July 11, 2022, 4:45 p.m. UTC | #13
Hi Alistair,

On Mon, Jul 11, 2022 at 01:36:28PM +1000, Alistair Francis wrote:
> On Mon, Jul 11, 2022 at 10:28 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> >
> > On 7/11/22, Alistair Francis <alistair23@gmail.com> wrote:
> > > On Fri, Jul 8, 2022 at 7:56 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> > >>
> > >> Hi Alistair,
> > >>
> > >> On 7/8/22, Alistair Francis <alistair23@gmail.com> wrote:
> > >>
> > >> >> > but I think that's just the way things go unfortunately.
> > >> >
> > >> > Hmm... That's a pain. So there is a bug in older kernels where they
> > >> > won't boot if we specify this?
> > >> >
> > >> > Can you point to the fixes?
> > >>
> > >> Actually, in trying to reproduce this, I don't actually think this is
> > >> affected by those old random.c bugs.
> > >>
> > >>
> > >> >> As for your 5.8 issue, I've been trying to reproduce that to
> > >> >> understand
> > >> >> more about it, but I'm unable to. I've been trying with
> > >> >> nommu_virt_defconfig using my patch ontop of qemu master. Maybe it's
> > >> >> possible in testing this out you were testing the wrong branch?
> > >> >> Anyway,
> > >> >> it'd be nice to get this queued up...
> > >> >
> > >> > Hmm... you can't reproduce it?
> > >>
> > >> No, I can't, and I'm now no longer convinced that there *is* a bug.
> > >> Can you try to repro again and send me detailed reproduction steps?
> > >
> > > I just checked again and I can confirm it is this patch that causes
> > > the regression.
> > >
> > > This is the command line:
> > >
> > > qemu-system-riscv64 \
> > > -machine virt -m 64M \
> > > -cpu rv64,mmu=false \
> > > -serial mon:stdio -serial null -nographic \
> > > -append "root=/dev/vda rw highres=off  console=ttyS0 mem=1G ip=dhcp
> > > earlycon=sbi" \
> > > -device virtio-net-device,netdev=net0,mac=52:54:00:12:34:02 -netdev
> > > user,id=net0 \
> > > -object rng-random,filename=/dev/urandom,id=rng0 -device
> > > virtio-rng-device,rng=rng0 \
> > > -smp 1 -d guest_errors \
> > > -kernel ./images/qemuriscv64/nommu-Image \
> > > -drive
> > > id=disk0,file=./images/qemuriscv64/nommu-rootfs.ext2,if=none,format=raw \
> > > -device virtio-blk-device,drive=disk0 -bios none
> > >
> > > You can access the images from:
> > > https://nextcloud.alistair23.me/index.php/s/a2zrtbT7DjdTx9t
> > >
> >
> > Thanks. Can you send the kernel . config too?
> 
> Unfortunately I don't have it, it should just be the 5.8 no MMU defconfig though

I've reproduced the problem and determined the root cause. This is a
generic issue with the mmio get_cycles() implementation before 5.9 on
no-MMU configs, which was fixed during the 5.9 cycle. I don't believe
that this is the only thing affected on that .0 kernel, where fixes were
ostensibly backported. Given the relative age of risc-v, the fact that
5.8.0 was broken anyway, and that likely nobody is using this kernel in
that configuration without applying updates, I'm pretty sure my patch is
safe to apply. I'd recommend updating the broken kernel in your CI.

Meanwhile, the rng-seed field is part of the DT spec. Holding back the
(virtual) hardware just because some random dot-zero non-LTS release had
a quickly fixed bug seems ridiculous, and the way in which progress gets
held up, hacks accumulate, and generally nothing good gets done. It will
only hamper security, functionality, and boot speed, while helping no
real practical case that can't be fixed in a better way.

So I believe you should apply the rng-seed commit so that the RISC-V
machine honors that DT field.

Regards,
Jason
Jason A. Donenfeld July 13, 2022, 5:29 p.m. UTC | #14
Hi again,

On Mon, Jul 11, 2022 at 06:45:42PM +0200, Jason A. Donenfeld wrote:
> I've reproduced the problem and determined the root cause. This is a
> generic issue with the mmio get_cycles() implementation before 5.9 on
> no-MMU configs, which was fixed during the 5.9 cycle. I don't believe
> that this is the only thing affected on that .0 kernel, where fixes were
> ostensibly backported. Given the relative age of risc-v, the fact that
> 5.8.0 was broken anyway, and that likely nobody is using this kernel in
> that configuration without applying updates, I'm pretty sure my patch is
> safe to apply. I'd recommend updating the broken kernel in your CI.
> 
> Meanwhile, the rng-seed field is part of the DT spec. Holding back the
> (virtual) hardware just because some random dot-zero non-LTS release had
> a quickly fixed bug seems ridiculous, and the way in which progress gets
> held up, hacks accumulate, and generally nothing good gets done. It will
> only hamper security, functionality, and boot speed, while helping no
> real practical case that can't be fixed in a better way.
> 
> So I believe you should apply the rng-seed commit so that the RISC-V
> machine honors that DT field.
> 
> Regards,
> Jason
> 

Just following up on this... Hoping we can get this into a tree soon.

Thanks,
Jason
Alistair Francis July 18, 2022, 10:39 p.m. UTC | #15
On Thu, Jul 14, 2022 at 3:29 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> Hi again,
>
> On Mon, Jul 11, 2022 at 06:45:42PM +0200, Jason A. Donenfeld wrote:
> > I've reproduced the problem and determined the root cause. This is a
> > generic issue with the mmio get_cycles() implementation before 5.9 on
> > no-MMU configs, which was fixed during the 5.9 cycle. I don't believe
> > that this is the only thing affected on that .0 kernel, where fixes were
> > ostensibly backported. Given the relative age of risc-v, the fact that
> > 5.8.0 was broken anyway, and that likely nobody is using this kernel in
> > that configuration without applying updates, I'm pretty sure my patch is
> > safe to apply. I'd recommend updating the broken kernel in your CI.
> >
> > Meanwhile, the rng-seed field is part of the DT spec. Holding back the
> > (virtual) hardware just because some random dot-zero non-LTS release had
> > a quickly fixed bug seems ridiculous, and the way in which progress gets
> > held up, hacks accumulate, and generally nothing good gets done. It will
> > only hamper security, functionality, and boot speed, while helping no
> > real practical case that can't be fixed in a better way.
> >
> > So I believe you should apply the rng-seed commit so that the RISC-V
> > machine honors that DT field.
> >
> > Regards,
> > Jason
> >
>
> Just following up on this... Hoping we can get this into a tree soon.

Yep! Sorry, I have been off sick for the last week.

I just updated my test images to a newer kernel, which means this
passes my tests

Thanks!

Applied to riscv-to-apply.next

Alistair

>
> Thanks,
> Jason
diff mbox series

Patch

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index bc424dd2f5..368a723bf6 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -21,6 +21,7 @@ 
 #include "qemu/osdep.h"
 #include "qemu/units.h"
 #include "qemu/error-report.h"
+#include "qemu/guest-random.h"
 #include "qapi/error.h"
 #include "hw/boards.h"
 #include "hw/loader.h"
@@ -998,6 +999,7 @@  static void create_fdt(RISCVVirtState *s, const MemMapEntry *memmap,
     MachineState *mc = MACHINE(s);
     uint32_t phandle = 1, irq_mmio_phandle = 1, msi_pcie_phandle = 1;
     uint32_t irq_pcie_phandle = 1, irq_virtio_phandle = 1;
+    uint8_t rng_seed[32];
 
     if (mc->dtb) {
         mc->fdt = load_device_tree(mc->dtb, &s->fdt_size);
@@ -1046,6 +1048,10 @@  update_bootargs:
     if (cmdline && *cmdline) {
         qemu_fdt_setprop_string(mc->fdt, "/chosen", "bootargs", cmdline);
     }
+
+    /* Pass seed to RNG. */
+    qemu_guest_getrandom_nofail(rng_seed, sizeof(rng_seed));
+    qemu_fdt_setprop(mc->fdt, "/chosen", "rng-seed", rng_seed, sizeof(rng_seed));
 }
 
 static inline DeviceState *gpex_pcie_init(MemoryRegion *sys_mem,