Message ID | cover.1538683492.git.alistair.francis@wdc.com (mailing list archive) |
---|---|
Headers | show |
Series | Connect a PCIe host and graphics support to RISC-V | expand |
On Thu, 2018-10-04 at 20:06 +0000, Alistair Francis wrote: > Alistair Francis (5): > hw/riscv/virt: Increase the number of interrupts > hw/riscv/virt: Connect the gpex PCIe > riscv: Enable VGA and PCIE_VGA > hw/riscv/sifive_u: Connect the Xilinx PCIe > hw/riscv/virt: Connect a VirtIO net PCIe device > > default-configs/riscv32-softmmu.mak | 10 +++- > default-configs/riscv64-softmmu.mak | 10 +++- > hw/riscv/sifive_u.c | 64 +++++++++++++++++++++++++ > hw/riscv/virt.c | 72 +++++++++++++++++++++++++++++ > include/hw/riscv/sifive_u.h | 4 +- > include/hw/riscv/virt.h | 6 ++- > 6 files changed, 161 insertions(+), 5 deletions(-) I gave v4 a try a few weeks ago because I wanted to see what would be needed to wire this up on the libvirt side. Turns out, not much really :) I still have a couple of questions that hopefully you'll be able to answer: * what should libvirt look for to figure out whether or not a RISC-V guest will have PCI support? For aarch64 we look for the presence of the 'gpex-pcihost' device, but of course that won't work for RISC-V so we need something else; * I have succesfully started a RISC-V guest with virtio-pci devices attached but, while they show up in 'info qtree' and friends, the guest OS itself doesn't seem to recognize any of them - not even pcie.0! I'm using the guest images listed at [1] and following the corresponding instructions, but I think the BBL build (config at [2]) is missing some feature... Any ideas what we would need to add there? If you can help with these I'll give the patches another spin and gladly provide my Tested-by :) [1] https://fedoraproject.org/wiki/Architectures/RISC-V/Installing [2] https://github.com/rwmjones/fedora-riscv-kernel/blob/master/config
Andrea and Alistair +Keith since he developed a lot of the NVMe drive model >> Alistair Francis (5): >> hw/riscv/virt: Increase the number of interrupts >> hw/riscv/virt: Connect the gpex PCIe >> riscv: Enable VGA and PCIE_VGA >> hw/riscv/sifive_u: Connect the Xilinx PCIe >> hw/riscv/virt: Connect a VirtIO net PCIe device I also tried these out but I was interested in seeing if I could create NVMe models inside the new PCIe subsystem (for both the virt and sifive_u machines). The sifive_u machine did not work at all (so I'll leave that one for now). The virt machine successfully mapped in the NVMe devices and the OS driver was able to probe the nvme driver against them. However something seems to be broken with interrupts as I see messages like these in the OS dmesg: [ 62.852000] nvme nvme0: I/O 856 QID 1 timeout, completion polled [ 64.832000] nvme nvme1: I/O 819 QID 1 timeout, completion polled [ 64.836000] nvme nvme1: I/O 820 QID 1 timeout, completion polled [ 64.840000] nvme nvme1: I/O 821 QID 1 timeout, completion polled [ 64.844000] nvme nvme1: I/O 822 QID 1 timeout, completion polled [ 64.848000] nvme nvme0: I/O 856 QID 1 timeout, completion polled [ 64.852000] nvme nvme0: I/O 857 QID 1 timeout, completion polled These imply the driver hit an admin queue timeout but when it reaped the NVMe admin completion queue it found commands were done but no interrupt was detected by the OS. Also on starting QEMU I see this: bbl loader qemu-system-riscv64: plic: invalid register write: 00002090 qemu-system-riscv64: plic: invalid register write: 00002094 qemu-system-riscv64: plic: invalid register write: 00002098 qemu-system-riscv64: plic: invalid register write: 0000209c qemu-system-riscv64: plic: invalid register write: 000020a0 qemu-system-riscv64: plic: invalid register write: 000020a4 qemu-system-riscv64: plic: invalid register write: 000020a8 qemu-system-riscv64: plic: invalid register write: 000020ac qemu-system-riscv64: plic: invalid register write: 000020b0 qemu-system-riscv64: plic: invalid register write: 000020b4 In pci_offset = [2080] msem = [2164] parent = [2] My command to start qemu was: $QEMU -nographic \ -machine virt \ -smp 1 -m 8G \ -append "console=hvc0 ro root=/dev/vda" \ -kernel $KERNEL \ -drive file=${ROOTFS},format=raw,id=hd0 \ -device virtio-blk-device,drive=hd0 \ -device virtio-net-device,netdev=net0 \ -netdev user,id=net0 \ -device nvme,drive=nvme0,serial=nvme0,cmb_size_mb=16 \ -drive file=nvme0.qcow2,if=none,id=nvme0,snapshot=on \ -drive file=nvme1.qcow2,if=none,id=nvme1,snapshot=on \ -device nvme,drive=nvme1,serial=nvme1,cmb_size_mb=64 I plan to also try with a e1000 network interface model tomorrow and see how that behaves.... Cheers Stephen
On Wed, 2018-10-10 at 13:11 +0000, Stephen Bates wrote: > I also tried these out but I was interested in seeing if I could create NVMe models inside the new PCIe subsystem (for both the virt and sifive_u machines). The sifive_u machine did not work at all (so I'll leave that one for now). The virt machine successfully mapped in the NVMe devices and the OS driver was able to probe the nvme driver against them. However something seems to be broken with interrupts as I see messages like these in the OS dmesg: > > [ 62.852000] nvme nvme0: I/O 856 QID 1 timeout, completion polled > [ 64.832000] nvme nvme1: I/O 819 QID 1 timeout, completion polled > [ 64.836000] nvme nvme1: I/O 820 QID 1 timeout, completion polled > [ 64.840000] nvme nvme1: I/O 821 QID 1 timeout, completion polled > [ 64.844000] nvme nvme1: I/O 822 QID 1 timeout, completion polled > [ 64.848000] nvme nvme0: I/O 856 QID 1 timeout, completion polled > [ 64.852000] nvme nvme0: I/O 857 QID 1 timeout, completion polled > > These imply the driver hit an admin queue timeout but when it reaped the NVMe admin completion queue it found commands were done but no interrupt was detected by the OS. So it looks like you at least got to the point where the guest OS would find PCIe devices... Can you share the output of 'lspci' as well as the configuration you used when building your bbl? > I plan to also try with a e1000 network interface model tomorrow and see how that behaves.... Please do :)
> So it looks like you at least got to the point where the guest OS > would find PCIe devices... Yes and in fact NVMe IO against those devices do succeed (I can write and read the NVMe namespaces). It is just slow because the interrupts are not getting to the OS and hence NVMe timeouts are how the completions are discovered. > Can you share the output of 'lspci' as well as the configuration you used when building your bbl? Below is lspci -vvv for the qemu command I sent earlier. The kernel source is here [1] and the .config is here [2]. 00:00.0 Host bridge: Red Hat, Inc. QEMU PCIe Host bridge Subsystem: Red Hat, Inc QEMU PCIe Host bridge Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- lspci: Unable to load libkmod resources: error -12 <--[Note this is an error due to poor kmod support in riscv Linux at this time] 00:01.0 Non-Volatile memory controller: Intel Corporation QEMU NVM Express Controller (rev 02) (prog-if 02 [NVM Express]) Subsystem: Red Hat, Inc QEMU Virtual Machine Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx+ Latency: 0 Interrupt: pin A routed to IRQ 1 Region 0: Memory at 45000000 (64-bit, non-prefetchable) [size=8K] Region 2: Memory at 44000000 (64-bit, prefetchable) [size=16M] Capabilities: [80] Express (v2) Root Complex Integrated Endpoint, MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0 ExtTag- RBE+ DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled Kernel driver in use: nvme 00:02.0 Non-Volatile memory controller: Intel Corporation QEMU NVM Express Controller (rev 02) (prog-if 02 [NVM Express]) Subsystem: Red Hat, Inc QEMU Virtual Machine Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx+ Latency: 0 Interrupt: pin A routed to IRQ 1 Region 0: Memory at 45002000 (64-bit, non-prefetchable) [size=8K] Region 2: Memory at 40000000 (64-bit, prefetchable) [size=64M] Capabilities: [80] Express (v2) Root Complex Integrated Endpoint, MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0 ExtTag- RBE+ DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled Kernel driver in use: nvme Stephen [1] https://github.com/sbates130272/linux-p2pmem/tree/riscv-p2p-sifive [2] https://github.com/Eideticom/kernel-configs/blob/master/riscv-good-config-updated-p2pdma
>> I plan to also try with a e1000 network interface model tomorrow and see how that behaves.... > > Please do :) I added e1000 and e1000e support to my kernel and changed the QEMU command to: $QEMU -nographic \ -machine virt \ -smp 1 -m 8G \ -append "console=hvc0 ro root=/dev/vda nvme.admin_timeout=1" \ -kernel $KERNEL \ -drive file=${ROOTFS},format=raw,id=hd0 \ -device virtio-blk-device,drive=hd0 \ -device virtio-net-device,netdev=net0 \ -netdev user,id=net0 \ -device e1000,netdev=net1 \ -netdev user,id=net1 And the kernel ooops: [ 0.224000] e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI [ 0.224000] e1000: Copyright (c) 1999-2006 Intel Corporation. [ 0.224000] e1000 0000:00:01.0: enabling device (0000 -> 0002) [ 0.244000] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 0.244000] Oops [#1] [ 0.244000] Modules linked in: [ 0.244000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc6-eideticom-riscv-00038-gc2b45b2fe26a-dirty #41 [ 0.244000] sepc: ffffffd20040cc18 ra : ffffffd20040e912 sp : ffffffd3f7a77b60 [ 0.244000] gp : ffffffd2007e5960 tp : ffffffd3f7ac0000 t0 : ffffffd3f754b4c0 [ 0.244000] t1 : 0000000000000000 t2 : 00000000000003af s0 : ffffffd3f7a77b70 [ 0.244000] s1 : ffffffd3f7554b20 a0 : ffffffd3f7554b20 a1 : 0000000000000000 [ 0.244000] a2 : 0000000000000000 a3 : 0000000000000001 a4 : 0000000000000002 [ 0.244000] a5 : 0000000000000002 a6 : 00000000eac0c6e6 a7 : 0000000000000000 [ 0.244000] s2 : 0000000004140240 s3 : 0000000000000000 s4 : ffffffd3f7554f08 [ 0.244000] s5 : ffffffd3f7554000 s6 : ffffffd2007e7794 s7 : ffffffd3f7555000 [ 0.244000] s8 : ffffffd3f75546c0 s9 : ffffffd3f7554b20 s10: 0000000000001000 [ 0.244000] s11: 0000000000000000 t3 : ffffffd20078e918 t4 : ffffffd20078e920 [ 0.244000] t5 : 0000000000000007 t6 : 0000000000000006 [ 0.244000] sstatus: 0000000000000120 sbadaddr: 0000000000000000 scause: 000000000000000f [ 0.252000] ---[ end trace 371f7702831e633b ]---
On 10/10/2018 05:26 AM, Andrea Bolognani wrote: > On Thu, 2018-10-04 at 20:06 +0000, Alistair Francis wrote: >> Alistair Francis (5): >> hw/riscv/virt: Increase the number of interrupts >> hw/riscv/virt: Connect the gpex PCIe >> riscv: Enable VGA and PCIE_VGA >> hw/riscv/sifive_u: Connect the Xilinx PCIe >> hw/riscv/virt: Connect a VirtIO net PCIe device >> >> default-configs/riscv32-softmmu.mak | 10 +++- >> default-configs/riscv64-softmmu.mak | 10 +++- >> hw/riscv/sifive_u.c | 64 +++++++++++++++++++++++++ >> hw/riscv/virt.c | 72 +++++++++++++++++++++++++++++ >> include/hw/riscv/sifive_u.h | 4 +- >> include/hw/riscv/virt.h | 6 ++- >> 6 files changed, 161 insertions(+), 5 deletions(-) > > I gave v4 a try a few weeks ago because I wanted to see what would > be needed to wire this up on the libvirt side. Turns out, not much > really :) Great! > > I still have a couple of questions that hopefully you'll be able > to answer: > > * what should libvirt look for to figure out whether or not a RISC-V > guest will have PCI support? For aarch64 we look for the presence > of the 'gpex-pcihost' device, but of course that won't work for > RISC-V so we need something else; I'm not sure what you mean here. Why can we not do the same thing with RISC-V? > > * I have succesfully started a RISC-V guest with virtio-pci devices > attached but, while they show up in 'info qtree' and friends, the > guest OS itself doesn't seem to recognize any of them - not even > pcie.0! I'm using the guest images listed at [1] and following the > corresponding instructions, but I think the BBL build (config at > [2]) is missing some feature... Any ideas what we would need to > add there? I use this monolithic config: https://github.com/alistair23/meta-riscv/blob/7a950aa705b439b5ec19bb6f094930888335ba7b/recipes-kernel/linux/files/freedom-u540/defconfig It has way too much enabled, but I think if you copy the PCIe part that should be enough. My colleague Atish has Fedora booting on real hardware with the MicroSemi PCIe support. You can also see his config here: https://github.com/westerndigitalcorporation/RISC-V-Linux/blob/master/riscv-linux-conf/config_fedora_success_4.19_demo_sep11 Obviously on top of that you will need to enable the VirtIO support as that doesn't exist in the hardware. Alistair > > If you can help with these I'll give the patches another spin and > gladly provide my Tested-by :) > > > [1] https://fedoraproject.org/wiki/Architectures/RISC-V/Installing > [2] https://github.com/rwmjones/fedora-riscv-kernel/blob/master/config >
On 10/10/2018 10:32 AM, Stephen Bates wrote: >>> I plan to also try with a e1000 network interface model tomorrow and see how that behaves.... >> >> Please do :) > > I added e1000 and e1000e support to my kernel and changed the QEMU command to: > > $QEMU -nographic \ > -machine virt \ > -smp 1 -m 8G \ > -append "console=hvc0 ro root=/dev/vda nvme.admin_timeout=1" \ > -kernel $KERNEL \ > -drive file=${ROOTFS},format=raw,id=hd0 \ > -device virtio-blk-device,drive=hd0 \ > -device virtio-net-device,netdev=net0 \ > -netdev user,id=net0 \ > -device e1000,netdev=net1 \ > -netdev user,id=net1 Why do you need two networking options? > > And the kernel ooops: > > [ 0.224000] e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI > [ 0.224000] e1000: Copyright (c) 1999-2006 Intel Corporation. > [ 0.224000] e1000 0000:00:01.0: enabling device (0000 -> 0002) > [ 0.244000] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 > [ 0.244000] Oops [#1] > [ 0.244000] Modules linked in: > [ 0.244000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc6-eideticom-riscv-00038-gc2b45b2fe26a-dirty #41 > [ 0.244000] sepc: ffffffd20040cc18 ra : ffffffd20040e912 sp : ffffffd3f7a77b60 > [ 0.244000] gp : ffffffd2007e5960 tp : ffffffd3f7ac0000 t0 : ffffffd3f754b4c0 > [ 0.244000] t1 : 0000000000000000 t2 : 00000000000003af s0 : ffffffd3f7a77b70 > [ 0.244000] s1 : ffffffd3f7554b20 a0 : ffffffd3f7554b20 a1 : 0000000000000000 > [ 0.244000] a2 : 0000000000000000 a3 : 0000000000000001 a4 : 0000000000000002 > [ 0.244000] a5 : 0000000000000002 a6 : 00000000eac0c6e6 a7 : 0000000000000000 > [ 0.244000] s2 : 0000000004140240 s3 : 0000000000000000 s4 : ffffffd3f7554f08 > [ 0.244000] s5 : ffffffd3f7554000 s6 : ffffffd2007e7794 s7 : ffffffd3f7555000 > [ 0.244000] s8 : ffffffd3f75546c0 s9 : ffffffd3f7554b20 s10: 0000000000001000 > [ 0.244000] s11: 0000000000000000 t3 : ffffffd20078e918 t4 : ffffffd20078e920 > [ 0.244000] t5 : 0000000000000007 t6 : 0000000000000006 > [ 0.244000] sstatus: 0000000000000120 sbadaddr: 0000000000000000 scause: 000000000000000f > [ 0.252000] ---[ end trace 371f7702831e633b ]--- Strange. Is there any reason you need to use the e1000? The VirtIO networking device works for me. Alistair > >
> Why do you need two networking options? I don't need the e1000 for networking. The e1000 option is there to test the PCIe since it implements a PCIe model of the e1000 NIC. Basically it's another test path for your PCIe patches and was used for testing when PCIe support to the arm virt model [1]. > Strange. Is there any reason you need to use the e1000? The VirtIO > networking device works for me. As per above. The e1000 is there to test PCIe not networking. Stephen [1] https://github.com/qemu/qemu/commit/4ab29b8214cc4b54e0c1a8270b610a340311470e
> I added e1000 and e1000e support to my kernel and changed the QEMU command to:
So using -device e1000e rather than -device e1000 seems to work. I am not sure why -device e1000 causes a kernel panic. The MSI-X message is interesting and may be related to why NVMe interrupts are not reaching the OS in the guest?
[ 0.216000] e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
[ 0.216000] e1000: Copyright (c) 1999-2006 Intel Corporation.
[ 0.216000] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
[ 0.216000] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[ 0.220000] e1000e 0000:00:01.0: assign IRQ: got 1
[ 0.220000] e1000e 0000:00:01.0: enabling device (0000 -> 0002)
[ 0.220000] e1000e 0000:00:01.0: enabling bus mastering
[ 0.220000] e1000e 0000:00:01.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
[ 0.220000] e1000e 0000:00:01.0 0000:00:01.0 (uninitialized): Failed to initialize MSI-X interrupts. Falling back to MSI interrupts.
[ 0.220000] e1000e 0000:00:01.0 0000:00:01.0 (uninitialized): Failed to initialize MSI interrupts. Falling back to legacy interrupts.
[ 0.348000] e1000e 0000:00:01.0 eth0: (PCI Express:2.5GT/s:Width x1) 52:54:00:12:34:56
[ 0.356000] e1000e 0000:00:01.0 eth0: Intel(R) PRO/1000 Network Connection
[ 0.356000] e1000e 0000:00:01.0 eth0: MAC: 3, PHY: 8, PBA No: 000000-000
root@libertas:~# lspci -vvv
00:00.0 Host bridge: Red Hat, Inc. QEMU PCIe Host bridge
Subsystem: Red Hat, Inc QEMU PCIe Host bridge
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
lspci: Unable to load libkmod resources: error -12
00:01.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
Subsystem: Intel Corporation 82574L Gigabit Network Connection
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 1
Region 0: Memory at 40040000 (32-bit, non-prefetchable) [size=128K]
Region 1: Memory at 40060000 (32-bit, non-prefetchable) [size=128K]
Region 2: I/O ports at <unassigned> [disabled]
Region 3: Memory at 40080000 (32-bit, non-prefetchable) [size=16K]
[virtual] Expansion ROM at 40000000 [disabled] [size=256K]
Capabilities: [c8] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [e0] Express (v1) Root Complex Integrated Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0
ExtTag- RBE+
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
Capabilities: [140 v1] Device Serial Number 52-54-00-ff-ff-12-34-56
Kernel driver in use: e1000e
On 10/10/2018 11:47 AM, Stephen Bates wrote: >> Why do you need two networking options? > > I don't need the e1000 for networking. The e1000 option is there to test the PCIe since it implements a PCIe model of the e1000 NIC. Basically it's another test path for your PCIe patches and was used for testing when PCIe support to the arm virt model [1]. > >> Strange. Is there any reason you need to use the e1000? The VirtIO >> networking device works for me. > > As per above. The e1000 is there to test PCIe not networking. Awe. My mistake. I thought that the VirtIO networking device was a PCIe device for some reason. Alistair > > Stephen > > [1] https://github.com/qemu/qemu/commit/4ab29b8214cc4b54e0c1a8270b610a340311470e >
On 10/10/2018 12:01 PM, Stephen Bates wrote: >> I added e1000 and e1000e support to my kernel and changed the QEMU command to: > > So using -device e1000e rather than -device e1000 seems to work. I am not sure why -device e1000 causes a kernel panic. The MSI-X message is interesting and may be related to why NVMe interrupts are not reaching the OS in the guest? Great! I'm glad that it works. So it looks like PCIe is working but with some limitations in the interrupts (as seen here and with the NVMe). Unless anyone has any objections I still think it makes sense to merge the current patches as that works for a variety of PCIe devices. We can continue to look into the interrupt issues after that. Do you want to add a Tested-by tag Stephen? Alistair > > [ 0.216000] e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI > [ 0.216000] e1000: Copyright (c) 1999-2006 Intel Corporation. > [ 0.216000] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k > [ 0.216000] e1000e: Copyright(c) 1999 - 2015 Intel Corporation. > [ 0.220000] e1000e 0000:00:01.0: assign IRQ: got 1 > [ 0.220000] e1000e 0000:00:01.0: enabling device (0000 -> 0002) > [ 0.220000] e1000e 0000:00:01.0: enabling bus mastering > [ 0.220000] e1000e 0000:00:01.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode > [ 0.220000] e1000e 0000:00:01.0 0000:00:01.0 (uninitialized): Failed to initialize MSI-X interrupts. Falling back to MSI interrupts. > [ 0.220000] e1000e 0000:00:01.0 0000:00:01.0 (uninitialized): Failed to initialize MSI interrupts. Falling back to legacy interrupts. > [ 0.348000] e1000e 0000:00:01.0 eth0: (PCI Express:2.5GT/s:Width x1) 52:54:00:12:34:56 > [ 0.356000] e1000e 0000:00:01.0 eth0: Intel(R) PRO/1000 Network Connection > [ 0.356000] e1000e 0000:00:01.0 eth0: MAC: 3, PHY: 8, PBA No: 000000-000 > > root@libertas:~# lspci -vvv > 00:00.0 Host bridge: Red Hat, Inc. QEMU PCIe Host bridge > Subsystem: Red Hat, Inc QEMU PCIe Host bridge > Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- > Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- > lspci: Unable to load libkmod resources: error -12 > > 00:01.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection > Subsystem: Intel Corporation 82574L Gigabit Network Connection > Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- > Latency: 0 > Interrupt: pin A routed to IRQ 1 > Region 0: Memory at 40040000 (32-bit, non-prefetchable) [size=128K] > Region 1: Memory at 40060000 (32-bit, non-prefetchable) [size=128K] > Region 2: I/O ports at <unassigned> [disabled] > Region 3: Memory at 40080000 (32-bit, non-prefetchable) [size=16K] > [virtual] Expansion ROM at 40000000 [disabled] [size=256K] > Capabilities: [c8] Power Management version 2 > Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) > Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- > Capabilities: [e0] Express (v1) Root Complex Integrated Endpoint, MSI 00 > DevCap: MaxPayload 128 bytes, PhantFunc 0 > ExtTag- RBE+ > DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- > RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- > MaxPayload 128 bytes, MaxReadReq 128 bytes > DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- > Capabilities: [100 v2] Advanced Error Reporting > UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- > CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- > CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ > AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- > Capabilities: [140 v1] Device Serial Number 52-54-00-ff-ff-12-34-56 > Kernel driver in use: e1000e > >
On Wed, 2018-10-10 at 12:53 -0700, Alistair wrote: > On 10/10/2018 11:47 AM, Stephen Bates wrote: > > > Strange. Is there any reason you need to use the e1000? The VirtIO > > > networking device works for me. > > > > As per above. The e1000 is there to test PCIe not networking. Unless I'm mistaken, e1000 is a conventional PCI device, with e1000e being the PCI Express equivalent. > Awe. My mistake. I thought that the VirtIO networking device was a PCIe > device for some reason. Most VirtIO devices, including virtio-net, show up as either conventional PCI or PCI Express based on the slot they're plugged into, so if you have -device virtio-net-pci,bus=pci.0 it will show up as a conventional PCI device but if you have -device pcie-root-port,id=pci.1,bus=pcie.0 \ -device virtio-net-pci,bus=pci.1 instead it will show up as a PCI Express device.
On Wed, 2018-10-10 at 10:57 -0700, Alistair wrote: > On 10/10/2018 05:26 AM, Andrea Bolognani wrote: > > * what should libvirt look for to figure out whether or not a RISC-V > > guest will have PCI support? For aarch64 we look for the presence > > of the 'gpex-pcihost' device, but of course that won't work for > > RISC-V so we need something else; > > I'm not sure what you mean here. Why can we not do the same thing with > RISC-V? The gpex-pcihost device was introduced at the same time as aarch64/virt gained PCI support, so we can probe the binary[1] and, if the device is present, we know that aarch64/virt guests will be able to use PCI. We cannot do the same for RISC-V for obvious reasons: the device is already there :) Is there any RISC-V device / property that was introduced along with PCI support and we can use as a witness? > > * I have succesfully started a RISC-V guest with virtio-pci devices > > attached but, while they show up in 'info qtree' and friends, the > > guest OS itself doesn't seem to recognize any of them - not even > > pcie.0! I'm using the guest images listed at [1] and following the > > corresponding instructions, but I think the BBL build (config at > > [2]) is missing some feature... Any ideas what we would need to > > add there? > > I use this monolithic config: > https://github.com/alistair23/meta-riscv/blob/7a950aa705b439b5ec19bb6f094930888335ba7b/recipes-kernel/linux/files/freedom-u540/defconfig > > It has way too much enabled, but I think if you copy the PCIe part that > should be enough. Looks like there's quite a few CONFIG*PCI* options we're missing! Rich, can you try incorporating those and kicking off a BBL build? If I remember correctly you might also have to backport a commit or two to make some of the options available on RISC-V, but it should be very feasible. [...] > Obviously on top of that you will need to enable the VirtIO support as > that doesn't exist in the hardware. Yeah, we already have that enabled and non-PCI VirtIO devices work just fine. [1] Not the guest, mind you: libvirt will probe each QEMU binary only once to reduce the overhead, so anyth
On Thu, Oct 11, 2018 at 07:59:59AM +0200, Andrea Bolognani wrote: > On Wed, 2018-10-10 at 10:57 -0700, Alistair wrote: > > On 10/10/2018 05:26 AM, Andrea Bolognani wrote: > > > * what should libvirt look for to figure out whether or not a RISC-V > > > guest will have PCI support? For aarch64 we look for the presence > > > of the 'gpex-pcihost' device, but of course that won't work for > > > RISC-V so we need something else; > > > > I'm not sure what you mean here. Why can we not do the same thing with > > RISC-V? > > The gpex-pcihost device was introduced at the same time as > aarch64/virt gained PCI support, so we can probe the binary[1] and, > if the device is present, we know that aarch64/virt guests will be > able to use PCI. We cannot do the same for RISC-V for obvious > reasons: the device is already there :) > > Is there any RISC-V device / property that was introduced along > with PCI support and we can use as a witness? Can we ignore non-PCIe guests? virtio-mmio should die. Also if we ever get to the stage where (real) RISC-V servers are shipping that don't have PCIe, then I'm afraid I will have lost a critical battle :-( Non-PCI might be OK for tiny embedded stuff, but for proper machines - real and virtual - we should require PCIe. > > > * I have succesfully started a RISC-V guest with virtio-pci devices > > > attached but, while they show up in 'info qtree' and friends, the > > > guest OS itself doesn't seem to recognize any of them - not even > > > pcie.0! I'm using the guest images listed at [1] and following the > > > corresponding instructions, but I think the BBL build (config at > > > [2]) is missing some feature... Any ideas what we would need to > > > add there? > > > > I use this monolithic config: > > https://github.com/alistair23/meta-riscv/blob/7a950aa705b439b5ec19bb6f094930888335ba7b/recipes-kernel/linux/files/freedom-u540/defconfig > > > > It has way too much enabled, but I think if you copy the PCIe part that > > should be enough. > > Looks like there's quite a few CONFIG*PCI* options we're missing! > > Rich, can you try incorporating those and kicking off a BBL build? > If I remember correctly you might also have to backport a commit or > two to make some of the options available on RISC-V, but it should > be very feasible. I'll have a look. Rich.
On Thu, Oct 11, 2018 at 07:59:59AM +0200, Andrea Bolognani wrote: > On Wed, 2018-10-10 at 10:57 -0700, Alistair wrote: > > I use this monolithic config: > > https://github.com/alistair23/meta-riscv/blob/7a950aa705b439b5ec19bb6f094930888335ba7b/recipes-kernel/linux/files/freedom-u540/defconfig > > > > It has way too much enabled, but I think if you copy the PCIe part that > > should be enough. > > Looks like there's quite a few CONFIG*PCI* options we're missing! PCI settings in this file: CONFIG_BLK_MQ_PCI=y CONFIG_GENERIC_PCI_IOMAP=y CONFIG_HOTPLUG_PCI_PCIE=y CONFIG_HOTPLUG_PCI=y CONFIG_MEDIA_PCI_SUPPORT=y CONFIG_PCI_ATS=y CONFIG_PCI_BUS_ADDR_T_64BIT=y CONFIG_PCI_DEBUG=y CONFIG_PCI_DMA_32=y CONFIG_PCI_DOMAINS_GENERIC=y CONFIG_PCI_DOMAINS=y CONFIG_PCIEAER=y CONFIG_PCIEASPM_DEFAULT=y CONFIG_PCIEASPM=y CONFIG_PCI_ECAM=y CONFIG_PCIE_MICROSEMI=y CONFIG_PCI_ENDPOINT=y CONFIG_PCIEPORTBUS=y CONFIG_PCIE_XILINX=y CONFIG_PCI_HOST_COMMON=y CONFIG_PCI_HOST_GENERIC=y CONFIG_PCI_IOV=y CONFIG_PCI_MSI=y CONFIG_PCI_PASID=y CONFIG_PCI_PRI=y CONFIG_PCI_QUIRKS=y CONFIG_PCI_SW_SWITCHTEC=y CONFIG_PCI=y CONFIG_USB_BDC_PCI=y CONFIG_USB_EHCI_PCI=y CONFIG_USB_OHCI_HCD_PCI=y CONFIG_USB_PCI=y CONFIG_USB_XHCI_PCI=y Here are the settings we currently do NOT have in my RV kernel: CONFIG_HOTPLUG_PCI_PCIE=y CONFIG_HOTPLUG_PCI=y CONFIG_MEDIA_PCI_SUPPORT=y CONFIG_PCI_ATS=y CONFIG_PCI_DEBUG=y CONFIG_PCIEAER=y CONFIG_PCIEASPM_DEFAULT=y CONFIG_PCIEASPM=y CONFIG_PCI_ECAM=y CONFIG_PCIE_MICROSEMI=y CONFIG_PCI_ENDPOINT=y CONFIG_PCIEPORTBUS=y CONFIG_PCI_HOST_COMMON=y CONFIG_PCI_HOST_GENERIC=y CONFIG_PCI_IOV=y CONFIG_PCI_PASID=y CONFIG_PCI_PRI=y CONFIG_PCI_SW_SWITCHTEC=y CONFIG_USB_BDC_PCI=y If you're happy with it, I can add all of those. If there are any which shouldn't be added let me know. Rich.
On Thu, Oct 11, 2018 at 09:01:14AM +0100, Richard W.M. Jones wrote: > Here are the settings we currently do NOT have in my RV kernel: > > CONFIG_HOTPLUG_PCI_PCIE=y > CONFIG_HOTPLUG_PCI=y > CONFIG_MEDIA_PCI_SUPPORT=y > CONFIG_PCI_ATS=y > CONFIG_PCI_DEBUG=y > CONFIG_PCIEAER=y > CONFIG_PCIEASPM_DEFAULT=y > CONFIG_PCIEASPM=y > CONFIG_PCI_ECAM=y > CONFIG_PCIE_MICROSEMI=y > CONFIG_PCI_ENDPOINT=y > CONFIG_PCIEPORTBUS=y > CONFIG_PCI_HOST_COMMON=y > CONFIG_PCI_HOST_GENERIC=y > CONFIG_PCI_IOV=y > CONFIG_PCI_PASID=y > CONFIG_PCI_PRI=y > CONFIG_PCI_SW_SWITCHTEC=y > CONFIG_USB_BDC_PCI=y > > If you're happy with it, I can add all of those. If there are any > which shouldn't be added let me know. I didn't see an answer but in any case I have tried to enable all of these. The only settings which could not be enabled were: CONFIG_MEDIA_PCI_SUPPORT=y CONFIG_PCI_ECAM=y CONFIG_PCI_HOST_COMMON=y CONFIG_PCI_HOST_GENERIC=y CONFIG_USB_BDC_PCI=y Probably missing deps or missing arch support. I didn't track them down yet, but note I'm still using kernel 4.15. https://github.com/rwmjones/fedora-riscv-kernel/commits/master In any case a new kernel/bbl has been built, available in the usual place: https://fedorapeople.org/groups/risc-v/disk-images/ Rich.
On 11 October 2018 at 08:55, Richard W.M. Jones <rjones@redhat.com> wrote:
> Can we ignore non-PCIe guests? virtio-mmio should die.
I note that Edgar's recent Xilinx Versal patchset adds a
new user of virtio-mmio. If that statement is meant to
apply generally it might be worth starting the discussion
in that patchthread too.
thanks
-- PMM
On Thu, 2018-10-11 at 12:45 +0100, Richard W.M. Jones wrote: > On Thu, Oct 11, 2018 at 09:01:14AM +0100, Richard W.M. Jones wrote: > > Here are the settings we currently do NOT have in my RV kernel: > > > > CONFIG_HOTPLUG_PCI_PCIE=y > > CONFIG_HOTPLUG_PCI=y > > CONFIG_MEDIA_PCI_SUPPORT=y > > CONFIG_PCI_ATS=y > > CONFIG_PCI_DEBUG=y > > CONFIG_PCIEAER=y > > CONFIG_PCIEASPM_DEFAULT=y > > CONFIG_PCIEASPM=y > > CONFIG_PCI_ECAM=y > > CONFIG_PCIE_MICROSEMI=y > > CONFIG_PCI_ENDPOINT=y > > CONFIG_PCIEPORTBUS=y > > CONFIG_PCI_HOST_COMMON=y > > CONFIG_PCI_HOST_GENERIC=y > > CONFIG_PCI_IOV=y > > CONFIG_PCI_PASID=y > > CONFIG_PCI_PRI=y > > CONFIG_PCI_SW_SWITCHTEC=y > > CONFIG_USB_BDC_PCI=y > > > > If you're happy with it, I can add all of those. If there are any > > which shouldn't be added let me know. > > I didn't see an answer but in any case I have tried to enable all of > these. The only settings which could not be enabled were: > > CONFIG_MEDIA_PCI_SUPPORT=y > CONFIG_PCI_ECAM=y > CONFIG_PCI_HOST_COMMON=y > CONFIG_PCI_HOST_GENERIC=y I believe these last two are the ones we really miss. I could be wrong though - as you know, I've made a few guesses that didn't quite pan out already :) > CONFIG_USB_BDC_PCI=y > > Probably missing deps or missing arch support. I didn't track them > down yet, but note I'm still using kernel 4.15. > > https://github.com/rwmjones/fedora-riscv-kernel/commits/master We talked about this on IRC a while ago: assuming my guess above is correct, you would need to backport https://github.com/torvalds/linux/commit/51bc085d6454214b02dba7a259ee1fdfe3ee8d9f because that's the one that makes PCI_HOST_GENERIC available to non-ARM architectures. > In any case a new kernel/bbl has been built, available in the usual > place: > > https://fedorapeople.org/groups/risc-v/disk-images/ I gave it a spin: it doesn't work :(
Hi All > because that's the one that makes PCI_HOST_GENERIC available to > non-ARM architectures. You are going to need PCI_HOST_GENERIC for sure to get the drive for the GPEX host PCIe port. I did my testing on a 4.19 based kernel [1] and the .config is here [2] (sorry the link I sent for the config yesterday was a 404). This kernel has the non-ARM support for CONFIG_ PCI_HOST_GENERIC. Stephen [1] https://github.com/sbates130272/linux-p2pmem/tree/riscv-p2p-sifive [2] https://drive.google.com/file/d/190vcLhF3_pZvyUbNEtjP5gwMI9IaKEsv/view?usp=sharing
On Wed, Oct 10, 2018 at 11:00 PM Andrea Bolognani <abologna@redhat.com> wrote: > > On Wed, 2018-10-10 at 10:57 -0700, Alistair wrote: > > On 10/10/2018 05:26 AM, Andrea Bolognani wrote: > > > * what should libvirt look for to figure out whether or not a RISC-V > > > guest will have PCI support? For aarch64 we look for the presence > > > of the 'gpex-pcihost' device, but of course that won't work for > > > RISC-V so we need something else; > > > > I'm not sure what you mean here. Why can we not do the same thing with > > RISC-V? > > The gpex-pcihost device was introduced at the same time as > aarch64/virt gained PCI support, so we can probe the binary[1] and, > if the device is present, we know that aarch64/virt guests will be > able to use PCI. We cannot do the same for RISC-V for obvious > reasons: the device is already there :) The device shouldn't exist in the RISC-V QEMU binary though until after this patch series. So it should have the same effect. Alistair > > Is there any RISC-V device / property that was introduced along > with PCI support and we can use as a witness? > > > > * I have succesfully started a RISC-V guest with virtio-pci devices > > > attached but, while they show up in 'info qtree' and friends, the > > > guest OS itself doesn't seem to recognize any of them - not even > > > pcie.0! I'm using the guest images listed at [1] and following the > > > corresponding instructions, but I think the BBL build (config at > > > [2]) is missing some feature... Any ideas what we would need to > > > add there? > > > > I use this monolithic config: > > https://github.com/alistair23/meta-riscv/blob/7a950aa705b439b5ec19bb6f094930888335ba7b/recipes-kernel/linux/files/freedom-u540/defconfig > > > > It has way too much enabled, but I think if you copy the PCIe part that > > should be enough. > > Looks like there's quite a few CONFIG*PCI* options we're missing! > > Rich, can you try incorporating those and kicking off a BBL build? > If I remember correctly you might also have to backport a commit or > two to make some of the options available on RISC-V, but it should > be very feasible. > > [...] > > Obviously on top of that you will need to enable the VirtIO support as > > that doesn't exist in the hardware. > > Yeah, we already have that enabled and non-PCI VirtIO devices work > just fine. > > > [1] Not the guest, mind you: libvirt will probe each QEMU binary only > once to reduce the overhead, so anyth > -- > Andrea Bolognani / Red Hat / Virtualization >
On Thu, 2018-10-11 at 10:40 -0700, Alistair Francis wrote: > On Wed, Oct 10, 2018 at 11:00 PM Andrea Bolognani <abologna@redhat.com> wrote: > > The gpex-pcihost device was introduced at the same time as > > aarch64/virt gained PCI support, so we can probe the binary[1] and, > > if the device is present, we know that aarch64/virt guests will be > > able to use PCI. We cannot do the same for RISC-V for obvious > > reasons: the device is already there :) > > The device shouldn't exist in the RISC-V QEMU binary though until > after this patch series. So it should have the same effect. Duh! I just checked and *of course* you're right! Problem solved then :)
On Fri, Oct 12, 2018 at 6:46 AM Andrea Bolognani <abologna@redhat.com> wrote: > > On Thu, 2018-10-11 at 10:40 -0700, Alistair Francis wrote: > > On Wed, Oct 10, 2018 at 11:00 PM Andrea Bolognani <abologna@redhat.com> wrote: > > > The gpex-pcihost device was introduced at the same time as > > > aarch64/virt gained PCI support, so we can probe the binary[1] and, > > > if the device is present, we know that aarch64/virt guests will be > > > able to use PCI. We cannot do the same for RISC-V for obvious > > > reasons: the device is already there :) > > > > The device shouldn't exist in the RISC-V QEMU binary though until > > after this patch series. So it should have the same effect. > > Duh! I just checked and *of course* you're right! > > Problem solved then :) Does that mean you have it working? Alistair > > -- > Andrea Bolognani / Red Hat / Virtualization >
On Fri, 2018-10-12 at 09:12 -0700, Alistair Francis wrote: > On Fri, Oct 12, 2018 at 6:46 AM Andrea Bolognani <abologna@redhat.com> wrote: > > Problem solved then :) > > Does that mean you have it working? Yeah, it was trivial once I knew what to look for ;) One more thing that I forgot to bring up earlier: at the same time as PCIe support is added, we should also make sure that the pcie-root-port device is built into the qemu-system-riscv* binaries by default, as that device being missing will cause PCI-enabled libvirt guests to fail to start.
On Mon, Oct 15, 2018 at 7:39 AM Andrea Bolognani <abologna@redhat.com> wrote: > > On Fri, 2018-10-12 at 09:12 -0700, Alistair Francis wrote: > > On Fri, Oct 12, 2018 at 6:46 AM Andrea Bolognani <abologna@redhat.com> wrote: > > > Problem solved then :) > > > > Does that mean you have it working? > > Yeah, it was trivial once I knew what to look for ;) > > > One more thing that I forgot to bring up earlier: at the same time > as PCIe support is added, we should also make sure that the > pcie-root-port device is built into the qemu-system-riscv* binaries > by default, as that device being missing will cause PCI-enabled > libvirt guests to fail to start. We are dong that aren't we? Alistair > > -- > Andrea Bolognani / Red Hat / Virtualization >
On Mon, 2018-10-15 at 09:59 -0700, Alistair Francis wrote: > On Mon, Oct 15, 2018 at 7:39 AM Andrea Bolognani <abologna@redhat.com> wrote: > > One more thing that I forgot to bring up earlier: at the same time > > as PCIe support is added, we should also make sure that the > > pcie-root-port device is built into the qemu-system-riscv* binaries > > by default, as that device being missing will cause PCI-enabled > > libvirt guests to fail to start. > > We are dong that aren't we? Doesn't look that way: $ riscv64-softmmu/qemu-system-riscv64 -device help 2>&1 | head -5 Controller/Bridge/Hub devices: name "pci-bridge", bus PCI, desc "Standard PCI Bridge" name "pci-bridge-seat", bus PCI, desc "Standard PCI Bridge (multiseat)" name "vfio-pci-igd-lpc-bridge", bus PCI, desc "VFIO dummy ISA/LPC bridge for IGD assignment" $ Looking at the output of '-device help' in its entirety, I think there's a lot of stuff in there that doesn't quite belong with a RISC-V guest and that it would probably make sense to compile out.
On Tue, 2018-10-16 at 09:38 +0200, Andrea Bolognani wrote: > On Mon, 2018-10-15 at 09:59 -0700, Alistair Francis wrote: > > On Mon, Oct 15, 2018 at 7:39 AM Andrea Bolognani <abologna@redhat.com> wrote: > > > One more thing that I forgot to bring up earlier: at the same time > > > as PCIe support is added, we should also make sure that the > > > pcie-root-port device is built into the qemu-system-riscv* binaries > > > by default, as that device being missing will cause PCI-enabled > > > libvirt guests to fail to start. > > > > We are dong that aren't we? > > Doesn't look that way: > > $ riscv64-softmmu/qemu-system-riscv64 -device help 2>&1 | head -5 > Controller/Bridge/Hub devices: > name "pci-bridge", bus PCI, desc "Standard PCI Bridge" > name "pci-bridge-seat", bus PCI, desc "Standard PCI Bridge (multiseat)" > name "vfio-pci-igd-lpc-bridge", bus PCI, desc "VFIO dummy ISA/LPC bridge for IGD assignment" > > $ Okay, I've (slow) cooked myself a BBL with CONFIG_PCI_HOST_GENERIC=y, a QEMU with CONFIG_PCIE_PORT=y and a libvirt with RISC-V PCI support. With all of the above in place, I could finally define a mmio-less guest which... Failed to boot pretty much right away: error: Failed to start domain riscv error: internal error: process exited while connecting to monitor: 2018-10-16T13:32:20.713064Z qemu-system-riscv64: -device pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x1: MSI-X is not supported by interrupt controller Well, okay then. As a second attempt, I manually placed all virtio devices on pcie.0, overriding libvirt's own address assignment algorithm and getting rid of pcie-root-ports at the same time. Now the guest will actually start, but soon enough OF: PCI: host bridge /pci@2000000000 ranges: OF: PCI: No bus range found for /pci@2000000000, using [bus 00-ff] OF: PCI: MEM 0x40000000..0x5fffffff -> 0x40000000 pci-host-generic 2000000000.pci: ECAM area [mem 0x2000000000-0x2003ffffff] can only accommodate [bus 00-3f] (reduced from [bus 00-ff] desired) pci-host-generic 2000000000.pci: ECAM at [mem 0x2000000000-0x2003ffffff] for [bus 00-3f] pci-host-generic 2000000000.pci: PCI host bridge to bus 0000:00 pci_bus 0000:00: root bus resource [bus 00-ff] pci_bus 0000:00: root bus resource [mem 0x40000000-0x5fffffff] pci 0000:00:02.0: BAR 6: assigned [mem 0x40000000-0x4003ffff pref] pci 0000:00:01.0: BAR 4: assigned [mem 0x40040000-0x40043fff 64bit pref] pci 0000:00:02.0: BAR 4: assigned [mem 0x40044000-0x40047fff 64bit pref] pci 0000:00:03.0: BAR 4: assigned [mem 0x40048000-0x4004bfff 64bit pref] pci 0000:00:04.0: BAR 4: assigned [mem 0x4004c000-0x4004ffff 64bit pref] pci 0000:00:01.0: BAR 0: no space for [io size 0x0040] pci 0000:00:01.0: BAR 0: failed to assign [io size 0x0040] pci 0000:00:02.0: BAR 0: no space for [io size 0x0020] pci 0000:00:02.0: BAR 0: failed to assign [io size 0x0020] pci 0000:00:03.0: BAR 0: no space for [io size 0x0020] pci 0000:00:03.0: BAR 0: failed to assign [io size 0x0020] pci 0000:00:04.0: BAR 0: no space for [io size 0x0020] pci 0000:00:04.0: BAR 0: failed to assign [io size 0x0020] virtio-pci 0000:00:01.0: enabling device (0000 -> 0002) virtio-pci 0000:00:02.0: enabling device (0000 -> 0002) virtio-pci 0000:00:03.0: enabling device (0000 -> 0002) virtio-pci 0000:00:04.0: enabling device (0000 -> 0002) will show up on the console and boot will not progress any further. I tried making only the disk virtio-pci, leaving all other devices as virtio-mmio, but that too failed to boot with a similar message about IO space exaustion. If the network device is the only one using virtio-pci, though, despite still getting pci 0000:00:01.0: BAR 0: no space for [io size 0x0020] pci 0000:00:01.0: BAR 0: failed to assign [io size 0x0020] I can get all the way to a prompt, and the device will show up in the output of lspci: 00:00.0 Host bridge: Red Hat, Inc. QEMU PCIe Host bridge Subsystem: Red Hat, Inc. Device 1100 Flags: fast devsel lspci: Unable to load libkmod resources: error -12 00:01.0 Ethernet controller: Red Hat, Inc. Virtio network device Subsystem: Red Hat, Inc. Device 0001 Flags: bus master, fast devsel, latency 0, IRQ 1 I/O ports at <unassigned> [disabled] Memory at 40040000 (64-bit, prefetchable) [size=16K] [virtual] Expansion ROM at 40000000 [disabled] [size=256K] Capabilities: [84] Vendor Specific Information: VirtIO: <unknown> Capabilities: [70] Vendor Specific Information: VirtIO: Notify Capabilities: [60] Vendor Specific Information: VirtIO: DeviceCfg Capabilities: [50] Vendor Specific Information: VirtIO: ISR Capabilities: [40] Vendor Specific Information: VirtIO: CommonCfg Kernel driver in use: virtio-pci So it looks like virtio-pci is not quite usable yet; still, this is definitely some progress over the status quo! Anyone has any ideas on how to bridge the gap separating us from a pure virtio-pci RISC-V guest?
On Tue, 2018-10-16 at 16:11 +0200, Andrea Bolognani wrote: [...] > If the network device is the only one > using virtio-pci, though, despite still getting > > pci 0000:00:01.0: BAR 0: no space for [io size 0x0020] > pci 0000:00:01.0: BAR 0: failed to assign [io size 0x0020] > > I can get all the way to a prompt, and the device will show up in > the output of lspci: > [...] > 00:01.0 Ethernet controller: Red Hat, Inc. Virtio network device > Subsystem: Red Hat, Inc. Device 0001 > Flags: bus master, fast devsel, latency 0, IRQ 1 > I/O ports at <unassigned> [disabled] > Memory at 40040000 (64-bit, prefetchable) [size=16K] > [virtual] Expansion ROM at 40000000 [disabled] [size=256K] > Capabilities: [84] Vendor Specific Information: VirtIO: <unknown> > Capabilities: [70] Vendor Specific Information: VirtIO: Notify > Capabilities: [60] Vendor Specific Information: VirtIO: DeviceCfg > Capabilities: [50] Vendor Specific Information: VirtIO: ISR > Capabilities: [40] Vendor Specific Information: VirtIO: CommonCfg > Kernel driver in use: virtio-pci Forgot to mention that, despite showing up, the device doesn't quite work: for example, it can't seem to be able to acquire an IP address.
> Forgot to mention that, despite showing up, the device doesn't quite > work: for example, it can't seem to be able to acquire an IP address. Andrea and Alistair A lot of your issues look very similar to what I saw. The PCIe device can be accessed via MMIO but interrupts are broken which causes issues with the devices (e.g. NVMe or 1000e). I think we need to root cause and resolve this issue before we can merge the series. Alistair I think you need to review how interrupts are wired up in the PCIe root device. Maybe compare to how it was done in arm as it looks different. For now I would concentrate on the virt machine (since it has a working equivalent in the arm tree). We can then look at the sifive_u machine after. Cheers Stephen