[00/28] arm64: Dom0 ITS emulation
diff mbox

Message ID CALicx6sbypKc1x9Qu+J0rf6iKGmK7eDpTwUn14b51-QmdUfKhQ@mail.gmail.com
State New, archived
Headers show

Commit Message

Vijay Kilari Feb. 13, 2017, 1:53 p.m. UTC
Hi Andre,

  I tried your patch series on HW. Dom0 boots but no LPIs are coming to Dom0.
So I made below patch to consider segment ID in generating devid,
 I see below panic from _xmalloc().

Complete log is here
http://pastebin.com/btythn2V

[    3.280979] iommu: Adding device 0000:01:07.2 to group 26
[    3.286373] pci 0000:01:07.3: [177d:a02f] type 00 class 0x058000
[    3.292463] pci 0000:01:07.3: BAR 0: [mem
0x87e05b000000-0x87e05b7fffff 64bit] (from Enhanced Allocation,
properties 0x0)
[    3.303457] pci 0000:01:07.3: BAR 4: [mem
0x87e05bf00000-0x87e05bffffff 64bit] (from Enhanced Allocation,
properties 0x0)
(XEN) In do_physdev_op calling gicv3_its_map_device for S: 0 B: 1 F:59 DEVID 315
(XEN) Hypervisor Trap. HSR=0x96000044 EC=0x25 IL=1 Syndrome=0x44
(XEN) CPU0: Unexpected Trap: Hypervisor
(XEN) ----[ Xen-4.9-unstable  arm64  debug=y   Not tainted ]----
(XEN) CPU:    0
(XEN) PC:     0000000000235e3c xmem_pool_alloc+0x33c/0x49c
(XEN) LR:     0000000000235c28
(XEN) SP:     0000801ffaba7c10
(XEN) CPSR:   20000349 MODE:64-bit EL2h (Hypervisor, handler)
(XEN)      X0: 000000000000001a  X1: 8022bf008022be00  X2: 0000801ffff165d0
(XEN)      X3: 000000000a0e0a0e  X4: 0000000000000001  X5: 0000000000000000
(XEN)      X6: 0000000000000004  X7: 0000000000000003  X8: 00000000fffffffb
(XEN)      X9: 000000000000000a X10: 0000801ffaba7ad8 X11: 0000000000000033
(XEN)     X12: 0000000000000003 X13: 0000000000268df8 X14: 0000000000000020
(XEN)     X15: 0000000000000000 X16: 0000000000000021 X17: 000000000000000b
(XEN)     X18: 000000000000000d X19: 0000801ffff16000 X20: 0000000000000005
(XEN)     X21: 0000000000000150 X22: 0000801ffff17868 X23: 000000000000003a
(XEN)     X24: 00000000ffffffff X25: 000000000000001f X26: 0000000000000039
(XEN)     X27: 0000000000000000 X28: 0000801fff8e9160  FP: 0000801ffaba7c10
(XEN)
(XEN)   VTCR_EL2: 80053590
(XEN)  VTTBR_EL2: 0001001ffabac000
(XEN)
(XEN)  SCTLR_EL2: 30cd183d
(XEN)    HCR_EL2: 000000008038663f
(XEN)  TTBR0_EL2: 0000001fffefe000
(XEN)
(XEN)    ESR_EL2: 96000044
(XEN)  HPFAR_EL2: 0000008010000000
(XEN)    FAR_EL2: 8022bf008022be10
(XEN)
(XEN) Xen stack trace from sp=0000801ffaba7c10:
(XEN)    0000801ffaba7c70 0000000000236394 0000000000000100 0000000000000060
(XEN)    0000000000000150 0000801ff5a62530 0000000000000000 0000801ff5a62528
(XEN)    0000801ffa32f0e0 0000000000000005 000000000000013b ffff8000ea2c6800
(XEN)    0000801ffaba7cc0 000000000024c154 0000801fff8ebec0 000000000000013b
(XEN)    0000801ff5a62000 0000801ff5a62530 0000000000000000 0000801ff5a62528
(XEN)    0000801ffa32f0e0 0000801ffaba7d70 0000801ffaba7d70 0000000000253590
(XEN)    000000000000013b 0000801ffaba7f30 ffff8000000ba1b8 ffff8000000ba1b8
(XEN)    0000000060000045 ffff800000e40000 ffff800000f50000 0000000000000000
(XEN)    ffff800000e30ec8 ffff8000ea2c6800 0000801ffaba7d30 00000000ffffffc8
(XEN)    0000000060000045 000000000026be48 0000000000000000 0000000000000001
(XEN)    000000000000003b 000000000000013b 0000801ffaba7da4 ffff8012f21bb450
(XEN)    0000801ffaba7db0 0000000000254de4 0000801ffaba7eb0 00000000002549f4
(XEN)    000000005a000ea1 000000013b010000 0000000000000000 ffff8000000ba1b8
(XEN)    0000801ffaba7e10 00000000002572a4 000000005a000ea1 0000801ffaba7eb0
(XEN)    000000005a000ea1 0000000000256550 0000801ffaba7e70 0000000000248e34
(XEN)    0000000000313c80 ffff800000d96000 ffffffffffffffff ffff800000525484
(XEN)    ffff8012f3aeb780 000000000025fb54 ffff8012f21cf098 ffff8012f21cf000
(XEN)    ffffffffffffffff ffff8000000ba1b8 0000000060000045 ffff800000e40000
(XEN)    ffff800000f50000 0000000000000000 ffff800000e30ec8 ffff8000ea2c6800
(XEN)    0000801ffaba7e90 000000000025827c 0000000000000002 0000000000258298
(XEN)    ffff8012f3aeafd0 000000000025fb58 0000000000000002 ffff800000d96000
(XEN)    0000000000000019 ffff8012f3aeb7c0 0000000000000007 0000000000000000
(XEN)    0000000000000001 ffff800000e34330 0000000080808080 ffff8012f21bb450
(XEN)    7f7f7f7f7f7f7f7f 5e646c68736d7471 7f7f7f7f7f7f7f7f 0101010101010101
(XEN)    0000000000000020 6962343620666666 6d6f726628205d74 0000000000000000
(XEN)    0000000000000021 000000000000000b 000000000000000d ffff8012f21cf098
(XEN)    ffff8012f21cf000 ffff800000d90000 0000000000000000 ffff8012f21cf098
(XEN)    ffff800000e40000 ffff800000f50000 0000000000000000 ffff800000e30ec8
(XEN)    ffff8000ea2c6800 ffff8012f3aeb780 ffff800000599718 ffffffffffffffff
(XEN)    ffff8000000ba1b8 0000000060000045 0000000060000045 0000000000000000
(XEN)    0000000000000000 0000000000000000 ffff8012f3aeb780 ffff80000010b068
(XEN)    0000000000000000 0000000000000000
(XEN) Xen call trace:
(XEN)    [<0000000000235e3c>] xmem_pool_alloc+0x33c/0x49c (PC)
(XEN)    [<0000000000235c28>] xmem_pool_alloc+0x128/0x49c (LR)
(XEN)    [<0000000000236394>] _xmalloc+0xfc/0x274
(XEN)    [<000000000024c154>] gicv3_its_map_guest_device+0xb0/0x2a0
(XEN)    [<0000000000253590>] do_physdev_op+0xc4/0x114
(XEN)    [<0000000000254de4>] traps.c#do_trap_hypercall+0x90/0x12c
(XEN)    [<00000000002572a4>] do_trap_hypervisor+0xd88/0x1c6c
(XEN)    [<000000000025fb54>] entry.o#guest_sync+0x90/0xc0
(XEN)

Note: I have added print similar to below that you see in log
(XEN) In do_physdev_op calling gicv3_its_map_device for S: 0 B: 1 F:59 DEVID 315


On Tue, Jan 31, 2017 at 12:01 AM, Andre Przywara <andre.przywara@arm.com> wrote:
> Hi,
>
> after the two RFC versions now the first "serious" attempt for emulating
> an ARM GICv3 ITS interrupt controller, for Dom0 only at the moment.
> The ITS is an interrupt controller widget providing a sophisticated way
> of dealing with MSIs in a scalable manner.
> For hardware which relies on the ITS to provide interrupts for its
> peripherals this code is needed to get a machine booted into Dom0 at all.
> ITS emulation for DomUs is only really useful with PCI passthrough,
> which is not yet available for ARM. It is expected that this feature
> will be co-developed with the ITS DomU code. However this code drop here
> considered DomU emulation already, to keep later architectural changes
> to a minimum.
>
> Some generic design principles:
>
> * The current GIC code statically allocates structures for each supported
> IRQ (both for the host and the guest), which due to the potentially
> millions of LPI interrupts is not feasible to copy for the ITS.
> So we refrain from introducing the ITS as a first class Xen interrupt
> controller, also we don't hold struct irq_desc's or struct pending_irq's
> for each possible LPI.
> Fortunately LPIs are only interesting to guests, so we get away with
> storing only the virtual IRQ number and the guest VCPU for each allocated
> host LPI, which can be stashed into one uint64_t. This data is stored in
> a two-level table, which is both memory efficient and quick to access.
> We hook into the existing IRQ handling and VGIC code to avoid accessing
> the normal structures, providing alternative methods for getting the
> needed information (priority, is enabled?) for LPIs.
> For interrupts which are queued to or are actually in a guest we
> allocate struct pending_irq's on demand. As it is expected that only a
> very small number of interrupts is ever on a VCPU at the same time, this
> seems like the best approach. For now allocated structs are re-used and
> held in a linked list. Should it emerge that traversing a linked list
> is a performance issue, this can be changed to use a hash table.
>
> * On the guest side we (later will) have to deal with malicious guests
> trying to hog Xen with mapping requests for a lot of LPIs, for instance.
> As the ITS actually uses system memory for storing status information,
> we use this memory (which the guest has to provide) to naturally limit
> a guest. For those tables which are page sized (devices, collections (CPUs),
> LPI properties) we map those pages into Xen, so we can easily access
> them from the virtual GIC code.
> Unfortunately the actual interrupt mapping tables are not necessarily
> page aligned, also can be much smaller than a page, so mapping all of
> them permanently is fiddly. As ITS commands in need to iterate those
> tables are pretty rare after all, we for now map them on demand upon
> emulating a virtual ITS command. This is acceptable because "mapping"
> them is actually very cheap on arm64. Also as we can't properly protect
> those areas due to their sub-page-size property, we validate the data
> in there before actually using it. The vITS code basically just stores
> the data in there which the guest has actually transferred via the
> virtual ITS command queue before, so there is no secret revealed nor
> does it create an attack vector for a malicious guest.
>
> * An obvious approach to handling some guest ITS commands would be to
> propagate them to the host, for instance to map devices and LPIs and
> to enable or disable LPIs.
> However this (later with DomU support) will create an attack vector, as
> a malicious guest could try to fill the host command queue with
> propagated commands.
> So (in contrast to the first RFC post) we completely avoid this situation.
> For mapping devices and LPIs we rely on this being done via a hypercall
> prior to the actual guest run. For enabling and disabling LPIs we keep
> this bit on the virtual side and let LPIs always be enabled on the host side,
> dealing with the consequences this approach creates.
>
> As it is expected that the ITS support will become a tech preview in the
> first release, there is a Kconfig option to enable it. Also it is
> supported on arm64 only, which will most likely not change in the future.
> This leads to some hideous constructs like an #ifdef'ed header file with
> empty function stubs, I have some hope we can still clean this up.
> Also some parameters are config options which can be overridden on the
> Xen commandline. This is to support experimentation and adaption to
> various platforms, ideally we find either one-size-fits-all values or
> find another way of getting rid of this.
>
> Compared to the previous post (RFC-v2) this has seen a lot of reworks
> and cleanups in various areas.
> I tried to address all of the review comments, though some are hard to
> follow due to rewrites. So apologies if some points have slipped through.
> Allocating and mapping of memory for both the physical and virtual ITS
> and redistributor tables has been improved, though I didn't manage to
> write protect the virtual tables from a guest without impacting access
> from Xen at the same time. I will need to take a deeper look into this,
> but ideally it's only a small change in get_guest_pages().
>
> This code boots Dom0 on an ARM Fast Model with ITS support. I tried to
> address the issues seen by people running the previous version on real
> hardware, though couldn't verify this here for myself.
> So any testing, bug reports (and possibly even fixes) are very welcome.
>
> The code can also be found on the its/v1 branch here:
> git://linux-arm.org/xen-ap.git
> http://www.linux-arm.org/git?p=xen-ap.git;a=shortlog;h=refs/heads/its/v1
>
> Cheers,
> Andre
>
> (Rough) changelog RFC-v2 .. v1:
> - split host ITS driver into gic-v3-lpi.c and gic-v3-its.c part
> - rename virtual ITS driver file to vgic-v3-its.c
> - use macros and named constants for all magic numbers
> - use atomic accessors for accessing the host LPI data
> - remove leftovers from connecting virtual and host ITSes
> - bail out if host ITS is disabled in the DT
> - rework map/unmap_guest_pages():
>     - split off p2m part as get/put_guest_pages (to be done on allocation)
>     - get rid of vmap, using map_domain_page() instead
> - delay allocation of virtual tables until actual LPI/ITS enablement
> - properly size both virtual and physical tables upon allocation
> - fix put_domain() locking issues in physdev_op and LPI handling code
> - add and extend comments in various areas
> - fix lotsa coding style and white space issues, including comment style
> - add locking to data structures not yet covered
> - fix various locking issues
> - use an rbtree to deal with ITS devices (instead of a list)
> - properly handle memory attributes for ITS tables
> - handle cacheable/non-cacheable ITS table mappings
> - sanitize guest provided ITS/LPI table attributes
> - fix breakage on non-GICv2 compatible host GICv3 controllers
> - add command line parameters on top of Kconfig options
> - properly wait for an ITS to become quiescient before enabling it
> - handle host ITS command queue errors
> - actually wait for host ITS command completion (READR==WRITER)
> - fix ARM32 compilation
> - various patch splits and reorderings
>
> Andre Przywara (28):
>   ARM: export __flush_dcache_area()
>   ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT
>   ARM: GICv3: allocate LPI pending and property table
>   ARM: GICv3 ITS: allocate device and collection table
>   ARM: GICv3 ITS: map ITS command buffer
>   ARM: GICv3 ITS: introduce ITS command handling
>   ARM: GICv3 ITS: introduce device mapping
>   ARM: GICv3 ITS: introduce host LPI array
>   ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall
>   ARM: GICv3: introduce separate pending_irq structs for LPIs
>   ARM: GICv3: forward pending LPIs to guests
>   ARM: GICv3: enable ITS and LPIs on the host
>   ARM: vGICv3: handle virtual LPI pending and property tables
>   ARM: vGICv3: Handle disabled LPIs
>   ARM: vGICv3: introduce basic ITS emulation bits
>   ARM: vITS: introduce translation table walks
>   ARM: vITS: handle CLEAR command
>   ARM: vITS: handle INT command
>   ARM: vITS: handle MAPC command
>   ARM: vITS: handle MAPD command
>   ARM: vITS: handle MAPTI command
>   ARM: vITS: handle MOVI command
>   ARM: vITS: handle DISCARD command
>   ARM: vITS: handle INV command
>   ARM: vITS: handle INVALL command
>   ARM: vITS: create and initialize virtual ITSes for Dom0
>   ARM: vITS: create ITS subnodes for Dom0 DT
>   ARM: vGIC: advertising LPI support
>
>  xen/arch/arm/Kconfig              |  33 ++
>  xen/arch/arm/Makefile             |   3 +
>  xen/arch/arm/efi/efi-boot.h       |   1 -
>  xen/arch/arm/gic-v3-its.c         | 825 +++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3-lpi.c         | 414 +++++++++++++++++
>  xen/arch/arm/gic-v3.c             |  98 +++-
>  xen/arch/arm/gic.c                |   9 +-
>  xen/arch/arm/physdev.c            |  21 +
>  xen/arch/arm/vgic-v3-its.c        | 929 ++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/vgic-v3.c            | 347 ++++++++++++--
>  xen/arch/arm/vgic.c               |  68 ++-
>  xen/include/asm-arm/atomic.h      |   6 +-
>  xen/include/asm-arm/bitops.h      |   1 +
>  xen/include/asm-arm/cache.h       |   4 +
>  xen/include/asm-arm/domain.h      |  14 +-
>  xen/include/asm-arm/gic.h         |   7 +
>  xen/include/asm-arm/gic_v3_defs.h |  73 ++-
>  xen/include/asm-arm/gic_v3_its.h  | 241 ++++++++++
>  xen/include/asm-arm/irq.h         |   8 +
>  xen/include/asm-arm/vgic.h        |  34 ++
>  20 files changed, 3089 insertions(+), 47 deletions(-)
>  create mode 100644 xen/arch/arm/gic-v3-its.c
>  create mode 100644 xen/arch/arm/gic-v3-lpi.c
>  create mode 100644 xen/arch/arm/vgic-v3-its.c
>  create mode 100644 xen/include/asm-arm/gic_v3_its.h
>
> --
> 2.9.0
>

Comments

Stefano Stabellini Feb. 14, 2017, 10 p.m. UTC | #1
On Mon, 13 Feb 2017, Vijay Kilari wrote:
> Hi Andre,
> 
>   I tried your patch series on HW. Dom0 boots but no LPIs are coming to Dom0.
> So I made below patch to consider segment ID in generating devid,
>  I see below panic from _xmalloc().
> 
> Complete log is here
> http://pastebin.com/btythn2V
> 
> diff --git a/xen/arch/arm/physdev.c b/xen/arch/arm/physdev.c
> index 6e02de4..72ffe9f 100644
> --- a/xen/arch/arm/physdev.c
> +++ b/xen/arch/arm/physdev.c
> @@ -17,6 +17,7 @@
>  int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>  {
>      struct physdev_manage_pci manage;
> +   struct physdev_pci_device_add pci_add;
>      u32 devid;
>      int ret;
> 
> @@ -33,6 +34,19 @@ int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>                                               cmd == PHYSDEVOP_manage_pci_add);
> 
>              return ret;
> +       case PHYSDEVOP_pci_device_add:
> +            if ( copy_from_guest(&pci_add, arg, 1) != 0 )
> +                return -EFAULT;
> +            devid = pci_add.seg << 16 | pci_add.bus << 8 | pci_add.devfn;
> +
> +            printk("In %s calling gicv3_its_map_device for S: %d B:
> %d F:%d DEVID %u\n",
> +                    __func__, pci_add.seg,pci_add.bus, pci_add.devfn, devid);
> +            /* Allocate an ITS device table with space for 32 MSIs */
> +            ret = gicv3_its_map_guest_device(hardware_domain, devid, devid, 5,
> +                                       cmd == PHYSDEVOP_pci_device_add);
> +
> +            return ret;
>      }

Hi Vijay, thanks for testing the series. Instead of implementing
PHYSDEVOP_pci_device_add here, could you call gicv3_its_map_guest_device
for each device statically from a Cavium specific platform file under
xen/arch/arm/platforms?

Once we'll have a clearer idea about how to implement which hypercalls,
we'll do this properly.

Patch
diff mbox

diff --git a/xen/arch/arm/physdev.c b/xen/arch/arm/physdev.c
index 6e02de4..72ffe9f 100644
--- a/xen/arch/arm/physdev.c
+++ b/xen/arch/arm/physdev.c
@@ -17,6 +17,7 @@ 
 int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
     struct physdev_manage_pci manage;
+   struct physdev_pci_device_add pci_add;
     u32 devid;
     int ret;

@@ -33,6 +34,19 @@  int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
                                              cmd == PHYSDEVOP_manage_pci_add);

             return ret;
+       case PHYSDEVOP_pci_device_add:
+            if ( copy_from_guest(&pci_add, arg, 1) != 0 )
+                return -EFAULT;
+            devid = pci_add.seg << 16 | pci_add.bus << 8 | pci_add.devfn;
+
+            printk("In %s calling gicv3_its_map_device for S: %d B:
%d F:%d DEVID %u\n",
+                    __func__, pci_add.seg,pci_add.bus, pci_add.devfn, devid);
+            /* Allocate an ITS device table with space for 32 MSIs */
+            ret = gicv3_its_map_guest_device(hardware_domain, devid, devid, 5,
+                                       cmd == PHYSDEVOP_pci_device_add);
+
+            return ret;
     }