Message ID | 20170316112030.20419-3-andre.przywara@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Andre, On 03/16/2017 11:20 AM, Andre Przywara wrote: > diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig > index bf64c61..86f7b53 100644 > --- a/xen/arch/arm/Kconfig > +++ b/xen/arch/arm/Kconfig > @@ -49,6 +49,21 @@ config HAS_ITS > bool "GICv3 ITS MSI controller support" > depends on HAS_GICV3 > > +config MAX_PHYS_LPI_BITS > + depends on HAS_ITS > + int "Maximum bits for GICv3 host LPIs (14-32)" > + range 14 32 > + default "20" > + help > + Specifies the maximum number of LPIs (in bits) Xen should take > + care of. The host ITS may provide support for a very large number > + of supported LPIs, for all of which we may not want to allocate > + memory, so this number here allows to limit this. > + Xen itself does not know how many LPIs domains will ever need > + beforehand. > + This can be overridden on the command line with the max_lpi_bits > + parameter. I continue to disagree on this Kconfig option. There is no sensible default value and I don't see how a distribution will be able to pick-up one value here. If the number of LPIs have to be restricted, this should be done via the command line or a per-platform option. I have already made my point on previous e-mails so I am not going to argue more. > + > endmenu > > menu "ARM errata workaround via the alternative framework" > diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile > index 54860e0..02a8737 100644 > --- a/xen/arch/arm/Makefile > +++ b/xen/arch/arm/Makefile > @@ -19,6 +19,7 @@ obj-y += gic.o > obj-y += gic-v2.o > obj-$(CONFIG_HAS_GICV3) += gic-v3.o > obj-$(CONFIG_HAS_ITS) += gic-v3-its.o > +obj-$(CONFIG_HAS_ITS) += gic-v3-lpi.o > obj-y += guestcopy.o > obj-y += hvm.o > obj-y += io.o > diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c > new file mode 100644 > index 0000000..4f8414b > --- /dev/null > +++ b/xen/arch/arm/gic-v3-lpi.c > @@ -0,0 +1,201 @@ > +/* > + * xen/arch/arm/gic-v3-lpi.c > + * > + * ARM GICv3 Locality-specific Peripheral Interrupts (LPI) support > + * > + * Copyright (C) 2016,2017 - ARM Ltd > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; under version 2 of the License. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; If not, see <http://www.gnu.org/licenses/>. > + */ > + > +#include <xen/lib.h> > +#include <xen/mm.h> > +#include <asm/gic.h> > +#include <asm/gic_v3_defs.h> > +#include <asm/gic_v3_its.h> > +#include <asm/io.h> > +#include <asm/page.h> > + > +#define LPI_PROPTABLE_NEEDS_FLUSHING (1U << 0) NIT: Newline here please. > +/* Global state */ > +static struct { > + /* The global LPI property table, shared by all redistributors. */ > + uint8_t *lpi_property; > + /* > + * Number of physical LPIs the host supports. This is a property of > + * the GIC hardware. We depart from the habit of naming these things > + * "physical" in Xen, as the GICv3/4 spec uses the term "physical LPI" > + * in a different context to differentiate them from "virtual LPIs". > + */ > + unsigned long int nr_host_lpis; Why unsigned long? > + unsigned int flags; > +} lpi_data; > + > +struct lpi_redist_data { > + void *pending_table; > +}; > + > +static DEFINE_PER_CPU(struct lpi_redist_data, lpi_redist); > + > +#define MAX_PHYS_LPIS (lpi_data.nr_host_lpis - LPI_OFFSET) This is fairly confusing. When I read "nr_host_lpis" I understand that you store the number of LPIs. But in fact you store the maximum LPI ID. Also it is very unclear the difference between MAX_PHYS_LPIS and nr_host_lpis and will likely cause trouble in the future. So it looks like to me that nr_host_lpis should be named max_host_lpis_ids and MAX_PHYS_LPIS should be MAX_NR_PHYS_LPIS. Also, please stay consistent with the naming. If you use host then use host everywhere and not a mix between phys and host for the same purpose. > + > +static int gicv3_lpi_allocate_pendtable(uint64_t *reg) > +{ > + uint64_t val; > + unsigned int order; > + void *pendtable; > + > + if ( this_cpu(lpi_redist).pending_table ) > + return -EBUSY; > + > + val = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT; > + val |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT; > + val |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT; > + > + /* > + * The pending table holds one bit per LPI and even covers bits for > + * interrupt IDs below 8192, so we allocate the full range. > + * The GICv3 imposes a 64KB alignment requirement, also requires > + * physically contigious memory. The bit after the comma seems quite obvious. Both xalloc and alloc_xenheap_pages will ensure that the region will be contiguous in the physical address space. [...] > +static int gicv3_lpi_set_proptable(void __iomem * rdist_base) > +{ > + uint64_t reg; > + > + reg = GIC_BASER_CACHE_RaWaWb << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT; > + reg |= GIC_BASER_CACHE_SameAsInner << GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT; > + reg |= GIC_BASER_InnerShareable << GICR_PROPBASER_SHAREABILITY_SHIFT; > + > + /* > + * The property table is shared across all redistributors, so allocate > + * this only once, but return the same value on subsequent calls. > + */ > + if ( !lpi_data.lpi_property ) > + { > + /* The property table holds one byte per LPI. */ > + unsigned int order = get_order_from_bytes(lpi_data.nr_host_lpis); > + void *table = alloc_xenheap_pages(order, 0); > + > + if ( !table ) > + return -ENOMEM; > + > + /* Make sure the physical address can be encoded in the register. */ > + if ( (virt_to_maddr(table) & ~GENMASK(51, 12)) ) > + { > + free_xenheap_pages(table, 0); > + return -ERANGE; > + } > + memset(table, GIC_PRI_IRQ | LPI_PROP_RES1, MAX_PHYS_LPIS); > + clean_and_invalidate_dcache_va_range(table, MAX_PHYS_LPIS); > + lpi_data.lpi_property = table; > + } > + > + /* Encode the number of bits needed, minus one */ > + reg |= ((fls(lpi_data.nr_host_lpis - 1) - 1) << 0); As said on the previous version, please avoid hardcoded shift. [...] > +int gicv3_lpi_init_rdist(void __iomem * rdist_base) > +{ > + uint32_t reg; > + uint64_t table_reg; > + int ret; > + > + /* We don't support LPIs without an ITS. */ > + if ( !gicv3_its_host_has_its() ) > + return -ENODEV; > + > + /* Make sure LPIs are disabled before setting up the tables. */ > + reg = readl_relaxed(rdist_base + GICR_CTLR); > + if ( reg & GICR_CTLR_ENABLE_LPIS ) > + return -EBUSY; > + > + ret = gicv3_lpi_allocate_pendtable(&table_reg); > + if (ret) > + return ret; > + writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER); Same question as on the previous version. The cacheability and shareability may not stick. This was a bug fix in Linux (see commit 241a386) and I am struggling to understand why Xen would not be affected? > + > + return gicv3_lpi_set_proptable(rdist_base); > +} > + > +static unsigned int max_lpi_bits = CONFIG_MAX_PHYS_LPI_BITS; > +integer_param("max_lpi_bits", max_lpi_bits); > + > +int gicv3_lpi_init_host_lpis(unsigned int hw_lpi_bits) > +{ Please add a comment to explain why you don't sanitize the value from the command line. > + lpi_data.nr_host_lpis = BIT_ULL(min(hw_lpi_bits, max_lpi_bits)); nr_host_lpis is "unsigned long" so why are you using BIT_ULL? > + > + printk("GICv3: using at most %ld LPIs on the host.\n", MAX_PHYS_LPIS); s/%ld/%lu/ [...] > diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c > index 1512521..ed78363 100644 > --- a/xen/arch/arm/gic-v3.c > +++ b/xen/arch/arm/gic-v3.c > @@ -548,6 +548,9 @@ static void __init gicv3_dist_init(void) > type = readl_relaxed(GICD + GICD_TYPER); > nr_lines = 32 * ((type & GICD_TYPE_LINES) + 1); > > + if ( type & GICD_TYPE_LPIS ) > + gicv3_lpi_init_host_lpis(((type >> GICD_TYPE_ID_BITS_SHIFT) & 0x1f) + 1); Again, a macro has been suggested on in the last 2 versions to avoid hardcoded value. You said you will fix it and it is not done. Please do it. [...] > diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h > index 836a103..12bd155 100644 > --- a/xen/include/asm-arm/gic.h > +++ b/xen/include/asm-arm/gic.h > @@ -220,6 +220,8 @@ enum gic_version { > GIC_V3, > }; > > +#define LPI_OFFSET 8192 My comments on the previous version about LPI_OFFSET are not addressed. And one question was left unanswered. > + > extern enum gic_version gic_hw_version(void); > > /* Program the IRQ type into the GIC */ [...] > diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h > index 765a655..219d109 100644 > --- a/xen/include/asm-arm/gic_v3_its.h > +++ b/xen/include/asm-arm/gic_v3_its.h > @@ -40,6 +40,11 @@ void gicv3_its_dt_init(const struct dt_device_node *node); > > bool gicv3_its_host_has_its(void); > > +int gicv3_lpi_init_rdist(void __iomem * rdist_base); > + > +/* Initialize the host structures for LPIs. */ > +int gicv3_lpi_init_host_lpis(unsigned int nr_lpis); In the implementation, the parameter is called "hw_lpi_bits". Please stay consistent. > + > #else > > static LIST_HEAD(host_its_list); > @@ -53,6 +58,15 @@ static inline bool gicv3_its_host_has_its(void) > return false; > } > > +static inline int gicv3_lpi_init_rdist(void __iomem * rdist_base) > +{ > + return -ENODEV; > +} > + > +static inline int gicv3_lpi_init_host_lpis(unsigned int nr_lpis) Ditto. > +{ > + return 0; > +} > #endif /* CONFIG_HAS_ITS */ > > #endif >
On Thu, 16 Mar 2017, Andre Przywara wrote: > The ARM GICv3 provides a new kind of interrupt called LPIs. > The pending bits and the configuration data (priority, enable bits) for > those LPIs are stored in tables in normal memory, which software has to > provide to the hardware. > Allocate the required memory, initialize it and hand it over to each > redistributor. The maximum number of LPIs to be used can be adjusted with > the command line option "max_lpi_bits", which defaults to a compile time > constant exposed in Kconfig. > > Signed-off-by: Andre Przywara <andre.przywara@arm.com> > --- > docs/misc/xen-command-line.markdown | 9 ++ > xen/arch/arm/Kconfig | 15 +++ > xen/arch/arm/Makefile | 1 + > xen/arch/arm/gic-v3-lpi.c | 201 ++++++++++++++++++++++++++++++++++++ > xen/arch/arm/gic-v3.c | 17 +++ > xen/include/asm-arm/bitops.h | 1 + > xen/include/asm-arm/gic.h | 2 + > xen/include/asm-arm/gic_v3_defs.h | 52 +++++++++- > xen/include/asm-arm/gic_v3_its.h | 14 +++ > 9 files changed, 311 insertions(+), 1 deletion(-) > create mode 100644 xen/arch/arm/gic-v3-lpi.c > > diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown > index a11fdf9..619016d 100644 > --- a/docs/misc/xen-command-line.markdown > +++ b/docs/misc/xen-command-line.markdown > @@ -1158,6 +1158,15 @@ based interrupts. Any higher IRQs will be available for use via PCI MSI. > ### maxcpus > > `= <integer>` > > +### max\_lpi\_bits > +> `= <integer>` > + > +Specifies the number of ARM GICv3 LPI interrupts to allocate on the host, > +presented as the number of bits needed to encode it. This must be at least > +14 and not exceed 32, and each LPI requires one byte (configuration) and > +one pending bit to be allocated. > +Defaults to 20 bits (to cover at most 1048576 interrupts). > + > ### mce > > `= <integer>` > > diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig > index bf64c61..86f7b53 100644 > --- a/xen/arch/arm/Kconfig > +++ b/xen/arch/arm/Kconfig > @@ -49,6 +49,21 @@ config HAS_ITS > bool "GICv3 ITS MSI controller support" > depends on HAS_GICV3 > > +config MAX_PHYS_LPI_BITS > + depends on HAS_ITS > + int "Maximum bits for GICv3 host LPIs (14-32)" > + range 14 32 > + default "20" > + help > + Specifies the maximum number of LPIs (in bits) Xen should take > + care of. The host ITS may provide support for a very large number > + of supported LPIs, for all of which we may not want to allocate > + memory, so this number here allows to limit this. > + Xen itself does not know how many LPIs domains will ever need > + beforehand. > + This can be overridden on the command line with the max_lpi_bits > + parameter. > + > endmenu > > menu "ARM errata workaround via the alternative framework" > diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile > index 54860e0..02a8737 100644 > --- a/xen/arch/arm/Makefile > +++ b/xen/arch/arm/Makefile > @@ -19,6 +19,7 @@ obj-y += gic.o > obj-y += gic-v2.o > obj-$(CONFIG_HAS_GICV3) += gic-v3.o > obj-$(CONFIG_HAS_ITS) += gic-v3-its.o > +obj-$(CONFIG_HAS_ITS) += gic-v3-lpi.o > obj-y += guestcopy.o > obj-y += hvm.o > obj-y += io.o > diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c > new file mode 100644 > index 0000000..4f8414b > --- /dev/null > +++ b/xen/arch/arm/gic-v3-lpi.c > @@ -0,0 +1,201 @@ > +/* > + * xen/arch/arm/gic-v3-lpi.c > + * > + * ARM GICv3 Locality-specific Peripheral Interrupts (LPI) support > + * > + * Copyright (C) 2016,2017 - ARM Ltd > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; under version 2 of the License. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; If not, see <http://www.gnu.org/licenses/>. > + */ > + > +#include <xen/lib.h> > +#include <xen/mm.h> > +#include <asm/gic.h> > +#include <asm/gic_v3_defs.h> > +#include <asm/gic_v3_its.h> > +#include <asm/io.h> > +#include <asm/page.h> > + > +#define LPI_PROPTABLE_NEEDS_FLUSHING (1U << 0) > +/* Global state */ > +static struct { > + /* The global LPI property table, shared by all redistributors. */ > + uint8_t *lpi_property; > + /* > + * Number of physical LPIs the host supports. This is a property of > + * the GIC hardware. We depart from the habit of naming these things > + * "physical" in Xen, as the GICv3/4 spec uses the term "physical LPI" > + * in a different context to differentiate them from "virtual LPIs". > + */ > + unsigned long int nr_host_lpis; > + unsigned int flags; > +} lpi_data; > + > +struct lpi_redist_data { > + void *pending_table; > +}; > + > +static DEFINE_PER_CPU(struct lpi_redist_data, lpi_redist); > + > +#define MAX_PHYS_LPIS (lpi_data.nr_host_lpis - LPI_OFFSET) > + > +static int gicv3_lpi_allocate_pendtable(uint64_t *reg) > +{ > + uint64_t val; > + unsigned int order; > + void *pendtable; > + > + if ( this_cpu(lpi_redist).pending_table ) > + return -EBUSY; > + > + val = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT; > + val |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT; > + val |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT; > + > + /* > + * The pending table holds one bit per LPI and even covers bits for > + * interrupt IDs below 8192, so we allocate the full range. > + * The GICv3 imposes a 64KB alignment requirement, also requires > + * physically contigious memory. > + */ > + order = max(get_order_from_bytes(lpi_data.nr_host_lpis / 8), 4U); > + pendtable = alloc_xenheap_pages(order, 0); > + if ( !pendtable ) > + return -ENOMEM; > + > + /* Make sure the physical address can be encoded in the register. */ > + if ( (virt_to_maddr(pendtable) & ~GENMASK(51, 16)) ) > + { > + free_xenheap_pages(pendtable, 0); > + return -ERANGE; > + } > + memset(pendtable, 0, lpi_data.nr_host_lpis / 8); > + clean_and_invalidate_dcache_va_range(pendtable, > + lpi_data.nr_host_lpis / 8); > + > + this_cpu(lpi_redist).pending_table = pendtable; > + > + val |= GICR_PENDBASER_PTZ; > + > + val |= virt_to_maddr(pendtable); > + > + *reg = val; > + > + return 0; > +} > + > +/* > + * Tell a redistributor about the (shared) property table, allocating one > + * if not already done. > + */ > +static int gicv3_lpi_set_proptable(void __iomem * rdist_base) > +{ > + uint64_t reg; > + > + reg = GIC_BASER_CACHE_RaWaWb << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT; > + reg |= GIC_BASER_CACHE_SameAsInner << GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT; > + reg |= GIC_BASER_InnerShareable << GICR_PROPBASER_SHAREABILITY_SHIFT; > + > + /* > + * The property table is shared across all redistributors, so allocate > + * this only once, but return the same value on subsequent calls. > + */ > + if ( !lpi_data.lpi_property ) > + { > + /* The property table holds one byte per LPI. */ > + unsigned int order = get_order_from_bytes(lpi_data.nr_host_lpis); > + void *table = alloc_xenheap_pages(order, 0); you investigated the possibility of using _xmalloc to 64K align it? > + if ( !table ) > + return -ENOMEM; > + > + /* Make sure the physical address can be encoded in the register. */ > + if ( (virt_to_maddr(table) & ~GENMASK(51, 12)) ) > + { > + free_xenheap_pages(table, 0); > + return -ERANGE; > + } > + memset(table, GIC_PRI_IRQ | LPI_PROP_RES1, MAX_PHYS_LPIS); > + clean_and_invalidate_dcache_va_range(table, MAX_PHYS_LPIS); > + lpi_data.lpi_property = table; > + } > + > + /* Encode the number of bits needed, minus one */ > + reg |= ((fls(lpi_data.nr_host_lpis - 1) - 1) << 0); > + > + reg |= virt_to_maddr(lpi_data.lpi_property); > + > + writeq_relaxed(reg, rdist_base + GICR_PROPBASER); > + reg = readq_relaxed(rdist_base + GICR_PROPBASER); > + > + /* If we can't do shareable, we have to drop cacheability as well. */ > + if ( !(reg & GICR_PROPBASER_SHAREABILITY_MASK) ) > + { > + reg &= ~GICR_PROPBASER_INNER_CACHEABILITY_MASK; > + reg |= GIC_BASER_CACHE_nC << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT; > + } > + > + /* Remember that we have to flush the property table if non-cacheable. */ > + if ( (reg & GICR_PROPBASER_INNER_CACHEABILITY_MASK) <= GIC_BASER_CACHE_nC ) > + { > + lpi_data.flags |= LPI_PROPTABLE_NEEDS_FLUSHING; > + /* Update the redistributors knowledge about the attributes. */ > + writeq_relaxed(reg, rdist_base + GICR_PROPBASER); > + } > + > + return 0; > +} > + > +int gicv3_lpi_init_rdist(void __iomem * rdist_base) > +{ > + uint32_t reg; > + uint64_t table_reg; > + int ret; > + > + /* We don't support LPIs without an ITS. */ > + if ( !gicv3_its_host_has_its() ) > + return -ENODEV; > + > + /* Make sure LPIs are disabled before setting up the tables. */ > + reg = readl_relaxed(rdist_base + GICR_CTLR); > + if ( reg & GICR_CTLR_ENABLE_LPIS ) > + return -EBUSY; > + > + ret = gicv3_lpi_allocate_pendtable(&table_reg); > + if (ret) > + return ret; > + writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER); > + > + return gicv3_lpi_set_proptable(rdist_base); > +} > + > +static unsigned int max_lpi_bits = CONFIG_MAX_PHYS_LPI_BITS; > +integer_param("max_lpi_bits", max_lpi_bits); > + > +int gicv3_lpi_init_host_lpis(unsigned int hw_lpi_bits) > +{ > + lpi_data.nr_host_lpis = BIT_ULL(min(hw_lpi_bits, max_lpi_bits)); > + > + printk("GICv3: using at most %ld LPIs on the host.\n", MAX_PHYS_LPIS); > + > + return 0; > +} > + > +/* > + * Local variables: > + * mode: C > + * c-file-style: "BSD" > + * c-basic-offset: 4 > + * indent-tabs-mode: nil > + * End: > + */ > diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c > index 1512521..ed78363 100644 > --- a/xen/arch/arm/gic-v3.c > +++ b/xen/arch/arm/gic-v3.c > @@ -548,6 +548,9 @@ static void __init gicv3_dist_init(void) > type = readl_relaxed(GICD + GICD_TYPER); > nr_lines = 32 * ((type & GICD_TYPE_LINES) + 1); > > + if ( type & GICD_TYPE_LPIS ) > + gicv3_lpi_init_host_lpis(((type >> GICD_TYPE_ID_BITS_SHIFT) & 0x1f) + 1); > + > printk("GICv3: %d lines, (IID %8.8x).\n", > nr_lines, readl_relaxed(GICD + GICD_IIDR)); > > @@ -660,6 +663,20 @@ static int __init gicv3_populate_rdist(void) > if ( (typer >> 32) == aff ) > { > this_cpu(rbase) = ptr; > + > + if ( typer & GICR_TYPER_PLPIS ) > + { > + int ret; > + > + ret = gicv3_lpi_init_rdist(ptr); > + if ( ret && ret != -ENODEV ) > + { > + printk("GICv3: CPU%d: Cannot initialize LPIs: %d\n", > + smp_processor_id(), ret); > + break; > + } > + } > + > printk("GICv3: CPU%d: Found redistributor in region %d @%p\n", > smp_processor_id(), i, ptr); > return 0; > diff --git a/xen/include/asm-arm/bitops.h b/xen/include/asm-arm/bitops.h > index bda8898..1cbfb9e 100644 > --- a/xen/include/asm-arm/bitops.h > +++ b/xen/include/asm-arm/bitops.h > @@ -24,6 +24,7 @@ > #define BIT(nr) (1UL << (nr)) > #define BIT_MASK(nr) (1UL << ((nr) % BITS_PER_WORD)) > #define BIT_WORD(nr) ((nr) / BITS_PER_WORD) > +#define BIT_ULL(nr) (1ULL << (nr)) > #define BITS_PER_BYTE 8 > > #define ADDR (*(volatile int *) addr) > diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h > index 836a103..12bd155 100644 > --- a/xen/include/asm-arm/gic.h > +++ b/xen/include/asm-arm/gic.h > @@ -220,6 +220,8 @@ enum gic_version { > GIC_V3, > }; > > +#define LPI_OFFSET 8192 > + > extern enum gic_version gic_hw_version(void); > > /* Program the IRQ type into the GIC */ > diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h > index 6bd25a5..b307322 100644 > --- a/xen/include/asm-arm/gic_v3_defs.h > +++ b/xen/include/asm-arm/gic_v3_defs.h > @@ -44,7 +44,8 @@ > #define GICC_SRE_EL2_ENEL1 (1UL << 3) > > /* Additional bits in GICD_TYPER defined by GICv3 */ > -#define GICD_TYPE_ID_BITS_SHIFT 19 > +#define GICD_TYPE_ID_BITS_SHIFT 19 > +#define GICD_TYPE_LPIS (1U << 17) > > #define GICD_CTLR_RWP (1UL << 31) > #define GICD_CTLR_ARE_NS (1U << 4) > @@ -95,12 +96,61 @@ > #define GICR_IGRPMODR0 (0x0D00) > #define GICR_NSACR (0x0E00) > > +#define GICR_CTLR_ENABLE_LPIS (1U << 0) > + > #define GICR_TYPER_PLPIS (1U << 0) > #define GICR_TYPER_VLPIS (1U << 1) > #define GICR_TYPER_LAST (1U << 4) > > +/* For specifying the inner cacheability type only */ > +#define GIC_BASER_CACHE_nCnB 0ULL > +/* For specifying the outer cacheability type only */ > +#define GIC_BASER_CACHE_SameAsInner 0ULL > +#define GIC_BASER_CACHE_nC 1ULL > +#define GIC_BASER_CACHE_RaWt 2ULL > +#define GIC_BASER_CACHE_RaWb 3ULL > +#define GIC_BASER_CACHE_WaWt 4ULL > +#define GIC_BASER_CACHE_WaWb 5ULL > +#define GIC_BASER_CACHE_RaWaWt 6ULL > +#define GIC_BASER_CACHE_RaWaWb 7ULL > +#define GIC_BASER_CACHE_MASK 7ULL > + > +#define GIC_BASER_NonShareable 0ULL > +#define GIC_BASER_InnerShareable 1ULL > +#define GIC_BASER_OuterShareable 2ULL > + > +#define GICR_PROPBASER_SHAREABILITY_SHIFT 10 > +#define GICR_PROPBASER_INNER_CACHEABILITY_SHIFT 7 > +#define GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT 56 > +#define GICR_PROPBASER_SHAREABILITY_MASK \ > + (3UL << GICR_PROPBASER_SHAREABILITY_SHIFT) > +#define GICR_PROPBASER_INNER_CACHEABILITY_MASK \ > + (7UL << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT) > +#define GICR_PROPBASER_OUTER_CACHEABILITY_MASK \ > + (7UL << GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT) > +#define GICR_PROPBASER_RES0_MASK \ > + (GENMASK(63, 59) | GENMASK(55, 52) | GENMASK(6, 5)) > + > +#define GICR_PENDBASER_SHAREABILITY_SHIFT 10 > +#define GICR_PENDBASER_INNER_CACHEABILITY_SHIFT 7 > +#define GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT 56 > +#define GICR_PENDBASER_SHAREABILITY_MASK \ > + (3UL << GICR_PENDBASER_SHAREABILITY_SHIFT) > +#define GICR_PENDBASER_INNER_CACHEABILITY_MASK \ > + (7UL << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT) > +#define GICR_PENDBASER_OUTER_CACHEABILITY_MASK \ > + (7UL << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT) > +#define GICR_PENDBASER_PTZ BIT(62) > +#define GICR_PENDBASER_RES0_MASK \ > + (BIT(63) | GENMASK(61, 59) | GENMASK(55, 52) | \ > + GENMASK(15, 12) | GENMASK(6, 0)) > + > #define DEFAULT_PMR_VALUE 0xff > > +#define LPI_PROP_PRIO_MASK 0xfc > +#define LPI_PROP_RES1 (1 << 1) > +#define LPI_PROP_ENABLED (1 << 0) > + > #define GICH_VMCR_EOI (1 << 9) > #define GICH_VMCR_VENG1 (1 << 1) > > diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h > index 765a655..219d109 100644 > --- a/xen/include/asm-arm/gic_v3_its.h > +++ b/xen/include/asm-arm/gic_v3_its.h > @@ -40,6 +40,11 @@ void gicv3_its_dt_init(const struct dt_device_node *node); > > bool gicv3_its_host_has_its(void); > > +int gicv3_lpi_init_rdist(void __iomem * rdist_base); > + > +/* Initialize the host structures for LPIs. */ > +int gicv3_lpi_init_host_lpis(unsigned int nr_lpis); > + > #else > > static LIST_HEAD(host_its_list); > @@ -53,6 +58,15 @@ static inline bool gicv3_its_host_has_its(void) > return false; > } > > +static inline int gicv3_lpi_init_rdist(void __iomem * rdist_base) > +{ > + return -ENODEV; > +} > + > +static inline int gicv3_lpi_init_host_lpis(unsigned int nr_lpis) > +{ > + return 0; > +} > #endif /* CONFIG_HAS_ITS */ > > #endif > -- > 2.9.0 >
On 21/03/17 22:57, Stefano Stabellini wrote: > On Thu, 16 Mar 2017, Andre Przywara wrote: >> The ARM GICv3 provides a new kind of interrupt called LPIs. >> The pending bits and the configuration data (priority, enable bits) for >> those LPIs are stored in tables in normal memory, which software has to >> provide to the hardware. >> Allocate the required memory, initialize it and hand it over to each >> redistributor. The maximum number of LPIs to be used can be adjusted with >> the command line option "max_lpi_bits", which defaults to a compile time >> constant exposed in Kconfig. >> >> Signed-off-by: Andre Przywara <andre.przywara@arm.com> .... >> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c >> new file mode 100644 >> index 0000000..4f8414b >> --- /dev/null >> +++ b/xen/arch/arm/gic-v3-lpi.c .... >> +/* >> + * Tell a redistributor about the (shared) property table, allocating one >> + * if not already done. >> + */ >> +static int gicv3_lpi_set_proptable(void __iomem * rdist_base) >> +{ >> + uint64_t reg; >> + >> + reg = GIC_BASER_CACHE_RaWaWb << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT; >> + reg |= GIC_BASER_CACHE_SameAsInner << GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT; >> + reg |= GIC_BASER_InnerShareable << GICR_PROPBASER_SHAREABILITY_SHIFT; >> + >> + /* >> + * The property table is shared across all redistributors, so allocate >> + * this only once, but return the same value on subsequent calls. >> + */ >> + if ( !lpi_data.lpi_property ) >> + { >> + /* The property table holds one byte per LPI. */ >> + unsigned int order = get_order_from_bytes(lpi_data.nr_host_lpis); >> + void *table = alloc_xenheap_pages(order, 0); > > you investigated the possibility of using _xmalloc to 64K align it? Yes, _xmalloc gives me the alignment, but AFAICT not the physical contiguous memory guarantee the table needs. If there are other ways to satisfy those requirements, I am all ears. Otherwise I'd just add a comment to motivate this. Cheers, Andre.
On Tue, 21 Mar 2017, André Przywara wrote: > On 21/03/17 22:57, Stefano Stabellini wrote: > > On Thu, 16 Mar 2017, Andre Przywara wrote: > >> The ARM GICv3 provides a new kind of interrupt called LPIs. > >> The pending bits and the configuration data (priority, enable bits) for > >> those LPIs are stored in tables in normal memory, which software has to > >> provide to the hardware. > >> Allocate the required memory, initialize it and hand it over to each > >> redistributor. The maximum number of LPIs to be used can be adjusted with > >> the command line option "max_lpi_bits", which defaults to a compile time > >> constant exposed in Kconfig. > >> > >> Signed-off-by: Andre Przywara <andre.przywara@arm.com> > > .... > > >> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c > >> new file mode 100644 > >> index 0000000..4f8414b > >> --- /dev/null > >> +++ b/xen/arch/arm/gic-v3-lpi.c > > .... > > >> +/* > >> + * Tell a redistributor about the (shared) property table, allocating one > >> + * if not already done. > >> + */ > >> +static int gicv3_lpi_set_proptable(void __iomem * rdist_base) > >> +{ > >> + uint64_t reg; > >> + > >> + reg = GIC_BASER_CACHE_RaWaWb << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT; > >> + reg |= GIC_BASER_CACHE_SameAsInner << GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT; > >> + reg |= GIC_BASER_InnerShareable << GICR_PROPBASER_SHAREABILITY_SHIFT; > >> + > >> + /* > >> + * The property table is shared across all redistributors, so allocate > >> + * this only once, but return the same value on subsequent calls. > >> + */ > >> + if ( !lpi_data.lpi_property ) > >> + { > >> + /* The property table holds one byte per LPI. */ > >> + unsigned int order = get_order_from_bytes(lpi_data.nr_host_lpis); > >> + void *table = alloc_xenheap_pages(order, 0); > > > > you investigated the possibility of using _xmalloc to 64K align it? > > Yes, _xmalloc gives me the alignment, but AFAICT not the physical > contiguous memory guarantee the table needs. If there are other ways to > satisfy those requirements, I am all ears. Otherwise I'd just add a > comment to motivate this. OK, but the problem is that I am not sure whether alloc_xenheap_pages makes any guarantees on the alignment of the allocated pages. Is it possible to allocate 1<<4 pages that are not 64K aligned with it?
Hi, On 21/03/17 23:27, Stefano Stabellini wrote: > On Tue, 21 Mar 2017, André Przywara wrote: >> On 21/03/17 22:57, Stefano Stabellini wrote: >>> On Thu, 16 Mar 2017, Andre Przywara wrote: >>>> The ARM GICv3 provides a new kind of interrupt called LPIs. >>>> The pending bits and the configuration data (priority, enable bits) for >>>> those LPIs are stored in tables in normal memory, which software has to >>>> provide to the hardware. >>>> Allocate the required memory, initialize it and hand it over to each >>>> redistributor. The maximum number of LPIs to be used can be adjusted with >>>> the command line option "max_lpi_bits", which defaults to a compile time >>>> constant exposed in Kconfig. >>>> >>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com> >> >> .... >> >>>> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c >>>> new file mode 100644 >>>> index 0000000..4f8414b >>>> --- /dev/null >>>> +++ b/xen/arch/arm/gic-v3-lpi.c >> >> .... >> >>>> +/* >>>> + * Tell a redistributor about the (shared) property table, allocating one >>>> + * if not already done. >>>> + */ >>>> +static int gicv3_lpi_set_proptable(void __iomem * rdist_base) >>>> +{ >>>> + uint64_t reg; >>>> + >>>> + reg = GIC_BASER_CACHE_RaWaWb << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT; >>>> + reg |= GIC_BASER_CACHE_SameAsInner << GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT; >>>> + reg |= GIC_BASER_InnerShareable << GICR_PROPBASER_SHAREABILITY_SHIFT; >>>> + >>>> + /* >>>> + * The property table is shared across all redistributors, so allocate >>>> + * this only once, but return the same value on subsequent calls. >>>> + */ >>>> + if ( !lpi_data.lpi_property ) >>>> + { >>>> + /* The property table holds one byte per LPI. */ >>>> + unsigned int order = get_order_from_bytes(lpi_data.nr_host_lpis); >>>> + void *table = alloc_xenheap_pages(order, 0); >>> >>> you investigated the possibility of using _xmalloc to 64K align it? >> >> Yes, _xmalloc gives me the alignment, but AFAICT not the physical >> contiguous memory guarantee the table needs. If there are other ways to >> satisfy those requirements, I am all ears. Otherwise I'd just add a >> comment to motivate this. > > OK, but the problem is that I am not sure whether alloc_xenheap_pages > makes any guarantees on the alignment of the allocated pages. Is it > possible to allocate 1<<4 pages that are not 64K aligned with it? My understanding of the code is that it naturally aligns the allocation to the order you requested: So calling it with (4, 0) would give you (4K << 4) = 64K alignment (because it's taken from the 64K bucket). But anyway the property table needs only to be 4K aligned, isn't it? I know that the GICv3 is amazing in this aspect: every table has slightly different requirements when it comes to alignment, support of physical address space, page sizes and so on... Cheers, Andre.
Hi, On 21/03/17 21:23, Julien Grall wrote: > Hi Andre, > > On 03/16/2017 11:20 AM, Andre Przywara wrote: >> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig >> index bf64c61..86f7b53 100644 >> --- a/xen/arch/arm/Kconfig >> +++ b/xen/arch/arm/Kconfig >> @@ -49,6 +49,21 @@ config HAS_ITS >> bool "GICv3 ITS MSI controller support" >> depends on HAS_GICV3 >> >> +config MAX_PHYS_LPI_BITS >> + depends on HAS_ITS >> + int "Maximum bits for GICv3 host LPIs (14-32)" >> + range 14 32 >> + default "20" >> + help >> + Specifies the maximum number of LPIs (in bits) Xen should take >> + care of. The host ITS may provide support for a very large >> number >> + of supported LPIs, for all of which we may not want to >> allocate >> + memory, so this number here allows to limit this. >> + Xen itself does not know how many LPIs domains will ever need >> + beforehand. >> + This can be overridden on the command line with the >> max_lpi_bits >> + parameter. > > I continue to disagree on this Kconfig option. There is no sensible > default value and I don't see how a distribution will be able to pick-up > one value here. > > If the number of LPIs have to be restricted, this should be done via the > command line or a per-platform option. So are you opting for taking the hardware provided value, unless there is a command line option or a per-platform limit? Please keep in mind that the situation here is different from the device table: - There is no indirect table option for the property and pending table. Any redistributor supporting 32 bits worth of LPIs would lead to a 4GB property table and 512MB pending table allocation. - An OS like Linux can make sensible assumptions about the number of LPIs needed and only allocate as much LPIs as required. Linux for instance uses at most 65536. Xen cannot easily make this decision. So we would need to go as high as possible, but not too high to not waste memory. I see different tradeoffs here between a distribution targeting servers or another one aiming at some embedded devices, for instance. So I would expect exactly this decision to be covered by a Kconfig option. - In contrast to the device ID, which is guest provided, the host LPI number is chosen by Xen. We allocate them in chunks of 32, starting at the beginning and just counting up. So we can limit them without loosing too much flexibility, because it's not that sparse as device IDs, for instance. > I have already made my point on previous e-mails so I am not going to > argue more. So as I mentioned before, I am happy to loose the Kconfig option, but then we need some sensible default value. In our case we could be cheeky here for now and just use the Linux value, because a Linux Dom0 would be the only user. But that doesn't sound very future proof, though this may not matter for 4.9. >> + >> endmenu >> >> menu "ARM errata workaround via the alternative framework" >> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile >> index 54860e0..02a8737 100644 >> --- a/xen/arch/arm/Makefile >> +++ b/xen/arch/arm/Makefile >> @@ -19,6 +19,7 @@ obj-y += gic.o >> obj-y += gic-v2.o >> obj-$(CONFIG_HAS_GICV3) += gic-v3.o >> obj-$(CONFIG_HAS_ITS) += gic-v3-its.o >> +obj-$(CONFIG_HAS_ITS) += gic-v3-lpi.o >> obj-y += guestcopy.o >> obj-y += hvm.o >> obj-y += io.o >> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c >> new file mode 100644 >> index 0000000..4f8414b >> --- /dev/null >> +++ b/xen/arch/arm/gic-v3-lpi.c >> @@ -0,0 +1,201 @@ >> +/* >> + * xen/arch/arm/gic-v3-lpi.c >> + * >> + * ARM GICv3 Locality-specific Peripheral Interrupts (LPI) support >> + * >> + * Copyright (C) 2016,2017 - ARM Ltd >> + * >> + * This program is free software; you can redistribute it and/or modify >> + * it under the terms of the GNU General Public License as published by >> + * the Free Software Foundation; under version 2 of the License. >> + * >> + * This program is distributed in the hope that it will be useful, >> + * but WITHOUT ANY WARRANTY; without even the implied warranty of >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the >> + * GNU General Public License for more details. >> + * >> + * You should have received a copy of the GNU General Public License >> + * along with this program; If not, see <http://www.gnu.org/licenses/>. >> + */ >> + >> +#include <xen/lib.h> >> +#include <xen/mm.h> >> +#include <asm/gic.h> >> +#include <asm/gic_v3_defs.h> >> +#include <asm/gic_v3_its.h> >> +#include <asm/io.h> >> +#include <asm/page.h> >> + >> +#define LPI_PROPTABLE_NEEDS_FLUSHING (1U << 0) > > NIT: Newline here please. > >> +/* Global state */ >> +static struct { >> + /* The global LPI property table, shared by all redistributors. */ >> + uint8_t *lpi_property; >> + /* >> + * Number of physical LPIs the host supports. This is a property of >> + * the GIC hardware. We depart from the habit of naming these things >> + * "physical" in Xen, as the GICv3/4 spec uses the term "physical >> LPI" >> + * in a different context to differentiate them from "virtual LPIs". >> + */ >> + unsigned long int nr_host_lpis; > > Why unsigned long? Because someone requested that I store the actual number of LPIs, not the number of bits required to encode them. So I need to cover the case of exactly 32 bits worth of LPIs, which u32 cannot hold. >> + unsigned int flags; >> +} lpi_data; >> + >> +struct lpi_redist_data { >> + void *pending_table; >> +}; >> + >> +static DEFINE_PER_CPU(struct lpi_redist_data, lpi_redist); >> + >> +#define MAX_PHYS_LPIS (lpi_data.nr_host_lpis - LPI_OFFSET) > > This is fairly confusing. When I read "nr_host_lpis" I understand that > you store the number of LPIs. But in fact you store the maximum LPI ID. One of the reasons I argued for storing the number of bits .... > Also it is very unclear the difference between MAX_PHYS_LPIS and > nr_host_lpis and will likely cause trouble in the future. > > So it looks like to me that nr_host_lpis should be named > max_host_lpis_ids and MAX_PHYS_LPIS should be MAX_NR_PHYS_LPIS. > > Also, please stay consistent with the naming. If you use host then use > host everywhere and not a mix between phys and host for the same purpose. OK, can do. >> + >> +static int gicv3_lpi_allocate_pendtable(uint64_t *reg) >> +{ >> + uint64_t val; >> + unsigned int order; >> + void *pendtable; >> + >> + if ( this_cpu(lpi_redist).pending_table ) >> + return -EBUSY; >> + >> + val = GIC_BASER_CACHE_RaWaWb << >> GICR_PENDBASER_INNER_CACHEABILITY_SHIFT; >> + val |= GIC_BASER_CACHE_SameAsInner << >> GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT; >> + val |= GIC_BASER_InnerShareable << >> GICR_PENDBASER_SHAREABILITY_SHIFT; >> + >> + /* >> + * The pending table holds one bit per LPI and even covers bits for >> + * interrupt IDs below 8192, so we allocate the full range. >> + * The GICv3 imposes a 64KB alignment requirement, also requires >> + * physically contigious memory. > > The bit after the comma seems quite obvious. Both xalloc and > alloc_xenheap_pages will ensure that the region will be contiguous in > the physical address space. Mmh, somehow I missed that when I tried to decide which allocation function is the right one. But indeed xmalloc() eventually calls alloc_xenheap_pages() with the order of alignment, so we get the same guarantees. > > [...] > >> +static int gicv3_lpi_set_proptable(void __iomem * rdist_base) >> +{ >> + uint64_t reg; >> + >> + reg = GIC_BASER_CACHE_RaWaWb << >> GICR_PROPBASER_INNER_CACHEABILITY_SHIFT; >> + reg |= GIC_BASER_CACHE_SameAsInner << >> GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT; >> + reg |= GIC_BASER_InnerShareable << >> GICR_PROPBASER_SHAREABILITY_SHIFT; >> + >> + /* >> + * The property table is shared across all redistributors, so >> allocate >> + * this only once, but return the same value on subsequent calls. >> + */ >> + if ( !lpi_data.lpi_property ) >> + { >> + /* The property table holds one byte per LPI. */ >> + unsigned int order = >> get_order_from_bytes(lpi_data.nr_host_lpis); >> + void *table = alloc_xenheap_pages(order, 0); >> + >> + if ( !table ) >> + return -ENOMEM; >> + >> + /* Make sure the physical address can be encoded in the >> register. */ >> + if ( (virt_to_maddr(table) & ~GENMASK(51, 12)) ) >> + { >> + free_xenheap_pages(table, 0); >> + return -ERANGE; >> + } >> + memset(table, GIC_PRI_IRQ | LPI_PROP_RES1, MAX_PHYS_LPIS); >> + clean_and_invalidate_dcache_va_range(table, MAX_PHYS_LPIS); >> + lpi_data.lpi_property = table; >> + } >> + >> + /* Encode the number of bits needed, minus one */ >> + reg |= ((fls(lpi_data.nr_host_lpis - 1) - 1) << 0); > > As said on the previous version, please avoid hardcoded shift. > > [...] > >> +int gicv3_lpi_init_rdist(void __iomem * rdist_base) >> +{ >> + uint32_t reg; >> + uint64_t table_reg; >> + int ret; >> + >> + /* We don't support LPIs without an ITS. */ >> + if ( !gicv3_its_host_has_its() ) >> + return -ENODEV; >> + >> + /* Make sure LPIs are disabled before setting up the tables. */ >> + reg = readl_relaxed(rdist_base + GICR_CTLR); >> + if ( reg & GICR_CTLR_ENABLE_LPIS ) >> + return -EBUSY; >> + >> + ret = gicv3_lpi_allocate_pendtable(&table_reg); >> + if (ret) >> + return ret; >> + writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER); > > Same question as on the previous version. The cacheability and > shareability may not stick. This was a bug fix in Linux (see commit > 241a386) and I am struggling to understand why Xen would not be affected? Oh, OK, we need to remove cacheability if shareability is denied, though that sounds like a hardware bug to me if this is not observed by the redistributor. The reason I didn't care about it here was that software doesn't use the pending table, so there would be nothing on the code side to be done in case the redistributor shoots down one of our requests. The mentioned Linux patch covers all ITS tables and the property table as well. >> + >> + return gicv3_lpi_set_proptable(rdist_base); >> +} >> + >> +static unsigned int max_lpi_bits = CONFIG_MAX_PHYS_LPI_BITS; >> +integer_param("max_lpi_bits", max_lpi_bits); >> + >> +int gicv3_lpi_init_host_lpis(unsigned int hw_lpi_bits) >> +{ > > > Please add a comment to explain why you don't sanitize the value from > the command line. > >> + lpi_data.nr_host_lpis = BIT_ULL(min(hw_lpi_bits, max_lpi_bits)); > > nr_host_lpis is "unsigned long" so why are you using BIT_ULL? Right, creeping over from Linux where the ITS is supported by ARM(32) as well. >> + >> + printk("GICv3: using at most %ld LPIs on the host.\n", >> MAX_PHYS_LPIS); > > s/%ld/%lu/ > > [...] > >> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c >> index 1512521..ed78363 100644 >> --- a/xen/arch/arm/gic-v3.c >> +++ b/xen/arch/arm/gic-v3.c >> @@ -548,6 +548,9 @@ static void __init gicv3_dist_init(void) >> type = readl_relaxed(GICD + GICD_TYPER); >> nr_lines = 32 * ((type & GICD_TYPE_LINES) + 1); >> >> + if ( type & GICD_TYPE_LPIS ) >> + gicv3_lpi_init_host_lpis(((type >> GICD_TYPE_ID_BITS_SHIFT) & >> 0x1f) + 1); > > Again, a macro has been suggested on in the last 2 versions to avoid > hardcoded value. You said you will fix it and it is not done. Please do it. Sorry for that, I did this for GITS_TYPER_DEVICE_ID_BITS, so mentally ticked this one off as well. Will be fixed. > > [...] > >> diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h >> index 836a103..12bd155 100644 >> --- a/xen/include/asm-arm/gic.h >> +++ b/xen/include/asm-arm/gic.h >> @@ -220,6 +220,8 @@ enum gic_version { >> GIC_V3, >> }; >> >> +#define LPI_OFFSET 8192 > > My comments on the previous version about LPI_OFFSET are not addressed. > And one question was left unanswered. Do you mean this? >> It would make much sense to have this definition moved in irq.h >> close to NR_IRQS. Admittedly I missed that over the next question... I think I wasn't convinced in the first place to put this GIC specific detail into the generic irq.h, but I see that it makes sense since we have the do_LPI() prototype in there as well. >> Also, I am a bit surprised that NR_IRQS & co has not been modified. >> Is there any reason for that? But I answered on this one. Is there anything else you want to have answered? > >> + >> extern enum gic_version gic_hw_version(void); >> >> /* Program the IRQ type into the GIC */ > > [...] > >> diff --git a/xen/include/asm-arm/gic_v3_its.h >> b/xen/include/asm-arm/gic_v3_its.h >> index 765a655..219d109 100644 >> --- a/xen/include/asm-arm/gic_v3_its.h >> +++ b/xen/include/asm-arm/gic_v3_its.h >> @@ -40,6 +40,11 @@ void gicv3_its_dt_init(const struct dt_device_node >> *node); >> >> bool gicv3_its_host_has_its(void); >> >> +int gicv3_lpi_init_rdist(void __iomem * rdist_base); >> + >> +/* Initialize the host structures for LPIs. */ >> +int gicv3_lpi_init_host_lpis(unsigned int nr_lpis); > > In the implementation, the parameter is called "hw_lpi_bits". Please > stay consistent. Yeah, this header file is a constant source of merge conflicts when squashing fixes and reworks, which makes it a pain to work with. Sorry for that. Cheers, Andre. >> + >> #else >> >> static LIST_HEAD(host_its_list); >> @@ -53,6 +58,15 @@ static inline bool gicv3_its_host_has_its(void) >> return false; >> } >> >> +static inline int gicv3_lpi_init_rdist(void __iomem * rdist_base) >> +{ >> + return -ENODEV; >> +} >> + >> +static inline int gicv3_lpi_init_host_lpis(unsigned int nr_lpis) > > Ditto. > >> +{ >> + return 0; >> +} >> #endif /* CONFIG_HAS_ITS */ >> >> #endif >> >
On 23/03/17 14:40, Andre Przywara wrote: > Hi, Hi Andre, > > On 21/03/17 21:23, Julien Grall wrote: >> Hi Andre, >> >> On 03/16/2017 11:20 AM, Andre Przywara wrote: >>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig >>> index bf64c61..86f7b53 100644 >>> --- a/xen/arch/arm/Kconfig >>> +++ b/xen/arch/arm/Kconfig >>> @@ -49,6 +49,21 @@ config HAS_ITS >>> bool "GICv3 ITS MSI controller support" >>> depends on HAS_GICV3 >>> >>> +config MAX_PHYS_LPI_BITS >>> + depends on HAS_ITS >>> + int "Maximum bits for GICv3 host LPIs (14-32)" >>> + range 14 32 >>> + default "20" >>> + help >>> + Specifies the maximum number of LPIs (in bits) Xen should take >>> + care of. The host ITS may provide support for a very large >>> number >>> + of supported LPIs, for all of which we may not want to >>> allocate >>> + memory, so this number here allows to limit this. >>> + Xen itself does not know how many LPIs domains will ever need >>> + beforehand. >>> + This can be overridden on the command line with the >>> max_lpi_bits >>> + parameter. >> >> I continue to disagree on this Kconfig option. There is no sensible >> default value and I don't see how a distribution will be able to pick-up >> one value here. >> >> If the number of LPIs have to be restricted, this should be done via the >> command line or a per-platform option. > > So are you opting for taking the hardware provided value, unless there > is a command line option or a per-platform limit? > > Please keep in mind that the situation here is different from the device > table: > - There is no indirect table option for the property and pending table. > Any redistributor supporting 32 bits worth of LPIs would lead to a > 4GB property table and 512MB pending table allocation. > - An OS like Linux can make sensible assumptions about the number of > LPIs needed and only allocate as much LPIs as required. Linux for > instance uses at most 65536. Xen cannot easily make this decision. > So we would need to go as high as possible, but not too high to not > waste memory. I see different tradeoffs here between a distribution > targeting servers or another one aiming at some embedded devices, for > instance. So I would expect exactly this decision to be covered by a > Kconfig option. Hence why a parameter on the command line is the better place for that. The decision is left to the user. >> I have already made my point on previous e-mails so I am not going to >> argue more. > > So as I mentioned before, I am happy to loose the Kconfig option, but > then we need some sensible default value. In our case we could be cheeky > here for now and just use the Linux value, because a Linux Dom0 would be > the only user. But that doesn't sound very future proof, though this may > not matter for 4.9. I don't think we need a sensible default value and IHMO there is none. I would left the user to decide the exact number. >>> +/* Global state */ >>> +static struct { >>> + /* The global LPI property table, shared by all redistributors. */ >>> + uint8_t *lpi_property; >>> + /* >>> + * Number of physical LPIs the host supports. This is a property of >>> + * the GIC hardware. We depart from the habit of naming these things >>> + * "physical" in Xen, as the GICv3/4 spec uses the term "physical >>> LPI" >>> + * in a different context to differentiate them from "virtual LPIs". >>> + */ >>> + unsigned long int nr_host_lpis; >> >> Why unsigned long? > > Because someone requested that I store the actual number of LPIs, not > the number of bits required to encode them. So I need to cover the case > of exactly 32 bits worth of LPIs, which u32 cannot hold. Fair enough. [...] >>> +int gicv3_lpi_init_rdist(void __iomem * rdist_base) >>> +{ >>> + uint32_t reg; >>> + uint64_t table_reg; >>> + int ret; >>> + >>> + /* We don't support LPIs without an ITS. */ >>> + if ( !gicv3_its_host_has_its() ) >>> + return -ENODEV; >>> + >>> + /* Make sure LPIs are disabled before setting up the tables. */ >>> + reg = readl_relaxed(rdist_base + GICR_CTLR); >>> + if ( reg & GICR_CTLR_ENABLE_LPIS ) >>> + return -EBUSY; >>> + >>> + ret = gicv3_lpi_allocate_pendtable(&table_reg); >>> + if (ret) >>> + return ret; >>> + writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER); >> >> Same question as on the previous version. The cacheability and >> shareability may not stick. This was a bug fix in Linux (see commit >> 241a386) and I am struggling to understand why Xen would not be affected? > > Oh, OK, we need to remove cacheability if shareability is denied, though > that sounds like a hardware bug to me if this is not observed by the > redistributor. > The reason I didn't care about it here was that software doesn't use the > pending table, so there would be nothing on the code side to be done in > case the redistributor shoots down one of our requests. > The mentioned Linux patch covers all ITS tables and the property table > as well. The patch was sent by Marc and I trust him to do the right things. Looking at the commit message, it is to avoid cacheable traffic. Unless you give me a good reason for not doing that, please do the change. I don't want to end-up digging into weird bug later on because we decided that "the hardware is not supposed to do that". [...] >>> diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h >>> index 836a103..12bd155 100644 >>> --- a/xen/include/asm-arm/gic.h >>> +++ b/xen/include/asm-arm/gic.h >>> @@ -220,6 +220,8 @@ enum gic_version { >>> GIC_V3, >>> }; >>> >>> +#define LPI_OFFSET 8192 >> >> My comments on the previous version about LPI_OFFSET are not addressed. >> And one question was left unanswered. > > Do you mean this? >>> It would make much sense to have this definition moved in irq.h >>> close to NR_IRQS. > > Admittedly I missed that over the next question... > I think I wasn't convinced in the first place to put this GIC specific > detail into the generic irq.h, but I see that it makes sense since we > have the do_LPI() prototype in there as well. > >>> Also, I am a bit surprised that NR_IRQS & co has not been modified. >>> Is there any reason for that? > > But I answered on this one. > Is there anything else you want to have answered? Yes, the question in <dae3afdb-d64d-a81b-7df3-ed8a61e00ad1@arm.com>. "Can we have a comment on top of NR_IRQS explaining why LPIs are not taken into account?" Cheers,
On Thu, 23 Mar 2017, Julien Grall wrote: > On 23/03/17 14:40, Andre Przywara wrote: > > Hi, > > Hi Andre, > > > > > On 21/03/17 21:23, Julien Grall wrote: > > > Hi Andre, > > > > > > On 03/16/2017 11:20 AM, Andre Przywara wrote: > > > > diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig > > > > index bf64c61..86f7b53 100644 > > > > --- a/xen/arch/arm/Kconfig > > > > +++ b/xen/arch/arm/Kconfig > > > > @@ -49,6 +49,21 @@ config HAS_ITS > > > > bool "GICv3 ITS MSI controller support" > > > > depends on HAS_GICV3 > > > > > > > > +config MAX_PHYS_LPI_BITS > > > > + depends on HAS_ITS > > > > + int "Maximum bits for GICv3 host LPIs (14-32)" > > > > + range 14 32 > > > > + default "20" > > > > + help > > > > + Specifies the maximum number of LPIs (in bits) Xen should > > > > take > > > > + care of. The host ITS may provide support for a very large > > > > number > > > > + of supported LPIs, for all of which we may not want to > > > > allocate > > > > + memory, so this number here allows to limit this. > > > > + Xen itself does not know how many LPIs domains will ever need > > > > + beforehand. > > > > + This can be overridden on the command line with the > > > > max_lpi_bits > > > > + parameter. > > > > > > I continue to disagree on this Kconfig option. There is no sensible > > > default value and I don't see how a distribution will be able to pick-up > > > one value here. > > > > > > If the number of LPIs have to be restricted, this should be done via the > > > command line or a per-platform option. > > > > So are you opting for taking the hardware provided value, unless there > > is a command line option or a per-platform limit? > > > > Please keep in mind that the situation here is different from the device > > table: > > - There is no indirect table option for the property and pending table. > > Any redistributor supporting 32 bits worth of LPIs would lead to a > > 4GB property table and 512MB pending table allocation. > > - An OS like Linux can make sensible assumptions about the number of > > LPIs needed and only allocate as much LPIs as required. Linux for > > instance uses at most 65536. Xen cannot easily make this decision. > > So we would need to go as high as possible, but not too high to not > > waste memory. I see different tradeoffs here between a distribution > > targeting servers or another one aiming at some embedded devices, for > > instance. So I would expect exactly this decision to be covered by a > > Kconfig option. > > Hence why a parameter on the command line is the better place for that. The > decision is left to the user. > > > > I have already made my point on previous e-mails so I am not going to > > > argue more. > > > > So as I mentioned before, I am happy to loose the Kconfig option, but > > then we need some sensible default value. In our case we could be cheeky > > here for now and just use the Linux value, because a Linux Dom0 would be > > the only user. But that doesn't sound very future proof, though this may > > not matter for 4.9. > > I don't think we need a sensible default value and IHMO there is none. I would > left the user to decide the exact number. In that case, the command line parameter becomes mandatory: we need to force the user to specify it as we do for dom0_mem today.
On 23/03/17 10:50, Andre Przywara wrote: > Hi, Hi Andre, > On 21/03/17 23:27, Stefano Stabellini wrote: >> On Tue, 21 Mar 2017, André Przywara wrote: >>> On 21/03/17 22:57, Stefano Stabellini wrote: >>>> On Thu, 16 Mar 2017, Andre Przywara wrote: >>>>> The ARM GICv3 provides a new kind of interrupt called LPIs. >>>>> The pending bits and the configuration data (priority, enable bits) for >>>>> those LPIs are stored in tables in normal memory, which software has to >>>>> provide to the hardware. >>>>> Allocate the required memory, initialize it and hand it over to each >>>>> redistributor. The maximum number of LPIs to be used can be adjusted with >>>>> the command line option "max_lpi_bits", which defaults to a compile time >>>>> constant exposed in Kconfig. >>>>> >>>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com> >>> >>> .... >>> >>>>> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c >>>>> new file mode 100644 >>>>> index 0000000..4f8414b >>>>> --- /dev/null >>>>> +++ b/xen/arch/arm/gic-v3-lpi.c >>> >>> .... >>> >>>>> +/* >>>>> + * Tell a redistributor about the (shared) property table, allocating one >>>>> + * if not already done. >>>>> + */ >>>>> +static int gicv3_lpi_set_proptable(void __iomem * rdist_base) >>>>> +{ >>>>> + uint64_t reg; >>>>> + >>>>> + reg = GIC_BASER_CACHE_RaWaWb << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT; >>>>> + reg |= GIC_BASER_CACHE_SameAsInner << GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT; >>>>> + reg |= GIC_BASER_InnerShareable << GICR_PROPBASER_SHAREABILITY_SHIFT; >>>>> + >>>>> + /* >>>>> + * The property table is shared across all redistributors, so allocate >>>>> + * this only once, but return the same value on subsequent calls. >>>>> + */ >>>>> + if ( !lpi_data.lpi_property ) >>>>> + { >>>>> + /* The property table holds one byte per LPI. */ >>>>> + unsigned int order = get_order_from_bytes(lpi_data.nr_host_lpis); >>>>> + void *table = alloc_xenheap_pages(order, 0); >>>> >>>> you investigated the possibility of using _xmalloc to 64K align it? >>> >>> Yes, _xmalloc gives me the alignment, but AFAICT not the physical >>> contiguous memory guarantee the table needs. If there are other ways to >>> satisfy those requirements, I am all ears. Otherwise I'd just add a >>> comment to motivate this. As already said numerous times, _x*alloc will guarantee contiguous physical memory. The same as kzalloc. >> >> OK, but the problem is that I am not sure whether alloc_xenheap_pages >> makes any guarantees on the alignment of the allocated pages. Is it >> possible to allocate 1<<4 pages that are not 64K aligned with it? > > My understanding of the code is that it naturally aligns the allocation > to the order you requested: So calling it with (4, 0) would give you (4K > << 4) = 64K alignment (because it's taken from the 64K bucket). > > But anyway the property table needs only to be 4K aligned, isn't it? > > I know that the GICv3 is amazing in this aspect: every table has > slightly different requirements when it comes to alignment, support of > physical address space, page sizes and so on... alloc_xenheap_pages will guarantee allocation aligned to the (1 << order) * PAGE_SIZE. Cheers,
Hi Stefano, On 23/03/17 17:45, Stefano Stabellini wrote: > On Thu, 23 Mar 2017, Julien Grall wrote: >>> So as I mentioned before, I am happy to loose the Kconfig option, but >>> then we need some sensible default value. In our case we could be cheeky >>> here for now and just use the Linux value, because a Linux Dom0 would be >>> the only user. But that doesn't sound very future proof, though this may >>> not matter for 4.9. >> >> I don't think we need a sensible default value and IHMO there is none. I would >> left the user to decide the exact number. > > In that case, the command line parameter becomes mandatory: we need to > force the user to specify it as we do for dom0_mem today. Not really. We should use the hardware value by default. If a user thinks the number allocated is too big for his use case, then it can limit using the command line. Anyway, I will not oppose to make this command option mandatory when ITS is in use. Cheers,
On Thu, 23 Mar 2017, Julien Grall wrote: > Hi Stefano, > > On 23/03/17 17:45, Stefano Stabellini wrote: > > On Thu, 23 Mar 2017, Julien Grall wrote: > > > > So as I mentioned before, I am happy to loose the Kconfig option, but > > > > then we need some sensible default value. In our case we could be cheeky > > > > here for now and just use the Linux value, because a Linux Dom0 would be > > > > the only user. But that doesn't sound very future proof, though this may > > > > not matter for 4.9. > > > > > > I don't think we need a sensible default value and IHMO there is none. I > > > would > > > left the user to decide the exact number. > > > > In that case, the command line parameter becomes mandatory: we need to > > force the user to specify it as we do for dom0_mem today. > > Not really. We should use the hardware value by default. If a user thinks the > number allocated is too big for his use case, then it can limit using the > command line. > > Anyway, I will not oppose to make this command option mandatory when ITS is in > use. Andre wrote: Any redistributor supporting 32 bits worth of LPIs would lead to a 4GB property table and 512MB pending table allocation. Let's assume that such scenario is realistic (if it is not, then this discussion is fruitless), in this case the user most surely is not going to want to use the hardware provided value. But how can she knows it? How can she find out that she is wasting too much memory on her system? Is there an easy and obvious way to know? She could find out if Xen printed a big warning such as: USING 4G OF MEMORY FOR ITS PROPTABLE, CONSIDER PASSING max_lpi_bits to XEN but to do that, we need a threshold value in Xen, above which the hypervisor prints the warning. But if we have a threshold value in Xen, then we might as well consider making it the default ceiling: Xen uses the hardware provided value, unless it's greater than threshold, in that case it uses threshold and prints a warning, for example: LIMITING ITS PROPTABLE MEMORY TO 1G, CHANGE IT WITH max_lpi_bits PARAMETER Does it make sense?
Hi, On 23/03/17 18:01, Stefano Stabellini wrote: > On Thu, 23 Mar 2017, Julien Grall wrote: >> Hi Stefano, >> >> On 23/03/17 17:45, Stefano Stabellini wrote: >>> On Thu, 23 Mar 2017, Julien Grall wrote: >>>>> So as I mentioned before, I am happy to loose the Kconfig option, but >>>>> then we need some sensible default value. In our case we could be cheeky >>>>> here for now and just use the Linux value, because a Linux Dom0 would be >>>>> the only user. But that doesn't sound very future proof, though this may >>>>> not matter for 4.9. >>>> >>>> I don't think we need a sensible default value and IHMO there is none. I >>>> would >>>> left the user to decide the exact number. >>> >>> In that case, the command line parameter becomes mandatory: we need to >>> force the user to specify it as we do for dom0_mem today. >> >> Not really. We should use the hardware value by default. If a user thinks the >> number allocated is too big for his use case, then it can limit using the >> command line. >> >> Anyway, I will not oppose to make this command option mandatory when ITS is in >> use. > > Andre wrote: > > Any redistributor supporting 32 bits worth of LPIs would lead to a > 4GB property table and 512MB pending table allocation. > > Let's assume that such scenario is realistic (if it is not, then this > discussion is fruitless), in this case the user most surely is not going > to want to use the hardware provided value. But how can she knows it? > How can she find out that she is wasting too much memory on her system? > Is there an easy and obvious way to know? > > She could find out if Xen printed a big warning such as: > > USING 4G OF MEMORY FOR ITS PROPTABLE, CONSIDER PASSING max_lpi_bits to XEN > > but to do that, we need a threshold value in Xen, above which the > hypervisor prints the warning. But if we have a threshold value in Xen, > then we might as well consider making it the default ceiling: Xen uses > the hardware provided value, unless it's greater than threshold, in that > case it uses threshold and prints a warning, for example: > > LIMITING ITS PROPTABLE MEMORY TO 1G, CHANGE IT WITH max_lpi_bits PARAMETER > > > Does it make sense? Yes, that is exactly what I was after. In contrast to the number of device IDs I think the number of LPI bits is _not_ a value that software should use directly, it's more a limit, actually a GICv3 implementation choice (how wide the LPI ID fields internally are, for instance). So do we want to limit to 1GB and warn starting at 256MB? Cheers, Andre.
Hi Andre, On 03/23/2017 06:21 PM, Andre Przywara wrote: > Hi, > > On 23/03/17 18:01, Stefano Stabellini wrote: >> On Thu, 23 Mar 2017, Julien Grall wrote: >>> Hi Stefano, >>> >>> On 23/03/17 17:45, Stefano Stabellini wrote: >>>> On Thu, 23 Mar 2017, Julien Grall wrote: >>>>>> So as I mentioned before, I am happy to loose the Kconfig option, but >>>>>> then we need some sensible default value. In our case we could be cheeky >>>>>> here for now and just use the Linux value, because a Linux Dom0 would be >>>>>> the only user. But that doesn't sound very future proof, though this may >>>>>> not matter for 4.9. >>>>> >>>>> I don't think we need a sensible default value and IHMO there is none. I >>>>> would >>>>> left the user to decide the exact number. >>>> >>>> In that case, the command line parameter becomes mandatory: we need to >>>> force the user to specify it as we do for dom0_mem today. >>> >>> Not really. We should use the hardware value by default. If a user thinks the >>> number allocated is too big for his use case, then it can limit using the >>> command line. >>> >>> Anyway, I will not oppose to make this command option mandatory when ITS is in >>> use. >> >> Andre wrote: >> >> Any redistributor supporting 32 bits worth of LPIs would lead to a >> 4GB property table and 512MB pending table allocation. >> >> Let's assume that such scenario is realistic (if it is not, then this >> discussion is fruitless), in this case the user most surely is not going >> to want to use the hardware provided value. But how can she knows it? >> How can she find out that she is wasting too much memory on her system? >> Is there an easy and obvious way to know? >> >> She could find out if Xen printed a big warning such as: >> >> USING 4G OF MEMORY FOR ITS PROPTABLE, CONSIDER PASSING max_lpi_bits to XEN >> >> but to do that, we need a threshold value in Xen, above which the >> hypervisor prints the warning. But if we have a threshold value in Xen, >> then we might as well consider making it the default ceiling: Xen uses >> the hardware provided value, unless it's greater than threshold, in that >> case it uses threshold and prints a warning, for example: >> >> LIMITING ITS PROPTABLE MEMORY TO 1G, CHANGE IT WITH max_lpi_bits PARAMETER >> >> >> Does it make sense? > > Yes, that is exactly what I was after. > In contrast to the number of device IDs I think the number of LPI bits > is _not_ a value that software should use directly, it's more a limit, > actually a GICv3 implementation choice (how wide the LPI ID fields > internally are, for instance). > > So do we want to limit to 1GB and warn starting at 256MB? I think Stefano was suggesting to make the command line mandatory. Cheers,
On Fri, 24 Mar 2017, Julien Grall wrote: > Hi Andre, > > On 03/23/2017 06:21 PM, Andre Przywara wrote: > > Hi, > > > > On 23/03/17 18:01, Stefano Stabellini wrote: > > > On Thu, 23 Mar 2017, Julien Grall wrote: > > > > Hi Stefano, > > > > > > > > On 23/03/17 17:45, Stefano Stabellini wrote: > > > > > On Thu, 23 Mar 2017, Julien Grall wrote: > > > > > > > So as I mentioned before, I am happy to loose the Kconfig option, > > > > > > > but > > > > > > > then we need some sensible default value. In our case we could be > > > > > > > cheeky > > > > > > > here for now and just use the Linux value, because a Linux Dom0 > > > > > > > would be > > > > > > > the only user. But that doesn't sound very future proof, though > > > > > > > this may > > > > > > > not matter for 4.9. > > > > > > > > > > > > I don't think we need a sensible default value and IHMO there is > > > > > > none. I > > > > > > would > > > > > > left the user to decide the exact number. > > > > > > > > > > In that case, the command line parameter becomes mandatory: we need to > > > > > force the user to specify it as we do for dom0_mem today. > > > > > > > > Not really. We should use the hardware value by default. If a user > > > > thinks the > > > > number allocated is too big for his use case, then it can limit using > > > > the > > > > command line. > > > > > > > > Anyway, I will not oppose to make this command option mandatory when ITS > > > > is in > > > > use. > > > > > > Andre wrote: > > > > > > Any redistributor supporting 32 bits worth of LPIs would lead to a > > > 4GB property table and 512MB pending table allocation. > > > > > > Let's assume that such scenario is realistic (if it is not, then this > > > discussion is fruitless), in this case the user most surely is not going > > > to want to use the hardware provided value. But how can she knows it? > > > How can she find out that she is wasting too much memory on her system? > > > Is there an easy and obvious way to know? > > > > > > She could find out if Xen printed a big warning such as: > > > > > > USING 4G OF MEMORY FOR ITS PROPTABLE, CONSIDER PASSING max_lpi_bits to > > > XEN > > > > > > but to do that, we need a threshold value in Xen, above which the > > > hypervisor prints the warning. But if we have a threshold value in Xen, > > > then we might as well consider making it the default ceiling: Xen uses > > > the hardware provided value, unless it's greater than threshold, in that > > > case it uses threshold and prints a warning, for example: > > > > > > LIMITING ITS PROPTABLE MEMORY TO 1G, CHANGE IT WITH max_lpi_bits > > > PARAMETER > > > > > > > > > Does it make sense? > > > > Yes, that is exactly what I was after. > > In contrast to the number of device IDs I think the number of LPI bits > > is _not_ a value that software should use directly, it's more a limit, > > actually a GICv3 implementation choice (how wide the LPI ID fields > > internally are, for instance). > > > > So do we want to limit to 1GB and warn starting at 256MB? > > I think Stefano was suggesting to make the command line mandatory. Either way works, but the idea is that the user should not be surprised by the amount of memory utilized because either 1) she chose max_lpi_bits 2) Xen printed a WARNING at boot impossible to miss
diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown index a11fdf9..619016d 100644 --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -1158,6 +1158,15 @@ based interrupts. Any higher IRQs will be available for use via PCI MSI. ### maxcpus > `= <integer>` +### max\_lpi\_bits +> `= <integer>` + +Specifies the number of ARM GICv3 LPI interrupts to allocate on the host, +presented as the number of bits needed to encode it. This must be at least +14 and not exceed 32, and each LPI requires one byte (configuration) and +one pending bit to be allocated. +Defaults to 20 bits (to cover at most 1048576 interrupts). + ### mce > `= <integer>` diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig index bf64c61..86f7b53 100644 --- a/xen/arch/arm/Kconfig +++ b/xen/arch/arm/Kconfig @@ -49,6 +49,21 @@ config HAS_ITS bool "GICv3 ITS MSI controller support" depends on HAS_GICV3 +config MAX_PHYS_LPI_BITS + depends on HAS_ITS + int "Maximum bits for GICv3 host LPIs (14-32)" + range 14 32 + default "20" + help + Specifies the maximum number of LPIs (in bits) Xen should take + care of. The host ITS may provide support for a very large number + of supported LPIs, for all of which we may not want to allocate + memory, so this number here allows to limit this. + Xen itself does not know how many LPIs domains will ever need + beforehand. + This can be overridden on the command line with the max_lpi_bits + parameter. + endmenu menu "ARM errata workaround via the alternative framework" diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile index 54860e0..02a8737 100644 --- a/xen/arch/arm/Makefile +++ b/xen/arch/arm/Makefile @@ -19,6 +19,7 @@ obj-y += gic.o obj-y += gic-v2.o obj-$(CONFIG_HAS_GICV3) += gic-v3.o obj-$(CONFIG_HAS_ITS) += gic-v3-its.o +obj-$(CONFIG_HAS_ITS) += gic-v3-lpi.o obj-y += guestcopy.o obj-y += hvm.o obj-y += io.o diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c new file mode 100644 index 0000000..4f8414b --- /dev/null +++ b/xen/arch/arm/gic-v3-lpi.c @@ -0,0 +1,201 @@ +/* + * xen/arch/arm/gic-v3-lpi.c + * + * ARM GICv3 Locality-specific Peripheral Interrupts (LPI) support + * + * Copyright (C) 2016,2017 - ARM Ltd + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; under version 2 of the License. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; If not, see <http://www.gnu.org/licenses/>. + */ + +#include <xen/lib.h> +#include <xen/mm.h> +#include <asm/gic.h> +#include <asm/gic_v3_defs.h> +#include <asm/gic_v3_its.h> +#include <asm/io.h> +#include <asm/page.h> + +#define LPI_PROPTABLE_NEEDS_FLUSHING (1U << 0) +/* Global state */ +static struct { + /* The global LPI property table, shared by all redistributors. */ + uint8_t *lpi_property; + /* + * Number of physical LPIs the host supports. This is a property of + * the GIC hardware. We depart from the habit of naming these things + * "physical" in Xen, as the GICv3/4 spec uses the term "physical LPI" + * in a different context to differentiate them from "virtual LPIs". + */ + unsigned long int nr_host_lpis; + unsigned int flags; +} lpi_data; + +struct lpi_redist_data { + void *pending_table; +}; + +static DEFINE_PER_CPU(struct lpi_redist_data, lpi_redist); + +#define MAX_PHYS_LPIS (lpi_data.nr_host_lpis - LPI_OFFSET) + +static int gicv3_lpi_allocate_pendtable(uint64_t *reg) +{ + uint64_t val; + unsigned int order; + void *pendtable; + + if ( this_cpu(lpi_redist).pending_table ) + return -EBUSY; + + val = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT; + val |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT; + val |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT; + + /* + * The pending table holds one bit per LPI and even covers bits for + * interrupt IDs below 8192, so we allocate the full range. + * The GICv3 imposes a 64KB alignment requirement, also requires + * physically contigious memory. + */ + order = max(get_order_from_bytes(lpi_data.nr_host_lpis / 8), 4U); + pendtable = alloc_xenheap_pages(order, 0); + if ( !pendtable ) + return -ENOMEM; + + /* Make sure the physical address can be encoded in the register. */ + if ( (virt_to_maddr(pendtable) & ~GENMASK(51, 16)) ) + { + free_xenheap_pages(pendtable, 0); + return -ERANGE; + } + memset(pendtable, 0, lpi_data.nr_host_lpis / 8); + clean_and_invalidate_dcache_va_range(pendtable, + lpi_data.nr_host_lpis / 8); + + this_cpu(lpi_redist).pending_table = pendtable; + + val |= GICR_PENDBASER_PTZ; + + val |= virt_to_maddr(pendtable); + + *reg = val; + + return 0; +} + +/* + * Tell a redistributor about the (shared) property table, allocating one + * if not already done. + */ +static int gicv3_lpi_set_proptable(void __iomem * rdist_base) +{ + uint64_t reg; + + reg = GIC_BASER_CACHE_RaWaWb << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT; + reg |= GIC_BASER_CACHE_SameAsInner << GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT; + reg |= GIC_BASER_InnerShareable << GICR_PROPBASER_SHAREABILITY_SHIFT; + + /* + * The property table is shared across all redistributors, so allocate + * this only once, but return the same value on subsequent calls. + */ + if ( !lpi_data.lpi_property ) + { + /* The property table holds one byte per LPI. */ + unsigned int order = get_order_from_bytes(lpi_data.nr_host_lpis); + void *table = alloc_xenheap_pages(order, 0); + + if ( !table ) + return -ENOMEM; + + /* Make sure the physical address can be encoded in the register. */ + if ( (virt_to_maddr(table) & ~GENMASK(51, 12)) ) + { + free_xenheap_pages(table, 0); + return -ERANGE; + } + memset(table, GIC_PRI_IRQ | LPI_PROP_RES1, MAX_PHYS_LPIS); + clean_and_invalidate_dcache_va_range(table, MAX_PHYS_LPIS); + lpi_data.lpi_property = table; + } + + /* Encode the number of bits needed, minus one */ + reg |= ((fls(lpi_data.nr_host_lpis - 1) - 1) << 0); + + reg |= virt_to_maddr(lpi_data.lpi_property); + + writeq_relaxed(reg, rdist_base + GICR_PROPBASER); + reg = readq_relaxed(rdist_base + GICR_PROPBASER); + + /* If we can't do shareable, we have to drop cacheability as well. */ + if ( !(reg & GICR_PROPBASER_SHAREABILITY_MASK) ) + { + reg &= ~GICR_PROPBASER_INNER_CACHEABILITY_MASK; + reg |= GIC_BASER_CACHE_nC << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT; + } + + /* Remember that we have to flush the property table if non-cacheable. */ + if ( (reg & GICR_PROPBASER_INNER_CACHEABILITY_MASK) <= GIC_BASER_CACHE_nC ) + { + lpi_data.flags |= LPI_PROPTABLE_NEEDS_FLUSHING; + /* Update the redistributors knowledge about the attributes. */ + writeq_relaxed(reg, rdist_base + GICR_PROPBASER); + } + + return 0; +} + +int gicv3_lpi_init_rdist(void __iomem * rdist_base) +{ + uint32_t reg; + uint64_t table_reg; + int ret; + + /* We don't support LPIs without an ITS. */ + if ( !gicv3_its_host_has_its() ) + return -ENODEV; + + /* Make sure LPIs are disabled before setting up the tables. */ + reg = readl_relaxed(rdist_base + GICR_CTLR); + if ( reg & GICR_CTLR_ENABLE_LPIS ) + return -EBUSY; + + ret = gicv3_lpi_allocate_pendtable(&table_reg); + if (ret) + return ret; + writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER); + + return gicv3_lpi_set_proptable(rdist_base); +} + +static unsigned int max_lpi_bits = CONFIG_MAX_PHYS_LPI_BITS; +integer_param("max_lpi_bits", max_lpi_bits); + +int gicv3_lpi_init_host_lpis(unsigned int hw_lpi_bits) +{ + lpi_data.nr_host_lpis = BIT_ULL(min(hw_lpi_bits, max_lpi_bits)); + + printk("GICv3: using at most %ld LPIs on the host.\n", MAX_PHYS_LPIS); + + return 0; +} + +/* + * Local variables: + * mode: C + * c-file-style: "BSD" + * c-basic-offset: 4 + * indent-tabs-mode: nil + * End: + */ diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index 1512521..ed78363 100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ -548,6 +548,9 @@ static void __init gicv3_dist_init(void) type = readl_relaxed(GICD + GICD_TYPER); nr_lines = 32 * ((type & GICD_TYPE_LINES) + 1); + if ( type & GICD_TYPE_LPIS ) + gicv3_lpi_init_host_lpis(((type >> GICD_TYPE_ID_BITS_SHIFT) & 0x1f) + 1); + printk("GICv3: %d lines, (IID %8.8x).\n", nr_lines, readl_relaxed(GICD + GICD_IIDR)); @@ -660,6 +663,20 @@ static int __init gicv3_populate_rdist(void) if ( (typer >> 32) == aff ) { this_cpu(rbase) = ptr; + + if ( typer & GICR_TYPER_PLPIS ) + { + int ret; + + ret = gicv3_lpi_init_rdist(ptr); + if ( ret && ret != -ENODEV ) + { + printk("GICv3: CPU%d: Cannot initialize LPIs: %d\n", + smp_processor_id(), ret); + break; + } + } + printk("GICv3: CPU%d: Found redistributor in region %d @%p\n", smp_processor_id(), i, ptr); return 0; diff --git a/xen/include/asm-arm/bitops.h b/xen/include/asm-arm/bitops.h index bda8898..1cbfb9e 100644 --- a/xen/include/asm-arm/bitops.h +++ b/xen/include/asm-arm/bitops.h @@ -24,6 +24,7 @@ #define BIT(nr) (1UL << (nr)) #define BIT_MASK(nr) (1UL << ((nr) % BITS_PER_WORD)) #define BIT_WORD(nr) ((nr) / BITS_PER_WORD) +#define BIT_ULL(nr) (1ULL << (nr)) #define BITS_PER_BYTE 8 #define ADDR (*(volatile int *) addr) diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h index 836a103..12bd155 100644 --- a/xen/include/asm-arm/gic.h +++ b/xen/include/asm-arm/gic.h @@ -220,6 +220,8 @@ enum gic_version { GIC_V3, }; +#define LPI_OFFSET 8192 + extern enum gic_version gic_hw_version(void); /* Program the IRQ type into the GIC */ diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h index 6bd25a5..b307322 100644 --- a/xen/include/asm-arm/gic_v3_defs.h +++ b/xen/include/asm-arm/gic_v3_defs.h @@ -44,7 +44,8 @@ #define GICC_SRE_EL2_ENEL1 (1UL << 3) /* Additional bits in GICD_TYPER defined by GICv3 */ -#define GICD_TYPE_ID_BITS_SHIFT 19 +#define GICD_TYPE_ID_BITS_SHIFT 19 +#define GICD_TYPE_LPIS (1U << 17) #define GICD_CTLR_RWP (1UL << 31) #define GICD_CTLR_ARE_NS (1U << 4) @@ -95,12 +96,61 @@ #define GICR_IGRPMODR0 (0x0D00) #define GICR_NSACR (0x0E00) +#define GICR_CTLR_ENABLE_LPIS (1U << 0) + #define GICR_TYPER_PLPIS (1U << 0) #define GICR_TYPER_VLPIS (1U << 1) #define GICR_TYPER_LAST (1U << 4) +/* For specifying the inner cacheability type only */ +#define GIC_BASER_CACHE_nCnB 0ULL +/* For specifying the outer cacheability type only */ +#define GIC_BASER_CACHE_SameAsInner 0ULL +#define GIC_BASER_CACHE_nC 1ULL +#define GIC_BASER_CACHE_RaWt 2ULL +#define GIC_BASER_CACHE_RaWb 3ULL +#define GIC_BASER_CACHE_WaWt 4ULL +#define GIC_BASER_CACHE_WaWb 5ULL +#define GIC_BASER_CACHE_RaWaWt 6ULL +#define GIC_BASER_CACHE_RaWaWb 7ULL +#define GIC_BASER_CACHE_MASK 7ULL + +#define GIC_BASER_NonShareable 0ULL +#define GIC_BASER_InnerShareable 1ULL +#define GIC_BASER_OuterShareable 2ULL + +#define GICR_PROPBASER_SHAREABILITY_SHIFT 10 +#define GICR_PROPBASER_INNER_CACHEABILITY_SHIFT 7 +#define GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT 56 +#define GICR_PROPBASER_SHAREABILITY_MASK \ + (3UL << GICR_PROPBASER_SHAREABILITY_SHIFT) +#define GICR_PROPBASER_INNER_CACHEABILITY_MASK \ + (7UL << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT) +#define GICR_PROPBASER_OUTER_CACHEABILITY_MASK \ + (7UL << GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT) +#define GICR_PROPBASER_RES0_MASK \ + (GENMASK(63, 59) | GENMASK(55, 52) | GENMASK(6, 5)) + +#define GICR_PENDBASER_SHAREABILITY_SHIFT 10 +#define GICR_PENDBASER_INNER_CACHEABILITY_SHIFT 7 +#define GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT 56 +#define GICR_PENDBASER_SHAREABILITY_MASK \ + (3UL << GICR_PENDBASER_SHAREABILITY_SHIFT) +#define GICR_PENDBASER_INNER_CACHEABILITY_MASK \ + (7UL << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT) +#define GICR_PENDBASER_OUTER_CACHEABILITY_MASK \ + (7UL << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT) +#define GICR_PENDBASER_PTZ BIT(62) +#define GICR_PENDBASER_RES0_MASK \ + (BIT(63) | GENMASK(61, 59) | GENMASK(55, 52) | \ + GENMASK(15, 12) | GENMASK(6, 0)) + #define DEFAULT_PMR_VALUE 0xff +#define LPI_PROP_PRIO_MASK 0xfc +#define LPI_PROP_RES1 (1 << 1) +#define LPI_PROP_ENABLED (1 << 0) + #define GICH_VMCR_EOI (1 << 9) #define GICH_VMCR_VENG1 (1 << 1) diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h index 765a655..219d109 100644 --- a/xen/include/asm-arm/gic_v3_its.h +++ b/xen/include/asm-arm/gic_v3_its.h @@ -40,6 +40,11 @@ void gicv3_its_dt_init(const struct dt_device_node *node); bool gicv3_its_host_has_its(void); +int gicv3_lpi_init_rdist(void __iomem * rdist_base); + +/* Initialize the host structures for LPIs. */ +int gicv3_lpi_init_host_lpis(unsigned int nr_lpis); + #else static LIST_HEAD(host_its_list); @@ -53,6 +58,15 @@ static inline bool gicv3_its_host_has_its(void) return false; } +static inline int gicv3_lpi_init_rdist(void __iomem * rdist_base) +{ + return -ENODEV; +} + +static inline int gicv3_lpi_init_host_lpis(unsigned int nr_lpis) +{ + return 0; +} #endif /* CONFIG_HAS_ITS */ #endif
The ARM GICv3 provides a new kind of interrupt called LPIs. The pending bits and the configuration data (priority, enable bits) for those LPIs are stored in tables in normal memory, which software has to provide to the hardware. Allocate the required memory, initialize it and hand it over to each redistributor. The maximum number of LPIs to be used can be adjusted with the command line option "max_lpi_bits", which defaults to a compile time constant exposed in Kconfig. Signed-off-by: Andre Przywara <andre.przywara@arm.com> --- docs/misc/xen-command-line.markdown | 9 ++ xen/arch/arm/Kconfig | 15 +++ xen/arch/arm/Makefile | 1 + xen/arch/arm/gic-v3-lpi.c | 201 ++++++++++++++++++++++++++++++++++++ xen/arch/arm/gic-v3.c | 17 +++ xen/include/asm-arm/bitops.h | 1 + xen/include/asm-arm/gic.h | 2 + xen/include/asm-arm/gic_v3_defs.h | 52 +++++++++- xen/include/asm-arm/gic_v3_its.h | 14 +++ 9 files changed, 311 insertions(+), 1 deletion(-) create mode 100644 xen/arch/arm/gic-v3-lpi.c