diff mbox series

[v5,5/8] xen/arm: assign devices to boot domains

Message ID 20190925184924.21691-5-sstabellini@kernel.org (mailing list archive)
State Superseded
Headers show
Series [v5,1/8] xen/arm: introduce handle_device_interrupts | expand

Commit Message

Stefano Stabellini Sept. 25, 2019, 6:49 p.m. UTC
Scan the user provided dtb fragment at boot. For each device node, map
memory to guests, and route interrupts and setup the iommu.

The memory region to remap is specified by the "xen,reg" property.

The iommu is setup by passing the node of the device to assign on the
host device tree. The path is specified in the device tree fragment as
the "xen,path" string property.

The interrupts are remapped based on the information from the
corresponding node on the host device tree. Call
handle_device_interrupts to remap interrupts. Interrupts related device
tree properties are copied from the device tree fragment, same as all
the other properties.

Also add the new flag XEN_DOMCTL_CDF_iommu to that dom0less domU can use
the IOMMU.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
---
Changes in v5:
- use local variable for name
- use map_regions_p2mt
- add warning for not page aligned addresses/sizes
- introduce handle_passthrough_prop

Changes in v4:
- use unsigned
- improve commit message
- code style
- use dt_prop_cmp
- use device_tree_get_reg
- don't copy over xen,reg and xen,path
- don't create special interrupt properties for domU: copy them from the
  fragment
- in-code comment

Changes in v3:
- improve commit message
- remove superfluous cast
- merge code with the copy code
- add interrup-parent
- demove depth > 2 check
- reuse code from handle_device_interrupts
- copy interrupts from host dt

Changes in v2:
- rename "path" to "xen,path"
- grammar fix
- use gaddr_to_gfn and maddr_to_mfn
- remove depth <= 2 limitation in scanning the dtb fragment
- introduce and parse xen,reg
- code style
- support more than one interrupt per device
- specify only the GIC is supported
---
 xen/arch/arm/domain_build.c | 101 ++++++++++++++++++++++++++++++++++--
 1 file changed, 97 insertions(+), 4 deletions(-)

Comments

Julien Grall Sept. 25, 2019, 9:12 p.m. UTC | #1
Hi Stefano,

On 25/09/2019 19:49, Stefano Stabellini wrote:
> Scan the user provided dtb fragment at boot. For each device node, map
> memory to guests, and route interrupts and setup the iommu.
> 
> The memory region to remap is specified by the "xen,reg" property.
> 
> The iommu is setup by passing the node of the device to assign on the
> host device tree. The path is specified in the device tree fragment as
> the "xen,path" string property.
> 
> The interrupts are remapped based on the information from the
> corresponding node on the host device tree. Call
> handle_device_interrupts to remap interrupts. Interrupts related device
> tree properties are copied from the device tree fragment, same as all
> the other properties.
> 
> Also add the new flag XEN_DOMCTL_CDF_iommu to that dom0less domU can use
> the IOMMU.
> 
> Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
> ---
> Changes in v5:
> - use local variable for name
> - use map_regions_p2mt
> - add warning for not page aligned addresses/sizes
> - introduce handle_passthrough_prop
> 
> Changes in v4:
> - use unsigned
> - improve commit message
> - code style
> - use dt_prop_cmp
> - use device_tree_get_reg
> - don't copy over xen,reg and xen,path
> - don't create special interrupt properties for domU: copy them from the
>    fragment
> - in-code comment
> 
> Changes in v3:
> - improve commit message
> - remove superfluous cast
> - merge code with the copy code
> - add interrup-parent
> - demove depth > 2 check
> - reuse code from handle_device_interrupts
> - copy interrupts from host dt
> 
> Changes in v2:
> - rename "path" to "xen,path"
> - grammar fix
> - use gaddr_to_gfn and maddr_to_mfn
> - remove depth <= 2 limitation in scanning the dtb fragment
> - introduce and parse xen,reg
> - code style
> - support more than one interrupt per device
> - specify only the GIC is supported
> ---
>   xen/arch/arm/domain_build.c | 101 ++++++++++++++++++++++++++++++++++--
>   1 file changed, 97 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 9d985d3bbe..414893bc24 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -1701,6 +1701,85 @@ static int __init make_vpl011_uart_node(struct kernel_info *kinfo)
>   }
>   #endif
>   
> +/*
> + * Scan device tree properties for passthrough specific information.
> + * Returns -ENOENT when no passthrough properties are found
> + *         < 0 on error
> + *         0 on success
> + */
> +static int __init handle_passthrough_prop(struct kernel_info *kinfo,
> +                                          const struct fdt_property *prop,
> +                                          const char *name,
> +                                          uint32_t address_cells, uint32_t size_cells)
> +{
> +    const __be32 *cell;
> +    unsigned int i, len;
> +    struct dt_device_node *node;
> +    int res;
> +
> +    /* xen,reg specifies where to map the MMIO region */
> +    if ( dt_prop_cmp("xen,reg", name) == 0 )
> +    {
> +        paddr_t mstart, size, gstart;
> +        cell = (const __be32 *)prop->data;
> +        len = fdt32_to_cpu(prop->len) /
> +            ((address_cells * 2 + size_cells) * sizeof(uint32_t));
> +
> +        for ( i = 0; i < len; i++ )
> +        {
> +            device_tree_get_reg(&cell, address_cells, size_cells,
> +                    &mstart, &size);
> +            gstart = dt_next_cell(address_cells, &cell);
> +
> +            if ( gstart & ~PAGE_MASK || mstart & ~PAGE_MASK || size & ~PAGE_MASK )
> +                dprintk(XENLOG_WARNING,
> +                        "DomU passthrough config has not page aligned addresses/sizes\n");

I don't think this is wise to continue, the more this is a printk that 
can only happen in debug build. So someone using a release build may not 
notice the error.

So I think this wants to be a printk(XENLOG_ERR,...) and also return an 
error.

> +
> +            res = map_regions_p2mt(kinfo->d,
> +                    gaddr_to_gfn(gstart),
> +                    PFN_DOWN(size),
> +                    maddr_to_mfn(mstart),
> +                    p2m_mmio_direct_dev);

Coding style.

> +            if ( res < 0 )
> +            {
> +                dprintk(XENLOG_ERR,
> +                        "Failed to map %"PRIpaddr" to the guest at%"PRIpaddr"\n",
> +                        mstart, gstart);

Similarly, this wants to be a printk.

> +                return -EFAULT;
> +            }
> +        }
> +
> +        return 0;
> +    }
> +    /*
> +     * xen,path specifies the corresponding node in the host DT.
> +     * Both interrupt mappings and IOMMU settings are based on it,
> +     * as they are done based on the corresponding host DT node.
> +     */
> +    else if ( dt_prop_cmp("xen,path", name) == 0 )
> +    {
> +        node = dt_find_node_by_path(prop->data);
> +        if ( node == NULL )
> +        {
> +            dprintk(XENLOG_ERR, "Couldn't find node %s in host_dt!\n",
> +                    (char *)prop->data);

Same here.

> +            return -EINVAL;
> +        }
> +
> +        res = iommu_assign_dt_device(kinfo->d, node);
> +        if ( res < 0 )
> +            return res;
> +
> +        res = handle_device_interrupts(kinfo->d, node, true);
> +        if ( res < 0 )
> +            return res;
> +
> +        return 0;
> +    }
> +
> +    return -ENOENT;
> +}
> +
>   static int __init handle_prop_pfdt(struct kernel_info *kinfo,
>                                      const void *pfdt, int nodeoff,
>                                      uint32_t address_cells, uint32_t size_cells,
> @@ -1709,6 +1788,7 @@ static int __init handle_prop_pfdt(struct kernel_info *kinfo,
>       void *fdt = kinfo->fdt;
>       int propoff, nameoff, res;
>       const struct fdt_property *prop;
> +    const char *name;
>   
>       for ( propoff = fdt_first_property_offset(pfdt, nodeoff);
>             propoff >= 0;
> @@ -1717,11 +1797,23 @@ static int __init handle_prop_pfdt(struct kernel_info *kinfo,
>           if ( !(prop = fdt_get_property_by_offset(pfdt, propoff, NULL)) )
>               return -FDT_ERR_INTERNAL;
>   
> +        res = 0;
>           nameoff = fdt32_to_cpu(prop->nameoff);
> -        res = fdt_property(fdt, fdt_string(pfdt, nameoff),
> -                           prop->data, fdt32_to_cpu(prop->len));
> -        if ( res )
> +        name = fdt_string(pfdt, nameoff);
> +
> +        if ( scan_passthrough_prop )
> +            res = handle_passthrough_prop(kinfo, prop, name,
> +                                          address_cells, size_cells);
> +        if ( res < 0 && res != -ENOENT )
>               return res;
> +
> +        /* copy all other properties */
> +        if ( !scan_passthrough_prop || res == -ENOENT )
> +        {
> +            res = fdt_property(fdt, name, prop->data, fdt32_to_cpu(prop->len));
> +            if ( res )
> +                return res;
> +        }
>       }
>   
>       /* FDT_ERR_NOTFOUND => There is no more properties for this node */
> @@ -2254,7 +2346,8 @@ void __init create_domUs(void)
>           struct xen_domctl_createdomain d_cfg = {
>               .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE,
>               .arch.nr_spis = 0,
> -            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
> +            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
> +                     XEN_DOMCTL_CDF_iommu,

This will break dom0less on platform without an IOMMU because setting 
CDF_iommu for them will be invalid.

I also don't think this is wise to always enable the IOMMU. This should 
only be done if there is a partial device-tree present (most likely it 
means passthrough will be used).

>               .max_evtchn_port = -1,
>               .max_grant_frames = 64,
>               .max_maptrack_frames = 1024,
> 

Cheers,
Stefano Stabellini Sept. 26, 2019, 3:45 a.m. UTC | #2
On Wed, 25 Sep 2019, Julien Grall wrote:
> Hi Stefano,
> 
> On 25/09/2019 19:49, Stefano Stabellini wrote:
> > Scan the user provided dtb fragment at boot. For each device node, map
> > memory to guests, and route interrupts and setup the iommu.
> > 
> > The memory region to remap is specified by the "xen,reg" property.
> > 
> > The iommu is setup by passing the node of the device to assign on the
> > host device tree. The path is specified in the device tree fragment as
> > the "xen,path" string property.
> > 
> > The interrupts are remapped based on the information from the
> > corresponding node on the host device tree. Call
> > handle_device_interrupts to remap interrupts. Interrupts related device
> > tree properties are copied from the device tree fragment, same as all
> > the other properties.
> > 
> > Also add the new flag XEN_DOMCTL_CDF_iommu to that dom0less domU can use
> > the IOMMU.
> > 
> > Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
> > ---
> > Changes in v5:
> > - use local variable for name
> > - use map_regions_p2mt
> > - add warning for not page aligned addresses/sizes
> > - introduce handle_passthrough_prop
> > 
> > Changes in v4:
> > - use unsigned
> > - improve commit message
> > - code style
> > - use dt_prop_cmp
> > - use device_tree_get_reg
> > - don't copy over xen,reg and xen,path
> > - don't create special interrupt properties for domU: copy them from the
> >    fragment
> > - in-code comment
> > 
> > Changes in v3:
> > - improve commit message
> > - remove superfluous cast
> > - merge code with the copy code
> > - add interrup-parent
> > - demove depth > 2 check
> > - reuse code from handle_device_interrupts
> > - copy interrupts from host dt
> > 
> > Changes in v2:
> > - rename "path" to "xen,path"
> > - grammar fix
> > - use gaddr_to_gfn and maddr_to_mfn
> > - remove depth <= 2 limitation in scanning the dtb fragment
> > - introduce and parse xen,reg
> > - code style
> > - support more than one interrupt per device
> > - specify only the GIC is supported
> > ---
> >   xen/arch/arm/domain_build.c | 101 ++++++++++++++++++++++++++++++++++--
> >   1 file changed, 97 insertions(+), 4 deletions(-)
> > 
> > diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> > index 9d985d3bbe..414893bc24 100644
> > --- a/xen/arch/arm/domain_build.c
> > +++ b/xen/arch/arm/domain_build.c
> > @@ -1701,6 +1701,85 @@ static int __init make_vpl011_uart_node(struct kernel_info *kinfo)
> >   }
> >   #endif
> >   
> > +/*
> > + * Scan device tree properties for passthrough specific information.
> > + * Returns -ENOENT when no passthrough properties are found
> > + *         < 0 on error
> > + *         0 on success
> > + */
> > +static int __init handle_passthrough_prop(struct kernel_info *kinfo,
> > +                                          const struct fdt_property *prop,
> > +                                          const char *name,
> > +                                          uint32_t address_cells, uint32_t size_cells)
> > +{
> > +    const __be32 *cell;
> > +    unsigned int i, len;
> > +    struct dt_device_node *node;
> > +    int res;
> > +
> > +    /* xen,reg specifies where to map the MMIO region */
> > +    if ( dt_prop_cmp("xen,reg", name) == 0 )
> > +    {
> > +        paddr_t mstart, size, gstart;
> > +        cell = (const __be32 *)prop->data;
> > +        len = fdt32_to_cpu(prop->len) /
> > +            ((address_cells * 2 + size_cells) * sizeof(uint32_t));
> > +
> > +        for ( i = 0; i < len; i++ )
> > +        {
> > +            device_tree_get_reg(&cell, address_cells, size_cells,
> > +                    &mstart, &size);
> > +            gstart = dt_next_cell(address_cells, &cell);
> > +
> > +            if ( gstart & ~PAGE_MASK || mstart & ~PAGE_MASK || size & ~PAGE_MASK )
> > +                dprintk(XENLOG_WARNING,
> > +                        "DomU passthrough config has not page aligned addresses/sizes\n");
> 
> I don't think this is wise to continue, the more this is a printk that 
> can only happen in debug build. So someone using a release build may not 
> notice the error.
> 
> So I think this wants to be a printk(XENLOG_ERR,...) and also return an 
> error.

I'll fix the code style issues, use printk instead of dprintk and return
error here.


> > +
> > +            res = map_regions_p2mt(kinfo->d,
> > +                    gaddr_to_gfn(gstart),
> > +                    PFN_DOWN(size),
> > +                    maddr_to_mfn(mstart),
> > +                    p2m_mmio_direct_dev);
> 
> Coding style.
> 
> > +            if ( res < 0 )
> > +            {
> > +                dprintk(XENLOG_ERR,
> > +                        "Failed to map %"PRIpaddr" to the guest at%"PRIpaddr"\n",
> > +                        mstart, gstart);
> 
> Similarly, this wants to be a printk.
> 
> > +                return -EFAULT;
> > +            }
> > +        }
> > +
> > +        return 0;
> > +    }
> > +    /*
> > +     * xen,path specifies the corresponding node in the host DT.
> > +     * Both interrupt mappings and IOMMU settings are based on it,
> > +     * as they are done based on the corresponding host DT node.
> > +     */
> > +    else if ( dt_prop_cmp("xen,path", name) == 0 )
> > +    {
> > +        node = dt_find_node_by_path(prop->data);
> > +        if ( node == NULL )
> > +        {
> > +            dprintk(XENLOG_ERR, "Couldn't find node %s in host_dt!\n",
> > +                    (char *)prop->data);
> 
> Same here.
> 
> > +            return -EINVAL;
> > +        }
> > +
> > +        res = iommu_assign_dt_device(kinfo->d, node);
> > +        if ( res < 0 )
> > +            return res;
> > +
> > +        res = handle_device_interrupts(kinfo->d, node, true);
> > +        if ( res < 0 )
> > +            return res;
> > +
> > +        return 0;
> > +    }
> > +
> > +    return -ENOENT;
> > +}
> > +
> >   static int __init handle_prop_pfdt(struct kernel_info *kinfo,
> >                                      const void *pfdt, int nodeoff,
> >                                      uint32_t address_cells, uint32_t size_cells,
> > @@ -1709,6 +1788,7 @@ static int __init handle_prop_pfdt(struct kernel_info *kinfo,
> >       void *fdt = kinfo->fdt;
> >       int propoff, nameoff, res;
> >       const struct fdt_property *prop;
> > +    const char *name;
> >   
> >       for ( propoff = fdt_first_property_offset(pfdt, nodeoff);
> >             propoff >= 0;
> > @@ -1717,11 +1797,23 @@ static int __init handle_prop_pfdt(struct kernel_info *kinfo,
> >           if ( !(prop = fdt_get_property_by_offset(pfdt, propoff, NULL)) )
> >               return -FDT_ERR_INTERNAL;
> >   
> > +        res = 0;
> >           nameoff = fdt32_to_cpu(prop->nameoff);
> > -        res = fdt_property(fdt, fdt_string(pfdt, nameoff),
> > -                           prop->data, fdt32_to_cpu(prop->len));
> > -        if ( res )
> > +        name = fdt_string(pfdt, nameoff);
> > +
> > +        if ( scan_passthrough_prop )
> > +            res = handle_passthrough_prop(kinfo, prop, name,
> > +                                          address_cells, size_cells);
> > +        if ( res < 0 && res != -ENOENT )
> >               return res;
> > +
> > +        /* copy all other properties */
> > +        if ( !scan_passthrough_prop || res == -ENOENT )
> > +        {
> > +            res = fdt_property(fdt, name, prop->data, fdt32_to_cpu(prop->len));
> > +            if ( res )
> > +                return res;
> > +        }
> >       }
> >   
> >       /* FDT_ERR_NOTFOUND => There is no more properties for this node */
> > @@ -2254,7 +2346,8 @@ void __init create_domUs(void)
> >           struct xen_domctl_createdomain d_cfg = {
> >               .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE,
> >               .arch.nr_spis = 0,
> > -            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
> > +            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
> > +                     XEN_DOMCTL_CDF_iommu,
> 
> This will break dom0less on platform without an IOMMU because setting 
> CDF_iommu for them will be invalid.
> 
> I also don't think this is wise to always enable the IOMMU. This should 
> only be done if there is a partial device-tree present (most likely it 
> means passthrough will be used).

That's interesting, good idea, I'll do that
Oleksandr Tyshchenko Sept. 27, 2019, 2:40 p.m. UTC | #3
On 26.09.19 00:12, Julien Grall wrote:
> Hi Stefano,


Hi Stefano, Julien


>
> On 25/09/2019 19:49, Stefano Stabellini wrote:
>> Scan the user provided dtb fragment at boot. For each device node, map
>> memory to guests, and route interrupts and setup the iommu.
>>
>> The memory region to remap is specified by the "xen,reg" property.
>>
>> The iommu is setup by passing the node of the device to assign on the
>> host device tree. The path is specified in the device tree fragment as
>> the "xen,path" string property.
>>
>> The interrupts are remapped based on the information from the
>> corresponding node on the host device tree. Call
>> handle_device_interrupts to remap interrupts. Interrupts related device
>> tree properties are copied from the device tree fragment, same as all
>> the other properties.
>>
>> Also add the new flag XEN_DOMCTL_CDF_iommu to that dom0less domU can use
>> the IOMMU.
>>
>> Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
>> ---
>> Changes in v5:
>> - use local variable for name
>> - use map_regions_p2mt
>> - add warning for not page aligned addresses/sizes
>> - introduce handle_passthrough_prop
>>
>> Changes in v4:
>> - use unsigned
>> - improve commit message
>> - code style
>> - use dt_prop_cmp
>> - use device_tree_get_reg
>> - don't copy over xen,reg and xen,path
>> - don't create special interrupt properties for domU: copy them from the
>>     fragment
>> - in-code comment
>>
>> Changes in v3:
>> - improve commit message
>> - remove superfluous cast
>> - merge code with the copy code
>> - add interrup-parent
>> - demove depth > 2 check
>> - reuse code from handle_device_interrupts
>> - copy interrupts from host dt
>>
>> Changes in v2:
>> - rename "path" to "xen,path"
>> - grammar fix
>> - use gaddr_to_gfn and maddr_to_mfn
>> - remove depth <= 2 limitation in scanning the dtb fragment
>> - introduce and parse xen,reg
>> - code style
>> - support more than one interrupt per device
>> - specify only the GIC is supported
>> ---
>>    xen/arch/arm/domain_build.c | 101 ++++++++++++++++++++++++++++++++++--
>>    1 file changed, 97 insertions(+), 4 deletions(-)
>>
>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>> index 9d985d3bbe..414893bc24 100644
>> --- a/xen/arch/arm/domain_build.c
>> +++ b/xen/arch/arm/domain_build.c
>> @@ -1701,6 +1701,85 @@ static int __init make_vpl011_uart_node(struct kernel_info *kinfo)
>>    }
>>    #endif
>>    
>> +/*
>> + * Scan device tree properties for passthrough specific information.
>> + * Returns -ENOENT when no passthrough properties are found
>> + *         < 0 on error
>> + *         0 on success
>> + */
>> +static int __init handle_passthrough_prop(struct kernel_info *kinfo,
>> +                                          const struct fdt_property *prop,
>> +                                          const char *name,
>> +                                          uint32_t address_cells, uint32_t size_cells)
>> +{
>> +    const __be32 *cell;
>> +    unsigned int i, len;
>> +    struct dt_device_node *node;
>> +    int res;
>> +
>> +    /* xen,reg specifies where to map the MMIO region */
>> +    if ( dt_prop_cmp("xen,reg", name) == 0 )
>> +    {
>> +        paddr_t mstart, size, gstart;
>> +        cell = (const __be32 *)prop->data;
>> +        len = fdt32_to_cpu(prop->len) /
>> +            ((address_cells * 2 + size_cells) * sizeof(uint32_t));
>> +
>> +        for ( i = 0; i < len; i++ )
>> +        {
>> +            device_tree_get_reg(&cell, address_cells, size_cells,
>> +                    &mstart, &size);
>> +            gstart = dt_next_cell(address_cells, &cell);
>> +
>> +            if ( gstart & ~PAGE_MASK || mstart & ~PAGE_MASK || size & ~PAGE_MASK )
>> +                dprintk(XENLOG_WARNING,
>> +                        "DomU passthrough config has not page aligned addresses/sizes\n");
> I don't think this is wise to continue, the more this is a printk that
> can only happen in debug build. So someone using a release build may not
> notice the error.
>
> So I think this wants to be a printk(XENLOG_ERR,...) and also return an
> error.
>
>> +
>> +            res = map_regions_p2mt(kinfo->d,
>> +                    gaddr_to_gfn(gstart),
>> +                    PFN_DOWN(size),
>> +                    maddr_to_mfn(mstart),
>> +                    p2m_mmio_direct_dev);
> Coding style.
>
>> +            if ( res < 0 )
>> +            {
>> +                dprintk(XENLOG_ERR,
>> +                        "Failed to map %"PRIpaddr" to the guest at%"PRIpaddr"\n",
>> +                        mstart, gstart);
> Similarly, this wants to be a printk.
>
>> +                return -EFAULT;
>> +            }
>> +        }
>> +
>> +        return 0;
>> +    }
>> +    /*
>> +     * xen,path specifies the corresponding node in the host DT.
>> +     * Both interrupt mappings and IOMMU settings are based on it,
>> +     * as they are done based on the corresponding host DT node.
>> +     */
>> +    else if ( dt_prop_cmp("xen,path", name) == 0 )
>> +    {
>> +        node = dt_find_node_by_path(prop->data);
>> +        if ( node == NULL )
>> +        {
>> +            dprintk(XENLOG_ERR, "Couldn't find node %s in host_dt!\n",
>> +                    (char *)prop->data);
> Same here.
>
>> +            return -EINVAL;
>> +        }

I have to admit that I don't know about dom0less feature enough ...


But, shouldn't we check if the device is behind the IOMMU and try to add 
it (iommu_add_dt_device) before assigning it (this is needed for drivers 
which support generic IOMMU DT bindings in the first place).

[please take a look at 
https://lists.xenproject.org/archives/html/xen-devel/2019-09/msg02714.html 
if so]

Julien, what do you think?


>> +
>> +        res = iommu_assign_dt_device(kinfo->d, node);
>> +        if ( res < 0 )
>> +            return res;
>> +
>> +        res = handle_device_interrupts(kinfo->d, node, true);
>> +        if ( res < 0 )
>> +            return res;
>> +
>> +        return 0;
>> +    }
>> +
>> +    return -ENOENT;
>> +}
>> +
>>    static int __init handle_prop_pfdt(struct kernel_info *kinfo,
>>                                       const void *pfdt, int nodeoff,
>>                                       uint32_t address_cells, uint32_t size_cells,
>> @@ -1709,6 +1788,7 @@ static int __init handle_prop_pfdt(struct kernel_info *kinfo,
>>        void *fdt = kinfo->fdt;
>>        int propoff, nameoff, res;
>>        const struct fdt_property *prop;
>> +    const char *name;
>>    
>>        for ( propoff = fdt_first_property_offset(pfdt, nodeoff);
>>              propoff >= 0;
>> @@ -1717,11 +1797,23 @@ static int __init handle_prop_pfdt(struct kernel_info *kinfo,
>>            if ( !(prop = fdt_get_property_by_offset(pfdt, propoff, NULL)) )
>>                return -FDT_ERR_INTERNAL;
>>    
>> +        res = 0;
>>            nameoff = fdt32_to_cpu(prop->nameoff);
>> -        res = fdt_property(fdt, fdt_string(pfdt, nameoff),
>> -                           prop->data, fdt32_to_cpu(prop->len));
>> -        if ( res )
>> +        name = fdt_string(pfdt, nameoff);
>> +
>> +        if ( scan_passthrough_prop )
>> +            res = handle_passthrough_prop(kinfo, prop, name,
>> +                                          address_cells, size_cells);
>> +        if ( res < 0 && res != -ENOENT )
>>                return res;
>> +
>> +        /* copy all other properties */
>> +        if ( !scan_passthrough_prop || res == -ENOENT )
>> +        {
>> +            res = fdt_property(fdt, name, prop->data, fdt32_to_cpu(prop->len));
>> +            if ( res )
>> +                return res;
>> +        }
>>        }
>>    
>>        /* FDT_ERR_NOTFOUND => There is no more properties for this node */
>> @@ -2254,7 +2346,8 @@ void __init create_domUs(void)
>>            struct xen_domctl_createdomain d_cfg = {
>>                .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE,
>>                .arch.nr_spis = 0,
>> -            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
>> +            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
>> +                     XEN_DOMCTL_CDF_iommu,
> This will break dom0less on platform without an IOMMU because setting
> CDF_iommu for them will be invalid.
>
> I also don't think this is wise to always enable the IOMMU. This should
> only be done if there is a partial device-tree present (most likely it
> means passthrough will be used).
>
>>                .max_evtchn_port = -1,
>>                .max_grant_frames = 64,
>>                .max_maptrack_frames = 1024,
>>
> Cheers,
>
Julien Grall Sept. 27, 2019, 8:02 p.m. UTC | #4
Hi,

On 27/09/2019 15:40, Oleksandr wrote:
> 
> On 26.09.19 00:12, Julien Grall wrote:
>> On 25/09/2019 19:49, Stefano Stabellini wrote:
>>> Scan the user provided dtb fragment at boot. For each device node, map
>>> memory to guests, and route interrupts and setup the iommu.
>>>
>>> The memory region to remap is specified by the "xen,reg" property.
>>>
>>> The iommu is setup by passing the node of the device to assign on the
>>> host device tree. The path is specified in the device tree fragment as
>>> the "xen,path" string property.
>>>
>>> The interrupts are remapped based on the information from the
>>> corresponding node on the host device tree. Call
>>> handle_device_interrupts to remap interrupts. Interrupts related device
>>> tree properties are copied from the device tree fragment, same as all
>>> the other properties.
>>>
>>> Also add the new flag XEN_DOMCTL_CDF_iommu to that dom0less domU can use
>>> the IOMMU.
>>>
>>> Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
>>> ---
>>> Changes in v5:
>>> - use local variable for name
>>> - use map_regions_p2mt
>>> - add warning for not page aligned addresses/sizes
>>> - introduce handle_passthrough_prop
>>>
>>> Changes in v4:
>>> - use unsigned
>>> - improve commit message
>>> - code style
>>> - use dt_prop_cmp
>>> - use device_tree_get_reg
>>> - don't copy over xen,reg and xen,path
>>> - don't create special interrupt properties for domU: copy them from the
>>>     fragment
>>> - in-code comment
>>>
>>> Changes in v3:
>>> - improve commit message
>>> - remove superfluous cast
>>> - merge code with the copy code
>>> - add interrup-parent
>>> - demove depth > 2 check
>>> - reuse code from handle_device_interrupts
>>> - copy interrupts from host dt
>>>
>>> Changes in v2:
>>> - rename "path" to "xen,path"
>>> - grammar fix
>>> - use gaddr_to_gfn and maddr_to_mfn
>>> - remove depth <= 2 limitation in scanning the dtb fragment
>>> - introduce and parse xen,reg
>>> - code style
>>> - support more than one interrupt per device
>>> - specify only the GIC is supported
>>> ---
>>>    xen/arch/arm/domain_build.c | 101 
>>> ++++++++++++++++++++++++++++++++++--
>>>    1 file changed, 97 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>> index 9d985d3bbe..414893bc24 100644
>>> --- a/xen/arch/arm/domain_build.c
>>> +++ b/xen/arch/arm/domain_build.c
>>> @@ -1701,6 +1701,85 @@ static int __init make_vpl011_uart_node(struct 
>>> kernel_info *kinfo)
>>>    }
>>>    #endif
>>> +/*
>>> + * Scan device tree properties for passthrough specific information.
>>> + * Returns -ENOENT when no passthrough properties are found
>>> + *         < 0 on error
>>> + *         0 on success
>>> + */
>>> +static int __init handle_passthrough_prop(struct kernel_info *kinfo,
>>> +                                          const struct fdt_property 
>>> *prop,
>>> +                                          const char *name,
>>> +                                          uint32_t address_cells, 
>>> uint32_t size_cells)
>>> +{
>>> +    const __be32 *cell;
>>> +    unsigned int i, len;
>>> +    struct dt_device_node *node;
>>> +    int res;
>>> +
>>> +    /* xen,reg specifies where to map the MMIO region */
>>> +    if ( dt_prop_cmp("xen,reg", name) == 0 )
>>> +    {
>>> +        paddr_t mstart, size, gstart;
>>> +        cell = (const __be32 *)prop->data;
>>> +        len = fdt32_to_cpu(prop->len) /
>>> +            ((address_cells * 2 + size_cells) * sizeof(uint32_t));
>>> +
>>> +        for ( i = 0; i < len; i++ )
>>> +        {
>>> +            device_tree_get_reg(&cell, address_cells, size_cells,
>>> +                    &mstart, &size);
>>> +            gstart = dt_next_cell(address_cells, &cell);
>>> +
>>> +            if ( gstart & ~PAGE_MASK || mstart & ~PAGE_MASK || size 
>>> & ~PAGE_MASK )
>>> +                dprintk(XENLOG_WARNING,
>>> +                        "DomU passthrough config has not page 
>>> aligned addresses/sizes\n");
>> I don't think this is wise to continue, the more this is a printk that
>> can only happen in debug build. So someone using a release build may not
>> notice the error.
>>
>> So I think this wants to be a printk(XENLOG_ERR,...) and also return an
>> error.
>>
>>> +
>>> +            res = map_regions_p2mt(kinfo->d,
>>> +                    gaddr_to_gfn(gstart),
>>> +                    PFN_DOWN(size),
>>> +                    maddr_to_mfn(mstart),
>>> +                    p2m_mmio_direct_dev);
>> Coding style.
>>
>>> +            if ( res < 0 )
>>> +            {
>>> +                dprintk(XENLOG_ERR,
>>> +                        "Failed to map %"PRIpaddr" to the guest 
>>> at%"PRIpaddr"\n",
>>> +                        mstart, gstart);
>> Similarly, this wants to be a printk.
>>
>>> +                return -EFAULT;
>>> +            }
>>> +        }
>>> +
>>> +        return 0;
>>> +    }
>>> +    /*
>>> +     * xen,path specifies the corresponding node in the host DT.
>>> +     * Both interrupt mappings and IOMMU settings are based on it,
>>> +     * as they are done based on the corresponding host DT node.
>>> +     */
>>> +    else if ( dt_prop_cmp("xen,path", name) == 0 )
>>> +    {
>>> +        node = dt_find_node_by_path(prop->data);
>>> +        if ( node == NULL )
>>> +        {
>>> +            dprintk(XENLOG_ERR, "Couldn't find node %s in host_dt!\n",
>>> +                    (char *)prop->data);
>> Same here.
>>
>>> +            return -EINVAL;
>>> +        }
> 
> I have to admit that I don't know about dom0less feature enough ...
> 
> 
> But, shouldn't we check if the device is behind the IOMMU and try to add 
> it (iommu_add_dt_device) before assigning it (this is needed for drivers 
> which support generic IOMMU DT bindings in the first place).
> 
> [please take a look at 
> https://lists.xenproject.org/archives/html/xen-devel/2019-09/msg02714.html 
> if so]
> 
> Julien, what do you think?

Yes you are right.

@Stefano, this is a recently merged feature. Without it, you will not be 
able to use passthrough with dom0less guest when the IOMMU (such as 
IPMMU) is using the generic DT bindings.

Cheers,
Stefano Stabellini Sept. 27, 2019, 11:28 p.m. UTC | #5
On Fri, 27 Sep 2019, Julien Grall wrote:
> Hi,
> 
> On 27/09/2019 15:40, Oleksandr wrote:
> > 
> > On 26.09.19 00:12, Julien Grall wrote:
> >> On 25/09/2019 19:49, Stefano Stabellini wrote:
> >>> Scan the user provided dtb fragment at boot. For each device node, map
> >>> memory to guests, and route interrupts and setup the iommu.
> >>>
> >>> The memory region to remap is specified by the "xen,reg" property.
> >>>
> >>> The iommu is setup by passing the node of the device to assign on the
> >>> host device tree. The path is specified in the device tree fragment as
> >>> the "xen,path" string property.
> >>>
> >>> The interrupts are remapped based on the information from the
> >>> corresponding node on the host device tree. Call
> >>> handle_device_interrupts to remap interrupts. Interrupts related device
> >>> tree properties are copied from the device tree fragment, same as all
> >>> the other properties.
> >>>
> >>> Also add the new flag XEN_DOMCTL_CDF_iommu to that dom0less domU can use
> >>> the IOMMU.
> >>>
> >>> Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
> >>> ---
> >>> Changes in v5:
> >>> - use local variable for name
> >>> - use map_regions_p2mt
> >>> - add warning for not page aligned addresses/sizes
> >>> - introduce handle_passthrough_prop
> >>>
> >>> Changes in v4:
> >>> - use unsigned
> >>> - improve commit message
> >>> - code style
> >>> - use dt_prop_cmp
> >>> - use device_tree_get_reg
> >>> - don't copy over xen,reg and xen,path
> >>> - don't create special interrupt properties for domU: copy them from the
> >>>     fragment
> >>> - in-code comment
> >>>
> >>> Changes in v3:
> >>> - improve commit message
> >>> - remove superfluous cast
> >>> - merge code with the copy code
> >>> - add interrup-parent
> >>> - demove depth > 2 check
> >>> - reuse code from handle_device_interrupts
> >>> - copy interrupts from host dt
> >>>
> >>> Changes in v2:
> >>> - rename "path" to "xen,path"
> >>> - grammar fix
> >>> - use gaddr_to_gfn and maddr_to_mfn
> >>> - remove depth <= 2 limitation in scanning the dtb fragment
> >>> - introduce and parse xen,reg
> >>> - code style
> >>> - support more than one interrupt per device
> >>> - specify only the GIC is supported
> >>> ---
> >>>    xen/arch/arm/domain_build.c | 101 
> >>> ++++++++++++++++++++++++++++++++++--
> >>>    1 file changed, 97 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> >>> index 9d985d3bbe..414893bc24 100644
> >>> --- a/xen/arch/arm/domain_build.c
> >>> +++ b/xen/arch/arm/domain_build.c
> >>> @@ -1701,6 +1701,85 @@ static int __init make_vpl011_uart_node(struct 
> >>> kernel_info *kinfo)
> >>>    }
> >>>    #endif
> >>> +/*
> >>> + * Scan device tree properties for passthrough specific information.
> >>> + * Returns -ENOENT when no passthrough properties are found
> >>> + *         < 0 on error
> >>> + *         0 on success
> >>> + */
> >>> +static int __init handle_passthrough_prop(struct kernel_info *kinfo,
> >>> +                                          const struct fdt_property 
> >>> *prop,
> >>> +                                          const char *name,
> >>> +                                          uint32_t address_cells, 
> >>> uint32_t size_cells)
> >>> +{
> >>> +    const __be32 *cell;
> >>> +    unsigned int i, len;
> >>> +    struct dt_device_node *node;
> >>> +    int res;
> >>> +
> >>> +    /* xen,reg specifies where to map the MMIO region */
> >>> +    if ( dt_prop_cmp("xen,reg", name) == 0 )
> >>> +    {
> >>> +        paddr_t mstart, size, gstart;
> >>> +        cell = (const __be32 *)prop->data;
> >>> +        len = fdt32_to_cpu(prop->len) /
> >>> +            ((address_cells * 2 + size_cells) * sizeof(uint32_t));
> >>> +
> >>> +        for ( i = 0; i < len; i++ )
> >>> +        {
> >>> +            device_tree_get_reg(&cell, address_cells, size_cells,
> >>> +                    &mstart, &size);
> >>> +            gstart = dt_next_cell(address_cells, &cell);
> >>> +
> >>> +            if ( gstart & ~PAGE_MASK || mstart & ~PAGE_MASK || size 
> >>> & ~PAGE_MASK )
> >>> +                dprintk(XENLOG_WARNING,
> >>> +                        "DomU passthrough config has not page 
> >>> aligned addresses/sizes\n");
> >> I don't think this is wise to continue, the more this is a printk that
> >> can only happen in debug build. So someone using a release build may not
> >> notice the error.
> >>
> >> So I think this wants to be a printk(XENLOG_ERR,...) and also return an
> >> error.
> >>
> >>> +
> >>> +            res = map_regions_p2mt(kinfo->d,
> >>> +                    gaddr_to_gfn(gstart),
> >>> +                    PFN_DOWN(size),
> >>> +                    maddr_to_mfn(mstart),
> >>> +                    p2m_mmio_direct_dev);
> >> Coding style.
> >>
> >>> +            if ( res < 0 )
> >>> +            {
> >>> +                dprintk(XENLOG_ERR,
> >>> +                        "Failed to map %"PRIpaddr" to the guest 
> >>> at%"PRIpaddr"\n",
> >>> +                        mstart, gstart);
> >> Similarly, this wants to be a printk.
> >>
> >>> +                return -EFAULT;
> >>> +            }
> >>> +        }
> >>> +
> >>> +        return 0;
> >>> +    }
> >>> +    /*
> >>> +     * xen,path specifies the corresponding node in the host DT.
> >>> +     * Both interrupt mappings and IOMMU settings are based on it,
> >>> +     * as they are done based on the corresponding host DT node.
> >>> +     */
> >>> +    else if ( dt_prop_cmp("xen,path", name) == 0 )
> >>> +    {
> >>> +        node = dt_find_node_by_path(prop->data);
> >>> +        if ( node == NULL )
> >>> +        {
> >>> +            dprintk(XENLOG_ERR, "Couldn't find node %s in host_dt!\n",
> >>> +                    (char *)prop->data);
> >> Same here.
> >>
> >>> +            return -EINVAL;
> >>> +        }
> > 
> > I have to admit that I don't know about dom0less feature enough ...
> > 
> > 
> > But, shouldn't we check if the device is behind the IOMMU and try to add 
> > it (iommu_add_dt_device) before assigning it (this is needed for drivers 
> > which support generic IOMMU DT bindings in the first place).
> > 
> > [please take a look at 
> > https://lists.xenproject.org/archives/html/xen-devel/2019-09/msg02714.html 
> > if so]
> > 
> > Julien, what do you think?
> 
> Yes you are right.
> 
> @Stefano, this is a recently merged feature. Without it, you will not be 
> able to use passthrough with dom0less guest when the IOMMU (such as 
> IPMMU) is using the generic DT bindings.

Just double-checking but it should be only a matter of the following,
right?

+        res = iommu_add_dt_device(node);
+        if ( res < 0 )
+            return res;
+
+        if ( dt_device_is_protected(node) )
+        {
+            res = iommu_assign_dt_device(kinfo->d, node);
+            if ( res < 0 )
+                return res;
+        }
+

(I am asking because I couldn't quite test it due to the error with
mmu-masters I mentioned in the other email.)
Oleksandr Tyshchenko Sept. 30, 2019, 9:34 a.m. UTC | #6
On 28.09.19 02:28, Stefano Stabellini wrote:

Hi Stefano

> On Fri, 27 Sep 2019, Julien Grall wrote:
>> Hi,
>>
>> On 27/09/2019 15:40, Oleksandr wrote:
>>> On 26.09.19 00:12, Julien Grall wrote:
>>>> On 25/09/2019 19:49, Stefano Stabellini wrote:
>>>>> Scan the user provided dtb fragment at boot. For each device node, map
>>>>> memory to guests, and route interrupts and setup the iommu.
>>>>>
>>>>> The memory region to remap is specified by the "xen,reg" property.
>>>>>
>>>>> The iommu is setup by passing the node of the device to assign on the
>>>>> host device tree. The path is specified in the device tree fragment as
>>>>> the "xen,path" string property.
>>>>>
>>>>> The interrupts are remapped based on the information from the
>>>>> corresponding node on the host device tree. Call
>>>>> handle_device_interrupts to remap interrupts. Interrupts related device
>>>>> tree properties are copied from the device tree fragment, same as all
>>>>> the other properties.
>>>>>
>>>>> Also add the new flag XEN_DOMCTL_CDF_iommu to that dom0less domU can use
>>>>> the IOMMU.
>>>>>
>>>>> Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
>>>>> ---
>>>>> Changes in v5:
>>>>> - use local variable for name
>>>>> - use map_regions_p2mt
>>>>> - add warning for not page aligned addresses/sizes
>>>>> - introduce handle_passthrough_prop
>>>>>
>>>>> Changes in v4:
>>>>> - use unsigned
>>>>> - improve commit message
>>>>> - code style
>>>>> - use dt_prop_cmp
>>>>> - use device_tree_get_reg
>>>>> - don't copy over xen,reg and xen,path
>>>>> - don't create special interrupt properties for domU: copy them from the
>>>>>      fragment
>>>>> - in-code comment
>>>>>
>>>>> Changes in v3:
>>>>> - improve commit message
>>>>> - remove superfluous cast
>>>>> - merge code with the copy code
>>>>> - add interrup-parent
>>>>> - demove depth > 2 check
>>>>> - reuse code from handle_device_interrupts
>>>>> - copy interrupts from host dt
>>>>>
>>>>> Changes in v2:
>>>>> - rename "path" to "xen,path"
>>>>> - grammar fix
>>>>> - use gaddr_to_gfn and maddr_to_mfn
>>>>> - remove depth <= 2 limitation in scanning the dtb fragment
>>>>> - introduce and parse xen,reg
>>>>> - code style
>>>>> - support more than one interrupt per device
>>>>> - specify only the GIC is supported
>>>>> ---
>>>>>     xen/arch/arm/domain_build.c | 101
>>>>> ++++++++++++++++++++++++++++++++++--
>>>>>     1 file changed, 97 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>>>> index 9d985d3bbe..414893bc24 100644
>>>>> --- a/xen/arch/arm/domain_build.c
>>>>> +++ b/xen/arch/arm/domain_build.c
>>>>> @@ -1701,6 +1701,85 @@ static int __init make_vpl011_uart_node(struct
>>>>> kernel_info *kinfo)
>>>>>     }
>>>>>     #endif
>>>>> +/*
>>>>> + * Scan device tree properties for passthrough specific information.
>>>>> + * Returns -ENOENT when no passthrough properties are found
>>>>> + *         < 0 on error
>>>>> + *         0 on success
>>>>> + */
>>>>> +static int __init handle_passthrough_prop(struct kernel_info *kinfo,
>>>>> +                                          const struct fdt_property
>>>>> *prop,
>>>>> +                                          const char *name,
>>>>> +                                          uint32_t address_cells,
>>>>> uint32_t size_cells)
>>>>> +{
>>>>> +    const __be32 *cell;
>>>>> +    unsigned int i, len;
>>>>> +    struct dt_device_node *node;
>>>>> +    int res;
>>>>> +
>>>>> +    /* xen,reg specifies where to map the MMIO region */
>>>>> +    if ( dt_prop_cmp("xen,reg", name) == 0 )
>>>>> +    {
>>>>> +        paddr_t mstart, size, gstart;
>>>>> +        cell = (const __be32 *)prop->data;
>>>>> +        len = fdt32_to_cpu(prop->len) /
>>>>> +            ((address_cells * 2 + size_cells) * sizeof(uint32_t));
>>>>> +
>>>>> +        for ( i = 0; i < len; i++ )
>>>>> +        {
>>>>> +            device_tree_get_reg(&cell, address_cells, size_cells,
>>>>> +                    &mstart, &size);
>>>>> +            gstart = dt_next_cell(address_cells, &cell);
>>>>> +
>>>>> +            if ( gstart & ~PAGE_MASK || mstart & ~PAGE_MASK || size
>>>>> & ~PAGE_MASK )
>>>>> +                dprintk(XENLOG_WARNING,
>>>>> +                        "DomU passthrough config has not page
>>>>> aligned addresses/sizes\n");
>>>> I don't think this is wise to continue, the more this is a printk that
>>>> can only happen in debug build. So someone using a release build may not
>>>> notice the error.
>>>>
>>>> So I think this wants to be a printk(XENLOG_ERR,...) and also return an
>>>> error.
>>>>
>>>>> +
>>>>> +            res = map_regions_p2mt(kinfo->d,
>>>>> +                    gaddr_to_gfn(gstart),
>>>>> +                    PFN_DOWN(size),
>>>>> +                    maddr_to_mfn(mstart),
>>>>> +                    p2m_mmio_direct_dev);
>>>> Coding style.
>>>>
>>>>> +            if ( res < 0 )
>>>>> +            {
>>>>> +                dprintk(XENLOG_ERR,
>>>>> +                        "Failed to map %"PRIpaddr" to the guest
>>>>> at%"PRIpaddr"\n",
>>>>> +                        mstart, gstart);
>>>> Similarly, this wants to be a printk.
>>>>
>>>>> +                return -EFAULT;
>>>>> +            }
>>>>> +        }
>>>>> +
>>>>> +        return 0;
>>>>> +    }
>>>>> +    /*
>>>>> +     * xen,path specifies the corresponding node in the host DT.
>>>>> +     * Both interrupt mappings and IOMMU settings are based on it,
>>>>> +     * as they are done based on the corresponding host DT node.
>>>>> +     */
>>>>> +    else if ( dt_prop_cmp("xen,path", name) == 0 )
>>>>> +    {
>>>>> +        node = dt_find_node_by_path(prop->data);
>>>>> +        if ( node == NULL )
>>>>> +        {
>>>>> +            dprintk(XENLOG_ERR, "Couldn't find node %s in host_dt!\n",
>>>>> +                    (char *)prop->data);
>>>> Same here.
>>>>
>>>>> +            return -EINVAL;
>>>>> +        }
>>> I have to admit that I don't know about dom0less feature enough ...
>>>
>>>
>>> But, shouldn't we check if the device is behind the IOMMU and try to add
>>> it (iommu_add_dt_device) before assigning it (this is needed for drivers
>>> which support generic IOMMU DT bindings in the first place).
>>>
>>> [please take a look at
>>> https://lists.xenproject.org/archives/html/xen-devel/2019-09/msg02714.html
>>> if so]
>>>
>>> Julien, what do you think?
>> Yes you are right.
>>
>> @Stefano, this is a recently merged feature. Without it, you will not be
>> able to use passthrough with dom0less guest when the IOMMU (such as
>> IPMMU) is using the generic DT bindings.
> Just double-checking but it should be only a matter of the following,
> right?
>
> +        res = iommu_add_dt_device(node);
> +        if ( res < 0 )
> +            return res;

I think, the change above is correct.


> +
> +        if ( dt_device_is_protected(node) )
> +        {
> +            res = iommu_assign_dt_device(kinfo->d, node);
> +            if ( res < 0 )
> +                return res;
> +        }
> +
>
> (I am asking because I couldn't quite test it due to the error with
> mmu-masters I mentioned in the other email.)
Regarding the check "if (dt_device_is_protected(node))" here. I think, 
it depends on the "xen,path" purpose.

1. If "xen,path" property is, let say, close to "dtdev" property in 
domain config file, where we describe master devices which are behind 
the IOMMU, so *must* be protected, then that check should be removed. 
Please see iommu_do_dt_domctl().

2. If "xen,path" property can also be used to describe devices which are 
not behind the IOMMU (so don't need to be protected), but just for the 
"interrupt mappings" purposes, then that check is correct and should remain.
Julien Grall Sept. 30, 2019, 11:05 a.m. UTC | #7
Hi Oleksandr,

On 30/09/2019 10:34, Oleksandr wrote:
> On 28.09.19 02:28, Stefano Stabellini wrote:
>>>> I have to admit that I don't know about dom0less feature enough ...
>>>>
>>>>
>>>> But, shouldn't we check if the device is behind the IOMMU and try to add
>>>> it (iommu_add_dt_device) before assigning it (this is needed for drivers
>>>> which support generic IOMMU DT bindings in the first place).
>>>>
>>>> [please take a look at
>>>> https://lists.xenproject.org/archives/html/xen-devel/2019-09/msg02714.html
>>>> if so]
>>>>
>>>> Julien, what do you think?
>>> Yes you are right.
>>>
>>> @Stefano, this is a recently merged feature. Without it, you will not be
>>> able to use passthrough with dom0less guest when the IOMMU (such as
>>> IPMMU) is using the generic DT bindings.
>> Just double-checking but it should be only a matter of the following,
>> right?
>>
>> +        res = iommu_add_dt_device(node);
>> +        if ( res < 0 )
>> +            return res;
> 
> I think, the change above is correct.
> 
> 
>> +
>> +        if ( dt_device_is_protected(node) )
>> +        {
>> +            res = iommu_assign_dt_device(kinfo->d, node);
>> +            if ( res < 0 )
>> +                return res;
>> +        }
>> +
>>
>> (I am asking because I couldn't quite test it due to the error with
>> mmu-masters I mentioned in the other email.)
> Regarding the check "if (dt_device_is_protected(node))" here. I think, it 
> depends on the "xen,path" purpose.
> 
> 1. If "xen,path" property is, let say, close to "dtdev" property in domain 
> config file, where we describe master devices which are behind the IOMMU, so 
> *must* be protected, then that check should be removed. Please see 
> iommu_do_dt_domctl().
> 
> 2. If "xen,path" property can also be used to describe devices which are not 
> behind the IOMMU (so don't need to be protected), but just for the "interrupt 
> mappings" purposes, then that check is correct and should remain.

Some device may not be behind an IOMMU but still do DMA. We are not doing a 
favor to the user to continue the assignment as this could lead to at best to a 
non-working device (at worst a security issue).

Therefore I am against the solution 2).

However, this raises some questions why MMIOs are treated differently (i.e they 
don't need an IOMMU).

In the current setup, you would not be able to passthrough a non DMA-capable to 
a guest if they needs interrupts (e.g. an UART) but you would be if they don't 
use interrupts.

So I think we need a couple of more changes:
    1) Introduce an option to allow the user to ignore IOMMU issues (something 
like "xen,force-assign-without-iommu").
    2) "xen,reg" cannot be specified without "xen,path". This allows us to 
police the user DT.

Any opinions?

Cheers,
Stefano Stabellini Sept. 30, 2019, 11:24 p.m. UTC | #8
On Mon, 30 Sep 2019, Julien Grall wrote:
> Hi Oleksandr,
> 
> On 30/09/2019 10:34, Oleksandr wrote:
> > On 28.09.19 02:28, Stefano Stabellini wrote:
> > > > > I have to admit that I don't know about dom0less feature enough ...
> > > > > 
> > > > > 
> > > > > But, shouldn't we check if the device is behind the IOMMU and try to
> > > > > add
> > > > > it (iommu_add_dt_device) before assigning it (this is needed for
> > > > > drivers
> > > > > which support generic IOMMU DT bindings in the first place).
> > > > > 
> > > > > [please take a look at
> > > > > https://lists.xenproject.org/archives/html/xen-devel/2019-09/msg02714.html
> > > > > if so]
> > > > > 
> > > > > Julien, what do you think?
> > > > Yes you are right.
> > > > 
> > > > @Stefano, this is a recently merged feature. Without it, you will not be
> > > > able to use passthrough with dom0less guest when the IOMMU (such as
> > > > IPMMU) is using the generic DT bindings.
> > > Just double-checking but it should be only a matter of the following,
> > > right?
> > > 
> > > +        res = iommu_add_dt_device(node);
> > > +        if ( res < 0 )
> > > +            return res;
> > 
> > I think, the change above is correct.
> > 
> > 
> > > +
> > > +        if ( dt_device_is_protected(node) )
> > > +        {
> > > +            res = iommu_assign_dt_device(kinfo->d, node);
> > > +            if ( res < 0 )
> > > +                return res;
> > > +        }
> > > +
> > > 
> > > (I am asking because I couldn't quite test it due to the error with
> > > mmu-masters I mentioned in the other email.)
> > Regarding the check "if (dt_device_is_protected(node))" here. I think, it
> > depends on the "xen,path" purpose.
> > 
> > 1. If "xen,path" property is, let say, close to "dtdev" property in domain
> > config file, where we describe master devices which are behind the IOMMU, so
> > *must* be protected, then that check should be removed. Please see
> > iommu_do_dt_domctl().
> > 
> > 2. If "xen,path" property can also be used to describe devices which are not
> > behind the IOMMU (so don't need to be protected), but just for the
> > "interrupt mappings" purposes, then that check is correct and should remain.
> 
> Some device may not be behind an IOMMU but still do DMA. We are not doing a
> favor to the user to continue the assignment as this could lead to at best to
> a non-working device (at worst a security issue).
> 
> Therefore I am against the solution 2).

I agree. (And honestly, "xen,path" was introduced as an equivalent of
"dtdev" initially.)


> However, this raises some questions why MMIOs are treated differently (i.e
> they don't need an IOMMU).
> 
> In the current setup, you would not be able to passthrough a non DMA-capable
> to a guest if they needs interrupts (e.g. an UART) but you would be if they
> don't use interrupts.
> 
> So I think we need a couple of more changes:
>    1) Introduce an option to allow the user to ignore IOMMU issues (something
> like "xen,force-assign-without-iommu").
>    2) "xen,reg" cannot be specified without "xen,path". This allows us to
> police the user DT.

Interesting questions.

Something like "xen,force-assign-without-iommu" would be useful. The
upside of being able to assign a non-IOMMU-protected non-DMA-capable
device outweighs the downsides.

I am less sure about having to specify "xen,reg" together with
"xen,path". It is fairly common to have a control register MMIO region
page in FPGA that doesn't do any DMA and has no related interrupts. In
those cases, it is nice to be able to handle it by just having one
"xen,reg" property. But maybe if the user also passes
"xen,force-assign-without-iommu" then we could also ignore a missing
"xen,path".

In any case, my preference would be to keep the series as is for now,
and make these changes later. However, for the sake of moving things
forward quickly, I also implemented Julien's suggestions. So I'll send
two v7 updates to this series:

- v7a: the minimal changes version, without things discussed here except
       for removing the "if (dt_device_is_protected(node))" check
- v7b: a version with all the changes discussed here

Julien, I'll let you pick your favorite, hopefully one of them will be
to your liking.
Julien Grall Oct. 1, 2019, 9:30 a.m. UTC | #9
Hi Stefano,

On 01/10/2019 00:24, Stefano Stabellini wrote:
> On Mon, 30 Sep 2019, Julien Grall wrote:
>> Hi Oleksandr,
>>
>> On 30/09/2019 10:34, Oleksandr wrote:
>>> On 28.09.19 02:28, Stefano Stabellini wrote:
>>>>>> I have to admit that I don't know about dom0less feature enough ...
>>>>>>
>>>>>>
>>>>>> But, shouldn't we check if the device is behind the IOMMU and try to
>>>>>> add
>>>>>> it (iommu_add_dt_device) before assigning it (this is needed for
>>>>>> drivers
>>>>>> which support generic IOMMU DT bindings in the first place).
>>>>>>
>>>>>> [please take a look at
>>>>>> https://lists.xenproject.org/archives/html/xen-devel/2019-09/msg02714.html
>>>>>> if so]
>>>>>>
>>>>>> Julien, what do you think?
>>>>> Yes you are right.
>>>>>
>>>>> @Stefano, this is a recently merged feature. Without it, you will not be
>>>>> able to use passthrough with dom0less guest when the IOMMU (such as
>>>>> IPMMU) is using the generic DT bindings.
>>>> Just double-checking but it should be only a matter of the following,
>>>> right?
>>>>
>>>> +        res = iommu_add_dt_device(node);
>>>> +        if ( res < 0 )
>>>> +            return res;
>>>
>>> I think, the change above is correct.
>>>
>>>
>>>> +
>>>> +        if ( dt_device_is_protected(node) )
>>>> +        {
>>>> +            res = iommu_assign_dt_device(kinfo->d, node);
>>>> +            if ( res < 0 )
>>>> +                return res;
>>>> +        }
>>>> +
>>>>
>>>> (I am asking because I couldn't quite test it due to the error with
>>>> mmu-masters I mentioned in the other email.)
>>> Regarding the check "if (dt_device_is_protected(node))" here. I think, it
>>> depends on the "xen,path" purpose.
>>>
>>> 1. If "xen,path" property is, let say, close to "dtdev" property in domain
>>> config file, where we describe master devices which are behind the IOMMU, so
>>> *must* be protected, then that check should be removed. Please see
>>> iommu_do_dt_domctl().
>>>
>>> 2. If "xen,path" property can also be used to describe devices which are not
>>> behind the IOMMU (so don't need to be protected), but just for the
>>> "interrupt mappings" purposes, then that check is correct and should remain.
>>
>> Some device may not be behind an IOMMU but still do DMA. We are not doing a
>> favor to the user to continue the assignment as this could lead to at best to
>> a non-working device (at worst a security issue).
>>
>> Therefore I am against the solution 2).
> 
> I agree. (And honestly, "xen,path" was introduced as an equivalent of
> "dtdev" initially.)
> 
> 
>> However, this raises some questions why MMIOs are treated differently (i.e
>> they don't need an IOMMU).
>>
>> In the current setup, you would not be able to passthrough a non DMA-capable
>> to a guest if they needs interrupts (e.g. an UART) but you would be if they
>> don't use interrupts.
>>
>> So I think we need a couple of more changes:
>>     1) Introduce an option to allow the user to ignore IOMMU issues (something
>> like "xen,force-assign-without-iommu").
>>     2) "xen,reg" cannot be specified without "xen,path". This allows us to
>> police the user DT.
> 
> Interesting questions.
> 
> Something like "xen,force-assign-without-iommu" would be useful. The
> upside of being able to assign a non-IOMMU-protected non-DMA-capable
> device outweighs the downsides.
> 
> I am less sure about having to specify "xen,reg" together with
> "xen,path". It is fairly common to have a control register MMIO region
> page in FPGA that doesn't do any DMA and has no related interrupts. In
> those cases, it is nice to be able to handle it by just having one
> "xen,reg" property. But maybe if the user also passes
> "xen,force-assign-without-iommu" then we could also ignore a missing
> "xen,path".
> 
> In any case, my preference would be to keep the series as is for now,
> and make these changes later. 

Bindings are meant to be stable, so we would end up to have to create a new 
bindings to cater the solution discussed here. So I would rather avoid to take 
that approach.

> However, for the sake of moving things
> forward quickly, I also implemented Julien's suggestions. So I'll send
> two v7 updates to this series:
> 
> - v7a: the minimal changes version, without things discussed here except
>         for removing the "if (dt_device_is_protected(node))" check
> - v7b: a version with all the changes discussed here
> 
> Julien, I'll let you pick your favorite, hopefully one of them will be
> to your liking.

Thank you for suggesting the two versions. I will have a look at them.

Cheers,
Stefano Stabellini Oct. 1, 2019, 5:41 p.m. UTC | #10
On Tue, 1 Oct 2019, Julien Grall wrote:
> Hi Stefano,
> 
> On 01/10/2019 00:24, Stefano Stabellini wrote:
> > On Mon, 30 Sep 2019, Julien Grall wrote:
> > > Hi Oleksandr,
> > > 
> > > On 30/09/2019 10:34, Oleksandr wrote:
> > > > On 28.09.19 02:28, Stefano Stabellini wrote:
> > > > > > > I have to admit that I don't know about dom0less feature enough
> > > > > > > ...
> > > > > > > 
> > > > > > > 
> > > > > > > But, shouldn't we check if the device is behind the IOMMU and try
> > > > > > > to
> > > > > > > add
> > > > > > > it (iommu_add_dt_device) before assigning it (this is needed for
> > > > > > > drivers
> > > > > > > which support generic IOMMU DT bindings in the first place).
> > > > > > > 
> > > > > > > [please take a look at
> > > > > > > https://lists.xenproject.org/archives/html/xen-devel/2019-09/msg02714.html
> > > > > > > if so]
> > > > > > > 
> > > > > > > Julien, what do you think?
> > > > > > Yes you are right.
> > > > > > 
> > > > > > @Stefano, this is a recently merged feature. Without it, you will
> > > > > > not be
> > > > > > able to use passthrough with dom0less guest when the IOMMU (such as
> > > > > > IPMMU) is using the generic DT bindings.
> > > > > Just double-checking but it should be only a matter of the following,
> > > > > right?
> > > > > 
> > > > > +        res = iommu_add_dt_device(node);
> > > > > +        if ( res < 0 )
> > > > > +            return res;
> > > > 
> > > > I think, the change above is correct.
> > > > 
> > > > 
> > > > > +
> > > > > +        if ( dt_device_is_protected(node) )
> > > > > +        {
> > > > > +            res = iommu_assign_dt_device(kinfo->d, node);
> > > > > +            if ( res < 0 )
> > > > > +                return res;
> > > > > +        }
> > > > > +
> > > > > 
> > > > > (I am asking because I couldn't quite test it due to the error with
> > > > > mmu-masters I mentioned in the other email.)
> > > > Regarding the check "if (dt_device_is_protected(node))" here. I think,
> > > > it
> > > > depends on the "xen,path" purpose.
> > > > 
> > > > 1. If "xen,path" property is, let say, close to "dtdev" property in
> > > > domain
> > > > config file, where we describe master devices which are behind the
> > > > IOMMU, so
> > > > *must* be protected, then that check should be removed. Please see
> > > > iommu_do_dt_domctl().
> > > > 
> > > > 2. If "xen,path" property can also be used to describe devices which are
> > > > not
> > > > behind the IOMMU (so don't need to be protected), but just for the
> > > > "interrupt mappings" purposes, then that check is correct and should
> > > > remain.
> > > 
> > > Some device may not be behind an IOMMU but still do DMA. We are not doing
> > > a
> > > favor to the user to continue the assignment as this could lead to at best
> > > to
> > > a non-working device (at worst a security issue).
> > > 
> > > Therefore I am against the solution 2).
> > 
> > I agree. (And honestly, "xen,path" was introduced as an equivalent of
> > "dtdev" initially.)
> > 
> > 
> > > However, this raises some questions why MMIOs are treated differently (i.e
> > > they don't need an IOMMU).
> > > 
> > > In the current setup, you would not be able to passthrough a non
> > > DMA-capable
> > > to a guest if they needs interrupts (e.g. an UART) but you would be if
> > > they
> > > don't use interrupts.
> > > 
> > > So I think we need a couple of more changes:
> > >     1) Introduce an option to allow the user to ignore IOMMU issues
> > > (something
> > > like "xen,force-assign-without-iommu").
> > >     2) "xen,reg" cannot be specified without "xen,path". This allows us to
> > > police the user DT.
> > 
> > Interesting questions.
> > 
> > Something like "xen,force-assign-without-iommu" would be useful. The
> > upside of being able to assign a non-IOMMU-protected non-DMA-capable
> > device outweighs the downsides.
> > 
> > I am less sure about having to specify "xen,reg" together with
> > "xen,path". It is fairly common to have a control register MMIO region
> > page in FPGA that doesn't do any DMA and has no related interrupts. In
> > those cases, it is nice to be able to handle it by just having one
> > "xen,reg" property. But maybe if the user also passes
> > "xen,force-assign-without-iommu" then we could also ignore a missing
> > "xen,path".
> > 
> > In any case, my preference would be to keep the series as is for now,
> > and make these changes later. 
> 
> Bindings are meant to be stable, so we would end up to have to create a new
> bindings to cater the solution discussed here. So I would rather avoid to take
> that approach.

Adding a note here from our discussion on IRC. One idea would be to keep
the code as is (v7a) but to make sure the docs reflect that xen,reg and
xen,path are both required. That would be good. However, the docs
already imply it so I didn't actually need to make any changes in that
respect in v7a. In any case, I could certainly add a statement or two to
the docs if it helps.


> > However, for the sake of moving things
> > forward quickly, I also implemented Julien's suggestions. So I'll send
> > two v7 updates to this series:
> > 
> > - v7a: the minimal changes version, without things discussed here except
> >         for removing the "if (dt_device_is_protected(node))" check
> > - v7b: a version with all the changes discussed here
> > 
> > Julien, I'll let you pick your favorite, hopefully one of them will be
> > to your liking.
> 
> Thank you for suggesting the two versions. I will have a look at them.
diff mbox series

Patch

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 9d985d3bbe..414893bc24 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1701,6 +1701,85 @@  static int __init make_vpl011_uart_node(struct kernel_info *kinfo)
 }
 #endif
 
+/*
+ * Scan device tree properties for passthrough specific information.
+ * Returns -ENOENT when no passthrough properties are found
+ *         < 0 on error
+ *         0 on success
+ */
+static int __init handle_passthrough_prop(struct kernel_info *kinfo,
+                                          const struct fdt_property *prop,
+                                          const char *name,
+                                          uint32_t address_cells, uint32_t size_cells)
+{
+    const __be32 *cell;
+    unsigned int i, len;
+    struct dt_device_node *node;
+    int res;
+
+    /* xen,reg specifies where to map the MMIO region */
+    if ( dt_prop_cmp("xen,reg", name) == 0 )
+    {
+        paddr_t mstart, size, gstart;
+        cell = (const __be32 *)prop->data;
+        len = fdt32_to_cpu(prop->len) /
+            ((address_cells * 2 + size_cells) * sizeof(uint32_t));
+
+        for ( i = 0; i < len; i++ )
+        {
+            device_tree_get_reg(&cell, address_cells, size_cells,
+                    &mstart, &size);
+            gstart = dt_next_cell(address_cells, &cell);
+
+            if ( gstart & ~PAGE_MASK || mstart & ~PAGE_MASK || size & ~PAGE_MASK )
+                dprintk(XENLOG_WARNING,
+                        "DomU passthrough config has not page aligned addresses/sizes\n");
+
+            res = map_regions_p2mt(kinfo->d,
+                    gaddr_to_gfn(gstart),
+                    PFN_DOWN(size),
+                    maddr_to_mfn(mstart),
+                    p2m_mmio_direct_dev);
+            if ( res < 0 )
+            {
+                dprintk(XENLOG_ERR,
+                        "Failed to map %"PRIpaddr" to the guest at%"PRIpaddr"\n",
+                        mstart, gstart);
+                return -EFAULT;
+            }
+        }
+
+        return 0;
+    }
+    /*
+     * xen,path specifies the corresponding node in the host DT.
+     * Both interrupt mappings and IOMMU settings are based on it,
+     * as they are done based on the corresponding host DT node.
+     */
+    else if ( dt_prop_cmp("xen,path", name) == 0 )
+    {
+        node = dt_find_node_by_path(prop->data);
+        if ( node == NULL )
+        {
+            dprintk(XENLOG_ERR, "Couldn't find node %s in host_dt!\n",
+                    (char *)prop->data);
+            return -EINVAL;
+        }
+
+        res = iommu_assign_dt_device(kinfo->d, node);
+        if ( res < 0 )
+            return res;
+
+        res = handle_device_interrupts(kinfo->d, node, true);
+        if ( res < 0 )
+            return res;
+
+        return 0;
+    }
+
+    return -ENOENT;
+}
+
 static int __init handle_prop_pfdt(struct kernel_info *kinfo,
                                    const void *pfdt, int nodeoff,
                                    uint32_t address_cells, uint32_t size_cells,
@@ -1709,6 +1788,7 @@  static int __init handle_prop_pfdt(struct kernel_info *kinfo,
     void *fdt = kinfo->fdt;
     int propoff, nameoff, res;
     const struct fdt_property *prop;
+    const char *name;
 
     for ( propoff = fdt_first_property_offset(pfdt, nodeoff);
           propoff >= 0;
@@ -1717,11 +1797,23 @@  static int __init handle_prop_pfdt(struct kernel_info *kinfo,
         if ( !(prop = fdt_get_property_by_offset(pfdt, propoff, NULL)) )
             return -FDT_ERR_INTERNAL;
 
+        res = 0;
         nameoff = fdt32_to_cpu(prop->nameoff);
-        res = fdt_property(fdt, fdt_string(pfdt, nameoff),
-                           prop->data, fdt32_to_cpu(prop->len));
-        if ( res )
+        name = fdt_string(pfdt, nameoff);
+
+        if ( scan_passthrough_prop )
+            res = handle_passthrough_prop(kinfo, prop, name,
+                                          address_cells, size_cells);
+        if ( res < 0 && res != -ENOENT )
             return res;
+
+        /* copy all other properties */
+        if ( !scan_passthrough_prop || res == -ENOENT )
+        {
+            res = fdt_property(fdt, name, prop->data, fdt32_to_cpu(prop->len));
+            if ( res )
+                return res;
+        }
     }
 
     /* FDT_ERR_NOTFOUND => There is no more properties for this node */
@@ -2254,7 +2346,8 @@  void __init create_domUs(void)
         struct xen_domctl_createdomain d_cfg = {
             .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE,
             .arch.nr_spis = 0,
-            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
+            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
+                     XEN_DOMCTL_CDF_iommu,
             .max_evtchn_port = -1,
             .max_grant_frames = 64,
             .max_maptrack_frames = 1024,