diff mbox series

[03/12] xen/arm: introduce 1:1 mapping for domUs

Message ID 20200415010255.10081-3-sstabellini@kernel.org (mailing list archive)
State New, archived
Headers show
Series [01/12] xen: introduce xen_dom_flags | expand

Commit Message

Stefano Stabellini April 15, 2020, 1:02 a.m. UTC
In some cases it is desirable to map domU memory 1:1 (guest physical ==
physical.) For instance, because we want to assign a device to the domU
but the IOMMU is not present or cannot be used. In these cases, other
mechanisms should be used for DMA protection, e.g. a MPU.

This patch introduces a new device tree option for dom0less guests to
request a domain to be directly mapped. It also specifies the memory
ranges. This patch documents the new attribute and parses it at boot
time. (However, the implementation of 1:1 mapping is missing and just
BUG() out at the moment.)  Finally the patch sets the new direct_map
flag for DomU domains.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
---
 docs/misc/arm/device-tree/booting.txt | 13 +++++++
 docs/misc/arm/passthrough-noiommu.txt | 35 ++++++++++++++++++
 xen/arch/arm/domain_build.c           | 52 +++++++++++++++++++++++++--
 3 files changed, 98 insertions(+), 2 deletions(-)
 create mode 100644 docs/misc/arm/passthrough-noiommu.txt

Comments

Julien Grall April 15, 2020, 1:36 p.m. UTC | #1
On 15/04/2020 02:02, Stefano Stabellini wrote:
> In some cases it is desirable to map domU memory 1:1 (guest physical ==
> physical.) For instance, because we want to assign a device to the domU
> but the IOMMU is not present or cannot be used. In these cases, other
> mechanisms should be used for DMA protection, e.g. a MPU.

I am not against this, however the documentation should clearly explain 
that you are making your platform insecure unless you have other mean 
for DMA protection.

> 
> This patch introduces a new device tree option for dom0less guests to
> request a domain to be directly mapped. It also specifies the memory
> ranges. This patch documents the new attribute and parses it at boot
> time. (However, the implementation of 1:1 mapping is missing and just
> BUG() out at the moment.)  Finally the patch sets the new direct_map
> flag for DomU domains.
> 
> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> ---
>   docs/misc/arm/device-tree/booting.txt | 13 +++++++
>   docs/misc/arm/passthrough-noiommu.txt | 35 ++++++++++++++++++
>   xen/arch/arm/domain_build.c           | 52 +++++++++++++++++++++++++--
>   3 files changed, 98 insertions(+), 2 deletions(-)
>   create mode 100644 docs/misc/arm/passthrough-noiommu.txt
> 
> diff --git a/docs/misc/arm/device-tree/booting.txt b/docs/misc/arm/device-tree/booting.txt
> index 5243bc7fd3..fce5f7ed5a 100644
> --- a/docs/misc/arm/device-tree/booting.txt
> +++ b/docs/misc/arm/device-tree/booting.txt
> @@ -159,6 +159,19 @@ with the following properties:
>       used, or GUEST_VPL011_SPI+1 if vpl011 is enabled, whichever is
>       greater.
>   
> +- direct-map
> +
> +    Optional. An array of integer pairs specifying addresses and sizes.
> +    direct_map requests the memory of the domain to be 1:1 mapped with
> +    the memory ranges specified as argument. Only sizes that are a
> +    power of two number of pages are allowed.
> +
> +- #direct-map-addr-cells and #direct-map-size-cells
> +
> +    The number of cells to use for the addresses and for the sizes in
> +    direct-map. Default and maximum are 2 cells for both addresses and
> +    sizes.
> +

As this is going to be mostly used for passthrough, can't we take 
advantage of the partial device-tree and describe the memory region 
using memory node?

>   - #address-cells and #size-cells
>   
>       Both #address-cells and #size-cells need to be specified because
> diff --git a/docs/misc/arm/passthrough-noiommu.txt b/docs/misc/arm/passthrough-noiommu.txt
> new file mode 100644
> index 0000000000..d2bfaf26c3
> --- /dev/null
> +++ b/docs/misc/arm/passthrough-noiommu.txt
> @@ -0,0 +1,35 @@
> +Request Device Assignment without IOMMU support
> +===============================================
> +
> +Add xen,force-assign-without-iommu; to the device tree snippet
> +
> +    ethernet: ethernet@ff0e0000 {
> +        compatible = "cdns,zynqmp-gem";
> +        xen,path = "/amba/ethernet@ff0e0000";
> +        xen,reg = <0x0 0xff0e0000 0x1000 0x0 0xff0e0000>;
> +        xen,force-assign-without-iommu;
> +
> +Optionally, if none of the domains require an IOMMU, then it could be
> +disabled (not recommended). For instance by adding status = "disabled";
> +under the smmu node:
> +
> +    smmu@fd800000 {
> +        compatible = "arm,mmu-500";
> +        status = "disabled";

I am not sure why this section is added in this patch. Furthermore, the 
file is named "noiommu" but here you mention the IOMMU.

> +
> +
> +Request 1:1 memory mapping for the dom0-less domain
> +===================================================
> +
> +Add a direct-map property under the appropriate /chosen/domU node with
> +the memory ranges you want to assign to your domain. If you are using
> +imagebuilder, you can add to boot.source something like the following:
> +
> +    fdt set /chosen/domU0 direct-map <0x0 0x10000000 0x0 0x10000000 0x0 0x60000000 0x0 0x10000000>
> +
> +Which will assign the ranges:
> +
> +    0x10000000 - 0x20000000
> +    0x60000000 - 0x70000000
> +
> +to the first dom0less domU.
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 2ec7453aa3..a2bb411568 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -2435,7 +2435,51 @@ static int __init construct_domU(struct domain *d,
>       /* type must be set before allocate memory */
>       d->arch.type = kinfo.type;
>   #endif
> -    allocate_memory(d, &kinfo);
> +
> +    if ( !is_domain_direct_mapped(d) )
> +        allocate_memory(d, &kinfo);
> +    else
> +    {
> +        struct membank banks[NR_MEM_BANKS];
> +        const struct dt_property *prop;
> +        u32 direct_map_len, direct_map_addr_len = 2, direct_map_size_len = 2;
> +        unsigned int i;
> +        __be32 *p;
> +
> +        prop = dt_find_property(node, "direct-map", &direct_map_len);
> +        BUG_ON(!prop);
> +
> +        dt_property_read_u32(node,
> +                             "#direct-map-addr-cells",
> +                             &direct_map_addr_len);
> +        dt_property_read_u32(node,
> +                             "#direct-map-size-cells",
> +                             &direct_map_size_len);
> +        BUG_ON(direct_map_size_len > 2 || direct_map_addr_len > 2);
> +
> +        for ( i = 0, p = prop->value;
> +              i < direct_map_len /
> +                  (4 * (direct_map_addr_len + direct_map_size_len));
> +              i++)
> +        {
> +            int j;
> +
> +            banks[i].start = 0;
> +            for ( j = 0; j < direct_map_addr_len; j++, p++ )
> +                banks[i].start |= __be32_to_cpu(*p) << (32*j);

Please use dt_read_number();

> +
> +            banks[i].size = 0;
> +            for ( j = 0; j < direct_map_size_len; j++, p++ )
> +                banks[i].size |= __be32_to_cpu(*p) << (32*j);
> +

Same here.

> +            printk(XENLOG_DEBUG
> +                   "direct_map start=%#"PRIpaddr" size=%#"PRIpaddr"\n",
> +                   banks[i].start, banks[i].size);
> +        }
> +
> +        /* reserve_memory_11(d, &kinfo, &banks[0], i); */
> +        BUG();

This can be avoided by re-ordering the patches and folding #6 in this patch.

> +    }
>   
>       rc = prepare_dtb_domU(d, &kinfo);
>       if ( rc < 0 )
> @@ -2455,6 +2499,7 @@ void __init create_domUs(void)
>   {
>       struct dt_device_node *node;
>       const struct dt_device_node *chosen = dt_find_node_by_path("/chosen");
> +    u32 direct_map = 0;
>   
>       BUG_ON(chosen == NULL);
>       dt_for_each_child_node(chosen, node)
> @@ -2476,8 +2521,11 @@ void __init create_domUs(void)
>               panic("Missing property 'cpus' for domain %s\n",`
>                     dt_node_name(node));
>   
> -        if ( dt_find_compatible_node(node, NULL, "multiboot,device-tree") )
> +        dt_find_property(node, "direct-map", &direct_map);

Please use dt_property_read_bool().

> +        if ( dt_find_compatible_node(node, NULL, "multiboot,device-tree") &&
> +             !direct_map )
>               d_cfg.flags |= XEN_DOMCTL_CDF_iommu;
> +        flags.arch.is_direct_map = direct_map != 0;
>   
>           if ( !dt_property_read_u32(node, "nr_spis", &d_cfg.arch.nr_spis) )
>           {
> 

Cheers,
Stefano Stabellini May 1, 2020, 1:26 a.m. UTC | #2
On Wed, 15 Apr 2020, Julien Grall wrote:
> On 15/04/2020 02:02, Stefano Stabellini wrote:
> > In some cases it is desirable to map domU memory 1:1 (guest physical ==
> > physical.) For instance, because we want to assign a device to the domU
> > but the IOMMU is not present or cannot be used. In these cases, other
> > mechanisms should be used for DMA protection, e.g. a MPU.
> 
> I am not against this, however the documentation should clearly explain that
> you are making your platform insecure unless you have other mean for DMA
> protection.

I'll expand the docs


> > 
> > This patch introduces a new device tree option for dom0less guests to
> > request a domain to be directly mapped. It also specifies the memory
> > ranges. This patch documents the new attribute and parses it at boot
> > time. (However, the implementation of 1:1 mapping is missing and just
> > BUG() out at the moment.)  Finally the patch sets the new direct_map
> > flag for DomU domains.
> > 
> > Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> > ---
> >   docs/misc/arm/device-tree/booting.txt | 13 +++++++
> >   docs/misc/arm/passthrough-noiommu.txt | 35 ++++++++++++++++++
> >   xen/arch/arm/domain_build.c           | 52 +++++++++++++++++++++++++--
> >   3 files changed, 98 insertions(+), 2 deletions(-)
> >   create mode 100644 docs/misc/arm/passthrough-noiommu.txt
> > 
> > diff --git a/docs/misc/arm/device-tree/booting.txt
> > b/docs/misc/arm/device-tree/booting.txt
> > index 5243bc7fd3..fce5f7ed5a 100644
> > --- a/docs/misc/arm/device-tree/booting.txt
> > +++ b/docs/misc/arm/device-tree/booting.txt
> > @@ -159,6 +159,19 @@ with the following properties:
> >       used, or GUEST_VPL011_SPI+1 if vpl011 is enabled, whichever is
> >       greater.
> >   +- direct-map
> > +
> > +    Optional. An array of integer pairs specifying addresses and sizes.
> > +    direct_map requests the memory of the domain to be 1:1 mapped with
> > +    the memory ranges specified as argument. Only sizes that are a
> > +    power of two number of pages are allowed.
> > +
> > +- #direct-map-addr-cells and #direct-map-size-cells
> > +
> > +    The number of cells to use for the addresses and for the sizes in
> > +    direct-map. Default and maximum are 2 cells for both addresses and
> > +    sizes.
> > +
> 
> As this is going to be mostly used for passthrough, can't we take advantage of
> the partial device-tree and describe the memory region using memory node?

With the system device tree bindings that are under discussion the role
of the partial device tree might be reduce going forward, and might even
go away in the long term. For this reason, I would prefer not to add
more things to the partial device tree.


> >   - #address-cells and #size-cells
> >         Both #address-cells and #size-cells need to be specified because
> > diff --git a/docs/misc/arm/passthrough-noiommu.txt
> > b/docs/misc/arm/passthrough-noiommu.txt
> > new file mode 100644
> > index 0000000000..d2bfaf26c3
> > --- /dev/null
> > +++ b/docs/misc/arm/passthrough-noiommu.txt
> > @@ -0,0 +1,35 @@
> > +Request Device Assignment without IOMMU support
> > +===============================================
> > +
> > +Add xen,force-assign-without-iommu; to the device tree snippet
> > +
> > +    ethernet: ethernet@ff0e0000 {
> > +        compatible = "cdns,zynqmp-gem";
> > +        xen,path = "/amba/ethernet@ff0e0000";
> > +        xen,reg = <0x0 0xff0e0000 0x1000 0x0 0xff0e0000>;
> > +        xen,force-assign-without-iommu;
> > +
> > +Optionally, if none of the domains require an IOMMU, then it could be
> > +disabled (not recommended). For instance by adding status = "disabled";
> > +under the smmu node:
> > +
> > +    smmu@fd800000 {
> > +        compatible = "arm,mmu-500";
> > +        status = "disabled";
> 
> I am not sure why this section is added in this patch. Furthermore, the file
> is named "noiommu" but here you mention the IOMMU.

I took the habit of writing user and testing docs at the time of writing
the patch series. I'll move this doc to the end of the the series. Also,
the words here are inaccurate, I'll improve it in the next version.


I have addressed all other comments.
Julien Grall May 1, 2020, 8:30 a.m. UTC | #3
On 01/05/2020 02:26, Stefano Stabellini wrote:
> On Wed, 15 Apr 2020, Julien Grall wrote:
>> On 15/04/2020 02:02, Stefano Stabellini wrote:
>>> In some cases it is desirable to map domU memory 1:1 (guest physical ==
>>> physical.) For instance, because we want to assign a device to the domU
>>> but the IOMMU is not present or cannot be used. In these cases, other
>>> mechanisms should be used for DMA protection, e.g. a MPU.
>>
>> I am not against this, however the documentation should clearly explain that
>> you are making your platform insecure unless you have other mean for DMA
>> protection.
> 
> I'll expand the docs
> 
> 
>>>
>>> This patch introduces a new device tree option for dom0less guests to
>>> request a domain to be directly mapped. It also specifies the memory
>>> ranges. This patch documents the new attribute and parses it at boot
>>> time. (However, the implementation of 1:1 mapping is missing and just
>>> BUG() out at the moment.)  Finally the patch sets the new direct_map
>>> flag for DomU domains.
>>>
>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>>> ---
>>>    docs/misc/arm/device-tree/booting.txt | 13 +++++++
>>>    docs/misc/arm/passthrough-noiommu.txt | 35 ++++++++++++++++++
>>>    xen/arch/arm/domain_build.c           | 52 +++++++++++++++++++++++++--
>>>    3 files changed, 98 insertions(+), 2 deletions(-)
>>>    create mode 100644 docs/misc/arm/passthrough-noiommu.txt
>>>
>>> diff --git a/docs/misc/arm/device-tree/booting.txt
>>> b/docs/misc/arm/device-tree/booting.txt
>>> index 5243bc7fd3..fce5f7ed5a 100644
>>> --- a/docs/misc/arm/device-tree/booting.txt
>>> +++ b/docs/misc/arm/device-tree/booting.txt
>>> @@ -159,6 +159,19 @@ with the following properties:
>>>        used, or GUEST_VPL011_SPI+1 if vpl011 is enabled, whichever is
>>>        greater.
>>>    +- direct-map
>>> +
>>> +    Optional. An array of integer pairs specifying addresses and sizes.
>>> +    direct_map requests the memory of the domain to be 1:1 mapped with
>>> +    the memory ranges specified as argument. Only sizes that are a
>>> +    power of two number of pages are allowed.
>>> +
>>> +- #direct-map-addr-cells and #direct-map-size-cells
>>> +
>>> +    The number of cells to use for the addresses and for the sizes in
>>> +    direct-map. Default and maximum are 2 cells for both addresses and
>>> +    sizes.
>>> +
>>
>> As this is going to be mostly used for passthrough, can't we take advantage of
>> the partial device-tree and describe the memory region using memory node?
> 
> With the system device tree bindings that are under discussion the role
> of the partial device tree might be reduce going forward, and might even
> go away in the long term. For this reason, I would prefer not to add
> more things to the partial device tree.

Was the interface you suggested approved by the committee behind system 
device tree? If not, we will still have to support your proposal + 
whatever the committee come up with. So I am not entirely sure why using 
the partial device-tree will be an issue.

It is actually better to keep everything in the partial device-tree as 
it would avoid to clash with whatever you come up with the system device 
tree.

Also, I don't think the partial device-tree could ever go away at least 
in Xen. This is an external interface we provide to the user, removing 
it would mean users would not be able to upgrade from Xen 4.x to 4.y 
without any major rewrite of there DT.

Cheers,
Stefano Stabellini May 9, 2020, 12:07 a.m. UTC | #4
On Fri, 1 May 2020, Julien Grall wrote:
> On 01/05/2020 02:26, Stefano Stabellini wrote:
> > On Wed, 15 Apr 2020, Julien Grall wrote:
> > > On 15/04/2020 02:02, Stefano Stabellini wrote:
> > > > In some cases it is desirable to map domU memory 1:1 (guest physical ==
> > > > physical.) For instance, because we want to assign a device to the domU
> > > > but the IOMMU is not present or cannot be used. In these cases, other
> > > > mechanisms should be used for DMA protection, e.g. a MPU.
> > > 
> > > I am not against this, however the documentation should clearly explain
> > > that
> > > you are making your platform insecure unless you have other mean for DMA
> > > protection.
> > 
> > I'll expand the docs
> > 
> > 
> > > > 
> > > > This patch introduces a new device tree option for dom0less guests to
> > > > request a domain to be directly mapped. It also specifies the memory
> > > > ranges. This patch documents the new attribute and parses it at boot
> > > > time. (However, the implementation of 1:1 mapping is missing and just
> > > > BUG() out at the moment.)  Finally the patch sets the new direct_map
> > > > flag for DomU domains.
> > > > 
> > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> > > > ---
> > > >    docs/misc/arm/device-tree/booting.txt | 13 +++++++
> > > >    docs/misc/arm/passthrough-noiommu.txt | 35 ++++++++++++++++++
> > > >    xen/arch/arm/domain_build.c           | 52
> > > > +++++++++++++++++++++++++--
> > > >    3 files changed, 98 insertions(+), 2 deletions(-)
> > > >    create mode 100644 docs/misc/arm/passthrough-noiommu.txt
> > > > 
> > > > diff --git a/docs/misc/arm/device-tree/booting.txt
> > > > b/docs/misc/arm/device-tree/booting.txt
> > > > index 5243bc7fd3..fce5f7ed5a 100644
> > > > --- a/docs/misc/arm/device-tree/booting.txt
> > > > +++ b/docs/misc/arm/device-tree/booting.txt
> > > > @@ -159,6 +159,19 @@ with the following properties:
> > > >        used, or GUEST_VPL011_SPI+1 if vpl011 is enabled, whichever is
> > > >        greater.
> > > >    +- direct-map
> > > > +
> > > > +    Optional. An array of integer pairs specifying addresses and sizes.
> > > > +    direct_map requests the memory of the domain to be 1:1 mapped with
> > > > +    the memory ranges specified as argument. Only sizes that are a
> > > > +    power of two number of pages are allowed.
> > > > +
> > > > +- #direct-map-addr-cells and #direct-map-size-cells
> > > > +
> > > > +    The number of cells to use for the addresses and for the sizes in
> > > > +    direct-map. Default and maximum are 2 cells for both addresses and
> > > > +    sizes.
> > > > +
> > > 
> > > As this is going to be mostly used for passthrough, can't we take
> > > advantage of
> > > the partial device-tree and describe the memory region using memory node?
> > 
> > With the system device tree bindings that are under discussion the role
> > of the partial device tree might be reduce going forward, and might even
> > go away in the long term. For this reason, I would prefer not to add
> > more things to the partial device tree.
> 
> Was the interface you suggested approved by the committee behind system device
> tree? If not, we will still have to support your proposal + whatever the
> committee come up with. So I am not entirely sure why using the partial
> device-tree will be an issue.

Not yet


> It is actually better to keep everything in the partial device-tree as it
> would avoid to clash with whatever you come up with the system device tree.

I don't think we want to support both in the long term. The closer we
are to it the better for transitioning.


> Also, I don't think the partial device-tree could ever go away at least in
> Xen. This is an external interface we provide to the user, removing it would
> mean users would not be able to upgrade from Xen 4.x to 4.y without any major
> rewrite of there DT.

I don't want to put the memory ranges inside the multiboot,device-tree
module because that is clearly for device assignment:

"Device Assignment (Passthrough) is supported by adding another module,
alongside the kernel and ramdisk, with the device tree fragment
corresponding to the device node to assign to the guest."

One could do 1:1 memory mapping without device assignment.


Genuine question: did we write down any compatibility promise on that
interface? If so, do you know where? I'd like to take a look.

In any case backward compatible interfaces can be deprecated although it
takes time. Alternatively it could be made optional (even for device
assignment). So expanding its scope beyond device assignment
configuration it is not a good idea.
Julien Grall May 9, 2020, 9:56 a.m. UTC | #5
Hi,

On 09/05/2020 01:07, Stefano Stabellini wrote:
> On Fri, 1 May 2020, Julien Grall wrote:
>> On 01/05/2020 02:26, Stefano Stabellini wrote:
>>> On Wed, 15 Apr 2020, Julien Grall wrote:
>>>> On 15/04/2020 02:02, Stefano Stabellini wrote:
>>>>> In some cases it is desirable to map domU memory 1:1 (guest physical ==
>>>>> physical.) For instance, because we want to assign a device to the domU
>>>>> but the IOMMU is not present or cannot be used. In these cases, other
>>>>> mechanisms should be used for DMA protection, e.g. a MPU.
>>>>
>>>> I am not against this, however the documentation should clearly explain
>>>> that
>>>> you are making your platform insecure unless you have other mean for DMA
>>>> protection.
>>>
>>> I'll expand the docs
>>>
>>>
>>>>>
>>>>> This patch introduces a new device tree option for dom0less guests to
>>>>> request a domain to be directly mapped. It also specifies the memory
>>>>> ranges. This patch documents the new attribute and parses it at boot
>>>>> time. (However, the implementation of 1:1 mapping is missing and just
>>>>> BUG() out at the moment.)  Finally the patch sets the new direct_map
>>>>> flag for DomU domains.
>>>>>
>>>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>>>>> ---
>>>>>     docs/misc/arm/device-tree/booting.txt | 13 +++++++
>>>>>     docs/misc/arm/passthrough-noiommu.txt | 35 ++++++++++++++++++
>>>>>     xen/arch/arm/domain_build.c           | 52
>>>>> +++++++++++++++++++++++++--
>>>>>     3 files changed, 98 insertions(+), 2 deletions(-)
>>>>>     create mode 100644 docs/misc/arm/passthrough-noiommu.txt
>>>>>
>>>>> diff --git a/docs/misc/arm/device-tree/booting.txt
>>>>> b/docs/misc/arm/device-tree/booting.txt
>>>>> index 5243bc7fd3..fce5f7ed5a 100644
>>>>> --- a/docs/misc/arm/device-tree/booting.txt
>>>>> +++ b/docs/misc/arm/device-tree/booting.txt
>>>>> @@ -159,6 +159,19 @@ with the following properties:
>>>>>         used, or GUEST_VPL011_SPI+1 if vpl011 is enabled, whichever is
>>>>>         greater.
>>>>>     +- direct-map
>>>>> +
>>>>> +    Optional. An array of integer pairs specifying addresses and sizes.
>>>>> +    direct_map requests the memory of the domain to be 1:1 mapped with
>>>>> +    the memory ranges specified as argument. Only sizes that are a
>>>>> +    power of two number of pages are allowed.
>>>>> +
>>>>> +- #direct-map-addr-cells and #direct-map-size-cells
>>>>> +
>>>>> +    The number of cells to use for the addresses and for the sizes in
>>>>> +    direct-map. Default and maximum are 2 cells for both addresses and
>>>>> +    sizes.
>>>>> +
>>>>
>>>> As this is going to be mostly used for passthrough, can't we take
>>>> advantage of
>>>> the partial device-tree and describe the memory region using memory node?
>>>
>>> With the system device tree bindings that are under discussion the role
>>> of the partial device tree might be reduce going forward, and might even
>>> go away in the long term. For this reason, I would prefer not to add
>>> more things to the partial device tree.
>>
>> Was the interface you suggested approved by the committee behind system device
>> tree? If not, we will still have to support your proposal + whatever the
>> committee come up with. So I am not entirely sure why using the partial
>> device-tree will be an issue.
> 
> Not yet

This answer...

> 
> 
>> It is actually better to keep everything in the partial device-tree as it
>> would avoid to clash with whatever you come up with the system device tree.
> 
> I don't think we want to support both in the long term. The closer we
> are to it the better for transitioning.

... raises the question how your solution is going to be closer? Do you 
mind providing more details on the system device-tree?

> 
> 
>> Also, I don't think the partial device-tree could ever go away at least in
>> Xen. This is an external interface we provide to the user, removing it would
>> mean users would not be able to upgrade from Xen 4.x to 4.y without any major
>> rewrite of there DT.
> 
> I don't want to put the memory ranges inside the multiboot,device-tree
> module because that is clearly for device assignment:
> 
> "Device Assignment (Passthrough) is supported by adding another module,
> alongside the kernel and ramdisk, with the device tree fragment
> corresponding to the device node to assign to the guest."

Thanks you for copying the documentation here... As you know, when the 
partial device-tree was designed, it was only focused on device 
assignment. However, I don't see how this prevents us to extend it to 
more use cases.

Describing the RAM regions in the partial device-tree means you have a 
single place where you can understand the memory layout of your guest.

You have also much more flexibility for describing your guests over the 
/chosen node and avoid to clash with the rest of the host device-tree.

> 
> One could do 1:1 memory mapping without device assignment.
 >
> Genuine question: did we write down any compatibility promise on that
> interface? If so, do you know where? I'd like to take a look.

Nothing written in Xen, however a Device-Tree bindings are meant to be 
stable.

This would be a pretty bad user experience if you had to rewrite your 
device-tree when upgrading Xen 4.14 to Xen 5.x. This also means 
roll-back would be more difficult as there are more components dependency.

> In any case backward compatible interfaces can be deprecated although it
> takes time. Alternatively it could be made optional (even for device
> assignment). So expanding its scope beyond device assignment
> configuration it is not a good idea.

What would be the replacement? I still haven't seen any light of the 
so-called system device-tree.

At the moment, I can better picture how this can work with the partial 
device-tree. One of the advantage is you could describe your guest 
layout in one place and then re-use it for booting a guest from the 
toolstack (not implemented yet) or the hypervisor.

I could change my mind if it turns out to be genuinely more complicated 
to implement in Xen and/or you provide more details on how this is going 
to work out with the system device-tree.

Cheers,
diff mbox series

Patch

diff --git a/docs/misc/arm/device-tree/booting.txt b/docs/misc/arm/device-tree/booting.txt
index 5243bc7fd3..fce5f7ed5a 100644
--- a/docs/misc/arm/device-tree/booting.txt
+++ b/docs/misc/arm/device-tree/booting.txt
@@ -159,6 +159,19 @@  with the following properties:
     used, or GUEST_VPL011_SPI+1 if vpl011 is enabled, whichever is
     greater.
 
+- direct-map
+
+    Optional. An array of integer pairs specifying addresses and sizes.
+    direct_map requests the memory of the domain to be 1:1 mapped with
+    the memory ranges specified as argument. Only sizes that are a
+    power of two number of pages are allowed.
+
+- #direct-map-addr-cells and #direct-map-size-cells
+
+    The number of cells to use for the addresses and for the sizes in
+    direct-map. Default and maximum are 2 cells for both addresses and
+    sizes.
+
 - #address-cells and #size-cells
 
     Both #address-cells and #size-cells need to be specified because
diff --git a/docs/misc/arm/passthrough-noiommu.txt b/docs/misc/arm/passthrough-noiommu.txt
new file mode 100644
index 0000000000..d2bfaf26c3
--- /dev/null
+++ b/docs/misc/arm/passthrough-noiommu.txt
@@ -0,0 +1,35 @@ 
+Request Device Assignment without IOMMU support
+===============================================
+
+Add xen,force-assign-without-iommu; to the device tree snippet
+
+    ethernet: ethernet@ff0e0000 {
+        compatible = "cdns,zynqmp-gem";
+        xen,path = "/amba/ethernet@ff0e0000";
+        xen,reg = <0x0 0xff0e0000 0x1000 0x0 0xff0e0000>;
+        xen,force-assign-without-iommu;
+
+Optionally, if none of the domains require an IOMMU, then it could be
+disabled (not recommended). For instance by adding status = "disabled";
+under the smmu node:
+
+    smmu@fd800000 {
+        compatible = "arm,mmu-500";
+        status = "disabled";
+
+
+Request 1:1 memory mapping for the dom0-less domain
+===================================================
+
+Add a direct-map property under the appropriate /chosen/domU node with
+the memory ranges you want to assign to your domain. If you are using
+imagebuilder, you can add to boot.source something like the following:
+
+    fdt set /chosen/domU0 direct-map <0x0 0x10000000 0x0 0x10000000 0x0 0x60000000 0x0 0x10000000>
+
+Which will assign the ranges:
+
+    0x10000000 - 0x20000000
+    0x60000000 - 0x70000000
+
+to the first dom0less domU.
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 2ec7453aa3..a2bb411568 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -2435,7 +2435,51 @@  static int __init construct_domU(struct domain *d,
     /* type must be set before allocate memory */
     d->arch.type = kinfo.type;
 #endif
-    allocate_memory(d, &kinfo);
+
+    if ( !is_domain_direct_mapped(d) )
+        allocate_memory(d, &kinfo);
+    else
+    {
+        struct membank banks[NR_MEM_BANKS];
+        const struct dt_property *prop;
+        u32 direct_map_len, direct_map_addr_len = 2, direct_map_size_len = 2;
+        unsigned int i;
+        __be32 *p;
+
+        prop = dt_find_property(node, "direct-map", &direct_map_len);
+        BUG_ON(!prop);
+
+        dt_property_read_u32(node,
+                             "#direct-map-addr-cells",
+                             &direct_map_addr_len);
+        dt_property_read_u32(node,
+                             "#direct-map-size-cells",
+                             &direct_map_size_len);
+        BUG_ON(direct_map_size_len > 2 || direct_map_addr_len > 2);
+
+        for ( i = 0, p = prop->value;
+              i < direct_map_len /
+                  (4 * (direct_map_addr_len + direct_map_size_len));
+              i++)
+        {
+            int j;
+
+            banks[i].start = 0;
+            for ( j = 0; j < direct_map_addr_len; j++, p++ )
+                banks[i].start |= __be32_to_cpu(*p) << (32*j);
+
+            banks[i].size = 0;
+            for ( j = 0; j < direct_map_size_len; j++, p++ )
+                banks[i].size |= __be32_to_cpu(*p) << (32*j);
+
+            printk(XENLOG_DEBUG
+                   "direct_map start=%#"PRIpaddr" size=%#"PRIpaddr"\n",
+                   banks[i].start, banks[i].size);
+        }
+
+        /* reserve_memory_11(d, &kinfo, &banks[0], i); */
+        BUG();
+    }
 
     rc = prepare_dtb_domU(d, &kinfo);
     if ( rc < 0 )
@@ -2455,6 +2499,7 @@  void __init create_domUs(void)
 {
     struct dt_device_node *node;
     const struct dt_device_node *chosen = dt_find_node_by_path("/chosen");
+    u32 direct_map = 0;
 
     BUG_ON(chosen == NULL);
     dt_for_each_child_node(chosen, node)
@@ -2476,8 +2521,11 @@  void __init create_domUs(void)
             panic("Missing property 'cpus' for domain %s\n",
                   dt_node_name(node));
 
-        if ( dt_find_compatible_node(node, NULL, "multiboot,device-tree") )
+        dt_find_property(node, "direct-map", &direct_map);
+        if ( dt_find_compatible_node(node, NULL, "multiboot,device-tree") &&
+             !direct_map )
             d_cfg.flags |= XEN_DOMCTL_CDF_iommu;
+        flags.arch.is_direct_map = direct_map != 0;
 
         if ( !dt_property_read_u32(node, "nr_spis", &d_cfg.arch.nr_spis) )
         {