Message ID | 1651947548-4055-6-git-send-email-olekstysh@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | virtio: Solution to restrict memory access under Xen using xen-grant DMA-mapping layer | expand |
On Sat, 7 May 2022, Oleksandr Tyshchenko wrote: > From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> > > Introduce Xen specific binding for the virtualized device (e.g. virtio) > to be used by Xen grant DMA-mapping layer in the subsequent commit. > > This binding indicates that Xen grant mappings scheme needs to be > enabled for the device which DT node contains that property and specifies > the ID of Xen domain where the corresponding backend resides. The ID > (domid) is used as an argument to the grant mapping APIs. > > This is needed for the option to restrict memory access using Xen grant > mappings to work which primary goal is to enable using virtio devices > in Xen guests. > > Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> The binding is OK and the wording is OK too. Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> I am not an expert on the details of writing a good schema, I'll defer to Rob if he has any comments on that. > --- > Changes RFC -> V1: > - update commit subject/description and text in description > - move to devicetree/bindings/arm/ > > Changes V1 -> V2: > - update text in description > - change the maintainer of the binding > - fix validation issue > - reference xen,dev-domid.yaml schema from virtio/mmio.yaml > --- > .../devicetree/bindings/arm/xen,dev-domid.yaml | 37 ++++++++++++++++++++++ > Documentation/devicetree/bindings/virtio/mmio.yaml | 7 ++++ > 2 files changed, 44 insertions(+) > create mode 100644 Documentation/devicetree/bindings/arm/xen,dev-domid.yaml > > diff --git a/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml b/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml > new file mode 100644 > index 00000000..750e89e > --- /dev/null > +++ b/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml > @@ -0,0 +1,37 @@ > +# SPDX-License-Identifier: (GPL-2.0-only or BSD-2-Clause) > +%YAML 1.2 > +--- > +$id: http://devicetree.org/schemas/arm/xen,dev-domid.yaml# > +$schema: http://devicetree.org/meta-schemas/core.yaml# > + > +title: Xen specific binding for virtualized devices (e.g. virtio) > + > +maintainers: > + - Stefano Stabellini <sstabellini@kernel.org> > + > +select: true > + > +description: > + This binding indicates that Xen grant mappings need to be enabled for > + the device, and it specifies the ID of the domain where the corresponding > + device (backend) resides. The property is required to restrict memory > + access using Xen grant mappings. > + > +properties: > + xen,dev-domid: > + $ref: /schemas/types.yaml#/definitions/uint32 > + description: > + The domid (domain ID) of the domain where the device (backend) is running. > + > +additionalProperties: true > + > +examples: > + - | > + virtio@3000 { > + compatible = "virtio,mmio"; > + reg = <0x3000 0x100>; > + interrupts = <41>; > + > + /* The device is located in Xen domain with ID 1 */ > + xen,dev-domid = <1>; > + }; > diff --git a/Documentation/devicetree/bindings/virtio/mmio.yaml b/Documentation/devicetree/bindings/virtio/mmio.yaml > index 10c22b5..29a0932 100644 > --- a/Documentation/devicetree/bindings/virtio/mmio.yaml > +++ b/Documentation/devicetree/bindings/virtio/mmio.yaml > @@ -13,6 +13,9 @@ description: > See https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=virtio for > more details. > > +allOf: > + - $ref: /schemas/arm/xen,dev-domid.yaml# > + > properties: > compatible: > const: virtio,mmio > @@ -33,6 +36,10 @@ properties: > description: Required for devices making accesses thru an IOMMU. > maxItems: 1 > > + xen,dev-domid: > + description: Required when Xen grant mappings need to be enabled for device. > + $ref: /schemas/types.yaml#/definitions/uint32 > + > required: > - compatible > - reg > -- > 2.7.4 > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >
On Sat, May 07, 2022 at 09:19:06PM +0300, Oleksandr Tyshchenko wrote: > From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> > > Introduce Xen specific binding for the virtualized device (e.g. virtio) > to be used by Xen grant DMA-mapping layer in the subsequent commit. > > This binding indicates that Xen grant mappings scheme needs to be > enabled for the device which DT node contains that property and specifies > the ID of Xen domain where the corresponding backend resides. The ID > (domid) is used as an argument to the grant mapping APIs. > > This is needed for the option to restrict memory access using Xen grant > mappings to work which primary goal is to enable using virtio devices > in Xen guests. > > Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> > --- > Changes RFC -> V1: > - update commit subject/description and text in description > - move to devicetree/bindings/arm/ > > Changes V1 -> V2: > - update text in description > - change the maintainer of the binding > - fix validation issue > - reference xen,dev-domid.yaml schema from virtio/mmio.yaml > --- > .../devicetree/bindings/arm/xen,dev-domid.yaml | 37 ++++++++++++++++++++++ > Documentation/devicetree/bindings/virtio/mmio.yaml | 7 ++++ > 2 files changed, 44 insertions(+) > create mode 100644 Documentation/devicetree/bindings/arm/xen,dev-domid.yaml > > diff --git a/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml b/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml > new file mode 100644 > index 00000000..750e89e > --- /dev/null > +++ b/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml > @@ -0,0 +1,37 @@ > +# SPDX-License-Identifier: (GPL-2.0-only or BSD-2-Clause) > +%YAML 1.2 > +--- > +$id: http://devicetree.org/schemas/arm/xen,dev-domid.yaml# > +$schema: http://devicetree.org/meta-schemas/core.yaml# > + > +title: Xen specific binding for virtualized devices (e.g. virtio) > + > +maintainers: > + - Stefano Stabellini <sstabellini@kernel.org> > + > +select: true Omit. No need to apply this on every single node. > + > +description: > + This binding indicates that Xen grant mappings need to be enabled for > + the device, and it specifies the ID of the domain where the corresponding > + device (backend) resides. The property is required to restrict memory > + access using Xen grant mappings. > + > +properties: > + xen,dev-domid: I kind of think 'dev' is redundant. Is there another kind of domid possible? Maybe xen,backend-domid or just xen,domid? I don't know Xen too well, so ultimately up to you all. > + $ref: /schemas/types.yaml#/definitions/uint32 > + description: > + The domid (domain ID) of the domain where the device (backend) is running. > + > +additionalProperties: true > + > +examples: > + - | > + virtio@3000 { > + compatible = "virtio,mmio"; > + reg = <0x3000 0x100>; > + interrupts = <41>; > + > + /* The device is located in Xen domain with ID 1 */ > + xen,dev-domid = <1>; > + }; > diff --git a/Documentation/devicetree/bindings/virtio/mmio.yaml b/Documentation/devicetree/bindings/virtio/mmio.yaml > index 10c22b5..29a0932 100644 > --- a/Documentation/devicetree/bindings/virtio/mmio.yaml > +++ b/Documentation/devicetree/bindings/virtio/mmio.yaml > @@ -13,6 +13,9 @@ description: > See https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=virtio for > more details. > > +allOf: > + - $ref: /schemas/arm/xen,dev-domid.yaml# > + > properties: > compatible: > const: virtio,mmio > @@ -33,6 +36,10 @@ properties: > description: Required for devices making accesses thru an IOMMU. > maxItems: 1 > > + xen,dev-domid: > + description: Required when Xen grant mappings need to be enabled for device. > + $ref: /schemas/types.yaml#/definitions/uint32 No need to define the type again nor describe it again. Instead, just change additionalProperties to unevaluateProperties in this doc. The diff is the latter takes $ref's into account. > + > required: > - compatible > - reg > -- > 2.7.4 > >
On 17.05.22 03:27, Rob Herring wrote: Hello Rob, all > On Sat, May 07, 2022 at 09:19:06PM +0300, Oleksandr Tyshchenko wrote: >> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> >> >> Introduce Xen specific binding for the virtualized device (e.g. virtio) >> to be used by Xen grant DMA-mapping layer in the subsequent commit. >> >> This binding indicates that Xen grant mappings scheme needs to be >> enabled for the device which DT node contains that property and specifies >> the ID of Xen domain where the corresponding backend resides. The ID >> (domid) is used as an argument to the grant mapping APIs. >> >> This is needed for the option to restrict memory access using Xen grant >> mappings to work which primary goal is to enable using virtio devices >> in Xen guests. >> >> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> >> --- >> Changes RFC -> V1: >> - update commit subject/description and text in description >> - move to devicetree/bindings/arm/ >> >> Changes V1 -> V2: >> - update text in description >> - change the maintainer of the binding >> - fix validation issue >> - reference xen,dev-domid.yaml schema from virtio/mmio.yaml >> --- >> .../devicetree/bindings/arm/xen,dev-domid.yaml | 37 ++++++++++++++++++++++ >> Documentation/devicetree/bindings/virtio/mmio.yaml | 7 ++++ >> 2 files changed, 44 insertions(+) >> create mode 100644 Documentation/devicetree/bindings/arm/xen,dev-domid.yaml >> >> diff --git a/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml b/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml >> new file mode 100644 >> index 00000000..750e89e >> --- /dev/null >> +++ b/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml >> @@ -0,0 +1,37 @@ >> +# SPDX-License-Identifier: (GPL-2.0-only or BSD-2-Clause) >> +%YAML 1.2 >> +--- >> +$id: http://devicetree.org/schemas/arm/xen,dev-domid.yaml# >> +$schema: http://devicetree.org/meta-schemas/core.yaml# >> + >> +title: Xen specific binding for virtualized devices (e.g. virtio) >> + >> +maintainers: >> + - Stefano Stabellini <sstabellini@kernel.org> >> + >> +select: true > Omit. No need to apply this on every single node. ok, will do > >> + >> +description: >> + This binding indicates that Xen grant mappings need to be enabled for >> + the device, and it specifies the ID of the domain where the corresponding >> + device (backend) resides. The property is required to restrict memory >> + access using Xen grant mappings. >> + >> +properties: >> + xen,dev-domid: > I kind of think 'dev' is redundant. Is there another kind of domid > possible? In general, yes. It is driver(frontend) domid. But, at least for now, I don't see why we will need an additional property for that. > Maybe xen,backend-domid or just xen,domid? I don't know Xen > too well, so ultimately up to you all. xen,domid sounds ambiguous to me. xen,backend-domid sounds perfectly fine to me, I even think it fits better. Stefano, Juergen, would you be happy with new xen,backend-domid name? If yes, Stefano could you please clarify, would you be OK if I retained your R-b tags (for all patches in current series which touch that property) after doing such renaming? > >> + $ref: /schemas/types.yaml#/definitions/uint32 >> + description: >> + The domid (domain ID) of the domain where the device (backend) is running. >> + >> +additionalProperties: true >> + >> +examples: >> + - | >> + virtio@3000 { >> + compatible = "virtio,mmio"; >> + reg = <0x3000 0x100>; >> + interrupts = <41>; >> + >> + /* The device is located in Xen domain with ID 1 */ >> + xen,dev-domid = <1>; >> + }; >> diff --git a/Documentation/devicetree/bindings/virtio/mmio.yaml b/Documentation/devicetree/bindings/virtio/mmio.yaml >> index 10c22b5..29a0932 100644 >> --- a/Documentation/devicetree/bindings/virtio/mmio.yaml >> +++ b/Documentation/devicetree/bindings/virtio/mmio.yaml >> @@ -13,6 +13,9 @@ description: >> See https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=virtio for >> more details. >> >> +allOf: >> + - $ref: /schemas/arm/xen,dev-domid.yaml# >> + >> properties: >> compatible: >> const: virtio,mmio >> @@ -33,6 +36,10 @@ properties: >> description: Required for devices making accesses thru an IOMMU. >> maxItems: 1 >> >> + xen,dev-domid: >> + description: Required when Xen grant mappings need to be enabled for device. >> + $ref: /schemas/types.yaml#/definitions/uint32 > No need to define the type again nor describe it again. > > Instead, just change additionalProperties to unevaluateProperties in > this doc. The diff is the latter takes $ref's into account. ok, will do. Could you please clarify, shall I use? unevaluatedProperties: false or unevaluatedProperties: type: object I am not too familiar with this stuff. Both variants seem to pass validation. Thank you. > >> + >> required: >> - compatible >> - reg >> -- >> 2.7.4 >> >>
On Sat, May 7, 2022 at 7:19 PM Oleksandr Tyshchenko <olekstysh@gmail.com> wrote: > > diff --git a/Documentation/devicetree/bindings/virtio/mmio.yaml b/Documentation/devicetree/bindings/virtio/mmio.yaml > index 10c22b5..29a0932 100644 > --- a/Documentation/devicetree/bindings/virtio/mmio.yaml > +++ b/Documentation/devicetree/bindings/virtio/mmio.yaml > @@ -13,6 +13,9 @@ description: > See https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=virtio for > more details. > > +allOf: > + - $ref: /schemas/arm/xen,dev-domid.yaml# > + > properties: > compatible: > const: virtio,mmio > @@ -33,6 +36,10 @@ properties: > description: Required for devices making accesses thru an IOMMU. > maxItems: 1 > > + xen,dev-domid: > + description: Required when Xen grant mappings need to be enabled for device. > + $ref: /schemas/types.yaml#/definitions/uint32 > + > required: > - compatible > - reg Sorry for joining the discussion late. Have you considered using the generic iommu binding here instead of a custom property? This would mean having a device node for the grant-table mechanism that can be referred to using the 'iommus' phandle property, with the domid as an additional argument. It does not quite fit the model that Linux currently uses for iommus, as that has an allocator for dma_addr_t space, but it would think it's conceptually close enough that it makes sense for the binding. Arnd
On 18.05.22 17:32, Arnd Bergmann wrote: Hello Arnd > On Sat, May 7, 2022 at 7:19 PM Oleksandr Tyshchenko <olekstysh@gmail.com> wrote: >> diff --git a/Documentation/devicetree/bindings/virtio/mmio.yaml b/Documentation/devicetree/bindings/virtio/mmio.yaml >> index 10c22b5..29a0932 100644 >> --- a/Documentation/devicetree/bindings/virtio/mmio.yaml >> +++ b/Documentation/devicetree/bindings/virtio/mmio.yaml >> @@ -13,6 +13,9 @@ description: >> See https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=virtio for >> more details. >> >> +allOf: >> + - $ref: /schemas/arm/xen,dev-domid.yaml# >> + >> properties: >> compatible: >> const: virtio,mmio >> @@ -33,6 +36,10 @@ properties: >> description: Required for devices making accesses thru an IOMMU. >> maxItems: 1 >> >> + xen,dev-domid: >> + description: Required when Xen grant mappings need to be enabled for device. >> + $ref: /schemas/types.yaml#/definitions/uint32 >> + >> required: >> - compatible >> - reg > Sorry for joining the discussion late. Have you considered using the > generic iommu > binding here instead of a custom property? I have to admit - no, I haven't. I was thinking that Xen specific feature should be communicated using Xen specific DT property. > This would mean having a device > node for the grant-table mechanism that can be referred to using the 'iommus' > phandle property, with the domid as an additional argument. I assume, you are speaking about something like the following? xen_dummy_iommu { compatible = "xen,dummy-iommu"; #iommu-cells = <1>; }; virtio@3000 { compatible = "virtio,mmio"; reg = <0x3000 0x100>; interrupts = <41>; /* The device is located in Xen domain with ID 1 */ iommus = <&xen_dummy_iommu 1>; }; > > It does not quite fit the model that Linux currently uses for iommus, > as that has an allocator for dma_addr_t space yes (# 3/7 adds grant-table based allocator) > , but it would think it's > conceptually close enough that it makes sense for the binding. Interesting idea. I am wondering, do we need an extra actions for this to work in Linux guest (dummy IOMMU driver, etc)? > > Arnd
On Wed, May 18, 2022 at 5:06 PM Oleksandr <olekstysh@gmail.com> wrote: > On 18.05.22 17:32, Arnd Bergmann wrote: > > On Sat, May 7, 2022 at 7:19 PM Oleksandr Tyshchenko <olekstysh@gmail.com> wrote: > > > This would mean having a device > > node for the grant-table mechanism that can be referred to using the 'iommus' > > phandle property, with the domid as an additional argument. > > I assume, you are speaking about something like the following? > > > xen_dummy_iommu { > compatible = "xen,dummy-iommu"; > #iommu-cells = <1>; > }; > > virtio@3000 { > compatible = "virtio,mmio"; > reg = <0x3000 0x100>; > interrupts = <41>; > > /* The device is located in Xen domain with ID 1 */ > iommus = <&xen_dummy_iommu 1>; > }; Right, that's that's the idea, except I would not call it a 'dummy'. From the perspective of the DT, this behaves just like an IOMMU, even if the exact mechanism is different from most hardware IOMMU implementations. > > It does not quite fit the model that Linux currently uses for iommus, > > as that has an allocator for dma_addr_t space > > yes (# 3/7 adds grant-table based allocator) > > > > , but it would think it's > > conceptually close enough that it makes sense for the binding. > > Interesting idea. I am wondering, do we need an extra actions for this > to work in Linux guest (dummy IOMMU driver, etc)? It depends on how closely the guest implementation can be made to resemble a normal iommu. If you do allocate dma_addr_t addresses, it may actually be close enough that you can just turn the grant-table code into a normal iommu driver and change nothing else. Arnd
On Wed, May 18, 2022 at 03:32:27PM +0100, Arnd Bergmann wrote: > On Sat, May 7, 2022 at 7:19 PM Oleksandr Tyshchenko <olekstysh@gmail.com> wrote: > > > > diff --git a/Documentation/devicetree/bindings/virtio/mmio.yaml b/Documentation/devicetree/bindings/virtio/mmio.yaml > > index 10c22b5..29a0932 100644 > > --- a/Documentation/devicetree/bindings/virtio/mmio.yaml > > +++ b/Documentation/devicetree/bindings/virtio/mmio.yaml > > @@ -13,6 +13,9 @@ description: > > See https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=virtio for > > more details. > > > > +allOf: > > + - $ref: /schemas/arm/xen,dev-domid.yaml# > > + > > properties: > > compatible: > > const: virtio,mmio > > @@ -33,6 +36,10 @@ properties: > > description: Required for devices making accesses thru an IOMMU. > > maxItems: 1 > > > > + xen,dev-domid: > > + description: Required when Xen grant mappings need to be enabled for device. > > + $ref: /schemas/types.yaml#/definitions/uint32 > > + > > required: > > - compatible > > - reg > > Sorry for joining the discussion late. Have you considered using the > generic iommu > binding here instead of a custom property? This would mean having a device > node for the grant-table mechanism that can be referred to using the 'iommus' > phandle property, with the domid as an additional argument. > > It does not quite fit the model that Linux currently uses for iommus, > as that has an allocator for dma_addr_t space, but it would think it's > conceptually close enough that it makes sense for the binding. Something common is almost always better. That may also have the issue that fw_devlink will make the 'iommu' driver a dependency to probe. Rob
On 18.05.22 19:39, Arnd Bergmann wrote: Hello Arnd > On Wed, May 18, 2022 at 5:06 PM Oleksandr <olekstysh@gmail.com> wrote: >> On 18.05.22 17:32, Arnd Bergmann wrote: >>> On Sat, May 7, 2022 at 7:19 PM Oleksandr Tyshchenko <olekstysh@gmail.com> wrote: >>> This would mean having a device >>> node for the grant-table mechanism that can be referred to using the 'iommus' >>> phandle property, with the domid as an additional argument. >> I assume, you are speaking about something like the following? >> >> >> xen_dummy_iommu { >> compatible = "xen,dummy-iommu"; >> #iommu-cells = <1>; >> }; >> >> virtio@3000 { >> compatible = "virtio,mmio"; >> reg = <0x3000 0x100>; >> interrupts = <41>; >> >> /* The device is located in Xen domain with ID 1 */ >> iommus = <&xen_dummy_iommu 1>; >> }; > Right, that's that's the idea, thank you for the confirmation > except I would not call it a 'dummy'. > From the perspective of the DT, this behaves just like an IOMMU, > even if the exact mechanism is different from most hardware IOMMU > implementations. well, agree > >>> It does not quite fit the model that Linux currently uses for iommus, >>> as that has an allocator for dma_addr_t space >> yes (# 3/7 adds grant-table based allocator) >> >> >>> , but it would think it's >>> conceptually close enough that it makes sense for the binding. >> Interesting idea. I am wondering, do we need an extra actions for this >> to work in Linux guest (dummy IOMMU driver, etc)? > It depends on how closely the guest implementation can be made to > resemble a normal iommu. If you do allocate dma_addr_t addresses, > it may actually be close enough that you can just turn the grant-table > code into a normal iommu driver and change nothing else. Unfortunately, I failed to find a way how use grant references at the iommu_ops level (I mean to fully pretend that we are an IOMMU driver). I am not too familiar with that, so what is written below might be wrong or at least not precise. The normal IOMMU driver in Linux doesn’t allocate DMA addresses by itself, it just maps (IOVA-PA) what was requested to be mapped by the upper layer. The DMA address allocation is done by the upper layer (DMA-IOMMU which is the glue layer between DMA API and IOMMU API allocates IOVA for PA?). But, all what we need here is just to allocate our specific grant-table based DMA addresses (DMA address = grant reference + offset in the page), so let’s say we need an entity to take a physical address as parameter and return a DMA address (what actually commit #3/7 is doing), and that’s all. So working at the dma_ops layer we get exactly what we need, with the minimal changes to guest infrastructure. In our case the Xen itself acts as an IOMMU. Assuming that we want to reuse the IOMMU infrastructure somehow for our needs. I think, in that case we will likely need to introduce a new specific IOVA allocator (alongside with a generic one) to be hooked up by the DMA-IOMMU layer if we run on top of Xen. But, even having the specific IOVA allocator to return what we indeed need (DMA address = grant reference + offset in the page) we will still need the specific minimal required IOMMU driver to be present in the system anyway in order to track the mappings(?) and do nothing with them, returning a success (this specific IOMMU driver should have all mandatory callbacks implemented). I completely agree, it would be really nice to reuse generic IOMMU bindings rather than introducing Xen specific property if what we are trying to implement in current patch series fits in the usage of "iommus" in Linux more-less. But, if we will have to add more complexity/more components to the code for the sake of reusing device tree binding, this raises a question whether that’s worthwhile. Or I really missed something? > > Arnd
On 18.05.22 21:59, Rob Herring wrote: Hello Rob, Arnd > On Wed, May 18, 2022 at 03:32:27PM +0100, Arnd Bergmann wrote: >> On Sat, May 7, 2022 at 7:19 PM Oleksandr Tyshchenko <olekstysh@gmail.com> wrote: >>> diff --git a/Documentation/devicetree/bindings/virtio/mmio.yaml b/Documentation/devicetree/bindings/virtio/mmio.yaml >>> index 10c22b5..29a0932 100644 >>> --- a/Documentation/devicetree/bindings/virtio/mmio.yaml >>> +++ b/Documentation/devicetree/bindings/virtio/mmio.yaml >>> @@ -13,6 +13,9 @@ description: >>> See https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=virtio for >>> more details. >>> >>> +allOf: >>> + - $ref: /schemas/arm/xen,dev-domid.yaml# >>> + >>> properties: >>> compatible: >>> const: virtio,mmio >>> @@ -33,6 +36,10 @@ properties: >>> description: Required for devices making accesses thru an IOMMU. >>> maxItems: 1 >>> >>> + xen,dev-domid: >>> + description: Required when Xen grant mappings need to be enabled for device. >>> + $ref: /schemas/types.yaml#/definitions/uint32 >>> + >>> required: >>> - compatible >>> - reg >> Sorry for joining the discussion late. Have you considered using the >> generic iommu >> binding here instead of a custom property? This would mean having a device >> node for the grant-table mechanism that can be referred to using the 'iommus' >> phandle property, with the domid as an additional argument. >> >> It does not quite fit the model that Linux currently uses for iommus, >> as that has an allocator for dma_addr_t space, but it would think it's >> conceptually close enough that it makes sense for the binding. > Something common is almost always better. agree > > That may also have the issue that fw_devlink will make the 'iommu' > driver a dependency to probe. Looks like I ran into it while experimenting. I generated the following nodes in guest DT using Xen toolstack: [snip] xen_dummy_iommu { compatible = "xen,dummy-iommu"; #iommu-cells = <0x01>; phandle = <0xfde9>; }; virtio@2000000 { compatible = "virtio,mmio"; reg = <0x00 0x2000000 0x00 0x200>; interrupts = <0x00 0x01 0xf01>; interrupt-parent = <0xfde8>; dma-coherent; iommus = <0xfde9 0x01>; }; [snip] And got: virtio-mmio 2000000.virtio: deferred probe timeout, ignoring dependency > > Rob
On Thu, 19 May 2022, Oleksandr wrote: > > On Wed, May 18, 2022 at 5:06 PM Oleksandr <olekstysh@gmail.com> wrote: > > > On 18.05.22 17:32, Arnd Bergmann wrote: > > > > On Sat, May 7, 2022 at 7:19 PM Oleksandr Tyshchenko > > > > <olekstysh@gmail.com> wrote: > > > > This would mean having a device > > > > node for the grant-table mechanism that can be referred to using the > > > > 'iommus' > > > > phandle property, with the domid as an additional argument. > > > I assume, you are speaking about something like the following? > > > > > > > > > xen_dummy_iommu { > > > compatible = "xen,dummy-iommu"; > > > #iommu-cells = <1>; > > > }; > > > > > > virtio@3000 { > > > compatible = "virtio,mmio"; > > > reg = <0x3000 0x100>; > > > interrupts = <41>; > > > > > > /* The device is located in Xen domain with ID 1 */ > > > iommus = <&xen_dummy_iommu 1>; > > > }; > > Right, that's that's the idea, > > thank you for the confirmation > > > > > except I would not call it a 'dummy'. > > From the perspective of the DT, this behaves just like an IOMMU, > > even if the exact mechanism is different from most hardware IOMMU > > implementations. > > well, agree > > > > > > > > It does not quite fit the model that Linux currently uses for iommus, > > > > as that has an allocator for dma_addr_t space > > > yes (# 3/7 adds grant-table based allocator) > > > > > > > > > > , but it would think it's > > > > conceptually close enough that it makes sense for the binding. > > > Interesting idea. I am wondering, do we need an extra actions for this > > > to work in Linux guest (dummy IOMMU driver, etc)? > > It depends on how closely the guest implementation can be made to > > resemble a normal iommu. If you do allocate dma_addr_t addresses, > > it may actually be close enough that you can just turn the grant-table > > code into a normal iommu driver and change nothing else. > > Unfortunately, I failed to find a way how use grant references at the > iommu_ops level (I mean to fully pretend that we are an IOMMU driver). I am > not too familiar with that, so what is written below might be wrong or at > least not precise. > > The normal IOMMU driver in Linux doesn’t allocate DMA addresses by itself, it > just maps (IOVA-PA) what was requested to be mapped by the upper layer. The > DMA address allocation is done by the upper layer (DMA-IOMMU which is the glue > layer between DMA API and IOMMU API allocates IOVA for PA?). But, all what we > need here is just to allocate our specific grant-table based DMA addresses > (DMA address = grant reference + offset in the page), so let’s say we need an > entity to take a physical address as parameter and return a DMA address (what > actually commit #3/7 is doing), and that’s all. So working at the dma_ops > layer we get exactly what we need, with the minimal changes to guest > infrastructure. In our case the Xen itself acts as an IOMMU. > > Assuming that we want to reuse the IOMMU infrastructure somehow for our needs. > I think, in that case we will likely need to introduce a new specific IOVA > allocator (alongside with a generic one) to be hooked up by the DMA-IOMMU > layer if we run on top of Xen. But, even having the specific IOVA allocator to > return what we indeed need (DMA address = grant reference + offset in the > page) we will still need the specific minimal required IOMMU driver to be > present in the system anyway in order to track the mappings(?) and do nothing > with them, returning a success (this specific IOMMU driver should have all > mandatory callbacks implemented). > > I completely agree, it would be really nice to reuse generic IOMMU bindings > rather than introducing Xen specific property if what we are trying to > implement in current patch series fits in the usage of "iommus" in Linux > more-less. But, if we will have to add more complexity/more components to the > code for the sake of reusing device tree binding, this raises a question > whether that’s worthwhile. > > Or I really missed something? I think Arnd was primarily suggesting to reuse the IOMMU Device Tree bindings, not necessarily the IOMMU drivers framework in Linux (although that would be an added bonus.) I know from previous discussions with you that making the grant table fit in the existing IOMMU drivers model is difficult, but just reusing the Device Tree bindings seems feasible?
On 19.05.22 04:06, Stefano Stabellini wrote: Hello Stefano > On Thu, 19 May 2022, Oleksandr wrote: >>> On Wed, May 18, 2022 at 5:06 PM Oleksandr <olekstysh@gmail.com> wrote: >>>> On 18.05.22 17:32, Arnd Bergmann wrote: >>>>> On Sat, May 7, 2022 at 7:19 PM Oleksandr Tyshchenko >>>>> <olekstysh@gmail.com> wrote: >>>>> This would mean having a device >>>>> node for the grant-table mechanism that can be referred to using the >>>>> 'iommus' >>>>> phandle property, with the domid as an additional argument. >>>> I assume, you are speaking about something like the following? >>>> >>>> >>>> xen_dummy_iommu { >>>> compatible = "xen,dummy-iommu"; >>>> #iommu-cells = <1>; >>>> }; >>>> >>>> virtio@3000 { >>>> compatible = "virtio,mmio"; >>>> reg = <0x3000 0x100>; >>>> interrupts = <41>; >>>> >>>> /* The device is located in Xen domain with ID 1 */ >>>> iommus = <&xen_dummy_iommu 1>; >>>> }; >>> Right, that's that's the idea, >> thank you for the confirmation >> >> >> >>> except I would not call it a 'dummy'. >>> From the perspective of the DT, this behaves just like an IOMMU, >>> even if the exact mechanism is different from most hardware IOMMU >>> implementations. >> well, agree >> >> >>>>> It does not quite fit the model that Linux currently uses for iommus, >>>>> as that has an allocator for dma_addr_t space >>>> yes (# 3/7 adds grant-table based allocator) >>>> >>>> >>>>> , but it would think it's >>>>> conceptually close enough that it makes sense for the binding. >>>> Interesting idea. I am wondering, do we need an extra actions for this >>>> to work in Linux guest (dummy IOMMU driver, etc)? >>> It depends on how closely the guest implementation can be made to >>> resemble a normal iommu. If you do allocate dma_addr_t addresses, >>> it may actually be close enough that you can just turn the grant-table >>> code into a normal iommu driver and change nothing else. >> Unfortunately, I failed to find a way how use grant references at the >> iommu_ops level (I mean to fully pretend that we are an IOMMU driver). I am >> not too familiar with that, so what is written below might be wrong or at >> least not precise. >> >> The normal IOMMU driver in Linux doesn’t allocate DMA addresses by itself, it >> just maps (IOVA-PA) what was requested to be mapped by the upper layer. The >> DMA address allocation is done by the upper layer (DMA-IOMMU which is the glue >> layer between DMA API and IOMMU API allocates IOVA for PA?). But, all what we >> need here is just to allocate our specific grant-table based DMA addresses >> (DMA address = grant reference + offset in the page), so let’s say we need an >> entity to take a physical address as parameter and return a DMA address (what >> actually commit #3/7 is doing), and that’s all. So working at the dma_ops >> layer we get exactly what we need, with the minimal changes to guest >> infrastructure. In our case the Xen itself acts as an IOMMU. >> >> Assuming that we want to reuse the IOMMU infrastructure somehow for our needs. >> I think, in that case we will likely need to introduce a new specific IOVA >> allocator (alongside with a generic one) to be hooked up by the DMA-IOMMU >> layer if we run on top of Xen. But, even having the specific IOVA allocator to >> return what we indeed need (DMA address = grant reference + offset in the >> page) we will still need the specific minimal required IOMMU driver to be >> present in the system anyway in order to track the mappings(?) and do nothing >> with them, returning a success (this specific IOMMU driver should have all >> mandatory callbacks implemented). >> >> I completely agree, it would be really nice to reuse generic IOMMU bindings >> rather than introducing Xen specific property if what we are trying to >> implement in current patch series fits in the usage of "iommus" in Linux >> more-less. But, if we will have to add more complexity/more components to the >> code for the sake of reusing device tree binding, this raises a question >> whether that’s worthwhile. >> >> Or I really missed something? > I think Arnd was primarily suggesting to reuse the IOMMU Device Tree > bindings, not necessarily the IOMMU drivers framework in Linux (although > that would be an added bonus.) > > I know from previous discussions with you that making the grant table > fit in the existing IOMMU drivers model is difficult, but just reusing > the Device Tree bindings seems feasible? I started experimenting with that. As wrote in a separate email, I got a deferred probe timeout, after inserting required nodes into guest device tree, which seems to be a consequence of the unavailability of IOMMU, I will continue to investigate this question.
On 19.05.22 09:03, Oleksandr wrote: Hello Stefano, all > > On 19.05.22 04:06, Stefano Stabellini wrote: > > > Hello Stefano, all > >> On Thu, 19 May 2022, Oleksandr wrote: >>>> On Wed, May 18, 2022 at 5:06 PM Oleksandr <olekstysh@gmail.com> wrote: >>>>> On 18.05.22 17:32, Arnd Bergmann wrote: >>>>>> On Sat, May 7, 2022 at 7:19 PM Oleksandr Tyshchenko >>>>>> <olekstysh@gmail.com> wrote: >>>>>> This would mean having a device >>>>>> node for the grant-table mechanism that can be referred to using the >>>>>> 'iommus' >>>>>> phandle property, with the domid as an additional argument. >>>>> I assume, you are speaking about something like the following? >>>>> >>>>> >>>>> xen_dummy_iommu { >>>>> compatible = "xen,dummy-iommu"; >>>>> #iommu-cells = <1>; >>>>> }; >>>>> >>>>> virtio@3000 { >>>>> compatible = "virtio,mmio"; >>>>> reg = <0x3000 0x100>; >>>>> interrupts = <41>; >>>>> >>>>> /* The device is located in Xen domain with ID 1 */ >>>>> iommus = <&xen_dummy_iommu 1>; >>>>> }; >>>> Right, that's that's the idea, >>> thank you for the confirmation >>> >>> >>> >>>> except I would not call it a 'dummy'. >>>> From the perspective of the DT, this behaves just like an IOMMU, >>>> even if the exact mechanism is different from most hardware IOMMU >>>> implementations. >>> well, agree >>> >>> >>>>>> It does not quite fit the model that Linux currently uses for >>>>>> iommus, >>>>>> as that has an allocator for dma_addr_t space >>>>> yes (# 3/7 adds grant-table based allocator) >>>>> >>>>> >>>>>> , but it would think it's >>>>>> conceptually close enough that it makes sense for the binding. >>>>> Interesting idea. I am wondering, do we need an extra actions for >>>>> this >>>>> to work in Linux guest (dummy IOMMU driver, etc)? >>>> It depends on how closely the guest implementation can be made to >>>> resemble a normal iommu. If you do allocate dma_addr_t addresses, >>>> it may actually be close enough that you can just turn the grant-table >>>> code into a normal iommu driver and change nothing else. >>> Unfortunately, I failed to find a way how use grant references at the >>> iommu_ops level (I mean to fully pretend that we are an IOMMU >>> driver). I am >>> not too familiar with that, so what is written below might be wrong >>> or at >>> least not precise. >>> >>> The normal IOMMU driver in Linux doesn’t allocate DMA addresses by >>> itself, it >>> just maps (IOVA-PA) what was requested to be mapped by the upper >>> layer. The >>> DMA address allocation is done by the upper layer (DMA-IOMMU which >>> is the glue >>> layer between DMA API and IOMMU API allocates IOVA for PA?). But, >>> all what we >>> need here is just to allocate our specific grant-table based DMA >>> addresses >>> (DMA address = grant reference + offset in the page), so let’s say >>> we need an >>> entity to take a physical address as parameter and return a DMA >>> address (what >>> actually commit #3/7 is doing), and that’s all. So working at the >>> dma_ops >>> layer we get exactly what we need, with the minimal changes to guest >>> infrastructure. In our case the Xen itself acts as an IOMMU. >>> >>> Assuming that we want to reuse the IOMMU infrastructure somehow for >>> our needs. >>> I think, in that case we will likely need to introduce a new >>> specific IOVA >>> allocator (alongside with a generic one) to be hooked up by the >>> DMA-IOMMU >>> layer if we run on top of Xen. But, even having the specific IOVA >>> allocator to >>> return what we indeed need (DMA address = grant reference + offset >>> in the >>> page) we will still need the specific minimal required IOMMU driver >>> to be >>> present in the system anyway in order to track the mappings(?) and >>> do nothing >>> with them, returning a success (this specific IOMMU driver should >>> have all >>> mandatory callbacks implemented). >>> >>> I completely agree, it would be really nice to reuse generic IOMMU >>> bindings >>> rather than introducing Xen specific property if what we are trying to >>> implement in current patch series fits in the usage of "iommus" in >>> Linux >>> more-less. But, if we will have to add more complexity/more >>> components to the >>> code for the sake of reusing device tree binding, this raises a >>> question >>> whether that’s worthwhile. >>> >>> Or I really missed something? >> I think Arnd was primarily suggesting to reuse the IOMMU Device Tree >> bindings, not necessarily the IOMMU drivers framework in Linux (although >> that would be an added bonus.) >> >> I know from previous discussions with you that making the grant table >> fit in the existing IOMMU drivers model is difficult, but just reusing >> the Device Tree bindings seems feasible? > > I started experimenting with that. As wrote in a separate email, I got > a deferred probe timeout, > > after inserting required nodes into guest device tree, which seems to > be a consequence of the unavailability of IOMMU, I will continue to > investigate this question. I have experimented with that. Yes, just reusing the Device Tree bindings is technically feasible (and we are able to do this by only touching grant-dma-ops.c), although deferred probe timeout still stands (as there is no IOMMU driver being present actually). [ 0.583771] virtio-mmio 2000000.virtio: deferred probe timeout, ignoring dependency [ 0.615556] virtio_blk virtio0: [vda] 4096000 512-byte logical blocks (2.10 GB/1.95 GiB) Below the working diff (on top of current series): diff --git a/drivers/xen/grant-dma-ops.c b/drivers/xen/grant-dma-ops.c index da9c7ff..6586152 100644 --- a/drivers/xen/grant-dma-ops.c +++ b/drivers/xen/grant-dma-ops.c @@ -272,17 +272,24 @@ static const struct dma_map_ops xen_grant_dma_ops = { bool xen_is_grant_dma_device(struct device *dev) { + struct device_node *iommu_np; + bool has_iommu; + /* XXX Handle only DT devices for now */ if (!dev->of_node) return false; - return of_property_read_bool(dev->of_node, "xen,backend-domid"); + iommu_np = of_parse_phandle(dev->of_node, "iommus", 0); + has_iommu = iommu_np && of_device_is_compatible(iommu_np, "xen,grant-dma"); + of_node_put(iommu_np); + + return has_iommu; } void xen_grant_setup_dma_ops(struct device *dev) { struct xen_grant_dma_data *data; - uint32_t domid; + struct of_phandle_args iommu_spec; data = find_xen_grant_dma_data(dev); if (data) { @@ -294,16 +301,30 @@ void xen_grant_setup_dma_ops(struct device *dev) if (!dev->of_node) goto err; - if (of_property_read_u32(dev->of_node, "xen,backend-domid", &domid)) { - dev_err(dev, "xen,backend-domid property is not present\n"); + if (of_parse_phandle_with_args(dev->of_node, "iommus", "#iommu-cells", + 0, &iommu_spec)) { + dev_err(dev, "Cannot parse iommus property\n"); + goto err; + } + + if (!of_device_is_compatible(iommu_spec.np, "xen,grant-dma") || + iommu_spec.args_count != 1) { + dev_err(dev, "Incompatible IOMMU node\n"); + of_node_put(iommu_spec.np); goto err; } + of_node_put(iommu_spec.np); + data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL); if (!data) goto err; - data->backend_domid = domid; + /* + * The endpoint ID here means the ID of the domain where the corresponding + * backend is running + */ + data->backend_domid = iommu_spec.args[0]; if (xa_err(xa_store(&xen_grant_dma_devices, (unsigned long)dev, data, GFP_KERNEL))) { (END) Below, the nodes generated by Xen toolstack: xen_grant_dma { compatible = "xen,grant-dma"; #iommu-cells = <0x01>; phandle = <0xfde9>; }; virtio@2000000 { compatible = "virtio,mmio"; reg = <0x00 0x2000000 0x00 0x200>; interrupts = <0x00 0x01 0xf01>; interrupt-parent = <0xfde8>; dma-coherent; iommus = <0xfde9 0x01>; }; I am wondering, would be the proper solution to eliminate deferred probe timeout issue in our particular case (without introducing an extra IOMMU driver)? > > > >
On Mon, 23 May 2022, Oleksandr wrote: > > > On Thu, 19 May 2022, Oleksandr wrote: > > > > > On Wed, May 18, 2022 at 5:06 PM Oleksandr <olekstysh@gmail.com> wrote: > > > > > > On 18.05.22 17:32, Arnd Bergmann wrote: > > > > > > > On Sat, May 7, 2022 at 7:19 PM Oleksandr Tyshchenko > > > > > > > <olekstysh@gmail.com> wrote: > > > > > > > This would mean having a device > > > > > > > node for the grant-table mechanism that can be referred to using > > > > > > > the > > > > > > > 'iommus' > > > > > > > phandle property, with the domid as an additional argument. > > > > > > I assume, you are speaking about something like the following? > > > > > > > > > > > > > > > > > > xen_dummy_iommu { > > > > > > compatible = "xen,dummy-iommu"; > > > > > > #iommu-cells = <1>; > > > > > > }; > > > > > > > > > > > > virtio@3000 { > > > > > > compatible = "virtio,mmio"; > > > > > > reg = <0x3000 0x100>; > > > > > > interrupts = <41>; > > > > > > > > > > > > /* The device is located in Xen domain with ID 1 */ > > > > > > iommus = <&xen_dummy_iommu 1>; > > > > > > }; > > > > > Right, that's that's the idea, > > > > thank you for the confirmation > > > > > > > > > > > > > > > > > except I would not call it a 'dummy'. > > > > > From the perspective of the DT, this behaves just like an IOMMU, > > > > > even if the exact mechanism is different from most hardware IOMMU > > > > > implementations. > > > > well, agree > > > > > > > > > > > > > > > It does not quite fit the model that Linux currently uses for > > > > > > > iommus, > > > > > > > as that has an allocator for dma_addr_t space > > > > > > yes (# 3/7 adds grant-table based allocator) > > > > > > > > > > > > > > > > > > > , but it would think it's > > > > > > > conceptually close enough that it makes sense for the binding. > > > > > > Interesting idea. I am wondering, do we need an extra actions for > > > > > > this > > > > > > to work in Linux guest (dummy IOMMU driver, etc)? > > > > > It depends on how closely the guest implementation can be made to > > > > > resemble a normal iommu. If you do allocate dma_addr_t addresses, > > > > > it may actually be close enough that you can just turn the grant-table > > > > > code into a normal iommu driver and change nothing else. > > > > Unfortunately, I failed to find a way how use grant references at the > > > > iommu_ops level (I mean to fully pretend that we are an IOMMU driver). I > > > > am > > > > not too familiar with that, so what is written below might be wrong or > > > > at > > > > least not precise. > > > > > > > > The normal IOMMU driver in Linux doesn’t allocate DMA addresses by > > > > itself, it > > > > just maps (IOVA-PA) what was requested to be mapped by the upper layer. > > > > The > > > > DMA address allocation is done by the upper layer (DMA-IOMMU which is > > > > the glue > > > > layer between DMA API and IOMMU API allocates IOVA for PA?). But, all > > > > what we > > > > need here is just to allocate our specific grant-table based DMA > > > > addresses > > > > (DMA address = grant reference + offset in the page), so let’s say we > > > > need an > > > > entity to take a physical address as parameter and return a DMA address > > > > (what > > > > actually commit #3/7 is doing), and that’s all. So working at the > > > > dma_ops > > > > layer we get exactly what we need, with the minimal changes to guest > > > > infrastructure. In our case the Xen itself acts as an IOMMU. > > > > > > > > Assuming that we want to reuse the IOMMU infrastructure somehow for our > > > > needs. > > > > I think, in that case we will likely need to introduce a new specific > > > > IOVA > > > > allocator (alongside with a generic one) to be hooked up by the > > > > DMA-IOMMU > > > > layer if we run on top of Xen. But, even having the specific IOVA > > > > allocator to > > > > return what we indeed need (DMA address = grant reference + offset in > > > > the > > > > page) we will still need the specific minimal required IOMMU driver to > > > > be > > > > present in the system anyway in order to track the mappings(?) and do > > > > nothing > > > > with them, returning a success (this specific IOMMU driver should have > > > > all > > > > mandatory callbacks implemented). > > > > > > > > I completely agree, it would be really nice to reuse generic IOMMU > > > > bindings > > > > rather than introducing Xen specific property if what we are trying to > > > > implement in current patch series fits in the usage of "iommus" in Linux > > > > more-less. But, if we will have to add more complexity/more components > > > > to the > > > > code for the sake of reusing device tree binding, this raises a question > > > > whether that’s worthwhile. > > > > > > > > Or I really missed something? > > > I think Arnd was primarily suggesting to reuse the IOMMU Device Tree > > > bindings, not necessarily the IOMMU drivers framework in Linux (although > > > that would be an added bonus.) > > > > > > I know from previous discussions with you that making the grant table > > > fit in the existing IOMMU drivers model is difficult, but just reusing > > > the Device Tree bindings seems feasible? > > > > I started experimenting with that. As wrote in a separate email, I got a > > deferred probe timeout, > > > > after inserting required nodes into guest device tree, which seems to be a > > consequence of the unavailability of IOMMU, I will continue to investigate > > this question. > > > I have experimented with that. Yes, just reusing the Device Tree bindings is > technically feasible (and we are able to do this by only touching > grant-dma-ops.c), although deferred probe timeout still stands (as there is no > IOMMU driver being present actually). > > [ 0.583771] virtio-mmio 2000000.virtio: deferred probe timeout, ignoring > dependency > [ 0.615556] virtio_blk virtio0: [vda] 4096000 512-byte logical blocks (2.10 > GB/1.95 GiB) > > > Below the working diff (on top of current series): > > diff --git a/drivers/xen/grant-dma-ops.c b/drivers/xen/grant-dma-ops.c > index da9c7ff..6586152 100644 > --- a/drivers/xen/grant-dma-ops.c > +++ b/drivers/xen/grant-dma-ops.c > @@ -272,17 +272,24 @@ static const struct dma_map_ops xen_grant_dma_ops = { > > bool xen_is_grant_dma_device(struct device *dev) > { > + struct device_node *iommu_np; > + bool has_iommu; > + > /* XXX Handle only DT devices for now */ > if (!dev->of_node) > return false; > > - return of_property_read_bool(dev->of_node, "xen,backend-domid"); > + iommu_np = of_parse_phandle(dev->of_node, "iommus", 0); > + has_iommu = iommu_np && of_device_is_compatible(iommu_np, > "xen,grant-dma"); > + of_node_put(iommu_np); > + > + return has_iommu; > } > > void xen_grant_setup_dma_ops(struct device *dev) > { > struct xen_grant_dma_data *data; > - uint32_t domid; > + struct of_phandle_args iommu_spec; > > data = find_xen_grant_dma_data(dev); > if (data) { > @@ -294,16 +301,30 @@ void xen_grant_setup_dma_ops(struct device *dev) > if (!dev->of_node) > goto err; > > - if (of_property_read_u32(dev->of_node, "xen,backend-domid", &domid)) { > - dev_err(dev, "xen,backend-domid property is not present\n"); > + if (of_parse_phandle_with_args(dev->of_node, "iommus", "#iommu-cells", > + 0, &iommu_spec)) { > + dev_err(dev, "Cannot parse iommus property\n"); > + goto err; > + } > + > + if (!of_device_is_compatible(iommu_spec.np, "xen,grant-dma") || > + iommu_spec.args_count != 1) { > + dev_err(dev, "Incompatible IOMMU node\n"); > + of_node_put(iommu_spec.np); > goto err; > } > > + of_node_put(iommu_spec.np); > + > data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL); > if (!data) > goto err; > > - data->backend_domid = domid; > + /* > + * The endpoint ID here means the ID of the domain where the > corresponding > + * backend is running > + */ > + data->backend_domid = iommu_spec.args[0]; > > if (xa_err(xa_store(&xen_grant_dma_devices, (unsigned long)dev, data, > GFP_KERNEL))) { > (END) > > > > Below, the nodes generated by Xen toolstack: > > xen_grant_dma { > compatible = "xen,grant-dma"; > #iommu-cells = <0x01>; > phandle = <0xfde9>; > }; > > virtio@2000000 { > compatible = "virtio,mmio"; > reg = <0x00 0x2000000 0x00 0x200>; > interrupts = <0x00 0x01 0xf01>; > interrupt-parent = <0xfde8>; > dma-coherent; > iommus = <0xfde9 0x01>; > }; Not bad! I like it. > I am wondering, would be the proper solution to eliminate deferred probe > timeout issue in our particular case (without introducing an extra IOMMU > driver)? In reality I don't think there is a way to do that. I would create an empty skelethon IOMMU driver for xen,grant-dma.
+Saravana On Mon, May 23, 2022 at 06:58:13PM -0700, Stefano Stabellini wrote: > On Mon, 23 May 2022, Oleksandr wrote: > > > > On Thu, 19 May 2022, Oleksandr wrote: > > > > > > On Wed, May 18, 2022 at 5:06 PM Oleksandr <olekstysh@gmail.com> wrote: > > > > > > > On 18.05.22 17:32, Arnd Bergmann wrote: > > > > > > > > On Sat, May 7, 2022 at 7:19 PM Oleksandr Tyshchenko > > > > > > > > <olekstysh@gmail.com> wrote: > > > > > > > > This would mean having a device > > > > > > > > node for the grant-table mechanism that can be referred to using > > > > > > > > the > > > > > > > > 'iommus' > > > > > > > > phandle property, with the domid as an additional argument. > > > > > > > I assume, you are speaking about something like the following? > > > > > > > > > > > > > > > > > > > > > xen_dummy_iommu { > > > > > > > compatible = "xen,dummy-iommu"; > > > > > > > #iommu-cells = <1>; > > > > > > > }; > > > > > > > > > > > > > > virtio@3000 { > > > > > > > compatible = "virtio,mmio"; > > > > > > > reg = <0x3000 0x100>; > > > > > > > interrupts = <41>; > > > > > > > > > > > > > > /* The device is located in Xen domain with ID 1 */ > > > > > > > iommus = <&xen_dummy_iommu 1>; > > > > > > > }; > > > > > > Right, that's that's the idea, > > > > > thank you for the confirmation > > > > > > > > > > > > > > > > > > > > > except I would not call it a 'dummy'. > > > > > > From the perspective of the DT, this behaves just like an IOMMU, > > > > > > even if the exact mechanism is different from most hardware IOMMU > > > > > > implementations. > > > > > well, agree > > > > > > > > > > > > > > > > > > It does not quite fit the model that Linux currently uses for > > > > > > > > iommus, > > > > > > > > as that has an allocator for dma_addr_t space > > > > > > > yes (# 3/7 adds grant-table based allocator) > > > > > > > > > > > > > > > > > > > > > > , but it would think it's > > > > > > > > conceptually close enough that it makes sense for the binding. > > > > > > > Interesting idea. I am wondering, do we need an extra actions for > > > > > > > this > > > > > > > to work in Linux guest (dummy IOMMU driver, etc)? > > > > > > It depends on how closely the guest implementation can be made to > > > > > > resemble a normal iommu. If you do allocate dma_addr_t addresses, > > > > > > it may actually be close enough that you can just turn the grant-table > > > > > > code into a normal iommu driver and change nothing else. > > > > > Unfortunately, I failed to find a way how use grant references at the > > > > > iommu_ops level (I mean to fully pretend that we are an IOMMU driver). I > > > > > am > > > > > not too familiar with that, so what is written below might be wrong or > > > > > at > > > > > least not precise. > > > > > > > > > > The normal IOMMU driver in Linux doesn’t allocate DMA addresses by > > > > > itself, it > > > > > just maps (IOVA-PA) what was requested to be mapped by the upper layer. > > > > > The > > > > > DMA address allocation is done by the upper layer (DMA-IOMMU which is > > > > > the glue > > > > > layer between DMA API and IOMMU API allocates IOVA for PA?). But, all > > > > > what we > > > > > need here is just to allocate our specific grant-table based DMA > > > > > addresses > > > > > (DMA address = grant reference + offset in the page), so let’s say we > > > > > need an > > > > > entity to take a physical address as parameter and return a DMA address > > > > > (what > > > > > actually commit #3/7 is doing), and that’s all. So working at the > > > > > dma_ops > > > > > layer we get exactly what we need, with the minimal changes to guest > > > > > infrastructure. In our case the Xen itself acts as an IOMMU. > > > > > > > > > > Assuming that we want to reuse the IOMMU infrastructure somehow for our > > > > > needs. > > > > > I think, in that case we will likely need to introduce a new specific > > > > > IOVA > > > > > allocator (alongside with a generic one) to be hooked up by the > > > > > DMA-IOMMU > > > > > layer if we run on top of Xen. But, even having the specific IOVA > > > > > allocator to > > > > > return what we indeed need (DMA address = grant reference + offset in > > > > > the > > > > > page) we will still need the specific minimal required IOMMU driver to > > > > > be > > > > > present in the system anyway in order to track the mappings(?) and do > > > > > nothing > > > > > with them, returning a success (this specific IOMMU driver should have > > > > > all > > > > > mandatory callbacks implemented). > > > > > > > > > > I completely agree, it would be really nice to reuse generic IOMMU > > > > > bindings > > > > > rather than introducing Xen specific property if what we are trying to > > > > > implement in current patch series fits in the usage of "iommus" in Linux > > > > > more-less. But, if we will have to add more complexity/more components > > > > > to the > > > > > code for the sake of reusing device tree binding, this raises a question > > > > > whether that’s worthwhile. > > > > > > > > > > Or I really missed something? > > > > I think Arnd was primarily suggesting to reuse the IOMMU Device Tree > > > > bindings, not necessarily the IOMMU drivers framework in Linux (although > > > > that would be an added bonus.) > > > > > > > > I know from previous discussions with you that making the grant table > > > > fit in the existing IOMMU drivers model is difficult, but just reusing > > > > the Device Tree bindings seems feasible? > > > > > > I started experimenting with that. As wrote in a separate email, I got a > > > deferred probe timeout, > > > > > > after inserting required nodes into guest device tree, which seems to be a > > > consequence of the unavailability of IOMMU, I will continue to investigate > > > this question. > > > > > > I have experimented with that. Yes, just reusing the Device Tree bindings is > > technically feasible (and we are able to do this by only touching > > grant-dma-ops.c), although deferred probe timeout still stands (as there is no > > IOMMU driver being present actually). > > > > [ 0.583771] virtio-mmio 2000000.virtio: deferred probe timeout, ignoring > > dependency > > [ 0.615556] virtio_blk virtio0: [vda] 4096000 512-byte logical blocks (2.10 > > GB/1.95 GiB) > > > > > > Below the working diff (on top of current series): > > > > diff --git a/drivers/xen/grant-dma-ops.c b/drivers/xen/grant-dma-ops.c > > index da9c7ff..6586152 100644 > > --- a/drivers/xen/grant-dma-ops.c > > +++ b/drivers/xen/grant-dma-ops.c > > @@ -272,17 +272,24 @@ static const struct dma_map_ops xen_grant_dma_ops = { > > > > bool xen_is_grant_dma_device(struct device *dev) > > { > > + struct device_node *iommu_np; > > + bool has_iommu; > > + > > /* XXX Handle only DT devices for now */ > > if (!dev->of_node) > > return false; > > > > - return of_property_read_bool(dev->of_node, "xen,backend-domid"); > > + iommu_np = of_parse_phandle(dev->of_node, "iommus", 0); > > + has_iommu = iommu_np && of_device_is_compatible(iommu_np, > > "xen,grant-dma"); > > + of_node_put(iommu_np); > > + > > + return has_iommu; > > } > > > > void xen_grant_setup_dma_ops(struct device *dev) > > { > > struct xen_grant_dma_data *data; > > - uint32_t domid; > > + struct of_phandle_args iommu_spec; > > > > data = find_xen_grant_dma_data(dev); > > if (data) { > > @@ -294,16 +301,30 @@ void xen_grant_setup_dma_ops(struct device *dev) > > if (!dev->of_node) > > goto err; > > > > - if (of_property_read_u32(dev->of_node, "xen,backend-domid", &domid)) { > > - dev_err(dev, "xen,backend-domid property is not present\n"); > > + if (of_parse_phandle_with_args(dev->of_node, "iommus", "#iommu-cells", > > + 0, &iommu_spec)) { > > + dev_err(dev, "Cannot parse iommus property\n"); > > + goto err; > > + } > > + > > + if (!of_device_is_compatible(iommu_spec.np, "xen,grant-dma") || > > + iommu_spec.args_count != 1) { > > + dev_err(dev, "Incompatible IOMMU node\n"); > > + of_node_put(iommu_spec.np); > > goto err; > > } > > > > + of_node_put(iommu_spec.np); > > + > > data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL); > > if (!data) > > goto err; > > > > - data->backend_domid = domid; > > + /* > > + * The endpoint ID here means the ID of the domain where the > > corresponding > > + * backend is running > > + */ > > + data->backend_domid = iommu_spec.args[0]; > > > > if (xa_err(xa_store(&xen_grant_dma_devices, (unsigned long)dev, data, > > GFP_KERNEL))) { > > (END) > > > > > > > > Below, the nodes generated by Xen toolstack: > > > > xen_grant_dma { Nit: iommu { > > compatible = "xen,grant-dma"; > > #iommu-cells = <0x01>; > > phandle = <0xfde9>; > > }; > > > > virtio@2000000 { > > compatible = "virtio,mmio"; > > reg = <0x00 0x2000000 0x00 0x200>; > > interrupts = <0x00 0x01 0xf01>; > > interrupt-parent = <0xfde8>; > > dma-coherent; > > iommus = <0xfde9 0x01>; > > }; > > Not bad! I like it. > > > > I am wondering, would be the proper solution to eliminate deferred probe > > timeout issue in our particular case (without introducing an extra IOMMU > > driver)? > > In reality I don't think there is a way to do that. I would create an > empty skelethon IOMMU driver for xen,grant-dma. Does it have to be an empty driver? Originally, IOMMU 'drivers' were not drivers, but they've been getting converted. Can that be done here? Short of that, I think we could have some sort of skip probe list for deferred probe. Not sure if that would be easiest as IOMMU specific or global. Rob
On 24.05.22 04:58, Stefano Stabellini wrote: Hello Stefano, all > On Mon, 23 May 2022, Oleksandr wrote: >>>> On Thu, 19 May 2022, Oleksandr wrote: >>>>>> On Wed, May 18, 2022 at 5:06 PM Oleksandr <olekstysh@gmail.com> wrote: >>>>>>> On 18.05.22 17:32, Arnd Bergmann wrote: >>>>>>>> On Sat, May 7, 2022 at 7:19 PM Oleksandr Tyshchenko >>>>>>>> <olekstysh@gmail.com> wrote: >>>>>>>> This would mean having a device >>>>>>>> node for the grant-table mechanism that can be referred to using >>>>>>>> the >>>>>>>> 'iommus' >>>>>>>> phandle property, with the domid as an additional argument. >>>>>>> I assume, you are speaking about something like the following? >>>>>>> >>>>>>> >>>>>>> xen_dummy_iommu { >>>>>>> compatible = "xen,dummy-iommu"; >>>>>>> #iommu-cells = <1>; >>>>>>> }; >>>>>>> >>>>>>> virtio@3000 { >>>>>>> compatible = "virtio,mmio"; >>>>>>> reg = <0x3000 0x100>; >>>>>>> interrupts = <41>; >>>>>>> >>>>>>> /* The device is located in Xen domain with ID 1 */ >>>>>>> iommus = <&xen_dummy_iommu 1>; >>>>>>> }; >>>>>> Right, that's that's the idea, >>>>> thank you for the confirmation >>>>> >>>>> >>>>> >>>>>> except I would not call it a 'dummy'. >>>>>> From the perspective of the DT, this behaves just like an IOMMU, >>>>>> even if the exact mechanism is different from most hardware IOMMU >>>>>> implementations. >>>>> well, agree >>>>> >>>>> >>>>>>>> It does not quite fit the model that Linux currently uses for >>>>>>>> iommus, >>>>>>>> as that has an allocator for dma_addr_t space >>>>>>> yes (# 3/7 adds grant-table based allocator) >>>>>>> >>>>>>> >>>>>>>> , but it would think it's >>>>>>>> conceptually close enough that it makes sense for the binding. >>>>>>> Interesting idea. I am wondering, do we need an extra actions for >>>>>>> this >>>>>>> to work in Linux guest (dummy IOMMU driver, etc)? >>>>>> It depends on how closely the guest implementation can be made to >>>>>> resemble a normal iommu. If you do allocate dma_addr_t addresses, >>>>>> it may actually be close enough that you can just turn the grant-table >>>>>> code into a normal iommu driver and change nothing else. >>>>> Unfortunately, I failed to find a way how use grant references at the >>>>> iommu_ops level (I mean to fully pretend that we are an IOMMU driver). I >>>>> am >>>>> not too familiar with that, so what is written below might be wrong or >>>>> at >>>>> least not precise. >>>>> >>>>> The normal IOMMU driver in Linux doesn’t allocate DMA addresses by >>>>> itself, it >>>>> just maps (IOVA-PA) what was requested to be mapped by the upper layer. >>>>> The >>>>> DMA address allocation is done by the upper layer (DMA-IOMMU which is >>>>> the glue >>>>> layer between DMA API and IOMMU API allocates IOVA for PA?). But, all >>>>> what we >>>>> need here is just to allocate our specific grant-table based DMA >>>>> addresses >>>>> (DMA address = grant reference + offset in the page), so let’s say we >>>>> need an >>>>> entity to take a physical address as parameter and return a DMA address >>>>> (what >>>>> actually commit #3/7 is doing), and that’s all. So working at the >>>>> dma_ops >>>>> layer we get exactly what we need, with the minimal changes to guest >>>>> infrastructure. In our case the Xen itself acts as an IOMMU. >>>>> >>>>> Assuming that we want to reuse the IOMMU infrastructure somehow for our >>>>> needs. >>>>> I think, in that case we will likely need to introduce a new specific >>>>> IOVA >>>>> allocator (alongside with a generic one) to be hooked up by the >>>>> DMA-IOMMU >>>>> layer if we run on top of Xen. But, even having the specific IOVA >>>>> allocator to >>>>> return what we indeed need (DMA address = grant reference + offset in >>>>> the >>>>> page) we will still need the specific minimal required IOMMU driver to >>>>> be >>>>> present in the system anyway in order to track the mappings(?) and do >>>>> nothing >>>>> with them, returning a success (this specific IOMMU driver should have >>>>> all >>>>> mandatory callbacks implemented). >>>>> >>>>> I completely agree, it would be really nice to reuse generic IOMMU >>>>> bindings >>>>> rather than introducing Xen specific property if what we are trying to >>>>> implement in current patch series fits in the usage of "iommus" in Linux >>>>> more-less. But, if we will have to add more complexity/more components >>>>> to the >>>>> code for the sake of reusing device tree binding, this raises a question >>>>> whether that’s worthwhile. >>>>> >>>>> Or I really missed something? >>>> I think Arnd was primarily suggesting to reuse the IOMMU Device Tree >>>> bindings, not necessarily the IOMMU drivers framework in Linux (although >>>> that would be an added bonus.) >>>> >>>> I know from previous discussions with you that making the grant table >>>> fit in the existing IOMMU drivers model is difficult, but just reusing >>>> the Device Tree bindings seems feasible? >>> I started experimenting with that. As wrote in a separate email, I got a >>> deferred probe timeout, >>> >>> after inserting required nodes into guest device tree, which seems to be a >>> consequence of the unavailability of IOMMU, I will continue to investigate >>> this question. >> >> I have experimented with that. Yes, just reusing the Device Tree bindings is >> technically feasible (and we are able to do this by only touching >> grant-dma-ops.c), although deferred probe timeout still stands (as there is no >> IOMMU driver being present actually). >> >> [ 0.583771] virtio-mmio 2000000.virtio: deferred probe timeout, ignoring >> dependency >> [ 0.615556] virtio_blk virtio0: [vda] 4096000 512-byte logical blocks (2.10 >> GB/1.95 GiB) >> >> >> Below the working diff (on top of current series): >> >> diff --git a/drivers/xen/grant-dma-ops.c b/drivers/xen/grant-dma-ops.c >> index da9c7ff..6586152 100644 >> --- a/drivers/xen/grant-dma-ops.c >> +++ b/drivers/xen/grant-dma-ops.c >> @@ -272,17 +272,24 @@ static const struct dma_map_ops xen_grant_dma_ops = { >> >> bool xen_is_grant_dma_device(struct device *dev) >> { >> + struct device_node *iommu_np; >> + bool has_iommu; >> + >> /* XXX Handle only DT devices for now */ >> if (!dev->of_node) >> return false; >> >> - return of_property_read_bool(dev->of_node, "xen,backend-domid"); >> + iommu_np = of_parse_phandle(dev->of_node, "iommus", 0); >> + has_iommu = iommu_np && of_device_is_compatible(iommu_np, >> "xen,grant-dma"); >> + of_node_put(iommu_np); >> + >> + return has_iommu; >> } >> >> void xen_grant_setup_dma_ops(struct device *dev) >> { >> struct xen_grant_dma_data *data; >> - uint32_t domid; >> + struct of_phandle_args iommu_spec; >> >> data = find_xen_grant_dma_data(dev); >> if (data) { >> @@ -294,16 +301,30 @@ void xen_grant_setup_dma_ops(struct device *dev) >> if (!dev->of_node) >> goto err; >> >> - if (of_property_read_u32(dev->of_node, "xen,backend-domid", &domid)) { >> - dev_err(dev, "xen,backend-domid property is not present\n"); >> + if (of_parse_phandle_with_args(dev->of_node, "iommus", "#iommu-cells", >> + 0, &iommu_spec)) { >> + dev_err(dev, "Cannot parse iommus property\n"); >> + goto err; >> + } >> + >> + if (!of_device_is_compatible(iommu_spec.np, "xen,grant-dma") || >> + iommu_spec.args_count != 1) { >> + dev_err(dev, "Incompatible IOMMU node\n"); >> + of_node_put(iommu_spec.np); >> goto err; >> } >> >> + of_node_put(iommu_spec.np); >> + >> data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL); >> if (!data) >> goto err; >> >> - data->backend_domid = domid; >> + /* >> + * The endpoint ID here means the ID of the domain where the >> corresponding >> + * backend is running >> + */ >> + data->backend_domid = iommu_spec.args[0]; >> >> if (xa_err(xa_store(&xen_grant_dma_devices, (unsigned long)dev, data, >> GFP_KERNEL))) { >> (END) >> >> >> >> Below, the nodes generated by Xen toolstack: >> >> xen_grant_dma { >> compatible = "xen,grant-dma"; >> #iommu-cells = <0x01>; >> phandle = <0xfde9>; >> }; >> >> virtio@2000000 { >> compatible = "virtio,mmio"; >> reg = <0x00 0x2000000 0x00 0x200>; >> interrupts = <0x00 0x01 0xf01>; >> interrupt-parent = <0xfde8>; >> dma-coherent; >> iommus = <0xfde9 0x01>; >> }; > > Not bad! I like it. Good. > > >> I am wondering, would be the proper solution to eliminate deferred probe >> timeout issue in our particular case (without introducing an extra IOMMU >> driver)? > In reality I don't think there is a way to do that. I would create an > empty skelethon IOMMU driver for xen,grant-dma. Ok, I found yet another option how we can avoid deferred probe timeout issue. I am not sure whether it will be welcome. But it doesn't really require introducing stub IOMMU driver or other changes in the guest. The idea is to make IOMMU device unavailable (status = "disabled"), this way of_iommu_configure() will treat that as success condition also. https://elixir.bootlin.com/linux/v5.18/source/drivers/iommu/of_iommu.c#L31 https://elixir.bootlin.com/linux/v5.18/source/drivers/iommu/of_iommu.c#L149 xen_grant_dma { compatible = "xen,grant-dma"; #iommu-cells = <0x01>; phandle = <0xfde9>; status = "disabled"; }; virtio@2000000 { compatible = "virtio,mmio"; reg = <0x00 0x2000000 0x00 0x200>; interrupts = <0x00 0x01 0xf01>; interrupt-parent = <0xfde8>; dma-coherent; iommus = <0xfde9 0x01>; }; I have checked, this "fixes" deferred probe timeout issue. Or we indeed need to introduce stub IOMMU driver (I placed it to driver/xen instead of driver/iommu, also we can even squash it with grant-dma-ops.c?). This stub driver also results in NO_IOMMU condition (as "of_xlate" callback is not implemented). diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig index a7bd8ce..35b91b9 100644 --- a/drivers/xen/Kconfig +++ b/drivers/xen/Kconfig @@ -335,6 +335,10 @@ config XEN_UNPOPULATED_ALLOC having to balloon out RAM regions in order to obtain physical memory space to create such mappings. +config XEN_GRANT_DMA_IOMMU + bool + select IOMMU_API + config XEN_GRANT_DMA_OPS bool select DMA_OPS @@ -343,6 +347,7 @@ config XEN_VIRTIO bool "Xen virtio support" depends on VIRTIO select XEN_GRANT_DMA_OPS + select XEN_GRANT_DMA_IOMMU help Enable virtio support for running as Xen guest. Depending on the guest type this will require special support on the backend side diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile index 1a23cb0..c0503f1 100644 --- a/drivers/xen/Makefile +++ b/drivers/xen/Makefile @@ -40,3 +40,4 @@ xen-privcmd-y := privcmd.o privcmd-buf.o obj-$(CONFIG_XEN_FRONT_PGDIR_SHBUF) += xen-front-pgdir-shbuf.o obj-$(CONFIG_XEN_UNPOPULATED_ALLOC) += unpopulated-alloc.o obj-$(CONFIG_XEN_GRANT_DMA_OPS) += grant-dma-ops.o +obj-$(CONFIG_XEN_GRANT_DMA_IOMMU) += grant-dma-iommu.o diff --git a/drivers/xen/grant-dma-iommu.c b/drivers/xen/grant-dma-iommu.c new file mode 100644 index 00000000..b8aad8a --- /dev/null +++ b/drivers/xen/grant-dma-iommu.c @@ -0,0 +1,76 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Stub IOMMU driver which does nothing. + * The main purpose of it being present is to reuse generic device-tree IOMMU + * bindings by Xen grant DMA-mapping layer. + */ + +#include <linux/iommu.h> +#include <linux/of.h> +#include <linux/platform_device.h> + +struct grant_dma_iommu_device { + struct device *dev; + struct iommu_device iommu; +}; + +/* Nothing is really needed here */ +static const struct iommu_ops grant_dma_iommu_ops; + +static const struct of_device_id grant_dma_iommu_of_match[] = { + { .compatible = "xen,grant-dma" }, + { }, +}; + +static int grant_dma_iommu_probe(struct platform_device *pdev) +{ + struct grant_dma_iommu_device *mmu; + int ret; + + mmu = devm_kzalloc(&pdev->dev, sizeof(*mmu), GFP_KERNEL); + if (!mmu) + return -ENOMEM; + + mmu->dev = &pdev->dev; + + ret = iommu_device_register(&mmu->iommu, &grant_dma_iommu_ops, &pdev->dev); + if (ret) + return ret; + + platform_set_drvdata(pdev, mmu); + + return 0; +} + +static int grant_dma_iommu_remove(struct platform_device *pdev) +{ + struct grant_dma_iommu_device *mmu = platform_get_drvdata(pdev); + + platform_set_drvdata(pdev, NULL); + iommu_device_unregister(&mmu->iommu); + + return 0; +} + +static struct platform_driver grant_dma_iommu_driver = { + .driver = { + .name = "grant-dma-iommu", + .of_match_table = grant_dma_iommu_of_match, + }, + .probe = grant_dma_iommu_probe, + .remove = grant_dma_iommu_remove, +}; + +static int __init grant_dma_iommu_init(void) +{ + struct device_node *iommu_np; + + iommu_np = of_find_matching_node(NULL, grant_dma_iommu_of_match); + if (!iommu_np) + return 0; + + of_node_put(iommu_np); + + return platform_driver_register(&grant_dma_iommu_driver); +} +subsys_initcall(grant_dma_iommu_init); I have checked, this also "fixes" deferred probe timeout issue. Personally I would prefer the first option, but I would be also happy to use second option in order to unblock the series. What do the maintainers think?
On Tue, 24 May 2022, Oleksandr wrote: > > On Mon, 23 May 2022, Oleksandr wrote: > > > > > On Thu, 19 May 2022, Oleksandr wrote: > > > > > > > On Wed, May 18, 2022 at 5:06 PM Oleksandr <olekstysh@gmail.com> > > > > > > > wrote: > > > > > > > > On 18.05.22 17:32, Arnd Bergmann wrote: > > > > > > > > > On Sat, May 7, 2022 at 7:19 PM Oleksandr Tyshchenko > > > > > > > > > <olekstysh@gmail.com> wrote: > > > > > > > > > This would mean having a device > > > > > > > > > node for the grant-table mechanism that can be referred to > > > > > > > > > using > > > > > > > > > the > > > > > > > > > 'iommus' > > > > > > > > > phandle property, with the domid as an additional argument. > > > > > > > > I assume, you are speaking about something like the following? > > > > > > > > > > > > > > > > > > > > > > > > xen_dummy_iommu { > > > > > > > > compatible = "xen,dummy-iommu"; > > > > > > > > #iommu-cells = <1>; > > > > > > > > }; > > > > > > > > > > > > > > > > virtio@3000 { > > > > > > > > compatible = "virtio,mmio"; > > > > > > > > reg = <0x3000 0x100>; > > > > > > > > interrupts = <41>; > > > > > > > > > > > > > > > > /* The device is located in Xen domain with ID 1 */ > > > > > > > > iommus = <&xen_dummy_iommu 1>; > > > > > > > > }; > > > > > > > Right, that's that's the idea, > > > > > > thank you for the confirmation > > > > > > > > > > > > > > > > > > > > > > > > > except I would not call it a 'dummy'. > > > > > > > From the perspective of the DT, this behaves just like an > > > > > > > IOMMU, > > > > > > > even if the exact mechanism is different from most hardware IOMMU > > > > > > > implementations. > > > > > > well, agree > > > > > > > > > > > > > > > > > > > > > It does not quite fit the model that Linux currently uses for > > > > > > > > > iommus, > > > > > > > > > as that has an allocator for dma_addr_t space > > > > > > > > yes (# 3/7 adds grant-table based allocator) > > > > > > > > > > > > > > > > > > > > > > > > > , but it would think it's > > > > > > > > > conceptually close enough that it makes sense for the binding. > > > > > > > > Interesting idea. I am wondering, do we need an extra actions > > > > > > > > for > > > > > > > > this > > > > > > > > to work in Linux guest (dummy IOMMU driver, etc)? > > > > > > > It depends on how closely the guest implementation can be made to > > > > > > > resemble a normal iommu. If you do allocate dma_addr_t addresses, > > > > > > > it may actually be close enough that you can just turn the > > > > > > > grant-table > > > > > > > code into a normal iommu driver and change nothing else. > > > > > > Unfortunately, I failed to find a way how use grant references at > > > > > > the > > > > > > iommu_ops level (I mean to fully pretend that we are an IOMMU > > > > > > driver). I > > > > > > am > > > > > > not too familiar with that, so what is written below might be wrong > > > > > > or > > > > > > at > > > > > > least not precise. > > > > > > > > > > > > The normal IOMMU driver in Linux doesn’t allocate DMA addresses by > > > > > > itself, it > > > > > > just maps (IOVA-PA) what was requested to be mapped by the upper > > > > > > layer. > > > > > > The > > > > > > DMA address allocation is done by the upper layer (DMA-IOMMU which > > > > > > is > > > > > > the glue > > > > > > layer between DMA API and IOMMU API allocates IOVA for PA?). But, > > > > > > all > > > > > > what we > > > > > > need here is just to allocate our specific grant-table based DMA > > > > > > addresses > > > > > > (DMA address = grant reference + offset in the page), so let’s say > > > > > > we > > > > > > need an > > > > > > entity to take a physical address as parameter and return a DMA > > > > > > address > > > > > > (what > > > > > > actually commit #3/7 is doing), and that’s all. So working at the > > > > > > dma_ops > > > > > > layer we get exactly what we need, with the minimal changes to guest > > > > > > infrastructure. In our case the Xen itself acts as an IOMMU. > > > > > > > > > > > > Assuming that we want to reuse the IOMMU infrastructure somehow for > > > > > > our > > > > > > needs. > > > > > > I think, in that case we will likely need to introduce a new > > > > > > specific > > > > > > IOVA > > > > > > allocator (alongside with a generic one) to be hooked up by the > > > > > > DMA-IOMMU > > > > > > layer if we run on top of Xen. But, even having the specific IOVA > > > > > > allocator to > > > > > > return what we indeed need (DMA address = grant reference + offset > > > > > > in > > > > > > the > > > > > > page) we will still need the specific minimal required IOMMU driver > > > > > > to > > > > > > be > > > > > > present in the system anyway in order to track the mappings(?) and > > > > > > do > > > > > > nothing > > > > > > with them, returning a success (this specific IOMMU driver should > > > > > > have > > > > > > all > > > > > > mandatory callbacks implemented). > > > > > > > > > > > > I completely agree, it would be really nice to reuse generic IOMMU > > > > > > bindings > > > > > > rather than introducing Xen specific property if what we are trying > > > > > > to > > > > > > implement in current patch series fits in the usage of "iommus" in > > > > > > Linux > > > > > > more-less. But, if we will have to add more complexity/more > > > > > > components > > > > > > to the > > > > > > code for the sake of reusing device tree binding, this raises a > > > > > > question > > > > > > whether that’s worthwhile. > > > > > > > > > > > > Or I really missed something? > > > > > I think Arnd was primarily suggesting to reuse the IOMMU Device Tree > > > > > bindings, not necessarily the IOMMU drivers framework in Linux > > > > > (although > > > > > that would be an added bonus.) > > > > > > > > > > I know from previous discussions with you that making the grant table > > > > > fit in the existing IOMMU drivers model is difficult, but just reusing > > > > > the Device Tree bindings seems feasible? > > > > I started experimenting with that. As wrote in a separate email, I got a > > > > deferred probe timeout, > > > > > > > > after inserting required nodes into guest device tree, which seems to be > > > > a > > > > consequence of the unavailability of IOMMU, I will continue to > > > > investigate > > > > this question. > > > > > > I have experimented with that. Yes, just reusing the Device Tree bindings > > > is > > > technically feasible (and we are able to do this by only touching > > > grant-dma-ops.c), although deferred probe timeout still stands (as there > > > is no > > > IOMMU driver being present actually). > > > > > > [ 0.583771] virtio-mmio 2000000.virtio: deferred probe timeout, > > > ignoring > > > dependency > > > [ 0.615556] virtio_blk virtio0: [vda] 4096000 512-byte logical blocks > > > (2.10 > > > GB/1.95 GiB) > > > > > > > > > Below the working diff (on top of current series): > > > > > > diff --git a/drivers/xen/grant-dma-ops.c b/drivers/xen/grant-dma-ops.c > > > index da9c7ff..6586152 100644 > > > --- a/drivers/xen/grant-dma-ops.c > > > +++ b/drivers/xen/grant-dma-ops.c > > > @@ -272,17 +272,24 @@ static const struct dma_map_ops xen_grant_dma_ops = > > > { > > > > > > bool xen_is_grant_dma_device(struct device *dev) > > > { > > > + struct device_node *iommu_np; > > > + bool has_iommu; > > > + > > > /* XXX Handle only DT devices for now */ > > > if (!dev->of_node) > > > return false; > > > > > > - return of_property_read_bool(dev->of_node, "xen,backend-domid"); > > > + iommu_np = of_parse_phandle(dev->of_node, "iommus", 0); > > > + has_iommu = iommu_np && of_device_is_compatible(iommu_np, > > > "xen,grant-dma"); > > > + of_node_put(iommu_np); > > > + > > > + return has_iommu; > > > } > > > > > > void xen_grant_setup_dma_ops(struct device *dev) > > > { > > > struct xen_grant_dma_data *data; > > > - uint32_t domid; > > > + struct of_phandle_args iommu_spec; > > > > > > data = find_xen_grant_dma_data(dev); > > > if (data) { > > > @@ -294,16 +301,30 @@ void xen_grant_setup_dma_ops(struct device *dev) > > > if (!dev->of_node) > > > goto err; > > > > > > - if (of_property_read_u32(dev->of_node, "xen,backend-domid", > > > &domid)) { > > > - dev_err(dev, "xen,backend-domid property is not > > > present\n"); > > > + if (of_parse_phandle_with_args(dev->of_node, "iommus", > > > "#iommu-cells", > > > + 0, &iommu_spec)) { > > > + dev_err(dev, "Cannot parse iommus property\n"); > > > + goto err; > > > + } > > > + > > > + if (!of_device_is_compatible(iommu_spec.np, "xen,grant-dma") || > > > + iommu_spec.args_count != 1) { > > > + dev_err(dev, "Incompatible IOMMU node\n"); > > > + of_node_put(iommu_spec.np); > > > goto err; > > > } > > > > > > + of_node_put(iommu_spec.np); > > > + > > > data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL); > > > if (!data) > > > goto err; > > > > > > - data->backend_domid = domid; > > > + /* > > > + * The endpoint ID here means the ID of the domain where the > > > corresponding > > > + * backend is running > > > + */ > > > + data->backend_domid = iommu_spec.args[0]; > > > > > > if (xa_err(xa_store(&xen_grant_dma_devices, (unsigned long)dev, > > > data, > > > GFP_KERNEL))) { > > > (END) > > > > > > > > > > > > Below, the nodes generated by Xen toolstack: > > > > > > xen_grant_dma { > > > compatible = "xen,grant-dma"; > > > #iommu-cells = <0x01>; > > > phandle = <0xfde9>; > > > }; > > > > > > virtio@2000000 { > > > compatible = "virtio,mmio"; > > > reg = <0x00 0x2000000 0x00 0x200>; > > > interrupts = <0x00 0x01 0xf01>; > > > interrupt-parent = <0xfde8>; > > > dma-coherent; > > > iommus = <0xfde9 0x01>; > > > }; > > Not bad! I like it. > > > Good. > > > > > > > > I am wondering, would be the proper solution to eliminate deferred probe > > > timeout issue in our particular case (without introducing an extra IOMMU > > > driver)? > > In reality I don't think there is a way to do that. I would create an > > empty skelethon IOMMU driver for xen,grant-dma. > > Ok, I found yet another option how we can avoid deferred probe timeout issue. > I am not sure whether it will be welcome. But it doesn't really require > introducing stub IOMMU driver or other changes in the guest. The idea is to > make IOMMU device unavailable (status = "disabled"), this way > of_iommu_configure() will treat that as success condition also. > > https://elixir.bootlin.com/linux/v5.18/source/drivers/iommu/of_iommu.c#L31 > https://elixir.bootlin.com/linux/v5.18/source/drivers/iommu/of_iommu.c#L149 > > xen_grant_dma { > compatible = "xen,grant-dma"; > #iommu-cells = <0x01>; > phandle = <0xfde9>; > status = "disabled"; > }; > virtio@2000000 { > compatible = "virtio,mmio"; > reg = <0x00 0x2000000 0x00 0x200>; > interrupts = <0x00 0x01 0xf01>; > interrupt-parent = <0xfde8>; > dma-coherent; > iommus = <0xfde9 0x01>; > }; > > I have checked, this "fixes" deferred probe timeout issue. > > > Or we indeed need to introduce stub IOMMU driver (I placed it to driver/xen > instead of driver/iommu, also we can even squash it with grant-dma-ops.c?). > This stub driver also results in NO_IOMMU condition (as "of_xlate" callback is > not implemented). > > diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig > index a7bd8ce..35b91b9 100644 > --- a/drivers/xen/Kconfig > +++ b/drivers/xen/Kconfig > @@ -335,6 +335,10 @@ config XEN_UNPOPULATED_ALLOC > having to balloon out RAM regions in order to obtain physical memory > space to create such mappings. > > +config XEN_GRANT_DMA_IOMMU > + bool > + select IOMMU_API > + > config XEN_GRANT_DMA_OPS > bool > select DMA_OPS > @@ -343,6 +347,7 @@ config XEN_VIRTIO > bool "Xen virtio support" > depends on VIRTIO > select XEN_GRANT_DMA_OPS > + select XEN_GRANT_DMA_IOMMU > help > Enable virtio support for running as Xen guest. Depending on the > guest type this will require special support on the backend side > diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile > index 1a23cb0..c0503f1 100644 > --- a/drivers/xen/Makefile > +++ b/drivers/xen/Makefile > @@ -40,3 +40,4 @@ xen-privcmd-y := privcmd.o > privcmd-buf.o > obj-$(CONFIG_XEN_FRONT_PGDIR_SHBUF) += xen-front-pgdir-shbuf.o > obj-$(CONFIG_XEN_UNPOPULATED_ALLOC) += unpopulated-alloc.o > obj-$(CONFIG_XEN_GRANT_DMA_OPS) += grant-dma-ops.o > +obj-$(CONFIG_XEN_GRANT_DMA_IOMMU) += grant-dma-iommu.o > diff --git a/drivers/xen/grant-dma-iommu.c b/drivers/xen/grant-dma-iommu.c > new file mode 100644 > index 00000000..b8aad8a > --- /dev/null > +++ b/drivers/xen/grant-dma-iommu.c > @@ -0,0 +1,76 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Stub IOMMU driver which does nothing. > + * The main purpose of it being present is to reuse generic device-tree IOMMU > + * bindings by Xen grant DMA-mapping layer. > + */ > + > +#include <linux/iommu.h> > +#include <linux/of.h> > +#include <linux/platform_device.h> > + > +struct grant_dma_iommu_device { > + struct device *dev; > + struct iommu_device iommu; > +}; > + > +/* Nothing is really needed here */ > +static const struct iommu_ops grant_dma_iommu_ops; > + > +static const struct of_device_id grant_dma_iommu_of_match[] = { > + { .compatible = "xen,grant-dma" }, > + { }, > +}; > + > +static int grant_dma_iommu_probe(struct platform_device *pdev) > +{ > + struct grant_dma_iommu_device *mmu; > + int ret; > + > + mmu = devm_kzalloc(&pdev->dev, sizeof(*mmu), GFP_KERNEL); > + if (!mmu) > + return -ENOMEM; > + > + mmu->dev = &pdev->dev; > + > + ret = iommu_device_register(&mmu->iommu, &grant_dma_iommu_ops, > &pdev->dev); > + if (ret) > + return ret; > + > + platform_set_drvdata(pdev, mmu); > + > + return 0; > +} > + > +static int grant_dma_iommu_remove(struct platform_device *pdev) > +{ > + struct grant_dma_iommu_device *mmu = platform_get_drvdata(pdev); > + > + platform_set_drvdata(pdev, NULL); > + iommu_device_unregister(&mmu->iommu); > + > + return 0; > +} > + > +static struct platform_driver grant_dma_iommu_driver = { > + .driver = { > + .name = "grant-dma-iommu", > + .of_match_table = grant_dma_iommu_of_match, > + }, > + .probe = grant_dma_iommu_probe, > + .remove = grant_dma_iommu_remove, > +}; > + > +static int __init grant_dma_iommu_init(void) > +{ > + struct device_node *iommu_np; > + > + iommu_np = of_find_matching_node(NULL, grant_dma_iommu_of_match); > + if (!iommu_np) > + return 0; > + > + of_node_put(iommu_np); > + > + return platform_driver_register(&grant_dma_iommu_driver); > +} > +subsys_initcall(grant_dma_iommu_init); > > I have checked, this also "fixes" deferred probe timeout issue. > > Personally I would prefer the first option, but I would be also happy to use > second option in order to unblock the series. > > What do the maintainers think? I don't think it is a good idea to mark the fake IOMMU as disabled because it implies that there is no need to use it (no need to use dma_ops) which is a problem. If we don't want the skelethon driver then Rob's suggestion of having a skip list for deferred probe is better. I think the skelethon driver also is totally fine.
" On Tue, May 24, 2022 at 9:01 AM Rob Herring <robh@kernel.org> wrote: > > +Saravana > > On Mon, May 23, 2022 at 06:58:13PM -0700, Stefano Stabellini wrote: > > On Mon, 23 May 2022, Oleksandr wrote: > > > > > On Thu, 19 May 2022, Oleksandr wrote: > > > > > > > On Wed, May 18, 2022 at 5:06 PM Oleksandr <olekstysh@gmail.com> wrote: > > > > > > > > On 18.05.22 17:32, Arnd Bergmann wrote: > > > > > > > > > On Sat, May 7, 2022 at 7:19 PM Oleksandr Tyshchenko > > > > > > > > > <olekstysh@gmail.com> wrote: > > > > > > > > > This would mean having a device > > > > > > > > > node for the grant-table mechanism that can be referred to using > > > > > > > > > the > > > > > > > > > 'iommus' > > > > > > > > > phandle property, with the domid as an additional argument. > > > > > > > > I assume, you are speaking about something like the following? > > > > > > > > > > > > > > > > > > > > > > > > xen_dummy_iommu { > > > > > > > > compatible = "xen,dummy-iommu"; > > > > > > > > #iommu-cells = <1>; > > > > > > > > }; > > > > > > > > > > > > > > > > virtio@3000 { > > > > > > > > compatible = "virtio,mmio"; > > > > > > > > reg = <0x3000 0x100>; > > > > > > > > interrupts = <41>; > > > > > > > > > > > > > > > > /* The device is located in Xen domain with ID 1 */ > > > > > > > > iommus = <&xen_dummy_iommu 1>; > > > > > > > > }; > > > > > > > Right, that's that's the idea, > > > > > > thank you for the confirmation > > > > > > > > > > > > > > > > > > > > > > > > > except I would not call it a 'dummy'. > > > > > > > From the perspective of the DT, this behaves just like an IOMMU, > > > > > > > even if the exact mechanism is different from most hardware IOMMU > > > > > > > implementations. > > > > > > well, agree > > > > > > > > > > > > > > > > > > > > > It does not quite fit the model that Linux currently uses for > > > > > > > > > iommus, > > > > > > > > > as that has an allocator for dma_addr_t space > > > > > > > > yes (# 3/7 adds grant-table based allocator) > > > > > > > > > > > > > > > > > > > > > > > > > , but it would think it's > > > > > > > > > conceptually close enough that it makes sense for the binding. > > > > > > > > Interesting idea. I am wondering, do we need an extra actions for > > > > > > > > this > > > > > > > > to work in Linux guest (dummy IOMMU driver, etc)? > > > > > > > It depends on how closely the guest implementation can be made to > > > > > > > resemble a normal iommu. If you do allocate dma_addr_t addresses, > > > > > > > it may actually be close enough that you can just turn the grant-table > > > > > > > code into a normal iommu driver and change nothing else. > > > > > > Unfortunately, I failed to find a way how use grant references at the > > > > > > iommu_ops level (I mean to fully pretend that we are an IOMMU driver). I > > > > > > am > > > > > > not too familiar with that, so what is written below might be wrong or > > > > > > at > > > > > > least not precise. > > > > > > > > > > > > The normal IOMMU driver in Linux doesn’t allocate DMA addresses by > > > > > > itself, it > > > > > > just maps (IOVA-PA) what was requested to be mapped by the upper layer. > > > > > > The > > > > > > DMA address allocation is done by the upper layer (DMA-IOMMU which is > > > > > > the glue > > > > > > layer between DMA API and IOMMU API allocates IOVA for PA?). But, all > > > > > > what we > > > > > > need here is just to allocate our specific grant-table based DMA > > > > > > addresses > > > > > > (DMA address = grant reference + offset in the page), so let’s say we > > > > > > need an > > > > > > entity to take a physical address as parameter and return a DMA address > > > > > > (what > > > > > > actually commit #3/7 is doing), and that’s all. So working at the > > > > > > dma_ops > > > > > > layer we get exactly what we need, with the minimal changes to guest > > > > > > infrastructure. In our case the Xen itself acts as an IOMMU. > > > > > > > > > > > > Assuming that we want to reuse the IOMMU infrastructure somehow for our > > > > > > needs. > > > > > > I think, in that case we will likely need to introduce a new specific > > > > > > IOVA > > > > > > allocator (alongside with a generic one) to be hooked up by the > > > > > > DMA-IOMMU > > > > > > layer if we run on top of Xen. But, even having the specific IOVA > > > > > > allocator to > > > > > > return what we indeed need (DMA address = grant reference + offset in > > > > > > the > > > > > > page) we will still need the specific minimal required IOMMU driver to > > > > > > be > > > > > > present in the system anyway in order to track the mappings(?) and do > > > > > > nothing > > > > > > with them, returning a success (this specific IOMMU driver should have > > > > > > all > > > > > > mandatory callbacks implemented). > > > > > > > > > > > > I completely agree, it would be really nice to reuse generic IOMMU > > > > > > bindings > > > > > > rather than introducing Xen specific property if what we are trying to > > > > > > implement in current patch series fits in the usage of "iommus" in Linux > > > > > > more-less. But, if we will have to add more complexity/more components > > > > > > to the > > > > > > code for the sake of reusing device tree binding, this raises a question > > > > > > whether that’s worthwhile. > > > > > > > > > > > > Or I really missed something? > > > > > I think Arnd was primarily suggesting to reuse the IOMMU Device Tree > > > > > bindings, not necessarily the IOMMU drivers framework in Linux (although > > > > > that would be an added bonus.) > > > > > > > > > > I know from previous discussions with you that making the grant table > > > > > fit in the existing IOMMU drivers model is difficult, but just reusing > > > > > the Device Tree bindings seems feasible? > > > > > > > > I started experimenting with that. As wrote in a separate email, I got a > > > > deferred probe timeout, > > > > > > > > after inserting required nodes into guest device tree, which seems to be a > > > > consequence of the unavailability of IOMMU, I will continue to investigate > > > > this question. > > > > > > > > > I have experimented with that. Yes, just reusing the Device Tree bindings is > > > technically feasible (and we are able to do this by only touching > > > grant-dma-ops.c), although deferred probe timeout still stands (as there is no > > > IOMMU driver being present actually). > > > > > > [ 0.583771] virtio-mmio 2000000.virtio: deferred probe timeout, ignoring > > > dependency > > > [ 0.615556] virtio_blk virtio0: [vda] 4096000 512-byte logical blocks (2.10 > > > GB/1.95 GiB) > > > > > > > > > Below the working diff (on top of current series): > > > > > > diff --git a/drivers/xen/grant-dma-ops.c b/drivers/xen/grant-dma-ops.c > > > index da9c7ff..6586152 100644 > > > --- a/drivers/xen/grant-dma-ops.c > > > +++ b/drivers/xen/grant-dma-ops.c > > > @@ -272,17 +272,24 @@ static const struct dma_map_ops xen_grant_dma_ops = { > > > > > > bool xen_is_grant_dma_device(struct device *dev) > > > { > > > + struct device_node *iommu_np; > > > + bool has_iommu; > > > + > > > /* XXX Handle only DT devices for now */ > > > if (!dev->of_node) > > > return false; > > > > > > - return of_property_read_bool(dev->of_node, "xen,backend-domid"); > > > + iommu_np = of_parse_phandle(dev->of_node, "iommus", 0); > > > + has_iommu = iommu_np && of_device_is_compatible(iommu_np, > > > "xen,grant-dma"); > > > + of_node_put(iommu_np); > > > + > > > + return has_iommu; > > > } > > > > > > void xen_grant_setup_dma_ops(struct device *dev) > > > { > > > struct xen_grant_dma_data *data; > > > - uint32_t domid; > > > + struct of_phandle_args iommu_spec; > > > > > > data = find_xen_grant_dma_data(dev); > > > if (data) { > > > @@ -294,16 +301,30 @@ void xen_grant_setup_dma_ops(struct device *dev) > > > if (!dev->of_node) > > > goto err; > > > > > > - if (of_property_read_u32(dev->of_node, "xen,backend-domid", &domid)) { > > > - dev_err(dev, "xen,backend-domid property is not present\n"); > > > + if (of_parse_phandle_with_args(dev->of_node, "iommus", "#iommu-cells", > > > + 0, &iommu_spec)) { > > > + dev_err(dev, "Cannot parse iommus property\n"); > > > + goto err; > > > + } > > > + > > > + if (!of_device_is_compatible(iommu_spec.np, "xen,grant-dma") || > > > + iommu_spec.args_count != 1) { > > > + dev_err(dev, "Incompatible IOMMU node\n"); > > > + of_node_put(iommu_spec.np); > > > goto err; > > > } > > > > > > + of_node_put(iommu_spec.np); > > > + > > > data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL); > > > if (!data) > > > goto err; > > > > > > - data->backend_domid = domid; > > > + /* > > > + * The endpoint ID here means the ID of the domain where the > > > corresponding > > > + * backend is running > > > + */ > > > + data->backend_domid = iommu_spec.args[0]; > > > > > > if (xa_err(xa_store(&xen_grant_dma_devices, (unsigned long)dev, data, > > > GFP_KERNEL))) { > > > (END) > > > > > > > > > > > > Below, the nodes generated by Xen toolstack: > > > > > > xen_grant_dma { > > Nit: iommu { > > > > compatible = "xen,grant-dma"; > > > #iommu-cells = <0x01>; > > > phandle = <0xfde9>; > > > }; > > > > > > virtio@2000000 { > > > compatible = "virtio,mmio"; > > > reg = <0x00 0x2000000 0x00 0x200>; > > > interrupts = <0x00 0x01 0xf01>; > > > interrupt-parent = <0xfde8>; > > > dma-coherent; > > > iommus = <0xfde9 0x01>; > > > }; > > > > Not bad! I like it. > > > > > > > I am wondering, would be the proper solution to eliminate deferred probe > > > timeout issue in our particular case (without introducing an extra IOMMU > > > driver)? > > > > In reality I don't think there is a way to do that. I would create an > > empty skelethon IOMMU driver for xen,grant-dma. > > Does it have to be an empty driver? Originally, IOMMU 'drivers' were not > drivers, but they've been getting converted. Can that be done here? > > Short of that, I think we could have some sort of skip probe list for > deferred probe. Not sure if that would be easiest as IOMMU specific or > global. Hi Oleksandr, If you do fw_devlink.strict=1, you'll notice that the consumers of this "iommu" won't probe at all or will delay the boot by some number of seconds. The eventual goal is to go towards fw_devlink.strict=1 being the default. From a fw_devlik perspective, please implement a driver. Ideally a real one, but at least an empty one. The empty one doesn't need to be an IOMMU driver, but at least just do a return 0 in the probe function. Also, if it's not a device, why even have a "compatible" property (removing it won't necessarily remove the deferred probe timeout issue you see)? Will any code be using "xen,grant-dma" to look up the node? If so, that driver could be the one that probes this device. At least from a fw_devlink perspective, it just needs to have a driver that binds to this device. Also, if we aren't going to implement a driver and have the supplier ("xen,grant-dma") behave like a device (as in, have a driver that probes), I'd rather that the iommu binding not be used at all as this would be an exception to how every other iommu device behaves. -Saravana
On 24.05.22 20:59, Stefano Stabellini wrote: Hello Stefano > On Tue, 24 May 2022, Oleksandr wrote: >>> On Mon, 23 May 2022, Oleksandr wrote: >>>>>> On Thu, 19 May 2022, Oleksandr wrote: >>>>>>>> On Wed, May 18, 2022 at 5:06 PM Oleksandr <olekstysh@gmail.com> >>>>>>>> wrote: >>>>>>>>> On 18.05.22 17:32, Arnd Bergmann wrote: >>>>>>>>>> On Sat, May 7, 2022 at 7:19 PM Oleksandr Tyshchenko >>>>>>>>>> <olekstysh@gmail.com> wrote: >>>>>>>>>> This would mean having a device >>>>>>>>>> node for the grant-table mechanism that can be referred to >>>>>>>>>> using >>>>>>>>>> the >>>>>>>>>> 'iommus' >>>>>>>>>> phandle property, with the domid as an additional argument. >>>>>>>>> I assume, you are speaking about something like the following? >>>>>>>>> >>>>>>>>> >>>>>>>>> xen_dummy_iommu { >>>>>>>>> compatible = "xen,dummy-iommu"; >>>>>>>>> #iommu-cells = <1>; >>>>>>>>> }; >>>>>>>>> >>>>>>>>> virtio@3000 { >>>>>>>>> compatible = "virtio,mmio"; >>>>>>>>> reg = <0x3000 0x100>; >>>>>>>>> interrupts = <41>; >>>>>>>>> >>>>>>>>> /* The device is located in Xen domain with ID 1 */ >>>>>>>>> iommus = <&xen_dummy_iommu 1>; >>>>>>>>> }; >>>>>>>> Right, that's that's the idea, >>>>>>> thank you for the confirmation >>>>>>> >>>>>>> >>>>>>> >>>>>>>> except I would not call it a 'dummy'. >>>>>>>> From the perspective of the DT, this behaves just like an >>>>>>>> IOMMU, >>>>>>>> even if the exact mechanism is different from most hardware IOMMU >>>>>>>> implementations. >>>>>>> well, agree >>>>>>> >>>>>>> >>>>>>>>>> It does not quite fit the model that Linux currently uses for >>>>>>>>>> iommus, >>>>>>>>>> as that has an allocator for dma_addr_t space >>>>>>>>> yes (# 3/7 adds grant-table based allocator) >>>>>>>>> >>>>>>>>> >>>>>>>>>> , but it would think it's >>>>>>>>>> conceptually close enough that it makes sense for the binding. >>>>>>>>> Interesting idea. I am wondering, do we need an extra actions >>>>>>>>> for >>>>>>>>> this >>>>>>>>> to work in Linux guest (dummy IOMMU driver, etc)? >>>>>>>> It depends on how closely the guest implementation can be made to >>>>>>>> resemble a normal iommu. If you do allocate dma_addr_t addresses, >>>>>>>> it may actually be close enough that you can just turn the >>>>>>>> grant-table >>>>>>>> code into a normal iommu driver and change nothing else. >>>>>>> Unfortunately, I failed to find a way how use grant references at >>>>>>> the >>>>>>> iommu_ops level (I mean to fully pretend that we are an IOMMU >>>>>>> driver). I >>>>>>> am >>>>>>> not too familiar with that, so what is written below might be wrong >>>>>>> or >>>>>>> at >>>>>>> least not precise. >>>>>>> >>>>>>> The normal IOMMU driver in Linux doesn’t allocate DMA addresses by >>>>>>> itself, it >>>>>>> just maps (IOVA-PA) what was requested to be mapped by the upper >>>>>>> layer. >>>>>>> The >>>>>>> DMA address allocation is done by the upper layer (DMA-IOMMU which >>>>>>> is >>>>>>> the glue >>>>>>> layer between DMA API and IOMMU API allocates IOVA for PA?). But, >>>>>>> all >>>>>>> what we >>>>>>> need here is just to allocate our specific grant-table based DMA >>>>>>> addresses >>>>>>> (DMA address = grant reference + offset in the page), so let’s say >>>>>>> we >>>>>>> need an >>>>>>> entity to take a physical address as parameter and return a DMA >>>>>>> address >>>>>>> (what >>>>>>> actually commit #3/7 is doing), and that’s all. So working at the >>>>>>> dma_ops >>>>>>> layer we get exactly what we need, with the minimal changes to guest >>>>>>> infrastructure. In our case the Xen itself acts as an IOMMU. >>>>>>> >>>>>>> Assuming that we want to reuse the IOMMU infrastructure somehow for >>>>>>> our >>>>>>> needs. >>>>>>> I think, in that case we will likely need to introduce a new >>>>>>> specific >>>>>>> IOVA >>>>>>> allocator (alongside with a generic one) to be hooked up by the >>>>>>> DMA-IOMMU >>>>>>> layer if we run on top of Xen. But, even having the specific IOVA >>>>>>> allocator to >>>>>>> return what we indeed need (DMA address = grant reference + offset >>>>>>> in >>>>>>> the >>>>>>> page) we will still need the specific minimal required IOMMU driver >>>>>>> to >>>>>>> be >>>>>>> present in the system anyway in order to track the mappings(?) and >>>>>>> do >>>>>>> nothing >>>>>>> with them, returning a success (this specific IOMMU driver should >>>>>>> have >>>>>>> all >>>>>>> mandatory callbacks implemented). >>>>>>> >>>>>>> I completely agree, it would be really nice to reuse generic IOMMU >>>>>>> bindings >>>>>>> rather than introducing Xen specific property if what we are trying >>>>>>> to >>>>>>> implement in current patch series fits in the usage of "iommus" in >>>>>>> Linux >>>>>>> more-less. But, if we will have to add more complexity/more >>>>>>> components >>>>>>> to the >>>>>>> code for the sake of reusing device tree binding, this raises a >>>>>>> question >>>>>>> whether that’s worthwhile. >>>>>>> >>>>>>> Or I really missed something? >>>>>> I think Arnd was primarily suggesting to reuse the IOMMU Device Tree >>>>>> bindings, not necessarily the IOMMU drivers framework in Linux >>>>>> (although >>>>>> that would be an added bonus.) >>>>>> >>>>>> I know from previous discussions with you that making the grant table >>>>>> fit in the existing IOMMU drivers model is difficult, but just reusing >>>>>> the Device Tree bindings seems feasible? >>>>> I started experimenting with that. As wrote in a separate email, I got a >>>>> deferred probe timeout, >>>>> >>>>> after inserting required nodes into guest device tree, which seems to be >>>>> a >>>>> consequence of the unavailability of IOMMU, I will continue to >>>>> investigate >>>>> this question. >>>> I have experimented with that. Yes, just reusing the Device Tree bindings >>>> is >>>> technically feasible (and we are able to do this by only touching >>>> grant-dma-ops.c), although deferred probe timeout still stands (as there >>>> is no >>>> IOMMU driver being present actually). >>>> >>>> [ 0.583771] virtio-mmio 2000000.virtio: deferred probe timeout, >>>> ignoring >>>> dependency >>>> [ 0.615556] virtio_blk virtio0: [vda] 4096000 512-byte logical blocks >>>> (2.10 >>>> GB/1.95 GiB) >>>> >>>> >>>> Below the working diff (on top of current series): >>>> >>>> diff --git a/drivers/xen/grant-dma-ops.c b/drivers/xen/grant-dma-ops.c >>>> index da9c7ff..6586152 100644 >>>> --- a/drivers/xen/grant-dma-ops.c >>>> +++ b/drivers/xen/grant-dma-ops.c >>>> @@ -272,17 +272,24 @@ static const struct dma_map_ops xen_grant_dma_ops = >>>> { >>>> >>>> bool xen_is_grant_dma_device(struct device *dev) >>>> { >>>> + struct device_node *iommu_np; >>>> + bool has_iommu; >>>> + >>>> /* XXX Handle only DT devices for now */ >>>> if (!dev->of_node) >>>> return false; >>>> >>>> - return of_property_read_bool(dev->of_node, "xen,backend-domid"); >>>> + iommu_np = of_parse_phandle(dev->of_node, "iommus", 0); >>>> + has_iommu = iommu_np && of_device_is_compatible(iommu_np, >>>> "xen,grant-dma"); >>>> + of_node_put(iommu_np); >>>> + >>>> + return has_iommu; >>>> } >>>> >>>> void xen_grant_setup_dma_ops(struct device *dev) >>>> { >>>> struct xen_grant_dma_data *data; >>>> - uint32_t domid; >>>> + struct of_phandle_args iommu_spec; >>>> >>>> data = find_xen_grant_dma_data(dev); >>>> if (data) { >>>> @@ -294,16 +301,30 @@ void xen_grant_setup_dma_ops(struct device *dev) >>>> if (!dev->of_node) >>>> goto err; >>>> >>>> - if (of_property_read_u32(dev->of_node, "xen,backend-domid", >>>> &domid)) { >>>> - dev_err(dev, "xen,backend-domid property is not >>>> present\n"); >>>> + if (of_parse_phandle_with_args(dev->of_node, "iommus", >>>> "#iommu-cells", >>>> + 0, &iommu_spec)) { >>>> + dev_err(dev, "Cannot parse iommus property\n"); >>>> + goto err; >>>> + } >>>> + >>>> + if (!of_device_is_compatible(iommu_spec.np, "xen,grant-dma") || >>>> + iommu_spec.args_count != 1) { >>>> + dev_err(dev, "Incompatible IOMMU node\n"); >>>> + of_node_put(iommu_spec.np); >>>> goto err; >>>> } >>>> >>>> + of_node_put(iommu_spec.np); >>>> + >>>> data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL); >>>> if (!data) >>>> goto err; >>>> >>>> - data->backend_domid = domid; >>>> + /* >>>> + * The endpoint ID here means the ID of the domain where the >>>> corresponding >>>> + * backend is running >>>> + */ >>>> + data->backend_domid = iommu_spec.args[0]; >>>> >>>> if (xa_err(xa_store(&xen_grant_dma_devices, (unsigned long)dev, >>>> data, >>>> GFP_KERNEL))) { >>>> (END) >>>> >>>> >>>> >>>> Below, the nodes generated by Xen toolstack: >>>> >>>> xen_grant_dma { >>>> compatible = "xen,grant-dma"; >>>> #iommu-cells = <0x01>; >>>> phandle = <0xfde9>; >>>> }; >>>> >>>> virtio@2000000 { >>>> compatible = "virtio,mmio"; >>>> reg = <0x00 0x2000000 0x00 0x200>; >>>> interrupts = <0x00 0x01 0xf01>; >>>> interrupt-parent = <0xfde8>; >>>> dma-coherent; >>>> iommus = <0xfde9 0x01>; >>>> }; >>> Not bad! I like it. >> >> Good. >> >> >> >>> >>>> I am wondering, would be the proper solution to eliminate deferred probe >>>> timeout issue in our particular case (without introducing an extra IOMMU >>>> driver)? >>> In reality I don't think there is a way to do that. I would create an >>> empty skelethon IOMMU driver for xen,grant-dma. >> Ok, I found yet another option how we can avoid deferred probe timeout issue. >> I am not sure whether it will be welcome. But it doesn't really require >> introducing stub IOMMU driver or other changes in the guest. The idea is to >> make IOMMU device unavailable (status = "disabled"), this way >> of_iommu_configure() will treat that as success condition also. >> >> https://elixir.bootlin.com/linux/v5.18/source/drivers/iommu/of_iommu.c#L31 >> https://elixir.bootlin.com/linux/v5.18/source/drivers/iommu/of_iommu.c#L149 >> >> xen_grant_dma { >> compatible = "xen,grant-dma"; >> #iommu-cells = <0x01>; >> phandle = <0xfde9>; >> status = "disabled"; >> }; >> virtio@2000000 { >> compatible = "virtio,mmio"; >> reg = <0x00 0x2000000 0x00 0x200>; >> interrupts = <0x00 0x01 0xf01>; >> interrupt-parent = <0xfde8>; >> dma-coherent; >> iommus = <0xfde9 0x01>; >> }; >> >> I have checked, this "fixes" deferred probe timeout issue. >> >> >> Or we indeed need to introduce stub IOMMU driver (I placed it to driver/xen >> instead of driver/iommu, also we can even squash it with grant-dma-ops.c?). >> This stub driver also results in NO_IOMMU condition (as "of_xlate" callback is >> not implemented). >> >> diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig >> index a7bd8ce..35b91b9 100644 >> --- a/drivers/xen/Kconfig >> +++ b/drivers/xen/Kconfig >> @@ -335,6 +335,10 @@ config XEN_UNPOPULATED_ALLOC >> having to balloon out RAM regions in order to obtain physical memory >> space to create such mappings. >> >> +config XEN_GRANT_DMA_IOMMU >> + bool >> + select IOMMU_API >> + >> config XEN_GRANT_DMA_OPS >> bool >> select DMA_OPS >> @@ -343,6 +347,7 @@ config XEN_VIRTIO >> bool "Xen virtio support" >> depends on VIRTIO >> select XEN_GRANT_DMA_OPS >> + select XEN_GRANT_DMA_IOMMU >> help >> Enable virtio support for running as Xen guest. Depending on the >> guest type this will require special support on the backend side >> diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile >> index 1a23cb0..c0503f1 100644 >> --- a/drivers/xen/Makefile >> +++ b/drivers/xen/Makefile >> @@ -40,3 +40,4 @@ xen-privcmd-y := privcmd.o >> privcmd-buf.o >> obj-$(CONFIG_XEN_FRONT_PGDIR_SHBUF) += xen-front-pgdir-shbuf.o >> obj-$(CONFIG_XEN_UNPOPULATED_ALLOC) += unpopulated-alloc.o >> obj-$(CONFIG_XEN_GRANT_DMA_OPS) += grant-dma-ops.o >> +obj-$(CONFIG_XEN_GRANT_DMA_IOMMU) += grant-dma-iommu.o >> diff --git a/drivers/xen/grant-dma-iommu.c b/drivers/xen/grant-dma-iommu.c >> new file mode 100644 >> index 00000000..b8aad8a >> --- /dev/null >> +++ b/drivers/xen/grant-dma-iommu.c >> @@ -0,0 +1,76 @@ >> +// SPDX-License-Identifier: GPL-2.0 >> +/* >> + * Stub IOMMU driver which does nothing. >> + * The main purpose of it being present is to reuse generic device-tree IOMMU >> + * bindings by Xen grant DMA-mapping layer. >> + */ >> + >> +#include <linux/iommu.h> >> +#include <linux/of.h> >> +#include <linux/platform_device.h> >> + >> +struct grant_dma_iommu_device { >> + struct device *dev; >> + struct iommu_device iommu; >> +}; >> + >> +/* Nothing is really needed here */ >> +static const struct iommu_ops grant_dma_iommu_ops; >> + >> +static const struct of_device_id grant_dma_iommu_of_match[] = { >> + { .compatible = "xen,grant-dma" }, >> + { }, >> +}; >> + >> +static int grant_dma_iommu_probe(struct platform_device *pdev) >> +{ >> + struct grant_dma_iommu_device *mmu; >> + int ret; >> + >> + mmu = devm_kzalloc(&pdev->dev, sizeof(*mmu), GFP_KERNEL); >> + if (!mmu) >> + return -ENOMEM; >> + >> + mmu->dev = &pdev->dev; >> + >> + ret = iommu_device_register(&mmu->iommu, &grant_dma_iommu_ops, >> &pdev->dev); >> + if (ret) >> + return ret; >> + >> + platform_set_drvdata(pdev, mmu); >> + >> + return 0; >> +} >> + >> +static int grant_dma_iommu_remove(struct platform_device *pdev) >> +{ >> + struct grant_dma_iommu_device *mmu = platform_get_drvdata(pdev); >> + >> + platform_set_drvdata(pdev, NULL); >> + iommu_device_unregister(&mmu->iommu); >> + >> + return 0; >> +} >> + >> +static struct platform_driver grant_dma_iommu_driver = { >> + .driver = { >> + .name = "grant-dma-iommu", >> + .of_match_table = grant_dma_iommu_of_match, >> + }, >> + .probe = grant_dma_iommu_probe, >> + .remove = grant_dma_iommu_remove, >> +}; >> + >> +static int __init grant_dma_iommu_init(void) >> +{ >> + struct device_node *iommu_np; >> + >> + iommu_np = of_find_matching_node(NULL, grant_dma_iommu_of_match); >> + if (!iommu_np) >> + return 0; >> + >> + of_node_put(iommu_np); >> + >> + return platform_driver_register(&grant_dma_iommu_driver); >> +} >> +subsys_initcall(grant_dma_iommu_init); >> >> I have checked, this also "fixes" deferred probe timeout issue. >> >> Personally I would prefer the first option, but I would be also happy to use >> second option in order to unblock the series. >> >> What do the maintainers think? > > > I don't think it is a good idea to mark the fake IOMMU as disabled > because it implies that there is no need to use it (no need to use > dma_ops) which is a problem. I got your point. You are right, this indeed sounds weird. I expected this simple solution wouldn't be welcome. > > If we don't want the skelethon driver then Rob's suggestion of having a > skip list for deferred probe is better. I am not sure I understand the idea completely. Does it mean that we will need new command line option (?) to pass some string (I assume, the "compatible" for IOMMU device) to not defer probe if corresponding driver is missing, but just return NO_IOMMU right away? > > I think the skelethon driver also is totally fine. ok, thank you for the feedback.
On 24.05.22 21:34, Saravana Kannan wrote: Hello all > " > > On Tue, May 24, 2022 at 9:01 AM Rob Herring <robh@kernel.org> wrote: >> +Saravana >> >> On Mon, May 23, 2022 at 06:58:13PM -0700, Stefano Stabellini wrote: >>> On Mon, 23 May 2022, Oleksandr wrote: >>>>>> On Thu, 19 May 2022, Oleksandr wrote: >>>>>>>> On Wed, May 18, 2022 at 5:06 PM Oleksandr <olekstysh@gmail.com> wrote: >>>>>>>>> On 18.05.22 17:32, Arnd Bergmann wrote: >>>>>>>>>> On Sat, May 7, 2022 at 7:19 PM Oleksandr Tyshchenko >>>>>>>>>> <olekstysh@gmail.com> wrote: >>>>>>>>>> This would mean having a device >>>>>>>>>> node for the grant-table mechanism that can be referred to using >>>>>>>>>> the >>>>>>>>>> 'iommus' >>>>>>>>>> phandle property, with the domid as an additional argument. >>>>>>>>> I assume, you are speaking about something like the following? >>>>>>>>> >>>>>>>>> >>>>>>>>> xen_dummy_iommu { >>>>>>>>> compatible = "xen,dummy-iommu"; >>>>>>>>> #iommu-cells = <1>; >>>>>>>>> }; >>>>>>>>> >>>>>>>>> virtio@3000 { >>>>>>>>> compatible = "virtio,mmio"; >>>>>>>>> reg = <0x3000 0x100>; >>>>>>>>> interrupts = <41>; >>>>>>>>> >>>>>>>>> /* The device is located in Xen domain with ID 1 */ >>>>>>>>> iommus = <&xen_dummy_iommu 1>; >>>>>>>>> }; >>>>>>>> Right, that's that's the idea, >>>>>>> thank you for the confirmation >>>>>>> >>>>>>> >>>>>>> >>>>>>>> except I would not call it a 'dummy'. >>>>>>>> From the perspective of the DT, this behaves just like an IOMMU, >>>>>>>> even if the exact mechanism is different from most hardware IOMMU >>>>>>>> implementations. >>>>>>> well, agree >>>>>>> >>>>>>> >>>>>>>>>> It does not quite fit the model that Linux currently uses for >>>>>>>>>> iommus, >>>>>>>>>> as that has an allocator for dma_addr_t space >>>>>>>>> yes (# 3/7 adds grant-table based allocator) >>>>>>>>> >>>>>>>>> >>>>>>>>>> , but it would think it's >>>>>>>>>> conceptually close enough that it makes sense for the binding. >>>>>>>>> Interesting idea. I am wondering, do we need an extra actions for >>>>>>>>> this >>>>>>>>> to work in Linux guest (dummy IOMMU driver, etc)? >>>>>>>> It depends on how closely the guest implementation can be made to >>>>>>>> resemble a normal iommu. If you do allocate dma_addr_t addresses, >>>>>>>> it may actually be close enough that you can just turn the grant-table >>>>>>>> code into a normal iommu driver and change nothing else. >>>>>>> Unfortunately, I failed to find a way how use grant references at the >>>>>>> iommu_ops level (I mean to fully pretend that we are an IOMMU driver). I >>>>>>> am >>>>>>> not too familiar with that, so what is written below might be wrong or >>>>>>> at >>>>>>> least not precise. >>>>>>> >>>>>>> The normal IOMMU driver in Linux doesn’t allocate DMA addresses by >>>>>>> itself, it >>>>>>> just maps (IOVA-PA) what was requested to be mapped by the upper layer. >>>>>>> The >>>>>>> DMA address allocation is done by the upper layer (DMA-IOMMU which is >>>>>>> the glue >>>>>>> layer between DMA API and IOMMU API allocates IOVA for PA?). But, all >>>>>>> what we >>>>>>> need here is just to allocate our specific grant-table based DMA >>>>>>> addresses >>>>>>> (DMA address = grant reference + offset in the page), so let’s say we >>>>>>> need an >>>>>>> entity to take a physical address as parameter and return a DMA address >>>>>>> (what >>>>>>> actually commit #3/7 is doing), and that’s all. So working at the >>>>>>> dma_ops >>>>>>> layer we get exactly what we need, with the minimal changes to guest >>>>>>> infrastructure. In our case the Xen itself acts as an IOMMU. >>>>>>> >>>>>>> Assuming that we want to reuse the IOMMU infrastructure somehow for our >>>>>>> needs. >>>>>>> I think, in that case we will likely need to introduce a new specific >>>>>>> IOVA >>>>>>> allocator (alongside with a generic one) to be hooked up by the >>>>>>> DMA-IOMMU >>>>>>> layer if we run on top of Xen. But, even having the specific IOVA >>>>>>> allocator to >>>>>>> return what we indeed need (DMA address = grant reference + offset in >>>>>>> the >>>>>>> page) we will still need the specific minimal required IOMMU driver to >>>>>>> be >>>>>>> present in the system anyway in order to track the mappings(?) and do >>>>>>> nothing >>>>>>> with them, returning a success (this specific IOMMU driver should have >>>>>>> all >>>>>>> mandatory callbacks implemented). >>>>>>> >>>>>>> I completely agree, it would be really nice to reuse generic IOMMU >>>>>>> bindings >>>>>>> rather than introducing Xen specific property if what we are trying to >>>>>>> implement in current patch series fits in the usage of "iommus" in Linux >>>>>>> more-less. But, if we will have to add more complexity/more components >>>>>>> to the >>>>>>> code for the sake of reusing device tree binding, this raises a question >>>>>>> whether that’s worthwhile. >>>>>>> >>>>>>> Or I really missed something? >>>>>> I think Arnd was primarily suggesting to reuse the IOMMU Device Tree >>>>>> bindings, not necessarily the IOMMU drivers framework in Linux (although >>>>>> that would be an added bonus.) >>>>>> >>>>>> I know from previous discussions with you that making the grant table >>>>>> fit in the existing IOMMU drivers model is difficult, but just reusing >>>>>> the Device Tree bindings seems feasible? >>>>> I started experimenting with that. As wrote in a separate email, I got a >>>>> deferred probe timeout, >>>>> >>>>> after inserting required nodes into guest device tree, which seems to be a >>>>> consequence of the unavailability of IOMMU, I will continue to investigate >>>>> this question. >>>> >>>> I have experimented with that. Yes, just reusing the Device Tree bindings is >>>> technically feasible (and we are able to do this by only touching >>>> grant-dma-ops.c), although deferred probe timeout still stands (as there is no >>>> IOMMU driver being present actually). >>>> >>>> [ 0.583771] virtio-mmio 2000000.virtio: deferred probe timeout, ignoring >>>> dependency >>>> [ 0.615556] virtio_blk virtio0: [vda] 4096000 512-byte logical blocks (2.10 >>>> GB/1.95 GiB) >>>> >>>> >>>> Below the working diff (on top of current series): >>>> >>>> diff --git a/drivers/xen/grant-dma-ops.c b/drivers/xen/grant-dma-ops.c >>>> index da9c7ff..6586152 100644 >>>> --- a/drivers/xen/grant-dma-ops.c >>>> +++ b/drivers/xen/grant-dma-ops.c >>>> @@ -272,17 +272,24 @@ static const struct dma_map_ops xen_grant_dma_ops = { >>>> >>>> bool xen_is_grant_dma_device(struct device *dev) >>>> { >>>> + struct device_node *iommu_np; >>>> + bool has_iommu; >>>> + >>>> /* XXX Handle only DT devices for now */ >>>> if (!dev->of_node) >>>> return false; >>>> >>>> - return of_property_read_bool(dev->of_node, "xen,backend-domid"); >>>> + iommu_np = of_parse_phandle(dev->of_node, "iommus", 0); >>>> + has_iommu = iommu_np && of_device_is_compatible(iommu_np, >>>> "xen,grant-dma"); >>>> + of_node_put(iommu_np); >>>> + >>>> + return has_iommu; >>>> } >>>> >>>> void xen_grant_setup_dma_ops(struct device *dev) >>>> { >>>> struct xen_grant_dma_data *data; >>>> - uint32_t domid; >>>> + struct of_phandle_args iommu_spec; >>>> >>>> data = find_xen_grant_dma_data(dev); >>>> if (data) { >>>> @@ -294,16 +301,30 @@ void xen_grant_setup_dma_ops(struct device *dev) >>>> if (!dev->of_node) >>>> goto err; >>>> >>>> - if (of_property_read_u32(dev->of_node, "xen,backend-domid", &domid)) { >>>> - dev_err(dev, "xen,backend-domid property is not present\n"); >>>> + if (of_parse_phandle_with_args(dev->of_node, "iommus", "#iommu-cells", >>>> + 0, &iommu_spec)) { >>>> + dev_err(dev, "Cannot parse iommus property\n"); >>>> + goto err; >>>> + } >>>> + >>>> + if (!of_device_is_compatible(iommu_spec.np, "xen,grant-dma") || >>>> + iommu_spec.args_count != 1) { >>>> + dev_err(dev, "Incompatible IOMMU node\n"); >>>> + of_node_put(iommu_spec.np); >>>> goto err; >>>> } >>>> >>>> + of_node_put(iommu_spec.np); >>>> + >>>> data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL); >>>> if (!data) >>>> goto err; >>>> >>>> - data->backend_domid = domid; >>>> + /* >>>> + * The endpoint ID here means the ID of the domain where the >>>> corresponding >>>> + * backend is running >>>> + */ >>>> + data->backend_domid = iommu_spec.args[0]; >>>> >>>> if (xa_err(xa_store(&xen_grant_dma_devices, (unsigned long)dev, data, >>>> GFP_KERNEL))) { >>>> (END) >>>> >>>> >>>> >>>> Below, the nodes generated by Xen toolstack: >>>> >>>> xen_grant_dma { >> Nit: iommu { >> >>>> compatible = "xen,grant-dma"; >>>> #iommu-cells = <0x01>; >>>> phandle = <0xfde9>; >>>> }; >>>> >>>> virtio@2000000 { >>>> compatible = "virtio,mmio"; >>>> reg = <0x00 0x2000000 0x00 0x200>; >>>> interrupts = <0x00 0x01 0xf01>; >>>> interrupt-parent = <0xfde8>; >>>> dma-coherent; >>>> iommus = <0xfde9 0x01>; >>>> }; >>> Not bad! I like it. >>> >>> >>>> I am wondering, would be the proper solution to eliminate deferred probe >>>> timeout issue in our particular case (without introducing an extra IOMMU >>>> driver)? >>> In reality I don't think there is a way to do that. I would create an >>> empty skelethon IOMMU driver for xen,grant-dma. >> Does it have to be an empty driver? Originally, IOMMU 'drivers' were not >> drivers, but they've been getting converted. Can that be done here? >> >> Short of that, I think we could have some sort of skip probe list for >> deferred probe. Not sure if that would be easiest as IOMMU specific or >> global. > Hi Oleksandr, > > If you do fw_devlink.strict=1, you'll notice that the consumers of > this "iommu" won't probe at all or will delay the boot by some number > of seconds. The eventual goal is to go towards fw_devlink.strict=1 > being the default. ok, I got it. Let's me please explain our particular case in details, sorry I may repeat some information which I have already mentioned elsewhere, but it maybe better to keep the whole context here. We have Xen grant DMA-mapping layer added by previous commit [1]. For it to operate properly we need a way to communicate some per-device information using device-tree, and this information is Xen specific. This is what the current commit is doing by introducing new binding to describe that. The next commit [2] will use that new binding to retrieve required information. There was a suggestion to consider reusing generic device-tree IOMMU bindings to communicate this specific information instead of introducing a custom property. Although it requires more effort for the Xen toolstack (instead of adding a code to insert a single "xen,backend-domid" property, we need to generate fake IOMMU node, reserve phandle for it, etc), from the device tree PoV it looks indeed good (we reuse endpoint ID to pass the ID of the domain where the corresponding backend is running), and resulting code to retrieve this information in our DMA-mapping layer also looks simple enough [3]. Using generic device-tree IOMMU bindings: iommu { compatible = "xen,grant-dma"; #iommu-cells = <0x01>; phandle = <0xfde9>; }; virtio@2000000 { compatible = "virtio,mmio"; reg = <0x00 0x2000000 0x00 0x200>; interrupts = <0x00 0x01 0xf01>; interrupt-parent = <0xfde8>; dma-coherent; iommus = <0xfde9 0x01>; }; Using Xen specific property: virtio@2000000 { compatible = "virtio,mmio"; reg = <0x00 0x2000000 0x00 0x200>; interrupts = <0x00 0x01 0xf01>; interrupt-parent = <0xfde8>; dma-coherent; xen,backend-domid = <0x01>; }; The main problem is that idea doesn't quite fit into how Linux currently behaves for the "iommus" property. Of course, just reusing IOMMU bindings (without having a corresponding driver) leads to the deferred probe timeout issue afterwards, because the IOMMU device never becomes available. From my understanding, our DMA-mapping layer we are consider to reuse IOMMU bindings for, *cannot* be converted into the proper IOMMU driver. Sure, we will need to find a way how to deal with it, if we really want to reuse the IOMMU bindings. And yes, one idea was to just implement stub IOMMU driver for that purpose, I have rechecked, it works fine with that stub driver [4]. > > From a fw_devlik perspective, please implement a driver. Ideally a > real one, but at least an empty one. The empty one doesn't need to be > an IOMMU driver, but at least just do a return 0 in the probe > function. If I got things right, I am afraid, for the "of_iommu" case the empty driver is not enough. The driver should at least register iommu_ops, but the "of_xlate" callback should be *not* implemented. In that case, we will get NO_IOMMU (>0 : there is no IOMMU, or one was unavailable for non-fatal reasons) which is also a success condition, so -EPROBE_DEFER won't be returned. https://elixir.bootlin.com/linux/v5.18/source/drivers/iommu/of_iommu.c#L32 Otherwise, of_iommu_xlate() will call driver_deferred_probe_check_state(). https://elixir.bootlin.com/linux/v5.18/source/drivers/iommu/of_iommu.c#L43 > Also, if it's not a device, why even have a "compatible" > property (removing it won't necessarily remove the deferred probe > timeout issue you see)? Will any code be using "xen,grant-dma" to look > up the node? Yes > If so, that driver could be the one that probes this > device. At least from a fw_devlink perspective, it just needs to have > a driver that binds to this device. Agree > > Also, if we aren't going to implement a driver and have the supplier > ("xen,grant-dma") behave like a device (as in, have a driver that > probes), I'd rather that the iommu binding not be used at all as this > would be an exception to how every other iommu device behaves. Agree Saravana, thank you for the explanation. To summarize, as I understand, we have three options (the first two are clear enough, the third is unclear yet): 1. Do not try to reuse IOMMU bindings for current xen-virtio enabling work, use "xen,backend-domid" property. 2. Reuse IOMMU bindings, for that purpose introduce stub IOMMU driver. It is a standalone entity in my example, but it can be a part of grant-dma-ops.c which actually uses "xen,grant-dma" compatible to look up a node. 3. Try to find other options how to reuse IOMMU bindings but *without* introducing stub IOMMU driver, such as skip list for deferred probe, etc. What do the maintainers think regarding the option to go forward? [1] https://lore.kernel.org/xen-devel/1651947548-4055-4-git-send-email-olekstysh@gmail.com/ [2] https://lore.kernel.org/xen-devel/1651947548-4055-7-git-send-email-olekstysh@gmail.com/ [3] https://lore.kernel.org/xen-devel/390ba7bb-ee9e-b7b7-5f08-71a7245fa4ec@gmail.com/ [4] https://lore.kernel.org/xen-devel/606dfdcc-ec10-0c4a-04e9-72cd73ee6676@gmail.com/ > > -Saravana
diff --git a/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml b/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml new file mode 100644 index 00000000..750e89e --- /dev/null +++ b/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml @@ -0,0 +1,37 @@ +# SPDX-License-Identifier: (GPL-2.0-only or BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/arm/xen,dev-domid.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Xen specific binding for virtualized devices (e.g. virtio) + +maintainers: + - Stefano Stabellini <sstabellini@kernel.org> + +select: true + +description: + This binding indicates that Xen grant mappings need to be enabled for + the device, and it specifies the ID of the domain where the corresponding + device (backend) resides. The property is required to restrict memory + access using Xen grant mappings. + +properties: + xen,dev-domid: + $ref: /schemas/types.yaml#/definitions/uint32 + description: + The domid (domain ID) of the domain where the device (backend) is running. + +additionalProperties: true + +examples: + - | + virtio@3000 { + compatible = "virtio,mmio"; + reg = <0x3000 0x100>; + interrupts = <41>; + + /* The device is located in Xen domain with ID 1 */ + xen,dev-domid = <1>; + }; diff --git a/Documentation/devicetree/bindings/virtio/mmio.yaml b/Documentation/devicetree/bindings/virtio/mmio.yaml index 10c22b5..29a0932 100644 --- a/Documentation/devicetree/bindings/virtio/mmio.yaml +++ b/Documentation/devicetree/bindings/virtio/mmio.yaml @@ -13,6 +13,9 @@ description: See https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=virtio for more details. +allOf: + - $ref: /schemas/arm/xen,dev-domid.yaml# + properties: compatible: const: virtio,mmio @@ -33,6 +36,10 @@ properties: description: Required for devices making accesses thru an IOMMU. maxItems: 1 + xen,dev-domid: + description: Required when Xen grant mappings need to be enabled for device. + $ref: /schemas/types.yaml#/definitions/uint32 + required: - compatible - reg