mbox series

[v3,0/8] vfio/mdev: IOMMU aware mediated device

Message ID 20181012051632.26064-1-baolu.lu@linux.intel.com (mailing list archive)
Headers show
Series vfio/mdev: IOMMU aware mediated device | expand

Message

Baolu Lu Oct. 12, 2018, 5:16 a.m. UTC
Hi,

The Mediate Device is a framework for fine-grained physical device
sharing across the isolated domains. Currently the mdev framework
is designed to be independent of the platform IOMMU support. As the
result, the DMA isolation relies on the mdev parent device in a
vendor specific way.

There are several cases where a mediated device could be protected
and isolated by the platform IOMMU. For example, Intel vt-d rev3.0
[1] introduces a new translation mode called 'scalable mode', which
enables PASID-granular translations. The vt-d scalable mode is the
key ingredient for Scalable I/O Virtualization [2] [3] which allows
sharing a device in minimal possible granularity (ADI - Assignable
Device Interface).

A mediated device backed by an ADI could be protected and isolated
by the IOMMU since 1) the parent device supports tagging an unique
PASID to all DMA traffic out of the mediated device; and 2) the DMA
translation unit (IOMMU) supports the PASID granular translation.
We can apply IOMMU protection and isolation to this kind of devices
just as what we are doing with an assignable PCI device.

In order to distinguish the IOMMU-capable mediated devices from those
which still need to rely on parent devices, this patch set adds two
new members in struct mdev_device.

* iommu_device
  - This, if set, indicates that the mediated device could
    be fully isolated and protected by IOMMU via attaching
    an iommu domain to this device. If empty, it indicates
    using vendor defined isolation.

* iommu_domain
  - This is a place holder for an iommu domain. A domain
    could be store here for later use once it has been
    attached to the iommu_device of this mdev.

Below helpers are added to set and get above iommu device
and iommu domain pointers in mdev core implementation.

* mdev_set/get_iommu_device(dev, iommu_device)
  - Set or get the iommu device which represents this mdev
    in IOMMU's device scope. Drivers don't need to set the
    iommu device if it uses vendor defined isolation.

* mdev_set/get_iommu_domain(domain)
  - A iommu domain which has been attached to the iommu
    device in order to protect and isolate the mediated
    device will be kept in the mdev data structure and
    could be retrieved later.

The mdev parent device driver could opt-in that the mdev could be
fully isolated and protected by the IOMMU when the mdev is being
created by invoking mdev_set_iommu_device() in its @create().

In the vfio_iommu_type1_attach_group(), a domain allocated through
iommu_domain_alloc() will be attached to the mdev iommu device if
an iommu device has been set. Otherwise, the dummy external domain
will be used and all the DMA isolation and protection are routed to
parent driver as the result.

On IOMMU side, a basic requirement is allowing to attach multiple
domains to a PCI device if the device advertises the capability
and the IOMMU hardware supports finer granularity translations than
the normal PCI Source ID based translation.

As the result, a PCI device could work in two modes: normal mode
and auxiliary mode. In the normal mode, a pci device could be
isolated in the Source ID granularity; the pci device itself could
be assigned to a user application by attaching a single domain
to it. In the auxiliary mode, a pci device could be isolated in
finer granularity, hence subsets of the device could be assigned
to different user level application by attaching a different domain
to each subset.

The device driver is able to switch between above two modes with
below interfaces:

* iommu_get_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_CAPABILITY)
  - Represents the ability of supporting multiple domains
    per device.

* iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_ENABLE)
  - Enable the multiple domains capability for the device
    referenced by @dev.

* iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_DISABLE)
  - Disable the multiple domains capability for the device
    referenced by @dev.

* iommu_domain_get_attr(domain, DOMAIN_ATTR_AUXD_ID)
  - Return ID used for finer-granularity DMA translation.

The existing interfaces for attaching/detaching domains keep the
same as before. The different behaviors between the normal mode
and the auxiliary mode are handled in the vendor specific iommu
drivers.

In order for the ease of discussion, sometimes we call "a domain in
auxiliary mode' or simply 'an auxiliary domain' when a domain is
attached to a device for finer granularity translations. But we need
to keep in mind that this doesn't mean there is a differnt domain
type. A same domain could be bound to a device for Source ID based
translation, and bound to another device for finer granularity
translation at the same time.

This patch series extends both IOMMU and vfio components to support
mdev device passing through when it could be isolated and protected
by the IOMMU units. The first part of this series (PATCH 1/08~5/08)
adds the interfaces and implementation of the multiple domains per
device. The second part (PATCH 6/08~8/08) adds the iommu device
attribute to each mdev, determines isolation type according to the
existence of an iommu device when attaching group in vfio type1 iommu
module, and attaches the domain to iommu aware mediated devices.

This patch series depends on a patch set posted here [4] for discussion
which added scalable mode support in Intel IOMMU driver.

References:
[1] https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification
[2] https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification
[3] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf
[4] https://lkml.org/lkml/2018/10/7/54

Best regards,
Lu Baolu

Change log:
  v2->v3:
  - Remove domain type enum and use a pointer on mdev_device instead.
  - Add a generic interface for getting/setting per device iommu
    attributions. And use it for query aux domain capability, enable
    aux domain and disable aux domain purpose.
  - Reuse iommu_domain_get_attr() to retrieve the id in a aux domain.
  - We discussed the impact of the default domain implementation
    on reusing iommu_at(de)tach_device() interfaces. We agreed
    that reusing iommu_at(de)tach_device() interfaces is the right
    direction and we could tweak the code to remove the impact.
    https://www.spinics.net/lists/kvm/msg175285.html  
  - Removed the RFC tag since no objections received.
  - This patch has been submitted separately.
    https://www.spinics.net/lists/kvm/msg173936.html

  v1->v2:
  - Rewrite the patches with the concept of auxiliary domains.

Lu Baolu (8):
  iommu: Add APIs for multiple domains per device
  iommu/vt-d: Add multiple domains per device query
  iommu/vt-d: Enable/disable multiple domains per device
  iommu/vt-d: Attach/detach domains in auxiliary mode
  iommu/vt-d: Return ID associated with an auxiliary domain
  vfio/mdev: Add iommu place holders in mdev_device
  vfio/type1: Add domain at(de)taching group helpers
  vfio/type1: Handle different mdev isolation type

 drivers/iommu/intel-iommu.c      | 249 ++++++++++++++++++++++++++++++-
 drivers/iommu/iommu.c            |  25 ++++
 drivers/vfio/mdev/mdev_core.c    |  36 +++++
 drivers/vfio/mdev/mdev_private.h |   2 +
 drivers/vfio/vfio_iommu_type1.c  | 146 ++++++++++++++++--
 include/linux/intel-iommu.h      |  11 ++
 include/linux/iommu.h            |  33 ++++
 include/linux/mdev.h             |  23 +++
 8 files changed, 509 insertions(+), 16 deletions(-)

Comments

Xu Zaibo Oct. 13, 2018, 8:25 a.m. UTC | #1
Hi,

On 2018/10/12 13:16, Lu Baolu wrote:
> Hi,
>
> The Mediate Device is a framework for fine-grained physical device
> sharing across the isolated domains. Currently the mdev framework
> is designed to be independent of the platform IOMMU support. As the
> result, the DMA isolation relies on the mdev parent device in a
> vendor specific way.
>
> There are several cases where a mediated device could be protected
> and isolated by the platform IOMMU. For example, Intel vt-d rev3.0
> [1] introduces a new translation mode called 'scalable mode', which
> enables PASID-granular translations. The vt-d scalable mode is the
> key ingredient for Scalable I/O Virtualization [2] [3] which allows
> sharing a device in minimal possible granularity (ADI - Assignable
> Device Interface).
>
> A mediated device backed by an ADI could be protected and isolated
> by the IOMMU since 1) the parent device supports tagging an unique
> PASID to all DMA traffic out of the mediated device; and 2) the DMA
> translation unit (IOMMU) supports the PASID granular translation.
> We can apply IOMMU protection and isolation to this kind of devices
> just as what we are doing with an assignable PCI device.
>
> In order to distinguish the IOMMU-capable mediated devices from those
> which still need to rely on parent devices, this patch set adds two
> new members in struct mdev_device.
>
> * iommu_device
>    - This, if set, indicates that the mediated device could
>      be fully isolated and protected by IOMMU via attaching
>      an iommu domain to this device. If empty, it indicates
>      using vendor defined isolation.
>
> * iommu_domain
>    - This is a place holder for an iommu domain. A domain
>      could be store here for later use once it has been
>      attached to the iommu_device of this mdev.
>
> Below helpers are added to set and get above iommu device
> and iommu domain pointers in mdev core implementation.
>
> * mdev_set/get_iommu_device(dev, iommu_device)
>    - Set or get the iommu device which represents this mdev
>      in IOMMU's device scope. Drivers don't need to set the
>      iommu device if it uses vendor defined isolation.
>
> * mdev_set/get_iommu_domain(domain)
>    - A iommu domain which has been attached to the iommu
>      device in order to protect and isolate the mediated
>      device will be kept in the mdev data structure and
>      could be retrieved later.
>
> The mdev parent device driver could opt-in that the mdev could be
> fully isolated and protected by the IOMMU when the mdev is being
> created by invoking mdev_set_iommu_device() in its @create().
I just cannot understand here, how to get an iommu_device while I create 
mediated
device in my parent device driver?

And why not reuse the device of MDEV instread of adding a new device here?

Thanks,
Zaibo

.
>
> In the vfio_iommu_type1_attach_group(), a domain allocated through
> iommu_domain_alloc() will be attached to the mdev iommu device if
> an iommu device has been set. Otherwise, the dummy external domain
> will be used and all the DMA isolation and protection are routed to
> parent driver as the result.
>
> On IOMMU side, a basic requirement is allowing to attach multiple
> domains to a PCI device if the device advertises the capability
> and the IOMMU hardware supports finer granularity translations than
> the normal PCI Source ID based translation.
>
> As the result, a PCI device could work in two modes: normal mode
> and auxiliary mode. In the normal mode, a pci device could be
> isolated in the Source ID granularity; the pci device itself could
> be assigned to a user application by attaching a single domain
> to it. In the auxiliary mode, a pci device could be isolated in
> finer granularity, hence subsets of the device could be assigned
> to different user level application by attaching a different domain
> to each subset.
>
> The device driver is able to switch between above two modes with
> below interfaces:
>
> * iommu_get_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_CAPABILITY)
>    - Represents the ability of supporting multiple domains
>      per device.
>
> * iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_ENABLE)
>    - Enable the multiple domains capability for the device
>      referenced by @dev.
>
> * iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_DISABLE)
>    - Disable the multiple domains capability for the device
>      referenced by @dev.
>
> * iommu_domain_get_attr(domain, DOMAIN_ATTR_AUXD_ID)
>    - Return ID used for finer-granularity DMA translation.
>
> The existing interfaces for attaching/detaching domains keep the
> same as before. The different behaviors between the normal mode
> and the auxiliary mode are handled in the vendor specific iommu
> drivers.
>
> In order for the ease of discussion, sometimes we call "a domain in
> auxiliary mode' or simply 'an auxiliary domain' when a domain is
> attached to a device for finer granularity translations. But we need
> to keep in mind that this doesn't mean there is a differnt domain
> type. A same domain could be bound to a device for Source ID based
> translation, and bound to another device for finer granularity
> translation at the same time.
>
> This patch series extends both IOMMU and vfio components to support
> mdev device passing through when it could be isolated and protected
> by the IOMMU units. The first part of this series (PATCH 1/08~5/08)
> adds the interfaces and implementation of the multiple domains per
> device. The second part (PATCH 6/08~8/08) adds the iommu device
> attribute to each mdev, determines isolation type according to the
> existence of an iommu device when attaching group in vfio type1 iommu
> module, and attaches the domain to iommu aware mediated devices.
>
> This patch series depends on a patch set posted here [4] for discussion
> which added scalable mode support in Intel IOMMU driver.
>
> References:
> [1] https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification
> [2] https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification
> [3] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf
> [4] https://lkml.org/lkml/2018/10/7/54
>
> Best regards,
> Lu Baolu
>
> Change log:
>    v2->v3:
>    - Remove domain type enum and use a pointer on mdev_device instead.
>    - Add a generic interface for getting/setting per device iommu
>      attributions. And use it for query aux domain capability, enable
>      aux domain and disable aux domain purpose.
>    - Reuse iommu_domain_get_attr() to retrieve the id in a aux domain.
>    - We discussed the impact of the default domain implementation
>      on reusing iommu_at(de)tach_device() interfaces. We agreed
>      that reusing iommu_at(de)tach_device() interfaces is the right
>      direction and we could tweak the code to remove the impact.
>      https://www.spinics.net/lists/kvm/msg175285.html
>    - Removed the RFC tag since no objections received.
>    - This patch has been submitted separately.
>      https://www.spinics.net/lists/kvm/msg173936.html
>
>    v1->v2:
>    - Rewrite the patches with the concept of auxiliary domains.
>
> Lu Baolu (8):
>    iommu: Add APIs for multiple domains per device
>    iommu/vt-d: Add multiple domains per device query
>    iommu/vt-d: Enable/disable multiple domains per device
>    iommu/vt-d: Attach/detach domains in auxiliary mode
>    iommu/vt-d: Return ID associated with an auxiliary domain
>    vfio/mdev: Add iommu place holders in mdev_device
>    vfio/type1: Add domain at(de)taching group helpers
>    vfio/type1: Handle different mdev isolation type
>
>   drivers/iommu/intel-iommu.c      | 249 ++++++++++++++++++++++++++++++-
>   drivers/iommu/iommu.c            |  25 ++++
>   drivers/vfio/mdev/mdev_core.c    |  36 +++++
>   drivers/vfio/mdev/mdev_private.h |   2 +
>   drivers/vfio/vfio_iommu_type1.c  | 146 ++++++++++++++++--
>   include/linux/intel-iommu.h      |  11 ++
>   include/linux/iommu.h            |  33 ++++
>   include/linux/mdev.h             |  23 +++
>   8 files changed, 509 insertions(+), 16 deletions(-)
>
Baolu Lu Oct. 15, 2018, 2:48 a.m. UTC | #2
Hi,

On 10/13/2018 04:25 PM, Xu Zaibo wrote:
> Hi,
> 
> On 2018/10/12 13:16, Lu Baolu wrote:
>> Hi,
>>
>> The Mediate Device is a framework for fine-grained physical device
>> sharing across the isolated domains. Currently the mdev framework
>> is designed to be independent of the platform IOMMU support. As the
>> result, the DMA isolation relies on the mdev parent device in a
>> vendor specific way.
>>
>> There are several cases where a mediated device could be protected
>> and isolated by the platform IOMMU. For example, Intel vt-d rev3.0
>> [1] introduces a new translation mode called 'scalable mode', which
>> enables PASID-granular translations. The vt-d scalable mode is the
>> key ingredient for Scalable I/O Virtualization [2] [3] which allows
>> sharing a device in minimal possible granularity (ADI - Assignable
>> Device Interface).
>>
>> A mediated device backed by an ADI could be protected and isolated
>> by the IOMMU since 1) the parent device supports tagging an unique
>> PASID to all DMA traffic out of the mediated device; and 2) the DMA
>> translation unit (IOMMU) supports the PASID granular translation.
>> We can apply IOMMU protection and isolation to this kind of devices
>> just as what we are doing with an assignable PCI device.
>>
>> In order to distinguish the IOMMU-capable mediated devices from those
>> which still need to rely on parent devices, this patch set adds two
>> new members in struct mdev_device.
>>
>> * iommu_device
>>    - This, if set, indicates that the mediated device could
>>      be fully isolated and protected by IOMMU via attaching
>>      an iommu domain to this device. If empty, it indicates
>>      using vendor defined isolation.
>>
>> * iommu_domain
>>    - This is a place holder for an iommu domain. A domain
>>      could be store here for later use once it has been
>>      attached to the iommu_device of this mdev.
>>
>> Below helpers are added to set and get above iommu device
>> and iommu domain pointers in mdev core implementation.
>>
>> * mdev_set/get_iommu_device(dev, iommu_device)
>>    - Set or get the iommu device which represents this mdev
>>      in IOMMU's device scope. Drivers don't need to set the
>>      iommu device if it uses vendor defined isolation.
>>
>> * mdev_set/get_iommu_domain(domain)
>>    - A iommu domain which has been attached to the iommu
>>      device in order to protect and isolate the mediated
>>      device will be kept in the mdev data structure and
>>      could be retrieved later.
>>
>> The mdev parent device driver could opt-in that the mdev could be
>> fully isolated and protected by the IOMMU when the mdev is being
>> created by invoking mdev_set_iommu_device() in its @create().
> I just cannot understand here, how to get an iommu_device while I create 
> mediated
> device in my parent device driver?

When you are creating an mdev in your parent driver, you should know
which PCI device this mdev belonging to.

> 
> And why not reuse the device of MDEV instread of adding a new device here?

iommu_device in the mdev_device structure represents the PCI device
that represents this mdev in iommu's device scope. IOMMU is only aware
of pci devices, it's not aware of mdev device.

Best regards,
Lu Baolu
Xu Zaibo Oct. 15, 2018, 8:50 a.m. UTC | #3
Hi,

On 2018/10/15 10:48, Lu Baolu wrote:
> Hi,
>
> On 10/13/2018 04:25 PM, Xu Zaibo wrote:
>> Hi,
>>
>> On 2018/10/12 13:16, Lu Baolu wrote:
>>> Hi,
>>>
>>> The Mediate Device is a framework for fine-grained physical device
>>> sharing across the isolated domains. Currently the mdev framework
>>> is designed to be independent of the platform IOMMU support. As the
>>> result, the DMA isolation relies on the mdev parent device in a
>>> vendor specific way.
>>>
>>> There are several cases where a mediated device could be protected
>>> and isolated by the platform IOMMU. For example, Intel vt-d rev3.0
>>> [1] introduces a new translation mode called 'scalable mode', which
>>> enables PASID-granular translations. The vt-d scalable mode is the
>>> key ingredient for Scalable I/O Virtualization [2] [3] which allows
>>> sharing a device in minimal possible granularity (ADI - Assignable
>>> Device Interface).
>>>
>>> A mediated device backed by an ADI could be protected and isolated
>>> by the IOMMU since 1) the parent device supports tagging an unique
>>> PASID to all DMA traffic out of the mediated device; and 2) the DMA
>>> translation unit (IOMMU) supports the PASID granular translation.
>>> We can apply IOMMU protection and isolation to this kind of devices
>>> just as what we are doing with an assignable PCI device.
>>>
>>> In order to distinguish the IOMMU-capable mediated devices from those
>>> which still need to rely on parent devices, this patch set adds two
>>> new members in struct mdev_device.
>>>
>>> * iommu_device
>>>    - This, if set, indicates that the mediated device could
>>>      be fully isolated and protected by IOMMU via attaching
>>>      an iommu domain to this device. If empty, it indicates
>>>      using vendor defined isolation.
>>>
>>> * iommu_domain
>>>    - This is a place holder for an iommu domain. A domain
>>>      could be store here for later use once it has been
>>>      attached to the iommu_device of this mdev.
>>>
>>> Below helpers are added to set and get above iommu device
>>> and iommu domain pointers in mdev core implementation.
>>>
>>> * mdev_set/get_iommu_device(dev, iommu_device)
>>>    - Set or get the iommu device which represents this mdev
>>>      in IOMMU's device scope. Drivers don't need to set the
>>>      iommu device if it uses vendor defined isolation.
>>>
>>> * mdev_set/get_iommu_domain(domain)
>>>    - A iommu domain which has been attached to the iommu
>>>      device in order to protect and isolate the mediated
>>>      device will be kept in the mdev data structure and
>>>      could be retrieved later.
>>>
>>> The mdev parent device driver could opt-in that the mdev could be
>>> fully isolated and protected by the IOMMU when the mdev is being
>>> created by invoking mdev_set_iommu_device() in its @create().
>> I just cannot understand here, how to get an iommu_device while I 
>> create mediated
>> device in my parent device driver?
>
> When you are creating an mdev in your parent driver, you should know
> which PCI device this mdev belonging to.
>

So, generally, I can set the parent device as mdev's iommu_device?
If that, however, Mdev already holds its parent device. So, I just 
figure what
differences between Mdev's parent device and iommu_device are.
>>
>> And why not reuse the device of MDEV instread of adding a new device 
>> here?
>
> iommu_device in the mdev_device structure represents the PCI device
> that represents this mdev in iommu's device scope. IOMMU is only aware
> of pci devices, it's not aware of mdev device.

Could I understand like that: IOMMU can be aware of the parent device of 
Mdev?
And more, I am doubting the necessary of iommu_device in Mdev.

Thanks,
Zaibo

.
Baolu Lu Oct. 16, 2018, 1:21 a.m. UTC | #4
Hi,

On 10/15/2018 04:50 PM, Xu Zaibo wrote:
> Hi,
> 
> On 2018/10/15 10:48, Lu Baolu wrote:
>> Hi,
>>
>> On 10/13/2018 04:25 PM, Xu Zaibo wrote:
>>> Hi,
>>>
>>> On 2018/10/12 13:16, Lu Baolu wrote:
>>>> Hi,
>>>>
>>>> The Mediate Device is a framework for fine-grained physical device
>>>> sharing across the isolated domains. Currently the mdev framework
>>>> is designed to be independent of the platform IOMMU support. As the
>>>> result, the DMA isolation relies on the mdev parent device in a
>>>> vendor specific way.
>>>>
>>>> There are several cases where a mediated device could be protected
>>>> and isolated by the platform IOMMU. For example, Intel vt-d rev3.0
>>>> [1] introduces a new translation mode called 'scalable mode', which
>>>> enables PASID-granular translations. The vt-d scalable mode is the
>>>> key ingredient for Scalable I/O Virtualization [2] [3] which allows
>>>> sharing a device in minimal possible granularity (ADI - Assignable
>>>> Device Interface).
>>>>
>>>> A mediated device backed by an ADI could be protected and isolated
>>>> by the IOMMU since 1) the parent device supports tagging an unique
>>>> PASID to all DMA traffic out of the mediated device; and 2) the DMA
>>>> translation unit (IOMMU) supports the PASID granular translation.
>>>> We can apply IOMMU protection and isolation to this kind of devices
>>>> just as what we are doing with an assignable PCI device.
>>>>
>>>> In order to distinguish the IOMMU-capable mediated devices from those
>>>> which still need to rely on parent devices, this patch set adds two
>>>> new members in struct mdev_device.
>>>>
>>>> * iommu_device
>>>>    - This, if set, indicates that the mediated device could
>>>>      be fully isolated and protected by IOMMU via attaching
>>>>      an iommu domain to this device. If empty, it indicates
>>>>      using vendor defined isolation.
>>>>
>>>> * iommu_domain
>>>>    - This is a place holder for an iommu domain. A domain
>>>>      could be store here for later use once it has been
>>>>      attached to the iommu_device of this mdev.
>>>>
>>>> Below helpers are added to set and get above iommu device
>>>> and iommu domain pointers in mdev core implementation.
>>>>
>>>> * mdev_set/get_iommu_device(dev, iommu_device)
>>>>    - Set or get the iommu device which represents this mdev
>>>>      in IOMMU's device scope. Drivers don't need to set the
>>>>      iommu device if it uses vendor defined isolation.
>>>>
>>>> * mdev_set/get_iommu_domain(domain)
>>>>    - A iommu domain which has been attached to the iommu
>>>>      device in order to protect and isolate the mediated
>>>>      device will be kept in the mdev data structure and
>>>>      could be retrieved later.
>>>>
>>>> The mdev parent device driver could opt-in that the mdev could be
>>>> fully isolated and protected by the IOMMU when the mdev is being
>>>> created by invoking mdev_set_iommu_device() in its @create().
>>> I just cannot understand here, how to get an iommu_device while I 
>>> create mediated
>>> device in my parent device driver?
>>
>> When you are creating an mdev in your parent driver, you should know
>> which PCI device this mdev belonging to.
>>
> 
> So, generally, I can set the parent device as mdev's iommu_device?
> If that, however, Mdev already holds its parent device. So, I just 
> figure what
> differences between Mdev's parent device and iommu_device are.
>>>
>>> And why not reuse the device of MDEV instread of adding a new device 
>>> here?
>>
>> iommu_device in the mdev_device structure represents the PCI device
>> that represents this mdev in iommu's device scope. IOMMU is only aware
>> of pci devices, it's not aware of mdev device.
> 
> Could I understand like that: IOMMU can be aware of the parent device of 
> Mdev?
> And more, I am doubting the necessary of iommu_device in Mdev.
> 

The "mdev parent device" and "mdev iommu device" are different although
they might be the same in practice. "mdev parent device" represents the
device who created the mdev. "mdev iommu device" represents the device
who shares the device context entry in iommu tables.

"mdev iommu device" is always a PCI/PCIe device since IOMMU always use
source id (bus:dev:func) to walk the device context table. But there is
no limitation on who can create an mdev, right?

Best regards,
Lu Baolu
Xu Zaibo Oct. 17, 2018, 2:02 a.m. UTC | #5
Hi,

On 2018/10/16 9:21, Lu Baolu wrote:
> Hi,
>
> On 10/15/2018 04:50 PM, Xu Zaibo wrote:
>> Hi,
>>
>> On 2018/10/15 10:48, Lu Baolu wrote:
>>> Hi,
>>>
>>> On 10/13/2018 04:25 PM, Xu Zaibo wrote:
>>>> Hi,
>>>>
>>>> On 2018/10/12 13:16, Lu Baolu wrote:
>>>>> Hi,
>>>>>
>>>>> The Mediate Device is a framework for fine-grained physical device
>>>>> sharing across the isolated domains. Currently the mdev framework
>>>>> is designed to be independent of the platform IOMMU support. As the
>>>>> result, the DMA isolation relies on the mdev parent device in a
>>>>> vendor specific way.
>>>>>
>>>>> There are several cases where a mediated device could be protected
>>>>> and isolated by the platform IOMMU. For example, Intel vt-d rev3.0
>>>>> [1] introduces a new translation mode called 'scalable mode', which
>>>>> enables PASID-granular translations. The vt-d scalable mode is the
>>>>> key ingredient for Scalable I/O Virtualization [2] [3] which allows
>>>>> sharing a device in minimal possible granularity (ADI - Assignable
>>>>> Device Interface).
>>>>>
>>>>> A mediated device backed by an ADI could be protected and isolated
>>>>> by the IOMMU since 1) the parent device supports tagging an unique
>>>>> PASID to all DMA traffic out of the mediated device; and 2) the DMA
>>>>> translation unit (IOMMU) supports the PASID granular translation.
>>>>> We can apply IOMMU protection and isolation to this kind of devices
>>>>> just as what we are doing with an assignable PCI device.
>>>>>
>>>>> In order to distinguish the IOMMU-capable mediated devices from those
>>>>> which still need to rely on parent devices, this patch set adds two
>>>>> new members in struct mdev_device.
>>>>>
>>>>> * iommu_device
>>>>>    - This, if set, indicates that the mediated device could
>>>>>      be fully isolated and protected by IOMMU via attaching
>>>>>      an iommu domain to this device. If empty, it indicates
>>>>>      using vendor defined isolation.
>>>>>
>>>>> * iommu_domain
>>>>>    - This is a place holder for an iommu domain. A domain
>>>>>      could be store here for later use once it has been
>>>>>      attached to the iommu_device of this mdev.
>>>>>
>>>>> Below helpers are added to set and get above iommu device
>>>>> and iommu domain pointers in mdev core implementation.
>>>>>
>>>>> * mdev_set/get_iommu_device(dev, iommu_device)
>>>>>    - Set or get the iommu device which represents this mdev
>>>>>      in IOMMU's device scope. Drivers don't need to set the
>>>>>      iommu device if it uses vendor defined isolation.
>>>>>
>>>>> * mdev_set/get_iommu_domain(domain)
>>>>>    - A iommu domain which has been attached to the iommu
>>>>>      device in order to protect and isolate the mediated
>>>>>      device will be kept in the mdev data structure and
>>>>>      could be retrieved later.
>>>>>
>>>>> The mdev parent device driver could opt-in that the mdev could be
>>>>> fully isolated and protected by the IOMMU when the mdev is being
>>>>> created by invoking mdev_set_iommu_device() in its @create().
>>>> I just cannot understand here, how to get an iommu_device while I 
>>>> create mediated
>>>> device in my parent device driver?
>>>
>>> When you are creating an mdev in your parent driver, you should know
>>> which PCI device this mdev belonging to.
>>>
>>
>> So, generally, I can set the parent device as mdev's iommu_device?
>> If that, however, Mdev already holds its parent device. So, I just 
>> figure what
>> differences between Mdev's parent device and iommu_device are.
>>>>
>>>> And why not reuse the device of MDEV instread of adding a new 
>>>> device here?
>>>
>>> iommu_device in the mdev_device structure represents the PCI device
>>> that represents this mdev in iommu's device scope. IOMMU is only aware
>>> of pci devices, it's not aware of mdev device.
>>
>> Could I understand like that: IOMMU can be aware of the parent device 
>> of Mdev?
>> And more, I am doubting the necessary of iommu_device in Mdev.
>>
>
> The "mdev parent device" and "mdev iommu device" are different although
> they might be the same in practice. "mdev parent device" represents the
> device who created the mdev. "mdev iommu device" represents the device
> who shares the device context entry in iommu tables.
>
> "mdev iommu device" is always a PCI/PCIe device since IOMMU always use
> source id (bus:dev:func) to walk the device context table. But there is
> no limitation on who can create an mdev, right?
>
Actually, I am not sure.

My understanding:
The DMA address will be issued by the parent device with PASID or 
something like that to IOMMU
facilities. However, the translation units such as iommu (PASID/page 
.etx)tables are from another
device node.  I cannot figure out how to control this in hardware level, 
or whether there will be
conflicts between the DMA transation of iommu_device and parent device.

Thanks,
Zaibo

.
Baolu Lu Oct. 17, 2018, 2:10 a.m. UTC | #6
Hi,

On 10/17/18 10:02 AM, Xu Zaibo wrote:
> Hi,
> 
> On 2018/10/16 9:21, Lu Baolu wrote:
>> Hi,
>>
>> On 10/15/2018 04:50 PM, Xu Zaibo wrote:
>>> Hi,
>>>
>>> On 2018/10/15 10:48, Lu Baolu wrote:
>>>> Hi,
>>>>
>>>> On 10/13/2018 04:25 PM, Xu Zaibo wrote:
>>>>> Hi,
>>>>>
>>>>> On 2018/10/12 13:16, Lu Baolu wrote:
>>>>>> Hi,
>>>>>>
>>>>>> The Mediate Device is a framework for fine-grained physical device
>>>>>> sharing across the isolated domains. Currently the mdev framework
>>>>>> is designed to be independent of the platform IOMMU support. As the
>>>>>> result, the DMA isolation relies on the mdev parent device in a
>>>>>> vendor specific way.
>>>>>>
>>>>>> There are several cases where a mediated device could be protected
>>>>>> and isolated by the platform IOMMU. For example, Intel vt-d rev3.0
>>>>>> [1] introduces a new translation mode called 'scalable mode', which
>>>>>> enables PASID-granular translations. The vt-d scalable mode is the
>>>>>> key ingredient for Scalable I/O Virtualization [2] [3] which allows
>>>>>> sharing a device in minimal possible granularity (ADI - Assignable
>>>>>> Device Interface).
>>>>>>
>>>>>> A mediated device backed by an ADI could be protected and isolated
>>>>>> by the IOMMU since 1) the parent device supports tagging an unique
>>>>>> PASID to all DMA traffic out of the mediated device; and 2) the DMA
>>>>>> translation unit (IOMMU) supports the PASID granular translation.
>>>>>> We can apply IOMMU protection and isolation to this kind of devices
>>>>>> just as what we are doing with an assignable PCI device.
>>>>>>
>>>>>> In order to distinguish the IOMMU-capable mediated devices from those
>>>>>> which still need to rely on parent devices, this patch set adds two
>>>>>> new members in struct mdev_device.
>>>>>>
>>>>>> * iommu_device
>>>>>>    - This, if set, indicates that the mediated device could
>>>>>>      be fully isolated and protected by IOMMU via attaching
>>>>>>      an iommu domain to this device. If empty, it indicates
>>>>>>      using vendor defined isolation.
>>>>>>
>>>>>> * iommu_domain
>>>>>>    - This is a place holder for an iommu domain. A domain
>>>>>>      could be store here for later use once it has been
>>>>>>      attached to the iommu_device of this mdev.
>>>>>>
>>>>>> Below helpers are added to set and get above iommu device
>>>>>> and iommu domain pointers in mdev core implementation.
>>>>>>
>>>>>> * mdev_set/get_iommu_device(dev, iommu_device)
>>>>>>    - Set or get the iommu device which represents this mdev
>>>>>>      in IOMMU's device scope. Drivers don't need to set the
>>>>>>      iommu device if it uses vendor defined isolation.
>>>>>>
>>>>>> * mdev_set/get_iommu_domain(domain)
>>>>>>    - A iommu domain which has been attached to the iommu
>>>>>>      device in order to protect and isolate the mediated
>>>>>>      device will be kept in the mdev data structure and
>>>>>>      could be retrieved later.
>>>>>>
>>>>>> The mdev parent device driver could opt-in that the mdev could be
>>>>>> fully isolated and protected by the IOMMU when the mdev is being
>>>>>> created by invoking mdev_set_iommu_device() in its @create().
>>>>> I just cannot understand here, how to get an iommu_device while I 
>>>>> create mediated
>>>>> device in my parent device driver?
>>>>
>>>> When you are creating an mdev in your parent driver, you should know
>>>> which PCI device this mdev belonging to.
>>>>
>>>
>>> So, generally, I can set the parent device as mdev's iommu_device?
>>> If that, however, Mdev already holds its parent device. So, I just 
>>> figure what
>>> differences between Mdev's parent device and iommu_device are.
>>>>>
>>>>> And why not reuse the device of MDEV instread of adding a new 
>>>>> device here?
>>>>
>>>> iommu_device in the mdev_device structure represents the PCI device
>>>> that represents this mdev in iommu's device scope. IOMMU is only aware
>>>> of pci devices, it's not aware of mdev device.
>>>
>>> Could I understand like that: IOMMU can be aware of the parent device 
>>> of Mdev?
>>> And more, I am doubting the necessary of iommu_device in Mdev.
>>>
>>
>> The "mdev parent device" and "mdev iommu device" are different although
>> they might be the same in practice. "mdev parent device" represents the
>> device who created the mdev. "mdev iommu device" represents the device
>> who shares the device context entry in iommu tables.
>>
>> "mdev iommu device" is always a PCI/PCIe device since IOMMU always use
>> source id (bus:dev:func) to walk the device context table. But there is
>> no limitation on who can create an mdev, right?
>>
> Actually, I am not sure.
> 
> My understanding:
> The DMA address will be issued by the parent device with PASID or 
> something like that to IOMMU
> facilities. However, the translation units such as iommu (PASID/page 
> .etx)tables are from another
> device node.  I cannot figure out how to control this in hardware level, 
> or whether there will be
> conflicts between the DMA transation of iommu_device and parent device.


Yes. That's the reason why these two devices are same in practice. But
conceptually, they might be different.

Best regards,
Lu Baolu