mbox series

[v3,0/6] vfio/hisilicon: add acc live migration driver

Message ID 20210915095037.1149-1-shameerali.kolothum.thodi@huawei.com (mailing list archive)
Headers show
Series vfio/hisilicon: add acc live migration driver | expand

Message

Shameerali Kolothum Thodi Sept. 15, 2021, 9:50 a.m. UTC
Hi,

Thanks to the introduction of vfio_pci_core subsystem framework[0],
now it is possible to provide vendor specific functionality to
vfio pci devices. This series attempts to add vfio live migration
support for HiSilicon ACC VF devices based on the new framework.

HiSilicon ACC VF device MMIO space includes both the functional
register space and migration control register space. As discussed
in RFCv1[1], this may create security issues as these regions get
shared between the Guest driver and the migration driver.
Based on the feedback, we tried to address those concerns in
this version. 

This is now sanity tested on HiSilicon platforms that support these
ACC devices.

Thanks,
Shameer

[0] https://lore.kernel.org/kvm/20210826103912.128972-1-yishaih@nvidia.com/
[1] https://lore.kernel.org/lkml/20210415220137.GA1672608@nvidia.com/

Change History:

RFC v2 --> v3
 -Dropped RFC tag as the vfio_pci_core subsystem framework is now 
  part of 5.15-rc1.
 -Added override methods for vfio_device_ops read/write/mmap calls 
  to limit the access within the functional register space.
 -Patches 1 to 3 are code refactoring to move the common ACC QM 
  definitions and header around.

RFCv1 --> RFCv2

 -Adds a new vendor-specific vfio_pci driver(hisi-acc-vfio-pci)
  for HiSilicon ACC VF devices based on the new vfio-pci-core
  framework proposal.

 -Since HiSilicon ACC VF device MMIO space contains both the
  functional register space and migration control register space,
  override the vfio_device_ops ioctl method to report only the
  functional space to VMs.

 -For a successful migration, we still need access to VF dev
  functional register space mainly to read the status registers.
  But accessing these while the Guest vCPUs are running may leave
  a security hole. To avoid any potential security issues, we
  map/unmap the MMIO regions on a need basis and is safe to do so.
  (Please see hisi_acc_vf_ioremap/unmap() fns in patch #4).
 
 -Dropped debugfs support for now.
 -Uses common QM functions for mailbox access(patch #3).

Longfang Liu (2):
  crypto: hisilicon/qm: Move few definitions to common header
  hisi_acc_vfio_pci: Add support for VFIO live migration

Shameer Kolothum (4):
  crypto: hisilicon/qm: Move the QM header to include/linux
  hisi_acc_qm: Move PCI device IDs to common header
  hisi-acc-vfio-pci: add new vfio_pci driver for HiSilicon ACC devices
  hisi_acc_vfio_pci: Restrict access to VF dev BAR2 migration region

 drivers/crypto/hisilicon/hpre/hpre.h          |    2 +-
 drivers/crypto/hisilicon/hpre/hpre_main.c     |   12 +-
 drivers/crypto/hisilicon/qm.c                 |   34 +-
 drivers/crypto/hisilicon/sec2/sec.h           |    2 +-
 drivers/crypto/hisilicon/sec2/sec_main.c      |    2 -
 drivers/crypto/hisilicon/sgl.c                |    2 +-
 drivers/crypto/hisilicon/zip/zip.h            |    2 +-
 drivers/crypto/hisilicon/zip/zip_main.c       |   11 +-
 drivers/vfio/pci/Kconfig                      |   13 +
 drivers/vfio/pci/Makefile                     |    3 +
 drivers/vfio/pci/hisi_acc_vfio_pci.c          | 1217 +++++++++++++++++
 drivers/vfio/pci/hisi_acc_vfio_pci.h          |  117 ++
 .../qm.h => include/linux/hisi_acc_qm.h       |   45 +
 13 files changed, 1414 insertions(+), 48 deletions(-)
 create mode 100644 drivers/vfio/pci/hisi_acc_vfio_pci.c
 create mode 100644 drivers/vfio/pci/hisi_acc_vfio_pci.h
 rename drivers/crypto/hisilicon/qm.h => include/linux/hisi_acc_qm.h (88%)

Comments

Tian, Kevin Sept. 29, 2021, 3:57 a.m. UTC | #1
Hi, Shameer,

> From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> Sent: Wednesday, September 15, 2021 5:51 PM
> 
> Hi,
> 
> Thanks to the introduction of vfio_pci_core subsystem framework[0],
> now it is possible to provide vendor specific functionality to
> vfio pci devices. This series attempts to add vfio live migration
> support for HiSilicon ACC VF devices based on the new framework.
> 
> HiSilicon ACC VF device MMIO space includes both the functional
> register space and migration control register space. As discussed
> in RFCv1[1], this may create security issues as these regions get
> shared between the Guest driver and the migration driver.
> Based on the feedback, we tried to address those concerns in
> this version.

This series doesn't mention anything related to dirty page tracking. 
Are you rely on Keqian's series for utilizing hardware iommu dirty
bit (e.g. SMMU HTTU)? If not, suppose some software trick is still
required e.g. by dynamically mediating guest-submitted descriptors
to compose the dirty page information in the migration phase...

Thanks
Kevin
Shameerali Kolothum Thodi Sept. 29, 2021, 8:34 a.m. UTC | #2
Hi Kevin,

> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> Sent: 29 September 2021 04:58
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> kvm@vger.kernel.org; linux-kernel@vger.kernel.org;
> linux-crypto@vger.kernel.org
> Cc: alex.williamson@redhat.com; jgg@nvidia.com; mgurtovoy@nvidia.com;
> Linuxarm <linuxarm@huawei.com>; liulongfang <liulongfang@huawei.com>;
> Zengtao (B) <prime.zeng@hisilicon.com>; Jonathan Cameron
> <jonathan.cameron@huawei.com>; Wangzhou (B)
> <wangzhou1@hisilicon.com>; He, Shaopeng <shaopeng.he@intel.com>; Zhao,
> Yan Y <yan.y.zhao@intel.com>
> Subject: RE: [PATCH v3 0/6] vfio/hisilicon: add acc live migration driver
> 
> Hi, Shameer,
> 
> > From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > Sent: Wednesday, September 15, 2021 5:51 PM
> >
> > Hi,
> >
> > Thanks to the introduction of vfio_pci_core subsystem framework[0],
> > now it is possible to provide vendor specific functionality to
> > vfio pci devices. This series attempts to add vfio live migration
> > support for HiSilicon ACC VF devices based on the new framework.
> >
> > HiSilicon ACC VF device MMIO space includes both the functional
> > register space and migration control register space. As discussed
> > in RFCv1[1], this may create security issues as these regions get
> > shared between the Guest driver and the migration driver.
> > Based on the feedback, we tried to address those concerns in
> > this version.
> 
> This series doesn't mention anything related to dirty page tracking.
> Are you rely on Keqian's series for utilizing hardware iommu dirty
> bit (e.g. SMMU HTTU)? 

Yes, this doesn't have dirty page tracking and the plan is to make use of
Keqian's SMMU HTTU work to improve performance. We have done basic
sanity testing with those patches.  

Thanks,
Shameer

If not, suppose some software trick is still
> required e.g. by dynamically mediating guest-submitted descriptors
> to compose the dirty page information in the migration phase...
> 
> Thanks
> Kevin
Tian, Kevin Sept. 29, 2021, 9:05 a.m. UTC | #3
> From: Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>
> 
> Hi Kevin,
> 
> > From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > Sent: 29 September 2021 04:58
> >
> > Hi, Shameer,
> >
> > > From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > > Sent: Wednesday, September 15, 2021 5:51 PM
> > >
> > > Hi,
> > >
> > > Thanks to the introduction of vfio_pci_core subsystem framework[0],
> > > now it is possible to provide vendor specific functionality to
> > > vfio pci devices. This series attempts to add vfio live migration
> > > support for HiSilicon ACC VF devices based on the new framework.
> > >
> > > HiSilicon ACC VF device MMIO space includes both the functional
> > > register space and migration control register space. As discussed
> > > in RFCv1[1], this may create security issues as these regions get
> > > shared between the Guest driver and the migration driver.
> > > Based on the feedback, we tried to address those concerns in
> > > this version.
> >
> > This series doesn't mention anything related to dirty page tracking.
> > Are you rely on Keqian's series for utilizing hardware iommu dirty
> > bit (e.g. SMMU HTTU)?
> 
> Yes, this doesn't have dirty page tracking and the plan is to make use of
> Keqian's SMMU HTTU work to improve performance. We have done basic
> sanity testing with those patches.
> 

Do you plan to support migration w/o HTTU as the fallback option? 
Generally one would expect the basic functionality ready before talking
about optimization.

If not, how does userspace know that migration of a given device can 
be allowed only when the iommu supports hardware dirty bit?

Thanks
Kevin
Shameerali Kolothum Thodi Sept. 29, 2021, 9:16 a.m. UTC | #4
> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> Sent: 29 September 2021 10:06
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> kvm@vger.kernel.org; linux-kernel@vger.kernel.org;
> linux-crypto@vger.kernel.org
> Cc: alex.williamson@redhat.com; jgg@nvidia.com; mgurtovoy@nvidia.com;
> Linuxarm <linuxarm@huawei.com>; liulongfang <liulongfang@huawei.com>;
> Zengtao (B) <prime.zeng@hisilicon.com>; Jonathan Cameron
> <jonathan.cameron@huawei.com>; Wangzhou (B)
> <wangzhou1@hisilicon.com>; He, Shaopeng <shaopeng.he@intel.com>; Zhao,
> Yan Y <yan.y.zhao@intel.com>
> Subject: RE: [PATCH v3 0/6] vfio/hisilicon: add acc live migration driver
> 
> > From: Shameerali Kolothum Thodi
> > <shameerali.kolothum.thodi@huawei.com>
> >
> > Hi Kevin,
> >
> > > From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > > Sent: 29 September 2021 04:58
> > >
> > > Hi, Shameer,
> > >
> > > > From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > > > Sent: Wednesday, September 15, 2021 5:51 PM
> > > >
> > > > Hi,
> > > >
> > > > Thanks to the introduction of vfio_pci_core subsystem framework[0],
> > > > now it is possible to provide vendor specific functionality to
> > > > vfio pci devices. This series attempts to add vfio live migration
> > > > support for HiSilicon ACC VF devices based on the new framework.
> > > >
> > > > HiSilicon ACC VF device MMIO space includes both the functional
> > > > register space and migration control register space. As discussed
> > > > in RFCv1[1], this may create security issues as these regions get
> > > > shared between the Guest driver and the migration driver.
> > > > Based on the feedback, we tried to address those concerns in
> > > > this version.
> > >
> > > This series doesn't mention anything related to dirty page tracking.
> > > Are you rely on Keqian's series for utilizing hardware iommu dirty
> > > bit (e.g. SMMU HTTU)?
> >
> > Yes, this doesn't have dirty page tracking and the plan is to make use of
> > Keqian's SMMU HTTU work to improve performance. We have done basic
> > sanity testing with those patches.
> >
> 
> Do you plan to support migration w/o HTTU as the fallback option?
> Generally one would expect the basic functionality ready before talking
> about optimization.

Yes, the plan is to get the basic live migration working and then we can optimize
it with SMMU HTTU when it is available.

> If not, how does userspace know that migration of a given device can
> be allowed only when the iommu supports hardware dirty bit?

Yes. It would be nice to have that info available I guess. But for these devices,
they are integrated end point devices and the platform that supports these
will have SMMUv3 with HTTU.

Thanks,
Shameer

> Thanks
> Kevin
Tian, Kevin Sept. 30, 2021, 12:42 a.m. UTC | #5
> From: Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>
> 
> > From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > Sent: 29 September 2021 10:06
> >
> > > From: Shameerali Kolothum Thodi
> > > <shameerali.kolothum.thodi@huawei.com>
> > >
> > > Hi Kevin,
> > >
> > > > From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > > > Sent: 29 September 2021 04:58
> > > >
> > > > Hi, Shameer,
> > > >
> > > > > From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > > > > Sent: Wednesday, September 15, 2021 5:51 PM
> > > > >
> > > > > Hi,
> > > > >
> > > > > Thanks to the introduction of vfio_pci_core subsystem framework[0],
> > > > > now it is possible to provide vendor specific functionality to
> > > > > vfio pci devices. This series attempts to add vfio live migration
> > > > > support for HiSilicon ACC VF devices based on the new framework.
> > > > >
> > > > > HiSilicon ACC VF device MMIO space includes both the functional
> > > > > register space and migration control register space. As discussed
> > > > > in RFCv1[1], this may create security issues as these regions get
> > > > > shared between the Guest driver and the migration driver.
> > > > > Based on the feedback, we tried to address those concerns in
> > > > > this version.
> > > >
> > > > This series doesn't mention anything related to dirty page tracking.
> > > > Are you rely on Keqian's series for utilizing hardware iommu dirty
> > > > bit (e.g. SMMU HTTU)?
> > >
> > > Yes, this doesn't have dirty page tracking and the plan is to make use of
> > > Keqian's SMMU HTTU work to improve performance. We have done basic
> > > sanity testing with those patches.
> > >
> >
> > Do you plan to support migration w/o HTTU as the fallback option?
> > Generally one would expect the basic functionality ready before talking
> > about optimization.
> 
> Yes, the plan is to get the basic live migration working and then we can
> optimize
> it with SMMU HTTU when it is available.

The interesting thing is that w/o HTTU vfio will just report every pinned
page as dirty, i.e. the entire guest memory is dirty. This completely kills
the benefit of precopy phase since Qemu still needs to transfer the entire
guest memory in the stop-copy phase. This is not a 'working' model for
live migration.

So it needs to be clear whether HTTU is really an optimization or
a hard functional-requirement for migrating such device. If the latter
the migration region info is not a nice-to-have thing.

btw the fallback option that I raised earlier is more like some software 
mitigation for collecting dirty pages, e.g. analyzing the ring descriptors
to build software-tracked dirty info by mediating the cmd portal
(which requires dynamically unmapping cmd portal from the fast-path
to enable mediation). We are looking into this option for some platform
which lacks of IOMMU dirty bit support.

Thanks
Kevin
Shameerali Kolothum Thodi Sept. 30, 2021, 6:34 a.m. UTC | #6
> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> Sent: 30 September 2021 01:42
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> kvm@vger.kernel.org; linux-kernel@vger.kernel.org;
> linux-crypto@vger.kernel.org
> Cc: alex.williamson@redhat.com; jgg@nvidia.com; mgurtovoy@nvidia.com;
> Linuxarm <linuxarm@huawei.com>; liulongfang <liulongfang@huawei.com>;
> Zengtao (B) <prime.zeng@hisilicon.com>; Jonathan Cameron
> <jonathan.cameron@huawei.com>; Wangzhou (B)
> <wangzhou1@hisilicon.com>; He, Shaopeng <shaopeng.he@intel.com>; Zhao,
> Yan Y <yan.y.zhao@intel.com>
> Subject: RE: [PATCH v3 0/6] vfio/hisilicon: add acc live migration driver
> 
> > From: Shameerali Kolothum Thodi
> > <shameerali.kolothum.thodi@huawei.com>
> >
> > > From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > > Sent: 29 September 2021 10:06
> > >
> > > > From: Shameerali Kolothum Thodi
> > > > <shameerali.kolothum.thodi@huawei.com>
> > > >
> > > > Hi Kevin,
> > > >
> > > > > From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > > > > Sent: 29 September 2021 04:58
> > > > >
> > > > > Hi, Shameer,
> > > > >
> > > > > > From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > > > > > Sent: Wednesday, September 15, 2021 5:51 PM
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Thanks to the introduction of vfio_pci_core subsystem framework[0],
> > > > > > now it is possible to provide vendor specific functionality to
> > > > > > vfio pci devices. This series attempts to add vfio live migration
> > > > > > support for HiSilicon ACC VF devices based on the new framework.
> > > > > >
> > > > > > HiSilicon ACC VF device MMIO space includes both the functional
> > > > > > register space and migration control register space. As discussed
> > > > > > in RFCv1[1], this may create security issues as these regions get
> > > > > > shared between the Guest driver and the migration driver.
> > > > > > Based on the feedback, we tried to address those concerns in
> > > > > > this version.
> > > > >
> > > > > This series doesn't mention anything related to dirty page tracking.
> > > > > Are you rely on Keqian's series for utilizing hardware iommu dirty
> > > > > bit (e.g. SMMU HTTU)?
> > > >
> > > > Yes, this doesn't have dirty page tracking and the plan is to make use of
> > > > Keqian's SMMU HTTU work to improve performance. We have done
> basic
> > > > sanity testing with those patches.
> > > >
> > >
> > > Do you plan to support migration w/o HTTU as the fallback option?
> > > Generally one would expect the basic functionality ready before talking
> > > about optimization.
> >
> > Yes, the plan is to get the basic live migration working and then we can
> > optimize
> > it with SMMU HTTU when it is available.
> 
> The interesting thing is that w/o HTTU vfio will just report every pinned
> page as dirty, i.e. the entire guest memory is dirty. This completely kills
> the benefit of precopy phase since Qemu still needs to transfer the entire
> guest memory in the stop-copy phase. This is not a 'working' model for
> live migration.
> 
> So it needs to be clear whether HTTU is really an optimization or
> a hard functional-requirement for migrating such device. If the latter
> the migration region info is not a nice-to-have thing.

Yes, agree that we have to transfer the entire Guest memory in this case.
But don't think that is a killer here as we would still like to have the 
basic live migration enabled on these platforms and can be used
where the constraints of memory transfer is acceptable.
 
> btw the fallback option that I raised earlier is more like some software
> mitigation for collecting dirty pages, e.g. analyzing the ring descriptors
> to build software-tracked dirty info by mediating the cmd portal
> (which requires dynamically unmapping cmd portal from the fast-path
> to enable mediation). We are looking into this option for some platform
> which lacks of IOMMU dirty bit support.

Interesting. Is there anything available publicly so that we can take a look?

Thanks,
Shameer
Tian, Kevin Oct. 15, 2021, 5:53 a.m. UTC | #7
> From: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Sent: Thursday, September 30, 2021 2:35 PM
> 
> > From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > Sent: 30 September 2021 01:42
> >
> > > From: Shameerali Kolothum Thodi
> > > <shameerali.kolothum.thodi@huawei.com>
> > >
> > > > From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > > > Sent: 29 September 2021 10:06
> > > >
> > > > > From: Shameerali Kolothum Thodi
> > > > > <shameerali.kolothum.thodi@huawei.com>
> > > > >
> > > > > Hi Kevin,
> > > > >
> > > > > > From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > > > > > Sent: 29 September 2021 04:58
> > > > > >
> > > > > > Hi, Shameer,
> > > > > >
> > > > > > > From: Shameer Kolothum
> <shameerali.kolothum.thodi@huawei.com>
> > > > > > > Sent: Wednesday, September 15, 2021 5:51 PM
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > Thanks to the introduction of vfio_pci_core subsystem
> framework[0],
> > > > > > > now it is possible to provide vendor specific functionality to
> > > > > > > vfio pci devices. This series attempts to add vfio live migration
> > > > > > > support for HiSilicon ACC VF devices based on the new framework.
> > > > > > >
> > > > > > > HiSilicon ACC VF device MMIO space includes both the functional
> > > > > > > register space and migration control register space. As discussed
> > > > > > > in RFCv1[1], this may create security issues as these regions get
> > > > > > > shared between the Guest driver and the migration driver.
> > > > > > > Based on the feedback, we tried to address those concerns in
> > > > > > > this version.
> > > > > >
> > > > > > This series doesn't mention anything related to dirty page tracking.
> > > > > > Are you rely on Keqian's series for utilizing hardware iommu dirty
> > > > > > bit (e.g. SMMU HTTU)?
> > > > >
> > > > > Yes, this doesn't have dirty page tracking and the plan is to make use
> of
> > > > > Keqian's SMMU HTTU work to improve performance. We have done
> > basic
> > > > > sanity testing with those patches.
> > > > >
> > > >
> > > > Do you plan to support migration w/o HTTU as the fallback option?
> > > > Generally one would expect the basic functionality ready before talking
> > > > about optimization.
> > >
> > > Yes, the plan is to get the basic live migration working and then we can
> > > optimize
> > > it with SMMU HTTU when it is available.
> >
> > The interesting thing is that w/o HTTU vfio will just report every pinned
> > page as dirty, i.e. the entire guest memory is dirty. This completely kills
> > the benefit of precopy phase since Qemu still needs to transfer the entire
> > guest memory in the stop-copy phase. This is not a 'working' model for
> > live migration.
> >
> > So it needs to be clear whether HTTU is really an optimization or
> > a hard functional-requirement for migrating such device. If the latter
> > the migration region info is not a nice-to-have thing.
> 
> Yes, agree that we have to transfer the entire Guest memory in this case.
> But don't think that is a killer here as we would still like to have the
> basic live migration enabled on these platforms and can be used
> where the constraints of memory transfer is acceptable.
> 
> > btw the fallback option that I raised earlier is more like some software
> > mitigation for collecting dirty pages, e.g. analyzing the ring descriptors
> > to build software-tracked dirty info by mediating the cmd portal
> > (which requires dynamically unmapping cmd portal from the fast-path
> > to enable mediation). We are looking into this option for some platform
> > which lacks of IOMMU dirty bit support.
> 
> Interesting. Is there anything available publicly so that we can take a look?
> 

Not yet. Once we had an implementation based on an old approach before
vfio-pci-core is ready. Now suppose it needs rework based on the new
framework.

+Shaopeng who owns this work.

Thanks
Kevin