mbox series

[0/4] intel_iommu: Reset vIOMMU after all the rest of devices

Message ID 20240117091559.144730-1-peterx@redhat.com (mailing list archive)
Headers show
Series intel_iommu: Reset vIOMMU after all the rest of devices | expand

Message

Peter Xu Jan. 17, 2024, 9:15 a.m. UTC
From: Peter Xu <peterx@redhat.com>

There're issue reported that when syetem_reset the VM with an intel iommu
device and MT2892 PF(mlx5_core driver), the host kernel throws DMAR error.

  https://issues.redhat.com/browse/RHEL-7188

Alex quickly spot a possible issue on ordering of device resets.

It's verified by our QE team then that it is indeed the root cause of the
problem.  Consider when vIOMMU is reset before a VFIO device in a system
reset: the device can be doing DMAs even if the vIOMMU is gone; in this
specific context it means the shadow mapping can already be completely
destroyed.  Host will see these DMAs as malicious and report.

To fix it, we'll need to make sure all devices under the vIOMMU device
hierachy will be reset before the vIOMMU itself.  There's plenty of trick
inside, one can get those by reading the last patch.

I didn't check other vIOMMUs, but this series should fix the issue for VT-d
as of now.  The solution can be slightly ugly, but a beautiful one can be
very non-trivial.

Review comments welcomed, thanks.

Peter Xu (4):
  reset: qemu_register_reset_one()
  reset: Allow multiple stages of system resets
  intel_iommu: Tear down address spaces before IOMMU reset
  intel_iommu: Reset vIOMMU at the last stage of system reset

 include/sysemu/reset.h |  5 ++++
 hw/core/reset.c        | 67 ++++++++++++++++++++++++++++++------------
 hw/i386/intel_iommu.c  | 56 +++++++++++++++++++++++++++++++++--
 3 files changed, 107 insertions(+), 21 deletions(-)

Comments

Eric Auger Jan. 17, 2024, 10:29 a.m. UTC | #1
Hi Peter,

On 1/17/24 10:15, peterx@redhat.com wrote:
> From: Peter Xu <peterx@redhat.com>
>
> There're issue reported that when syetem_reset the VM with an intel iommu
system_reset
> device and MT2892 PF(mlx5_core driver), the host kernel throws DMAR error.
>
>   https://issues.redhat.com/browse/RHEL-7188
>
> Alex quickly spot a possible issue on ordering of device resets.
>
> It's verified by our QE team then that it is indeed the root cause of the
> problem.  Consider when vIOMMU is reset before a VFIO device in a system
> reset: the device can be doing DMAs even if the vIOMMU is gone; in this
> specific context it means the shadow mapping can already be completely
> destroyed.  Host will see these DMAs as malicious and report.
That's curious we did not get this earlier?
>
> To fix it, we'll need to make sure all devices under the vIOMMU device
> hierachy will be reset before the vIOMMU itself.  There's plenty of trick
> inside, one can get those by reading the last patch.
Not sure what you meant here ;-)
>
> I didn't check other vIOMMUs, but this series should fix the issue for VT-d
> as of now.  The solution can be slightly ugly, but a beautiful one can be
> very non-trivial.
>
> Review comments welcomed, thanks.
Thanks

Eric
>
> Peter Xu (4):
>   reset: qemu_register_reset_one()
>   reset: Allow multiple stages of system resets
>   intel_iommu: Tear down address spaces before IOMMU reset
>   intel_iommu: Reset vIOMMU at the last stage of system reset
>
>  include/sysemu/reset.h |  5 ++++
>  hw/core/reset.c        | 67 ++++++++++++++++++++++++++++++------------
>  hw/i386/intel_iommu.c  | 56 +++++++++++++++++++++++++++++++++--
>  3 files changed, 107 insertions(+), 21 deletions(-)
>
Peter Xu Jan. 19, 2024, 10:46 a.m. UTC | #2
On Wed, Jan 17, 2024 at 11:29:08AM +0100, Eric Auger wrote:
> Hi Peter,

Hi, Eric,

Thanks for the reviews!

> 
> On 1/17/24 10:15, peterx@redhat.com wrote:
> > From: Peter Xu <peterx@redhat.com>
> >
> > There're issue reported that when syetem_reset the VM with an intel iommu
> system_reset
> > device and MT2892 PF(mlx5_core driver), the host kernel throws DMAR error.
> >
> >   https://issues.redhat.com/browse/RHEL-7188
> >
> > Alex quickly spot a possible issue on ordering of device resets.
> >
> > It's verified by our QE team then that it is indeed the root cause of the
> > problem.  Consider when vIOMMU is reset before a VFIO device in a system
> > reset: the device can be doing DMAs even if the vIOMMU is gone; in this
> > specific context it means the shadow mapping can already be completely
> > destroyed.  Host will see these DMAs as malicious and report.
> That's curious we did not get this earlier?

I sincerely don't know.  It could be that we just didn't test much on
system resets. Or, we could have overlooked the host dmesgs; after all the
error messages can be benign from functional pov.

> >
> > To fix it, we'll need to make sure all devices under the vIOMMU device
> > hierachy will be reset before the vIOMMU itself.  There's plenty of trick
> > inside, one can get those by reading the last patch.
> Not sure what you meant here ;-)

I meant "how to make sure all the vIOMMU managed devices will be reset
before the vIOMMU" is tricky on the implementation.  I didn't reference any
of those in the cover letter, because I think I stated mostly in patch 4, I
want to reference that patch for the details.  Since I think it's very
tricky, I left that major comment in the code to persist.

Thanks,