Message ID | 20201222154338.9459-3-julien@xen.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | xen/iommu: Collection of bug fixes for IOMMU teadorwn | expand |
On 22.12.2020 16:43, Julien Grall wrote: > From: Julien Grall <jgrall@amazon.com> > > The pgtables.lock is protecting access to the page list pgtables.list. > However, iommu_free_pgtables() will not held it. I guess it was assumed > that page-tables cannot be allocated while the domain is dying. > > Unfortunately, there is no guarantee that iommu_map() will not be > called while a domain is dying (it looks like to be possible from > XEN_DOMCTL_memory_mapping). I'd rather disallow any new allocations for a dying domain, not the least because ... > So it would be possible to be concurrently > allocate memory and free the page-tables. > > Therefore, we need to held the lock when freeing the page tables. ... we should try to avoid holding locks across allocation / freeing functions wherever possible. As to where to place a respective check - I wonder if we wouldn't be better off disallowing a majority of domctl-s (and perhaps other operations) on dying domains. Thoughts? Jan
Hi Jan, On 23/12/2020 13:48, Jan Beulich wrote: > On 22.12.2020 16:43, Julien Grall wrote: >> From: Julien Grall <jgrall@amazon.com> >> >> The pgtables.lock is protecting access to the page list pgtables.list. >> However, iommu_free_pgtables() will not held it. I guess it was assumed >> that page-tables cannot be allocated while the domain is dying. >> >> Unfortunately, there is no guarantee that iommu_map() will not be >> called while a domain is dying (it looks like to be possible from >> XEN_DOMCTL_memory_mapping). > > I'd rather disallow any new allocations for a dying domain, not > the least because ... Patch #4 will disallow such allocation. However... > >> So it would be possible to be concurrently >> allocate memory and free the page-tables. >> >> Therefore, we need to held the lock when freeing the page tables. > > ... we should try to avoid holding locks across allocation / > freeing functions wherever possible. > > As to where to place a respective check - I wonder if we wouldn't > be better off disallowing a majority of domctl-s (and perhaps > other operations) on dying domains. Thoughts? ... this is still pretty racy because you need to guarantee that d->is_dying is seen by the other processors to prevent allocation. As to whether we can forbid most of the domctl-s, I would agree this is a good move. But this doesn't remove the underlying problem here. We are hoping that a top-level function will protect us against the race. Given the IOMMU code is quite deep in the callstack, this is something pretty hard to guarantee with future change. So I still think we need a way to mitigate the issue. Cheers,
On 23.12.2020 15:01, Julien Grall wrote: > Hi Jan, > > On 23/12/2020 13:48, Jan Beulich wrote: >> On 22.12.2020 16:43, Julien Grall wrote: >>> From: Julien Grall <jgrall@amazon.com> >>> >>> The pgtables.lock is protecting access to the page list pgtables.list. >>> However, iommu_free_pgtables() will not held it. I guess it was assumed >>> that page-tables cannot be allocated while the domain is dying. >>> >>> Unfortunately, there is no guarantee that iommu_map() will not be >>> called while a domain is dying (it looks like to be possible from >>> XEN_DOMCTL_memory_mapping). >> >> I'd rather disallow any new allocations for a dying domain, not >> the least because ... > > Patch #4 will disallow such allocation. However... > >> >>> So it would be possible to be concurrently >>> allocate memory and free the page-tables. >>> >>> Therefore, we need to held the lock when freeing the page tables. >> >> ... we should try to avoid holding locks across allocation / >> freeing functions wherever possible. > >> As to where to place a respective check - I wonder if we wouldn't >> be better off disallowing a majority of domctl-s (and perhaps >> other operations) on dying domains. Thoughts? > > ... this is still pretty racy because you need to guarantee that > d->is_dying is seen by the other processors to prevent allocation. The function freeing the page tables will need a spin_barrier() or alike similar to evtchn_destroy(). Aiui this will eliminate all potential for races. Jan
On 23/12/2020 14:01, Julien Grall wrote: > Hi Jan, > > On 23/12/2020 13:48, Jan Beulich wrote: >> On 22.12.2020 16:43, Julien Grall wrote: >>> From: Julien Grall <jgrall@amazon.com> >>> >>> The pgtables.lock is protecting access to the page list pgtables.list. >>> However, iommu_free_pgtables() will not held it. I guess it was assumed >>> that page-tables cannot be allocated while the domain is dying. >>> >>> Unfortunately, there is no guarantee that iommu_map() will not be >>> called while a domain is dying (it looks like to be possible from >>> XEN_DOMCTL_memory_mapping). >> >> I'd rather disallow any new allocations for a dying domain, not >> the least because ... > > Patch #4 will disallow such allocation. However... It turns out that I can't disallow it because there is at least one call of iommu_map() while an HVM domain is destroyed: (XEN) [<ffff82d04026e399>] R iommu_map+0xf2/0x171 (XEN) [<ffff82d04026e43e>] F iommu_legacy_map+0x26/0x63 (XEN) [<ffff82d040302a42>] F arch/x86/mm/p2m-ept.c#ept_set_entry+0x6b2/0x730 (XEN) [<ffff82d0402f761d>] F p2m_set_entry+0x91/0x128 (XEN) [<ffff82d0402f8643>] F guest_physmap_add_entry+0x3d7/0x4e0 (XEN) [<ffff82d0402f87cb>] F guest_physmap_add_page+0x7f/0xfe (XEN) [<ffff82d0402ba0a2>] F arch/x86/hvm/ioreq.c#hvm_add_ioreq_gfn+0x6f/0x8d (XEN) [<ffff82d0402ba0f9>] F arch/x86/hvm/ioreq.c#hvm_ioreq_server_disable+0x39/0x4f (XEN) [<ffff82d0402bb50e>] F hvm_destroy_all_ioreq_servers+0x6f/0x9b (XEN) [<ffff82d0402afc15>] F hvm_domain_relinquish_resources+0x3e/0x92 (XEN) [<ffff82d04031f278>] F domain_relinquish_resources+0x355/0x372 (XEN) [<ffff82d040205dd4>] F domain_kill+0xc7/0x150 (XEN) [<ffff82d04023baf0>] F do_domctl+0xba7/0x1a09 (XEN) [<ffff82d040312c8f>] F pv_hypercall+0x2f0/0x55f (XEN) [<ffff82d040391432>] F lstar_enter+0x112/0x120 This one resulted to a domain crash rather than a clean destruction. I would still like to disallow page-table allocation, so I am think to ignore any request in iommu_map() if the domain is dying. Cheers,
On 14.01.2021 20:19, Julien Grall wrote: > > > On 23/12/2020 14:01, Julien Grall wrote: >> Hi Jan, >> >> On 23/12/2020 13:48, Jan Beulich wrote: >>> On 22.12.2020 16:43, Julien Grall wrote: >>>> From: Julien Grall <jgrall@amazon.com> >>>> >>>> The pgtables.lock is protecting access to the page list pgtables.list. >>>> However, iommu_free_pgtables() will not held it. I guess it was assumed >>>> that page-tables cannot be allocated while the domain is dying. >>>> >>>> Unfortunately, there is no guarantee that iommu_map() will not be >>>> called while a domain is dying (it looks like to be possible from >>>> XEN_DOMCTL_memory_mapping). >>> >>> I'd rather disallow any new allocations for a dying domain, not >>> the least because ... >> >> Patch #4 will disallow such allocation. However... > > It turns out that I can't disallow it because there is at least one call > of iommu_map() while an HVM domain is destroyed: > > (XEN) [<ffff82d04026e399>] R iommu_map+0xf2/0x171 > (XEN) [<ffff82d04026e43e>] F iommu_legacy_map+0x26/0x63 > (XEN) [<ffff82d040302a42>] F > arch/x86/mm/p2m-ept.c#ept_set_entry+0x6b2/0x730 > (XEN) [<ffff82d0402f761d>] F p2m_set_entry+0x91/0x128 > (XEN) [<ffff82d0402f8643>] F guest_physmap_add_entry+0x3d7/0x4e0 > (XEN) [<ffff82d0402f87cb>] F guest_physmap_add_page+0x7f/0xfe > (XEN) [<ffff82d0402ba0a2>] F > arch/x86/hvm/ioreq.c#hvm_add_ioreq_gfn+0x6f/0x8d > (XEN) [<ffff82d0402ba0f9>] F > arch/x86/hvm/ioreq.c#hvm_ioreq_server_disable+0x39/0x4f > (XEN) [<ffff82d0402bb50e>] F hvm_destroy_all_ioreq_servers+0x6f/0x9b > (XEN) [<ffff82d0402afc15>] F > hvm_domain_relinquish_resources+0x3e/0x92 > (XEN) [<ffff82d04031f278>] F domain_relinquish_resources+0x355/0x372 > (XEN) [<ffff82d040205dd4>] F domain_kill+0xc7/0x150 > (XEN) [<ffff82d04023baf0>] F do_domctl+0xba7/0x1a09 > (XEN) [<ffff82d040312c8f>] F pv_hypercall+0x2f0/0x55f > (XEN) [<ffff82d040391432>] F lstar_enter+0x112/0x120 > > This one resulted to a domain crash rather than a clean destruction. A domain crash despite the domain already getting cleaned up is something we may at least want to consider doing better: There already is a !d->is_shutting_down conditional printk() there. What would people think about avoiding the domain_crash() call in similar ways? (It could even be considered to make domain_crash() itself behave like this, but such a step may make it necessary to first audit all use sites.) > I would still like to disallow page-table allocation, so I am think to > ignore any request in iommu_map() if the domain is dying. Ignoring requests there seems fragile to me. Paul - what are your thoughts about bailing early from hvm_add_ioreq_gfn() when the domain is dying? Jan
> -----Original Message----- > From: Jan Beulich <jbeulich@suse.com> > Sent: 15 January 2021 11:07 > To: Julien Grall <julien@xen.org>; Paul Durrant <paul@xen.org> > Cc: hongyxia@amazon.co.uk; Julien Grall <jgrall@amazon.com>; xen-devel@lists.xenproject.org; Andrew > Cooper <andrew.cooper3@citrix.com> > Subject: Re: [PATCH for-4.15 2/4] xen/iommu: x86: Free the IOMMU page-tables with the pgtables.lock > held > > On 14.01.2021 20:19, Julien Grall wrote: > > > > > > On 23/12/2020 14:01, Julien Grall wrote: > >> Hi Jan, > >> > >> On 23/12/2020 13:48, Jan Beulich wrote: > >>> On 22.12.2020 16:43, Julien Grall wrote: > >>>> From: Julien Grall <jgrall@amazon.com> > >>>> > >>>> The pgtables.lock is protecting access to the page list pgtables.list. > >>>> However, iommu_free_pgtables() will not held it. I guess it was assumed > >>>> that page-tables cannot be allocated while the domain is dying. > >>>> > >>>> Unfortunately, there is no guarantee that iommu_map() will not be > >>>> called while a domain is dying (it looks like to be possible from > >>>> XEN_DOMCTL_memory_mapping). > >>> > >>> I'd rather disallow any new allocations for a dying domain, not > >>> the least because ... > >> > >> Patch #4 will disallow such allocation. However... > > > > It turns out that I can't disallow it because there is at least one call > > of iommu_map() while an HVM domain is destroyed: > > > > (XEN) [<ffff82d04026e399>] R iommu_map+0xf2/0x171 > > (XEN) [<ffff82d04026e43e>] F iommu_legacy_map+0x26/0x63 > > (XEN) [<ffff82d040302a42>] F > > arch/x86/mm/p2m-ept.c#ept_set_entry+0x6b2/0x730 > > (XEN) [<ffff82d0402f761d>] F p2m_set_entry+0x91/0x128 > > (XEN) [<ffff82d0402f8643>] F guest_physmap_add_entry+0x3d7/0x4e0 > > (XEN) [<ffff82d0402f87cb>] F guest_physmap_add_page+0x7f/0xfe > > (XEN) [<ffff82d0402ba0a2>] F > > arch/x86/hvm/ioreq.c#hvm_add_ioreq_gfn+0x6f/0x8d > > (XEN) [<ffff82d0402ba0f9>] F > > arch/x86/hvm/ioreq.c#hvm_ioreq_server_disable+0x39/0x4f > > (XEN) [<ffff82d0402bb50e>] F hvm_destroy_all_ioreq_servers+0x6f/0x9b > > (XEN) [<ffff82d0402afc15>] F > > hvm_domain_relinquish_resources+0x3e/0x92 > > (XEN) [<ffff82d04031f278>] F domain_relinquish_resources+0x355/0x372 > > (XEN) [<ffff82d040205dd4>] F domain_kill+0xc7/0x150 > > (XEN) [<ffff82d04023baf0>] F do_domctl+0xba7/0x1a09 > > (XEN) [<ffff82d040312c8f>] F pv_hypercall+0x2f0/0x55f > > (XEN) [<ffff82d040391432>] F lstar_enter+0x112/0x120 > > > > This one resulted to a domain crash rather than a clean destruction. > > A domain crash despite the domain already getting cleaned up is > something we may at least want to consider doing better: There > already is a !d->is_shutting_down conditional printk() there. > What would people think about avoiding the domain_crash() call > in similar ways? (It could even be considered to make > domain_crash() itself behave like this, but such a step may make > it necessary to first audit all use sites.) > > > I would still like to disallow page-table allocation, so I am think to > > ignore any request in iommu_map() if the domain is dying. > > Ignoring requests there seems fragile to me. Paul - what are your > thoughts about bailing early from hvm_add_ioreq_gfn() when the > domain is dying? > I think that's ok. Really, the only reason for putting the pages back in the P2M is to allow migration to work so if the domain is dying then we don't really care do we? Paul > Jan
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c index cea1032b3d02..779dbb5b98ba 100644 --- a/xen/drivers/passthrough/x86/iommu.c +++ b/xen/drivers/passthrough/x86/iommu.c @@ -267,13 +267,18 @@ int iommu_free_pgtables(struct domain *d) struct page_info *pg; unsigned int done = 0; + spin_lock(&hd->arch.pgtables.lock); while ( (pg = page_list_remove_head(&hd->arch.pgtables.list)) ) { free_domheap_page(pg); if ( !(++done & 0xff) && general_preempt_check() ) + { + spin_unlock(&hd->arch.pgtables.lock); return -ERESTART; + } } + spin_unlock(&hd->arch.pgtables.lock); return 0; }