Message ID | 20210217142458.3769-3-julien@xen.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | xen/iommu: Collection of bug fixes for IOMMU teadorwn | expand |
On 17.02.2021 15:24, Julien Grall wrote: > From: Julien Grall <jgrall@amazon.com> > > The new x86 IOMMU page-tables allocator will release the pages when > relinquishing the domain resources. However, this is not sufficient > when the domain is dying because nothing prevents page-table to be > allocated. > > Currently page-table allocations can only happen from iommu_map(). As > the domain is dying, there is no good reason to continue to modify the > IOMMU page-tables. While I agree this to be the case right now, I'm not sure it is a good idea to build on it (in that you leave the unmap paths untouched). Imo there's a fair chance this would be overlooked at the point super page mappings get introduced (which has been long overdue), and I thought prior discussion had lead to a possible approach without risking use-after-free due to squashed unmap requests. > --- a/xen/drivers/passthrough/x86/iommu.c > +++ b/xen/drivers/passthrough/x86/iommu.c > @@ -273,6 +273,9 @@ int iommu_free_pgtables(struct domain *d) > /* > * Pages will be moved to the free list below. So we want to > * clear the root page-table to avoid any potential use after-free. > + * > + * After this call, no more IOMMU mapping can happen. > + * > */ > hd->platform_ops->clear_root_pgtable(d); I.e. you utilize the call in place of spin_barrier(). Maybe worth saying in the comment? Also, nit: Stray blank comment line. Jan
Hi Jan, On 17/02/2021 15:01, Jan Beulich wrote: > On 17.02.2021 15:24, Julien Grall wrote: >> From: Julien Grall <jgrall@amazon.com> >> >> The new x86 IOMMU page-tables allocator will release the pages when >> relinquishing the domain resources. However, this is not sufficient >> when the domain is dying because nothing prevents page-table to be >> allocated. >> >> Currently page-table allocations can only happen from iommu_map(). As >> the domain is dying, there is no good reason to continue to modify the >> IOMMU page-tables. > > While I agree this to be the case right now, I'm not sure it is a > good idea to build on it (in that you leave the unmap paths > untouched). I don't build on that assumption. See next patch. > Imo there's a fair chance this would be overlooked at > the point super page mappings get introduced (which has been long > overdue), and I thought prior discussion had lead to a possible > approach without risking use-after-free due to squashed unmap > requests. I know you suggested to zap the root page-tables... However, I don't think this is 4.15 material and you agree with this (you were the one pointed out that out). >> --- a/xen/drivers/passthrough/x86/iommu.c >> +++ b/xen/drivers/passthrough/x86/iommu.c >> @@ -273,6 +273,9 @@ int iommu_free_pgtables(struct domain *d) >> /* >> * Pages will be moved to the free list below. So we want to >> * clear the root page-table to avoid any potential use after-free. >> + * >> + * After this call, no more IOMMU mapping can happen. >> + * >> */ >> hd->platform_ops->clear_root_pgtable(d); > > I.e. you utilize the call in place of spin_barrier(). Maybe worth > saying in the comment? Sure. > > Also, nit: Stray blank comment line. Cheers,
On 17.02.2021 17:07, Julien Grall wrote: > On 17/02/2021 15:01, Jan Beulich wrote: >> On 17.02.2021 15:24, Julien Grall wrote: >>> From: Julien Grall <jgrall@amazon.com> >>> >>> The new x86 IOMMU page-tables allocator will release the pages when >>> relinquishing the domain resources. However, this is not sufficient >>> when the domain is dying because nothing prevents page-table to be >>> allocated. >>> >>> Currently page-table allocations can only happen from iommu_map(). As >>> the domain is dying, there is no good reason to continue to modify the >>> IOMMU page-tables. >> >> While I agree this to be the case right now, I'm not sure it is a >> good idea to build on it (in that you leave the unmap paths >> untouched). > > I don't build on that assumption. See next patch. Yet as said there that patch makes unmapping perhaps more fragile, by introducing a latent error source into the path. >> Imo there's a fair chance this would be overlooked at >> the point super page mappings get introduced (which has been long >> overdue), and I thought prior discussion had lead to a possible >> approach without risking use-after-free due to squashed unmap >> requests. > > I know you suggested to zap the root page-tables... However, I don't > think this is 4.15 material and you agree with this (you were the one > pointed out that out). Paul - do you have any thoughts here? Outright zapping isn't going to work, we'd need to switch to quarantine page tables at the very least to prevent the issue with babbling devices. But that still seems better to me than the introduction of a latent issue on the unmap paths. >>> --- a/xen/drivers/passthrough/x86/iommu.c >>> +++ b/xen/drivers/passthrough/x86/iommu.c >>> @@ -273,6 +273,9 @@ int iommu_free_pgtables(struct domain *d) >>> /* >>> * Pages will be moved to the free list below. So we want to >>> * clear the root page-table to avoid any potential use after-free. >>> + * >>> + * After this call, no more IOMMU mapping can happen. >>> + * >>> */ >>> hd->platform_ops->clear_root_pgtable(d); >> >> I.e. you utilize the call in place of spin_barrier(). Maybe worth >> saying in the comment? > > Sure. Btw - "no more IOMMU mapping" is also possibly ambiguous here: One might take it to mean both maps and unmaps. How about "no new IOMMU mappings can be inserted", as long as the unmap paths don't follow suit? Jan
Hi, On 18/02/2021 13:05, Jan Beulich wrote: > On 17.02.2021 17:07, Julien Grall wrote: >> On 17/02/2021 15:01, Jan Beulich wrote: >>> On 17.02.2021 15:24, Julien Grall wrote: >>>> From: Julien Grall <jgrall@amazon.com> >>>> >>>> The new x86 IOMMU page-tables allocator will release the pages when >>>> relinquishing the domain resources. However, this is not sufficient >>>> when the domain is dying because nothing prevents page-table to be >>>> allocated. >>>> >>>> Currently page-table allocations can only happen from iommu_map(). As >>>> the domain is dying, there is no good reason to continue to modify the >>>> IOMMU page-tables. >>> >>> While I agree this to be the case right now, I'm not sure it is a >>> good idea to build on it (in that you leave the unmap paths >>> untouched). >> >> I don't build on that assumption. See next patch. > > Yet as said there that patch makes unmapping perhaps more fragile, > by introducing a latent error source into the path. I still don't see what latent error my patch will introduce. Allocation of page-tables are not guarantee to succeed. So are you implying that a code may rely on the page allocation to succeed? > >>> Imo there's a fair chance this would be overlooked at >>> the point super page mappings get introduced (which has been long >>> overdue), and I thought prior discussion had lead to a possible >>> approach without risking use-after-free due to squashed unmap >>> requests. >> >> I know you suggested to zap the root page-tables... However, I don't >> think this is 4.15 material and you agree with this (you were the one >> pointed out that out). > > Paul - do you have any thoughts here? Outright zapping isn't > going to work, we'd need to switch to quarantine page tables at > the very least to prevent the issue with babbling devices. But > that still seems better to me than the introduction of a latent > issue on the unmap paths. I am afraid I am not going to be able to come up with such patch for 4.15. If you want to go that route for 4.15, then feel free to suggest a patch. [...] > Btw - "no more IOMMU mapping" is also possibly ambiguous here: > One might take it to mean both maps and unmaps. How about "no > new IOMMU mappings can be inserted", as long as the unmap paths > don't follow suit? Sure. Cheers,
On 18/02/2021 13:05, Jan Beulich wrote: > On 17.02.2021 17:07, Julien Grall wrote: >> On 17/02/2021 15:01, Jan Beulich wrote: >>> On 17.02.2021 15:24, Julien Grall wrote: >>>> From: Julien Grall <jgrall@amazon.com> >>>> >>>> The new x86 IOMMU page-tables allocator will release the pages when >>>> relinquishing the domain resources. However, this is not sufficient >>>> when the domain is dying because nothing prevents page-table to be >>>> allocated. >>>> >>>> Currently page-table allocations can only happen from iommu_map(). As >>>> the domain is dying, there is no good reason to continue to modify the >>>> IOMMU page-tables. >>> >>> While I agree this to be the case right now, I'm not sure it is a >>> good idea to build on it (in that you leave the unmap paths >>> untouched). >> >> I don't build on that assumption. See next patch. > > Yet as said there that patch makes unmapping perhaps more fragile, > by introducing a latent error source into the path. > >>> Imo there's a fair chance this would be overlooked at >>> the point super page mappings get introduced (which has been long >>> overdue), and I thought prior discussion had lead to a possible >>> approach without risking use-after-free due to squashed unmap >>> requests. >> >> I know you suggested to zap the root page-tables... However, I don't >> think this is 4.15 material and you agree with this (you were the one >> pointed out that out). > > Paul - do you have any thoughts here? Outright zapping isn't > going to work, we'd need to switch to quarantine page tables at > the very least to prevent the issue with babbling devices. But > that still seems better to me than the introduction of a latent > issue on the unmap paths. > I've not really been following this. AFAICT clear_root_pgtable() does not actually mess with any context entries so the device view of the tables won't change, will it? Paul
On 18.02.2021 14:25, Julien Grall wrote: > Hi, > > On 18/02/2021 13:05, Jan Beulich wrote: >> On 17.02.2021 17:07, Julien Grall wrote: >>> On 17/02/2021 15:01, Jan Beulich wrote: >>>> On 17.02.2021 15:24, Julien Grall wrote: >>>>> From: Julien Grall <jgrall@amazon.com> >>>>> >>>>> The new x86 IOMMU page-tables allocator will release the pages when >>>>> relinquishing the domain resources. However, this is not sufficient >>>>> when the domain is dying because nothing prevents page-table to be >>>>> allocated. >>>>> >>>>> Currently page-table allocations can only happen from iommu_map(). As >>>>> the domain is dying, there is no good reason to continue to modify the >>>>> IOMMU page-tables. >>>> >>>> While I agree this to be the case right now, I'm not sure it is a >>>> good idea to build on it (in that you leave the unmap paths >>>> untouched). >>> >>> I don't build on that assumption. See next patch. >> >> Yet as said there that patch makes unmapping perhaps more fragile, >> by introducing a latent error source into the path. > > I still don't see what latent error my patch will introduce. Allocation > of page-tables are not guarantee to succeed. > > So are you implying that a code may rely on the page allocation to succeed? I'm implying that failure to split a superpage may have unknown consequences. Since we make no use of superpages when not sharing page tables, I call this a latent issue which may go unnoticed for quite some time once no longer latent. Jan
On 18.02.2021 15:00, Paul Durrant wrote: > On 18/02/2021 13:05, Jan Beulich wrote: >> On 17.02.2021 17:07, Julien Grall wrote: >>> On 17/02/2021 15:01, Jan Beulich wrote: >>>> On 17.02.2021 15:24, Julien Grall wrote: >>>>> From: Julien Grall <jgrall@amazon.com> >>>>> >>>>> The new x86 IOMMU page-tables allocator will release the pages when >>>>> relinquishing the domain resources. However, this is not sufficient >>>>> when the domain is dying because nothing prevents page-table to be >>>>> allocated. >>>>> >>>>> Currently page-table allocations can only happen from iommu_map(). As >>>>> the domain is dying, there is no good reason to continue to modify the >>>>> IOMMU page-tables. >>>> >>>> While I agree this to be the case right now, I'm not sure it is a >>>> good idea to build on it (in that you leave the unmap paths >>>> untouched). >>> >>> I don't build on that assumption. See next patch. >> >> Yet as said there that patch makes unmapping perhaps more fragile, >> by introducing a latent error source into the path. >> >>>> Imo there's a fair chance this would be overlooked at >>>> the point super page mappings get introduced (which has been long >>>> overdue), and I thought prior discussion had lead to a possible >>>> approach without risking use-after-free due to squashed unmap >>>> requests. >>> >>> I know you suggested to zap the root page-tables... However, I don't >>> think this is 4.15 material and you agree with this (you were the one >>> pointed out that out). >> >> Paul - do you have any thoughts here? Outright zapping isn't >> going to work, we'd need to switch to quarantine page tables at >> the very least to prevent the issue with babbling devices. But >> that still seems better to me than the introduction of a latent >> issue on the unmap paths. >> > > I've not really been following this. AFAICT clear_root_pgtable() does > not actually mess with any context entries so the device view of the > tables won't change, will it? Correct. Elsewhere I said that "zapping" here has two meanings, one is what Julien does and the other is what I mean and what I understand you also refer to - to actually replace the mappings referenced by a context entry as soon as we no longer guarantee to update them correctly. My concern with not touching the unmap paths is that if there was any "best effort" unmap left anywhere in the tree, we may still end up with a use-after-free when late unmaps are now made possibly fail by failing late allocation attempts. Jan
Hi Jan, On 19/02/2021 08:49, Jan Beulich wrote: > On 18.02.2021 14:25, Julien Grall wrote: >> Hi, >> >> On 18/02/2021 13:05, Jan Beulich wrote: >>> On 17.02.2021 17:07, Julien Grall wrote: >>>> On 17/02/2021 15:01, Jan Beulich wrote: >>>>> On 17.02.2021 15:24, Julien Grall wrote: >>>>>> From: Julien Grall <jgrall@amazon.com> >>>>>> >>>>>> The new x86 IOMMU page-tables allocator will release the pages when >>>>>> relinquishing the domain resources. However, this is not sufficient >>>>>> when the domain is dying because nothing prevents page-table to be >>>>>> allocated. >>>>>> >>>>>> Currently page-table allocations can only happen from iommu_map(). As >>>>>> the domain is dying, there is no good reason to continue to modify the >>>>>> IOMMU page-tables. >>>>> >>>>> While I agree this to be the case right now, I'm not sure it is a >>>>> good idea to build on it (in that you leave the unmap paths >>>>> untouched). >>>> >>>> I don't build on that assumption. See next patch. >>> >>> Yet as said there that patch makes unmapping perhaps more fragile, >>> by introducing a latent error source into the path. >> >> I still don't see what latent error my patch will introduce. Allocation >> of page-tables are not guarantee to succeed. >> >> So are you implying that a code may rely on the page allocation to succeed? > > I'm implying that failure to split a superpage may have unknown > consequences. As failure to flush the TLBs (see below). > Since we make no use of superpages when not > sharing page tables, I call this a latent issue which may go > unnoticed for quite some time once no longer latent. And it would take a lot longer to unnotice an OOM as this should happen less often ;). But, I think this is wrong to blame the lower layer for a (latent) bug in the upper layer... Even with your solution to zap page-tables, there is still a risk of failure because the TLB flush may fail (the operation can return an error). So regardless the solution, we still need to have callers that function correctly. Anyway, what matters right now is fixing a host crash when running PV guest with passthrough. Neither patch #3 nor the iommu_unmap() part is strictly necessary for 4.15 (as you say this is latent). So I would suggest to defer this discussion post-4.15. Cheers,
diff --git a/xen/drivers/passthrough/amd/iommu_map.c b/xen/drivers/passthrough/amd/iommu_map.c index d3a8b1aec766..ed78a083ba12 100644 --- a/xen/drivers/passthrough/amd/iommu_map.c +++ b/xen/drivers/passthrough/amd/iommu_map.c @@ -285,6 +285,19 @@ int amd_iommu_map_page(struct domain *d, dfn_t dfn, mfn_t mfn, spin_lock(&hd->arch.mapping_lock); + /* + * IOMMU mapping request can be safely ignored when the domain is dying. + * + * hd->arch.mapping_lock guarantees that d->is_dying will be observed + * before any page tables are freed (see iommu_free_pgtables() and + * iommu_clear_root_pgtable()). + */ + if ( d->is_dying ) + { + spin_unlock(&hd->arch.mapping_lock); + return 0; + } + rc = amd_iommu_alloc_root(d); if ( rc ) { diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c index e1871f6c2bc1..239a63f74f64 100644 --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -1771,6 +1771,19 @@ static int __must_check intel_iommu_map_page(struct domain *d, dfn_t dfn, spin_lock(&hd->arch.mapping_lock); + /* + * IOMMU mapping request can be safely ignored when the domain is dying. + * + * hd->arch.mapping_lock guarantees that d->is_dying will be observed + * before any page tables are freed (see iommu_free_pgtables() and + * iommu_clear_root_pgtable()). + */ + if ( d->is_dying ) + { + spin_unlock(&hd->arch.mapping_lock); + return 0; + } + pg_maddr = addr_to_dma_page_maddr(d, dfn_to_daddr(dfn), 1); if ( !pg_maddr ) { diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c index f54fc8093f18..faa0078db595 100644 --- a/xen/drivers/passthrough/x86/iommu.c +++ b/xen/drivers/passthrough/x86/iommu.c @@ -273,6 +273,9 @@ int iommu_free_pgtables(struct domain *d) /* * Pages will be moved to the free list below. So we want to * clear the root page-table to avoid any potential use after-free. + * + * After this call, no more IOMMU mapping can happen. + * */ hd->platform_ops->clear_root_pgtable(d);