Message ID | 3430b8fa-6700-b5ec-a160-2e7609dd38a3@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | IOMMU: assorted follow-on to XSA-400 | expand |
On 11/04/2022 10:35, Jan Beulich wrote: > Prior extension of these functions to enable per-device quarantine page > tables already didn't add more locking there, but merely left in place > what had been there before. But really locking is unnecessary here: > We're running with pcidevs_lock held (i.e. multiple invocations of the > same function [or their teardown equivalents] are impossible, and hence > there are no "local" races), while all consuming of the data being > populated here can't race anyway due to happening sequentially > afterwards. See also the comment in struct arch_pci_dev. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> It is only legitimate to drop these calls if you delete the mapping_lock entirely. Otherwise you're breaking the semantics of mapping_lock. Your argument of "well this is already guarded by the pci lock" means that these are uncontended lock/unlock operations, and therefore not interesting to drop in the first place. This patch is specifically setting us up for an XSA in the future when the behaviour of the the PCI lock changes, the fix for which will be revert this patch. ~Andrew
On 11.04.2022 12:01, Andrew Cooper wrote: > On 11/04/2022 10:35, Jan Beulich wrote: >> Prior extension of these functions to enable per-device quarantine page >> tables already didn't add more locking there, but merely left in place >> what had been there before. But really locking is unnecessary here: >> We're running with pcidevs_lock held (i.e. multiple invocations of the >> same function [or their teardown equivalents] are impossible, and hence >> there are no "local" races), while all consuming of the data being >> populated here can't race anyway due to happening sequentially >> afterwards. See also the comment in struct arch_pci_dev. >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > > It is only legitimate to drop these calls if you delete the mapping_lock > entirely. Otherwise you're breaking the semantics of mapping_lock. Not for all domains other than DomIO. And DomIO is, well, special. As is at what times quarantine_init() is actually being invoked. > Your argument of "well this is already guarded by the pci lock" means > that these are uncontended lock/unlock operations, and therefore not > interesting to drop in the first place. > > This patch is specifically setting us up for an XSA in the future when > the behaviour of the the PCI lock changes, the fix for which will be > revert this patch. That wouldn't suffice then. As said in the description, and as discussed during the development of the XSA-400 series, further locking would need adding then. Jan
On Mon, Apr 11, 2022 at 11:35:54AM +0200, Jan Beulich wrote: > Prior extension of these functions to enable per-device quarantine page > tables already didn't add more locking there, but merely left in place > what had been there before. But really locking is unnecessary here: > We're running with pcidevs_lock held (i.e. multiple invocations of the > same function [or their teardown equivalents] are impossible, and hence > there are no "local" races), while all consuming of the data being > populated here can't race anyway due to happening sequentially > afterwards. See also the comment in struct arch_pci_dev. I would explicitly say that none of the code in the locked region touches any data in the domain_iommu struct, so taking the mapping_lock is unneeded. Long term we might wish to implemented a per-device lock that could be used here, instead of relying on the pcidevs lock. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Thanks, Roger.
On 12.04.2022 13:05, Roger Pau Monné wrote: > On Mon, Apr 11, 2022 at 11:35:54AM +0200, Jan Beulich wrote: >> Prior extension of these functions to enable per-device quarantine page >> tables already didn't add more locking there, but merely left in place >> what had been there before. But really locking is unnecessary here: >> We're running with pcidevs_lock held (i.e. multiple invocations of the >> same function [or their teardown equivalents] are impossible, and hence >> there are no "local" races), while all consuming of the data being >> populated here can't race anyway due to happening sequentially >> afterwards. See also the comment in struct arch_pci_dev. > > I would explicitly say that none of the code in the locked region > touches any data in the domain_iommu struct, so taking the > mapping_lock is unneeded. But that would limit what the mapping_lock protects more than it actually does: The entire page tables hanging off of the root table are also protected by that lock. It's just that for a pdev, after having installed identity mappings, the root doesn't hang off of hd. But in principle - i.e. if the per-device mappings weren't static once created - the lock would be the one to hold whenever any of these page tables was modified. > Long term we might wish to implemented a per-device lock that could be > used here, instead of relying on the pcidevs lock. Well, I would want to avoid this unless a need arises to not hold the pcidevs lock here. Or, of course, if a need arose to dynamically alter these page tables. >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > > Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Thanks. Jan
On Tue, Apr 12, 2022 at 02:17:16PM +0200, Jan Beulich wrote: > On 12.04.2022 13:05, Roger Pau Monné wrote: > > On Mon, Apr 11, 2022 at 11:35:54AM +0200, Jan Beulich wrote: > >> Prior extension of these functions to enable per-device quarantine page > >> tables already didn't add more locking there, but merely left in place > >> what had been there before. But really locking is unnecessary here: > >> We're running with pcidevs_lock held (i.e. multiple invocations of the > >> same function [or their teardown equivalents] are impossible, and hence > >> there are no "local" races), while all consuming of the data being > >> populated here can't race anyway due to happening sequentially > >> afterwards. See also the comment in struct arch_pci_dev. > > > > I would explicitly say that none of the code in the locked region > > touches any data in the domain_iommu struct, so taking the > > mapping_lock is unneeded. > > But that would limit what the mapping_lock protects more than it actually > does: The entire page tables hanging off of the root table are also > protected by that lock. Right, but at the point where fill_qpt() gets called hd->arch.amd.root_table == NULL, and hence it seems completely pointless to wrap this in a mapping_lock locked region. > It's just that for a pdev, after having > installed identity mappings, the root doesn't hang off of hd. But in > principle - i.e. if the per-device mappings weren't static once created - > the lock would be the one to hold whenever any of these page tables was > modified. The lock would need to be held if pages tables are modified while being set in hd->arch.amd.root_table, or at least that's my understanding. This is a special case anyway, as the page tables are not per-domain but per-device, but it seems clear to me that if the page tables are not set in hd->arch.amd.root_table the lock in hd->arch.mapping_lock is not supposed to be protecting them. > > Long term we might wish to implemented a per-device lock that could be > > used here, instead of relying on the pcidevs lock. > > Well, I would want to avoid this unless a need arises to not hold the > pcidevs lock here. Or, of course, if a need arose to dynamically alter > these page tables. I think it's likely we will need such lock for other purposes if we ever manage to convert the pcidevs lock into a rwlock, so my comment was not so much as it's required for the use case here, but a side effect if we ever manage to change pcidevs lock. Thanks, Roger.
On 12.04.2022 14:54, Roger Pau Monné wrote: > On Tue, Apr 12, 2022 at 02:17:16PM +0200, Jan Beulich wrote: >> On 12.04.2022 13:05, Roger Pau Monné wrote: >>> On Mon, Apr 11, 2022 at 11:35:54AM +0200, Jan Beulich wrote: >>>> Prior extension of these functions to enable per-device quarantine page >>>> tables already didn't add more locking there, but merely left in place >>>> what had been there before. But really locking is unnecessary here: >>>> We're running with pcidevs_lock held (i.e. multiple invocations of the >>>> same function [or their teardown equivalents] are impossible, and hence >>>> there are no "local" races), while all consuming of the data being >>>> populated here can't race anyway due to happening sequentially >>>> afterwards. See also the comment in struct arch_pci_dev. >>> >>> I would explicitly say that none of the code in the locked region >>> touches any data in the domain_iommu struct, so taking the >>> mapping_lock is unneeded. >> >> But that would limit what the mapping_lock protects more than it actually >> does: The entire page tables hanging off of the root table are also >> protected by that lock. > > Right, but at the point where fill_qpt() gets called > hd->arch.amd.root_table == NULL, and hence it seems completely > pointless to wrap this in a mapping_lock locked region. > >> It's just that for a pdev, after having >> installed identity mappings, the root doesn't hang off of hd. But in >> principle - i.e. if the per-device mappings weren't static once created - >> the lock would be the one to hold whenever any of these page tables was >> modified. > > The lock would need to be held if pages tables are modified while > being set in hd->arch.amd.root_table, or at least that's my > understanding. > > This is a special case anyway, as the page tables are not per-domain > but per-device, but it seems clear to me that if the page tables are > not set in hd->arch.amd.root_table the lock in hd->arch.mapping_lock > is not supposed to be protecting them. There are multiple models possible, one being that for per-device page tables DomIO's lock protects all of them. Hence my hesitance to say something along these lines in the description. >>> Long term we might wish to implemented a per-device lock that could be >>> used here, instead of relying on the pcidevs lock. >> >> Well, I would want to avoid this unless a need arises to not hold the >> pcidevs lock here. Or, of course, if a need arose to dynamically alter >> these page tables. > > I think it's likely we will need such lock for other purposes if we > ever manage to convert the pcidevs lock into a rwlock, so my comment > was not so much as it's required for the use case here, but a side > effect if we ever manage to change pcidevs lock. Such a need would further depend on whether the code paths leading here would hold the lock in read or write mode. But yes, it is reasonable to expect that it would only be read mode. Jan
On 11.04.2022 12:01, Andrew Cooper wrote: > On 11/04/2022 10:35, Jan Beulich wrote: >> Prior extension of these functions to enable per-device quarantine page >> tables already didn't add more locking there, but merely left in place >> what had been there before. But really locking is unnecessary here: >> We're running with pcidevs_lock held (i.e. multiple invocations of the >> same function [or their teardown equivalents] are impossible, and hence >> there are no "local" races), while all consuming of the data being >> populated here can't race anyway due to happening sequentially >> afterwards. See also the comment in struct arch_pci_dev. >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > > It is only legitimate to drop these calls if you delete the mapping_lock > entirely. Otherwise you're breaking the semantics of mapping_lock. > > Your argument of "well this is already guarded by the pci lock" means > that these are uncontended lock/unlock operations, and therefore not > interesting to drop in the first place. > > This patch is specifically setting us up for an XSA in the future when > the behaviour of the the PCI lock changes, the fix for which will be > revert this patch. Further to my earlier reply, may I remind you that changes to the PCI lock won't go unnoticed here, as there are respective ASSERT()s in place. Jan
> From: Jan Beulich <jbeulich@suse.com> > Sent: Monday, April 11, 2022 5:36 PM > > Prior extension of these functions to enable per-device quarantine page > tables already didn't add more locking there, but merely left in place > what had been there before. But really locking is unnecessary here: > We're running with pcidevs_lock held (i.e. multiple invocations of the > same function [or their teardown equivalents] are impossible, and hence > there are no "local" races), while all consuming of the data being > populated here can't race anyway due to happening sequentially > afterwards. See also the comment in struct arch_pci_dev. Better add some explanation (as you explained in other replies) why above rationale around pcidevs_lock leads to the drop of a different lock (mapping_lock). > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > Reviewed-by: Kevin Tian <kevin.tian@intel.com>
--- a/xen/drivers/passthrough/amd/iommu_map.c +++ b/xen/drivers/passthrough/amd/iommu_map.c @@ -699,15 +699,11 @@ int cf_check amd_iommu_quarantine_init(s union amd_iommu_pte *root; struct page_info *pgs[IOMMU_MAX_PT_LEVELS] = {}; - spin_lock(&hd->arch.mapping_lock); - root = __map_domain_page(pdev->arch.amd.root_table); rc = fill_qpt(root, level - 1, pgs); unmap_domain_page(root); pdev->arch.leaf_mfn = page_to_mfn(pgs[0]); - - spin_unlock(&hd->arch.mapping_lock); } page_list_move(&pdev->arch.pgtables_list, &hd->arch.pgtables.list); --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -3054,15 +3054,11 @@ static int cf_check intel_iommu_quaranti struct dma_pte *root; struct page_info *pgs[6] = {}; - spin_lock(&hd->arch.mapping_lock); - root = map_vtd_domain_page(pdev->arch.vtd.pgd_maddr); rc = fill_qpt(root, level - 1, pgs); unmap_vtd_domain_page(root); pdev->arch.leaf_mfn = page_to_mfn(pgs[0]); - - spin_unlock(&hd->arch.mapping_lock); } page_list_move(&pdev->arch.pgtables_list, &hd->arch.pgtables.list);
Prior extension of these functions to enable per-device quarantine page tables already didn't add more locking there, but merely left in place what had been there before. But really locking is unnecessary here: We're running with pcidevs_lock held (i.e. multiple invocations of the same function [or their teardown equivalents] are impossible, and hence there are no "local" races), while all consuming of the data being populated here can't race anyway due to happening sequentially afterwards. See also the comment in struct arch_pci_dev. Signed-off-by: Jan Beulich <jbeulich@suse.com>