Message ID | 1493035176-3633-1-git-send-email-gakula@caviumnetworks.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Apr 24, 2017 at 05:29:36PM +0530, Geetha sowjanya wrote: > From: Geetha <gakula@cavium.com> > > When large memory is being unmapped, huge no of tlb invalidation cmds are > submitted followed by a SYNC command. This sometimes hits CMD queue full and > poll on queue drain is being timedout throwing error message 'CMD_SYNC timeout'. > > Although there is no functional issue, error message confuses user. Hence increased > poll timeout to 500us Hmm, what are you doing to unmap that much? Is this VFIO teardown? Do you have 7c6d90e2bb1a ("iommu/io-pgtable-arm: Fix iova_to_phys for block entries") applied? Will > > Signed-off-by: Geetha <gakula@cavium.com> > --- > drivers/iommu/arm-smmu-v3.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c > index 591bb96..1dcd154 100644 > --- a/drivers/iommu/arm-smmu-v3.c > +++ b/drivers/iommu/arm-smmu-v3.c > @@ -407,7 +407,7 @@ > #define PRIQ_1_ADDR_MASK 0xfffffffffffffUL > > /* High-level queue structures */ > -#define ARM_SMMU_POLL_TIMEOUT_US 100 > +#define ARM_SMMU_POLL_TIMEOUT_US 500 > > #define MSI_IOVA_BASE 0x8000000 > #define MSI_IOVA_LENGTH 0x100000 > -- > 1.9.1 >
On Mon, Apr 24, 2017 at 9:38 PM, Will Deacon <will.deacon@arm.com> wrote: > On Mon, Apr 24, 2017 at 05:29:36PM +0530, Geetha sowjanya wrote: >> From: Geetha <gakula@cavium.com> >> >> When large memory is being unmapped, huge no of tlb invalidation cmds are >> submitted followed by a SYNC command. This sometimes hits CMD queue full and >> poll on queue drain is being timedout throwing error message 'CMD_SYNC timeout'. >> >> Although there is no functional issue, error message confuses user. Hence increased >> poll timeout to 500us > > Hmm, what are you doing to unmap that much? Is this VFIO teardown? Do you > have 7c6d90e2bb1a ("iommu/io-pgtable-arm: Fix iova_to_phys for block > entries") applied? Yes it's VFIO teardown and again yes the above fix is applied. But i didn't get how above fix is related. TLB invalidation commands are submitted at 'arm_smmu_tlb_inv_range_nosync()' and it's a loop over granule size. 1357 do { 1358 arm_smmu_cmdq_issue_cmd(smmu, &cmd); 1359 cmd.tlbi.addr += granule; 1360 } while (size -= granule); So if invalidation size is big then huge no of invalidation commands will be submitted irrespective of fix that you pointed above, right ? Thanks, Sunil. > > Will > >> >> Signed-off-by: Geetha <gakula@cavium.com> >> --- >> drivers/iommu/arm-smmu-v3.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c >> index 591bb96..1dcd154 100644 >> --- a/drivers/iommu/arm-smmu-v3.c >> +++ b/drivers/iommu/arm-smmu-v3.c >> @@ -407,7 +407,7 @@ >> #define PRIQ_1_ADDR_MASK 0xfffffffffffffUL >> >> /* High-level queue structures */ >> -#define ARM_SMMU_POLL_TIMEOUT_US 100 >> +#define ARM_SMMU_POLL_TIMEOUT_US 500 >> >> #define MSI_IOVA_BASE 0x8000000 >> #define MSI_IOVA_LENGTH 0x100000 >> -- >> 1.9.1 >> > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Mon, Apr 24, 2017 at 10:26:53PM +0530, Sunil Kovvuri wrote: > On Mon, Apr 24, 2017 at 9:38 PM, Will Deacon <will.deacon@arm.com> wrote: > > On Mon, Apr 24, 2017 at 05:29:36PM +0530, Geetha sowjanya wrote: > >> From: Geetha <gakula@cavium.com> > >> > >> When large memory is being unmapped, huge no of tlb invalidation cmds are > >> submitted followed by a SYNC command. This sometimes hits CMD queue full and > >> poll on queue drain is being timedout throwing error message 'CMD_SYNC timeout'. > >> > >> Although there is no functional issue, error message confuses user. Hence increased > >> poll timeout to 500us > > > > Hmm, what are you doing to unmap that much? Is this VFIO teardown? Do you > > have 7c6d90e2bb1a ("iommu/io-pgtable-arm: Fix iova_to_phys for block > > entries") applied? > > Yes it's VFIO teardown and again yes the above fix is applied. > But i didn't get how above fix is related. > TLB invalidation commands are submitted at 'arm_smmu_tlb_inv_range_nosync()' > and it's a loop over granule size. > > 1357 do { > 1358 arm_smmu_cmdq_issue_cmd(smmu, &cmd); > 1359 cmd.tlbi.addr += granule; > 1360 } while (size -= granule); > > So if invalidation size is big then huge no of invalidation commands > will be submitted > irrespective of fix that you pointed above, right ? VFIO has some logic to batch up invalidations, but this didn't work properly for us without the fix above. However, I guess you have a huge memory range that's mapped with 2M sections or something, so there are still loads of entries to invalidate. I would much prefer it if VFIO could just teardown the whole address space so that we could do an invalidate all, but there's a chicken-and-egg problem with page accounting iirc. Will
On Mon, Apr 24, 2017 at 10:35 PM, Will Deacon <will.deacon@arm.com> wrote: > On Mon, Apr 24, 2017 at 10:26:53PM +0530, Sunil Kovvuri wrote: >> On Mon, Apr 24, 2017 at 9:38 PM, Will Deacon <will.deacon@arm.com> wrote: >> > On Mon, Apr 24, 2017 at 05:29:36PM +0530, Geetha sowjanya wrote: >> >> From: Geetha <gakula@cavium.com> >> >> >> >> When large memory is being unmapped, huge no of tlb invalidation cmds are >> >> submitted followed by a SYNC command. This sometimes hits CMD queue full and >> >> poll on queue drain is being timedout throwing error message 'CMD_SYNC timeout'. >> >> >> >> Although there is no functional issue, error message confuses user. Hence increased >> >> poll timeout to 500us >> > >> > Hmm, what are you doing to unmap that much? Is this VFIO teardown? Do you >> > have 7c6d90e2bb1a ("iommu/io-pgtable-arm: Fix iova_to_phys for block >> > entries") applied? >> >> Yes it's VFIO teardown and again yes the above fix is applied. >> But i didn't get how above fix is related. >> TLB invalidation commands are submitted at 'arm_smmu_tlb_inv_range_nosync()' >> and it's a loop over granule size. >> >> 1357 do { >> 1358 arm_smmu_cmdq_issue_cmd(smmu, &cmd); >> 1359 cmd.tlbi.addr += granule; >> 1360 } while (size -= granule); >> >> So if invalidation size is big then huge no of invalidation commands >> will be submitted >> irrespective of fix that you pointed above, right ? > > VFIO has some logic to batch up invalidations, but this didn't work properly > for us without the fix above. However, I guess you have a huge memory range > that's mapped with 2M sections or something, so there are still loads of > entries to invalidate. > > I would much prefer it if VFIO could just teardown the whole address space > so that we could do an invalidate all, but there's a chicken-and-egg problem > with page accounting iirc. > We can definitely look into this from VFIO perspective but for now I am guessing this patch is fine, as no functionality is being changed. What do you say ? Thanks, Sunil.
On Wed, Apr 26, 2017 at 02:50:04PM +0530, Sunil Kovvuri wrote: > On Mon, Apr 24, 2017 at 10:35 PM, Will Deacon <will.deacon@arm.com> wrote: > > On Mon, Apr 24, 2017 at 10:26:53PM +0530, Sunil Kovvuri wrote: > >> On Mon, Apr 24, 2017 at 9:38 PM, Will Deacon <will.deacon@arm.com> wrote: > >> > On Mon, Apr 24, 2017 at 05:29:36PM +0530, Geetha sowjanya wrote: > >> >> From: Geetha <gakula@cavium.com> > >> >> > >> >> When large memory is being unmapped, huge no of tlb invalidation cmds are > >> >> submitted followed by a SYNC command. This sometimes hits CMD queue full and > >> >> poll on queue drain is being timedout throwing error message 'CMD_SYNC timeout'. > >> >> > >> >> Although there is no functional issue, error message confuses user. Hence increased > >> >> poll timeout to 500us > >> > > >> > Hmm, what are you doing to unmap that much? Is this VFIO teardown? Do you > >> > have 7c6d90e2bb1a ("iommu/io-pgtable-arm: Fix iova_to_phys for block > >> > entries") applied? > >> > >> Yes it's VFIO teardown and again yes the above fix is applied. > >> But i didn't get how above fix is related. > >> TLB invalidation commands are submitted at 'arm_smmu_tlb_inv_range_nosync()' > >> and it's a loop over granule size. > >> > >> 1357 do { > >> 1358 arm_smmu_cmdq_issue_cmd(smmu, &cmd); > >> 1359 cmd.tlbi.addr += granule; > >> 1360 } while (size -= granule); > >> > >> So if invalidation size is big then huge no of invalidation commands > >> will be submitted > >> irrespective of fix that you pointed above, right ? > > > > VFIO has some logic to batch up invalidations, but this didn't work properly > > for us without the fix above. However, I guess you have a huge memory range > > that's mapped with 2M sections or something, so there are still loads of > > entries to invalidate. > > > > I would much prefer it if VFIO could just teardown the whole address space > > so that we could do an invalidate all, but there's a chicken-and-egg problem > > with page accounting iirc. > > > > We can definitely look into this from VFIO perspective but for now I am guessing > this patch is fine, as no functionality is being changed. > What do you say ? Thinking about it some more, I'd rather we rework the polling loop so that: 1. It's structured more like the arm-smmu.c TLB loop queued for 4.11 (so we don't udelay(1) if the thing doesn't sync immediately) 2. Have a larger timeout for the drain case, which I think is what you're running into. This could even be 1s, like arm-smmu.c. Will
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 591bb96..1dcd154 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -407,7 +407,7 @@ #define PRIQ_1_ADDR_MASK 0xfffffffffffffUL /* High-level queue structures */ -#define ARM_SMMU_POLL_TIMEOUT_US 100 +#define ARM_SMMU_POLL_TIMEOUT_US 500 #define MSI_IOVA_BASE 0x8000000 #define MSI_IOVA_LENGTH 0x100000