diff mbox series

iommu/arm-smmu-v3: Align size in __arm_smmu_tlb_inv_range

Message ID 20220413041745.35174-1-nicolinc@nvidia.com (mailing list archive)
State New, archived
Headers show
Series iommu/arm-smmu-v3: Align size in __arm_smmu_tlb_inv_range | expand

Commit Message

Nicolin Chen April 13, 2022, 4:17 a.m. UTC
To calculate num_pages, the size should be aligned with
"page size", determined by the tg value. Otherwise, its
following "while (iova < end)" might become an infinite
loop if unaligned size is slightly greater than 1 << tg.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Robin Murphy April 13, 2022, 1:40 p.m. UTC | #1
On 2022-04-13 05:17, Nicolin Chen wrote:
> To calculate num_pages, the size should be aligned with
> "page size", determined by the tg value. Otherwise, its
> following "while (iova < end)" might become an infinite
> loop if unaligned size is slightly greater than 1 << tg.

Hmm, how does a non-page-aligned invalidation request get generated in 
the first place?

Robin.

> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 627a3ed5ee8f..8249dad5ae44 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1879,7 +1879,7 @@ static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd,
>   		/* Determine what level the granule is at */
>   		cmd->tlbi.ttl = 4 - ((ilog2(granule) - 3) / (tg - 3));
>   
> -		num_pages = size >> tg;
> +		num_pages = ALIGN(size, 1 << tg) >> tg;
>   	}
>   
>   	cmds.num = 0;
Nicolin Chen April 13, 2022, 8:19 p.m. UTC | #2
Hi Robin,

On Wed, Apr 13, 2022 at 02:40:31PM +0100, Robin Murphy wrote:
> On 2022-04-13 05:17, Nicolin Chen wrote:
> > To calculate num_pages, the size should be aligned with
> > "page size", determined by the tg value. Otherwise, its
> > following "while (iova < end)" might become an infinite
> > loop if unaligned size is slightly greater than 1 << tg.
> 
> Hmm, how does a non-page-aligned invalidation request get generated in
> the first place?

I don't have the testing environment because it was a bug,
reported by a client who uses SVA feature on top of SMMU.

But judging from the log, the non-page-aligned inv request
was coming from an likely incorrect end address, e.g.
	{ start = 0xff10000, end = 0xff20000 }
So the size turned out to be 0x10001, unaligned.

I don't have a full call trace on hand right now to see if
upper callers are doing something wrong when calculate the
end address, though I've asked the owner to check.

By looking at the call trace within arm_smmu_* functions:
  __arm_smmu_tlb_inv_range
  arm_smmu_tlb_inv_range_asid
  arm_smmu_mm_invalidate_range
  {from mm_notifier_* functions}

There's no address alignment check. Although I do think we
should fix the source who passes down the non-page-aligned
parameter, the SMMU driver shouldn't silently dead loop if
a set of unaligned inputs are given, IMHO.

Thanks
Nic
Robin Murphy April 14, 2022, 10:32 a.m. UTC | #3
On 2022-04-13 21:19, Nicolin Chen wrote:
> Hi Robin,
> 
> On Wed, Apr 13, 2022 at 02:40:31PM +0100, Robin Murphy wrote:
>> On 2022-04-13 05:17, Nicolin Chen wrote:
>>> To calculate num_pages, the size should be aligned with
>>> "page size", determined by the tg value. Otherwise, its
>>> following "while (iova < end)" might become an infinite
>>> loop if unaligned size is slightly greater than 1 << tg.
>>
>> Hmm, how does a non-page-aligned invalidation request get generated in
>> the first place?
> 
> I don't have the testing environment because it was a bug,
> reported by a client who uses SVA feature on top of SMMU.
> 
> But judging from the log, the non-page-aligned inv request
> was coming from an likely incorrect end address, e.g.
> 	{ start = 0xff10000, end = 0xff20000 }
> So the size turned out to be 0x10001, unaligned.
> 
> I don't have a full call trace on hand right now to see if
> upper callers are doing something wrong when calculate the
> end address, though I've asked the owner to check.
> 
> By looking at the call trace within arm_smmu_* functions:
>    __arm_smmu_tlb_inv_range
>    arm_smmu_tlb_inv_range_asid
>    arm_smmu_mm_invalidate_range
>    {from mm_notifier_* functions}
> 
> There's no address alignment check. Although I do think we
> should fix the source who passes down the non-page-aligned
> parameter, the SMMU driver shouldn't silently dead loop if
> a set of unaligned inputs are given, IMHO.

Oh, sure, I'm not saying we definitely don't need to fix anything, I'd 
just like to get a better understanding of *what* we're fixing. I'd have 
(naively) expected the mm layer to give us page-aligned quantities even 
in the SVA notifier case, so if we've got a clear off-by-one somewhere 
in that path we should fix that before just blindly over-invalidating to 
paper over it; if we still also want to be robust at the SMMU driver end 
just in case, something like "if (WARN_ON(num_pages == 0)) num_pages = 
1;" might be more appropriate. However if it turns out that we *can* 
actually end up with unsanitised input from some userspace unmap 
interface getting this far, then a silent fixup is the best option, but 
if so I'd still like to confirm that we're rounding in the same 
direction as whoever touched the pagetables (since it can't have been us).

Thanks,
Robin.
Nicolin Chen April 15, 2022, 3:56 a.m. UTC | #4
On Thu, Apr 14, 2022 at 11:32:38AM +0100, Robin Murphy wrote:
> External email: Use caution opening links or attachments
> 
> 
> On 2022-04-13 21:19, Nicolin Chen wrote:
> > Hi Robin,
> > 
> > On Wed, Apr 13, 2022 at 02:40:31PM +0100, Robin Murphy wrote:
> > > On 2022-04-13 05:17, Nicolin Chen wrote:
> > > > To calculate num_pages, the size should be aligned with
> > > > "page size", determined by the tg value. Otherwise, its
> > > > following "while (iova < end)" might become an infinite
> > > > loop if unaligned size is slightly greater than 1 << tg.
> > > 
> > > Hmm, how does a non-page-aligned invalidation request get generated in
> > > the first place?
> > 
> > I don't have the testing environment because it was a bug,
> > reported by a client who uses SVA feature on top of SMMU.
> > 
> > But judging from the log, the non-page-aligned inv request
> > was coming from an likely incorrect end address, e.g.
> >       { start = 0xff10000, end = 0xff20000 }
> > So the size turned out to be 0x10001, unaligned.
> > 
> > I don't have a full call trace on hand right now to see if
> > upper callers are doing something wrong when calculate the
> > end address, though I've asked the owner to check.
> > 
> > By looking at the call trace within arm_smmu_* functions:
> >    __arm_smmu_tlb_inv_range
> >    arm_smmu_tlb_inv_range_asid
> >    arm_smmu_mm_invalidate_range
> >    {from mm_notifier_* functions}
> > 
> > There's no address alignment check. Although I do think we
> > should fix the source who passes down the non-page-aligned
> > parameter, the SMMU driver shouldn't silently dead loop if
> > a set of unaligned inputs are given, IMHO.
> 
> Oh, sure, I'm not saying we definitely don't need to fix anything, I'd
> just like to get a better understanding of *what* we're fixing. I'd have
> (naively) expected the mm layer to give us page-aligned quantities even
> in the SVA notifier case, so if we've got a clear off-by-one somewhere
> in that path we should fix that before just blindly over-invalidating to
> paper over it;

I see. Yea, definitely should fix the source too. I've asked
the owner to track it down. I sent the change, thinking that
we could do it in parallel.

> if we still also want to be robust at the SMMU driver end
> just in case, something like "if (WARN_ON(num_pages == 0)) num_pages =
> 1;" might be more appropriate. However if it turns out that we *can*
> actually end up with unsanitised input from some userspace unmap
> interface getting this far, then a silent fixup is the best option, but
> if so I'd still like to confirm that we're rounding in the same
> direction as whoever touched the pagetables (since it can't have been us).

I see. I'll give an update once I have the debugging result.

Thanks!
Nic
Nicolin Chen April 16, 2022, 2:03 a.m. UTC | #5
On Thu, Apr 14, 2022 at 11:32:38AM +0100, Robin Murphy wrote:
> > By looking at the call trace within arm_smmu_* functions:
> >    __arm_smmu_tlb_inv_range
> >    arm_smmu_tlb_inv_range_asid
> >    arm_smmu_mm_invalidate_range
> >    {from mm_notifier_* functions}
> > 
> > There's no address alignment check. Although I do think we
> > should fix the source who passes down the non-page-aligned
> > parameter, the SMMU driver shouldn't silently dead loop if
> > a set of unaligned inputs are given, IMHO.
> 
> Oh, sure, I'm not saying we definitely don't need to fix anything, I'd
> just like to get a better understanding of *what* we're fixing. I'd have
> (naively) expected the mm layer to give us page-aligned quantities even
> in the SVA notifier case, so if we've got a clear off-by-one somewhere
> in that path we should fix that before just blindly over-invalidating to
> paper over it; if we still also want to be robust at the SMMU driver end
> just in case, something like "if (WARN_ON(num_pages == 0)) num_pages =
> 1;" might be more appropriate. However if it turns out that we *can*
> actually end up with unsanitised input from some userspace unmap
> interface getting this far, then a silent fixup is the best option, but
> if so I'd still like to confirm that we're rounding in the same
> direction as whoever touched the pagetables (since it can't have been us).

I got some details:

[ 1008.868735] mmap: -------__do_munmap: range [ffffa4fd0000, ffffa4fe0000] len 10000
[ 1008.869183] -------arm_smmu_mm_invalidate_range: range [ffffa4fd0000, ffffa4fe0000] len 10001
[ 1009.056127] ------------[ cut here ]------------
[ 1009.345791] WARNING: CPU: 0 PID: 131 at drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c:189 arm_smmu_mm_invalidate_range+0x4c/0xa8
[ 1009.605439] Modules linked in: nvidia(O)
[ 1009.692799] CPU: 0 PID: 131 Comm: dmaTest Tainted: G        W  O      5.15.0-tegra #30
[ 1009.865535] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[ 1010.015871] pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 1010.168191] pc : arm_smmu_mm_invalidate_range+0x4c/0xa8
[ 1010.283136] lr : arm_smmu_mm_invalidate_range+0x48/0xa8
[ 1010.397119] sp : ffff80001436fa60
[ 1010.469568] x29: ffff80001436fa60 x28: ffff00001840be80 x27: ffff000007b3fff0
[ 1010.629631] x26: 00e80000589f0f43 x25: ffff00001aa20288 x24: 0000000000000000
[ 1010.786432] x23: ffff0000138c1000 x22: ffff00001783aa00 x21: ffff00001c021380
[ 1010.944448] x20: 0000ffffa4fd0000 x19: 0000000000010001 x18: 0000000000000000
[ 1011.101568] x17: ffff80000e4b0000 x16: ffff800010010000 x15: 000081a13a744e89
[ 1011.259839] x14: 00000000000000ce x13: 00000000000000ce x12: 0000000000000000
[ 1011.415616] x11: 0000000000000010 x10: 00000000000009c0 x9 : ffff80001436f7f0
[ 1011.575552] x8 : ffff000013563420 x7 : ffff00001feb9180 x6 : 00000000000035aa
[ 1011.731775] x5 : 0000000000000000 x4 : ffff00001feb29e0 x3 : ffff00001feb5a78
[ 1011.887615] x2 : 66f9034381513000 x1 : 0000000000000000 x0 : 0000000000000051
[ 1012.042944] Call trace:
[ 1012.097919]  arm_smmu_mm_invalidate_range+0x4c/0xa8
[ 1012.204480]  __mmu_notifier_invalidate_range+0x68/0xb0
[ 1012.318208]  unmap_page_range+0x730/0x740
[ 1012.405951]  unmap_single_vma+0x4c/0xb0
[ 1012.521920]  unmap_vmas+0x70/0xf0
[ 1012.633727]  unmap_region+0xb0/0x110
[ 1012.753856]  __do_munmap+0x36c/0x460
[ 1012.855168]  __vm_munmap+0x70/0xd0
[ 1012.929791]  __arm64_sys_munmap+0x34/0x50
[ 1013.018944]  invoke_syscall.constprop.0+0x4c/0xe0
[ 1013.122047]  do_el0_svc+0x50/0x150
[ 1013.196415]  el0_svc+0x28/0xc0
[ 1013.262848]  el0t_64_sync_handler+0xb0/0xc0
[ 1013.355584]  el0t_64_sync+0x1a0/0x1a4
[ 1013.435903] ---[ end trace c95eb7dc909f29ba ]---

We can see from call trace and logs that the invalidation range
comes from __do_munmap() with end address = 0xffffa4fe0000.

The problem seems to be the difference between how mm and iommu
cores express their end addresses: mm core calculates end using
start + size, while iommu core subtracts 1 from that. So that
end address 0xffffa4fe0000 should be 0xffffa4fdffff in iommu's
way.

Perhaps we should simply do something like the following?

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index d816759a6bcf..e280568bb513 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -183,7 +183,7 @@ static void arm_smmu_mm_invalidate_range(struct mmu_notifier *mn,
 {
        struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
        struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
-       size_t size = end - start + 1;
+       size_t size = end - start;

        if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_BTM))
                arm_smmu_tlb_inv_range_asid(start, size, smmu_mn->cd->asid,

Thanks
Nic
Jason Gunthorpe April 19, 2022, 8:02 p.m. UTC | #6
On Fri, Apr 15, 2022 at 07:03:20PM -0700, Nicolin Chen wrote:

> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index d816759a6bcf..e280568bb513 100644
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -183,7 +183,7 @@ static void arm_smmu_mm_invalidate_range(struct mmu_notifier *mn,
>  {
>         struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
>         struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
> -       size_t size = end - start + 1;
> +       size_t size = end - start;

+1 to this bug fix. You should send a formal patch for stable with a fixes/etc

mmu notifiers uses 'end' not 'last' in alignment with how VMA's work:

include/linux/mm_types.h:       unsigned long vm_end;           /* The first byte after our end address

Jason
Nicolin Chen April 19, 2022, 8:05 p.m. UTC | #7
On Tue, Apr 19, 2022 at 05:02:33PM -0300, Jason Gunthorpe wrote:

> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> > index d816759a6bcf..e280568bb513 100644
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> > @@ -183,7 +183,7 @@ static void arm_smmu_mm_invalidate_range(struct mmu_notifier *mn,
> >  {
> >         struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
> >         struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
> > -       size_t size = end - start + 1;
> > +       size_t size = end - start;
> 
> +1 to this bug fix. You should send a formal patch for stable with a fixes/etc
> 
> mmu notifiers uses 'end' not 'last' in alignment with how VMA's work:
> 
> include/linux/mm_types.h:       unsigned long vm_end;           /* The first byte after our end address

Thanks for the review!

Yea, I will send a new patch.

Nic
Robin Murphy April 19, 2022, 8:15 p.m. UTC | #8
On 2022-04-19 21:05, Nicolin Chen wrote:
> On Tue, Apr 19, 2022 at 05:02:33PM -0300, Jason Gunthorpe wrote:
> 
>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
>>> index d816759a6bcf..e280568bb513 100644
>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
>>> @@ -183,7 +183,7 @@ static void arm_smmu_mm_invalidate_range(struct mmu_notifier *mn,
>>>   {
>>>          struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
>>>          struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
>>> -       size_t size = end - start + 1;
>>> +       size_t size = end - start;
>>
>> +1 to this bug fix. You should send a formal patch for stable with a fixes/etc
>>
>> mmu notifiers uses 'end' not 'last' in alignment with how VMA's work:
>>
>> include/linux/mm_types.h:       unsigned long vm_end;           /* The first byte after our end address
> 
> Thanks for the review!
> 
> Yea, I will send a new patch.

Yup, +1 from me too - this is exactly the kind of thing I suspected - 
and I reckon it might even be worth a comment in the code here that mm's 
"end" is an exclusive limit, to help us remember in future. If there 
doesn't look to be any way for completely arbitrarily-aligned addresses 
to slip through then I'd be tempted to leave it at that (i.e. reason 
that if the infinite loop can only happen due to catastrophic failure 
then it's beyond the scope of things that are worth trying to mitigate), 
but I'll let Jean and Will have the final say there.

Cheers,
Robin.
diff mbox series

Patch

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 627a3ed5ee8f..8249dad5ae44 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1879,7 +1879,7 @@  static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd,
 		/* Determine what level the granule is at */
 		cmd->tlbi.ttl = 4 - ((ilog2(granule) - 3) / (tg - 3));
 
-		num_pages = size >> tg;
+		num_pages = ALIGN(size, 1 << tg) >> tg;
 	}
 
 	cmds.num = 0;