Message ID | 20210820213117.13050-2-alyssa.rosenzweig@collabora.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/panfrost: Bug fixes for lock_region | expand |
On 20/08/2021 22:31, Alyssa Rosenzweig wrote: > In lock_region, simplify the calculation of the region_width parameter. > This field is the size, but encoded as log2(ceil(size)) - 1. > log2(ceil(size)) may be computed directly as fls(size - 1). However, we > want to use the 64-bit versions as the amount to lock can exceed > 32-bits. > > This avoids undefined behaviour when locking all memory (size ~0), > caught by UBSAN. It might have been useful to mention what it is that UBSAN specifically picked up (it took me a while to spot) - but anyway I think there's a bigger issue with it being completely wrong when size == ~0 (see below). > Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> > Reported-and-tested-by: Chris Morgan <macromorgan@hotmail.com> > Cc: <stable@vger.kernel.org> However, I've confirmed this returns the same values and is certainly more simple, so: Reviewed-by: Steven Price <steven.price@arm.com> > --- > drivers/gpu/drm/panfrost/panfrost_mmu.c | 19 +++++-------------- > 1 file changed, 5 insertions(+), 14 deletions(-) > > diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c > index 0da5b3100ab1..f6e02d0392f4 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c > +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c > @@ -62,21 +62,12 @@ static void lock_region(struct panfrost_device *pfdev, u32 as_nr, > { > u8 region_width; > u64 region = iova & PAGE_MASK; > - /* > - * fls returns: > - * 1 .. 32 > - * > - * 10 + fls(num_pages) > - * results in the range (11 .. 42) > - */ > - > - size = round_up(size, PAGE_SIZE); This seems to be the first issue - ~0 will be 'rounded up' to 0. > > - region_width = 10 + fls(size >> PAGE_SHIFT); fls(0) == 0, so region_width == 10. > - if ((size >> PAGE_SHIFT) != (1ul << (region_width - 11))) { Presumably here's where UBSAN objects - we're shifting by a negative value, which even it it happens to works means the lock region is tiny and certainly not what was intended! It might well be worth a: Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver") Note for anyone following along at (working-from-) home: although this code was cargo culted from kbase - kbase is fine because it takes a pfn and doesn't do the round_up() stage. Which also exposes the second bug (fixed in patch 2): a size_t isn't big enough on 32 bit platforms (all Midgard/Bifrost GPUs have a VA size bigger than 32 bits). Again kbase gets away with a u32 because it's a pfn. There is potentially a third bug which kbase only recently attempted to fix. The lock address is effectively rounded down by the hardware (the bottom bits are ignored). So if you have mask=(1<<region_width)-1 but (iova & mask) != ((iova + size) & mask) then you are potentially failing to lock the end of the intended region. kbase has added some code to handle this: > /* Round up if some memory pages spill into the next region. */ > region_frame_number_start = pfn >> (lockaddr_size_log2 - PAGE_SHIFT); > region_frame_number_end = > (pfn + num_pages - 1) >> (lockaddr_size_log2 - PAGE_SHIFT); > > if (region_frame_number_start < region_frame_number_end) > lockaddr_size_log2 += 1; I guess we should too? Steve > - /* not pow2, so must go up to the next pow2 */ > - region_width += 1; > - } > + /* The size is encoded as ceil(log2) minus(1), which may be calculated > + * with fls. The size must be clamped to hardware bounds. > + */ > + size = max_t(u64, size, PAGE_SIZE); > + region_width = fls64(size - 1) - 1; > region |= region_width; > > /* Lock the region that needs to be updated */ >
> > In lock_region, simplify the calculation of the region_width parameter. > > This field is the size, but encoded as log2(ceil(size)) - 1. > > log2(ceil(size)) may be computed directly as fls(size - 1). However, we > > want to use the 64-bit versions as the amount to lock can exceed > > 32-bits. > > > > This avoids undefined behaviour when locking all memory (size ~0), > > caught by UBSAN. > > It might have been useful to mention what it is that UBSAN specifically > picked up (it took me a while to spot) - but anyway I think there's a > bigger issue with it being completely wrong when size == ~0 (see below). Indeed. I've updated the commit message in v2 to explain what goes wrong (your analysis was spot on, but a mailing list message is more ephermal than a commit message). I'll send out v2 tomorrow assuming nobody objects to v1 in the mean time. Thanks for the review. > There is potentially a third bug which kbase only recently attempted to > fix. The lock address is effectively rounded down by the hardware (the > bottom bits are ignored). So if you have mask=(1<<region_width)-1 but > (iova & mask) != ((iova + size) & mask) then you are potentially failing > to lock the end of the intended region. kbase has added some code to > handle this: > > > /* Round up if some memory pages spill into the next region. */ > > region_frame_number_start = pfn >> (lockaddr_size_log2 - PAGE_SHIFT); > > region_frame_number_end = > > (pfn + num_pages - 1) >> (lockaddr_size_log2 - PAGE_SHIFT); > > > > if (region_frame_number_start < region_frame_number_end) > > lockaddr_size_log2 += 1; > > I guess we should too? Oh, I missed this one. Guess we have 4 bugs with this code instead of just 3, yikes. How could such a short function be so deeply and horribly broken?
On 23 August 2021 22:09:08 BST, Alyssa Rosenzweig <alyssa@collabora.com> wrote: >> > In lock_region, simplify the calculation of the region_width parameter. >> > This field is the size, but encoded as log2(ceil(size)) - 1. >> > log2(ceil(size)) may be computed directly as fls(size - 1). However, we >> > want to use the 64-bit versions as the amount to lock can exceed >> > 32-bits. >> > >> > This avoids undefined behaviour when locking all memory (size ~0), >> > caught by UBSAN. >> >> It might have been useful to mention what it is that UBSAN specifically >> picked up (it took me a while to spot) - but anyway I think there's a >> bigger issue with it being completely wrong when size == ~0 (see below). > >Indeed. I've updated the commit message in v2 to explain what goes >wrong (your analysis was spot on, but a mailing list message is more >ephermal than a commit message). I'll send out v2 tomorrow assuming >nobody objects to v1 in the mean time. > >Thanks for the review. > >> There is potentially a third bug which kbase only recently attempted to >> fix. The lock address is effectively rounded down by the hardware (the >> bottom bits are ignored). So if you have mask=(1<<region_width)-1 but >> (iova & mask) != ((iova + size) & mask) then you are potentially failing >> to lock the end of the intended region. kbase has added some code to >> handle this: >> >> > /* Round up if some memory pages spill into the next region. */ >> > region_frame_number_start = pfn >> (lockaddr_size_log2 - PAGE_SHIFT); >> > region_frame_number_end = >> > (pfn + num_pages - 1) >> (lockaddr_size_log2 - PAGE_SHIFT); >> > >> > if (region_frame_number_start < region_frame_number_end) >> > lockaddr_size_log2 += 1; >> >> I guess we should too? > >Oh, I missed this one. Guess we have 4 bugs with this code instead of >just 3, yikes. How could such a short function be so deeply and horribly >broken? ���� > >Should I add a fourth patch to the series to fix this? Yes please! Thanks, Steve
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index 0da5b3100ab1..f6e02d0392f4 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -62,21 +62,12 @@ static void lock_region(struct panfrost_device *pfdev, u32 as_nr, { u8 region_width; u64 region = iova & PAGE_MASK; - /* - * fls returns: - * 1 .. 32 - * - * 10 + fls(num_pages) - * results in the range (11 .. 42) - */ - - size = round_up(size, PAGE_SIZE); - region_width = 10 + fls(size >> PAGE_SHIFT); - if ((size >> PAGE_SHIFT) != (1ul << (region_width - 11))) { - /* not pow2, so must go up to the next pow2 */ - region_width += 1; - } + /* The size is encoded as ceil(log2) minus(1), which may be calculated + * with fls. The size must be clamped to hardware bounds. + */ + size = max_t(u64, size, PAGE_SIZE); + region_width = fls64(size - 1) - 1; region |= region_width; /* Lock the region that needs to be updated */
In lock_region, simplify the calculation of the region_width parameter. This field is the size, but encoded as log2(ceil(size)) - 1. log2(ceil(size)) may be computed directly as fls(size - 1). However, we want to use the 64-bit versions as the amount to lock can exceed 32-bits. This avoids undefined behaviour when locking all memory (size ~0), caught by UBSAN. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reported-and-tested-by: Chris Morgan <macromorgan@hotmail.com> Cc: <stable@vger.kernel.org> --- drivers/gpu/drm/panfrost/panfrost_mmu.c | 19 +++++-------------- 1 file changed, 5 insertions(+), 14 deletions(-)