diff mbox

[1/3] radix-tree: 'slot' can be NULL in radix_tree_next_slot()

Message ID 20160808185747.21028-1-ross.zwisler@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ross Zwisler Aug. 8, 2016, 6:57 p.m. UTC
There are four cases I can see where we could end up with a NULL 'slot' in
radix_tree_next_slot().  Yet radix_tree_next_slot() never actually checks
whether 'slot' is NULL.  It just happens that for the cases where 'slot' is
NULL, some other combination of factors prevents us from dereferencing it.

It would be very easy for someone to unwittingly change one of these
factors without realizing that we are implicitly depending on it to save us
from a NULL pointer dereference.

So, explicitly account for the fact that that 'slot' can be NULL in
radix_tree_next_slot() and save ourselves from future crashes and debugging
efforts.

Here are details on the four cases:

1) radix_tree_iter_retry() via a non-tagged iteration like
radix_tree_for_each_slot().  In this case we currently aren't seeing a bug
because radix_tree_iter_retry() sets

	iter->next_index = iter->index;

which means that in in the else case in radix_tree_next_slot(), 'count' is
zero, so we skip over the while() loop and effectively just return NULL
without ever dereferencing 'slot'.

2) radix_tree_iter_retry() via tagged iteration like
radix_tree_for_each_tagged().  This case was giving us NULL pointer
dereferences in testing, and was fixed with this commit:

commit 3cb9185c6730 ("radix-tree: fix radix_tree_iter_retry() for tagged
iterators.")

This fix doesn't explicitly check for 'slot' being NULL, though, it works
around the NULL pointer dereference by instead zeroing iter->tags in
radix_tree_iter_retry(), which makes us bail out of the if() case in
radix_tree_next_slot() before we dereference 'slot'.

3) radix_tree_iter_next() via via a non-tagged iteration like
radix_tree_for_each_slot().  This currently happens in shmem_tag_pins()
and shmem_partial_swap_usage().

As with non-tagged iteration, 'count' in the else case of
radix_tree_next_slot() is zero, so we skip over the while() loop and
effectively just return NULL without ever dereferencing 'slot'.

4) radix_tree_iter_next() via tagged iteration like
radix_tree_for_each_tagged().  This happens in shmem_wait_for_pins().

radix_tree_iter_next() zeros out iter->tags, so we end up exiting
radix_tree_next_slot() here:

	if (flags & RADIX_TREE_ITER_TAGGED) {
		void *canon = slot;

		iter->tags >>= 1;
		if (unlikely(!iter->tags))
			return NULL;

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
---
 include/linux/radix-tree.h | 3 +++
 1 file changed, 3 insertions(+)

Comments

Konstantin Khlebnikov Aug. 8, 2016, 7:21 p.m. UTC | #1
On Mon, Aug 8, 2016 at 9:57 PM, Ross Zwisler
<ross.zwisler@linux.intel.com> wrote:
> There are four cases I can see where we could end up with a NULL 'slot' in
> radix_tree_next_slot().  Yet radix_tree_next_slot() never actually checks
> whether 'slot' is NULL.  It just happens that for the cases where 'slot' is
> NULL, some other combination of factors prevents us from dereferencing it.
>
> It would be very easy for someone to unwittingly change one of these
> factors without realizing that we are implicitly depending on it to save us
> from a NULL pointer dereference.
>
> So, explicitly account for the fact that that 'slot' can be NULL in
> radix_tree_next_slot() and save ourselves from future crashes and debugging
> efforts.
>
> Here are details on the four cases:
>
> 1) radix_tree_iter_retry() via a non-tagged iteration like
> radix_tree_for_each_slot().  In this case we currently aren't seeing a bug
> because radix_tree_iter_retry() sets
>
>         iter->next_index = iter->index;
>
> which means that in in the else case in radix_tree_next_slot(), 'count' is
> zero, so we skip over the while() loop and effectively just return NULL
> without ever dereferencing 'slot'.
>
> 2) radix_tree_iter_retry() via tagged iteration like
> radix_tree_for_each_tagged().  This case was giving us NULL pointer
> dereferences in testing, and was fixed with this commit:
>
> commit 3cb9185c6730 ("radix-tree: fix radix_tree_iter_retry() for tagged
> iterators.")
>
> This fix doesn't explicitly check for 'slot' being NULL, though, it works
> around the NULL pointer dereference by instead zeroing iter->tags in
> radix_tree_iter_retry(), which makes us bail out of the if() case in
> radix_tree_next_slot() before we dereference 'slot'.
>
> 3) radix_tree_iter_next() via via a non-tagged iteration like
> radix_tree_for_each_slot().  This currently happens in shmem_tag_pins()
> and shmem_partial_swap_usage().
>
> As with non-tagged iteration, 'count' in the else case of
> radix_tree_next_slot() is zero, so we skip over the while() loop and
> effectively just return NULL without ever dereferencing 'slot'.
>
> 4) radix_tree_iter_next() via tagged iteration like
> radix_tree_for_each_tagged().  This happens in shmem_wait_for_pins().
>
> radix_tree_iter_next() zeros out iter->tags, so we end up exiting
> radix_tree_next_slot() here:
>
>         if (flags & RADIX_TREE_ITER_TAGGED) {
>                 void *canon = slot;
>
>                 iter->tags >>= 1;
>                 if (unlikely(!iter->tags))
>                         return NULL;
>
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> ---
>  include/linux/radix-tree.h | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/include/linux/radix-tree.h b/include/linux/radix-tree.h
> index 4c45105..1bf16ed 100644
> --- a/include/linux/radix-tree.h
> +++ b/include/linux/radix-tree.h
> @@ -465,6 +465,9 @@ static inline struct radix_tree_node *entry_to_node(void *ptr)
>  static __always_inline void **
>  radix_tree_next_slot(void **slot, struct radix_tree_iter *iter, unsigned flags)
>  {
> +       if (unlikely(!slot))
> +               return NULL;
> +
>         if (flags & RADIX_TREE_ITER_TAGGED) {
>                 void *canon = slot;
>

NAK. This is fast path and it's already bloated.
I want to revert most changes here and rework "multiorder" entries.

Here you can find almost ready patchset for that
https://github.com/koct9i/linux/commits/radix-tree

> --
> 2.9.0
>
Ross Zwisler Aug. 9, 2016, 3:27 p.m. UTC | #2
On Mon, Aug 08, 2016 at 10:21:39PM +0300, Konstantin Khlebnikov wrote:
<>
> NAK. This is fast path and it's already bloated.
> I want to revert most changes here and rework "multiorder" entries.
> 
> Here you can find almost ready patchset for that
> https://github.com/koct9i/linux/commits/radix-tree

Okay...are you okay with the second 2 patches in the series?  They stand
alone, and I believe are both good to have.
Konstantin Khlebnikov Aug. 10, 2016, 6:29 a.m. UTC | #3
On Tue, Aug 9, 2016 at 6:27 PM, Ross Zwisler
<ross.zwisler@linux.intel.com> wrote:
> On Mon, Aug 08, 2016 at 10:21:39PM +0300, Konstantin Khlebnikov wrote:
> <>
>> NAK. This is fast path and it's already bloated.
>> I want to revert most changes here and rework "multiorder" entries.
>>
>> Here you can find almost ready patchset for that
>> https://github.com/koct9i/linux/commits/radix-tree
>
> Okay...are you okay with the second 2 patches in the series?  They stand
> alone, and I believe are both good to have.

They looks good.

If you're worried about complicated paths - I'll add comment for
*_next_slot() about that.
Ross Zwisler Aug. 10, 2016, 2:52 p.m. UTC | #4
On Wed, Aug 10, 2016 at 09:29:23AM +0300, Konstantin Khlebnikov wrote:
> On Tue, Aug 9, 2016 at 6:27 PM, Ross Zwisler
> <ross.zwisler@linux.intel.com> wrote:
> > On Mon, Aug 08, 2016 at 10:21:39PM +0300, Konstantin Khlebnikov wrote:
> > <>
> >> NAK. This is fast path and it's already bloated.
> >> I want to revert most changes here and rework "multiorder" entries.
> >>
> >> Here you can find almost ready patchset for that
> >> https://github.com/koct9i/linux/commits/radix-tree
> >
> > Okay...are you okay with the second 2 patches in the series?  They stand
> > alone, and I believe are both good to have.
> 
> They looks good.

Cool, can I interpret that as an Acked-by for my v2? :)

> If you're worried about complicated paths - I'll add comment for
> *_next_slot() about that.

Yea, if we can document the exact ways in which we're protected from not
dereferencing a NULL 'slot', that would be great.  I think it essentially
boils down to:

1) For tagged iteration, if 'slot' is NULL then iter->tags must be cleared
2) For non-tagged iteration, it 'slot' is NULL then
   radix_tree_chunk_size(iter) must return 1 or less.
diff mbox

Patch

diff --git a/include/linux/radix-tree.h b/include/linux/radix-tree.h
index 4c45105..1bf16ed 100644
--- a/include/linux/radix-tree.h
+++ b/include/linux/radix-tree.h
@@ -465,6 +465,9 @@  static inline struct radix_tree_node *entry_to_node(void *ptr)
 static __always_inline void **
 radix_tree_next_slot(void **slot, struct radix_tree_iter *iter, unsigned flags)
 {
+	if (unlikely(!slot))
+		return NULL;
+
 	if (flags & RADIX_TREE_ITER_TAGGED) {
 		void *canon = slot;