diff mbox series

[3/3] mm/kmemleak: Prevent soft lockup in first object iteration loop of kmemleak_scan()

Message ID 20220612183301.981616-4-longman@redhat.com (mailing list archive)
State New
Headers show
Series mm/kmemleak: Avoid soft lockup in kmemleak_scan() | expand

Commit Message

Waiman Long June 12, 2022, 6:33 p.m. UTC
The first RCU-based object iteration loop has to put almost all the
objects into the gray list and so cannot skip taking the object lock.

One way to avoid soft lockup is to insert occasional cond_resched()
into the loop. This cannot be done while holding the RCU read lock
which is to protect objects from removal. However, putting an object
into the gray list means getting a reference to the object. That will
prevent the object from removal as well without the need to hold the
RCU read lock. So insert a cond_resched() call after every 64k objects
are put into the gray list.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 mm/kmemleak.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

Comments

Catalin Marinas June 14, 2022, 5:15 p.m. UTC | #1
On Sun, Jun 12, 2022 at 02:33:01PM -0400, Waiman Long wrote:
> @@ -1437,10 +1440,25 @@ static void kmemleak_scan(void)
>  #endif
>  		/* reset the reference count (whiten the object) */
>  		object->count = 0;
> -		if (color_gray(object) && get_object(object))
> +		if (color_gray(object) && get_object(object)) {
>  			list_add_tail(&object->gray_list, &gray_list);
> +			gray_list_cnt++;
> +			object_pinned = true;
> +		}
>  
>  		raw_spin_unlock_irq(&object->lock);
> +
> +		/*
> +		 * With object pinned by a positive reference count, it
> +		 * won't go away and we can safely release the RCU read
> +		 * lock and do a cond_resched() to avoid soft lockup every
> +		 * 64k objects.
> +		 */
> +		if (object_pinned && !(gray_list_cnt & 0xffff)) {
> +			rcu_read_unlock();
> +			cond_resched();
> +			rcu_read_lock();
> +		}

I'm not sure this gains much. There should be very few gray objects
initially (those passed to kmemleak_not_leak() for example). The
majority should be white objects.

If we drop the fine-grained object->lock, we could instead take
kmemleak_lock outside the loop with a cond_resched_lock(&kmemleak_lock)
within the loop. I think we can get away with not having an
rcu_read_lock() at all for list traversal with the big lock outside the
loop.

The reason I added it in the first kmemleak incarnation was to defer
kmemleak_object freeing as it was causing a re-entrant call into the
slab allocator. I later went for fine-grained locking and RCU list
traversal but I may have overdone it ;).
Catalin Marinas June 14, 2022, 5:27 p.m. UTC | #2
On Tue, Jun 14, 2022 at 06:15:14PM +0100, Catalin Marinas wrote:
> On Sun, Jun 12, 2022 at 02:33:01PM -0400, Waiman Long wrote:
> > @@ -1437,10 +1440,25 @@ static void kmemleak_scan(void)
> >  #endif
> >  		/* reset the reference count (whiten the object) */
> >  		object->count = 0;
> > -		if (color_gray(object) && get_object(object))
> > +		if (color_gray(object) && get_object(object)) {
> >  			list_add_tail(&object->gray_list, &gray_list);
> > +			gray_list_cnt++;
> > +			object_pinned = true;
> > +		}
> >  
> >  		raw_spin_unlock_irq(&object->lock);
> > +
> > +		/*
> > +		 * With object pinned by a positive reference count, it
> > +		 * won't go away and we can safely release the RCU read
> > +		 * lock and do a cond_resched() to avoid soft lockup every
> > +		 * 64k objects.
> > +		 */
> > +		if (object_pinned && !(gray_list_cnt & 0xffff)) {
> > +			rcu_read_unlock();
> > +			cond_resched();
> > +			rcu_read_lock();
> > +		}
> 
> I'm not sure this gains much. There should be very few gray objects
> initially (those passed to kmemleak_not_leak() for example). The
> majority should be white objects.
> 
> If we drop the fine-grained object->lock, we could instead take
> kmemleak_lock outside the loop with a cond_resched_lock(&kmemleak_lock)
> within the loop. I think we can get away with not having an
> rcu_read_lock() at all for list traversal with the big lock outside the
> loop.

Actually this doesn't work is the current object in the iteration is
freed. Does list_for_each_rcu_safe() help?
Waiman Long June 14, 2022, 6:22 p.m. UTC | #3
On 6/14/22 13:27, Catalin Marinas wrote:
> On Tue, Jun 14, 2022 at 06:15:14PM +0100, Catalin Marinas wrote:
>> On Sun, Jun 12, 2022 at 02:33:01PM -0400, Waiman Long wrote:
>>> @@ -1437,10 +1440,25 @@ static void kmemleak_scan(void)
>>>   #endif
>>>   		/* reset the reference count (whiten the object) */
>>>   		object->count = 0;
>>> -		if (color_gray(object) && get_object(object))
>>> +		if (color_gray(object) && get_object(object)) {
>>>   			list_add_tail(&object->gray_list, &gray_list);
>>> +			gray_list_cnt++;
>>> +			object_pinned = true;
>>> +		}
>>>   
I may have the mistaken belief that setting count to 0 will make most 
object gray. Apparently, that may not be the case here.
>>>   		raw_spin_unlock_irq(&object->lock);
>>> +
>>> +		/*
>>> +		 * With object pinned by a positive reference count, it
>>> +		 * won't go away and we can safely release the RCU read
>>> +		 * lock and do a cond_resched() to avoid soft lockup every
>>> +		 * 64k objects.
>>> +		 */
>>> +		if (object_pinned && !(gray_list_cnt & 0xffff)) {
>>> +			rcu_read_unlock();
>>> +			cond_resched();
>>> +			rcu_read_lock();
>>> +		}
>> I'm not sure this gains much. There should be very few gray objects
>> initially (those passed to kmemleak_not_leak() for example). The
>> majority should be white objects.
>>
>> If we drop the fine-grained object->lock, we could instead take
>> kmemleak_lock outside the loop with a cond_resched_lock(&kmemleak_lock)
>> within the loop. I think we can get away with not having an
>> rcu_read_lock() at all for list traversal with the big lock outside the
>> loop.
> Actually this doesn't work is the current object in the iteration is
> freed. Does list_for_each_rcu_safe() help?

list_for_each_rcu_safe() helps if we are worrying about object being 
freed. However, it won't help if object->next is freed instead.

How about something like:

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 7dd64139a7c7..fd836e43cb16 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1417,12 +1417,16 @@ static void kmemleak_scan(void)
         struct zone *zone;
         int __maybe_unused i;
         int new_leaks = 0;
+       int loop1_cnt = 0;

         jiffies_last_scan = jiffies;

         /* prepare the kmemleak_object's */
         rcu_read_lock();
         list_for_each_entry_rcu(object, &object_list, object_list) {
+               bool obj_pinned = false;
+
+               loop1_cnt++;
                 raw_spin_lock_irq(&object->lock);
  #ifdef DEBUG
                 /*
@@ -1437,10 +1441,32 @@ static void kmemleak_scan(void)
  #endif
                 /* reset the reference count (whiten the object) */
                 object->count = 0;
-               if (color_gray(object) && get_object(object))
+               if (color_gray(object) && get_object(object)) {
                         list_add_tail(&object->gray_list, &gray_list);
+                       obj_pinned = true;
+               }

                 raw_spin_unlock_irq(&object->lock);
+
+               /*
+                * Do a cond_resched() to avoid soft lockup every 64k 
objects.
+                * Make sure a reference has been taken so that the object
+                * won't go away without RCU read lock.
+                */
+               if (loop1_cnt & 0xffff) {
+                       if (!obj_pinned && !get_object(object)) {
+                               /* Try the next object instead */
+                               loop1_cnt--;
+                               continue;
+                       }
+
+                       rcu_read_unlock();
+                       cond_resched();
+                       rcu_read_lock();
+
+                       if (!obj_pinned)
+                               put_object(object);
+               }
         }
         rcu_read_unlock();

Cheers,
Longman
Waiman Long June 14, 2022, 6:28 p.m. UTC | #4
On 6/14/22 14:22, Waiman Long wrote:
> On 6/14/22 13:27, Catalin Marinas wrote:
>
>>>> raw_spin_unlock_irq(&object->lock);
>>>> +
>>>> +        /*
>>>> +         * With object pinned by a positive reference count, it
>>>> +         * won't go away and we can safely release the RCU read
>>>> +         * lock and do a cond_resched() to avoid soft lockup every
>>>> +         * 64k objects.
>>>> +         */
>>>> +        if (object_pinned && !(gray_list_cnt & 0xffff)) {
>>>> +            rcu_read_unlock();
>>>> +            cond_resched();
>>>> +            rcu_read_lock();
>>>> +        }
>>> I'm not sure this gains much. There should be very few gray objects
>>> initially (those passed to kmemleak_not_leak() for example). The
>>> majority should be white objects.
>>>
>>> If we drop the fine-grained object->lock, we could instead take
>>> kmemleak_lock outside the loop with a cond_resched_lock(&kmemleak_lock)
>>> within the loop. I think we can get away with not having an
>>> rcu_read_lock() at all for list traversal with the big lock outside the
>>> loop.
>> Actually this doesn't work is the current object in the iteration is
>> freed. Does list_for_each_rcu_safe() help?
>
> list_for_each_rcu_safe() helps if we are worrying about object being 
> freed. However, it won't help if object->next is freed instead.
>
> How about something like:
>
> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> index 7dd64139a7c7..fd836e43cb16 100644
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -1417,12 +1417,16 @@ static void kmemleak_scan(void)
>         struct zone *zone;
>         int __maybe_unused i;
>         int new_leaks = 0;
> +       int loop1_cnt = 0;
>
>         jiffies_last_scan = jiffies;
>
>         /* prepare the kmemleak_object's */
>         rcu_read_lock();
>         list_for_each_entry_rcu(object, &object_list, object_list) {
> +               bool obj_pinned = false;
> +
> +               loop1_cnt++;
>                 raw_spin_lock_irq(&object->lock);
>  #ifdef DEBUG
>                 /*
> @@ -1437,10 +1441,32 @@ static void kmemleak_scan(void)
>  #endif
>                 /* reset the reference count (whiten the object) */
>                 object->count = 0;
> -               if (color_gray(object) && get_object(object))
> +               if (color_gray(object) && get_object(object)) {
>                         list_add_tail(&object->gray_list, &gray_list);
> +                       obj_pinned = true;
> +               }
>
>                 raw_spin_unlock_irq(&object->lock);
> +
> +               /*
> +                * Do a cond_resched() to avoid soft lockup every 64k 
> objects.
> +                * Make sure a reference has been taken so that the 
> object
> +                * won't go away without RCU read lock.
> +                */
> +               if (loop1_cnt & 0xffff) {


Sorry, should be "(!(loop1_cnt & 0xffff))".

> + if (!obj_pinned && !get_object(object)) {
> +                               /* Try the next object instead */
> +                               loop1_cnt--;
> +                               continue;
> +                       }
> +
> +                       rcu_read_unlock();
> +                       cond_resched();
> +                       rcu_read_lock();
> +
> +                       if (!obj_pinned)
> +                               put_object(object);
> +               }
>         }
>         rcu_read_unlock();
>
Cheers,
Longman
diff mbox series

Patch

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 7dd64139a7c7..a7c42e134fa1 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1417,12 +1417,15 @@  static void kmemleak_scan(void)
 	struct zone *zone;
 	int __maybe_unused i;
 	int new_leaks = 0;
+	int gray_list_cnt = 0;
 
 	jiffies_last_scan = jiffies;
 
 	/* prepare the kmemleak_object's */
 	rcu_read_lock();
 	list_for_each_entry_rcu(object, &object_list, object_list) {
+		bool object_pinned = false;
+
 		raw_spin_lock_irq(&object->lock);
 #ifdef DEBUG
 		/*
@@ -1437,10 +1440,25 @@  static void kmemleak_scan(void)
 #endif
 		/* reset the reference count (whiten the object) */
 		object->count = 0;
-		if (color_gray(object) && get_object(object))
+		if (color_gray(object) && get_object(object)) {
 			list_add_tail(&object->gray_list, &gray_list);
+			gray_list_cnt++;
+			object_pinned = true;
+		}
 
 		raw_spin_unlock_irq(&object->lock);
+
+		/*
+		 * With object pinned by a positive reference count, it
+		 * won't go away and we can safely release the RCU read
+		 * lock and do a cond_resched() to avoid soft lockup every
+		 * 64k objects.
+		 */
+		if (object_pinned && !(gray_list_cnt & 0xffff)) {
+			rcu_read_unlock();
+			cond_resched();
+			rcu_read_lock();
+		}
 	}
 	rcu_read_unlock();