diff mbox series

[v3] kasan: Don't call find_vm_area() in RT kernel

Message ID 20250217042108.185932-1-longman@redhat.com (mailing list archive)
State New
Headers show
Series [v3] kasan: Don't call find_vm_area() in RT kernel | expand

Commit Message

Waiman Long Feb. 17, 2025, 4:21 a.m. UTC
The following bug report appeared with a test run in a RT debug kernel.

[ 3359.353842] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
[ 3359.353848] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 140605, name: kunit_try_catch
[ 3359.353853] preempt_count: 1, expected: 0
  :
[ 3359.353933] Call trace:
  :
[ 3359.353955]  rt_spin_lock+0x70/0x140
[ 3359.353959]  find_vmap_area+0x84/0x168
[ 3359.353963]  find_vm_area+0x1c/0x50
[ 3359.353966]  print_address_description.constprop.0+0x2a0/0x320
[ 3359.353972]  print_report+0x108/0x1f8
[ 3359.353976]  kasan_report+0x90/0xc8
[ 3359.353980]  __asan_load1+0x60/0x70

Commit e30a0361b851 ("kasan: make report_lock a raw spinlock")
changes report_lock to a raw_spinlock_t to avoid a similar RT problem.
The print_address_description() function is called with report_lock
acquired and interrupt disabled.  However, the find_vm_area() function
still needs to acquire a spinlock_t which becomes a sleeping lock in
the RT kernel. IOW, we can't call find_vm_area() in a RT kernel and
changing report_lock to a raw_spinlock_t is not enough to completely
solve this RT kernel problem.

Fix this bug report by skipping the find_vm_area() call in this case
and just print out the address as is.

For !RT kernel, follow the example set in commit 0cce06ba859a
("debugobjects,locking: Annotate debug_object_fill_pool() wait type
violation") and use DEFINE_WAIT_OVERRIDE_MAP() to avoid a spinlock_t
inside raw_spinlock_t warning.

Fixes: e30a0361b851 ("kasan: make report_lock a raw spinlock")
Signed-off-by: Waiman Long <longman@redhat.com>
---
 mm/kasan/report.c | 43 ++++++++++++++++++++++++++++++-------------
 1 file changed, 30 insertions(+), 13 deletions(-)

 [v3] Rename helper to print_vmalloc_info_set_page.

Comments

Sebastian Andrzej Siewior Feb. 17, 2025, 3:43 p.m. UTC | #1
On 2025-02-16 23:21:08 [-0500], Waiman Long wrote:

I would skip the first part. The backtrace is not really helpful here.

> The following bug report appeared with a test run in a RT debug kernel.
> 
> [ 3359.353842] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
> [ 3359.353848] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 140605, name: kunit_try_catch
> [ 3359.353853] preempt_count: 1, expected: 0
>   :
> [ 3359.353933] Call trace:
>   :
> [ 3359.353955]  rt_spin_lock+0x70/0x140
> [ 3359.353959]  find_vmap_area+0x84/0x168
> [ 3359.353963]  find_vm_area+0x1c/0x50
> [ 3359.353966]  print_address_description.constprop.0+0x2a0/0x320
> [ 3359.353972]  print_report+0x108/0x1f8
> [ 3359.353976]  kasan_report+0x90/0xc8
> [ 3359.353980]  __asan_load1+0x60/0x70
> 
> Commit e30a0361b851 ("kasan: make report_lock a raw spinlock")
> changes report_lock to a raw_spinlock_t to avoid a similar RT problem.

s/to avoid.*//. This has nothing to do with the problem at hand.

> The print_address_description() function is called with report_lock
> acquired and interrupt disabled.  However, the find_vm_area() function
> still needs to acquire a spinlock_t which becomes a sleeping lock in
> the RT kernel. IOW, we can't call find_vm_area() in a RT kernel and
> changing report_lock to a raw_spinlock_t is not enough to completely
> solve this RT kernel problem.

This function is always invoked under the report_lock which is a
raw_spinlock_t. The context under this lock is always atomic even on
PREEMPT_RT. find_vm_area() acquires vmap_node::busy.lock which is a
spinlock_t, becoming a sleeping lock on PREEMPT_RT and must not be
acquired in atomic context.

> Fix this bug report by skipping the find_vm_area() call in this case
> and just print out the address as is.

Please use PREEMPT_RT instead of RT.

Don't invoke find_vm_area() on PREEMPT_RT and just print the address.
Non-PREEMPT_RT builds remain unchanged. Add a DEFINE_WAIT_OVERRIDE_MAP()
is to tell lockdep that this lock nesting allowed because the PREEMPT_RT
part (which is invalid) has been taken care of.

> For !RT kernel, follow the example set in commit 0cce06ba859a
> ("debugobjects,locking: Annotate debug_object_fill_pool() wait type
> violation") and use DEFINE_WAIT_OVERRIDE_MAP() to avoid a spinlock_t
> inside raw_spinlock_t warning.


> Fixes: e30a0361b851 ("kasan: make report_lock a raw spinlock")
> Signed-off-by: Waiman Long <longman@redhat.com>

Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

> ---
>  mm/kasan/report.c | 43 ++++++++++++++++++++++++++++++-------------
>  1 file changed, 30 insertions(+), 13 deletions(-)
> 
>  [v3] Rename helper to print_vmalloc_info_set_page.
> 
> diff --git a/mm/kasan/report.c b/mm/kasan/report.c
> index 3fe77a360f1c..7c8c2e173aa4 100644
> --- a/mm/kasan/report.c
> +++ b/mm/kasan/report.c
> @@ -370,6 +370,34 @@ static inline bool init_task_stack_addr(const void *addr)
>  			sizeof(init_thread_union.stack));
>  }
>  
> +/*
> + * RT kernel cannot call find_vm_area() in atomic context. For !RT kernel,
> + * prevent spinlock_t inside raw_spinlock_t warning by raising wait-type
> + * to WAIT_SLEEP.
> + */

Do we need this comment? I lacks context of why it is atomic. And we
have it in the commit description.

> +static inline void print_vmalloc_info_set_page(void *addr, struct page **ppage)
> +{
> +	if (!IS_ENABLED(CONFIG_PREEMPT_RT)) {
> +		static DEFINE_WAIT_OVERRIDE_MAP(vmalloc_map, LD_WAIT_SLEEP);
> +		struct vm_struct *va;
> +
> +		lock_map_acquire_try(&vmalloc_map);
> +		va = find_vm_area(addr);
> +		if (va) {
> +			pr_err("The buggy address belongs to the virtual mapping at\n"
> +			       " [%px, %px) created by:\n"
> +			       " %pS\n",
> +			       va->addr, va->addr + va->size, va->caller);
> +			pr_err("\n");
> +
> +			*ppage = vmalloc_to_page(addr);
> +		}
> +		lock_map_release(&vmalloc_map);
> +		return;
> +	}
> +	pr_err("The buggy address %px belongs to a vmalloc virtual mapping\n", addr);
> +}
> +
>  static void print_address_description(void *addr, u8 tag,
>  				      struct kasan_report_info *info)
>  {

Sebastian
Andrey Konovalov Feb. 17, 2025, 4:28 p.m. UTC | #2
On Mon, Feb 17, 2025 at 5:21 AM Waiman Long <longman@redhat.com> wrote:
>
> The following bug report appeared with a test run in a RT debug kernel.
>
> [ 3359.353842] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
> [ 3359.353848] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 140605, name: kunit_try_catch
> [ 3359.353853] preempt_count: 1, expected: 0
>   :
> [ 3359.353933] Call trace:
>   :
> [ 3359.353955]  rt_spin_lock+0x70/0x140
> [ 3359.353959]  find_vmap_area+0x84/0x168
> [ 3359.353963]  find_vm_area+0x1c/0x50
> [ 3359.353966]  print_address_description.constprop.0+0x2a0/0x320
> [ 3359.353972]  print_report+0x108/0x1f8
> [ 3359.353976]  kasan_report+0x90/0xc8
> [ 3359.353980]  __asan_load1+0x60/0x70
>
> Commit e30a0361b851 ("kasan: make report_lock a raw spinlock")
> changes report_lock to a raw_spinlock_t to avoid a similar RT problem.
> The print_address_description() function is called with report_lock
> acquired and interrupt disabled.  However, the find_vm_area() function
> still needs to acquire a spinlock_t which becomes a sleeping lock in
> the RT kernel. IOW, we can't call find_vm_area() in a RT kernel and
> changing report_lock to a raw_spinlock_t is not enough to completely
> solve this RT kernel problem.
>
> Fix this bug report by skipping the find_vm_area() call in this case
> and just print out the address as is.
>
> For !RT kernel, follow the example set in commit 0cce06ba859a
> ("debugobjects,locking: Annotate debug_object_fill_pool() wait type
> violation") and use DEFINE_WAIT_OVERRIDE_MAP() to avoid a spinlock_t
> inside raw_spinlock_t warning.
>
> Fixes: e30a0361b851 ("kasan: make report_lock a raw spinlock")
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>  mm/kasan/report.c | 43 ++++++++++++++++++++++++++++++-------------
>  1 file changed, 30 insertions(+), 13 deletions(-)
>
>  [v3] Rename helper to print_vmalloc_info_set_page.
>
> diff --git a/mm/kasan/report.c b/mm/kasan/report.c
> index 3fe77a360f1c..7c8c2e173aa4 100644
> --- a/mm/kasan/report.c
> +++ b/mm/kasan/report.c
> @@ -370,6 +370,34 @@ static inline bool init_task_stack_addr(const void *addr)
>                         sizeof(init_thread_union.stack));
>  }
>
> +/*
> + * RT kernel cannot call find_vm_area() in atomic context. For !RT kernel,
> + * prevent spinlock_t inside raw_spinlock_t warning by raising wait-type
> + * to WAIT_SLEEP.

Quoting your response from the other thread:

> Lockdep currently issues warnings for taking spinlock_t inside
> raw_spinlock_t because it is not allowed in RT. Test coverage of RT
> kernels is likely less than !RT kernel and so less bug of this kind will
> be caught. By making !RT doing the same check, we increase coverage.
> However, we do allow override in the !RT case, but it has to be done on
> a case-by-case basis.

Got it.

So let's put this exactly this explanation in the comment, otherwise
it's unclear why we need something special for the !RT case.

> + */
> +static inline void print_vmalloc_info_set_page(void *addr, struct page **ppage)
> +{
> +       if (!IS_ENABLED(CONFIG_PREEMPT_RT)) {
> +               static DEFINE_WAIT_OVERRIDE_MAP(vmalloc_map, LD_WAIT_SLEEP);
> +               struct vm_struct *va;
> +
> +               lock_map_acquire_try(&vmalloc_map);
> +               va = find_vm_area(addr);
> +               if (va) {
> +                       pr_err("The buggy address belongs to the virtual mapping at\n"
> +                              " [%px, %px) created by:\n"
> +                              " %pS\n",
> +                              va->addr, va->addr + va->size, va->caller);
> +                       pr_err("\n");
> +
> +                       *ppage = vmalloc_to_page(addr);

Looking at the code again, I actually like the Andrey Ryabinin's
suggestion from the v1 thread: add a separate function that contains
an annotated call of find_vm_area(). And keep vmalloc_to_page()
outside of it, just as done in the upstream version now.

> +               }
> +               lock_map_release(&vmalloc_map);
> +               return;
> +       }
> +       pr_err("The buggy address %px belongs to a vmalloc virtual mapping\n", addr);
> +}
> +
>  static void print_address_description(void *addr, u8 tag,
>                                       struct kasan_report_info *info)
>  {
> @@ -398,19 +426,8 @@ static void print_address_description(void *addr, u8 tag,
>                 pr_err("\n");
>         }
>
> -       if (is_vmalloc_addr(addr)) {
> -               struct vm_struct *va = find_vm_area(addr);
> -
> -               if (va) {
> -                       pr_err("The buggy address belongs to the virtual mapping at\n"
> -                              " [%px, %px) created by:\n"
> -                              " %pS\n",
> -                              va->addr, va->addr + va->size, va->caller);
> -                       pr_err("\n");
> -
> -                       page = vmalloc_to_page(addr);
> -               }
> -       }
> +       if (is_vmalloc_addr(addr))
> +               print_vmalloc_info_set_page(addr, &page);
>
>         if (page) {
>                 pr_err("The buggy address belongs to the physical page:\n");
> --
> 2.48.1
>
Andrey Konovalov Feb. 17, 2025, 4:29 p.m. UTC | #3
On Mon, Feb 17, 2025 at 4:43 PM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> > --- a/mm/kasan/report.c
> > +++ b/mm/kasan/report.c
> > @@ -370,6 +370,34 @@ static inline bool init_task_stack_addr(const void *addr)
> >                       sizeof(init_thread_union.stack));
> >  }
> >
> > +/*
> > + * RT kernel cannot call find_vm_area() in atomic context. For !RT kernel,
> > + * prevent spinlock_t inside raw_spinlock_t warning by raising wait-type
> > + * to WAIT_SLEEP.
> > + */
>
> Do we need this comment? I lacks context of why it is atomic. And we
> have it in the commit description.

I would prefer to have this in the comment, but with a full
explanation of why this needs to be done.
Waiman Long Feb. 17, 2025, 5:56 p.m. UTC | #4
On 2/17/25 11:28 AM, Andrey Konovalov wrote:
> On Mon, Feb 17, 2025 at 5:21 AM Waiman Long <longman@redhat.com> wrote:
>> The following bug report appeared with a test run in a RT debug kernel.
>>
>> [ 3359.353842] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
>> [ 3359.353848] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 140605, name: kunit_try_catch
>> [ 3359.353853] preempt_count: 1, expected: 0
>>    :
>> [ 3359.353933] Call trace:
>>    :
>> [ 3359.353955]  rt_spin_lock+0x70/0x140
>> [ 3359.353959]  find_vmap_area+0x84/0x168
>> [ 3359.353963]  find_vm_area+0x1c/0x50
>> [ 3359.353966]  print_address_description.constprop.0+0x2a0/0x320
>> [ 3359.353972]  print_report+0x108/0x1f8
>> [ 3359.353976]  kasan_report+0x90/0xc8
>> [ 3359.353980]  __asan_load1+0x60/0x70
>>
>> Commit e30a0361b851 ("kasan: make report_lock a raw spinlock")
>> changes report_lock to a raw_spinlock_t to avoid a similar RT problem.
>> The print_address_description() function is called with report_lock
>> acquired and interrupt disabled.  However, the find_vm_area() function
>> still needs to acquire a spinlock_t which becomes a sleeping lock in
>> the RT kernel. IOW, we can't call find_vm_area() in a RT kernel and
>> changing report_lock to a raw_spinlock_t is not enough to completely
>> solve this RT kernel problem.
>>
>> Fix this bug report by skipping the find_vm_area() call in this case
>> and just print out the address as is.
>>
>> For !RT kernel, follow the example set in commit 0cce06ba859a
>> ("debugobjects,locking: Annotate debug_object_fill_pool() wait type
>> violation") and use DEFINE_WAIT_OVERRIDE_MAP() to avoid a spinlock_t
>> inside raw_spinlock_t warning.
>>
>> Fixes: e30a0361b851 ("kasan: make report_lock a raw spinlock")
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>>   mm/kasan/report.c | 43 ++++++++++++++++++++++++++++++-------------
>>   1 file changed, 30 insertions(+), 13 deletions(-)
>>
>>   [v3] Rename helper to print_vmalloc_info_set_page.
>>
>> diff --git a/mm/kasan/report.c b/mm/kasan/report.c
>> index 3fe77a360f1c..7c8c2e173aa4 100644
>> --- a/mm/kasan/report.c
>> +++ b/mm/kasan/report.c
>> @@ -370,6 +370,34 @@ static inline bool init_task_stack_addr(const void *addr)
>>                          sizeof(init_thread_union.stack));
>>   }
>>
>> +/*
>> + * RT kernel cannot call find_vm_area() in atomic context. For !RT kernel,
>> + * prevent spinlock_t inside raw_spinlock_t warning by raising wait-type
>> + * to WAIT_SLEEP.
> Quoting your response from the other thread:
>
>> Lockdep currently issues warnings for taking spinlock_t inside
>> raw_spinlock_t because it is not allowed in RT. Test coverage of RT
>> kernels is likely less than !RT kernel and so less bug of this kind will
>> be caught. By making !RT doing the same check, we increase coverage.
>> However, we do allow override in the !RT case, but it has to be done on
>> a case-by-case basis.
> Got it.
>
> So let's put this exactly this explanation in the comment, otherwise
> it's unclear why we need something special for the !RT case.

Sure. Will do that.


>> + */
>> +static inline void print_vmalloc_info_set_page(void *addr, struct page **ppage)
>> +{
>> +       if (!IS_ENABLED(CONFIG_PREEMPT_RT)) {
>> +               static DEFINE_WAIT_OVERRIDE_MAP(vmalloc_map, LD_WAIT_SLEEP);
>> +               struct vm_struct *va;
>> +
>> +               lock_map_acquire_try(&vmalloc_map);
>> +               va = find_vm_area(addr);
>> +               if (va) {
>> +                       pr_err("The buggy address belongs to the virtual mapping at\n"
>> +                              " [%px, %px) created by:\n"
>> +                              " %pS\n",
>> +                              va->addr, va->addr + va->size, va->caller);
>> +                       pr_err("\n");
>> +
>> +                       *ppage = vmalloc_to_page(addr);
> Looking at the code again, I actually like the Andrey Ryabinin's
> suggestion from the v1 thread: add a separate function that contains
> an annotated call of find_vm_area(). And keep vmalloc_to_page()
> outside of it, just as done in the upstream version now.

I can make the change if it is what you want.

Cheers,
Longman
Andrey Konovalov Feb. 17, 2025, 6:59 p.m. UTC | #5
On Mon, Feb 17, 2025 at 6:56 PM Waiman Long <llong@redhat.com> wrote:
>
> >> + */
> >> +static inline void print_vmalloc_info_set_page(void *addr, struct page **ppage)
> >> +{
> >> +       if (!IS_ENABLED(CONFIG_PREEMPT_RT)) {
> >> +               static DEFINE_WAIT_OVERRIDE_MAP(vmalloc_map, LD_WAIT_SLEEP);
> >> +               struct vm_struct *va;
> >> +
> >> +               lock_map_acquire_try(&vmalloc_map);
> >> +               va = find_vm_area(addr);
> >> +               if (va) {
> >> +                       pr_err("The buggy address belongs to the virtual mapping at\n"
> >> +                              " [%px, %px) created by:\n"
> >> +                              " %pS\n",
> >> +                              va->addr, va->addr + va->size, va->caller);
> >> +                       pr_err("\n");
> >> +
> >> +                       *ppage = vmalloc_to_page(addr);
> > Looking at the code again, I actually like the Andrey Ryabinin's
> > suggestion from the v1 thread: add a separate function that contains
> > an annotated call of find_vm_area(). And keep vmalloc_to_page()
> > outside of it, just as done in the upstream version now.
>
> I can make the change if it is what you want.

Yes, please, I think splitting out the call that requires an
annotation into a separate function makes sense.
diff mbox series

Patch

diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 3fe77a360f1c..7c8c2e173aa4 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -370,6 +370,34 @@  static inline bool init_task_stack_addr(const void *addr)
 			sizeof(init_thread_union.stack));
 }
 
+/*
+ * RT kernel cannot call find_vm_area() in atomic context. For !RT kernel,
+ * prevent spinlock_t inside raw_spinlock_t warning by raising wait-type
+ * to WAIT_SLEEP.
+ */
+static inline void print_vmalloc_info_set_page(void *addr, struct page **ppage)
+{
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT)) {
+		static DEFINE_WAIT_OVERRIDE_MAP(vmalloc_map, LD_WAIT_SLEEP);
+		struct vm_struct *va;
+
+		lock_map_acquire_try(&vmalloc_map);
+		va = find_vm_area(addr);
+		if (va) {
+			pr_err("The buggy address belongs to the virtual mapping at\n"
+			       " [%px, %px) created by:\n"
+			       " %pS\n",
+			       va->addr, va->addr + va->size, va->caller);
+			pr_err("\n");
+
+			*ppage = vmalloc_to_page(addr);
+		}
+		lock_map_release(&vmalloc_map);
+		return;
+	}
+	pr_err("The buggy address %px belongs to a vmalloc virtual mapping\n", addr);
+}
+
 static void print_address_description(void *addr, u8 tag,
 				      struct kasan_report_info *info)
 {
@@ -398,19 +426,8 @@  static void print_address_description(void *addr, u8 tag,
 		pr_err("\n");
 	}
 
-	if (is_vmalloc_addr(addr)) {
-		struct vm_struct *va = find_vm_area(addr);
-
-		if (va) {
-			pr_err("The buggy address belongs to the virtual mapping at\n"
-			       " [%px, %px) created by:\n"
-			       " %pS\n",
-			       va->addr, va->addr + va->size, va->caller);
-			pr_err("\n");
-
-			page = vmalloc_to_page(addr);
-		}
-	}
+	if (is_vmalloc_addr(addr))
+		print_vmalloc_info_set_page(addr, &page);
 
 	if (page) {
 		pr_err("The buggy address belongs to the physical page:\n");