diff mbox series

[-next,v2] uprobes: fix two zero old_folio bugs in __replace_page()

Message ID 20250221015056.1269344-1-tongtiangen@huawei.com (mailing list archive)
State New
Headers show
Series [-next,v2] uprobes: fix two zero old_folio bugs in __replace_page() | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Tong Tiangen Feb. 21, 2025, 1:50 a.m. UTC
We triggered the following error logs in syzkaller test:

  BUG: Bad page state in process syz.7.38  pfn:1eff3
  page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1eff3
  flags: 0x3fffff00004004(referenced|reserved|node=0|zone=1|lastcpupid=0x1fffff)
  raw: 003fffff00004004 ffffe6c6c07bfcc8 ffffe6c6c07bfcc8 0000000000000000
  raw: 0000000000000000 0000000000000000 00000000fffffffe 0000000000000000
  page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
  Call Trace:
   <TASK>
   dump_stack_lvl+0x32/0x50
   bad_page+0x69/0xf0
   free_unref_page_prepare+0x401/0x500
   free_unref_page+0x6d/0x1b0
   uprobe_write_opcode+0x460/0x8e0
   install_breakpoint.part.0+0x51/0x80
   register_for_each_vma+0x1d9/0x2b0
   __uprobe_register+0x245/0x300
   bpf_uprobe_multi_link_attach+0x29b/0x4f0
   link_create+0x1e2/0x280
   __sys_bpf+0x75f/0xac0
   __x64_sys_bpf+0x1a/0x30
   do_syscall_64+0x56/0x100
   entry_SYSCALL_64_after_hwframe+0x78/0xe2

   BUG: Bad rss-counter state mm:00000000452453e0 type:MM_FILEPAGES val:-1

The following syzkaller test case can be used to reproduce:

  r2 = creat(&(0x7f0000000000)='./file0\x00', 0x8)
  write$nbd(r2, &(0x7f0000000580)=ANY=[], 0x10)
  r4 = openat(0xffffffffffffff9c, &(0x7f0000000040)='./file0\x00', 0x42, 0x0)
  mmap$IORING_OFF_SQ_RING(&(0x7f0000ffd000/0x3000)=nil, 0x3000, 0x0, 0x12, r4, 0x0)
  r5 = userfaultfd(0x80801)
  ioctl$UFFDIO_API(r5, 0xc018aa3f, &(0x7f0000000040)={0xaa, 0x20})
  r6 = userfaultfd(0x80801)
  ioctl$UFFDIO_API(r6, 0xc018aa3f, &(0x7f0000000140))
  ioctl$UFFDIO_REGISTER(r6, 0xc020aa00, &(0x7f0000000100)={{&(0x7f0000ffc000/0x4000)=nil, 0x4000}, 0x2})
  ioctl$UFFDIO_ZEROPAGE(r5, 0xc020aa04, &(0x7f0000000000)={{&(0x7f0000ffd000/0x1000)=nil, 0x1000}})
  r7 = bpf$PROG_LOAD(0x5, &(0x7f0000000140)={0x2, 0x3, &(0x7f0000000200)=ANY=[@ANYBLOB="1800000000120000000000000000000095"], &(0x7f0000000000)='GPL\x00', 0x7, 0x0, 0x0, 0x0, 0x0, '\x00', 0x0, @fallback=0x30, 0xffffffffffffffff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x10, 0x0, @void, @value}, 0x94)
  bpf$BPF_LINK_CREATE_XDP(0x1c, &(0x7f0000000040)={r7, 0x0, 0x30, 0x1e, @val=@uprobe_multi={&(0x7f0000000080)='./file0\x00', &(0x7f0000000100)=[0x2], 0x0, 0x0, 0x1}}, 0x40)

The cause is that zero pfn is set to the pte without increasing the rss
count in mfill_atomic_pte_zeropage() and the refcount of zero folio does
not increase accordingly. Then, the operation on the same pfn is performed
in uprobe_write_opcode()->__replace_page() to unconditional decrease the
rss count and old_folio's refcount.

Therefore, two bugs are introduced:
1. The rss count is incorrect, when process exit, the check_mm() report
   error "Bad rss-count".
2. The reserved folio (zero folio) is freed when folio->refcount is zero,
   then free_pages_prepare->free_page_is_bad() report error
   "Bad page state".

Considering that uprobe hit the zero folio is a very rare scene, just
reject zero old folio immediately after get_user_page_vma_remote().

Fixes: 7396fa818d62 ("uprobes/core: Make background page replacement logic account for rss_stat counters")
Fixes: 2b1444983508 ("uprobes, mm, x86: Add the ability to install and remove uprobes breakpoints")
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
---
v2: Modified according to the comments of David and Oleg. 
---
 kernel/events/uprobes.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

David Hildenbrand Feb. 21, 2025, 8:12 a.m. UTC | #1
On 21.02.25 02:50, Tong Tiangen wrote:
> We triggered the following error logs in syzkaller test:
> 
>    BUG: Bad page state in process syz.7.38  pfn:1eff3
>    page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1eff3
>    flags: 0x3fffff00004004(referenced|reserved|node=0|zone=1|lastcpupid=0x1fffff)
>    raw: 003fffff00004004 ffffe6c6c07bfcc8 ffffe6c6c07bfcc8 0000000000000000
>    raw: 0000000000000000 0000000000000000 00000000fffffffe 0000000000000000
>    page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
>    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
>    Call Trace:
>     <TASK>
>     dump_stack_lvl+0x32/0x50
>     bad_page+0x69/0xf0
>     free_unref_page_prepare+0x401/0x500
>     free_unref_page+0x6d/0x1b0
>     uprobe_write_opcode+0x460/0x8e0
>     install_breakpoint.part.0+0x51/0x80
>     register_for_each_vma+0x1d9/0x2b0
>     __uprobe_register+0x245/0x300
>     bpf_uprobe_multi_link_attach+0x29b/0x4f0
>     link_create+0x1e2/0x280
>     __sys_bpf+0x75f/0xac0
>     __x64_sys_bpf+0x1a/0x30
>     do_syscall_64+0x56/0x100
>     entry_SYSCALL_64_after_hwframe+0x78/0xe2
> 
>     BUG: Bad rss-counter state mm:00000000452453e0 type:MM_FILEPAGES val:-1
> 
> The following syzkaller test case can be used to reproduce:
> 
>    r2 = creat(&(0x7f0000000000)='./file0\x00', 0x8)
>    write$nbd(r2, &(0x7f0000000580)=ANY=[], 0x10)
>    r4 = openat(0xffffffffffffff9c, &(0x7f0000000040)='./file0\x00', 0x42, 0x0)
>    mmap$IORING_OFF_SQ_RING(&(0x7f0000ffd000/0x3000)=nil, 0x3000, 0x0, 0x12, r4, 0x0)
>    r5 = userfaultfd(0x80801)
>    ioctl$UFFDIO_API(r5, 0xc018aa3f, &(0x7f0000000040)={0xaa, 0x20})
>    r6 = userfaultfd(0x80801)
>    ioctl$UFFDIO_API(r6, 0xc018aa3f, &(0x7f0000000140))
>    ioctl$UFFDIO_REGISTER(r6, 0xc020aa00, &(0x7f0000000100)={{&(0x7f0000ffc000/0x4000)=nil, 0x4000}, 0x2})
>    ioctl$UFFDIO_ZEROPAGE(r5, 0xc020aa04, &(0x7f0000000000)={{&(0x7f0000ffd000/0x1000)=nil, 0x1000}})
>    r7 = bpf$PROG_LOAD(0x5, &(0x7f0000000140)={0x2, 0x3, &(0x7f0000000200)=ANY=[@ANYBLOB="1800000000120000000000000000000095"], &(0x7f0000000000)='GPL\x00', 0x7, 0x0, 0x0, 0x0, 0x0, '\x00', 0x0, @fallback=0x30, 0xffffffffffffffff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x10, 0x0, @void, @value}, 0x94)
>    bpf$BPF_LINK_CREATE_XDP(0x1c, &(0x7f0000000040)={r7, 0x0, 0x30, 0x1e, @val=@uprobe_multi={&(0x7f0000000080)='./file0\x00', &(0x7f0000000100)=[0x2], 0x0, 0x0, 0x1}}, 0x40)
> 
> The cause is that zero pfn is set to the pte without increasing the rss
> count in mfill_atomic_pte_zeropage() and the refcount of zero folio does
> not increase accordingly. Then, the operation on the same pfn is performed
> in uprobe_write_opcode()->__replace_page() to unconditional decrease the
> rss count and old_folio's refcount.
> 
> Therefore, two bugs are introduced:
> 1. The rss count is incorrect, when process exit, the check_mm() report
>     error "Bad rss-count".
> 2. The reserved folio (zero folio) is freed when folio->refcount is zero,
>     then free_pages_prepare->free_page_is_bad() report error
>     "Bad page state".

Well, there is more, like triggering the

	VM_WARN_ON_FOLIO(is_zero_folio(folio), folio);

in __folio_rmap_sanity_checks() I assume.

So maybe just call the patch

	"uprobes: reject the share zeropage in uprobe_write_opcode)()"

Thanks!
Oleg Nesterov Feb. 21, 2025, 3:28 p.m. UTC | #2
On 02/21, Tong Tiangen wrote:
>
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -506,6 +506,11 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
>  	if (ret <= 0)
>  		goto put_old;
>
> +	if (is_zero_page(old_page)) {
> +		ret = -EINVAL;
> +		goto put_old;
> +	}

I agree with David, the subject looks a bit misleading.

And. I won't insist, this is cosmetic, but if you send V2 please consider
moving the "verify_opcode()" check down, after the is_zero_page/PageCompound
checks.

Oleg.
Tong Tiangen Feb. 22, 2025, 2:33 a.m. UTC | #3
在 2025/2/21 16:12, David Hildenbrand 写道:
> On 21.02.25 02:50, Tong Tiangen wrote:
>> We triggered the following error logs in syzkaller test:
>>
>>    BUG: Bad page state in process syz.7.38  pfn:1eff3
>>    page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 
>> pfn:0x1eff3
>>    flags: 
>> 0x3fffff00004004(referenced|reserved|node=0|zone=1|lastcpupid=0x1fffff)
>>    raw: 003fffff00004004 ffffe6c6c07bfcc8 ffffe6c6c07bfcc8 
>> 0000000000000000
>>    raw: 0000000000000000 0000000000000000 00000000fffffffe 
>> 0000000000000000
>>    page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
>>    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
>> 1.13.0-1ubuntu1.1 04/01/2014
>>    Call Trace:
>>     <TASK>
>>     dump_stack_lvl+0x32/0x50
>>     bad_page+0x69/0xf0
>>     free_unref_page_prepare+0x401/0x500
>>     free_unref_page+0x6d/0x1b0
>>     uprobe_write_opcode+0x460/0x8e0
>>     install_breakpoint.part.0+0x51/0x80
>>     register_for_each_vma+0x1d9/0x2b0
>>     __uprobe_register+0x245/0x300
>>     bpf_uprobe_multi_link_attach+0x29b/0x4f0
>>     link_create+0x1e2/0x280
>>     __sys_bpf+0x75f/0xac0
>>     __x64_sys_bpf+0x1a/0x30
>>     do_syscall_64+0x56/0x100
>>     entry_SYSCALL_64_after_hwframe+0x78/0xe2
>>
>>     BUG: Bad rss-counter state mm:00000000452453e0 type:MM_FILEPAGES 
>> val:-1
>>
>> The following syzkaller test case can be used to reproduce:
>>
>>    r2 = creat(&(0x7f0000000000)='./file0\x00', 0x8)
>>    write$nbd(r2, &(0x7f0000000580)=ANY=[], 0x10)
>>    r4 = openat(0xffffffffffffff9c, &(0x7f0000000040)='./file0\x00', 
>> 0x42, 0x0)
>>    mmap$IORING_OFF_SQ_RING(&(0x7f0000ffd000/0x3000)=nil, 0x3000, 0x0, 
>> 0x12, r4, 0x0)
>>    r5 = userfaultfd(0x80801)
>>    ioctl$UFFDIO_API(r5, 0xc018aa3f, &(0x7f0000000040)={0xaa, 0x20})
>>    r6 = userfaultfd(0x80801)
>>    ioctl$UFFDIO_API(r6, 0xc018aa3f, &(0x7f0000000140))
>>    ioctl$UFFDIO_REGISTER(r6, 0xc020aa00, 
>> &(0x7f0000000100)={{&(0x7f0000ffc000/0x4000)=nil, 0x4000}, 0x2})
>>    ioctl$UFFDIO_ZEROPAGE(r5, 0xc020aa04, 
>> &(0x7f0000000000)={{&(0x7f0000ffd000/0x1000)=nil, 0x1000}})
>>    r7 = bpf$PROG_LOAD(0x5, &(0x7f0000000140)={0x2, 0x3, 
>> &(0x7f0000000200)=ANY=[@ANYBLOB="1800000000120000000000000000000095"], 
>> &(0x7f0000000000)='GPL\x00', 0x7, 0x0, 0x0, 0x0, 0x0, '\x00', 0x0, 
>> @fallback=0x30, 0xffffffffffffffff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 
>> 0x0, 0x0, 0x0, 0x0, 0x10, 0x0, @void, @value}, 0x94)
>>    bpf$BPF_LINK_CREATE_XDP(0x1c, &(0x7f0000000040)={r7, 0x0, 0x30, 
>> 0x1e, @val=@uprobe_multi={&(0x7f0000000080)='./file0\x00', 
>> &(0x7f0000000100)=[0x2], 0x0, 0x0, 0x1}}, 0x40)
>>
>> The cause is that zero pfn is set to the pte without increasing the rss
>> count in mfill_atomic_pte_zeropage() and the refcount of zero folio does
>> not increase accordingly. Then, the operation on the same pfn is 
>> performed
>> in uprobe_write_opcode()->__replace_page() to unconditional decrease the
>> rss count and old_folio's refcount.
>>
>> Therefore, two bugs are introduced:
>> 1. The rss count is incorrect, when process exit, the check_mm() report
>>     error "Bad rss-count".
>> 2. The reserved folio (zero folio) is freed when folio->refcount is zero,
>>     then free_pages_prepare->free_page_is_bad() report error
>>     "Bad page state".
> 
> Well, there is more, like triggering the
> 
>      VM_WARN_ON_FOLIO(is_zero_folio(folio), folio);
> 
> in __folio_rmap_sanity_checks() I assume.
> 
> So maybe just call the patch
> 
>      "uprobes: reject the share zeropage in uprobe_write_opcode)()"
> 
> Thanks!

OK, This subject is more appropriate.

Thanks.

>
Tong Tiangen Feb. 22, 2025, 2:37 a.m. UTC | #4
在 2025/2/21 23:28, Oleg Nesterov 写道:
> On 02/21, Tong Tiangen wrote:
>>
>> --- a/kernel/events/uprobes.c
>> +++ b/kernel/events/uprobes.c
>> @@ -506,6 +506,11 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
>>   	if (ret <= 0)
>>   		goto put_old;
>>
>> +	if (is_zero_page(old_page)) {
>> +		ret = -EINVAL;
>> +		goto put_old;
>> +	}
> 
> I agree with David, the subject looks a bit misleading.
> 
> And. I won't insist, this is cosmetic, but if you send V2 please consider
> moving the "verify_opcode()" check down, after the is_zero_page/PageCompound
> checks.
> 
> Oleg.

OK, check the validity of the old page first and modify the subject in
v3 .

Thanks.

> 
> 
> .
diff mbox series

Patch

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 46ddf3a2334d..daa46868faf3 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -506,6 +506,11 @@  int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
 	if (ret <= 0)
 		goto put_old;
 
+	if (is_zero_page(old_page)) {
+		ret = -EINVAL;
+		goto put_old;
+	}
+
 	if (WARN(!is_register && PageCompound(old_page),
 		 "uprobe unregister should never work on compound page\n")) {
 		ret = -EINVAL;