diff mbox

[v3,7/8] proc/kcore: optimize multiple page reads

Message ID fd346c11090cf93d867e01b8d73a6567c5ac6361.1531953780.git.osandov@fb.com (mailing list archive)
State New, archived
Headers show

Commit Message

Omar Sandoval July 18, 2018, 10:58 p.m. UTC
From: Omar Sandoval <osandov@fb.com>

The current code does a full search of the segment list every time for
every page. This is wasteful, since it's almost certain that the next
page will be in the same segment. Instead, check if the previous segment
covers the current page before doing the list search.

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/proc/kcore.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

Comments

Dominique Martinet Aug. 28, 2018, 10:59 a.m. UTC | #1
> The current code does a full search of the segment list every time for
> every page. This is wasteful, since it's almost certain that the next
> page will be in the same segment. Instead, check if the previous segment
> covers the current page before doing the list search.
> 
> Signed-off-by: Omar Sandoval <osandov@fb.com>
> ---
>  fs/proc/kcore.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
> index e2ca58d49938..25fefdd05ee5 100644
> --- a/fs/proc/kcore.c
> +++ b/fs/proc/kcore.c
> @@ -428,10 +428,18 @@ read_kcore(struct file *file, char __user *buffer,
> size_t buflen, loff_t *fpos)
>  	if ((tsz = (PAGE_SIZE - (start & ~PAGE_MASK))) > buflen)
>  		tsz = buflen;
>  
> +	m = NULL;
>  	while (buflen) {
> -		list_for_each_entry(m, &kclist_head, list) {
> -			if (start >= m->addr && start < (m->addr+m->size))
> -				break;
> +		/*
> +		 * If this is the first iteration or the address is not within
> +		 * the previous entry, search for a matching entry.
> +		 */
> +		if (!m || start < m->addr || start >= m->addr + m->size) {

This line apparently triggers a KASAN warning since I rebooted on
4.19-rc1

This is 100% reproductible on my machine when the kdump service starts
(fedora28 x86_64 VM), here's the full stack (on 4.19-rc1):
[   38.161102] BUG: KASAN: global-out-of-bounds in read_kcore+0xd5c/0xf20
[   38.162123] Read of size 8 at addr ffffffffa6c0f770 by task kexec/6201

[   38.163386] CPU: 16 PID: 6201 Comm: kexec Not tainted 4.19.0-rc1+ #13
[   38.164374] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
[   38.166211] Call Trace:
[   38.166658]  dump_stack+0x71/0xa9
[   38.167443]  print_address_description+0x65/0x22e
[   38.168194]  ? read_kcore+0xd5c/0xf20
[   38.168778]  kasan_report.cold.6+0x241/0x306
[   38.169458]  read_kcore+0xd5c/0xf20
[   38.170018]  ? open_kcore+0x1d0/0x1d0
[   38.170605]  ? avc_has_perm_noaudit+0x370/0x370
[   38.171291]  ? kasan_unpoison_shadow+0x30/0x40
[   38.171973]  ? kasan_kmalloc+0xbf/0xe0
[   38.172562]  ? kmem_cache_alloc_trace+0x105/0x200
[   38.173289]  ? open_kcore+0x5f/0x1d0
[   38.173858]  ? open_kcore+0x5f/0x1d0
[   38.174428]  ? deref_stack_reg+0xe0/0xe0
[   38.175038]  proc_reg_read+0x18b/0x220
[   38.175652]  ? proc_reg_unlocked_ioctl+0x210/0x210
[   38.176399]  __vfs_read+0xe1/0x6b0
[   38.176930]  ? __x64_sys_copy_file_range+0x450/0x450
[   38.177723]  ? do_filp_open+0x190/0x250
[   38.178313]  ? may_open_dev+0xc0/0xc0
[   38.178886]  ? __fsnotify_update_child_dentry_flags.part.3+0x330/0x330
[   38.179883]  ? __fsnotify_inode_delete+0x20/0x20
[   38.180608]  ? __inode_security_revalidate+0x8e/0xb0
[   38.181378]  vfs_read+0xde/0x2c0
[   38.181889]  ksys_read+0xb2/0x160
[   38.182413]  ? kernel_write+0x130/0x130
[   38.183000]  ? task_work_run+0x74/0x1c0
[   38.183621]  do_syscall_64+0xa0/0x2e0
[   38.184183]  ? async_page_fault+0x8/0x30
[   38.184802]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   38.185610] RIP: 0033:0x7fc525d74091
[   38.186155] Code: fe ff ff 50 48 8d 3d b6 b6 09 00 e8 59 05 02 00 66 0f 1f 84 00 00 00 00 00 48 8d 05 51 39 2d 00 8b 00 85 c0 75 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 57 c3 66 0f 1f 44 00 00 41 54 49 89 d4 55 48
[   38.188980] RSP: 002b:00007fffd6802a28 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[   38.190153] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc525d74091
[   38.191240] RDX: 0000000000010000 RSI: 00000000017f90b0 RDI: 0000000000000004
[   38.192329] RBP: 0000000000010000 R08: 00007fc526043420 R09: 0000000000000001
[   38.194593] R10: 00000000017e8010 R11: 0000000000000246 R12: 00000000017f90b0
[   38.196823] R13: 0000000000000004 R14: 00007fffd6802ac8 R15: 00007fffd6802cb0

[   38.200526] The buggy address belongs to the variable:
[   38.202539]  kclist_head+0x10/0x440

[   38.205568] Memory state around the buggy address:
[   38.207411]  ffffffffa6c0f600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   38.209648]  ffffffffa6c0f680: 00 00 00 00 00 00 00 00 04 fa fa fa fa fa fa fa
[   38.211812] >ffffffffa6c0f700: 00 00 00 00 00 fa fa fa fa fa fa fa 00 00 fa fa
[   38.213936]                                                              ^
[   38.216010]  ffffffffa6c0f780: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00
[   38.218178]  ffffffffa6c0f800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   38.220306] ==================================================================

where
[   37.636589]  read_kcore+0xd5c/0xf20
symbolizes to
[<        none        >] read_kcore+0xd5c/0xf20 fs/proc/kcore.c:454
, the above check.


I haven't checked but I think I am in the first case below:
                if (&m->list == &kclist_head) {
meaning no address matched in the list, so you cannot check m->addr and
m->size in this case -- I'm afraid you will have to run through the list
just in case if that happens even if there likely won't be any match for
the next address either.


> +			list_for_each_entry(m, &kclist_head, list) {
> +				if (start >= m->addr &&
> +				    start < m->addr + m->size)
> +					break;
> +			}
>  		}
>  
>  		if (&m->list == &kclist_head) {
diff mbox

Patch

diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
index e2ca58d49938..25fefdd05ee5 100644
--- a/fs/proc/kcore.c
+++ b/fs/proc/kcore.c
@@ -428,10 +428,18 @@  read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos)
 	if ((tsz = (PAGE_SIZE - (start & ~PAGE_MASK))) > buflen)
 		tsz = buflen;
 
+	m = NULL;
 	while (buflen) {
-		list_for_each_entry(m, &kclist_head, list) {
-			if (start >= m->addr && start < (m->addr+m->size))
-				break;
+		/*
+		 * If this is the first iteration or the address is not within
+		 * the previous entry, search for a matching entry.
+		 */
+		if (!m || start < m->addr || start >= m->addr + m->size) {
+			list_for_each_entry(m, &kclist_head, list) {
+				if (start >= m->addr &&
+				    start < m->addr + m->size)
+					break;
+			}
 		}
 
 		if (&m->list == &kclist_head) {