From patchwork Wed Jul 20 00:21:23 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 9238565 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 4B74F600CB for ; Wed, 20 Jul 2016 00:21:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3D8E3268AE for ; Wed, 20 Jul 2016 00:21:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3201B2787C; Wed, 20 Jul 2016 00:21:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from oss.sgi.com (oss.sgi.com [192.48.182.195]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 964EB268AE for ; Wed, 20 Jul 2016 00:21:49 +0000 (UTC) Received: from oss.sgi.com (localhost [IPv6:::1]) by oss.sgi.com (Postfix) with ESMTP id 4AE397CA1; Tue, 19 Jul 2016 19:21:47 -0500 (CDT) X-Original-To: xfs@oss.sgi.com Delivered-To: xfs@oss.sgi.com Received: from relay.sgi.com (relay2.corp.sgi.com [137.38.102.29]) by oss.sgi.com (Postfix) with ESMTP id 15F527CA0 for ; Tue, 19 Jul 2016 19:21:45 -0500 (CDT) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by relay2.corp.sgi.com (Postfix) with ESMTP id CED8B304043 for ; Tue, 19 Jul 2016 17:21:41 -0700 (PDT) X-ASG-Debug-ID: 1468974097-04cbb00355b6fd0001-NocioJ Received: from ipmail06.adl6.internode.on.net (ipmail06.adl6.internode.on.net [150.101.137.145]) by cuda.sgi.com with ESMTP id r5QuEtNCrpG1u5sG for ; Tue, 19 Jul 2016 17:21:38 -0700 (PDT) X-Barracuda-Envelope-From: dave@fromorbit.com X-Barracuda-Effective-Source-IP: ipmail06.adl6.internode.on.net[150.101.137.145] X-Barracuda-Apparent-Source-IP: 150.101.137.145 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2CjCwCVwo5XEH6zLHldgz+BUoZwnQYBAQEBAQEGkDeCJotXTQEBAQEBAQcBAQEBAQECPkCFCi87GGoDBy2IL59WngeFYokdFAGFfgWGCpMajAGCY4FpiAiFQgJIhheJP4IOAQsBRByBXioyhmEBDheBHgEBAQ Received: from ppp121-44-179-126.lns20.syd7.internode.on.net (HELO dastard) ([121.44.179.126]) by ipmail06.adl6.internode.on.net with ESMTP; 20 Jul 2016 09:51:37 +0930 Received: from disappointment.disaster.area ([192.168.1.110] helo=disappointment) by dastard with esmtp (Exim 4.80) (envelope-from ) id 1bPfGa-0000Z6-Do for xfs@oss.sgi.com; Wed, 20 Jul 2016 10:21:36 +1000 Received: from dave by disappointment with local (Exim 4.87) (envelope-from ) id 1bPfGN-0000xk-E5 for xfs@oss.sgi.com; Wed, 20 Jul 2016 10:21:23 +1000 From: Dave Chinner To: xfs@oss.sgi.com Subject: [PATCH] xfs: bufferhead chains are invalid after end_page_writeback Date: Wed, 20 Jul 2016 10:21:23 +1000 X-ASG-Orig-Subj: [PATCH] xfs: bufferhead chains are invalid after end_page_writeback Message-Id: <1468974083-3660-1-git-send-email-david@fromorbit.com> X-Mailer: git-send-email 2.8.0.rc3 X-Barracuda-Connect: ipmail06.adl6.internode.on.net[150.101.137.145] X-Barracuda-Start-Time: 1468974098 X-Barracuda-URL: https://192.48.176.25:443/cgi-mod/mark.cgi X-Barracuda-Scan-Msg-Size: 4606 X-Virus-Scanned: by bsmtpd at sgi.com X-Barracuda-BRTS-Status: 1 X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=1000.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.7 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.31386 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-BeenThere: xfs@oss.sgi.com X-Mailman-Version: 2.1.14 Precedence: list List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Chinner In xfs_finish_page_writeback(), we have a loop that looks like this: do { if (off < bvec->bv_offset) goto next_bh; if (off > end) break; bh->b_end_io(bh, !error); next_bh: off += bh->b_size; } while ((bh = bh->b_this_page) != head); The b_end_io function is end_buffer_async_write(), which will call end_page_writeback() once all the buffers have marked as no longer under IO. This issue here is that the only thing currently protecting both the bufferhead chain and the page from being reclaimed is the PageWriteback state held on the page. While we attempt to limit the loop to just the buffers covered by the IO, we still read from the buffer size and follow the next pointer in the bufferhead chain. There is no guarantee that either of these are valid after the PageWriteback flag has been cleared. Hence, loops like this are completely unsafe, and result in use-after-free issues. One such problem was caught by Calvin Owens with KASAN: ..... INFO: Freed in 0x103fc80ec age=18446651500051355200 cpu=2165122683 pid=-1 free_buffer_head+0x41/0x90 __slab_free+0x1ed/0x340 kmem_cache_free+0x270/0x300 free_buffer_head+0x41/0x90 try_to_free_buffers+0x171/0x240 xfs_vm_releasepage+0xcb/0x3b0 try_to_release_page+0x106/0x190 shrink_page_list+0x118e/0x1a10 shrink_inactive_list+0x42c/0xdf0 shrink_zone_memcg+0xa09/0xfa0 shrink_zone+0x2c3/0xbc0 ..... Call Trace: [] dump_stack+0x68/0x94 [] print_trailer+0x115/0x1a0 [] object_err+0x34/0x40 [] kasan_report_error+0x217/0x530 [] __asan_report_load8_noabort+0x43/0x50 [] xfs_destroy_ioend+0x3bf/0x4c0 [] xfs_end_bio+0x154/0x220 [] bio_endio+0x158/0x1b0 [] blk_update_request+0x18b/0xb80 [] scsi_end_request+0x97/0x5a0 [] scsi_io_completion+0x438/0x1690 [] scsi_finish_command+0x375/0x4e0 [] scsi_softirq_done+0x280/0x340 Where the access is occuring during IO completion after the buffer had been freed from direct memory reclaim. Prevent use-after-free accidents in this end_io processing loop by pre-calculating the loop conditionals before calling bh->b_end_io(). The loop is already limited to just the bufferheads covered by the IO in progress, so the offset checks are sufficient to prevent accessing buffers in the chain after end_page_writeback() has been called by the the bh->b_end_io() callout. Yet another example of why Bufferheads Must Die. Signed-off-by: Dave Chinner Reported-and-Tested-by: Calvin Owens Reviewed-by: Brian Foster Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_aops.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 80714eb..0cfb944 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -87,6 +87,12 @@ xfs_find_bdev_for_inode( * We're now finished for good with this page. Update the page state via the * associated buffer_heads, paying attention to the start and end offsets that * we need to process on the page. + * + * Landmine Warning: bh->b_end_io() will call end_page_writeback() on the last + * buffer in the IO. Once it does this, it is unsafe to access the bufferhead or + * the page at all, as we may be racing with memory reclaim and it can free both + * the bufferhead chain and the page as it will see the page as clean and + * unused. */ static void xfs_finish_page_writeback( @@ -95,8 +101,9 @@ xfs_finish_page_writeback( int error) { unsigned int end = bvec->bv_offset + bvec->bv_len - 1; - struct buffer_head *head, *bh; + struct buffer_head *head, *bh, *next; unsigned int off = 0; + unsigned int bsize; ASSERT(bvec->bv_offset < PAGE_SIZE); ASSERT((bvec->bv_offset & ((1 << inode->i_blkbits) - 1)) == 0); @@ -105,15 +112,17 @@ xfs_finish_page_writeback( bh = head = page_buffers(bvec->bv_page); + bsize = bh->b_size; do { + next = bh->b_this_page; if (off < bvec->bv_offset) goto next_bh; if (off > end) break; bh->b_end_io(bh, !error); next_bh: - off += bh->b_size; - } while ((bh = bh->b_this_page) != head); + off += bsize; + } while ((bh = next) != head); } /*