From patchwork Tue Aug 18 10:39:03 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michal Hocko X-Patchwork-Id: 7030521 Return-Path: X-Original-To: patchwork-linux-btrfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 6A5D19F373 for ; Tue, 18 Aug 2015 10:43:26 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 56B6420458 for ; Tue, 18 Aug 2015 10:43:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2AA49203C0 for ; Tue, 18 Aug 2015 10:43:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752695AbbHRKjJ (ORCPT ); Tue, 18 Aug 2015 06:39:09 -0400 Received: from mail-wi0-f175.google.com ([209.85.212.175]:33806 "EHLO mail-wi0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752258AbbHRKjG (ORCPT ); Tue, 18 Aug 2015 06:39:06 -0400 Received: by wicne3 with SMTP id ne3so96806822wic.1; Tue, 18 Aug 2015 03:39:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=6s4FXdv73Qm684CuOgYw3ulv8qW0hdf7/xOzUST5dLA=; b=lho0FurA7JisTZ90QQt2kb7FQ8hgaZ+rVlN3F+GW93cq3u/TT73RabTPThn6OKlkUw bzYexe4feCuuNqs/dB0cAE8nSokG7a3Bxj3SWquuwC05H47ZDYj+8Xp3X19sw+NUDyF5 4OS7EW5yCKMUHfbMQLfiOrAgszDWaC6F5Aioaj3yCsuMo3oLr5Ji/XYAcQS2f0wUT+G0 Qvy7Gr9XINlKU5O8cp/fw/EnzcuNN2PDcIIgQLjIBKg0ytm4MNDWpkNMcykKgrPPqdzo 2EpWV5Nm4QGdY3zSGuAquo86skM0NoVFHdkCZ6j1kDI8xjipBaJQx8dmIiU8w9b70JQt I9mg== X-Received: by 10.181.13.241 with SMTP id fb17mr44160423wid.13.1439894344843; Tue, 18 Aug 2015 03:39:04 -0700 (PDT) Received: from localhost (bband-dyn181.95-103-48.t-com.sk. [95.103.48.181]) by smtp.gmail.com with ESMTPSA id vl1sm11126266wjc.0.2015.08.18.03.39.04 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 18 Aug 2015 03:39:04 -0700 (PDT) Date: Tue, 18 Aug 2015 12:39:03 +0200 From: Michal Hocko To: LKML Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Andrew Morton , Johannes Weiner , Tetsuo Handa , Dave Chinner , Theodore Ts'o , linux-btrfs@vger.kernel.org, linux-ext4@vger.kernel.org, Jan Kara Subject: [RFC -v2 5/8] ext4: Do not fail journal due to block allocator Message-ID: <20150818103903.GD5033@dhcp22.suse.cz> References: <1438768284-30927-1-git-send-email-mhocko@kernel.org> <1438768284-30927-6-git-send-email-mhocko@kernel.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1438768284-30927-6-git-send-email-mhocko@kernel.org> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Michal Hocko Since "mm: page_alloc: do not lock up GFP_NOFS allocations upon OOM" memory allocator doesn't endlessly loop to satisfy low-order allocations and instead fails them to allow callers to handle them gracefully. Some of the callers are not yet prepared for this behavior though. ext4 block allocator relies solely on GFP_NOFS allocation requests and allocation failures lead to aborting yournal too easily: [ 345.028333] oom-trash: page allocation failure: order:0, mode:0x50 [ 345.028336] CPU: 1 PID: 8334 Comm: oom-trash Tainted: G W 4.0.0-nofs3-00006-gdfe9931f5f68 #588 [ 345.028337] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150428_134905-gandalf 04/01/2014 [ 345.028339] 0000000000000000 ffff880005a17708 ffffffff81538a54 ffffffff8107a40f [ 345.028341] 0000000000000050 ffff880005a17798 ffffffff810fe854 0000000180000000 [ 345.028342] 0000000000000046 0000000000000000 ffffffff81a52100 0000000000000246 [ 345.028343] Call Trace: [ 345.028348] [] dump_stack+0x4f/0x7b [ 345.028370] [] warn_alloc_failed+0x12a/0x13f [ 345.028373] [] __alloc_pages_nodemask+0x7f3/0x8aa [ 345.028375] [] pagecache_get_page+0x12a/0x1c9 [ 345.028390] [] ext4_mb_load_buddy+0x220/0x367 [ext4] [ 345.028414] [] ext4_free_blocks+0x522/0xa4c [ext4] [ 345.028425] [] ext4_ext_remove_space+0x833/0xf22 [ext4] [ 345.028434] [] ext4_ext_truncate+0x8c/0xb0 [ext4] [ 345.028441] [] ext4_truncate+0x20b/0x38d [ext4] [ 345.028462] [] ext4_evict_inode+0x32b/0x4c1 [ext4] [ 345.028464] [] evict+0xa0/0x148 [ 345.028466] [] iput+0x1a1/0x1f0 [ 345.028468] [] __dentry_kill+0x136/0x1a6 [ 345.028470] [] dput+0x21a/0x243 [ 345.028472] [] __fput+0x184/0x19b [ 345.028473] [] ____fput+0xe/0x10 [ 345.028475] [] task_work_run+0x8a/0xa1 [ 345.028477] [] do_exit+0x3c6/0x8dc [ 345.028482] [] do_group_exit+0x4d/0xb2 [ 345.028483] [] get_signal+0x5b1/0x5f5 [ 345.028488] [] do_signal+0x28/0x5d0 [...] [ 345.028624] EXT4-fs error (device hdb1) in ext4_free_blocks:4879: Out of memory [ 345.033097] Aborting journal on device hdb1-8. [ 345.036339] EXT4-fs (hdb1): Remounting filesystem read-only [ 345.036344] EXT4-fs error (device hdb1) in ext4_reserve_inode_write:4834: Journal has aborted [ 345.036766] EXT4-fs error (device hdb1) in ext4_reserve_inode_write:4834: Journal has aborted [ 345.038583] EXT4-fs error (device hdb1) in ext4_ext_remove_space:3048: Journal has aborted [ 345.049115] EXT4-fs error (device hdb1) in ext4_ext_truncate:4669: Journal has aborted [ 345.050434] EXT4-fs error (device hdb1) in ext4_reserve_inode_write:4834: Journal has aborted [ 345.053064] EXT4-fs error (device hdb1) in ext4_truncate:3668: Journal has aborted [ 345.053582] EXT4-fs error (device hdb1) in ext4_reserve_inode_write:4834: Journal has aborted [ 345.053946] EXT4-fs error (device hdb1) in ext4_orphan_del:2686: Journal has aborted [ 345.055367] EXT4-fs error (device hdb1) in ext4_reserve_inode_write:4834: Journal has aborted The failure is really premature because GFP_NOFS allocation context is very restricted - especially in the fs metadata heavy loads. Before we go with a more sofisticated solution, let's simply imitate the previous behavior of non-failing NOFS allocation and use __GFP_NOFAIL for the buddy block allocator. I wasn't able to trigger the issue with this patch anymore. Signed-off-by: Michal Hocko --- fs/ext4/mballoc.c | 52 ++++++++++++++++++++++++---------------------------- 1 file changed, 24 insertions(+), 28 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 5b1613a54307..0360ea32c30f 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -992,9 +992,8 @@ static int ext4_mb_get_buddy_page_lock(struct super_block *sb, block = group * 2; pnum = block / blocks_per_page; poff = block % blocks_per_page; - page = find_or_create_page(inode->i_mapping, pnum, GFP_NOFS); - if (!page) - return -ENOMEM; + page = find_or_create_page(inode->i_mapping, pnum, + GFP_NOFS|__GFP_NOFAIL); BUG_ON(page->mapping != inode->i_mapping); e4b->bd_bitmap_page = page; e4b->bd_bitmap = page_address(page) + (poff * sb->s_blocksize); @@ -1006,9 +1005,8 @@ static int ext4_mb_get_buddy_page_lock(struct super_block *sb, block++; pnum = block / blocks_per_page; - page = find_or_create_page(inode->i_mapping, pnum, GFP_NOFS); - if (!page) - return -ENOMEM; + page = find_or_create_page(inode->i_mapping, pnum, + GFP_NOFS|__GFP_NOFAIL); BUG_ON(page->mapping != inode->i_mapping); e4b->bd_buddy_page = page; return 0; @@ -1158,20 +1156,19 @@ ext4_mb_load_buddy(struct super_block *sb, ext4_group_t group, * wait for it to initialize. */ page_cache_release(page); - page = find_or_create_page(inode->i_mapping, pnum, GFP_NOFS); - if (page) { - BUG_ON(page->mapping != inode->i_mapping); - if (!PageUptodate(page)) { - ret = ext4_mb_init_cache(page, NULL); - if (ret) { - unlock_page(page); - goto err; - } - mb_cmp_bitmaps(e4b, page_address(page) + - (poff * sb->s_blocksize)); + page = find_or_create_page(inode->i_mapping, pnum, + GFP_NOFS|__GFP_NOFAIL); + BUG_ON(page->mapping != inode->i_mapping); + if (!PageUptodate(page)) { + ret = ext4_mb_init_cache(page, NULL); + if (ret) { + unlock_page(page); + goto err; } - unlock_page(page); + mb_cmp_bitmaps(e4b, page_address(page) + + (poff * sb->s_blocksize)); } + unlock_page(page); } if (page == NULL) { ret = -ENOMEM; @@ -1194,18 +1191,17 @@ ext4_mb_load_buddy(struct super_block *sb, ext4_group_t group, if (page == NULL || !PageUptodate(page)) { if (page) page_cache_release(page); - page = find_or_create_page(inode->i_mapping, pnum, GFP_NOFS); - if (page) { - BUG_ON(page->mapping != inode->i_mapping); - if (!PageUptodate(page)) { - ret = ext4_mb_init_cache(page, e4b->bd_bitmap); - if (ret) { - unlock_page(page); - goto err; - } + page = find_or_create_page(inode->i_mapping, pnum, + GFP_NOFS|__GFP_NOFAIL); + BUG_ON(page->mapping != inode->i_mapping); + if (!PageUptodate(page)) { + ret = ext4_mb_init_cache(page, e4b->bd_bitmap); + if (ret) { + unlock_page(page); + goto err; } - unlock_page(page); } + unlock_page(page); } if (page == NULL) { ret = -ENOMEM;