From patchwork Mon Apr 1 08:29:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nikolay Borisov X-Patchwork-Id: 10879363 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 123621805 for ; Mon, 1 Apr 2019 08:30:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F2349285DD for ; Mon, 1 Apr 2019 08:30:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E697228639; Mon, 1 Apr 2019 08:30:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8BED0285DD for ; Mon, 1 Apr 2019 08:30:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732029AbfDAIaD (ORCPT ); Mon, 1 Apr 2019 04:30:03 -0400 Received: from mx2.suse.de ([195.135.220.15]:51638 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1732000AbfDAIaD (ORCPT ); Mon, 1 Apr 2019 04:30:03 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id F09E9AED7 for ; Mon, 1 Apr 2019 08:30:01 +0000 (UTC) From: Nikolay Borisov To: linux-btrfs@vger.kernel.org Cc: Nikolay Borisov Subject: [PATCH 1/2] btrfs: Use kvmalloc for allocating compressed path context Date: Mon, 1 Apr 2019 11:29:57 +0300 Message-Id: <20190401082958.26470-2-nborisov@suse.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190401082958.26470-1-nborisov@suse.com> References: <20190401082958.26470-1-nborisov@suse.com> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Recent refactoring of cow_file_range_async means it's now possible to request a rather large physically contiguous memory via kmalloc. The size is dependent on the number of 512k chunks that the compressed range consists of. David reported multiple OOM messages on such large allocations. Fix it by switching to using kvmalloc. Signed-off-by: Nikolay Borisov --- fs/btrfs/inode.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 43ee890c715f..85f61913b92d 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -28,6 +28,7 @@ #include #include #include +#include #include #include "ctree.h" #include "disk-io.h" @@ -1172,7 +1173,7 @@ static noinline void async_cow_free(struct btrfs_work *work) * async_chunk's, freeing it ensures the whole array has been freed. */ if (atomic_dec_and_test(async_chunk->pending)) - kfree(async_chunk->pending); + kvfree(async_chunk->pending); } static int cow_file_range_async(struct inode *inode, struct page *locked_page, @@ -1188,6 +1189,7 @@ static int cow_file_range_async(struct inode *inode, struct page *locked_page, u64 num_chunks = DIV_ROUND_UP(end - start, SZ_512K); int i; bool should_compress; + unsigned nofs_flag; unlock_extent(&BTRFS_I(inode)->io_tree, start, end); @@ -1199,7 +1201,10 @@ static int cow_file_range_async(struct inode *inode, struct page *locked_page, should_compress = true; } - ctx = kmalloc(struct_size(ctx, chunks, num_chunks), GFP_NOFS); + nofs_flag = memalloc_nofs_save(); + ctx = kvmalloc(struct_size(ctx, chunks, num_chunks), GFP_KERNEL); + memalloc_nofs_restore(nofs_flag); + if (!ctx) { unsigned clear_bits = EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DELALLOC_NEW | EXTENT_DEFRAG | From patchwork Mon Apr 1 08:29:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nikolay Borisov X-Patchwork-Id: 10879361 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 616E3922 for ; Mon, 1 Apr 2019 08:30:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4A86F285DD for ; Mon, 1 Apr 2019 08:30:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3EA8A28639; Mon, 1 Apr 2019 08:30:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D0031285E2 for ; Mon, 1 Apr 2019 08:30:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732066AbfDAIaD (ORCPT ); Mon, 1 Apr 2019 04:30:03 -0400 Received: from mx2.suse.de ([195.135.220.15]:51650 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726206AbfDAIaD (ORCPT ); Mon, 1 Apr 2019 04:30:03 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 409F2AF11 for ; Mon, 1 Apr 2019 08:30:02 +0000 (UTC) From: Nikolay Borisov To: linux-btrfs@vger.kernel.org Cc: Nikolay Borisov Subject: [PATCH 2/2] btrfs: Switch memory allocations in async csum calculation path to kvmalloc Date: Mon, 1 Apr 2019 11:29:58 +0300 Message-Id: <20190401082958.26470-3-nborisov@suse.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190401082958.26470-1-nborisov@suse.com> References: <20190401082958.26470-1-nborisov@suse.com> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP For performance reasons checksum calculation on pending writes is performed in asynchronous context. This requires doing potentially large order memory allocations (in my testing I've seen up to order 6 or 512k). This could put quite a strain on the slab allocator since it's not guaranteed such a large, phisically contiguous memory allocation can succeed - allocation could fail because of memory fragmentation in addition to exhaustion. To add insult to injury, the code path in question can't handle allocation failure gracefully, instead it just BUGs. This patch tries to alleviate the issue by switching the allocation from using kmalloc to using kvmalloc. For small writes this is unlikely to have any visible effect since kmalloc will still satisfy allocation requests. For larger requests the code will just fallback to vmalloc. Signed-off-by: Nikolay Borisov --- fs/btrfs/file-item.c | 15 +++++++++++---- fs/btrfs/ordered-data.c | 3 ++- 2 files changed, 13 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c index 7e2cd9c81eb1..06757b67a7be 100644 --- a/fs/btrfs/file-item.c +++ b/fs/btrfs/file-item.c @@ -7,6 +7,7 @@ #include #include #include +#include #include "ctree.h" #include "disk-io.h" #include "transaction.h" @@ -427,9 +428,13 @@ blk_status_t btrfs_csum_one_bio(struct inode *inode, struct bio *bio, unsigned long this_sum_bytes = 0; int i; u64 offset; + unsigned nofs_flag; + + nofs_flag = memalloc_nofs_save(); + sums = kvmalloc(btrfs_ordered_sum_size(fs_info, bio->bi_iter.bi_size), + GFP_KERNEL); + memalloc_nofs_restore(nofs_flag); - sums = kzalloc(btrfs_ordered_sum_size(fs_info, bio->bi_iter.bi_size), - GFP_NOFS); if (!sums) return BLK_STS_RESOURCE; @@ -469,8 +474,10 @@ blk_status_t btrfs_csum_one_bio(struct inode *inode, struct bio *bio, bytes_left = bio->bi_iter.bi_size - total_bytes; - sums = kzalloc(btrfs_ordered_sum_size(fs_info, bytes_left), - GFP_NOFS); + nofs_flag = memalloc_nofs_save(); + sums = kzalloc(btrfs_ordered_sum_size(fs_info, + bytes_left), GFP_KERNEL); + memalloc_nofs_restore(nofs_flag); BUG_ON(!sums); /* -ENOMEM */ sums->len = bytes_left; ordered = btrfs_lookup_ordered_extent(inode, diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index 4d9bb0dea9af..f6bb6039fa4c 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -6,6 +6,7 @@ #include #include #include +#include #include "ctree.h" #include "transaction.h" #include "btrfs_inode.h" @@ -441,7 +442,7 @@ void btrfs_put_ordered_extent(struct btrfs_ordered_extent *entry) cur = entry->list.next; sum = list_entry(cur, struct btrfs_ordered_sum, list); list_del(&sum->list); - kfree(sum); + kvfree(sum); } kmem_cache_free(btrfs_ordered_extent_cache, entry); }