From patchwork Wed Mar 6 06:27:11 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Damien Le Moal X-Patchwork-Id: 10840423 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 389891515 for ; Wed, 6 Mar 2019 06:27:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 226CB2CCA6 for ; Wed, 6 Mar 2019 06:27:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 162312CCA4; Wed, 6 Mar 2019 06:27:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5FC632CCA4 for ; Wed, 6 Mar 2019 06:27:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728667AbfCFG1O (ORCPT ); Wed, 6 Mar 2019 01:27:14 -0500 Received: from esa6.hgst.iphmx.com ([216.71.154.45]:4010 "EHLO esa6.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726378AbfCFG1O (ORCPT ); Wed, 6 Mar 2019 01:27:14 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1551853634; x=1583389634; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=AjbLeW/G3XPjCQee6k+cMNPani+HofFSZjDnSHdVs3M=; b=H4/B+G17DXrkfzT/cTzYI6svZshxfMhMQPf5EW8RS4sxGob+L1KMujMZ ytTZ1gxEfc+Mo0OZlo97bdJZy/iaXuysUNTvdCsnVSMDawsaohiv7162L gdk7u4bsjgHVGRz1DhWkwFxeA8Vh2H7HdNqrjgQ7xOO+Skk9VCye7/wnR 0HcxBrY4U/wiTv0EW5LLl8CgsrS0uBYFaB++XLsweSUdoe/DRmxnIHD5G mUYFUGaOprqNAiyQ4O3ohfL3Ga5pvP4IIsNoL7KCs36Twd8uBUzQGfN8V dBu7/g34DnbgHUFenJqXuh5oajlksWtwDgPZQImo5/mpMIsf5AV00H9Uc Q==; X-IronPort-AV: E=Sophos;i="5.58,446,1544457600"; d="scan'208";a="104562995" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 06 Mar 2019 14:27:13 +0800 IronPort-SDR: U6zliUt54jqdGJUto6S3rLAuf25lYdpv10KH2hvQvDxmRbn0vOWm+dyJmmVXKfhY7XigSVZdI3 YFiUPMe73R/HIT1mhX8q5b7YorAJqwLt99egHVNW5lUPymBT8S4/ZtrCpwnFlF3O6GEloa67Ld GnUtKvKnfBVgpcGhnJCt1iGAP2FB0n61jZTTwyoBhYEroHe4MklQt80tWoj1r10+QHqVL7M0Ly 3+MH7XyWrsAtAa3TBkNq5plPhrth/8rlhdFxDQ7l7Exm+W+xWfggsG6Zw+KliXOrEMUJoyDRSv CZUshu265kqWO47vEVZG+HZp Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP; 05 Mar 2019 22:05:13 -0800 IronPort-SDR: ubJ+vV6pZs92illwOD/+raYsUNapYjY/sP2bsbDpO9eqch/dnJ1eeLFgN40VEzI1vlj76qXuea b9YC3WkQp/DuA44t5H0zyjdXx1qv6iFgwVaTr0fZecag7o3/3pK9FkEifp6lmgd1BzMwFebSy0 NiK9qEl8Y5IRL1BGtFtEwK6yNjGZ0dZWpwOAby4qspihgsp9/4FmQBQJRKAreL5yE8FfFWROaE 0VV0Li4Cd5s4xUQxeN8AZmuVmIDRQZHoKOHP8SMgtOZ+yipbcKIyVmqQrhr0uVIQvkHKawI5EE tF4= Received: from washi.fujisawa.hgst.com ([10.149.53.254]) by uls-op-cesaip02.wdc.com with ESMTP; 05 Mar 2019 22:27:13 -0800 From: Damien Le Moal To: jaegeuk@kernel.org, yuchao0@huawei.com, linux-f2fs-devel@lists.sourceforge.net Cc: linux-fsdevel@vger.kernel.org, Matias Bjorling Subject: [PATCH v2] f2fs: Reduce zoned block device memory usage Date: Wed, 6 Mar 2019 15:27:11 +0900 Message-Id: <20190306062711.14456-1-damien.lemoal@wdc.com> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP For zoned block devices, an array of zone types for each device is allocated and initialized in order to determine if a section is stored on a sequential zone (zone reset needed) or a conventional zone (no zone reset needed and regular discard applies). Considering this usage, the zone types stored in memory can be replaced with a bitmap to indicate an equivalent information, that is, if a zone is sequential or not. This reduces the memory usage for each zoned device by roughly 8: on a 14TB disk with zones of 256 MB, the zone type array consumes 13x4KB pages while the bitmap uses only 2x4KB pages. This patch changes the f2fs_dev_info structure blkz_type field to the bitmap blkz_seq. Access to this bitmap is done using the helper function f2fs_blkz_is_seq(), which is a rewrite of the function get_blkz_type(). Signed-off-by: Damien Le Moal --- Changes from v1: * Use kvfree() instead of kfree() to free the zone bitmap fs/f2fs/f2fs.h | 13 +++++++------ fs/f2fs/segment.c | 23 +++++++---------------- fs/f2fs/super.c | 13 ++++++++----- 3 files changed, 22 insertions(+), 27 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 12fabd6735dd..d7b2de930352 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -1067,8 +1067,8 @@ struct f2fs_dev_info { block_t start_blk; block_t end_blk; #ifdef CONFIG_BLK_DEV_ZONED - unsigned int nr_blkz; /* Total number of zones */ - u8 *blkz_type; /* Array of zones type */ + unsigned int nr_blkz; /* Total number of zones */ + unsigned long *blkz_seq; /* Bitmap indicating sequential zones */ #endif }; @@ -3508,16 +3508,17 @@ F2FS_FEATURE_FUNCS(lost_found, LOST_FOUND); F2FS_FEATURE_FUNCS(sb_chksum, SB_CHKSUM); #ifdef CONFIG_BLK_DEV_ZONED -static inline int get_blkz_type(struct f2fs_sb_info *sbi, - struct block_device *bdev, block_t blkaddr) +static inline bool f2fs_blkz_is_seq(struct f2fs_sb_info *sbi, + struct block_device *bdev, block_t blkaddr) { unsigned int zno = blkaddr >> sbi->log_blocks_per_blkz; int i; for (i = 0; i < sbi->s_ndevs; i++) if (FDEV(i).bdev == bdev) - return FDEV(i).blkz_type[zno]; - return -EINVAL; + return test_bit(zno, FDEV(i).blkz_seq); + WARN_ON_ONCE(1); + return false; } #endif diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 9b79056d705d..65941070776c 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -1703,19 +1703,8 @@ static int __f2fs_issue_discard_zone(struct f2fs_sb_info *sbi, blkstart -= FDEV(devi).start_blk; } - /* - * We need to know the type of the zone: for conventional zones, - * use regular discard if the drive supports it. For sequential - * zones, reset the zone write pointer. - */ - switch (get_blkz_type(sbi, bdev, blkstart)) { - - case BLK_ZONE_TYPE_CONVENTIONAL: - if (!blk_queue_discard(bdev_get_queue(bdev))) - return 0; - return __queue_discard_cmd(sbi, bdev, lblkstart, blklen); - case BLK_ZONE_TYPE_SEQWRITE_REQ: - case BLK_ZONE_TYPE_SEQWRITE_PREF: + /* For sequential zones, reset the zone write pointer */ + if (f2fs_blkz_is_seq(sbi, bdev, blkstart)) { sector = SECTOR_FROM_BLOCK(blkstart); nr_sects = SECTOR_FROM_BLOCK(blklen); @@ -1730,10 +1719,12 @@ static int __f2fs_issue_discard_zone(struct f2fs_sb_info *sbi, trace_f2fs_issue_reset_zone(bdev, blkstart); return blkdev_reset_zones(bdev, sector, nr_sects, GFP_NOFS); - default: - /* Unknown zone type: broken device ? */ - return -EIO; } + + /* For conventional zones, use regular discard if supported */ + if (!blk_queue_discard(bdev_get_queue(bdev))) + return 0; + return __queue_discard_cmd(sbi, bdev, lblkstart, blklen); } #endif diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index c46a1d4318d4..91d7429be554 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -1017,7 +1017,7 @@ static void destroy_device_list(struct f2fs_sb_info *sbi) for (i = 0; i < sbi->s_ndevs; i++) { blkdev_put(FDEV(i).bdev, FMODE_EXCL); #ifdef CONFIG_BLK_DEV_ZONED - kvfree(FDEV(i).blkz_type); + kvfree(FDEV(i).blkz_seq); #endif } kvfree(sbi->devs); @@ -2765,9 +2765,11 @@ static int init_blkz_info(struct f2fs_sb_info *sbi, int devi) if (nr_sectors & (bdev_zone_sectors(bdev) - 1)) FDEV(devi).nr_blkz++; - FDEV(devi).blkz_type = f2fs_kmalloc(sbi, FDEV(devi).nr_blkz, - GFP_KERNEL); - if (!FDEV(devi).blkz_type) + FDEV(devi).blkz_seq = f2fs_kzalloc(sbi, + BITS_TO_LONGS(FDEV(devi).nr_blkz) + * sizeof(unsigned long), + GFP_KERNEL); + if (!FDEV(devi).blkz_seq) return -ENOMEM; #define F2FS_REPORT_NR_ZONES 4096 @@ -2794,7 +2796,8 @@ static int init_blkz_info(struct f2fs_sb_info *sbi, int devi) } for (i = 0; i < nr_zones; i++) { - FDEV(devi).blkz_type[n] = zones[i].type; + if (zones[i].type != BLK_ZONE_TYPE_CONVENTIONAL) + set_bit(n, FDEV(devi).blkz_seq); sector += zones[i].len; n++; }