From patchwork Tue Jun 15 05:49:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 12320617 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0A89C49EA2 for ; Tue, 15 Jun 2021 05:49:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B47BC6141B for ; Tue, 15 Jun 2021 05:49:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229811AbhFOFvk (ORCPT ); Tue, 15 Jun 2021 01:51:40 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]:57268 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230038AbhFOFvi (ORCPT ); Tue, 15 Jun 2021 01:51:38 -0400 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 53C43219C5; Tue, 15 Jun 2021 05:49:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1623736172; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TR9Qxjp87eDoqfnf8N5jLZ9ERrRCE+jLsCBNMtHhrSs=; b=X0eo75ALDLmLZlvnlqDLDAyX52MmwDQNCbSfmNRPQ9grLkko/2Zctp9ytylKvy8TDn0SCn 22DklRIKVvrePvuBsp7kzMsKFzRUDboiX44V+F0JU2iqrO6Qlf6TTBfrtk0ewkfxwFyHZM pg4FKS1TfgkiTIwFvlTD5W01RqmOF+Y= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1623736172; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TR9Qxjp87eDoqfnf8N5jLZ9ERrRCE+jLsCBNMtHhrSs=; b=lEsT8ytEudYIPyKvm1KmkGwAn9m7PB5RiYqmGJl18oaGtBTe9o0r2jEjwqttpac6GJVj96 vc8uOigAdHq+JWCg== Received: from localhost.localdomain (unknown [10.163.16.22]) by relay2.suse.de (Postfix) with ESMTP id B505DA3B98; Tue, 15 Jun 2021 05:49:30 +0000 (UTC) From: Coly Li To: axboe@kernel.dk Cc: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org, Chao Yu , Coly Li Subject: [PATCH 01/14] bcache: fix error info in register_bcache() Date: Tue, 15 Jun 2021 13:49:08 +0800 Message-Id: <20210615054921.101421-2-colyli@suse.de> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210615054921.101421-1-colyli@suse.de> References: <20210615054921.101421-1-colyli@suse.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Chao Yu In register_bcache(), there are several cases we didn't set correct error info (return value and/or error message): - if kzalloc() fails, it needs to return ENOMEM and print "cannot allocate memory"; - if register_cache() fails, it's better to propagate its return value rather than using default EINVAL. Signed-off-by: Chao Yu Signed-off-by: Coly Li Reviewed-by: Hannes Reinecke --- drivers/md/bcache/super.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index bea8c4429ae8..0a20ccf5a1db 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -2620,8 +2620,11 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr, if (SB_IS_BDEV(sb)) { struct cached_dev *dc = kzalloc(sizeof(*dc), GFP_KERNEL); - if (!dc) + if (!dc) { + ret = -ENOMEM; + err = "cannot allocate memory"; goto out_put_sb_page; + } mutex_lock(&bch_register_lock); ret = register_bdev(sb, sb_disk, bdev, dc); @@ -2632,11 +2635,15 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr, } else { struct cache *ca = kzalloc(sizeof(*ca), GFP_KERNEL); - if (!ca) + if (!ca) { + ret = -ENOMEM; + err = "cannot allocate memory"; goto out_put_sb_page; + } /* blkdev_put() will be called in bch_cache_release() */ - if (register_cache(sb, sb_disk, bdev, ca) != 0) + ret = register_cache(sb, sb_disk, bdev, ca); + if (ret) goto out_free_sb; } From patchwork Tue Jun 15 05:49:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 12320619 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58D81C48BDF for ; Tue, 15 Jun 2021 05:49:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 377A561410 for ; Tue, 15 Jun 2021 05:49:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230084AbhFOFvk (ORCPT ); Tue, 15 Jun 2021 01:51:40 -0400 Received: from smtp-out2.suse.de ([195.135.220.29]:45584 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230060AbhFOFvk (ORCPT ); Tue, 15 Jun 2021 01:51:40 -0400 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 9635C1FD2A; Tue, 15 Jun 2021 05:49:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1623736174; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9fo5bo8mzPI37nH0cbbOFJn0vZDu8ORoqn2KbFGq/mE=; b=G0BT+GR4kJQtsK79hYUzHKSGMI6EKzZEdvl4ydvmcmGzlJZi1WRiTTd1p3622gxH2rEMVn ljGUv3ehDzdCC0NZsWKbjemo++k4pgV6Kcxgo1pOXzcT+Ae1e15Cwe+OtnwTJhpO/COMbQ BGfcwX2f0bd4IrG8xS4261V/4QWX2ls= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1623736174; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9fo5bo8mzPI37nH0cbbOFJn0vZDu8ORoqn2KbFGq/mE=; b=Pz9fVln/f62jPwhZ+ATg5spsmgQImwMPL1rHD4R3DYEjsWJLlFAwx6GFxXjeYJOnhmyc1g y35Nmi78+ocWocDw== Received: from localhost.localdomain (unknown [10.163.16.22]) by relay2.suse.de (Postfix) with ESMTP id BACEAA3B94; Tue, 15 Jun 2021 05:49:32 +0000 (UTC) From: Coly Li To: axboe@kernel.dk Cc: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org, Ding Senjie , Coly Li Subject: [PATCH 02/14] md: bcache: Fix spelling of 'acquire' Date: Tue, 15 Jun 2021 13:49:09 +0800 Message-Id: <20210615054921.101421-3-colyli@suse.de> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210615054921.101421-1-colyli@suse.de> References: <20210615054921.101421-1-colyli@suse.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Ding Senjie acqurie -> acquire Signed-off-by: Ding Senjie Signed-off-by: Coly Li Reviewed-by: Hannes Reinecke --- drivers/md/bcache/super.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index 0a20ccf5a1db..2f1ee4fbf4d5 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -2760,7 +2760,7 @@ static int bcache_reboot(struct notifier_block *n, unsigned long code, void *x) * The reason bch_register_lock is not held to call * bch_cache_set_stop() and bcache_device_stop() is to * avoid potential deadlock during reboot, because cache - * set or bcache device stopping process will acqurie + * set or bcache device stopping process will acquire * bch_register_lock too. * * We are safe here because bcache_is_reboot sets to From patchwork Tue Jun 15 05:49:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 12320621 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EBF6EC49EA4 for ; Tue, 15 Jun 2021 05:49:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CCB326141B for ; Tue, 15 Jun 2021 05:49:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230188AbhFOFvm (ORCPT ); Tue, 15 Jun 2021 01:51:42 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]:57280 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230130AbhFOFvl (ORCPT ); Tue, 15 Jun 2021 01:51:41 -0400 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id D00DE2199E; Tue, 15 Jun 2021 05:49:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1623736176; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=la1spYHj4zHqQAXb0uaDT982RykUFblkHmqSblbpGsQ=; b=rd/mFt2JCaE++Y+geoVKkG23XdPBG/mODDrx79t9qASnSyxj6cko/AeptRHT2I/ZrnYjLN y+NZgmGRIK5KZKGw9zymDXXjyj03EsXofbxPKGPETOoYqZQL9ACWiZz1GbV1G1Ywsq3n6l sbM3+ZcV9SDv9X/HnhPm6UEtrevHNGk= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1623736176; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=la1spYHj4zHqQAXb0uaDT982RykUFblkHmqSblbpGsQ=; b=S6icO6F5Uue7FLVginQKqGfbKoqvDtf3iqIjMIdmKagkZvwCiuaBEywkYw/uboP/UMCOcg w4X9W8tZI/hXDQAw== Received: from localhost.localdomain (unknown [10.163.16.22]) by relay2.suse.de (Postfix) with ESMTP id 0A3CBA3B94; Tue, 15 Jun 2021 05:49:34 +0000 (UTC) From: Coly Li To: axboe@kernel.dk Cc: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org, Coly Li , Jianpeng Ma , Qiaowei Ren Subject: [PATCH 03/14] bcache: add initial data structures for nvm pages Date: Tue, 15 Jun 2021 13:49:10 +0800 Message-Id: <20210615054921.101421-4-colyli@suse.de> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210615054921.101421-1-colyli@suse.de> References: <20210615054921.101421-1-colyli@suse.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org This patch initializes the prototype data structures for nvm pages allocator, - struct bch_nvm_pages_sb This is the super block allocated on each nvdimm namespace. A nvdimm set may have multiple namespaces, bch_nvm_pages_sb->set_uuid is used to mark which nvdimm set this name space belongs to. Normally we will use the bcache's cache set UUID to initialize this uuid, to connect this nvdimm set to a specified bcache cache set. - struct bch_owner_list_head This is a table for all heads of all owner lists. A owner list records which page(s) allocated to which owner. After reboot from power failure, the ownwer may find all its requested and allocated pages from the owner list by a handler which is converted by a UUID. - struct bch_nvm_pages_owner_head This is a head of an owner list. Each owner only has one owner list, and a nvm page only belongs to an specific owner. uuid[] will be set to owner's uuid, for bcache it is the bcache's cache set uuid. label is not mandatory, it is a human-readable string for debug purpose. The pointer *recs references to separated nvm page which hold the table of struct bch_nvm_pgalloc_rec. - struct bch_nvm_pgalloc_recs This struct occupies a whole page, owner_uuid should match the uuid in struct bch_nvm_pages_owner_head. recs[] is the real table contains all allocated records. - struct bch_nvm_pgalloc_rec Each structure records a range of allocated nvm pages. - Bits 0 - 51: is pages offset of the allocated pages. - Bits 52 - 57: allocaed size in page_size * order-of-2 - Bits 58 - 63: reserved. Since each of the allocated nvm pages are power of 2, using 6 bits to represent allocated size can have (1<<(1<<64) - 1) * PAGE_SIZE maximum value. It can be a 76 bits width range size in byte for 4KB page size, which is large enough currently. Signed-off-by: Coly Li Cc: Jianpeng Ma Cc: Qiaowei Ren --- include/uapi/linux/bcache-nvm.h | 200 ++++++++++++++++++++++++++++++++ 1 file changed, 200 insertions(+) create mode 100644 include/uapi/linux/bcache-nvm.h diff --git a/include/uapi/linux/bcache-nvm.h b/include/uapi/linux/bcache-nvm.h new file mode 100644 index 000000000000..5094a6797679 --- /dev/null +++ b/include/uapi/linux/bcache-nvm.h @@ -0,0 +1,200 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ + +#ifndef _UAPI_BCACHE_NVM_H +#define _UAPI_BCACHE_NVM_H + +#if (__BITS_PER_LONG == 64) +/* + * Bcache on NVDIMM data structures + */ + +/* + * - struct bch_nvm_pages_sb + * This is the super block allocated on each nvdimm namespace. A nvdimm + * set may have multiple namespaces, bch_nvm_pages_sb->set_uuid is used to mark + * which nvdimm set this name space belongs to. Normally we will use the + * bcache's cache set UUID to initialize this uuid, to connect this nvdimm + * set to a specified bcache cache set. + * + * - struct bch_owner_list_head + * This is a table for all heads of all owner lists. A owner list records + * which page(s) allocated to which owner. After reboot from power failure, + * the ownwer may find all its requested and allocated pages from the owner + * list by a handler which is converted by a UUID. + * + * - struct bch_nvm_pages_owner_head + * This is a head of an owner list. Each owner only has one owner list, + * and a nvm page only belongs to an specific owner. uuid[] will be set to + * owner's uuid, for bcache it is the bcache's cache set uuid. label is not + * mandatory, it is a human-readable string for debug purpose. The pointer + * recs references to separated nvm page which hold the table of struct + * bch_pgalloc_rec. + * + *- struct bch_nvm_pgalloc_recs + * This structure occupies a whole page, owner_uuid should match the uuid + * in struct bch_nvm_pages_owner_head. recs[] is the real table contains all + * allocated records. + * + * - struct bch_pgalloc_rec + * Each structure records a range of allocated nvm pages. pgoff is offset + * in unit of page size of this allocated nvm page range. The adjoint page + * ranges of same owner can be merged into a larger one, therefore pages_nr + * is NOT always power of 2. + * + * + * Memory layout on nvdimm namespace 0 + * + * 0 +---------------------------------+ + * | | + * 4KB +---------------------------------+ + * | bch_nvm_pages_sb | + * 8KB +---------------------------------+ <--- bch_nvm_pages_sb.bch_owner_list_head + * | bch_owner_list_head | + * | | + * 16KB +---------------------------------+ <--- bch_owner_list_head.heads[0].recs[0] + * | bch_nvm_pgalloc_recs | + * | (nvm pages internal usage) | + * 24KB +---------------------------------+ + * | | + * | | + * 16MB +---------------------------------+ + * | allocable nvm pages | + * | for buddy allocator | + * end +---------------------------------+ + * + * + * + * Memory layout on nvdimm namespace N + * (doesn't have owner list) + * + * 0 +---------------------------------+ + * | | + * 4KB +---------------------------------+ + * | bch_nvm_pages_sb | + * 8KB +---------------------------------+ + * | | + * | | + * | | + * | | + * | | + * | | + * 16MB +---------------------------------+ + * | allocable nvm pages | + * | for buddy allocator | + * end +---------------------------------+ + * + */ + +#include + +/* In sectors */ +#define BCH_NVM_PAGES_SB_OFFSET 4096 +#define BCH_NVM_PAGES_OFFSET (16 << 20) + +#define BCH_NVM_PAGES_LABEL_SIZE 32 +#define BCH_NVM_PAGES_NAMESPACES_MAX 8 + +#define BCH_NVM_PAGES_OWNER_LIST_HEAD_OFFSET (8<<10) +#define BCH_NVM_PAGES_SYS_RECS_HEAD_OFFSET (16<<10) + +#define BCH_NVM_PAGES_SB_VERSION 0 +#define BCH_NVM_PAGES_SB_VERSION_MAX 0 + +static const unsigned char bch_nvm_pages_magic[] = { + 0x17, 0xbd, 0x53, 0x7f, 0x1b, 0x23, 0xd6, 0x83, + 0x46, 0xa4, 0xf8, 0x28, 0x17, 0xda, 0xec, 0xa9 }; +static const unsigned char bch_nvm_pages_pgalloc_magic[] = { + 0x39, 0x25, 0x3f, 0xf7, 0x27, 0x17, 0xd0, 0xb9, + 0x10, 0xe6, 0xd2, 0xda, 0x38, 0x68, 0x26, 0xae }; + +/* takes 64bit width */ +struct bch_pgalloc_rec { + __u64 pgoff:52; + __u64 order:6; + __u64 reserved:6; +}; + +struct bch_nvm_pgalloc_recs { +union { + struct { + struct bch_nvm_pages_owner_head *owner; + struct bch_nvm_pgalloc_recs *next; + unsigned char magic[16]; + unsigned char owner_uuid[16]; + unsigned int size; + unsigned int used; + unsigned long _pad[4]; + struct bch_pgalloc_rec recs[]; + }; + unsigned char pad[8192]; +}; +}; + +#define BCH_MAX_RECS \ + ((sizeof(struct bch_nvm_pgalloc_recs) - \ + offsetof(struct bch_nvm_pgalloc_recs, recs)) / \ + sizeof(struct bch_pgalloc_rec)) + +struct bch_nvm_pages_owner_head { + unsigned char uuid[16]; + unsigned char label[BCH_NVM_PAGES_LABEL_SIZE]; + /* Per-namespace own lists */ + struct bch_nvm_pgalloc_recs *recs[BCH_NVM_PAGES_NAMESPACES_MAX]; +}; + +/* heads[0] is always for nvm_pages internal usage */ +struct bch_owner_list_head { +union { + struct { + unsigned int size; + unsigned int used; + unsigned long _pad[4]; + struct bch_nvm_pages_owner_head heads[]; + }; + unsigned char pad[8192]; +}; +}; +#define BCH_MAX_OWNER_LIST \ + ((sizeof(struct bch_owner_list_head) - \ + offsetof(struct bch_owner_list_head, heads)) / \ + sizeof(struct bch_nvm_pages_owner_head)) + +/* The on-media bit order is local CPU order */ +struct bch_nvm_pages_sb { + unsigned long csum; + unsigned long ns_start; + unsigned long sb_offset; + unsigned long version; + unsigned char magic[16]; + unsigned char uuid[16]; + unsigned int page_size; + unsigned int total_namespaces_nr; + unsigned int this_namespace_nr; + union { + unsigned char set_uuid[16]; + unsigned long set_magic; + }; + + unsigned long flags; + unsigned long seq; + + unsigned long feature_compat; + unsigned long feature_incompat; + unsigned long feature_ro_compat; + + /* For allocable nvm pages from buddy systems */ + unsigned long pages_offset; + unsigned long pages_total; + + unsigned long pad[8]; + + /* Only on the first name space */ + struct bch_owner_list_head *owner_list_head; + + /* Just for csum_set() */ + unsigned int keys; + unsigned long d[0]; +}; +#endif /* __BITS_PER_LONG == 64 */ + +#endif /* _UAPI_BCACHE_NVM_H */ From patchwork Tue Jun 15 05:49:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 12320623 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85434C48BDF for ; Tue, 15 Jun 2021 05:49:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 66E3D61410 for ; Tue, 15 Jun 2021 05:49:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230190AbhFOFvo (ORCPT ); Tue, 15 Jun 2021 01:51:44 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]:57334 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230130AbhFOFvo (ORCPT ); Tue, 15 Jun 2021 01:51:44 -0400 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 50F39219C7; Tue, 15 Jun 2021 05:49:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1623736179; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dzS4eJY8fpv4cUNgO3ycq+r2sPsKfPsU+vE2PyZ0xcg=; b=QkyV1idltB6OKWYs0gqrQERwOmgWacCW+BItVHxzC5lv1GzmdKaOmeWyLkRpshc6Hv1ysc os3/9jkgWpDvrE+eC29MHY68mEQPCvkl3ce1Nn92xhp79SmMqlmCB7r0Nkd9xeP0T8LuPV ugBPQO4ExIp0drkb1bT1Qo2GTPa/VpI= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1623736179; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dzS4eJY8fpv4cUNgO3ycq+r2sPsKfPsU+vE2PyZ0xcg=; b=wKSvJgcC4v00Xq12BnHN/IjTQN8PvtTRKfu+cmaVUNulkQsLBsHkZE0vXg2ShKVIMHI3aB XMeyMOejkpljtKDg== Received: from localhost.localdomain (unknown [10.163.16.22]) by relay2.suse.de (Postfix) with ESMTP id 4469CA3B8F; Tue, 15 Jun 2021 05:49:37 +0000 (UTC) From: Coly Li To: axboe@kernel.dk Cc: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org, Jianpeng Ma , Randy Dunlap , Qiaowei Ren , Coly Li Subject: [PATCH 04/14] bcache: initialize the nvm pages allocator Date: Tue, 15 Jun 2021 13:49:11 +0800 Message-Id: <20210615054921.101421-5-colyli@suse.de> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210615054921.101421-1-colyli@suse.de> References: <20210615054921.101421-1-colyli@suse.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Jianpeng Ma This patch define the prototype data structures in memory and initializes the nvm pages allocator. The nvm address space which is managed by this allocator can consist of many nvm namespaces, and some namespaces can compose into one nvm set, like cache set. For this initial implementation, only one set can be supported. The users of this nvm pages allocator need to call register_namespace() to register the nvdimm device (like /dev/pmemX) into this allocator as the instance of struct nvm_namespace. Reported-by: Randy Dunlap Signed-off-by: Jianpeng Ma Co-developed-by: Qiaowei Ren Signed-off-by: Qiaowei Ren Signed-off-by: Coly Li --- drivers/md/bcache/Kconfig | 10 ++ drivers/md/bcache/Makefile | 1 + drivers/md/bcache/nvm-pages.c | 295 ++++++++++++++++++++++++++++++++++ drivers/md/bcache/nvm-pages.h | 74 +++++++++ drivers/md/bcache/super.c | 3 + 5 files changed, 383 insertions(+) create mode 100644 drivers/md/bcache/nvm-pages.c create mode 100644 drivers/md/bcache/nvm-pages.h diff --git a/drivers/md/bcache/Kconfig b/drivers/md/bcache/Kconfig index d1ca4d059c20..a69f6c0e0507 100644 --- a/drivers/md/bcache/Kconfig +++ b/drivers/md/bcache/Kconfig @@ -35,3 +35,13 @@ config BCACHE_ASYNC_REGISTRATION device path into this file will returns immediately and the real registration work is handled in kernel work queue in asynchronous way. + +config BCACHE_NVM_PAGES + bool "NVDIMM support for bcache (EXPERIMENTAL)" + depends on BCACHE + depends on 64BIT + depends on LIBNVDIMM + depends on DAX + help + Allocate/release NV-memory pages for bcache and provide allocated pages + for each requestor after system reboot. diff --git a/drivers/md/bcache/Makefile b/drivers/md/bcache/Makefile index 5b87e59676b8..2397bb7c7ffd 100644 --- a/drivers/md/bcache/Makefile +++ b/drivers/md/bcache/Makefile @@ -5,3 +5,4 @@ obj-$(CONFIG_BCACHE) += bcache.o bcache-y := alloc.o bset.o btree.o closure.o debug.o extents.o\ io.o journal.o movinggc.o request.o stats.o super.o sysfs.o trace.o\ util.o writeback.o features.o +bcache-$(CONFIG_BCACHE_NVM_PAGES) += nvm-pages.o diff --git a/drivers/md/bcache/nvm-pages.c b/drivers/md/bcache/nvm-pages.c new file mode 100644 index 000000000000..18fdadbc502f --- /dev/null +++ b/drivers/md/bcache/nvm-pages.c @@ -0,0 +1,295 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Nvdimm page-buddy allocator + * + * Copyright (c) 2021, Intel Corporation. + * Copyright (c) 2021, Qiaowei Ren . + * Copyright (c) 2021, Jianpeng Ma . + */ + +#if defined(CONFIG_BCACHE_NVM_PAGES) + +#include "bcache.h" +#include "nvm-pages.h" + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +struct bch_nvm_set *only_set; + +static void release_nvm_namespaces(struct bch_nvm_set *nvm_set) +{ + int i; + struct bch_nvm_namespace *ns; + + for (i = 0; i < nvm_set->total_namespaces_nr; i++) { + ns = nvm_set->nss[i]; + if (ns) { + blkdev_put(ns->bdev, FMODE_READ|FMODE_WRITE|FMODE_EXEC); + kfree(ns); + } + } + + kfree(nvm_set->nss); +} + +static void release_nvm_set(struct bch_nvm_set *nvm_set) +{ + release_nvm_namespaces(nvm_set); + kfree(nvm_set); +} + +static int init_owner_info(struct bch_nvm_namespace *ns) +{ + struct bch_owner_list_head *owner_list_head = ns->sb->owner_list_head; + + mutex_lock(&only_set->lock); + only_set->owner_list_head = owner_list_head; + only_set->owner_list_size = owner_list_head->size; + only_set->owner_list_used = owner_list_head->used; + mutex_unlock(&only_set->lock); + + return 0; +} + +static int attach_nvm_set(struct bch_nvm_namespace *ns) +{ + int rc = 0; + + mutex_lock(&only_set->lock); + if (only_set->nss) { + if (memcmp(ns->sb->set_uuid, only_set->set_uuid, 16)) { + pr_info("namespace id doesn't match nvm set\n"); + rc = -EINVAL; + goto unlock; + } + + if (only_set->nss[ns->sb->this_namespace_nr]) { + pr_info("already has the same position(%d) nvm\n", + ns->sb->this_namespace_nr); + rc = -EEXIST; + goto unlock; + } + } else { + memcpy(only_set->set_uuid, ns->sb->set_uuid, 16); + only_set->total_namespaces_nr = ns->sb->total_namespaces_nr; + only_set->nss = kcalloc(only_set->total_namespaces_nr, + sizeof(struct bch_nvm_namespace *), GFP_KERNEL); + if (!only_set->nss) { + rc = -ENOMEM; + goto unlock; + } + } + + only_set->nss[ns->sb->this_namespace_nr] = ns; + + /* Firstly attach */ + if ((unsigned long)ns->sb->owner_list_head == BCH_NVM_PAGES_OWNER_LIST_HEAD_OFFSET) { + struct bch_nvm_pages_owner_head *sys_owner_head; + struct bch_nvm_pgalloc_recs *sys_pgalloc_recs; + + ns->sb->owner_list_head = ns->kaddr + BCH_NVM_PAGES_OWNER_LIST_HEAD_OFFSET; + sys_pgalloc_recs = ns->kaddr + BCH_NVM_PAGES_SYS_RECS_HEAD_OFFSET; + + sys_owner_head = &(ns->sb->owner_list_head->heads[0]); + sys_owner_head->recs[0] = sys_pgalloc_recs; + ns->sb->csum = csum_set(ns->sb); + + sys_pgalloc_recs->owner = sys_owner_head; + } else + BUG_ON(ns->sb->owner_list_head != + (ns->kaddr + BCH_NVM_PAGES_OWNER_LIST_HEAD_OFFSET)); + +unlock: + mutex_unlock(&only_set->lock); + return rc; +} + +static int read_nvdimm_meta_super(struct block_device *bdev, + struct bch_nvm_namespace *ns) +{ + struct page *page; + struct bch_nvm_pages_sb *sb; + int r = 0; + uint64_t expected_csum = 0; + + page = read_cache_page_gfp(bdev->bd_inode->i_mapping, + BCH_NVM_PAGES_SB_OFFSET >> PAGE_SHIFT, GFP_KERNEL); + + if (IS_ERR(page)) + return -EIO; + + sb = (struct bch_nvm_pages_sb *)(page_address(page) + + offset_in_page(BCH_NVM_PAGES_SB_OFFSET)); + r = -EINVAL; + expected_csum = csum_set(sb); + if (expected_csum != sb->csum) { + pr_info("csum is not match with expected one\n"); + goto put_page; + } + + if (memcmp(sb->magic, bch_nvm_pages_magic, 16)) { + pr_info("invalid bch_nvm_pages_magic\n"); + goto put_page; + } + + if (sb->total_namespaces_nr != 1) { + pr_info("currently only support one nvm device\n"); + goto put_page; + } + + if (sb->sb_offset != BCH_NVM_PAGES_SB_OFFSET) { + pr_info("invalid superblock offset\n"); + goto put_page; + } + + r = 0; + /* temporary use for DAX API */ + ns->page_size = sb->page_size; + ns->pages_total = sb->pages_total; + +put_page: + put_page(page); + return r; +} + +struct bch_nvm_namespace *bch_register_namespace(const char *dev_path) +{ + struct bch_nvm_namespace *ns; + int err; + pgoff_t pgoff; + char buf[BDEVNAME_SIZE]; + struct block_device *bdev; + int id; + char *path = NULL; + + path = kstrndup(dev_path, 512, GFP_KERNEL); + if (!path) { + pr_err("kstrndup failed\n"); + return ERR_PTR(-ENOMEM); + } + + bdev = blkdev_get_by_path(strim(path), + FMODE_READ|FMODE_WRITE|FMODE_EXEC, + only_set); + if (IS_ERR(bdev)) { + pr_info("get %s error: %ld\n", dev_path, PTR_ERR(bdev)); + kfree(path); + return ERR_PTR(PTR_ERR(bdev)); + } + + err = -ENOMEM; + ns = kzalloc(sizeof(struct bch_nvm_namespace), GFP_KERNEL); + if (!ns) + goto bdput; + + err = -EIO; + if (read_nvdimm_meta_super(bdev, ns)) { + pr_info("%s read nvdimm meta super block failed.\n", + bdevname(bdev, buf)); + goto free_ns; + } + + err = -EOPNOTSUPP; + if (!bdev_dax_supported(bdev, ns->page_size)) { + pr_info("%s don't support DAX\n", bdevname(bdev, buf)); + goto free_ns; + } + + err = -EINVAL; + if (bdev_dax_pgoff(bdev, 0, ns->page_size, &pgoff)) { + pr_info("invalid offset of %s\n", bdevname(bdev, buf)); + goto free_ns; + } + + err = -ENOMEM; + ns->dax_dev = fs_dax_get_by_bdev(bdev); + if (!ns->dax_dev) { + pr_info("can't by dax device by %s\n", bdevname(bdev, buf)); + goto free_ns; + } + + err = -EINVAL; + id = dax_read_lock(); + if (dax_direct_access(ns->dax_dev, pgoff, ns->pages_total, + &ns->kaddr, &ns->start_pfn) <= 0) { + pr_info("dax_direct_access error\n"); + dax_read_unlock(id); + goto free_ns; + } + dax_read_unlock(id); + + ns->sb = ns->kaddr + BCH_NVM_PAGES_SB_OFFSET; + + err = -EINVAL; + /* Check magic again to make sure DAX mapping is correct */ + if (memcmp(ns->sb->magic, bch_nvm_pages_magic, 16)) { + pr_info("invalid bch_nvm_pages_magic after DAX mapping\n"); + goto free_ns; + } + + err = attach_nvm_set(ns); + if (err < 0) + goto free_ns; + + ns->page_size = ns->sb->page_size; + ns->pages_offset = ns->sb->pages_offset; + ns->pages_total = ns->sb->pages_total; + ns->free = 0; + ns->bdev = bdev; + ns->nvm_set = only_set; + mutex_init(&ns->lock); + + if (ns->sb->this_namespace_nr == 0) { + pr_info("only first namespace contain owner info\n"); + err = init_owner_info(ns); + if (err < 0) { + pr_info("init_owner_info met error %d\n", err); + only_set->nss[ns->sb->this_namespace_nr] = NULL; + goto free_ns; + } + } + + kfree(path); + return ns; +free_ns: + kfree(ns); +bdput: + blkdev_put(bdev, FMODE_READ|FMODE_WRITE|FMODE_EXEC); + kfree(path); + return ERR_PTR(err); +} +EXPORT_SYMBOL_GPL(bch_register_namespace); + +int __init bch_nvm_init(void) +{ + only_set = kzalloc(sizeof(*only_set), GFP_KERNEL); + if (!only_set) + return -ENOMEM; + + only_set->total_namespaces_nr = 0; + only_set->owner_list_head = NULL; + only_set->nss = NULL; + + mutex_init(&only_set->lock); + + pr_info("bcache nvm init\n"); + return 0; +} + +void bch_nvm_exit(void) +{ + release_nvm_set(only_set); + pr_info("bcache nvm exit\n"); +} + +#endif /* CONFIG_BCACHE_NVM_PAGES */ diff --git a/drivers/md/bcache/nvm-pages.h b/drivers/md/bcache/nvm-pages.h new file mode 100644 index 000000000000..3e24c4dee7fd --- /dev/null +++ b/drivers/md/bcache/nvm-pages.h @@ -0,0 +1,74 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef _BCACHE_NVM_PAGES_H +#define _BCACHE_NVM_PAGES_H + +#if defined(CONFIG_BCACHE_NVM_PAGES) +#include +#endif /* CONFIG_BCACHE_NVM_PAGES */ + +/* + * Bcache NVDIMM in memory data structures + */ + +/* + * The following three structures in memory records which page(s) allocated + * to which owner. After reboot from power failure, they will be initialized + * based on nvm pages superblock in NVDIMM device. + */ +struct bch_nvm_namespace { + struct bch_nvm_pages_sb *sb; + void *kaddr; + + u8 uuid[16]; + u64 free; + u32 page_size; + u64 pages_offset; + u64 pages_total; + pfn_t start_pfn; + + struct dax_device *dax_dev; + struct block_device *bdev; + struct bch_nvm_set *nvm_set; + + struct mutex lock; +}; + +/* + * A set of namespaces. Currently only one set can be supported. + */ +struct bch_nvm_set { + u8 set_uuid[16]; + u32 total_namespaces_nr; + + u32 owner_list_size; + u32 owner_list_used; + struct bch_owner_list_head *owner_list_head; + + struct bch_nvm_namespace **nss; + + struct mutex lock; +}; +extern struct bch_nvm_set *only_set; + +#if defined(CONFIG_BCACHE_NVM_PAGES) + +struct bch_nvm_namespace *bch_register_namespace(const char *dev_path); +int bch_nvm_init(void); +void bch_nvm_exit(void); + +#else + +static inline struct bch_nvm_namespace *bch_register_namespace(const char *dev_path) +{ + return NULL; +} +static inline int bch_nvm_init(void) +{ + return 0; +} +static inline void bch_nvm_exit(void) { } + +#endif /* CONFIG_BCACHE_NVM_PAGES */ + +#endif /* _BCACHE_NVM_PAGES_H */ diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index 2f1ee4fbf4d5..ce22aefb1352 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -14,6 +14,7 @@ #include "request.h" #include "writeback.h" #include "features.h" +#include "nvm-pages.h" #include #include @@ -2823,6 +2824,7 @@ static void bcache_exit(void) { bch_debug_exit(); bch_request_exit(); + bch_nvm_exit(); if (bcache_kobj) kobject_put(bcache_kobj); if (bcache_wq) @@ -2921,6 +2923,7 @@ static int __init bcache_init(void) bch_debug_init(); closure_debug_init(); + bch_nvm_init(); bcache_is_reboot = false; From patchwork Tue Jun 15 05:49:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 12320625 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5FAA3C49361 for ; Tue, 15 Jun 2021 05:49:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4A02561412 for ; Tue, 15 Jun 2021 05:49:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230299AbhFOFvr (ORCPT ); Tue, 15 Jun 2021 01:51:47 -0400 Received: from smtp-out2.suse.de ([195.135.220.29]:45636 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230269AbhFOFvq (ORCPT ); Tue, 15 Jun 2021 01:51:46 -0400 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id EAECF1FD2A; Tue, 15 Jun 2021 05:49:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1623736181; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cMgkASmFuD6HvdIGhZRe5oM3JCTCcKwe/KnRzPROy30=; b=ZWtTgD7fdkXrGgqif1REIsEeut9A83VoHRKY0uDu+FofUwO3qFRSe1nwm8e24WjcP+3Yfz yNB+xAh85N5wLyTZ9TqYOW0QdBC2BIaQ7ijN2mrvijHEal/2UD+Lq/qWty7Bj6E6vBh5U8 wmZhDyUkTpRVZOMyJ9uAYYBxhDMDYsY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1623736181; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cMgkASmFuD6HvdIGhZRe5oM3JCTCcKwe/KnRzPROy30=; b=7Rm5LRSW/XzsQ/+/w9JmoWQg05o1qH0Sys/EBzRt2B+5x8m0nXpSr8l5tN5nQcmX4LNfSM lT+PTfCrAHEBqUAQ== Received: from localhost.localdomain (unknown [10.163.16.22]) by relay2.suse.de (Postfix) with ESMTP id B134AA3BAA; Tue, 15 Jun 2021 05:49:39 +0000 (UTC) From: Coly Li To: axboe@kernel.dk Cc: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org, Jianpeng Ma , kernel test robot , Dan Carpenter , Qiaowei Ren , Coly Li Subject: [PATCH 05/14] bcache: initialization of the buddy Date: Tue, 15 Jun 2021 13:49:12 +0800 Message-Id: <20210615054921.101421-6-colyli@suse.de> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210615054921.101421-1-colyli@suse.de> References: <20210615054921.101421-1-colyli@suse.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Jianpeng Ma This nvm pages allocator will implement the simple buddy to manage the nvm address space. This patch initializes this buddy for new namespace. the unit of alloc/free of the buddy is page. DAX device has their struct page(in dram or PMEM). struct { /* ZONE_DEVICE pages */ /** @pgmap: Points to the hosting device page map. */ struct dev_pagemap *pgmap; void *zone_device_data; /* * ZONE_DEVICE private pages are counted as being * mapped so the next 3 words hold the mapping, index, * and private fields from the source anonymous or * page cache page while the page is migrated to device * private memory. * ZONE_DEVICE MEMORY_DEVICE_FS_DAX pages also * use the mapping, index, and private fields when * pmem backed DAX files are mapped. */ }; ZONE_DEVICE pages only use pgmap. Other 4 words[16/32 bytes] don't use. So the second/third word will be used as 'struct list_head ' which list in buddy. The fourth word(that is normal struct page::index) store pgoff which the page-offset in the dax device. And the fifth word (that is normal struct page::private) store order of buddy. page_type will be used to store buddy flags. Reported-by: kernel test robot Reported-by: Dan Carpenter Signed-off-by: Jianpeng Ma Co-developed-by: Qiaowei Ren Signed-off-by: Qiaowei Ren Signed-off-by: Coly Li --- drivers/md/bcache/nvm-pages.c | 156 +++++++++++++++++++++++++++++++- drivers/md/bcache/nvm-pages.h | 6 ++ include/uapi/linux/bcache-nvm.h | 10 +- 3 files changed, 165 insertions(+), 7 deletions(-) diff --git a/drivers/md/bcache/nvm-pages.c b/drivers/md/bcache/nvm-pages.c index 18fdadbc502f..804ee66e97be 100644 --- a/drivers/md/bcache/nvm-pages.c +++ b/drivers/md/bcache/nvm-pages.c @@ -34,6 +34,10 @@ static void release_nvm_namespaces(struct bch_nvm_set *nvm_set) for (i = 0; i < nvm_set->total_namespaces_nr; i++) { ns = nvm_set->nss[i]; if (ns) { + kvfree(ns->pages_bitmap); + if (ns->pgalloc_recs_bitmap) + bitmap_free(ns->pgalloc_recs_bitmap); + blkdev_put(ns->bdev, FMODE_READ|FMODE_WRITE|FMODE_EXEC); kfree(ns); } @@ -48,17 +52,130 @@ static void release_nvm_set(struct bch_nvm_set *nvm_set) kfree(nvm_set); } +static struct page *nvm_vaddr_to_page(struct bch_nvm_namespace *ns, void *addr) +{ + return virt_to_page(addr); +} + +static void *nvm_pgoff_to_vaddr(struct bch_nvm_namespace *ns, pgoff_t pgoff) +{ + return ns->kaddr + (pgoff << PAGE_SHIFT); +} + +static inline void remove_owner_space(struct bch_nvm_namespace *ns, + pgoff_t pgoff, u64 nr) +{ + while (nr > 0) { + unsigned int num = nr > UINT_MAX ? UINT_MAX : nr; + + bitmap_set(ns->pages_bitmap, pgoff, num); + nr -= num; + pgoff += num; + } +} + +#define BCH_PGOFF_TO_KVADDR(pgoff) ((void *)((unsigned long)pgoff << PAGE_SHIFT)) + static int init_owner_info(struct bch_nvm_namespace *ns) { struct bch_owner_list_head *owner_list_head = ns->sb->owner_list_head; + struct bch_nvm_pgalloc_recs *sys_recs; + int i, j, k, rc = 0; mutex_lock(&only_set->lock); only_set->owner_list_head = owner_list_head; only_set->owner_list_size = owner_list_head->size; only_set->owner_list_used = owner_list_head->used; + + /* remove used space */ + remove_owner_space(ns, 0, div_u64(ns->pages_offset, ns->page_size)); + + sys_recs = ns->kaddr + BCH_NVM_PAGES_SYS_RECS_HEAD_OFFSET; + /* suppose no hole in array */ + for (i = 0; i < owner_list_head->used; i++) { + struct bch_nvm_pages_owner_head *head = &owner_list_head->heads[i]; + + for (j = 0; j < BCH_NVM_PAGES_NAMESPACES_MAX; j++) { + struct bch_nvm_pgalloc_recs *pgalloc_recs = head->recs[j]; + unsigned long offset = (unsigned long)ns->kaddr >> PAGE_SHIFT; + struct page *page; + + while (pgalloc_recs) { + u32 pgalloc_recs_pos = (unsigned int)(pgalloc_recs - sys_recs); + + if (memcmp(pgalloc_recs->magic, bch_nvm_pages_pgalloc_magic, 16)) { + pr_info("invalid bch_nvm_pages_pgalloc_magic\n"); + rc = -EINVAL; + goto unlock; + } + if (memcmp(pgalloc_recs->owner_uuid, head->uuid, 16)) { + pr_info("invalid owner_uuid in bch_nvm_pgalloc_recs\n"); + rc = -EINVAL; + goto unlock; + } + if (pgalloc_recs->owner != head) { + pr_info("invalid owner in bch_nvm_pgalloc_recs\n"); + rc = -EINVAL; + goto unlock; + } + + /* recs array can has hole */ + for (k = 0; k < pgalloc_recs->size; k++) { + struct bch_pgalloc_rec *rec = &pgalloc_recs->recs[k]; + + if (rec->pgoff) { + BUG_ON(rec->pgoff <= offset); + + /* init struct page: index/private */ + page = nvm_vaddr_to_page(ns, + BCH_PGOFF_TO_KVADDR(rec->pgoff)); + + set_page_private(page, rec->order); + page->index = rec->pgoff - offset; + + remove_owner_space(ns, + rec->pgoff - offset, + 1L << rec->order); + } + } + bitmap_set(ns->pgalloc_recs_bitmap, pgalloc_recs_pos, 1); + pgalloc_recs = pgalloc_recs->next; + } + } + } +unlock: mutex_unlock(&only_set->lock); - return 0; + return rc; +} + +static void init_nvm_free_space(struct bch_nvm_namespace *ns) +{ + unsigned int start, end, pages; + int i; + struct page *page; + pgoff_t pgoff_start; + + bitmap_for_each_clear_region(ns->pages_bitmap, start, end, 0, ns->pages_total) { + pgoff_start = start; + pages = end - start; + + while (pages) { + for (i = BCH_MAX_ORDER - 1; i >= 0 ; i--) { + if ((pgoff_start % (1L << i) == 0) && (pages >= (1L << i))) + break; + } + + page = nvm_vaddr_to_page(ns, nvm_pgoff_to_vaddr(ns, pgoff_start)); + page->index = pgoff_start; + set_page_private(page, i); + __SetPageBuddy(page); + list_add((struct list_head *)&page->zone_device_data, &ns->free_area[i]); + + pgoff_start += 1L << i; + pages -= 1L << i; + } + } } static int attach_nvm_set(struct bch_nvm_namespace *ns) @@ -165,7 +282,7 @@ static int read_nvdimm_meta_super(struct block_device *bdev, struct bch_nvm_namespace *bch_register_namespace(const char *dev_path) { struct bch_nvm_namespace *ns; - int err; + int i, err; pgoff_t pgoff; char buf[BDEVNAME_SIZE]; struct block_device *bdev; @@ -249,18 +366,49 @@ struct bch_nvm_namespace *bch_register_namespace(const char *dev_path) ns->nvm_set = only_set; mutex_init(&ns->lock); + /* + * parameters of bitmap_set/clear are unsigned int. + * Given currently size of nvm is far from exceeding this limit, + * so only add a WARN_ON message. + */ + WARN_ON(BITS_TO_LONGS(ns->pages_total) > UINT_MAX); + ns->pages_bitmap = kvcalloc(BITS_TO_LONGS(ns->pages_total), + sizeof(unsigned long), GFP_KERNEL); + if (!ns->pages_bitmap) { + err = -ENOMEM; + goto clear_ns_nr; + } + + if (ns->sb->this_namespace_nr == 0) { + ns->pgalloc_recs_bitmap = bitmap_zalloc(BCH_MAX_PGALLOC_RECS, GFP_KERNEL); + if (ns->pgalloc_recs_bitmap == NULL) { + err = -ENOMEM; + goto free_pages_bitmap; + } + } + + for (i = 0; i < BCH_MAX_ORDER; i++) + INIT_LIST_HEAD(&ns->free_area[i]); + if (ns->sb->this_namespace_nr == 0) { pr_info("only first namespace contain owner info\n"); err = init_owner_info(ns); if (err < 0) { pr_info("init_owner_info met error %d\n", err); - only_set->nss[ns->sb->this_namespace_nr] = NULL; - goto free_ns; + goto free_recs_bitmap; } + /* init buddy allocator */ + init_nvm_free_space(ns); } kfree(path); return ns; +free_recs_bitmap: + bitmap_free(ns->pgalloc_recs_bitmap); +free_pages_bitmap: + kvfree(ns->pages_bitmap); +clear_ns_nr: + only_set->nss[ns->sb->this_namespace_nr] = NULL; free_ns: kfree(ns); bdput: diff --git a/drivers/md/bcache/nvm-pages.h b/drivers/md/bcache/nvm-pages.h index 3e24c4dee7fd..71beb244b9be 100644 --- a/drivers/md/bcache/nvm-pages.h +++ b/drivers/md/bcache/nvm-pages.h @@ -16,6 +16,7 @@ * to which owner. After reboot from power failure, they will be initialized * based on nvm pages superblock in NVDIMM device. */ +#define BCH_MAX_ORDER 20 struct bch_nvm_namespace { struct bch_nvm_pages_sb *sb; void *kaddr; @@ -27,6 +28,11 @@ struct bch_nvm_namespace { u64 pages_total; pfn_t start_pfn; + unsigned long *pages_bitmap; + struct list_head free_area[BCH_MAX_ORDER]; + + unsigned long *pgalloc_recs_bitmap; + struct dax_device *dax_dev; struct block_device *bdev; struct bch_nvm_set *nvm_set; diff --git a/include/uapi/linux/bcache-nvm.h b/include/uapi/linux/bcache-nvm.h index 5094a6797679..1fdb3eaabf7e 100644 --- a/include/uapi/linux/bcache-nvm.h +++ b/include/uapi/linux/bcache-nvm.h @@ -130,11 +130,15 @@ union { }; }; -#define BCH_MAX_RECS \ - ((sizeof(struct bch_nvm_pgalloc_recs) - \ - offsetof(struct bch_nvm_pgalloc_recs, recs)) / \ +#define BCH_MAX_RECS \ + ((sizeof(struct bch_nvm_pgalloc_recs) - \ + offsetof(struct bch_nvm_pgalloc_recs, recs)) / \ sizeof(struct bch_pgalloc_rec)) +#define BCH_MAX_PGALLOC_RECS \ + ((BCH_NVM_PAGES_OFFSET - BCH_NVM_PAGES_SYS_RECS_HEAD_OFFSET) / \ + sizeof(struct bch_nvm_pgalloc_recs)) + struct bch_nvm_pages_owner_head { unsigned char uuid[16]; unsigned char label[BCH_NVM_PAGES_LABEL_SIZE]; From patchwork Tue Jun 15 05:49:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 12320627 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 478E2C49EA2 for ; Tue, 15 Jun 2021 05:49:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 365126140C for ; Tue, 15 Jun 2021 05:49:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230212AbhFOFvu (ORCPT ); Tue, 15 Jun 2021 01:51:50 -0400 Received: from smtp-out2.suse.de ([195.135.220.29]:45648 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230233AbhFOFvt (ORCPT ); Tue, 15 Jun 2021 01:51:49 -0400 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 30F091FD55; Tue, 15 Jun 2021 05:49:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1623736184; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vpE7f2Fh4rhHS8BvwF5DiUomCxJwBM6D+bRkTRUCx6U=; b=Cym2Paek+5W/CjXblpSwMRzJ1D0oSEG8Dij7+dBdjXc7PLl6KL99BLG40AvvqYKSDsqE4u m2IYhTbVmOlyX1gtc5YuEceAIomNipYMuhDuGAP0aEoVS4T0w0XLu3j4QSJWLxEirJonke a0FLkGlDvggU58QIq1QdEuY/AogIs84= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1623736184; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vpE7f2Fh4rhHS8BvwF5DiUomCxJwBM6D+bRkTRUCx6U=; b=25+gO1NZvYqAZ8FG8vuW2kOsb7zWuf4dQJwzuV2rii6ARJyMK0MerADH+JpPDcn9/5dC/B 6QeqPKsXG33a2NDg== Received: from localhost.localdomain (unknown [10.163.16.22]) by relay2.suse.de (Postfix) with ESMTP id 5EE89A3BB4; Tue, 15 Jun 2021 05:49:42 +0000 (UTC) From: Coly Li To: axboe@kernel.dk Cc: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org, Jianpeng Ma , Qiaowei Ren , Coly Li Subject: [PATCH 06/14] bcache: bch_nvm_alloc_pages() of the buddy Date: Tue, 15 Jun 2021 13:49:13 +0800 Message-Id: <20210615054921.101421-7-colyli@suse.de> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210615054921.101421-1-colyli@suse.de> References: <20210615054921.101421-1-colyli@suse.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Jianpeng Ma This patch implements the bch_nvm_alloc_pages() of the buddy. In terms of function, this func is like current-page-buddy-alloc. But the differences are: a: it need owner_uuid as parameter which record owner info. And it make those info persistence. b: it don't need flags like GFP_*. All allocs are the equal. c: it don't trigger other ops etc swap/recycle. Signed-off-by: Jianpeng Ma Co-developed-by: Qiaowei Ren Signed-off-by: Qiaowei Ren Signed-off-by: Coly Li --- drivers/md/bcache/nvm-pages.c | 174 ++++++++++++++++++++++++++++++++ drivers/md/bcache/nvm-pages.h | 6 ++ include/uapi/linux/bcache-nvm.h | 6 +- 3 files changed, 184 insertions(+), 2 deletions(-) diff --git a/drivers/md/bcache/nvm-pages.c b/drivers/md/bcache/nvm-pages.c index 804ee66e97be..5d095d241483 100644 --- a/drivers/md/bcache/nvm-pages.c +++ b/drivers/md/bcache/nvm-pages.c @@ -74,6 +74,180 @@ static inline void remove_owner_space(struct bch_nvm_namespace *ns, } } +/* If not found, it will create if create == true */ +static struct bch_nvm_pages_owner_head *find_owner_head(const char *owner_uuid, bool create) +{ + struct bch_owner_list_head *owner_list_head = only_set->owner_list_head; + struct bch_nvm_pages_owner_head *owner_head = NULL; + int i; + + if (owner_list_head == NULL) + goto out; + + for (i = 0; i < only_set->owner_list_used; i++) { + if (!memcmp(owner_uuid, owner_list_head->heads[i].uuid, 16)) { + owner_head = &(owner_list_head->heads[i]); + break; + } + } + + if (!owner_head && create) { + u32 used = only_set->owner_list_used; + + if (only_set->owner_list_size > used) { + memcpy_flushcache(owner_list_head->heads[used].uuid, owner_uuid, 16); + only_set->owner_list_used++; + + owner_list_head->used++; + owner_head = &(owner_list_head->heads[used]); + } else + pr_info("no free bch_nvm_pages_owner_head\n"); + } + +out: + return owner_head; +} + +static struct bch_nvm_pgalloc_recs *find_empty_pgalloc_recs(void) +{ + unsigned int start; + struct bch_nvm_namespace *ns = only_set->nss[0]; + struct bch_nvm_pgalloc_recs *recs; + + start = bitmap_find_next_zero_area(ns->pgalloc_recs_bitmap, BCH_MAX_PGALLOC_RECS, 0, 1, 0); + if (start > BCH_MAX_PGALLOC_RECS) { + pr_info("no free struct bch_nvm_pgalloc_recs\n"); + return NULL; + } + + bitmap_set(ns->pgalloc_recs_bitmap, start, 1); + recs = (struct bch_nvm_pgalloc_recs *)(ns->kaddr + BCH_NVM_PAGES_SYS_RECS_HEAD_OFFSET) + + start; + return recs; +} + +static struct bch_nvm_pgalloc_recs *find_nvm_pgalloc_recs(struct bch_nvm_namespace *ns, + struct bch_nvm_pages_owner_head *owner_head, bool create) +{ + int ns_nr = ns->sb->this_namespace_nr; + struct bch_nvm_pgalloc_recs *prev_recs = NULL, *recs = owner_head->recs[ns_nr]; + + /* If create=false, we return recs[nr] */ + if (!create) + return recs; + + /* + * If create=true, it mean we need a empty struct bch_pgalloc_rec + * So we should find non-empty struct bch_nvm_pgalloc_recs or alloc + * new struct bch_nvm_pgalloc_recs. And return this bch_nvm_pgalloc_recs + */ + while (recs && (recs->used == recs->size)) { + prev_recs = recs; + recs = recs->next; + } + + /* Found empty struct bch_nvm_pgalloc_recs */ + if (recs) + return recs; + /* Need alloc new struct bch_nvm_galloc_recs */ + recs = find_empty_pgalloc_recs(); + if (recs) { + recs->next = NULL; + recs->owner = owner_head; + memcpy_flushcache(recs->magic, bch_nvm_pages_pgalloc_magic, 16); + memcpy_flushcache(recs->owner_uuid, owner_head->uuid, 16); + recs->size = BCH_MAX_RECS; + recs->used = 0; + + if (prev_recs) + prev_recs->next = recs; + else + owner_head->recs[ns_nr] = recs; + } + + return recs; +} + +static void add_pgalloc_rec(struct bch_nvm_pgalloc_recs *recs, void *kaddr, int order) +{ + int i; + + for (i = 0; i < recs->size; i++) { + if (recs->recs[i].pgoff == 0) { + recs->recs[i].pgoff = (unsigned long)kaddr >> PAGE_SHIFT; + recs->recs[i].order = order; + recs->used++; + break; + } + } + BUG_ON(i == recs->size); +} + +void *bch_nvm_alloc_pages(int order, const char *owner_uuid) +{ + void *kaddr = NULL; + struct bch_nvm_pgalloc_recs *pgalloc_recs; + struct bch_nvm_pages_owner_head *owner_head; + int i, j; + + mutex_lock(&only_set->lock); + owner_head = find_owner_head(owner_uuid, true); + + if (!owner_head) { + pr_err("can't find bch_nvm_pgalloc_recs by(uuid=%s)\n", owner_uuid); + goto unlock; + } + + for (j = 0; j < only_set->total_namespaces_nr; j++) { + struct bch_nvm_namespace *ns = only_set->nss[j]; + + if (!ns || (ns->free < (1L << order))) + continue; + + for (i = order; i < BCH_MAX_ORDER; i++) { + struct list_head *list; + struct page *page, *buddy_page; + + if (list_empty(&ns->free_area[i])) + continue; + + list = ns->free_area[i].next; + page = container_of((void *)list, struct page, zone_device_data); + + list_del(list); + + while (i != order) { + buddy_page = nvm_vaddr_to_page(ns, + nvm_pgoff_to_vaddr(ns, page->index + (1L << (i - 1)))); + set_page_private(buddy_page, i - 1); + buddy_page->index = page->index + (1L << (i - 1)); + __SetPageBuddy(buddy_page); + list_add((struct list_head *)&buddy_page->zone_device_data, + &ns->free_area[i - 1]); + i--; + } + + set_page_private(page, order); + __ClearPageBuddy(page); + ns->free -= 1L << order; + kaddr = nvm_pgoff_to_vaddr(ns, page->index); + break; + } + + if (i < BCH_MAX_ORDER) { + pgalloc_recs = find_nvm_pgalloc_recs(ns, owner_head, true); + /* ToDo: handle pgalloc_recs==NULL */ + add_pgalloc_rec(pgalloc_recs, kaddr, order); + break; + } + } + +unlock: + mutex_unlock(&only_set->lock); + return kaddr; +} +EXPORT_SYMBOL_GPL(bch_nvm_alloc_pages); + #define BCH_PGOFF_TO_KVADDR(pgoff) ((void *)((unsigned long)pgoff << PAGE_SHIFT)) static int init_owner_info(struct bch_nvm_namespace *ns) diff --git a/drivers/md/bcache/nvm-pages.h b/drivers/md/bcache/nvm-pages.h index 71beb244b9be..f2583723aca6 100644 --- a/drivers/md/bcache/nvm-pages.h +++ b/drivers/md/bcache/nvm-pages.h @@ -62,6 +62,7 @@ extern struct bch_nvm_set *only_set; struct bch_nvm_namespace *bch_register_namespace(const char *dev_path); int bch_nvm_init(void); void bch_nvm_exit(void); +void *bch_nvm_alloc_pages(int order, const char *owner_uuid); #else @@ -74,6 +75,11 @@ static inline int bch_nvm_init(void) return 0; } static inline void bch_nvm_exit(void) { } +static inline void *bch_nvm_alloc_pages(int order, const char *owner_uuid) +{ + return NULL; +} + #endif /* CONFIG_BCACHE_NVM_PAGES */ diff --git a/include/uapi/linux/bcache-nvm.h b/include/uapi/linux/bcache-nvm.h index 1fdb3eaabf7e..9cb937292202 100644 --- a/include/uapi/linux/bcache-nvm.h +++ b/include/uapi/linux/bcache-nvm.h @@ -135,9 +135,11 @@ union { offsetof(struct bch_nvm_pgalloc_recs, recs)) / \ sizeof(struct bch_pgalloc_rec)) +/* Currently 64 struct bch_nvm_pgalloc_recs is enough */ #define BCH_MAX_PGALLOC_RECS \ - ((BCH_NVM_PAGES_OFFSET - BCH_NVM_PAGES_SYS_RECS_HEAD_OFFSET) / \ - sizeof(struct bch_nvm_pgalloc_recs)) + (min_t(unsigned int, 64, \ + (BCH_NVM_PAGES_OFFSET - BCH_NVM_PAGES_SYS_RECS_HEAD_OFFSET) / \ + sizeof(struct bch_nvm_pgalloc_recs))) struct bch_nvm_pages_owner_head { unsigned char uuid[16]; From patchwork Tue Jun 15 05:49:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 12320629 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4775C48BE5 for ; Tue, 15 Jun 2021 05:49:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D246D6140C for ; Tue, 15 Jun 2021 05:49:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230283AbhFOFvv (ORCPT ); Tue, 15 Jun 2021 01:51:51 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]:57354 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230236AbhFOFvv (ORCPT ); Tue, 15 Jun 2021 01:51:51 -0400 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 5CDAF2199E; Tue, 15 Jun 2021 05:49:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1623736186; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7NClXLAhe812Qn+7Gk+KJdd/JeWG7kRHXVUpsn9g7KE=; b=DJEbORD+BQDSYf+qQIaNbZeKnVCjwhzI13sH8CWfW/EfhkXmxd+5I07SKGCdDJjbHnslPF xj2IhK7U2hhkgYYfROX7pWDlr1uj6vKN07jOMMIUuSHlMzy0iO86QXgcCo3OuE8Iamtfef ugGhfFLjhEvOUwdsDoh4FLUJ178JiQo= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1623736186; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7NClXLAhe812Qn+7Gk+KJdd/JeWG7kRHXVUpsn9g7KE=; b=n9YJu+7rSj0pQvEoZo+G2axayghrvVb1xkhMGt42CToa01O5lqUoFhBhgPUw+IW1321xO5 qvh+CxLvGyfViRBA== Received: from localhost.localdomain (unknown [10.163.16.22]) by relay2.suse.de (Postfix) with ESMTP id 91114A3BB4; Tue, 15 Jun 2021 05:49:44 +0000 (UTC) From: Coly Li To: axboe@kernel.dk Cc: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org, Jianpeng Ma , Qiaowei Ren , Coly Li Subject: [PATCH 07/14] bcache: bch_nvm_free_pages() of the buddy Date: Tue, 15 Jun 2021 13:49:14 +0800 Message-Id: <20210615054921.101421-8-colyli@suse.de> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210615054921.101421-1-colyli@suse.de> References: <20210615054921.101421-1-colyli@suse.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Jianpeng Ma This patch implements the bch_nvm_free_pages() of the buddy. The difference between this and page-buddy-free: it need owner_uuid to free owner allocated pages.And must persistent after free. Signed-off-by: Jianpeng Ma Co-developed-by: Qiaowei Ren Signed-off-by: Qiaowei Ren Signed-off-by: Coly Li --- drivers/md/bcache/nvm-pages.c | 164 ++++++++++++++++++++++++++++++++-- drivers/md/bcache/nvm-pages.h | 3 +- 2 files changed, 159 insertions(+), 8 deletions(-) diff --git a/drivers/md/bcache/nvm-pages.c b/drivers/md/bcache/nvm-pages.c index 5d095d241483..74d08950c67c 100644 --- a/drivers/md/bcache/nvm-pages.c +++ b/drivers/md/bcache/nvm-pages.c @@ -52,7 +52,7 @@ static void release_nvm_set(struct bch_nvm_set *nvm_set) kfree(nvm_set); } -static struct page *nvm_vaddr_to_page(struct bch_nvm_namespace *ns, void *addr) +static struct page *nvm_vaddr_to_page(void *addr) { return virt_to_page(addr); } @@ -183,6 +183,155 @@ static void add_pgalloc_rec(struct bch_nvm_pgalloc_recs *recs, void *kaddr, int BUG_ON(i == recs->size); } +static inline void *nvm_end_addr(struct bch_nvm_namespace *ns) +{ + return ns->kaddr + (ns->pages_total << PAGE_SHIFT); +} + +static inline bool in_nvm_range(struct bch_nvm_namespace *ns, + void *start_addr, void *end_addr) +{ + return (start_addr >= ns->kaddr) && (end_addr < nvm_end_addr(ns)); +} + +static struct bch_nvm_namespace *find_nvm_by_addr(void *addr, int order) +{ + int i; + struct bch_nvm_namespace *ns; + + for (i = 0; i < only_set->total_namespaces_nr; i++) { + ns = only_set->nss[i]; + if (ns && in_nvm_range(ns, addr, addr + (1L << order))) + return ns; + } + return NULL; +} + +static int remove_pgalloc_rec(struct bch_nvm_pgalloc_recs *pgalloc_recs, int ns_nr, + void *kaddr, int order) +{ + struct bch_nvm_pages_owner_head *owner_head = pgalloc_recs->owner; + struct bch_nvm_pgalloc_recs *prev_recs, *sys_recs; + u64 pgoff = (unsigned long)kaddr >> PAGE_SHIFT; + struct bch_nvm_namespace *ns = only_set->nss[0]; + int i; + + prev_recs = pgalloc_recs; + sys_recs = ns->kaddr + BCH_NVM_PAGES_SYS_RECS_HEAD_OFFSET; + while (pgalloc_recs) { + for (i = 0; i < pgalloc_recs->size; i++) { + struct bch_pgalloc_rec *rec = &(pgalloc_recs->recs[i]); + + if (rec->pgoff == pgoff) { + WARN_ON(rec->order != order); + rec->pgoff = 0; + rec->order = 0; + pgalloc_recs->used--; + + if (pgalloc_recs->used == 0) { + int recs_pos = pgalloc_recs - sys_recs; + + if (pgalloc_recs == prev_recs) + owner_head->recs[ns_nr] = pgalloc_recs->next; + else + prev_recs->next = pgalloc_recs->next; + + pgalloc_recs->next = NULL; + pgalloc_recs->owner = NULL; + + bitmap_clear(ns->pgalloc_recs_bitmap, recs_pos, 1); + } + goto exit; + } + } + prev_recs = pgalloc_recs; + pgalloc_recs = pgalloc_recs->next; + } +exit: + return pgalloc_recs ? 0 : -ENOENT; +} + +static void __free_space(struct bch_nvm_namespace *ns, void *addr, int order) +{ + unsigned long add_pages = (1L << order); + pgoff_t pgoff; + struct page *page; + + page = nvm_vaddr_to_page(addr); + WARN_ON((!page) || (page->private != order)); + pgoff = page->index; + + while (order < BCH_MAX_ORDER - 1) { + struct page *buddy_page; + + pgoff_t buddy_pgoff = pgoff ^ (1L << order); + pgoff_t parent_pgoff = pgoff & ~(1L << order); + + if ((parent_pgoff + (1L << (order + 1)) > ns->pages_total)) + break; + + buddy_page = nvm_vaddr_to_page(nvm_pgoff_to_vaddr(ns, buddy_pgoff)); + WARN_ON(!buddy_page); + + if (PageBuddy(buddy_page) && (buddy_page->private == order)) { + list_del((struct list_head *)&buddy_page->zone_device_data); + __ClearPageBuddy(buddy_page); + pgoff = parent_pgoff; + order++; + continue; + } + break; + } + + page = nvm_vaddr_to_page(nvm_pgoff_to_vaddr(ns, pgoff)); + WARN_ON(!page); + list_add((struct list_head *)&page->zone_device_data, &ns->free_area[order]); + page->index = pgoff; + set_page_private(page, order); + __SetPageBuddy(page); + ns->free += add_pages; +} + +void bch_nvm_free_pages(void *addr, int order, const char *owner_uuid) +{ + struct bch_nvm_namespace *ns; + struct bch_nvm_pages_owner_head *owner_head; + struct bch_nvm_pgalloc_recs *pgalloc_recs; + int r; + + mutex_lock(&only_set->lock); + + ns = find_nvm_by_addr(addr, order); + if (!ns) { + pr_err("can't find nvm_dev by kaddr %p\n", addr); + goto unlock; + } + + owner_head = find_owner_head(owner_uuid, false); + if (!owner_head) { + pr_err("can't found bch_nvm_pages_owner_head by(uuid=%s)\n", owner_uuid); + goto unlock; + } + + pgalloc_recs = find_nvm_pgalloc_recs(ns, owner_head, false); + if (!pgalloc_recs) { + pr_err("can't find bch_nvm_pgalloc_recs by(uuid=%s)\n", owner_uuid); + goto unlock; + } + + r = remove_pgalloc_rec(pgalloc_recs, ns->sb->this_namespace_nr, addr, order); + if (r < 0) { + pr_err("can't find bch_pgalloc_rec\n"); + goto unlock; + } + + __free_space(ns, addr, order); + +unlock: + mutex_unlock(&only_set->lock); +} +EXPORT_SYMBOL_GPL(bch_nvm_free_pages); + void *bch_nvm_alloc_pages(int order, const char *owner_uuid) { void *kaddr = NULL; @@ -217,7 +366,7 @@ void *bch_nvm_alloc_pages(int order, const char *owner_uuid) list_del(list); while (i != order) { - buddy_page = nvm_vaddr_to_page(ns, + buddy_page = nvm_vaddr_to_page( nvm_pgoff_to_vaddr(ns, page->index + (1L << (i - 1)))); set_page_private(buddy_page, i - 1); buddy_page->index = page->index + (1L << (i - 1)); @@ -301,7 +450,7 @@ static int init_owner_info(struct bch_nvm_namespace *ns) BUG_ON(rec->pgoff <= offset); /* init struct page: index/private */ - page = nvm_vaddr_to_page(ns, + page = nvm_vaddr_to_page( BCH_PGOFF_TO_KVADDR(rec->pgoff)); set_page_private(page, rec->order); @@ -340,11 +489,12 @@ static void init_nvm_free_space(struct bch_nvm_namespace *ns) break; } - page = nvm_vaddr_to_page(ns, nvm_pgoff_to_vaddr(ns, pgoff_start)); + page = nvm_vaddr_to_page(nvm_pgoff_to_vaddr(ns, pgoff_start)); page->index = pgoff_start; set_page_private(page, i); - __SetPageBuddy(page); - list_add((struct list_head *)&page->zone_device_data, &ns->free_area[i]); + + /* in order to update ns->free */ + __free_space(ns, nvm_pgoff_to_vaddr(ns, pgoff_start), i); pgoff_start += 1L << i; pages -= 1L << i; @@ -535,7 +685,7 @@ struct bch_nvm_namespace *bch_register_namespace(const char *dev_path) ns->page_size = ns->sb->page_size; ns->pages_offset = ns->sb->pages_offset; ns->pages_total = ns->sb->pages_total; - ns->free = 0; + ns->free = 0; /* increase by __free_space() */ ns->bdev = bdev; ns->nvm_set = only_set; mutex_init(&ns->lock); diff --git a/drivers/md/bcache/nvm-pages.h b/drivers/md/bcache/nvm-pages.h index f2583723aca6..0ca699166855 100644 --- a/drivers/md/bcache/nvm-pages.h +++ b/drivers/md/bcache/nvm-pages.h @@ -63,6 +63,7 @@ struct bch_nvm_namespace *bch_register_namespace(const char *dev_path); int bch_nvm_init(void); void bch_nvm_exit(void); void *bch_nvm_alloc_pages(int order, const char *owner_uuid); +void bch_nvm_free_pages(void *addr, int order, const char *owner_uuid); #else @@ -79,7 +80,7 @@ static inline void *bch_nvm_alloc_pages(int order, const char *owner_uuid) { return NULL; } - +static inline void bch_nvm_free_pages(void *addr, int order, const char *owner_uuid) { } #endif /* CONFIG_BCACHE_NVM_PAGES */ From patchwork Tue Jun 15 05:49:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 12320631 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB53DC49361 for ; Tue, 15 Jun 2021 05:49:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A85B76141E for ; Tue, 15 Jun 2021 05:49:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230286AbhFOFvy (ORCPT ); Tue, 15 Jun 2021 01:51:54 -0400 Received: from smtp-out2.suse.de ([195.135.220.29]:45658 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229520AbhFOFvx (ORCPT ); Tue, 15 Jun 2021 01:51:53 -0400 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 992801FD2A; Tue, 15 Jun 2021 05:49:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1623736188; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3kgOGOtFbni91YpCkPSMNm/9rRn/zSB0E64MCYStyWY=; b=i712REx7WkUcQW3TZgNHhk2VpxKX4CPfzDpcTuBG4fkEZnmWKQ4mGWtoZT64Wua5s2urO/ fYMByJp3jUG5Rv0kJPiZLJ3aqn2hj3IScbYqFxtIXIhR9j1iM9JC1k4l9096roAdNV4wyf HejunKPeWVOmvTHLwicBcIKsmlNXck4= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1623736188; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3kgOGOtFbni91YpCkPSMNm/9rRn/zSB0E64MCYStyWY=; b=lzWdyabIDiFZK2NaaF9j5TyKN2SSV+c/QrX8h2nhtZEhpMSimmerADJXUj4LQLicm4r5H8 /Q4M1g7gB6UCm+BA== Received: from localhost.localdomain (unknown [10.163.16.22]) by relay2.suse.de (Postfix) with ESMTP id C3172A3BC3; Tue, 15 Jun 2021 05:49:46 +0000 (UTC) From: Coly Li To: axboe@kernel.dk Cc: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org, Jianpeng Ma , Qiaowei Ren , Coly Li Subject: [PATCH 08/14] bcache: get allocated pages from specific owner Date: Tue, 15 Jun 2021 13:49:15 +0800 Message-Id: <20210615054921.101421-9-colyli@suse.de> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210615054921.101421-1-colyli@suse.de> References: <20210615054921.101421-1-colyli@suse.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Jianpeng Ma This patch implements bch_get_allocated_pages() of the buddy to be used to get allocated pages from specific owner. Signed-off-by: Jianpeng Ma Co-developed-by: Qiaowei Ren Signed-off-by: Qiaowei Ren Signed-off-by: Coly Li Reviewed-by: Hannes Reinecke --- drivers/md/bcache/nvm-pages.c | 6 ++++++ drivers/md/bcache/nvm-pages.h | 5 +++++ 2 files changed, 11 insertions(+) diff --git a/drivers/md/bcache/nvm-pages.c b/drivers/md/bcache/nvm-pages.c index 74d08950c67c..42b0504d9564 100644 --- a/drivers/md/bcache/nvm-pages.c +++ b/drivers/md/bcache/nvm-pages.c @@ -397,6 +397,12 @@ void *bch_nvm_alloc_pages(int order, const char *owner_uuid) } EXPORT_SYMBOL_GPL(bch_nvm_alloc_pages); +struct bch_nvm_pages_owner_head *bch_get_allocated_pages(const char *owner_uuid) +{ + return find_owner_head(owner_uuid, false); +} +EXPORT_SYMBOL_GPL(bch_get_allocated_pages); + #define BCH_PGOFF_TO_KVADDR(pgoff) ((void *)((unsigned long)pgoff << PAGE_SHIFT)) static int init_owner_info(struct bch_nvm_namespace *ns) diff --git a/drivers/md/bcache/nvm-pages.h b/drivers/md/bcache/nvm-pages.h index 0ca699166855..c763bf2e2721 100644 --- a/drivers/md/bcache/nvm-pages.h +++ b/drivers/md/bcache/nvm-pages.h @@ -64,6 +64,7 @@ int bch_nvm_init(void); void bch_nvm_exit(void); void *bch_nvm_alloc_pages(int order, const char *owner_uuid); void bch_nvm_free_pages(void *addr, int order, const char *owner_uuid); +struct bch_nvm_pages_owner_head *bch_get_allocated_pages(const char *owner_uuid); #else @@ -81,6 +82,10 @@ static inline void *bch_nvm_alloc_pages(int order, const char *owner_uuid) return NULL; } static inline void bch_nvm_free_pages(void *addr, int order, const char *owner_uuid) { } +static inline struct bch_nvm_pages_owner_head *bch_get_allocated_pages(const char *owner_uuid) +{ + return NULL; +} #endif /* CONFIG_BCACHE_NVM_PAGES */ From patchwork Tue Jun 15 05:49:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 12320633 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B7D4C49EA4 for ; Tue, 15 Jun 2021 05:49:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7238161410 for ; Tue, 15 Jun 2021 05:49:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229463AbhFOFv4 (ORCPT ); Tue, 15 Jun 2021 01:51:56 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]:57364 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230236AbhFOFvz (ORCPT ); Tue, 15 Jun 2021 01:51:55 -0400 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id CC3042199E; Tue, 15 Jun 2021 05:49:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1623736190; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RKY2SuCedQvRkq+r7VAHyvOA1CuXQQP9ZHUD2VsGEk0=; b=xURQbTvJIQ9/RLK1nRjaW18NqvhVlK0bbvV3mxulbVH7yhNcgDTrQ+3EYUkjKhOyFRHhWD +pAZjCHnKro8yYGliUW7gc17iJ9cYWB9J0OvTiStsaex2XtMePfTYyet569zYk7xdJVZ9Z dUM3E6UoQcK8AbNiC0TACmoFF9feP0U= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1623736190; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RKY2SuCedQvRkq+r7VAHyvOA1CuXQQP9ZHUD2VsGEk0=; b=hwQZFscdVJqHILCAwjKfQ1zuV3CAEWEDn97PVtRqqZX3iYtBEebRSZ6/0IudvGF526WLJu 3wR0Cbd8UgocybDw== Received: from localhost.localdomain (unknown [10.163.16.22]) by relay2.suse.de (Postfix) with ESMTP id 0102FA3BA5; Tue, 15 Jun 2021 05:49:48 +0000 (UTC) From: Coly Li To: axboe@kernel.dk Cc: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org, Coly Li , Jianpeng Ma , Qiaowei Ren Subject: [PATCH 09/14] bcache: use bucket index to set GC_MARK_METADATA for journal buckets in bch_btree_gc_finish() Date: Tue, 15 Jun 2021 13:49:16 +0800 Message-Id: <20210615054921.101421-10-colyli@suse.de> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210615054921.101421-1-colyli@suse.de> References: <20210615054921.101421-1-colyli@suse.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Currently the meta data bucket locations on cache device are reserved after the meta data stored on NVDIMM pages, for the meta data layout consistentcy temporarily. So these buckets are still marked as meta data by SET_GC_MARK() in bch_btree_gc_finish(). When BCH_FEATURE_INCOMPAT_NVDIMM_META is set, the sb.d[] stores linear address of NVDIMM pages and not bucket index anymore. Therefore we should avoid to find bucket index from sb.d[], and directly use bucket index from ca->sb.first_bucket to (ca->sb.first_bucket + ca->sb.njournal_bucketsi) for setting the gc mark of journal bucket. Signed-off-by: Coly Li Cc: Jianpeng Ma Cc: Qiaowei Ren Reviewed-by: Hannes Reinecke --- drivers/md/bcache/btree.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c index 183a58c89377..e0d7135669ca 100644 --- a/drivers/md/bcache/btree.c +++ b/drivers/md/bcache/btree.c @@ -1761,8 +1761,10 @@ static void bch_btree_gc_finish(struct cache_set *c) ca = c->cache; ca->invalidate_needs_gc = 0; - for (k = ca->sb.d; k < ca->sb.d + ca->sb.keys; k++) - SET_GC_MARK(ca->buckets + *k, GC_MARK_METADATA); + /* Range [first_bucket, first_bucket + keys) is for journal buckets */ + for (i = ca->sb.first_bucket; + i < ca->sb.first_bucket + ca->sb.njournal_buckets; i++) + SET_GC_MARK(ca->buckets + i, GC_MARK_METADATA); for (k = ca->prio_buckets; k < ca->prio_buckets + prio_buckets(ca) * 2; k++) From patchwork Tue Jun 15 05:49:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 12320635 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9DAB1C48BE5 for ; Tue, 15 Jun 2021 05:49:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 88EE66140C for ; Tue, 15 Jun 2021 05:49:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230344AbhFOFv6 (ORCPT ); Tue, 15 Jun 2021 01:51:58 -0400 Received: from smtp-out2.suse.de ([195.135.220.29]:45668 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230297AbhFOFv5 (ORCPT ); Tue, 15 Jun 2021 01:51:57 -0400 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id F25061FD2A; Tue, 15 Jun 2021 05:49:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1623736192; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HBAZ0ZjuLQW/pckeEosVADDOYazldmp97Y5zmm0Pj0s=; b=FHbd41dh3BuUTGIuxezkr60iBHOjVh/egrPwfNLErNm1YiJV6S6ab4yVaGEekJd2g8W4gt uC9SNHGrGe1jkc1PMj5v7KGvGKvoCXI+1FCKjSaMTACdCyFtJDaacTyvjCSio/cAQp4JUF /xpMxNZmt2Pn5jkxtoEznRogMXUvoTg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1623736192; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HBAZ0ZjuLQW/pckeEosVADDOYazldmp97Y5zmm0Pj0s=; b=J6BMbYUdUdE+l4uQt0Yn9O8+5Y89aBeiFJBBuH5+b3nnanYeM905e1Q5iyu2Xv/RCKeWTB iRyu2ShgN+s59RBg== Received: from localhost.localdomain (unknown [10.163.16.22]) by relay2.suse.de (Postfix) with ESMTP id 32B89A3BAD; Tue, 15 Jun 2021 05:49:50 +0000 (UTC) From: Coly Li To: axboe@kernel.dk Cc: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org, Coly Li , Jianpeng Ma , Qiaowei Ren Subject: [PATCH 10/14] bcache: add BCH_FEATURE_INCOMPAT_NVDIMM_META into incompat feature set Date: Tue, 15 Jun 2021 13:49:17 +0800 Message-Id: <20210615054921.101421-11-colyli@suse.de> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210615054921.101421-1-colyli@suse.de> References: <20210615054921.101421-1-colyli@suse.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org This patch adds BCH_FEATURE_INCOMPAT_NVDIMM_META (value 0x0004) into the incompat feature set. When this bit is set by bcache-tools, it indicates bcache meta data should be stored on specific NVDIMM meta device. The bcache meta data mainly includes journal and btree nodes, when this bit is set in incompat feature set, bcache will ask the nvm-pages allocator for NVDIMM space to store the meta data. Signed-off-by: Coly Li Cc: Jianpeng Ma Cc: Qiaowei Ren Reviewed-by: Hannes Reinecke --- drivers/md/bcache/features.h | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/md/bcache/features.h b/drivers/md/bcache/features.h index d1c8fd3977fc..45d2508d5532 100644 --- a/drivers/md/bcache/features.h +++ b/drivers/md/bcache/features.h @@ -17,11 +17,19 @@ #define BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET 0x0001 /* real bucket size is (1 << bucket_size) */ #define BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE 0x0002 +/* store bcache meta data on nvdimm */ +#define BCH_FEATURE_INCOMPAT_NVDIMM_META 0x0004 #define BCH_FEATURE_COMPAT_SUPP 0 #define BCH_FEATURE_RO_COMPAT_SUPP 0 +#if defined(CONFIG_BCACHE_NVM_PAGES) +#define BCH_FEATURE_INCOMPAT_SUPP (BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET| \ + BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE| \ + BCH_FEATURE_INCOMPAT_NVDIMM_META) +#else #define BCH_FEATURE_INCOMPAT_SUPP (BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET| \ BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE) +#endif #define BCH_HAS_COMPAT_FEATURE(sb, mask) \ ((sb)->feature_compat & (mask)) @@ -89,6 +97,7 @@ static inline void bch_clear_feature_##name(struct cache_sb *sb) \ BCH_FEATURE_INCOMPAT_FUNCS(obso_large_bucket, OBSO_LARGE_BUCKET); BCH_FEATURE_INCOMPAT_FUNCS(large_bucket, LOG_LARGE_BUCKET_SIZE); +BCH_FEATURE_INCOMPAT_FUNCS(nvdimm_meta, NVDIMM_META); static inline bool bch_has_unknown_compat_features(struct cache_sb *sb) { From patchwork Tue Jun 15 05:49:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 12320637 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9E41C48BE8 for ; Tue, 15 Jun 2021 05:49:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B515161413 for ; Tue, 15 Jun 2021 05:49:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230297AbhFOFwA (ORCPT ); Tue, 15 Jun 2021 01:52:00 -0400 Received: from smtp-out2.suse.de ([195.135.220.29]:45678 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230360AbhFOFv7 (ORCPT ); Tue, 15 Jun 2021 01:51:59 -0400 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 3490C1FD2A; Tue, 15 Jun 2021 05:49:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1623736195; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Mk3/cDxQAgmfGogAUUYameJI02lw8sRtnnw7a1AL2ew=; b=hAvJ6jIJgO9XyYCbaOTMl90RO8yhhSUot+LZidbtF7Cd2G+Nzj/W7LNFFstHko/pDpdhC5 UBbHp8+71LjFmzDYTE7UVmfhTrKJNz3Rnnhe7gnNIVBlQe+LJzyDngf6u9p3FqPgrOz/2/ uMZwbRM3Oxk5l9IDNG6iy777kMF1Kcc= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1623736195; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Mk3/cDxQAgmfGogAUUYameJI02lw8sRtnnw7a1AL2ew=; b=i732GA7wJQVt8GUR/1imczo5EzOvhIoT8pDBEm0RmcYupIucXmWxTOmHIbLBqA/eUGXwKc nVNEnTn5iguVX/AQ== Received: from localhost.localdomain (unknown [10.163.16.22]) by relay2.suse.de (Postfix) with ESMTP id 65977A3BAD; Tue, 15 Jun 2021 05:49:53 +0000 (UTC) From: Coly Li To: axboe@kernel.dk Cc: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org, Coly Li , Jianpeng Ma , Qiaowei Ren Subject: [PATCH 11/14] bcache: initialize bcache journal for NVDIMM meta device Date: Tue, 15 Jun 2021 13:49:18 +0800 Message-Id: <20210615054921.101421-12-colyli@suse.de> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210615054921.101421-1-colyli@suse.de> References: <20210615054921.101421-1-colyli@suse.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org The nvm-pages allocator may store and index the NVDIMM pages allocated for bcache journal. This patch adds the initialization to store bcache journal space on NVDIMM pages if BCH_FEATURE_INCOMPAT_NVDIMM_META bit is set by bcache-tools. If BCH_FEATURE_INCOMPAT_NVDIMM_META is set, get_nvdimm_journal_space() will return the linear address of NVDIMM pages for bcache journal, - If there is previously allocated space, find it from nvm-pages owner list and return to bch_journal_init(). - If there is no previously allocated space, require a new NVDIMM range from the nvm-pages allocator, and return it to bch_journal_init(). And in bch_journal_init(), keys in sb.d[] store the corresponding linear address from NVDIMM into sb.d[i].ptr[0] where 'i' is the bucket index to iterate all journal buckets. Later when bcache journaling code stores the journaling jset, the target NVDIMM linear address stored (and updated) in sb.d[i].ptr[0] can be used directly in memory copy from DRAM pages into NVDIMM pages. Signed-off-by: Coly Li Cc: Jianpeng Ma Cc: Qiaowei Ren --- drivers/md/bcache/journal.c | 105 ++++++++++++++++++++++++++++++++++++ drivers/md/bcache/journal.h | 2 +- drivers/md/bcache/super.c | 16 +++--- 3 files changed, 115 insertions(+), 8 deletions(-) diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c index 61bd79babf7a..32599d2ff5d2 100644 --- a/drivers/md/bcache/journal.c +++ b/drivers/md/bcache/journal.c @@ -9,6 +9,8 @@ #include "btree.h" #include "debug.h" #include "extents.h" +#include "nvm-pages.h" +#include "features.h" #include @@ -982,3 +984,106 @@ int bch_journal_alloc(struct cache_set *c) return 0; } + +#if defined(CONFIG_BCACHE_NVM_PAGES) + +static void *find_journal_nvm_base(struct bch_nvm_pages_owner_head *owner_list, + struct cache *ca) +{ + unsigned long addr = 0; + struct bch_nvm_pgalloc_recs *recs_list = owner_list->recs[0]; + + while (recs_list) { + struct bch_pgalloc_rec *rec; + unsigned long jnl_pgoff; + int i; + + jnl_pgoff = ((unsigned long)ca->sb.d[0]) >> PAGE_SHIFT; + rec = recs_list->recs; + for (i = 0; i < recs_list->used; i++) { + if (rec->pgoff == jnl_pgoff) + break; + rec++; + } + if (i < recs_list->used) { + addr = rec->pgoff << PAGE_SHIFT; + break; + } + recs_list = recs_list->next; + } + return (void *)addr; +} + +static void *get_nvdimm_journal_space(struct cache *ca) +{ + struct bch_nvm_pages_owner_head *owner_list = NULL; + void *ret = NULL; + int order; + + owner_list = bch_get_allocated_pages(ca->sb.set_uuid); + if (owner_list) { + ret = find_journal_nvm_base(owner_list, ca); + if (ret) + goto found; + } + + order = ilog2(ca->sb.bucket_size * + ca->sb.njournal_buckets / PAGE_SECTORS); + ret = bch_nvm_alloc_pages(order, ca->sb.set_uuid); + if (ret) + memset(ret, 0, (1 << order) * PAGE_SIZE); + +found: + return ret; +} + +static int __bch_journal_nvdimm_init(struct cache *ca) +{ + int i, ret = 0; + void *journal_nvm_base = NULL; + + journal_nvm_base = get_nvdimm_journal_space(ca); + if (!journal_nvm_base) { + pr_err("Failed to get journal space from nvdimm\n"); + ret = -1; + goto out; + } + + /* Iniialized and reloaded from on-disk super block already */ + if (ca->sb.d[0] != 0) + goto out; + + for (i = 0; i < ca->sb.keys; i++) + ca->sb.d[i] = + (u64)(journal_nvm_base + (ca->sb.bucket_size * i)); + +out: + return ret; +} + +#else /* CONFIG_BCACHE_NVM_PAGES */ + +static int __bch_journal_nvdimm_init(struct cache *ca) +{ + return -1; +} + +#endif /* CONFIG_BCACHE_NVM_PAGES */ + +int bch_journal_init(struct cache_set *c) +{ + int i, ret = 0; + struct cache *ca = c->cache; + + ca->sb.keys = clamp_t(int, ca->sb.nbuckets >> 7, + 2, SB_JOURNAL_BUCKETS); + + if (!bch_has_feature_nvdimm_meta(&ca->sb)) { + for (i = 0; i < ca->sb.keys; i++) + ca->sb.d[i] = ca->sb.first_bucket + i; + } else { + ret = __bch_journal_nvdimm_init(ca); + } + + return ret; +} diff --git a/drivers/md/bcache/journal.h b/drivers/md/bcache/journal.h index f2ea34d5f431..e3a7fa5a8fda 100644 --- a/drivers/md/bcache/journal.h +++ b/drivers/md/bcache/journal.h @@ -179,7 +179,7 @@ void bch_journal_mark(struct cache_set *c, struct list_head *list); void bch_journal_meta(struct cache_set *c, struct closure *cl); int bch_journal_read(struct cache_set *c, struct list_head *list); int bch_journal_replay(struct cache_set *c, struct list_head *list); - +int bch_journal_init(struct cache_set *c); void bch_journal_free(struct cache_set *c); int bch_journal_alloc(struct cache_set *c); diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index ce22aefb1352..cce0f6bf0944 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -147,10 +147,15 @@ static const char *read_super_common(struct cache_sb *sb, struct block_device * goto err; err = "Journal buckets not sequential"; +#if defined(CONFIG_BCACHE_NVM_PAGES) + if (!bch_has_feature_nvdimm_meta(sb)) { +#endif for (i = 0; i < sb->keys; i++) if (sb->d[i] != sb->first_bucket + i) goto err; - +#ifdef CONFIG_BCACHE_NVM_PAGES + } /* bch_has_feature_nvdimm_meta */ +#endif err = "Too many journal buckets"; if (sb->first_bucket + sb->keys > sb->nbuckets) goto err; @@ -2072,14 +2077,11 @@ static int run_cache_set(struct cache_set *c) if (bch_journal_replay(c, &journal)) goto err; } else { - unsigned int j; - pr_notice("invalidating existing data\n"); - ca->sb.keys = clamp_t(int, ca->sb.nbuckets >> 7, - 2, SB_JOURNAL_BUCKETS); - for (j = 0; j < ca->sb.keys; j++) - ca->sb.d[j] = ca->sb.first_bucket + j; + err = "error initializing journal"; + if (bch_journal_init(c)) + goto err; bch_initial_gc_finish(c); From patchwork Tue Jun 15 05:49:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 12320639 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E17ABC49EA4 for ; Tue, 15 Jun 2021 05:49:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D0CA361410 for ; Tue, 15 Jun 2021 05:49:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230303AbhFOFwC (ORCPT ); Tue, 15 Jun 2021 01:52:02 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]:57374 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230329AbhFOFwC (ORCPT ); Tue, 15 Jun 2021 01:52:02 -0400 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 66D592199E; Tue, 15 Jun 2021 05:49:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1623736197; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bHOYnErxEwK+iSc8zW5hZs8/76WYbhB2kLHTmjEiVyw=; b=kORNYoRiRcTJZV9oXUW4nNvYLLeIGOwK2BEsfpnC9lu/RBydOP6Bh/EHeJxKBPz+AxRHuA fLc+eu/3seSXktCrmRmMuCBkByGvUrt2Rr1/7f/KL/CTeljvLEZdxW/iyL86KBxbnMDUoR 6YW0Ak5DuWTaTWu1mdJ9ehReHQlSsU0= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1623736197; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bHOYnErxEwK+iSc8zW5hZs8/76WYbhB2kLHTmjEiVyw=; b=f8kGHTOCgHRrKVJoXl1B9wNin/XenbK9HnK+zdHvJnbJLQ4N67MPeEVsHN/DGD+w/PvN69 6clZ2m6I/ipqFKAg== Received: from localhost.localdomain (unknown [10.163.16.22]) by relay2.suse.de (Postfix) with ESMTP id 9C275A3BB7; Tue, 15 Jun 2021 05:49:55 +0000 (UTC) From: Coly Li To: axboe@kernel.dk Cc: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org, Coly Li , Jianpeng Ma , Qiaowei Ren Subject: [PATCH 12/14] bcache: support storing bcache journal into NVDIMM meta device Date: Tue, 15 Jun 2021 13:49:19 +0800 Message-Id: <20210615054921.101421-13-colyli@suse.de> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210615054921.101421-1-colyli@suse.de> References: <20210615054921.101421-1-colyli@suse.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org This patch implements two methods to store bcache journal to, 1) __journal_write_unlocked() for block interface device The latency method to compose bio and issue the jset bio to cache device (e.g. SSD). c->journal.key.ptr[0] indicates the LBA on cache device to store the journal jset. 2) __journal_nvdimm_write_unlocked() for memory interface NVDIMM Use memory interface to access NVDIMM pages and store the jset by memcpy_flushcache(). c->journal.key.ptr[0] indicates the linear address from the NVDIMM pages to store the journal jset. For lagency configuration without NVDIMM meta device, journal I/O is handled by __journal_write_unlocked() with existing code logic. If the NVDIMM meta device is used (by bcache-tools), the journal I/O will be handled by __journal_nvdimm_write_unlocked() and go into the NVDIMM pages. And when NVDIMM meta device is used, sb.d[] stores the linear addresses from NVDIMM pages (no more bucket index), in journal_reclaim() the journaling location in c->journal.key.ptr[0] should also be updated by linear address from NVDIMM pages (no more LBA combined by sectors offset and bucket index). Signed-off-by: Coly Li Cc: Jianpeng Ma Cc: Qiaowei Ren Reviewed-by: Hannes Reinecke --- drivers/md/bcache/journal.c | 119 ++++++++++++++++++++++++---------- drivers/md/bcache/nvm-pages.h | 1 + drivers/md/bcache/super.c | 28 +++++++- 3 files changed, 110 insertions(+), 38 deletions(-) diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c index 32599d2ff5d2..03ecedf813b0 100644 --- a/drivers/md/bcache/journal.c +++ b/drivers/md/bcache/journal.c @@ -596,6 +596,8 @@ static void do_journal_discard(struct cache *ca) return; } + BUG_ON(bch_has_feature_nvdimm_meta(&ca->sb)); + switch (atomic_read(&ja->discard_in_flight)) { case DISCARD_IN_FLIGHT: return; @@ -661,9 +663,13 @@ static void journal_reclaim(struct cache_set *c) goto out; ja->cur_idx = next; - k->ptr[0] = MAKE_PTR(0, - bucket_to_sector(c, ca->sb.d[ja->cur_idx]), - ca->sb.nr_this_dev); + if (!bch_has_feature_nvdimm_meta(&ca->sb)) + k->ptr[0] = MAKE_PTR(0, + bucket_to_sector(c, ca->sb.d[ja->cur_idx]), + ca->sb.nr_this_dev); + else + k->ptr[0] = ca->sb.d[ja->cur_idx]; + atomic_long_inc(&c->reclaimed_journal_buckets); bkey_init(k); @@ -729,46 +735,21 @@ static void journal_write_unlock(struct closure *cl) spin_unlock(&c->journal.lock); } -static void journal_write_unlocked(struct closure *cl) + +static void __journal_write_unlocked(struct cache_set *c) __releases(c->journal.lock) { - struct cache_set *c = container_of(cl, struct cache_set, journal.io); - struct cache *ca = c->cache; - struct journal_write *w = c->journal.cur; struct bkey *k = &c->journal.key; - unsigned int i, sectors = set_blocks(w->data, block_bytes(ca)) * - ca->sb.block_size; - + struct journal_write *w = c->journal.cur; + struct closure *cl = &c->journal.io; + struct cache *ca = c->cache; struct bio *bio; struct bio_list list; + unsigned int i, sectors = set_blocks(w->data, block_bytes(ca)) * + ca->sb.block_size; bio_list_init(&list); - if (!w->need_write) { - closure_return_with_destructor(cl, journal_write_unlock); - return; - } else if (journal_full(&c->journal)) { - journal_reclaim(c); - spin_unlock(&c->journal.lock); - - btree_flush_write(c); - continue_at(cl, journal_write, bch_journal_wq); - return; - } - - c->journal.blocks_free -= set_blocks(w->data, block_bytes(ca)); - - w->data->btree_level = c->root->level; - - bkey_copy(&w->data->btree_root, &c->root->key); - bkey_copy(&w->data->uuid_bucket, &c->uuid_bucket); - - w->data->prio_bucket[ca->sb.nr_this_dev] = ca->prio_buckets[0]; - w->data->magic = jset_magic(&ca->sb); - w->data->version = BCACHE_JSET_VERSION; - w->data->last_seq = last_seq(&c->journal); - w->data->csum = csum_set(w->data); - for (i = 0; i < KEY_PTRS(k); i++) { ca = c->cache; bio = &ca->journal.bio; @@ -793,7 +774,6 @@ static void journal_write_unlocked(struct closure *cl) ca->journal.seq[ca->journal.cur_idx] = w->data->seq; } - /* If KEY_PTRS(k) == 0, this jset gets lost in air */ BUG_ON(i == 0); @@ -805,6 +785,73 @@ static void journal_write_unlocked(struct closure *cl) while ((bio = bio_list_pop(&list))) closure_bio_submit(c, bio, cl); +} + +#if defined(CONFIG_BCACHE_NVM_PAGES) + +static void __journal_nvdimm_write_unlocked(struct cache_set *c) + __releases(c->journal.lock) +{ + struct journal_write *w = c->journal.cur; + struct cache *ca = c->cache; + unsigned int sectors; + + sectors = set_blocks(w->data, block_bytes(ca)) * ca->sb.block_size; + atomic_long_add(sectors, &ca->meta_sectors_written); + + memcpy_flushcache((void *)c->journal.key.ptr[0], w->data, sectors << 9); + + c->journal.key.ptr[0] += sectors << 9; + ca->journal.seq[ca->journal.cur_idx] = w->data->seq; + + atomic_dec_bug(&fifo_back(&c->journal.pin)); + bch_journal_next(&c->journal); + journal_reclaim(c); + + spin_unlock(&c->journal.lock); +} + +#else /* CONFIG_BCACHE_NVM_PAGES */ + +static void __journal_nvdimm_write_unlocked(struct cache_set *c) { } + +#endif /* CONFIG_BCACHE_NVM_PAGES */ + +static void journal_write_unlocked(struct closure *cl) +{ + struct cache_set *c = container_of(cl, struct cache_set, journal.io); + struct cache *ca = c->cache; + struct journal_write *w = c->journal.cur; + + if (!w->need_write) { + closure_return_with_destructor(cl, journal_write_unlock); + return; + } else if (journal_full(&c->journal)) { + journal_reclaim(c); + spin_unlock(&c->journal.lock); + + btree_flush_write(c); + continue_at(cl, journal_write, bch_journal_wq); + return; + } + + c->journal.blocks_free -= set_blocks(w->data, block_bytes(ca)); + + w->data->btree_level = c->root->level; + + bkey_copy(&w->data->btree_root, &c->root->key); + bkey_copy(&w->data->uuid_bucket, &c->uuid_bucket); + + w->data->prio_bucket[ca->sb.nr_this_dev] = ca->prio_buckets[0]; + w->data->magic = jset_magic(&ca->sb); + w->data->version = BCACHE_JSET_VERSION; + w->data->last_seq = last_seq(&c->journal); + w->data->csum = csum_set(w->data); + + if (!bch_has_feature_nvdimm_meta(&ca->sb)) + __journal_write_unlocked(c); + else + __journal_nvdimm_write_unlocked(c); continue_at(cl, journal_write_done, NULL); } diff --git a/drivers/md/bcache/nvm-pages.h b/drivers/md/bcache/nvm-pages.h index c763bf2e2721..736a661777b7 100644 --- a/drivers/md/bcache/nvm-pages.h +++ b/drivers/md/bcache/nvm-pages.h @@ -5,6 +5,7 @@ #if defined(CONFIG_BCACHE_NVM_PAGES) #include +#include #endif /* CONFIG_BCACHE_NVM_PAGES */ /* diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index cce0f6bf0944..4d6666d03aa7 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -1686,7 +1686,32 @@ void bch_cache_set_release(struct kobject *kobj) static void cache_set_free(struct closure *cl) { struct cache_set *c = container_of(cl, struct cache_set, cl); - struct cache *ca; + struct cache *ca = c->cache; + +#if defined(CONFIG_BCACHE_NVM_PAGES) + /* Flush cache if journal stored in NVDIMM */ + if (ca && bch_has_feature_nvdimm_meta(&ca->sb)) { + unsigned long bucket_size = ca->sb.bucket_size; + int i; + + for (i = 0; i < ca->sb.keys; i++) { + unsigned long offset = 0; + unsigned int len = round_down(UINT_MAX, 2); + + if ((void *)ca->sb.d[i] == NULL) + continue; + + while (bucket_size > 0) { + if (len > bucket_size) + len = bucket_size; + arch_invalidate_pmem( + (void *)(ca->sb.d[i] + offset), len); + offset += len; + bucket_size -= len; + } + } + } +#endif /* CONFIG_BCACHE_NVM_PAGES */ debugfs_remove(c->debug); @@ -1698,7 +1723,6 @@ static void cache_set_free(struct closure *cl) bch_bset_sort_state_free(&c->sort); free_pages((unsigned long) c->uuids, ilog2(meta_bucket_pages(&c->cache->sb))); - ca = c->cache; if (ca) { ca->set = NULL; c->cache = NULL; From patchwork Tue Jun 15 05:49:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 12320641 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D22EBC48BE8 for ; Tue, 15 Jun 2021 05:50:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B880B6140C for ; Tue, 15 Jun 2021 05:50:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230349AbhFOFwG (ORCPT ); Tue, 15 Jun 2021 01:52:06 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]:57384 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230360AbhFOFwE (ORCPT ); Tue, 15 Jun 2021 01:52:04 -0400 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 970E92199E; Tue, 15 Jun 2021 05:49:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1623736199; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FZu5PZoGiJu1s3gSxse3FZD6RNCgqcNg5dhUg6wKbbE=; b=ujdI6jx5IS4qVr89E1kzKTDV8ZkTqBbsD9PmBlikYiZZI6KCz4zblKsUuu0he0c4CZuRQ/ 1CS0y2t5cqOLLd+d35k9G9Q8wJUrp3jzv4UEZUP6BbQRE5kXu6MKyWEvsLlmVpyC4Pm1s7 5Bhkq97MEm3cwvcWVQFIVTjMDKL4JrI= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1623736199; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FZu5PZoGiJu1s3gSxse3FZD6RNCgqcNg5dhUg6wKbbE=; b=dX5XLpvlzl1B2UpXzuJYT24fE1W9Or43BXEMZAvEC0JQCj7REGsyyTGBEP2mIm4F01nxMz QziubqBwt5uJvnDg== Received: from localhost.localdomain (unknown [10.163.16.22]) by relay2.suse.de (Postfix) with ESMTP id CD5A7A3BB7; Tue, 15 Jun 2021 05:49:57 +0000 (UTC) From: Coly Li To: axboe@kernel.dk Cc: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org, Coly Li , Jianpeng Ma , Qiaowei Ren Subject: [PATCH 13/14] bcache: read jset from NVDIMM pages for journal replay Date: Tue, 15 Jun 2021 13:49:20 +0800 Message-Id: <20210615054921.101421-14-colyli@suse.de> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210615054921.101421-1-colyli@suse.de> References: <20210615054921.101421-1-colyli@suse.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org This patch implements two methods to read jset from media for journal replay, - __jnl_rd_bkt() for block device This is the legacy method to read jset via block device interface. - __jnl_rd_nvm_bkt() for NVDIMM This is the method to read jset from NVDIMM memory interface, a.k.a memcopy() from NVDIMM pages to DRAM pages. If BCH_FEATURE_INCOMPAT_NVDIMM_META is set in incompat feature set, during running cache set, journal_read_bucket() will read the journal content from NVDIMM by __jnl_rd_nvm_bkt(). The linear addresses of NVDIMM pages to read jset are stored in sb.d[SB_JOURNAL_BUCKETS], which were initialized and maintained in previous runs of the cache set. A thing should be noticed is, when bch_journal_read() is called, the linear address of NVDIMM pages is not loaded and initialized yet, it is necessary to call __bch_journal_nvdimm_init() before reading the jset from NVDIMM pages. Signed-off-by: Coly Li Cc: Jianpeng Ma Cc: Qiaowei Ren --- drivers/md/bcache/journal.c | 93 +++++++++++++++++++++++++++---------- 1 file changed, 69 insertions(+), 24 deletions(-) diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c index 03ecedf813b0..23e5ccf125df 100644 --- a/drivers/md/bcache/journal.c +++ b/drivers/md/bcache/journal.c @@ -34,60 +34,96 @@ static void journal_read_endio(struct bio *bio) closure_put(cl); } +static struct jset *__jnl_rd_bkt(struct cache *ca, unsigned int bkt_idx, + unsigned int len, unsigned int offset, + struct closure *cl) +{ + sector_t bucket = bucket_to_sector(ca->set, ca->sb.d[bkt_idx]); + struct bio *bio = &ca->journal.bio; + struct jset *data = ca->set->journal.w[0].data; + + bio_reset(bio); + bio->bi_iter.bi_sector = bucket + offset; + bio_set_dev(bio, ca->bdev); + bio->bi_iter.bi_size = len << 9; + bio->bi_end_io = journal_read_endio; + bio->bi_private = cl; + bio_set_op_attrs(bio, REQ_OP_READ, 0); + bch_bio_map(bio, data); + + closure_bio_submit(ca->set, bio, cl); + closure_sync(cl); + + /* Indeed journal.w[0].data */ + return data; +} + +#if defined(CONFIG_BCACHE_NVM_PAGES) + +static struct jset *__jnl_rd_nvm_bkt(struct cache *ca, unsigned int bkt_idx, + unsigned int len, unsigned int offset) +{ + void *jset_addr = (void *)ca->sb.d[bkt_idx] + (offset << 9); + struct jset *data = ca->set->journal.w[0].data; + + memcpy(data, jset_addr, len << 9); + + /* Indeed journal.w[0].data */ + return data; +} + +#else /* CONFIG_BCACHE_NVM_PAGES */ + +static struct jset *__jnl_rd_nvm_bkt(struct cache *ca, unsigned int bkt_idx, + unsigned int len, unsigned int offset) +{ + return NULL; +} + +#endif /* CONFIG_BCACHE_NVM_PAGES */ + static int journal_read_bucket(struct cache *ca, struct list_head *list, - unsigned int bucket_index) + unsigned int bucket_idx) { struct journal_device *ja = &ca->journal; - struct bio *bio = &ja->bio; struct journal_replay *i; - struct jset *j, *data = ca->set->journal.w[0].data; + struct jset *j; struct closure cl; unsigned int len, left, offset = 0; int ret = 0; - sector_t bucket = bucket_to_sector(ca->set, ca->sb.d[bucket_index]); closure_init_stack(&cl); - pr_debug("reading %u\n", bucket_index); + pr_debug("reading %u\n", bucket_idx); while (offset < ca->sb.bucket_size) { reread: left = ca->sb.bucket_size - offset; len = min_t(unsigned int, left, PAGE_SECTORS << JSET_BITS); - bio_reset(bio); - bio->bi_iter.bi_sector = bucket + offset; - bio_set_dev(bio, ca->bdev); - bio->bi_iter.bi_size = len << 9; - - bio->bi_end_io = journal_read_endio; - bio->bi_private = &cl; - bio_set_op_attrs(bio, REQ_OP_READ, 0); - bch_bio_map(bio, data); - - closure_bio_submit(ca->set, bio, &cl); - closure_sync(&cl); + if (!bch_has_feature_nvdimm_meta(&ca->sb)) + j = __jnl_rd_bkt(ca, bucket_idx, len, offset, &cl); + else + j = __jnl_rd_nvm_bkt(ca, bucket_idx, len, offset); /* This function could be simpler now since we no longer write * journal entries that overlap bucket boundaries; this means * the start of a bucket will always have a valid journal entry * if it has any journal entries at all. */ - - j = data; while (len) { struct list_head *where; size_t blocks, bytes = set_bytes(j); if (j->magic != jset_magic(&ca->sb)) { - pr_debug("%u: bad magic\n", bucket_index); + pr_debug("%u: bad magic\n", bucket_idx); return ret; } if (bytes > left << 9 || bytes > PAGE_SIZE << JSET_BITS) { pr_info("%u: too big, %zu bytes, offset %u\n", - bucket_index, bytes, offset); + bucket_idx, bytes, offset); return ret; } @@ -96,7 +132,7 @@ reread: left = ca->sb.bucket_size - offset; if (j->csum != csum_set(j)) { pr_info("%u: bad csum, %zu bytes, offset %u\n", - bucket_index, bytes, offset); + bucket_idx, bytes, offset); return ret; } @@ -158,8 +194,8 @@ reread: left = ca->sb.bucket_size - offset; list_add(&i->list, where); ret = 1; - if (j->seq > ja->seq[bucket_index]) - ja->seq[bucket_index] = j->seq; + if (j->seq > ja->seq[bucket_idx]) + ja->seq[bucket_idx] = j->seq; next_set: offset += blocks * ca->sb.block_size; len -= blocks * ca->sb.block_size; @@ -170,6 +206,8 @@ reread: left = ca->sb.bucket_size - offset; return ret; } +static int __bch_journal_nvdimm_init(struct cache *ca); + int bch_journal_read(struct cache_set *c, struct list_head *list) { #define read_bucket(b) \ @@ -188,6 +226,13 @@ int bch_journal_read(struct cache_set *c, struct list_head *list) unsigned int i, l, r, m; uint64_t seq; + /* + * Linear addresses of NVDIMM pages for journaling is not + * initialized yet, do it before read jset from NVDIMM pages. + */ + if (bch_has_feature_nvdimm_meta(&ca->sb)) + __bch_journal_nvdimm_init(ca); + bitmap_zero(bitmap, SB_JOURNAL_BUCKETS); pr_debug("%u journal buckets\n", ca->sb.njournal_buckets); From patchwork Tue Jun 15 05:49:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 12320643 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 951E3C49361 for ; Tue, 15 Jun 2021 05:50:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 78B9461412 for ; Tue, 15 Jun 2021 05:50:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229918AbhFOFwI (ORCPT ); Tue, 15 Jun 2021 01:52:08 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]:57394 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230331AbhFOFwG (ORCPT ); Tue, 15 Jun 2021 01:52:06 -0400 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id C9BE8219C5; Tue, 15 Jun 2021 05:50:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1623736201; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=C3Np6UpmqYzZ4kH5SUJzYj4UbsK192DUNvW0ehDEFvc=; b=0sdVN7Hz8iMI0Vf1UUB6ZRDa1Y58PjjZ2+7kehlOoG7SmjWCDwtmagBk14T+27yIVM4wpQ ltLn00X3kKRnNDA4p1bYMWpTgise7tFN6HLgcYWI1ahXzk6sKlArSAXQtgMf6I/c/cxj5W wO29l5Fd7ldSiSQFCQudC8iMg14/uTk= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1623736201; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=C3Np6UpmqYzZ4kH5SUJzYj4UbsK192DUNvW0ehDEFvc=; b=WZ1JcYJie3OGdDArDpcqNs6cfA64enkm+26hb8dGdbuVnLImafTmQ6V+FA5fqNXyKTw/s5 hjuekLxXWrKuy5Ag== Received: from localhost.localdomain (unknown [10.163.16.22]) by relay2.suse.de (Postfix) with ESMTP id 0B026A3BB7; Tue, 15 Jun 2021 05:49:59 +0000 (UTC) From: Coly Li To: axboe@kernel.dk Cc: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org, Coly Li , Jianpeng Ma , Qiaowei Ren Subject: [PATCH 14/14] bcache: add sysfs interface register_nvdimm_meta to register NVDIMM meta device Date: Tue, 15 Jun 2021 13:49:21 +0800 Message-Id: <20210615054921.101421-15-colyli@suse.de> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210615054921.101421-1-colyli@suse.de> References: <20210615054921.101421-1-colyli@suse.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org This patch adds a sysfs interface register_nvdimm_meta to register NVDIMM meta device. The sysfs interface file only shows up when CONFIG_BCACHE_NVM_PAGES=y. Then a NVDIMM name space formatted by bcache-tools can be registered into bcache by e.g., echo /dev/pmem0 > /sys/fs/bcache/register_nvdimm_meta Signed-off-by: Coly Li Cc: Jianpeng Ma Cc: Qiaowei Ren Reviewed-by: Hannes Reinecke --- drivers/md/bcache/super.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index 4d6666d03aa7..9d506d053548 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -2439,10 +2439,18 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr, static ssize_t bch_pending_bdevs_cleanup(struct kobject *k, struct kobj_attribute *attr, const char *buffer, size_t size); +#if defined(CONFIG_BCACHE_NVM_PAGES) +static ssize_t register_nvdimm_meta(struct kobject *k, + struct kobj_attribute *attr, + const char *buffer, size_t size); +#endif kobj_attribute_write(register, register_bcache); kobj_attribute_write(register_quiet, register_bcache); kobj_attribute_write(pendings_cleanup, bch_pending_bdevs_cleanup); +#if defined(CONFIG_BCACHE_NVM_PAGES) +kobj_attribute_write(register_nvdimm_meta, register_nvdimm_meta); +#endif static bool bch_is_open_backing(dev_t dev) { @@ -2556,6 +2564,24 @@ static void register_device_async(struct async_reg_args *args) queue_delayed_work(system_wq, &args->reg_work, 10); } +#if defined(CONFIG_BCACHE_NVM_PAGES) +static ssize_t register_nvdimm_meta(struct kobject *k, struct kobj_attribute *attr, + const char *buffer, size_t size) +{ + ssize_t ret = size; + + struct bch_nvm_namespace *ns = bch_register_namespace(buffer); + + if (IS_ERR(ns)) { + pr_err("register nvdimm namespace %s for meta device failed.\n", + buffer); + ret = -EINVAL; + } + + return ret; +} +#endif + static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr, const char *buffer, size_t size) { @@ -2898,6 +2924,9 @@ static int __init bcache_init(void) static const struct attribute *files[] = { &ksysfs_register.attr, &ksysfs_register_quiet.attr, +#if defined(CONFIG_BCACHE_NVM_PAGES) + &ksysfs_register_nvdimm_meta.attr, +#endif &ksysfs_pendings_cleanup.attr, NULL };