From patchwork Thu Nov 19 16:49:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hannes Reinecke X-Patchwork-Id: 11918343 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40DFCC56201 for ; Thu, 19 Nov 2020 16:50:12 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 457F922227 for ; Thu, 19 Nov 2020 16:50:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 457F922227 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=tempfail smtp.mailfrom=dm-devel-bounces@redhat.com Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-539-9_NovrWQP_O0be8i8MzQZA-1; Thu, 19 Nov 2020 11:50:06 -0500 X-MC-Unique: 9_NovrWQP_O0be8i8MzQZA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8249AA0CA0; Thu, 19 Nov 2020 16:50:01 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 5EC7F5C1D1; Thu, 19 Nov 2020 16:50:01 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 287C54EBC6; Thu, 19 Nov 2020 16:50:01 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id 0AJGngkN004039 for ; Thu, 19 Nov 2020 11:49:42 -0500 Received: by smtp.corp.redhat.com (Postfix) id E3CA52166B2D; Thu, 19 Nov 2020 16:49:41 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast03.extmail.prod.ext.rdu2.redhat.com [10.11.55.19]) by smtp.corp.redhat.com (Postfix) with ESMTPS id DA25F2166B29 for ; Thu, 19 Nov 2020 16:49:39 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-2.mimecast.com [207.211.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 72A03811E76 for ; Thu, 19 Nov 2020 16:49:39 +0000 (UTC) Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-39-sDxyZOmNNgm-LXi0mfZaYQ-1; Thu, 19 Nov 2020 11:49:33 -0500 X-MC-Unique: sDxyZOmNNgm-LXi0mfZaYQ-1 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 44A50AC65; Thu, 19 Nov 2020 16:49:31 +0000 (UTC) From: Hannes Reinecke To: Mike Snitzer Date: Thu, 19 Nov 2020 17:49:23 +0100 Message-Id: <20201119164924.74401-2-hare@suse.de> In-Reply-To: <20201119164924.74401-1-hare@suse.de> References: <20201119164924.74401-1-hare@suse.de> X-Mimecast-Impersonation-Protect: Policy=CLT - Impersonation Protection Definition; Similar Internal Domain=false; Similar Monitored External Domain=false; Custom External Domain=false; Mimecast External Domain=false; Newly Observed Domain=false; Internal User Name=false; Custom Display Name List=false; Reply-to Address Mismatch=false; Targeted Threat Dictionary=false; Mimecast Threat Dictionary=false; Custom Threat Dictionary=false X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-loop: dm-devel@redhat.com Cc: linux-block@vger.kernel.org, dm-devel@redhat.com, Christoph Hellwig , Sergei Shtepa Subject: [dm-devel] [PATCH 1/2] blk_interposer - Block Layer Interposer X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=dm-devel-bounces@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com From: Sergei Shtepa The block layer interposer allows to intercept bio requests. This allows you to connect device mapper and other kernel modules to the block device stack on the fly. Signed-off-by: Sergei Shtepa Signed-off-by: Hannes Reinecke --- block/blk-core.c | 34 +++++++++++++++++++++++++++++ block/genhd.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++ include/linux/blk_types.h | 6 ++++-- include/linux/genhd.h | 19 ++++++++++++++++ 4 files changed, 112 insertions(+), 2 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 2db8bda43b6e..130f0124d939 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1030,6 +1030,37 @@ static blk_qc_t __submit_bio_noacct_mq(struct bio *bio) return ret; } +static blk_qc_t __submit_bio_interposed(struct bio *bio) +{ + struct bio_list bio_list[2] = { }; + blk_qc_t ret = BLK_QC_T_NONE; + + current->bio_list = bio_list; + if (likely(bio_queue_enter(bio) == 0)) { + struct gendisk *disk = bio->bi_disk; + + bio_set_flag(bio, BIO_INTERPOSED); + if (likely(blk_has_interposer(disk))) { + struct blk_interposer *ip = disk->interposer; + ret = ip->ip_submit_bio(ip, bio); + + if (ret == BLK_QC_T_NONE) + blk_queue_exit(disk->queue); + } else { + /* interposer was removed */ + bio_list_add(¤t->bio_list[0], bio); + blk_queue_exit(bio->bi_disk->queue); + } + } + current->bio_list = NULL; + + /* Resubmit remaining bios */ + while ((bio = bio_list_pop(&bio_list[0]))) + ret = submit_bio_noacct(bio); + + return ret; +} + /** * submit_bio_noacct - re-submit a bio to the block device layer for I/O * @bio: The bio describing the location in memory and on the device. @@ -1055,6 +1086,9 @@ blk_qc_t submit_bio_noacct(struct bio *bio) return BLK_QC_T_NONE; } + if (blk_has_interposer(bio->bi_disk) && + !bio_flagged(bio, BIO_INTERPOSED)) + return __submit_bio_interposed(bio); if (!bio->bi_disk->fops->submit_bio) return __submit_bio_noacct_mq(bio); return __submit_bio_noacct(bio); diff --git a/block/genhd.c b/block/genhd.c index 9387f050c248..17b43e2ed06a 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -2364,3 +2364,58 @@ static void disk_release_events(struct gendisk *disk) WARN_ON_ONCE(disk->ev && disk->ev->block != 1); kfree(disk->ev); } + +/** + * blk_interposer_attach - Attach interposer to disk + * @disk: target disk + * @interposer: block device interposer + * + * Returns: + * -EINVAL if @interposer is NULL. + * -ENODEV if interposer is not initialized, + * -EBUSY if the block device already has interposer. + */ +int blk_interposer_attach(struct gendisk *disk, + struct blk_interposer *interposer) +{ + int ret = 0; + + if (!interposer) + return -EINVAL; + + blk_mq_freeze_queue(disk->queue); + blk_mq_quiesce_queue(disk->queue); + if (blk_has_interposer(disk)) + ret = -EBUSY; + else + disk->interposer = interposer; + blk_mq_unquiesce_queue(disk->queue); + blk_mq_unfreeze_queue(disk->queue); + + return ret; +} +EXPORT_SYMBOL_GPL(blk_interposer_attach); + +/** + * blk_interposer_detach - Detach interposer from disk + * @disk: target disk + * + * Returns the attached interposer or NULL if none was attached. + */ +struct blk_interposer *blk_interposer_detach(struct gendisk *disk) +{ + struct blk_interposer *interposer; + + if (WARN_ON(!disk)) + return NULL; + + blk_mq_freeze_queue(disk->queue); + blk_mq_quiesce_queue(disk->queue); + interposer = disk->interposer; + disk->interposer = NULL; + blk_mq_unquiesce_queue(disk->queue); + blk_mq_unfreeze_queue(disk->queue); + + return interposer; +} +EXPORT_SYMBOL_GPL(blk_interposer_detach); diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index d9b69bbde5cc..996b803e5aa1 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -207,7 +207,7 @@ struct bio { * top bits REQ_OP. Use * accessors. */ - unsigned short bi_flags; /* status, etc and bvec pool number */ + unsigned int bi_flags; /* status, etc and bvec pool number */ unsigned short bi_ioprio; unsigned short bi_write_hint; blk_status_t bi_status; @@ -284,6 +284,8 @@ enum { * of this bio. */ BIO_CGROUP_ACCT, /* has been accounted to a cgroup */ BIO_TRACKED, /* set if bio goes through the rq_qos path */ + BIO_INTERPOSED, /* bio has been interposed and can be moved to + * a different disk */ BIO_FLAG_LAST }; @@ -302,7 +304,7 @@ enum { * freed. */ #define BVEC_POOL_BITS (3) -#define BVEC_POOL_OFFSET (16 - BVEC_POOL_BITS) +#define BVEC_POOL_OFFSET (32 - BVEC_POOL_BITS) #define BVEC_POOL_IDX(bio) ((bio)->bi_flags >> BVEC_POOL_OFFSET) #if (1<< BVEC_POOL_BITS) < (BVEC_POOL_NR+1) # error "BVEC_POOL_BITS is too small" diff --git a/include/linux/genhd.h b/include/linux/genhd.h index 03da3f603d30..f58ab391deda 100644 --- a/include/linux/genhd.h +++ b/include/linux/genhd.h @@ -164,6 +164,15 @@ struct blk_integrity { unsigned char tag_size; }; +struct blk_interposer; +typedef blk_qc_t (*ip_submit_bio_t) (struct blk_interposer *ip, struct bio *bio); + +struct blk_interposer { + struct gendisk *ip_disk; + ip_submit_bio_t ip_submit_bio; + void *ip_private; +}; + struct gendisk { /* major, first_minor and minors are input parameters only, * don't use directly. Use disk_devt() and disk_max_parts(). @@ -188,6 +197,7 @@ struct gendisk { const struct block_device_operations *fops; struct request_queue *queue; + struct blk_interposer *interposer; void *private_data; int flags; @@ -409,4 +419,13 @@ static inline dev_t blk_lookup_devt(const char *name, int partno) } #endif /* CONFIG_BLOCK */ +/* + * Block device interposing + */ +#define blk_has_interposer(d) ((d)->interposer != NULL) +#define blk_interposer_active(ip) ((ip)->ip_disk != NULL) + +int blk_interposer_attach(struct gendisk *, struct blk_interposer *); +struct blk_interposer * blk_interposer_detach(struct gendisk *); + #endif /* _LINUX_GENHD_H */ From patchwork Thu Nov 19 16:49:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hannes Reinecke X-Patchwork-Id: 11918341 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CDAC4C64E7A for ; Thu, 19 Nov 2020 16:49:54 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F1CD222227 for ; Thu, 19 Nov 2020 16:49:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F1CD222227 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=tempfail smtp.mailfrom=dm-devel-bounces@redhat.com Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-273-oSNldpo_MDCGi7oCVDPcwA-1; Thu, 19 Nov 2020 11:49:49 -0500 X-MC-Unique: oSNldpo_MDCGi7oCVDPcwA-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 35373A0CA8; Thu, 19 Nov 2020 16:49:44 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 769925D6A8; Thu, 19 Nov 2020 16:49:43 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 27FFD4A7C6; Thu, 19 Nov 2020 16:49:42 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id 0AJGnfI1004034 for ; Thu, 19 Nov 2020 11:49:41 -0500 Received: by smtp.corp.redhat.com (Postfix) id 26A6B10F2700; Thu, 19 Nov 2020 16:49:41 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast04.extmail.prod.ext.rdu2.redhat.com [10.11.55.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 2303E10F26FF for ; Thu, 19 Nov 2020 16:49:38 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [205.139.110.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CAE81103B800 for ; Thu, 19 Nov 2020 16:49:38 +0000 (UTC) Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-416-g_MeGXTTOZ2-0WtzEjs-vQ-1; Thu, 19 Nov 2020 11:49:32 -0500 X-MC-Unique: g_MeGXTTOZ2-0WtzEjs-vQ-1 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 4B9CEAC2D; Thu, 19 Nov 2020 16:49:31 +0000 (UTC) From: Hannes Reinecke To: Mike Snitzer Date: Thu, 19 Nov 2020 17:49:24 +0100 Message-Id: <20201119164924.74401-3-hare@suse.de> In-Reply-To: <20201119164924.74401-1-hare@suse.de> References: <20201119164924.74401-1-hare@suse.de> X-Mimecast-Impersonation-Protect: Policy=CLT - Impersonation Protection Definition; Similar Internal Domain=false; Similar Monitored External Domain=false; Custom External Domain=false; Mimecast External Domain=false; Newly Observed Domain=false; Internal User Name=false; Custom Display Name List=false; Reply-to Address Mismatch=false; Targeted Threat Dictionary=false; Mimecast Threat Dictionary=false; Custom Threat Dictionary=false X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-loop: dm-devel@redhat.com Cc: linux-block@vger.kernel.org, dm-devel@redhat.com, Christoph Hellwig , Sergei Shtepa Subject: [dm-devel] [PATCH 2/2] dm_interposer - blk_interposer for device-mapper X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=dm-devel-bounces@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com From: Sergei Shtepa Implement a block interposer for device-mapper to attach to an existing block layer stack. Signed-off-by: Sergei Shtepa Signed-off-by: Hannes Reinecke --- drivers/md/dm-table.c | 59 +++++++++++++++++++++ drivers/md/dm.c | 140 ++++++++++++++++++++++++++++++++++++++++++++++---- drivers/md/dm.h | 4 +- 3 files changed, 191 insertions(+), 12 deletions(-) diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index ce543b761be7..56923d0795b2 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -1809,6 +1809,65 @@ static bool dm_table_requires_stable_pages(struct dm_table *t) return false; } +static char *_dm_interposer_claim_ptr = "device-mapper interposer"; + +static int device_activate_interposer(struct dm_target *ti, + struct dm_dev *dev, sector_t start, + sector_t len, void *data) +{ + struct blk_interposer *blk_ip = dev->bdev->bd_disk->interposer; + struct mapped_device *md = data; + + if (!blk_ip) + return false; + if (md) { + struct block_device *bdev; + + bdev = blkdev_get_by_dev(md->bdev->bd_dev, + dev->mode | FMODE_EXCL, + _dm_interposer_claim_ptr); + if (!bdev) + return false; + blk_ip->ip_private = md; + } else if (blk_ip->ip_private) { + md = blk_ip->ip_private; + blkdev_put(md->bdev, dev->mode | FMODE_EXCL); + blk_ip->ip_private = NULL; + } + return true; +} + +bool dm_table_activate_interposer(struct dm_table *t, struct mapped_device *md) +{ + struct dm_target *ti; + + if (t->num_targets) { + ti = t->targets; + + if (!ti->type->iterate_devices || + !ti->type->iterate_devices(ti, + device_activate_interposer, md)) + return false; + DMINFO("%s: activated interposer", dm_device_name(md)); + } + return true; +} + +bool dm_table_deactivate_interposer(struct dm_table *t) +{ + struct dm_target *ti; + + if (t->num_targets) { + ti = t->targets; + + if (!ti->type->iterate_devices || + !ti->type->iterate_devices(ti, + device_activate_interposer, NULL)) + return false; + } + return true; +} + void dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, struct queue_limits *limits) { diff --git a/drivers/md/dm.c b/drivers/md/dm.c index c18fc2548518..ec6df4ad4c85 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -48,6 +48,7 @@ static DEFINE_IDR(_minor_idr); static DEFINE_SPINLOCK(_minor_lock); static void do_deferred_remove(struct work_struct *w); +static blk_qc_t dm_submit_bio_interposed(struct blk_interposer *ip, struct bio *bio); static DECLARE_WORK(deferred_remove_work, do_deferred_remove); @@ -730,6 +731,38 @@ static void dm_put_live_table_fast(struct mapped_device *md) __releases(RCU) rcu_read_unlock(); } +static inline int dm_install_interposer(struct gendisk *disk, + struct mapped_device *md) +{ + struct blk_interposer *blk_ip; + int ret; + + blk_ip = kzalloc(sizeof(struct blk_interposer), GFP_KERNEL); + if (!blk_ip) + return -ENOMEM; + + blk_ip->ip_submit_bio = dm_submit_bio_interposed; + blk_ip->ip_disk = disk; + blk_ip->ip_private = md; + + ret = blk_interposer_attach(disk, blk_ip); + if (ret) { + kfree(blk_ip); + return ret; + } + + return 0; +} + +static inline void dm_uninstall_interposer(struct gendisk *disk) +{ + struct blk_interposer *blk_ip; + + blk_ip = blk_interposer_detach(disk); + if (blk_ip) + kfree(blk_ip); +} + static char *_dm_claim_ptr = "I belong to device-mapper"; /* @@ -739,19 +772,38 @@ static int open_table_device(struct table_device *td, dev_t dev, struct mapped_device *md) { struct block_device *bdev; - - int r; + fmode_t fmode = td->dm_dev.mode | FMODE_EXCL; + int ret; BUG_ON(td->dm_dev.bdev); - bdev = blkdev_get_by_dev(dev, td->dm_dev.mode | FMODE_EXCL, _dm_claim_ptr); - if (IS_ERR(bdev)) - return PTR_ERR(bdev); + bdev = blkdev_get_by_dev(dev, fmode, _dm_claim_ptr); + if (IS_ERR(bdev)) { + ret = PTR_ERR(bdev); + if (ret != -EBUSY) + return ret; - r = bd_link_disk_holder(bdev, dm_disk(md)); - if (r) { - blkdev_put(bdev, td->dm_dev.mode | FMODE_EXCL); - return r; + /* + * If device cannot be opened in exclusive mode, + * then try to use blk_interpose. + */ + fmode = td->dm_dev.mode; + bdev = blkdev_get_by_dev(dev, fmode, NULL); + if (IS_ERR(bdev)) + return PTR_ERR(bdev); + + ret = dm_install_interposer(bdev->bd_disk, md); + if (ret) { + blkdev_put(bdev, fmode); + return ret; + } + } + + ret = bd_link_disk_holder(bdev, dm_disk(md)); + if (ret) { + dm_uninstall_interposer(bdev->bd_disk); + blkdev_put(bdev, fmode); + return ret; } td->dm_dev.bdev = bdev; @@ -764,11 +816,18 @@ static int open_table_device(struct table_device *td, dev_t dev, */ static void close_table_device(struct table_device *td, struct mapped_device *md) { + fmode_t fmode = td->dm_dev.mode | FMODE_EXCL; + if (!td->dm_dev.bdev) return; bd_unlink_disk_holder(td->dm_dev.bdev, dm_disk(md)); - blkdev_put(td->dm_dev.bdev, td->dm_dev.mode | FMODE_EXCL); + if (blk_has_interposer(td->dm_dev.bdev->bd_disk)) { + dm_uninstall_interposer(td->dm_dev.bdev->bd_disk); + fmode = td->dm_dev.mode; + } + blkdev_put(td->dm_dev.bdev, fmode); + put_dax(td->dm_dev.dax_dev); td->dm_dev.bdev = NULL; td->dm_dev.dax_dev = NULL; @@ -1666,6 +1725,62 @@ static blk_qc_t dm_submit_bio(struct bio *bio) return ret; } +static void dm_interposed_endio(struct bio *clone) +{ + struct bio *bio = clone->bi_private; + + bio->bi_status = clone->bi_status; + bio_endio(bio); + bio_put(bio); + + bio_put(clone); +} + +static blk_qc_t dm_submit_bio_interposed(struct blk_interposer *interposer, + struct bio *bio) +{ + struct mapped_device *md = interposer->ip_private; + blk_qc_t ret = BLK_QC_T_NONE; + int srcu_idx; + struct dm_table *map; + struct bio *clone; + + if (unlikely(!md)) + return submit_bio_noacct(bio); + + map = dm_get_live_table(md, &srcu_idx); + if (unlikely(!map)) { + DMERR_LIMIT("%s: mapping table unavailable, erroring io", + dm_device_name(md)); + goto out; + } + + clone = bio_clone_fast(bio, GFP_NOIO, NULL); + if (unlikely(!clone)) { + DMERR_LIMIT("%s: failed to clone bio", + dm_device_name(md)); + goto out; + } + + bio_get(bio); + bio_set_flag(clone, BIO_INTERPOSED); + clone->bi_private = bio; + clone->bi_disk = dm_disk(md); + clone->bi_end_io = dm_interposed_endio; + + trace_block_bio_remap(clone->bi_disk->queue, bio, bio_dev(bio), + bio->bi_iter.bi_sector); + + ret = submit_bio_noacct(clone); + dm_put_live_table(md, srcu_idx); + return ret; + +out: + bio_io_error(bio); + dm_put_live_table(md, srcu_idx); + return ret; +} + /*----------------------------------------------------------------- * An IDR is used to keep track of allocated minor numbers. *---------------------------------------------------------------*/ @@ -2005,8 +2120,11 @@ static struct dm_table *__bind(struct mapped_device *md, struct dm_table *t, md->immutable_target_type = dm_table_get_immutable_target_type(t); dm_table_set_restrictions(t, q, limits); - if (old_map) + if (old_map) { + dm_table_deactivate_interposer(old_map); dm_sync_table(md); + } + dm_table_activate_interposer(t, md); out: return old_map; diff --git a/drivers/md/dm.h b/drivers/md/dm.h index fffe1e289c53..51b98e10e9bb 100644 --- a/drivers/md/dm.h +++ b/drivers/md/dm.h @@ -75,7 +75,9 @@ bool dm_table_supports_dax(struct dm_table *t, iterate_devices_callout_fn fn, int *blocksize); int device_supports_dax(struct dm_target *ti, struct dm_dev *dev, sector_t start, sector_t len, void *data); - +bool dm_table_activate_interposer(struct dm_table *t, + struct mapped_device *md); +bool dm_table_deactivate_interposer(struct dm_table *t); void dm_lock_md_type(struct mapped_device *md); void dm_unlock_md_type(struct mapped_device *md); void dm_set_md_type(struct mapped_device *md, enum dm_queue_mode type);