From patchwork Wed Aug 7 21:54:17 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kent Overstreet X-Patchwork-Id: 2840647 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Original-To: patchwork-dm-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 407D9BF535 for ; Wed, 7 Aug 2013 22:55:59 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 1C7CD2050D for ; Wed, 7 Aug 2013 22:55:58 +0000 (UTC) Received: from mx3-phx2.redhat.com (mx3-phx2.redhat.com [209.132.183.24]) by mail.kernel.org (Postfix) with ESMTP id 3717B20512 for ; Wed, 7 Aug 2013 22:55:56 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by mx3-phx2.redhat.com (8.13.8/8.13.8) with ESMTP id r77Mqpo7027251; Wed, 7 Aug 2013 18:52:51 -0400 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id r77LtI2u010822 for ; Wed, 7 Aug 2013 17:55:18 -0400 Received: from mx1.redhat.com (ext-mx14.extmail.prod.ext.phx2.redhat.com [10.5.110.19]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id r77LtHoo032752 for ; Wed, 7 Aug 2013 17:55:17 -0400 Received: from mail-pa0-f41.google.com (mail-pa0-f41.google.com [209.85.220.41]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r77LtE0W012889 for ; Wed, 7 Aug 2013 17:55:14 -0400 Received: by mail-pa0-f41.google.com with SMTP id bj1so2655765pad.0 for ; Wed, 07 Aug 2013 14:55:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Rsp8Qrxw7SPUlNmXlSRAEUNmu/6pFJTS2Kn006a0s9w=; b=OkcQGZOBZvvs/b4o3bBImKMBXfcIrYNcjOodepJk4IKrglxsQpIi3UmmQjVvvNUmJU FufgqCEzKWWzc9Gx+hag3nODeWAHNhQRF+4pVSoRezjl5btT/qyRwRRNtOALsfYoOuMp cjyvh55y5RPgrZx8CaHhw/LPzYt3vvzm4o5RA1UEL9obOmvm/41X78sXb3OEvJYNT/OC 3KNV8q7tZZR+iZcUbPwaom9sjxOCH731/tnrQfXgD3Pf+XUoIONBS1gtO0xlVL3QJD8x tf2Nz7dfN6WbbSfWro6yStCxWyi38eqtdC0yFEdKUwoL8p4g5pDkL/SuhMbMZFs+PgWO 9qOg== X-Gm-Message-State: ALoCoQlNU1jjr14SHAPADXvAnDnJ69CzbivYrJya01IWN2AFgaqCQeJCyJlCkmlkl1Au+7P5wiD/ X-Received: by 10.66.4.71 with SMTP id i7mr2727121pai.82.1375912513817; Wed, 07 Aug 2013 14:55:13 -0700 (PDT) Received: from localhost.localdomain (173-13-132-141-sfba.hfc.comcastbusiness.net. [173.13.132.141]) by mx.google.com with ESMTPSA id xe9sm12505640pab.0.2013.08.07.14.55.12 for (version=TLSv1.2 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 07 Aug 2013 14:55:13 -0700 (PDT) From: Kent Overstreet To: axboe@kernel.dk Date: Wed, 7 Aug 2013 14:54:17 -0700 Message-Id: <1375912471-5106-9-git-send-email-kmo@daterainc.com> In-Reply-To: <1375912471-5106-1-git-send-email-kmo@daterainc.com> References: <1375912471-5106-1-git-send-email-kmo@daterainc.com> X-RedHat-Spam-Score: -1.3 (BAYES_00, DCC_REPUT_00_12, KHOP_BIG_TO_CC, RCVD_IN_DNSWL_LOW, URIBL_BLOCKED) X-Scanned-By: MIMEDefang 2.67 on 10.5.11.11 X-Scanned-By: MIMEDefang 2.68 on 10.5.110.19 X-loop: dm-devel@redhat.com X-Mailman-Approved-At: Wed, 07 Aug 2013 18:51:48 -0400 Cc: nbd-general@lists.sourceforge.net, Kent Overstreet , Paul Clements , linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, dm-devel@redhat.com, linux-fsdevel@vger.kernel.org, drbd-user@lists.linbit.com, Lars Ellenberg Subject: [dm-devel] [PATCH 08/22] block: Immutable bio vecs X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk Reply-To: device-mapper development List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This adds a mechanism by which we can advance a bio by an arbitrary number of bytes without modifying the biovec: bio->bi_iter.bi_bvec_done indicates the number of bytes completed in the current bvec. Various driver code still needs to be updated to not refer to the bvec directly before we can use this for interesting things, like efficient bio splitting. Signed-off-by: Kent Overstreet Cc: Jens Axboe Cc: Lars Ellenberg Cc: Paul Clements Cc: drbd-user@lists.linbit.com Cc: nbd-general@lists.sourceforge.net --- drivers/block/drbd/drbd_main.c | 4 +-- drivers/block/nbd.c | 2 +- fs/bio.c | 27 ++------------ include/linux/bio.h | 81 +++++++++++++++++++++++++++++++++++++----- include/linux/blk_types.h | 2 ++ include/linux/blkdev.h | 4 +-- 6 files changed, 82 insertions(+), 38 deletions(-) diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c index 1589ea4..fee8b51 100644 --- a/drivers/block/drbd/drbd_main.c +++ b/drivers/block/drbd/drbd_main.c @@ -1546,7 +1546,7 @@ static int _drbd_send_bio(struct drbd_conf *mdev, struct bio *bio) err = _drbd_no_send_page(mdev, bvec.bv_page, bvec.bv_offset, bvec.bv_len, - bio_iter_last(bio, iter) + bio_iter_last(bvec, iter) ? 0 : MSG_MORE); if (err) return err; @@ -1565,7 +1565,7 @@ static int _drbd_send_zc_bio(struct drbd_conf *mdev, struct bio *bio) err = _drbd_send_page(mdev, bvec.bv_page, bvec.bv_offset, bvec.bv_len, - bio_iter_last(bio, iter) ? 0 : MSG_MORE); + bio_iter_last(bvec, iter) ? 0 : MSG_MORE); if (err) return err; } diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index aa362f4..55298db 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -278,7 +278,7 @@ static int nbd_send_req(struct nbd_device *nbd, struct request *req) */ rq_for_each_segment(bvec, req, iter) { flags = 0; - if (!rq_iter_last(req, iter)) + if (!rq_iter_last(bvec, iter)) flags = MSG_MORE; dprintk(DBG_TX, "%s: request %p: sending %d bytes data\n", nbd->disk->disk_name, req, bvec.bv_len); diff --git a/fs/bio.c b/fs/bio.c index 805befd..1c10ccc 100644 --- a/fs/bio.c +++ b/fs/bio.c @@ -532,13 +532,11 @@ void __bio_clone(struct bio *bio, struct bio *bio_src) * most users will be overriding ->bi_bdev with a new target, * so we don't set nor calculate new physical/hw segment counts here */ - bio->bi_iter.bi_sector = bio_src->bi_iter.bi_sector; bio->bi_bdev = bio_src->bi_bdev; bio->bi_flags |= 1 << BIO_CLONED; bio->bi_rw = bio_src->bi_rw; bio->bi_vcnt = bio_src->bi_vcnt; - bio->bi_iter.bi_size = bio_src->bi_iter.bi_size; - bio->bi_iter.bi_idx = bio_src->bi_iter.bi_idx; + bio->bi_iter = bio_src->bi_iter; } EXPORT_SYMBOL(__bio_clone); @@ -808,28 +806,7 @@ void bio_advance(struct bio *bio, unsigned bytes) if (bio_integrity(bio)) bio_integrity_advance(bio, bytes); - bio->bi_iter.bi_sector += bytes >> 9; - bio->bi_iter.bi_size -= bytes; - - if (bio->bi_rw & BIO_NO_ADVANCE_ITER_MASK) - return; - - while (bytes) { - if (unlikely(bio->bi_iter.bi_idx >= bio->bi_vcnt)) { - WARN_ONCE(1, "bio idx %d >= vcnt %d\n", - bio->bi_iter.bi_idx, bio->bi_vcnt); - break; - } - - if (bytes >= bio_iovec(bio).bv_len) { - bytes -= bio_iovec(bio).bv_len; - bio->bi_iter.bi_idx++; - } else { - bio_iovec(bio).bv_len -= bytes; - bio_iovec(bio).bv_offset += bytes; - bytes = 0; - } - } + bio_advance_iter(bio, &bio->bi_iter, bytes); } EXPORT_SYMBOL(bio_advance); diff --git a/include/linux/bio.h b/include/linux/bio.h index 5724feb..151868e 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -64,11 +64,38 @@ #define bio_iovec_idx(bio, idx) (&((bio)->bi_io_vec[(idx)])) #define __bio_iovec(bio) bio_iovec_idx((bio), (bio)->bi_iter.bi_idx) -#define bio_iter_iovec(bio, iter) ((bio)->bi_io_vec[(iter).bi_idx]) +#define __bvec_iter_bvec(bvec, iter) (&(bvec)[(iter).bi_idx]) -#define bio_page(bio) (bio_iovec((bio)).bv_page) -#define bio_offset(bio) (bio_iovec((bio)).bv_offset) -#define bio_iovec(bio) (*__bio_iovec(bio)) +#define bvec_iter_page(bvec, iter) \ + (__bvec_iter_bvec((bvec), (iter))->bv_page) + +#define bvec_iter_len(bvec, iter) \ + min((iter).bi_size, \ + __bvec_iter_bvec((bvec), (iter))->bv_len - (iter).bi_bvec_done) + +#define bvec_iter_offset(bvec, iter) \ + (__bvec_iter_bvec((bvec), (iter))->bv_offset + (iter).bi_bvec_done) + +#define bvec_iter_bvec(bvec, iter) \ +((struct bio_vec) { \ + .bv_page = bvec_iter_page((bvec), (iter)), \ + .bv_len = bvec_iter_len((bvec), (iter)), \ + .bv_offset = bvec_iter_offset((bvec), (iter)), \ +}) + +#define bio_iter_iovec(bio, iter) \ + bvec_iter_bvec((bio)->bi_io_vec, (iter)) + +#define bio_iter_page(bio, iter) \ + bvec_iter_page((bio)->bi_io_vec, (iter)) +#define bio_iter_len(bio, iter) \ + bvec_iter_len((bio)->bi_io_vec, (iter)) +#define bio_iter_offset(bio, iter) \ + bvec_iter_offset((bio)->bi_io_vec, (iter)) + +#define bio_page(bio) bio_iter_page((bio), (bio)->bi_iter) +#define bio_offset(bio) bio_iter_offset((bio), (bio)->bi_iter) +#define bio_iovec(bio) bio_iter_iovec((bio), (bio)->bi_iter) #define bio_segments(bio) ((bio)->bi_vcnt - (bio)->bi_iter.bi_idx) #define bio_sectors(bio) ((bio)->bi_iter.bi_size >> 9) @@ -145,16 +172,54 @@ static inline void *bio_data(struct bio *bio) bvl = bio_iovec_idx((bio), (i)), i < (bio)->bi_vcnt; \ i++) +static inline void bvec_iter_advance(struct bio_vec *bv, struct bvec_iter *iter, + unsigned bytes) +{ + WARN_ONCE(bytes > iter->bi_size, + "Attempted to advance past end of bvec iter\n"); + + while (bytes) { + unsigned len = min(bytes, bvec_iter_len(bv, *iter)); + + bytes -= len; + iter->bi_size -= len; + iter->bi_bvec_done += len; + + if (iter->bi_bvec_done == __bvec_iter_bvec(bv, *iter)->bv_len) { + iter->bi_bvec_done = 0; + iter->bi_idx++; + } + } +} + +#define for_each_bvec(bvl, bio_vec, iter, start) \ + for ((iter) = start; \ + (bvl) = bvec_iter_bvec((bio_vec), (iter)), \ + (iter).bi_size; \ + bvec_iter_advance((bio_vec), &(iter), (bvl).bv_len)) + + +static inline void bio_advance_iter(struct bio *bio, struct bvec_iter *iter, + unsigned bytes) +{ + iter->bi_sector += bytes >> 9; + + if (bio->bi_rw & BIO_NO_ADVANCE_ITER_MASK) + iter->bi_size -= bytes; + else + bvec_iter_advance(bio->bi_io_vec, iter, bytes); +} + #define __bio_for_each_segment(bvl, bio, iter, start) \ for (iter = (start); \ - bvl = bio_iter_iovec((bio), (iter)), \ - (iter).bi_idx < (bio)->bi_vcnt; \ - (iter).bi_idx++) + (iter).bi_size && \ + ((bvl = bio_iter_iovec((bio), (iter))), 1); \ + bio_advance_iter((bio), &(iter), (bvl).bv_len)) #define bio_for_each_segment(bvl, bio, iter) \ __bio_for_each_segment(bvl, bio, iter, (bio)->bi_iter) -#define bio_iter_last(bio, iter) ((iter).bi_idx == (bio)->bi_vcnt - 1) +#define bio_iter_last(bvec, iter) ((iter).bi_size == (bvec).bv_len) /* * get a reference to a bio, so it won't disappear. the intended use is diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index d46e8a6..72f1274 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -34,6 +34,8 @@ struct bvec_iter { unsigned int bi_size; /* residual I/O count */ unsigned int bi_idx; /* current index into bvl_vec */ + + unsigned int bi_bvec_done; }; /* diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 1b9d47b..2a16de2 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -714,9 +714,9 @@ struct req_iterator { __rq_for_each_bio(_iter.bio, _rq) \ bio_for_each_segment(bvl, _iter.bio, _iter.iter) -#define rq_iter_last(rq, _iter) \ +#define rq_iter_last(bvec, _iter) \ (_iter.bio->bi_next == NULL && \ - bio_iter_last(_iter.bio, _iter.iter)) + bio_iter_last(bvec, _iter.iter)) #ifndef ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE # error "You should define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE for your platform"