Message ID | 1481971751-4016-1-git-send-email-ming.lei@canonical.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 12/17/2016 03:49 AM, Ming Lei wrote: > If the last bvec of the 1st bio and the 1st bvec of the next > bio are contineous physically, and the latter can be merged > to last segment of the 1st bio, we should think they don't > violate sg gap(or virt boundary) limit. > > Both Vitaly and Dexuan reported lots of unmergeable small bios > are observed when running mkfs on Hyper-V virtual storage, and > performance becomes quite low, so this patch is figured out for > fixing the performance issue. > > The same issue should exist on NVMe too sine it sets virt boundary too. It looks pretty reasonable to me. I'll queue it up for some testing, changes like this always make me a little nervous.
On Sun, Dec 18, 2016 at 12:49 AM, Jens Axboe <axboe@fb.com> wrote: > On 12/17/2016 03:49 AM, Ming Lei wrote: >> If the last bvec of the 1st bio and the 1st bvec of the next >> bio are contineous physically, and the latter can be merged >> to last segment of the 1st bio, we should think they don't >> violate sg gap(or virt boundary) limit. >> >> Both Vitaly and Dexuan reported lots of unmergeable small bios >> are observed when running mkfs on Hyper-V virtual storage, and >> performance becomes quite low, so this patch is figured out for >> fixing the performance issue. >> >> The same issue should exist on NVMe too sine it sets virt boundary too. > > It looks pretty reasonable to me. I'll queue it up for some testing, > changes like this always make me a little nervous. Understood. But given it is still in early stage of 4.10 cycle, seems fine to expose it now, and we should have enough time to fix it if there might be regressions. BTW, it passes my xfstest(ext4) over sata/NVMe. Thanks, Ming -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/19/2016 07:07 PM, Ming Lei wrote: > On Sun, Dec 18, 2016 at 12:49 AM, Jens Axboe <axboe@fb.com> wrote: >> On 12/17/2016 03:49 AM, Ming Lei wrote: >>> If the last bvec of the 1st bio and the 1st bvec of the next >>> bio are contineous physically, and the latter can be merged >>> to last segment of the 1st bio, we should think they don't >>> violate sg gap(or virt boundary) limit. >>> >>> Both Vitaly and Dexuan reported lots of unmergeable small bios >>> are observed when running mkfs on Hyper-V virtual storage, and >>> performance becomes quite low, so this patch is figured out for >>> fixing the performance issue. >>> >>> The same issue should exist on NVMe too sine it sets virt boundary too. >> >> It looks pretty reasonable to me. I'll queue it up for some testing, >> changes like this always make me a little nervous. > > Understood. > > But given it is still in early stage of 4.10 cycle, seems fine to expose > it now, and we should have enough time to fix it if there might be > regressions. > > BTW, it passes my xfstest(ext4) over sata/NVMe. It's been fine here in testing, too. I'm not worried about performance regressions, those we can always fix. Merging makes me worried about corruption, and those regressions are much worse. Any reason we need to rush this? I'd be more comfortable pushing this to 4.11, unless there are strong reasons this should make 4.10.
> From: Jens Axboe [mailto:axboe@fb.com] > Sent: Tuesday, December 20, 2016 10:31 > To: Ming Lei <ming.lei@canonical.com> > Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>; linux-block <linux- > block@vger.kernel.org>; Christoph Hellwig <hch@infradead.org>; Dexuan Cui > <decui@microsoft.com>; Vitaly Kuznetsov <vkuznets@redhat.com>; Keith Busch > <keith.busch@intel.com>; Hannes Reinecke <hare@suse.de>; Mike Christie > <mchristi@redhat.com>; Martin K. Petersen <martin.petersen@oracle.com>; > Toshi Kani <toshi.kani@hpe.com>; Dan Williams <dan.j.williams@intel.com>; > Damien Le Moal <damien.lemoal@hgst.com> > Subject: Re: [PATCH] block: loose check on sg gap > > On 12/19/2016 07:07 PM, Ming Lei wrote: > > On Sun, Dec 18, 2016 at 12:49 AM, Jens Axboe <axboe@fb.com> wrote: > >> On 12/17/2016 03:49 AM, Ming Lei wrote: > >>> If the last bvec of the 1st bio and the 1st bvec of the next > >>> bio are contineous physically, and the latter can be merged > >>> to last segment of the 1st bio, we should think they don't > >>> violate sg gap(or virt boundary) limit. > >>> > >>> Both Vitaly and Dexuan reported lots of unmergeable small bios > >>> are observed when running mkfs on Hyper-V virtual storage, and > >>> performance becomes quite low, so this patch is figured out for > >>> fixing the performance issue. > >>> > >>> The same issue should exist on NVMe too sine it sets virt boundary too. > >> > >> It looks pretty reasonable to me. I'll queue it up for some testing, > >> changes like this always make me a little nervous. > > > > Understood. > > > > But given it is still in early stage of 4.10 cycle, seems fine to expose > > it now, and we should have enough time to fix it if there might be > > regressions. > > > > BTW, it passes my xfstest(ext4) over sata/NVMe. > > It's been fine here in testing, too. I'm not worried about performance > regressions, those we can always fix. Merging makes me worried about > corruption, and those regressions are much worse. > > Any reason we need to rush this? I'd be more comfortable pushing this to > 4.11, unless there are strong reasons this should make 4.10. > > -- > Jens Axboe Hi Jens, As far as I know, the patch is important to popular Linux distros, e.g. at least Ubuntu 14.04.5, 16.x and RHEL 7.3, when they run on Hyper-V/Azure, because they can suffer from a pretty bad throughput/latency in some cases, e.g. mkfs.ext4 for a 100GB partition can take 8 minutes, but with the patch, it only takes 1 second. Thanks, -- Dexuan
> From: Dexuan Cui > Sent: Tuesday, December 20, 2016 11:41 > To: 'Jens Axboe' <axboe@fb.com>; Ming Lei <ming.lei@canonical.com> > Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>; linux-block > <linux-block@vger.kernel.org>; Christoph Hellwig <hch@infradead.org>; > Vitaly Kuznetsov <vkuznets@redhat.com>; Keith Busch > <keith.busch@intel.com>; Hannes Reinecke <hare@suse.de>; Mike Christie > <mchristi@redhat.com>; Martin K. Petersen <martin.petersen@oracle.com>; > Toshi Kani <toshi.kani@hpe.com>; Dan Williams <dan.j.williams@intel.com>; > Damien Le Moal <damien.lemoal@hgst.com> > Subject: RE: [PATCH] block: loose check on sg gap > > > From: Jens Axboe [mailto:axboe@fb.com] > > Sent: Tuesday, December 20, 2016 10:31 > > To: Ming Lei <ming.lei@canonical.com> > > Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>; linux-block > <linux- > > block@vger.kernel.org>; Christoph Hellwig <hch@infradead.org>; Dexuan > Cui > > <decui@microsoft.com>; Vitaly Kuznetsov <vkuznets@redhat.com>; Keith > Busch > > <keith.busch@intel.com>; Hannes Reinecke <hare@suse.de>; Mike Christie > > <mchristi@redhat.com>; Martin K. Petersen > <martin.petersen@oracle.com>; > > Toshi Kani <toshi.kani@hpe.com>; Dan Williams > <dan.j.williams@intel.com>; > > Damien Le Moal <damien.lemoal@hgst.com> > > Subject: Re: [PATCH] block: loose check on sg gap > > > > On 12/19/2016 07:07 PM, Ming Lei wrote: > > > On Sun, Dec 18, 2016 at 12:49 AM, Jens Axboe <axboe@fb.com> wrote: > > >> On 12/17/2016 03:49 AM, Ming Lei wrote: > > >>> If the last bvec of the 1st bio and the 1st bvec of the next > > >>> bio are contineous physically, and the latter can be merged > > >>> to last segment of the 1st bio, we should think they don't > > >>> violate sg gap(or virt boundary) limit. > > >>> > > >>> Both Vitaly and Dexuan reported lots of unmergeable small bios > > >>> are observed when running mkfs on Hyper-V virtual storage, and > > >>> performance becomes quite low, so this patch is figured out for > > >>> fixing the performance issue. > > >>> > > >>> The same issue should exist on NVMe too sine it sets virt boundary > too. > > >> > > >> It looks pretty reasonable to me. I'll queue it up for some testing, > > >> changes like this always make me a little nervous. > > > > > > Understood. > > > > > > But given it is still in early stage of 4.10 cycle, seems fine to expose > > > it now, and we should have enough time to fix it if there might be > > > regressions. > > > > > > BTW, it passes my xfstest(ext4) over sata/NVMe. > > > > It's been fine here in testing, too. I'm not worried about performance > > regressions, those we can always fix. Merging makes me worried about > > corruption, and those regressions are much worse. > > > > Any reason we need to rush this? I'd be more comfortable pushing this to > > 4.11, unless there are strong reasons this should make 4.10. > > > > -- > > Jens Axboe > > Hi Jens, > > As far as I know, the patch is important to popular Linux distros, > e.g. at least Ubuntu 14.04.5, 16.x and RHEL 7.3, when they run on > Hyper-V/Azure, because they can suffer from a pretty bad > throughput/latency > in some cases, e.g. mkfs.ext4 for a 100GB partition can take 8 minutes, but > with the patch, it only takes 1 second. > > -- Dexuan Hi Ming, Jens, Did you find any issue later when testing with the patch? May I know if it's possible to have it in 4.10 considering the above impact? Is it on some temporary branch of linux-block.git? Looks not. Thanks, -- Dexuan
On Wed, Jan 11, 2017 at 1:10 PM, Dexuan Cui <decui@microsoft.com> wrote: >> From: Dexuan Cui >> Sent: Tuesday, December 20, 2016 11:41 >> To: 'Jens Axboe' <axboe@fb.com>; Ming Lei <ming.lei@canonical.com> >> Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>; linux-block >> <linux-block@vger.kernel.org>; Christoph Hellwig <hch@infradead.org>; >> Vitaly Kuznetsov <vkuznets@redhat.com>; Keith Busch >> <keith.busch@intel.com>; Hannes Reinecke <hare@suse.de>; Mike Christie >> <mchristi@redhat.com>; Martin K. Petersen <martin.petersen@oracle.com>; >> Toshi Kani <toshi.kani@hpe.com>; Dan Williams <dan.j.williams@intel.com>; >> Damien Le Moal <damien.lemoal@hgst.com> >> Subject: RE: [PATCH] block: loose check on sg gap >> >> > From: Jens Axboe [mailto:axboe@fb.com] >> > Sent: Tuesday, December 20, 2016 10:31 >> > To: Ming Lei <ming.lei@canonical.com> >> > Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>; linux-block >> <linux- >> > block@vger.kernel.org>; Christoph Hellwig <hch@infradead.org>; Dexuan >> Cui >> > <decui@microsoft.com>; Vitaly Kuznetsov <vkuznets@redhat.com>; Keith >> Busch >> > <keith.busch@intel.com>; Hannes Reinecke <hare@suse.de>; Mike Christie >> > <mchristi@redhat.com>; Martin K. Petersen >> <martin.petersen@oracle.com>; >> > Toshi Kani <toshi.kani@hpe.com>; Dan Williams >> <dan.j.williams@intel.com>; >> > Damien Le Moal <damien.lemoal@hgst.com> >> > Subject: Re: [PATCH] block: loose check on sg gap >> > >> > On 12/19/2016 07:07 PM, Ming Lei wrote: >> > > On Sun, Dec 18, 2016 at 12:49 AM, Jens Axboe <axboe@fb.com> wrote: >> > >> On 12/17/2016 03:49 AM, Ming Lei wrote: >> > >>> If the last bvec of the 1st bio and the 1st bvec of the next >> > >>> bio are contineous physically, and the latter can be merged >> > >>> to last segment of the 1st bio, we should think they don't >> > >>> violate sg gap(or virt boundary) limit. >> > >>> >> > >>> Both Vitaly and Dexuan reported lots of unmergeable small bios >> > >>> are observed when running mkfs on Hyper-V virtual storage, and >> > >>> performance becomes quite low, so this patch is figured out for >> > >>> fixing the performance issue. >> > >>> >> > >>> The same issue should exist on NVMe too sine it sets virt boundary >> too. >> > >> >> > >> It looks pretty reasonable to me. I'll queue it up for some testing, >> > >> changes like this always make me a little nervous. >> > > >> > > Understood. >> > > >> > > But given it is still in early stage of 4.10 cycle, seems fine to expose >> > > it now, and we should have enough time to fix it if there might be >> > > regressions. >> > > >> > > BTW, it passes my xfstest(ext4) over sata/NVMe. >> > >> > It's been fine here in testing, too. I'm not worried about performance >> > regressions, those we can always fix. Merging makes me worried about >> > corruption, and those regressions are much worse. >> > >> > Any reason we need to rush this? I'd be more comfortable pushing this to >> > 4.11, unless there are strong reasons this should make 4.10. >> > >> > -- >> > Jens Axboe >> >> Hi Jens, >> >> As far as I know, the patch is important to popular Linux distros, >> e.g. at least Ubuntu 14.04.5, 16.x and RHEL 7.3, when they run on >> Hyper-V/Azure, because they can suffer from a pretty bad >> throughput/latency >> in some cases, e.g. mkfs.ext4 for a 100GB partition can take 8 minutes, but >> with the patch, it only takes 1 second. >> >> -- Dexuan > > Hi Ming, Jens, > Did you find any issue later when testing with the patch? > > May I know if it's possible to have it in 4.10 considering the above impact? > > Is it on some temporary branch of linux-block.git? Looks not. Dexuan, Jens has said that this patch may land v4.11, so just wait a release and let it expose into more tests. Thanks, Ming -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> From: Ming Lei [mailto:ming.lei@canonical.com] > Sent: Thursday, January 12, 2017 10:54 > To: Dexuan Cui <decui@microsoft.com> > Cc: Jens Axboe <axboe@fb.com>; Linux Kernel Mailing List <linux- > kernel@vger.kernel.org>; linux-block <linux-block@vger.kernel.org>; > Christoph Hellwig <hch@infradead.org>; Vitaly Kuznetsov > <vkuznets@redhat.com>; Keith Busch <keith.busch@intel.com>; Hannes > Reinecke <hare@suse.de>; Mike Christie <mchristi@redhat.com>; Martin K. > Petersen <martin.petersen@oracle.com>; Toshi Kani <toshi.kani@hpe.com>; > Dan Williams <dan.j.williams@intel.com>; Damien Le Moal > <damien.lemoal@hgst.com>; KY Srinivasan <kys@microsoft.com> > Subject: Re: [PATCH] block: loose check on sg gap > > On Wed, Jan 11, 2017 at 1:10 PM, Dexuan Cui <decui@microsoft.com> wrote: > >> From: Dexuan Cui > >> Sent: Tuesday, December 20, 2016 11:41 > >> To: 'Jens Axboe' <axboe@fb.com>; Ming Lei <ming.lei@canonical.com> > >> Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>; linux-block > >> <linux-block@vger.kernel.org>; Christoph Hellwig <hch@infradead.org>; > >> Vitaly Kuznetsov <vkuznets@redhat.com>; Keith Busch > >> <keith.busch@intel.com>; Hannes Reinecke <hare@suse.de>; Mike > Christie > >> <mchristi@redhat.com>; Martin K. Petersen > <martin.petersen@oracle.com>; > >> Toshi Kani <toshi.kani@hpe.com>; Dan Williams > <dan.j.williams@intel.com>; > >> Damien Le Moal <damien.lemoal@hgst.com> > >> Subject: RE: [PATCH] block: loose check on sg gap > >> > >> > From: Jens Axboe [mailto:axboe@fb.com] > >> > Sent: Tuesday, December 20, 2016 10:31 > >> > To: Ming Lei <ming.lei@canonical.com> > >> > Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>; linux-block > >> <linux- > >> > block@vger.kernel.org>; Christoph Hellwig <hch@infradead.org>; > Dexuan > >> Cui > >> > <decui@microsoft.com>; Vitaly Kuznetsov <vkuznets@redhat.com>; > Keith > >> Busch > >> > <keith.busch@intel.com>; Hannes Reinecke <hare@suse.de>; Mike > Christie > >> > <mchristi@redhat.com>; Martin K. Petersen > >> <martin.petersen@oracle.com>; > >> > Toshi Kani <toshi.kani@hpe.com>; Dan Williams > >> <dan.j.williams@intel.com>; > >> > Damien Le Moal <damien.lemoal@hgst.com> > >> > Subject: Re: [PATCH] block: loose check on sg gap > >> > > >> > On 12/19/2016 07:07 PM, Ming Lei wrote: > >> > > On Sun, Dec 18, 2016 at 12:49 AM, Jens Axboe <axboe@fb.com> wrote: > >> > >> On 12/17/2016 03:49 AM, Ming Lei wrote: > >> > >>> If the last bvec of the 1st bio and the 1st bvec of the next > >> > >>> bio are contineous physically, and the latter can be merged > >> > >>> to last segment of the 1st bio, we should think they don't > >> > >>> violate sg gap(or virt boundary) limit. > >> > >>> > >> > >>> Both Vitaly and Dexuan reported lots of unmergeable small bios > >> > >>> are observed when running mkfs on Hyper-V virtual storage, and > >> > >>> performance becomes quite low, so this patch is figured out for > >> > >>> fixing the performance issue. > >> > >>> > >> > >>> The same issue should exist on NVMe too sine it sets virt boundary > >> too. > >> > >> > >> > >> It looks pretty reasonable to me. I'll queue it up for some testing, > >> > >> changes like this always make me a little nervous. > >> > > > >> > > Understood. > >> > > > >> > > But given it is still in early stage of 4.10 cycle, seems fine to expose > >> > > it now, and we should have enough time to fix it if there might be > >> > > regressions. > >> > > > >> > > BTW, it passes my xfstest(ext4) over sata/NVMe. > >> > > >> > It's been fine here in testing, too. I'm not worried about performance > >> > regressions, those we can always fix. Merging makes me worried about > >> > corruption, and those regressions are much worse. > >> > > >> > Any reason we need to rush this? I'd be more comfortable pushing this > to > >> > 4.11, unless there are strong reasons this should make 4.10. > >> > > >> > -- > >> > Jens Axboe > >> > >> Hi Jens, > >> > >> As far as I know, the patch is important to popular Linux distros, > >> e.g. at least Ubuntu 14.04.5, 16.x and RHEL 7.3, when they run on > >> Hyper-V/Azure, because they can suffer from a pretty bad > >> throughput/latency > >> in some cases, e.g. mkfs.ext4 for a 100GB partition can take 8 minutes, > but > >> with the patch, it only takes 1 second. > >> > >> -- Dexuan > > > > Hi Ming, Jens, > > Did you find any issue later when testing with the patch? > > > > May I know if it's possible to have it in 4.10 considering the above impact? > > > > Is it on some temporary branch of linux-block.git? Looks not. > > Dexuan, Jens has said that this patch may land v4.11, so just wait a release > and let it expose into more tests. > > Thanks, > Ming Thanks for the reply! Sorry, I didn't mean to be pushy -- I just wanted to get more idea about the status of the patch, since I'm unfamiliar with the linux-block repo. :-) BTW, I've been using the patch for ~1 month and I didn't get any issue. Thanks, -- Dexuan
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 286b2a264383..1ce26e771bcc 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1608,6 +1608,25 @@ static inline bool bvec_gap_to_prev(struct request_queue *q, return __bvec_gap_to_prev(q, bprv, offset); } +/* + * Check if the two bvecs from two bios can be merged to one segment. + * If yes, no need to check gap between the two bios since the 1st bio + * and the 1st bvec in the 2nd bio can be handled in one segment. + */ +static inline bool bios_segs_mergeable(struct request_queue *q, + struct bio *prev, struct bio_vec *prev_last_bv, + struct bio_vec *next_first_bv) +{ + if (!BIOVEC_PHYS_MERGEABLE(prev_last_bv, next_first_bv)) + return false; + if (!BIOVEC_SEG_BOUNDARY(q, prev_last_bv, next_first_bv)) + return false; + if (prev->bi_seg_back_size + next_first_bv->bv_len > + queue_max_segment_size(q)) + return false; + return true; +} + static inline bool bio_will_gap(struct request_queue *q, struct bio *prev, struct bio *next) { @@ -1617,7 +1636,8 @@ static inline bool bio_will_gap(struct request_queue *q, struct bio *prev, bio_get_last_bvec(prev, &pb); bio_get_first_bvec(next, &nb); - return __bvec_gap_to_prev(q, &pb, nb.bv_offset); + if (!bios_segs_mergeable(q, prev, &pb, &nb)) + return __bvec_gap_to_prev(q, &pb, nb.bv_offset); } return false;