Message ID | 1488296503-4987-10-git-send-email-tom.leiming@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Feb 28, 2017 at 11:41:39PM +0800, Ming Lei wrote: > Use this helper, instead of direct access to .bi_vcnt. what We really need to do for the behind IO is: - allocate memory and copy bio data to the memory - let behind bio do IO against the memory The behind bio doesn't need to have the exactly same bio_vec setting. If we just track the new memory, we don't need use the bio_segments_all and access bio_vec too. Thanks, Shaohua > Signed-off-by: Ming Lei <tom.leiming@gmail.com> > --- > drivers/md/raid1.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c > index 316bd6dd6cc1..7396c99ff7b1 100644 > --- a/drivers/md/raid1.c > +++ b/drivers/md/raid1.c > @@ -1091,7 +1091,8 @@ static void alloc_behind_pages(struct bio *bio, struct r1bio *r1_bio) > { > int i; > struct bio_vec *bvec; > - struct bio_vec *bvecs = kzalloc(bio->bi_vcnt * sizeof(struct bio_vec), > + unsigned vcnt = bio_segments_all(bio); > + struct bio_vec *bvecs = kzalloc(vcnt * sizeof(struct bio_vec), > GFP_NOIO); > if (unlikely(!bvecs)) > return; > @@ -1107,12 +1108,12 @@ static void alloc_behind_pages(struct bio *bio, struct r1bio *r1_bio) > kunmap(bvec->bv_page); > } > r1_bio->behind_bvecs = bvecs; > - r1_bio->behind_page_count = bio->bi_vcnt; > + r1_bio->behind_page_count = vcnt; > set_bit(R1BIO_BehindIO, &r1_bio->state); > return; > > do_sync_io: > - for (i = 0; i < bio->bi_vcnt; i++) > + for (i = 0; i < vcnt; i++) > if (bvecs[i].bv_page) > put_page(bvecs[i].bv_page); > kfree(bvecs); > -- > 2.7.4 >
Hi Shaohua, On Wed, Mar 1, 2017 at 7:42 AM, Shaohua Li <shli@kernel.org> wrote: > On Tue, Feb 28, 2017 at 11:41:39PM +0800, Ming Lei wrote: >> Use this helper, instead of direct access to .bi_vcnt. > > what We really need to do for the behind IO is: > - allocate memory and copy bio data to the memory > - let behind bio do IO against the memory > > The behind bio doesn't need to have the exactly same bio_vec setting. If we > just track the new memory, we don't need use the bio_segments_all and access > bio_vec too. But we need to figure out how many vecs(each vec store one page) to be allocated for the cloned/behind bio, and that is the only value of bio_segments_all() here. Or you have idea to avoid that? Thanks, Ming
On Thu, Mar 02, 2017 at 10:34:25AM +0800, Ming Lei wrote: > Hi Shaohua, > > On Wed, Mar 1, 2017 at 7:42 AM, Shaohua Li <shli@kernel.org> wrote: > > On Tue, Feb 28, 2017 at 11:41:39PM +0800, Ming Lei wrote: > >> Use this helper, instead of direct access to .bi_vcnt. > > > > what We really need to do for the behind IO is: > > - allocate memory and copy bio data to the memory > > - let behind bio do IO against the memory > > > > The behind bio doesn't need to have the exactly same bio_vec setting. If we > > just track the new memory, we don't need use the bio_segments_all and access > > bio_vec too. > > But we need to figure out how many vecs(each vec store one page) to be > allocated for the cloned/behind bio, and that is the only value of > bio_segments_all() here. Or you have idea to avoid that? As I said, the behind bio doesn't need to have the exactly same bio_vec setting. We just allocate memory and copy original bio data to the memory, then do IO against the new memory. The behind bio segments == (bio->bi_iter.bi_size + PAGE_SIZE - 1) >> PAGE_SHIFT Thanks, Shaohua
On Thu, Mar 2, 2017 at 3:52 PM, Shaohua Li <shli@kernel.org> wrote: > On Thu, Mar 02, 2017 at 10:34:25AM +0800, Ming Lei wrote: >> Hi Shaohua, >> >> On Wed, Mar 1, 2017 at 7:42 AM, Shaohua Li <shli@kernel.org> wrote: >> > On Tue, Feb 28, 2017 at 11:41:39PM +0800, Ming Lei wrote: >> >> Use this helper, instead of direct access to .bi_vcnt. >> > >> > what We really need to do for the behind IO is: >> > - allocate memory and copy bio data to the memory >> > - let behind bio do IO against the memory >> > >> > The behind bio doesn't need to have the exactly same bio_vec setting. If we >> > just track the new memory, we don't need use the bio_segments_all and access >> > bio_vec too. >> >> But we need to figure out how many vecs(each vec store one page) to be >> allocated for the cloned/behind bio, and that is the only value of >> bio_segments_all() here. Or you have idea to avoid that? > > As I said, the behind bio doesn't need to have the exactly same bio_vec > setting. We just allocate memory and copy original bio data to the memory, > then do IO against the new memory. The behind bio > segments == (bio->bi_iter.bi_size + PAGE_SIZE - 1) >> PAGE_SHIFT The equation isn't always correct, especially when bvec includes just part of page, and it is quite often in case of mkfs, in which one bvec often includes 512byte buffer. Thanks, Ming Lei
On Fri, Mar 3, 2017 at 10:20 AM, Ming Lei <tom.leiming@gmail.com> wrote: > On Thu, Mar 2, 2017 at 3:52 PM, Shaohua Li <shli@kernel.org> wrote: >> On Thu, Mar 02, 2017 at 10:34:25AM +0800, Ming Lei wrote: >>> Hi Shaohua, >>> >>> On Wed, Mar 1, 2017 at 7:42 AM, Shaohua Li <shli@kernel.org> wrote: >>> > On Tue, Feb 28, 2017 at 11:41:39PM +0800, Ming Lei wrote: >>> >> Use this helper, instead of direct access to .bi_vcnt. >>> > >>> > what We really need to do for the behind IO is: >>> > - allocate memory and copy bio data to the memory >>> > - let behind bio do IO against the memory >>> > >>> > The behind bio doesn't need to have the exactly same bio_vec setting. If we >>> > just track the new memory, we don't need use the bio_segments_all and access >>> > bio_vec too. >>> >>> But we need to figure out how many vecs(each vec store one page) to be >>> allocated for the cloned/behind bio, and that is the only value of >>> bio_segments_all() here. Or you have idea to avoid that? >> >> As I said, the behind bio doesn't need to have the exactly same bio_vec >> setting. We just allocate memory and copy original bio data to the memory, >> then do IO against the new memory. The behind bio >> segments == (bio->bi_iter.bi_size + PAGE_SIZE - 1) >> PAGE_SHIFT > > The equation isn't always correct, especially when bvec includes just > part of page, and it is quite often in case of mkfs, in which one bvec often > includes 512byte buffer. Think it further, your idea could be workable and more clean, but the change can be a bit big, looks we need to switch handling write behind into the following way: 1) replace bio_clone_bioset_partial() with bio_allocate(nr_vecs), and 'nr_vecs' is computed with your equation; 2) allocate 'nr_vecs' pages once and share them among all created bio in 1) 3) for each created bio, add each page into the bio via bio_add_page() 4) only for the 1st created bio, call bio_copy_data() to copy data from master bio. Let me know if you are OK with the above implementaion. Thanks, Ming Lei
On Fri, Mar 03, 2017 at 02:22:30PM +0800, Ming Lei wrote: > On Fri, Mar 3, 2017 at 10:20 AM, Ming Lei <tom.leiming@gmail.com> wrote: > > On Thu, Mar 2, 2017 at 3:52 PM, Shaohua Li <shli@kernel.org> wrote: > >> On Thu, Mar 02, 2017 at 10:34:25AM +0800, Ming Lei wrote: > >>> Hi Shaohua, > >>> > >>> On Wed, Mar 1, 2017 at 7:42 AM, Shaohua Li <shli@kernel.org> wrote: > >>> > On Tue, Feb 28, 2017 at 11:41:39PM +0800, Ming Lei wrote: > >>> >> Use this helper, instead of direct access to .bi_vcnt. > >>> > > >>> > what We really need to do for the behind IO is: > >>> > - allocate memory and copy bio data to the memory > >>> > - let behind bio do IO against the memory > >>> > > >>> > The behind bio doesn't need to have the exactly same bio_vec setting. If we > >>> > just track the new memory, we don't need use the bio_segments_all and access > >>> > bio_vec too. > >>> > >>> But we need to figure out how many vecs(each vec store one page) to be > >>> allocated for the cloned/behind bio, and that is the only value of > >>> bio_segments_all() here. Or you have idea to avoid that? > >> > >> As I said, the behind bio doesn't need to have the exactly same bio_vec > >> setting. We just allocate memory and copy original bio data to the memory, > >> then do IO against the new memory. The behind bio > >> segments == (bio->bi_iter.bi_size + PAGE_SIZE - 1) >> PAGE_SHIFT > > > > The equation isn't always correct, especially when bvec includes just > > part of page, and it is quite often in case of mkfs, in which one bvec often > > includes 512byte buffer. > > Think it further, your idea could be workable and more clean, but the change > can be a bit big, looks we need to switch handling write behind into > the following way: > > 1) replace bio_clone_bioset_partial() with bio_allocate(nr_vecs), and 'nr_vecs' > is computed with your equation; > > 2) allocate 'nr_vecs' pages once and share them among all created bio in 1) > > 3) for each created bio, add each page into the bio via bio_add_page() > > 4) only for the 1st created bio, call bio_copy_data() to copy data from > master bio. > > Let me know if you are OK with the above implementaion. Right, this is exactly what I'd like to do. This way we don't need touch bvec and should be much cleaner. Thanks, Shaohua
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 316bd6dd6cc1..7396c99ff7b1 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -1091,7 +1091,8 @@ static void alloc_behind_pages(struct bio *bio, struct r1bio *r1_bio) { int i; struct bio_vec *bvec; - struct bio_vec *bvecs = kzalloc(bio->bi_vcnt * sizeof(struct bio_vec), + unsigned vcnt = bio_segments_all(bio); + struct bio_vec *bvecs = kzalloc(vcnt * sizeof(struct bio_vec), GFP_NOIO); if (unlikely(!bvecs)) return; @@ -1107,12 +1108,12 @@ static void alloc_behind_pages(struct bio *bio, struct r1bio *r1_bio) kunmap(bvec->bv_page); } r1_bio->behind_bvecs = bvecs; - r1_bio->behind_page_count = bio->bi_vcnt; + r1_bio->behind_page_count = vcnt; set_bit(R1BIO_BehindIO, &r1_bio->state); return; do_sync_io: - for (i = 0; i < bio->bi_vcnt; i++) + for (i = 0; i < vcnt; i++) if (bvecs[i].bv_page) put_page(bvecs[i].bv_page); kfree(bvecs);
Use this helper, instead of direct access to .bi_vcnt. Signed-off-by: Ming Lei <tom.leiming@gmail.com> --- drivers/md/raid1.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)