Message ID | 20170710072538.GA32208@ming.t460p (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Jul 10, 2017 at 03:25:41PM +0800, Ming Lei wrote: > On Mon, Jul 10, 2017 at 02:38:19PM +1000, NeilBrown wrote: > > On Mon, Jul 10 2017, Ming Lei wrote: > > > > > On Mon, Jul 10, 2017 at 11:35:12AM +0800, Ming Lei wrote: > > >> On Mon, Jul 10, 2017 at 7:09 AM, NeilBrown <neilb@suse.com> wrote: > > ... > > >> >> + > > >> >> + rp->idx = 0; > > >> > > > >> > This is the only place the ->idx is initialized, in r1buf_pool_alloc(). > > >> > The mempool alloc function is suppose to allocate memory, not initialize > > >> > it. > > >> > > > >> > If the mempool_alloc() call cannot allocate memory it will use memory > > >> > from the pool. If this memory has already been used, then it will no > > >> > longer have the initialized value. > > >> > > > >> > In short: you need to initialise memory *after* calling > > >> > mempool_alloc(), unless you ensure it is reset to the init values before > > >> > calling mempool_free(). > > >> > > > >> > https://bugzilla.kernel.org/show_bug.cgi?id=196307 > > >> > > >> OK, thanks for posting it out. > > >> > > >> Another fix might be to reinitialize the variable(rp->idx = 0) in > > >> r1buf_pool_free(). > > >> Or just set it as zero every time when it is used. > > >> > > >> But I don't understand why mempool_free() calls pool->free() at the end of > > >> this function, which may cause to run pool->free() on a new allocated buf, > > >> seems a bug in mempool? > > > > > > Looks I missed the 'return' in mempool_free(), so it is fine. > > > > > > How about the following fix? > > > > It looks like it would probably work, but it is rather unusual to > > initialise something just before freeing it. > > > > Couldn't you just move the initialization to shortly after the > > mempool_alloc() call. There looks like a good place that already loops > > over all the bios.... > > OK, follows the revised patch according to your suggestion. > --- > > From 68f9936635b3dda13c87a6b6125ac543145bb940 Mon Sep 17 00:00:00 2001 > From: Ming Lei <ming.lei@redhat.com> > Date: Mon, 10 Jul 2017 15:16:16 +0800 > Subject: [PATCH] MD: move initialization of resync pages' index out of mempool > allocator > > mempool_alloc() is only responsible for allocation, not for initialization, > so we need to move the initialization of resync pages's index out of the > allocator function. > > Reported-by: NeilBrown <neilb@suse.com> > Fixes: f0250618361d(md: raid10: don't use bio's vec table to manage resync pages) > Fixes: 98d30c5812c3(md: raid1: don't use bio's vec table to manage resync pages) > Signed-off-by: Ming Lei <ming.lei@redhat.com> > --- > drivers/md/raid1.c | 4 +++- > drivers/md/raid10.c | 6 +++++- > 2 files changed, 8 insertions(+), 2 deletions(-) > > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c > index e1a7e3d4c5e4..26f5efba0504 100644 > --- a/drivers/md/raid1.c > +++ b/drivers/md/raid1.c > @@ -170,7 +170,6 @@ static void * r1buf_pool_alloc(gfp_t gfp_flags, void *data) > resync_get_all_pages(rp); > } > > - rp->idx = 0; > rp->raid_bio = r1_bio; > bio->bi_private = rp; > } > @@ -2698,6 +2697,9 @@ static sector_t raid1_sync_request(struct mddev *mddev, sector_t sector_nr, > struct md_rdev *rdev; > bio = r1_bio->bios[i]; > > + /* This initialization should follow mempool_alloc() */ > + get_resync_pages(bio)->idx = 0; > + This is fragile and hard to maintain. Can we add a wrap for the allocation/init? Thanks, Shaohua > rdev = rcu_dereference(conf->mirrors[i].rdev); > if (rdev == NULL || > test_bit(Faulty, &rdev->flags)) { > diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c > index 797ed60abd5e..5ebcb7487284 100644 > --- a/drivers/md/raid10.c > +++ b/drivers/md/raid10.c > @@ -221,7 +221,6 @@ static void * r10buf_pool_alloc(gfp_t gfp_flags, void *data) > resync_get_all_pages(rp); > } > > - rp->idx = 0; > rp->raid_bio = r10_bio; > bio->bi_private = rp; > if (rbio) { > @@ -3095,6 +3094,7 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr, > bio = r10_bio->devs[0].bio; > bio->bi_next = biolist; > biolist = bio; > + get_resync_pages(bio)->idx = 0; > bio->bi_end_io = end_sync_read; > bio_set_op_attrs(bio, REQ_OP_READ, 0); > if (test_bit(FailFast, &rdev->flags)) > @@ -3120,6 +3120,7 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr, > bio = r10_bio->devs[1].bio; > bio->bi_next = biolist; > biolist = bio; > + get_resync_pages(bio)->idx = 0; > bio->bi_end_io = end_sync_write; > bio_set_op_attrs(bio, REQ_OP_WRITE, 0); > bio->bi_iter.bi_sector = to_addr > @@ -3146,6 +3147,7 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr, > break; > bio->bi_next = biolist; > biolist = bio; > + get_resync_pages(bio)->idx = 0; > bio->bi_end_io = end_sync_write; > bio_set_op_attrs(bio, REQ_OP_WRITE, 0); > bio->bi_iter.bi_sector = to_addr + > @@ -3291,6 +3293,7 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr, > atomic_inc(&r10_bio->remaining); > bio->bi_next = biolist; > biolist = bio; > + get_resync_pages(bio)->idx = 0; > bio->bi_end_io = end_sync_read; > bio_set_op_attrs(bio, REQ_OP_READ, 0); > if (test_bit(FailFast, &conf->mirrors[d].rdev->flags)) > @@ -3314,6 +3317,7 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr, > sector = r10_bio->devs[i].addr; > bio->bi_next = biolist; > biolist = bio; > + get_resync_pages(bio)->idx = 0; > bio->bi_end_io = end_sync_write; > bio_set_op_attrs(bio, REQ_OP_WRITE, 0); > if (test_bit(FailFast, &conf->mirrors[d].rdev->flags)) > -- > 2.9.4 > > > > -- > Ming
On Mon, Jul 10, 2017 at 12:05:49PM -0700, Shaohua Li wrote: > On Mon, Jul 10, 2017 at 03:25:41PM +0800, Ming Lei wrote: > > On Mon, Jul 10, 2017 at 02:38:19PM +1000, NeilBrown wrote: > > > On Mon, Jul 10 2017, Ming Lei wrote: > > > > > > > On Mon, Jul 10, 2017 at 11:35:12AM +0800, Ming Lei wrote: > > > >> On Mon, Jul 10, 2017 at 7:09 AM, NeilBrown <neilb@suse.com> wrote: > > > ... > > > >> >> + > > > >> >> + rp->idx = 0; > > > >> > > > > >> > This is the only place the ->idx is initialized, in r1buf_pool_alloc(). > > > >> > The mempool alloc function is suppose to allocate memory, not initialize > > > >> > it. > > > >> > > > > >> > If the mempool_alloc() call cannot allocate memory it will use memory > > > >> > from the pool. If this memory has already been used, then it will no > > > >> > longer have the initialized value. > > > >> > > > > >> > In short: you need to initialise memory *after* calling > > > >> > mempool_alloc(), unless you ensure it is reset to the init values before > > > >> > calling mempool_free(). > > > >> > > > > >> > https://bugzilla.kernel.org/show_bug.cgi?id=196307 > > > >> > > > >> OK, thanks for posting it out. > > > >> > > > >> Another fix might be to reinitialize the variable(rp->idx = 0) in > > > >> r1buf_pool_free(). > > > >> Or just set it as zero every time when it is used. > > > >> > > > >> But I don't understand why mempool_free() calls pool->free() at the end of > > > >> this function, which may cause to run pool->free() on a new allocated buf, > > > >> seems a bug in mempool? > > > > > > > > Looks I missed the 'return' in mempool_free(), so it is fine. > > > > > > > > How about the following fix? > > > > > > It looks like it would probably work, but it is rather unusual to > > > initialise something just before freeing it. > > > > > > Couldn't you just move the initialization to shortly after the > > > mempool_alloc() call. There looks like a good place that already loops > > > over all the bios.... > > > > OK, follows the revised patch according to your suggestion. > > --- > > > > From 68f9936635b3dda13c87a6b6125ac543145bb940 Mon Sep 17 00:00:00 2001 > > From: Ming Lei <ming.lei@redhat.com> > > Date: Mon, 10 Jul 2017 15:16:16 +0800 > > Subject: [PATCH] MD: move initialization of resync pages' index out of mempool > > allocator > > > > mempool_alloc() is only responsible for allocation, not for initialization, > > so we need to move the initialization of resync pages's index out of the > > allocator function. > > > > Reported-by: NeilBrown <neilb@suse.com> > > Fixes: f0250618361d(md: raid10: don't use bio's vec table to manage resync pages) > > Fixes: 98d30c5812c3(md: raid1: don't use bio's vec table to manage resync pages) > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > > --- > > drivers/md/raid1.c | 4 +++- > > drivers/md/raid10.c | 6 +++++- > > 2 files changed, 8 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c > > index e1a7e3d4c5e4..26f5efba0504 100644 > > --- a/drivers/md/raid1.c > > +++ b/drivers/md/raid1.c > > @@ -170,7 +170,6 @@ static void * r1buf_pool_alloc(gfp_t gfp_flags, void *data) > > resync_get_all_pages(rp); > > } > > > > - rp->idx = 0; > > rp->raid_bio = r1_bio; > > bio->bi_private = rp; > > } > > @@ -2698,6 +2697,9 @@ static sector_t raid1_sync_request(struct mddev *mddev, sector_t sector_nr, > > struct md_rdev *rdev; > > bio = r1_bio->bios[i]; > > > > + /* This initialization should follow mempool_alloc() */ > > + get_resync_pages(bio)->idx = 0; > > + > > This is fragile and hard to maintain. Can we add a wrap for the > allocation/init? The resync pages's index init is done in the following loop after mempool_alloc(), actually the whole loop is still sort of init, so I think it isn't fragile, and it should be fine.
On Mon, Jul 10 2017, Shaohua Li wrote: > On Mon, Jul 10, 2017 at 03:25:41PM +0800, Ming Lei wrote: >> On Mon, Jul 10, 2017 at 02:38:19PM +1000, NeilBrown wrote: >> > On Mon, Jul 10 2017, Ming Lei wrote: >> > >> > > On Mon, Jul 10, 2017 at 11:35:12AM +0800, Ming Lei wrote: >> > >> On Mon, Jul 10, 2017 at 7:09 AM, NeilBrown <neilb@suse.com> wrote: >> > ... >> > >> >> + >> > >> >> + rp->idx = 0; >> > >> > >> > >> > This is the only place the ->idx is initialized, in r1buf_pool_alloc(). >> > >> > The mempool alloc function is suppose to allocate memory, not initialize >> > >> > it. >> > >> > >> > >> > If the mempool_alloc() call cannot allocate memory it will use memory >> > >> > from the pool. If this memory has already been used, then it will no >> > >> > longer have the initialized value. >> > >> > >> > >> > In short: you need to initialise memory *after* calling >> > >> > mempool_alloc(), unless you ensure it is reset to the init values before >> > >> > calling mempool_free(). >> > >> > >> > >> > https://bugzilla.kernel.org/show_bug.cgi?id=196307 >> > >> >> > >> OK, thanks for posting it out. >> > >> >> > >> Another fix might be to reinitialize the variable(rp->idx = 0) in >> > >> r1buf_pool_free(). >> > >> Or just set it as zero every time when it is used. >> > >> >> > >> But I don't understand why mempool_free() calls pool->free() at the end of >> > >> this function, which may cause to run pool->free() on a new allocated buf, >> > >> seems a bug in mempool? >> > > >> > > Looks I missed the 'return' in mempool_free(), so it is fine. >> > > >> > > How about the following fix? >> > >> > It looks like it would probably work, but it is rather unusual to >> > initialise something just before freeing it. >> > >> > Couldn't you just move the initialization to shortly after the >> > mempool_alloc() call. There looks like a good place that already loops >> > over all the bios.... >> >> OK, follows the revised patch according to your suggestion. Thanks. That isn't as tidy as I hoped. So I went deeper into the code to try to understand why... I think that maybe we should just discard the ->idx field completely. It is only used in this code: do { struct page *page; int len = PAGE_SIZE; if (sector_nr + (len>>9) > max_sector) len = (max_sector - sector_nr) << 9; if (len == 0) break; for (bio= biolist ; bio ; bio=bio->bi_next) { struct resync_pages *rp = get_resync_pages(bio); page = resync_fetch_page(rp, rp->idx++); /* * won't fail because the vec table is big enough * to hold all these pages */ bio_add_page(bio, page, len, 0); } nr_sectors += len>>9; sector_nr += len>>9; } while (get_resync_pages(biolist)->idx < RESYNC_PAGES); and all of the different 'rp' always have the same value for 'idx'. This code is more complex than it needs to be. This is because it used to be possible for bio_add_page() to fail. That cannot happen any more. So we can make the code something like: for (idx = 0; idx < RESYNC_PAGES; idx++) { struct page *page; int len = PAGE_SIZE; if (sector_nr + (len >> 9) > max_sector) len = (max_sector - sector_nr) << 9 if (len == 0) break; for (bio = biolist; bio; bio = bio->bi_next) { struct resync_pages *rp = get_resync_pages(bio); page = resync_fetch_page(rp, idx); bio_add_page(bio, page, len, 0); } nr_sectors += len >> 9; sector_nr += len >> 9; } Or did I miss something? Thanks, NeilBrown
On Tue, Jul 11, 2017 at 7:14 AM, NeilBrown <neilb@suse.com> wrote: > On Mon, Jul 10 2017, Shaohua Li wrote: > >> On Mon, Jul 10, 2017 at 03:25:41PM +0800, Ming Lei wrote: >>> On Mon, Jul 10, 2017 at 02:38:19PM +1000, NeilBrown wrote: >>> > On Mon, Jul 10 2017, Ming Lei wrote: >>> > >>> > > On Mon, Jul 10, 2017 at 11:35:12AM +0800, Ming Lei wrote: >>> > >> On Mon, Jul 10, 2017 at 7:09 AM, NeilBrown <neilb@suse.com> wrote: >>> > ... >>> > >> >> + >>> > >> >> + rp->idx = 0; >>> > >> > >>> > >> > This is the only place the ->idx is initialized, in r1buf_pool_alloc(). >>> > >> > The mempool alloc function is suppose to allocate memory, not initialize >>> > >> > it. >>> > >> > >>> > >> > If the mempool_alloc() call cannot allocate memory it will use memory >>> > >> > from the pool. If this memory has already been used, then it will no >>> > >> > longer have the initialized value. >>> > >> > >>> > >> > In short: you need to initialise memory *after* calling >>> > >> > mempool_alloc(), unless you ensure it is reset to the init values before >>> > >> > calling mempool_free(). >>> > >> > >>> > >> > https://bugzilla.kernel.org/show_bug.cgi?id=196307 >>> > >> >>> > >> OK, thanks for posting it out. >>> > >> >>> > >> Another fix might be to reinitialize the variable(rp->idx = 0) in >>> > >> r1buf_pool_free(). >>> > >> Or just set it as zero every time when it is used. >>> > >> >>> > >> But I don't understand why mempool_free() calls pool->free() at the end of >>> > >> this function, which may cause to run pool->free() on a new allocated buf, >>> > >> seems a bug in mempool? >>> > > >>> > > Looks I missed the 'return' in mempool_free(), so it is fine. >>> > > >>> > > How about the following fix? >>> > >>> > It looks like it would probably work, but it is rather unusual to >>> > initialise something just before freeing it. >>> > >>> > Couldn't you just move the initialization to shortly after the >>> > mempool_alloc() call. There looks like a good place that already loops >>> > over all the bios.... >>> >>> OK, follows the revised patch according to your suggestion. > > Thanks. > > That isn't as tidy as I hoped. So I went deeper into the code to try to > understand why... > > I think that maybe we should just discard the ->idx field completely. > It is only used in this code: > > do { > struct page *page; > int len = PAGE_SIZE; > if (sector_nr + (len>>9) > max_sector) > len = (max_sector - sector_nr) << 9; > if (len == 0) > break; > for (bio= biolist ; bio ; bio=bio->bi_next) { > struct resync_pages *rp = get_resync_pages(bio); > page = resync_fetch_page(rp, rp->idx++); > /* > * won't fail because the vec table is big enough > * to hold all these pages > */ > bio_add_page(bio, page, len, 0); > } > nr_sectors += len>>9; > sector_nr += len>>9; > } while (get_resync_pages(biolist)->idx < RESYNC_PAGES); > > and all of the different 'rp' always have the same value for 'idx'. > This code is more complex than it needs to be. This is because it used > to be possible for bio_add_page() to fail. That cannot happen any more. > So we can make the code something like: > > for (idx = 0; idx < RESYNC_PAGES; idx++) { > struct page *page; > int len = PAGE_SIZE; > if (sector_nr + (len >> 9) > max_sector) > len = (max_sector - sector_nr) << 9 > if (len == 0) > break; > for (bio = biolist; bio; bio = bio->bi_next) { > struct resync_pages *rp = get_resync_pages(bio); > page = resync_fetch_page(rp, idx); > bio_add_page(bio, page, len, 0); > } > nr_sectors += len >> 9; > sector_nr += len >> 9; > } > > Or did I miss something? I think this approach is much clean. -- Ming Lei
On Wed, Jul 12, 2017 at 09:40:10AM +0800, Ming Lei wrote: > On Tue, Jul 11, 2017 at 7:14 AM, NeilBrown <neilb@suse.com> wrote: > > On Mon, Jul 10 2017, Shaohua Li wrote: > > > >> On Mon, Jul 10, 2017 at 03:25:41PM +0800, Ming Lei wrote: > >>> On Mon, Jul 10, 2017 at 02:38:19PM +1000, NeilBrown wrote: > >>> > On Mon, Jul 10 2017, Ming Lei wrote: > >>> > > >>> > > On Mon, Jul 10, 2017 at 11:35:12AM +0800, Ming Lei wrote: > >>> > >> On Mon, Jul 10, 2017 at 7:09 AM, NeilBrown <neilb@suse.com> wrote: > >>> > ... > >>> > >> >> + > >>> > >> >> + rp->idx = 0; > >>> > >> > > >>> > >> > This is the only place the ->idx is initialized, in r1buf_pool_alloc(). > >>> > >> > The mempool alloc function is suppose to allocate memory, not initialize > >>> > >> > it. > >>> > >> > > >>> > >> > If the mempool_alloc() call cannot allocate memory it will use memory > >>> > >> > from the pool. If this memory has already been used, then it will no > >>> > >> > longer have the initialized value. > >>> > >> > > >>> > >> > In short: you need to initialise memory *after* calling > >>> > >> > mempool_alloc(), unless you ensure it is reset to the init values before > >>> > >> > calling mempool_free(). > >>> > >> > > >>> > >> > https://bugzilla.kernel.org/show_bug.cgi?id=196307 > >>> > >> > >>> > >> OK, thanks for posting it out. > >>> > >> > >>> > >> Another fix might be to reinitialize the variable(rp->idx = 0) in > >>> > >> r1buf_pool_free(). > >>> > >> Or just set it as zero every time when it is used. > >>> > >> > >>> > >> But I don't understand why mempool_free() calls pool->free() at the end of > >>> > >> this function, which may cause to run pool->free() on a new allocated buf, > >>> > >> seems a bug in mempool? > >>> > > > >>> > > Looks I missed the 'return' in mempool_free(), so it is fine. > >>> > > > >>> > > How about the following fix? > >>> > > >>> > It looks like it would probably work, but it is rather unusual to > >>> > initialise something just before freeing it. > >>> > > >>> > Couldn't you just move the initialization to shortly after the > >>> > mempool_alloc() call. There looks like a good place that already loops > >>> > over all the bios.... > >>> > >>> OK, follows the revised patch according to your suggestion. > > > > Thanks. > > > > That isn't as tidy as I hoped. So I went deeper into the code to try to > > understand why... > > > > I think that maybe we should just discard the ->idx field completely. > > It is only used in this code: > > > > do { > > struct page *page; > > int len = PAGE_SIZE; > > if (sector_nr + (len>>9) > max_sector) > > len = (max_sector - sector_nr) << 9; > > if (len == 0) > > break; > > for (bio= biolist ; bio ; bio=bio->bi_next) { > > struct resync_pages *rp = get_resync_pages(bio); > > page = resync_fetch_page(rp, rp->idx++); > > /* > > * won't fail because the vec table is big enough > > * to hold all these pages > > */ > > bio_add_page(bio, page, len, 0); > > } > > nr_sectors += len>>9; > > sector_nr += len>>9; > > } while (get_resync_pages(biolist)->idx < RESYNC_PAGES); > > > > and all of the different 'rp' always have the same value for 'idx'. > > This code is more complex than it needs to be. This is because it used > > to be possible for bio_add_page() to fail. That cannot happen any more. > > So we can make the code something like: > > > > for (idx = 0; idx < RESYNC_PAGES; idx++) { > > struct page *page; > > int len = PAGE_SIZE; > > if (sector_nr + (len >> 9) > max_sector) > > len = (max_sector - sector_nr) << 9 > > if (len == 0) > > break; > > for (bio = biolist; bio; bio = bio->bi_next) { > > struct resync_pages *rp = get_resync_pages(bio); > > page = resync_fetch_page(rp, idx); > > bio_add_page(bio, page, len, 0); > > } > > nr_sectors += len >> 9; > > sector_nr += len >> 9; > > } > > > > Or did I miss something? > > I think this approach is much clean. Thought I suggested not using the 'idx' in your previous post, but you said there is reason (not because of bio_add_page) not to do it. Is that changed? can't remember the details, I need to dig the mail archives. Thanks, Shaohua
On Wed, Jul 12, 2017 at 09:30:50AM -0700, Shaohua Li wrote: > On Wed, Jul 12, 2017 at 09:40:10AM +0800, Ming Lei wrote: > > On Tue, Jul 11, 2017 at 7:14 AM, NeilBrown <neilb@suse.com> wrote: > > > On Mon, Jul 10 2017, Shaohua Li wrote: > > > > > >> On Mon, Jul 10, 2017 at 03:25:41PM +0800, Ming Lei wrote: > > >>> On Mon, Jul 10, 2017 at 02:38:19PM +1000, NeilBrown wrote: > > >>> > On Mon, Jul 10 2017, Ming Lei wrote: > > >>> > > > >>> > > On Mon, Jul 10, 2017 at 11:35:12AM +0800, Ming Lei wrote: > > >>> > >> On Mon, Jul 10, 2017 at 7:09 AM, NeilBrown <neilb@suse.com> wrote: > > >>> > ... > > >>> > >> >> + > > >>> > >> >> + rp->idx = 0; > > >>> > >> > > > >>> > >> > This is the only place the ->idx is initialized, in r1buf_pool_alloc(). > > >>> > >> > The mempool alloc function is suppose to allocate memory, not initialize > > >>> > >> > it. > > >>> > >> > > > >>> > >> > If the mempool_alloc() call cannot allocate memory it will use memory > > >>> > >> > from the pool. If this memory has already been used, then it will no > > >>> > >> > longer have the initialized value. > > >>> > >> > > > >>> > >> > In short: you need to initialise memory *after* calling > > >>> > >> > mempool_alloc(), unless you ensure it is reset to the init values before > > >>> > >> > calling mempool_free(). > > >>> > >> > > > >>> > >> > https://bugzilla.kernel.org/show_bug.cgi?id=196307 > > >>> > >> > > >>> > >> OK, thanks for posting it out. > > >>> > >> > > >>> > >> Another fix might be to reinitialize the variable(rp->idx = 0) in > > >>> > >> r1buf_pool_free(). > > >>> > >> Or just set it as zero every time when it is used. > > >>> > >> > > >>> > >> But I don't understand why mempool_free() calls pool->free() at the end of > > >>> > >> this function, which may cause to run pool->free() on a new allocated buf, > > >>> > >> seems a bug in mempool? > > >>> > > > > >>> > > Looks I missed the 'return' in mempool_free(), so it is fine. > > >>> > > > > >>> > > How about the following fix? > > >>> > > > >>> > It looks like it would probably work, but it is rather unusual to > > >>> > initialise something just before freeing it. > > >>> > > > >>> > Couldn't you just move the initialization to shortly after the > > >>> > mempool_alloc() call. There looks like a good place that already loops > > >>> > over all the bios.... > > >>> > > >>> OK, follows the revised patch according to your suggestion. > > > > > > Thanks. > > > > > > That isn't as tidy as I hoped. So I went deeper into the code to try to > > > understand why... > > > > > > I think that maybe we should just discard the ->idx field completely. > > > It is only used in this code: > > > > > > do { > > > struct page *page; > > > int len = PAGE_SIZE; > > > if (sector_nr + (len>>9) > max_sector) > > > len = (max_sector - sector_nr) << 9; > > > if (len == 0) > > > break; > > > for (bio= biolist ; bio ; bio=bio->bi_next) { > > > struct resync_pages *rp = get_resync_pages(bio); > > > page = resync_fetch_page(rp, rp->idx++); > > > /* > > > * won't fail because the vec table is big enough > > > * to hold all these pages > > > */ > > > bio_add_page(bio, page, len, 0); > > > } > > > nr_sectors += len>>9; > > > sector_nr += len>>9; > > > } while (get_resync_pages(biolist)->idx < RESYNC_PAGES); > > > > > > and all of the different 'rp' always have the same value for 'idx'. > > > This code is more complex than it needs to be. This is because it used > > > to be possible for bio_add_page() to fail. That cannot happen any more. > > > So we can make the code something like: > > > > > > for (idx = 0; idx < RESYNC_PAGES; idx++) { > > > struct page *page; > > > int len = PAGE_SIZE; > > > if (sector_nr + (len >> 9) > max_sector) > > > len = (max_sector - sector_nr) << 9 > > > if (len == 0) > > > break; > > > for (bio = biolist; bio; bio = bio->bi_next) { > > > struct resync_pages *rp = get_resync_pages(bio); > > > page = resync_fetch_page(rp, idx); > > > bio_add_page(bio, page, len, 0); > > > } > > > nr_sectors += len >> 9; > > > sector_nr += len >> 9; > > > } > > > > > > Or did I miss something? > > > > I think this approach is much clean. > > Thought I suggested not using the 'idx' in your previous post, but you said > there is reason (not because of bio_add_page) not to do it. Is that changed? > can't remember the details, I need to dig the mail archives. I found it: http://marc.info/?l=linux-raid&m=148847751302825&w=2 Not sure why I didn't change to this way in v3, but the idea is correct. Maybe I misunderstood it that time.
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index e1a7e3d4c5e4..26f5efba0504 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -170,7 +170,6 @@ static void * r1buf_pool_alloc(gfp_t gfp_flags, void *data) resync_get_all_pages(rp); } - rp->idx = 0; rp->raid_bio = r1_bio; bio->bi_private = rp; } @@ -2698,6 +2697,9 @@ static sector_t raid1_sync_request(struct mddev *mddev, sector_t sector_nr, struct md_rdev *rdev; bio = r1_bio->bios[i]; + /* This initialization should follow mempool_alloc() */ + get_resync_pages(bio)->idx = 0; + rdev = rcu_dereference(conf->mirrors[i].rdev); if (rdev == NULL || test_bit(Faulty, &rdev->flags)) { diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 797ed60abd5e..5ebcb7487284 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -221,7 +221,6 @@ static void * r10buf_pool_alloc(gfp_t gfp_flags, void *data) resync_get_all_pages(rp); } - rp->idx = 0; rp->raid_bio = r10_bio; bio->bi_private = rp; if (rbio) { @@ -3095,6 +3094,7 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr, bio = r10_bio->devs[0].bio; bio->bi_next = biolist; biolist = bio; + get_resync_pages(bio)->idx = 0; bio->bi_end_io = end_sync_read; bio_set_op_attrs(bio, REQ_OP_READ, 0); if (test_bit(FailFast, &rdev->flags)) @@ -3120,6 +3120,7 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr, bio = r10_bio->devs[1].bio; bio->bi_next = biolist; biolist = bio; + get_resync_pages(bio)->idx = 0; bio->bi_end_io = end_sync_write; bio_set_op_attrs(bio, REQ_OP_WRITE, 0); bio->bi_iter.bi_sector = to_addr @@ -3146,6 +3147,7 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr, break; bio->bi_next = biolist; biolist = bio; + get_resync_pages(bio)->idx = 0; bio->bi_end_io = end_sync_write; bio_set_op_attrs(bio, REQ_OP_WRITE, 0); bio->bi_iter.bi_sector = to_addr + @@ -3291,6 +3293,7 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr, atomic_inc(&r10_bio->remaining); bio->bi_next = biolist; biolist = bio; + get_resync_pages(bio)->idx = 0; bio->bi_end_io = end_sync_read; bio_set_op_attrs(bio, REQ_OP_READ, 0); if (test_bit(FailFast, &conf->mirrors[d].rdev->flags)) @@ -3314,6 +3317,7 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr, sector = r10_bio->devs[i].addr; bio->bi_next = biolist; biolist = bio; + get_resync_pages(bio)->idx = 0; bio->bi_end_io = end_sync_write; bio_set_op_attrs(bio, REQ_OP_WRITE, 0); if (test_bit(FailFast, &conf->mirrors[d].rdev->flags))