diff mbox

fix mirror device creation with lvcreate failed

Message ID 1409127505-5246-1-git-send-email-lwang@suse.com (mailing list archive)
State Superseded, archived
Delegated to: Mike Snitzer
Headers show

Commit Message

Liuhua Wang Aug. 27, 2014, 8:18 a.m. UTC
On some devices, when create mirror device with lvcreate, it fails.
 device-mapper: reload ioctl on  failed: Invalid argument
 Failed to activate new LV.

 We can use read size 4096 since md always use 4096.

Signed-off-by: Liuhua Wang <lwang@suse.com>
---
 drivers/md/dm-raid.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Alasdair G Kergon Sept. 16, 2014, 11:21 p.m. UTC | #1
On Wed, Aug 27, 2014 at 04:18:25PM +0800, Liuhua Wang wrote:
>  On some devices, 

Which devices?  What is the necessary condition to trigger the problem?

> when create mirror device with lvcreate, it fails.
>  device-mapper: reload ioctl on  failed: Invalid argument
>  Failed to activate new LV.
 
What exactly causes this error?
- Do you get a kernel error message too?

>  We can use read size 4096 since md always use 4096.
 
How is that connected to problem?
Why does it fix it?

> +++ b/drivers/md/dm-raid.c
> @@ -860,7 +860,7 @@ static int super_load(struct md_rdev *rdev, struct md_rdev *refdev)
>  	rdev->sb_start = 0;
>  	rdev->sb_size = sizeof(*sb);
  
What should sb_size really be?

> -	ret = read_disk_sb(rdev, rdev->sb_size);
> +	ret = read_disk_sb(rdev, 4096);

Why aren't they both MD_SB_BYTES?
(Strictly, does sb_size need recalculating after the read?)

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Alasdair G Kergon Sept. 16, 2014, 11:35 p.m. UTC | #2
On Wed, Sep 17, 2014 at 12:21:04AM +0100, Alasdair G Kergon wrote:
> >  	rdev->sb_size = sizeof(*sb);
> What should sb_size really be?
> > -	ret = read_disk_sb(rdev, rdev->sb_size);
 
Or are the sizes in the existing code correct (the dm sb is always 512 bytes?), 
with the problem coming from something in sync_page_io()?

(4k sector device? large page size? incorrectly uninitialised field?)

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Alasdair G Kergon Sept. 16, 2014, 11:40 p.m. UTC | #3
On Wed, Sep 17, 2014 at 12:35:29AM +0100, Alasdair G Kergon wrote:
> > > -	ret = read_disk_sb(rdev, rdev->sb_size);

And if that needed changing to 4096, why does this similar instance elsewhere
one not also need changing?

  static void attempt_restore_of_faulty_devices(struct raid_set *rs)
  {
  ...
                    sync_page_io(r, 0, r->sb_size, r->sb_page, READ, 1)) {

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Liuhua Wang Sept. 17, 2014, 5:11 a.m. UTC | #4
Hi, Alasdair

>>> On 9/17/2014 at 07:21 AM, in message
<20140916232104.GC2407@agk-dp.fab.redhat.com>, Alasdair G Kergon
<agk@redhat.com> wrote: 
> On Wed, Aug 27, 2014 at 04:18:25PM +0800, Liuhua Wang wrote:
> >  On some devices, 
> 
> Which devices?  What is the necessary condition to trigger the problem?
> 
When using dasd device formated with bocksize of 4096bytes.
the problem occurs, while formated with blocksie of 512bytes,
no problem.

> > when create mirror device with lvcreate, it fails.
> >  device-mapper: reload ioctl on  failed: Invalid argument
> >  Failed to activate new LV.
>  
> What exactly causes this error?
> - Do you get a kernel error message too?
There are error messages:
---------------------
[22798.077877] end_request: I/O error, dev dasdd, sector 2240
[22798.077902] device-mapper: raid: Failed to read superblock of device at position 1
[23087.875420] type=1006 audit(1404313201.346:28): pid=1596 uid=0 old auid=4294967295 new auid=0 old ses=4294967295 new ses=27 res=1
[23127.707296] dasd-eckd.d2bf98: 0.0.141d: ERP 0000000036ab1e40 has run out of retries and failed
----------------------

> >  We can use read size 4096 since md always use 4096.
>  
> How is that connected to problem?
> Why does it fix it?
> 
> > +++ b/drivers/md/dm-raid.c
> > @@ -860,7 +860,7 @@ static int super_load(struct md_rdev *rdev, struct 
> md_rdev *refdev)
> >  	rdev->sb_start = 0;
> >  	rdev->sb_size = sizeof(*sb);
>   
> What should sb_size really be?
512bytes
> 
> > -	ret = read_disk_sb(rdev, rdev->sb_size);
> > +	ret = read_disk_sb(rdev, 4096);
> 
> Why aren't they both MD_SB_BYTES?
> (Strictly, does sb_size need recalculating after the read?)
> 
> Alasdair
> 

Best regards,
Liuhua 



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Alasdair G Kergon Sept. 17, 2014, 11:41 a.m. UTC | #5
On Tue, Sep 16, 2014 at 11:11:35PM -0600, Liuhua Wang wrote:
> There are error messages:
> ---------------------
> [22798.077877] end_request: I/O error, dev dasdd, sector 2240
> [22798.077902] device-mapper: raid: Failed to read superblock of device at position 1
> [23087.875420] type=1006 audit(1404313201.346:28): pid=1596 uid=0 old auid=4294967295 new auid=0 old ses=4294967295 new ses=27 res=1
> [23127.707296] dasd-eckd.d2bf98: 0.0.141d: ERP 0000000036ab1e40 has run out of retries and failed
> ----------------------
 
OK, so it sounds like Neil's patch will be enough to fix this.
Can you test that and confirm it solves the problem?

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
NeilBrown Sept. 22, 2014, 1:20 a.m. UTC | #6
On Thu, 18 Sep 2014 00:56:33 -0600 "Liuhua Wang" <lwang@suse.com> wrote:

> Hi, Alasdair
> Hi, Neil
> 
> >>> On 9/17/2014 at 07:41 PM, in message
> <20140917114151.GF2407@agk-dp.fab.redhat.com>, Alasdair G Kergon
> <agk@redhat.com> wrote: 
> > On Tue, Sep 16, 2014 at 11:11:35PM -0600, Liuhua Wang wrote:
> > > There are error messages:
> > > ---------------------
> > > [22798.077877] end_request: I/O error, dev dasdd, sector 2240
> > > [22798.077902] device-mapper: raid: Failed to read superblock of device at 
> > position 1
> > > [23087.875420] type=1006 audit(1404313201.346:28): pid=1596 uid=0 old 
> > auid=4294967295 new auid=0 old ses=4294967295 new ses=27 res=1
> > > [23127.707296] dasd-eckd.d2bf98: 0.0.141d: ERP 0000000036ab1e40 has run out 
> > of retries and failed
> > > ----------------------
> >  
> > OK, so it sounds like Neil's patch will be enough to fix this.
> > Can you test that and confirm it solves the problem?
> > 
> 
> I tested Neil's patch, lvcreate succeed, the lv created can display,
> be written successfully, but only there still are error messages:
> --------------------------------
> 2014-09-18T06:39:52.653239+02:00 s390vsl158 kernel: device-mapper: raid: Superblocks created for new array
> 2014-09-18T06:39:52.663361+02:00 s390vsl158 kernel: md/raid1:mdX: not clean -- starting background reconstruction
> 2014-09-18T06:39:52.663366+02:00 s390vsl158 kernel: md/raid1:mdX: active with 2 out of 2 mirrors
> 2014-09-18T06:39:52.663366+02:00 s390vsl158 kernel: Choosing daemon_sleep default (5 sec)
> 2014-09-18T06:39:52.663367+02:00 s390vsl158 kernel: created bitmap (1 pages) for device mdX
> 2014-09-18T06:39:52.663368+02:00 s390vsl158 kernel: end_request: I/O error, dev dasdb, sector 19202144
> 2014-09-18T06:39:52.663369+02:00 s390vsl158 kernel: md: super_written gets error=-5, uptodate=0
> 2014-09-18T06:39:52.663369+02:00 s390vsl158 kernel: md/raid1:mdX: Disk failure on dm-3, disabling device.
> -------------------------
> 
> So I add also roundup to md_super_write() as following,
> then the above error messages disappears.
> --------------------------
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 1294238..18cd2e7 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -754,6 +754,8 @@ void md_super_write(struct mddev *mddev, struct md_rdev *rdev,
>  	struct bio *bio = bio_alloc_mddev(GFP_NOIO, 1, mddev);
>  
>  	bio->bi_bdev = rdev->meta_bdev ? rdev->meta_bdev : rdev->bdev;
> +        size = roundup(size, bdev_logical_block_size(bio->bi_bdev));
> +	
>  	bio->bi_iter.bi_sector = sector;
>  	bio_add_page(bio, page, size, 0);
>  	bio->bi_private = rdev;

I'm not really comfortable with this change.

It is always safe to read more than requested.  Writing more than requested
is quite different.

I think I would rather that rdev->sb_size were rounded up by the code that
sets it.  That would make it clear to anyone reading the code that a full
block was always written.

Jon: how does lvm allocate space for the dm-raid metadata?  Does it always
allocate some minimum?  Is the allocation aware of underlying block size?
Is it reasonable to round up rdev->sb_size in load_super() ??


Thanks,
NeilBrown


> @@ -784,6 +786,10 @@ int sync_page_io(struct md_rdev *rdev, sector_t sector, int size,
>  
>  	bio->bi_bdev = (metadata_op && rdev->meta_bdev) ?
>  		rdev->meta_bdev : rdev->bdev;
> +        size = roundup(size, bdev_logical_block_size(bio->bi_bdev));
> +        if (size > PAGE_SIZE)
> +                return -EINVAL;
> +
>  	if (metadata_op)
>  		bio->bi_iter.bi_sector = sector + rdev->sb_start;
>  	else if (rdev->mddev->reshape_position != MaxSector &&
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
diff mbox

Patch

diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index 4880b69..621afbb 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -860,7 +860,7 @@  static int super_load(struct md_rdev *rdev, struct md_rdev *refdev)
 	rdev->sb_start = 0;
 	rdev->sb_size = sizeof(*sb);
 
-	ret = read_disk_sb(rdev, rdev->sb_size);
+	ret = read_disk_sb(rdev, 4096);
 	if (ret)
 		return ret;