diff mbox

rbd: revalidate_disk upon rbd resize

Message ID 1365609989-14493-1-git-send-email-laurent@ksperis.com (mailing list archive)
State New, archived
Headers show

Commit Message

Laurent Barbe April 10, 2013, 4:06 p.m. UTC
If rbd disk is open and rbd resize is done, new size is not visible by
filesystem.
Like is done in virtio-blk and dm driver, revalidate_disk() permits to
update the bd_inode size.

Signed-off-by: Laurent Barbe <laurent@ksperis.com>
---
 drivers/block/rbd.c |    1 +
 1 file changed, 1 insertion(+)

Comments

Alex Elder April 10, 2013, 7:30 p.m. UTC | #1
On 04/10/2013 11:06 AM, Laurent Barbe wrote:
> If rbd disk is open and rbd resize is done, new size is not visible by
> filesystem.
> Like is done in virtio-blk and dm driver, revalidate_disk() permits to
> update the bd_inode size.

Looks good to me.  I'll take this in via the ceph-client tree.
Thanks a lot.

Reviewed-by: Alex Elder <elder@inktank.com>

> Signed-off-by: Laurent Barbe <laurent@ksperis.com>
> ---
>  drivers/block/rbd.c |    1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index f556f8a..1963025 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -2293,6 +2293,7 @@ static void rbd_update_mapping_size(struct rbd_device *rbd_dev)
>  	dout("setting size to %llu sectors", (unsigned long long) size);
>  	rbd_dev->mapping.size = (u64) size;
>  	set_capacity(rbd_dev->disk, size);
> +	revalidate_disk(rbd_dev->disk);
>  }
>  
>  /*
> 

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alex Elder April 11, 2013, 2:53 a.m. UTC | #2
On 04/10/2013 02:30 PM, Alex Elder wrote:
> On 04/10/2013 11:06 AM, Laurent Barbe wrote:
>> If rbd disk is open and rbd resize is done, new size is not visible by
>> filesystem.
>> Like is done in virtio-blk and dm driver, revalidate_disk() permits to
>> update the bd_inode size.
> 
> Looks good to me.  I'll take this in via the ceph-client tree.
> Thanks a lot.

Unfortunately this leads to a lock inversion.  I'm going to
think about how to go about resolving it, so I won't be
committing it just yet.

					-Alex

> 
> Reviewed-by: Alex Elder <elder@inktank.com>
> 
>> Signed-off-by: Laurent Barbe <laurent@ksperis.com>
>> ---
>>  drivers/block/rbd.c |    1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
>> index f556f8a..1963025 100644
>> --- a/drivers/block/rbd.c
>> +++ b/drivers/block/rbd.c
>> @@ -2293,6 +2293,7 @@ static void rbd_update_mapping_size(struct rbd_device *rbd_dev)
>>  	dout("setting size to %llu sectors", (unsigned long long) size);
>>  	rbd_dev->mapping.size = (u64) size;
>>  	set_capacity(rbd_dev->disk, size);
>> +	revalidate_disk(rbd_dev->disk);
>>  }
>>  
>>  /*
>>
> 

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alex Elder April 11, 2013, 1:27 p.m. UTC | #3
On 04/11/2013 03:47 AM, Laurent Barbe wrote:
> Thanks Alex,
> 
> What do you mean about "lock inversion", deadlock ?
> Is it the mutex_lock in block_dev.c:revalidate_disk() and the mutex
> in rbd_dev_refresh ?

Yes.

When we refresh an rbd device we get the rbd device header
semaphore, and with this patch we then acquire the bdev's
mutex.

Meanhwhile, via blkdev_close() __blkdev_put() acquires the
bdev->bd_mutex, and eventually we get down to creating an
image object request.  If it's a write, we need to get the
snapshot context, and to do that we get the rbd device
header mutex.

So we're acquiring the locks in two different orders, and
that's what I meant by "lock inversion," and yes, that
can lead to deadlock.

There may be a simple fix--like holding off calling
revalidate_disk() until after we release the lock,
most likely in rbd_dev_refresh().  But I just haven't
really thought it through yet.

					-Alex

> 
> 2013/4/11 Alex Elder <elder@inktank.com>
> 
>> On 04/10/2013 02:30 PM, Alex Elder wrote:
>>> On 04/10/2013 11:06 AM, Laurent Barbe wrote:
>>>> If rbd disk is open and rbd resize is done, new size is not visible by
>>>> filesystem.
>>>> Like is done in virtio-blk and dm driver, revalidate_disk() permits to
>>>> update the bd_inode size.
>>>
>>> Looks good to me.  I'll take this in via the ceph-client tree.
>>> Thanks a lot.
>>
>> Unfortunately this leads to a lock inversion.  I'm going to
>> think about how to go about resolving it, so I won't be
>> committing it just yet.
>>
>>                                         -Alex
>>
>>>
>>> Reviewed-by: Alex Elder <elder@inktank.com>
>>>
>>>> Signed-off-by: Laurent Barbe <laurent@ksperis.com>
>>>> ---
>>>>  drivers/block/rbd.c |    1 +
>>>>  1 file changed, 1 insertion(+)
>>>>
>>>> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
>>>> index f556f8a..1963025 100644
>>>> --- a/drivers/block/rbd.c
>>>> +++ b/drivers/block/rbd.c
>>>> @@ -2293,6 +2293,7 @@ static void rbd_update_mapping_size(struct
>> rbd_device *rbd_dev)
>>>>      dout("setting size to %llu sectors", (unsigned long long) size);
>>>>      rbd_dev->mapping.size = (u64) size;
>>>>      set_capacity(rbd_dev->disk, size);
>>>> +    revalidate_disk(rbd_dev->disk);
>>>>  }
>>>>
>>>>  /*
>>>>
>>>
>>
>>
> 

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Laurent Barbe April 11, 2013, 1:59 p.m. UTC | #4
Ok,
thank you for your explanation, I'll look a little later.

2013/4/11 Alex Elder <elder@inktank.com>:
> On 04/11/2013 03:47 AM, Laurent Barbe wrote:
>> Thanks Alex,
>>
>> What do you mean about "lock inversion", deadlock ?
>> Is it the mutex_lock in block_dev.c:revalidate_disk() and the mutex
>> in rbd_dev_refresh ?
>
> Yes.
>
> When we refresh an rbd device we get the rbd device header
> semaphore, and with this patch we then acquire the bdev's
> mutex.
>
> Meanhwhile, via blkdev_close() __blkdev_put() acquires the
> bdev->bd_mutex, and eventually we get down to creating an
> image object request.  If it's a write, we need to get the
> snapshot context, and to do that we get the rbd device
> header mutex.
>
> So we're acquiring the locks in two different orders, and
> that's what I meant by "lock inversion," and yes, that
> can lead to deadlock.
>
> There may be a simple fix--like holding off calling
> revalidate_disk() until after we release the lock,
> most likely in rbd_dev_refresh().  But I just haven't
> really thought it through yet.
>
>                                         -Alex
>
>>
>> 2013/4/11 Alex Elder <elder@inktank.com>
>>
>>> On 04/10/2013 02:30 PM, Alex Elder wrote:
>>>> On 04/10/2013 11:06 AM, Laurent Barbe wrote:
>>>>> If rbd disk is open and rbd resize is done, new size is not visible by
>>>>> filesystem.
>>>>> Like is done in virtio-blk and dm driver, revalidate_disk() permits to
>>>>> update the bd_inode size.
>>>>
>>>> Looks good to me.  I'll take this in via the ceph-client tree.
>>>> Thanks a lot.
>>>
>>> Unfortunately this leads to a lock inversion.  I'm going to
>>> think about how to go about resolving it, so I won't be
>>> committing it just yet.
>>>
>>>                                         -Alex
>>>
>>>>
>>>> Reviewed-by: Alex Elder <elder@inktank.com>
>>>>
>>>>> Signed-off-by: Laurent Barbe <laurent@ksperis.com>
>>>>> ---
>>>>>  drivers/block/rbd.c |    1 +
>>>>>  1 file changed, 1 insertion(+)
>>>>>
>>>>> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
>>>>> index f556f8a..1963025 100644
>>>>> --- a/drivers/block/rbd.c
>>>>> +++ b/drivers/block/rbd.c
>>>>> @@ -2293,6 +2293,7 @@ static void rbd_update_mapping_size(struct
>>> rbd_device *rbd_dev)
>>>>>      dout("setting size to %llu sectors", (unsigned long long) size);
>>>>>      rbd_dev->mapping.size = (u64) size;
>>>>>      set_capacity(rbd_dev->disk, size);
>>>>> +    revalidate_disk(rbd_dev->disk);
>>>>>  }
>>>>>
>>>>>  /*
>>>>>
>>>>
>>>
>>>
>>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index f556f8a..1963025 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -2293,6 +2293,7 @@  static void rbd_update_mapping_size(struct rbd_device *rbd_dev)
 	dout("setting size to %llu sectors", (unsigned long long) size);
 	rbd_dev->mapping.size = (u64) size;
 	set_capacity(rbd_dev->disk, size);
+	revalidate_disk(rbd_dev->disk);
 }
 
 /*