diff mbox

ceph:Update the file time after mmap-write.

Message ID 201307110917146231510@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

majianpeng July 11, 2013, 1:17 a.m. UTC
Although, mmap-write of ceph update the time of file using
file_update_time.Because it don't mark the relative cap so the time
can't save.

Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>

---
 fs/ceph/addr.c | 8 ++++++++
 1 file changed, 8 insertions(+)

-- 
1.8.1.2

Comments

Yan, Zheng July 11, 2013, 1:42 a.m. UTC | #1
On Thu, Jul 11, 2013 at 9:17 AM, majianpeng <majianpeng@gmail.com> wrote:
> Although, mmap-write of ceph update the time of file using
> file_update_time.Because it don't mark the relative cap so the time
> can't save.

I think cephfs' mmap IO support is still broken. mmap IO does not respect
ceph capabilities at all. It's possible that the kclient has no Fw cap when
ceph_page_mkwrite is called.

Yan, Zheng


>
> Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
> ---
>  fs/ceph/addr.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
> index 38b5c1b..52f7c7c 100644
> --- a/fs/ceph/addr.c
> +++ b/fs/ceph/addr.c
> @@ -1231,10 +1231,18 @@ static int ceph_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
>
>         ret = ceph_update_writeable_page(vma->vm_file, off, len, page);
>         if (ret == 0) {
> +               int dirty;
> +               struct ceph_inode_info *ci = ceph_inode(inode);
>                 /* success.  we'll keep the page locked. */
>                 set_page_dirty(page);
>                 up_read(&mdsc->snap_rwsem);
>                 ret = VM_FAULT_LOCKED;
> +
> +               spin_lock(&ci->i_ceph_lock);
> +               dirty = __ceph_mark_dirty_caps(ci, CEPH_CAP_FILE_WR);
> +               spin_unlock(&ci->i_ceph_lock);
> +               if (dirty)
> +                       __mark_inode_dirty(inode, dirty);
>         } else {
>                 if (ret == -ENOMEM)
>                         ret = VM_FAULT_OOM;
> --
> 1.8.1.2
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
majianpeng July 11, 2013, 1:49 a.m. UTC | #2
>On Thu, Jul 11, 2013 at 9:17 AM, majianpeng <majianpeng@gmail.com> wrote:

>> Although, mmap-write of ceph update the time of file using

>> file_update_time.Because it don't mark the relative cap so the time

>> can't save.

>

>I think cephfs' mmap IO support is still broken. mmap IO does not respect

>ceph capabilities at all. It's possible that the kclient has no Fw cap when

>ceph_page_mkwrite is called.

>

Thanks, i'll add this check and resent.

Jianpeng Ma
>Yan, Zheng

>

>

>>

>> Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>

>> ---

>>  fs/ceph/addr.c | 8 ++++++++

>>  1 file changed, 8 insertions(+)

>>

>> diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c

>> index 38b5c1b..52f7c7c 100644

>> --- a/fs/ceph/addr.c

>> +++ b/fs/ceph/addr.c

>> @@ -1231,10 +1231,18 @@ static int ceph_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)

>>

>>         ret = ceph_update_writeable_page(vma->vm_file, off, len, page);

>>         if (ret == 0) {

>> +               int dirty;

>> +               struct ceph_inode_info *ci = ceph_inode(inode);

>>                 /* success.  we'll keep the page locked. */

>>                 set_page_dirty(page);

>>                 up_read(&mdsc->snap_rwsem);

>>                 ret = VM_FAULT_LOCKED;

>> +

>> +               spin_lock(&ci->i_ceph_lock);

>> +               dirty = __ceph_mark_dirty_caps(ci, CEPH_CAP_FILE_WR);

>> +               spin_unlock(&ci->i_ceph_lock);

>> +               if (dirty)

>> +                       __mark_inode_dirty(inode, dirty);

>>         } else {

>>                 if (ret == -ENOMEM)

>>                         ret = VM_FAULT_OOM;

>> --

>> 1.8.1.2
majianpeng July 11, 2013, 7:53 a.m. UTC | #3
>On Thu, Jul 11, 2013 at 9:17 AM, majianpeng <majianpeng@gmail.com> wrote:

>> Although, mmap-write of ceph update the time of file using

>> file_update_time.Because it don't mark the relative cap so the time

>> can't save.

>

>I think cephfs' mmap IO support is still broken. mmap IO does not respect

>ceph capabilities at all. It's possible that the kclient has no Fw cap when

>ceph_page_mkwrite is called.

>

>Yan, Zheng

>

>

Hi Yan,
	For the read of mmap, i think we also add cap-check.
And i think we can rewrite the filemap_fault ect ceph_filemap_fault.
In ceph_filemap_fault, most code from filemap_fault, we only add cap-check for read-operation.
How about this or can you suggest anthor method?

Thanks!
Jianpeng Ma
Yan, Zheng July 11, 2013, 9:03 a.m. UTC | #4
On Thu, Jul 11, 2013 at 3:53 PM, majianpeng <majianpeng@gmail.com> wrote:
>>On Thu, Jul 11, 2013 at 9:17 AM, majianpeng <majianpeng@gmail.com> wrote:
>>> Although, mmap-write of ceph update the time of file using
>>> file_update_time.Because it don't mark the relative cap so the time
>>> can't save.
>>
>>I think cephfs' mmap IO support is still broken. mmap IO does not respect
>>ceph capabilities at all. It's possible that the kclient has no Fw cap when
>>ceph_page_mkwrite is called.
>>
>>Yan, Zheng
>>
>>
> Hi Yan,
>         For the read of mmap, i think we also add cap-check.
> And i think we can rewrite the filemap_fault ect ceph_filemap_fault.
> In ceph_filemap_fault, most code from filemap_fault, we only add cap-check for read-operation.
> How about this or can you suggest anthor method?
>

This doesn't work. if a file is opened by multiple clients, the MDS
doesn't issue Fcb caps to the client.
If we add cap check to filemap_fault and page_mkwrite, they can block
forever. To make mmap IO work
properly, I think the only solution is introducing a new cache
coherence protocol for mmap IO


> Thanks!
> Jianpeng Ma
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
majianpeng July 12, 2013, 2:26 a.m. UTC | #5
>On Thu, Jul 11, 2013 at 3:53 PM, majianpeng <majianpeng@gmail.com> wrote:

>>>On Thu, Jul 11, 2013 at 9:17 AM, majianpeng <majianpeng@gmail.com> wrote:

>>>> Although, mmap-write of ceph update the time of file using

>>>> file_update_time.Because it don't mark the relative cap so the time

>>>> can't save.

>>>

>>>I think cephfs' mmap IO support is still broken. mmap IO does not respect

>>>ceph capabilities at all. It's possible that the kclient has no Fw cap when

>>>ceph_page_mkwrite is called.

>>>

>>>Yan, Zheng

>>>

>>>

>> Hi Yan,

>>         For the read of mmap, i think we also add cap-check.

>> And i think we can rewrite the filemap_fault ect ceph_filemap_fault.

>> In ceph_filemap_fault, most code from filemap_fault, we only add cap-check for read-operation.

>> How about this or can you suggest anthor method?

>>

>

>This doesn't work. if a file is opened by multiple clients, the MDS

>doesn't issue Fcb caps to the client.

>If we add cap check to filemap_fault and page_mkwrite, they can block

>forever. To make mmap IO work

>properly, I think the only solution is introducing a new cache

>coherence protocol for mmap IO

>

Thanks your suggestion.
Unless we get the Fcb cap, we can do the mmap-io.How about this?
Because the cap can change,so we can using Fx to lock the file, so cap can't change.

>

>> Thanks!

>> Jianpeng Ma
Gregory Farnum July 12, 2013, 2:46 a.m. UTC | #6
On Thu, Jul 11, 2013 at 2:03 AM, Yan, Zheng <ukernel@gmail.com> wrote:
> On Thu, Jul 11, 2013 at 3:53 PM, majianpeng <majianpeng@gmail.com> wrote:
>>>On Thu, Jul 11, 2013 at 9:17 AM, majianpeng <majianpeng@gmail.com> wrote:
>>>> Although, mmap-write of ceph update the time of file using
>>>> file_update_time.Because it don't mark the relative cap so the time
>>>> can't save.
>>>
>>>I think cephfs' mmap IO support is still broken. mmap IO does not respect
>>>ceph capabilities at all. It's possible that the kclient has no Fw cap when
>>>ceph_page_mkwrite is called.
>>>
>>>Yan, Zheng
>>>
>>>
>> Hi Yan,
>>         For the read of mmap, i think we also add cap-check.
>> And i think we can rewrite the filemap_fault ect ceph_filemap_fault.
>> In ceph_filemap_fault, most code from filemap_fault, we only add cap-check for read-operation.
>> How about this or can you suggest anthor method?
>>
>
> This doesn't work. if a file is opened by multiple clients, the MDS
> doesn't issue Fcb caps to the client.
> If we add cap check to filemap_fault and page_mkwrite, they can block
> forever. To make mmap IO work
> properly, I think the only solution is introducing a new cache
> coherence protocol for mmap IO

I'm not familiar with the mmap IO paths here at all, but is there some
reason we need Fcb for it? Not having those just means everything
needs to be synchronous to the OSD, which is going to be hella slow
but should be legal?
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yan, Zheng July 12, 2013, 3:16 a.m. UTC | #7
On Fri, Jul 12, 2013 at 10:46 AM, Gregory Farnum <greg@inktank.com> wrote:
> On Thu, Jul 11, 2013 at 2:03 AM, Yan, Zheng <ukernel@gmail.com> wrote:
>> On Thu, Jul 11, 2013 at 3:53 PM, majianpeng <majianpeng@gmail.com> wrote:
>>>>On Thu, Jul 11, 2013 at 9:17 AM, majianpeng <majianpeng@gmail.com> wrote:
>>>>> Although, mmap-write of ceph update the time of file using
>>>>> file_update_time.Because it don't mark the relative cap so the time
>>>>> can't save.
>>>>
>>>>I think cephfs' mmap IO support is still broken. mmap IO does not respect
>>>>ceph capabilities at all. It's possible that the kclient has no Fw cap when
>>>>ceph_page_mkwrite is called.
>>>>
>>>>Yan, Zheng
>>>>
>>>>
>>> Hi Yan,
>>>         For the read of mmap, i think we also add cap-check.
>>> And i think we can rewrite the filemap_fault ect ceph_filemap_fault.
>>> In ceph_filemap_fault, most code from filemap_fault, we only add cap-check for read-operation.
>>> How about this or can you suggest anthor method?
>>>
>>
>> This doesn't work. if a file is opened by multiple clients, the MDS
>> doesn't issue Fcb caps to the client.
>> If we add cap check to filemap_fault and page_mkwrite, they can block
>> forever. To make mmap IO work
>> properly, I think the only solution is introducing a new cache
>> coherence protocol for mmap IO
>
> I'm not familiar with the mmap IO paths here at all, but is there some
> reason we need Fcb for it? Not having those just means everything
> needs to be synchronous to the OSD, which is going to be hella slow
> but should be legal?

For mmap write, the kernel has no way to know when an write operation
is finished,
so it can't do synchronous write to the OSD.

Yan, Zheng
Regards

> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 38b5c1b..52f7c7c 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -1231,10 +1231,18 @@  static int ceph_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
 
 	ret = ceph_update_writeable_page(vma->vm_file, off, len, page);
 	if (ret == 0) {
+		int dirty;
+		struct ceph_inode_info *ci = ceph_inode(inode);
 		/* success.  we'll keep the page locked. */
 		set_page_dirty(page);
 		up_read(&mdsc->snap_rwsem);
 		ret = VM_FAULT_LOCKED;
+
+		spin_lock(&ci->i_ceph_lock);
+		dirty = __ceph_mark_dirty_caps(ci, CEPH_CAP_FILE_WR);
+		spin_unlock(&ci->i_ceph_lock);
+		if (dirty)
+		       	__mark_inode_dirty(inode, dirty);
 	} else {
 		if (ret == -ENOMEM)
 			ret = VM_FAULT_OOM;