Message ID | 201307110917146231510@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Jul 11, 2013 at 9:17 AM, majianpeng <majianpeng@gmail.com> wrote: > Although, mmap-write of ceph update the time of file using > file_update_time.Because it don't mark the relative cap so the time > can't save. I think cephfs' mmap IO support is still broken. mmap IO does not respect ceph capabilities at all. It's possible that the kclient has no Fw cap when ceph_page_mkwrite is called. Yan, Zheng > > Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> > --- > fs/ceph/addr.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c > index 38b5c1b..52f7c7c 100644 > --- a/fs/ceph/addr.c > +++ b/fs/ceph/addr.c > @@ -1231,10 +1231,18 @@ static int ceph_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf) > > ret = ceph_update_writeable_page(vma->vm_file, off, len, page); > if (ret == 0) { > + int dirty; > + struct ceph_inode_info *ci = ceph_inode(inode); > /* success. we'll keep the page locked. */ > set_page_dirty(page); > up_read(&mdsc->snap_rwsem); > ret = VM_FAULT_LOCKED; > + > + spin_lock(&ci->i_ceph_lock); > + dirty = __ceph_mark_dirty_caps(ci, CEPH_CAP_FILE_WR); > + spin_unlock(&ci->i_ceph_lock); > + if (dirty) > + __mark_inode_dirty(inode, dirty); > } else { > if (ret == -ENOMEM) > ret = VM_FAULT_OOM; > -- > 1.8.1.2 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>On Thu, Jul 11, 2013 at 9:17 AM, majianpeng <majianpeng@gmail.com> wrote: >> Although, mmap-write of ceph update the time of file using >> file_update_time.Because it don't mark the relative cap so the time >> can't save. > >I think cephfs' mmap IO support is still broken. mmap IO does not respect >ceph capabilities at all. It's possible that the kclient has no Fw cap when >ceph_page_mkwrite is called. > Thanks, i'll add this check and resent. Jianpeng Ma >Yan, Zheng > > >> >> Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> >> --- >> fs/ceph/addr.c | 8 ++++++++ >> 1 file changed, 8 insertions(+) >> >> diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c >> index 38b5c1b..52f7c7c 100644 >> --- a/fs/ceph/addr.c >> +++ b/fs/ceph/addr.c >> @@ -1231,10 +1231,18 @@ static int ceph_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf) >> >> ret = ceph_update_writeable_page(vma->vm_file, off, len, page); >> if (ret == 0) { >> + int dirty; >> + struct ceph_inode_info *ci = ceph_inode(inode); >> /* success. we'll keep the page locked. */ >> set_page_dirty(page); >> up_read(&mdsc->snap_rwsem); >> ret = VM_FAULT_LOCKED; >> + >> + spin_lock(&ci->i_ceph_lock); >> + dirty = __ceph_mark_dirty_caps(ci, CEPH_CAP_FILE_WR); >> + spin_unlock(&ci->i_ceph_lock); >> + if (dirty) >> + __mark_inode_dirty(inode, dirty); >> } else { >> if (ret == -ENOMEM) >> ret = VM_FAULT_OOM; >> -- >> 1.8.1.2
>On Thu, Jul 11, 2013 at 9:17 AM, majianpeng <majianpeng@gmail.com> wrote: >> Although, mmap-write of ceph update the time of file using >> file_update_time.Because it don't mark the relative cap so the time >> can't save. > >I think cephfs' mmap IO support is still broken. mmap IO does not respect >ceph capabilities at all. It's possible that the kclient has no Fw cap when >ceph_page_mkwrite is called. > >Yan, Zheng > > Hi Yan, For the read of mmap, i think we also add cap-check. And i think we can rewrite the filemap_fault ect ceph_filemap_fault. In ceph_filemap_fault, most code from filemap_fault, we only add cap-check for read-operation. How about this or can you suggest anthor method? Thanks! Jianpeng Ma
On Thu, Jul 11, 2013 at 3:53 PM, majianpeng <majianpeng@gmail.com> wrote: >>On Thu, Jul 11, 2013 at 9:17 AM, majianpeng <majianpeng@gmail.com> wrote: >>> Although, mmap-write of ceph update the time of file using >>> file_update_time.Because it don't mark the relative cap so the time >>> can't save. >> >>I think cephfs' mmap IO support is still broken. mmap IO does not respect >>ceph capabilities at all. It's possible that the kclient has no Fw cap when >>ceph_page_mkwrite is called. >> >>Yan, Zheng >> >> > Hi Yan, > For the read of mmap, i think we also add cap-check. > And i think we can rewrite the filemap_fault ect ceph_filemap_fault. > In ceph_filemap_fault, most code from filemap_fault, we only add cap-check for read-operation. > How about this or can you suggest anthor method? > This doesn't work. if a file is opened by multiple clients, the MDS doesn't issue Fcb caps to the client. If we add cap check to filemap_fault and page_mkwrite, they can block forever. To make mmap IO work properly, I think the only solution is introducing a new cache coherence protocol for mmap IO > Thanks! > Jianpeng Ma -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>On Thu, Jul 11, 2013 at 3:53 PM, majianpeng <majianpeng@gmail.com> wrote: >>>On Thu, Jul 11, 2013 at 9:17 AM, majianpeng <majianpeng@gmail.com> wrote: >>>> Although, mmap-write of ceph update the time of file using >>>> file_update_time.Because it don't mark the relative cap so the time >>>> can't save. >>> >>>I think cephfs' mmap IO support is still broken. mmap IO does not respect >>>ceph capabilities at all. It's possible that the kclient has no Fw cap when >>>ceph_page_mkwrite is called. >>> >>>Yan, Zheng >>> >>> >> Hi Yan, >> For the read of mmap, i think we also add cap-check. >> And i think we can rewrite the filemap_fault ect ceph_filemap_fault. >> In ceph_filemap_fault, most code from filemap_fault, we only add cap-check for read-operation. >> How about this or can you suggest anthor method? >> > >This doesn't work. if a file is opened by multiple clients, the MDS >doesn't issue Fcb caps to the client. >If we add cap check to filemap_fault and page_mkwrite, they can block >forever. To make mmap IO work >properly, I think the only solution is introducing a new cache >coherence protocol for mmap IO > Thanks your suggestion. Unless we get the Fcb cap, we can do the mmap-io.How about this? Because the cap can change,so we can using Fx to lock the file, so cap can't change. > >> Thanks! >> Jianpeng Ma
On Thu, Jul 11, 2013 at 2:03 AM, Yan, Zheng <ukernel@gmail.com> wrote: > On Thu, Jul 11, 2013 at 3:53 PM, majianpeng <majianpeng@gmail.com> wrote: >>>On Thu, Jul 11, 2013 at 9:17 AM, majianpeng <majianpeng@gmail.com> wrote: >>>> Although, mmap-write of ceph update the time of file using >>>> file_update_time.Because it don't mark the relative cap so the time >>>> can't save. >>> >>>I think cephfs' mmap IO support is still broken. mmap IO does not respect >>>ceph capabilities at all. It's possible that the kclient has no Fw cap when >>>ceph_page_mkwrite is called. >>> >>>Yan, Zheng >>> >>> >> Hi Yan, >> For the read of mmap, i think we also add cap-check. >> And i think we can rewrite the filemap_fault ect ceph_filemap_fault. >> In ceph_filemap_fault, most code from filemap_fault, we only add cap-check for read-operation. >> How about this or can you suggest anthor method? >> > > This doesn't work. if a file is opened by multiple clients, the MDS > doesn't issue Fcb caps to the client. > If we add cap check to filemap_fault and page_mkwrite, they can block > forever. To make mmap IO work > properly, I think the only solution is introducing a new cache > coherence protocol for mmap IO I'm not familiar with the mmap IO paths here at all, but is there some reason we need Fcb for it? Not having those just means everything needs to be synchronous to the OSD, which is going to be hella slow but should be legal? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jul 12, 2013 at 10:46 AM, Gregory Farnum <greg@inktank.com> wrote: > On Thu, Jul 11, 2013 at 2:03 AM, Yan, Zheng <ukernel@gmail.com> wrote: >> On Thu, Jul 11, 2013 at 3:53 PM, majianpeng <majianpeng@gmail.com> wrote: >>>>On Thu, Jul 11, 2013 at 9:17 AM, majianpeng <majianpeng@gmail.com> wrote: >>>>> Although, mmap-write of ceph update the time of file using >>>>> file_update_time.Because it don't mark the relative cap so the time >>>>> can't save. >>>> >>>>I think cephfs' mmap IO support is still broken. mmap IO does not respect >>>>ceph capabilities at all. It's possible that the kclient has no Fw cap when >>>>ceph_page_mkwrite is called. >>>> >>>>Yan, Zheng >>>> >>>> >>> Hi Yan, >>> For the read of mmap, i think we also add cap-check. >>> And i think we can rewrite the filemap_fault ect ceph_filemap_fault. >>> In ceph_filemap_fault, most code from filemap_fault, we only add cap-check for read-operation. >>> How about this or can you suggest anthor method? >>> >> >> This doesn't work. if a file is opened by multiple clients, the MDS >> doesn't issue Fcb caps to the client. >> If we add cap check to filemap_fault and page_mkwrite, they can block >> forever. To make mmap IO work >> properly, I think the only solution is introducing a new cache >> coherence protocol for mmap IO > > I'm not familiar with the mmap IO paths here at all, but is there some > reason we need Fcb for it? Not having those just means everything > needs to be synchronous to the OSD, which is going to be hella slow > but should be legal? For mmap write, the kernel has no way to know when an write operation is finished, so it can't do synchronous write to the OSD. Yan, Zheng Regards > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 38b5c1b..52f7c7c 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -1231,10 +1231,18 @@ static int ceph_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf) ret = ceph_update_writeable_page(vma->vm_file, off, len, page); if (ret == 0) { + int dirty; + struct ceph_inode_info *ci = ceph_inode(inode); /* success. we'll keep the page locked. */ set_page_dirty(page); up_read(&mdsc->snap_rwsem); ret = VM_FAULT_LOCKED; + + spin_lock(&ci->i_ceph_lock); + dirty = __ceph_mark_dirty_caps(ci, CEPH_CAP_FILE_WR); + spin_unlock(&ci->i_ceph_lock); + if (dirty) + __mark_inode_dirty(inode, dirty); } else { if (ret == -ENOMEM) ret = VM_FAULT_OOM;
Although, mmap-write of ceph update the time of file using file_update_time.Because it don't mark the relative cap so the time can't save. Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> --- fs/ceph/addr.c | 8 ++++++++ 1 file changed, 8 insertions(+) -- 1.8.1.2