Message ID | 20181019152932.32462-3-olga.kornievskaia@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | client-side support for "inter" SSC copy | expand |
On Fri, 2018-10-19 at 11:29 -0400, Olga Kornievskaia wrote: > From: Olga Kornievskaia <kolga@netapp.com> > > Allow copy_file_range to copy between different superblocks but only > of the same file system types. This feature was of interest to CIFS > as well as NFS. > > This feature is needed by NFSv4.2 to perform file copy operation on > the same server or file copy between different NFSv4.2 servers. > > If a file system's fileoperations copy_file_range operation prohibits > cross-device copies, fall back to do_splice_direct. This would be > needed for the NFS (destination) server side implementation of the > file copy and currently for CIFS. > > Besides NFS, there is only 1 implementor of the copy_file_range FS > operation -- CIFS. CIFS assumes incoming file descriptors are both > CIFS but it will check if they are coming from different servers and > return error code to fall back to do_splice_direct. > > NFS will allow for copies between different NFS servers. > > Adding to the vfs.txt documentation to explicitly warn about allowing > for different superblocks of the same file type to be passed into the > copy_file_range for the future users of copy_file_range method. > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com> > --- > Documentation/filesystems/vfs.txt | 4 +++- > fs/read_write.c | 13 ++++++------- > 2 files changed, 9 insertions(+), 8 deletions(-) > > diff --git a/Documentation/filesystems/vfs.txt > b/Documentation/filesystems/vfs.txt > index a6c6a8a..5e520de 100644 > --- a/Documentation/filesystems/vfs.txt > +++ b/Documentation/filesystems/vfs.txt > @@ -958,7 +958,9 @@ otherwise noted. > > fallocate: called by the VFS to preallocate blocks or punch a > hole. > > - copy_file_range: called by the copy_file_range(2) system call. > + copy_file_range: called by copy_file_range(2) system call. This > method > + works on two file descriptors that might reside on > + different superblocks of the same type of file > system. > > clone_file_range: called by the ioctl(2) system call for > FICLONERANGE and > FICLONE commands. > diff --git a/fs/read_write.c b/fs/read_write.c > index c60790f..474e740 100644 > --- a/fs/read_write.c > +++ b/fs/read_write.c > @@ -1578,10 +1578,6 @@ ssize_t vfs_copy_file_range(struct file > *file_in, loff_t pos_in, > (file_out->f_flags & O_APPEND)) > return -EBADF; > > - /* this could be relaxed once a method supports cross-fs copies > */ > - if (inode_in->i_sb != inode_out->i_sb) > - return -EXDEV; > - > if (len == 0) > return 0; > > @@ -1591,7 +1587,8 @@ ssize_t vfs_copy_file_range(struct file > *file_in, loff_t pos_in, > * Try cloning first, this is supported by more file systems, > and > * more efficient if both clone and copy are supported (e.g. > NFS). > */ > - if (file_in->f_op->clone_file_range) { > + if (inode_in->i_sb == inode_out->i_sb && > + file_in->f_op->clone_file_range) { > ret = file_in->f_op->clone_file_range(file_in, pos_in, > file_out, pos_out, len); > if (ret == 0) { > @@ -1600,10 +1597,12 @@ ssize_t vfs_copy_file_range(struct file > *file_in, loff_t pos_in, > } > } > > - if (file_out->f_op->copy_file_range) { > + if (file_out->f_op->copy_file_range && > + (file_in->f_op->copy_file_range == > + file_out->f_op->copy_file_range)) { > ret = file_out->f_op->copy_file_range(file_in, pos_in, > file_out, > pos_out, len, > flags); > - if (ret != -EOPNOTSUPP) > + if (ret != -EOPNOTSUPP && ret != -EXDEV) > goto done; > } > Ditto. This also needs an ACK from the VFS maintainers. Cc: Al and linux-fsdevel
On Fri, Oct 19, 2018 at 12:14 PM Trond Myklebust <trondmy@hammerspace.com> wrote: > > On Fri, 2018-10-19 at 11:29 -0400, Olga Kornievskaia wrote: > > From: Olga Kornievskaia <kolga@netapp.com> > > > > Allow copy_file_range to copy between different superblocks but only > > of the same file system types. This feature was of interest to CIFS > > as well as NFS. > > > > This feature is needed by NFSv4.2 to perform file copy operation on > > the same server or file copy between different NFSv4.2 servers. > > > > If a file system's fileoperations copy_file_range operation prohibits > > cross-device copies, fall back to do_splice_direct. This would be > > needed for the NFS (destination) server side implementation of the > > file copy and currently for CIFS. > > > > Besides NFS, there is only 1 implementor of the copy_file_range FS > > operation -- CIFS. CIFS assumes incoming file descriptors are both > > CIFS but it will check if they are coming from different servers and > > return error code to fall back to do_splice_direct. > > > > NFS will allow for copies between different NFS servers. > > > > Adding to the vfs.txt documentation to explicitly warn about allowing > > for different superblocks of the same file type to be passed into the > > copy_file_range for the future users of copy_file_range method. > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com> > > --- > > Documentation/filesystems/vfs.txt | 4 +++- > > fs/read_write.c | 13 ++++++------- > > 2 files changed, 9 insertions(+), 8 deletions(-) > > > > diff --git a/Documentation/filesystems/vfs.txt > > b/Documentation/filesystems/vfs.txt > > index a6c6a8a..5e520de 100644 > > --- a/Documentation/filesystems/vfs.txt > > +++ b/Documentation/filesystems/vfs.txt > > @@ -958,7 +958,9 @@ otherwise noted. > > > > fallocate: called by the VFS to preallocate blocks or punch a > > hole. > > > > - copy_file_range: called by the copy_file_range(2) system call. > > + copy_file_range: called by copy_file_range(2) system call. This > > method > > + works on two file descriptors that might reside on > > + different superblocks of the same type of file > > system. > > > > clone_file_range: called by the ioctl(2) system call for > > FICLONERANGE and > > FICLONE commands. > > diff --git a/fs/read_write.c b/fs/read_write.c > > index c60790f..474e740 100644 > > --- a/fs/read_write.c > > +++ b/fs/read_write.c > > @@ -1578,10 +1578,6 @@ ssize_t vfs_copy_file_range(struct file > > *file_in, loff_t pos_in, > > (file_out->f_flags & O_APPEND)) > > return -EBADF; > > > > - /* this could be relaxed once a method supports cross-fs copies > > */ > > - if (inode_in->i_sb != inode_out->i_sb) > > - return -EXDEV; > > - > > if (len == 0) > > return 0; > > > > @@ -1591,7 +1587,8 @@ ssize_t vfs_copy_file_range(struct file > > *file_in, loff_t pos_in, > > * Try cloning first, this is supported by more file systems, > > and > > * more efficient if both clone and copy are supported (e.g. > > NFS). > > */ > > - if (file_in->f_op->clone_file_range) { > > + if (inode_in->i_sb == inode_out->i_sb && > > + file_in->f_op->clone_file_range) { > > ret = file_in->f_op->clone_file_range(file_in, pos_in, > > file_out, pos_out, len); > > if (ret == 0) { > > @@ -1600,10 +1597,12 @@ ssize_t vfs_copy_file_range(struct file > > *file_in, loff_t pos_in, > > } > > } > > > > - if (file_out->f_op->copy_file_range) { > > + if (file_out->f_op->copy_file_range && > > + (file_in->f_op->copy_file_range == > > + file_out->f_op->copy_file_range)) { > > ret = file_out->f_op->copy_file_range(file_in, pos_in, > > file_out, > > pos_out, len, > > flags); > > - if (ret != -EOPNOTSUPP) > > + if (ret != -EOPNOTSUPP && ret != -EXDEV) > > goto done; > > } > > > > Ditto. This also needs an ACK from the VFS maintainers. > > Cc: Al and linux-fsdevel Yeah I sent VFS as separate patches to the linux-fsdevel and included other folks (glibc/CIFS?) that were interested in this functionality. Apologizes for double sent. It was easier to send this as a patch series to just linux-nfs first. > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@hammerspace.com > >
diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt index a6c6a8a..5e520de 100644 --- a/Documentation/filesystems/vfs.txt +++ b/Documentation/filesystems/vfs.txt @@ -958,7 +958,9 @@ otherwise noted. fallocate: called by the VFS to preallocate blocks or punch a hole. - copy_file_range: called by the copy_file_range(2) system call. + copy_file_range: called by copy_file_range(2) system call. This method + works on two file descriptors that might reside on + different superblocks of the same type of file system. clone_file_range: called by the ioctl(2) system call for FICLONERANGE and FICLONE commands. diff --git a/fs/read_write.c b/fs/read_write.c index c60790f..474e740 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -1578,10 +1578,6 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in, (file_out->f_flags & O_APPEND)) return -EBADF; - /* this could be relaxed once a method supports cross-fs copies */ - if (inode_in->i_sb != inode_out->i_sb) - return -EXDEV; - if (len == 0) return 0; @@ -1591,7 +1587,8 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in, * Try cloning first, this is supported by more file systems, and * more efficient if both clone and copy are supported (e.g. NFS). */ - if (file_in->f_op->clone_file_range) { + if (inode_in->i_sb == inode_out->i_sb && + file_in->f_op->clone_file_range) { ret = file_in->f_op->clone_file_range(file_in, pos_in, file_out, pos_out, len); if (ret == 0) { @@ -1600,10 +1597,12 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in, } } - if (file_out->f_op->copy_file_range) { + if (file_out->f_op->copy_file_range && + (file_in->f_op->copy_file_range == + file_out->f_op->copy_file_range)) { ret = file_out->f_op->copy_file_range(file_in, pos_in, file_out, pos_out, len, flags); - if (ret != -EOPNOTSUPP) + if (ret != -EOPNOTSUPP && ret != -EXDEV) goto done; }