diff mbox series

[v1,02/11] VFS permit cross device vfs_copy_file_range

Message ID 20181019152932.32462-3-olga.kornievskaia@gmail.com (mailing list archive)
State New, archived
Headers show
Series client-side support for "inter" SSC copy | expand

Commit Message

Olga Kornievskaia Oct. 19, 2018, 3:29 p.m. UTC
From: Olga Kornievskaia <kolga@netapp.com>

Allow copy_file_range to copy between different superblocks but only
of the same file system types. This feature was of interest to CIFS
as well as NFS.

This feature is needed by NFSv4.2 to perform file copy operation on
the same server or file copy between different NFSv4.2 servers.

If a file system's fileoperations copy_file_range operation prohibits
cross-device copies, fall back to do_splice_direct. This would be
needed for the NFS (destination) server side implementation of the
file copy and currently for CIFS.

Besides NFS, there is only 1 implementor of the copy_file_range FS
operation -- CIFS. CIFS assumes incoming file descriptors are both
CIFS but it will check if they are coming from different servers and
return error code to fall back to do_splice_direct.

NFS will allow for copies between different NFS servers.

Adding to the vfs.txt documentation to explicitly warn about allowing
for different superblocks of the same file type to be passed into the
copy_file_range for the future users of copy_file_range method.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
---
 Documentation/filesystems/vfs.txt |  4 +++-
 fs/read_write.c                   | 13 ++++++-------
 2 files changed, 9 insertions(+), 8 deletions(-)

Comments

Trond Myklebust Oct. 19, 2018, 4:14 p.m. UTC | #1
On Fri, 2018-10-19 at 11:29 -0400, Olga Kornievskaia wrote:
> From: Olga Kornievskaia <kolga@netapp.com>
> 
> Allow copy_file_range to copy between different superblocks but only
> of the same file system types. This feature was of interest to CIFS
> as well as NFS.
> 
> This feature is needed by NFSv4.2 to perform file copy operation on
> the same server or file copy between different NFSv4.2 servers.
> 
> If a file system's fileoperations copy_file_range operation prohibits
> cross-device copies, fall back to do_splice_direct. This would be
> needed for the NFS (destination) server side implementation of the
> file copy and currently for CIFS.
> 
> Besides NFS, there is only 1 implementor of the copy_file_range FS
> operation -- CIFS. CIFS assumes incoming file descriptors are both
> CIFS but it will check if they are coming from different servers and
> return error code to fall back to do_splice_direct.
> 
> NFS will allow for copies between different NFS servers.
> 
> Adding to the vfs.txt documentation to explicitly warn about allowing
> for different superblocks of the same file type to be passed into the
> copy_file_range for the future users of copy_file_range method.
> 
> Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> ---
>  Documentation/filesystems/vfs.txt |  4 +++-
>  fs/read_write.c                   | 13 ++++++-------
>  2 files changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/filesystems/vfs.txt
> b/Documentation/filesystems/vfs.txt
> index a6c6a8a..5e520de 100644
> --- a/Documentation/filesystems/vfs.txt
> +++ b/Documentation/filesystems/vfs.txt
> @@ -958,7 +958,9 @@ otherwise noted.
>  
>    fallocate: called by the VFS to preallocate blocks or punch a
> hole.
>  
> -  copy_file_range: called by the copy_file_range(2) system call.
> +  copy_file_range: called by copy_file_range(2) system call. This
> method
> +		   works on two file descriptors that might reside on
> +		   different superblocks of the same type of file
> system.
>  
>    clone_file_range: called by the ioctl(2) system call for
> FICLONERANGE and
>  	FICLONE commands.
> diff --git a/fs/read_write.c b/fs/read_write.c
> index c60790f..474e740 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -1578,10 +1578,6 @@ ssize_t vfs_copy_file_range(struct file
> *file_in, loff_t pos_in,
>  	    (file_out->f_flags & O_APPEND))
>  		return -EBADF;
>  
> -	/* this could be relaxed once a method supports cross-fs copies
> */
> -	if (inode_in->i_sb != inode_out->i_sb)
> -		return -EXDEV;
> -
>  	if (len == 0)
>  		return 0;
>  
> @@ -1591,7 +1587,8 @@ ssize_t vfs_copy_file_range(struct file
> *file_in, loff_t pos_in,
>  	 * Try cloning first, this is supported by more file systems,
> and
>  	 * more efficient if both clone and copy are supported (e.g.
> NFS).
>  	 */
> -	if (file_in->f_op->clone_file_range) {
> +	if (inode_in->i_sb == inode_out->i_sb &&
> +			file_in->f_op->clone_file_range) {
>  		ret = file_in->f_op->clone_file_range(file_in, pos_in,
>  				file_out, pos_out, len);
>  		if (ret == 0) {
> @@ -1600,10 +1597,12 @@ ssize_t vfs_copy_file_range(struct file
> *file_in, loff_t pos_in,
>  		}
>  	}
>  
> -	if (file_out->f_op->copy_file_range) {
> +	if (file_out->f_op->copy_file_range &&
> +			(file_in->f_op->copy_file_range ==
> +				file_out->f_op->copy_file_range)) {
>  		ret = file_out->f_op->copy_file_range(file_in, pos_in,
> file_out,
>  						      pos_out, len,
> flags);
> -		if (ret != -EOPNOTSUPP)
> +		if (ret != -EOPNOTSUPP && ret != -EXDEV)
>  			goto done;
>  	}
>  

Ditto.  This also needs an ACK from the VFS maintainers.

Cc: Al and linux-fsdevel
Olga Kornievskaia Oct. 19, 2018, 4:26 p.m. UTC | #2
On Fri, Oct 19, 2018 at 12:14 PM Trond Myklebust
<trondmy@hammerspace.com> wrote:
>
> On Fri, 2018-10-19 at 11:29 -0400, Olga Kornievskaia wrote:
> > From: Olga Kornievskaia <kolga@netapp.com>
> >
> > Allow copy_file_range to copy between different superblocks but only
> > of the same file system types. This feature was of interest to CIFS
> > as well as NFS.
> >
> > This feature is needed by NFSv4.2 to perform file copy operation on
> > the same server or file copy between different NFSv4.2 servers.
> >
> > If a file system's fileoperations copy_file_range operation prohibits
> > cross-device copies, fall back to do_splice_direct. This would be
> > needed for the NFS (destination) server side implementation of the
> > file copy and currently for CIFS.
> >
> > Besides NFS, there is only 1 implementor of the copy_file_range FS
> > operation -- CIFS. CIFS assumes incoming file descriptors are both
> > CIFS but it will check if they are coming from different servers and
> > return error code to fall back to do_splice_direct.
> >
> > NFS will allow for copies between different NFS servers.
> >
> > Adding to the vfs.txt documentation to explicitly warn about allowing
> > for different superblocks of the same file type to be passed into the
> > copy_file_range for the future users of copy_file_range method.
> >
> > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > ---
> >  Documentation/filesystems/vfs.txt |  4 +++-
> >  fs/read_write.c                   | 13 ++++++-------
> >  2 files changed, 9 insertions(+), 8 deletions(-)
> >
> > diff --git a/Documentation/filesystems/vfs.txt
> > b/Documentation/filesystems/vfs.txt
> > index a6c6a8a..5e520de 100644
> > --- a/Documentation/filesystems/vfs.txt
> > +++ b/Documentation/filesystems/vfs.txt
> > @@ -958,7 +958,9 @@ otherwise noted.
> >
> >    fallocate: called by the VFS to preallocate blocks or punch a
> > hole.
> >
> > -  copy_file_range: called by the copy_file_range(2) system call.
> > +  copy_file_range: called by copy_file_range(2) system call. This
> > method
> > +                works on two file descriptors that might reside on
> > +                different superblocks of the same type of file
> > system.
> >
> >    clone_file_range: called by the ioctl(2) system call for
> > FICLONERANGE and
> >       FICLONE commands.
> > diff --git a/fs/read_write.c b/fs/read_write.c
> > index c60790f..474e740 100644
> > --- a/fs/read_write.c
> > +++ b/fs/read_write.c
> > @@ -1578,10 +1578,6 @@ ssize_t vfs_copy_file_range(struct file
> > *file_in, loff_t pos_in,
> >           (file_out->f_flags & O_APPEND))
> >               return -EBADF;
> >
> > -     /* this could be relaxed once a method supports cross-fs copies
> > */
> > -     if (inode_in->i_sb != inode_out->i_sb)
> > -             return -EXDEV;
> > -
> >       if (len == 0)
> >               return 0;
> >
> > @@ -1591,7 +1587,8 @@ ssize_t vfs_copy_file_range(struct file
> > *file_in, loff_t pos_in,
> >        * Try cloning first, this is supported by more file systems,
> > and
> >        * more efficient if both clone and copy are supported (e.g.
> > NFS).
> >        */
> > -     if (file_in->f_op->clone_file_range) {
> > +     if (inode_in->i_sb == inode_out->i_sb &&
> > +                     file_in->f_op->clone_file_range) {
> >               ret = file_in->f_op->clone_file_range(file_in, pos_in,
> >                               file_out, pos_out, len);
> >               if (ret == 0) {
> > @@ -1600,10 +1597,12 @@ ssize_t vfs_copy_file_range(struct file
> > *file_in, loff_t pos_in,
> >               }
> >       }
> >
> > -     if (file_out->f_op->copy_file_range) {
> > +     if (file_out->f_op->copy_file_range &&
> > +                     (file_in->f_op->copy_file_range ==
> > +                             file_out->f_op->copy_file_range)) {
> >               ret = file_out->f_op->copy_file_range(file_in, pos_in,
> > file_out,
> >                                                     pos_out, len,
> > flags);
> > -             if (ret != -EOPNOTSUPP)
> > +             if (ret != -EOPNOTSUPP && ret != -EXDEV)
> >                       goto done;
> >       }
> >
>
> Ditto.  This also needs an ACK from the VFS maintainers.
>
> Cc: Al and linux-fsdevel

Yeah I sent VFS as separate patches to the linux-fsdevel and included
other folks (glibc/CIFS?) that were interested in this functionality.
Apologizes for double sent. It was easier to send this as a patch
series to just linux-nfs first.

> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
>
>
diff mbox series

Patch

diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index a6c6a8a..5e520de 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -958,7 +958,9 @@  otherwise noted.
 
   fallocate: called by the VFS to preallocate blocks or punch a hole.
 
-  copy_file_range: called by the copy_file_range(2) system call.
+  copy_file_range: called by copy_file_range(2) system call. This method
+		   works on two file descriptors that might reside on
+		   different superblocks of the same type of file system.
 
   clone_file_range: called by the ioctl(2) system call for FICLONERANGE and
 	FICLONE commands.
diff --git a/fs/read_write.c b/fs/read_write.c
index c60790f..474e740 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1578,10 +1578,6 @@  ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
 	    (file_out->f_flags & O_APPEND))
 		return -EBADF;
 
-	/* this could be relaxed once a method supports cross-fs copies */
-	if (inode_in->i_sb != inode_out->i_sb)
-		return -EXDEV;
-
 	if (len == 0)
 		return 0;
 
@@ -1591,7 +1587,8 @@  ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
 	 * Try cloning first, this is supported by more file systems, and
 	 * more efficient if both clone and copy are supported (e.g. NFS).
 	 */
-	if (file_in->f_op->clone_file_range) {
+	if (inode_in->i_sb == inode_out->i_sb &&
+			file_in->f_op->clone_file_range) {
 		ret = file_in->f_op->clone_file_range(file_in, pos_in,
 				file_out, pos_out, len);
 		if (ret == 0) {
@@ -1600,10 +1597,12 @@  ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
 		}
 	}
 
-	if (file_out->f_op->copy_file_range) {
+	if (file_out->f_op->copy_file_range &&
+			(file_in->f_op->copy_file_range ==
+				file_out->f_op->copy_file_range)) {
 		ret = file_out->f_op->copy_file_range(file_in, pos_in, file_out,
 						      pos_out, len, flags);
-		if (ret != -EOPNOTSUPP)
+		if (ret != -EOPNOTSUPP && ret != -EXDEV)
 			goto done;
 	}