Message ID | 20170302160211.30451-3-kolga@netapp.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Mar 02, 2017 at 11:02:10AM -0500, Olga Kornievskaia wrote: > Allow nfs_copy_file_range to copy across devices. > NFSv4.2 inter server to server copy always copies across devices, and > NFSv4.2 intra server to server copy can copy across devices on the same > server. > > If a file system's fileoperations copy_file_range operation prohibits > cross-device copies, fall back to do_splice_direct. This is needed for > nfsd_copy_file_range() which is called by the inter server to server > destination server acting as an NFS client, and reading the file from > the source server. NAK, we really should not do operations between different superblocks. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On Mar 2, 2017, at 11:07 AM, Christoph Hellwig <hch@lst.de> wrote: > > On Thu, Mar 02, 2017 at 11:02:10AM -0500, Olga Kornievskaia wrote: >> Allow nfs_copy_file_range to copy across devices. >> NFSv4.2 inter server to server copy always copies across devices, and >> NFSv4.2 intra server to server copy can copy across devices on the same >> server. >> >> If a file system's fileoperations copy_file_range operation prohibits >> cross-device copies, fall back to do_splice_direct. This is needed for >> nfsd_copy_file_range() which is called by the inter server to server >> destination server acting as an NFS client, and reading the file from >> the source server. > > NAK, we really should not do operations between different superblocks. Can you provide some reasoning as to why? What would it break? The reasoning for including one is to allow for a file system to achieve better performance which seems like a feature that would be of great benefit. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On Mar 2, 2017, at 11:38 AM, Olga Kornievskaia <kolga@netapp.com> wrote: > > >> On Mar 2, 2017, at 11:07 AM, Christoph Hellwig <hch@lst.de> wrote: >> >> On Thu, Mar 02, 2017 at 11:02:10AM -0500, Olga Kornievskaia wrote: >>> Allow nfs_copy_file_range to copy across devices. >>> NFSv4.2 inter server to server copy always copies across devices, and >>> NFSv4.2 intra server to server copy can copy across devices on the same >>> server. >>> >>> If a file system's fileoperations copy_file_range operation prohibits >>> cross-device copies, fall back to do_splice_direct. This is needed for >>> nfsd_copy_file_range() which is called by the inter server to server >>> destination server acting as an NFS client, and reading the file from >>> the source server. >> >> NAK, we really should not do operations between different superblocks. > > Can you provide some reasoning as to why? What would it break? The reasoning for including one is to allow for a file system to achieve better performance which seems like a feature that would be of great benefit. Christoph, could you please elaborate on your objection. Al, could you weigh in with regards to if and what would it take to allow for copy_file_range() to allow for copies between different superblocks. We would appreciate the feedback of how can we enable this performance feature to be useful to users. Thank you. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Mar 02, 2017 at 11:38:03AM -0500, Olga Kornievskaia wrote: > > On Mar 2, 2017, at 11:07 AM, Christoph Hellwig <hch@lst.de> wrote: > > NAK, we really should not do operations between different > > superblocks. > > Can you provide some reasoning as to why? What would it break? The > reasoning for including one is to allow for a file system to achieve > better performance which seems like a feature that would be of great > benefit. -- Yes, this has come up a few times. What's going on?: - There was an explanation, and I missed it. - The explanation is complicated and Christoph hasn't had time to write it up. - Christoph has a strong suspicion there are issues without being sure exactly where they are. - Something else? I mean, the whole enterprise is dead in the water without this, as far as I can tell, so I'd really like to understand the answer one way or the other. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Mar 15, 2017 at 02:09:13PM -0400, J. Bruce Fields wrote: > On Thu, Mar 02, 2017 at 11:38:03AM -0500, Olga Kornievskaia wrote: > > > On Mar 2, 2017, at 11:07 AM, Christoph Hellwig <hch@lst.de> wrote: > > > NAK, we really should not do operations between different > > > superblocks. > > > > Can you provide some reasoning as to why? What would it break? The > > reasoning for including one is to allow for a file system to achieve > > better performance which seems like a feature that would be of great > > benefit. -- > > Yes, this has come up a few times. What's going on?: > > - There was an explanation, and I missed it. > - The explanation is complicated and Christoph hasn't had time > to write it up. > - Christoph has a strong suspicion there are issues without > being sure exactly where they are. > - Something else? As far as I can tell from talking at LSF: - file system operations crossing superblocks is unusual. No specific known issue. Also not sure I understand why splice isn't precedent. - implementation (with server acting as a client, long-running process, etc.) will be complicated and ugly. Sure, I guess we'll see how complicated and weigh that against any advantages. (Though I'm curious about case e.g. of btrfs clone--can't it easily clone across filesystem boundaries in some cases when they share storage?) Anyway. I'll read the patches. Probably not this week. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 21, 2017 at 12:03:08PM -0400, Olga Kornievskaia wrote: > Thank you for the update. I guess I don’t see how the proposed NFS > implementation is complicated and ugly (but I’m biased). I’ll try to > give you some performance number. My 1 data point (1gb) inter copy > showed 30% improvement (how can that be ignored). That would be useful, thanks--if it comes with some details about the setup. I'm not so curious about percent improvement, as how to predict the performance on a given network. If server-to-server copy looks like it's normally able to use close to the available bandwidth between the two servers, and if a traditional read-write-copy loop is similarly able to use the available bandwidth, then I can figure out whether server-to-server copy will help on my setup. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 21, 2017 at 3:03 PM, J. Bruce Fields <bfields@fieldses.org> wrote: > On Tue, Mar 21, 2017 at 12:03:08PM -0400, Olga Kornievskaia wrote: >> Thank you for the update. I guess I don’t see how the proposed NFS >> implementation is complicated and ugly (but I’m biased). I’ll try to >> give you some performance number. My 1 data point (1gb) inter copy >> showed 30% improvement (how can that be ignored). > > That would be useful, thanks--if it comes with some details about the > setup. What I have available to me are two laptops that I run my VMs on. It is not a setup that is representative of a real setup. I think this setup can only provide percent improvement numbers. Andy was suggesting that perhaps the performance lab at Redhat would be able to do some testing of the patches for some real world performance? > I'm not so curious about percent improvement, as how to predict the > performance on a given network. > > If server-to-server copy looks like it's normally able to use close to > the available bandwidth between the two servers, and if a traditional > read-write-copy loop is similarly able to use the available bandwidth, > then I can figure out whether server-to-server copy will help on my > setup. > > --b. > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/02/2017 05:02 PM, Olga Kornievskaia wrote: > Allow nfs_copy_file_range to copy across devices. > NFSv4.2 inter server to server copy always copies across devices, and > NFSv4.2 intra server to server copy can copy across devices on the same > server. > > If a file system's fileoperations copy_file_range operation prohibits > cross-device copies, fall back to do_splice_direct. This is needed for > nfsd_copy_file_range() which is called by the inter server to server > destination server acting as an NFS client, and reading the file from > the source server. > > Signed-off-by: Andy Adamson<andros@netapp.com> What has happened to the patch? We unwittingly used copy_file_range in the glibc test suite, without realizing that it does not support cross-device copies. Thanks, Florian
On Fri, Aug 31, 2018 at 12:14 PM Florian Weimer <fweimer@redhat.com> wrote: > > On 03/02/2017 05:02 PM, Olga Kornievskaia wrote: > > Allow nfs_copy_file_range to copy across devices. > > NFSv4.2 inter server to server copy always copies across devices, and > > NFSv4.2 intra server to server copy can copy across devices on the same > > server. > > > > If a file system's fileoperations copy_file_range operation prohibits > > cross-device copies, fall back to do_splice_direct. This is needed for > > nfsd_copy_file_range() which is called by the inter server to server > > destination server acting as an NFS client, and reading the file from > > the source server. > > > > Signed-off-by: Andy Adamson<andros@netapp.com> > > What has happened to the patch? We unwittingly used copy_file_range in > the glibc test suite, without realizing that it does not support > cross-device copies. I'm still planning to fight for the patch to be included. As per maintainers request, I separated out the async copy patches out and hopefully that will be going into 4.20. I'm working on the "inter" copy offload functionality which does require this patch. I will start submitting and pushing this patch along with the rest of the patches. Are you interested in the copy_file_range() functionality that does support cross-device copies? If so can you tell me how are you using it? It would be also helpful to watch for the submission of the patch and argue in its favor. Thank you. > > Thanks, > Florian
On Fri, Aug 31, 2018 at 11:41 AM Olga Kornievskaia <aglo@umich.edu> wrote: > > On Fri, Aug 31, 2018 at 12:14 PM Florian Weimer <fweimer@redhat.com> wrote: > > > > On 03/02/2017 05:02 PM, Olga Kornievskaia wrote: > > > Allow nfs_copy_file_range to copy across devices. > > > NFSv4.2 inter server to server copy always copies across devices, and > > > NFSv4.2 intra server to server copy can copy across devices on the same > > > server. > > > > > > If a file system's fileoperations copy_file_range operation prohibits > > > cross-device copies, fall back to do_splice_direct. This is needed for > > > nfsd_copy_file_range() which is called by the inter server to server > > > destination server acting as an NFS client, and reading the file from > > > the source server. > > > > > > Signed-off-by: Andy Adamson<andros@netapp.com> > > > > What has happened to the patch? We unwittingly used copy_file_range in > > the glibc test suite, without realizing that it does not support > > cross-device copies. > > I'm still planning to fight for the patch to be included. As per > maintainers request, I separated out the async copy patches out and > hopefully that will be going into 4.20. I'm working on the "inter" > copy offload functionality which does require this patch. I will start > submitting and pushing this patch along with the rest of the patches. > > Are you interested in the copy_file_range() functionality that does > support cross-device copies? If so can you tell me how are you using > it? It would be also helpful to watch for the submission of the patch > and argue in its favor. SMB3 clients and server (Windows, Macs, Samba and probably most every modern NAS out there) support the SMB3 "CopyChunk" operation used by the Linux client. I would expect that one of the most useful cases is cross device For example you have two mounts to your server or server cluster \\server\share1 is mounted to /mnt1 and \\server\backup is mounted to /mnt2 and you want to do copy_file or copy_file_range of various files from /mnt1 to /mnt2 As long as they are both the same file system type, why not let the file system tell you whether it can do the copy. Given that this is 10x to 100x faster than the alternative and this is a common case (and there are many 100s of millions of SMB3 server capable systems out there which already support copychunk (and thus would support copy file) - why would we want to restrict it. It is much saner to let the file system tell the VFS if it can't support the cross device copy. ---- and to make it even more obvious why this patch matters --- Virtualization clients (and various Windows and NetApp server) support another copy offload mechanism (not just CopyChunk) ie via ODX which would support cross server not just cross share copy. Enabling this would be a BIG incentive for finishing up ODX copy support in Linux (cifs.ko using SMB3). This is fairly widely supported by servers.
diff --git a/fs/read_write.c b/fs/read_write.c index 1d9e305..75084cd 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -1534,10 +1534,6 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in, (file_out->f_flags & O_APPEND)) return -EBADF; - /* this could be relaxed once a method supports cross-fs copies */ - if (inode_in->i_sb != inode_out->i_sb) - return -EXDEV; - if (len == 0) return 0; @@ -1559,7 +1555,7 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in, if (file_out->f_op->copy_file_range) { ret = file_out->f_op->copy_file_range(file_in, pos_in, file_out, pos_out, len, flags); - if (ret != -EOPNOTSUPP) + if (ret != -EOPNOTSUPP && ret != -EXDEV) goto done; }
Allow nfs_copy_file_range to copy across devices. NFSv4.2 inter server to server copy always copies across devices, and NFSv4.2 intra server to server copy can copy across devices on the same server. If a file system's fileoperations copy_file_range operation prohibits cross-device copies, fall back to do_splice_direct. This is needed for nfsd_copy_file_range() which is called by the inter server to server destination server acting as an NFS client, and reading the file from the source server. Signed-off-by: Andy Adamson <andros@netapp.com> --- fs/read_write.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)