Message ID | 20161102225340.11613-1-jabolopes@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Nov 02, 2016 at 11:53:40PM +0100, Jose Lopes wrote: > Syscall 'ftruncate' makes the 'file' struct available to filesystem > handlers. This makes it possible, e.g., for filesystems, such as, > FUSE, to access the file handle associated with the file descriptor > that was passed to 'ftruncate'. In the specific case of FUSE, this > also makes it possible for (userspace) FUSE-based filesystems to > distinguish between calls to 'truncate' and 'ftruncate'. Why FUSE is such a precious snowflake that it needs to make that distinction, unlike all other filesystems? > In a future patch, make a similar change to the 'fchown' and > 'futimens' syscalls. I'm thucking frilled... NAK, unless you can give a better reason than "what if somebody would want that piece of information?". -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Al Viro wrote: > On Wed, Nov 02, 2016 at 11:53:40PM +0100, Jose Lopes wrote: >> Syscall 'ftruncate' makes the 'file' struct available to filesystem >> handlers. This makes it possible, e.g., for filesystems, such as, >> FUSE, to access the file handle associated with the file descriptor >> that was passed to 'ftruncate'. In the specific case of FUSE, this >> also makes it possible for (userspace) FUSE-based filesystems to >> distinguish between calls to 'truncate' and 'ftruncate'. > > Why FUSE is such a precious snowflake that it needs to make that distinction, > unlike all other filesystems? For fuse file system which delegate the permission checks to user space (and have to do so because of cacheing issues), the write permission has to be checked for truncate(), and not checked for ftruncate() : the file may have been opened for writing and then its permissions set to read-only before the ftruncate() is requested. The user space file system can check current permissions, not the ones which were set when the file was opened. Jean-Pierre -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Nov 03 2016, Al Viro <viro@ZenIV.linux.org.uk> wrote: > On Wed, Nov 02, 2016 at 11:53:40PM +0100, Jose Lopes wrote: >> Syscall 'ftruncate' makes the 'file' struct available to filesystem >> handlers. This makes it possible, e.g., for filesystems, such as, >> FUSE, to access the file handle associated with the file descriptor >> that was passed to 'ftruncate'. In the specific case of FUSE, this >> also makes it possible for (userspace) FUSE-based filesystems to >> distinguish between calls to 'truncate' and 'ftruncate'. > > Why FUSE is such a precious snowflake that it needs to make that distinction, > unlike all other filesystems? FUSE filesystems are often used as an extra layer on top of another filesystem (that stores the actual data). If the user opens a file (in the fuse filesystem), deletes it, and then truncates it, FUSE currently cannot do the same operation in the underlying filesystem: it receives a truncate() call with the inode, but there is no syscall that allows truncation for an inode. If FUSE had access to the file handle, it can use that to store a file descriptor for the file on the underlying filesystem and use ftruncate. Best, -Nikolaus
On Nov 03 2016, Nikolaus Rath <Nikolaus-BTH8mxji4b0@public.gmane.org> wrote: > On Nov 03 2016, Al Viro <viro@ZenIV.linux.org.uk> wrote: >> On Wed, Nov 02, 2016 at 11:53:40PM +0100, Jose Lopes wrote: >>> Syscall 'ftruncate' makes the 'file' struct available to filesystem >>> handlers. This makes it possible, e.g., for filesystems, such as, >>> FUSE, to access the file handle associated with the file descriptor >>> that was passed to 'ftruncate'. In the specific case of FUSE, this >>> also makes it possible for (userspace) FUSE-based filesystems to >>> distinguish between calls to 'truncate' and 'ftruncate'. >> >> Why FUSE is such a precious snowflake that it needs to make that distinction, >> unlike all other filesystems? > > FUSE filesystems are often used as an extra layer on top of another > filesystem (that stores the actual data). If the user opens a file (in > the fuse filesystem), deletes it, and then truncates it, FUSE currently > cannot do the same operation in the underlying filesystem: it receives > a truncate() call with the inode, but there is no syscall that allows > truncation for an inode. If FUSE had access to the file handle, it can > use that to store a file descriptor for the file on the underlying > filesystem and use ftruncate. In case it wasn't clear: for the case of ftruncate, fuse *does* have access to the file handle so the problem does not arise. For the case of fchmod, this is currently a problem (and the patch we're discussing would fix it). Best, -Nikolaus
On Nov 03 2016, Jose Lopes <jabolopes@gmail.com> wrote: > Syscall 'ftruncate' makes the 'file' struct available to filesystem > handlers. This makes it possible, e.g., for filesystems, such as, > FUSE, to access the file handle associated with the file descriptor > that was passed to 'ftruncate'. In the specific case of FUSE, this > also makes it possible for (userspace) FUSE-based filesystems to > distinguish between calls to 'truncate' and 'ftruncate'. > > From an implementation point of view, this is possible because the > 'ftruncate' syscall passes the 'file' struct to the underlying > filesystem handlers via the 'ia_file' field and the 'ia_valid' field > mask. > > Similarly to 'ftruncate', pass the 'file' struct to 'fchmod', which > allows filesystem handlers to get the file handle associated with the > file descriptor passed to 'fchmod' and allows FUSE-based filesystems > to distinguish between calls to 'chmod' and 'fchmod'. > > In a future patch, make a similar change to the 'fchown' and > 'futimens' syscalls. > > Signed-off-by: Jose Lopes <jabolopes@gmail.com> Tested-by: Nikolaus Rath <Nikolaus@rath.org> Best, -Nikolaus
Hi Al, Can you please take another look? Thanks, Jose On Mon, Nov 7, 2016 at 5:51 AM, Nikolaus Rath <Nikolaus@rath.org> wrote: > On Nov 03 2016, Nikolaus Rath <Nikolaus-BTH8mxji4b0@public.gmane.org> wrote: >> On Nov 03 2016, Al Viro <viro@ZenIV.linux.org.uk> wrote: >>> On Wed, Nov 02, 2016 at 11:53:40PM +0100, Jose Lopes wrote: >>>> Syscall 'ftruncate' makes the 'file' struct available to filesystem >>>> handlers. This makes it possible, e.g., for filesystems, such as, >>>> FUSE, to access the file handle associated with the file descriptor >>>> that was passed to 'ftruncate'. In the specific case of FUSE, this >>>> also makes it possible for (userspace) FUSE-based filesystems to >>>> distinguish between calls to 'truncate' and 'ftruncate'. >>> >>> Why FUSE is such a precious snowflake that it needs to make that distinction, >>> unlike all other filesystems? >> >> FUSE filesystems are often used as an extra layer on top of another >> filesystem (that stores the actual data). If the user opens a file (in >> the fuse filesystem), deletes it, and then truncates it, FUSE currently >> cannot do the same operation in the underlying filesystem: it receives >> a truncate() call with the inode, but there is no syscall that allows >> truncation for an inode. If FUSE had access to the file handle, it can >> use that to store a file descriptor for the file on the underlying >> filesystem and use ftruncate. > > In case it wasn't clear: for the case of ftruncate, fuse *does* have > access to the file handle so the problem does not arise. For the case of > fchmod, this is currently a problem (and the patch we're discussing > would fix it). > > Best, > -Nikolaus > > -- > GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F > Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F > > »Time flies like an arrow, fruit flies like a Banana.« -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jose Lopes <jabolopes@gmail.com> writes: > Hi, > > On Thu, Nov 3, 2016 at 9:22 AM Jean-Pierre André <jean-pierre.andre@wanadoo.fr> wrote: > > Al Viro wrote: > > On Wed, Nov 02, 2016 at 11:53:40PM +0100, Jose Lopes wrote: > >> Syscall 'ftruncate' makes the 'file' struct available to filesystem > >> handlers. This makes it possible, e.g., for filesystems, such as, > >> FUSE, to access the file handle associated with the file descriptor > >> that was passed to 'ftruncate'. In the specific case of FUSE, this > >> also makes it possible for (userspace) FUSE-based filesystems to > >> distinguish between calls to 'truncate' and 'ftruncate'. > > > > Why FUSE is such a precious snowflake that it needs to make that distinction, > > unlike all other filesystems? > > For fuse file system which delegate the permission checks > to user space (and have to do so because of cacheing > issues), the write permission has to be checked for > truncate(), and not checked for ftruncate() : the file > may have been opened for writing and then its permissions > set to read-only before the ftruncate() is requested. > The user space file system can check current permissions, > not the ones which were set when the file was opened. > > +1 what Jean-Pierre said. > > Also, I work on a FUSE-based network filesystem and the fact that we cannot > distinguish between calls to fchmod and chmod produces incorrect results. > For example, in the cases where a file was unlinked or moved, calling fchmod > should apply the change directly in the open file. However, since the fchmod > call arrives to FUSE as chmod (because of the missing file handle), FUSE will > try to resolve the path to get to the open file, which fails because the file was > moved or unlinked, or it will apply the change to the wrong file if in the meantime > another file was open under the same path of the previous file. I read through this and I agree with Al. Semantically ftruncate needs the file handle to operate correctly. Semantically fchmod does not need the file handle. The file handle to fchmod is just a way to pass it the specific inode. Given that a file handle exists presumably userspace has state cached for this file already. So a lookup by inode in the userspace filesystems data structures should get the job done. Beyond that the kernel does have interfaces for dealing with things like this if you don't want to have lots and lots of open files in userspace. These are the system calls name_to_handle_at and open_by_handle_at. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Nov 17 2016, ebiederm@xmission.com (Eric W. Biederman) wrote: > Jose Lopes <jabolopes@gmail.com> writes: > >> Hi, >> >> On Thu, Nov 3, 2016 at 9:22 AM Jean-Pierre André <jean-pierre.andre@wanadoo.fr> wrote: >> >> Al Viro wrote: >> > On Wed, Nov 02, 2016 at 11:53:40PM +0100, Jose Lopes wrote: >> >> Syscall 'ftruncate' makes the 'file' struct available to filesystem >> >> handlers. This makes it possible, e.g., for filesystems, such as, >> >> FUSE, to access the file handle associated with the file descriptor >> >> that was passed to 'ftruncate'. In the specific case of FUSE, this >> >> also makes it possible for (userspace) FUSE-based filesystems to >> >> distinguish between calls to 'truncate' and 'ftruncate'. >> > >> > Why FUSE is such a precious snowflake that it needs to make that distinction, >> > unlike all other filesystems? >> >> For fuse file system which delegate the permission checks >> to user space (and have to do so because of cacheing >> issues), the write permission has to be checked for >> truncate(), and not checked for ftruncate() : the file >> may have been opened for writing and then its permissions >> set to read-only before the ftruncate() is requested. >> The user space file system can check current permissions, >> not the ones which were set when the file was opened. >> >> +1 what Jean-Pierre said. >> >> Also, I work on a FUSE-based network filesystem and the fact that we cannot >> distinguish between calls to fchmod and chmod produces incorrect results. >> For example, in the cases where a file was unlinked or moved, calling fchmod >> should apply the change directly in the open file. However, since the fchmod >> call arrives to FUSE as chmod (because of the missing file handle), FUSE will >> try to resolve the path to get to the open file, which fails because the file was >> moved or unlinked, or it will apply the change to the wrong file if in the meantime >> another file was open under the same path of the previous file. > > I read through this and I agree with Al. Semantically ftruncate needs > the file handle to operate correctly. Semantically fchmod does not need > the file handle. The file handle to fchmod is just a way to pass it the > specific inode. Could you explain this in more detail? What does ftruncate need the file handle for other than to obtain the inode? > Given that a file handle exists presumably userspace has state cached > for this file already. So a lookup by inode in the userspace > filesystems data structures should get the job done. True. But passing the information from the kernel is just copying some bytes around, obtaining it in userspace would mean a hash table lookup for every request (including those that don't have a file handle). I presume this is the reason why ftruncate gets the information from the kernel (it could also just do lookup by inode). Why doesn't the same argument apply to eg fchmod? Best, -Nikolaus
Nikolaus Rath <Nikolaus@rath.org> writes: > On Nov 17 2016, ebiederm@xmission.com (Eric W. Biederman) wrote: >> Jose Lopes <jabolopes@gmail.com> writes: >> >>> Hi, >>> >>> On Thu, Nov 3, 2016 at 9:22 AM Jean-Pierre André <jean-pierre.andre@wanadoo.fr> wrote: >>> >>> Al Viro wrote: >>> > On Wed, Nov 02, 2016 at 11:53:40PM +0100, Jose Lopes wrote: >>> >> Syscall 'ftruncate' makes the 'file' struct available to filesystem >>> >> handlers. This makes it possible, e.g., for filesystems, such as, >>> >> FUSE, to access the file handle associated with the file descriptor >>> >> that was passed to 'ftruncate'. In the specific case of FUSE, this >>> >> also makes it possible for (userspace) FUSE-based filesystems to >>> >> distinguish between calls to 'truncate' and 'ftruncate'. >>> > >>> > Why FUSE is such a precious snowflake that it needs to make that distinction, >>> > unlike all other filesystems? >>> >>> For fuse file system which delegate the permission checks >>> to user space (and have to do so because of cacheing >>> issues), the write permission has to be checked for >>> truncate(), and not checked for ftruncate() : the file >>> may have been opened for writing and then its permissions >>> set to read-only before the ftruncate() is requested. >>> The user space file system can check current permissions, >>> not the ones which were set when the file was opened. >>> >>> +1 what Jean-Pierre said. >>> >>> Also, I work on a FUSE-based network filesystem and the fact that we cannot >>> distinguish between calls to fchmod and chmod produces incorrect results. >>> For example, in the cases where a file was unlinked or moved, calling fchmod >>> should apply the change directly in the open file. However, since the fchmod >>> call arrives to FUSE as chmod (because of the missing file handle), FUSE will >>> try to resolve the path to get to the open file, which fails because the file was >>> moved or unlinked, or it will apply the change to the wrong file if in the meantime >>> another file was open under the same path of the previous file. >> >> I read through this and I agree with Al. Semantically ftruncate needs >> the file handle to operate correctly. Semantically fchmod does not need >> the file handle. The file handle to fchmod is just a way to pass it the >> specific inode. > > Could you explain this in more detail? What does ftruncate need the file > handle for other than to obtain the inode? ftruncate requires the file to be opened for writing. >> Given that a file handle exists presumably userspace has state cached >> for this file already. So a lookup by inode in the userspace >> filesystems data structures should get the job done. > > True. But passing the information from the kernel is just copying some > bytes around, obtaining it in userspace would mean a hash table lookup > for every request (including those that don't have a file handle). > > I presume this is the reason why ftruncate gets the information from the > kernel (it could also just do lookup by inode). Why doesn't the same > argument apply to eg fchmod? fchmod does not require the file to be opened for writing. There might be an argument for better tokens between fuse and the kernel for inodes, but that is another story. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2016-11-17 20:20 GMT+01:00 Eric W. Biederman <ebiederm@xmission.com>: > Nikolaus Rath <Nikolaus@rath.org> writes: > > > ftruncate requires the file to be opened for writing. > > > fchmod does not require the file to be opened for writing. > Makes sense of course. ftruncate is acting on the file, fchmod does work on the attributes (inode information). Stef -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/open.c b/fs/open.c index d3ed817..00214c5 100644 --- a/fs/open.c +++ b/fs/open.c @@ -516,7 +516,8 @@ SYSCALL_DEFINE1(chroot, const char __user *, filename) return error; } -static int chmod_common(const struct path *path, umode_t mode) +static int chmod_common(const struct path *path, umode_t mode, + struct file *filp) { struct inode *inode = path->dentry->d_inode; struct inode *delegated_inode = NULL; @@ -533,6 +534,10 @@ static int chmod_common(const struct path *path, umode_t mode) goto out_unlock; newattrs.ia_mode = (mode & S_IALLUGO) | (inode->i_mode & ~S_IALLUGO); newattrs.ia_valid = ATTR_MODE | ATTR_CTIME; + if (filp) { + newattrs.ia_file = filp; + newattrs.ia_valid |= ATTR_FILE; + } error = notify_change(path->dentry, &newattrs, &delegated_inode); out_unlock: inode_unlock(inode); @@ -552,7 +557,7 @@ SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode) if (f.file) { audit_file(f.file); - err = chmod_common(&f.file->f_path, mode); + err = chmod_common(&f.file->f_path, mode, f.file); fdput(f); } return err; @@ -566,7 +571,7 @@ SYSCALL_DEFINE3(fchmodat, int, dfd, const char __user *, filename, umode_t, mode retry: error = user_path_at(dfd, filename, lookup_flags, &path); if (!error) { - error = chmod_common(&path, mode); + error = chmod_common(&path, mode, NULL); path_put(&path); if (retry_estale(error, lookup_flags)) { lookup_flags |= LOOKUP_REVAL;
Syscall 'ftruncate' makes the 'file' struct available to filesystem handlers. This makes it possible, e.g., for filesystems, such as, FUSE, to access the file handle associated with the file descriptor that was passed to 'ftruncate'. In the specific case of FUSE, this also makes it possible for (userspace) FUSE-based filesystems to distinguish between calls to 'truncate' and 'ftruncate'. From an implementation point of view, this is possible because the 'ftruncate' syscall passes the 'file' struct to the underlying filesystem handlers via the 'ia_file' field and the 'ia_valid' field mask. Similarly to 'ftruncate', pass the 'file' struct to 'fchmod', which allows filesystem handlers to get the file handle associated with the file descriptor passed to 'fchmod' and allows FUSE-based filesystems to distinguish between calls to 'chmod' and 'fchmod'. In a future patch, make a similar change to the 'fchown' and 'futimens' syscalls. Signed-off-by: Jose Lopes <jabolopes@gmail.com> --- fs/open.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)