diff mbox

[1/2] vfs: allow dedupe of user owned read-only files

Message ID 20180511192651.21324-2-mfasheh@suse.de (mailing list archive)
State New, archived
Headers show

Commit Message

Mark Fasheh May 11, 2018, 7:26 p.m. UTC
The permission check in vfs_dedupe_file_range() is too coarse - We
only allow dedupe of the destination file if the user is root, or
they have the file open for write.

This effectively limits a non-root user from deduping their own
read-only files. As file data during a dedupe does not change,
this is unexpected behavior and this has caused a number of issue
reports. For an example, see:

https://github.com/markfasheh/duperemove/issues/129

So change the check so we allow dedupe on the target if:

- the root or admin is asking for it
- the owner of the file is asking for the dedupe
- the process has write access

Signed-off-by: Mark Fasheh <mfasheh@suse.de>
---
 fs/read_write.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Darrick J. Wong May 11, 2018, 11:58 p.m. UTC | #1
On Fri, May 11, 2018 at 12:26:50PM -0700, Mark Fasheh wrote:
> The permission check in vfs_dedupe_file_range() is too coarse - We
> only allow dedupe of the destination file if the user is root, or
> they have the file open for write.
> 
> This effectively limits a non-root user from deduping their own
> read-only files. As file data during a dedupe does not change,
> this is unexpected behavior and this has caused a number of issue
> reports. For an example, see:
> 
> https://github.com/markfasheh/duperemove/issues/129
> 
> So change the check so we allow dedupe on the target if:
> 
> - the root or admin is asking for it
> - the owner of the file is asking for the dedupe
> - the process has write access
> 
> Signed-off-by: Mark Fasheh <mfasheh@suse.de>

Sounds fine I guess....
Acked-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/read_write.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/read_write.c b/fs/read_write.c
> index c4eabbfc90df..77986a2e2a3b 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -2036,7 +2036,8 @@ int vfs_dedupe_file_range(struct file *file, struct file_dedupe_range *same)
>  
>  		if (info->reserved) {
>  			info->status = -EINVAL;
> -		} else if (!(is_admin || (dst_file->f_mode & FMODE_WRITE))) {
> +		} else if (!(is_admin || (dst_file->f_mode & FMODE_WRITE) ||
> +			     uid_eq(current_fsuid(), dst->i_uid))) {
>  			info->status = -EINVAL;
>  		} else if (file->f_path.mnt != dst_file->f_path.mnt) {
>  			info->status = -EXDEV;
> -- 
> 2.15.1
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Adam Borowski May 12, 2018, 2:49 a.m. UTC | #2
On Fri, May 11, 2018 at 12:26:50PM -0700, Mark Fasheh wrote:
> The permission check in vfs_dedupe_file_range() is too coarse - We
> only allow dedupe of the destination file if the user is root, or
> they have the file open for write.
> 
> This effectively limits a non-root user from deduping their own
> read-only files. As file data during a dedupe does not change,
> this is unexpected behavior and this has caused a number of issue
> reports. For an example, see:
> 
> https://github.com/markfasheh/duperemove/issues/129
> 
> So change the check so we allow dedupe on the target if:
> 
> - the root or admin is asking for it
> - the owner of the file is asking for the dedupe
> - the process has write access

I submitted a similar patch in May 2016, yet it has never been applied
despite multiple pings, with no NAK.  My version allowed dedupe if:
- the root or admin is asking for it
- the file has w permission (on the inode -- ie, could have been opened rw)

There was a request to include in xfstests a test case for the ETXTBSY race
this patch fixes, but there's no reasonable way to make such a test case:
the race condition is not a bug, it's write-xor-exec working as designed.

Another idea discussed was about possibly just allowing everyone who can
open the file to deduplicate it, as the file contents are not modified in
any way.  Zygo Blaxell expressed a concern that it could be used by an
unprivileged user who can trigger a crash to abuse writeout bugs.

I like this new version better than mine: "root or owner or w" is more
Unixy than "could have been opened w".

> Signed-off-by: Mark Fasheh <mfasheh@suse.de>
> ---
>  fs/read_write.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/read_write.c b/fs/read_write.c
> index c4eabbfc90df..77986a2e2a3b 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -2036,7 +2036,8 @@ int vfs_dedupe_file_range(struct file *file, struct file_dedupe_range *same)
>  
>  		if (info->reserved) {
>  			info->status = -EINVAL;
> -		} else if (!(is_admin || (dst_file->f_mode & FMODE_WRITE))) {
> +		} else if (!(is_admin || (dst_file->f_mode & FMODE_WRITE) ||
> +			     uid_eq(current_fsuid(), dst->i_uid))) {
I had:
  +		} else if (!(is_admin || !inode_permission(dst, MAY_WRITE))) {
>  			info->status = -EINVAL;
>  		} else if (file->f_path.mnt != dst_file->f_path.mnt) {
>  			info->status = -EXDEV;
> -- 

Meow!
Mark Fasheh May 13, 2018, 6:16 p.m. UTC | #3
Hey Adam,

On Sat, May 12, 2018 at 04:49:20AM +0200, Adam Borowski wrote:
> On Fri, May 11, 2018 at 12:26:50PM -0700, Mark Fasheh wrote:
> > The permission check in vfs_dedupe_file_range() is too coarse - We
> > only allow dedupe of the destination file if the user is root, or
> > they have the file open for write.
> > 
> > This effectively limits a non-root user from deduping their own
> > read-only files. As file data during a dedupe does not change,
> > this is unexpected behavior and this has caused a number of issue
> > reports. For an example, see:
> > 
> > https://github.com/markfasheh/duperemove/issues/129
> > 
> > So change the check so we allow dedupe on the target if:
> > 
> > - the root or admin is asking for it
> > - the owner of the file is asking for the dedupe
> > - the process has write access
> 
> I submitted a similar patch in May 2016, yet it has never been applied
> despite multiple pings, with no NAK.  My version allowed dedupe if:
> - the root or admin is asking for it
> - the file has w permission (on the inode -- ie, could have been opened rw)

Ahh, yes I see that now. I did wind up acking it too :)

> There was a request to include in xfstests a test case for the ETXTBSY race
> this patch fixes, but there's no reasonable way to make such a test case:
> the race condition is not a bug, it's write-xor-exec working as designed.
> 
> Another idea discussed was about possibly just allowing everyone who can
> open the file to deduplicate it, as the file contents are not modified in
> any way.  Zygo Blaxell expressed a concern that it could be used by an
> unprivileged user who can trigger a crash to abuse writeout bugs.
> 
> I like this new version better than mine: "root or owner or w" is more
> Unixy than "could have been opened w".

I agree, IMHO the behavior in this patch is intuitive. What we had before
would surprise users.

Yeah we've been careful about not letting a user dedupe some other users
file. Data-wise it technically might be ok but I can imagine several
scenarios where you might not want some other user deduping your files.

Thanks for the review,
	--Mark
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Adam Borowski May 13, 2018, 8:50 p.m. UTC | #4
On Sun, May 13, 2018 at 06:16:53PM +0000, Mark Fasheh wrote:
> On Sat, May 12, 2018 at 04:49:20AM +0200, Adam Borowski wrote:
> > On Fri, May 11, 2018 at 12:26:50PM -0700, Mark Fasheh wrote:
> > > The permission check in vfs_dedupe_file_range() is too coarse - We
> > > only allow dedupe of the destination file if the user is root, or
> > > they have the file open for write.
> > > 
> > > This effectively limits a non-root user from deduping their own
> > > read-only files. As file data during a dedupe does not change,
> > > this is unexpected behavior and this has caused a number of issue
> > > reports.
[...]
> > > So change the check so we allow dedupe on the target if:
> > > 
> > > - the root or admin is asking for it
> > > - the owner of the file is asking for the dedupe
> > > - the process has write access
> > 
> > I submitted a similar patch in May 2016, yet it has never been applied
> > despite multiple pings, with no NAK.  My version allowed dedupe if:
> > - the root or admin is asking for it
> > - the file has w permission (on the inode -- ie, could have been opened rw)
> 
> Ahh, yes I see that now. I did wind up acking it too :)
> > 
> > I like this new version better than mine: "root or owner or w" is more
> > Unixy than "could have been opened w".
> 
> I agree, IMHO the behavior in this patch is intuitive. What we had before
> would surprise users.

Actually, there's one reason to still consider "could have been opened w":
with it, deduplication programs can simply open the file r and not care
about ETXTBSY at all.  Otherwise, every program needs to stat() and have
logic to pick the proper argument to the open() call (r if owner/root,
rw or w if not).

I also have a sister patch: btrfs_ioctl_defrag wants the same change, for
the very same reason.  But, let's discuss dedupe first to avoid unnecessary
round trips.


Meow!
Mark Fasheh May 17, 2018, 11:01 p.m. UTC | #5
On Sun, May 13, 2018 at 10:50:25PM +0200, Adam Borowski wrote:
> On Sun, May 13, 2018 at 06:16:53PM +0000, Mark Fasheh wrote:
> > On Sat, May 12, 2018 at 04:49:20AM +0200, Adam Borowski wrote:
> > > On Fri, May 11, 2018 at 12:26:50PM -0700, Mark Fasheh wrote:
> > > > The permission check in vfs_dedupe_file_range() is too coarse - We
> > > > only allow dedupe of the destination file if the user is root, or
> > > > they have the file open for write.
> > > > 
> > > > This effectively limits a non-root user from deduping their own
> > > > read-only files. As file data during a dedupe does not change,
> > > > this is unexpected behavior and this has caused a number of issue
> > > > reports.
> [...]
> > > > So change the check so we allow dedupe on the target if:
> > > > 
> > > > - the root or admin is asking for it
> > > > - the owner of the file is asking for the dedupe
> > > > - the process has write access
> > > 
> > > I submitted a similar patch in May 2016, yet it has never been applied
> > > despite multiple pings, with no NAK.  My version allowed dedupe if:
> > > - the root or admin is asking for it
> > > - the file has w permission (on the inode -- ie, could have been opened rw)
> > 
> > Ahh, yes I see that now. I did wind up acking it too :)
> > > 
> > > I like this new version better than mine: "root or owner or w" is more
> > > Unixy than "could have been opened w".
> > 
> > I agree, IMHO the behavior in this patch is intuitive. What we had before
> > would surprise users.
> 
> Actually, there's one reason to still consider "could have been opened w":
> with it, deduplication programs can simply open the file r and not care
> about ETXTBSY at all.  Otherwise, every program needs to stat() and have
> logic to pick the proper argument to the open() call (r if owner/root,
> rw or w if not).

That makes sense. The goal after all is to be able to just open r and not
worry about it. I hadn't considered the other possibilities that
inode_permission() covers. I'll make that change for my next series.

Thanks,
  --Mark
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/read_write.c b/fs/read_write.c
index c4eabbfc90df..77986a2e2a3b 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -2036,7 +2036,8 @@  int vfs_dedupe_file_range(struct file *file, struct file_dedupe_range *same)
 
 		if (info->reserved) {
 			info->status = -EINVAL;
-		} else if (!(is_admin || (dst_file->f_mode & FMODE_WRITE))) {
+		} else if (!(is_admin || (dst_file->f_mode & FMODE_WRITE) ||
+			     uid_eq(current_fsuid(), dst->i_uid))) {
 			info->status = -EINVAL;
 		} else if (file->f_path.mnt != dst_file->f_path.mnt) {
 			info->status = -EXDEV;