[v2] Btrfs: fix error handling in btrfs_truncate()
diff mbox

Message ID d181fa0f3fc581208a0258132c8dc3305b01be1a.1527007575.git.osandov@fb.com
State New
Headers show

Commit Message

Omar Sandoval May 22, 2018, 4:47 p.m. UTC
From: Omar Sandoval <osandov@fb.com>

Jun Wu at Facebook reported that an internal service was seeing a return
value of 1 from ftruncate() on Btrfs in some cases. This is coming from
the NEED_TRUNCATE_BLOCK return value from btrfs_truncate_inode_items().

btrfs_truncate() uses two variables for error handling, ret and err.
When btrfs_truncate_inode_items() returns non-zero, we set err to the
return value. However, NEED_TRUNCATE_BLOCK is not an error. Make sure we
only set err if ret is an error (i.e., negative).

Fixes: ddfae63cc8e0 ("btrfs: move btrfs_truncate_block out of trans handle")
Reported-by: Jun Wu <quark@fb.com>
Cc: stable@vger.kernel.org
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
This version makes the minimal fix which should be good for v4.17 and
stable. I'll submit a cleanup separately.

 fs/btrfs/inode.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

David Sterba May 22, 2018, 5:17 p.m. UTC | #1
On Tue, May 22, 2018 at 09:47:58AM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
> 
> Jun Wu at Facebook reported that an internal service was seeing a return
> value of 1 from ftruncate() on Btrfs in some cases.

Do you have a reproducer? To estimate how likely is to hit the problem
in practice.

> This is coming from
> the NEED_TRUNCATE_BLOCK return value from btrfs_truncate_inode_items().
> 
> btrfs_truncate() uses two variables for error handling, ret and err.
> When btrfs_truncate_inode_items() returns non-zero, we set err to the
> return value. However, NEED_TRUNCATE_BLOCK is not an error. Make sure we
> only set err if ret is an error (i.e., negative).
> 
> Fixes: ddfae63cc8e0 ("btrfs: move btrfs_truncate_block out of trans handle")
> Reported-by: Jun Wu <quark@fb.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Omar Sandoval <osandov@fb.com>
> ---
> This version makes the minimal fix which should be good for v4.17 and
> stable. I'll submit a cleanup separately.
> 
>  fs/btrfs/inode.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index d241285a0d2a..f276da70f659 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -9117,7 +9117,8 @@ static int btrfs_truncate(struct inode *inode, bool skip_writeback)
>  						 BTRFS_EXTENT_DATA_KEY);
>  		trans->block_rsv = &fs_info->trans_block_rsv;
>  		if (ret != -ENOSPC && ret != -EAGAIN) {
> -			err = ret;
> +			if (ret < 0)
> +				err = ret;
>  			break;
>  		}
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Omar Sandoval May 22, 2018, 5:37 p.m. UTC | #2
On Tue, May 22, 2018 at 07:17:48PM +0200, David Sterba wrote:
> On Tue, May 22, 2018 at 09:47:58AM -0700, Omar Sandoval wrote:
> > From: Omar Sandoval <osandov@fb.com>
> > 
> > Jun Wu at Facebook reported that an internal service was seeing a return
> > value of 1 from ftruncate() on Btrfs in some cases.
> 
> Do you have a reproducer? To estimate how likely is to hit the problem
> in practice.

This reproduces it every time when mounted with compress-force=zstd:

#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <unistd.h>

int main() {
	int i;

	for (i = 0; i < 2; i++) {
		char buf[256] = { 0 };
		int fd, ret;
		char *p;

		fd = open("test", O_CREAT | O_WRONLY | O_TRUNC, 0666);
		if (fd == -1) {
			perror("open");
			return EXIT_FAILURE;
		}
		if (write(fd, buf, sizeof(buf)) != sizeof(buf)) {
			perror("write");
			close(fd);
			return EXIT_FAILURE;
		}
		close(fd);

		fd = open("test", O_RDONLY, 0666);
		if (fd == -1) {
			perror("open");
			return EXIT_FAILURE;
		}
		p = mmap(NULL, 256, PROT_READ, MAP_SHARED, fd, 0);
		if (p == MAP_FAILED) {
			perror("mmap");
			close(fd);
			return EXIT_FAILURE;
		}
		if (p[0] != 0)
			return 1;
		close(fd);

		fd = open("test", O_WRONLY, 0666);
		if (fd == -1) {
			perror("open");
			return EXIT_FAILURE;
		}
		ret = ftruncate(fd, 128);
		if (ret) {
			printf("ftruncate() returned %d\n", ret);
			close(fd);
			return EXIT_FAILURE;
		}
		close(fd);
	}

	return EXIT_SUCCESS;
}

This happens any time NEED_TRUNCATE_BLOCK is returned. The file has to
be inline and compressed, and there's some other condition that I
haven't figured out yet which the mmap() is there for.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Omar Sandoval May 22, 2018, 5:41 p.m. UTC | #3
On Tue, May 22, 2018 at 10:37:14AM -0700, Omar Sandoval wrote:
> On Tue, May 22, 2018 at 07:17:48PM +0200, David Sterba wrote:
> > On Tue, May 22, 2018 at 09:47:58AM -0700, Omar Sandoval wrote:
> > > From: Omar Sandoval <osandov@fb.com>
> > > 
> > > Jun Wu at Facebook reported that an internal service was seeing a return
> > > value of 1 from ftruncate() on Btrfs in some cases.
> > 
> > Do you have a reproducer? To estimate how likely is to hit the problem
> > in practice.

Here's an even easier one: truncating a compressed, inline file twice:

#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main() {
	char buf[256] = { 0 };
	int ret;
	int fd;

	fd = open("test", O_CREAT | O_WRONLY | O_TRUNC, 0666);
	if (fd == -1) {
		perror("open");
		return EXIT_FAILURE;
	}
	if (write(fd, buf, sizeof(buf)) != sizeof(buf)) {
		perror("write");
		close(fd);
		return EXIT_FAILURE;
	}
	close(fd);

	fd = open("test", O_WRONLY, 0666);
	if (fd == -1) {
		perror("open");
		return EXIT_FAILURE;
	}
	ret = ftruncate(fd, 128);
	if (ret) {
		printf("first ftruncate() returned %d\n", ret);
		close(fd);
		return EXIT_FAILURE;
	}
	close(fd);

	fd = open("test", O_WRONLY, 0666);
	if (fd == -1) {
		perror("open");
		return EXIT_FAILURE;
	}
	ret = ftruncate(fd, 64);
	if (ret) {
		printf("second ftruncate() returned %d\n", ret);
		close(fd);
		return EXIT_FAILURE;
	}
	close(fd);

	return EXIT_SUCCESS;
}

The output is

second ftruncate() returned 1
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Omar Sandoval May 22, 2018, 5:48 p.m. UTC | #4
On Tue, May 22, 2018 at 10:41:11AM -0700, Omar Sandoval wrote:
> On Tue, May 22, 2018 at 10:37:14AM -0700, Omar Sandoval wrote:
> > On Tue, May 22, 2018 at 07:17:48PM +0200, David Sterba wrote:
> > > On Tue, May 22, 2018 at 09:47:58AM -0700, Omar Sandoval wrote:
> > > > From: Omar Sandoval <osandov@fb.com>
> > > > 
> > > > Jun Wu at Facebook reported that an internal service was seeing a return
> > > > value of 1 from ftruncate() on Btrfs in some cases.
> > > 
> > > Do you have a reproducer? To estimate how likely is to hit the problem
> > > in practice.

Okay last one, I promise, we just need the extent items to be on disk:

#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main(void) {
	char buf[256] = { 0 };
	int ret;
	int fd;

	fd = open("test", O_CREAT | O_WRONLY | O_TRUNC, 0666);
	if (fd == -1) {
		perror("open");
		return EXIT_FAILURE;
	}

	if (write(fd, buf, sizeof(buf)) != sizeof(buf)) {
		perror("write");
		close(fd);
		return EXIT_FAILURE;
	}

	if (fsync(fd) == -1) {
		perror("fsync");
		close(fd);
		return EXIT_FAILURE;
	}

	ret = ftruncate(fd, 128);
	if (ret) {
		printf("ftruncate() returned %d\n", ret);
		close(fd);
		return EXIT_FAILURE;
	}

	close(fd);

	return EXIT_SUCCESS;
}

Basically, any time we truncate a compressed, inline file, as long as
its extents are already on disk, we get the erroneous return value.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba May 24, 2018, 11:06 a.m. UTC | #5
On Tue, May 22, 2018 at 10:48:38AM -0700, Omar Sandoval wrote:
> On Tue, May 22, 2018 at 10:41:11AM -0700, Omar Sandoval wrote:
> > On Tue, May 22, 2018 at 10:37:14AM -0700, Omar Sandoval wrote:
> > > On Tue, May 22, 2018 at 07:17:48PM +0200, David Sterba wrote:
> > > > On Tue, May 22, 2018 at 09:47:58AM -0700, Omar Sandoval wrote:
> > > > > From: Omar Sandoval <osandov@fb.com>
> > > > > 
> > > > > Jun Wu at Facebook reported that an internal service was seeing a return
> > > > > value of 1 from ftruncate() on Btrfs in some cases.
> > > > 
> > > > Do you have a reproducer? To estimate how likely is to hit the problem
> > > > in practice.
> 
> Okay last one, I promise, we just need the extent items to be on disk:

[...]

Thanks, I'll add it to the changelog and send the pull request about today.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch
diff mbox

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index d241285a0d2a..f276da70f659 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -9117,7 +9117,8 @@  static int btrfs_truncate(struct inode *inode, bool skip_writeback)
 						 BTRFS_EXTENT_DATA_KEY);
 		trans->block_rsv = &fs_info->trans_block_rsv;
 		if (ret != -ENOSPC && ret != -EAGAIN) {
-			err = ret;
+			if (ret < 0)
+				err = ret;
 			break;
 		}