From patchwork Fri Oct 25 06:32:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13850089 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A22FD18BC1C for ; Fri, 25 Oct 2024 06:32:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729837963; cv=none; b=PNVPPaNhrIPYK64rXrUmScTkFE0tEDfYq2gnqjtJOHYuj98pEYttXAyzoJA5S8F0ddNhalbRw7LrTz6Q2rghBhApxZ7N0rFk41+GzvCJn7vEhwm3SpFIqE53yNyI7dv7IJfeC/LFC03BbCGhdaa1249Pw73626ixncTJBzUDPDg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729837963; c=relaxed/simple; bh=HLO/KxwjDwB7nvej2QS4r8emKO3J1k0o4fQ8pP3jVVI=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=ZlZqW1UYhuo9Pcs5z0wxif5FGJUwMq8cU4q/ftss/sntqlEpWEXee/Xt2WXGIN+hy/QQ96S1tuA4LZbUPvDqaHiN5Wa4Evs9saNLWTeW+WjgfB9IanWcaD00jQrkbUP7ZvFR8RHr8zDyHTXDfXRlr8aAe/Z/S8GLJw/AORMB8dk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=j0ILZVPg; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="j0ILZVPg" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 775E1C4CEC3; Fri, 25 Oct 2024 06:32:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1729837963; bh=HLO/KxwjDwB7nvej2QS4r8emKO3J1k0o4fQ8pP3jVVI=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=j0ILZVPgkwMpy9huP7T238viocdFp9DkbllvDfJMDflglhHe9w9uYovjJOpHTTRfS uqkr0GxC4ZjsJqZWpw0CduGyzgi6+kf2+TtFjy2NT5tGgtSDJnLlyK8/6lWLx8MxJh Iq8CP33LRTpQ4PLQDrvyGtQDRNsatW08AjP5+dfSvGCF3vcxvJyRGTSUrVkWM3YEnp WoJK96V3wFRK2KBVPZluuf900msCTeVnd6b6ToUudmyiStA7ncBflWgJqGX3GvAmlG aqgLV6ScOr/TyANVzM24xDTVXCy4n1AsN217AVSmOutvyqWM/ff6jgb68JyZfwVGVO do8MQluzOZ3iQ== Date: Thu, 24 Oct 2024 23:32:42 -0700 Subject: [PATCH 1/7] man: document file range commit ioctls From: "Darrick J. Wong" To: cem@kernel.org, aalbersh@kernel.org, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <172983773345.3040944.13699158820293279920.stgit@frogsfrogsfrogs> In-Reply-To: <172983773323.3040944.5615240418900510348.stgit@frogsfrogsfrogs> References: <172983773323.3040944.5615240418900510348.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Document the two new ioctls to support committing arbitrary dirty data ranges of two files. Signed-off-by: Darrick J. Wong --- man/man2/ioctl_xfs_commit_range.2 | 296 +++++++++++++++++++++++++++++++++++++ man/man2/ioctl_xfs_fsgeometry.2 | 2 man/man2/ioctl_xfs_start_commit.2 | 1 3 files changed, 298 insertions(+), 1 deletion(-) create mode 100644 man/man2/ioctl_xfs_commit_range.2 create mode 100644 man/man2/ioctl_xfs_start_commit.2 diff --git a/man/man2/ioctl_xfs_commit_range.2 b/man/man2/ioctl_xfs_commit_range.2 new file mode 100644 index 00000000000000..3244e52c3e0946 --- /dev/null +++ b/man/man2/ioctl_xfs_commit_range.2 @@ -0,0 +1,296 @@ +.\" Copyright (c) 2020-2024 Oracle. All rights reserved. +.\" +.\" %%%LICENSE_START(GPLv2+_DOC_FULL) +.\" This is free documentation; you can redistribute it and/or +.\" modify it under the terms of the GNU General Public License as +.\" published by the Free Software Foundation; either version 2 of +.\" the License, or (at your option) any later version. +.\" +.\" The GNU General Public License's references to "object code" +.\" and "executables" are to be interpreted as the output of any +.\" document formatting or typesetting system, including +.\" intermediate and printed output. +.\" +.\" This manual is distributed in the hope that it will be useful, +.\" but WITHOUT ANY WARRANTY; without even the implied warranty of +.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +.\" GNU General Public License for more details. +.\" +.\" You should have received a copy of the GNU General Public +.\" License along with this manual; if not, see +.\" . +.\" %%%LICENSE_END +.TH IOCTL-XFS-COMMIT-RANGE 2 2024-02-18 "XFS" +.SH NAME +ioctl_xfs_start_commit \- prepare to exchange the contents of two files +ioctl_xfs_commit_range \- conditionally exchange the contents of parts of two files +.SH SYNOPSIS +.br +.B #include +.br +.B #include +.PP +.BI "int ioctl(int " file2_fd ", XFS_IOC_START_COMMIT, struct xfs_commit_range *" arg ); +.PP +.BI "int ioctl(int " file2_fd ", XFS_IOC_COMMIT_RANGE, struct xfs_commit_range *" arg ); +.SH DESCRIPTION +Given a range of bytes in a first file +.B file1_fd +and a second range of bytes in a second file +.BR file2_fd , +this +.BR ioctl (2) +exchanges the contents of the two ranges if +.B file2_fd +passes certain freshness criteria. + +Before exchanging the contents, the program must call the +.B XFS_IOC_START_COMMIT +ioctl to sample freshness data for +.BR file2_fd . +If the sampled metadata does not match the file metadata at commit time, +.B XFS_IOC_COMMIT_RANGE +will return +.BR EBUSY . +.PP +Exchanges are atomic with regards to concurrent file operations. +Implementations must guarantee that readers see either the old contents or the +new contents in their entirety, even if the system fails. +.PP +The system call parameters are conveyed in structures of the following form: +.PP +.in +4n +.EX +struct xfs_commit_range { + __s32 file1_fd; + __u32 pad; + __u64 file1_offset; + __u64 file2_offset; + __u64 length; + __u64 flags; + __u64 file2_freshness[5]; +}; +.EE +.in +.PP +The field +.I pad +must be zero. +.PP +The fields +.IR file1_fd ", " file1_offset ", and " length +define the first range of bytes to be exchanged. +.PP +The fields +.IR file2_fd ", " file2_offset ", and " length +define the second range of bytes to be exchanged. +.PP +The field +.I file2_freshness +is an opaque field whose contents are determined by the kernel. +These file attributes are used to confirm that +.B file2_fd +has not changed by another thread since the current thread began staging its +own update. +.PP +Both files must be from the same filesystem mount. +If the two file descriptors represent the same file, the byte ranges must not +overlap. +Most disk-based filesystems require that the starts of both ranges must be +aligned to the file block size. +If this is the case, the ends of the ranges must also be so aligned unless the +.B XFS_EXCHANGE_RANGE_TO_EOF +flag is set. + +.PP +The field +.I flags +control the behavior of the exchange operation. +.RS 0.4i +.TP +.B XFS_EXCHANGE_RANGE_TO_EOF +Ignore the +.I length +parameter. +All bytes in +.I file1_fd +from +.I file1_offset +to EOF are moved to +.IR file2_fd , +and file2's size is set to +.RI ( file2_offset "+(" file1_length - file1_offset )). +Meanwhile, all bytes in file2 from +.I file2_offset +to EOF are moved to file1 and file1's size is set to +.RI ( file1_offset "+(" file2_length - file2_offset )). +.TP +.B XFS_EXCHANGE_RANGE_DSYNC +Ensure that all modified in-core data in both file ranges and all metadata +updates pertaining to the exchange operation are flushed to persistent storage +before the call returns. +Opening either file descriptor with +.BR O_SYNC " or " O_DSYNC +will have the same effect. +.TP +.B XFS_EXCHANGE_RANGE_FILE1_WRITTEN +Only exchange sub-ranges of +.I file1_fd +that are known to contain data written by application software. +Each sub-range may be expanded (both upwards and downwards) to align with the +file allocation unit. +For files on the data device, this is one filesystem block. +For files on the realtime device, this is the realtime extent size. +This facility can be used to implement fast atomic scatter-gather writes of any +complexity for software-defined storage targets if all writes are aligned to +the file allocation unit. +.TP +.B XFS_EXCHANGE_RANGE_DRY_RUN +Check the parameters and the feasibility of the operation, but do not change +anything. +.RE +.PP +.SH RETURN VALUE +On error, \-1 is returned, and +.I errno +is set to indicate the error. +.PP +.SH ERRORS +Error codes can be one of, but are not limited to, the following: +.TP +.B EBADF +.IR file1_fd +is not open for reading and writing or is open for append-only writes; or +.IR file2_fd +is not open for reading and writing or is open for append-only writes. +.TP +.B EBUSY +The file2 inode number and timestamps supplied do not match +.IR file2_fd . +.TP +.B EINVAL +The parameters are not correct for these files. +This error can also appear if either file descriptor represents +a device, FIFO, or socket. +Disk filesystems generally require the offset and length arguments +to be aligned to the fundamental block sizes of both files. +.TP +.B EIO +An I/O error occurred. +.TP +.B EISDIR +One of the files is a directory. +.TP +.B ENOMEM +The kernel was unable to allocate sufficient memory to perform the +operation. +.TP +.B ENOSPC +There is not enough free space in the filesystem exchange the contents safely. +.TP +.B EOPNOTSUPP +The filesystem does not support exchanging bytes between the two +files. +.TP +.B EPERM +.IR file1_fd " or " file2_fd +are immutable. +.TP +.B ETXTBSY +One of the files is a swap file. +.TP +.B EUCLEAN +The filesystem is corrupt. +.TP +.B EXDEV +.IR file1_fd " and " file2_fd +are not on the same mounted filesystem. +.SH CONFORMING TO +This API is XFS-specific. +.SH USE CASES +.PP +Several use cases are imagined for this system call. +Coordination between multiple threads is performed by the kernel. +.PP +The first is a filesystem defragmenter, which copies the contents of a file +into another file and wishes to exchange the space mappings of the two files, +provided that the original file has not changed. +.PP +An example program might look like this: +.PP +.in +4n +.EX +int fd = open("/some/file", O_RDWR); +int temp_fd = open("/some", O_TMPFILE | O_RDWR); +struct stat sb; +struct xfs_commit_range args = { + .flags = XFS_EXCHANGE_RANGE_TO_EOF, +}; + +/* gather file2's freshness information */ +ioctl(fd, XFS_IOC_START_COMMIT, &args); +fstat(fd, &sb); + +/* make a fresh copy of the file with terrible alignment to avoid reflink */ +clone_file_range(fd, NULL, temp_fd, NULL, 1, 0); +clone_file_range(fd, NULL, temp_fd, NULL, sb.st_size - 1, 0); + +/* commit the entire update */ +args.file1_fd = temp_fd; +ret = ioctl(fd, XFS_IOC_COMMIT_RANGE, &args); +if (ret && errno == EBUSY) + printf("file changed while defrag was underway\\n"); +.EE +.in +.PP +The second is a data storage program that wants to commit non-contiguous updates +to a file atomically. +This program cannot coordinate updates to the file and therefore relies on the +kernel to reject the COMMIT_RANGE command if the file has been updated by +someone else. +This can be done by creating a temporary file, calling +.BR FICLONE (2) +to share the contents, and staging the updates into the temporary file. +The +.B FULL_FILES +flag is recommended for this purpose. +The temporary file can be deleted or punched out afterwards. +.PP +An example program might look like this: +.PP +.in +4n +.EX +int fd = open("/some/file", O_RDWR); +int temp_fd = open("/some", O_TMPFILE | O_RDWR); +struct xfs_commit_range args = { + .flags = XFS_EXCHANGE_RANGE_TO_EOF, +}; + +/* gather file2's freshness information */ +ioctl(fd, XFS_IOC_START_COMMIT, &args); + +ioctl(temp_fd, FICLONE, fd); + +/* append 1MB of records */ +lseek(temp_fd, 0, SEEK_END); +write(temp_fd, data1, 1000000); + +/* update record index */ +pwrite(temp_fd, data1, 600, 98765); +pwrite(temp_fd, data2, 320, 54321); +pwrite(temp_fd, data2, 15, 0); + +/* commit the entire update */ +args.file1_fd = temp_fd; +ret = ioctl(fd, XFS_IOC_COMMIT_RANGE, &args); +if (ret && errno == EBUSY) + printf("file changed before commit; will roll back\\n"); +.EE +.in +.B +.SH NOTES +.PP +Some filesystems may limit the amount of data or the number of extents that can +be exchanged in a single call. +.SH SEE ALSO +.BR ioctl (2) diff --git a/man/man2/ioctl_xfs_fsgeometry.2 b/man/man2/ioctl_xfs_fsgeometry.2 index 54fd89390883c1..db7698fa922b87 100644 --- a/man/man2/ioctl_xfs_fsgeometry.2 +++ b/man/man2/ioctl_xfs_fsgeometry.2 @@ -212,7 +212,7 @@ .SH FILESYSTEM FEATURE FLAGS .B XFS_FSOP_GEOM_FLAGS_REFLINK Filesystem supports sharing blocks between files. .TP -.B XFS_FSOP_GEOM_FLAGS_EXCHRANGE +.B XFS_FSOP_GEOM_FLAGS_EXCHANGE_RANGE Filesystem can exchange file contents atomically via XFS_IOC_EXCHANGE_RANGE. .RE .SH XFS METADATA HEALTH REPORTING diff --git a/man/man2/ioctl_xfs_start_commit.2 b/man/man2/ioctl_xfs_start_commit.2 new file mode 100644 index 00000000000000..f11410120f698d --- /dev/null +++ b/man/man2/ioctl_xfs_start_commit.2 @@ -0,0 +1 @@ +.so man2/ioctl_xfs_commit_range.2 From patchwork Fri Oct 25 06:32:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13850090 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 56162170A16 for ; Fri, 25 Oct 2024 06:32:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729837979; cv=none; b=O7ZcEVMV0bAmrLQX7oMYLDbnmWtnOLZ7Y/HAbpoXB7ZSKMalb8OCNzwvpzTlMjPuwfCw7TmxUgDwFxWtUHfDxczyEw5XHN3hORP9pHHu7QWVRvkoMX8XzdZrfI367bTkYufCTsIFfpSips7kE+vBjSpQmr5jhcezZaxtdDe4FCU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729837979; c=relaxed/simple; bh=B/RI4XbFns3zF4U8VioPN8izM+6spRkg8jyjZr5SvFE=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Ub2ty5XI9gnrvoTbSoNnaLPirsZ3uCJVyYVtF6hlOITppLhsM5Sf1C2C4i3hUVq1O16EUnY31g+Cpmlj4aKx2EbTh9AxzRol4lW1vhjaMYwxDtXFJ39vwa4j0/uObwdgl4J1ruXL6gwSoJJ86QkKIHnDd7LeMgi5hanLC4NZKEc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=gFaFGz/2; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="gFaFGz/2" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 26DAFC4CEC3; Fri, 25 Oct 2024 06:32:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1729837979; bh=B/RI4XbFns3zF4U8VioPN8izM+6spRkg8jyjZr5SvFE=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=gFaFGz/26OiqzfVRe6PqX1LWkBpJ7neNBCCVyGkmAYMkrsr6BqgbzkukeJ0tdf8fo vHnYbXxEaRdCzr86YCVLXJ0LHFiWisF0FwoQjAZyF553iCLbLmRPBMQLxDqY7P+AGa F/f9ejuLgi5caEyv3LrIqS3gFwtiCfU+dO/LQqTGwft34Cec02P/teqolfmLAZYQZA JN81JmMX0FP1gkhXul/dgkKTDiurL/KZ6l2/HOu76jnhNydiMnau7n6p899GMfkDr8 DRQ1UVKCgf2UlQEHKMDd7vTB1nERFU+E5Hxr0attbFf6m9Krjvu5Wg0R0MXqqKokne IFp6UQ8T9q8kA== Date: Thu, 24 Oct 2024 23:32:58 -0700 Subject: [PATCH 2/7] libfrog: add support for commit range ioctl family From: "Darrick J. Wong" To: cem@kernel.org, aalbersh@kernel.org, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <172983773360.3040944.3327362870969316613.stgit@frogsfrogsfrogs> In-Reply-To: <172983773323.3040944.5615240418900510348.stgit@frogsfrogsfrogs> References: <172983773323.3040944.5615240418900510348.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Add some library code to support the new file range commit ioctls. This will be used to test the atomic file commit functionality in fstests. Signed-off-by: Darrick J. Wong --- libfrog/file_exchange.c | 194 +++++++++++++++++++++++++++++++++++++++++++++++ libfrog/file_exchange.h | 10 ++ 2 files changed, 204 insertions(+) diff --git a/libfrog/file_exchange.c b/libfrog/file_exchange.c index 29fdc17e598ce4..e6c3f486b0ffdc 100644 --- a/libfrog/file_exchange.c +++ b/libfrog/file_exchange.c @@ -50,3 +50,197 @@ xfrog_exchangerange( return 0; } + +/* + * Prepare for committing a file contents exchange if nobody changes file2 in + * the meantime by asking the kernel to sample file2's change attributes. + * + * Returns 0 for success or a negative errno. + */ +int +xfrog_commitrange_prep( + struct xfs_commit_range *xcr, + int file2_fd, + off_t file2_offset, + int file1_fd, + off_t file1_offset, + uint64_t length) +{ + int ret; + + memset(xcr, 0, sizeof(*xcr)); + + xcr->file1_fd = file1_fd; + xcr->file1_offset = file1_offset; + xcr->length = length; + xcr->file2_offset = file2_offset; + + ret = ioctl(file2_fd, XFS_IOC_START_COMMIT, xcr); + if (ret) + return -errno; + + return 0; +} + +/* + * Execute an exchange-commit operation. Returns 0 for success or a negative + * errno. + */ +int +xfrog_commitrange( + int file2_fd, + struct xfs_commit_range *xcr, + uint64_t flags) +{ + int ret; + + xcr->flags = flags; + + ret = ioctl(file2_fd, XFS_IOC_COMMIT_RANGE, xcr); + if (ret) + return -errno; + + return 0; +} + +/* Opaque freshness blob for XFS_IOC_COMMIT_RANGE */ +struct xfs_commit_range_fresh { + xfs_fsid_t fsid; /* m_fixedfsid */ + __u64 file2_ino; /* inode number */ + __s64 file2_mtime; /* modification time */ + __s64 file2_ctime; /* change time */ + __s32 file2_mtime_nsec; /* mod time, nsec */ + __s32 file2_ctime_nsec; /* change time, nsec */ + __u32 file2_gen; /* inode generation */ + __u32 magic; /* zero */ +}; + +/* magic flag to force use of swapext */ +#define XCR_SWAPEXT_MAGIC 0x43524150 /* CRAP */ + +/* + * Import file2 freshness information for a XFS_IOC_SWAPEXT call from bulkstat + * information. We can skip the fsid and file2_gen members because old swapext + * did not verify those things. + */ +static void +xfrog_swapext_prep( + struct xfs_commit_range *xdf, + const struct xfs_bulkstat *file2_stat) +{ + struct xfs_commit_range_fresh *f; + + f = (struct xfs_commit_range_fresh *)&xdf->file2_freshness; + f->file2_ino = file2_stat->bs_ino; + f->file2_mtime = file2_stat->bs_mtime; + f->file2_mtime_nsec = file2_stat->bs_mtime_nsec; + f->file2_ctime = file2_stat->bs_ctime; + f->file2_ctime_nsec = file2_stat->bs_ctime_nsec; + f->magic = XCR_SWAPEXT_MAGIC; +} + +/* Invoke the old swapext ioctl. */ +static int +xfrog_ioc_swapext( + int file2_fd, + struct xfs_commit_range *xdf) +{ + struct xfs_swapext args = { + .sx_version = XFS_SX_VERSION, + .sx_fdtarget = file2_fd, + .sx_length = xdf->length, + .sx_fdtmp = xdf->file1_fd, + }; + struct xfs_commit_range_fresh *f; + int ret; + + BUILD_BUG_ON(sizeof(struct xfs_commit_range_fresh) != + sizeof(xdf->file2_freshness)); + + f = (struct xfs_commit_range_fresh *)&xdf->file2_freshness; + args.sx_stat.bs_ino = f->file2_ino; + args.sx_stat.bs_mtime.tv_sec = f->file2_mtime; + args.sx_stat.bs_mtime.tv_nsec = f->file2_mtime_nsec; + args.sx_stat.bs_ctime.tv_sec = f->file2_ctime; + args.sx_stat.bs_ctime.tv_nsec = f->file2_ctime_nsec; + + ret = ioctl(file2_fd, XFS_IOC_SWAPEXT, &args); + if (ret) { + /* + * Old swapext returns EFAULT if file1 or file2 length doesn't + * match. The new new COMMIT_RANGE doesn't check the file + * length, but the freshness checks will trip and return EBUSY. + * If we see EFAULT from the old ioctl, turn that into EBUSY. + */ + if (errno == EFAULT) + return -EBUSY; + return -errno; + } + + return 0; +} + +/* + * Prepare for defragmenting a file by committing a file contents exchange if + * if nobody changes file2 in the meantime by asking the kernel to sample + * file2's change attributes. + * + * If the kernel supports only the old XFS_IOC_SWAPEXT ioctl, the @file2_stat + * information will be used to sample the change attributes. + * + * Returns 0 or a negative errno. + */ +int +xfrog_defragrange_prep( + struct xfs_commit_range *xdf, + int file2_fd, + const struct xfs_bulkstat *file2_stat, + int file1_fd) +{ + int ret; + + memset(xdf, 0, sizeof(*xdf)); + + xdf->file1_fd = file1_fd; + xdf->length = file2_stat->bs_size; + + ret = ioctl(file2_fd, XFS_IOC_START_COMMIT, xdf); + if (ret && (errno == EOPNOTSUPP || errno == ENOTTY)) { + xfrog_swapext_prep(xdf, file2_stat); + return 0; + } + if (ret) + return -errno; + + return 0; +} + +/* Execute an exchange operation. Returns 0 for success or a negative errno. */ +int +xfrog_defragrange( + int file2_fd, + struct xfs_commit_range *xdf) +{ + struct xfs_commit_range_fresh *f; + int ret; + + f = (struct xfs_commit_range_fresh *)&xdf->file2_freshness; + if (f->magic == XCR_SWAPEXT_MAGIC) + goto legacy_fallback; + + ret = ioctl(file2_fd, XFS_IOC_COMMIT_RANGE, xdf); + if (ret) { + if (errno == EOPNOTSUPP || errno != ENOTTY) + goto legacy_fallback; + return -errno; + } + + return 0; + +legacy_fallback: + ret = xfrog_ioc_swapext(file2_fd, xdf); + if (ret) + return -errno; + + return 0; +} diff --git a/libfrog/file_exchange.h b/libfrog/file_exchange.h index b6f6f9f698a8c9..98d3b867c317ee 100644 --- a/libfrog/file_exchange.h +++ b/libfrog/file_exchange.h @@ -12,4 +12,14 @@ void xfrog_exchangerange_prep(struct xfs_exchange_range *fxr, int xfrog_exchangerange(int file2_fd, struct xfs_exchange_range *fxr, uint64_t flags); +int xfrog_commitrange_prep(struct xfs_commit_range *xcr, int file2_fd, + off_t file2_offset, int file1_fd, off_t file1_offset, + uint64_t length); +int xfrog_commitrange(int file2_fd, struct xfs_commit_range *xcr, + uint64_t flags); + +int xfrog_defragrange_prep(struct xfs_commit_range *xdf, int file2_fd, + const struct xfs_bulkstat *file2_stat, int file1_fd); +int xfrog_defragrange(int file2_fd, struct xfs_commit_range *xdf); + #endif /* __LIBFROG_FILE_EXCHANGE_H__ */ From patchwork Fri Oct 25 06:33:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13850091 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4103418BC1C for ; Fri, 25 Oct 2024 06:33:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729837995; cv=none; b=TOw9SxhZgUEtUGIJlpqdCCVp5LCf1BPX6XwFKf4XtiyW9NUn2IPyzs8fVgDoXIQCQGHyG4xRiVnSN03onkTSdLOa+tqxQSrZ54lfs/MsobChGXvt7kP9yjMI+I66p1vgWsid3Gm23BKXshMLzcVZ3QQnIt548Z+QYR9iw341FB0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729837995; c=relaxed/simple; bh=4DS8L5WVIa4gPNTrXKdfv8k0OFViohtUcUB3oluJjQQ=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=oAz2kkgA+lpwGn6r/hSyAk9Ao+rL1h13s0mFuoxIL8LZBLJh4NPT6sUm+2v01JgUZFErOMj9TS46Kt6BrvCeMVXAW2nHVgV3AME4AXIiE8Cw66JWALrzu8DIDbSIy/LNiuGIKbjxIXMOfLXYpavLu1dHRgEIVV//dYh9uD+pspQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=A68byGc6; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="A68byGc6" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B9B74C4CEC3; Fri, 25 Oct 2024 06:33:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1729837994; bh=4DS8L5WVIa4gPNTrXKdfv8k0OFViohtUcUB3oluJjQQ=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=A68byGc6kg3SK92WP4910LDeTdZL+FKKEhk+CXY9i0tJSSaFMltdsrZHHbnMqmbGr SwO0SNqQ23TSjnn/0RVNzHsxoRc9EIadyt25qP1dWqjDUUwuwMQJtA02qUkgI94wuq 0i9QmHslGw/rvLW9YuaF30BDctGJXI18gST82pjasqvAuImu5pL5ygWCIJFSQ3G7PR qbSWpxcCJEdAkxgi+fVhi4zdi4Lafeji3Y+90BhgzdfIIFkxhWl82nZQgiuGbx2jpc RdZLSVWh6ZKpwKNj/MQp2xPfCr1vM71DrRE5pILLyElZ5pZs097cJTUx4pcQkiEznw 1D0Hf18iyXCmw== Date: Thu, 24 Oct 2024 23:33:14 -0700 Subject: [PATCH 3/7] libxfs: remove unused xfs_inode fields From: "Darrick J. Wong" To: cem@kernel.org, aalbersh@kernel.org, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <172983773375.3040944.3625887079395000900.stgit@frogsfrogsfrogs> In-Reply-To: <172983773323.3040944.5615240418900510348.stgit@frogsfrogsfrogs> References: <172983773323.3040944.5615240418900510348.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Remove these unused fields; on the author's system this reduces the struct size from 560 bytes to 448. Signed-off-by: Darrick J. Wong --- include/xfs_inode.h | 4 ---- 1 file changed, 4 deletions(-) diff --git a/include/xfs_inode.h b/include/xfs_inode.h index 170cc5288d3645..f250102ff19d65 100644 --- a/include/xfs_inode.h +++ b/include/xfs_inode.h @@ -215,7 +215,6 @@ typedef struct xfs_inode { struct xfs_mount *i_mount; /* fs mount struct ptr */ xfs_ino_t i_ino; /* inode number (agno/agino) */ struct xfs_imap i_imap; /* location for xfs_imap() */ - struct xfs_buftarg i_dev; /* dev for this inode */ struct xfs_ifork *i_cowfp; /* copy on write extents */ struct xfs_ifork i_df; /* data fork */ struct xfs_ifork i_af; /* attribute fork */ @@ -239,9 +238,6 @@ typedef struct xfs_inode { xfs_agino_t i_next_unlinked; xfs_agino_t i_prev_unlinked; - xfs_extnum_t i_cnextents; /* # of extents in cow fork */ - unsigned int i_cformat; /* format of cow fork */ - xfs_fsize_t i_size; /* in-memory size */ struct inode i_vnode; } xfs_inode_t; From patchwork Fri Oct 25 06:33:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13850092 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D6C2E1CF96 for ; Fri, 25 Oct 2024 06:33:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729838010; cv=none; b=W/VnJ3cbKmCbKAPuCtxjvpsmX8IRLOnIzQdww24dx/yRLLRTIx8pLfOlqkXLwUz4ik9aWoGB/jKTFfqUaz6GJU0RGFepJM0+NRPSfIcFqmKwfy+dnvpOjtXHLLWCCKfGNuw2BverLuHrzxz2GHsdi+6OV7EdHy6UDV5HjNaEBmQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729838010; c=relaxed/simple; bh=KWJkctQUa0j6Xx07tu6NhDAg+q+dMKXu7XWIh1DO1io=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=J4aRaKdAD8Y0NGQLgEyvf6DHwpSy4MZhT0CvhoDKvmG26Z50VGrJWar9RzvA27wsOKHofAx8ifXvz4ctzRsDCMtIanhf1Sdd9zrX4f+E/NoYftTe67QWemol/GhxgVCZikAuUlCpu84fiotXFZU7jTkquIEncyte4nZ7y6MHFjg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=p0zeh77J; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="p0zeh77J" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 56194C4CEC3; Fri, 25 Oct 2024 06:33:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1729838010; bh=KWJkctQUa0j6Xx07tu6NhDAg+q+dMKXu7XWIh1DO1io=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=p0zeh77JJ/Gq6ga0X33YsVqfyd5XMtrtfc2XJg5g7FEQsNBkiuKjUROpZkhLobIDH BCJNA26U7H6Cu8aNlL7XTjz2KdRgggQsORtAJ/RUUiRmKbL/lsdeZG2mf2M7mgzcBd mAr//z6fgVv7+EIOJscFFRK5u05TRNgr9oIyPfofSUisSqVg7bn4YJ4ybxID/hnu59 e/15JRKLyes5BpGPE7pClCZdYa3+PQ3NVMHSgNXNMuPurvIQlKJYafTUcQ2qPJMRXG Dqaoh9tvbfoxpF0tdk6pijbNSphTESEthpUm7aJy/BiX2S/5VG4POF7asuzEzdR1J1 TA17likkdnpyw== Date: Thu, 24 Oct 2024 23:33:29 -0700 Subject: [PATCH 4/7] libxfs: validate inumber in xfs_iget From: "Darrick J. Wong" To: cem@kernel.org, aalbersh@kernel.org, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <172983773390.3040944.4686468112875777629.stgit@frogsfrogsfrogs> In-Reply-To: <172983773323.3040944.5615240418900510348.stgit@frogsfrogsfrogs> References: <172983773323.3040944.5615240418900510348.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Actually use the inumber validator to check the argument passed in here, just like we now do in the kernel. Signed-off-by: Darrick J. Wong --- libxfs/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libxfs/inode.c b/libxfs/inode.c index 2062ecf54486cf..9230ad24a5cb6c 100644 --- a/libxfs/inode.c +++ b/libxfs/inode.c @@ -143,7 +143,7 @@ libxfs_iget( int error = 0; /* reject inode numbers outside existing AGs */ - if (!ino || XFS_INO_TO_AGNO(mp, ino) >= mp->m_sb.sb_agcount) + if (!xfs_verify_ino(mp, ino)) return -EINVAL; ip = kmem_cache_zalloc(xfs_inode_cache, 0); From patchwork Fri Oct 25 06:33:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13850093 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7254E170A16 for ; Fri, 25 Oct 2024 06:33:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729838026; cv=none; b=bgUVs4V5k/XKYRK1shY4UVfDp9IRYZgD/TBrrOrO2tiDi0HH7TrQun5SBKFMThzyVE9MXcalHnRPQf9l+QlDo7vnKfgHvlgiuuL5VfgvS1JvMTazUdtK15h388juBf0eGOkscsskV99UPvOCAaJwnqMwK0wjCE14e55It7VA/wg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729838026; c=relaxed/simple; bh=vS77g9y2vwqQciMlX71HgjaBSrkw3Kt6j/3hcuLOVV8=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=GUu8BUObJn2sAhP8vlR6YxoZY/GUug6dArvwwOmf7pqiCLAEiG7S0OPpyyzqdyOIMipvX81kSVm74AfE2yHirm00Zvs3bb4XMxyT5Z1vUwLecNQyE/jDnLW1cfjBvUl56BJ6xAB3h748Y7aXH1BLKN8fN4kuBKhWB4BNPyqIrmY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=qwACuzKf; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="qwACuzKf" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 04173C4CEC3; Fri, 25 Oct 2024 06:33:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1729838026; bh=vS77g9y2vwqQciMlX71HgjaBSrkw3Kt6j/3hcuLOVV8=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=qwACuzKfOsDcBIjMSJbcqbUiO35SQlzEwWazP+v4GLF7P0beSglT6I2ObbsE0vVa1 jkzh+SfXdW3wrdzKJpewURRkp468AshE2HxStAsPi2XJNNdgkYx8KJti4rkfQDbfwq i3lbc2NscOs7KieY7xU8y8/n1u7rLu1XWT9Cjaw5qJsBS4VYWJ3tgFv2gOc4uCmh56 L1nHQlGu8xg9esY1rgDMNRwder79pKzV8pT0tdnUlgZFZyl7t2XNi0CZOgbjvOFULB CRdLr+XDhPZMRadc9xSZs9fuJovm6ieQNznnfjMy6MI/eFQ+GYAxJUHOyJ6bHBg38D vm+SP1tbC7VvQ== Date: Thu, 24 Oct 2024 23:33:45 -0700 Subject: [PATCH 5/7] xfs_fsr: port to new file exchange library function From: "Darrick J. Wong" To: cem@kernel.org, aalbersh@kernel.org, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <172983773405.3040944.316630518119574344.stgit@frogsfrogsfrogs> In-Reply-To: <172983773323.3040944.5615240418900510348.stgit@frogsfrogsfrogs> References: <172983773323.3040944.5615240418900510348.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Port fsr to use the new libfrog library functions to handle exchanging mappings between the target and donor files. Signed-off-by: Darrick J. Wong --- fsr/xfs_fsr.c | 74 ++++++++++++++++++++++++++------------------------------- 1 file changed, 34 insertions(+), 40 deletions(-) diff --git a/fsr/xfs_fsr.c b/fsr/xfs_fsr.c index 22e134adfd73ab..8845ff172fcb2e 100644 --- a/fsr/xfs_fsr.c +++ b/fsr/xfs_fsr.c @@ -13,6 +13,7 @@ #include "libfrog/paths.h" #include "libfrog/fsgeom.h" #include "libfrog/bulkstat.h" +#include "libfrog/file_exchange.h" #include #include @@ -122,12 +123,6 @@ open_handle( return 0; } -static int -xfs_swapext(int fd, xfs_swapext_t *sx) -{ - return ioctl(fd, XFS_IOC_SWAPEXT, sx); -} - static int xfs_fscounts(int fd, xfs_fsop_counts_t *counts) { @@ -1189,14 +1184,13 @@ packfile( struct xfs_bulkstat *statp, struct fsxattr *fsxp) { + struct xfs_commit_range xdf; int tfd = -1; - int srval; int retval = -1; /* Failure is the default */ int nextents, extent, cur_nextents, new_nextents; unsigned blksz_dio; unsigned dio_min; struct dioattr dio; - static xfs_swapext_t sx; struct xfs_flock64 space; off_t cnt, pos; void *fbuf = NULL; @@ -1239,6 +1233,16 @@ packfile( goto out; } + /* + * Snapshot file_fd before we start copying data but after tweaking + * forkoff. + */ + error = xfrog_defragrange_prep(&xdf, file_fd->fd, statp, tfd); + if (error) { + fsrprintf(_("failed to prep for defrag: %s\n"), strerror(error)); + goto out; + } + /* Setup extended inode flags, project identifier, etc */ if (fsxp->fsx_xflags || fsxp->fsx_projid) { if (ioctl(tfd, FS_IOC_FSSETXATTR, fsxp) < 0) { @@ -1446,19 +1450,6 @@ packfile( goto out; } - error = -xfrog_bulkstat_v5_to_v1(file_fd, &sx.sx_stat, statp); - if (error) { - fsrprintf(_("bstat conversion error on %s: %s\n"), - fname, strerror(error)); - goto out; - } - - sx.sx_version = XFS_SX_VERSION; - sx.sx_fdtarget = file_fd->fd; - sx.sx_fdtmp = tfd; - sx.sx_offset = 0; - sx.sx_length = statp->bs_size; - /* switch to the owner's id, to keep quota in line */ if (fchown(tfd, statp->bs_uid, statp->bs_gid) < 0) { if (vflag) @@ -1468,25 +1459,28 @@ packfile( } /* Swap the extents */ - srval = xfs_swapext(file_fd->fd, &sx); - if (srval < 0) { - if (errno == ENOTSUP) { - if (vflag || dflag) - fsrprintf(_("%s: file type not supported\n"), fname); - } else if (errno == EFAULT) { - /* The file has changed since we started the copy */ - if (vflag || dflag) - fsrprintf(_("%s: file modified defrag aborted\n"), - fname); - } else if (errno == EBUSY) { - /* Timestamp has changed or mmap'ed file */ - if (vflag || dflag) - fsrprintf(_("%s: file busy\n"), fname); - } else { - fsrprintf(_("XFS_IOC_SWAPEXT failed: %s: %s\n"), - fname, strerror(errno)); - } - goto out; + error = xfrog_defragrange(file_fd->fd, &xdf); + switch (error) { + case 0: + break; + case ENOTSUP: + if (vflag || dflag) + fsrprintf(_("%s: file type not supported\n"), fname); + break; + case EFAULT: + /* The file has changed since we started the copy */ + if (vflag || dflag) + fsrprintf(_("%s: file modified defrag aborted\n"), + fname); + break; + case EBUSY: + /* Timestamp has changed or mmap'ed file */ + if (vflag || dflag) + fsrprintf(_("%s: file busy\n"), fname); + break; + default: + fsrprintf(_("XFS_IOC_SWAPEXT failed: %s: %s\n"), + fname, strerror(error)); } /* Report progress */ From patchwork Fri Oct 25 06:34:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13850094 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BAB9D1CF96 for ; Fri, 25 Oct 2024 06:34:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729838041; cv=none; b=ZWJQeDwjaQZVfTtLRqowozhfOiVyvaXeTQc3xI3EYQx+pM82CLYIgG4giKtU67t8Uv4Gkqu4McX59F135eYqzH76dJZETTXSXyjgwT82Ig17+mgN5GWKXg5/djrMQF1S8DE23h/oywT+llUAhJo68UmpRBAPXRTGq61sLjqcITk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729838041; c=relaxed/simple; bh=rXHBFprqFZ0S+yOqBS1KfHMnVEFYdReAsH5q7AKWnU0=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=GwU1Jdck9STUa2Vnlqg9uOSqE8X0lSej4HQa8e5GD00se3I0cFLD4YZEhRqDmN6IciKPRE7DkP+cymW3omhWrz9IFRqw70uJ8IeGB7i7qNeRALm9yZpceqB8MKvOUdac+rLdvMRjWcEHGU6bf3QwTn1jpnCoU7lS3CYG7viJefA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=pNGIqZBy; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="pNGIqZBy" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 96CD0C4CEC3; Fri, 25 Oct 2024 06:34:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1729838041; bh=rXHBFprqFZ0S+yOqBS1KfHMnVEFYdReAsH5q7AKWnU0=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=pNGIqZByqLR5d2LeEaPUP8lCsBz47uEOTHq08tMkd64hptYwkL6YaAmtwjH7B+wOV sUQ0mVSj8q4lTQ0CaLDlVGkcqmYJwGvHdhE7hbfLWYQl3k66ia9/lMuG+UpI0q/eiw eHsUbfUdgoIFexCzdGRv8rYPh8fTcHghYbcZMewgx+WZ4QoItwA22NnK69P6HdD5pc i1jQt1VSndO8jDCicHGdgeBfmfDy4fkA+rIl5rKGMh7MQNuew7JFOKlwoPAPJ0yaEY +eZnK3LKQXSduBVDYMdF+Ijc47c12dbhTectfp8AzsXiPCJH4ZSAO8+j6FX5B8ogUO k07q+HGaiJWTg== Date: Thu, 24 Oct 2024 23:34:01 -0700 Subject: [PATCH 6/7] xfs_io: add a commitrange option to the exchangerange command From: "Darrick J. Wong" To: cem@kernel.org, aalbersh@kernel.org, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <172983773420.3040944.255060526461776194.stgit@frogsfrogsfrogs> In-Reply-To: <172983773323.3040944.5615240418900510348.stgit@frogsfrogsfrogs> References: <172983773323.3040944.5615240418900510348.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Teach the xfs_io exchangerange command to be able to use the commit range functionality so that we can test it piece by piece. Signed-off-by: Darrick J. Wong --- io/exchrange.c | 26 ++++++++++++++++++++++---- man/man8/xfs_io.8 | 3 +++ 2 files changed, 25 insertions(+), 4 deletions(-) diff --git a/io/exchrange.c b/io/exchrange.c index 016429280e2717..0a3750f1eb2607 100644 --- a/io/exchrange.c +++ b/io/exchrange.c @@ -19,6 +19,7 @@ exchangerange_help(void) "\n" " Exchange file data between the open file descriptor and the supplied filename.\n" " -C -- Print timing information in a condensed format\n" +" -c -- Commit to the exchange only if file2 has not changed.\n" " -d N -- Start exchanging contents at this position in the open file\n" " -f -- Flush changed file data and metadata to disk\n" " -l N -- Exchange this many bytes between the two files instead of to EOF\n" @@ -34,9 +35,9 @@ exchangerange_f( int argc, char **argv) { - struct xfs_exchange_range fxr; struct stat stat; struct timeval t1, t2; + bool use_commit = false; uint64_t flags = XFS_EXCHANGE_RANGE_TO_EOF; int64_t src_offset = 0; int64_t dest_offset = 0; @@ -53,6 +54,9 @@ exchangerange_f( case 'C': condensed = 1; break; + case 'c': + use_commit = true; + break; case 'd': dest_offset = cvtnum(fsblocksize, fssectsize, optarg); if (dest_offset < 0) { @@ -117,8 +121,22 @@ exchangerange_f( if (length < 0) length = stat.st_size; - xfrog_exchangerange_prep(&fxr, dest_offset, fd, src_offset, length); - ret = xfrog_exchangerange(file->fd, &fxr, flags); + if (use_commit) { + struct xfs_commit_range xcr; + + ret = xfrog_commitrange_prep(&xcr, file->fd, dest_offset, fd, + src_offset, length); + if (!ret) { + gettimeofday(&t1, NULL); + ret = xfrog_commitrange(file->fd, &xcr, flags); + } + } else { + struct xfs_exchange_range fxr; + + xfrog_exchangerange_prep(&fxr, dest_offset, fd, src_offset, + length); + ret = xfrog_exchangerange(file->fd, &fxr, flags); + } if (ret) { xfrog_perror(ret, "exchangerange"); exitcode = 1; @@ -149,7 +167,7 @@ static struct cmdinfo exchangerange_cmd = { void exchangerange_init(void) { - exchangerange_cmd.args = _("[-Cfntw] [-d dest_offset] [-s src_offset] [-l length] "); + exchangerange_cmd.args = _("[-Ccfntw] [-d dest_offset] [-s src_offset] [-l length] "); exchangerange_cmd.oneline = _("Exchange contents between files."); add_command(&exchangerange_cmd); diff --git a/man/man8/xfs_io.8 b/man/man8/xfs_io.8 index 1e7901393ff4d4..49d4057bb069ed 100644 --- a/man/man8/xfs_io.8 +++ b/man/man8/xfs_io.8 @@ -732,6 +732,9 @@ .SH FILE I/O COMMANDS .B \-C Print timing information in a condensed format. .TP +.B \-c +Exchange contents only if file2 has not changed. +.TP .BI \-d " dest_offset" Swap extents with open file beginning at .IR dest_offset . From patchwork Fri Oct 25 06:34:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13850095 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6FC921CF96 for ; Fri, 25 Oct 2024 06:34:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729838057; cv=none; b=XdJLblLHIDh/hFJB+Bbgqoyv+tPbmGshitYFo1wyyJmtXJ0qiOikmT0W2vZFvb3l/M8v07cZeI9SMstTtpbq6PLDaTnCi6J+9cy9hQdkZ6JoYRZXymN7leOeH+sUghcvSz0zVQgihWaoo2EupcVv9uPfUtPeF/IgTr4Gca9qqao= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729838057; c=relaxed/simple; bh=VMlnZuHjBiD6M2h6NQalcon/08D20hDwYhTgose8m2A=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=if5xaADnK7BOq6i/gyp9qj7DQ7XsXBlG9Wf6LhrsFMILIWmd/B1YdCNFcOnZeBaZkdxHIaPMQXOIRlVeITIddNySlMk1dNXy/0cZlvvBvq7g725jq91zUZSaskQhkAu5pXsaK5P8PXh0WbkF+PZlMs3SxX4kDLB2W+DCbMFU/FY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Jd5HdjP4; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Jd5HdjP4" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3CE2CC4CEC3; Fri, 25 Oct 2024 06:34:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1729838057; bh=VMlnZuHjBiD6M2h6NQalcon/08D20hDwYhTgose8m2A=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=Jd5HdjP45xgn8T71T/R4jGhaIff/s9INwDaAPeLY9jPbMzTbR9six3VSj3tOpI6qv 7y2IfhX0YstcmIouQ6RetYsXgAM/7VXMLi9b6e6eXKNoeDchadXCR7As+xLCVJweKI B3BFEMTGTg8uY3DgypPwenPh8HKbNpdIr8gtQhcF0/XSWJt8SMaKJ7zONUDnNFM2QE 6kzzC/TFK1qm13tDvHxXUCmkQkFlXxAxG5IU8A/qFSsERcdExeF01RHrxC7bNa6e9s GwGEtxHEc8IwHyIcXE752Ewc5s4mCxOpZNsE/ePI0a3J2C3xTWNbCJNw4ygKrFgqkv l+8OiqY2Q+GkQ== Date: Thu, 24 Oct 2024 23:34:16 -0700 Subject: [PATCH 7/7] xfs_io: add atomic file update commands to exercise file commit range From: "Darrick J. Wong" To: cem@kernel.org, aalbersh@kernel.org, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <172983773435.3040944.11571503838591968979.stgit@frogsfrogsfrogs> In-Reply-To: <172983773323.3040944.5615240418900510348.stgit@frogsfrogsfrogs> References: <172983773323.3040944.5615240418900510348.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Add three commands to xfs_io so that we can exercise atomic file updates as provided by reflink and the start-commit / commit-range functionality. Signed-off-by: Darrick J. Wong --- io/exchrange.c | 364 +++++++++++++++++++++++++++++++++++++++++++++++++++++ io/io.h | 4 + io/open.c | 27 +++- man/man8/xfs_io.8 | 32 +++++ 4 files changed, 421 insertions(+), 6 deletions(-) diff --git a/io/exchrange.c b/io/exchrange.c index 0a3750f1eb2607..707d78d8e624fe 100644 --- a/io/exchrange.c +++ b/io/exchrange.c @@ -164,6 +164,358 @@ static struct cmdinfo exchangerange_cmd = { .help = exchangerange_help, }; +/* Atomic file updates commands */ + +struct update_info { + /* File that we're updating. */ + int fd; + + /* ioctl data to commit the changes */ + struct xfs_commit_range xcr; + + /* Name of the file we're updating. */ + char *old_fname; + + /* fd we're using to stage the updates. */ + int temp_fd; +}; + +enum finish_how { + FINISH_ABORT, + FINISH_COMMIT, + FINISH_CHECK +}; + +static struct update_info *updates; +static unsigned int nr_updates; + +static void +startupdate_help(void) +{ + printf(_( +"\n" +" Prepare for an atomic file update, if supported by the filesystem.\n" +" A temporary file will be opened for writing and inserted into the file\n" +" table. The current file will be changed to this temporary file. Neither\n" +" file can be closed for the duration of the update.\n" +"\n" +" -e -- Start with an empty file\n" +"\n")); +} + +static int +startupdate_f( + int argc, + char *argv[]) +{ + struct fsxattr attr; + struct xfs_fsop_geom fsgeom; + struct fs_path fspath; + struct stat stat; + struct update_info *p; + char *fname; + char *path = NULL, *d; + size_t fname_len; + int flags = IO_TMPFILE | IO_ATOMICUPDATE; + int temp_fd = -1; + bool clone_file = true; + int c; + int ret; + + while ((c = getopt(argc, argv, "e")) != -1) { + switch (c) { + case 'e': + clone_file = false; + break; + default: + startupdate_help(); + return 0; + } + } + if (optind != argc) { + startupdate_help(); + return 0; + } + + /* Allocate a new slot. */ + p = realloc(updates, (++nr_updates) * sizeof(*p)); + if (!p) { + perror("startupdate realloc"); + goto fail; + } + updates = p; + + /* Fill out the update information so that we can commit later. */ + p = &updates[nr_updates - 1]; + memset(p, 0, sizeof(*p)); + + ret = fstat(file->fd, &stat); + if (ret) { + perror(file->name); + goto fail; + } + + /* Is the current file realtime? If so, the temp file must match. */ + ret = ioctl(file->fd, FS_IOC_FSGETXATTR, &attr); + if (ret == 0 && attr.fsx_xflags & FS_XFLAG_REALTIME) + flags |= IO_REALTIME; + + /* Compute path to the directory that the current file is in. */ + path = strdup(file->name); + d = strrchr(path, '/'); + if (!d) { + fprintf(stderr, _("%s: cannot compute dirname?"), path); + goto fail; + } + *d = 0; + + /* Open a temporary file to stage the new contents. */ + temp_fd = openfile(path, &fsgeom, flags, 0600, &fspath); + if (temp_fd < 0) { + perror(path); + goto fail; + } + + /* + * Snapshot the original file metadata in anticipation of the later + * file mapping exchange request. + */ + ret = xfrog_commitrange_prep(&p->xcr, file->fd, 0, temp_fd, 0, + stat.st_size); + if (ret) { + perror("update prep"); + goto fail; + } + + /* Clone all the data from the original file into the temporary file. */ + if (clone_file) { + ret = ioctl(temp_fd, XFS_IOC_CLONE, file->fd); + if (ret) { + perror(path); + goto fail; + } + } + + /* Prepare a new path string for the duration of the update. */ +#define FILEUPDATE_STR " (fileupdate)" + fname_len = strlen(file->name) + strlen(FILEUPDATE_STR); + fname = malloc(fname_len + 1); + if (!fname) { + perror("new path"); + goto fail; + } + snprintf(fname, fname_len + 1, "%s%s", file->name, FILEUPDATE_STR); + + /* + * Install the temporary file into the same slot of the file table as + * the original file. Ensure that the original file cannot be closed. + */ + file->flags |= IO_ATOMICUPDATE; + p->old_fname = file->name; + file->name = fname; + p->fd = file->fd; + p->temp_fd = file->fd = temp_fd; + + free(path); + return 0; +fail: + if (temp_fd >= 0) + close(temp_fd); + free(path); + nr_updates--; + exitcode = 1; + return 1; +} + +static long long +finish_update( + enum finish_how how, + uint64_t flags, + long long *offset) +{ + struct update_info *p; + long long committed_bytes = 0; + size_t length; + unsigned int i; + unsigned int upd_offset; + int temp_fd; + int ret; + + /* Find our update descriptor. */ + for (i = 0, p = updates; i < nr_updates; i++, p++) { + if (p->temp_fd == file->fd) + break; + } + + if (i == nr_updates) { + fprintf(stderr, + _("Current file is not the staging file for an atomic update.\n")); + exitcode = 1; + return -1; + } + + /* + * Commit our changes, if desired. If the mapping exchange fails, we + * stop processing immediately so that we can run more xfs_io commands. + */ + switch (how) { + case FINISH_CHECK: + flags |= XFS_EXCHANGE_RANGE_DRY_RUN; + fallthrough; + case FINISH_COMMIT: + ret = xfrog_commitrange(p->fd, &p->xcr, flags); + if (ret) { + xfrog_perror(ret, _("committing update")); + exitcode = 1; + return -1; + } + printf(_("Committed updates to '%s'.\n"), p->old_fname); + *offset = p->xcr.file2_offset; + committed_bytes = p->xcr.length; + break; + case FINISH_ABORT: + printf(_("Cancelled updates to '%s'.\n"), p->old_fname); + break; + } + + /* + * Reset the filetable to point to the original file, and close the + * temporary file. + */ + free(file->name); + file->name = p->old_fname; + file->flags &= ~IO_ATOMICUPDATE; + temp_fd = file->fd; + file->fd = p->fd; + ret = close(temp_fd); + if (ret) + perror(_("closing temporary file")); + + /* Remove the atomic update context, shifting things down. */ + upd_offset = p - updates; + length = nr_updates * sizeof(struct update_info); + length -= (upd_offset + 1) * sizeof(struct update_info); + if (length) + memmove(p, p + 1, length); + + nr_updates--; + return committed_bytes; +} + +static void +cancelupdate_help(void) +{ + printf(_( +"\n" +" Cancels an atomic file update. The temporary file will be closed, and the\n" +" current file set back to the original file.\n" +"\n")); +} + +static int +cancelupdate_f( + int argc, + char *argv[]) +{ + return finish_update(FINISH_ABORT, 0, NULL); +} + +static void +commitupdate_help(void) +{ + printf(_( +"\n" +" Commits an atomic file update. File contents written to the temporary file\n" +" will be exchanged atomically with the corresponding range in the original\n" +" file. The temporary file will be closed, and the current file set back to\n" +" the original file.\n" +"\n" +" -C -- Print timing information in a condensed format.\n" +" -h -- Only exchange written ranges in the temporary file.\n" +" -k -- Exchange to end of file, ignore any length previously set.\n" +" -n -- Check parameters but do not change anything.\n" +" -q -- Do not print timing information at all.\n")); +} + +static int +commitupdate_f( + int argc, + char *argv[]) +{ + struct timeval t1, t2; + enum finish_how how = FINISH_COMMIT; + uint64_t flags = XFS_EXCHANGE_RANGE_TO_EOF; + long long offset, len; + int condensed = 0, quiet_flag = 0; + int c; + + while ((c = getopt(argc, argv, "Chknq")) != -1) { + switch (c) { + case 'C': + condensed = 1; + break; + case 'h': + flags |= XFS_EXCHANGE_RANGE_FILE1_WRITTEN; + break; + case 'k': + flags &= ~XFS_EXCHANGE_RANGE_TO_EOF; + break; + case 'n': + how = FINISH_CHECK; + break; + case 'q': + quiet_flag = 1; + break; + default: + commitupdate_help(); + return 0; + } + } + if (optind != argc) { + commitupdate_help(); + return 0; + } + + gettimeofday(&t1, NULL); + len = finish_update(how, flags, &offset); + if (len < 0) + return 1; + if (quiet_flag) + return 0; + + gettimeofday(&t2, NULL); + t2 = tsub(t2, t1); + report_io_times("commitupdate", &t2, offset, len, len, 1, condensed); + return 0; +} + +static struct cmdinfo startupdate_cmd = { + .name = "startupdate", + .cfunc = startupdate_f, + .argmin = 0, + .argmax = -1, + .flags = CMD_FLAG_ONESHOT | CMD_NOMAP_OK, + .help = startupdate_help, +}; + +static struct cmdinfo cancelupdate_cmd = { + .name = "cancelupdate", + .cfunc = cancelupdate_f, + .argmin = 0, + .argmax = 0, + .flags = CMD_FLAG_ONESHOT | CMD_NOMAP_OK, + .help = cancelupdate_help, +}; + +static struct cmdinfo commitupdate_cmd = { + .name = "commitupdate", + .cfunc = commitupdate_f, + .argmin = 0, + .argmax = -1, + .flags = CMD_FLAG_ONESHOT | CMD_NOMAP_OK, + .help = commitupdate_help, +}; + void exchangerange_init(void) { @@ -171,4 +523,16 @@ exchangerange_init(void) exchangerange_cmd.oneline = _("Exchange contents between files."); add_command(&exchangerange_cmd); + + startupdate_cmd.oneline = _("start an atomic update of a file"); + startupdate_cmd.args = _("[-e]"); + + cancelupdate_cmd.oneline = _("cancel an atomic update"); + + commitupdate_cmd.oneline = _("commit a file update atomically"); + commitupdate_cmd.args = _("[-C] [-h] [-n] [-q]"); + + add_command(&startupdate_cmd); + add_command(&cancelupdate_cmd); + add_command(&commitupdate_cmd); } diff --git a/io/io.h b/io/io.h index 8c5e59100c5cbd..4daedac06419ae 100644 --- a/io/io.h +++ b/io/io.h @@ -31,6 +31,9 @@ #define IO_PATH (1<<10) #define IO_NOFOLLOW (1<<11) +/* undergoing atomic update, do not close */ +#define IO_ATOMICUPDATE (1<<12) + /* * Regular file I/O control */ @@ -74,6 +77,7 @@ extern int openfile(char *, struct xfs_fsop_geom *, int, mode_t, struct fs_path *); extern int addfile(char *, int , struct xfs_fsop_geom *, int, struct fs_path *); +extern int closefile(void); extern void printxattr(uint, int, int, const char *, int, int); extern unsigned int recurse_all; diff --git a/io/open.c b/io/open.c index 15850b5557bc5b..a30dd89a1fd56c 100644 --- a/io/open.c +++ b/io/open.c @@ -338,14 +338,19 @@ open_f( return 0; } -static int -close_f( - int argc, - char **argv) +int +closefile(void) { size_t length; unsigned int offset; + if (file->flags & IO_ATOMICUPDATE) { + fprintf(stderr, + _("%s: atomic update in progress, cannot close.\n"), + file->name); + exitcode = 1; + return 0; + } if (close(file->fd) < 0) { perror("close"); exitcode = 1; @@ -371,7 +376,19 @@ close_f( free(filetable); file = filetable = NULL; } - filelist_f(); + return 0; +} + +static int +close_f( + int argc, + char **argv) +{ + int ret; + + ret = closefile(); + if (!ret) + filelist_f(); return 0; } diff --git a/man/man8/xfs_io.8 b/man/man8/xfs_io.8 index 49d4057bb069ed..eb2201fca74380 100644 --- a/man/man8/xfs_io.8 +++ b/man/man8/xfs_io.8 @@ -1058,7 +1058,37 @@ .SH FILE I/O COMMANDS nsec is the nanoseconds since the sec. This value needs to be in the range 0-999999999 with UTIME_NOW and UTIME_OMIT being exceptions. Each (sec, nsec) pair constitutes a single timestamp value. - +.TP +.BI "startupdate [ " -e ] +Create a temporary clone of a file in which to stage file updates. +The +.B \-e +option creates an empty staging file. +.TP +.B cancelupdate +Abandon changes from a update staging file. +.TP +.BI "commitupdate [" OPTIONS ] +Commit changes from a update staging file to the real file. +.RS 1.0i +.PD 0 +.TP 0.4i +.B \-C +Print timing information in a condensed format. +.TP 0.4i +.B \-h +Only swap ranges in the update staging file that were actually written. +.TP 0.4i +.B \-k +Do not change file size. +.TP 0.4i +.B \-n +Check parameters without changing anything. +.TP 0.4i +.B \-q +Do not print timing information at all. +.PD +.RE .SH MEMORY MAPPED I/O COMMANDS .TP