From patchwork Thu Oct 31 23:19:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13858468 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6C4091E0DE5 for ; Thu, 31 Oct 2024 23:19:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730416780; cv=none; b=E5nFDm0v4nUCgEgDRdDLCqmiKMv5+nC3jAQz3oGJKO2/WGN3aYeSbNDPRkYBsi6saYjbrDTzIIp29pMOZFy6U3bcZ5zkZlbPLbEbrhkGjzSRvxEOyRfdyq66fYM1Tk0TTd62kkLKbUJMiUuo5qcmzsIAfhh+biKi8+hHz4h0Vmw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730416780; c=relaxed/simple; bh=k63De8AZPDHAyZuP/x4thDkXrhacT+g8Wob6PymeRX4=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=kUyVnA/puQmdpTFmPB8AcvmcEhIQn9pr2j49NuWlPi8g01pcxvN9yp08g2iB5VvF4w5HIWvAxX1SEtdAdeOgui2HYHOKDiMVB3mMB/+5xCNSqdqQ0DxpdArLrUeA1mZ6xWukpySgPdJzGoFWC76kQgZJ5XZvywae7q5U5pqqWEo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=IUh/i1vk; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="IUh/i1vk" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E3925C4CEC3; Thu, 31 Oct 2024 23:19:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730416778; bh=k63De8AZPDHAyZuP/x4thDkXrhacT+g8Wob6PymeRX4=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=IUh/i1vkT+HuD5nP256oC8cbEChSio1ANnh09Pt7KaLzddms/8Up+yln2w3XXvSZ2 kOTDMDpQEnqmSyCxAY0MinKALOjKmPmDxREUw9o0zsWwnKkZ2Vfysx8aHXx4Z/aJ4Q Se/jnpYCWusPuX8SORLgPJCGodMDEM8Q2ROzL7xHsmA6jPDCd9IQctcW6ofrlkw7sU 2PvBRBO4Sbi3r4CefoETDeH7yjfN06pQfDBzshBnW3jrQ3qnJzJzYyAf4igm0myJ5t j3WIXf6GEuidOFoVOwCe929jDaBFfpmtPweWk7pGrZPYbv7cTHNk3y+PzsZhHyNSPX 79K0lGePv20Bw== Date: Thu, 31 Oct 2024 16:19:38 -0700 Subject: [PATCH 1/7] man: document file range commit ioctls From: "Darrick J. Wong" To: djwong@kernel.org, aalbersh@kernel.org Cc: linux-xfs@vger.kernel.org Message-ID: <173041566921.963918.12171244688114984215.stgit@frogsfrogsfrogs> In-Reply-To: <173041566899.963918.1566223803606797457.stgit@frogsfrogsfrogs> References: <173041566899.963918.1566223803606797457.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Document the two new ioctls to support committing arbitrary dirty data ranges of two files. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- man/man2/ioctl_xfs_commit_range.2 | 296 +++++++++++++++++++++++++++++++++++++ man/man2/ioctl_xfs_fsgeometry.2 | 2 man/man2/ioctl_xfs_start_commit.2 | 1 3 files changed, 298 insertions(+), 1 deletion(-) create mode 100644 man/man2/ioctl_xfs_commit_range.2 create mode 100644 man/man2/ioctl_xfs_start_commit.2 diff --git a/man/man2/ioctl_xfs_commit_range.2 b/man/man2/ioctl_xfs_commit_range.2 new file mode 100644 index 00000000000000..3244e52c3e0946 --- /dev/null +++ b/man/man2/ioctl_xfs_commit_range.2 @@ -0,0 +1,296 @@ +.\" Copyright (c) 2020-2024 Oracle. All rights reserved. +.\" +.\" %%%LICENSE_START(GPLv2+_DOC_FULL) +.\" This is free documentation; you can redistribute it and/or +.\" modify it under the terms of the GNU General Public License as +.\" published by the Free Software Foundation; either version 2 of +.\" the License, or (at your option) any later version. +.\" +.\" The GNU General Public License's references to "object code" +.\" and "executables" are to be interpreted as the output of any +.\" document formatting or typesetting system, including +.\" intermediate and printed output. +.\" +.\" This manual is distributed in the hope that it will be useful, +.\" but WITHOUT ANY WARRANTY; without even the implied warranty of +.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +.\" GNU General Public License for more details. +.\" +.\" You should have received a copy of the GNU General Public +.\" License along with this manual; if not, see +.\" . +.\" %%%LICENSE_END +.TH IOCTL-XFS-COMMIT-RANGE 2 2024-02-18 "XFS" +.SH NAME +ioctl_xfs_start_commit \- prepare to exchange the contents of two files +ioctl_xfs_commit_range \- conditionally exchange the contents of parts of two files +.SH SYNOPSIS +.br +.B #include +.br +.B #include +.PP +.BI "int ioctl(int " file2_fd ", XFS_IOC_START_COMMIT, struct xfs_commit_range *" arg ); +.PP +.BI "int ioctl(int " file2_fd ", XFS_IOC_COMMIT_RANGE, struct xfs_commit_range *" arg ); +.SH DESCRIPTION +Given a range of bytes in a first file +.B file1_fd +and a second range of bytes in a second file +.BR file2_fd , +this +.BR ioctl (2) +exchanges the contents of the two ranges if +.B file2_fd +passes certain freshness criteria. + +Before exchanging the contents, the program must call the +.B XFS_IOC_START_COMMIT +ioctl to sample freshness data for +.BR file2_fd . +If the sampled metadata does not match the file metadata at commit time, +.B XFS_IOC_COMMIT_RANGE +will return +.BR EBUSY . +.PP +Exchanges are atomic with regards to concurrent file operations. +Implementations must guarantee that readers see either the old contents or the +new contents in their entirety, even if the system fails. +.PP +The system call parameters are conveyed in structures of the following form: +.PP +.in +4n +.EX +struct xfs_commit_range { + __s32 file1_fd; + __u32 pad; + __u64 file1_offset; + __u64 file2_offset; + __u64 length; + __u64 flags; + __u64 file2_freshness[5]; +}; +.EE +.in +.PP +The field +.I pad +must be zero. +.PP +The fields +.IR file1_fd ", " file1_offset ", and " length +define the first range of bytes to be exchanged. +.PP +The fields +.IR file2_fd ", " file2_offset ", and " length +define the second range of bytes to be exchanged. +.PP +The field +.I file2_freshness +is an opaque field whose contents are determined by the kernel. +These file attributes are used to confirm that +.B file2_fd +has not changed by another thread since the current thread began staging its +own update. +.PP +Both files must be from the same filesystem mount. +If the two file descriptors represent the same file, the byte ranges must not +overlap. +Most disk-based filesystems require that the starts of both ranges must be +aligned to the file block size. +If this is the case, the ends of the ranges must also be so aligned unless the +.B XFS_EXCHANGE_RANGE_TO_EOF +flag is set. + +.PP +The field +.I flags +control the behavior of the exchange operation. +.RS 0.4i +.TP +.B XFS_EXCHANGE_RANGE_TO_EOF +Ignore the +.I length +parameter. +All bytes in +.I file1_fd +from +.I file1_offset +to EOF are moved to +.IR file2_fd , +and file2's size is set to +.RI ( file2_offset "+(" file1_length - file1_offset )). +Meanwhile, all bytes in file2 from +.I file2_offset +to EOF are moved to file1 and file1's size is set to +.RI ( file1_offset "+(" file2_length - file2_offset )). +.TP +.B XFS_EXCHANGE_RANGE_DSYNC +Ensure that all modified in-core data in both file ranges and all metadata +updates pertaining to the exchange operation are flushed to persistent storage +before the call returns. +Opening either file descriptor with +.BR O_SYNC " or " O_DSYNC +will have the same effect. +.TP +.B XFS_EXCHANGE_RANGE_FILE1_WRITTEN +Only exchange sub-ranges of +.I file1_fd +that are known to contain data written by application software. +Each sub-range may be expanded (both upwards and downwards) to align with the +file allocation unit. +For files on the data device, this is one filesystem block. +For files on the realtime device, this is the realtime extent size. +This facility can be used to implement fast atomic scatter-gather writes of any +complexity for software-defined storage targets if all writes are aligned to +the file allocation unit. +.TP +.B XFS_EXCHANGE_RANGE_DRY_RUN +Check the parameters and the feasibility of the operation, but do not change +anything. +.RE +.PP +.SH RETURN VALUE +On error, \-1 is returned, and +.I errno +is set to indicate the error. +.PP +.SH ERRORS +Error codes can be one of, but are not limited to, the following: +.TP +.B EBADF +.IR file1_fd +is not open for reading and writing or is open for append-only writes; or +.IR file2_fd +is not open for reading and writing or is open for append-only writes. +.TP +.B EBUSY +The file2 inode number and timestamps supplied do not match +.IR file2_fd . +.TP +.B EINVAL +The parameters are not correct for these files. +This error can also appear if either file descriptor represents +a device, FIFO, or socket. +Disk filesystems generally require the offset and length arguments +to be aligned to the fundamental block sizes of both files. +.TP +.B EIO +An I/O error occurred. +.TP +.B EISDIR +One of the files is a directory. +.TP +.B ENOMEM +The kernel was unable to allocate sufficient memory to perform the +operation. +.TP +.B ENOSPC +There is not enough free space in the filesystem exchange the contents safely. +.TP +.B EOPNOTSUPP +The filesystem does not support exchanging bytes between the two +files. +.TP +.B EPERM +.IR file1_fd " or " file2_fd +are immutable. +.TP +.B ETXTBSY +One of the files is a swap file. +.TP +.B EUCLEAN +The filesystem is corrupt. +.TP +.B EXDEV +.IR file1_fd " and " file2_fd +are not on the same mounted filesystem. +.SH CONFORMING TO +This API is XFS-specific. +.SH USE CASES +.PP +Several use cases are imagined for this system call. +Coordination between multiple threads is performed by the kernel. +.PP +The first is a filesystem defragmenter, which copies the contents of a file +into another file and wishes to exchange the space mappings of the two files, +provided that the original file has not changed. +.PP +An example program might look like this: +.PP +.in +4n +.EX +int fd = open("/some/file", O_RDWR); +int temp_fd = open("/some", O_TMPFILE | O_RDWR); +struct stat sb; +struct xfs_commit_range args = { + .flags = XFS_EXCHANGE_RANGE_TO_EOF, +}; + +/* gather file2's freshness information */ +ioctl(fd, XFS_IOC_START_COMMIT, &args); +fstat(fd, &sb); + +/* make a fresh copy of the file with terrible alignment to avoid reflink */ +clone_file_range(fd, NULL, temp_fd, NULL, 1, 0); +clone_file_range(fd, NULL, temp_fd, NULL, sb.st_size - 1, 0); + +/* commit the entire update */ +args.file1_fd = temp_fd; +ret = ioctl(fd, XFS_IOC_COMMIT_RANGE, &args); +if (ret && errno == EBUSY) + printf("file changed while defrag was underway\\n"); +.EE +.in +.PP +The second is a data storage program that wants to commit non-contiguous updates +to a file atomically. +This program cannot coordinate updates to the file and therefore relies on the +kernel to reject the COMMIT_RANGE command if the file has been updated by +someone else. +This can be done by creating a temporary file, calling +.BR FICLONE (2) +to share the contents, and staging the updates into the temporary file. +The +.B FULL_FILES +flag is recommended for this purpose. +The temporary file can be deleted or punched out afterwards. +.PP +An example program might look like this: +.PP +.in +4n +.EX +int fd = open("/some/file", O_RDWR); +int temp_fd = open("/some", O_TMPFILE | O_RDWR); +struct xfs_commit_range args = { + .flags = XFS_EXCHANGE_RANGE_TO_EOF, +}; + +/* gather file2's freshness information */ +ioctl(fd, XFS_IOC_START_COMMIT, &args); + +ioctl(temp_fd, FICLONE, fd); + +/* append 1MB of records */ +lseek(temp_fd, 0, SEEK_END); +write(temp_fd, data1, 1000000); + +/* update record index */ +pwrite(temp_fd, data1, 600, 98765); +pwrite(temp_fd, data2, 320, 54321); +pwrite(temp_fd, data2, 15, 0); + +/* commit the entire update */ +args.file1_fd = temp_fd; +ret = ioctl(fd, XFS_IOC_COMMIT_RANGE, &args); +if (ret && errno == EBUSY) + printf("file changed before commit; will roll back\\n"); +.EE +.in +.B +.SH NOTES +.PP +Some filesystems may limit the amount of data or the number of extents that can +be exchanged in a single call. +.SH SEE ALSO +.BR ioctl (2) diff --git a/man/man2/ioctl_xfs_fsgeometry.2 b/man/man2/ioctl_xfs_fsgeometry.2 index 54fd89390883c1..db7698fa922b87 100644 --- a/man/man2/ioctl_xfs_fsgeometry.2 +++ b/man/man2/ioctl_xfs_fsgeometry.2 @@ -212,7 +212,7 @@ .SH FILESYSTEM FEATURE FLAGS .B XFS_FSOP_GEOM_FLAGS_REFLINK Filesystem supports sharing blocks between files. .TP -.B XFS_FSOP_GEOM_FLAGS_EXCHRANGE +.B XFS_FSOP_GEOM_FLAGS_EXCHANGE_RANGE Filesystem can exchange file contents atomically via XFS_IOC_EXCHANGE_RANGE. .RE .SH XFS METADATA HEALTH REPORTING diff --git a/man/man2/ioctl_xfs_start_commit.2 b/man/man2/ioctl_xfs_start_commit.2 new file mode 100644 index 00000000000000..f11410120f698d --- /dev/null +++ b/man/man2/ioctl_xfs_start_commit.2 @@ -0,0 +1 @@ +.so man2/ioctl_xfs_commit_range.2 From patchwork Thu Oct 31 23:19:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13858469 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B01961CF7B7 for ; Thu, 31 Oct 2024 23:19:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730416794; cv=none; b=qPDHGHxiAXKaeE4k4B0Wij+8jrRofsYj98Wf9nwCZ/m5O5RWmw8ze4Ndps1hQrUjVZj30GZ77dARXmX6x10k8vJyHEL3JmsM8zh9dcAN2gAAxSg3yWXTRBGOaic9L4ipPlqm1toAGbjlh2aqcmrDZFLvxOMB5I/X0LKy/561D68= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730416794; c=relaxed/simple; bh=pjs4h4+z2Gm9dNPIpWcTAKSUiOYTTGTNvPqiZQZe9DI=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=t0W+p8uztCQoo+qSaZsf8IPMRFR11s6yP1U0DzU7HqLHYElxXi2u/U9Vddk2iMYnhclwrLHBhi+o/6acpZBQA4rimJ8rwUi0meE8razTEA8GgioouL9fmd1uoJeGyjJyyfmKYdur4OTyUQXQ6Sgjks09E/IHPcyUBvLV7LlcRA8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=pLyZLL4L; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="pLyZLL4L" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7F6A8C4CEC3; Thu, 31 Oct 2024 23:19:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730416794; bh=pjs4h4+z2Gm9dNPIpWcTAKSUiOYTTGTNvPqiZQZe9DI=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=pLyZLL4Luf8bvJxDIb7nvsiM0Er2ywjlltszz4NcoP5lIu5hT7aLFLCzdKa21ZUiG RuoAVXDBfyLxbOIOlC96WRxescAFPnGnVCYGL3oyKtQ0VteejMO/GA1z1bz8/6gMDX mDSjBZaxuFrBCeuY2vy7ZhQYFiZ/hhrRpzA9TBZePW/RNqBzhxYYSllL0iPdbDv38W 5NkQ+lwavwGviuozSZq7LOEIFAXCWPIHMj/8REU9zY0nTFmqiipmCaCXiEJ5iQv6d2 LF5sboB+CGLlxaZZW0rqegjvKpQ8BmU8y2YokbEwCEUddRlXkMLh7rBubwEKsjaisT DirTTQmlofJqw== Date: Thu, 31 Oct 2024 16:19:54 -0700 Subject: [PATCH 2/7] libfrog: add support for commit range ioctl family From: "Darrick J. Wong" To: djwong@kernel.org, aalbersh@kernel.org Cc: linux-xfs@vger.kernel.org Message-ID: <173041566936.963918.13182092769896210494.stgit@frogsfrogsfrogs> In-Reply-To: <173041566899.963918.1566223803606797457.stgit@frogsfrogsfrogs> References: <173041566899.963918.1566223803606797457.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Add some library code to support the new file range commit ioctls. This will be used to test the atomic file commit functionality in fstests. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- libfrog/file_exchange.c | 194 +++++++++++++++++++++++++++++++++++++++++++++++ libfrog/file_exchange.h | 10 ++ 2 files changed, 204 insertions(+) diff --git a/libfrog/file_exchange.c b/libfrog/file_exchange.c index 29fdc17e598ce4..e6c3f486b0ffdc 100644 --- a/libfrog/file_exchange.c +++ b/libfrog/file_exchange.c @@ -50,3 +50,197 @@ xfrog_exchangerange( return 0; } + +/* + * Prepare for committing a file contents exchange if nobody changes file2 in + * the meantime by asking the kernel to sample file2's change attributes. + * + * Returns 0 for success or a negative errno. + */ +int +xfrog_commitrange_prep( + struct xfs_commit_range *xcr, + int file2_fd, + off_t file2_offset, + int file1_fd, + off_t file1_offset, + uint64_t length) +{ + int ret; + + memset(xcr, 0, sizeof(*xcr)); + + xcr->file1_fd = file1_fd; + xcr->file1_offset = file1_offset; + xcr->length = length; + xcr->file2_offset = file2_offset; + + ret = ioctl(file2_fd, XFS_IOC_START_COMMIT, xcr); + if (ret) + return -errno; + + return 0; +} + +/* + * Execute an exchange-commit operation. Returns 0 for success or a negative + * errno. + */ +int +xfrog_commitrange( + int file2_fd, + struct xfs_commit_range *xcr, + uint64_t flags) +{ + int ret; + + xcr->flags = flags; + + ret = ioctl(file2_fd, XFS_IOC_COMMIT_RANGE, xcr); + if (ret) + return -errno; + + return 0; +} + +/* Opaque freshness blob for XFS_IOC_COMMIT_RANGE */ +struct xfs_commit_range_fresh { + xfs_fsid_t fsid; /* m_fixedfsid */ + __u64 file2_ino; /* inode number */ + __s64 file2_mtime; /* modification time */ + __s64 file2_ctime; /* change time */ + __s32 file2_mtime_nsec; /* mod time, nsec */ + __s32 file2_ctime_nsec; /* change time, nsec */ + __u32 file2_gen; /* inode generation */ + __u32 magic; /* zero */ +}; + +/* magic flag to force use of swapext */ +#define XCR_SWAPEXT_MAGIC 0x43524150 /* CRAP */ + +/* + * Import file2 freshness information for a XFS_IOC_SWAPEXT call from bulkstat + * information. We can skip the fsid and file2_gen members because old swapext + * did not verify those things. + */ +static void +xfrog_swapext_prep( + struct xfs_commit_range *xdf, + const struct xfs_bulkstat *file2_stat) +{ + struct xfs_commit_range_fresh *f; + + f = (struct xfs_commit_range_fresh *)&xdf->file2_freshness; + f->file2_ino = file2_stat->bs_ino; + f->file2_mtime = file2_stat->bs_mtime; + f->file2_mtime_nsec = file2_stat->bs_mtime_nsec; + f->file2_ctime = file2_stat->bs_ctime; + f->file2_ctime_nsec = file2_stat->bs_ctime_nsec; + f->magic = XCR_SWAPEXT_MAGIC; +} + +/* Invoke the old swapext ioctl. */ +static int +xfrog_ioc_swapext( + int file2_fd, + struct xfs_commit_range *xdf) +{ + struct xfs_swapext args = { + .sx_version = XFS_SX_VERSION, + .sx_fdtarget = file2_fd, + .sx_length = xdf->length, + .sx_fdtmp = xdf->file1_fd, + }; + struct xfs_commit_range_fresh *f; + int ret; + + BUILD_BUG_ON(sizeof(struct xfs_commit_range_fresh) != + sizeof(xdf->file2_freshness)); + + f = (struct xfs_commit_range_fresh *)&xdf->file2_freshness; + args.sx_stat.bs_ino = f->file2_ino; + args.sx_stat.bs_mtime.tv_sec = f->file2_mtime; + args.sx_stat.bs_mtime.tv_nsec = f->file2_mtime_nsec; + args.sx_stat.bs_ctime.tv_sec = f->file2_ctime; + args.sx_stat.bs_ctime.tv_nsec = f->file2_ctime_nsec; + + ret = ioctl(file2_fd, XFS_IOC_SWAPEXT, &args); + if (ret) { + /* + * Old swapext returns EFAULT if file1 or file2 length doesn't + * match. The new new COMMIT_RANGE doesn't check the file + * length, but the freshness checks will trip and return EBUSY. + * If we see EFAULT from the old ioctl, turn that into EBUSY. + */ + if (errno == EFAULT) + return -EBUSY; + return -errno; + } + + return 0; +} + +/* + * Prepare for defragmenting a file by committing a file contents exchange if + * if nobody changes file2 in the meantime by asking the kernel to sample + * file2's change attributes. + * + * If the kernel supports only the old XFS_IOC_SWAPEXT ioctl, the @file2_stat + * information will be used to sample the change attributes. + * + * Returns 0 or a negative errno. + */ +int +xfrog_defragrange_prep( + struct xfs_commit_range *xdf, + int file2_fd, + const struct xfs_bulkstat *file2_stat, + int file1_fd) +{ + int ret; + + memset(xdf, 0, sizeof(*xdf)); + + xdf->file1_fd = file1_fd; + xdf->length = file2_stat->bs_size; + + ret = ioctl(file2_fd, XFS_IOC_START_COMMIT, xdf); + if (ret && (errno == EOPNOTSUPP || errno == ENOTTY)) { + xfrog_swapext_prep(xdf, file2_stat); + return 0; + } + if (ret) + return -errno; + + return 0; +} + +/* Execute an exchange operation. Returns 0 for success or a negative errno. */ +int +xfrog_defragrange( + int file2_fd, + struct xfs_commit_range *xdf) +{ + struct xfs_commit_range_fresh *f; + int ret; + + f = (struct xfs_commit_range_fresh *)&xdf->file2_freshness; + if (f->magic == XCR_SWAPEXT_MAGIC) + goto legacy_fallback; + + ret = ioctl(file2_fd, XFS_IOC_COMMIT_RANGE, xdf); + if (ret) { + if (errno == EOPNOTSUPP || errno != ENOTTY) + goto legacy_fallback; + return -errno; + } + + return 0; + +legacy_fallback: + ret = xfrog_ioc_swapext(file2_fd, xdf); + if (ret) + return -errno; + + return 0; +} diff --git a/libfrog/file_exchange.h b/libfrog/file_exchange.h index b6f6f9f698a8c9..98d3b867c317ee 100644 --- a/libfrog/file_exchange.h +++ b/libfrog/file_exchange.h @@ -12,4 +12,14 @@ void xfrog_exchangerange_prep(struct xfs_exchange_range *fxr, int xfrog_exchangerange(int file2_fd, struct xfs_exchange_range *fxr, uint64_t flags); +int xfrog_commitrange_prep(struct xfs_commit_range *xcr, int file2_fd, + off_t file2_offset, int file1_fd, off_t file1_offset, + uint64_t length); +int xfrog_commitrange(int file2_fd, struct xfs_commit_range *xcr, + uint64_t flags); + +int xfrog_defragrange_prep(struct xfs_commit_range *xdf, int file2_fd, + const struct xfs_bulkstat *file2_stat, int file1_fd); +int xfrog_defragrange(int file2_fd, struct xfs_commit_range *xdf); + #endif /* __LIBFROG_FILE_EXCHANGE_H__ */ From patchwork Thu Oct 31 23:20:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13858470 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4537C1CF29C for ; Thu, 31 Oct 2024 23:20:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730416810; cv=none; b=b+wUA+LNMVH3b9x6sz7mzIZmxo/ZcDB4nuamrI8o28wXkWjvTZp82JKX9Mwgu+xkXKxtqWKEhrBUTNDRWHd2hVWRRYG2oTRqrMB4Kd8Qiy00bJ1kBzZR9OFbkWIOZ4sdk/jlHYukFhbimTjKgqqysl4YUQ1DNtLIUnumlhUx8Mc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730416810; c=relaxed/simple; bh=Tafz5opgBwt8wSSVY3T1qgX0rq3DMyJ2ZMoMUZz3cZ8=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=c1n3s1oHPBLwOU5ntclBkgOR6aREn/WJZsP41JkQ6nIw1aFvCVG8E+PVCibxcogR5stL+A7Ea7XN5H0b/DpzKUj7rghoM02DfsMOkyTfYiQp0WfEu7CbrpVqDwNdIZNdiYOsEyGJva9gS87BMVZiqlXi+Nc6JymZ6LfRWeKw0E8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=r0a8GUJ4; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="r0a8GUJ4" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 22950C4CEC3; Thu, 31 Oct 2024 23:20:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730416810; bh=Tafz5opgBwt8wSSVY3T1qgX0rq3DMyJ2ZMoMUZz3cZ8=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=r0a8GUJ4V2QMqAtTp8z9dHLqzIe45sYupQ0l2LAdZeuKBo5IRpOMxvXL+7FbhF+d7 EupElYeWGRFvUstn8bIYK7hMQvZdVyTvltaRw+wemcE6d1IaPFMbPtxnLZTOwAj2kG pCaECtb+t9w4WU7SNsRuoRkick58PTi1iT7xV7DTHXhgAi4hwpPRabHkT49DpP7nV9 Zurfpecv/TApGy6ntz8tFLhjCRPZv7yvSFsb+R/jLeQCu2aRoeDU9IGl8Q5q+y4rhm Rjri0TA7FvS4thM7tQroTmSmf8NLGd6ZdA1q83IHiiXOYcpS5juCPnFEQ14yRUPXeW c0djPmVK3HpJw== Date: Thu, 31 Oct 2024 16:20:09 -0700 Subject: [PATCH 3/7] libxfs: remove unused xfs_inode fields From: "Darrick J. Wong" To: djwong@kernel.org, aalbersh@kernel.org Cc: linux-xfs@vger.kernel.org Message-ID: <173041566952.963918.16788605025792856310.stgit@frogsfrogsfrogs> In-Reply-To: <173041566899.963918.1566223803606797457.stgit@frogsfrogsfrogs> References: <173041566899.963918.1566223803606797457.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Remove these unused fields; on the author's system this reduces the struct size from 560 bytes to 448. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- include/xfs_inode.h | 4 ---- 1 file changed, 4 deletions(-) diff --git a/include/xfs_inode.h b/include/xfs_inode.h index 170cc5288d3645..f250102ff19d65 100644 --- a/include/xfs_inode.h +++ b/include/xfs_inode.h @@ -215,7 +215,6 @@ typedef struct xfs_inode { struct xfs_mount *i_mount; /* fs mount struct ptr */ xfs_ino_t i_ino; /* inode number (agno/agino) */ struct xfs_imap i_imap; /* location for xfs_imap() */ - struct xfs_buftarg i_dev; /* dev for this inode */ struct xfs_ifork *i_cowfp; /* copy on write extents */ struct xfs_ifork i_df; /* data fork */ struct xfs_ifork i_af; /* attribute fork */ @@ -239,9 +238,6 @@ typedef struct xfs_inode { xfs_agino_t i_next_unlinked; xfs_agino_t i_prev_unlinked; - xfs_extnum_t i_cnextents; /* # of extents in cow fork */ - unsigned int i_cformat; /* format of cow fork */ - xfs_fsize_t i_size; /* in-memory size */ struct inode i_vnode; } xfs_inode_t; From patchwork Thu Oct 31 23:20:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13858471 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 471BD1CC8B7 for ; Thu, 31 Oct 2024 23:20:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730416828; cv=none; b=k5l0y84xN3Sfxk8QPrZWJxdSCMdW/Dvko85eA4UFhmvYTCubMdirP9I0/Q36U3RE9FSKR9gDVupHcQgH46wHz7soiqhtF1VKqIYO/g5m1jH9lfUC4wMCOvYNFOCcwCZ09mYDYuiaBUU2NbNc0Q4+p8JstyoobtbFbnmCpfxuBzU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730416828; c=relaxed/simple; bh=UWu5lIBkkewIAqf8qTbdU8tp2O/NQsZAjTIN6focLbI=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=syOymFDyAgW2eDN89LMvgSjTX1wYWgawjf71pVgiXIgeaZw7SdnwwPEhG29Rg6Szf3qJlMu67w8Y2jLnFyJ8v+F4CM1TrdX9vspDXT/CikAx61u7WODOqip3oJZkfLgFh3oL6rlv6+5gQOdN1qhD7LI9GhOpkvvNz9wUWDsAZfA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=jKfk54kC; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="jKfk54kC" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B4277C4CEC3; Thu, 31 Oct 2024 23:20:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730416825; bh=UWu5lIBkkewIAqf8qTbdU8tp2O/NQsZAjTIN6focLbI=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=jKfk54kCOb/av9DkzUPmPHEr8n+vqB3G3CED65Z9NQMphTHVPGBm8Fnugd+7DxMWc 79/INx4QG1cxIrCJwwLIdGuEHjYMNTBs9dSFZb0gMG9Ox7JmUdxsfbstnrS2q2+tVz FR8Nft1Ci9upf/Z7oO8gHdhvi6LV6RmztHeoJQMkItcvlG43wQsnxuGQqwuuXsoIb0 aXZVJIVK2PafWkQNRbSmRzdhEQCJlDc53pfCbJJwieT5S2bgUHuvOyIkTPIc9EcN4a dDRcGlQRDVoGZIykIVBOV8kqwmygFBO6ezcdAddSHDLbk8I/o2kedEK2DiT7ifrzL/ TYSstXR48yWaA== Date: Thu, 31 Oct 2024 16:20:25 -0700 Subject: [PATCH 4/7] libxfs: validate inumber in xfs_iget From: "Darrick J. Wong" To: djwong@kernel.org, aalbersh@kernel.org Cc: linux-xfs@vger.kernel.org Message-ID: <173041566967.963918.4965622256532633715.stgit@frogsfrogsfrogs> In-Reply-To: <173041566899.963918.1566223803606797457.stgit@frogsfrogsfrogs> References: <173041566899.963918.1566223803606797457.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Actually use the inumber validator to check the argument passed in here, just like we now do in the kernel. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- libxfs/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libxfs/inode.c b/libxfs/inode.c index 2062ecf54486cf..9230ad24a5cb6c 100644 --- a/libxfs/inode.c +++ b/libxfs/inode.c @@ -143,7 +143,7 @@ libxfs_iget( int error = 0; /* reject inode numbers outside existing AGs */ - if (!ino || XFS_INO_TO_AGNO(mp, ino) >= mp->m_sb.sb_agcount) + if (!xfs_verify_ino(mp, ino)) return -EINVAL; ip = kmem_cache_zalloc(xfs_inode_cache, 0); From patchwork Thu Oct 31 23:20:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13858472 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7E47619CC1D for ; Thu, 31 Oct 2024 23:20:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730416841; cv=none; b=q7GxtMT99gZp9eNoIhzCUJp/SkpB8q/vN0fHO/8jBG4AiS6l1u66mdR+TQ2WIicJ7T7S4eOrs2kfftlvaXdXeYxw1U/2faoNxpti0fZeDWlN8M6kQOi6ZYN+0ShL1We7HLAAzpStRf8TqcDD3Khs1R4kPLkcdfOq4KZwjs6tVvM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730416841; c=relaxed/simple; bh=QMF/EGUSuYbcNiL36nO5jqGaxJRoii4RIkG5cbN507g=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=tFdWRhF4wWr6eEkhG+TFc/Hg4BPonS5J9Ny2IWxjbGagvLkdlbWBPryAweIjb5k/RsbBeHkxyv8j+sIxEdQCv+cavgrNLWoJVzVhpmx2KUgopqFTZYVkMxYJi+LumqfQkds+9wTCwfMwLjujMtuR4lEOcO/cD0bYVG6a9MTqcq4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=J0LlJs8H; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="J0LlJs8H" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 569DDC4CEC3; Thu, 31 Oct 2024 23:20:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730416841; bh=QMF/EGUSuYbcNiL36nO5jqGaxJRoii4RIkG5cbN507g=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=J0LlJs8HAdJcX9yTTwJEhppnrRIJEnsVywO9kepUnI38+R0V3ajxROsS6/9iwoo51 ja719iZu5AZxHBGetg9KaBoWT/Abu9T06EzKgByJVfFqGYwSHvEHBx2VmHrj5QLubt Qf+dhvf6zPYZLBx4vLdcJueQAUjegMJfwWIlbMOW2JdWUN1B2+U6pJf3jlpCI9gMhK uGOkvQpLJAGpQYHnXK+aJWanzR1htrI574IYXHZPgolap8D+XEGJcuS9G/1LLbVavs mc+0XztGk2UZESuwc9quk3uBbeg2UWSsQFBR+u3AZxGfET3KgVigSpFxeOMMYZuMbW 2eXYUc++XcEvg== Date: Thu, 31 Oct 2024 16:20:40 -0700 Subject: [PATCH 5/7] xfs_fsr: port to new file exchange library function From: "Darrick J. Wong" To: djwong@kernel.org, aalbersh@kernel.org Cc: linux-xfs@vger.kernel.org Message-ID: <173041566982.963918.11438785130491146100.stgit@frogsfrogsfrogs> In-Reply-To: <173041566899.963918.1566223803606797457.stgit@frogsfrogsfrogs> References: <173041566899.963918.1566223803606797457.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Port fsr to use the new libfrog library functions to handle exchanging mappings between the target and donor files. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- fsr/xfs_fsr.c | 74 ++++++++++++++++++++++++++------------------------------- 1 file changed, 34 insertions(+), 40 deletions(-) diff --git a/fsr/xfs_fsr.c b/fsr/xfs_fsr.c index 22e134adfd73ab..8845ff172fcb2e 100644 --- a/fsr/xfs_fsr.c +++ b/fsr/xfs_fsr.c @@ -13,6 +13,7 @@ #include "libfrog/paths.h" #include "libfrog/fsgeom.h" #include "libfrog/bulkstat.h" +#include "libfrog/file_exchange.h" #include #include @@ -122,12 +123,6 @@ open_handle( return 0; } -static int -xfs_swapext(int fd, xfs_swapext_t *sx) -{ - return ioctl(fd, XFS_IOC_SWAPEXT, sx); -} - static int xfs_fscounts(int fd, xfs_fsop_counts_t *counts) { @@ -1189,14 +1184,13 @@ packfile( struct xfs_bulkstat *statp, struct fsxattr *fsxp) { + struct xfs_commit_range xdf; int tfd = -1; - int srval; int retval = -1; /* Failure is the default */ int nextents, extent, cur_nextents, new_nextents; unsigned blksz_dio; unsigned dio_min; struct dioattr dio; - static xfs_swapext_t sx; struct xfs_flock64 space; off_t cnt, pos; void *fbuf = NULL; @@ -1239,6 +1233,16 @@ packfile( goto out; } + /* + * Snapshot file_fd before we start copying data but after tweaking + * forkoff. + */ + error = xfrog_defragrange_prep(&xdf, file_fd->fd, statp, tfd); + if (error) { + fsrprintf(_("failed to prep for defrag: %s\n"), strerror(error)); + goto out; + } + /* Setup extended inode flags, project identifier, etc */ if (fsxp->fsx_xflags || fsxp->fsx_projid) { if (ioctl(tfd, FS_IOC_FSSETXATTR, fsxp) < 0) { @@ -1446,19 +1450,6 @@ packfile( goto out; } - error = -xfrog_bulkstat_v5_to_v1(file_fd, &sx.sx_stat, statp); - if (error) { - fsrprintf(_("bstat conversion error on %s: %s\n"), - fname, strerror(error)); - goto out; - } - - sx.sx_version = XFS_SX_VERSION; - sx.sx_fdtarget = file_fd->fd; - sx.sx_fdtmp = tfd; - sx.sx_offset = 0; - sx.sx_length = statp->bs_size; - /* switch to the owner's id, to keep quota in line */ if (fchown(tfd, statp->bs_uid, statp->bs_gid) < 0) { if (vflag) @@ -1468,25 +1459,28 @@ packfile( } /* Swap the extents */ - srval = xfs_swapext(file_fd->fd, &sx); - if (srval < 0) { - if (errno == ENOTSUP) { - if (vflag || dflag) - fsrprintf(_("%s: file type not supported\n"), fname); - } else if (errno == EFAULT) { - /* The file has changed since we started the copy */ - if (vflag || dflag) - fsrprintf(_("%s: file modified defrag aborted\n"), - fname); - } else if (errno == EBUSY) { - /* Timestamp has changed or mmap'ed file */ - if (vflag || dflag) - fsrprintf(_("%s: file busy\n"), fname); - } else { - fsrprintf(_("XFS_IOC_SWAPEXT failed: %s: %s\n"), - fname, strerror(errno)); - } - goto out; + error = xfrog_defragrange(file_fd->fd, &xdf); + switch (error) { + case 0: + break; + case ENOTSUP: + if (vflag || dflag) + fsrprintf(_("%s: file type not supported\n"), fname); + break; + case EFAULT: + /* The file has changed since we started the copy */ + if (vflag || dflag) + fsrprintf(_("%s: file modified defrag aborted\n"), + fname); + break; + case EBUSY: + /* Timestamp has changed or mmap'ed file */ + if (vflag || dflag) + fsrprintf(_("%s: file busy\n"), fname); + break; + default: + fsrprintf(_("XFS_IOC_SWAPEXT failed: %s: %s\n"), + fname, strerror(error)); } /* Report progress */ From patchwork Thu Oct 31 23:20:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13858473 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 662CD19CC1D for ; Thu, 31 Oct 2024 23:20:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730416857; cv=none; b=uiCZTRHbqoYpXkWFiqOrASDGLkmPuRYx2Mw8EovdMIeMRNC1Sb4Vwj+e0tmkfLen2B6FBiDgbrRRN/PBUvyeLljpyPVomR0mTqhKC2n0CrDrZwaAxt3fcmRv+iKgMNfx4fNkA/bDzM17zljm9qR2ddru6eoRG9GJKAM5BZ3uKAU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730416857; c=relaxed/simple; bh=g8gDjylM/sRBAGbkp4JiaqYd51lX+Yyix5Vmjaj4Guo=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=rKLImF8N3QVHQ0VNFtPzp5rre08/5ozPIm2JqkL7BNfGxzHZ+9CaB/WSziSXcy52yQ5J8a0aoRYCCnvanjT3vfrpbxVa/xMsiiyR1hDVQaqHBJJwrOXfVJGEbYzny+eo/+QCmbVXfO7w0zVm7z291JlqnVAja8nqk1Nzyk0oHyM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=dfmj4PbN; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="dfmj4PbN" Received: by smtp.kernel.org (Postfix) with ESMTPSA id F3194C4CED1; Thu, 31 Oct 2024 23:20:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730416857; bh=g8gDjylM/sRBAGbkp4JiaqYd51lX+Yyix5Vmjaj4Guo=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=dfmj4PbNCrgLqgPz6ztZzxu78vsVMSRP0+aisb7ogKWUQaX3DJGEitiGZgnPEBLWf sJFdtAyy9FoSNkZinOJziyV5LQhFNFRqLYPsrrQrfdh/ICjNUvaaY/FtCHlHHsKC8c ZsKeJAQ2TD+slIhkG9yfNEghI4RqL1qQfG5fIwVRXjLqOAT2LkU384i3EcwjzPFZSI CkXAzFn+Llw5+/oVbCMR48L44zK1+ql4x/6cg8GqY3Z/SUOR343r5STUkncoVnu9Tp t76R0VR/dhWOvOUgP98/bmGnRYugQlMhgi6t7yxb4Rmg0mFsu0NGlsVMN4k/g8pRre F0ScYIsdma4Qw== Date: Thu, 31 Oct 2024 16:20:56 -0700 Subject: [PATCH 6/7] xfs_io: add a commitrange option to the exchangerange command From: "Darrick J. Wong" To: djwong@kernel.org, aalbersh@kernel.org Cc: linux-xfs@vger.kernel.org Message-ID: <173041566997.963918.18116189770113184408.stgit@frogsfrogsfrogs> In-Reply-To: <173041566899.963918.1566223803606797457.stgit@frogsfrogsfrogs> References: <173041566899.963918.1566223803606797457.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Teach the xfs_io exchangerange command to be able to use the commit range functionality so that we can test it piece by piece. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- io/exchrange.c | 26 ++++++++++++++++++++++---- man/man8/xfs_io.8 | 3 +++ 2 files changed, 25 insertions(+), 4 deletions(-) diff --git a/io/exchrange.c b/io/exchrange.c index 016429280e2717..0a3750f1eb2607 100644 --- a/io/exchrange.c +++ b/io/exchrange.c @@ -19,6 +19,7 @@ exchangerange_help(void) "\n" " Exchange file data between the open file descriptor and the supplied filename.\n" " -C -- Print timing information in a condensed format\n" +" -c -- Commit to the exchange only if file2 has not changed.\n" " -d N -- Start exchanging contents at this position in the open file\n" " -f -- Flush changed file data and metadata to disk\n" " -l N -- Exchange this many bytes between the two files instead of to EOF\n" @@ -34,9 +35,9 @@ exchangerange_f( int argc, char **argv) { - struct xfs_exchange_range fxr; struct stat stat; struct timeval t1, t2; + bool use_commit = false; uint64_t flags = XFS_EXCHANGE_RANGE_TO_EOF; int64_t src_offset = 0; int64_t dest_offset = 0; @@ -53,6 +54,9 @@ exchangerange_f( case 'C': condensed = 1; break; + case 'c': + use_commit = true; + break; case 'd': dest_offset = cvtnum(fsblocksize, fssectsize, optarg); if (dest_offset < 0) { @@ -117,8 +121,22 @@ exchangerange_f( if (length < 0) length = stat.st_size; - xfrog_exchangerange_prep(&fxr, dest_offset, fd, src_offset, length); - ret = xfrog_exchangerange(file->fd, &fxr, flags); + if (use_commit) { + struct xfs_commit_range xcr; + + ret = xfrog_commitrange_prep(&xcr, file->fd, dest_offset, fd, + src_offset, length); + if (!ret) { + gettimeofday(&t1, NULL); + ret = xfrog_commitrange(file->fd, &xcr, flags); + } + } else { + struct xfs_exchange_range fxr; + + xfrog_exchangerange_prep(&fxr, dest_offset, fd, src_offset, + length); + ret = xfrog_exchangerange(file->fd, &fxr, flags); + } if (ret) { xfrog_perror(ret, "exchangerange"); exitcode = 1; @@ -149,7 +167,7 @@ static struct cmdinfo exchangerange_cmd = { void exchangerange_init(void) { - exchangerange_cmd.args = _("[-Cfntw] [-d dest_offset] [-s src_offset] [-l length] "); + exchangerange_cmd.args = _("[-Ccfntw] [-d dest_offset] [-s src_offset] [-l length] "); exchangerange_cmd.oneline = _("Exchange contents between files."); add_command(&exchangerange_cmd); diff --git a/man/man8/xfs_io.8 b/man/man8/xfs_io.8 index 1e7901393ff4d4..49d4057bb069ed 100644 --- a/man/man8/xfs_io.8 +++ b/man/man8/xfs_io.8 @@ -732,6 +732,9 @@ .SH FILE I/O COMMANDS .B \-C Print timing information in a condensed format. .TP +.B \-c +Exchange contents only if file2 has not changed. +.TP .BI \-d " dest_offset" Swap extents with open file beginning at .IR dest_offset . From patchwork Thu Oct 31 23:21:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13858474 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BE8F119CC1D for ; Thu, 31 Oct 2024 23:21:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730416872; cv=none; b=rBkOZ3AYWBd7puIzjIflMiN/uzxUOdHkuQW1QED9E4uBPQQmMCcdUrNOt95fgv1SPd2EBgdEQFmIdwW2+VUOtqub5nKlJLIa3LuKbv8KelsVKeHCC6MVRZtrMxtFhEIlkq237tj3oGeFSiM75jfzae8nX0HLfxFBIUD9LK7g77U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730416872; c=relaxed/simple; bh=VoESM7WrAGAyItSxhKr5Mi9xeGOeYLg4THb1UwAkm7A=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=F8J1PvVST0IzWOy39KmI4loptK7ZP1KuBk0hg6eQwlnjhPzYP4SBWpveXaLKPmE8R7KxMVzfuDXc0YvcEBQgy+v6mhZrf95hvcBO9jzWjWdDQNIhVxcs7BKaYqGBwCNWdTqZwcpqQQhtJcw5usur5JHv0xejtd8IOT2abmJMUYs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=uCakTu6v; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="uCakTu6v" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 968BCC4CECF; Thu, 31 Oct 2024 23:21:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730416872; bh=VoESM7WrAGAyItSxhKr5Mi9xeGOeYLg4THb1UwAkm7A=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=uCakTu6vXKlW7M51MCMV/jn2BS6qgZbZ4mhH+br/UObAqNT18sTs3OXs7awSN18Sk fbYkML0S91jhrfGGznNejPcsQuK95m22Xx4dVN77DCXOWtPisoGWIXI05lOI7wnCTW 7FkgKO47jJGgTufm7vusupF5hYzCn38lzmOhmfndOGpU+ym0d9cZ6aLZwoHAXKuOJQ dVz4ovW4tZcLvH2S9HzCZdSxvvcs3+yJlmb8kbQE8iWKtv88GAjhRiAzqVfI6gvBS9 EE/kApZuoehXy4sz5/pHofmumn/SWwjPBAXpGCnVWDu2bUlACgp1uvvX19bEklRmjF cVPrXpw4pzcgA== Date: Thu, 31 Oct 2024 16:21:12 -0700 Subject: [PATCH 7/7] xfs_io: add atomic file update commands to exercise file commit range From: "Darrick J. Wong" To: djwong@kernel.org, aalbersh@kernel.org Cc: linux-xfs@vger.kernel.org Message-ID: <173041567012.963918.15313616588220944071.stgit@frogsfrogsfrogs> In-Reply-To: <173041566899.963918.1566223803606797457.stgit@frogsfrogsfrogs> References: <173041566899.963918.1566223803606797457.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Add three commands to xfs_io so that we can exercise atomic file updates as provided by reflink and the start-commit / commit-range functionality. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- io/exchrange.c | 364 +++++++++++++++++++++++++++++++++++++++++++++++++++++ io/io.h | 4 + io/open.c | 27 +++- man/man8/xfs_io.8 | 32 +++++ 4 files changed, 421 insertions(+), 6 deletions(-) diff --git a/io/exchrange.c b/io/exchrange.c index 0a3750f1eb2607..707d78d8e624fe 100644 --- a/io/exchrange.c +++ b/io/exchrange.c @@ -164,6 +164,358 @@ static struct cmdinfo exchangerange_cmd = { .help = exchangerange_help, }; +/* Atomic file updates commands */ + +struct update_info { + /* File that we're updating. */ + int fd; + + /* ioctl data to commit the changes */ + struct xfs_commit_range xcr; + + /* Name of the file we're updating. */ + char *old_fname; + + /* fd we're using to stage the updates. */ + int temp_fd; +}; + +enum finish_how { + FINISH_ABORT, + FINISH_COMMIT, + FINISH_CHECK +}; + +static struct update_info *updates; +static unsigned int nr_updates; + +static void +startupdate_help(void) +{ + printf(_( +"\n" +" Prepare for an atomic file update, if supported by the filesystem.\n" +" A temporary file will be opened for writing and inserted into the file\n" +" table. The current file will be changed to this temporary file. Neither\n" +" file can be closed for the duration of the update.\n" +"\n" +" -e -- Start with an empty file\n" +"\n")); +} + +static int +startupdate_f( + int argc, + char *argv[]) +{ + struct fsxattr attr; + struct xfs_fsop_geom fsgeom; + struct fs_path fspath; + struct stat stat; + struct update_info *p; + char *fname; + char *path = NULL, *d; + size_t fname_len; + int flags = IO_TMPFILE | IO_ATOMICUPDATE; + int temp_fd = -1; + bool clone_file = true; + int c; + int ret; + + while ((c = getopt(argc, argv, "e")) != -1) { + switch (c) { + case 'e': + clone_file = false; + break; + default: + startupdate_help(); + return 0; + } + } + if (optind != argc) { + startupdate_help(); + return 0; + } + + /* Allocate a new slot. */ + p = realloc(updates, (++nr_updates) * sizeof(*p)); + if (!p) { + perror("startupdate realloc"); + goto fail; + } + updates = p; + + /* Fill out the update information so that we can commit later. */ + p = &updates[nr_updates - 1]; + memset(p, 0, sizeof(*p)); + + ret = fstat(file->fd, &stat); + if (ret) { + perror(file->name); + goto fail; + } + + /* Is the current file realtime? If so, the temp file must match. */ + ret = ioctl(file->fd, FS_IOC_FSGETXATTR, &attr); + if (ret == 0 && attr.fsx_xflags & FS_XFLAG_REALTIME) + flags |= IO_REALTIME; + + /* Compute path to the directory that the current file is in. */ + path = strdup(file->name); + d = strrchr(path, '/'); + if (!d) { + fprintf(stderr, _("%s: cannot compute dirname?"), path); + goto fail; + } + *d = 0; + + /* Open a temporary file to stage the new contents. */ + temp_fd = openfile(path, &fsgeom, flags, 0600, &fspath); + if (temp_fd < 0) { + perror(path); + goto fail; + } + + /* + * Snapshot the original file metadata in anticipation of the later + * file mapping exchange request. + */ + ret = xfrog_commitrange_prep(&p->xcr, file->fd, 0, temp_fd, 0, + stat.st_size); + if (ret) { + perror("update prep"); + goto fail; + } + + /* Clone all the data from the original file into the temporary file. */ + if (clone_file) { + ret = ioctl(temp_fd, XFS_IOC_CLONE, file->fd); + if (ret) { + perror(path); + goto fail; + } + } + + /* Prepare a new path string for the duration of the update. */ +#define FILEUPDATE_STR " (fileupdate)" + fname_len = strlen(file->name) + strlen(FILEUPDATE_STR); + fname = malloc(fname_len + 1); + if (!fname) { + perror("new path"); + goto fail; + } + snprintf(fname, fname_len + 1, "%s%s", file->name, FILEUPDATE_STR); + + /* + * Install the temporary file into the same slot of the file table as + * the original file. Ensure that the original file cannot be closed. + */ + file->flags |= IO_ATOMICUPDATE; + p->old_fname = file->name; + file->name = fname; + p->fd = file->fd; + p->temp_fd = file->fd = temp_fd; + + free(path); + return 0; +fail: + if (temp_fd >= 0) + close(temp_fd); + free(path); + nr_updates--; + exitcode = 1; + return 1; +} + +static long long +finish_update( + enum finish_how how, + uint64_t flags, + long long *offset) +{ + struct update_info *p; + long long committed_bytes = 0; + size_t length; + unsigned int i; + unsigned int upd_offset; + int temp_fd; + int ret; + + /* Find our update descriptor. */ + for (i = 0, p = updates; i < nr_updates; i++, p++) { + if (p->temp_fd == file->fd) + break; + } + + if (i == nr_updates) { + fprintf(stderr, + _("Current file is not the staging file for an atomic update.\n")); + exitcode = 1; + return -1; + } + + /* + * Commit our changes, if desired. If the mapping exchange fails, we + * stop processing immediately so that we can run more xfs_io commands. + */ + switch (how) { + case FINISH_CHECK: + flags |= XFS_EXCHANGE_RANGE_DRY_RUN; + fallthrough; + case FINISH_COMMIT: + ret = xfrog_commitrange(p->fd, &p->xcr, flags); + if (ret) { + xfrog_perror(ret, _("committing update")); + exitcode = 1; + return -1; + } + printf(_("Committed updates to '%s'.\n"), p->old_fname); + *offset = p->xcr.file2_offset; + committed_bytes = p->xcr.length; + break; + case FINISH_ABORT: + printf(_("Cancelled updates to '%s'.\n"), p->old_fname); + break; + } + + /* + * Reset the filetable to point to the original file, and close the + * temporary file. + */ + free(file->name); + file->name = p->old_fname; + file->flags &= ~IO_ATOMICUPDATE; + temp_fd = file->fd; + file->fd = p->fd; + ret = close(temp_fd); + if (ret) + perror(_("closing temporary file")); + + /* Remove the atomic update context, shifting things down. */ + upd_offset = p - updates; + length = nr_updates * sizeof(struct update_info); + length -= (upd_offset + 1) * sizeof(struct update_info); + if (length) + memmove(p, p + 1, length); + + nr_updates--; + return committed_bytes; +} + +static void +cancelupdate_help(void) +{ + printf(_( +"\n" +" Cancels an atomic file update. The temporary file will be closed, and the\n" +" current file set back to the original file.\n" +"\n")); +} + +static int +cancelupdate_f( + int argc, + char *argv[]) +{ + return finish_update(FINISH_ABORT, 0, NULL); +} + +static void +commitupdate_help(void) +{ + printf(_( +"\n" +" Commits an atomic file update. File contents written to the temporary file\n" +" will be exchanged atomically with the corresponding range in the original\n" +" file. The temporary file will be closed, and the current file set back to\n" +" the original file.\n" +"\n" +" -C -- Print timing information in a condensed format.\n" +" -h -- Only exchange written ranges in the temporary file.\n" +" -k -- Exchange to end of file, ignore any length previously set.\n" +" -n -- Check parameters but do not change anything.\n" +" -q -- Do not print timing information at all.\n")); +} + +static int +commitupdate_f( + int argc, + char *argv[]) +{ + struct timeval t1, t2; + enum finish_how how = FINISH_COMMIT; + uint64_t flags = XFS_EXCHANGE_RANGE_TO_EOF; + long long offset, len; + int condensed = 0, quiet_flag = 0; + int c; + + while ((c = getopt(argc, argv, "Chknq")) != -1) { + switch (c) { + case 'C': + condensed = 1; + break; + case 'h': + flags |= XFS_EXCHANGE_RANGE_FILE1_WRITTEN; + break; + case 'k': + flags &= ~XFS_EXCHANGE_RANGE_TO_EOF; + break; + case 'n': + how = FINISH_CHECK; + break; + case 'q': + quiet_flag = 1; + break; + default: + commitupdate_help(); + return 0; + } + } + if (optind != argc) { + commitupdate_help(); + return 0; + } + + gettimeofday(&t1, NULL); + len = finish_update(how, flags, &offset); + if (len < 0) + return 1; + if (quiet_flag) + return 0; + + gettimeofday(&t2, NULL); + t2 = tsub(t2, t1); + report_io_times("commitupdate", &t2, offset, len, len, 1, condensed); + return 0; +} + +static struct cmdinfo startupdate_cmd = { + .name = "startupdate", + .cfunc = startupdate_f, + .argmin = 0, + .argmax = -1, + .flags = CMD_FLAG_ONESHOT | CMD_NOMAP_OK, + .help = startupdate_help, +}; + +static struct cmdinfo cancelupdate_cmd = { + .name = "cancelupdate", + .cfunc = cancelupdate_f, + .argmin = 0, + .argmax = 0, + .flags = CMD_FLAG_ONESHOT | CMD_NOMAP_OK, + .help = cancelupdate_help, +}; + +static struct cmdinfo commitupdate_cmd = { + .name = "commitupdate", + .cfunc = commitupdate_f, + .argmin = 0, + .argmax = -1, + .flags = CMD_FLAG_ONESHOT | CMD_NOMAP_OK, + .help = commitupdate_help, +}; + void exchangerange_init(void) { @@ -171,4 +523,16 @@ exchangerange_init(void) exchangerange_cmd.oneline = _("Exchange contents between files."); add_command(&exchangerange_cmd); + + startupdate_cmd.oneline = _("start an atomic update of a file"); + startupdate_cmd.args = _("[-e]"); + + cancelupdate_cmd.oneline = _("cancel an atomic update"); + + commitupdate_cmd.oneline = _("commit a file update atomically"); + commitupdate_cmd.args = _("[-C] [-h] [-n] [-q]"); + + add_command(&startupdate_cmd); + add_command(&cancelupdate_cmd); + add_command(&commitupdate_cmd); } diff --git a/io/io.h b/io/io.h index 8c5e59100c5cbd..4daedac06419ae 100644 --- a/io/io.h +++ b/io/io.h @@ -31,6 +31,9 @@ #define IO_PATH (1<<10) #define IO_NOFOLLOW (1<<11) +/* undergoing atomic update, do not close */ +#define IO_ATOMICUPDATE (1<<12) + /* * Regular file I/O control */ @@ -74,6 +77,7 @@ extern int openfile(char *, struct xfs_fsop_geom *, int, mode_t, struct fs_path *); extern int addfile(char *, int , struct xfs_fsop_geom *, int, struct fs_path *); +extern int closefile(void); extern void printxattr(uint, int, int, const char *, int, int); extern unsigned int recurse_all; diff --git a/io/open.c b/io/open.c index 15850b5557bc5b..a30dd89a1fd56c 100644 --- a/io/open.c +++ b/io/open.c @@ -338,14 +338,19 @@ open_f( return 0; } -static int -close_f( - int argc, - char **argv) +int +closefile(void) { size_t length; unsigned int offset; + if (file->flags & IO_ATOMICUPDATE) { + fprintf(stderr, + _("%s: atomic update in progress, cannot close.\n"), + file->name); + exitcode = 1; + return 0; + } if (close(file->fd) < 0) { perror("close"); exitcode = 1; @@ -371,7 +376,19 @@ close_f( free(filetable); file = filetable = NULL; } - filelist_f(); + return 0; +} + +static int +close_f( + int argc, + char **argv) +{ + int ret; + + ret = closefile(); + if (!ret) + filelist_f(); return 0; } diff --git a/man/man8/xfs_io.8 b/man/man8/xfs_io.8 index 49d4057bb069ed..eb2201fca74380 100644 --- a/man/man8/xfs_io.8 +++ b/man/man8/xfs_io.8 @@ -1058,7 +1058,37 @@ .SH FILE I/O COMMANDS nsec is the nanoseconds since the sec. This value needs to be in the range 0-999999999 with UTIME_NOW and UTIME_OMIT being exceptions. Each (sec, nsec) pair constitutes a single timestamp value. - +.TP +.BI "startupdate [ " -e ] +Create a temporary clone of a file in which to stage file updates. +The +.B \-e +option creates an empty staging file. +.TP +.B cancelupdate +Abandon changes from a update staging file. +.TP +.BI "commitupdate [" OPTIONS ] +Commit changes from a update staging file to the real file. +.RS 1.0i +.PD 0 +.TP 0.4i +.B \-C +Print timing information in a condensed format. +.TP 0.4i +.B \-h +Only swap ranges in the update staging file that were actually written. +.TP 0.4i +.B \-k +Do not change file size. +.TP 0.4i +.B \-n +Check parameters without changing anything. +.TP 0.4i +.B \-q +Do not print timing information at all. +.PD +.RE .SH MEMORY MAPPED I/O COMMANDS .TP