From patchwork Thu Feb 16 21:06:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13143913 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C383C636CC for ; Thu, 16 Feb 2023 21:06:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230092AbjBPVGH (ORCPT ); Thu, 16 Feb 2023 16:06:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40504 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230078AbjBPVGG (ORCPT ); Thu, 16 Feb 2023 16:06:06 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2AD71505D3 for ; Thu, 16 Feb 2023 13:06:05 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id D7462B8217A for ; Thu, 16 Feb 2023 21:06:03 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 85B27C433D2; Thu, 16 Feb 2023 21:06:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1676581562; bh=49CSVlrWibAsZstRHwNy0rgc8vpFgWoeaHThscIc0Uw=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=ezNZk218xsHD7YOYE/VGXxdUMYYgHAwlPuSdQx/MR9BvSfb4zH67yd67fwqVNkJR9 63yAvtx1k57eJjFEYbl11QB9LvB2kNAyDuDNL2q14KSbfPLVK/Y+KTQtURu21sW2Yi liucyOONqz6TjSldZ2nWCgLTV0WTwOZR2Qhq2l/GgEha/PmPTenXb5SZ9czv5rs+pm 5W3jxseWCt8TvKrhHsNbgZuZ95Mh8Qiv0AyKu8JrhL6CYGC82eEiky8zfL97BxplaX E42L0UTsfT3KvJj2BBhmlfnK0CrlZmZ4UGwcVkjUqIAZGbA2Zud0CpImSlhlW/Af/l nDhbRvgAvbO2Q== Date: Thu, 16 Feb 2023 13:06:02 -0800 Subject: [PATCH 1/4] libxfs: add xfile support From: "Darrick J. Wong" To: djwong@kernel.org Cc: allison.henderson@oracle.com, linux-xfs@vger.kernel.org Message-ID: <167657880693.3477371.11194291382483826413.stgit@magnolia> In-Reply-To: <167657880680.3477371.18364607478868446486.stgit@magnolia> References: <167657880680.3477371.18364607478868446486.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Port the xfile functionality (anonymous pageable file-index memory) from the kernel. In userspace, we try to use memfd() to create tmpfs files that are not in any namespace, matching the kernel. Signed-off-by: Darrick J. Wong --- configure.ac | 3 + include/builddefs.in | 3 + libxfs/Makefile | 12 +++ libxfs/xfile.c | 224 +++++++++++++++++++++++++++++++++++++++++++++++++ libxfs/xfile.h | 56 ++++++++++++ m4/package_libcdev.m4 | 50 +++++++++++ repair/xfs_repair.c | 15 +++ 7 files changed, 363 insertions(+) create mode 100644 libxfs/xfile.c create mode 100644 libxfs/xfile.h diff --git a/configure.ac b/configure.ac index 63cc18cc..2472b32f 100644 --- a/configure.ac +++ b/configure.ac @@ -251,6 +251,9 @@ AC_CHECK_SIZEOF([char *]) AC_TYPE_UMODE_T AC_MANUAL_FORMAT AC_HAVE_LIBURCU_ATOMIC64 +AC_HAVE_MEMFD_CLOEXEC +AC_HAVE_O_TMPFILE +AC_HAVE_MKOSTEMP_CLOEXEC AC_CONFIG_FILES([include/builddefs]) AC_OUTPUT diff --git a/include/builddefs.in b/include/builddefs.in index e0a2f3cb..60c1320a 100644 --- a/include/builddefs.in +++ b/include/builddefs.in @@ -127,6 +127,9 @@ SYSTEMD_SYSTEM_UNIT_DIR = @systemd_system_unit_dir@ HAVE_CROND = @have_crond@ CROND_DIR = @crond_dir@ HAVE_LIBURCU_ATOMIC64 = @have_liburcu_atomic64@ +HAVE_MEMFD_CLOEXEC = @have_memfd_cloexec@ +HAVE_O_TMPFILE = @have_o_tmpfile@ +HAVE_MKOSTEMP_CLOEXEC = @have_mkostemp_cloexec@ GCCFLAGS = -funsigned-char -fno-strict-aliasing -Wall # -Wbitwise -Wno-transparent-union -Wno-old-initializer -Wno-decl diff --git a/libxfs/Makefile b/libxfs/Makefile index 89d29dc9..17978006 100644 --- a/libxfs/Makefile +++ b/libxfs/Makefile @@ -26,6 +26,7 @@ HFILES = \ libxfs_priv.h \ linux-err.h \ topology.h \ + xfile.h \ xfs_ag_resv.h \ xfs_alloc.h \ xfs_alloc_btree.h \ @@ -66,6 +67,7 @@ CFILES = cache.c \ topology.c \ trans.c \ util.c \ + xfile.c \ xfs_ag.c \ xfs_ag_resv.c \ xfs_alloc.c \ @@ -113,6 +115,16 @@ CFILES = cache.c \ # #LCFLAGS += +ifeq ($(HAVE_MEMFD_CLOEXEC),yes) + LCFLAGS += -DHAVE_MEMFD_CLOEXEC +endif +ifeq ($(HAVE_O_TMPFILE),yes) + LCFLAGS += -DHAVE_O_TMPFILE +endif +ifeq ($(HAVE_MKOSTEMP_CLOEXEC),yes) + LCFLAGS += -DHAVE_MKOSTEMP_CLOEXEC +endif + FCFLAGS = -I. LTLIBS = $(LIBPTHREAD) $(LIBRT) diff --git a/libxfs/xfile.c b/libxfs/xfile.c new file mode 100644 index 00000000..f551aef5 --- /dev/null +++ b/libxfs/xfile.c @@ -0,0 +1,224 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (C) 2022 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#include "libxfs_priv.h" +#include "libxfs.h" +#include "libxfs/xfile.h" +#include +#include +#include + +/* + * Swappable Temporary Memory + * ========================== + * + * Offline checking sometimes needs to be able to stage a large amount of data + * in memory. This information might not fit in the available memory and it + * doesn't all need to be accessible at all times. In other words, we want an + * indexed data buffer to store data that can be paged out. + * + * memfd files meet those requirements. Therefore, the xfile mechanism uses + * one to store our staging data. The xfile must be freed with xfile_destroy. + * + * xfiles assume that the caller will handle all required concurrency + * management; file locks are not taken. + */ + +/* + * Open a memory-backed fd to back an xfile. We require close-on-exec here, + * because these memfd files function as windowed RAM and hence should never + * be shared with other processes. + */ +static int +xfile_create_fd( + const char *description) +{ + int fd = -1; + +#ifdef HAVE_MEMFD_CLOEXEC + /* memfd_create exists in kernel 3.17 (2014) and glibc 2.27 (2018). */ + fd = memfd_create(description, MFD_CLOEXEC); + if (fd >= 0) + return fd; +#endif + +#ifdef HAVE_O_TMPFILE + /* + * O_TMPFILE exists as of kernel 3.11 (2013), which means that if we + * find it, we're pretty safe in assuming O_CLOEXEC exists too. + */ + fd = open("/dev/shm", O_TMPFILE | O_CLOEXEC | O_RDWR, 0600); + if (fd >= 0) + return fd; + + fd = open("/tmp", O_TMPFILE | O_CLOEXEC | O_RDWR, 0600); + if (fd >= 0) + return fd; +#endif + +#ifdef HAVE_MKOSTEMP_CLOEXEC + /* + * mkostemp exists as of glibc 2.7 (2007) and O_CLOEXEC exists as of + * kernel 2.6.23 (2007). + */ + fd = mkostemp("libxfsXXXXXX", O_CLOEXEC); + if (fd >= 0) + return fd; +#endif + +#if !defined(HAVE_MEMFD_CLOEXEC) && \ + !defined(HAVE_O_TMPFILE) && \ + !defined(HAVE_MKOSTEMP_CLOEXEC) +# error System needs memfd_create, O_TMPFILE, or O_CLOEXEC to build! +#endif + + return fd; +} + +/* + * Create an xfile of the given size. The description will be used in the + * trace output. + */ +int +xfile_create( + struct xfs_mount *mp, + const char *description, + struct xfile **xfilep) +{ + struct xfile *xf; + char fname[MAXNAMELEN]; + int error; + + snprintf(fname, MAXNAMELEN - 1, "XFS (%s): %s", mp->m_fsname, + description); + fname[MAXNAMELEN - 1] = 0; + + xf = kmem_alloc(sizeof(struct xfile), KM_MAYFAIL); + if (!xf) + return -ENOMEM; + + xf->fd = xfile_create_fd(fname); + if (xf->fd < 0) { + error = -errno; + kmem_free(xf); + return error; + } + + *xfilep = xf; + return 0; +} + +/* Close the file and release all resources. */ +void +xfile_destroy( + struct xfile *xf) +{ + close(xf->fd); + kmem_free(xf); +} + +static inline loff_t +xfile_maxbytes( + struct xfile *xf) +{ + if (sizeof(loff_t) == 8) + return LLONG_MAX; + return LONG_MAX; +} + +/* + * Read a memory object directly from the xfile's page cache. Unlike regular + * pread, we return -E2BIG and -EFBIG for reads that are too large or at too + * high an offset, instead of truncating the read. Otherwise, we return + * bytes read or an error code, like regular pread. + */ +ssize_t +xfile_pread( + struct xfile *xf, + void *buf, + size_t count, + loff_t pos) +{ + ssize_t ret; + + if (count > INT_MAX) + return -E2BIG; + if (xfile_maxbytes(xf) - pos < count) + return -EFBIG; + + ret = pread(xf->fd, buf, count, pos); + if (ret >= 0) + return ret; + return -errno; +} + +/* + * Write a memory object directly to the xfile's page cache. Unlike regular + * pwrite, we return -E2BIG and -EFBIG for writes that are too large or at too + * high an offset, instead of truncating the write. Otherwise, we return + * bytes written or an error code, like regular pwrite. + */ +ssize_t +xfile_pwrite( + struct xfile *xf, + const void *buf, + size_t count, + loff_t pos) +{ + ssize_t ret; + + if (count > INT_MAX) + return -E2BIG; + if (xfile_maxbytes(xf) - pos < count) + return -EFBIG; + + ret = pwrite(xf->fd, buf, count, pos); + if (ret >= 0) + return ret; + return -errno; +} + +/* Query stat information for an xfile. */ +int +xfile_stat( + struct xfile *xf, + struct xfile_stat *statbuf) +{ + struct stat ks; + int error; + + error = fstat(xf->fd, &ks); + if (error) + return -errno; + + statbuf->size = ks.st_size; + statbuf->bytes = (unsigned long long)ks.st_blocks << 9; + return 0; +} + +/* Dump an xfile to stdout. */ +int +xfile_dump( + struct xfile *xf) +{ + char *argv[] = {"od", "-tx1", "-Ad", "-c", NULL}; + pid_t child; + int i; + + child = fork(); + if (child != 0) { + int wstatus; + + wait(&wstatus); + return wstatus == 0 ? 0 : -EIO; + } + + /* reroute our xfile to stdin and shut everything else */ + dup2(xf->fd, 0); + for (i = 3; i < 1024; i++) + close(i); + + return execvp("od", argv); +} diff --git a/libxfs/xfile.h b/libxfs/xfile.h new file mode 100644 index 00000000..1389ff8f --- /dev/null +++ b/libxfs/xfile.h @@ -0,0 +1,56 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Copyright (C) 2022 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#ifndef __LIBXFS_XFILE_H__ +#define __LIBXFS_XFILE_H__ + +struct xfile { + int fd; +}; + +int xfile_create(struct xfs_mount *mp, const char *description, + struct xfile **xfilep); +void xfile_destroy(struct xfile *xf); + +ssize_t xfile_pread(struct xfile *xf, void *buf, size_t count, loff_t pos); +ssize_t xfile_pwrite(struct xfile *xf, const void *buf, size_t count, loff_t pos); + +/* + * Load an object. Since we're treating this file as "memory", any error or + * short IO is treated as a failure to allocate memory. + */ +static inline int +xfile_obj_load(struct xfile *xf, void *buf, size_t count, loff_t pos) +{ + ssize_t ret = xfile_pread(xf, buf, count, pos); + + if (ret < 0 || ret != count) + return -ENOMEM; + return 0; +} + +/* + * Store an object. Since we're treating this file as "memory", any error or + * short IO is treated as a failure to allocate memory. + */ +static inline int +xfile_obj_store(struct xfile *xf, const void *buf, size_t count, loff_t pos) +{ + ssize_t ret = xfile_pwrite(xf, buf, count, pos); + + if (ret < 0 || ret != count) + return -ENOMEM; + return 0; +} + +struct xfile_stat { + loff_t size; + unsigned long long bytes; +}; + +int xfile_stat(struct xfile *xf, struct xfile_stat *statbuf); +int xfile_dump(struct xfile *xf); + +#endif /* __LIBXFS_XFILE_H__ */ diff --git a/m4/package_libcdev.m4 b/m4/package_libcdev.m4 index bb1ab49c..119d1bda 100644 --- a/m4/package_libcdev.m4 +++ b/m4/package_libcdev.m4 @@ -507,3 +507,53 @@ AC_DEFUN([AC_PACKAGE_CHECK_LTO], AC_SUBST(lto_cflags) AC_SUBST(lto_ldflags) ]) + +# +# Check if we have a memfd_create syscall with a MFD_CLOEXEC flag +# +AC_DEFUN([AC_HAVE_MEMFD_CLOEXEC], + [ AC_MSG_CHECKING([for memfd_fd and MFD_CLOEXEC]) + AC_LINK_IFELSE([AC_LANG_PROGRAM([[ +#define _GNU_SOURCE +#include + ]], [[ + return memfd_create("xfs", MFD_CLOEXEC); + ]])],[have_memfd_cloexec=yes + AC_MSG_RESULT(yes)],[AC_MSG_RESULT(no)]) + AC_SUBST(have_memfd_cloexec) + ]) + +# +# Check if we have the O_TMPFILE flag +# +AC_DEFUN([AC_HAVE_O_TMPFILE], + [ AC_MSG_CHECKING([for O_TMPFILE]) + AC_LINK_IFELSE([AC_LANG_PROGRAM([[ +#define _GNU_SOURCE +#include +#include +#include + ]], [[ + return open("nowhere", O_TMPFILE, 0600); + ]])],[have_o_tmpfile=yes + AC_MSG_RESULT(yes)],[AC_MSG_RESULT(no)]) + AC_SUBST(have_o_tmpfile) + ]) + +# +# Check if we have mkostemp with the O_CLOEXEC flag +# +AC_DEFUN([AC_HAVE_MKOSTEMP_CLOEXEC], + [ AC_MSG_CHECKING([for mkostemp and O_CLOEXEC]) + AC_LINK_IFELSE([AC_LANG_PROGRAM([[ +#define _GNU_SOURCE +#include +#include +#include +#include + ]], [[ + return mkostemp("nowhere", O_TMPFILE); + ]])],[have_mkostemp_cloexec=yes + AC_MSG_RESULT(yes)],[AC_MSG_RESULT(no)]) + AC_SUBST(have_mkostemp_cloexec) + ]) diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c index ff29bea9..65cb9387 100644 --- a/repair/xfs_repair.c +++ b/repair/xfs_repair.c @@ -953,6 +953,20 @@ phase_end(int phase) platform_crash(); } +/* Try to allow as many memfds as possible. */ +static void +bump_max_fds(void) +{ + struct rlimit rlim = { }; + int ret; + + ret = getrlimit(RLIMIT_NOFILE, &rlim); + if (!ret) { + rlim.rlim_cur = rlim.rlim_max; + setrlimit(RLIMIT_NOFILE, &rlim); + } +} + int main(int argc, char **argv) { @@ -972,6 +986,7 @@ main(int argc, char **argv) bindtextdomain(PACKAGE, LOCALEDIR); textdomain(PACKAGE); dinode_bmbt_translation_init(); + bump_max_fds(); temp_mp = &xfs_m; setbuf(stdout, NULL); From patchwork Thu Feb 16 21:06:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13143914 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24D18C61DA4 for ; Thu, 16 Feb 2023 21:06:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230093AbjBPVGU (ORCPT ); Thu, 16 Feb 2023 16:06:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40558 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230078AbjBPVGT (ORCPT ); Thu, 16 Feb 2023 16:06:19 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3BDCC505D3 for ; Thu, 16 Feb 2023 13:06:19 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id C6CDF60C69 for ; Thu, 16 Feb 2023 21:06:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3081FC433EF; Thu, 16 Feb 2023 21:06:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1676581578; bh=ENq+QlZjsUfJkzc0ic9Y/rGVYdquRMVSvOp8sZ1Ms8c=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=MLO+qJ0CDAjraesXE6Nz5tq1v61jQZMTrnCsuAvix9a88t6+fT7bAzoL8Y9fHoy+4 SdQRlMxoljRelQVZGlIfRxHAiwd/e3xX+W7XzUep0/4ye3IUaGvyztDfdT7dXMJke/ xXHCwDzLlr3JcXMZamjLLAViKH5fErbUnfEriPwfudE/VdyPepDI2Rlb7yObfffWci YLEiIXiTudog9Ejfr+P1EDdDPUaNAWOvqeD58s41aIY/7dVNm98iX6YIU8B3mRebhC j22B0DPbSL1c7az3GwJ+JsVngi3eTFYP2RYKCtCDX2TWzZAmmjMnujWx135A3W+TIl TNviYOmNwTxew== Date: Thu, 16 Feb 2023 13:06:17 -0800 Subject: [PATCH 2/4] xfs: track file link count updates during live nlinks fsck From: "Darrick J. Wong" To: djwong@kernel.org Cc: allison.henderson@oracle.com, linux-xfs@vger.kernel.org Message-ID: <167657880707.3477371.12711120588680798848.stgit@magnolia> In-Reply-To: <167657880680.3477371.18364607478868446486.stgit@magnolia> References: <167657880680.3477371.18364607478868446486.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Create the necessary hooks in the file create/unlink/rename code so that our live nlink scrub code can stay up to date with the rest of the filesystem. This will be the means to keep our shadow link count information up to date while the scan runs in real time. Signed-off-by: Darrick J. Wong --- libxfs/xfs_dir2.c | 6 ++++++ libxfs/xfs_dir2.h | 1 + repair/phase6.c | 4 ---- 3 files changed, 7 insertions(+), 4 deletions(-) diff --git a/libxfs/xfs_dir2.c b/libxfs/xfs_dir2.c index 43b4e46b..4bbe83f9 100644 --- a/libxfs/xfs_dir2.c +++ b/libxfs/xfs_dir2.c @@ -24,6 +24,12 @@ const struct xfs_name xfs_name_dotdot = { .type = XFS_DIR3_FT_DIR, }; +const struct xfs_name xfs_name_dot = { + .name = (const unsigned char *)".", + .len = 1, + .type = XFS_DIR3_FT_DIR, +}; + /* * Convert inode mode to directory entry filetype */ diff --git a/libxfs/xfs_dir2.h b/libxfs/xfs_dir2.h index ff59f009..ac360c0b 100644 --- a/libxfs/xfs_dir2.h +++ b/libxfs/xfs_dir2.h @@ -22,6 +22,7 @@ struct xfs_dir3_icfree_hdr; struct xfs_dir3_icleaf_hdr; extern const struct xfs_name xfs_name_dotdot; +extern const struct xfs_name xfs_name_dot; /* * Convert inode mode to directory entry filetype diff --git a/repair/phase6.c b/repair/phase6.c index e202398e..0d253701 100644 --- a/repair/phase6.c +++ b/repair/phase6.c @@ -23,10 +23,6 @@ static struct cred zerocr; static struct fsxattr zerofsx; static xfs_ino_t orphanage_ino; -static struct xfs_name xfs_name_dot = {(unsigned char *)".", - 1, - XFS_DIR3_FT_DIR}; - /* * Data structures used to keep track of directories where the ".." * entries are updated. These must be rebuilt after the initial pass From patchwork Thu Feb 16 21:06:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13143915 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3213EC61DA4 for ; Thu, 16 Feb 2023 21:06:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230097AbjBPVGg (ORCPT ); Thu, 16 Feb 2023 16:06:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40614 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230078AbjBPVGf (ORCPT ); Thu, 16 Feb 2023 16:06:35 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3F57528A1 for ; Thu, 16 Feb 2023 13:06:34 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6468D60C1A for ; Thu, 16 Feb 2023 21:06:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C9669C433D2; Thu, 16 Feb 2023 21:06:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1676581593; bh=dXKEzg/qkL7Bo6qY0Np/0mImyT6jU5nEhDvtcGhA14w=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=f/Ily+SfLOhmWhJZMLfR6dae58SeEWdFwSK5uHfQHp0hWQdF2xxfqqwDqrE68WKfQ GKVXJDrP5+/gs/S8cxrlJk0ZUv39yeXwyo1UucfC4s6GOOhoo2VHXPbDS8orY5XBCN i9eN3kmHjmCdIPHpcCYhORZdLjB/ZF4BnqTicBaJRdSeOG63WmUr0ku7reG5tx3LTF wb5aVJQPyI4mSXrvAujYdLWl5X5OHTk1QswpOYe3jj5bUyFHKT7eV1hd0DfRwf0fj1 zpzvyFEORrmhXElxgGO94WSSlbXqUHfVFENLNCKY5QxeuzTLgBqEk56KiZ+OSYTNrb w+u+548/jS3Uw== Date: Thu, 16 Feb 2023 13:06:33 -0800 Subject: [PATCH 3/4] xfs: create a blob array data structure From: "Darrick J. Wong" To: djwong@kernel.org Cc: allison.henderson@oracle.com, linux-xfs@vger.kernel.org Message-ID: <167657880720.3477371.15024482783988794017.stgit@magnolia> In-Reply-To: <167657880680.3477371.18364607478868446486.stgit@magnolia> References: <167657880680.3477371.18364607478868446486.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Create a simple 'blob array' data structure for storage of arbitrarily sized metadata objects that will be used to reconstruct metadata. For the intended usage (temporarily storing extended attribute names and values) we only have to support storing objects and retrieving them. Use the xfile abstraction to store the attribute information in memory that can be swapped out. Signed-off-by: Darrick J. Wong --- libxfs/Makefile | 2 + libxfs/xfblob.c | 148 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ libxfs/xfblob.h | 25 +++++++++ libxfs/xfile.c | 11 ++++ libxfs/xfile.h | 1 5 files changed, 187 insertions(+) create mode 100644 libxfs/xfblob.c create mode 100644 libxfs/xfblob.h diff --git a/libxfs/Makefile b/libxfs/Makefile index 17978006..cac0c948 100644 --- a/libxfs/Makefile +++ b/libxfs/Makefile @@ -26,6 +26,7 @@ HFILES = \ libxfs_priv.h \ linux-err.h \ topology.h \ + xfblob.h \ xfile.h \ xfs_ag_resv.h \ xfs_alloc.h \ @@ -67,6 +68,7 @@ CFILES = cache.c \ topology.c \ trans.c \ util.c \ + xfblob.c \ xfile.c \ xfs_ag.c \ xfs_ag_resv.c \ diff --git a/libxfs/xfblob.c b/libxfs/xfblob.c new file mode 100644 index 00000000..6c1c8e6f --- /dev/null +++ b/libxfs/xfblob.c @@ -0,0 +1,148 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (C) 2022 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#include "libxfs_priv.h" +#include "libxfs.h" +#include "libxfs/xfile.h" +#include "libxfs/xfblob.h" + +/* + * XFS Blob Storage + * ================ + * Stores and retrieves blobs using an xfile. Objects are appended to the file + * and the offset is returned as a magic cookie for retrieval. + */ + +#define XB_KEY_MAGIC 0xABAADDAD +struct xb_key { + uint32_t xb_magic; /* XB_KEY_MAGIC */ + uint32_t xb_size; /* size of the blob, in bytes */ + loff_t xb_offset; /* byte offset of this key */ + /* blob comes after here */ +} __packed; + +/* Initialize a blob storage object. */ +int +xfblob_create( + struct xfs_mount *mp, + const char *description, + struct xfblob **blobp) +{ + struct xfblob *blob; + struct xfile *xfile; + int error; + + error = xfile_create(mp, description, &xfile); + if (error) + return error; + + blob = malloc(sizeof(struct xfblob)); + if (!blob) { + error = -ENOMEM; + goto out_xfile; + } + + blob->xfile = xfile; + blob->last_offset = PAGE_SIZE; + + *blobp = blob; + return 0; + +out_xfile: + xfile_destroy(xfile); + return error; +} + +/* Destroy a blob storage object. */ +void +xfblob_destroy( + struct xfblob *blob) +{ + xfile_destroy(blob->xfile); + kfree(blob); +} + +/* Retrieve a blob. */ +int +xfblob_load( + struct xfblob *blob, + xfblob_cookie cookie, + void *ptr, + uint32_t size) +{ + struct xb_key key; + int error; + + error = xfile_obj_load(blob->xfile, &key, sizeof(key), cookie); + if (error) + return error; + + if (key.xb_magic != XB_KEY_MAGIC || key.xb_offset != cookie) { + ASSERT(0); + return -ENODATA; + } + if (size < key.xb_size) { + ASSERT(0); + return -EFBIG; + } + + return xfile_obj_load(blob->xfile, ptr, key.xb_size, + cookie + sizeof(key)); +} + +/* Store a blob. */ +int +xfblob_store( + struct xfblob *blob, + xfblob_cookie *cookie, + const void *ptr, + uint32_t size) +{ + struct xb_key key = { + .xb_offset = blob->last_offset, + .xb_magic = XB_KEY_MAGIC, + .xb_size = size, + }; + loff_t pos = blob->last_offset; + int error; + + error = xfile_obj_store(blob->xfile, &key, sizeof(key), pos); + if (error) + return error; + + pos += sizeof(key); + error = xfile_obj_store(blob->xfile, ptr, size, pos); + if (error) + goto out_err; + + *cookie = blob->last_offset; + blob->last_offset += sizeof(key) + size; + return 0; +out_err: + xfile_discard(blob->xfile, blob->last_offset, sizeof(key)); + return error; +} + +/* Free a blob. */ +int +xfblob_free( + struct xfblob *blob, + xfblob_cookie cookie) +{ + struct xb_key key; + int error; + + error = xfile_obj_load(blob->xfile, &key, sizeof(key), cookie); + if (error) + return error; + + if (key.xb_magic != XB_KEY_MAGIC || key.xb_offset != cookie) { + ASSERT(0); + return -ENODATA; + } + + xfile_discard(blob->xfile, cookie, sizeof(key) + key.xb_size); + return 0; +} diff --git a/libxfs/xfblob.h b/libxfs/xfblob.h new file mode 100644 index 00000000..d1282810 --- /dev/null +++ b/libxfs/xfblob.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Copyright (C) 2022 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#ifndef __XFS_SCRUB_XFBLOB_H__ +#define __XFS_SCRUB_XFBLOB_H__ + +struct xfblob { + struct xfile *xfile; + loff_t last_offset; +}; + +typedef loff_t xfblob_cookie; + +int xfblob_create(struct xfs_mount *mp, const char *descr, + struct xfblob **blobp); +void xfblob_destroy(struct xfblob *blob); +int xfblob_load(struct xfblob *blob, xfblob_cookie cookie, void *ptr, + uint32_t size); +int xfblob_store(struct xfblob *blob, xfblob_cookie *cookie, const void *ptr, + uint32_t size); +int xfblob_free(struct xfblob *blob, xfblob_cookie cookie); + +#endif /* __XFS_SCRUB_XFBLOB_H__ */ diff --git a/libxfs/xfile.c b/libxfs/xfile.c index f551aef5..57542507 100644 --- a/libxfs/xfile.c +++ b/libxfs/xfile.c @@ -222,3 +222,14 @@ xfile_dump( return execvp("od", argv); } + +/* Discard pages backing a range of the xfile. */ +void +xfile_discard( + struct xfile *xf, + loff_t pos, + unsigned long long count) +{ + fallocate(xf->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, + pos, count); +} diff --git a/libxfs/xfile.h b/libxfs/xfile.h index 1389ff8f..89431f6f 100644 --- a/libxfs/xfile.h +++ b/libxfs/xfile.h @@ -52,5 +52,6 @@ struct xfile_stat { int xfile_stat(struct xfile *xf, struct xfile_stat *statbuf); int xfile_dump(struct xfile *xf); +void xfile_discard(struct xfile *xf, loff_t pos, unsigned long long count); #endif /* __LIBXFS_XFILE_H__ */ From patchwork Thu Feb 16 21:06:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13143916 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 776A6C636CC for ; Thu, 16 Feb 2023 21:06:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230078AbjBPVGx (ORCPT ); Thu, 16 Feb 2023 16:06:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40678 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230100AbjBPVGx (ORCPT ); Thu, 16 Feb 2023 16:06:53 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0D8972B632 for ; Thu, 16 Feb 2023 13:06:52 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id B1418B8217A for ; Thu, 16 Feb 2023 21:06:50 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 714FBC433D2; Thu, 16 Feb 2023 21:06:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1676581609; bh=rlZY2hureLjwd7HVSepenAKow1b7C3bQF03fZOGtYsc=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=FYx6iCD92sLNCLz5/r2Lr7DCvls9JQJhE0V/Zifvj7OuTlZyMFGGSeXR69E4TmYso k4otDzTYBhIDaRMpkmd8WcoeavLYTOsMBGhTpKVUQ/ZNtgHoxJDhv9AwSIhnPQa1kj djGToWXy4w9XrR5VOkxjEJ7jvJpjQsd0Kgt1IFuzEApx+RDXr0abVK3uNvqoqpn/BU vHX73usJjLiwldbKeXjInTFWR+oayXDEvz46Prep5YPrZ1MBAnDkA1HsI/I/HB+29c /uwIip0sxSZaW8ngMvP13CduAmB3nSSzFRYtvNiQgsSCGTbblXb9dnavSIfviCwYS0 h0e11LJ+9NpNA== Date: Thu, 16 Feb 2023 13:06:48 -0800 Subject: [PATCH 4/4] libxfs: export attr3_leaf_hdr_from_disk via libxfs_api_defs.h From: "Darrick J. Wong" To: djwong@kernel.org Cc: allison.henderson@oracle.com, linux-xfs@vger.kernel.org Message-ID: <167657880733.3477371.14571769745474857902.stgit@magnolia> In-Reply-To: <167657880680.3477371.18364607478868446486.stgit@magnolia> References: <167657880680.3477371.18364607478868446486.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Do the xfs -> libxfs switcheroo and cleanups separately so the next patch doesn't become an even larger mess. Signed-off-by: Darrick J. Wong --- db/attr.c | 2 +- db/metadump.c | 2 +- libxfs/libxfs_api_defs.h | 1 + repair/attr_repair.c | 6 +++--- 4 files changed, 6 insertions(+), 5 deletions(-) diff --git a/db/attr.c b/db/attr.c index db7cf54b..8ea7b36e 100644 --- a/db/attr.c +++ b/db/attr.c @@ -253,7 +253,7 @@ attr_leaf_entry_walk( return 0; off = byteize(startoff); - xfs_attr3_leaf_hdr_from_disk(mp->m_attr_geo, &leafhdr, leaf); + libxfs_attr3_leaf_hdr_from_disk(mp->m_attr_geo, &leafhdr, leaf); entries = xfs_attr3_leaf_entryp(leaf); for (i = 0; i < leafhdr.count; i++) { diff --git a/db/metadump.c b/db/metadump.c index bb441fbb..4be23993 100644 --- a/db/metadump.c +++ b/db/metadump.c @@ -1757,7 +1757,7 @@ process_attr_block( } /* Ok, it's a leaf - get header; accounts for crc & non-crc */ - xfs_attr3_leaf_hdr_from_disk(mp->m_attr_geo, &hdr, leaf); + libxfs_attr3_leaf_hdr_from_disk(mp->m_attr_geo, &hdr, leaf); nentries = hdr.count; if (nentries == 0 || diff --git a/libxfs/libxfs_api_defs.h b/libxfs/libxfs_api_defs.h index 055d2862..6d045867 100644 --- a/libxfs/libxfs_api_defs.h +++ b/libxfs/libxfs_api_defs.h @@ -33,6 +33,7 @@ #define xfs_alloc_read_agf libxfs_alloc_read_agf #define xfs_alloc_vextent libxfs_alloc_vextent +#define xfs_attr3_leaf_hdr_from_disk libxfs_attr3_leaf_hdr_from_disk #define xfs_attr_get libxfs_attr_get #define xfs_attr_leaf_newentsize libxfs_attr_leaf_newentsize #define xfs_attr_namecheck libxfs_attr_namecheck diff --git a/repair/attr_repair.c b/repair/attr_repair.c index afe8073c..d3fd7a47 100644 --- a/repair/attr_repair.c +++ b/repair/attr_repair.c @@ -579,7 +579,7 @@ process_leaf_attr_block( da_freemap_t *attr_freemap; struct xfs_attr3_icleaf_hdr leafhdr; - xfs_attr3_leaf_hdr_from_disk(mp->m_attr_geo, &leafhdr, leaf); + libxfs_attr3_leaf_hdr_from_disk(mp->m_attr_geo, &leafhdr, leaf); clearit = usedbs = 0; firstb = mp->m_sb.sb_blocksize; stop = xfs_attr3_leaf_hdr_size(leaf); @@ -802,7 +802,7 @@ process_leaf_attr_level(xfs_mount_t *mp, } leaf = bp->b_addr; - xfs_attr3_leaf_hdr_from_disk(mp->m_attr_geo, &leafhdr, leaf); + libxfs_attr3_leaf_hdr_from_disk(mp->m_attr_geo, &leafhdr, leaf); /* check magic number for leaf directory btree block */ if (!(leafhdr.magic == XFS_ATTR_LEAF_MAGIC || @@ -1000,7 +1000,7 @@ process_longform_leaf_root( * check sibling pointers in leaf block or root block 0 before * we have to release the btree block */ - xfs_attr3_leaf_hdr_from_disk(mp->m_attr_geo, &leafhdr, bp->b_addr); + libxfs_attr3_leaf_hdr_from_disk(mp->m_attr_geo, &leafhdr, bp->b_addr); if (leafhdr.forw != 0 || leafhdr.back != 0) { if (!no_modify) { do_warn(