From patchwork Wed Aug 7 19:14:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13756651 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5E8E5B646 for ; Wed, 7 Aug 2024 19:14:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723058050; cv=none; b=oN+0wI/76M8pQ1ZlhTDBdTAdMlHweKRoxZQcAaHAGHluPmWRlfCeAqSdtntby5XwHEjtI9c/39rH+8mTE0igwPRlLIpJU3/X2dpHuYGqiDxaXDhIae/aBXRLB22RjghBURCxBqu2iqj0FfWj5ihZG6lCoj5m70L+LZIHlVXqoMw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723058050; c=relaxed/simple; bh=w+gWr/s1kxwhyQONgmut6VRZibjd4xDWnpSme9HDZKs=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Ol3l2/vhtxklikG+g+DAQTSCPdk9cMsKsw6/dJOQMviyRLpKCxs1oD9lVGAfQAD42pVwLtbm1oGBwed7pT4AZKTk/QT/YVQEQ3QCRQnfkWB3Ly6I7GyXA7lcqwcfpQqpWOI9TLFu+I97ZNwgn2sqxQwIi2UwZBOGSKoBuKU6HL8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=fmTjnNRA; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="fmTjnNRA" Received: by smtp.kernel.org (Postfix) with ESMTPSA id DF92BC32781; Wed, 7 Aug 2024 19:14:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1723058049; bh=w+gWr/s1kxwhyQONgmut6VRZibjd4xDWnpSme9HDZKs=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=fmTjnNRAGiZlxH+mw/CW/IJ4JXnawVeDCRMcCssI2czCG0TxW9sR5f5PC0YaIpX4n ljz2YOy/F7qZFqTwgmmgCTjDmwOrACsTqG6tg4KmtpzmtiZ5ZlefcRtB0LZsKSfv16 B5BBnhOUfuhlJEQsh69aZdv0KPybjlefLBz8bkucDeIJIrCC6s6CM+gAjHgRW8S8dJ +Ya469f8ltehlq4RleELn6SKb8N6dp6BejcAm9uG2btL1Aa7VlE+kQO0JotmlEXwzd Xz10U/u3t0u/nFrlqPa6zhe2fkEnZD4y0D1c8GkOwSKOdjYIVAAwsrn0bsRVrU0c1/ p2thNN4awNKXQ== Date: Wed, 07 Aug 2024 12:14:09 -0700 Subject: [PATCH 1/5] design: document atomic file mapping exchange log intent structures From: "Darrick J. Wong" To: djwong@kernel.org Cc: chandanbabu@kernel.org, hch@lst.de, linux-xfs@vger.kernel.org Message-ID: <172305794103.969463.851368852347319574.stgit@frogsfrogsfrogs> In-Reply-To: <172305794084.969463.781862996787293755.stgit@frogsfrogsfrogs> References: <172305794084.969463.781862996787293755.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Document the log intent item formats for the mapping exchange feature. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- .../allocation_groups.asciidoc | 10 ++ .../journaling_log.asciidoc | 123 ++++++++++++++++++++ design/XFS_Filesystem_Structure/magic.asciidoc | 2 3 files changed, 135 insertions(+) diff --git a/design/XFS_Filesystem_Structure/allocation_groups.asciidoc b/design/XFS_Filesystem_Structure/allocation_groups.asciidoc index c0ba16a8..e22c7344 100644 --- a/design/XFS_Filesystem_Structure/allocation_groups.asciidoc +++ b/design/XFS_Filesystem_Structure/allocation_groups.asciidoc @@ -458,6 +458,16 @@ xfs_repair before it can be mounted. Large file fork extent counts. This greatly expands the maximum number of space mappings allowed in data and extended attribute file forks. +| +XFS_SB_FEAT_INCOMPAT_EXCHRANGE+ | +Atomic file mapping exchanges. The filesystem is capable of exchanging a range +of mappings between two arbitrary ranges of a file's fork by using log intent +items to track the progress of the high level exchange operation. In other +words, the exchange operation can be restarted if the system goes down, which +is necessary for userspace to commit of new file contents atomically. This +flag has user-visible impacts, which is why it is a permanent incompat flag. +See the section about xref:XMI_Log_Item[mapping exchange log intents] for more +information. + |===== *sb_features_log_incompat*:: diff --git a/design/XFS_Filesystem_Structure/journaling_log.asciidoc b/design/XFS_Filesystem_Structure/journaling_log.asciidoc index 8ff437fe..9d9fa836 100644 --- a/design/XFS_Filesystem_Structure/journaling_log.asciidoc +++ b/design/XFS_Filesystem_Structure/journaling_log.asciidoc @@ -217,6 +217,8 @@ magic number to distinguish themselves. Buffer data items only appear after | +XFS_LI_BUD+ | 0x1245 | xref:BUD_Log_Item[File Block Mapping Update Done] | +XFS_LI_ATTRI+ | 0x1246 | xref:ATTRI_Log_Item[Extended Attribute Update Intent] | +XFS_LI_ATTRD+ | 0x1247 | xref:ATTRD_Log_Item[Extended Attribute Update Done] +| +XFS_LI_XMI+ | 0x1248 | xref:XMI_Log_Item[File Mapping Exchange Intent] +| +XFS_LI_XMD+ | 0x1249 | xref:XMD_Log_Item[File Mapping Exchange Done] |===== Note that all log items (except for transaction headers) MUST start with @@ -649,6 +651,8 @@ file block mapping operation we want. The upper three bytes are flag bits. | Value | Description | +XFS_BMAP_EXTENT_ATTR_FORK+ | Extent is for the attribute fork. | +XFS_BMAP_EXTENT_UNWRITTEN+ | Extent is unwritten. +| +XFS_BMAP_EXTENT_REALTIME+ | Mapping applies to the data fork of a +realtime file. This flag cannot be combined with +XFS_BMAP_EXTENT_ATTR_FORK+. |===== The ``file block mapping update intent'' operation comes first; it tells the @@ -821,6 +825,125 @@ These regions contain the name and value components of the extended attribute being updated, as needed. There are no magic numbers; each region contains the data and nothing else. +[[XMI_Log_Item]] +=== File Mapping Exchange Intent + +These two log items work together to track the exchange of mapped extents +between the forks of two files. Each operation requires a separate XMI/XMD +pair. The log intent item has the following format: + +[source, c] +---- +struct xfs_xmi_log_format { + uint16_t xmi_type; + uint16_t xmi_size; + uint32_t __pad; + uint64_t xmi_id; + uint64_t xmi_inode1; + uint64_t xmi_inode2; + uint32_t xmi_igen1; + uint32_t xmi_igen2; + uint64_t xmi_startoff1; + uint64_t xmi_startoff2; + uint64_t xmi_blockcount; + uint64_t xmi_flags; + int64_t xmi_isize1; + int64_t xmi_isize2; +}; +---- + +*xmi_type*:: +The signature of an XMI operation, 0x1248. This value is in host-endian order, +not big-endian like the rest of XFS. + +*xmi_size*:: +Size of this log item. Should be 1. + +*__pad*:: +Must be zero. + +*xmi_id*:: +A 64-bit number that binds the corresponding XMD log item to this XMI log item. + +*xmi_inode1*:: +Inode number of the first file involved in the operation. + +*xmi_inode2*:: +Inode number of the second file involved in the operation. + +*xmi_igen1*:: +Generation number of the first file involved in the operation. + +*xmi_igen2*:: +Generation number of the second file involved in the operation. + +*xmi_startoff1*:: +Starting point within the first file, in units of filesystem blocks. + +*xmi_startoff2*:: +Starting point within the second file, in units of filesystem blocks. + +*xmi_blockcount*:: +The length to be exchanged, in units of filesystem blocks. + +*xmi_flags*:: +Behavioral changes to the operation, as follows: + +.File Extent Swap Intent Item Flags +[options="header"] +|===== +| Value | Description +| +XFS_EXCHMAPS_ATTR_FORK+ | Exchange extents between attribute forks. +| +XFS_EXCHMAPS_SET_SIZES+ | Exchange the file sizes of the two files +after the operation completes. +| +XFS_EXCHMAPS_INO1_WRITTEN+ | Exchange the mappings of two files only +if the file allocation units mapped to file1's range have been written. +| +XFS_EXCHMAPS_CLEAR_INO1_REFLINK+ | Clear the reflink flag from inode1 after +the operation. +| +XFS_EXCHMAPS_CLEAR_INO2_REFLINK+ | Clear the reflink flag from inode2 after +the operation. +|===== + +*xmi_isize1*:: +The original size of the first file, in bytes. This is zero if the ++XFS_EXCHMAPS_SET_SIZES+ flag is not set. + +*xmi_isize2*:: +The original size of the second file, in bytes. This is zero if the ++XFS_EXCHMAPS_SET_SIZES+ flag is not set. + +[[XMD_Log_Item]] +=== Completion of File Mapping Exchange + +The ``file mapping exchange done'' operation complements the ``file mapping +exchange intent'' operation. This second operation indicates that the update +actually happened, so that log recovery needn't replay the update. The XMD +item and the actual updates are typically found in a new transaction following +the transaction in which the XMI was logged. The completion has this format: + +[source, c] +---- +struct xfs_xmd_log_format { + uint16_t xmd_type; + uint16_t xmd_size; + uint32_t __pad; + uint64_t xmd_xmi_id; +}; +---- + +*xmd_type*:: +The signature of an XMD operation, 0x1249. This value is in host-endian order, +not big-endian like the rest of XFS. + +*xmd_size*:: +Size of this log item. Should be 1. + +*__pad*:: +Must be zero. + +*xmd_xmi_id*:: +A 64-bit number that binds the corresponding XMI log item to this XMD log item. + [[Inode_Log_Item]] === Inode Updates diff --git a/design/XFS_Filesystem_Structure/magic.asciidoc b/design/XFS_Filesystem_Structure/magic.asciidoc index a343271a..60952aeb 100644 --- a/design/XFS_Filesystem_Structure/magic.asciidoc +++ b/design/XFS_Filesystem_Structure/magic.asciidoc @@ -73,6 +73,8 @@ are not aligned to blocks. | +XFS_LI_BUD+ | 0x1245 | | xref:BUD_Log_Item[File Block Mapping Update Done] | +XFS_LI_ATTRI+ | 0x1246 | | xref:ATTRI_Log_Item[Extended Attribute Update Intent] | +XFS_LI_ATTRD+ | 0x1247 | | xref:ATTRD_Log_Item[Extended Attribute Update Done] +| +XFS_LI_XMI+ | 0x1248 | | xref:XMI_Log_Item[File Mapping Exchange Intent] +| +XFS_LI_XMD+ | 0x1249 | | xref:XMD_Log_Item[File Mapping Exchange Done] |===== = Theoretical Limits From patchwork Wed Aug 7 19:14:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13756652 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B7ADB13CFA3 for ; Wed, 7 Aug 2024 19:14:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723058065; cv=none; b=Rh7QKd12UJD1gLOAP2nfKSHq4EiLoFnVP4G2eOW6nnps1FeEP/vXd7QC3E+zuXg5n1WF8JBviivxF6aI0y2NrtuRUmVBoSAQ03FRiHilLSwRwcrd+nwWviCAM9r00YHPiiooBH5/IFCOKG4+42qhhWrwsRhKp5yG/0AUwVtRPV8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723058065; c=relaxed/simple; bh=QPw9IRCI95tfLgM0IyAnsp+Ioqzye0YX8YqK3eG+D4o=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=kf2PbYvHsltK4WPRfRTEaBu10+52l5kDkoz2wlzeuQOfg4/W9ZGcVJTqJQNIVN3adwffSdWG4zwklsf+aGoFywakW1g3FOLZQsanyXQ14qVCkLw84Ew7FSBpAjRZtBohCQWIextOUfMDlKUcrd44ucL0wvRigYaKFeRwbTEU+M4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=PlydN1CY; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="PlydN1CY" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 888BDC32781; Wed, 7 Aug 2024 19:14:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1723058065; bh=QPw9IRCI95tfLgM0IyAnsp+Ioqzye0YX8YqK3eG+D4o=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=PlydN1CYyQG8KMsCjMHp+AvtVum+ri/ovKBb4mkxR0zIAiN6721Em7jvGj2J2a4K3 Lqu3zQ05mx9xUxf0M+HFlhFzzftiDhvbLWPvdJuxDkvuoNcfzcKYh8v2kCuN2jAhqY PjjFkQb7lKg6fEBZIxk+NrY0eAsmaBVeoSaC5sqLcSyiNHYxjnZSEN7rK9lm/rLChK h7L/BVyP9uPpcODsATEVzzI5vMKGkItRpYmec2/TQaXi0OHbmAii0JV2adqS7svnpE xTXS25LB+ojMLVjonXGrNAxCW/kmW/SShDWJJCVZvEvKUAYMx9xTi7JtuXWVCPmwMe wAOXRoH860S1g== Date: Wed, 07 Aug 2024 12:14:25 -0700 Subject: [PATCH 2/5] design: document new name-value logged attribute variants From: "Darrick J. Wong" To: djwong@kernel.org Cc: chandanbabu@kernel.org, hch@lst.de, linux-xfs@vger.kernel.org Message-ID: <172305794118.969463.1580394382652832046.stgit@frogsfrogsfrogs> In-Reply-To: <172305794084.969463.781862996787293755.stgit@frogsfrogsfrogs> References: <172305794084.969463.781862996787293755.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong In preparation for parent pointers, we added a few new opcodes for logged extended attribute updates. Document them now. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- .../journaling_log.asciidoc | 54 ++++++++++++++++++-- 1 file changed, 48 insertions(+), 6 deletions(-) diff --git a/design/XFS_Filesystem_Structure/journaling_log.asciidoc b/design/XFS_Filesystem_Structure/journaling_log.asciidoc index 9d9fa836..6b9d65c3 100644 --- a/design/XFS_Filesystem_Structure/journaling_log.asciidoc +++ b/design/XFS_Filesystem_Structure/journaling_log.asciidoc @@ -730,10 +730,18 @@ of file block mapping operation we want. .Extended attribute update log intent types [options="header"] |===== -| Value | Description -| +XFS_ATTRI_OP_FLAGS_SET+ | Set a key/value pair. -| +XFS_ATTRI_OP_FLAGS_REMOVE+ | Remove a key/value pair. -| +XFS_ATTRI_OP_FLAGS_REPLACE+ | Replace one key/value pair with another. +| Value | Description +| +XFS_ATTRI_OP_FLAGS_SET+ | Associate an attribute name with the +given value, creating an entry for the name if necessary. +| +XFS_ATTRI_OP_FLAGS_REMOVE+ | Remove an attribute name and any +value associated with it. +| +XFS_ATTRI_OP_FLAGS_REPLACE+ | Remove any value associated with an +attribute name, then associate the name with the given value. +| +XFS_ATTRI_OP_FLAGS_PPTR_SET+ | Add a parent pointer associating a directory entry name with a file handle to the parent directory. The (name, handle) tuple must not exist in the attribute structure. +| +XFS_ATTRI_OP_FLAGS_PPTR_REMOVE+ | Remove a parent pointer from the attribute structure. The (name, handle) tuple must already exist. +| +XFS_ATTRI_OP_FLAGS_PPTR_REPLACE+ | Remove a specific (name, handle) tuple from +the attribute structure, then add a new (name, handle) tuple to the attribute structure. +The two names and handles need not be the same. |===== The ``extended attribute update intent'' operation comes first; it tells the @@ -747,11 +755,17 @@ through the complex update will be replayed fully during log recovery. struct xfs_attri_log_format { uint16_t alfi_type; uint16_t alfi_size; - uint32_t __pad; + uint32_t alfi_igen; uint64_t alfi_id; uint64_t alfi_ino; uint32_t alfi_op_flags; - uint32_t alfi_name_len; + union { + uint32_t alfi_name_len; + struct { + uint16_t alfi_old_name_len; + uint16_t alfi_new_name_len; + }; + }; uint32_t alfi_value_len; uint32_t alfi_attr_filter; }; @@ -764,6 +778,9 @@ order, not big-endian like the rest of XFS. *alfi_size*:: Size of this log item. Should be 1. +*alfi_igen*:: +Generation number of the file being updated. + *alfi_id*:: A 64-bit number that binds the corresponding ATTRD log item to this ATTRI log item. @@ -778,6 +795,13 @@ The operation being performed. The lower byte must be one of the *alfi_name_len*:: Length of the name of the extended attribute. This must not be zero. The attribute name itself is captured in the next log item. +This field is not defined for the PPTR_REPLACE opcode. + +*alfi_old_name_len*:: +For PPTR_REPLACE, this is the length of the old name. + +*alfi_new_name_len*:: +For PPTR_REPLACE, this is the length of the new name. *alfi_value_len*:: Length of the value of the extended attribute. This must be zero for remove @@ -789,6 +813,24 @@ name. Attribute namespace filter flags. This must be one of +ATTR_ROOT+, +ATTR_SECURE+, or +ATTR_INCOMPLETE+. +For a SET or REPLACE opcode, there should be two regions after the ATTRI intent +item. The first region contains the attribute name and the second contains the +attribute value. + +For a REMOVE opcode, there should only be one region after the ATTRI intent +item, and it will contain the attribute name. + +For an PPTR_SET or PPTR_REMOVE opcode, there should be two regions after the +ATTRI intent item. The first region contains the dirent name as the attribute +name. The second region contains a file handle to the parent directory as the +attribute value. + +For an PPTR_REPLACE opcode, there should be between four regions after the +ATTRI intent item. The first region contains the dirent name to remove. +The second region contains the dirent name to create. The third region +contains the parent directory file handle to remove. The fourth region +contains the parent directory file handle to add. + [[ATTRD_Log_Item]] === Completion of Extended Attribute Updates From patchwork Wed Aug 7 19:14:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13756653 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AE0C9144D11 for ; Wed, 7 Aug 2024 19:14:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723058081; cv=none; b=JKRn1WaNajHjQgFJZvC2TGGvJGexELmb8bsYUJX6sFphYLGdqcG7GoPs/dhyPfBJvVZm/q+hsSIWP/8UjbH91pe4M5WfhJaLK5hmCW46CCXlQRL0N1aseB+QsSbg67bTqCk/tQn4GhHV/lwspSfygDSizl1TVq1SMXkKK3AXDqM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723058081; c=relaxed/simple; bh=OXw/5vWLjPeoPYWIx+/x9r8jow5ET0ubh6Dxl+/5U1E=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=ask5MI9Z32BNcb2iq4zDQpEmzg84oEOtMVo7wSBdv0HPZEmIup+1wMaqVIGiNeIEW70C9cnzOntkj/99Atj48SLEHDCokfb7LC+IMjmzRlVI+rgQFiBJcpmMtVz9iDt4gdyd8eMvp/pzteC7XQagUnXv4SxLJ+tcOS50d1zrsrY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=EwYGD62J; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="EwYGD62J" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3C1D1C32781; Wed, 7 Aug 2024 19:14:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1723058081; bh=OXw/5vWLjPeoPYWIx+/x9r8jow5ET0ubh6Dxl+/5U1E=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=EwYGD62JBAUc3v5vdAZeejpGZ40p6Qizs/aQgb9tfefaT1aD70eL++NJXXTUHzBKV QZR6r8qx6fX1ITydDfp87eRzK0ZzekbILNMuotnic9UgeQFD9QWMAIRMe/7/1TXr73 iZ+FTToXNsblO4GfazQ4cJrxeqQ3sgqmIIn2wOmWldmISaOPhI/hzG9vG+DVtSgkQI wAalK1gcMMXyOOGem2Vv+RTILch78hBZnPkiInXbMmYcEwYed2eiF1Qfx80OLG7VKw aH4UvRZ/+s3ovp2E0BPlohY1LBRLYff6bboW24Orek37Ce/m3vifwCIUrtStDoIjQ1 HX3L49hnc//Og== Date: Wed, 07 Aug 2024 12:14:40 -0700 Subject: [PATCH 3/5] design: document the parent pointer ondisk format From: "Darrick J. Wong" To: djwong@kernel.org Cc: chandanbabu@kernel.org, hch@lst.de, linux-xfs@vger.kernel.org Message-ID: <172305794133.969463.2869086475470560475.stgit@frogsfrogsfrogs> In-Reply-To: <172305794084.969463.781862996787293755.stgit@frogsfrogsfrogs> References: <172305794084.969463.781862996787293755.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Add parent pointers to the ondisk format documentation. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- .../allocation_groups.asciidoc | 4 + .../extended_attributes.asciidoc | 95 ++++++++++++++++++++ 2 files changed, 99 insertions(+) diff --git a/design/XFS_Filesystem_Structure/allocation_groups.asciidoc b/design/XFS_Filesystem_Structure/allocation_groups.asciidoc index e22c7344..d7fd63ea 100644 --- a/design/XFS_Filesystem_Structure/allocation_groups.asciidoc +++ b/design/XFS_Filesystem_Structure/allocation_groups.asciidoc @@ -468,6 +468,10 @@ flag has user-visible impacts, which is why it is a permanent incompat flag. See the section about xref:XMI_Log_Item[mapping exchange log intents] for more information. +| +XFS_SB_FEAT_INCOMPAT_PARENT+ | +Directory parent pointers. See the section about xref:Parent_Pointers[parent +pointers] for more information. + |===== *sb_features_log_incompat*:: diff --git a/design/XFS_Filesystem_Structure/extended_attributes.asciidoc b/design/XFS_Filesystem_Structure/extended_attributes.asciidoc index 19bff70f..4000c002 100644 --- a/design/XFS_Filesystem_Structure/extended_attributes.asciidoc +++ b/design/XFS_Filesystem_Structure/extended_attributes.asciidoc @@ -90,6 +90,7 @@ A combination of the following: | +XFS_ATTR_SECURE+ | The attribute's namespace is ``secure''. | +XFS_ATTR_INCOMPLETE+ | This attribute is being modified. | +XFS_ATTR_LOCAL+ | The attribute value is contained within this block. +| +XFS_ATTR_PARENT+ | This attribute is a parent pointer. |===== .Short form attribute layout @@ -911,6 +912,100 @@ Log sequence number of the last write to this block. Filesystems formatted prior to v5 do not have this header in the remote block. Value data begins immediately at offset zero. +[[Parent_Pointers]] +== Directory Parent Pointers + +If this feature is enabled, each directory entry pointing from a parent +directory to a child file has a corresponding back link from the child file +back to the parent. In other words, if directory P has an entry "foo" pointing +to child C, then child C will have a parent pointer entry "foo" pointing to +parent P. This redundancy enables validation and repairs of the directory tree +if the tree structure is damaged. + +Parent pointers are stored in the private ATTR_PARENT namespace within the +extended attribute structure. Attribute names in this namespace use a custom +hash function, which is defined as the dirent name hash of the dirent name XORd +with the upper and lower 32 bits of the parent inumber. This hash function +reduces collisions if the same file is hard linked into multiple directories +under identical names. + +The attribute name contains the dirent name in +the parent, and the attribute value contains a file handle to the parent +directory: + +[source, c] +---- +struct xfs_parent_rec { + __be64 p_ino; + __be32 p_gen; +}; +---- + +*p_ino*:: +Inode number of the parent directory. + +*p_gen*:: +Generation number of the parent directory. + +=== xfs_db Parent Pointer Example + +Create a directory tree with the following structure, assuming that the +XFS filesystem is mounted on +/mnt+: + +---- +$ mkdir /mnt/a/ /mnt/b +$ touch /mnt/a/autoexec.bat +$ ln /mnt/a/autoexec.bat /mnt/b/config.sys +---- + +Now we open this up in the debugger: + +---- +xfs_db> path /a +xfs_db> ls +8 131 directory 0x0000002e 1 . (good) +10 128 directory 0x0000172e 2 .. (good) +12 132 regular 0x5a1f6ea0 12 autoexec.bat (good) +xfs_db> path /b +xfs_db> ls +8 16777344 directory 0x0000002e 1 . (good) +10 128 directory 0x0000172e 2 .. (good) +15 132 regular 0x9a01678c 10 config.sys (good) +xfs_db> path /b/config.sys +xfs_db> p a +a.sfattr.hdr.totsize = 56 +a.sfattr.hdr.count = 2 +a.sfattr.list[0].namelen = 12 +a.sfattr.list[0].valuelen = 12 +a.sfattr.list[0].root = 0 +a.sfattr.list[0].secure = 0 +a.sfattr.list[0].parent = 1 +a.sfattr.list[0].name = "autoexec.bat" +a.sfattr.list[0].parent_dir.inumber = 131 +a.sfattr.list[0].parent_dir.gen = 3204669414 +a.sfattr.list[1].namelen = 10 +a.sfattr.list[1].valuelen = 12 +a.sfattr.list[1].root = 0 +a.sfattr.list[1].secure = 0 +a.sfattr.list[1].parent = 1 +a.sfattr.list[1].name = "config.sys" +a.sfattr.list[1].parent_dir.inumber = 16777344 +a.sfattr.list[1].parent_dir.gen = 4137450876 + +---- + +In this example, +/a+ and +/b+ are subdirectories of the root. A regular file +is hardlinked into both subdirectories, under different names. Directory +/a+ +is inode 131 and has an entry +autoexec.bat+ pointing to the child file. +Directory +/b+ is inode 16777344 and has an entry +config.sys+ pointing to the +same child file. + +Within the child file, notice that there are two parent pointers in the +extended attribute structure. The first parent pointer tells us that directory +inode 131 should have an entry +autoexec.bat+ pointing down to the child; the +second parent pointer tells us that directory inode 16777344 should have an +entry +config.sys+ pointing down to the child. + == Key Differences Between Directories and Extended Attributes Directories and extended attributes share the function of mapping names to From patchwork Wed Aug 7 19:14:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13756654 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1B5FE13CFA3 for ; Wed, 7 Aug 2024 19:14:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723058097; cv=none; b=pnaR16P+WBnbM1ps3xB/WmAYq3qhGTv8mw4Vp1FyRSTgUm8qpCnmWxbvnI0DgNr7oFCHnFv7FQm4sJ7plbXdgrs2ofWPTFW9BbHeFE9I3AZMUPOzjcyiNcCN0sxS2SBIZjgJ2q5YR4DmjZje6uX54i/wEXmzECxWzWjJu4rGjg4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723058097; c=relaxed/simple; bh=kgkb46v7suy+I5vqP3isDpQz1hwwm+X+K3j8fc3dy5Q=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=SJbV0CZ/mDo5grG2BL3JoKL7C46nC4e++F1FydZ3SJngbcJ3sB0hm1lsVC0E8EQZDXVyBNNyMRSJrTL55fAmEDrUvV9mVWeaA/4vT33dnlBWggD6u5VHbCxxeZAwYQTQ9PW3pu/cajvtVhDWMYimRpyMsmR49sUSu4c3tQMFzCg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=GSoPdEQJ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="GSoPdEQJ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E3549C32781; Wed, 7 Aug 2024 19:14:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1723058097; bh=kgkb46v7suy+I5vqP3isDpQz1hwwm+X+K3j8fc3dy5Q=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=GSoPdEQJK9kSqkHj0Vk/Hi1maJ0w28GJDUHxEV1Psex8iLnpLKSdsFMqXKcvK9fPN 17b9B8ZDGiclEALrZ3IzaLNs2kaeQZh/AJxcNl5zAEIbDG63U7R6hbL5c/Lzv3wGUA pMtLPflZOF49R9RpY6WnMgiZA3Rge5AhBOAKhNwLscx4tuFWm85gBdfPZSMIyPYSpM ZNEkp5WEowjv7kBrRA9LhosgOUWLp5N3DaT1ywimKUOSwjweJi8Z3IdKBS1ZA7QTy2 qmUYwn+ZBO2B8mptMTrbYbTWNKpUUY8MxmLsSxwv2IRF4ixQJgoMofnSeBG9W/EyJ4 Wvo1qX0jMDUqw== Date: Wed, 07 Aug 2024 12:14:56 -0700 Subject: [PATCH 4/5] design: document the metadump v2 format From: "Darrick J. Wong" To: djwong@kernel.org Cc: chandanbabu@kernel.org, hch@lst.de, linux-xfs@vger.kernel.org Message-ID: <172305794147.969463.227865134024435978.stgit@frogsfrogsfrogs> In-Reply-To: <172305794084.969463.781862996787293755.stgit@frogsfrogsfrogs> References: <172305794084.969463.781862996787293755.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Document the ondisk format of v2 metadumps. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- design/XFS_Filesystem_Structure/metadump.asciidoc | 112 ++++++++++++++++++++- 1 file changed, 109 insertions(+), 3 deletions(-) diff --git a/design/XFS_Filesystem_Structure/metadump.asciidoc b/design/XFS_Filesystem_Structure/metadump.asciidoc index 2bddb77f..a32d6423 100644 --- a/design/XFS_Filesystem_Structure/metadump.asciidoc +++ b/design/XFS_Filesystem_Structure/metadump.asciidoc @@ -6,6 +6,9 @@ snapshot of a live file system and to restore that snapshot onto a block device for debugging purposes. Only the metadata are captured in the snapshot, and the metadata blocks may be obscured for privacy reasons. +[[Metadump_v1]] +== Metadump v1 + A metadump file starts with a +xfs_metablock+ that records the addresses of the blocks that follow. Following that are the metadata blocks captured from the filesystem. The first block following the first superblock @@ -21,7 +24,7 @@ struct xfs_metablock { __be32 mb_magic; __be16 mb_count; uint8_t mb_blocklog; - uint8_t mb_reserved; + uint8_t mb_info; __be64 mb_daddr[]; }; ---- @@ -37,14 +40,117 @@ Number of blocks indexed by this record. This value must not exceed +(1 The log size of a metadump block. This size of a metadump block 512 bytes, so this value should be 9. -*mb_reserved*:: -Reserved. Should be zero. +*mb_info*:: +A combination of the following flags: + +.Metadump information flags +[options="header"] +|===== +| Flag | Description +| +XFS_METADUMP_INFO_FLAGS+ | +This field is nonzero. + +| +XFS_METADUMP_OBFUSCATED+ | +User-supplied directory entry and extended attribute names have been obscured, +and extended attribute values are zeroed to protect privacy. + +| +XFS_METADUMP_FULLBLOCKS+ | +Entire metadata blocks have been dumped, including unused areas. +If not set, the unused areas are zeroed. + +| +XFS_METADUMP_DIRTYLOG+ | +The log was dirty when the dump was captured. + +|===== *mb_daddr*:: An array of disk addresses. Each of the +mb_count+ blocks (of size +(1 << mb_blocklog+) following the +xfs_metablock+ should be written back to the address pointed to by the corresponding +mb_daddr+ entry. +[[Metadump_v2]] +== Metadump v2 + +A v2 metadump file starts with a +xfs_metadump_header+ structure that records +information about the dump itself. Immediately after this header is a sequence +of a +xfs_meta_extent+ structure describing an extent of data and the data +itself. Data areas must be a multiple of 512 bytes in length. + +.Metadata v2 Dump Format + +[source, c] +---- +struct xfs_metadump_header { + __be32 xmh_magic; + __be32 xmh_version; + __be32 xmh_compat_flags; + __be32 xmh_incompat_flags; + __be64 xmh_reserved; +} __packed; +---- + +*xmh_magic*:: +The magic number, ``XMD2'' (0x584D4432). + +*xmh_version*:: +The value 2. + +*xmh_compat_flags*:: +A combination of the following flags: + +.Metadump v2 compat flags +[options="header"] +|===== +| Flag | Description +| +XFS_MD2_COMPAT_OBFUSCATED+ | +User-supplied directory entry and extended attribute names have been obscured, +and extended attribute values are zeroed to protect privacy. + +| +XFS_MD2_COMPAT_FULLBLOCKS+ | +Entire metadata blocks have been dumped, including unused areas. +If not set, the unused areas are zeroed. + +| +XFS_MD2_COMPAT_DIRTYLOG+ | +The log was dirty when the dump was captured. + +| +XFS_MD2_COMPAT_EXTERNALLOG+ | +Dump contains external log contents. + +|===== + +*xmh_incompat_flags*:: +Must be zero. + +*xmh_reserved*:: +Must be zero. + +.Metadata v2 Extent Format + +[source, c] +---- +struct xfs_meta_extent { + __be64 xme_addr; + __be32 xme_len; +} __packed; +---- + +*xme_addr*:: +Bits 55-56 determine the device from which the metadata dump data was extracted. + +.Metadump v2 extent flags +[options="header"] +|===== +| Value | Description +| 0 | Data device +| 1 | External log +|===== + +The lower 54 bits determine the device address from which the dump data was +extracted, in units of 512 bytes. + +*xme_length*:: +Length of the metadata dump data region, in units of 512 bytes. + == Dump Obfuscation Unless explicitly disabled, the +xfs_metadump+ tool obfuscates empty block From patchwork Wed Aug 7 19:15:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13756655 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 266CA13E41F for ; Wed, 7 Aug 2024 19:15:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723058114; cv=none; b=MjiNDYL4cyCaktmuWLWYln/tcNsvdlVsalRc2ZSVh5zHNCdCgV/mIarAXePl8HtPouf4+7gMVBCkLM89uKQZK1rC4uBTLEJs/uQNhAhfJgH2g3Qntsc4w+6FOmKsLu4eh0rr1jmgVF70djL/V+4Lrappl1MEXdCS7q1B55RI7c0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723058114; c=relaxed/simple; bh=Pd6KU/ivC7dWbu070Vz+ll3x5aAQmTiFSJG/E2Zakj8=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=AV2Ad+9DAwAaJBD/V+2cqP2TQIXF2R9aAIBTBA2CEzFP1cLGwXLRxiIlOK2CmVBiW+2vG/bIRxUvv0aDo2pFvoZ99PA6U0ORnbErGlVVnvP37FFAiJQUV+EaJZOSGZ5T3u9ez0X5W2fcQ+RHlW5BosX7zSUxi7gg79Ynxun2O74= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=csR10qbS; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="csR10qbS" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 97BD4C32781; Wed, 7 Aug 2024 19:15:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1723058112; bh=Pd6KU/ivC7dWbu070Vz+ll3x5aAQmTiFSJG/E2Zakj8=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=csR10qbSGyFhLnicUr0Ih21aMF1vZCCt+P6O8cQ5Ab9PogxQxZD35yhdPE39effox IhEi+/itCzHygcFC9fqT3kagosIdxbZEUL1wrpP8hvDxVG0SJzGzDKClzaasOXfc82 Nr2sBKyCqoozBXz7xe4neD2vG73MIWHwffeywPG0U3yfBXSi+PsqqMOEcZsGiXxBXS k4+zmJyWOWfzKOXl5MkR40kaRnpcLhoJ2iNk+MPbYudfNkFNgMEB7isvL95hw2Wgb3 QHiHbgILMICZXnofhYQVhH4I427727UVsQzrcdw1OavVecaQDUxosZlNxuPhPhAdoh JEegHh2e6YCuw== Date: Wed, 07 Aug 2024 12:15:12 -0700 Subject: [PATCH 5/5] design: fix the changelog to reflect the new changes From: "Darrick J. Wong" To: djwong@kernel.org Cc: chandanbabu@kernel.org, hch@lst.de, linux-xfs@vger.kernel.org Message-ID: <172305794161.969463.5451370159032939139.stgit@frogsfrogsfrogs> In-Reply-To: <172305794084.969463.781862996787293755.stgit@frogsfrogsfrogs> References: <172305794084.969463.781862996787293755.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Minor updates to the changelog to reflect the last change (where we forgot to do this) and this one. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- design/XFS_Filesystem_Structure/docinfo.xml | 32 +++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/design/XFS_Filesystem_Structure/docinfo.xml b/design/XFS_Filesystem_Structure/docinfo.xml index c2acce19..1eddb1f4 100644 --- a/design/XFS_Filesystem_Structure/docinfo.xml +++ b/design/XFS_Filesystem_Structure/docinfo.xml @@ -198,4 +198,36 @@ + + 3.14159265 + February 2023 + + Darrick + Wong + djwong@kernel.org + + + + Add epub output. + large extent counts + logged extended attribute updates + + + + + 3.141592653 + August 2024 + + Darrick + Wong + djwong@kernel.org + + + + metadump v2 + exchange range log items + parent pointers + + +