diff mbox series

docs: de-indent first paragraph of gitformat-* to flow the text

Message ID patch-1.1-1c1434bba31-20221121T141411Z-avarab@gmail.com (mailing list archive)
State New, archived
Headers show
Series docs: de-indent first paragraph of gitformat-* to flow the text | expand

Commit Message

Ævar Arnfjörð Bjarmason Nov. 21, 2022, 2:15 p.m. UTC
Fix formatting issues with the documentation added to the new
gitformat-* namespace in c0f6dd49f19 (Merge branch
'ab/tech-docs-to-help', 2022-08-14).

As this documentation didn't have a wide audience before that, some of
these formatting issues pre-dated that change. See [1] for details.

But the end result of having some paragraphs use "<p>" in the HTML
output, and to have others wrapped in "<pre><code>" doesn't look
nice. The most minimal way to fix this is to de-indent the opening
line of paragraphs that don't start with another formatting
element (e.g. "*" or "-" would already format them as text). Let's do
that.

1. https://lore.kernel.org/git/221109.86bkpgriso.gmgdl@evledraar.gmail.com/

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

There's a lot more to fix in these gitformat-* docs (as their
formatting was funny before), but this is a minimal change to fix the
most major issues.

 Documentation/gitformat-commit-graph.txt | 10 ++--
 Documentation/gitformat-index.txt        | 74 ++++++++++++------------
 Documentation/gitformat-pack.txt         |  4 +-
 3 files changed, 44 insertions(+), 44 deletions(-)

Comments

Jeff King Nov. 21, 2022, 5:59 p.m. UTC | #1
On Mon, Nov 21, 2022 at 03:15:50PM +0100, Ævar Arnfjörð Bjarmason wrote:

> But the end result of having some paragraphs use "<p>" in the HTML
> output, and to have others wrapped in "<pre><code>" doesn't look
> nice. The most minimal way to fix this is to de-indent the opening
> line of paragraphs that don't start with another formatting
> element (e.g. "*" or "-" would already format them as text). Let's do
> that.

Is there any reason to just touch the opening line of the paragraphs,
and not change the indent of the whole thing? I understand that doing
the first line is sufficient to convince asciidoc to do what we want,
and the diff is technically fewer lines, but the result is rather
confusing for people who will work on the source in the future.

> There's a lot more to fix in these gitformat-* docs (as their
> formatting was funny before), but this is a minimal change to fix the
> most major issues.

I briefly looked at doc-diff output here. Most of them looked obviously
correct (though since there isn't a lot of context in the doc-diff, it's
sometimes hard to tell), but a few were questionable:

> @@ -103,7 +103,7 @@ Git index format
>  
>    Object name for the represented object
>  
> -  A 16-bit 'flags' field split into (high to low bits)
> +A 16-bit 'flags' field split into (high to low bits)
>  
>      1-bit assume-valid flag

This puts "A 16-bit flags field" at a different indent than the
paragraph before, but I think it is part of a list. It is less indented
than the follow-on paragraph ("1-bit assume-valid flag"), but I think
that is correct, as it is a sub-list of the 16-bit flags.

I know you said that formatting problems remain, and certainly this was
not right before your patch either. But I think your patch makes it
worse, because it pulls the touched line out of the list (and all of the
adjacent parts are still rendered as code blocks, making the structure
even less clear).

> @@ -114,7 +114,7 @@ Git index format
>      12-bit name length if the length is less than 0xFFF; otherwise 0xFFF
>      is stored in this field.
>  
> -  (Version 3 or later) A 16-bit field, only applicable if the
> +(Version 3 or later) A 16-bit field, only applicable if the
>    "extended flag" above is 1, split into (high to low bits).
>  
>      1-bit reserved for future

This is the same (it's the same list).

> @@ -125,16 +125,16 @@ Git index format
>  
>      13-bit unused, must be zero
>  
> -  Entry path name (variable length) relative to top level directory
> +Entry path name (variable length) relative to top level directory
>      (without leading slash). '/' is used as path separator. The special
>      path components ".", ".." and ".git" (without quotes) are disallowed.
>      Trailing slash is also disallowed.

And I think this is supposed to be part of that same list, too, but now
is de-dented.

> -    The exact encoding is undefined, but the '.' and '/' characters
> +The exact encoding is undefined, but the '.' and '/' characters
>      are encoded in 7-bit ASCII and the encoding cannot contain a NUL
>      byte (iow, this is a UNIX pathname).

This is supposed to be a continuation of the earlier "Entry path name"
item, but now is at the same top-level. Which is not right, but arguably
is not worse than before your patch.

>  
> -  (Version 4) In version 4, the entry path name is prefix-compressed
> +(Version 4) In version 4, the entry path name is prefix-compressed
>      relative to the path name for the previous entry (the very first
>      entry is encoded as if the path name for the previous entry is an
>      empty string).  At the beginning of an entry, an integer N in the

And this one is another top-level part of the list.

> --- a/Documentation/gitformat-pack.txt
> +++ b/Documentation/gitformat-pack.txt
> @@ -294,10 +294,10 @@ Pack file entry: <+
>  
>    - The same trailer as a v1 pack file:
>  
> -    A copy of the pack checksum at the end of
> +A copy of the pack checksum at the end of
>      corresponding packfile.
>  
> -    Index checksum of all of the above.
> +Index checksum of all of the above.

These are supposed to be list items of "The same trailer as..." above.
Now they're de-dented, which I think is worse than the state before your
patch.

I don't want to say "we must fix all format problems at once", but I
think in the cases I pointed out that trying to fix the code-block
problem is a losing battle, because the list-like nature is being made
worse. And they probably should remain untouched until somebody is
willing to turn them into actual list elements and continuation markers.

-Peff
Junio C Hamano Nov. 22, 2022, 12:36 a.m. UTC | #2
Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Fix formatting issues with the documentation added to the new
> gitformat-* namespace in c0f6dd49f19 (Merge branch
> 'ab/tech-docs-to-help', 2022-08-14).

I think I saw you do this before, but please refrain from blaming a
merge UNLESS there is a regression between the tip of the topic
getting merged c0f6dd49f19^2 and the result of the merge c0f6dd49f19
as it is confusing.

>  === CHUNK LOOKUP:
>  
> -  (C + 1) * 12 bytes listing the table of contents for the chunks:
> +(C + 1) * 12 bytes listing the table of contents for the chunks:
>        First 4 bytes describe the chunk id. Value 0 is a terminating label.
>        Other 8 bytes provide the byte-offset in current file for chunk to
>        start. (Chunks are ordered contiguously in the file, so you can infer
>        the length using the next chunk position if necessary.) Each chunk
>        ID appears at most once.
>  
> -  The CHUNK LOOKUP matches the table of contents from
> +The CHUNK LOOKUP matches the table of contents from
>    the chunk-based file format, see linkgit:gitformat-chunk[5]

This makes the result awkward to read for those of us who consume
the text in the source form.  I do not think a one-time cost of
reindenting the whole paragraph (and reviewing the patch to do so)
outweighs the cost of burdening the readers with the awkwardness.
diff mbox series

Patch

diff --git a/Documentation/gitformat-commit-graph.txt b/Documentation/gitformat-commit-graph.txt
index 31cad585e23..219265a0c7a 100644
--- a/Documentation/gitformat-commit-graph.txt
+++ b/Documentation/gitformat-commit-graph.txt
@@ -67,17 +67,17 @@  All multi-byte numbers are in network byte order.
 
 === CHUNK LOOKUP:
 
-  (C + 1) * 12 bytes listing the table of contents for the chunks:
+(C + 1) * 12 bytes listing the table of contents for the chunks:
       First 4 bytes describe the chunk id. Value 0 is a terminating label.
       Other 8 bytes provide the byte-offset in current file for chunk to
       start. (Chunks are ordered contiguously in the file, so you can infer
       the length using the next chunk position if necessary.) Each chunk
       ID appears at most once.
 
-  The CHUNK LOOKUP matches the table of contents from
+The CHUNK LOOKUP matches the table of contents from
   the chunk-based file format, see linkgit:gitformat-chunk[5]
 
-  The remaining data in the body is described one chunk at a time, and
+The remaining data in the body is described one chunk at a time, and
   these chunks may be given in any order. Chunks are required unless
   otherwise specified.
 
@@ -126,7 +126,7 @@  All multi-byte numbers are in network byte order.
       be stored within 31 bits.
 
 ==== Extra Edge List (ID: {'E', 'D', 'G', 'E'}) [Optional]
-      This list of 4-byte values store the second through nth parents for
+This list of 4-byte values store the second through nth parents for
       all octopus merges. The second parent value in the commit data stores
       an array position within this list along with the most-significant bit
       on. Starting at that array position, iterate through this list of commit
@@ -161,7 +161,7 @@  All multi-byte numbers are in network byte order.
     * The BDAT chunk is present if and only if BIDX is present.
 
 ==== Base Graphs List (ID: {'B', 'A', 'S', 'E'}) [Optional]
-      This list of H-byte hashes describe a set of B commit-graph files that
+This list of H-byte hashes describe a set of B commit-graph files that
       form a commit-graph chain. The graph position for the ith commit in this
       file's OID Lookup chunk is equal to i plus the number of commits in all
       base graphs.  If B is non-zero, this chunk must exist.
diff --git a/Documentation/gitformat-index.txt b/Documentation/gitformat-index.txt
index 015cb21bdc0..bbc188b9e65 100644
--- a/Documentation/gitformat-index.txt
+++ b/Documentation/gitformat-index.txt
@@ -17,7 +17,7 @@  Git index format
 
 == The Git index file has the following format
 
-  All binary numbers are in network byte order.
+All binary numbers are in network byte order.
   In a repository using the traditional SHA-1, checksums and object IDs
   (object names) mentioned below are all computed using SHA-1.  Similarly,
   in SHA-256 repositories, these values are computed using SHA-256.
@@ -51,12 +51,12 @@  Git index format
 
 == Index entry
 
-  Index entries are sorted in ascending order on the name field,
+Index entries are sorted in ascending order on the name field,
   interpreted as a string of unsigned bytes (i.e. memcmp() order, no
   localization, no special casing of directory separator '/'). Entries
   with the same name are sorted by their stage field.
 
-  An index entry typically represents a file. However, if sparse-checkout
+An index entry typically represents a file. However, if sparse-checkout
   is enabled in cone mode (`core.sparseCheckoutCone` is enabled) and the
   `extensions.sparseIndex` extension is enabled, then the index may
   contain entries for directories outside of the sparse-checkout definition.
@@ -103,7 +103,7 @@  Git index format
 
   Object name for the represented object
 
-  A 16-bit 'flags' field split into (high to low bits)
+A 16-bit 'flags' field split into (high to low bits)
 
     1-bit assume-valid flag
 
@@ -114,7 +114,7 @@  Git index format
     12-bit name length if the length is less than 0xFFF; otherwise 0xFFF
     is stored in this field.
 
-  (Version 3 or later) A 16-bit field, only applicable if the
+(Version 3 or later) A 16-bit field, only applicable if the
   "extended flag" above is 1, split into (high to low bits).
 
     1-bit reserved for future
@@ -125,16 +125,16 @@  Git index format
 
     13-bit unused, must be zero
 
-  Entry path name (variable length) relative to top level directory
+Entry path name (variable length) relative to top level directory
     (without leading slash). '/' is used as path separator. The special
     path components ".", ".." and ".git" (without quotes) are disallowed.
     Trailing slash is also disallowed.
 
-    The exact encoding is undefined, but the '.' and '/' characters
+The exact encoding is undefined, but the '.' and '/' characters
     are encoded in 7-bit ASCII and the encoding cannot contain a NUL
     byte (iow, this is a UNIX pathname).
 
-  (Version 4) In version 4, the entry path name is prefix-compressed
+(Version 4) In version 4, the entry path name is prefix-compressed
     relative to the path name for the previous entry (the very first
     entry is encoded as if the path name for the previous entry is an
     empty string).  At the beginning of an entry, an integer N in the
@@ -144,20 +144,20 @@  Git index format
     path name for the previous entry, and replacing it with the string S
     yields the path name for this entry.
 
-  1-8 nul bytes as necessary to pad the entry to a multiple of eight bytes
+1-8 nul bytes as necessary to pad the entry to a multiple of eight bytes
   while keeping the name NUL-terminated.
 
-  (Version 4) In version 4, the padding after the pathname does not
+(Version 4) In version 4, the padding after the pathname does not
   exist.
 
-  Interpretation of index entries in split index mode is completely
+Interpretation of index entries in split index mode is completely
   different. See below for details.
 
 == Extensions
 
 === Cache tree
 
-  Since the index does not record entries for directories, the cache
+Since the index does not record entries for directories, the cache
   entries cannot describe tree objects that already exist in the object
   database for regions of the index that are unchanged from an existing
   commit. The cache tree extension stores a recursive tree structure that
@@ -168,26 +168,26 @@  Git index format
   as `HEAD^{tree}`, since sections of the index can be skipped when a tree
   comparison demonstrates equality.
 
-  The recursive tree structure uses nodes that store a number of cache
+The recursive tree structure uses nodes that store a number of cache
   entries, a list of subnodes, and an object ID (OID). The OID references
   the existing tree for that node, if it is known to exist. The subnodes
   correspond to subdirectories that themselves have cache tree nodes. The
   number of cache entries corresponds to the number of cache entries in
   the index that describe paths within that tree's directory.
 
-  The extension tracks the full directory structure in the cache tree
+The extension tracks the full directory structure in the cache tree
   extension, but this is generally smaller than the full cache entry list.
 
-  When a path is updated in index, Git invalidates all nodes of the
+When a path is updated in index, Git invalidates all nodes of the
   recursive cache tree corresponding to the parent directories of that
   path. We store these tree nodes as being "invalid" by using "-1" as the
   number of cache entries. Invalid nodes still store a span of index
   entries, allowing Git to focus its efforts when reconstructing a full
   cache tree.
 
-  The signature for this extension is { 'T', 'R', 'E', 'E' }.
+The signature for this extension is { 'T', 'R', 'E', 'E' }.
 
-  A series of entries fill the entire extension; each of which
+A series of entries fill the entire extension; each of which
   consists of:
 
   - NUL-terminated path component (relative to its parent directory);
@@ -205,12 +205,12 @@  Git index format
   - Object name for the object that would result from writing this span
     of index as a tree.
 
-  An entry can be in an invalidated state and is represented by having
+An entry can be in an invalidated state and is represented by having
   a negative number in the entry_count field. In this case, there is no
   object name and the next entry starts immediately after the newline.
   When writing an invalid entry, -1 should always be used as entry_count.
 
-  The entries are written out in the top-down, depth-first order.  The
+The entries are written out in the top-down, depth-first order.  The
   first entry represents the root level of the repository, followed by the
   first subtree--let's call this A--of the root level (with its name
   relative to the root level), followed by the first subtree of A (with
@@ -219,19 +219,19 @@  Git index format
 
 === Resolve undo
 
-  A conflict is represented in the index as a set of higher stage entries.
+A conflict is represented in the index as a set of higher stage entries.
   When a conflict is resolved (e.g. with "git add path"), these higher
   stage entries will be removed and a stage-0 entry with proper resolution
   is added.
 
-  When these higher stage entries are removed, they are saved in the
+When these higher stage entries are removed, they are saved in the
   resolve undo extension, so that conflicts can be recreated (e.g. with
   "git checkout -m"), in case users want to redo a conflict resolution
   from scratch.
 
-  The signature for this extension is { 'R', 'E', 'U', 'C' }.
+The signature for this extension is { 'R', 'E', 'U', 'C' }.
 
-  A series of entries fill the entire extension; each of which
+A series of entries fill the entire extension; each of which
   consists of:
 
   - NUL-terminated pathname the entry describes (relative to the root of
@@ -246,13 +246,13 @@  Git index format
 
 === Split index
 
-  In split index mode, the majority of index entries could be stored
+In split index mode, the majority of index entries could be stored
   in a separate file. This extension records the changes to be made on
   top of that to produce the final index.
 
-  The signature for this extension is { 'l', 'i', 'n', 'k' }.
+The signature for this extension is { 'l', 'i', 'n', 'k' }.
 
-  The extension consists of:
+The extension consists of:
 
   - Hash of the shared index file. The shared index file path
     is $GIT_DIR/sharedindex.<hash>. If all bits are zero, the
@@ -273,17 +273,17 @@  Git index format
     first index entry, the second "1" bit to the second entry and so
     on. Replaced entries may have empty path names to save space.
 
-  The remaining index entries after replaced ones will be added to the
+The remaining index entries after replaced ones will be added to the
   final index. These added entries are also sorted by entry name then
   stage.
 
 == Untracked cache
 
-  Untracked cache saves the untracked file list and necessary data to
+Untracked cache saves the untracked file list and necessary data to
   verify the cache. The signature for this extension is { 'U', 'N',
   'T', 'R' }.
 
-  The extension starts with
+The extension starts with
 
   - A sequence of NUL-terminated strings, preceded by the size of the
     sequence in variable width encoding. Each string describes the
@@ -341,11 +341,11 @@  The remaining data of each directory block is grouped by type:
 
 == File System Monitor cache
 
-  The file system monitor cache tracks files for which the core.fsmonitor
+The file system monitor cache tracks files for which the core.fsmonitor
   hook has told us about changes.  The signature for this extension is
   { 'F', 'S', 'M', 'N' }.
 
-  The extension starts with
+The extension starts with
 
   - 32-bit version number: the current supported versions are 1 and 2.
 
@@ -366,16 +366,16 @@  The remaining data of each directory block is grouped by type:
 
 == End of Index Entry
 
-  The End of Index Entry (EOIE) is used to locate the end of the variable
+The End of Index Entry (EOIE) is used to locate the end of the variable
   length index entries and the beginning of the extensions. Code can take
   advantage of this to quickly locate the index extensions without having
   to parse through all of the index entries.
 
-  Because it must be able to be loaded before the variable length cache
+Because it must be able to be loaded before the variable length cache
   entries and other index extensions, this extension must be written last.
   The signature for this extension is { 'E', 'O', 'I', 'E' }.
 
-  The extension consists of:
+The extension consists of:
 
   - 32-bit offset to the end of the index entries
 
@@ -389,12 +389,12 @@  The remaining data of each directory block is grouped by type:
 
 == Index Entry Offset Table
 
-  The Index Entry Offset Table (IEOT) is used to help address the CPU
+The Index Entry Offset Table (IEOT) is used to help address the CPU
   cost of loading the index by enabling multi-threading the process of
   converting cache entries from the on-disk format to the in-memory format.
   The signature for this extension is { 'I', 'E', 'O', 'T' }.
 
-  The extension consists of:
+The extension consists of:
 
   - 32-bit version (currently 1)
 
@@ -407,7 +407,7 @@  The remaining data of each directory block is grouped by type:
 
 == Sparse Directory Entries
 
-  When using sparse-checkout in cone mode, some entire directories within
+When using sparse-checkout in cone mode, some entire directories within
   the index can be summarized by pointing to a tree object instead of the
   entire expanded list of paths within that tree. An index containing such
   entries is a "sparse index". Index format versions 4 and less were not
diff --git a/Documentation/gitformat-pack.txt b/Documentation/gitformat-pack.txt
index e06af02f211..8a0e8dd160d 100644
--- a/Documentation/gitformat-pack.txt
+++ b/Documentation/gitformat-pack.txt
@@ -294,10 +294,10 @@  Pack file entry: <+
 
   - The same trailer as a v1 pack file:
 
-    A copy of the pack checksum at the end of
+A copy of the pack checksum at the end of
     corresponding packfile.
 
-    Index checksum of all of the above.
+Index checksum of all of the above.
 
 == pack-*.rev files have the format: