diff mbox series

[v5,5/8] fast-import: improve documentation for path quoting

Message ID e1a1b0395d0679f386b07c6dfeec4f481ccc7420.1713056559.git.thalia@archibald.dev (mailing list archive)
State Accepted
Commit 22915955caaff8b49671426aba31cbcee19ed0ac
Headers show
Series fast-import: tighten parsing of paths | expand

Commit Message

Thalia Archibald April 14, 2024, 1:12 a.m. UTC
It describes what characters cannot be in an unquoted path, but not
their semantics. Reframe it as a definition of unquoted paths. From the
perspective of the parser, whether it starts with `"` is what defines
whether it will parse it as quoted or unquoted.

The restrictions on characters in unquoted paths (with starting-", LF,
and spaces) are explained in the quoted paragraph. Move it to the
unquoted paragraph and reword.

The restriction that the source paths of filecopy and filerename cannot
contain SP is only stated in their respective sections. Restate it in
the <path> section.

Signed-off-by: Thalia Archibald <thalia@archibald.dev>
---
 Documentation/git-fast-import.txt | 26 ++++++++++++++++----------
 1 file changed, 16 insertions(+), 10 deletions(-)
diff mbox series

Patch

diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt
index b2607366b9..1882758b8a 100644
--- a/Documentation/git-fast-import.txt
+++ b/Documentation/git-fast-import.txt
@@ -630,18 +630,24 @@  in octal.  Git only supports the following modes:
 In both formats `<path>` is the complete path of the file to be added
 (if not already existing) or modified (if already existing).
 
-A `<path>` string must use UNIX-style directory separators (forward
-slash `/`), may contain any byte other than `LF`, and must not
-start with double quote (`"`).
+A `<path>` can be written as unquoted bytes or a C-style quoted string.
 
-A path can use C-style string quoting; this is accepted in all cases
-and mandatory if the filename starts with double quote or contains
-`LF`. In C-style quoting, the complete name should be surrounded with
-double quotes, and any `LF`, backslash, or double quote characters
-must be escaped by preceding them with a backslash (e.g.,
-`"path/with\n, \\ and \" in it"`).
+When a `<path>` does not start with a double quote (`"`), it is an
+unquoted string and is parsed as literal bytes without any escape
+sequences. However, if the filename contains `LF` or starts with double
+quote, it cannot be represented as an unquoted string and must be
+quoted. Additionally, the source `<path>` in `filecopy` or `filerename`
+must be quoted if it contains SP.
 
-The value of `<path>` must be in canonical form. That is it must not:
+When a `<path>` starts with a double quote (`"`), it is a C-style quoted
+string, where the complete filename is enclosed in a pair of double
+quotes and escape sequences are used. Certain characters must be escaped
+by preceding them with a backslash: `LF` is written as `\n`, backslash
+as `\\`, and double quote as `\"`. All filenames can be represented as
+quoted strings.
+
+A `<path>` must use UNIX-style directory separators (forward slash `/`)
+and its value must be in canonical form. That is it must not:
 
 * contain an empty directory component (e.g. `foo//bar` is invalid),
 * end with a directory separator (e.g. `foo/` is invalid),