diff mbox series

[v3,5/8] fast-import: improve documentation for unquoted paths

Message ID 2e78690023f653436ec2eba0c5fb9dfa6bb62624.1712741871.git.thalia@archibald.dev (mailing list archive)
State Superseded
Headers show
Series fast-import: tighten parsing of paths | expand

Commit Message

Thalia Archibald April 10, 2024, 9:55 a.m. UTC
It describes what cannot be in an unquoted path, but not what it is.
Reframe it as a definition of unquoted paths. The requirement that it
not start with `"` is the core element that implies the rest.

The restriction that the source paths of filecopy and filerename cannot
contain SP is only stated in their respective sections. Restate it in
the <path> section.

Signed-off-by: Thalia Archibald <thalia@archibald.dev>
---
 Documentation/git-fast-import.txt | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

Comments

Junio C Hamano April 11, 2024, 7:51 p.m. UTC | #1
Thalia Archibald <thalia@archibald.dev> writes:

> It describes what cannot be in an unquoted path, but not what it is.
> Reframe it as a definition of unquoted paths. The requirement that it
> not start with `"` is the core element that implies the rest.

The other is that a path with LF in it cannot be written unquoted,
which should be treated the same way as ones that begin with dq in
this explanation, I think.

> The restriction that the source paths of filecopy and filerename cannot
> contain SP is only stated in their respective sections. Restate it in
> the <path> section.

Elsewhere later in the series we clarify that NUL cannot appear in a
path, so the above looks perfect at this point in the series.

> Signed-off-by: Thalia Archibald <thalia@archibald.dev>
> ---
>  Documentation/git-fast-import.txt | 19 ++++++++++++-------
>  1 file changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt
> index b2607366b9..f26d7a8571 100644
> --- a/Documentation/git-fast-import.txt
> +++ b/Documentation/git-fast-import.txt
> @@ -630,18 +630,23 @@ in octal.  Git only supports the following modes:
>  In both formats `<path>` is the complete path of the file to be added
>  (if not already existing) or modified (if already existing).
>  
> -A `<path>` string must use UNIX-style directory separators (forward
> -slash `/`), may contain any byte other than `LF`, and must not
> -start with double quote (`"`).
>  
> -A path can use C-style string quoting; this is accepted in all cases
> -and mandatory if the filename starts with double quote or contains
> -`LF`. In C-style quoting, the complete name should be surrounded with

> +A `<path>` can be written as unquoted bytes or a C-style quoted string:

If it is followed by two-bullet-point enumeration (one for unquoted,
the other for quoted), then sentence ending in a colon here is
perfectly fine, but we are not doing so in the next two paragraphs,
so let's give it a normal full-stop (period).  And make it a
paragraph on its own, which you did correctly.

> +When a `<path>` does not start with double quote (`"`), it is an
> +unquoted string and is parsed as literal bytes without any escape
> +sequences. However, if the filename contains `LF` or starts with double
> +quote, it must be written as a quoted string. Additionally, the source
> +`<path>` in `filecopy` or `filerename` must be quoted if it contains SP.

As the description for <path> in filecopy and filerename refers back
to this description, this "Additionally" is a good clarification to
have here.  Nicely done.

> +A `<path>` can use C-style string quoting; this is accepted in all cases
> +and mandatory in the cases where the filename cannot be represented as
> +an unquoted string. In C-style quoting, the complete name should be surrounded with

I somehow think the early part (before "In C-style quoting") is
redundant and unnecessary, as we started the section with "as
unquoted bytes or a C-style quoted string."  As the previous
paragraph about unquoted path begins with "When a <path> does not
start with a double quote", I would have expected this paragraph to
begin like so:

	When a `<path>` begins with a double quote (`"`), it is a
	C-style quoted string, where the complete name is enclosed
	in a pair of double quotes, and ...

>  double quotes, and any `LF`, backslash, or double quote characters
>  must be escaped by preceding them with a backslash (e.g.,
>  `"path/with\n, \\ and \" in it"`).
>  
> -The value of `<path>` must be in canonical form. That is it must not:
> +A `<path>` must use UNIX-style directory separators (forward slash `/`)
> +and must be in canonical form. That is it must not:
>  
>  * contain an empty directory component (e.g. `foo//bar` is invalid),
>  * end with a directory separator (e.g. `foo/` is invalid),

Thanks.
diff mbox series

Patch

diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt
index b2607366b9..f26d7a8571 100644
--- a/Documentation/git-fast-import.txt
+++ b/Documentation/git-fast-import.txt
@@ -630,18 +630,23 @@  in octal.  Git only supports the following modes:
 In both formats `<path>` is the complete path of the file to be added
 (if not already existing) or modified (if already existing).
 
-A `<path>` string must use UNIX-style directory separators (forward
-slash `/`), may contain any byte other than `LF`, and must not
-start with double quote (`"`).
+A `<path>` can be written as unquoted bytes or a C-style quoted string:
 
-A path can use C-style string quoting; this is accepted in all cases
-and mandatory if the filename starts with double quote or contains
-`LF`. In C-style quoting, the complete name should be surrounded with
+When a `<path>` does not start with double quote (`"`), it is an
+unquoted string and is parsed as literal bytes without any escape
+sequences. However, if the filename contains `LF` or starts with double
+quote, it must be written as a quoted string. Additionally, the source
+`<path>` in `filecopy` or `filerename` must be quoted if it contains SP.
+
+A `<path>` can use C-style string quoting; this is accepted in all cases
+and mandatory in the cases where the filename cannot be represented as
+an unquoted string. In C-style quoting, the complete name should be surrounded with
 double quotes, and any `LF`, backslash, or double quote characters
 must be escaped by preceding them with a backslash (e.g.,
 `"path/with\n, \\ and \" in it"`).
 
-The value of `<path>` must be in canonical form. That is it must not:
+A `<path>` must use UNIX-style directory separators (forward slash `/`)
+and must be in canonical form. That is it must not:
 
 * contain an empty directory component (e.g. `foo//bar` is invalid),
 * end with a directory separator (e.g. `foo/` is invalid),