diff mbox series

[v3,1/1] docs: rewrite the documentation of the text and eol attributes

Message ID 20230501023533.35370-2-alexhenrie24@gmail.com (mailing list archive)
State Superseded
Headers show
Series docs: rewrite the documentation of the text and eol attributes | expand

Commit Message

Alex Henrie May 1, 2023, 2:35 a.m. UTC
These two sentences are confusing because the description of the text
attribute sounds exactly the same as the description of the text=auto
attribute:

"Setting the text attribute on a path enables end-of-line normalization"

"When text is set to "auto", the path is marked for automatic
end-of-line conversion"

Unless the reader is already familiar with the two variants, there's a
high probability that they will think that "end-of-line normalization"
is the same thing as "automatic end-of-line conversion".

It's also not clear that the phrase "When the file has been committed
with CRLF, no conversion is done" in the paragraph for text=auto does
not apply equally to the bare text attribute which is described earlier.
Moreover, it falsely implies that normalization is only suppressed if
the file has been committed. In fact, running `git add` on a CRLF file,
adding the text=auto attribute to the file, and running `git add` again
does not do anything to the line endings either.

On top of that, in several places the documentation for the eol
attribute sounds like it can turn on normalization on checkin, but eol
only controls conversion on checkout. It also sounds like setting eol
(or setting a config variable) is required to turn on conversion on
checkout, but the text attribute can turn on conversion on checkout by
itself if eol is unspecified.

Rephrase the documentation of text, text=auto, eol, eol=crlf, and eol=lf
to be clear about how they are the same, how they are different, and in
what cases conversion is performed.

Helped-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
---
 Documentation/gitattributes.txt | 60 ++++++++++++++++++---------------
 1 file changed, 32 insertions(+), 28 deletions(-)

Comments

Torsten Bögershausen May 2, 2023, 4:17 a.m. UTC | #1
This looks much better, thanks.
I have some minor comments:
a) The commit message from above:
>  docs: rewrite the documentation of the text and eol attributes

The word "doc" is used 2 times, that feel a little bit redundant
(and should start with a uppercase letter)

"Docs: rewrite the documentation of the text and eol attributes"

Or may be shorter:
"Rewrite the documentation of the text and eol attributes"


b) Some more comments inline:
On Sun, Apr 30, 2023 at 08:35:33PM -0600, Alex Henrie wrote:
> These two sentences are confusing because the description of the text
> attribute sounds exactly the same as the description of the text=auto
> attribute:

The word "These" is somewhat dangling: Which ones ?
May be "The following two sentences" ?

>
> "Setting the text attribute on a path enables end-of-line normalization"
>
> "When text is set to "auto", the path is marked for automatic
> end-of-line conversion"
>
> Unless the reader is already familiar with the two variants, there's a
> high probability that they will think that "end-of-line normalization"
> is the same thing as "automatic end-of-line conversion".
Good.
>
> It's also not clear that the phrase "When the file has been committed
> with CRLF, no conversion is done" in the paragraph for text=auto does
> not apply equally to the bare text attribute which is described earlier.
> Moreover, it falsely implies that normalization is only suppressed if
> the file has been committed. In fact, running `git add` on a CRLF file,
> adding the text=auto attribute to the file, and running `git add` again
> does not do anything to the line endings either.
True.
>
> On top of that, in several places the documentation for the eol
> attribute sounds like it can turn on normalization on checkin, but eol
> only controls conversion on checkout. It also sounds like setting eol
> (or setting a config variable) is required to turn on conversion on
> checkout, but the text attribute can turn on conversion on checkout by
> itself if eol is unspecified.
>
> Rephrase the documentation of text, text=auto, eol, eol=crlf, and eol=lf
> to be clear about how they are the same, how they are different, and in
> what cases conversion is performed.

That's all good.

>
> Helped-by: Torsten Bögershausen <tboegi@web.de>
> Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
> ---
>  Documentation/gitattributes.txt | 60 ++++++++++++++++++---------------
>  1 file changed, 32 insertions(+), 28 deletions(-)
>
> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
> index 39bfbca1ff..076a056a72 100644
> --- a/Documentation/gitattributes.txt
> +++ b/Documentation/gitattributes.txt
> @@ -120,20 +120,19 @@ repository upon 'git add' and 'git commit'.
>  `text`
>  ^^^^^^
>
> -This attribute enables and controls end-of-line normalization.  When a
> -text file is normalized, its line endings are converted to LF in the
> -repository.  To control what line ending style is used in the working
> -directory, use the `eol` attribute for a single file and the
> -`core.eol` configuration variable for all text files.
> -Note that setting `core.autocrlf` to `true` or `input` overrides
> -`core.eol` (see the definitions of those options in
> -linkgit:git-config[1]).
> +This attribute marks the path as a text file, which enables end-of-line
> +conversion: When a matching file is added to the index, the file's line
> +endings are normalized to LF in the index.  Conversely, when the file is
> +copied from the index to the working directory, its line endings may be
I still stumble accross "copied". May be shorter:

"the file is written into the working directory"

> +converted from LF to CRLF depending on the `eol` attribute, the Git
> +config, and the platform (see explanation of `eol` below).
Good.
>
>  Set::
>
>  	Setting the `text` attribute on a path enables end-of-line
> -	normalization and marks the path as a text file.  End-of-line
> -	conversion takes place without guessing the content type.
> +	conversion on checkin and checkout as described above.  Line endings
> +	are normalized to LF in the index every time the file is checked in,
> +	even if the file was previously added to Git with CRLF line endings.
>
>  Unset::
>
> @@ -142,10 +141,11 @@ Unset::
>
>  Set to string value "auto"::
>
> -	When `text` is set to "auto", the path is marked for automatic
> -	end-of-line conversion.  If Git decides that the content is
> -	text, its line endings are converted to LF on checkin.
> -	When the file has been committed with CRLF, no conversion is done.
> +	When `text` is set to "auto", Git decides by itself whether the file
> +	is text or binary.  If it is text and the file was not already in
> +	Git with CRLF endings, line endings are converted on checkin and
> +	checkout as described above.  Otherwise, no conversion is done on
> +	checkin or checkout.
>
>  Unspecified::
>
> @@ -159,26 +159,30 @@ unspecified.
>  `eol`
>  ^^^^^
>
> -This attribute sets a specific line-ending style to be used in the
> -working directory.  This attribute has effect only if the `text`
> -attribute is set or unspecified, or if it is set to `auto`, the file is
> -detected as text, and it is stored with LF endings in the index.  Note
> -that setting this attribute on paths which are in the index with CRLF
> -line endings may make the paths to be considered dirty unless
> -`text=auto` is set. Adding the path to the index again will normalize
> -the line endings in the index.
> +This attribute marks a path to use a specific line-ending style in the
> +working tree when it is checked out.
It enables even the normalization at checkin, see
$ mkdir ttt
$ cd ttt
$ git init
$ echo "*.sh eol=lf" >.gitattributes
$ printf '#!/bin/sh\r\necho hello\r\n' >xx.sh
$ git add xx.sh
warning: CRLF will be replaced by LF in xx.sh.
The file will have its original line endings in your working directory



> + This attribute has effect only if
> +the `text` attribute is set or unspecified, or if it is set to `auto`,
> +the file is detected as text, and it is stored with LF endings in the
> +index.
It took me a while to understand it.
Should the "," after "unspecified" be removed ?

Or, should we write:

The `eol` attribute automatically sets `text`, unless `-text`, `binary` or
`text=auto` is specified.

I dunno.

>
>  Set to string value "crlf"::
>
> -	This setting forces Git to normalize line endings for this
> -	file on checkin and convert them to CRLF when the file is
> -	checked out.
> +	This setting converts the file's line endings in the working
> +	directory to CRLF when the file is checked out.
>
>  Set to string value "lf"::
>
> -	This setting forces Git to normalize line endings to LF on
> -	checkin and prevents conversion to CRLF when the file is
> -	checked out.
> +	This setting uses the same line endings in the working directory as
> +	in the index when the file is checked out.
> +
> +Unspecified::
> +
> +	If the `eol` attribute is unspecified for a file, its line endings
> +	in the working directory are determined by the `core.autocrlf` or
> +	`core.eol` configuration variable (see the definitions of those
> +	options in linkgit:git-config[1]).  The default if `text` is set but
> +	neither of those variables is is `eol=lf` on Unix and `eol=crlf` on
> +	Windows.

That's good - I wonder if everyone understands Linux, MacOs and others as Unix.

May be something like this:
The default, if `text` is set but neither of those variables, is `eol=crlf` on
Windows and `eol=lf` on all other systems.
Alex Henrie May 2, 2023, 5:59 a.m. UTC | #2
On Mon, May 1, 2023 at 10:17 PM Torsten Bögershausen <tboegi@web.de> wrote:
>
> This looks much better, thanks.
> I have some minor comments:
> a) The commit message from above:
> >  docs: rewrite the documentation of the text and eol attributes
>
> The word "doc" is used 2 times, that feel a little bit redundant
> (and should start with a uppercase letter)
>
> "Docs: rewrite the documentation of the text and eol attributes"
>
> Or may be shorter:
> "Rewrite the documentation of the text and eol attributes"

The commit message is modeled after 8c591dbfce, which is incidentally
the commit that introduced the confusing wording discussed below.
Unless there is a convention on how to write commit messages like
this, I would rather follow the existing precedent.

> On Sun, Apr 30, 2023 at 08:35:33PM -0600, Alex Henrie wrote:
> > These two sentences are confusing because the description of the text
> > attribute sounds exactly the same as the description of the text=auto
> > attribute:
>
> The word "These" is somewhat dangling: Which ones ?
> May be "The following two sentences" ?

The colon indicates that the sentence refers to the following sentences.

> > -This attribute enables and controls end-of-line normalization.  When a
> > -text file is normalized, its line endings are converted to LF in the
> > -repository.  To control what line ending style is used in the working
> > -directory, use the `eol` attribute for a single file and the
> > -`core.eol` configuration variable for all text files.
> > -Note that setting `core.autocrlf` to `true` or `input` overrides
> > -`core.eol` (see the definitions of those options in
> > -linkgit:git-config[1]).
> > +This attribute marks the path as a text file, which enables end-of-line
> > +conversion: When a matching file is added to the index, the file's line
> > +endings are normalized to LF in the index.  Conversely, when the file is
> > +copied from the index to the working directory, its line endings may be
> I still stumble accross "copied". May be shorter:
>
> "the file is written into the working directory"

Maybe, but then it almost sounds like Git watches the directory for
changes and transforms files as soon as the user creates them.
Conversion only happens when writing files _from the index_. Moreover,
the word "copy" does not always mean a byte-for-byte copy, and clearly
does not mean that here.

> > -This attribute sets a specific line-ending style to be used in the
> > -working directory.  This attribute has effect only if the `text`
> > -attribute is set or unspecified, or if it is set to `auto`, the file is
> > -detected as text, and it is stored with LF endings in the index.  Note
> > -that setting this attribute on paths which are in the index with CRLF
> > -line endings may make the paths to be considered dirty unless
> > -`text=auto` is set. Adding the path to the index again will normalize
> > -the line endings in the index.
> > +This attribute marks a path to use a specific line-ending style in the
> > +working tree when it is checked out.
> It enables even the normalization at checkin, see
> $ mkdir ttt
> $ cd ttt
> $ git init
> $ echo "*.sh eol=lf" >.gitattributes
> $ printf '#!/bin/sh\r\necho hello\r\n' >xx.sh
> $ git add xx.sh
> warning: CRLF will be replaced by LF in xx.sh.
> The file will have its original line endings in your working directory

This has got to be the most confusing Git feature ever. Clearly I did
not understand how it works.

> > + This attribute has effect only if
> > +the `text` attribute is set or unspecified, or if it is set to `auto`,
> > +the file is detected as text, and it is stored with LF endings in the
> > +index.
> It took me a while to understand it.
> Should the "," after "unspecified" be removed ?
>
> Or, should we write:
>
> The `eol` attribute automatically sets `text`, unless `-text`, `binary` or
> `text=auto` is specified.

Your proposed wording makes the relationship between `text` and `eol`
far more clear, and it avoids the need to repeat the detailed
explanation of how normalization on checkin works. However, I think we
still need to say in the introduction that `eol` does not work without
`text`. How about this:

It has effect only if `text` or `text=auto` is set (see above), but
specifying `eol` automatically sets `text` if `text` was left
unspecified.

I've CCed the author of the original sentence on this email in case he
wants to give any feedback on the rewrite.

> > +Unspecified::
> > +
> > +     If the `eol` attribute is unspecified for a file, its line endings
> > +     in the working directory are determined by the `core.autocrlf` or
> > +     `core.eol` configuration variable (see the definitions of those
> > +     options in linkgit:git-config[1]).  The default if `text` is set but
> > +     neither of those variables is is `eol=lf` on Unix and `eol=crlf` on
> > +     Windows.
>
> That's good - I wonder if everyone understands Linux, MacOs and others as Unix.
>
> May be something like this:
> The default, if `text` is set but neither of those variables, is `eol=crlf` on
> Windows and `eol=lf` on all other systems.

That's fine, we can say "all other platforms" instead of "Unix" here.
It would only become a problem if someone ports Git to DOS, which
seems unlikely.

Thanks for the insight and help,

-Alex
Felipe Contreras May 2, 2023, 4:27 p.m. UTC | #3
Torsten Bögershausen wrote:
> This looks much better, thanks.
> I have some minor comments:
> a) The commit message from above:
> >  docs: rewrite the documentation of the text and eol attributes
> 
> The word "doc" is used 2 times, that feel a little bit redundant
> (and should start with a uppercase letter)
> 
> "Docs: rewrite the documentation of the text and eol attributes"
> 
> Or may be shorter:
> "Rewrite the documentation of the text and eol attributes"

This suggestion goes against the Git guideline in SubmittingPatches [1]:

  The title sentence after the "area:" prefix omits the full stop at the
  end, and its first word is not capitalized (the omission of
  capitalization applies only to the word after the "area:" prefix of
  the title) unless there is a reason to capitalize it other than
  because it is the first word in the sentence.  E.g. "doc: clarify...",
  not "doc: Clarify...", or "githooks.txt: improve...", not
  "githooks.txt: Improve...".

If this is touching the documentation, it should have the "docs: "
prefix, or "doc: ".

[1] https://git-scm.com/docs/SubmittingPatches
diff mbox series

Patch

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 39bfbca1ff..076a056a72 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -120,20 +120,19 @@  repository upon 'git add' and 'git commit'.
 `text`
 ^^^^^^
 
-This attribute enables and controls end-of-line normalization.  When a
-text file is normalized, its line endings are converted to LF in the
-repository.  To control what line ending style is used in the working
-directory, use the `eol` attribute for a single file and the
-`core.eol` configuration variable for all text files.
-Note that setting `core.autocrlf` to `true` or `input` overrides
-`core.eol` (see the definitions of those options in
-linkgit:git-config[1]).
+This attribute marks the path as a text file, which enables end-of-line
+conversion: When a matching file is added to the index, the file's line
+endings are normalized to LF in the index.  Conversely, when the file is
+copied from the index to the working directory, its line endings may be
+converted from LF to CRLF depending on the `eol` attribute, the Git
+config, and the platform (see explanation of `eol` below).
 
 Set::
 
 	Setting the `text` attribute on a path enables end-of-line
-	normalization and marks the path as a text file.  End-of-line
-	conversion takes place without guessing the content type.
+	conversion on checkin and checkout as described above.  Line endings
+	are normalized to LF in the index every time the file is checked in,
+	even if the file was previously added to Git with CRLF line endings.
 
 Unset::
 
@@ -142,10 +141,11 @@  Unset::
 
 Set to string value "auto"::
 
-	When `text` is set to "auto", the path is marked for automatic
-	end-of-line conversion.  If Git decides that the content is
-	text, its line endings are converted to LF on checkin.
-	When the file has been committed with CRLF, no conversion is done.
+	When `text` is set to "auto", Git decides by itself whether the file
+	is text or binary.  If it is text and the file was not already in
+	Git with CRLF endings, line endings are converted on checkin and
+	checkout as described above.  Otherwise, no conversion is done on
+	checkin or checkout.
 
 Unspecified::
 
@@ -159,26 +159,30 @@  unspecified.
 `eol`
 ^^^^^
 
-This attribute sets a specific line-ending style to be used in the
-working directory.  This attribute has effect only if the `text`
-attribute is set or unspecified, or if it is set to `auto`, the file is
-detected as text, and it is stored with LF endings in the index.  Note
-that setting this attribute on paths which are in the index with CRLF
-line endings may make the paths to be considered dirty unless
-`text=auto` is set. Adding the path to the index again will normalize
-the line endings in the index.
+This attribute marks a path to use a specific line-ending style in the
+working tree when it is checked out.  This attribute has effect only if
+the `text` attribute is set or unspecified, or if it is set to `auto`,
+the file is detected as text, and it is stored with LF endings in the
+index.
 
 Set to string value "crlf"::
 
-	This setting forces Git to normalize line endings for this
-	file on checkin and convert them to CRLF when the file is
-	checked out.
+	This setting converts the file's line endings in the working
+	directory to CRLF when the file is checked out.
 
 Set to string value "lf"::
 
-	This setting forces Git to normalize line endings to LF on
-	checkin and prevents conversion to CRLF when the file is
-	checked out.
+	This setting uses the same line endings in the working directory as
+	in the index when the file is checked out.
+
+Unspecified::
+
+	If the `eol` attribute is unspecified for a file, its line endings
+	in the working directory are determined by the `core.autocrlf` or
+	`core.eol` configuration variable (see the definitions of those
+	options in linkgit:git-config[1]).  The default if `text` is set but
+	neither of those variables is is `eol=lf` on Unix and `eol=crlf` on
+	Windows.
 
 Backwards compatibility with `crlf` attribute
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^