Message ID | 20230421055641.550199-1-alexhenrie24@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | docs: clarify how the text and text=auto attributes are different | expand |
Alex Henrie <alexhenrie24@gmail.com> writes: > These two sentences are confusing because the description of the text > attribute sounds exactly the same as the description of the text=auto > attribute: > > "Setting the text attribute on a path enables end-of-line normalization" > > "When text is set to "auto", the path is marked for automatic > end-of-line conversion" > > Unless the reader is already familiar with the two variants, there's a > high probability that they will think that "end-of-line normalization" > is the same thing as "automatic end-of-line conversion". True. The prerequisite to understand these two seemingly contradicting sentences is to know that "attribute" can be "set" (in the sense of Boolean "yes") and can be "set to value" (here in this case, to the string "auto"). They are treated differently. It is very good that you decided to clarify it. > It's also confusing that the explanation of how end-of-line conversion > works is in the paragraph for text=auto even though it applies equally > to the text attribute which is described earlier. Good observation. Instead of half-duplicating the same explanation in two places, it is very good to do so just once, in the introductory description of the 'text' attribute before the individual description of what happens when the attribute is in any of the four states. > On top of that, "When the file has been committed with CRLF, no > conversion is done" implies that normalization is only suppressed if the > file has been committed. In fact, running `git add` on a CRLF file, > adding the text attribute to the file, and running `git add` again does > not do anything to the line endings either. True again. The conversion happens between the index and the working tree, and "committing" does not have much say in the process. Interestingly, the description of what happens when the attribute is "Unset" (in the sense of setting it to Boolean "no") uses a better terminology---"checkin or checkout", which is about getting things out of or into the index. It is very good that you decided to clarify it. > Rephrase the documentation of text and text=auto to be clear about how > they are the same, how they are different, and in what cases > normalization is performed. Thanks. > Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> > --- > Documentation/gitattributes.txt | 16 +++++++++------- > 1 file changed, 9 insertions(+), 7 deletions(-) > > diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt > index 39bfbca1ff..6db4ecd794 100644 > --- a/Documentation/gitattributes.txt > +++ b/Documentation/gitattributes.txt > @@ -131,9 +131,12 @@ linkgit:git-config[1]). > > Set:: > > - Setting the `text` attribute on a path enables end-of-line > - normalization and marks the path as a text file. End-of-line > - conversion takes place without guessing the content type. > + Setting the `text` attribute on a path marks the path as a text > + file, which enables end-of-line normalization: When a matching file > + is added to the index, even if it has CRLF line endings in the > + working directory, the file is stored in Git with LF line endings. > + However, if the file was already in Git with CRLF endings, no > + conversion is done. I think most of the new text should go to the paragraph before this hunk (i.e. "... its line endings are converted to ..."), where not just CRLF->LF conversion is described, but how it does and does not happen depending on the eol, core.eol, and core.autocrlf are set. > Unset:: > > @@ -142,10 +145,9 @@ Unset:: > > Set to string value "auto":: > > - When `text` is set to "auto", the path is marked for automatic > - end-of-line conversion. If Git decides that the content is > - text, its line endings are converted to LF on checkin. > - When the file has been committed with CRLF, no conversion is done. > + When text is set to "auto", Git decides by itself whether the file > + is text or binary. If it is text, line endings are converted as > + described above. If it is binary, they are not. This side is good. "As described above" can still be used as-is even after the explanation of eol conversion is made before we start enumerating Set/Unset/set to auto/unspecified. Thanks.
On Thu, Apr 20, 2023 at 11:56:41PM -0600, Alex Henrie wrote: Thanks for that, fully agree with comments from Junio. > Set:: > > - Setting the `text` attribute on a path enables end-of-line > - normalization and marks the path as a text file. End-of-line > - conversion takes place without guessing the content type. > + Setting the `text` attribute on a path marks the path as a text > + file, which enables end-of-line normalization: When a matching file > + is added to the index, even if it has CRLF line endings in the > + working directory, the file is stored in Git with LF line endings. Hm, I stumbled a little bit over the word "even" here. But it makes things very clear, so this is good. > + However, if the file was already in Git with CRLF endings, no > + conversion is done. But I think that this is wrong, unless I missed something. Using *.txt text in .gitattributes will convert CRLF into LF in the repo at the next git commit/git add When you create a file with CRLF (no atttibutes), commit it into the repo, and later add the attribute "text", the file may be reported as modified. Not always. A `touch file.txt` typically asks Git to re-investigate things, and it may show up as modified. After a `git clone` it may be shown as modified or not. End of side-note. Setting > > Unset:: > > @@ -142,10 +145,9 @@ Unset:: > > Set to string value "auto":: > > - When `text` is set to "auto", the path is marked for automatic > - end-of-line conversion. If Git decides that the content is > - text, its line endings are converted to LF on checkin. > - When the file has been committed with CRLF, no conversion is done. > + When text is set to "auto", Git decides by itself whether the file > + is text or binary. If it is text, line endings are converted as > + described above. If it is binary, they are not. Here we need to have the lines from above: When text is set to "auto", Git decides by itself whether the file is text or binary. If it is text, line endings are converted as described above. If it is binary, they are not. However, if the file was already in Git with CRLF endings, no conversion is done. I hope that I didn't messed up while reading this patch ? Thanks again for bringing this up
On Fri, Apr 21, 2023 at 12:26 PM Torsten Bögershausen <tboegi@web.de> wrote: > Using > *.txt text > in .gitattributes > will convert CRLF into LF in the repo at the next git commit/git add > > When you create a file with CRLF (no atttibutes), commit it into the repo, > and later add the attribute "text", the file may be reported as modified. > Not always. A `touch file.txt` typically asks Git to re-investigate things, > and it may show up as modified. > After a `git clone` it may be shown as modified or not. > End of side-note. Thank you very much for pointing that out. I had assumed that "text" and "text=auto" behaved the same for text files, but they do not: "text" normalizes line endings even if the file is already in Git. That means that my proposed text is not correct, and I will have to change it similarly to how you suggested below. > > Set to string value "auto":: > > > > - When `text` is set to "auto", the path is marked for automatic > > - end-of-line conversion. If Git decides that the content is > > - text, its line endings are converted to LF on checkin. > > - When the file has been committed with CRLF, no conversion is done. > > + When text is set to "auto", Git decides by itself whether the file > > + is text or binary. If it is text, line endings are converted as > > + described above. If it is binary, they are not. > > Here we need to have the lines from above: > > When text is set to "auto", Git decides by itself whether the file > is text or binary. If it is text, line endings are converted as > described above. If it is binary, they are not. > However, if the file was already in Git with CRLF endings, no > conversion is done. Yes, the note about what happens if the file is already in Git with CRLF endings applies specifically to text=auto, so it needs to be in the text=auto section. The documentation of the eol attribute actually explains it more clearly: "Note that setting this attribute on paths which are in the index with CRLF line endings may make the paths to be considered dirty unless text=auto is set. Adding the path to the index again will normalize the line endings in the index." I think that explanation needs to be moved from the eol attribute documentation to the text attribute documentation and edited for clarity. I will send a v2 that revises the documentation of both attributes, taking into account your and Junio's feedback. Thanks again, -Alex
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt index 39bfbca1ff..6db4ecd794 100644 --- a/Documentation/gitattributes.txt +++ b/Documentation/gitattributes.txt @@ -131,9 +131,12 @@ linkgit:git-config[1]). Set:: - Setting the `text` attribute on a path enables end-of-line - normalization and marks the path as a text file. End-of-line - conversion takes place without guessing the content type. + Setting the `text` attribute on a path marks the path as a text + file, which enables end-of-line normalization: When a matching file + is added to the index, even if it has CRLF line endings in the + working directory, the file is stored in Git with LF line endings. + However, if the file was already in Git with CRLF endings, no + conversion is done. Unset:: @@ -142,10 +145,9 @@ Unset:: Set to string value "auto":: - When `text` is set to "auto", the path is marked for automatic - end-of-line conversion. If Git decides that the content is - text, its line endings are converted to LF on checkin. - When the file has been committed with CRLF, no conversion is done. + When text is set to "auto", Git decides by itself whether the file + is text or binary. If it is text, line endings are converted as + described above. If it is binary, they are not. Unspecified::
These two sentences are confusing because the description of the text attribute sounds exactly the same as the description of the text=auto attribute: "Setting the text attribute on a path enables end-of-line normalization" "When text is set to "auto", the path is marked for automatic end-of-line conversion" Unless the reader is already familiar with the two variants, there's a high probability that they will think that "end-of-line normalization" is the same thing as "automatic end-of-line conversion". It's also confusing that the explanation of how end-of-line conversion works is in the paragraph for text=auto even though it applies equally to the text attribute which is described earlier. On top of that, "When the file has been committed with CRLF, no conversion is done" implies that normalization is only suppressed if the file has been committed. In fact, running `git add` on a CRLF file, adding the text attribute to the file, and running `git add` again does not do anything to the line endings either. Rephrase the documentation of text and text=auto to be clear about how they are the same, how they are different, and in what cases normalization is performed. Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> --- Documentation/gitattributes.txt | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)