diff mbox series

[v2] git-p4: fix CR LF handling for utf16 files

Message ID pull.1294.v2.git.git.1658341065221.gitgitgadget@gmail.com (mailing list archive)
State Accepted
Commit 4d35f744219335d8b32df363891b6336dcf02a6e
Headers show
Series [v2] git-p4: fix CR LF handling for utf16 files | expand

Commit Message

Baumann, Moritz July 20, 2022, 6:17 p.m. UTC
From: Moritz Baumann <moritz.baumann@sap.com>

Perforce silently replaces LF with CR LF for "utf16" files if the client
is a native Windows client. Since git's autocrlf logic does not undo
this transformation for UTF-16 encoded files, git-p4 replaces CR LF with
LF during the sync if the file type "utf16" is detected and the Perforce
client platform indicates that this conversion is performed.

Windows only runs on little-endian architectures, therefore the encoding
of the byte stream received from the Perforce client is UTF-16-LE and
the relevant byte sequence is 0D 00 0A 00.

Signed-off-by: Moritz Baumann <moritz.baumann@sap.com>
---
    git-p4: fix crlf handling for utf16 files on Windows

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1294%2Fmbs-c%2Ffix-crlf-conversion-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1294/mbs-c/fix-crlf-conversion-v2
Pull-Request: https://github.com/git/git/pull/1294

Range-diff vs v1:

 1:  4a7a14eec28 ! 1:  4d0043712d3 git-p4: fix crlf handling for utf16 files on Windows
     @@ Metadata
      Author: Moritz Baumann <moritz.baumann@sap.com>
      
       ## Commit message ##
     -    git-p4: fix crlf handling for utf16 files on Windows
     +    git-p4: fix CR LF handling for utf16 files
     +
     +    Perforce silently replaces LF with CR LF for "utf16" files if the client
     +    is a native Windows client. Since git's autocrlf logic does not undo
     +    this transformation for UTF-16 encoded files, git-p4 replaces CR LF with
     +    LF during the sync if the file type "utf16" is detected and the Perforce
     +    client platform indicates that this conversion is performed.
     +
     +    Windows only runs on little-endian architectures, therefore the encoding
     +    of the byte stream received from the Perforce client is UTF-16-LE and
     +    the relevant byte sequence is 0D 00 0A 00.
      
          Signed-off-by: Moritz Baumann <moritz.baumann@sap.com>
      


 git-p4.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


base-commit: bbea4dcf42b28eb7ce64a6306cdde875ae5d09ca

Comments

Junio C Hamano July 20, 2022, 6:42 p.m. UTC | #1
"Moritz Baumann via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Moritz Baumann <moritz.baumann@sap.com>
>
> Perforce silently replaces LF with CR LF for "utf16" files if the client
> is a native Windows client. Since git's autocrlf logic does not undo
> this transformation for UTF-16 encoded files, git-p4 replaces CR LF with
> LF during the sync if the file type "utf16" is detected and the Perforce
> client platform indicates that this conversion is performed.
>
> Windows only runs on little-endian architectures, therefore the encoding
> of the byte stream received from the Perforce client is UTF-16-LE and
> the relevant byte sequence is 0D 00 0A 00.
>
> Signed-off-by: Moritz Baumann <moritz.baumann@sap.com>
> ---

Will queue.  Thanks.
diff mbox series

Patch

diff --git a/git-p4.py b/git-p4.py
index 8fbf6eb1fe3..0a9d7e2ed7c 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -3148,7 +3148,7 @@  class P4Sync(Command, P4UserMap):
                     raise e
             else:
                 if p4_version_string().find('/NT') >= 0:
-                    text = text.replace(b'\r\n', b'\n')
+                    text = text.replace(b'\x0d\x00\x0a\x00', b'\x0a\x00')
                 contents = [text]
 
         if type_base == "apple":