[v5,12/15] git-p4: p4CmdList - support Unicode encoding
diff mbox series

Message ID e97ac0af8a33bc55c32b96d76136aef106d7b337.1575740863.git.gitgitgadget@gmail.com
State New
Headers show
  • git-p4.py: Cast byte strings to unicode strings in python3
Related show

Commit Message

Matthew Rogers via GitGitGadget Dec. 7, 2019, 5:47 p.m. UTC
From: Ben Keene <seraphire@gmail.com>

The p4CmdList is a commonly used function in the git-p4 code. It is used
to execute a command in P4 and return the results of the call in a list.

The problem is that p4CmdList takes bytes as the parameter data and
returns bytes in the return list.

Add a new optional parameter to the signature, encode_cmd_output, that
determines if the dictionary values returned in the function output are
treated as bytes or as strings.

Change the code to conditionally pass the output data through the
as_string() function when encode_cmd_output is true. Otherwise the
function should return the data as bytes.

Change the code so that regardless of the setting of encode_cmd_output,
the dictionary keys in the return value will always be encoded with

as_string(bytes) is a method defined in this project that treats the
byte data as a string. The word "string" is used because the meaning
varies depending on the version of Python:

  - Python 2: The "bytes" are returned as "str", functionally a No-op.
  - Python 3: The "bytes" are returned as a Unicode string.

The p4CmdList function returns a list of dictionaries that contain
the result of p4 command. If the callback (cb) is defined, the
standard output of the p4 command is redirected.

Data that is passed to the standard input of the P4 process should be
as_bytes() to avoid conversion unicode encoding errors.

as_bytes(text) is a method defined in this project that treats the text
data as a string that should be converted to a byte array (bytes). The
behavior of this function depends on the version of python:

  - Python 2: The "text" is returned as "str", functionally a No-op.
  - Python 3: The "text" is treated as a UTF-8 encoded Unicode string
        and is decoded to bytes.

Additionally, change literal text prior to conversion to be literal
bytes for the code that is evaluating the standard output from the
p4 call.

Add encode_cmd_output to the p4Cmd since this is a helper function that
wraps the behavior of p4CmdList.

Signed-off-by: Ben Keene <seraphire@gmail.com>
 git-p4.py | 36 ++++++++++++++++++++++++++++--------
 1 file changed, 28 insertions(+), 8 deletions(-)

diff mbox series

diff --git a/git-p4.py b/git-p4.py
index 03829f796d..e8f31339e4 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -716,7 +716,23 @@  def isModeExecChanged(src_mode, dst_mode):
     return isModeExec(src_mode) != isModeExec(dst_mode)
 def p4CmdList(cmd, stdin=None, stdin_mode='w+b', cb=None, skip_info=False,
-        errors_as_exceptions=False):
+        errors_as_exceptions=False, encode_cmd_output=True):
+    """ Executes a P4 command:  'cmd' optionally passing 'stdin' to the command's
+        standard input via a temporary file with 'stdin_mode' mode.
+        Output from the command is optionally passed to the callback function 'cb'.
+        If 'cb' is None, the response from the command is parsed into a list
+        of resulting dictionaries. (For each block read from the process pipe.)
+        If 'skip_info' is true, information in a block read that has a code type of
+        'info' will be skipped.
+        If 'errors_as_exceptions' is set to true (the default is false) the error
+        code returned from the execution will generate an exception.
+        If 'encode_cmd_output' is set to true (the default) the data that is returned
+        by this function will be passed through the "as_string" function.
+    """
     if not isinstance(cmd, list):
         cmd = "-G " + cmd
@@ -739,7 +755,7 @@  def p4CmdList(cmd, stdin=None, stdin_mode='w+b', cb=None, skip_info=False,
             for i in stdin:
-                stdin_file.write(i + '\n')
+                stdin_file.write(as_bytes(i) + b'\n')
@@ -753,12 +769,15 @@  def p4CmdList(cmd, stdin=None, stdin_mode='w+b', cb=None, skip_info=False,
         while True:
             entry = marshal.load(p4.stdout)
             if skip_info:
-                if 'code' in entry and entry['code'] == 'info':
+                if b'code' in entry and entry[b'code'] == b'info':
             if cb is not None:
-                result.append(entry)
+                out = {}
+                for key, value in entry.items():
+                    out[as_string(key)] = (as_string(value) if encode_cmd_output else value)
+                result.append(out)
     except EOFError:
     exitCode = p4.wait()
@@ -785,8 +804,9 @@  def p4CmdList(cmd, stdin=None, stdin_mode='w+b', cb=None, skip_info=False,
     return result
-def p4Cmd(cmd):
-    list = p4CmdList(cmd)
+def p4Cmd(cmd, encode_cmd_output=True):
+    """Executes a P4 command and returns the results in a dictionary"""
+    list = p4CmdList(cmd, encode_cmd_output=encode_cmd_output)
     result = {}
     for entry in list:
@@ -1165,7 +1185,7 @@  def getClientSpec():
     """Look at the p4 client spec, create a View() object that contains
        all the mappings, and return it."""
-    specList = p4CmdList("client -o")
+    specList = p4CmdList("client -o", encode_cmd_output=False)
     if len(specList) != 1:
         die('Output from "client -o" is %d lines, expecting 1' %
@@ -2609,7 +2629,7 @@  def update_client_spec_path_cache(self, files):
         if len(fileArgs) == 0:
             return  # All files in cache
-        where_result = p4CmdList(["-x", "-", "where"], stdin=fileArgs)
+        where_result = p4CmdList(["-x", "-", "where"], stdin=fileArgs, encode_cmd_output=False)
         for res in where_result:
             if "code" in res and res["code"] == "error":
                 # assume error is "... file(s) not in client view"