From patchwork Wed Dec 4 22:29:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11273673 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CC4AE13B6 for ; Wed, 4 Dec 2019 22:29:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A8D702073C for ; Wed, 4 Dec 2019 22:29:43 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="f8/KUmJZ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728343AbfLDW3m (ORCPT ); Wed, 4 Dec 2019 17:29:42 -0500 Received: from mail-wm1-f66.google.com ([209.85.128.66]:39380 "EHLO mail-wm1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727989AbfLDW3m (ORCPT ); Wed, 4 Dec 2019 17:29:42 -0500 Received: by mail-wm1-f66.google.com with SMTP id s14so1522370wmh.4 for ; Wed, 04 Dec 2019 14:29:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=XJqUfapH7odAB2bWQ1ubCrdJoAcc5L6/ywib7uw6o9w=; b=f8/KUmJZOgqH0/V7veTqjNDeOQf3/3G6z4Gbstc3/35tRHLpN04Sz667YI4c7quP6I 6cfArLwOcYUMZvMBgV8sUd2xUPdQn4fZiNeRON96qmEy45lRBHHLrIsfXAy8j/RTH2gq eKpG15GLAIwMAgRU6W39EyYYAe/m4TO+iRYQjs0dQTTZrT3EbBBrtDIDxdsbfzNZB1kz xJjuiqxxwQc4tZ12G2Lkj30UVEBgFgednFcRCOzQXx/ToVM8fFZ/w9zwVLIhaI7MLyvt +mcD17Y+lX2jKLsZNnEwUrJ1cekYUPkh+/K6DGEdpOQ9yzgx+RJoKeVtVGkKJri41VnB VMvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=XJqUfapH7odAB2bWQ1ubCrdJoAcc5L6/ywib7uw6o9w=; b=UgPVzSAQ8l3Sec4QBLYHxXzEyQ9Fyw28D7pnT7AmVvtK9oqV3c0qnZv9GqVH+/Iqre 8barUBIN+VXgHojnZ8svPXeHblmyRc7ZrAn8E7GiY0TxM/R2JmE4NceqQ+75elGQK1R/ StgzUYuRLAv/upiAcF1qjrGHRekenpm+YL8Wnf90rS4M7baRTzoRBcm291tOUELEGymA V3ZOh35OLCuGkfhv8cRwyPqtKaZSau6vuv6RJ1p1QUV9qwM9Jls4ba7/5FHeDcvpxR7W U616X0yVUy++odpXNX0/eq6gJePgGxon9LkmrfXN7M284us52VX+geF5yLe0oo30Xabc IjZA== X-Gm-Message-State: APjAAAUvjDl1UvhZ4WX7ED1RyeK3uZuejJjwE/RwnJCEnfoXKUU0mAR7 ZouMXod0ovXYjBvK3XC4OIT13YpF X-Google-Smtp-Source: APXvYqwWqrQsChu08h9XvF0QBzWEaQM8bvpG3bkJaTusPWMGq94hp3818QaBa2K+2wI5pezVuoXOjQ== X-Received: by 2002:a7b:c444:: with SMTP id l4mr1841060wmi.178.1575498580354; Wed, 04 Dec 2019 14:29:40 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a7sm10227070wrr.50.2019.12.04.14.29.39 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 04 Dec 2019 14:29:40 -0800 (PST) Message-Id: <40124269933691796ef57fd8df50f9e740d103b1.1575498577.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Ben Keene via GitGitGadget" Date: Wed, 04 Dec 2019 22:29:27 +0000 Subject: [PATCH v4 01/11] git-p4: select p4 binary by operating-system Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Ben Keene , Junio C Hamano , Ben Keene Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Ben Keene Depending on the version of GIT and Python installed, the perforce program (p4) may not resolve on Windows without the program extension. Check the operating system (platform.system) and if it is reporting that it is Windows, use the full filename of "p4.exe" instead of "p4" The original code unconditionally used "p4" as the binary filename. This change is Python2 and Python3 compatible. Thanks to: Junio C Hamano and Denton Liu for patiently explaining proper format for my submissions. Signed-off-by: Ben Keene (cherry picked from commit 9a3a5c4e6d29dbef670072a9605c7a82b3729434) --- git-p4.py | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/git-p4.py b/git-p4.py index 60c73b6a37..b2ffbc057b 100755 --- a/git-p4.py +++ b/git-p4.py @@ -75,7 +75,11 @@ def p4_build_cmd(cmd): location. It means that hooking into the environment, or other configuration can be done more easily. """ - real_cmd = ["p4"] + # Look for the P4 binary + if (platform.system() == "Windows"): + real_cmd = ["p4.exe"] + else: + real_cmd = ["p4"] user = gitConfig("git-p4.user") if len(user) > 0: From patchwork Wed Dec 4 22:29:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11273675 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E46C5930 for ; Wed, 4 Dec 2019 22:29:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C29D6206DB for ; Wed, 4 Dec 2019 22:29:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="hkBieDct" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728379AbfLDW3o (ORCPT ); Wed, 4 Dec 2019 17:29:44 -0500 Received: from mail-wm1-f43.google.com ([209.85.128.43]:52713 "EHLO mail-wm1-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728121AbfLDW3o (ORCPT ); Wed, 4 Dec 2019 17:29:44 -0500 Received: by mail-wm1-f43.google.com with SMTP id p9so1482848wmc.2 for ; Wed, 04 Dec 2019 14:29:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=JcaZgyysud3FVYO9lXHt87MA5RbnQXWDEH6oMQULAD0=; b=hkBieDctosvHMaNWd5LVissrmI7M5Kwnxi/pUpj84sFoq1EHr5YRj9xCeJbo78qt3d u8FHjXcA2vWevrM1vvYHyjUGO8XwwL4tSreiS3PcwD57gW5EwGY7htn8yFOv20LdDD1r mtX7+/OzNm3/abS5pntmp/jeS7f8yULtK6ooDPHV7bOhal7XD7Y/Cl36d8zmymsKieKF wGBWYpu+g336DshFOyoHS3bKGK8V4o1aXituoaRxayebFedPLTG9aBEgDIbJRyJawQei Ah5YEa6N1DyQ0RjoIOSP7hPd6nXLCLvayvOIuc9O2UziXxSK2JNJHv732/cku54SVlD/ FoGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=JcaZgyysud3FVYO9lXHt87MA5RbnQXWDEH6oMQULAD0=; b=qVUkW6GM5m6iauWmGGpUYKwDlHyC7xX7A1rVae2jQXGCFjqozUPnYTrbKLqhpUZK5j vfrNOMe4gQwnZlIPaebw+NVEisn8afc6+bZjYfXrOqFisQ1NI06EjWoZuK10wRh7IQWE w282mB8M+oeL18CvwgBzvyq/rTkOhHj2SeS8vUnU6pBfstQcv/lWOQGPj9E436PPUWs3 bgdaVAfCepl+cjnLDyifu4wJwfQFR04YFvNiEMmxPAYJmbJx7PXGyLk7EDf05rUernC6 ZktPQQZQKKymOpyrIbG3lPNwaYi2/jSzUqEEt6GIU/WPw16pKwRh0e3W6BaBTczwgNlN Ul7Q== X-Gm-Message-State: APjAAAXmdUyOxtS46vhr5oQE7WQPD9k43w70suJOtC3UyPkl7bfp2COQ BRP033DYQOv4OHkylVQl/5QxFAiO X-Google-Smtp-Source: APXvYqxSrNUCCEyxT3d33L1gxQggq99D5xnQrCzORY7inLxsyQPPM0/o4SgRiSGm59yQE9HVCKHbmQ== X-Received: by 2002:a1c:7306:: with SMTP id d6mr2009290wmb.164.1575498581177; Wed, 04 Dec 2019 14:29:41 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id n30sm8156843wmd.3.2019.12.04.14.29.40 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 04 Dec 2019 14:29:40 -0800 (PST) Message-Id: <0ef2f56b04803cad2e60bf881e86d8bdd69463a6.1575498577.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Ben Keene via GitGitGadget" Date: Wed, 04 Dec 2019 22:29:28 +0000 Subject: [PATCH v4 02/11] git-p4: change the expansion test from basestring to list Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Ben Keene , Junio C Hamano , Ben Keene Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Ben Keene Python 3+ handles strings differently than Python 2.7. Since Python 2 is reaching it's end of life, a series of changes are being submitted to enable python 3.7+ support. The current code fails basic tests under python 3.7. Change references to basestring in the isinstance tests to use list instead. This prepares the code to remove all references to basestring. The original code used basestring in a test to determine if a list or literal string was passed into 9 different functions. This is used to determine if the shell should be evoked when calling subprocess methods. Signed-off-by: Ben Keene (cherry picked from commit 5b1b1c145479b5d5fd242122737a3134890409e6) --- git-p4.py | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/git-p4.py b/git-p4.py index b2ffbc057b..0f27996393 100755 --- a/git-p4.py +++ b/git-p4.py @@ -109,7 +109,7 @@ def p4_build_cmd(cmd): # Provide a way to not pass this option by setting git-p4.retries to 0 real_cmd += ["-r", str(retries)] - if isinstance(cmd,basestring): + if not isinstance(cmd, list): real_cmd = ' '.join(real_cmd) + ' ' + cmd else: real_cmd += cmd @@ -175,7 +175,7 @@ def write_pipe(c, stdin): if verbose: sys.stderr.write('Writing pipe: %s\n' % str(c)) - expand = isinstance(c,basestring) + expand = not isinstance(c, list) p = subprocess.Popen(c, stdin=subprocess.PIPE, shell=expand) pipe = p.stdin val = pipe.write(stdin) @@ -197,7 +197,7 @@ def read_pipe_full(c): if verbose: sys.stderr.write('Reading pipe: %s\n' % str(c)) - expand = isinstance(c,basestring) + expand = not isinstance(c, list) p = subprocess.Popen(c, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=expand) (out, err) = p.communicate() return (p.returncode, out, err) @@ -233,7 +233,7 @@ def read_pipe_lines(c): if verbose: sys.stderr.write('Reading pipe: %s\n' % str(c)) - expand = isinstance(c, basestring) + expand = not isinstance(c, list) p = subprocess.Popen(c, stdout=subprocess.PIPE, shell=expand) pipe = p.stdout val = pipe.readlines() @@ -276,7 +276,7 @@ def p4_has_move_command(): return True def system(cmd, ignore_error=False): - expand = isinstance(cmd,basestring) + expand = not isinstance(cmd, list) if verbose: sys.stderr.write("executing %s\n" % str(cmd)) retcode = subprocess.call(cmd, shell=expand) @@ -288,7 +288,7 @@ def system(cmd, ignore_error=False): def p4_system(cmd): """Specifically invoke p4 as the system command. """ real_cmd = p4_build_cmd(cmd) - expand = isinstance(real_cmd, basestring) + expand = not isinstance(real_cmd, list) retcode = subprocess.call(real_cmd, shell=expand) if retcode: raise CalledProcessError(retcode, real_cmd) @@ -526,7 +526,7 @@ def getP4OpenedType(file): # Return the set of all p4 labels def getP4Labels(depotPaths): labels = set() - if isinstance(depotPaths,basestring): + if not isinstance(depotPaths, list): depotPaths = [depotPaths] for l in p4CmdList(["labels"] + ["%s..." % p for p in depotPaths]): @@ -613,7 +613,7 @@ def isModeExecChanged(src_mode, dst_mode): def p4CmdList(cmd, stdin=None, stdin_mode='w+b', cb=None, skip_info=False, errors_as_exceptions=False): - if isinstance(cmd,basestring): + if not isinstance(cmd, list): cmd = "-G " + cmd expand = True else: @@ -630,7 +630,7 @@ def p4CmdList(cmd, stdin=None, stdin_mode='w+b', cb=None, skip_info=False, stdin_file = None if stdin is not None: stdin_file = tempfile.TemporaryFile(prefix='p4-stdin', mode=stdin_mode) - if isinstance(stdin,basestring): + if not isinstance(stdin, list): stdin_file.write(stdin) else: for i in stdin: From patchwork Wed Dec 4 22:29:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11273677 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6EF6E13B6 for ; Wed, 4 Dec 2019 22:29:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 439E52073C for ; Wed, 4 Dec 2019 22:29:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="OLYDdcFD" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728401AbfLDW3q (ORCPT ); Wed, 4 Dec 2019 17:29:46 -0500 Received: from mail-wr1-f68.google.com ([209.85.221.68]:40571 "EHLO mail-wr1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728372AbfLDW3o (ORCPT ); Wed, 4 Dec 2019 17:29:44 -0500 Received: by mail-wr1-f68.google.com with SMTP id c14so1109107wrn.7 for ; Wed, 04 Dec 2019 14:29:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=DzCCkTzgUyIYBFSfzj6iOR0jXzzfHfA2nkvr+WW+YTg=; b=OLYDdcFDY3eYTY+kyIJE+WcC81H2vJfV5GmabrM/60a7g5yQ+Ak1b2qy2HisIai2lz HfiTRX3u5iKUZA0ZBsJuMn5PtcpM0nmVhXlryDSDYtRqbb8UDGbW4a5SHK2xygoDjFj2 FNifTTNQ4GGZgq7ugvJpP+KuoCtcZ+5eeW4w3qmyeDV+8JUc1CwIyuivcCKUgG1U4vy5 ggmTKrmPyW4kMIX8Ro2gM0SmhQ+Ix1Ae0ZR90J2LUv4U44cqL5Vgu2Oa6hwnyAcADxDt BVsZsfPBc4Yoak4ETlF/3mLOthPq/RFTI2LZuT1uWw4EXz+QFiQavVgCuYFAmGFJCQwh w1Dw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=DzCCkTzgUyIYBFSfzj6iOR0jXzzfHfA2nkvr+WW+YTg=; b=GHLrqqjGi5aEI6bCGJmdQfdZB2Hfzd4anFC235grA4/swkKPG2CeaQmk120LGJQ5wK w6J2LhUnw1Yx99haNpwLYnJJVX3RD1jn30i3uek9ekyzA071sOWH0Rta4M1RAz9M4aeE rcF1JQpo3ZxOjYke/Mo/v8UVcbvs627drzDHStg8wROLT1ACqbQX3U9687xf4HT+pXRR Adwz+HxgsLbUPQ72Qlg7vkeeBylU3nattcziALWcza6GRF2lGxbFfzbLgyFJ9pv3LNwR ZE+VfoAzjevyxJ/tquHWqg8quWU/8grVNv33sqaKx0pTn5EXFR8ug8/S2sn+VWFLbZXN Hwlw== X-Gm-Message-State: APjAAAVm8dH7IzfI5X7jet7XlB0ZKvfWl9wzF94N+HmFfjxVyWSN9uHH ah9Urfw6LgWih2U2+/mbtXDxp/nK X-Google-Smtp-Source: APXvYqwuWk4wADZCTIgjkblSDgMklOH+iNWyCpa3DK84qNFHnhBvzC1kxwmY9i2Fdh3imNUOgZr/8w== X-Received: by 2002:a5d:50d2:: with SMTP id f18mr6564178wrt.366.1575498581993; Wed, 04 Dec 2019 14:29:41 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id p5sm9495990wrt.79.2019.12.04.14.29.41 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 04 Dec 2019 14:29:41 -0800 (PST) Message-Id: In-Reply-To: References: From: "Ben Keene via GitGitGadget" Date: Wed, 04 Dec 2019 22:29:29 +0000 Subject: [PATCH v4 03/11] git-p4: add new helper functions for python3 conversion Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Ben Keene , Junio C Hamano , Ben Keene Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Ben Keene Python 3+ handles strings differently than Python 2.7. Since Python 2 is reaching it's end of life, a series of changes are being submitted to enable python 3.7+ support. The current code fails basic tests under python 3.7. Change the existing unicode test add new support functions for python2-python3 support. Define the following variables: - isunicode - a boolean variable that states if the version of python natively supports unicode (true) or not (false). This is true for Python3 and false for Python2. - unicode - a type alias for the datatype that holds a unicode string. It is assigned to a str under python 3 and the unicode type for Python2. - bytes - a type alias for an array of bytes. It is assigned the native bytes type for Python3 and str for Python2. Add the following new functions: - as_string(text) - A new function that will convert a byte array to a unicode (UTF-8) string under python 3. Under python 2, this returns the string unchanged. - as_bytes(text) - A new function that will convert a unicode string to a byte array under python 3. Under python 2, this returns the string unchanged. - to_unicode(text) - Converts a text string as Unicode(UTF-8) on both Python2 and Python3. Add a new function alias raw_input: If raw_input does not exist (it was renamed to input in python 3) alias input as raw_input. The AS_STRING and AS_BYTES functions allow for modifying the code with a minimal amount of impact on Python2 support. When a string is expected, the as_string() will be used to convert "cast" the incoming "bytes" to a string type. Conversely as_bytes() will be used to convert a "string" to a "byte array" type. Since Python2 overloads the datatype 'str' to serve both purposes, the Python2 versions of these function do not change the data, since the str functions as both a byte array and a string. basestring is removed since its only references are found in tests that were changed in the previous change list. Signed-off-by: Ben Keene (cherry picked from commit 7921aeb3136b07643c1a503c2d9d8b5ada620356) --- git-p4.py | 70 +++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 66 insertions(+), 4 deletions(-) diff --git a/git-p4.py b/git-p4.py index 0f27996393..93dfd0920a 100755 --- a/git-p4.py +++ b/git-p4.py @@ -32,16 +32,78 @@ unicode = unicode except NameError: # 'unicode' is undefined, must be Python 3 - str = str + # + # For Python3 which is natively unicode, we will use + # unicode for internal information but all P4 Data + # will remain in bytes + isunicode = True unicode = str bytes = bytes - basestring = (str,bytes) + + def as_string(text): + """Return a byte array as a unicode string""" + if text == None: + return None + if isinstance(text, bytes): + return unicode(text, "utf-8") + else: + return text + + def as_bytes(text): + """Return a Unicode string as a byte array""" + if text == None: + return None + if isinstance(text, bytes): + return text + else: + return bytes(text, "utf-8") + + def to_unicode(text): + """Return a byte array as a unicode string""" + return as_string(text) + + def path_as_string(path): + """ Converts a path to the UTF8 encoded string """ + if isinstance(path, unicode): + return path + return encodeWithUTF8(path).decode('utf-8') + else: # 'unicode' exists, must be Python 2 - str = str + # + # We will treat the data as: + # str -> str + # bytes -> str + # So for Python2 these functions are no-ops + # and will leave the data in the ambiguious + # string/bytes state + isunicode = False unicode = unicode bytes = str - basestring = basestring + + def as_string(text): + """ Return text unaltered (for Python3 support) """ + return text + + def as_bytes(text): + """ Return text unaltered (for Python3 support) """ + return text + + def to_unicode(text): + """Return a string as a unicode string""" + return text.decode('utf-8') + + def path_as_string(path): + """ Converts a path to the UTF8 encoded bytes """ + return encodeWithUTF8(path) + + + +# Check for raw_input support +try: + raw_input +except NameError: + raw_input = input try: from subprocess import CalledProcessError From patchwork Wed Dec 4 22:29:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11273681 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D4FDE159A for ; Wed, 4 Dec 2019 22:29:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B2FE22073C for ; Wed, 4 Dec 2019 22:29:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="P6ju5XOY" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728422AbfLDW3t (ORCPT ); Wed, 4 Dec 2019 17:29:49 -0500 Received: from mail-wr1-f50.google.com ([209.85.221.50]:39350 "EHLO mail-wr1-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727989AbfLDW3q (ORCPT ); Wed, 4 Dec 2019 17:29:46 -0500 Received: by mail-wr1-f50.google.com with SMTP id y11so1112166wrt.6 for ; Wed, 04 Dec 2019 14:29:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=R6auDpzwkB0Sojq7Ppg7Wt15Zp6cHNblRehr26rK1no=; b=P6ju5XOYIjCDNyBvTNLHZqfkbeuc/wOleSydi3mlknGStfQ2NJ+Z0vMJK07Wl1rbAH aUjuUQUK42yx9p0i/NAldOgpAn/Z1XTrlS38weI5+fe6v3jMtfGInbc5qmHaZHa/4yVe G1784wHTu43w3EuB52hO9vKqpt8ouL9Dcglvu1q9v73C3c3X76n8cZHmIW9/Aa3LS3tx kI8ppAI2xS0sVz+A00Mx0tJ+bktLJ9untO/JBTGok3bGcYPrKIaupioRKx1wy/Za4LrP rHHXuc3Gzb6hPe7XDzZpjdY7IoCrGWBUiqTTV9KggNmT/jzAABMNyXTPohp2MsI4iT2o LY8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=R6auDpzwkB0Sojq7Ppg7Wt15Zp6cHNblRehr26rK1no=; b=NrP4JJy0zQmZ7XqOBEpbU+sUR0yDECIVN2tnIvzGBh1RMRRA2f8aVB9ZUr0+WH/oRF 5tz1rnir1sYwKZMzgqSH4YFm2EI2Q/kyDoKF7m+6JGoL+tmzgtgYGdYfPSPlGPwk4jEL ZKK9o2zqdR7DdimlsArAAWdx4Gvtt0bRKoY4xuJkrF0Iu1dC1FKNGDF5svSUxYYxGNBO dIq9QHrfqxJtghRal7uEMDgBzOTLHyy6Wb3XX6PU/Pmrm1nc0UMYgxPZYgJoT/w9GFAn TvknE0A4tlIQP4qQ00MZcR/UbR9vdAaDq3Q+oTrmAUQo6HJPRbTz99W8uX79PEjza4Gv ERwg== X-Gm-Message-State: APjAAAUj7omRXh/FbGwdp3t2LwW4zapzYOfQ7DFOW6TWBLPbpz1sxHBf GH4yJjR4duPcuqWFXilQxeRZo/xR X-Google-Smtp-Source: APXvYqxsbUceZ7KsmA1hGdKoxRFGox2eiXAQHHKsCIZC4rswqcGPeRNxRBjf9N/mQTqzSlu4x9YIYQ== X-Received: by 2002:a5d:4b47:: with SMTP id w7mr6939375wrs.276.1575498582836; Wed, 04 Dec 2019 14:29:42 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id u10sm8176239wmd.1.2019.12.04.14.29.42 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 04 Dec 2019 14:29:42 -0800 (PST) Message-Id: <3c41db3e9157e20aeed41d3eff373183c9834bff.1575498577.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Ben Keene via GitGitGadget" Date: Wed, 04 Dec 2019 22:29:30 +0000 Subject: [PATCH v4 04/11] git-p4: python3 syntax changes Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Ben Keene , Junio C Hamano , Ben Keene Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Ben Keene Python 3+ handles strings differently than Python 2.7. Since Python 2 is reaching it's end of life, a series of changes are being submitted to enable python 3.7+ support. The current code fails basic tests under python 3.7. There are a number of translations suggested by modernize/futureize that should be taken to fix numerous non-string specific issues. Change references to the X.next() iterator to the function next(X) which is compatible with both Python2 and Python3. Change references to X.keys() to list(X.keys()) to return a list that can be iterated in both Python2 and Python3. Add the literal text (object) to the end of class definitions to be consistent with Python3 class definition. Change integer divison to use "//" instead of "/" Under Both python2 and python3 // will return a floor()ed result which matches existing functionality. Change the format string for displaying decimal values from %d to %4.1f% when displaying a progress. This avoids displaying long repeating decimals in user displayed text. Signed-off-by: Ben Keene (cherry picked from commit bde6b83296aa9b3e7a584c5ce2b571c7287d8f9f) --- git-p4.py | 55 +++++++++++++++++++++++++++++-------------------------- 1 file changed, 29 insertions(+), 26 deletions(-) diff --git a/git-p4.py b/git-p4.py index 93dfd0920a..b283ef1029 100755 --- a/git-p4.py +++ b/git-p4.py @@ -26,6 +26,9 @@ import zlib import ctypes import errno +import os.path +import codecs +import io # support basestring in python3 try: @@ -631,7 +634,7 @@ def parseDiffTreeEntry(entry): If the pattern is not matched, None is returned.""" - match = diffTreePattern().next().match(entry) + match = next(diffTreePattern()).match(entry) if match: return { 'src_mode': match.group(1), @@ -935,7 +938,7 @@ def findUpstreamBranchPoint(head = "HEAD"): branches = p4BranchesInGit() # map from depot-path to branch name branchByDepotPath = {} - for branch in branches.keys(): + for branch in list(branches.keys()): tip = branches[branch] log = extractLogMessageFromGitCommit(tip) settings = extractSettingsGitLog(log) @@ -1129,7 +1132,7 @@ def getClientSpec(): client_name = entry["Client"] # just the keys that start with "View" - view_keys = [ k for k in entry.keys() if k.startswith("View") ] + view_keys = [ k for k in list(entry.keys()) if k.startswith("View") ] # hold this new View view = View(client_name) @@ -1371,7 +1374,7 @@ def processContent(self, git_mode, relPath, contents): else: return LargeFileSystem.processContent(self, git_mode, relPath, contents) -class Command: +class Command(object): delete_actions = ( "delete", "move/delete", "purge" ) add_actions = ( "add", "branch", "move/add" ) @@ -1386,7 +1389,7 @@ def ensure_value(self, attr, value): setattr(self, attr, value) return getattr(self, attr) -class P4UserMap: +class P4UserMap(object): def __init__(self): self.userMapFromPerforceServer = False self.myP4UserId = None @@ -1437,7 +1440,7 @@ def getUserMapFromPerforceServer(self): self.emails[email] = user s = '' - for (key, val) in self.users.items(): + for (key, val) in list(self.users.items()): s += "%s\t%s\n" % (key.expandtabs(1), val.expandtabs(1)) open(self.getUserCacheFilename(), "wb").write(s) @@ -1788,7 +1791,7 @@ def prepareSubmitTemplate(self, changelist=None): break if not change_entry: die('Failed to decode output of p4 change -o') - for key, value in change_entry.iteritems(): + for key, value in list(change_entry.items()): if key.startswith('File'): if 'depot-paths' in settings: if not [p for p in settings['depot-paths'] @@ -2032,7 +2035,7 @@ def applyCommit(self, id): p4_delete(f) # Set/clear executable bits - for f in filesToChangeExecBit.keys(): + for f in list(filesToChangeExecBit.keys()): mode = filesToChangeExecBit[f] setP4ExecBit(f, mode) @@ -2285,7 +2288,7 @@ def run(self, args): self.clientSpecDirs = getClientSpec() # Check for the existence of P4 branches - branchesDetected = (len(p4BranchesInGit().keys()) > 1) + branchesDetected = (len(list(p4BranchesInGit().keys())) > 1) if self.useClientSpec and not branchesDetected: # all files are relative to the client spec @@ -2676,7 +2679,7 @@ def __init__(self): self.knownBranches = {} self.initialParents = {} - self.tz = "%+03d%02d" % (- time.timezone / 3600, ((- time.timezone % 3600) / 60)) + self.tz = "%+03d%02d" % (- time.timezone // 3600, ((- time.timezone % 3600) // 60)) self.labels = {} # Force a checkpoint in fast-import and wait for it to finish @@ -2793,7 +2796,7 @@ def splitFilesIntoBranches(self, commit): else: relPath = self.stripRepoPath(path, self.depotPaths) - for branch in self.knownBranches.keys(): + for branch in list(self.knownBranches.keys()): # add a trailing slash so that a commit into qt/4.2foo # doesn't end up in qt/4.2, e.g. if p4PathStartsWith(relPath, branch + "/"): @@ -2834,7 +2837,7 @@ def streamOneP4File(self, file, contents): size = int(self.stream_file['fileSize']) else: size = 0 # deleted files don't get a fileSize apparently - sys.stdout.write('\r%s --> %s (%i MB)\n' % (file['depotFile'], relPath, size/1024/1024)) + sys.stdout.write('\r%s --> %s (%i MB)\n' % (file['depotFile'], relPath, size//1024//1024)) sys.stdout.flush() (type_base, type_mods) = split_p4_type(file["type"]) @@ -2934,7 +2937,7 @@ def streamP4FilesCb(self, marshalled): required_bytes = int((4 * int(self.stream_file["fileSize"])) - calcDiskFree()) if required_bytes > 0: err = 'Not enough space left on %s! Free at least %i MB.' % ( - os.getcwd(), required_bytes/1024/1024 + os.getcwd(), required_bytes//1024//1024 ) if err: @@ -2963,7 +2966,7 @@ def streamP4FilesCb(self, marshalled): # pick up the new file information... for the # 'data' field we need to append to our array - for k in marshalled.keys(): + for k in list(marshalled.keys()): if k == 'data': if 'streamContentSize' not in self.stream_file: self.stream_file['streamContentSize'] = 0 @@ -2978,8 +2981,8 @@ def streamP4FilesCb(self, marshalled): 'depotFile' in self.stream_file): size = int(self.stream_file["fileSize"]) if size > 0: - progress = 100*self.stream_file['streamContentSize']/size - sys.stdout.write('\r%s %d%% (%i MB)' % (self.stream_file['depotFile'], progress, int(size/1024/1024))) + progress = 100.0*self.stream_file['streamContentSize']/size + sys.stdout.write('\r%s %4.1f%% (%i MB)' % (self.stream_file['depotFile'], progress, int(size//1024//1024))) sys.stdout.flush() self.stream_have_file_info = True @@ -3060,7 +3063,7 @@ def streamTag(self, gitStream, labelName, labelDetails, commit, epoch): gitStream.write("tagger %s\n" % tagger) - print("labelDetails=",labelDetails) + print(("labelDetails=",labelDetails)) if 'Description' in labelDetails: description = labelDetails['Description'] else: @@ -3199,7 +3202,7 @@ def getLabels(self): self.labels[newestChange] = [output, revisions] if self.verbose: - print("Label changes: %s" % self.labels.keys()) + print("Label changes: %s" % list(self.labels.keys())) # Import p4 labels as git tags. A direct mapping does not # exist, so assume that if all the files are at the same revision @@ -3342,7 +3345,7 @@ def getBranchMapping(self): def getBranchMappingFromGitBranches(self): branches = p4BranchesInGit(self.importIntoRemotes) - for branch in branches.keys(): + for branch in list(branches.keys()): if branch == "master": branch = "main" else: @@ -3454,14 +3457,14 @@ def importChanges(self, changes, origin_revision=0): self.updateOptionDict(description) if not self.silent: - sys.stdout.write("\rImporting revision %s (%s%%)" % (change, cnt * 100 / len(changes))) + sys.stdout.write("\rImporting revision %s (%4.1f%%)" % (change, cnt * 100 / len(changes))) sys.stdout.flush() cnt = cnt + 1 try: if self.detectBranches: branches = self.splitFilesIntoBranches(description) - for branch in branches.keys(): + for branch in list(branches.keys()): ## HACK --hwn branchPrefix = self.depotPaths[0] + branch + "/" self.branchPrefixes = [ branchPrefix ] @@ -3650,13 +3653,13 @@ def run(self, args): if short in branches: self.p4BranchesInGit = [ short ] else: - self.p4BranchesInGit = branches.keys() + self.p4BranchesInGit = list(branches.keys()) if len(self.p4BranchesInGit) > 1: if not self.silent: print("Importing from/into multiple branches") self.detectBranches = True - for branch in branches.keys(): + for branch in list(branches.keys()): self.initialParents[self.refPrefix + branch] = \ branches[branch] @@ -4040,7 +4043,7 @@ def findLastP4Revision(self, starting_point): to find the P4 commit we are based on, and the depot-paths. """ - for parent in (range(65535)): + for parent in (list(range(65535))): log = extractLogMessageFromGitCommit("{0}^{1}".format(starting_point, parent)) settings = extractSettingsGitLog(log) if 'change' in settings: @@ -4179,7 +4182,7 @@ def printUsage(commands): def main(): if len(sys.argv[1:]) == 0: - printUsage(commands.keys()) + printUsage(list(commands.keys())) sys.exit(2) cmdName = sys.argv[1] @@ -4189,7 +4192,7 @@ def main(): except KeyError: print("unknown command %s" % cmdName) print("") - printUsage(commands.keys()) + printUsage(list(commands.keys())) sys.exit(2) options = cmd.options From patchwork Wed Dec 4 22:29:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11273679 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 52F2C13B6 for ; Wed, 4 Dec 2019 22:29:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 26BDA206DB for ; Wed, 4 Dec 2019 22:29:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ucuuZIe0" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728410AbfLDW3s (ORCPT ); Wed, 4 Dec 2019 17:29:48 -0500 Received: from mail-wr1-f43.google.com ([209.85.221.43]:39345 "EHLO mail-wr1-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728121AbfLDW3q (ORCPT ); Wed, 4 Dec 2019 17:29:46 -0500 Received: by mail-wr1-f43.google.com with SMTP id y11so1112207wrt.6 for ; Wed, 04 Dec 2019 14:29:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=N2pqbez6QuyRw2bMBb/v8SGaZ3M53Bs8crmhS9TCJyQ=; b=ucuuZIe0L7dUsjesrTS4Mv0EaOBr3HRijsdzBGdt4HtohP4D3TqlMW9ZF2gjGacLUi T4K/rJ28kek5RzJOveO3fI8lQaSjqw94C23nhry3+UQor7lYd9CeA3qeolYSRsUaXhgc DkTOwCXMWpRXXeddvb0xmn2k3fkAAmadvzxvLyQ1moc/3kbnoh/3rOq0hUyNFaD+Su2b qFxsvFpDIQgEt8ItlGkcutHLtczePnICnGuIs3CnZ2ctUlrqdA2x/9Kz/moFnVPgkyrl mqMEmELnTKkuRbxjCOD0fKsaJMhHi+lN+wvEf8FGipzXNOyFyXrRHweX5B7vG5NHlpCT N+xQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=N2pqbez6QuyRw2bMBb/v8SGaZ3M53Bs8crmhS9TCJyQ=; b=pHNLm47xjdyQPVRavRIOeRmAJfF87YSYlBNcKy2vbtTFhknjwKQGUUFAvANF2rb/Xw SFI4HHdKjcDrhkXoS3fiMXAd+YBJILAwvsxyv9af+AbEMdmXQchS845shUuCu/GD3mVT toqH7c4dq5jp5o6mvqvpgvyHhI/Kr96SQMf7gcVSXh+it1qyUEDhhdamxr2hV+nQ0yp9 wh29b4M4U/f8vxGXKzhuxstAattuIPoKMa52lIVPB9cam4P/n10dViT/ZecsymRoJJjx +840GwYybs94XNsWgIuVAH9uGamQsFjWtDyjP6GTSEuaT+cv+BzU2bZYnx0xIXuQsWyl 84bg== X-Gm-Message-State: APjAAAWxIAXtnhDBBEujD2EduuEvBvcapNQg08o6TmEBFMkbzfDsKV6M UCuli+OESvceoMlsweFu2N3SHTyp X-Google-Smtp-Source: APXvYqzGFygr+hDpTkac/FkitSpTcpYQI8HUWz7RG87htDsca9Gs2sp3mmSli+OsLdhbS+eX60uVGA== X-Received: by 2002:adf:f508:: with SMTP id q8mr3398517wro.334.1575498583782; Wed, 04 Dec 2019 14:29:43 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id z64sm8991043wmg.30.2019.12.04.14.29.42 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 04 Dec 2019 14:29:43 -0800 (PST) Message-Id: <1bf7b073b047ca7625d0861b160a9602135f7baf.1575498578.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Ben Keene via GitGitGadget" Date: Wed, 04 Dec 2019 22:29:31 +0000 Subject: [PATCH v4 05/11] git-p4: Add new functions in preparation of usage Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Ben Keene , Junio C Hamano , Ben Keene Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Ben Keene This changelist is an intermediate submission for migrating the P4 support from Python2 to Python3. The code needs access to the encodeWithUTF8() for support of non-UTF8 filenames in the clone class as well as the sync class. Move the function encodeWithUTF8() from the P4Sync class to a stand-alone function. This will allow other classes to use this function without instanciating the P4Sync class. Change the self.verbose reference to an optional method parameter. Update the existing references to this function to pass the self.verbose since it is no longer available on "self" since the function is no longer contained on the P4Sync class. Modify the functions write_pipe() and p4_write_pipe() to remove the return value. The return value for both functions is the number of bytes, but the meaning is lost under python3 since the count does not match the number of characters that may have been encoded. Additionally, the return value was never used, so this is removed to avoid future ambiguity. Add a new method gitConfigSet(). This method will set a value in the git configuration cache list. Signed-off-by: Ben Keene (cherry picked from commit affe888f432bb6833df78962e8671fccdf76c47a) --- git-p4.py | 60 ++++++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 44 insertions(+), 16 deletions(-) diff --git a/git-p4.py b/git-p4.py index b283ef1029..2659531c2e 100755 --- a/git-p4.py +++ b/git-p4.py @@ -237,6 +237,8 @@ def die(msg): sys.exit(1) def write_pipe(c, stdin): + """ Executes the command 'c', passing 'stdin' on the standard input + """ if verbose: sys.stderr.write('Writing pipe: %s\n' % str(c)) @@ -248,11 +250,12 @@ def write_pipe(c, stdin): if p.wait(): die('Command failed: %s' % str(c)) - return val def p4_write_pipe(c, stdin): + """ Runs a P4 command 'c', passing 'stdin' data to P4 + """ real_cmd = p4_build_cmd(c) - return write_pipe(real_cmd, stdin) + write_pipe(real_cmd, stdin) def read_pipe_full(c): """ Read output from command. Returns a tuple @@ -653,6 +656,38 @@ def isModeExec(mode): # otherwise False. return mode[-3:] == "755" +def encodeWithUTF8(path, verbose = False): + """ Ensure that the path is encoded as a UTF-8 string + + Returns bytes(P3)/str(P2) + """ + + if isunicode: + try: + if isinstance(path, unicode): + # It is already unicode, cast it as a bytes + # that is encoded as utf-8. + return path.encode('utf-8', 'strict') + path.decode('ascii', 'strict') + except: + encoding = 'utf8' + if gitConfig('git-p4.pathEncoding'): + encoding = gitConfig('git-p4.pathEncoding') + path = path.decode(encoding, 'replace').encode('utf8', 'replace') + if verbose: + print('\nNOTE:Path with non-ASCII characters detected. Used %s to encode: %s ' % (encoding, to_unicode(path))) + else: + try: + path.decode('ascii') + except: + encoding = 'utf8' + if gitConfig('git-p4.pathEncoding'): + encoding = gitConfig('git-p4.pathEncoding') + path = path.decode(encoding, 'replace').encode('utf8', 'replace') + if verbose: + print('Path with non-ASCII characters detected. Used %s to encode: %s ' % (encoding, path)) + return path + class P4Exception(Exception): """ Base class for exceptions from the p4 client """ def __init__(self, exit_code): @@ -891,6 +926,11 @@ def gitConfigList(key): _gitConfig[key] = [] return _gitConfig[key] +def gitConfigSet(key, value): + """ Set the git configuration key 'key' to 'value' for this session + """ + _gitConfig[key] = value + def p4BranchesInGit(branchesAreInRemotes=True): """Find all the branches whose names start with "p4/", looking in remotes or heads as specified by the argument. Return @@ -2814,24 +2854,12 @@ def writeToGitStream(self, gitMode, relPath, contents): self.gitStream.write(d) self.gitStream.write('\n') - def encodeWithUTF8(self, path): - try: - path.decode('ascii') - except: - encoding = 'utf8' - if gitConfig('git-p4.pathEncoding'): - encoding = gitConfig('git-p4.pathEncoding') - path = path.decode(encoding, 'replace').encode('utf8', 'replace') - if self.verbose: - print('Path with non-ASCII characters detected. Used %s to encode: %s ' % (encoding, path)) - return path - # output one file from the P4 stream # - helper for streamP4Files def streamOneP4File(self, file, contents): relPath = self.stripRepoPath(file['depotFile'], self.branchPrefixes) - relPath = self.encodeWithUTF8(relPath) + relPath = encodeWithUTF8(relPath, self.verbose) if verbose: if 'fileSize' in self.stream_file: size = int(self.stream_file['fileSize']) @@ -2914,7 +2942,7 @@ def streamOneP4File(self, file, contents): def streamOneP4Deletion(self, file): relPath = self.stripRepoPath(file['path'], self.branchPrefixes) - relPath = self.encodeWithUTF8(relPath) + relPath = encodeWithUTF8(relPath, self.verbose) if verbose: sys.stdout.write("delete %s\n" % relPath) sys.stdout.flush() From patchwork Wed Dec 4 22:29:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11273689 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9009D930 for ; Wed, 4 Dec 2019 22:29:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6C8652073C for ; Wed, 4 Dec 2019 22:29:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="InwIWT5f" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728407AbfLDW3s (ORCPT ); Wed, 4 Dec 2019 17:29:48 -0500 Received: from mail-wr1-f68.google.com ([209.85.221.68]:46791 "EHLO mail-wr1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728383AbfLDW3q (ORCPT ); Wed, 4 Dec 2019 17:29:46 -0500 Received: by mail-wr1-f68.google.com with SMTP id z7so1058146wrl.13 for ; Wed, 04 Dec 2019 14:29:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Gl/kCKnStjrJrZ8zYMAsY2UtbM6GFFGrW5BTTTWrAJo=; b=InwIWT5f0TWB+CWZDzIb8hZ7i9caFbha13Llo188+2lvOPKMX85rLyCXGrRW50GUVN /lHnSn2MbY6PjBe4Gn/WsB5yAUSOBNacQ7XnY4pfpCik5fVfB8FvqmHRLM1FfD7WoY8J ZSTUQTGJkGXmy02bhRUYxSmzRy2AMuwzit75hO/7vR5O5I79q+JW0UWQKsu3F9E5DcX5 0xcRcumZAuKRSaRfTxAirZ+l04huq2rso3iCduzKr+AK0bKVLTUkCvJzio24UHsCUU/1 ADwI7M3kV4p/AnR6RqV8R1HbQtk/0t4yOwoIjRQXPAj2qdpbge3E89j+vDfUYch33a0y Ezbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Gl/kCKnStjrJrZ8zYMAsY2UtbM6GFFGrW5BTTTWrAJo=; b=gEOJHmsnypDdv31lqyEobJylv+rzkfpCLt4UzHhA1Uken5U5KP8IEClR8HpJVU+Khc 6mILw+0PFn4vf6mdOAQh8ZEGFpCnIrihVSD065a3iHQVS3PmJdjPlYiD21tzQcE4wxHq DwZ3q1x8KBCunf3VHQ9RfkDWhAkvyBXXR5KZE0LruhzaMfrlQqdzSyGBfznD22HlRYCs KGhmqN3LaEJSk8vI29Oy6rwbvme5zbYQ9OohMdvey3S8cOoKw8kPbdvJKdnq9pX69LQ6 78S7OGCpQoEFavQzdVbWLjnlUOvrJLnb1muWGpQp8jXaCcfNd4Iw+/MOsw+PaQmhwRqz mRRQ== X-Gm-Message-State: APjAAAWNp4/mA+fs5Ap4gnWaU4OWgsuY8wpOwQ9LY0SQ6d24zJ5LSt/X E67ys1PdkMHmrl72A09UPC0OKnPl X-Google-Smtp-Source: APXvYqzxmb29+gDHimSo89zXuslpXY7zRA5R/oopMPUi3Cse2Q4wMUMQ+jmTSauBE3tdcBlP1XKz9w== X-Received: by 2002:a5d:65c5:: with SMTP id e5mr6488357wrw.311.1575498584553; Wed, 04 Dec 2019 14:29:44 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s10sm9960434wrw.12.2019.12.04.14.29.43 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 04 Dec 2019 14:29:44 -0800 (PST) Message-Id: <8f5752c12737fd861274609fdafac095ad95c519.1575498578.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Ben Keene via GitGitGadget" Date: Wed, 04 Dec 2019 22:29:32 +0000 Subject: [PATCH v4 06/11] git-p4: Fix assumed path separators to be more Windows friendly Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Ben Keene , Junio C Hamano , Ben Keene Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Ben Keene When a computer is configured to use Git for windows and Python for windows, and not a Unix subsystem like cygwin or WSL, the directory separator changes and causes git-p4 to fail to properly determine paths. Fix 3 path separator errors: 1. getUserCacheFilename should not use string concatenation. Change this code to use os.path.join to build an OS tolerant path. 2. defaultDestiantion used the OS.path.split to split depot paths. This is incorrect on windows. Change the code to split on a forward slash(/) instead since depot paths use this character regardless of the operating system. 3. The call to isvalidGitDir() in the main code also used a literal forward slash. Change the cose to use os.path.join to correctly format the path for the operating system. These three changes allow the suggested windows configuration to properly locate files while retaining the existing behavior on non-windows operating systems. Signed-off-by: Ben Keene (cherry picked from commit a5b45c12c3861638a933b05a1ffee0c83978dcb2) --- git-p4.py | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/git-p4.py b/git-p4.py index 2659531c2e..7ac8cb42ef 100755 --- a/git-p4.py +++ b/git-p4.py @@ -1454,8 +1454,10 @@ def p4UserIsMe(self, p4User): return True def getUserCacheFilename(self): + """ Returns the filename of the username cache + """ home = os.environ.get("HOME", os.environ.get("USERPROFILE")) - return home + "/.gitp4-usercache.txt" + return os.path.join(home, ".gitp4-usercache.txt") def getUserMapFromPerforceServer(self): if self.userMapFromPerforceServer: @@ -3973,13 +3975,16 @@ def __init__(self): self.cloneBare = False def defaultDestination(self, args): + """ Returns the last path component as the default git + repository directory name + """ ## TODO: use common prefix of args? depotPath = args[0] depotDir = re.sub("(@[^@]*)$", "", depotPath) depotDir = re.sub("(#[^#]*)$", "", depotDir) depotDir = re.sub(r"\.\.\.$", "", depotDir) depotDir = re.sub(r"/$", "", depotDir) - return os.path.split(depotDir)[1] + return depotDir.split('/')[-1] def run(self, args): if len(args) < 1: @@ -4252,8 +4257,8 @@ def main(): chdir(cdup); if not isValidGitDir(cmd.gitdir): - if isValidGitDir(cmd.gitdir + "/.git"): - cmd.gitdir += "/.git" + if isValidGitDir(os.path.join(cmd.gitdir, ".git")): + cmd.gitdir = os.path.join(cmd.gitdir, ".git") else: die("fatal: cannot locate git repository at %s" % cmd.gitdir) From patchwork Wed Dec 4 22:29:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11273683 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 73B06159A for ; Wed, 4 Dec 2019 22:29:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 50E64206DB for ; Wed, 4 Dec 2019 22:29:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="QnSMhrq8" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728436AbfLDW3w (ORCPT ); Wed, 4 Dec 2019 17:29:52 -0500 Received: from mail-wr1-f49.google.com ([209.85.221.49]:45140 "EHLO mail-wr1-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728395AbfLDW3s (ORCPT ); Wed, 4 Dec 2019 17:29:48 -0500 Received: by mail-wr1-f49.google.com with SMTP id j42so1068703wrj.12 for ; Wed, 04 Dec 2019 14:29:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=qmCmURoMqMcKK7hPDa63KxUxtMekLT5cisZPYmvN+kg=; b=QnSMhrq8xj+wwnezTztT5hmE1IU6qcF0BxQitGI0tW0h73b3w5wAbe6v/r48KgT2RF VI6spOOPuwH0pzeRUR4wss0gQcuL6csIzegMuQBZlB5d5XhfWx4Cyi9RYS8QtVQ8+pDQ 0wC+luVTwK3yK1ZJx2azBzffgo42K5Jn2owZbv2DV3o0mucAmxTS4vOAiDCkWFWFhl9o da9MkgyCTBcDhmMkfhs0cnS+I+dp5Mz/NN0GIynse36S2kCr+EwClRBXEl+7rD36iDeD D7M6Jxe9oDNQVnd7m54bL26UNEhuoaM2fRxxOJjIGQN/4wPZM5fAEsalfta0YVuR9b8n mtLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=qmCmURoMqMcKK7hPDa63KxUxtMekLT5cisZPYmvN+kg=; b=XlLtNVmIiBLVX0w4ojEebMLFgOijYI7y/h+m33936ZudTU8f4W+oBCubkiG8yHdly5 UBzHGepBbpyWc/xVKjovForIR9LPCuJOm0mpzBMVFnqYcTH2jKuATDWLHnS8jXsLFRQz 7bPYwRqGfHuo4Lrj2+7RKBwBOggap/j/7Fyvsu9T0SvgonEdQBRMjeSCCgWgZF/3HVGp EJXk4CrEHRE6Hts+1YoV/PbjaETkwMhliRV3DHaujGZpYaqTHKupkuSb6uT3vBcw78WY wmG6IGVMS3i4LXa895ljz7NAvLLgNBxPemZXWr6ZJgmCsSgFSMYkgtypTA/H3H5H6lSD tpVw== X-Gm-Message-State: APjAAAUTGCoKcfSu9V6yt0X7/3vaEFxOQoKgkbzexG3QU706TFxvRO4Y 10HengvhMeY9Ldw5KUmbkkYiuwgy X-Google-Smtp-Source: APXvYqzBuuPm716Ypias6BTA5nK78HHvl4gYJ1c7x1yyPrMq6iXT88BAIA/IIHtSqxMpfoXzKhCdmw== X-Received: by 2002:adf:cd0a:: with SMTP id w10mr6386867wrm.107.1575498585283; Wed, 04 Dec 2019 14:29:45 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o15sm10535465wra.83.2019.12.04.14.29.44 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 04 Dec 2019 14:29:44 -0800 (PST) Message-Id: <10dc059444b965c3db3fda5600de64da32de53b4.1575498578.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Ben Keene via GitGitGadget" Date: Wed, 04 Dec 2019 22:29:33 +0000 Subject: [PATCH v4 07/11] git-p4: Add a helper class for stream writing Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Ben Keene , Junio C Hamano , Ben Keene Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Ben Keene This is a transtional commit that does not change current behvior. It adds a new class Py23File. Following the Python recommendation of keeping text as unicode internally and only converting to and from bytes on input and output, this class provides an interface for the methods used for reading and writing files and file like streams. Create a class that wraps the input and output functions used by the git-p4.py code for reading and writing to standard file handles. The methods of this class should take a Unicode string for writing and return unicode strings in reads. This class should be a drop-in for existing file like streams The following methods should be coded for supporting existing read/write calls: * write - this should write a Unicode string to the underlying stream * read - this should read from the underlying stream and cast the bytes as a unicode string * readline - this should read one line of text from the underlying stream and cast it as a unicode string * readline - this should read a number of lines, optionally hinted, and cast each line as a unicode string The expression "cast as a unicode string" is used because the code should use the AS_BYTES() and AS_UNICODE() functions instead of cohercing the data to actual unicode strings or bytes. This allows python 2 code to continue to use the internal "str" data type instead of converting the data back and forth to actual unicode strings. This retains current python2 support while python3 support may be incomplete. Signed-off-by: Ben Keene (cherry picked from commit 12919111fbaa3e4c0c4c2fdd4f79744cc683d860) --- git-p4.py | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+) diff --git a/git-p4.py b/git-p4.py index 7ac8cb42ef..0da640be93 100755 --- a/git-p4.py +++ b/git-p4.py @@ -4182,6 +4182,72 @@ def run(self, args): print("%s <= %s (%s)" % (branch, ",".join(settings["depot-paths"]), settings["change"])) return True +class Py23File(): + """ Python2/3 Unicode File Wrapper + """ + + stream_handle = None + verbose = False + debug_handle = None + + def __init__(self, stream_handle, verbose = False): + """ Create a Python3 compliant Unicode to Byte String + Windows compatible wrapper + + stream_handle = the underlying file-like handle + verbose = Boolean if content should be echoed + """ + self.stream_handle = stream_handle + self.verbose = verbose + + def write(self, utf8string): + """ Writes the utf8 encoded string to the underlying + file stream + """ + self.stream_handle.write(as_bytes(utf8string)) + if self.verbose: + sys.stderr.write("Stream Output: %s" % utf8string) + sys.stderr.flush() + + def read(self, size = None): + """ Reads int charcters from the underlying stream + and converts it to utf8. + + Be aware, the size value is for reading the underlying + bytes so the value may be incorrect. Usage of the size + value is discouraged. + """ + if size == None: + return as_string(self.stream_handle.read()) + else: + return as_string(self.stream_handle.read(size)) + + def readline(self): + """ Reads a line from the underlying byte stream + and converts it to utf8 + """ + return as_string(self.stream_handle.readline()) + + def readlines(self, sizeHint = None): + """ Returns a list containing lines from the file converted to unicode. + + sizehint - Optional. If the optional sizehint argument is + present, instead of reading up to EOF, whole lines totalling + approximately sizehint bytes are read. + """ + lines = self.stream_handle.readlines(sizeHint) + for i in range(0, len(lines)): + lines[i] = as_string(lines[i]) + return lines + + def close(self): + """ Closes the underlying byte stream """ + self.stream_handle.close() + + def flush(self): + """ Flushes the underlying byte stream """ + self.stream_handle.flush() + class HelpFormatter(optparse.IndentedHelpFormatter): def __init__(self): optparse.IndentedHelpFormatter.__init__(self) From patchwork Wed Dec 4 22:29:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11273685 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AC89013B6 for ; Wed, 4 Dec 2019 22:29:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8994A2073C for ; Wed, 4 Dec 2019 22:29:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="gvfYXyzW" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728427AbfLDW3v (ORCPT ); Wed, 4 Dec 2019 17:29:51 -0500 Received: from mail-wr1-f45.google.com ([209.85.221.45]:38616 "EHLO mail-wr1-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728326AbfLDW3s (ORCPT ); Wed, 4 Dec 2019 17:29:48 -0500 Received: by mail-wr1-f45.google.com with SMTP id y17so1124158wrh.5 for ; Wed, 04 Dec 2019 14:29:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=annvhjwxSXPrbFLi5I+ldURL5XncKDANKEh5DqKKyLo=; b=gvfYXyzWMD+6Ijr6Yt4945Ch802PaG+FNdNNQz3q78gmEuDJeWeerG6qpKJfvztVLX 2bLac3kNuDkp0BOGIQqgbjj1AziZ6kn5q+Jzys6gIHkEGTv6TAcbdLx/gQmn7OG/AzIi cT2iVrhDcHuzRbV9C4MZMVv6NdVZBEJewhK8SvEduaWIPGgwUXSnVwqcsZS9HR7U/2Mk 5ESsL6+G2R6s5kAPo/p05XIKi74rsx9tF9h7x3fmUWscsxOX/NbqRwckUNNCBtv2ftX3 wGzK5Kj5Dc/bv+VpHil6iVljKusWHHJy+JBg7LxbtdtY+BF8V95MpYUWyR/alehCQqmc a06g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=annvhjwxSXPrbFLi5I+ldURL5XncKDANKEh5DqKKyLo=; b=OjN4DHt0OEBJAORgy0mRJTQi4dFx8k4g/A0mxBhi0IEU7GnJ6zUHfoxKSBL2hioOKi 00AJcnNcnvNbMD8MGFDNxt+wm1CpLPdpyHOuFGiXX7LsiOkvR2cVx9Z5Hjvu18Y+t9GL BwTLh7yAqoUuV7QPivDFBctKJ4VaURnHshAI98NDF1LQmWnb2ver4X59UbsW9D6un8ws 6uT7RDmQNhzkkPNsjDGav8D8mDVjRizwPZbJZQ2IDlV21MD4X6gr4dqUNVlmI9jm/N8r cte0mWS0KRJpox2kH2hibqSZkuVNxpJT4QHeOtEiK5GWMEP5671fvk6PgDF/6Y5avC+B A3Mw== X-Gm-Message-State: APjAAAUcqWD7OjGdqTp0Stkz2Fpk7lqHucuW7fYEUhnc6cWPFkG6rHO3 /BtieLWHJ4ccHG6DAwpoJ6Wjl0UI X-Google-Smtp-Source: APXvYqwI80Ww5ngxVf9sKwSPCE10j6aUtKDKfoQOY+b6EyjTBnSitr2xWZKUMjOxtTL7SihQXxdLZg== X-Received: by 2002:adf:9427:: with SMTP id 36mr6627156wrq.166.1575498586052; Wed, 04 Dec 2019 14:29:46 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id x18sm9748666wrr.75.2019.12.04.14.29.45 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 04 Dec 2019 14:29:45 -0800 (PST) Message-Id: In-Reply-To: References: From: "Ben Keene via GitGitGadget" Date: Wed, 04 Dec 2019 22:29:34 +0000 Subject: [PATCH v4 08/11] git-p4: p4CmdList - support Unicode encoding Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Ben Keene , Junio C Hamano , Ben Keene Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Ben Keene The p4CmdList is a commonly used function in the git-p4 code. It is used to execute a command in P4 and return the results of the call in a list. Change this code to take a new optional parameter, encode_data that will optionally convert the data AS_STRING() that isto be returned by the function. Change the code so that the key will always be encoded AS_STRING() Data that is passed for standard input (stdin) should be AS_BYTES() to ensure unicode text that is supplied will be written out as bytes. Additionally, change literal text prior to conversion to be literal bytes. Signed-off-by: Ben Keene (cherry picked from commit 88306ac269186cbd0f6dc6cfd366b50b28ee4886) --- git-p4.py | 27 +++++++++++++++++++++++---- 1 file changed, 23 insertions(+), 4 deletions(-) diff --git a/git-p4.py b/git-p4.py index 0da640be93..f7c0ef0c53 100755 --- a/git-p4.py +++ b/git-p4.py @@ -711,7 +711,23 @@ def isModeExecChanged(src_mode, dst_mode): return isModeExec(src_mode) != isModeExec(dst_mode) def p4CmdList(cmd, stdin=None, stdin_mode='w+b', cb=None, skip_info=False, - errors_as_exceptions=False): + errors_as_exceptions=False, encode_data=True): + """ Executes a P4 command: 'cmd' optionally passing 'stdin' to the command's + standard input via a temporary file with 'stdin_mode' mode. + + Output from the command is optionally passed to the callback function 'cb'. + If 'cb' is None, the response from the command is parsed into a list + of resulting dictionaries. (For each block read from the process pipe.) + + If 'skip_info' is true, information in a block read that has a code type of + 'info' will be skipped. + + If 'errors_as_exceptions' is set to true (the default is false) the error + code returned from the execution will generate an exception. + + If 'encode_data' is set to true (the default) the data that is returned + by this function will be passed through the "as_string" function. + """ if not isinstance(cmd, list): cmd = "-G " + cmd @@ -734,7 +750,7 @@ def p4CmdList(cmd, stdin=None, stdin_mode='w+b', cb=None, skip_info=False, stdin_file.write(stdin) else: for i in stdin: - stdin_file.write(i + '\n') + stdin_file.write(as_bytes(i) + b'\n') stdin_file.flush() stdin_file.seek(0) @@ -748,12 +764,15 @@ def p4CmdList(cmd, stdin=None, stdin_mode='w+b', cb=None, skip_info=False, while True: entry = marshal.load(p4.stdout) if skip_info: - if 'code' in entry and entry['code'] == 'info': + if b'code' in entry and entry[b'code'] == b'info': continue if cb is not None: cb(entry) else: - result.append(entry) + out = {} + for key, value in entry.items(): + out[as_string(key)] = (as_string(value) if encode_data else value) + result.append(out) except EOFError: pass exitCode = p4.wait() From patchwork Wed Dec 4 22:29:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11273691 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B182313B6 for ; Wed, 4 Dec 2019 22:29:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 906412073C for ; Wed, 4 Dec 2019 22:29:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="BJDEFBlq" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728449AbfLDW34 (ORCPT ); Wed, 4 Dec 2019 17:29:56 -0500 Received: from mail-wm1-f65.google.com ([209.85.128.65]:37071 "EHLO mail-wm1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728279AbfLDW3t (ORCPT ); Wed, 4 Dec 2019 17:29:49 -0500 Received: by mail-wm1-f65.google.com with SMTP id f129so1539459wmf.2 for ; Wed, 04 Dec 2019 14:29:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=P3NS7SOafgOGxasVXp35ajnW/qv3DzAtEW2MjvpUDls=; b=BJDEFBlqog8qiZJpfMr7GVQaqDNcbkibJC+M18qVSb9PYUjZowJ9wQiIYUVare0oT7 bE0zkiIHCy2e6g57p+29P5wbeZZ8EC/PdyWl+Ru7x6xxasHoL5y7WBLhYSlOMCE0Hgct zeAXSLFCtndSDta+35e8MOmmLo8uPcbw2ArM7iGQcnXV194IzyufJ5QFrhoOrl86+3lt bsFHmBX4Rktme047C2vf3TkqNBWyKl62XkG/YAi9p1bYutBRXAAfC8PmQYzE0UghKhdu tXRvFfZgZ6kJHfyj3HRmN9LiLZnrRt4HPkswLVIWpnaLJxT0PutyKOnWW93gqtG++2// dXTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=P3NS7SOafgOGxasVXp35ajnW/qv3DzAtEW2MjvpUDls=; b=ivbKSLB37a7IgWNcWW3eck+FdKGa5aaKSk12r2deU7kGWQ43lgA/Z7s8ScxUbv5Z4H SPW1kH54Yvb2Z3fB896z1jhbLkfeZzYfIgAfbtbLGDye5CDOk0mLI+qvH6Imwyr5tcX1 bFB1rkcnshEdCcO7mYXGYByTcYtaGXlFI/aPG9ZFDOqFl5XLMvZJ7N9AsYWxMvI9Syo2 LkQ2t3Fwsu1/awDJhBHyQwRja8AMOq49bBLRlsHkFAPvhG9CGd9TM6PjkMH+6eYyOMXL 6eRc5RvhcPhp2f+qw4RD5xmTAa7/uqU9/PYzNz5XtuIVoWyYN8E6SOyENCnc4C5oXvLg ZqXQ== X-Gm-Message-State: APjAAAV/+uGiEe7CIZTsYHePnSRu7ovzZ562bZLFBvKxfrWMlZci7nDb qEWJd/rAwfiKV1f2+VO4hsZ7qoWD X-Google-Smtp-Source: APXvYqxm2aj/Cl17aLzjTIwZvWsxZeo+DSWOTtRRFn9+qSHuL1fIurF6oBbQavdjjPh09Lksf9WTxg== X-Received: by 2002:a7b:ca57:: with SMTP id m23mr2053376wml.65.1575498586742; Wed, 04 Dec 2019 14:29:46 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id i127sm8890170wma.35.2019.12.04.14.29.46 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 04 Dec 2019 14:29:46 -0800 (PST) Message-Id: <4fc49313f0d68a913ad19085ddb337ac4c18d0fe.1575498578.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Ben Keene via GitGitGadget" Date: Wed, 04 Dec 2019 22:29:35 +0000 Subject: [PATCH v4 09/11] git-p4: Add usability enhancements Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Ben Keene , Junio C Hamano , Ben Keene Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Ben Keene Issue: when prompting the user with raw_input, the tests are not forgiving of user input. For example, on the first query asks for a yes/no response. If the user enters the full word "yes" or "no" the test will fail. Additionally, offer the suggestion of setting git-p4.attemptRCSCleanup when applying a commit fails because of RCS keywords. Both of these changes are usability enhancement suggestions. Change the code prompting the user for input to sanitize the user input before checking the response by asking the response as a lower case string, trimming leading/trailing spaces, and returning the first character. Change the applyCommit() method that when applying a commit fails becasue of the P4 RCS Keywords, the user should consider setting git-p4.attemptRCSCleanup. Signed-off-by: Ben Keene (cherry picked from commit 1fab571664f5b6ad4ef321199f52615a32a9f8c7) --- git-p4.py | 31 ++++++++++++++++++++++++++----- 1 file changed, 26 insertions(+), 5 deletions(-) diff --git a/git-p4.py b/git-p4.py index f7c0ef0c53..f13e4645a3 100755 --- a/git-p4.py +++ b/git-p4.py @@ -1909,7 +1909,8 @@ def edit_template(self, template_file): return True while True: - response = raw_input("Submit template unchanged. Submit anyway? [y]es, [n]o (skip this patch) ") + response = raw_input("Submit template unchanged. Submit anyway? [y]es, [n]o (skip this patch) ").lower() \ + .strip()[0] if response == 'y': return True if response == 'n': @@ -2069,8 +2070,23 @@ def applyCommit(self, id): # disable the read-only bit on windows. if self.isWindows and file not in editedFiles: os.chmod(file, stat.S_IWRITE) - self.patchRCSKeywords(file, kwfiles[file]) - fixed_rcs_keywords = True + + try: + self.patchRCSKeywords(file, kwfiles[file]) + fixed_rcs_keywords = True + except: + # We are throwing an exception, undo all open edits + for f in editedFiles: + p4_revert(f) + raise + else: + # They do not have attemptRCSCleanup set, this might be the fail point + # Check to see if the file has RCS keywords and suggest setting the property. + for file in editedFiles | filesToDelete: + if p4_keywords_regexp_for_file(file) != None: + print("At least one file in this commit has RCS Keywords that may be causing problems. ") + print("Consider:\ngit config git-p4.attemptRCSCleanup true") + break if fixed_rcs_keywords: print("Retrying the patch with RCS keywords cleaned up") @@ -2481,7 +2497,7 @@ def run(self, args): if self.conflict_behavior == "ask": print("What do you want to do?") response = raw_input("[s]kip this commit but apply" - " the rest, or [q]uit? ") + " the rest, or [q]uit? ").lower().strip()[0] if not response: continue elif self.conflict_behavior == "skip": @@ -4327,7 +4343,12 @@ def main(): description = cmd.description, formatter = HelpFormatter()) - (cmd, args) = parser.parse_args(sys.argv[2:], cmd); + try: + (cmd, args) = parser.parse_args(sys.argv[2:], cmd); + except: + parser.print_help() + raise + global verbose verbose = cmd.verbose if cmd.needsGit: From patchwork Wed Dec 4 22:29:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11273693 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B899B13B6 for ; Wed, 4 Dec 2019 22:29:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 75277206DB for ; Wed, 4 Dec 2019 22:29:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="VI+7/EZL" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728452AbfLDW34 (ORCPT ); Wed, 4 Dec 2019 17:29:56 -0500 Received: from mail-wr1-f42.google.com ([209.85.221.42]:43654 "EHLO mail-wr1-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728121AbfLDW3v (ORCPT ); Wed, 4 Dec 2019 17:29:51 -0500 Received: by mail-wr1-f42.google.com with SMTP id d16so1079996wre.10 for ; Wed, 04 Dec 2019 14:29:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=eOj7bjIGPqbI50TlPYX2df7olY1qCgwxw+3WomzswE0=; b=VI+7/EZLPYBRtIQg+BSzberqSoFlVFDZwSZTzi6Q98QxeRIL128HLBjN9o8dpmzzEu R9mpMTVuED6HFtaEus5sAcXj1ah+0d+yZEvJnaSGHAtErJGbEUYPjyZDZTQNvky4Aupl YNQdPjQeKztoo1qAXwmWv/p26X3fXOUOVhgA8ysVJuqrnV5ckAwC785T0EpOLBVdfHFZ k6wZnPTO8ZXsPtTd1AyXUecwGgk6D4fyclz9cgKVA/atEdyxz7OT6FSpbxW1WjEGrfhP SqlMj3WRZ2C9TTxOwzJDw6iTvdIBVAz0lTaaaEQ/LMuCF9ganmnupBfuxE6ggpO7LFyA 3IsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=eOj7bjIGPqbI50TlPYX2df7olY1qCgwxw+3WomzswE0=; b=dPitFA5bYrmBdMI/FUtV8xP49/BRla7ddhO0dX+TqN+k8A39H/jgt4A7PYVeKhj7K5 vij1UgVb1pVW1QZCQDb+m700+575XjPsetrdcoLMatRlZ1eDKcQSlGELdRQRPkka9PTr SpCmjtH4Bt21NiBEKgT582d09dadhYZIhtPU4e7gzkk3sQqmOxAMXWzY1S0iMPiBnUmA kYicG1kHX+vI1TaNbK4eQucEiiWA4ncAx8EUMyUoo7yVcQLsWf0xazNSDIR3c9/cw6dG 10Z8hZn+CEjVqYgYuFjY5EGHQWfIn0L+W3ltvRaCeVTXcxWaG20d7yQpPVDHBmV0haeQ Jyvg== X-Gm-Message-State: APjAAAWk43nUIxCus6dyowciOFB3HMtMZDKFq4jKQh42lGeZxwS1sPqn v+xI9QSl9tDkQcqCVX1x2ye/tUfO X-Google-Smtp-Source: APXvYqwgJZJgGu3jYzt6u4TLjHIWQus/mNWhwjzllDdJDGEJsYdE3oD1AhtDBUi6vXMtpnG1JfF1mw== X-Received: by 2002:a5d:5091:: with SMTP id a17mr6435075wrt.362.1575498587617; Wed, 04 Dec 2019 14:29:47 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id t5sm9982068wrr.35.2019.12.04.14.29.46 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 04 Dec 2019 14:29:47 -0800 (PST) Message-Id: <04a0aedbaa213f046c49f97b0cb47581962e282c.1575498578.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Ben Keene via GitGitGadget" Date: Wed, 04 Dec 2019 22:29:36 +0000 Subject: [PATCH v4 10/11] git-p4: Support python3 for basic P4 clone, sync, and submit Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Ben Keene , Junio C Hamano , Ben Keene Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Ben Keene Issue: Python 3 is still not properly supported for any use with the git-p4 python code. Warning - this is a very large atomic commit. The commit text is also very large. Change the code such that, with the exception of P4 depot paths and depot files, all text read by git-p4 is cast as a string as soon as possible and converted back to bytes as late as possible, following Python2 to Python3 conversion best practices. Important: Do not cast the bytes that contain the p4 depot path or p4 depot file name. These should be left as bytes until used. These two values should not be converted because the encoding of these values is unknown. git-p4 supports a configuration value git-p4.pathEncoding that is used by the encodeWithUTF8() to determine what a UTF8 version of the path and filename should be. However, since depot path and depot filename need to be sent to P4 in their original encoding, they will be left as byte streams until they are actually used: * When sent to P4, the bytes are literally passed to the p4 command * When displayed in text for the user, they should be passed through the path_as_string() function * When used by GIT they should be passed through the encodeWithUTF8() function Change all the rest of system calls to cast output (stdin) as_bytes() and input (stdout) as_string(). This retains existing Python 2 support, and adds python 3 support for these functions: * read_pipe_full * read_pipe_lines * p4_has_move_command (used internally) * gitConfig * branch_exists * GitLFS.generatePointer * applyCommit - template must be read and written to the temporary file as_bytes() since it is created in memory as a string. * streamOneP4File(file, contents) - wrap calls to the depotFile in path_as_string() for display. The file contents must be retained as bytes, so update the RCS changes to be forced to bytes. * streamP4Files * importHeadRevision(revision) - encode the depotPaths for display separate from the text for processing. Py23File usage - Change the P4Sync.OpenStreams() function to cast the gitOutput, gitStream, and gitError streams as Py23File() wrapper classes. This facilitates taking strings in both python 2 and python 3 and casting them to bytes in the wrapper class instead of having to modify each method. Since the fast-import command also expects a raw byte stream for file content, add a new stream handle - gitStreamBytes which is an unwrapped verison of gitStream. Literal text - Depending on context, most literal text does not need casting to unicode or bytes as the text is Python dependent - In python 2, the string is implied as 'str' and python 3 the string is implied as 'unicode'. Under these conditions, they match the rest of the operating text, following best practices. However, when a literal string is used in functions that are dealing with the raw input from and raw ouput to files streams, literal bytes may be required. Additionally, functions that are dealing with P4 depot paths or P4 depot file names are also dealing with bytes and will require the same casting as bytes. The following functions cast text as byte strings: * wildcard_decode(path) - the path parameter is a P4 depot and is bytes. Cast all the literals to bytes. * wildcard_encode(path) - the path parameter is a P4 depot and is bytes. Cast all the literals to bytes. * streamP4FilesCb(marshalled) - the marshalled data is in bytes. Cast the literals as bytes. When using this data to manipulate self.stream_file, encode all the marshalled data except for the 'depotFile' name. * streamP4Files Special behavior: * p4_describe - encoding is disabled for the depotFile(x) and path elements since these are depot path and depo filenames. * p4PathStartsWith(path, prefix) - Since P4 depot paths can contain non-UTF-8 encoded strings, change this method to compare paths while supporting the optional encoding. - First, perform a byte-to-byte check to see if the path and prefix are both identical text. There is no need to perform encoding conversions if the text is identical. - If the byte check fails, pass both the path and prefix through encodeWithUTF8() to ensure both paths are using the same encoding. Then perform the test as originally written. * patchRCSKeywords(file, pattern) - the parameters of file and pattern are both strings. However this function changes the contents of the file itentified by name "file". Treat the content of this file as binary to ensure that python does not accidently change the original encoding. The regular expression is cast as_bytes() and run against the file as_bytes(). The P4 keywords are ASCII strings and cannot span lines so iterating over each line of the file is acceptable. * writeToGitStream(gitMode, relPath, contents) - Since 'contents' is already bytes data, instead of using the self.gitStream, use the new self.gitStreamBytes - the unwrapped gitStream that does not cast as_bytes() the binary data. * commit(details, files, branch, parent = "", allow_empty=False) - Changed the encoding for the commit message to the preferred format for fast-import. The number of bytes is sent in the data block instead of using the EOT marker. * Change the code for handling the user cache to use binary files. Cast text as_bytes() when writing to the cache and as_string() when reading from the cache. This makes the reading and writing of the cache determinstic in it's encoding. Unlike file paths, P4 encodes the user names in UTF-8 encoding so no additional string encoding is required. Signed-off-by: Ben Keene (cherry picked from commit 65ff0c74ebe62a200b4385ecfd4aa618ce091f48) --- git-p4.py | 287 ++++++++++++++++++++++++++++++++++++++---------------- 1 file changed, 205 insertions(+), 82 deletions(-) diff --git a/git-p4.py b/git-p4.py index f13e4645a3..05db2ec657 100755 --- a/git-p4.py +++ b/git-p4.py @@ -268,6 +268,8 @@ def read_pipe_full(c): expand = not isinstance(c, list) p = subprocess.Popen(c, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=expand) (out, err) = p.communicate() + out = as_string(out) + err = as_string(err) return (p.returncode, out, err) def read_pipe(c, ignore_error=False): @@ -294,10 +296,17 @@ def read_pipe_text(c): return out.rstrip() def p4_read_pipe(c, ignore_error=False): + """ Read output from the P4 command 'c'. Returns the output text on + success. On failure, terminates execution, unless + ignore_error is True, when it returns an empty string. + """ real_cmd = p4_build_cmd(c) return read_pipe(real_cmd, ignore_error) def read_pipe_lines(c): + """ Returns a list of text from executing the command 'c'. + The program will die if the command fails to execute. + """ if verbose: sys.stderr.write('Reading pipe: %s\n' % str(c)) @@ -307,6 +316,11 @@ def read_pipe_lines(c): val = pipe.readlines() if pipe.close() or p.wait(): die('Command failed: %s' % str(c)) + # Unicode conversion from byte-string + # Iterate and fix in-place to avoid a second list in memory. + if isunicode: + for i in range(len(val)): + val[i] = as_string(val[i]) return val @@ -335,6 +349,8 @@ def p4_has_move_command(): cmd = p4_build_cmd(["move", "-k", "@from", "@to"]) p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE) (out, err) = p.communicate() + out=as_string(out) + err=as_string(err) # return code will be 1 in either case if err.find("Invalid option") >= 0: return False @@ -462,16 +478,20 @@ def p4_last_change(): return int(results[0]['change']) def p4_describe(change, shelved=False): - """Make sure it returns a valid result by checking for - the presence of field "time". Return a dict of the - results.""" + """ Returns information about the requested P4 change list. + + Data returned is not string encoded (returned as bytes) + """ + # Make sure it returns a valid result by checking for + # the presence of field "time". Return a dict of the + # results. cmd = ["describe", "-s"] if shelved: cmd += ["-S"] cmd += [str(change)] - ds = p4CmdList(cmd, skip_info=True) + ds = p4CmdList(cmd, skip_info=True, encode_data=False) if len(ds) != 1: die("p4 describe -s %d did not return 1 result: %s" % (change, str(ds))) @@ -481,12 +501,23 @@ def p4_describe(change, shelved=False): die("p4 describe -s %d exited with %d: %s" % (change, d["p4ExitCode"], str(d))) if "code" in d: - if d["code"] == "error": + if d["code"] == b"error": die("p4 describe -s %d returned error code: %s" % (change, str(d))) if "time" not in d: die("p4 describe -s %d returned no \"time\": %s" % (change, str(d))) + # Do not convert 'depotFile(X)' or 'path' to be UTF-8 encoded, however + # cast as_string() the rest of the text. + keys=d.keys() + for key in keys: + if key.startswith('depotFile'): + d[key]=d[key] + elif key == 'path': + d[key]=d[key] + else: + d[key] = as_string(d[key]) + return d # @@ -800,6 +831,8 @@ def p4CmdList(cmd, stdin=None, stdin_mode='w+b', cb=None, skip_info=False, return result def p4Cmd(cmd): + """ Executes a P4 command and returns the results in a dictionary + """ list = p4CmdList(cmd) result = {} for entry in list: @@ -908,13 +941,15 @@ def gitDeleteRef(ref): _gitConfig = {} def gitConfig(key, typeSpecifier=None): + """ Return a configuration setting from GIT + """ if key not in _gitConfig: cmd = [ "git", "config" ] if typeSpecifier: cmd += [ typeSpecifier ] cmd += [ key ] s = read_pipe(cmd, ignore_error=True) - _gitConfig[key] = s.strip() + _gitConfig[key] = as_string(s).strip() return _gitConfig[key] def gitConfigBool(key): @@ -988,6 +1023,7 @@ def branch_exists(branch): cmd = [ "git", "rev-parse", "--symbolic", "--verify", branch ] p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE) out, _ = p.communicate() + out = as_string(out) if p.returncode: return False # expect exactly one line of output: the branch name @@ -1171,9 +1207,22 @@ def p4PathStartsWith(path, prefix): # # we may or may not have a problem. If you have core.ignorecase=true, # we treat DirA and dira as the same directory + + # Since we have to deal with mixed encodings for p4 file + # paths, first perform a simple startswith check, this covers + # the case that the formats and path are identical. + if as_bytes(path).startswith(as_bytes(prefix)): + return True + + # attempt to convert the prefix and path both to utf8 + path_utf8 = encodeWithUTF8(path) + prefix_utf8 = encodeWithUTF8(prefix) + if gitConfigBool("core.ignorecase"): - return path.lower().startswith(prefix.lower()) - return path.startswith(prefix) + # Check if we match byte-per-byte. + + return path_utf8.lower().startswith(prefix_utf8.lower()) + return path_utf8.startswith(prefix_utf8) def getClientSpec(): """Look at the p4 client spec, create a View() object that contains @@ -1229,18 +1278,24 @@ def wildcard_decode(path): # Cannot have * in a filename in windows; untested as to # what p4 would do in such a case. if not platform.system() == "Windows": - path = path.replace("%2A", "*") - path = path.replace("%23", "#") \ - .replace("%40", "@") \ - .replace("%25", "%") + path = path.replace(b"%2A", b"*") + path = path.replace(b"%23", b"#") \ + .replace(b"%40", b"@") \ + .replace(b"%25", b"%") return path def wildcard_encode(path): # do % first to avoid double-encoding the %s introduced here - path = path.replace("%", "%25") \ - .replace("*", "%2A") \ - .replace("#", "%23") \ - .replace("@", "%40") + if isinstance(path, unicode): + path = path.replace("%", "%25") \ + .replace("*", "%2A") \ + .replace("#", "%23") \ + .replace("@", "%40") + else: + path = path.replace(b"%", b"%25") \ + .replace(b"*", b"%2A") \ + .replace(b"#", b"%23") \ + .replace(b"@", b"%40") return path def wildcard_present(path): @@ -1372,7 +1427,7 @@ def generatePointer(self, contentFile): ['git', 'lfs', 'pointer', '--file=' + contentFile], stdout=subprocess.PIPE ) - pointerFile = pointerProcess.stdout.read() + pointerFile = as_string(pointerProcess.stdout.read()) if pointerProcess.wait(): os.remove(contentFile) die('git-lfs pointer command failed. Did you install the extension?') @@ -1479,6 +1534,8 @@ def getUserCacheFilename(self): return os.path.join(home, ".gitp4-usercache.txt") def getUserMapFromPerforceServer(self): + """ Creates the usercache from the data in P4. + """ if self.userMapFromPerforceServer: return self.users = {} @@ -1504,18 +1561,22 @@ def getUserMapFromPerforceServer(self): for (key, val) in list(self.users.items()): s += "%s\t%s\n" % (key.expandtabs(1), val.expandtabs(1)) - open(self.getUserCacheFilename(), "wb").write(s) + cache = io.open(self.getUserCacheFilename(), "wb") + cache.write(as_bytes(s)) + cache.close() self.userMapFromPerforceServer = True def loadUserMapFromCache(self): + """ Reads the P4 username to git email map + """ self.users = {} self.userMapFromPerforceServer = False try: - cache = open(self.getUserCacheFilename(), "rb") + cache = io.open(self.getUserCacheFilename(), "rb") lines = cache.readlines() cache.close() for line in lines: - entry = line.strip().split("\t") + entry = as_string(line).strip().split("\t") self.users[entry[0]] = entry[1] except IOError: self.getUserMapFromPerforceServer() @@ -1715,21 +1776,27 @@ def prepareLogMessage(self, template, message, jobs): return result def patchRCSKeywords(self, file, pattern): - # Attempt to zap the RCS keywords in a p4 controlled file matching the given pattern + """ Attempt to zap the RCS keywords in a p4 + controlled file matching the given pattern + """ + bSubLine = as_bytes(r'$\1$') (handle, outFileName) = tempfile.mkstemp(dir='.') try: - outFile = os.fdopen(handle, "w+") - inFile = open(file, "r") - regexp = re.compile(pattern, re.VERBOSE) + outFile = os.fdopen(handle, "w+b") + inFile = open(file, "rb") + regexp = re.compile(as_bytes(pattern), re.VERBOSE) for line in inFile.readlines(): - line = regexp.sub(r'$\1$', line) + line = regexp.sub(bSubLine, line) outFile.write(line) inFile.close() outFile.close() + outFile = None # Forcibly overwrite the original file os.unlink(file) shutil.move(outFileName, file) except: + if outFile != None: + outFile.close() # cleanup our temporary file os.unlink(outFileName) print("Failed to strip RCS keywords in %s" % file) @@ -2149,7 +2216,7 @@ def applyCommit(self, id): tmpFile = os.fdopen(handle, "w+b") if self.isWindows: submitTemplate = submitTemplate.replace("\n", "\r\n") - tmpFile.write(submitTemplate) + tmpFile.write(as_bytes(submitTemplate)) tmpFile.close() if self.prepare_p4_only: @@ -2199,8 +2266,8 @@ def applyCommit(self, id): message = tmpFile.read() tmpFile.close() if self.isWindows: - message = message.replace("\r\n", "\n") - submitTemplate = message[:message.index(separatorLine)] + message = message.replace(b"\r\n", b"\n") + submitTemplate = message[:message.index(as_bytes(separatorLine))] if update_shelve: p4_write_pipe(['shelve', '-r', '-i'], submitTemplate) @@ -2843,8 +2910,11 @@ def stripRepoPath(self, path, prefixes): return path def splitFilesIntoBranches(self, commit): - """Look at each depotFile in the commit to figure out to what - branch it belongs.""" + """ Look at each depotFile in the commit to figure out to what + branch it belongs. + + Data in the commit will NOT be encoded + """ if self.clientSpecDirs: files = self.extractFilesFromCommit(commit) @@ -2885,16 +2955,22 @@ def splitFilesIntoBranches(self, commit): return branches def writeToGitStream(self, gitMode, relPath, contents): - self.gitStream.write('M %s inline %s\n' % (gitMode, relPath)) + """ Writes the bytes[] 'contents' to the git fast-import + with the given 'gitMode' and 'relPath' as the relative + path. + """ + self.gitStream.write('M %s inline %s\n' % (gitMode, as_string(relPath))) self.gitStream.write('data %d\n' % sum(len(d) for d in contents)) for d in contents: - self.gitStream.write(d) + self.gitStreamBytes.write(d) self.gitStream.write('\n') - # output one file from the P4 stream - # - helper for streamP4Files - def streamOneP4File(self, file, contents): + """ output one file from the P4 stream to the git inbound stream. + helper for streamP4files. + + contents should be a bytes (bytes) + """ relPath = self.stripRepoPath(file['depotFile'], self.branchPrefixes) relPath = encodeWithUTF8(relPath, self.verbose) if verbose: @@ -2902,7 +2978,7 @@ def streamOneP4File(self, file, contents): size = int(self.stream_file['fileSize']) else: size = 0 # deleted files don't get a fileSize apparently - sys.stdout.write('\r%s --> %s (%i MB)\n' % (file['depotFile'], relPath, size//1024//1024)) + sys.stdout.write('\r%s --> %s (%i MB)\n' % (path_as_string(file['depotFile']), as_string(relPath), size//1024//1024)) sys.stdout.flush() (type_base, type_mods) = split_p4_type(file["type"]) @@ -2920,7 +2996,7 @@ def streamOneP4File(self, file, contents): # to nothing. This causes p4 errors when checking out such # a change, and errors here too. Work around it by ignoring # the bad symlink; hopefully a future change fixes it. - print("\nIgnoring empty symlink in %s" % file['depotFile']) + print("\nIgnoring empty symlink in %s" % path_as_string(file['depotFile'])) return elif data[-1] == '\n': contents = [data[:-1]] @@ -2960,16 +3036,16 @@ def streamOneP4File(self, file, contents): # Ideally, someday, this script can learn how to generate # appledouble files directly and import those to git, but # non-mac machines can never find a use for apple filetype. - print("\nIgnoring apple filetype file %s" % file['depotFile']) + print("\nIgnoring apple filetype file %s" % path_as_string(file['depotFile'])) return # Note that we do not try to de-mangle keywords on utf16 files, # even though in theory somebody may want that. - pattern = p4_keywords_regexp_for_type(type_base, type_mods) + pattern = as_bytes(p4_keywords_regexp_for_type(type_base, type_mods)) if pattern: regexp = re.compile(pattern, re.VERBOSE) - text = ''.join(contents) - text = regexp.sub(r'$\1$', text) + text = b''.join(contents) + text = regexp.sub(as_bytes(r'$\1$'), text) contents = [ text ] if self.largeFileSystem: @@ -2988,15 +3064,19 @@ def streamOneP4Deletion(self, file): if self.largeFileSystem and self.largeFileSystem.isLargeFile(relPath): self.largeFileSystem.removeLargeFile(relPath) - # handle another chunk of streaming data def streamP4FilesCb(self, marshalled): + """ Callback function for recording P4 chunks of data for streaming + into GIT. + + marshalled data is bytes[] from the caller + """ # catch p4 errors and complain err = None - if "code" in marshalled: - if marshalled["code"] == "error": - if "data" in marshalled: - err = marshalled["data"].rstrip() + if b"code" in marshalled: + if marshalled[b"code"] == b"error": + if b"data" in marshalled: + err = marshalled[b"data"].rstrip() if not err and 'fileSize' in self.stream_file: required_bytes = int((4 * int(self.stream_file["fileSize"])) - calcDiskFree()) @@ -3018,11 +3098,11 @@ def streamP4FilesCb(self, marshalled): # ignore errors, but make sure it exits first self.importProcess.wait() if f: - die("Error from p4 print for %s: %s" % (f, err)) + die("Error from p4 print for %s: %s" % (path_as_string(f), err)) else: die("Error from p4 print: %s" % err) - if 'depotFile' in marshalled and self.stream_have_file_info: + if b'depotFile' in marshalled and self.stream_have_file_info: # start of a new file - output the old one first self.streamOneP4File(self.stream_file, self.stream_contents) self.stream_file = {} @@ -3032,13 +3112,16 @@ def streamP4FilesCb(self, marshalled): # pick up the new file information... for the # 'data' field we need to append to our array for k in list(marshalled.keys()): - if k == 'data': + if k == b'data': if 'streamContentSize' not in self.stream_file: self.stream_file['streamContentSize'] = 0 - self.stream_file['streamContentSize'] += len(marshalled['data']) - self.stream_contents.append(marshalled['data']) + self.stream_file['streamContentSize'] += len(marshalled[b'data']) + self.stream_contents.append(marshalled[b'data']) else: - self.stream_file[k] = marshalled[k] + if k == b'depotFile': + self.stream_file[as_string(k)] = marshalled[k] + else: + self.stream_file[as_string(k)] = as_string(marshalled[k]) if (verbose and 'streamContentSize' in self.stream_file and @@ -3047,13 +3130,14 @@ def streamP4FilesCb(self, marshalled): size = int(self.stream_file["fileSize"]) if size > 0: progress = 100.0*self.stream_file['streamContentSize']/size - sys.stdout.write('\r%s %4.1f%% (%i MB)' % (self.stream_file['depotFile'], progress, int(size//1024//1024))) + sys.stdout.write('\r%s %4.1f%% (%i MB)' % (path_as_string(self.stream_file['depotFile']), progress, int(size//1024//1024))) sys.stdout.flush() self.stream_have_file_info = True - # Stream directly from "p4 files" into "git fast-import" def streamP4Files(self, files): + """ Stream directly from "p4 files" into "git fast-import" + """ filesForCommit = [] filesToRead = [] filesToDelete = [] @@ -3074,7 +3158,7 @@ def streamP4Files(self, files): self.stream_contents = [] self.stream_have_file_info = False - # curry self argument + # Callback for P4 command to collect file content def streamP4FilesCbSelf(entry): self.streamP4FilesCb(entry) @@ -3083,9 +3167,9 @@ def streamP4FilesCbSelf(entry): if 'shelved_cl' in f: # Handle shelved CLs using the "p4 print file@=N" syntax to print # the contents - fileArg = '%s@=%d' % (f['path'], f['shelved_cl']) + fileArg = b'%s@=%d' % (f['path'], as_bytes(f['shelved_cl'])) else: - fileArg = '%s#%s' % (f['path'], f['rev']) + fileArg = b'%s#%s' % (f['path'], as_bytes(f['rev'])) fileArgs.append(fileArg) @@ -3105,7 +3189,7 @@ def make_email(self, userid): def streamTag(self, gitStream, labelName, labelDetails, commit, epoch): """ Stream a p4 tag. - commit is either a git commit, or a fast-import mark, ":" + commit is either a git commit, or a fast-import mark, ":" """ if verbose: @@ -3177,7 +3261,22 @@ def commit(self, details, files, branch, parent = "", allow_empty=False): .format(details['change'])) return + # fast-import: + #'commit' SP LF + #mark? + #original-oid? + #('author' (SP )? SP LT GT SP LF)? + #'committer' (SP )? SP LT GT SP LF + #('encoding' SP )? + #data + #('from' SP LF)? + #('merge' SP LF)* + #(filemodify | filedelete | filecopy | filerename | filedeleteall | notemodify)* + #LF? + + #'commit' - is the name of the branch to make the commit on self.gitStream.write("commit %s\n" % branch) + #'mark' SP : self.gitStream.write("mark :%s\n" % details["change"]) self.committedChanges.add(int(details["change"])) committer = "" @@ -3187,19 +3286,29 @@ def commit(self, details, files, branch, parent = "", allow_empty=False): self.gitStream.write("committer %s\n" % committer) - self.gitStream.write("data < 0: - self.gitStream.write("\nJobs: %s" % (' '.join(jobs))) - + commitText += "\nJobs: %s" % (' '.join(jobs)) if not self.suppress_meta_comment: - self.gitStream.write("\n[git-p4: depot-paths = \"%s\": change = %s" % - (','.join(self.branchPrefixes), details["change"])) - if len(details['options']) > 0: - self.gitStream.write(": options = %s" % details['options']) - self.gitStream.write("]\n") + # coherce the path to the correct formatting in the branch prefixes as well. + dispPaths = [] + for p in self.branchPrefixes: + dispPaths += [path_as_string(p)] - self.gitStream.write("EOT\n\n") + commitText += ("\n[git-p4: depot-paths = \"%s\": change = %s" % + (','.join(dispPaths), details["change"])) + if len(details['options']) > 0: + commitText += (": options = %s" % details['options']) + commitText += "]" + commitText += "\n" + self.gitStream.write("data %s\n" % len(as_bytes(commitText))) + self.gitStream.write(commitText) + self.gitStream.write("\n") if len(parent) > 0: if self.verbose: @@ -3606,30 +3715,35 @@ def sync_origin_only(self): system("git fetch origin") def importHeadRevision(self, revision): - print("Doing initial import of %s from revision %s into %s" % (' '.join(self.depotPaths), revision, self.branch)) - + # Re-encode depot text + dispPaths = [] + utf8Paths = [] + for p in self.depotPaths: + dispPaths += [path_as_string(p)] + print("Doing initial import of %s from revision %s into %s" % (' '.join(dispPaths), revision, self.branch)) details = {} details["user"] = "git perforce import user" - details["desc"] = ("Initial import of %s from the state at revision %s\n" - % (' '.join(self.depotPaths), revision)) + details["desc"] = ("Initial import of %s from the state at revision %s\n" % + (' '.join(dispPaths), revision)) details["change"] = revision newestRevision = 0 + del dispPaths fileCnt = 0 fileArgs = ["%s...%s" % (p,revision) for p in self.depotPaths] - for info in p4CmdList(["files"] + fileArgs): + for info in p4CmdList(["files"] + fileArgs, encode_data = False): - if 'code' in info and info['code'] == 'error': + if 'code' in info and info['code'] == b'error': sys.stderr.write("p4 returned an error: %s\n" - % info['data']) - if info['data'].find("must refer to client") >= 0: + % as_string(info['data'])) + if info['data'].find(b"must refer to client") >= 0: sys.stderr.write("This particular p4 error is misleading.\n") sys.stderr.write("Perhaps the depot path was misspelled.\n"); sys.stderr.write("Depot path: %s\n" % " ".join(self.depotPaths)) sys.exit(1) if 'p4ExitCode' in info: - sys.stderr.write("p4 exitcode: %s\n" % info['p4ExitCode']) + sys.stderr.write("p4 exitcode: %s\n" % as_string(info['p4ExitCode'])) sys.exit(1) @@ -3642,8 +3756,10 @@ def importHeadRevision(self, revision): #fileCnt = fileCnt + 1 continue + # Save all the file information, howerver do not translate the depotFile name at + # this time. Leave that as bytes since the encoding may vary. for prop in ["depotFile", "rev", "action", "type" ]: - details["%s%s" % (prop, fileCnt)] = info[prop] + details["%s%s" % (prop, fileCnt)] = (info[prop] if prop == "depotFile" else as_string(info[prop])) fileCnt = fileCnt + 1 @@ -3663,13 +3779,18 @@ def importHeadRevision(self, revision): print(self.gitError.read()) def openStreams(self): + """ Opens the fast import pipes. Note that the git* streams are wrapped + to expect Unicode text. To send a raw byte Array, use the importProcess + underlying port + """ self.importProcess = subprocess.Popen(["git", "fast-import"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE); - self.gitOutput = self.importProcess.stdout - self.gitStream = self.importProcess.stdin - self.gitError = self.importProcess.stderr + self.gitOutput = Py23File(self.importProcess.stdout, verbose = self.verbose) + self.gitStream = Py23File(self.importProcess.stdin, verbose = self.verbose) + self.gitError = Py23File(self.importProcess.stderr, verbose = self.verbose) + self.gitStreamBytes = self.importProcess.stdin def closeStreams(self): self.gitStream.close() @@ -4035,15 +4156,17 @@ def run(self, args): self.cloneDestination = depotPaths[-1] depotPaths = depotPaths[:-1] + dispPaths = [] for p in depotPaths: if not p.startswith("//"): sys.stderr.write('Depot paths must start with "//": %s\n' % p) return False + dispPaths += [path_as_string(p)] if not self.cloneDestination: self.cloneDestination = self.defaultDestination(args) - print("Importing from %s into %s" % (', '.join(depotPaths), self.cloneDestination)) + print("Importing from %s into %s" % (', '.join(dispPaths), path_as_string(self.cloneDestination))) if not os.path.exists(self.cloneDestination): os.makedirs(self.cloneDestination) From patchwork Wed Dec 4 22:29:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11273695 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2627B930 for ; Wed, 4 Dec 2019 22:29:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id EEAFF206DB for ; Wed, 4 Dec 2019 22:29:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kNmxSfrI" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728459AbfLDW35 (ORCPT ); Wed, 4 Dec 2019 17:29:57 -0500 Received: from mail-wr1-f68.google.com ([209.85.221.68]:37016 "EHLO mail-wr1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728421AbfLDW3v (ORCPT ); Wed, 4 Dec 2019 17:29:51 -0500 Received: by mail-wr1-f68.google.com with SMTP id w15so1132945wru.4 for ; Wed, 04 Dec 2019 14:29:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=LG7UrrQD+Cb9wX5rcQaDY8ql7td6YtytjUpvKfIFZj4=; b=kNmxSfrIzMqLzv6hfmRsGa+SMb5Ng91pSQysQ6yggVgk0o4mFPUvkDQHjJiGNz9Lwv qqJtzMJ/4CrvrJ86FKplAyVxkc7JVNqfvJFurA0WnCvQcsBLK3hsqn3341dFhkf9k/ug GW/OcN4NcEgwnryZ7LeLWtTRonnPjwlLu+eRmxbNVnJsd0pxqPEWZEaGWQbkhjm8dmMr DqwbRT0wr11IWzvbJ2uqT8bXiWeTlo2jWaDumSfDriDrhxqojofKK1pmXk2kwABoZCi2 3Wjw2Av+WzfaUqVM7iIlMjNqbEtorVSc8H2tPFC1MvpQ89sCna7DISuM35j2cSFhoGXJ 4VZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=LG7UrrQD+Cb9wX5rcQaDY8ql7td6YtytjUpvKfIFZj4=; b=DjiK5bFh7ZgjzUx0CmYnDw5z375dDavLVBKb/QPYjj8gUl+JfKaqdDems4mXOqlNkQ WZIbdRweh0MdREN9l0eLMVT+cnKoxnOfHouqevhCgamXNDycj7NRjlhJwzD9+sejjx/U jXfe1+YiX6nOuKYLSm2fDiqdophdUq2EP074Ga1FhNNWkZZ93H74+ktl0PETsydSZw0x ivpokkNkQKpbspx0m4KqQr6wfPFh2FwiD1uqGUi8ZB4Ez7jgMf8WYGqZTj7TRvDolopR Yt/z0yP2aIU2plCcQD+1+EMdelVOrt32CBFVRmBHnjPAPWOf6izTHscfZiV3nDZk7d0d XOLA== X-Gm-Message-State: APjAAAXN8dyakfYZiCnEE/sNA5FaYuxWrAdYFbPmCWp3NYDwM+Jx1f3Y aTFEZiVlHzT5oyadyoKqnMNz6Ttr X-Google-Smtp-Source: APXvYqzTDbA0PUtNT7NX6v+SqbfAN8V8V48n/4VkIVPplbBEabkEelqj8mfJXW7mXyZbhqwJ6FORTQ== X-Received: by 2002:a5d:49c7:: with SMTP id t7mr6415637wrs.369.1575498588365; Wed, 04 Dec 2019 14:29:48 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h2sm9881290wrv.66.2019.12.04.14.29.47 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 04 Dec 2019 14:29:47 -0800 (PST) Message-Id: <883ef45ca5476c6ff412bf4f95b0e7b50c58338b.1575498578.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Ben Keene via GitGitGadget" Date: Wed, 04 Dec 2019 22:29:37 +0000 Subject: [PATCH v4 11/11] git-p4: Added --encoding parameter to p4 clone Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Ben Keene , Junio C Hamano , Ben Keene Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Ben Keene The test t9822 did not have any tests that had encoded a directory name in ISO8859-1. Additionally, to make it easier for the user to clone new repositories with a non-UTF-8 encoded path in P4, add a new parameter to p4clone "--encoding" that sets the Add new tests that use ISO8859-1 encoded text in both the directory and file names. git-p4.pathEncoding. Update the View class in the git-p4 code to properly cast text as_string() except for depot path and filenames. Update the documentation to include the new command line parameter for p4clone Signed-off-by: Ben Keene (cherry picked from commit e26f6309d60c6c1615320d4a9071935e23efe6fb) --- Documentation/git-p4.txt | 5 ++ git-p4.py | 61 +++++++++++++------ t/t9822-git-p4-path-encoding.sh | 101 ++++++++++++++++++++++++++++++++ 3 files changed, 149 insertions(+), 18 deletions(-) diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt index 3494a1db3e..f54af3c917 100644 --- a/Documentation/git-p4.txt +++ b/Documentation/git-p4.txt @@ -305,6 +305,11 @@ options described above. --bare:: Perform a bare clone. See linkgit:git-clone[1]. +--encoding :: + Optionally sets the git-p4.pathEncoding configuration value in + the newly created Git repository before files are synchronized + from P4. See git-p4.pathEncoding for more information. + Submit options ~~~~~~~~~~~~~~ These options can be used to modify 'git p4 submit' behavior. diff --git a/git-p4.py b/git-p4.py index 05db2ec657..1f2e43430a 100755 --- a/git-p4.py +++ b/git-p4.py @@ -1228,7 +1228,7 @@ def getClientSpec(): """Look at the p4 client spec, create a View() object that contains all the mappings, and return it.""" - specList = p4CmdList("client -o") + specList = p4CmdList("client -o", encode_data=False) if len(specList) != 1: die('Output from "client -o" is %d lines, expecting 1' % len(specList)) @@ -1237,7 +1237,7 @@ def getClientSpec(): entry = specList[0] # the //client/ name - client_name = entry["Client"] + client_name = as_string(entry["Client"]) # just the keys that start with "View" view_keys = [ k for k in list(entry.keys()) if k.startswith("View") ] @@ -2637,19 +2637,25 @@ def run(self, args): return True class View(object): - """Represent a p4 view ("p4 help views"), and map files in a - repo according to the view.""" + """ Represent a p4 view ("p4 help views"), and map files in a + repo according to the view. + """ def __init__(self, client_name): self.mappings = [] - self.client_prefix = "//%s/" % client_name + # the client prefix is saved in bytes as it is used for comparison + # against server data. + self.client_prefix = as_bytes("//%s/" % client_name) # cache results of "p4 where" to lookup client file locations self.client_spec_path_cache = {} def append(self, view_line): - """Parse a view line, splitting it into depot and client - sides. Append to self.mappings, preserving order. This - is only needed for tag creation.""" + """ Parse a view line, splitting it into depot and client + sides. Append to self.mappings, preserving order. This + is only needed for tag creation. + + view_line should be in bytes (depot path encoding) + """ # Split the view line into exactly two words. P4 enforces # structure on these lines that simplifies this quite a bit. @@ -2662,28 +2668,28 @@ def append(self, view_line): # The line is already white-space stripped. # The two words are separated by a single space. # - if view_line[0] == '"': + if view_line[0] == b'"': # First word is double quoted. Find its end. - close_quote_index = view_line.find('"', 1) + close_quote_index = view_line.find(b'"', 1) if close_quote_index <= 0: - die("No first-word closing quote found: %s" % view_line) + die("No first-word closing quote found: %s" % path_as_string(view_line)) depot_side = view_line[1:close_quote_index] # skip closing quote and space rhs_index = close_quote_index + 1 + 1 else: - space_index = view_line.find(" ") + space_index = view_line.find(b" ") if space_index <= 0: - die("No word-splitting space found: %s" % view_line) + die("No word-splitting space found: %s" % path_as_string(view_line)) depot_side = view_line[0:space_index] rhs_index = space_index + 1 # prefix + means overlay on previous mapping - if depot_side.startswith("+"): + if depot_side.startswith(b"+"): depot_side = depot_side[1:] # prefix - means exclude this path, leave out of mappings exclude = False - if depot_side.startswith("-"): + if depot_side.startswith(b"-"): exclude = True depot_side = depot_side[1:] @@ -2694,7 +2700,7 @@ def convert_client_path(self, clientFile): # chop off //client/ part to make it relative if not clientFile.startswith(self.client_prefix): die("No prefix '%s' on clientFile '%s'" % - (self.client_prefix, clientFile)) + (as_string(self.client_prefix)), path_as_string(clientFile)) return clientFile[len(self.client_prefix):] def update_client_spec_path_cache(self, files): @@ -2706,9 +2712,9 @@ def update_client_spec_path_cache(self, files): if len(fileArgs) == 0: return # All files in cache - where_result = p4CmdList(["-x", "-", "where"], stdin=fileArgs) + where_result = p4CmdList(["-x", "-", "where"], stdin=fileArgs, encode_data=False) for res in where_result: - if "code" in res and res["code"] == "error": + if "code" in res and res["code"] == b"error": # assume error is "... file(s) not in client view" continue if "clientFile" not in res: @@ -4125,10 +4131,14 @@ def __init__(self): help="where to leave result of the clone"), optparse.make_option("--bare", dest="cloneBare", action="store_true", default=False), + optparse.make_option("--encoding", dest="setPathEncoding", + action="store", default=None, + help="Sets the path encoding for this depot") ] self.cloneDestination = None self.needsGit = False self.cloneBare = False + self.setPathEncoding = None def defaultDestination(self, args): """ Returns the last path component as the default git @@ -4152,6 +4162,14 @@ def run(self, args): depotPaths = args + # If we have an encoding provided, ignore what may already exist + # in the registry. This will ensure we show the displayed values + # using the correct encoding. + if self.setPathEncoding: + gitConfigSet("git-p4.pathEncoding", self.setPathEncoding) + + # If more than 1 path element is supplied, the last element + # is the clone destination. if not self.cloneDestination and len(depotPaths) > 1: self.cloneDestination = depotPaths[-1] depotPaths = depotPaths[:-1] @@ -4179,6 +4197,13 @@ def run(self, args): if retcode: raise CalledProcessError(retcode, init_cmd) + # Set the encoding if it was provided command line + if self.setPathEncoding: + init_cmd= ["git", "config", "git-p4.pathEncoding", self.setPathEncoding] + retcode = subprocess.call(init_cmd) + if retcode: + raise CalledProcessError(retcode, init_cmd) + if not P4Sync.run(self, depotPaths): return False diff --git a/t/t9822-git-p4-path-encoding.sh b/t/t9822-git-p4-path-encoding.sh index 572d395498..cf8a15b2e4 100755 --- a/t/t9822-git-p4-path-encoding.sh +++ b/t/t9822-git-p4-path-encoding.sh @@ -4,9 +4,20 @@ test_description='Clone repositories with non ASCII paths' . ./lib-git-p4.sh +# lowercase filename +# UTF8 - HEX: a-\xc3\xa4_o-\xc3\xb6_u-\xc3\xbc +# - octal: a-\303\244_o-\303\266_u-\303\274 +# ISO8859 - HEX: a-\xe4_o-\xf6_u-\xfc UTF8_ESCAPED="a-\303\244_o-\303\266_u-\303\274.txt" ISO8859_ESCAPED="a-\344_o-\366_u-\374.txt" +# lowercase directory +# UTF8 - HEX: dir_a-\xc3\xa4_o-\xc3\xb6_u-\xc3\xbc +# ISO8859 - HEX: dir_a-\xe4_o-\xf6_u-\xfc +DIR_UTF8_ESCAPED="dir_a-\303\244_o-\303\266_u-\303\274" +DIR_ISO8859_ESCAPED="dir_a-\344_o-\366_u-\374" + + ISO8859="$(printf "$ISO8859_ESCAPED")" && echo content123 >"$ISO8859" && rm "$ISO8859" || { @@ -58,6 +69,22 @@ test_expect_success 'Clone repo containing iso8859-1 encoded paths with git-p4.p ) ' +test_expect_success 'Clone repo containing iso8859-1 encoded paths with using --encoding parameter' ' + test_when_finished cleanup_git && + ( + git p4 clone --encoding iso8859 --destination="$git" //depot && + cd "$git" && + UTF8="$(printf "$UTF8_ESCAPED")" && + echo "$UTF8" >expect && + git -c core.quotepath=false ls-files >actual && + test_cmp expect actual && + + echo content123 >expect && + cat "$UTF8" >actual && + test_cmp expect actual + ) +' + test_expect_success 'Delete iso8859-1 encoded paths and clone' ' ( cd "$cli" && @@ -74,4 +101,78 @@ test_expect_success 'Delete iso8859-1 encoded paths and clone' ' ) ' +# These tests will create a directory with ISO8859-1 characters in both the +# directory and the path. Since it is possible to clone a path instead of using +# the whole client-spec. Check both versions: client-spec and with a direct +# path using --encoding +test_expect_success 'Create a repo containing iso8859-1 encoded directory and filename' ' + ( + DIR_ISO8859="$(printf "$DIR_ISO8859_ESCAPED")" && + ISO8859="$(printf "$ISO8859_ESCAPED")" && + cd "$cli" && + mkdir "$DIR_ISO8859" && + cd "$DIR_ISO8859" && + echo content123 >"$ISO8859" && + p4 add "$ISO8859" && + p4 submit -d "test commit (encoded directory)" + ) +' + +test_expect_success 'Clone repo containing iso8859-1 encoded depot path and files with git-p4.pathEncoding' ' + test_when_finished cleanup_git && + ( + DIR_ISO8859="$(printf "$DIR_ISO8859_ESCAPED")" && + DIR_UTF8="$(printf "$DIR_UTF8_ESCAPED")" && + cd "$git" && + git init . && + git config git-p4.pathEncoding iso8859-1 && + git p4 clone --use-client-spec --destination="$git" "//depot/$DIR_ISO8859" && + cd "$DIR_UTF8" && + UTF8="$(printf "$UTF8_ESCAPED")" && + echo "$UTF8" >expect && + git -c core.quotepath=false ls-files >actual && + test_cmp expect actual && + + echo content123 >expect && + cat "$UTF8" >actual && + test_cmp expect actual + ) +' + +test_expect_success 'Clone repo containing iso8859-1 encoded depot path and files with git-p4.pathEncoding, without --use-client-spec' ' + test_when_finished cleanup_git && + ( + DIR_ISO8859="$(printf "$DIR_ISO8859_ESCAPED")" && + cd "$git" && + git init . && + git config git-p4.pathEncoding iso8859-1 && + git p4 clone --destination="$git" "//depot/$DIR_ISO8859" && + UTF8="$(printf "$UTF8_ESCAPED")" && + echo "$UTF8" >expect && + git -c core.quotepath=false ls-files >actual && + test_cmp expect actual && + + echo content123 >expect && + cat "$UTF8" >actual && + test_cmp expect actual + ) +' + +test_expect_success 'Clone repo containing iso8859-1 encoded depot path and files with using --encoding parameter' ' + test_when_finished cleanup_git && + ( + DIR_ISO8859="$(printf "$DIR_ISO8859_ESCAPED")" && + git p4 clone --encoding iso8859 --destination="$git" "//depot/$DIR_ISO8859" && + cd "$git" && + UTF8="$(printf "$UTF8_ESCAPED")" && + echo "$UTF8" >expect && + git -c core.quotepath=false ls-files >actual && + test_cmp expect actual && + + echo content123 >expect && + cat "$UTF8" >actual && + test_cmp expect actual + ) +' + test_done