From patchwork Fri Apr 12 08:03:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thalia Archibald X-Patchwork-Id: 13627097 Received: from mail-4022.proton.ch (mail-4022.proton.ch [185.70.40.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AA93B53E26 for ; Fri, 12 Apr 2024 08:03:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.70.40.22 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712909025; cv=none; b=orFsMffpif5cuLop1TtPi82afyTCCsVkwdQJ80G0dA6AzRDIrDylFLs22Eu/fehAnQIvL3n1O2+vgClmQMg4inHftfrJEfT62sBP16wKZdHAc6BE5Buto6w6xTbx4LlUHn9qwJn93BKaT+iUgGV+K6RM/63iHJPZmC7f7FwKkgg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712909025; c=relaxed/simple; bh=mA0IGHDr4QwxD9W8rUpULBVY+RI3BDSzCtgMfFGWBB4=; h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=sfdAOJ9aDYOxUzg6iHevQuEcKj9kWVjXFJWBQMEaDAAAvrOEWuApdj+QeaLhVfBv+xPx9ul1zj26780PzCD1ToS58JqFG7M2P9PHTw8weksTMKOQWFfN6AtRKz73SciSBshrTf+DPxBZV0GBgDEYqWCXzYX55zA3jKxFc7THooo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=archibald.dev; spf=pass smtp.mailfrom=archibald.dev; dkim=pass (2048-bit key) header.d=archibald.dev header.i=@archibald.dev header.b=gTY8YW3S; arc=none smtp.client-ip=185.70.40.22 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=archibald.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=archibald.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=archibald.dev header.i=@archibald.dev header.b="gTY8YW3S" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=archibald.dev; s=protonmail3; t=1712909016; x=1713168216; bh=L8jIZIyOQXAe8rEhPm5KHRj2e5XKT0YZoFwgpOEWRk4=; h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References: Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID:BIMI-Selector; b=gTY8YW3SVFTKsdlzwKiKFwnCGcWQyuXLiqjg754kAyF0xOhfaF3wvOVsL8Aal1ckg kTt4t8kIt2AAnjkVENHOWQltGn/t3daTsPMrvYAPV4BF22KHs7HhllfFJlvVsw45PQ uYNfIS9y7g2ZjQJbJtu2YEyoAlu5q8xB8h5DiNHMOHPv/mQgUUStN/ACft0o4sDNO4 UXE9HgTzZOFFPiGaEMGA0fcMSbKiTilmkMyv0ihixSE/V7XNFWyUsyfUZIpQUcZU+K nhKcFFb0qEkSul9xK1MjA3lnPSX3P6ROdmcoS93tS6pRxKL3YJBqlI6ObwaFCW41wJ v61SkOKUR2PdA== Date: Fri, 12 Apr 2024 08:03:30 +0000 To: git@vger.kernel.org From: Thalia Archibald Cc: Junio C Hamano , Patrick Steinhardt , Chris Torek , Elijah Newren , Thalia Archibald Subject: [PATCH v4 5/8] fast-import: improve documentation for path quoting Message-ID: In-Reply-To: References: <20240322000304.76810-1-thalia@archibald.dev> Feedback-ID: 63908566:user:proton Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 It describes what characters cannot be in an unquoted path, but not their semantics. Reframe it as a definition of unquoted paths. From the perspective of the parser, whether it starts with `"` is what defines whether it will parse it as quoted or unquoted. The restrictions on characters in unquoted paths (with starting-", LF, and spaces) are explained in the quoted paragraph. Move it to the unquoted paragraph and reword. The restriction that the source paths of filecopy and filerename cannot contain SP is only stated in their respective sections. Restate it in the section. Signed-off-by: Thalia Archibald --- Documentation/git-fast-import.txt | 26 ++++++++++++++++---------- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt index b2607366b9..1882758b8a 100644 --- a/Documentation/git-fast-import.txt +++ b/Documentation/git-fast-import.txt @@ -630,18 +630,24 @@ in octal. Git only supports the following modes: In both formats `` is the complete path of the file to be added (if not already existing) or modified (if already existing). -A `` string must use UNIX-style directory separators (forward -slash `/`), may contain any byte other than `LF`, and must not -start with double quote (`"`). +A `` can be written as unquoted bytes or a C-style quoted string. -A path can use C-style string quoting; this is accepted in all cases -and mandatory if the filename starts with double quote or contains -`LF`. In C-style quoting, the complete name should be surrounded with -double quotes, and any `LF`, backslash, or double quote characters -must be escaped by preceding them with a backslash (e.g., -`"path/with\n, \\ and \" in it"`). +When a `` does not start with a double quote (`"`), it is an +unquoted string and is parsed as literal bytes without any escape +sequences. However, if the filename contains `LF` or starts with double +quote, it cannot be represented as an unquoted string and must be +quoted. Additionally, the source `` in `filecopy` or `filerename` +must be quoted if it contains SP. -The value of `` must be in canonical form. That is it must not: +When a `` starts with a double quote (`"`), it is a C-style quoted +string, where the complete filename is enclosed in a pair of double +quotes and escape sequences are used. Certain characters must be escaped +by preceding them with a backslash: `LF` is written as `\n`, backslash +as `\\`, and double quote as `\"`. All filenames can be represented as +quoted strings. + +A `` must use UNIX-style directory separators (forward slash `/`) +and its value must be in canonical form. That is it must not: * contain an empty directory component (e.g. `foo//bar` is invalid), * end with a directory separator (e.g. `foo/` is invalid),