Message ID | 1c1b58e20cab6b4989b140282353073165f0067e.1721762306.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | git-prompt: support more shells | expand |
"Avi Halachmi (:avih) via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: "Avi Halachmi (:avih)" <avihpit@yahoo.com> > > $'...' is new in POSIX (2024), and some shells support it in recent > versions, while others have had it for decades (bash, zsh, ksh93). I will not look at this series futher during the current development cycle that is about to conclude, but ... > +__git_SOH=$'\1' __git_STX=$'\2' __git_ESC=$'\33' > +__git_LF=$'\n' __git_CRLF=$'\r\n' > + > +if [ $'\101' != A ]; then # fallback for shells without $'...' > + __git_CRLF=$(printf "\r\n\1\2\33") # CR LF SOH STX ESC > + __git_ESC=${__git_CRLF#????}; __git_CRLF=${__git_CRLF%?} > + __git_STX=${__git_CRLF#???}; __git_CRLF=${__git_CRLF%?} > + __git_SOH=${__git_CRLF#??}; __git_CRLF=${__git_CRLF%?} > + __git_LF=${__git_CRLF#?} > +fi ... given that these are not used literally in-place but are always referred to by their __git_BYTE names, if we are making this script portable across shells to the same degree as other shell scripts following our coding guidelines, I would prefer to see it done without any "fallback". $(printf '\r') would work with bash, zsh and ksh93, too, and one time assignment to these variables is not going to be performance critical. Just forbid use of $'\octal' notation in the coding guidelines document, and implement just one variant. Perhaps we should spell more things out that you wrote in some of your proposed log messages more explicitly. I think these have been rules we have followed (grep for them in *.sh files outside contrib/) but I did not find mention in the guidelines document. Thanks. Documentation/CodingGuidelines | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git c/Documentation/CodingGuidelines w/Documentation/CodingGuidelines index 1d92b2da03..bb058fcc87 100644 --- c/Documentation/CodingGuidelines +++ w/Documentation/CodingGuidelines @@ -107,6 +107,8 @@ For shell scripts specifically (not exhaustive): - We do not use Process Substitution <(list) or >(list). + - We do not use Dollar-Single-Quotes $'<octal>' notation. + - Do not write control structures on a single line with semicolon. "then" should be on the next line for if statements, and "do" should be on the next line for "while" and "for". @@ -140,7 +142,8 @@ For shell scripts specifically (not exhaustive): sort >actual && ... - - We prefer "test" over "[ ... ]". + - We prefer "test" over "[ ... ]". Never use "[[ ... ]]" unless in a + script only meant for bash. - We do not write the noiseword "function" in front of shell functions.
On Tuesday, July 23, 2024 at 11:25:29 PM GMT+3, Junio C Hamano <gitster@pobox.com> wrote: > > $'...' is new in POSIX (2024), and some shells support it in recent > > versions, while others have had it for decades (bash, zsh, ksh93). > I will not look at this series futher during the current development > cycle that is about to conclude, but ... Thanks. I'm happy to continue whenever others have the bandwidth. > > +__git_SOH=$'\1' __git_STX=$'\2' __git_ESC=$'\33' > > +__git_LF=$'\n' __git_CRLF=$'\r\n' > > + > > +if [ $'\101' != A ]; then # fallback for shells without $'...' > > + __git_CRLF=$(printf "\r\n\1\2\33") # CR LF SOH STX ESC > > + __git_ESC=${__git_CRLF#????}; __git_CRLF=${__git_CRLF%?} > > + __git_STX=${__git_CRLF#???}; __git_CRLF=${__git_CRLF%?} > > + __git_SOH=${__git_CRLF#??}; __git_CRLF=${__git_CRLF%?} > > + __git_LF=${__git_CRLF#?} > > +fi > ... I would prefer to see it done without any "fallback". That's fair. I'll change it. > $(printf '\r') would work with bash, zsh and ksh93, too, and one > time assignment to these variables is not going to be performance > critical. Generally true, and also mostly non-generally as far as performace goes, though personally I'd prefer to avoid command substitution in this context if possible, as even a single one can have non-negligible impact in (very) hot scripts. But it would still be very small, and doesn't matter with this file. > Just forbid use of $'\octal' notation in the coding > guidelines document, and implement just one variant. Agreed, and the CodingGuidelines patch LGTM. However, off the top of my head I wouldn't know how this variant should look like. This one printf and splitting it later is a bit meh to be used in every script which needs control literals, but I also don't have anything cleaner off the top of my head. > Perhaps we should spell more things out that you wrote in some of > your proposed log messages more explicitly. I think these have been > rules we have followed (grep for them in *.sh files outside > contrib/) but I did not find mention in the guidelines document. If I can help with this, let me know how. > Thanks. My pleasure.
On Wednesday, July 24, 2024 at 05:08:54 AM GMT+3, avih <avihpit@yahoo.com> wrote: > On Tuesday, July 23, 2024 at 11:25:29 PM GMT+3, Junio C Hamano <gitster@pobox.com> wrote: > > > > > > +__git_SOH=$'\1' __git_STX=$'\2' __git_ESC=$'\33' > > > +__git_LF=$'\n' __git_CRLF=$'\r\n' > > > + > > > +if [ $'\101' != A ]; then # fallback for shells without $'...' > > > + __git_CRLF=$(printf "\r\n\1\2\33") # CR LF SOH STX ESC > > > + __git_ESC=${__git_CRLF#????}; __git_CRLF=${__git_CRLF%?} > > > + __git_STX=${__git_CRLF#???}; __git_CRLF=${__git_CRLF%?} > > > + __git_SOH=${__git_CRLF#??}; __git_CRLF=${__git_CRLF%?} > > > + __git_LF=${__git_CRLF#?} > > > +fi > > > ... I would prefer to see it done without any "fallback". > > Just forbid use of $'\octal' notation in the coding > > guidelines document, and implement just one variant. > > ... off the top of my head I wouldn't know how this variant > should look like. This one printf and splitting it later is a bit > meh to be used in every script which needs control literals, but > I also don't have anything cleaner off the top of my head. I think I misinterpreted the scope of "variant". I thought it meant, across scripts which need to use $'...', but now I realize it meant between using $'...' and using the fallback in this patch. So mainly as a general solution, but also applicable to this patch, below is my best generalized solution so far, so that scripts don't have to reinvent the wheel with this "string strip dance" above, but I'm not too happy with it, mainly due to the gotcha that single quotes in the value break the world (escape the "eval"). So unless others prefer this solution or have other ideas, I think it's best to keep the existing "strip dance" which the patch already has, but make it the only variant instead of being a fallback. Generalized solution (without namespace-ification): # assign_as_fmt NAME=FMT ... --> NAME=$(printf FMT) ... # - works also if the output ends in newline # - NAME must be a valid var name # - FMT _must_ not include/output single quotes (escapes eval) # - best used like $'..': x=$'\r\n' -> assign_as_fmt x='\r\n' # (which also avoids accidental single quotes, but not '\047') assign_as_fmt () { # accumulate " NAME='FMT'" from each "NAME=FMT" fmt= # ignore non-locality for now while [ "${1+x}" ]; do fmt="$fmt ${1%%=*}='${1#*=}'" shift done eval "$(printf "$fmt")" # FMTs become values, eval'ed } assign_as_fmt \ __git_SOH='\1' __git_STX='\2' __git_ESC='\33' \ __git_LF='\n' __git_CRLF='\r\n'
On Thursday, July 25, 2024 at 02:27:29 PM GMT+3, avih <avihpit@yahoo.com> wrote: > > So mainly as a general solution, but also applicable to this patch, > below is my best generalized solution so far, so that scripts don't > have to reinvent the wheel with this "string strip dance" above, > but I'm not too happy with it, mainly due to the gotcha that single > quotes in the value break the world (escape the "eval"). Pardon the noise. To summarize the options to replace $'...' portably, like: __git_SOH=$'\1' __git_STX=$'\2' __git_ESC=$'\33' __git_LF=$'\n' __git_CRLF=$'\r\n' The current patch has this, which is not fun and not scalable: __git_CRLF=$(printf "\r\n\1\2\33") # CR LF SOH STX ESC __git_ESC=${__git_CRLF#????}; __git_CRLF=${__git_CRLF%?} __git_STX=${__git_CRLF#???}; __git_CRLF=${__git_CRLF%?} __git_SOH=${__git_CRLF#??}; __git_CRLF=${__git_CRLF%?} __git_LF=${__git_CRLF#?} If performance is not important, this works (with care about \n): __git_LF=$(printf "\nx"); __git_LF=${__git_LF%x} __git_STX=$(printf '\1') ... A previous message suggested this, which has a beautiful API, but it requires an additional non-tiny function, and it also hides a great risk of escaping "eval": assign_as_fmt () { # hides the usage of "eval" ... } assign_as_fmt \ __git_SOH='\1' __git_STX='\2' __git_ESC='\33' \ __git_LF='\n' __git_CRLF='\r\n' But then I figured there's another option, which is reasonably readable, scalable, small without additional functions, but still requires some care, though the risk is not hidden and easy to avoid: It's basically what the function does, but without a function: (double quotes required only if it ends in \n, or for uniformity) # doubel-check to ensure the printf output is valid shell input eval "$(printf ' __git_SOH="\1" __git_STX="\2" __git_ESC="\33" __git_LF="\n" __git_CRLF="\r\n" ')" I think it strikes the best balance between the options, both for this patch, and possibly also as a general recomendation. So unless there are objections or better suggestions, this is what I currently prefer for this patch.
avih <avihpit@yahoo.com> writes: > Generalized solution (without namespace-ification): I think you are over-engineering this. We do not have immediate need for such facility to be used by other scripts. On the other hand, we know exactly what git-prompt wants to see available, and you already implemented them. So just losing "make assignment asuming $'blah' works, and then reassign based on what printf gave us" and always using the printf thing is what we want to see here. > assign_as_fmt \ > __git_SOH='\1' __git_STX='\2' __git_ESC='\33' \ > __git_LF='\n' __git_CRLF='\r\n' Are you sure that everybody's implementation of printf(1) is happy with \d and \dd? I am an old timer who learnt in a distant past to always spell octals as \ddd without omitting any leading 0-digit, because some was unhappy. Thanks.
avih <avihpit@yahoo.com> writes: > eval "$(printf ' > __git_SOH="\1" __git_STX="\2" __git_ESC="\33" > __git_LF="\n" __git_CRLF="\r\n" > ')" > > I think it strikes the best balance between the options, both > for this patch, and possibly also as a general recomendation. > > So unless there are objections or better suggestions, this is > what I currently prefer for this patch. Modulo my superstition against \d and \dd, the above does look very readable and hard to break. Thanks.
On Thursday, July 25, 2024 at 07:19:56 PM GMT+3, Junio C Hamano <gitster@pobox.com> wrote: > I think you are over-engineering this. We do not have immediate > need for such facility to be used by other scripts. On the other > hand, we know exactly what git-prompt wants to see available, and > you already implemented them. Yes. It was a misunderstanding on my part, and it was overengineered. But because I initially misinterpreted your statement as "disallow $'...' at the guidelines, and we need to find one variant (form) to use", it got me thinking, because I wasn't very happy with the existing form at the patch either. And it did eventually lead to a solution we both liked. I'm OK taking that road, but next time I should probably wait a bit more before going pulic with half-baked ideas. > So just losing "make assignment asuming $'blah' works, and then > reassign based on what printf gave us" and always using the printf > thing is what we want to see here. Yes. > Are you sure that everybody's implementation of printf(1) is happy > with \d and \dd? I am an old timer who learnt in a distant past to > always spell octals as \ddd without omitting any leading 0-digit, > because some was unhappy. If I knew who "everybody" is, then maybe. But "learned in the distant past" does carry weight, as do existing practices. However, there seem to be almost no cases in non - /t/... files, and most of them are in git-prompt.sh. In test scripts though, it's a mixed bag. I think in decreasing order of popularity: - Always use \ddd form. - Allow less than 3 if it begins with 0, like \01, and many \0 . - Yes, \1 or \4 are fine (there are not many of those). I'll use the \ddd form you because you prefer it and it does seem the most popular (I think also outside of git codebase), and even if only for being bullet-proof against following '0'-'7' chars at the string. But back to the question of how much I'm sure, then I'm sure there are exceptions, but I couldn't find one yet. It works in all the shell-builtin-printf I have access to (~ 20), as well as gnu /bin/printf. Obviously there are many common ancestors there (esp. ash and pdksh), but they are still different codebases, and others don't share code with those, like ksh, bash, Schily, yash. As a cute data point, this printf statement, copy-pasted into a 1981 BSD 2.11 running on PDP11, prints the exact output as it does today: printf 'a="\1" b="\2" c="\33" d="\n" e="\r\n"' | od -c https://skn.noip.me/pdp11/pdp11.html ("boot RP1", user root, no pass) On Thursday, July 25, 2024 at 07:24:08 PM GMT+3, Junio C Hamano <gitster@pobox.com> wrote: > avih <avihpit@yahoo.com> writes: > > eval "$(printf ' > > __git_SOH="\1" __git_STX="\2" __git_ESC="\33" > > __git_LF="\n" __git_CRLF="\r\n" > > ')" > > Modulo my superstition against \d and \dd, the above does look > very readable and hard to break. Thanks.
On Tuesday, July 23, 2024 at 11:25:29 PM GMT+3, Junio C Hamano <gitster@pobox.com> wrote: > > From: "Avi Halachmi (:avih)" <avihpit@yahoo.com> > > > > $'...' is new in POSIX (2024), and some shells support it in recent > > versions, while others have had it for decades (bash, zsh, ksh93). > > I will not look at this series futher during the current development > cycle that is about to conclude, but ... Ping Reminder: I'll update this part to not-use $'...' and without fallback, but I'm currently waiting for comments on the other parts as well before I update this patch. - avih
avih <avihpit@yahoo.com> writes: > On Tuesday, July 23, 2024 at 11:25:29 PM GMT+3, Junio C Hamano <gitster@pobox.com> wrote: >> > From: "Avi Halachmi (:avih)" <avihpit@yahoo.com> >> > >> > $'...' is new in POSIX (2024), and some shells support it in recent >> > versions, while others have had it for decades (bash, zsh, ksh93). >> >> I will not look at this series futher during the current development >> cycle that is about to conclude, but ... > > Ping > > Reminder: I'll update this part to not-use $'...' and without > fallback, but I'm currently waiting for comments on the other parts > as well before I update this patch. Ping for others. I do not recall having much other things to say on the series. Thanks.
On Wednesday, August 14, 2024 at 10:32:19 PM GMT+3, Junio C Hamano <gitster@pobox.com> wrote: >> avih <avihpit@yahoo.com> writes: >>> On Tuesday, July 23, 2024 at 11:25:29 PM GMT+3, Junio C Hamano <gitster@pobox.com> wrote: >>> I will not look at this series futher during the current development >>> cycle that is about to conclude, but ... >> >> Ping >> >> Reminder: I'll update this part to not-use $'...' and without >> fallback, but I'm currently waiting for comments on the other parts >> as well before I update this patch. > > Ping for others. I do not recall having much other things to say on > the series. Not sure I understand. Shall I ping the other 7 parts individually? Or shall I go ahead and post the updated part 6/8 (and rebased parts 7 and 8 trivially) Slightly off topic, git-prompt.sh was not modified in master since I submitted the series, so no need to rebase the series, right?
avih <avihpit@yahoo.com> writes: >>> Ping >>> >>> Reminder: I'll update this part to not-use $'...' and without >>> fallback, but I'm currently waiting for comments on the other parts >>> as well before I update this patch. >> >> Ping for others. I do not recall having much other things to say on >> the series. > > Not sure I understand. > > Shall I ping the other 7 parts individually? No, I pinged other folks, who are reading git@vger.kernel.org, to give their reviews, because the prompt script is not exactly my area of expertise. I commented on it only because nobody else did, and because I care about portability while keeping the complexity level down. Other aspects of the series are better reviewed by others, not by me. If you have updated series, resending the whole series with as v2 (e.g. [PATCH v2 1/8]..[PATCH v2 8/8], if there is no change in the number of patches) would be good. THanks.
diff --git a/contrib/completion/git-prompt.sh b/contrib/completion/git-prompt.sh index 5d7f236fe48..bbc16417ac9 100644 --- a/contrib/completion/git-prompt.sh +++ b/contrib/completion/git-prompt.sh @@ -111,6 +111,17 @@ __git_printf_supports_v= printf -v __git_printf_supports_v -- '%s' yes >/dev/null 2>&1 +__git_SOH=$'\1' __git_STX=$'\2' __git_ESC=$'\33' +__git_LF=$'\n' __git_CRLF=$'\r\n' + +if [ $'\101' != A ]; then # fallback for shells without $'...' + __git_CRLF=$(printf "\r\n\1\2\33") # CR LF SOH STX ESC + __git_ESC=${__git_CRLF#????}; __git_CRLF=${__git_CRLF%?} + __git_STX=${__git_CRLF#???}; __git_CRLF=${__git_CRLF%?} + __git_SOH=${__git_CRLF#??}; __git_CRLF=${__git_CRLF%?} + __git_LF=${__git_CRLF#?} +fi + # stores the divergence from upstream in $p # used by GIT_PS1_SHOWUPSTREAM __git_ps1_show_upstream () @@ -118,7 +129,7 @@ __git_ps1_show_upstream () local key value local svn_remotes="" svn_url_pattern="" count n local upstream_type=git legacy="" verbose="" name="" - local LF=$'\n' + local LF="$__git_LF" # get some config options from git-config local output="$(git config -z --get-regexp '^(svn-remote\..*\.url|bash\.showupstream)$' 2>/dev/null | tr '\0\n' '\n ')" @@ -271,12 +282,16 @@ __git_ps1_colorize_gitstring () local c_lblue='%F{blue}' local c_clear='%f' else - # Using \001 and \002 around colors is necessary to prevent - # issues with command line editing/browsing/completion! - local c_red=$'\001\e[31m\002' - local c_green=$'\001\e[32m\002' - local c_lblue=$'\001\e[1;34m\002' - local c_clear=$'\001\e[0m\002' + # \001 (SOH) and \002 (STX) are 0-width substring markers + # which bash/readline identify while calculating the prompt + # on-screen width - to exclude 0-screen-width esc sequences. + local c_pre="${__git_SOH}${__git_ESC}[" + local c_post="m${__git_STX}" + + local c_red="${c_pre}31${c_post}" + local c_green="${c_pre}32${c_post}" + local c_lblue="${c_pre}1;34${c_post}" + local c_clear="${c_pre}0${c_post}" fi local bad_color="$c_red" local ok_color="$c_green" @@ -312,7 +327,7 @@ __git_ps1_colorize_gitstring () # variable, in that order. __git_eread () { - test -r "$1" && IFS=$'\r\n' read -r "$2" <"$1" + test -r "$1" && IFS=$__git_CRLF read -r "$2" <"$1" } # see if a cherry-pick or revert is in progress, if the user has committed a @@ -430,19 +445,20 @@ __git_ps1 () return "$exit" fi + local LF="$__git_LF" local short_sha="" if [ "$rev_parse_exit_code" = "0" ]; then - short_sha="${repo_info##*$'\n'}" - repo_info="${repo_info%$'\n'*}" + short_sha="${repo_info##*$LF}" + repo_info="${repo_info%$LF*}" fi - local ref_format="${repo_info##*$'\n'}" - repo_info="${repo_info%$'\n'*}" - local inside_worktree="${repo_info##*$'\n'}" - repo_info="${repo_info%$'\n'*}" - local bare_repo="${repo_info##*$'\n'}" - repo_info="${repo_info%$'\n'*}" - local inside_gitdir="${repo_info##*$'\n'}" - local g="${repo_info%$'\n'*}" + local ref_format="${repo_info##*$LF}" + repo_info="${repo_info%$LF*}" + local inside_worktree="${repo_info##*$LF}" + repo_info="${repo_info%$LF*}" + local bare_repo="${repo_info##*$LF}" + repo_info="${repo_info%$LF*}" + local inside_gitdir="${repo_info##*$LF}" + local g="${repo_info%$LF*}" if [ "true" = "$inside_worktree" ] && [ -n "${GIT_PS1_HIDE_IF_PWD_IGNORED-}" ] &&