Message ID | patch-v3-7.7-7a82b1fd005-20220326T171200Z-avarab@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | various: remove dead code, drop i18n not used in-tree | expand |
Am 26.03.22 um 18:14 schrieb Ævar Arnfjörð Bjarmason: > Partially revert d323c6b6410 (i18n: git-sh-setup.sh: mark strings for > translation, 2016-06-17). > > These strings are no longer used in-tree, and we shouldn't be wasting > translator time on them for the benefit of a hypothetical out-of-tree > user of git-sh-setup.sh. There is public documentation for these functions. Hence, you must assume that they are used in scripts outside of Git. Castrating their functionality like this patch does is unacceptable. -- Hannes
On Sun, Mar 27 2022, Johannes Sixt wrote: > Am 26.03.22 um 18:14 schrieb Ævar Arnfjörð Bjarmason: >> Partially revert d323c6b6410 (i18n: git-sh-setup.sh: mark strings for >> translation, 2016-06-17). >> >> These strings are no longer used in-tree, and we shouldn't be wasting >> translator time on them for the benefit of a hypothetical out-of-tree >> user of git-sh-setup.sh. > > There is public documentation for these functions. Hence, you must > assume that they are used in scripts outside of Git. Castrating their > functionality like this patch does is unacceptable. For require_clean_work_tree() the public documentation for this function states that it will emit a specific error message in English, and you're expected to Lego-interpolate a string that we'll concatenate with it: [...]It emits an error message of the form `Cannot <action>: <reason>. <hint>`, and dies. Example: + ---------------- require_clean_work_tree rebase "Please commit or stash them." So I think that marking it for translation like this as d323c6b6410 was always broken in that it broke that documented promise. But that's just an argument for keeping the require_clean_work_tree() part of this patch, not require_work_tree_exists(). For that one and others in git-sh-setup we've never said that we'd provide these translated (and to the extent we've implied anything about the rest, have strongly implied the opposite with require_clean_work_tree()'s docs). Nothing will break for out-of-tree users as a result of this change. When we added these functions and their documentation their output wouldn't be translated, then sometimes it was, now it's not again. We need also need to be mindful of translator time, it's a *lot* of strings to go through, and e.g. I've commented in the past on patches that marked stuff in t/helper/ for translation. Some hypothetical out-of-tree user is, I think, a much stronger candidate for skipping translation than that. Also keep in mind that we don't even translate in-tree contrib stuff like contrib/subtree/ (the recent "not-really-contrib" scalar being an exception). So I really think this is fine as-is, don't you think that if someone out-of-tree had such strong expectations about the human-readable strings these emit that they'd have long since stopped using them and provided their own replacements?
Am 27.03.22 um 13:15 schrieb Ævar Arnfjörð Bjarmason: > > On Sun, Mar 27 2022, Johannes Sixt wrote: > >> Am 26.03.22 um 18:14 schrieb Ævar Arnfjörð Bjarmason: >>> Partially revert d323c6b6410 (i18n: git-sh-setup.sh: mark strings for >>> translation, 2016-06-17). >>> >>> These strings are no longer used in-tree, and we shouldn't be wasting >>> translator time on them for the benefit of a hypothetical out-of-tree >>> user of git-sh-setup.sh. >> >> There is public documentation for these functions. Hence, you must >> assume that they are used in scripts outside of Git. Castrating their >> functionality like this patch does is unacceptable. > > For require_clean_work_tree() the public documentation for this function > states that it will emit a specific error message in English, and you're > expected to Lego-interpolate a string that we'll concatenate with it: > > [...]It emits an error message of the form `Cannot > <action>: <reason>. <hint>`, and dies. Example: > + > ---------------- > require_clean_work_tree rebase "Please commit or stash them." > > So I think that marking it for translation like this as d323c6b6410 was > always broken in that it broke that documented promise. I can buy this argument. But then this must be a separate commit with this justification. > But that's just an argument for keeping the require_clean_work_tree() > part of this patch, not require_work_tree_exists(). > > For that one and others in git-sh-setup we've never said that we'd > provide these translated (and to the extent we've implied anything about > the rest, have strongly implied the opposite with > require_clean_work_tree()'s docs). > > Nothing will break for out-of-tree users as a result of this > change. When we added these functions and their documentation their > output wouldn't be translated, then sometimes it was, now it's not > again. This does not sound convincing at all, but rather like "I want the code to be so, and here is some handwaving to justify it". What is wrong with the status quo? > We need also need to be mindful of translator time, it's a *lot* of > strings to go through, and e.g. I've commented in the past on patches > that marked stuff in t/helper/ for translation. Translator's time is your concern? No translator sifts through 5000 strings on every release. There are tools that show only new strings to translate. A text is translated once and then it lies under the radar until someone changes it. Don't tell me that is time-consuming. On the other hand, there is a lot of *reviewer* time that you are spending with changes like this. *That* should be your concern. -- Hannes
On Mon, Mar 28 2022, Johannes Sixt wrote: > Am 27.03.22 um 13:15 schrieb Ævar Arnfjörð Bjarmason: >> >> On Sun, Mar 27 2022, Johannes Sixt wrote: >> >>> Am 26.03.22 um 18:14 schrieb Ævar Arnfjörð Bjarmason: >>>> Partially revert d323c6b6410 (i18n: git-sh-setup.sh: mark strings for >>>> translation, 2016-06-17). >>>> >>>> These strings are no longer used in-tree, and we shouldn't be wasting >>>> translator time on them for the benefit of a hypothetical out-of-tree >>>> user of git-sh-setup.sh. >>> >>> There is public documentation for these functions. Hence, you must >>> assume that they are used in scripts outside of Git. Castrating their >>> functionality like this patch does is unacceptable. >> >> For require_clean_work_tree() the public documentation for this function >> states that it will emit a specific error message in English, and you're >> expected to Lego-interpolate a string that we'll concatenate with it: >> >> [...]It emits an error message of the form `Cannot >> <action>: <reason>. <hint>`, and dies. Example: >> + >> ---------------- >> require_clean_work_tree rebase "Please commit or stash them." >> >> So I think that marking it for translation like this as d323c6b6410 was >> always broken in that it broke that documented promise. > > I can buy this argument. But then this must be a separate commit with > this justification. Sure, I can elaborate on that point & split it up. >> But that's just an argument for keeping the require_clean_work_tree() >> part of this patch, not require_work_tree_exists(). >> >> For that one and others in git-sh-setup we've never said that we'd >> provide these translated (and to the extent we've implied anything about >> the rest, have strongly implied the opposite with >> require_clean_work_tree()'s docs). >> >> Nothing will break for out-of-tree users as a result of this >> change. When we added these functions and their documentation their >> output wouldn't be translated, then sometimes it was, now it's not >> again. > > This does not sound convincing at all, but rather like "I want the code > to be so, and here is some handwaving to justify it". What is wrong with > the status quo? The larger context for why I was looking at this again is that I'm trying to slowly get us to the point where we can remove the i18n-in-shellscript entirtely. But I thought that was a rather large digression for the commit message, and that these being both unused, and not something the "public" API affected ever promised it would do was sufficient. >> We need also need to be mindful of translator time, it's a *lot* of >> strings to go through, and e.g. I've commented in the past on patches >> that marked stuff in t/helper/ for translation. > > Translator's time is your concern? No translator sifts through 5000 > strings on every release. There are tools that show only new strings to > translate. Yes, I'm the person who added this entire i18n infrastructure to git, I know how it works :) > A text is translated once and then it lies under the radar > until someone changes it. Don't tell me that is time-consuming. Yes, individual orphaned strings aren't, but they add up. Just like having that "USE_PIC" comment in configure.ac isn't much of a big deal, but it makes sense to clean up unused code, just as we're adding new code. I will say that your implicit proposal of keeping this forever instead is assuming that we won't have more translations for git, and every new translator will look at this. Context is critical for translators, so even if it's one string it's a string you'll quickly grep for and find ... no uses for, and then likely go hunting around for where it's used only to (hopefully, in that case) find this thread. Better not to have it. > On the other hand, there is a lot of *reviewer* time that you are > spending with changes like this. *That* should be your concern. I'd think most of the that time, if any, will be spent on this sub-thread you started, so ... :) Which isn't to say it shouldn't have been brought up, but from my perspective I was (and still am) making a rather small change that I think won't harm anyone in practice, and gives us some incremental tidyness & contributes to an eventual large "git rm git-sh-i18n.sh" et al. But on reflection I don't think it's worth worrying about, and we can just do this change.
Am 28.03.22 um 14:16 schrieb Ævar Arnfjörð Bjarmason: > On Mon, Mar 28 2022, Johannes Sixt wrote: >> What is wrong with >> the status quo? > > The larger context for why I was looking at this again is that I'm > trying to slowly get us to the point where we can remove the > i18n-in-shellscript entirtely. Why? Again: what is wrong with the status quo? > Just like having that "USE_PIC" comment in configure.ac isn't much of a > big deal, but it makes sense to clean up unused code, just as we're > adding new code. There is a difference between "clean up unused code" and "change observable behavior". -- Hannes
Hi Ævar On 28/03/2022 13:16, Ævar Arnfjörð Bjarmason wrote: > > On Mon, Mar 28 2022, Johannes Sixt wrote: > >> Am 27.03.22 um 13:15 schrieb Ævar Arnfjörð Bjarmason: >>> >>> On Sun, Mar 27 2022, Johannes Sixt wrote: >>> >>>> Am 26.03.22 um 18:14 schrieb Ævar Arnfjörð Bjarmason: >>>>> Partially revert d323c6b6410 (i18n: git-sh-setup.sh: mark strings for >>>>> translation, 2016-06-17). >>>>> >>>>> These strings are no longer used in-tree, and we shouldn't be wasting >>>>> translator time on them for the benefit of a hypothetical out-of-tree >>>>> user of git-sh-setup.sh. The out of tree users of git-sh-setup.sh are not hypothetical, they exist and objected when you recently tried to remove these functions entirely[1]. >>>> There is public documentation for these functions. Hence, you must >>>> assume that they are used in scripts outside of Git. Castrating their >>>> functionality like this patch does is unacceptable. >>> >>> For require_clean_work_tree() the public documentation for this function >>> states that it will emit a specific error message in English, and you're >>> expected to Lego-interpolate a string that we'll concatenate with it: The documentation does not say whether the message is translated or not, probably because it was not updated when the translations were added six years ago. >>> [...]It emits an error message of the form `Cannot >>> <action>: <reason>. <hint>`, and dies. Example: This is not a promising a "specific error message in English" >>> + >>> ---------------- >>> require_clean_work_tree rebase "Please commit or stash them." This is an example message you cannot use that to argue that we will always show a message in English >>> So I think that marking it for translation like this as d323c6b6410 was >>> always broken in that it broke that documented promise. >> >> I can buy this argument. But then this must be a separate commit with >> this justification. > > Sure, I can elaborate on that point & split it up. > >>> But that's just an argument for keeping the require_clean_work_tree() >>> part of this patch, not require_work_tree_exists(). >>> >>> For that one and others in git-sh-setup we've never said that we'd >>> provide these translated (and to the extent we've implied anything about >>> the rest, have strongly implied the opposite with >>> require_clean_work_tree()'s docs). >>> >>> Nothing will break for out-of-tree users as a result of this >>> change. The strings the user sees will change >>> When we added these functions and their documentation their >>> output wouldn't be translated, Where does the documentation say "the output will not be translated"? >>> then sometimes it was, now it's not >>> again. >> >> This does not sound convincing at all, but rather like "I want the code >> to be so, and here is some handwaving to justify it". What is wrong with >> the status quo? > > The larger context for why I was looking at this again is that I'm > trying to slowly get us to the point where we can remove the > i18n-in-shellscript entirtely. > > But I thought that was a rather large digression for the commit message, > and that these being both unused, and not something the "public" API > affected ever promised it would do was sufficient. I think if that is what you want to do then you should propose a series that does just that and explains why it is desirable, rather than coming up with other reasons to justify the changes you want. >>> We need also need to be mindful of translator time, it's a *lot* of >>> strings to go through, and e.g. I've commented in the past on patches >>> that marked stuff in t/helper/ for translation. >> >> Translator's time is your concern? No translator sifts through 5000 >> strings on every release. There are tools that show only new strings to >> translate. > > Yes, I'm the person who added this entire i18n infrastructure to git, I > know how it works :) > >> A text is translated once and then it lies under the radar >> until someone changes it. Don't tell me that is time-consuming. > > Yes, individual orphaned strings aren't, but they add up. > > Just like having that "USE_PIC" comment in configure.ac isn't much of a > big deal, but it makes sense to clean up unused code, just as we're > adding new code. > > I will say that your implicit proposal of keeping this forever instead > is assuming that we won't have more translations for git, and every new > translator will look at this. > > Context is critical for translators, so even if it's one string it's a > string you'll quickly grep for and find ... no uses for, and then likely > go hunting around for where it's used only to (hopefully, in that case) > find this thread. Better not to have it. > >> On the other hand, there is a lot of *reviewer* time that you are >> spending with changes like this. *That* should be your concern. > > I'd think most of the that time, if any, will be spent on this > sub-thread you started, so ... :) This sub-tread exists because you posted this patch to the mailing list. Blaming reviewers for asking perfectly reasonable questions is neither fair nor helpful. This patch does not remove dead code as the rest of the series does but instead changes user facing messages in code that we recently established is part of the public api[2]. Nothing has changed since that recent discussion so I'm confused as to why you are proposing to modify the api again so soon. Best Wishes Phillip [1] https://lore.kernel.org/git/CAJm9OHfN9iXDt-rzu-wb=67y4PPpmCUgMfmZPy1JMBJkHG256g@mail.gmail.com/ [2] https://lore.kernel.org/git/xmqq5yvik8bc.fsf@gitster.g/ > Which isn't to say it shouldn't have been brought up, but from my > perspective I was (and still am) making a rather small change that I > think won't harm anyone in practice, and gives us some incremental > tidyness & contributes to an eventual large "git rm git-sh-i18n.sh" et > al. > > But on reflection I don't think it's worth worrying about, and we can > just do this change. >
On Thu, Mar 31 2022, Phillip Wood wrote: [tl;dr: Reply below, but this whole thing should be addressed by the v4 I sent last night: https://lore.kernel.org/git/cover-v4-0.6-00000000000-20220331T014349Z-avarab@gmail.com/ I.e. the controversial patch has been ejected]. > On 28/03/2022 13:16, Ævar Arnfjörð Bjarmason wrote: >> On Mon, Mar 28 2022, Johannes Sixt wrote: >> >>> Am 27.03.22 um 13:15 schrieb Ævar Arnfjörð Bjarmason: >>>> >>>> On Sun, Mar 27 2022, Johannes Sixt wrote: >>>> >>>>> Am 26.03.22 um 18:14 schrieb Ævar Arnfjörð Bjarmason: >>>>>> Partially revert d323c6b6410 (i18n: git-sh-setup.sh: mark strings for >>>>>> translation, 2016-06-17). >>>>>> >>>>>> These strings are no longer used in-tree, and we shouldn't be wasting >>>>>> translator time on them for the benefit of a hypothetical out-of-tree >>>>>> user of git-sh-setup.sh. > > The out of tree users of git-sh-setup.sh are not hypothetical, they > exist and objected when you recently tried to remove these functions > entirely[1]. I see that what I wrote there is ambiguous, but I'm aware of that & remember that thread. I meant to say the hypothetical user that cares about the i18n these functions exposed. >>>>> There is public documentation for these functions. Hence, you must >>>>> assume that they are used in scripts outside of Git. Castrating their >>>>> functionality like this patch does is unacceptable. >>>> >>>> For require_clean_work_tree() the public documentation for this function >>>> states that it will emit a specific error message in English, and you're >>>> expected to Lego-interpolate a string that we'll concatenate with it: > > The documentation does not say whether the message is translated or > not, probably because it was not updated when the translations were > added six years ago. It does say it. It uses the word "Cannot" at the beginning, and promises to emit that specific string. Yes we didn't update it at the time for i18n, and probably should. But to the extent that the gordian knot in making any changes to these whatsoever is because they've been publicly documented I don't think anyone using these has been promised different behavior. So it's highly relevant here. >>>> [...]It emits an error message of the form `Cannot >>>> <action>: <reason>. <hint>`, and dies. Example: > > This is not a promising a "specific error message in English" It really is. You cannot use this API to produce sensible output in any other language. It was used like this: require_clean_work_tree "pull with rebase" "Please commit or stash them." For which we'd emit: Cannot pull with rebase: You have unstaged changes. Please commit or stash them. You can see e.g. in the Bulgarian translation that this was dealt with by putting the interpolated string in double-quotes. >>>> + >>>> ---------------- >>>> require_clean_work_tree rebase "Please commit or stash them." > > This is an example message you cannot use that to argue that we will > always show a message in English I'm saying that the documentation says it emits English, that it didn't always do that, and now does so again. And that to get it to emit anything sensible in cases where we're not under LC_ALL=C would have required 1=1 matching the behavior of whatever shellscript is using this to what git-sh-i18n in picking the locale. I don't think it's plausible that there's an out-of-tree user maintaining their own set of i18n'd po/ files which expect to interact with our translations in this way. Any out-of-tree user of this (if they're using this at all) will either not care, or they'll see more sensible output again. >>>> So I think that marking it for translation like this as d323c6b6410 was >>>> always broken in that it broke that documented promise. >>> >>> I can buy this argument. But then this must be a separate commit with >>> this justification. >> Sure, I can elaborate on that point & split it up. >> >>>> But that's just an argument for keeping the require_clean_work_tree() >>>> part of this patch, not require_work_tree_exists(). >>>> >>>> For that one and others in git-sh-setup we've never said that we'd >>>> provide these translated (and to the extent we've implied anything about >>>> the rest, have strongly implied the opposite with >>>> require_clean_work_tree()'s docs). >>>> >>>> Nothing will break for out-of-tree users as a result of this >>>> change. > > The strings the user sees will change Yes, and I'll admit that "nothing will break here" on my part isn't the same as saying "there will be no observable change whatsoever". Sorry about being unclear there. As a general matter we don't promise that such strings won't change, even for die(), error() etc. messages emitted by plumbing commands. Except in some rare cases where they've been known to be used out of tree extensively, e.g. the human-readable "merge" messages where we have/had no other API to expose the same information. Or, in the case of plumbing output where such strings are part of the API contract. But for these commands in the "Internal helper commands" category I think this fall squarely in the category of changing a random error(), die() etc. in the C code (which we do quite freely). >>>> When we added these functions and their documentation their >>>> output wouldn't be translated, > > Where does the documentation say "the output will not be translated"? I think this was covered above, it's sufficient that it didn't promise that it would be, and in the one case where we discuss it in passing with an example we imply that it won't be. >>>> then sometimes it was, now it's not >>>> again. >>> >>> This does not sound convincing at all, but rather like "I want the code >>> to be so, and here is some handwaving to justify it". What is wrong with >>> the status quo? >> The larger context for why I was looking at this again is that I'm >> trying to slowly get us to the point where we can remove the >> i18n-in-shellscript entirtely. >> But I thought that was a rather large digression for the commit >> message, >> and that these being both unused, and not something the "public" API >> affected ever promised it would do was sufficient. > > I think if that is what you want to do then you should propose a > series that does just that and explains why it is desirable, rather > than coming up with other reasons to justify the changes you want. Just because I start looking at some code for reason X that doesn't mean that submitting a patch with rationale Y isn't a sufficient reason to make that change. I still think that in this case that they're not used by our own i18n effort is a perfectly sufficient reason to make the change, as we won't waste translator time in it. I.e. I'll still stand behind the stated rationale. But aside from that most changes I made to git are with an eye to some larger semi-related goal. I do have some WIP changes to tear down most of the *.sh and *.perl i18n infrastructure (the parts still in use would still have translations), and IIRC it's at least a 2k line negative diffstat, and enables us to do more interesting things in i18n (e.g. getting rid of the libintl dependency). But I also don't think that such a series is probably not possible in the near term if we're going to insist that all shellscript output must byte-for-byte be the same (for boring reasons I won't go into, but it's mainly to do with sh-i18n--envsubst.c). So it's also a bit of a chicke & egg problem. I wanted to send any such UI changes in first, to see if it was even worth finishing up that work, or if the whole thing would stall on not being able to change some output someone somewhere might have relied on being byte-for-byte the same. >>>> We need also need to be mindful of translator time, it's a *lot* of >>>> strings to go through, and e.g. I've commented in the past on patches >>>> that marked stuff in t/helper/ for translation. >>> >>> Translator's time is your concern? No translator sifts through 5000 >>> strings on every release. There are tools that show only new strings to >>> translate. >> Yes, I'm the person who added this entire i18n infrastructure to >> git, I >> know how it works :) >> >>> A text is translated once and then it lies under the radar >>> until someone changes it. Don't tell me that is time-consuming. >> Yes, individual orphaned strings aren't, but they add up. >> Just like having that "USE_PIC" comment in configure.ac isn't much >> of a >> big deal, but it makes sense to clean up unused code, just as we're >> adding new code. >> I will say that your implicit proposal of keeping this forever >> instead >> is assuming that we won't have more translations for git, and every new >> translator will look at this. >> Context is critical for translators, so even if it's one string it's >> a >> string you'll quickly grep for and find ... no uses for, and then likely >> go hunting around for where it's used only to (hopefully, in that case) >> find this thread. Better not to have it. >> >>> On the other hand, there is a lot of *reviewer* time that you are >>> spending with changes like this. *That* should be your concern. >> I'd think most of the that time, if any, will be spent on this >> sub-thread you started, so ... :) > > This sub-tread exists because you posted this patch to the mailing > list. Blaming reviewers for asking perfectly reasonable questions is > neither fair nor helpful. I didn't mean any offense there, but did mean to suggest (smiley an all) that a mountain was being made out a molehill in this case. Yes translator time is my concern. I started the i18n effort in git, and I think it's really important. We currently have 18 translations of git in the po/ directory, 16 if you leave out "dialects". Which if you compare it with https://en.wikipedia.org/wiki/List_of_languages_by_number_of_native_speakers is quite bad. For comparison I worked extensively on MediaWiki in a past life, which at the time had at least 100 such translations. I looked again and it's up to around 600 (many incomplete, to be fair). Is that our fault as project? No, but we could definitely help it along. I value the scarcity of translator time (including future translations) much more than concerns that there *may be* someone somewhere who's got a reliance on this particular output. > This patch does not remove dead code as the rest of the series does > but instead changes user facing messages in code that we recently > established is part of the public api[2]. Nothing has changed since > that recent discussion so I'm confused as to why you are proposing to > modify the api again so soon. As noted above I don't think that previous discussion applies to these changes as you describe, but in any case, ~8 hours before you sent this reply I sent a v4 re-roll which left out this change: https://lore.kernel.org/git/cover-v4-0.6-00000000000-20220331T014349Z-avarab@gmail.com/ Which I hope will address your & Johannes Sixt's concerns here. Does the rest of this series look good to you?
Am 31.03.22 um 13:15 schrieb Ævar Arnfjörð Bjarmason: > I do have some WIP changes to tear down most of the *.sh and *.perl i18n > infrastructure (the parts still in use would still have translations), > and IIRC it's at least a 2k line negative diffstat, and enables us to do > more interesting things in i18n (e.g. getting rid of the libintl > dependency). Why? Why? Why? Does the status quo have a problem somewhere? All this sounds like a change for the sake of change. > But I also don't think that such a series is probably not possible in > the near term if we're going to insist that all shellscript output must > byte-for-byte be the same (for boring reasons I won't go into, but it's > mainly to do with sh-i18n--envsubst.c). Such an insistence can easily be lifted if the change is justified sufficiently. I haven't seen such a justification, yet. -- Hannes
On Thu, Mar 31 2022, Johannes Sixt wrote: > Am 31.03.22 um 13:15 schrieb Ævar Arnfjörð Bjarmason: >> I do have some WIP changes to tear down most of the *.sh and *.perl i18n >> infrastructure (the parts still in use would still have translations), >> and IIRC it's at least a 2k line negative diffstat, and enables us to do >> more interesting things in i18n (e.g. getting rid of the libintl >> dependency). > > Why? Why? Why? Does the status quo have a problem somewhere? All this > sounds like a change for the sake of change. So this is quite the digression, but, hey, you asked for it. We don't have translations universally available because libintl is a rather heavy thing to ship. I don't personally mind linking against it for my own builds, but grep for NO_GETTEXT in our tree & history for some of the workarounds. We're also heading towards being able to build a stand-alone git binary for most things, which makes shipping in various setups much easier, but libintl is more of an "old-school" *nix library. You need to ferry around auxilliary *.mo files, and for the *.sh and *.perl translations we need gettext.sh, /usr/bin/gettext and Locale::Messages (and everything that brings in). I'd like translations for Git to Just Work, including if you're in some random docker image with someone's home-built git. Giving people fewer reasons to enable it improves accessibility. A lot of people who use git are not on their own personal laptop, but on some setup (corporate, CI etc.) that they don't fully control. The gettext model & libintl is also just bad at various use-cases I think would make sense to support. E.g. having a configurable option to emit output in two languages at the same time, either because you'd both like to understand the output & e.g. search errors online, or you'd understand more from a union of say German an English than from just one or the other. For libintl you need'd to juggle setlocale() in the middle of your underlying sprintf implementation to do that, or pull other shenanigans of bypassing its API (e.g. directly reading the *.mo files), which pretty much amounts to the same thing. So essentially I wanted to hack up something that would just post-process output like this: msgunfmt --strict -s -w 0 -i -E --color=always po/build/locale/de/LC_MESSAGES/git.mo And turn it into a lang-de.c file, for which we'd make a lang-de.o that we'd link in. And then either binary search through it, or just generate code we'd compile (one really big switch/case statement). Now, if you count the number of messages we translate in *.sh land on your digits you won't even need to use all of our toes, and for the *.perl it's similar, especially with add--interactive.perl going away any day now. There isn't any fundamental obstacle to making such a thing portable to *.sh and *.perl, but having gotten that particular interop working once in the past needing to do that again would bring this (I think worthwhile) project from a "maybe someday" to "nah". >> But I also don't think that such a series is probably not possible in >> the near term if we're going to insist that all shellscript output must >> byte-for-byte be the same (for boring reasons I won't go into, but it's >> mainly to do with sh-i18n--envsubst.c). > > Such an insistence can easily be lifted if the change is justified > sufficiently. I haven't seen such a justification, yet. Sure, but re the "chicken & egg" problem I might do all the work to do all that, and someone such as yourself might rightly point out that it would break someone's obscure use-case, scuttling the whole thing. Which isn't an exaggeration b.t.w., if you e.g. look through our remaining gettext.sh usage you'll find that we carry the entirety of sh-i18n--ensubst.c and everything around it (at least ~1k lines) for emitting a single word in a single message in git-sh-setup.sh, that's it. Because the whole reason eval_gettext exists, and everything to support it, is to support the use-case of feeding *arbitrary input* into the translation engine, i.e. not the string you yourself have in your source code & trust (it avoids shell "eval"). But if you think that's of paramount importance (that word is "usage" b.t.w., and the equivalent in usage.c isn't even translated) there wouldn't be any way to make forward progress towards the next step of making the remaining shellscript translations call some "git sh--i18n" helper to get their output. So, to the extent that I was going to pursue the above plan at all I wanted to do it in small steps, especially now as git-submodule.sh et al are going away. So. It would be nice to get some leeway in some areas, especially for something like this where I implemented this entire i18n system to begin with, so I'd think it would be clear that it's not some drive-by contribution. I clearly care about the end-goal, and have been sticking with this particular topic for more than a decade. Not everything can always be a single atomically understood patch that carries all possible reasons to make the change with it, some things are more of a longer term incremental effort. And since we all have limited time on this spinning ball of mud sometimes it can make sense to trickle in initial changes to see if some larger end-goal is even attainable, or will be blocked due to some unforeseen (or underestimated) reasons. Thanks.
Am 02.04.22 um 12:44 schrieb Ævar Arnfjörð Bjarmason: > > On Thu, Mar 31 2022, Johannes Sixt wrote: > >> Am 31.03.22 um 13:15 schrieb Ævar Arnfjörð Bjarmason: >>> I do have some WIP changes to tear down most of the *.sh and *.perl i18n >>> infrastructure (the parts still in use would still have translations), >>> and IIRC it's at least a 2k line negative diffstat, and enables us to do >>> more interesting things in i18n (e.g. getting rid of the libintl >>> dependency). >> >> Why? Why? Why? Does the status quo have a problem somewhere? All this >> sounds like a change for the sake of change. > > So this is quite the digression, but, hey, you asked for it. Oh, no, this is not a digression *at all*. What your write below is the kind of text that is needed to judge the value of a change. Without it, a change that does not have an otherwise obvious improvement[*] is just for the change's sake. [*] In my book, getting rid of a libintl dependency is not an obvious improvement. I may be biased in this case, because that dependency was never a problem for me. Might be because my personal builds all have NO_GETTEXT set. > We don't have translations universally available because libintl is a > rather heavy thing to ship. > > I don't personally mind linking against it for my own builds, but grep > for NO_GETTEXT in our tree & history for some of the workarounds. > > We're also heading towards being able to build a stand-alone git binary > for most things, which makes shipping in various setups much easier, but > libintl is more of an "old-school" *nix library. > > You need to ferry around auxilliary *.mo files, and for the *.sh and > *.perl translations we need gettext.sh, /usr/bin/gettext and > Locale::Messages (and everything that brings in). > > I'd like translations for Git to Just Work, including if you're in some > random docker image with someone's home-built git. Giving people fewer > reasons to enable it improves accessibility. A lot of people who use git > are not on their own personal laptop, but on some setup (corporate, CI > etc.) that they don't fully control. > > The gettext model & libintl is also just bad at various use-cases I > think would make sense to support. > > E.g. having a configurable option to emit output in two languages at the > same time, either because you'd both like to understand the output & > e.g. search errors online, or you'd understand more from a union of say > German an English than from just one or the other. > > For libintl you need'd to juggle setlocale() in the middle of your > underlying sprintf implementation to do that, or pull other shenanigans > of bypassing its API (e.g. directly reading the *.mo files), which > pretty much amounts to the same thing. > > So essentially I wanted to hack up something that would just > post-process output like this: > > msgunfmt --strict -s -w 0 -i -E --color=always po/build/locale/de/LC_MESSAGES/git.mo > > And turn it into a lang-de.c file, for which we'd make a lang-de.o that > we'd link in. And then either binary search through it, or just generate > code we'd compile (one really big switch/case statement). > > Now, if you count the number of messages we translate in *.sh land on > your digits you won't even need to use all of our toes, and for the > *.perl it's similar, especially with add--interactive.perl going away > any day now. > > There isn't any fundamental obstacle to making such a thing portable to > *.sh and *.perl, but having gotten that particular interop working once > in the past needing to do that again would bring this (I think > worthwhile) project from a "maybe someday" to "nah". Just to make it clear: I am totally neutral on your goal. It's on others to tell whether this is worth doing. >>> But I also don't think that such a series is probably not possible in >>> the near term if we're going to insist that all shellscript output must >>> byte-for-byte be the same (for boring reasons I won't go into, but it's >>> mainly to do with sh-i18n--envsubst.c). >> >> Such an insistence can easily be lifted if the change is justified >> sufficiently. I haven't seen such a justification, yet. > > Sure, but re the "chicken & egg" problem I might do all the work to do > all that, and someone such as yourself might rightly point out that it > would break someone's obscure use-case, scuttling the whole thing. > > Which isn't an exaggeration b.t.w., if you e.g. look through our > remaining gettext.sh usage you'll find that we carry the entirety of > sh-i18n--ensubst.c and everything around it (at least ~1k lines) for > emitting a single word in a single message in git-sh-setup.sh, that's > it. See, someone thought it was a good idea to have i18n in shell scripts and others agreed that it was worth having ~1k lines of code to do that. So the code went in. From then on, these ~1k lines are *not a problem* in themselves. From then on, the decision of having ~1k lines or not having them can only be based on what effect they have, but no longer on "oh, wow, that's 1k lines to write a single word; do we really want that"? > > Because the whole reason eval_gettext exists, and everything to support > it, is to support the use-case of feeding *arbitrary input* into the > translation engine, i.e. not the string you yourself have in your source > code & trust (it avoids shell "eval"). > > But if you think that's of paramount importance (that word is "usage" > b.t.w., and the equivalent in usage.c isn't even translated) there > wouldn't be any way to make forward progress towards the next step of > making the remaining shellscript translations call some "git sh--i18n" > helper to get their output. > > So, to the extent that I was going to pursue the above plan at all I > wanted to do it in small steps, especially now as git-submodule.sh et al > are going away. > > So. > > It would be nice to get some leeway in some areas, especially for > something like this where I implemented this entire i18n system to begin > with, so I'd think it would be clear that it's not some drive-by > contribution. I clearly care about the end-goal, and have been sticking > with this particular topic for more than a decade. > > Not everything can always be a single atomically understood patch that > carries all possible reasons to make the change with it, some things are > more of a longer term incremental effort. > > And since we all have limited time on this spinning ball of mud > sometimes it can make sense to trickle in initial changes to see if some > larger end-goal is even attainable, or will be blocked due to some > unforeseen (or underestimated) reasons. You can't have leeway for a change whose conclusion is "removes some miniscule feature". But if you add "Here is the secret plan to Scrat's golden nut; let's start with this change, even though it removes some miniscule feature", things are vastly different. -- Hannes
On Sat, Apr 02 2022, Johannes Sixt wrote: > Am 02.04.22 um 12:44 schrieb Ævar Arnfjörð Bjarmason: >> >> On Thu, Mar 31 2022, Johannes Sixt wrote: >> >>> Am 31.03.22 um 13:15 schrieb Ævar Arnfjörð Bjarmason: >>>> I do have some WIP changes to tear down most of the *.sh and *.perl i18n >>>> infrastructure (the parts still in use would still have translations), >>>> and IIRC it's at least a 2k line negative diffstat, and enables us to do >>>> more interesting things in i18n (e.g. getting rid of the libintl >>>> dependency). >>> >>> Why? Why? Why? Does the status quo have a problem somewhere? All this >>> sounds like a change for the sake of change. >> >> So this is quite the digression, but, hey, you asked for it. > > Oh, no, this is not a digression *at all*. What your write below is the > kind of text that is needed to judge the value of a change. Without it, > a change that does not have an otherwise obvious improvement[*] is just > for the change's sake. Well let's be clear here. It's been your claim that the proposed change must not be worth doing because you don't place the same value on having a 1=1 mapping between strings we ask translators to work on, and those that we'll actually present as part of git's UI. Which is fair enough, and something we can respectfully disagree on. But that's not the same as claiming that the stated reason for the upthread patch is incomplete or insufficient. I can tell you that as the person who implemented this whole i18n facility that providing translations for someone's random shellscript was never the point, at all. It just so happens that because we implemented some bits of functionality of the porcelain as shellscripts, and at the same time had a shellscript library which (regrettably or not) seems to invite both in-tree and out-of-tree users to use it,that the two went hand-in-hand. But now that they don't anymore I don't see anything "handwaving" about simply removing the translation markings. I don't think they serve any purpose anymore. > [*] In my book, getting rid of a libintl dependency is not an obvious > improvement. I may be biased in this case, because that dependency was > never a problem for me. Might be because my personal builds all have > NO_GETTEXT set. So not only don't you use a translated version of git, but you don't even compile one with it? Yes, I can imagine that hasn't exposed you to any of the problems with it :) >>>> But I also don't think that such a series is probably not possible in >>>> the near term if we're going to insist that all shellscript output must >>>> byte-for-byte be the same (for boring reasons I won't go into, but it's >>>> mainly to do with sh-i18n--envsubst.c). >>> >>> Such an insistence can easily be lifted if the change is justified >>> sufficiently. I haven't seen such a justification, yet. >> >> Sure, but re the "chicken & egg" problem I might do all the work to do >> all that, and someone such as yourself might rightly point out that it >> would break someone's obscure use-case, scuttling the whole thing. >> >> Which isn't an exaggeration b.t.w., if you e.g. look through our >> remaining gettext.sh usage you'll find that we carry the entirety of >> sh-i18n--ensubst.c and everything around it (at least ~1k lines) for >> emitting a single word in a single message in git-sh-setup.sh, that's >> it. > > See, someone thought it was a good idea to have i18n in shell scripts > and others agreed that it was worth having ~1k lines of code to do that. > So the code went in. From then on, these ~1k lines are *not a problem* > in themselves. From then on, the decision of having ~1k lines or not > having them can only be based on what effect they have, but no longer on > "oh, wow, that's 1k lines to write a single word; do we really want that"? Aside from i18n. I don't agree with that in general. Yes, code that's in-tree and working needs to be under less scrutiny as a new addition, and refactoring something isn't always worth it. We'll also need to review the removals. But there's also a cost to keeping things around, as you can e.g. see from various portability and correctness fixes to this code we've perma-forked from the GNU GPLv2 version. There's some tipping point wherea refactoring isn't worth it, but emitting the word "usage" with ~1k lines is a pretty clear candidate in my mind for a "git rm". >> Because the whole reason eval_gettext exists, and everything to support >> it, is to support the use-case of feeding *arbitrary input* into the >> translation engine, i.e. not the string you yourself have in your source >> code & trust (it avoids shell "eval"). >> >> But if you think that's of paramount importance (that word is "usage" >> b.t.w., and the equivalent in usage.c isn't even translated) there >> wouldn't be any way to make forward progress towards the next step of >> making the remaining shellscript translations call some "git sh--i18n" >> helper to get their output. >> >> So, to the extent that I was going to pursue the above plan at all I >> wanted to do it in small steps, especially now as git-submodule.sh et al >> are going away. >> >> So. >> >> It would be nice to get some leeway in some areas, especially for >> something like this where I implemented this entire i18n system to begin >> with, so I'd think it would be clear that it's not some drive-by >> contribution. I clearly care about the end-goal, and have been sticking >> with this particular topic for more than a decade. >> >> Not everything can always be a single atomically understood patch that >> carries all possible reasons to make the change with it, some things are >> more of a longer term incremental effort. >> >> And since we all have limited time on this spinning ball of mud >> sometimes it can make sense to trickle in initial changes to see if some >> larger end-goal is even attainable, or will be blocked due to some >> unforeseen (or underestimated) reasons. > > You can't have leeway for a change whose conclusion is "removes some > miniscule feature". But if you add "Here is the secret plan to Scrat's > golden nut; let's start with this change, even though it removes some > miniscule feature", things are vastly different. I mean leeway on the topic that I probably have some idea of what I'm talking about when it comes to git's i18n support, and whether it's worth the effort to keep certain things around or not. I.e. you started this thread by claiming that the removal of these translations would be "castrating [out-of-tree] functionality, [which is] unacceptable.". As noted above I don't think that assessment is correct, and if I'm understanding you correctly you don't even use git's i18n mechanism at all. Which I think presents only two possible conclusions. One is that I, the person who added the i18n mechanism in the first place, am so clueless about how it work or what it's for, that I'm (intentionally or not) submitting patches that "castrate" it. The other is that you've understandably missed some of the nuance, such as why we're even marking strings for translation, and what the intended audience of them.
Please appologize that I do not reply to your arguments directly. I think I have said all I can. Perhaps I am unable to express my concerns sufficiently clearly. -- Hannes
diff --git a/git-sh-setup.sh b/git-sh-setup.sh index d92df37e992..1abceaac8d3 100644 --- a/git-sh-setup.sh +++ b/git-sh-setup.sh @@ -187,8 +187,7 @@ cd_to_toplevel () { require_work_tree_exists () { if test "z$(git rev-parse --is-bare-repository)" != zfalse then - program_name=$0 - die "$(eval_gettext "fatal: \$program_name cannot be used without a working tree.")" + die "fatal: $0 cannot be used without a working tree." fi } @@ -206,13 +205,13 @@ require_clean_work_tree () { if ! git diff-files --quiet --ignore-submodules then - action=$1 - case "$action" in + case "$1" in "rewrite branches") gettextln "Cannot rewrite branches: You have unstaged changes." >&2 ;; *) - eval_gettextln "Cannot \$action: You have unstaged changes." >&2 + # Some out-of-tree user of require_clean_work_tree() + echo "Cannot $1: You have unstaged changes." >&2 ;; esac err=1 @@ -222,8 +221,15 @@ require_clean_work_tree () { then if test $err = 0 then - action=$1 - eval_gettextln "Cannot \$action: Your index contains uncommitted changes." >&2 + case "$1" in + "rewrite branches") + gettextln "Cannot rewrite branches: You have unstaged changes." >&2 + ;; + *) + # Some out-of-tree user of require_clean_work_tree() + echo "Cannot $1: Your index contains uncommitted changes." >&2 + ;; + esac else gettextln "Additionally, your index contains uncommitted changes." >&2 fi
Partially revert d323c6b6410 (i18n: git-sh-setup.sh: mark strings for translation, 2016-06-17). These strings are no longer used in-tree, and we shouldn't be wasting translator time on them for the benefit of a hypothetical out-of-tree user of git-sh-setup.sh. Since d03ebd411c6 (rebase: remove the rebase.useBuiltin setting, 2019-03-18) we've had no in-tree user of require_work_tree_exists(), and since the more recent c1e10b2dce2 (git-sh-setup: remove messaging supporting --preserve-merges, 2021-10-21) the only in-tree user of require_clean_work_tree() is git-filter-branch.sh. Let's only translate the message it uses, and revert the others to the pre-image of d323c6b6410. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-sh-setup.sh | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-)