Message ID | cover.1681906948.git.ps@pks.im (mailing list archive) |
---|---|
Headers | show |
Series | fetch: introduce machine-parseable output | expand |
Patrick Steinhardt wrote: > Parsing the output of fetches is mostly impossible. It prettifies > reference names that are about to be updated, doesn't print the old and > new object IDs the refs are being updated from and to, and prints all of > that information in nice columns. In short, it is designed to be read by > humans rather than machines. > > This makes it hard to use in a script way though, e.g. to learn about > which references actually have been updated or which have not been > updated. This patch series intends to fix that by introducing a new > machine-parseable interface: > > ``` > $ git fetch --output-format=porcelain --no-progress > fff5a5e7f528b2ed2c335991399a766c2cf01103 af67688dca57999fd848f051eeea1d375ba546b2 refs/remotes/origin/master > * 0000000000000000000000000000000000000000 e046fe5a36a970bc14fbfbcb2074a48776f6b671 refs/remotes/origin/x86-rep-insns > * 0000000000000000000000000000000000000000 bb81ed6862b864c9eb99447f04d49a84ecb647e5 refs/tags/v6.3-rc4 > * 0000000000000000000000000000000000000000 83af7b1468c0dca86b4dc9e43e73bfa4f38d9637 refs/tags/v6.3-rc5 > * 0000000000000000000000000000000000000000 ab3affb8ed84f68638162fe7e6fd4055e15bff5b refs/tags/v6.3-rc6 > * 0000000000000000000000000000000000000000 1c8c28415e8743368a2b800520a6dd0b22ee6ec2 refs/tags/v6.3-rc7 > ``` Makes sense, my only question is what other format could `git fetch` have? I think `--format=porcelain` is clear enough.
On Mon, Apr 24, 2023 at 02:17:31PM -0600, Felipe Contreras wrote: > Patrick Steinhardt wrote: > > Parsing the output of fetches is mostly impossible. It prettifies > > reference names that are about to be updated, doesn't print the old and > > new object IDs the refs are being updated from and to, and prints all of > > that information in nice columns. In short, it is designed to be read by > > humans rather than machines. > > > > This makes it hard to use in a script way though, e.g. to learn about > > which references actually have been updated or which have not been > > updated. This patch series intends to fix that by introducing a new > > machine-parseable interface: > > > > ``` > > $ git fetch --output-format=porcelain --no-progress > > fff5a5e7f528b2ed2c335991399a766c2cf01103 af67688dca57999fd848f051eeea1d375ba546b2 refs/remotes/origin/master > > * 0000000000000000000000000000000000000000 e046fe5a36a970bc14fbfbcb2074a48776f6b671 refs/remotes/origin/x86-rep-insns > > * 0000000000000000000000000000000000000000 bb81ed6862b864c9eb99447f04d49a84ecb647e5 refs/tags/v6.3-rc4 > > * 0000000000000000000000000000000000000000 83af7b1468c0dca86b4dc9e43e73bfa4f38d9637 refs/tags/v6.3-rc5 > > * 0000000000000000000000000000000000000000 ab3affb8ed84f68638162fe7e6fd4055e15bff5b refs/tags/v6.3-rc6 > > * 0000000000000000000000000000000000000000 1c8c28415e8743368a2b800520a6dd0b22ee6ec2 refs/tags/v6.3-rc7 > > ``` > > Makes sense, my only question is what other format could `git fetch` have? I > think `--format=porcelain` is clear enough. Yeah, I'd be perfectly happy to rename this to `--format=porcelain`. I'll wait for the Review Club that discusses this patch set tomorrow and will send a new version with that change afterwards if nobody disagrees. Patrick
Hi Patrick! Thanks for the pleasant read! I thought this was a great topic for Review Club. It's too bad that we missed you, but we post all relevant feedback here anyway. Nevertheless, if you'd like to see the meeting notes, you can find them at: https://docs.google.com/document/d/14L8BAumGTpsXpjDY8VzZ4rRtpAjuGrFSRqn3stCuS_w/edit Patrick Steinhardt <ps@pks.im> writes: > Parsing the output of fetches is mostly impossible. It prettifies > reference names that are about to be updated, doesn't print the old and > new object IDs the refs are being updated from and to, and prints all of > that information in nice columns. In short, it is designed to be read by > humans rather than machines. > > This makes it hard to use in a script way though, e.g. to learn about > which references actually have been updated or which have not been > updated. This patch series intends to fix that by introducing a new > machine-parseable interface: > > ``` > $ git fetch --output-format=porcelain --no-progress > fff5a5e7f528b2ed2c335991399a766c2cf01103 af67688dca57999fd848f051eeea1d375ba546b2 refs/remotes/origin/master > * 0000000000000000000000000000000000000000 e046fe5a36a970bc14fbfbcb2074a48776f6b671 refs/remotes/origin/x86-rep-insns > * 0000000000000000000000000000000000000000 bb81ed6862b864c9eb99447f04d49a84ecb647e5 refs/tags/v6.3-rc4 > * 0000000000000000000000000000000000000000 83af7b1468c0dca86b4dc9e43e73bfa4f38d9637 refs/tags/v6.3-rc5 > * 0000000000000000000000000000000000000000 ab3affb8ed84f68638162fe7e6fd4055e15bff5b refs/tags/v6.3-rc6 > * 0000000000000000000000000000000000000000 1c8c28415e8743368a2b800520a6dd0b22ee6ec2 refs/tags/v6.3-rc7 > ``` Having machine-parseable output seems like an obviously good goal to me. The finer points of the interface and output format are worth discussing. I'll do so in the patches themselves. Overall, I found the series very well-structured and easy to follow along. Thanks.
On 4/25/2023 2:58 AM, Patrick Steinhardt wrote: > On Mon, Apr 24, 2023 at 02:17:31PM -0600, Felipe Contreras wrote: >> Patrick Steinhardt wrote: >>> Parsing the output of fetches is mostly impossible. It prettifies >>> reference names that are about to be updated, doesn't print the old and >>> new object IDs the refs are being updated from and to, and prints all of >>> that information in nice columns. In short, it is designed to be read by >>> humans rather than machines. >>> >>> This makes it hard to use in a script way though, e.g. to learn about >>> which references actually have been updated or which have not been >>> updated. This patch series intends to fix that by introducing a new >>> machine-parseable interface: >>> >>> ``` >>> $ git fetch --output-format=porcelain --no-progress >>> fff5a5e7f528b2ed2c335991399a766c2cf01103 af67688dca57999fd848f051eeea1d375ba546b2 refs/remotes/origin/master >>> * 0000000000000000000000000000000000000000 e046fe5a36a970bc14fbfbcb2074a48776f6b671 refs/remotes/origin/x86-rep-insns >>> * 0000000000000000000000000000000000000000 bb81ed6862b864c9eb99447f04d49a84ecb647e5 refs/tags/v6.3-rc4 >>> * 0000000000000000000000000000000000000000 83af7b1468c0dca86b4dc9e43e73bfa4f38d9637 refs/tags/v6.3-rc5 >>> * 0000000000000000000000000000000000000000 ab3affb8ed84f68638162fe7e6fd4055e15bff5b refs/tags/v6.3-rc6 >>> * 0000000000000000000000000000000000000000 1c8c28415e8743368a2b800520a6dd0b22ee6ec2 refs/tags/v6.3-rc7 >>> ``` >> >> Makes sense, my only question is what other format could `git fetch` have? I >> think `--format=porcelain` is clear enough. > > Yeah, I'd be perfectly happy to rename this to `--format=porcelain`. > I'll wait for the Review Club that discusses this patch set tomorrow and > will send a new version with that change afterwards if nobody disagrees. > > Patrick We had some discussion during review club about this, where the idea of using "--porcelain" came up because many commands use that when switching into a machine readable format. In addition, this format not only changes the output but also moves it from being on stderr to stdout, which is a hint that the intended usage of the command is now a little different. I don't have strong opinion here but want to note that "--format" is often used by commands like log which changes how we structure the output of git objects. Obviously using "--porcelain" is a bit weird when dealing with the pre-existing compact and full outputs, and perhaps --format wouldn't be confusing. I didn't find any command which used --output-format today, and all the uses of --format I saw were for object formatting like git log. I'm ok with --format, but wanted to note the potential confusion. git status also uses --porcelain and has a -z option for NUL terminating instead of newline terminating. We thought -z might be useful to handle the potential of weird refnames that include characters. No one on the review could remember what rules were enforced on refnames to confirm if it was legal to have '\n' in a refname or not.
On 4/19/2023 5:31 AM, Patrick Steinhardt wrote: > > ``` > $ git fetch --output-format=porcelain --no-progress > fff5a5e7f528b2ed2c335991399a766c2cf01103 af67688dca57999fd848f051eeea1d375ba546b2 refs/remotes/origin/master > * 0000000000000000000000000000000000000000 e046fe5a36a970bc14fbfbcb2074a48776f6b671 refs/remotes/origin/x86-rep-insns > * 0000000000000000000000000000000000000000 bb81ed6862b864c9eb99447f04d49a84ecb647e5 refs/tags/v6.3-rc4 > * 0000000000000000000000000000000000000000 83af7b1468c0dca86b4dc9e43e73bfa4f38d9637 refs/tags/v6.3-rc5 > * 0000000000000000000000000000000000000000 ab3affb8ed84f68638162fe7e6fd4055e15bff5b refs/tags/v6.3-rc6 > * 0000000000000000000000000000000000000000 1c8c28415e8743368a2b800520a6dd0b22ee6ec2 refs/tags/v6.3-rc7 > ``` > One thing that the standard output (maybe with --progress?) shows to stderr is which remote is being fetched and what URL was fetched. This seems like useful data to include in a machine readable format. It wasn't clear if such output would still be made and whether it would go to stderr or stdout, nor whether that existing output was machine readable. Obviously you can somewhat infer remotes based on the refs/remotes, but that doesn't include other refs like tags. When fetching from multiple reports (--all or explicitly asking), I wonder if it makes sense to include some lines to differentiate where each block of updates for a given remote starts and ends? Thanks, Jake
Jacob Keller <jacob.e.keller@intel.com> writes: >> Yeah, I'd be perfectly happy to rename this to `--format=porcelain`. >> I'll wait for the Review Club that discusses this patch set tomorrow and >> will send a new version with that change afterwards if nobody disagrees. >> >> Patrick > > We had some discussion during review club about this, where the idea of > using "--porcelain" came up because many commands use that when > switching into a machine readable format. > > In addition, this format not only changes the output but also moves it > from being on stderr to stdout, which is a hint that the intended usage > of the command is now a little different. A little different from what? I do not think the answer would be "other program's --porcelain mode", as sending them to stdout would be one of the things that make the output easier for programs to parse, so it does sound like very much in the same spirit as "git status --porcelain" where its output format gets tweaked to be more machine friendly. The output with "--porcelain" option enabled tend to be less human friendly and the distinction between Porcelain (for humans) and plumbing (for scripts) is reversed in the use of the word there---it started as "this is the option for those who write Porcelain commands to use", but still it is not a very good name for the option. I am perfectly OK if the plan is to uniformly use --output-format (or something equally more descriptive) and migrate and deprecate the "--porcelain" option away from existing commands. Thanks.
Jacob Keller <jacob.e.keller@intel.com> writes: >> Yeah, I'd be perfectly happy to rename this to `--format=porcelain`. >> I'll wait for the Review Club that discusses this patch set tomorrow and >> will send a new version with that change afterwards if nobody disagrees. >> >> Patrick > > We had some discussion during review club about this, where the idea of > using "--porcelain" came up because many commands use that when > switching into a machine readable format. > > In addition, this format not only changes the output but also moves it > from being on stderr to stdout, which is a hint that the intended usage > of the command is now a little different. A little different from what? I do not think the answer would be "other program's --porcelain mode", as sending them to stdout would be one of the things that make the output easier for programs to parse, so it does sound like very much in the same spirit as "git status --porcelain" where its output format gets tweaked to be more machine friendly. The output with "--porcelain" option enabled tend to be less human friendly and the distinction between Porcelain (for humans) and plumbing (for scripts) is reversed in the use of the word there---it started as "this is the option for those who write Porcelain commands to use", but still it is not a very good name for the option. I am perfectly OK if the plan is to uniformly use --output-format (or something equally more descriptive) and migrate and deprecate the "--porcelain" option away from existing commands. Thanks.
On 4/26/2023 1:23 PM, Junio C Hamano wrote: > Jacob Keller <jacob.e.keller@intel.com> writes: > >>> Yeah, I'd be perfectly happy to rename this to `--format=porcelain`. >>> I'll wait for the Review Club that discusses this patch set tomorrow and >>> will send a new version with that change afterwards if nobody disagrees. >>> >>> Patrick >> >> We had some discussion during review club about this, where the idea of >> using "--porcelain" came up because many commands use that when >> switching into a machine readable format. >> >> In addition, this format not only changes the output but also moves it >> from being on stderr to stdout, which is a hint that the intended usage >> of the command is now a little different. > > A little different from what? I do not think the answer would be > "other program's --porcelain mode", as sending them to stdout would > be one of the things that make the output easier for programs to > parse, so it does sound like very much in the same spirit as "git > status --porcelain" where its output format gets tweaked to be more > machine friendly. A little different from using git fetch normally where all output is stderr and is generally "this is what I did" but which was obviously not intended to be parsed by scripts but instead by the human who ran the command. > > The output with "--porcelain" option enabled tend to be less human > friendly and the distinction between Porcelain (for humans) and > plumbing (for scripts) is reversed in the use of the word there---it > started as "this is the option for those who write Porcelain > commands to use", but still it is not a very good name for the > option. Yea the option sort of means "to be used by those implementing porcelain" and its definitely a bit confusing. > > I am perfectly OK if the plan is to uniformly use --output-format > (or something equally more descriptive) and migrate and deprecate > the "--porcelain" option away from existing commands. > > Thanks. That makes sense to me as well.
Glen Choo <chooglen@google.com> writes: > Hi Patrick! > > Thanks for the pleasant read! I thought this was a great topic for > Review Club. It's too bad that we missed you, but we post all relevant > feedback here anyway. > > Nevertheless, if you'd like to see the meeting notes, you can find them > at: > > https://docs.google.com/document/d/14L8BAumGTpsXpjDY8VzZ4rRtpAjuGrFSRqn3stCuS_w/edit It seems like Jacob Keller and I have mostly the same comments (since we did both go to the same Review Club, after all :)), so feel free to respond to only Jacob's messages and ignore mine (where the messages overlap) - I'll catch up on those conversations.
On Wed, Apr 26, 2023 at 01:23:12PM -0700, Junio C Hamano wrote: > Jacob Keller <jacob.e.keller@intel.com> writes: > > >> Yeah, I'd be perfectly happy to rename this to `--format=porcelain`. > >> I'll wait for the Review Club that discusses this patch set tomorrow and > >> will send a new version with that change afterwards if nobody disagrees. > >> > >> Patrick > > > > We had some discussion during review club about this, where the idea of > > using "--porcelain" came up because many commands use that when > > switching into a machine readable format. > > > > In addition, this format not only changes the output but also moves it > > from being on stderr to stdout, which is a hint that the intended usage > > of the command is now a little different. > > A little different from what? I do not think the answer would be > "other program's --porcelain mode", as sending them to stdout would > be one of the things that make the output easier for programs to > parse, so it does sound like very much in the same spirit as "git > status --porcelain" where its output format gets tweaked to be more > machine friendly. > > The output with "--porcelain" option enabled tend to be less human > friendly and the distinction between Porcelain (for humans) and > plumbing (for scripts) is reversed in the use of the word there---it > started as "this is the option for those who write Porcelain > commands to use", but still it is not a very good name for the > option. Ah, thanks for explaining where this reversed meaning actually comes from. I really wondered why we use "porcelain" in preexisting code to reflect machine-parseable interface, but that explanation does make sense. > I am perfectly OK if the plan is to uniformly use --output-format > (or something equally more descriptive) and migrate and deprecate > the "--porcelain" option away from existing commands. I'm still not quite clear on where the consensus lies now. Personally, I think that both `--format` and `--output-format` work well and are a bit more descriptive than simply `--porcelain`, and wouldn't mind also migrating other binaries to use either of them. Furthermore, I think that `--[output-]format` has the advantage that you don't need to handle priorities or mutual exclusion of different options that all apply to the reference format. To a user, it is not immediately obvious what `git fetch --format=compact --porcelain` would do, and which of both options ultimately get respected. But that's likely only true for future commands, because any migration would create the same kind of ambiguity for preexisting commands. If we were to also migrate preexisting code to use `--[output-]format` then I'd argue that `--output-format` is likely the better name, mostly because it is less likely to be ambiguous compared to `--format`. The latter could e.g. easily confused with `--object-format`. So I think I'll stick with `--output-format` for the time being. Patrick
On 4/27/2023 3:58 AM, Patrick Steinhardt wrote: > On Wed, Apr 26, 2023 at 01:23:12PM -0700, Junio C Hamano wrote: > Furthermore, I think that `--[output-]format` has the advantage that you > don't need to handle priorities or mutual exclusion of different options > that all apply to the reference format. To a user, it is not immediately > obvious what `git fetch --format=compact --porcelain` would do, and > which of both options ultimately get respected. But that's likely only > true for future commands, because any migration would create the same > kind of ambiguity for preexisting commands. > > If we were to also migrate preexisting code to use `--[output-]format` > then I'd argue that `--output-format` is likely the better name, mostly > because it is less likely to be ambiguous compared to `--format`. The > latter could e.g. easily confused with `--object-format`. > > So I think I'll stick with `--output-format` for the time being. > I agree. I think using --output-format and migrating existing commands that have --porcelain to use --output-format is good. I'm not sure whether to keep using --output-format=porcelain or whether to use a different term there. I definitely think avoiding confusion with --format is good, and I don't think its too much of a burden to have the longer --output-format as the option name. Thanks, Jake > Patrick
Junio C Hamano <gitster@pobox.com> writes: >> We had some discussion during review club about this, where the idea of >> using "--porcelain" came up because many commands use that when >> switching into a machine readable format. >> >> In addition, this format not only changes the output but also moves it >> from being on stderr to stdout, which is a hint that the intended usage >> of the command is now a little different. > > A little different from what? I do not think the answer would be > "other program's --porcelain mode", as sending them to stdout would > be one of the things that make the output easier for programs to > parse, so it does sound like very much in the same spirit as "git > status --porcelain" where its output format gets tweaked to be more > machine friendly. > > The output with "--porcelain" option enabled tend to be less human > friendly and the distinction between Porcelain (for humans) and > plumbing (for scripts) is reversed in the use of the word there---it > started as "this is the option for those who write Porcelain > commands to use", but still it is not a very good name for the > option. > > I am perfectly OK if the plan is to uniformly use --output-format > (or something equally more descriptive) and migrate and deprecate > the "--porcelain" option away from existing commands. I agree that --porcelain is a confusing name that would be nice to deprecate, but I don't think --output-format captures all of the intent of "operate in a machine-friendly mode instead of a human-friendly one". Unfortunately, if we had picked --plumbing from the beginning, I doubt we would be having this discussion today :/ E.g. machines (Unix ones at least?) like to have output on stdout and to be able to request NUL-terminated output. It's unfortunate that if we don't piggyback onto --output-format, we run into option precedence problems (like Patrick mentioned), but I'd find it more confusing that --output-format=[porcelain|full|compact] don't behave the same way. I don't think this puts us in a better spot with regards to option precedence either. Consider: git fetch --output-format=full -z <...> The only way to respect both options is to have -z affect the human-readable output, which isn't the end of the world, but it seems unnecessary. Perhaps something like --machine instead?