Message ID | aa6d73f3e526f416ee1e4e332e9ca3119efba0e8.1622126603.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | ref-filter: add %(raw) atom | expand |
ZheNing Hu via GitGitGadget wrote: > @@ -1372,6 +1389,15 @@ static void grab_raw_data(struct atom_value *val, int deref, void *buf, unsigned > &bodypos, &bodylen, &nonsiglen, > &sigpos, &siglen); > > + if (starts_with(name, "header")) { > + size_t header_len = subpos - (const char *)buf - 1; > + if (atom->u.header.option == H_BARE) { > + v->s = xmemdupz(buf, header_len); > + } else if (atom->u.header.option == H_LENGTH) No need for braces in the if. > + v->s = xstrfmt("%"PRIuMAX, (uintmax_t)header_len); > + continue; > + } > + > if (atom->u.contents.option == C_SUB) > v->s = copy_subject(subpos, sublen); > else if (atom->u.contents.option == C_SUB_SANITIZE) {
"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: ZheNing Hu <adlternative@gmail.com> > > Add new formatting option `%(header)`, which will print the > the structured header part of the raw object data. > > In the storage layout of an object: blob and tree only > contains raw data; commit and tag raw data contains two part: > header and contents. The header of tag contains "object OOO", > "type TTT", "tag AAA", "tagger GGG"; The header of commit > contains "tree RRR", "parent PPP", "author UUU", "committer CCC". > > Signed-off-by: ZheNing Hu <adlternative@gmail.com> > --- > Documentation/git-for-each-ref.txt | 7 +++++ > ref-filter.c | 26 +++++++++++++++++ > t/t6300-for-each-ref.sh | 45 ++++++++++++++++++++++++++++++ > 3 files changed, 78 insertions(+) While having this may not be wrong, I am not sure who needs it. Is your "cat-file --batch" topic needs this new atom?
"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes: > struct { > enum { RAW_BARE, RAW_LENGTH } option; > } raw_data; > + struct { > + enum { H_BARE, H_LENGTH } option; > + } header; Raw does not use R_{BARE,LENGTH} and uses raw_data member. Header should follow suit unless there is a compelling reason not to, no? struct { enum { HEADER_BARE, HEADER_LENGTH } option; } header_data; perhaps? > @@ -1372,6 +1389,15 @@ static void grab_raw_data(struct atom_value *val, int deref, void *buf, unsigned > &bodypos, &bodylen, &nonsiglen, > &sigpos, &siglen); > > + if (starts_with(name, "header")) { > + size_t header_len = subpos - (const char *)buf - 1; Hmph, is this correct? I would expect that the "header" part of a commit or a tag object excludes the blank line after the header fields. In other words, the "header" would be separated by a blank line from the "body", and that separating blank line is not part of "header" or "body". Otherwise, if there is a user of %(header), it needs to be coded to ignore the last blank line but has to diagnose it as an error if there is a blank line before that. > + if (atom->u.header.option == H_BARE) { > + v->s = xmemdupz(buf, header_len); > + } else if (atom->u.header.option == H_LENGTH) > + v->s = xstrfmt("%"PRIuMAX, (uintmax_t)header_len); > + continue; > + } > + > if (atom->u.contents.option == C_SUB) > v->s = copy_subject(subpos, sublen); > else if (atom->u.contents.option == C_SUB_SANITIZE) {
Junio C Hamano <gitster@pobox.com> 于2021年5月28日周五 下午12:36写道: > > "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes: > > > struct { > > enum { RAW_BARE, RAW_LENGTH } option; > > } raw_data; > > + struct { > > + enum { H_BARE, H_LENGTH } option; > > + } header; > > Raw does not use R_{BARE,LENGTH} and uses raw_data member. Header > should follow suit unless there is a compelling reason not to, no? > > struct { > enum { HEADER_BARE, HEADER_LENGTH } option; > } header_data; > > perhaps? > OK. > > @@ -1372,6 +1389,15 @@ static void grab_raw_data(struct atom_value *val, int deref, void *buf, unsigned > > &bodypos, &bodylen, &nonsiglen, > > &sigpos, &siglen); > > > > + if (starts_with(name, "header")) { > > + size_t header_len = subpos - (const char *)buf - 1; > > Hmph, is this correct? I would expect that the "header" part of a > commit or a tag object excludes the blank line after the header > fields. In other words, the "header" would be separated by a blank > line from the "body", and that separating blank line is not part of > "header" or "body". > > Otherwise, if there is a user of %(header), it needs to be coded to > ignore the last blank line but has to diagnose it as an error if > there is a blank line before that. > I am a bit confused, Is there any problem with me doing this? > > + size_t header_len = subpos - (const char *)buf - 1; "header" part starts from "buf" and header_len have minus 1 so that header part will not touch the blank line. At the same time, "contents" part starts from subpos, and it also does not touch the blank line. > While having this may not be wrong, I am not sure who needs it. Is > your "cat-file --batch" topic needs this new atom? Ok, I will remove it from this topic temporarily. Thanks. -- ZheNing Hu
diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt index f6ae751fd256..7827e48cde75 100644 --- a/Documentation/git-for-each-ref.txt +++ b/Documentation/git-for-each-ref.txt @@ -249,6 +249,13 @@ Note that `--format=%(raw)` should not combine with `--python`, `--shell`, `--tc `--perl` because if our binary raw data is passed to a variable in the host language, the host languages may cause escape errors. +The structured header part of the raw data in a commit or a tag object is `header`, +it composed of "tree XXX", "parent YYY", etc lines in commits, or composed of +"object OOO", "type TTT", etc lines in tags. + +header:size:: + The header size of the object. + The message in a commit or a tag object is `contents`, from which `contents:<part>` can be used to extract various parts out of: diff --git a/ref-filter.c b/ref-filter.c index c2abf5da7006..2f426830f562 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -141,6 +141,9 @@ static struct used_atom { struct { enum { RAW_BARE, RAW_LENGTH } option; } raw_data; + struct { + enum { H_BARE, H_LENGTH } option; + } header; struct { cmp_status cmp_status; const char *str; @@ -385,6 +388,18 @@ static int raw_atom_parser(const struct ref_format *format, struct used_atom *at return 0; } +static int header_atom_parser(const struct ref_format *format, struct used_atom *atom, + const char *arg, struct strbuf *err) +{ + if (!arg) + atom->u.header.option = H_BARE; + else if (!strcmp(arg, "size")) + atom->u.header.option = H_LENGTH; + else + return strbuf_addf_ret(err, -1, _("unrecognized %%(header) argument: %s"), arg); + return 0; +} + static int oid_atom_parser(const struct ref_format *format, struct used_atom *atom, const char *arg, struct strbuf *err) { @@ -546,6 +561,7 @@ static struct { { "trailers", SOURCE_OBJ, FIELD_STR, trailers_atom_parser }, { "contents", SOURCE_OBJ, FIELD_STR, contents_atom_parser }, { "raw", SOURCE_OBJ, FIELD_STR, raw_atom_parser }, + { "header", SOURCE_OBJ, FIELD_STR, header_atom_parser }, { "upstream", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser }, { "push", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser }, { "symref", SOURCE_NONE, FIELD_STR, refname_atom_parser }, @@ -1362,6 +1378,7 @@ static void grab_raw_data(struct atom_value *val, int deref, void *buf, unsigned if ((obj->type != OBJ_TAG && obj->type != OBJ_COMMIT) || (strcmp(name, "body") && + !starts_with(name, "header") && !starts_with(name, "subject") && !starts_with(name, "trailers") && !starts_with(name, "contents"))) @@ -1372,6 +1389,15 @@ static void grab_raw_data(struct atom_value *val, int deref, void *buf, unsigned &bodypos, &bodylen, &nonsiglen, &sigpos, &siglen); + if (starts_with(name, "header")) { + size_t header_len = subpos - (const char *)buf - 1; + if (atom->u.header.option == H_BARE) { + v->s = xmemdupz(buf, header_len); + } else if (atom->u.header.option == H_LENGTH) + v->s = xstrfmt("%"PRIuMAX, (uintmax_t)header_len); + continue; + } + if (atom->u.contents.option == C_SUB) v->s = copy_subject(subpos, sublen); else if (atom->u.contents.option == C_SUB_SANITIZE) { diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh index 07de4a84d70b..11fc8fc53649 100755 --- a/t/t6300-for-each-ref.sh +++ b/t/t6300-for-each-ref.sh @@ -232,6 +232,35 @@ test_expect_success 'basic atom: refs/tags/testtag *raw' ' test_cmp expected.clean actual.clean ' +test_expect_success 'basic atom: refs/tags/testtag header' ' + cat >expected <<-EOF && + object ea122842f48be4afb2d1fc6a4b96c05885ab7463 + type commit + tag testtag + tagger C O Mitter <committer@example.com> 1151968725 +0200 + + EOF + git for-each-ref --format="%(header)" refs/tags/testtag >actual && + test_cmp expected actual && + echo "131" >expected && + git for-each-ref --format="%(header:size)" refs/tags/testtag >actual && + test_cmp expected actual +' + +test_expect_success 'basic atom: refs/heads/main header' ' + cat >expected <<-EOF && + tree 8039ce043250c402d62ca312e9596e42ce1c7bb0 + author A U Thor <author@example.com> 1151968724 +0200 + committer C O Mitter <committer@example.com> 1151968723 +0200 + + EOF + git for-each-ref --format="%(header)" refs/heads/main >actual && + test_cmp expected actual && + echo "162" >expected && + git for-each-ref --format="%(header:size)" refs/heads/main >actual && + test_cmp expected actual +' + test_expect_success 'Check invalid atoms names are errors' ' test_must_fail git for-each-ref --format="%(INVALID)" refs/heads ' @@ -768,6 +797,14 @@ test_expect_success 'basic atom: refs/mytrees/first raw' ' test_cmp expected actual ' +test_expect_success 'basic atom: refs/mytrees/first header' ' + echo "" >expected && + git for-each-ref --format="%(header)" refs/mytrees/first >actual && + test_cmp expected actual && + git for-each-ref --format="%(header:size)" refs/mytrees/first >actual && + test_cmp expected actual +' + test_atom refs/myblobs/first subject "" test_atom refs/myblobs/first contents:subject "" test_atom refs/myblobs/first body "" @@ -785,6 +822,14 @@ test_expect_success 'basic atom: refs/myblobs/first raw' ' test_cmp expected actual ' +test_expect_success 'basic atom: refs/myblobs/first header' ' + echo "" >expected && + git for-each-ref --format="%(header)" refs/myblobs/first >actual && + test_cmp expected actual && + git for-each-ref --format="%(header:size)" refs/myblobs/first >actual && + test_cmp expected actual +' + test_expect_success 'set up refs pointing to binary blob' ' printf "%b" "a\0b\0c" >blob1 && printf "%b" "a\0c\0b" >blob2 &&