mbox series

[v1,0/3] trace2 output for bitmap decision path

Message ID cover.1648119652.git.dyroneteng@gmail.com (mailing list archive)
Headers show
Series trace2 output for bitmap decision path | expand

Message

Teng Long March 24, 2022, 11:43 a.m. UTC
A Git repo can be chosen to use the normal bitmap (before MIDX) and MIDX bitmap.

I recently tried to understand this part of the MIDX implementation because I found
a bug which has been discovered and repaired in community [1].

I am grateful to Taylor Blau for his help and for introducing me to the testing
method according to the `git rev-list --test-bitmap <rev>`.

In the process of understanding and troubleshooting by using this command, I found
when the execution is failed it will output a single line of
"fatal: failed to load bitmap indexes", sometimes will be more informations like
if the bitmap file is broken, the outputs maybe contain
"error: Corrupted bitmap index file (wrong header)".), but most of time are single
line output I mentioned above. So, this brings a little obstacle for debugging and
troubleshooting I think, because "failed to load bitmap indexes" can represent
to much informations (many causes like: midx config turn off, bitmap inexistence, etc.)

Therefore, as a git repo can be chosen to use the normal bitmap (before MIDX) or
MIDX bitmap, or they can both exist and let git make the decision. I think why not add
some extra informations based on TRACE2 to help for showing the bitmap decision path
clearer and more plentiful, so when the failure occurs the user can use it to debug
around bitmap in a quicker way.

Thanks.

Links:
	1. https://public-inbox.org/git/cover.1638991570.git.me@ttaylorr.com/)

Teng Long (3):
  pack-bitmap.c: use "ret" in "open_midx_bitmap()"
  pack-bitmap.c: add "break" statement in "open_pack_bitmap()"
  bitmap: add trace outputs during open "bitmap" file

 midx.c        |  2 ++
 pack-bitmap.c | 17 +++++++++++++----
 2 files changed, 15 insertions(+), 4 deletions(-)

Range-diff against v0:
-:  ---------- > 1:  3048b4dd29 pack-bitmap.c: use "ret" in "open_midx_bitmap()"
-:  ---------- > 2:  70500b6343 pack-bitmap.c: add "break" statement in "open_pack_bitmap()"
-:  ---------- > 3:  9912450fc1 bitmap: add trace outputs during open "bitmap" file

Comments

Ævar Arnfjörð Bjarmason March 24, 2022, 1:22 p.m. UTC | #1
On Thu, Mar 24 2022, Teng Long wrote:

> A Git repo can be chosen to use the normal bitmap (before MIDX) and MIDX bitmap.
>
> I recently tried to understand this part of the MIDX implementation because I found
> a bug which has been discovered and repaired in community [1].
>
> I am grateful to Taylor Blau for his help and for introducing me to the testing
> method according to the `git rev-list --test-bitmap <rev>`.
>
> In the process of understanding and troubleshooting by using this command, I found
> when the execution is failed it will output a single line of
> "fatal: failed to load bitmap indexes", sometimes will be more informations like
> if the bitmap file is broken, the outputs maybe contain
> "error: Corrupted bitmap index file (wrong header)".), but most of time are single
> line output I mentioned above. So, this brings a little obstacle for debugging and
> troubleshooting I think, because "failed to load bitmap indexes" can represent
> to much informations (many causes like: midx config turn off, bitmap inexistence, etc.)
>
> Therefore, as a git repo can be chosen to use the normal bitmap (before MIDX) or
> MIDX bitmap, or they can both exist and let git make the decision. I think why not add
> some extra informations based on TRACE2 to help for showing the bitmap decision path
> clearer and more plentiful, so when the failure occurs the user can use it to debug
> around bitmap in a quicker way.
>
> Thanks.
>
> Links:
> 	1. https://public-inbox.org/git/cover.1638991570.git.me@ttaylorr.com/)
>
> Teng Long (3):
>   pack-bitmap.c: use "ret" in "open_midx_bitmap()"
>   pack-bitmap.c: add "break" statement in "open_pack_bitmap()"
>   bitmap: add trace outputs during open "bitmap" file
>
>  midx.c        |  2 ++
>  pack-bitmap.c | 17 +++++++++++++----
>  2 files changed, 15 insertions(+), 4 deletions(-)
>
> Range-diff against v0:
> -:  ---------- > 1:  3048b4dd29 pack-bitmap.c: use "ret" in "open_midx_bitmap()"
> -:  ---------- > 2:  70500b6343 pack-bitmap.c: add "break" statement in "open_pack_bitmap()"
> -:  ---------- > 3:  9912450fc1 bitmap: add trace outputs during open "bitmap" file

Was there an on-list v0 (RFC?) or is this a range-diff against nothing?
Best not to include it until a v2 then.

Comments:

Sometimes it's better to split up patches, but I think these 3x should
really be squashed together. We make incremental progress to nowhere in
1/3 and 2/3, and it all comes together in 3/3. The 1-2/3 are trivial
enough that we can squash them in.

We then end up with this, with my comments added:
	
	 midx.c        |  2 ++
	 pack-bitmap.c | 17 +++++++++++++----
	 2 files changed, 15 insertions(+), 4 deletions(-)
	
	diff --git a/midx.c b/midx.c
	index 865170bad05..fda96440287 100644
	--- a/midx.c
	+++ b/midx.c
	@@ -392,6 +392,8 @@ int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, i
	 	struct multi_pack_index *m_search;
	 
	 	prepare_repo_settings(r);
	+	trace2_data_string("midx", r, "core.multipackIndex",
	+					   r->settings.core_multi_pack_index ? "true" : "false");

Weird indentation here.

Also, if we think it's a good idea to log these shouldn't it be in
repo_cfg_bool() in repo-settings.c, why is core.multipackIndex out of
all in r->settings special?

	 	if (!r->settings.core_multi_pack_index)
	 		return 0;
	 
	diff --git a/pack-bitmap.c b/pack-bitmap.c
	index 97909d48da3..cac8d4a978f 100644
	--- a/pack-bitmap.c
	+++ b/pack-bitmap.c
	@@ -484,25 +484,34 @@ static int open_pack_bitmap(struct repository *r,
	 	assert(!bitmap_git->map);
	 
	 	for (p = get_all_packs(r); p; p = p->next) {
	-		if (open_pack_bitmap_1(bitmap_git, p) == 0)
	+		if (open_pack_bitmap_1(bitmap_git, p) == 0) {

Aside: If we end up changing this line anyway, it's OK to just change it
to "if (!open...".


	 			ret = 0;
	+			break;
	+		}
	 	}
	 
	+	trace2_data_string("bitmap", the_repository, "open bitmap (non-midx)",
	+					   ret ? "failed" : "ok");
	 	return ret;
	 }
	 
	 static int open_midx_bitmap(struct repository *r,
	 			    struct bitmap_index *bitmap_git)
	 {
	+	int ret = -1;
	 	struct multi_pack_index *midx;
	 
	 	assert(!bitmap_git->map);
	 
	 	for (midx = get_multi_pack_index(r); midx; midx = midx->next) {
	-		if (!open_midx_bitmap_1(bitmap_git, midx))
	-			return 0;
	+		if (!open_midx_bitmap_1(bitmap_git, midx)) {
	+			ret = 0;
	+			break;
	+		}
	 	}
	-	return -1;
	+	trace2_data_string("midx", the_repository, "open bitmap (midx)",
	+					   ret ? "failed" : "ok");
	+	return ret;
	 }
	 
	 static int open_bitmap(struct repository *r,

It seems odd not to use trace2 regions for this, and to not add add this
data logging open_bitmap(). I came up with this on top of this when
testing this:
	
	diff --git a/pack-bitmap.c b/pack-bitmap.c
	index cac8d4a978f..ba71a7ea5cd 100644
	--- a/pack-bitmap.c
	+++ b/pack-bitmap.c
	@@ -318,11 +318,14 @@ static int open_midx_bitmap_1(struct bitmap_index *bitmap_git,
	 
	 	free(idx_name);
	 
	-	if (fd < 0)
	+	if (fd < 0) {
	+		/* TODO: Log trace2_data_string() here, do we care? */
	 		return -1;
	+	}
	 
	 	if (fstat(fd, &st)) {
	 		close(fd);
	+		/* TODO: Log trace2_data_string() here, do we care? */
	 		return -1;
	 	}
	 
	@@ -330,6 +333,7 @@ static int open_midx_bitmap_1(struct bitmap_index *bitmap_git,
	 		struct strbuf buf = STRBUF_INIT;
	 		get_midx_filename(&buf, midx->object_dir);
	 		/* ignore extra bitmap file; we can only handle one */
	+		/* NOTE: You'll already get a warning (well, "error") event due to this, and it'll be in your region */
	 		warning("ignoring extra bitmap file: %s", buf.buf);
	 		close(fd);
	 		strbuf_release(&buf);
	@@ -344,9 +348,11 @@ static int open_midx_bitmap_1(struct bitmap_index *bitmap_git,
	 	close(fd);
	 
	 	if (load_bitmap_header(bitmap_git) < 0)
	+		/* TODO: Add trace2_data_string() or warning/error here? */
	 		goto cleanup;
	 
	 	if (!hasheq(get_midx_checksum(bitmap_git->midx), bitmap_git->checksum))
	+		/* TODO: Add trace2_data_string() or warning/error here? */
	 		goto cleanup;
	 
	 	if (load_midx_revindex(bitmap_git->midx) < 0) {
	@@ -479,49 +485,44 @@ static int open_pack_bitmap(struct repository *r,
	 			    struct bitmap_index *bitmap_git)
	 {
	 	struct packed_git *p;
	-	int ret = -1;
	-
	-	assert(!bitmap_git->map);
	 
	 	for (p = get_all_packs(r); p; p = p->next) {
	 		if (open_pack_bitmap_1(bitmap_git, p) == 0) {
	-			ret = 0;
	-			break;
	+			return 0;
	 		}
	 	}
	-
	-	trace2_data_string("bitmap", the_repository, "open bitmap (non-midx)",
	-					   ret ? "failed" : "ok");
	-	return ret;
	+	return -1;
	 }
	 
	 static int open_midx_bitmap(struct repository *r,
	 			    struct bitmap_index *bitmap_git)
	 {
	-	int ret = -1;
	 	struct multi_pack_index *midx;
	 
	-	assert(!bitmap_git->map);
	-
	 	for (midx = get_multi_pack_index(r); midx; midx = midx->next) {
	 		if (!open_midx_bitmap_1(bitmap_git, midx)) {
	-			ret = 0;
	-			break;
	+			return 0;
	 		}
	 	}
	-	trace2_data_string("midx", the_repository, "open bitmap (midx)",
	-					   ret ? "failed" : "ok");
	-	return ret;
	+	return -1;
	 }
	 
	 static int open_bitmap(struct repository *r,
	 		       struct bitmap_index *bitmap_git)
	 {
	+	int ret;
	+
	 	assert(!bitmap_git->map);
	 
	-	if (!open_midx_bitmap(r, bitmap_git))
	-		return 0;
	-	return open_pack_bitmap(r, bitmap_git);
	+	trace2_region_enter("pack-bitmap", "open_bitmap", r);
	+	if (!open_midx_bitmap(r, bitmap_git)) {
	+		ret = 0;
	+		goto done;
	+	}
	+	ret = open_pack_bitmap(r, bitmap_git);
	+done:
	+	trace2_region_leave("pack-bitmap", "open_bitmap", r);
	+	return ret;
	 }
	 
	 struct bitmap_index *prepare_bitmap_git(struct repository *r)

I.e. surely you just want to create a region, and if you care enough to
log failure shouldn't we log that in open_midx_bitmap_1() if we care,
and perhaps error() there instead of silently returning -1?
Teng Long March 29, 2022, 7:38 a.m. UTC | #2
> Was there an on-list v0 (RFC?) or is this a range-diff against nothing?
> Best not to include it until a v2 then.

My careless, there's no RFC, actually this patch should begin with v0 and
without range-diff.


> Sometimes it's better to split up patches, but I think these 3x should
> really be squashed together. We make incremental progress to nowhere in
> 1/3 and 2/3, and it all comes together in 3/3. The 1-2/3 are trivial
> enough that we can squash them in.

Sure.

>	+	trace2_data_string("midx", r, "core.multipackIndex",
>	+						   r->settings.core_multi_pack_index ? "true" : "false");
>
> Weird indentation here.

Will fix.

> I.e. surely you just want to create a region, and if you care enough to
> log failure shouldn't we log that in open_midx_bitmap_1() if we care,

Actually, I just want to use "trace2_data_string()" at first because which informations I
want to append is not so many, or does "create a region" a preferred principle for using
TRACE2 APIs?

> and perhaps error() there instead of silently returning -1?

I think so, after I checked function "error_builtin()" and there is a "trace2_cmd_error_va()"
usage in it which is already support to print some debug infos using TRACE2 envs.

Thanks.
Ævar Arnfjörð Bjarmason March 29, 2022, 8:54 a.m. UTC | #3
On Tue, Mar 29 2022, Teng Long wrote:

>> Was there an on-list v0 (RFC?) or is this a range-diff against nothing?
>> Best not to include it until a v2 then.
>
> My careless, there's no RFC, actually this patch should begin with v0 and
> without range-diff.

No worries.

>> Sometimes it's better to split up patches, but I think these 3x should
>> really be squashed together. We make incremental progress to nowhere in
>> 1/3 and 2/3, and it all comes together in 3/3. The 1-2/3 are trivial
>> enough that we can squash them in.
>
> Sure.
>
>>	+	trace2_data_string("midx", r, "core.multipackIndex",
>>	+						   r->settings.core_multi_pack_index ? "true" : "false");
>>
>> Weird indentation here.
>
> Will fix.

*Nod*

>> I.e. surely you just want to create a region, and if you care enough to
>> log failure shouldn't we log that in open_midx_bitmap_1() if we care,
>
> Actually, I just want to use "trace2_data_string()" at first because which informations I
> want to append is not so many, or does "create a region" a preferred principle for using
> TRACE2 APIs?

I think a begin/end region gives you everything you want here and more, i.e.:

 * We'll get start/end times on this (potentially expensive?) operation
   automatically.

 * You just log a "failed", but as the RFC-on-top shows we can just log
   the specific reason instead, either via error(), warning() or perhaps
   even trace2_data_string(). because it's within a region it'll be
   clear from the data what failed.

B.t.w. I noticed after sending that that this ad-hoc patch fails the
tests, it passed the one bitmap test I hacked it up against.

It was just there to make the point/demo that it looked easier to do
this one function call above where you were adding this.

>> and perhaps error() there instead of silently returning -1?
>
> I think so, after I checked function "error_builtin()" and there is a "trace2_cmd_error_va()"
> usage in it which is already support to print some debug infos using TRACE2 envs.

Exactly.