diff mbox series

[v2,1/1] pack-bitmap.c: avoid exposing absolute paths

Message ID 87a494e5ac0cc992689944ab13600d097c51e54a.1667393419.git.dyroneteng@gmail.com (mailing list archive)
State Superseded
Headers show
Series pack-bitmap.c: avoid exposing absolute paths | expand

Commit Message

Teng Long Nov. 2, 2022, 12:56 p.m. UTC
From: Teng Long <dyroneteng@gmail.com>

In "open_midx_bitmap_1()" and "open_pack_bitmap_1()", when we find that
there are multiple bitmaps, we will only open the first one and then
leave warnings about the remaining pack information, the information
will contain the absolute path of the repository, for example in a
alternates usage scenario. So let's hide this kind of potentially
sensitive information in this commit.

Found-by: XingXin <moweng.xx@antgroup.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
 pack-bitmap.c           | 12 ++++++++----
 t/t5310-pack-bitmaps.sh | 11 ++++++++---
 2 files changed, 16 insertions(+), 7 deletions(-)

Comments

Taylor Blau Nov. 3, 2022, 1:16 a.m. UTC | #1
On Wed, Nov 02, 2022 at 08:56:05PM +0800, Teng Long wrote:
> From: Teng Long <dyroneteng@gmail.com>
>
> In "open_midx_bitmap_1()" and "open_pack_bitmap_1()", when we find that
> there are multiple bitmaps, we will only open the first one and then
> leave warnings about the remaining pack information, the information
> will contain the absolute path of the repository, for example in a
> alternates usage scenario. So let's hide this kind of potentially
> sensitive information in this commit.
>
> Found-by: XingXin <moweng.xx@antgroup.com>
> Signed-off-by: Teng Long <dyroneteng@gmail.com>
> ---
>  pack-bitmap.c           | 12 ++++++++----
>  t/t5310-pack-bitmaps.sh | 11 ++++++++---
>  2 files changed, 16 insertions(+), 7 deletions(-)
>
> diff --git a/pack-bitmap.c b/pack-bitmap.c
> index 36134222d7a..a8c76056637 100644
> --- a/pack-bitmap.c
> +++ b/pack-bitmap.c
> @@ -331,8 +331,9 @@ static int open_midx_bitmap_1(struct bitmap_index *bitmap_git,
>  	if (bitmap_git->pack || bitmap_git->midx) {
>  		struct strbuf buf = STRBUF_INIT;
>  		get_midx_filename(&buf, midx->object_dir);
> -		/* ignore extra bitmap file; we can only handle one */
> -		warning("ignoring extra bitmap file: %s", buf.buf);

OK... here we're getting rid of the user-visible warning, which makes
sense and is the point of this patch.

> +		/* ignore extra midx bitmap files; we can only handle one */
> +		trace2_data_string("bitmap", the_repository,
> +				   "ignoring extra midx bitmap file", basename(buf.buf));

But we replace it with a trace2_data_string() that only includes the
basename? I'd think that the point of moving this warning into
trace2-land is that we could emit the full path without worrying about
who sees it, since trace2 data is typically not plumbed through to
end-users.

IOW, I would have expected to see a patch something along the lines of:

> -		/* ignore extra bitmap file; we can only handle one */
> -		warning("ignoring extra bitmap file: %s", buf.buf);
> +		/* ignore extra midx bitmap files; we can only handle one */
> +		trace2_data_string("bitmap", the_repository,
> +				   "ignoring extra midx bitmap file", buf.buf);

> @@ -402,8 +403,9 @@ static int open_pack_bitmap_1(struct bitmap_index *bitmap_git, struct packed_git
>  	}
>
>  	if (bitmap_git->pack || bitmap_git->midx) {
> -		/* ignore extra bitmap file; we can only handle one */
> -		warning("ignoring extra bitmap file: %s", packfile->pack_name);
> +		/* ignore extra bitmap files; we can only handle one */
> +		trace2_data_string("bitmap", the_repository,
> +				   "ignoring extra bitmap file", basename(packfile->pack_name));

Same note here.

> +	trace2_data_string("bitmap", the_repository, "opened bitmap file",
> +			   basename(packfile->pack_name));

If we get later bitmap-related information in the trace2 stream, we know
that we opened a bitmap. And at the moment we read a bitmap, there is
only one such bitmap in the repository.

I suppose that this is race-proof in the sense that if the bitmaps
change after reading, we can still usefully debug the trace2 stream
after the fact.

So even though my first reaction was that this was unnecessary, on
balance I do find it useful.

> -test_expect_success 'complains about multiple pack bitmaps' '
> +test_expect_success 'complains about multiple pack bitmaps when debugging by trace2' '
>  	rm -fr repo &&
>  	git init repo &&
>  	test_when_finished "rm -fr repo" &&
> @@ -420,8 +420,13 @@ test_expect_success 'complains about multiple pack bitmaps' '
>  		test_line_count = 2 packs &&
>  		test_line_count = 2 bitmaps &&
>
> -		git rev-list --use-bitmap-index HEAD 2>err &&
> -		grep "ignoring extra bitmap file" err
> +		ls -tr .git/objects/pack | grep -e "^pack-.*\.pack$" > sorted-packs &&
> +		ignored_pack="$(cat sorted-packs | awk 'NR==1{print}')" &&
> +		open_pack="$(cat sorted-packs | awk 'NR==2{print}')" &&

Hmmph. This test only passes if 'ls -tr' gives us the packs in order
that they are read by readdir(), which doesn't seem all that portable to
me. At the very least, it is tightly coupled to the implementation of
open_pack_bitmap() and friends.

> +		GIT_TRACE2_PERF=1 git rev-list --use-bitmap-index HEAD 2>err &&
> +		grep "opened bitmap file:$opened_pack" err &&
> +		grep "ignoring extra bitmap file:$ignored_pack" err

Do we necessarily care about which .bitmap is read and which isn't? The
existing test doesn't look too closely, which I think is fine (though as
the author of that test, I might be biased ;-)).

I would be equally happy to write:

> +		GIT_TRACE2_PERF=$(pwd)/trace2.txt git rev-list --use-bitmap-index HEAD &&
> +		grep "opened bitmap" trace2.txt &&
> +		grep "ignoring extra bitmap" trace2.txt

Thanks,
Taylor
Teng Long Nov. 3, 2022, 9:35 a.m. UTC | #2
Taylor Blau <me@ttaylorr.com> writes:

> OK... here we're getting rid of the user-visible warning, which makes
> sense and is the point of this patch.

Yes, my point is to avoid exposing some sensitive informations to end- users.

> But we replace it with a trace2_data_string() that only includes the
> basename? I'd think that the point of moving this warning into
> trace2-land is that we could emit the full path without worrying about
> who sees it, since trace2 data is typically not plumbed through to
> end-users.

I'm not sure if trace2 data will be given to end-users, at least as I understand
it as you, it's not. If so, your opinion is very reasonable.

So... maybe we could add a new configuration like "core.hideSensitive", and
there are some supported values , "loose", "normal " and "strict":

    loose: plumbing full information in trace2, even warning.
    normal: plumbing full information in trace2, but not warning.
    strict: plumbing part of information in trace2, but not warning

But it looks like heavy, maybe not worthy... So, currently I will remove
basename and print the pack with path if there are no new inputs here.

Thanks.

> If we get later bitmap-related information in the trace2 stream, we know
> that we opened a bitmap. And at the moment we read a bitmap, there is
> only one such bitmap in the repository.
>
> I suppose that this is race-proof in the sense that if the bitmaps
> change after reading, we can still usefully debug the trace2 stream
> after the fact.
>
> So even though my first reaction was that this was unnecessary, on
> balance I do find it useful.

Yes... I think it's useful as least for me to do some bitmap tests and
I think print a bit more related information in trace2 data might be OK.

> > -test_expect_success 'complains about multiple pack bitmaps' '
> > +test_expect_success 'complains about multiple pack bitmaps when debugging by trace2' '
> >  	rm -fr repo &&
> >  	git init repo &&
> >  	test_when_finished "rm -fr repo" &&
> > @@ -420,8 +420,13 @@ test_expect_success 'complains about multiple pack bitmaps' '
> >  		test_line_count = 2 packs &&
> >  		test_line_count = 2 bitmaps &&
> >
> > -		git rev-list --use-bitmap-index HEAD 2>err &&
> > -		grep "ignoring extra bitmap file" err
> > +		ls -tr .git/objects/pack | grep -e "^pack-.*\.pack$" > sorted-packs &&
> > +		ignored_pack="$(cat sorted-packs | awk 'NR==1{print}')" &&
> > +		open_pack="$(cat sorted-packs | awk 'NR==2{print}')" &&
>
> Hmmph. This test only passes if 'ls -tr' gives us the packs in order
> that they are read by readdir(), which doesn't seem all that portable to
> me. At the very least, it is tightly coupled to the implementation of
> open_pack_bitmap() and friends.

Yes, because the _order_ matters here I think originally. May you explain a
little more about _portable_ problem please?

> Do we necessarily care about which .bitmap is read and which isn't? The
> existing test doesn't look too closely, which I think is fine (though as
> the author of that test, I might be biased ;-)).
>
> I would be equally happy to write:
>
> > +		GIT_TRACE2_PERF=$(pwd)/trace2.txt git rev-list --use-bitmap-index HEAD &&
> > +		grep "opened bitmap" trace2.txt &&
> > +		grep "ignoring extra bitmap" trace2.txt

Orz. Actually, I wrote it on the same way, but I think it maybe not so
sufficient for my patch. So...

But I think you are right afterI look other test cases, will fix.

Thanks.
Taylor Blau Nov. 5, 2022, 12:35 a.m. UTC | #3
On Thu, Nov 03, 2022 at 05:35:32PM +0800, Teng Long wrote:
> I'm not sure if trace2 data will be given to end-users, at least as I understand
> it as you, it's not. If so, your opinion is very reasonable.
>
> So... maybe we could add a new configuration like "core.hideSensitive", and
> there are some supported values , "loose", "normal " and "strict":
>
>     loose: plumbing full information in trace2, even warning.
>     normal: plumbing full information in trace2, but not warning.
>     strict: plumbing part of information in trace2, but not warning
>
> But it looks like heavy, maybe not worthy... So, currently I will remove
> basename and print the pack with path if there are no new inputs here.

trace2 data isn't sent to users. So having read this after I took a look
at the updated round, I am glad that you pursued the approach that you
did :-).

> > > -test_expect_success 'complains about multiple pack bitmaps' '
> > > +test_expect_success 'complains about multiple pack bitmaps when debugging by trace2' '
> > >  	rm -fr repo &&
> > >  	git init repo &&
> > >  	test_when_finished "rm -fr repo" &&
> > > @@ -420,8 +420,13 @@ test_expect_success 'complains about multiple pack bitmaps' '
> > >  		test_line_count = 2 packs &&
> > >  		test_line_count = 2 bitmaps &&
> > >
> > > -		git rev-list --use-bitmap-index HEAD 2>err &&
> > > -		grep "ignoring extra bitmap file" err
> > > +		ls -tr .git/objects/pack | grep -e "^pack-.*\.pack$" > sorted-packs &&
> > > +		ignored_pack="$(cat sorted-packs | awk 'NR==1{print}')" &&
> > > +		open_pack="$(cat sorted-packs | awk 'NR==2{print}')" &&
> >
> > Hmmph. This test only passes if 'ls -tr' gives us the packs in order
> > that they are read by readdir(), which doesn't seem all that portable to
> > me. At the very least, it is tightly coupled to the implementation of
> > open_pack_bitmap() and friends.
>
> Yes, because the _order_ matters here I think originally. May you explain a
> little more about _portable_ problem please?

We're depending on the loop in open_pack_bitmap() seeing packs in an
order compatible with the output of 'ls -tr' here. In other words, for
this test to pass, we care very much that the pack we collected as
"$ignored_pack" is seen *after* the pack that we designate as
"$open_pack".

Does that help?

Thanks,
Taylor
diff mbox series

Patch

diff --git a/pack-bitmap.c b/pack-bitmap.c
index 36134222d7a..a8c76056637 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -331,8 +331,9 @@  static int open_midx_bitmap_1(struct bitmap_index *bitmap_git,
 	if (bitmap_git->pack || bitmap_git->midx) {
 		struct strbuf buf = STRBUF_INIT;
 		get_midx_filename(&buf, midx->object_dir);
-		/* ignore extra bitmap file; we can only handle one */
-		warning("ignoring extra bitmap file: %s", buf.buf);
+		/* ignore extra midx bitmap files; we can only handle one */
+		trace2_data_string("bitmap", the_repository,
+				   "ignoring extra midx bitmap file", basename(buf.buf));
 		close(fd);
 		strbuf_release(&buf);
 		return -1;
@@ -402,8 +403,9 @@  static int open_pack_bitmap_1(struct bitmap_index *bitmap_git, struct packed_git
 	}
 
 	if (bitmap_git->pack || bitmap_git->midx) {
-		/* ignore extra bitmap file; we can only handle one */
-		warning("ignoring extra bitmap file: %s", packfile->pack_name);
+		/* ignore extra bitmap files; we can only handle one */
+		trace2_data_string("bitmap", the_repository,
+				   "ignoring extra bitmap file", basename(packfile->pack_name));
 		close(fd);
 		return -1;
 	}
@@ -428,6 +430,8 @@  static int open_pack_bitmap_1(struct bitmap_index *bitmap_git, struct packed_git
 		return -1;
 	}
 
+	trace2_data_string("bitmap", the_repository, "opened bitmap file",
+			   basename(packfile->pack_name));
 	return 0;
 }
 
diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
index f775fc1ce69..614586b0181 100755
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -397,7 +397,7 @@  test_expect_success 'pack.preferBitmapTips' '
 	)
 '
 
-test_expect_success 'complains about multiple pack bitmaps' '
+test_expect_success 'complains about multiple pack bitmaps when debugging by trace2' '
 	rm -fr repo &&
 	git init repo &&
 	test_when_finished "rm -fr repo" &&
@@ -420,8 +420,13 @@  test_expect_success 'complains about multiple pack bitmaps' '
 		test_line_count = 2 packs &&
 		test_line_count = 2 bitmaps &&
 
-		git rev-list --use-bitmap-index HEAD 2>err &&
-		grep "ignoring extra bitmap file" err
+		ls -tr .git/objects/pack | grep -e "^pack-.*\.pack$" > sorted-packs &&
+		ignored_pack="$(cat sorted-packs | awk 'NR==1{print}')" &&
+		open_pack="$(cat sorted-packs | awk 'NR==2{print}')" &&
+
+		GIT_TRACE2_PERF=1 git rev-list --use-bitmap-index HEAD 2>err &&
+		grep "opened bitmap file:$opened_pack" err &&
+		grep "ignoring extra bitmap file:$ignored_pack" err
 	)
 '