diff mbox series

rev-list: support `--human-readable` option when applied `disk-usage`

Message ID pull.1313.git.1659686097163.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series rev-list: support `--human-readable` option when applied `disk-usage` | expand

Commit Message

Li Linchao Aug. 5, 2022, 7:54 a.m. UTC
From: Li Linchao <lilinchao@oschina.cn>

The '--disk-usage' option for git-rev-list was introduced in 16950f8384
(rev-list: add --disk-usage option for calculating disk usage, 2021-02-09).
This is very useful for people inspect their git repo's objects usage
infomation, but the result number is quit hard for human to read.

Teach git rev-list to output more human readable result when using
'--disk-usage' to calculate objects disk usage.

Signed-off-by: Li Linchao <lilinchao@oschina.cn>
---
    rev-list: support --human-readable option when applied disk-usage
    
    The '--disk-usage' option for git-rev-list was introduced in 16950f8384
    (rev-list: add --disk-usage option for calculating disk usage,
    2021-02-09). This is very useful for people inspect their git repo's
    objects usage infomation, but the result number is quit hard for human
    to read.
    
    Teach git rev-list to output more human readable result when using
    '--disk-usage' to calculate objects disk usage.
    
    Signed-off-by: Li Linchao lilinchao@oschina.cn

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1313%2FCactusinhand%2Fllc%2Fadd-human-readable-option-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1313/Cactusinhand/llc/add-human-readable-option-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1313

 Documentation/rev-list-options.txt |  5 +++++
 builtin/rev-list.c                 | 31 ++++++++++++++++++++++++++----
 t/t6115-rev-list-du.sh             | 18 +++++++++++++++++
 3 files changed, 50 insertions(+), 4 deletions(-)


base-commit: 4af7188bc97f70277d0f10d56d5373022b1fa385

Comments

Ævar Arnfjörð Bjarmason Aug. 5, 2022, 10:03 a.m. UTC | #1
On Fri, Aug 05 2022, Li Linchao via GitGitGadget wrote:

> From: Li Linchao <lilinchao@oschina.cn>
>
> The '--disk-usage' option for git-rev-list was introduced in 16950f8384
> (rev-list: add --disk-usage option for calculating disk usage, 2021-02-09).
> This is very useful for people inspect their git repo's objects usage
> infomation, but the result number is quit hard for human to read.

s/the result number/the resulting number/
s/for human/for a human/

>
> Teach git rev-list to output more human readable result when using

s/to output more human/to output a human/

> '--disk-usage' to calculate objects disk usage.

For this I'd just s/ to calculate objects disk usage//. I.e. we already
discussed what --disk-usage does...

> +
> +-H::
> +--human-readable::
> +	Print on-disk objects size in human readable format. This option
> +	must be combined with `--disk-usage` together.
>  endif::git-rev-list[]

I'd really prefer if we didn't squat on -H, rev-list is overridden
enough, but how about:

	--disk-usage
	--disk-usage=human

Rather than introducing a new option?

>  	struct bitmap_index *bitmap_git;
> +	struct strbuf bitmap_size_buf = STRBUF_INIT;
> +	off_t size_from_bitmap;
>  
>  	if (!show_disk_usage)
>  		return -1;
> @@ -481,8 +484,13 @@ static int try_bitmap_disk_usage(struct rev_info *revs,
>  	if (!bitmap_git)
>  		return -1;
>  
> -	printf("%"PRIuMAX"\n",
> -	       (uintmax_t)get_disk_usage_from_bitmap(bitmap_git, revs));
> +	size_from_bitmap = get_disk_usage_from_bitmap(bitmap_git, revs);
> +	if (human_readable) {
> +		strbuf_humanise_bytes(&bitmap_size_buf, size_from_bitmap);
> +		printf("%s\n", bitmap_size_buf.buf);
> +	} else
> +		printf("%"PRIuMAX"\n", (uintmax_t)size_from_bitmap);
> +	strbuf_release(&bitmap_size_buf);

I think this would be better if we just use the strbuf unconditionally
(and a short &sb is conventional in such a short one-use function). So just:

	if (human_readable)
        	strbuf_humanise_bytes(&sb, size_from_bitmap);
	else
		strbuf_addf(&sb, "%"PRIuMAX", (uintmax_t)size_from_bitmap);
	puts(sb.buf);

It gets you rid of the need for {} braces, and I think makes for a nicer
read.

> -	if (show_disk_usage)
> -		printf("%"PRIuMAX"\n", (uintmax_t)total_disk_usage);
> +	if (show_disk_usage) {
> +		if (human_readable) {
> +			strbuf_humanise_bytes(&disk_buf, total_disk_usage);
> +			printf("%s\n", disk_buf.buf);
> +		} else
> +			printf("%"PRIuMAX"\n", (uintmax_t)total_disk_usage);
> +	}

Ditto, and we could make the &sb scoped to that "if (show_disk_usage)".

> +test_expect_success 'rev-list --disk-usage with --human-readable' '
> +	git rev-list --objects HEAD --disk-usage --human-readable >actual &&
> +	test_i18ngrep -e "446 bytes" actual

use grep, not test_i18ngrep (the latter should be going away entirely).

But actually we should use test_cmp here, isn't that the *entire*
output? I.e. won't this pass?

	echo 446 bytes >expect &&
	... >expect &&
	test_cmp expect actual

If so let's test what we really mean, i.e. we want *this* to be the
output, not to have output that has that sub-string on any arbitrary
amount of lines somewhere...

In this case it's unlikely to do the wrong thing, but it's a good habit
to get into...

> +test_expect_success 'rev-list --disk-usage with bitmap and --human-readable' '
> +	git rev-list --objects HEAD --use-bitmap-index --disk-usage -H >actual &&
> +	test_i18ngrep -e "446 bytes" actual

ditto.


> +'
> +
> +test_expect_success 'rev-list use --human-readable without --disk-usage' '
> +	test_must_fail git rev-list --objects HEAD --human-readable 2> err &&
> +	echo "fatal: option '\''--human-readable/-H'\'' should be used with" \
> +	"'\''--disk-usage'\'' together" >expect &&

You can make this a bit nicer by not using echo, use a here-doc instead:

	cat >expect <<-\EOF
        fatal: ...
	EOF

But you'll still need the '\'' quoting, but I thing it'll be better, and
avoids the line-wrapping (which we try to avoid for this sort of thing).
Li Linchao Aug. 5, 2022, 11:01 a.m. UTC | #2
>
>On Fri, Aug 05 2022, Li Linchao via GitGitGadget wrote:
>
>> From: Li Linchao <lilinchao@oschina.cn>
>>
>> The '--disk-usage' option for git-rev-list was introduced in 16950f8384
>> (rev-list: add --disk-usage option for calculating disk usage, 2021-02-09).
>> This is very useful for people inspect their git repo's objects usage
>> infomation, but the result number is quit hard for human to read.
>
>s/the result number/the resulting number/
>s/for human/for a human/
>
>>
>> Teach git rev-list to output more human readable result when using
>
>s/to output more human/to output a human/
>
>> '--disk-usage' to calculate objects disk usage.
>
>For this I'd just s/ to calculate objects disk usage//. I.e. we already
>discussed what --disk-usage does... 
OK
>
>> +
>> +-H::
>> +--human-readable::
>> +	Print on-disk objects size in human readable format. This option
>> +	must be combined with `--disk-usage` together.
>>  endif::git-rev-list[]
>
>I'd really prefer if we didn't squat on -H, rev-list is overridden
>enough, but how about:
>
>	--disk-usage
>	--disk-usage=human
>
>Rather than introducing a new option? 
Yes, this makes sense.
>
>>  struct bitmap_index *bitmap_git;
>> +	struct strbuf bitmap_size_buf = STRBUF_INIT;
>> +	off_t size_from_bitmap;
>> 
>>  if (!show_disk_usage)
>>  return -1;
>> @@ -481,8 +484,13 @@ static int try_bitmap_disk_usage(struct rev_info *revs,
>>  if (!bitmap_git)
>>  return -1;
>> 
>> -	printf("%"PRIuMAX"\n",
>> -	       (uintmax_t)get_disk_usage_from_bitmap(bitmap_git, revs));
>> +	size_from_bitmap = get_disk_usage_from_bitmap(bitmap_git, revs);
>> +	if (human_readable) {
>> +	strbuf_humanise_bytes(&bitmap_size_buf, size_from_bitmap);
>> +	printf("%s\n", bitmap_size_buf.buf);
>> +	} else
>> +	printf("%"PRIuMAX"\n", (uintmax_t)size_from_bitmap);
>> +	strbuf_release(&bitmap_size_buf);
>
>I think this would be better if we just use the strbuf unconditionally
>(and a short &sb is conventional in such a short one-use function). So just:
>
>	if (human_readable)
>        strbuf_humanise_bytes(&sb, size_from_bitmap);
>	else
>	strbuf_addf(&sb, "%"PRIuMAX", (uintmax_t)size_from_bitmap);
>	puts(sb.buf);
>
>It gets you rid of the need for {} braces, and I think makes for a nicer
>read. 
Agree
>
>> -	if (show_disk_usage)
>> -	printf("%"PRIuMAX"\n", (uintmax_t)total_disk_usage);
>> +	if (show_disk_usage) {
>> +	if (human_readable) {
>> +	strbuf_humanise_bytes(&disk_buf, total_disk_usage);
>> +	printf("%s\n", disk_buf.buf);
>> +	} else
>> +	printf("%"PRIuMAX"\n", (uintmax_t)total_disk_usage);
>> +	}
>
>Ditto, and we could make the &sb scoped to that "if (show_disk_usage)".
>
>> +test_expect_success 'rev-list --disk-usage with --human-readable' '
>> +	git rev-list --objects HEAD --disk-usage --human-readable >actual &&
>> +	test_i18ngrep -e "446 bytes" actual
>
>use grep, not test_i18ngrep (the latter should be going away entirely). 
OK
>
>But actually we should use test_cmp here, isn't that the *entire*
>output? I.e. won't this pass?
>
>	echo 446 bytes >expect &&
>	... >expect &&
>	test_cmp expect actual
>
>If so let's test what we really mean, i.e. we want *this* to be the
>output, not to have output that has that sub-string on any arbitrary
>amount of lines somewhere...
>
>In this case it's unlikely to do the wrong thing, but it's a good habit
>to get into...
>
>> +test_expect_success 'rev-list --disk-usage with bitmap and --human-readable' '
>> +	git rev-list --objects HEAD --use-bitmap-index --disk-usage -H >actual &&
>> +	test_i18ngrep -e "446 bytes" actual
>
>ditto. 
The output here is just "446  bytes" if we use '--disk-usage' option in this rest repo.
But Github CI/linux-sha256 reminded me that I made a mistake that
I should avoid to hardcore actual size here.
>
>
>> +'
>> +
>> +test_expect_success 'rev-list use --human-readable without --disk-usage' '
>> +	test_must_fail git rev-list --objects HEAD --human-readable 2> err &&
>> +	echo "fatal: option '\''--human-readable/-H'\'' should be used with" \
>> +	"'\''--disk-usage'\'' together" >expect &&
>
>You can make this a bit nicer by not using echo, use a here-doc instead:
>
>	cat >expect <<-\EOF
>        fatal: ...
>	EOF
>
>But you'll still need the '\'' quoting, but I thing it'll be better, and
>avoids the line-wrapping (which we try to avoid for this sort of thing). 
OK.

Many thanks for all your review comments :)
diff mbox series

Patch

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 195e74eec63..d30301a9159 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -249,6 +249,11 @@  ifdef::git-rev-list[]
 	faster (especially with `--use-bitmap-index`). See the `CAVEATS`
 	section in linkgit:git-cat-file[1] for the limitations of what
 	"on-disk storage" means.
+
+-H::
+--human-readable::
+	Print on-disk objects size in human readable format. This option
+	must be combined with `--disk-usage` together.
 endif::git-rev-list[]
 
 --cherry-mark::
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 30fd8e83eaf..be677f29070 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -81,6 +81,7 @@  static int arg_show_object_names = 1;
 
 static int show_disk_usage;
 static off_t total_disk_usage;
+static int human_readable;
 
 static off_t get_object_disk_usage(struct object *obj)
 {
@@ -473,6 +474,8 @@  static int try_bitmap_disk_usage(struct rev_info *revs,
 				 int filter_provided_objects)
 {
 	struct bitmap_index *bitmap_git;
+	struct strbuf bitmap_size_buf = STRBUF_INIT;
+	off_t size_from_bitmap;
 
 	if (!show_disk_usage)
 		return -1;
@@ -481,8 +484,13 @@  static int try_bitmap_disk_usage(struct rev_info *revs,
 	if (!bitmap_git)
 		return -1;
 
-	printf("%"PRIuMAX"\n",
-	       (uintmax_t)get_disk_usage_from_bitmap(bitmap_git, revs));
+	size_from_bitmap = get_disk_usage_from_bitmap(bitmap_git, revs);
+	if (human_readable) {
+		strbuf_humanise_bytes(&bitmap_size_buf, size_from_bitmap);
+		printf("%s\n", bitmap_size_buf.buf);
+	} else
+		printf("%"PRIuMAX"\n", (uintmax_t)size_from_bitmap);
+	strbuf_release(&bitmap_size_buf);
 	return 0;
 }
 
@@ -490,6 +498,7 @@  int cmd_rev_list(int argc, const char **argv, const char *prefix)
 {
 	struct rev_info revs;
 	struct rev_list_info info;
+	struct strbuf disk_buf = STRBUF_INIT;
 	struct setup_revision_opt s_r_opt = {
 		.allow_exclude_promisor_objects = 1,
 	};
@@ -630,9 +639,17 @@  int cmd_rev_list(int argc, const char **argv, const char *prefix)
 			continue;
 		}
 
+		if (!strcmp(arg, "--human-readable") || !strcmp(arg, "-H")) {
+			human_readable = 1;
+			continue;
+		}
+
 		usage(rev_list_usage);
 
 	}
+
+	if (!show_disk_usage && human_readable)
+		die(_("option '%s' should be used with '%s' together"), "--human-readable/-H", "--disk-usage");
 	if (revs.commit_format != CMIT_FMT_USERFORMAT)
 		revs.include_header = 1;
 	if (revs.commit_format != CMIT_FMT_UNSPECIFIED) {
@@ -752,10 +769,16 @@  int cmd_rev_list(int argc, const char **argv, const char *prefix)
 			printf("%d\n", revs.count_left + revs.count_right);
 	}
 
-	if (show_disk_usage)
-		printf("%"PRIuMAX"\n", (uintmax_t)total_disk_usage);
+	if (show_disk_usage) {
+		if (human_readable) {
+			strbuf_humanise_bytes(&disk_buf, total_disk_usage);
+			printf("%s\n", disk_buf.buf);
+		} else
+			printf("%"PRIuMAX"\n", (uintmax_t)total_disk_usage);
+	}
 
 cleanup:
 	release_revisions(&revs);
+	strbuf_release(&disk_buf);
 	return ret;
 }
diff --git a/t/t6115-rev-list-du.sh b/t/t6115-rev-list-du.sh
index b4aef32b713..614ebb72aaa 100755
--- a/t/t6115-rev-list-du.sh
+++ b/t/t6115-rev-list-du.sh
@@ -48,4 +48,22 @@  check_du HEAD
 check_du --objects HEAD
 check_du --objects HEAD^..HEAD
 
+
+test_expect_success 'rev-list --disk-usage with --human-readable' '
+	git rev-list --objects HEAD --disk-usage --human-readable >actual &&
+	test_i18ngrep -e "446 bytes" actual
+'
+
+test_expect_success 'rev-list --disk-usage with bitmap and --human-readable' '
+	git rev-list --objects HEAD --use-bitmap-index --disk-usage -H >actual &&
+	test_i18ngrep -e "446 bytes" actual
+'
+
+test_expect_success 'rev-list use --human-readable without --disk-usage' '
+	test_must_fail git rev-list --objects HEAD --human-readable 2> err &&
+	echo "fatal: option '\''--human-readable/-H'\'' should be used with" \
+	"'\''--disk-usage'\'' together" >expect &&
+	test_cmp err expect
+'
+
 test_done