Message ID | 20200513005424.81369-28-sandals@crustytoothpaste.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | SHA-256 part 2/3: protocol functionality | expand |
"brian m. carlson" <sandals@crustytoothpaste.net> writes: > It's possible to use a variety of index formats with show-index, and we > need a way to indicate the hash algorithm which is in use for a > particular index we'd like to show. Default to using the value for the > repository we're in by calling setup_git_directory_gently, and allow > overriding it by using a --hash argument. I think you meant to say that "show-index" does not autodetect what hash algorithm is used from its input, and the new argument is a way for the user to help the command when the hash algorithm is different from what is used in the current repository? I ask because I found that your version can be read to say that "show-index" can show the contents of a given pack index using any hash algorithm we support, and the user can specify --hash=SHA-256 when running the command on a pack .idx that uses SHA-1 object names to auto-convert it, and readers wouldn't be able to guess which was meant with only the above five lines. > Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> > --- > builtin/show-index.c | 29 ++++++++++++++++++++++++----- > git.c | 2 +- > 2 files changed, 25 insertions(+), 6 deletions(-) > > diff --git a/builtin/show-index.c b/builtin/show-index.c > index 0826f6a5a2..ebfa2e9abd 100644 > --- a/builtin/show-index.c > +++ b/builtin/show-index.c > @@ -1,9 +1,12 @@ > #include "builtin.h" > #include "cache.h" > #include "pack.h" > +#include "parse-options.h" > > -static const char show_index_usage[] = > -"git show-index"; > +static const char *const show_index_usage[] = { > + "git show-index [--hash=HASH]", > + NULL > +}; Do we say --hash=SHA-1 etc. or --hash-algo=SHA-256 in other places? Would the word "hash" alone clear enough that it does not refer to a specific "hash" value but the name of an algorithm? The generating side seems to use "index-pack --object-format=<algo>" and the transport seems to use a capability "object-format=<algo>", neither of which is directly visible to the end users, but I think they follow "git init --object-format=<algo>", so we are consistent there. Perhaps we should follow suit here, too? > int cmd_show_index(int argc, const char **argv, const char *prefix) > { > @@ -11,10 +14,26 @@ int cmd_show_index(int argc, const char **argv, const char *prefix) > unsigned nr; > unsigned int version; > static unsigned int top_index[256]; > - const unsigned hashsz = the_hash_algo->rawsz; > + unsigned hashsz; > + const char *hash_name = NULL; > + int hash_algo; > + const struct option show_index_options[] = { > + OPT_STRING(0, "hash", &hash_name, N_("hash"), > + N_("specify the hash algorithm to use")), init-db has an entry identical to this except for the second token given to the macro is "object-format" instead of "hash". Both may want to change what's inside N_() to "hash algorithm". > + OPT_END() > + }; > + > + argc = parse_options(argc, argv, prefix, show_index_options, show_index_usage, 0); > + > + if (hash_name) { > + hash_algo = hash_algo_by_name(hash_name); > + if (hash_algo == GIT_HASH_UNKNOWN) > + die(_("Unknown hash algorithm")); > + repo_set_hash_algo(the_repository, hash_algo); > + } > + > + hashsz = the_hash_algo->rawsz; > > - if (argc != 1) > - usage(show_index_usage); > if (fread(top_index, 2 * 4, 1, stdin) != 1) > die("unable to read header"); > if (top_index[0] == htonl(PACK_IDX_SIGNATURE)) { > diff --git a/git.c b/git.c > index 2e4efb4ff0..e53e8159a2 100644 > --- a/git.c > +++ b/git.c > @@ -573,7 +573,7 @@ static struct cmd_struct commands[] = { > { "shortlog", cmd_shortlog, RUN_SETUP_GENTLY | USE_PAGER }, > { "show", cmd_show, RUN_SETUP }, > { "show-branch", cmd_show_branch, RUN_SETUP }, > - { "show-index", cmd_show_index }, > + { "show-index", cmd_show_index, RUN_SETUP_GENTLY }, Hmph, this is not necessary to support peeking an .idx file in another repository that uses a different hash algorithm than ours (we do need the --hash=<algo> override to tell that the algo is different from what we read from our repository settings). Is this absolutely necessary? Ah, I am misreading the patch. We didn't even do setup but we now optionally do, in order to see if we are in a repository and what object format it uses to give the default value to --hash=<algo> when the argument is not given. The need for RUN_SETUP_GENTLY is understandable. As we do not take any path argument on the command line, the other side effect of setup_git_directory() that takes us up to the top level of the working tree does not hurt us, either, so this is a good change, I think. Thanks. > { "show-ref", cmd_show_ref, RUN_SETUP }, > { "sparse-checkout", cmd_sparse_checkout, RUN_SETUP | NEED_WORK_TREE }, > { "stage", cmd_add, RUN_SETUP | NEED_WORK_TREE },
On 2020-05-18 at 16:20:22, Junio C Hamano wrote: > "brian m. carlson" <sandals@crustytoothpaste.net> writes: > > > It's possible to use a variety of index formats with show-index, and we > > need a way to indicate the hash algorithm which is in use for a > > particular index we'd like to show. Default to using the value for the > > repository we're in by calling setup_git_directory_gently, and allow > > overriding it by using a --hash argument. > > I think you meant to say that "show-index" does not autodetect what > hash algorithm is used from its input, and the new argument is a way > for the user to help the command when the hash algorithm is > different from what is used in the current repository? Correct. > I ask because I found that your version can be read to say that > "show-index" can show the contents of a given pack index using any > hash algorithm we support, and the user can specify --hash=SHA-256 > when running the command on a pack .idx that uses SHA-1 object names > to auto-convert it, and readers wouldn't be able to guess which was > meant with only the above five lines. No, that's definitely not what I meant. I'll adjust the commit message to make this clearer. > Do we say --hash=SHA-1 etc. or --hash-algo=SHA-256 in other places? > Would the word "hash" alone clear enough that it does not refer to > a specific "hash" value but the name of an algorithm? > > The generating side seems to use "index-pack --object-format=<algo>" > and the transport seems to use a capability "object-format=<algo>", > neither of which is directly visible to the end users, but I think > they follow "git init --object-format=<algo>", so we are consistent > there. > > Perhaps we should follow suit here, too? Yeah, as I mentioned to Martin elsewhere in the thread, I'm going to make this consistent and use --object-formta. > > diff --git a/git.c b/git.c > > index 2e4efb4ff0..e53e8159a2 100644 > > --- a/git.c > > +++ b/git.c > > @@ -573,7 +573,7 @@ static struct cmd_struct commands[] = { > > { "shortlog", cmd_shortlog, RUN_SETUP_GENTLY | USE_PAGER }, > > { "show", cmd_show, RUN_SETUP }, > > { "show-branch", cmd_show_branch, RUN_SETUP }, > > - { "show-index", cmd_show_index }, > > + { "show-index", cmd_show_index, RUN_SETUP_GENTLY }, > > Hmph, this is not necessary to support peeking an .idx file in > another repository that uses a different hash algorithm than ours > (we do need the --hash=<algo> override to tell that the algo is > different from what we read from our repository settings). Is this > absolutely necessary? > > Ah, I am misreading the patch. We didn't even do setup but we now > optionally do, in order to see if we are in a repository and what > object format it uses to give the default value to --hash=<algo> > when the argument is not given. The need for RUN_SETUP_GENTLY > is understandable. Yes, this is designed to make us do the right thing when we're in a repository (e.g., with --stdin) by autodetecting the algorithm in use but not fail when we're outside of a repository. I'll update the commit message to make this a lot clearer, since I clearly omitted a lot of things that were in my head when writing this.
diff --git a/builtin/show-index.c b/builtin/show-index.c index 0826f6a5a2..ebfa2e9abd 100644 --- a/builtin/show-index.c +++ b/builtin/show-index.c @@ -1,9 +1,12 @@ #include "builtin.h" #include "cache.h" #include "pack.h" +#include "parse-options.h" -static const char show_index_usage[] = -"git show-index"; +static const char *const show_index_usage[] = { + "git show-index [--hash=HASH]", + NULL +}; int cmd_show_index(int argc, const char **argv, const char *prefix) { @@ -11,10 +14,26 @@ int cmd_show_index(int argc, const char **argv, const char *prefix) unsigned nr; unsigned int version; static unsigned int top_index[256]; - const unsigned hashsz = the_hash_algo->rawsz; + unsigned hashsz; + const char *hash_name = NULL; + int hash_algo; + const struct option show_index_options[] = { + OPT_STRING(0, "hash", &hash_name, N_("hash"), + N_("specify the hash algorithm to use")), + OPT_END() + }; + + argc = parse_options(argc, argv, prefix, show_index_options, show_index_usage, 0); + + if (hash_name) { + hash_algo = hash_algo_by_name(hash_name); + if (hash_algo == GIT_HASH_UNKNOWN) + die(_("Unknown hash algorithm")); + repo_set_hash_algo(the_repository, hash_algo); + } + + hashsz = the_hash_algo->rawsz; - if (argc != 1) - usage(show_index_usage); if (fread(top_index, 2 * 4, 1, stdin) != 1) die("unable to read header"); if (top_index[0] == htonl(PACK_IDX_SIGNATURE)) { diff --git a/git.c b/git.c index 2e4efb4ff0..e53e8159a2 100644 --- a/git.c +++ b/git.c @@ -573,7 +573,7 @@ static struct cmd_struct commands[] = { { "shortlog", cmd_shortlog, RUN_SETUP_GENTLY | USE_PAGER }, { "show", cmd_show, RUN_SETUP }, { "show-branch", cmd_show_branch, RUN_SETUP }, - { "show-index", cmd_show_index }, + { "show-index", cmd_show_index, RUN_SETUP_GENTLY }, { "show-ref", cmd_show_ref, RUN_SETUP }, { "sparse-checkout", cmd_sparse_checkout, RUN_SETUP | NEED_WORK_TREE }, { "stage", cmd_add, RUN_SETUP | NEED_WORK_TREE },
It's possible to use a variety of index formats with show-index, and we need a way to indicate the hash algorithm which is in use for a particular index we'd like to show. Default to using the value for the repository we're in by calling setup_git_directory_gently, and allow overriding it by using a --hash argument. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> --- builtin/show-index.c | 29 ++++++++++++++++++++++++----- git.c | 2 +- 2 files changed, 25 insertions(+), 6 deletions(-)