Message ID | 73e139ee377f9c50e671b0d94a28b93c1db28a69.1659577543.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Generalize 'scalar diagnose' into 'git diagnose' and 'git bugreport --diagnose' | expand |
On Thu, Aug 04 2022, Victoria Dye via GitGitGadget wrote: > From: Victoria Dye <vdye@github.com> > > Create a 'git diagnose' builtin to generate a standalone zip archive of > repository diagnostics. It's good to have this as a built-in separate from "git bugreport", but... > +git-diagnose - Generate a zip archive of diagnostic information ...I'd really prefer for this not to squat on such a common name we might regret having reserved later for such very specific functionality. I'd think e.g. these would be better: git mk-diagnostics-zip Or maybe: git archive-interesting-for-report If I had to guess what a "git diagnose" did, I'd probably think: * It analyzes your config, and suggests redundancies/alternatives * It does some perf tests / heuritics, and e.g. suggests you turn on the commit-graph writing. etc., this (arguably even too generic then) made sense as "scalar diagnose" because scalar is all about being an opinionated interface targeted at performance", so there's an implied "my repo's performance" following a "scalar diagnose". But as a top-level command-name I think we should pick something more specific to what it does, which is (I haven't fully read ahead in the re-roll, but I'm assuming is) mostly/entirely to be a "helper" for "scalar diagnose" and/or "git bugreport". > + switch (safe_create_leading_directories(zip_path.buf)) { > + case SCLD_OK: > + case SCLD_EXISTS: > + break; > + default: > + die(_("could not create leading directories for '%s'"), > + zip_path.buf); This seems to be carrying forward a minor bug from bugreport.c we should probably fix: we should use die_errno() here (and maybe lead with a commit to fix bugreport.c's version). The strbuf*() before that also seems like a good candidate for a utility function in your new diagnose library, i.e. to have both bugreport.c and diagnose.c pass it the prefix/suffix/format, then try to create that directory, then replace the copy/pasting here with a one-line call to that now-shared function. The two codepaths only seem to differ in the prefix & suffix from a quick skim, the rest is all copy/pasted...
On 8/3/2022 9:45 PM, Victoria Dye via GitGitGadget wrote: > From: Victoria Dye <vdye@github.com> > > Create a 'git diagnose' builtin to generate a standalone zip archive of > repository diagnostics. > + * The contents of the `.git`, `.git/hooks`, `.git/info`, `.git/logs`, and > + `.git/objects/info` directories You remove these lines in the next patch, which is called "gate certain data behind '--all'" but maybe we shouldn't have this functionality now and instead add it in the future. The biggest reason for the --all option is that these contents will likely include private IP (path names and branch names, but not file contents) that the user would probably not want to share with the public mailing list, but might want to share with a trusted Git expert in order to resolve a problem. You mention earlier that The generated archive can then, for example, be shared with the Git mailing list to help debug an issue or serve as a reference for independent debugging. So, if you're sending a v3, then moving this out of this patch and into the next one would be a good way to be sure that this possibly-private data is not mentioned as something to share super publicly. (Of course, this requires making the change to create_diagnostics_archive() in advance of creating the builtin, so maybe this reorganization isn't worth it.) > @@ -0,0 +1,58 @@ > +#include "builtin.h" > +#include "parse-options.h" > +#include "diagnose.h" > + > + nit: double empty line > +++ b/t/t0092-diagnose.sh > @@ -0,0 +1,28 @@ > +#!/bin/sh > + > +test_description='git diagnose' > + > +TEST_PASSES_SANITIZE_LEAK=true > +. ./test-lib.sh > + > +test_expect_success UNZIP 'creates diagnostics zip archive' ' > + test_when_finished rm -rf report && > + > + git diagnose -o report -s test >out && > + > + zip_path=report/git-diagnostics-test.zip && > + grep "Available space" out && > + test_path_is_file "$zip_path" && nit: 'grep' the output immediately after the 'git diagnose' command and keep the zip_path use immediately after its definition. Thanks, -Stolee
On 8/4/2022 2:27 AM, Ævar Arnfjörð Bjarmason wrote: > > On Thu, Aug 04 2022, Victoria Dye via GitGitGadget wrote: > >> From: Victoria Dye <vdye@github.com> >> >> Create a 'git diagnose' builtin to generate a standalone zip archive of >> repository diagnostics. > > It's good to have this as a built-in separate from "git bugreport", > but... > >> +git-diagnose - Generate a zip archive of diagnostic information > > ...I'd really prefer for this not to squat on such a common name we > might regret having reserved later for such very specific > functionality. I'd think e.g. these would be better: > > git mk-diagnostics-zip > > Or maybe: > > git archive-interesting-for-report These are not realistic replacements. > If I had to guess what a "git diagnose" did, I'd probably think: > > * It analyzes your config, and suggests redundancies/alternatives > * It does some perf tests / heuritics, and e.g. suggests you turn on > the commit-graph writing. These sound like great options to add in the future, such as: --perf-test: Run performance tests on your repository using different Git config options and recommend certain settings. (This --perf-test option would be a great way to get wider adoption of parallel checkout, since its optimal settings are so machine dependent.) The thing is, even if we did these other things, it would result in some kind of document that summarizes the repository shape and features. That kind of data is exactly what this version of 'git diagnose' does. For now, it leaves the human reader responsible for making decisions based on those documents, but they have been incredibly helpful when we are _diagnosing_ issues users are having with their repositories. Thanks, -Stolee
On Fri, Aug 05 2022, Derrick Stolee wrote: > On 8/4/2022 2:27 AM, Ævar Arnfjörð Bjarmason wrote: >> >> On Thu, Aug 04 2022, Victoria Dye via GitGitGadget wrote: >> >>> From: Victoria Dye <vdye@github.com> >>> >>> Create a 'git diagnose' builtin to generate a standalone zip archive of >>> repository diagnostics. >> >> It's good to have this as a built-in separate from "git bugreport", >> but... >> >>> +git-diagnose - Generate a zip archive of diagnostic information >> >> ...I'd really prefer for this not to squat on such a common name we >> might regret having reserved later for such very specific >> functionality. I'd think e.g. these would be better: >> >> git mk-diagnostics-zip >> >> Or maybe: >> >> git archive-interesting-for-report > > These are not realistic replacements. Maybe: git diagnose create-zip ? >> If I had to guess what a "git diagnose" did, I'd probably think: >> >> * It analyzes your config, and suggests redundancies/alternatives >> * It does some perf tests / heuritics, and e.g. suggests you turn on >> the commit-graph writing. > > These sound like great options to add in the future, such as: > > --perf-test: Run performance tests on your repository using different > Git config options and recommend certain settings. > > (This --perf-test option would be a great way to get wider adoption > of parallel checkout, since its optimal settings are so machine > dependent.) ... > The thing is, even if we did these other things, it would result in > some kind of document that summarizes the repository shape and features. > That kind of data is exactly what this version of 'git diagnose' does. I think a command like "git diagnose" that had options to do other unrelated stuff, but by default created a zip archive when given no options would be rather confusing. Yes, it makes sense to emit some human-readable summary, but to zip it up as well? That's something we just need for the "git bugreport" case... > For now, it leaves the human reader responsible for making decisions > based on those documents, but they have been incredibly helpful when we > are _diagnosing_ issues users are having with their repositories. This is orthagonal to what I'm pointing out. You're basically saying the user can just read the documentation to find out what this built-in does. That's true, what I'm pointing out is that it's unfortunate that such highly specific functionality is squatting on such a short & generic name, but just e.g. adding a "create-zip" sub-command would address that.
diff --git a/.gitignore b/.gitignore index 42fd7253b44..80b530bbed2 100644 --- a/.gitignore +++ b/.gitignore @@ -53,6 +53,7 @@ /git-cvsimport /git-cvsserver /git-daemon +/git-diagnose /git-diff /git-diff-files /git-diff-index diff --git a/Documentation/git-diagnose.txt b/Documentation/git-diagnose.txt new file mode 100644 index 00000000000..b12ef98f013 --- /dev/null +++ b/Documentation/git-diagnose.txt @@ -0,0 +1,52 @@ +git-diagnose(1) +================ + +NAME +---- +git-diagnose - Generate a zip archive of diagnostic information + +SYNOPSIS +-------- +[verse] +'git diagnose' [(-o | --output-directory) <path>] [(-s | --suffix) <format>] + +DESCRIPTION +----------- +Collects detailed information about the user's machine, Git client, and +repository state and packages that information into a zip archive. The +generated archive can then, for example, be shared with the Git mailing list to +help debug an issue or serve as a reference for independent debugging. + +The following information is captured in the archive: + + * 'git version --build-options' + * The path to the repository root + * The available disk space on the filesystem + * The name and size of each packfile, including those in alternate object + stores + * The total count of loose objects, as well as counts broken down by + `.git/objects` subdirectory + * The contents of the `.git`, `.git/hooks`, `.git/info`, `.git/logs`, and + `.git/objects/info` directories + +This tool differs from linkgit:git-bugreport[1] in that it collects much more +detailed information with a greater focus on reporting the size and data shape +of repository contents. + +OPTIONS +------- +-o <path>:: +--output-directory <path>:: + Place the resulting diagnostics archive in `<path>` instead of the + current directory. + +-s <format>:: +--suffix <format>:: + Specify an alternate suffix for the diagnostics archive name, to create + a file named 'git-diagnostics-<formatted suffix>'. This should take the + form of a strftime(3) format string; the current local time will be + used. + +GIT +--- +Part of the linkgit:git[1] suite diff --git a/Makefile b/Makefile index ad27a0bd70c..9ceaf55582a 100644 --- a/Makefile +++ b/Makefile @@ -1154,6 +1154,7 @@ BUILTIN_OBJS += builtin/credential-cache.o BUILTIN_OBJS += builtin/credential-store.o BUILTIN_OBJS += builtin/credential.o BUILTIN_OBJS += builtin/describe.o +BUILTIN_OBJS += builtin/diagnose.o BUILTIN_OBJS += builtin/diff-files.o BUILTIN_OBJS += builtin/diff-index.o BUILTIN_OBJS += builtin/diff-tree.o diff --git a/builtin.h b/builtin.h index 40e9ecc8485..8901a34d6bf 100644 --- a/builtin.h +++ b/builtin.h @@ -144,6 +144,7 @@ int cmd_credential_cache(int argc, const char **argv, const char *prefix); int cmd_credential_cache_daemon(int argc, const char **argv, const char *prefix); int cmd_credential_store(int argc, const char **argv, const char *prefix); int cmd_describe(int argc, const char **argv, const char *prefix); +int cmd_diagnose(int argc, const char **argv, const char *prefix); int cmd_diff_files(int argc, const char **argv, const char *prefix); int cmd_diff_index(int argc, const char **argv, const char *prefix); int cmd_diff(int argc, const char **argv, const char *prefix); diff --git a/builtin/diagnose.c b/builtin/diagnose.c new file mode 100644 index 00000000000..c545c6bae1d --- /dev/null +++ b/builtin/diagnose.c @@ -0,0 +1,58 @@ +#include "builtin.h" +#include "parse-options.h" +#include "diagnose.h" + + +static const char * const diagnose_usage[] = { + N_("git diagnose [-o|--output-directory <file>] [-s|--suffix <format>]"), + NULL +}; + +int cmd_diagnose(int argc, const char **argv, const char *prefix) +{ + struct strbuf zip_path = STRBUF_INIT; + time_t now = time(NULL); + struct tm tm; + char *option_output = NULL; + char *option_suffix = "%Y-%m-%d-%H%M"; + char *prefixed_filename; + + const struct option diagnose_options[] = { + OPT_STRING('o', "output-directory", &option_output, N_("path"), + N_("specify a destination for the diagnostics archive")), + OPT_STRING('s', "suffix", &option_suffix, N_("format"), + N_("specify a strftime format suffix for the filename")), + OPT_END() + }; + + argc = parse_options(argc, argv, prefix, diagnose_options, + diagnose_usage, 0); + + /* Prepare the path to put the result */ + prefixed_filename = prefix_filename(prefix, + option_output ? option_output : ""); + strbuf_addstr(&zip_path, prefixed_filename); + strbuf_complete(&zip_path, '/'); + + strbuf_addstr(&zip_path, "git-diagnostics-"); + strbuf_addftime(&zip_path, option_suffix, localtime_r(&now, &tm), 0, 0); + strbuf_addstr(&zip_path, ".zip"); + + switch (safe_create_leading_directories(zip_path.buf)) { + case SCLD_OK: + case SCLD_EXISTS: + break; + default: + die(_("could not create leading directories for '%s'"), + zip_path.buf); + } + + /* Prepare diagnostics */ + if (create_diagnostics_archive(&zip_path)) + die_errno(_("unable to create diagnostics archive %s"), + zip_path.buf); + + free(prefixed_filename); + strbuf_release(&zip_path); + return 0; +} diff --git a/git.c b/git.c index e5d62fa5a92..0b9d8ef7677 100644 --- a/git.c +++ b/git.c @@ -522,6 +522,7 @@ static struct cmd_struct commands[] = { { "credential-cache--daemon", cmd_credential_cache_daemon }, { "credential-store", cmd_credential_store }, { "describe", cmd_describe, RUN_SETUP }, + { "diagnose", cmd_diagnose, RUN_SETUP_GENTLY }, { "diff", cmd_diff, NO_PARSEOPT }, { "diff-files", cmd_diff_files, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT }, { "diff-index", cmd_diff_index, RUN_SETUP | NO_PARSEOPT }, diff --git a/t/t0092-diagnose.sh b/t/t0092-diagnose.sh new file mode 100755 index 00000000000..fa05bf6046f --- /dev/null +++ b/t/t0092-diagnose.sh @@ -0,0 +1,28 @@ +#!/bin/sh + +test_description='git diagnose' + +TEST_PASSES_SANITIZE_LEAK=true +. ./test-lib.sh + +test_expect_success UNZIP 'creates diagnostics zip archive' ' + test_when_finished rm -rf report && + + git diagnose -o report -s test >out && + + zip_path=report/git-diagnostics-test.zip && + grep "Available space" out && + test_path_is_file "$zip_path" && + + # Check zipped archive content + "$GIT_UNZIP" -p "$zip_path" diagnostics.log >out && + test_file_not_empty out && + + "$GIT_UNZIP" -p "$zip_path" packs-local.txt >out && + grep ".git/objects" out && + + "$GIT_UNZIP" -p "$zip_path" objects-local.txt >out && + grep "^Total: [0-9][0-9]*" out +' + +test_done