Message ID | a3be4f39aa240e614a2e12756e1ea864c35137a2.1592225416.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Allow overriding the default name of the default branch | expand |
Hi dscho On 15/06/2020 13:50, Johannes Schindelin via GitGitGadget wrote: > From: Johannes Schindelin <johannes.schindelin@gmx.de> > > When anonymizing commit history, we are careful to translate the main > branch name to `ref0`. > > When the main branch name is overridden via the config, we will want > `git fast-export` to use that overridden name instead. > > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> > --- > builtin/fast-export.c | 11 ++++++++++- > t/t9351-fast-export-anonymize.sh | 6 ++++++ > 2 files changed, 16 insertions(+), 1 deletion(-) > > diff --git a/builtin/fast-export.c b/builtin/fast-export.c > index 1072bbf041f..deeb01b6937 100644 > --- a/builtin/fast-export.c > +++ b/builtin/fast-export.c > @@ -515,14 +515,23 @@ static const char *anonymize_refname(const char *refname) > }; > static struct hashmap refs; > static struct strbuf anon = STRBUF_INIT; > + static char *main_branch; > int i; > > /* > * In certain circumstances, it might be interesting to be able to > * identify the main branch. For that reason, let's force its name to > * be anonymized to `ref0`. > + * > + * While the main branch name might often be `main` for new > + * repositories (and `master` for aged ones), and such well-known names > + * may not necessarily need anonymizing, it could be configured to use > + * a secret word that the user may not want to reveal. > */ > - if (!strcmp(refname, "refs/heads/master")) > + if (!main_branch) > + main_branch = git_main_branch_name(MAIN_BRANCH_FULL_NAME); > + > + if (!strcmp(refname, main_branch)) > return "refs/heads/ref0"; This leaks main_branch if it came from git_main_branch_name() Best Wishes Phillip > strbuf_reset(&anon); > diff --git a/t/t9351-fast-export-anonymize.sh b/t/t9351-fast-export-anonymize.sh > index 2415f0ec213..f42be635c2f 100755 > --- a/t/t9351-fast-export-anonymize.sh > +++ b/t/t9351-fast-export-anonymize.sh > @@ -31,6 +31,12 @@ test_expect_success 'stream translates master to ref0' ' > ! grep master stream > ' > > +test_expect_success 'respects configured main branch' ' > + git -c core.mainBranch=does-not-exist \ > + fast-export --anonymize --all >stream-without-ref0 && > + ! grep ref0 stream-without-ref0 > +' > + > test_expect_success 'stream omits other refnames' ' > ! grep other stream && > ! grep mytag stream >
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com> writes: > + * While the main branch name might often be `main` for new > + * repositories (and `master` for aged ones), and such well-known names As I said, if you used a different word for the first 'main' in the sentence, it reads much better. > + * may not necessarily need anonymizing, it could be configured to use > + * a secret word that the user may not want to reveal. > */ > - if (!strcmp(refname, "refs/heads/master")) > + if (!main_branch) > + main_branch = git_main_branch_name(MAIN_BRANCH_FULL_NAME); > + > + if (!strcmp(refname, main_branch)) > return "refs/heads/ref0"; The same comment as 02/12 applies here. If the helper function returns "" when the user says that no branch is more special than others in the repository, the code would automatically do the right thing. In any case, thanks for working on it. I am on "vacation" so will be commenting on the rest of the series later in the week.
On Mon, Jun 15, 2020 at 04:05:52PM +0100, Phillip Wood wrote: > > @@ -515,14 +515,23 @@ static const char *anonymize_refname(const char *refname) > > }; > > static struct hashmap refs; > > static struct strbuf anon = STRBUF_INIT; > > + static char *main_branch; > [...] > > - if (!strcmp(refname, "refs/heads/master")) > > + if (!main_branch) > > + main_branch = git_main_branch_name(MAIN_BRANCH_FULL_NAME); > > + > > + if (!strcmp(refname, main_branch)) > > return "refs/heads/ref0"; > > This leaks main_branch if it came from git_main_branch_name() It's a static that's used over and over, so I think it's intentional to essentially memoize it for the life of the program (at which point we could free it, but don't bother to do so, letting the process exit take care of it, and trusting in leak detectors to be aware that it's still reachable, as we do for lots of other process-lifetime allocations). -Peff
Hi Peff On 16/06/2020 14:10, Jeff King wrote: > On Mon, Jun 15, 2020 at 04:05:52PM +0100, Phillip Wood wrote: > >>> @@ -515,14 +515,23 @@ static const char *anonymize_refname(const char *refname) >>> }; >>> static struct hashmap refs; >>> static struct strbuf anon = STRBUF_INIT; >>> + static char *main_branch; >> [...] >>> - if (!strcmp(refname, "refs/heads/master")) >>> + if (!main_branch) >>> + main_branch = git_main_branch_name(MAIN_BRANCH_FULL_NAME); >>> + >>> + if (!strcmp(refname, main_branch)) >>> return "refs/heads/ref0"; >> >> This leaks main_branch if it came from git_main_branch_name() > > It's a static that's used over and over, so I think it's intentional to > essentially memoize it for the life of the program Oh you're right, I completely misread the patch Thanks Phillip > (at which point we > could free it, but don't bother to do so, letting the process exit take > care of it, and trusting in leak detectors to be aware that it's still > reachable, as we do for lots of other process-lifetime allocations). > > -Peff >
Hi Peff, On Tue, 16 Jun 2020, Jeff King wrote: > On Mon, Jun 15, 2020 at 04:05:52PM +0100, Phillip Wood wrote: > > > > @@ -515,14 +515,23 @@ static const char *anonymize_refname(const char *refname) > > > }; > > > static struct hashmap refs; > > > static struct strbuf anon = STRBUF_INIT; > > > + static char *main_branch; > > [...] > > > - if (!strcmp(refname, "refs/heads/master")) > > > + if (!main_branch) > > > + main_branch = git_main_branch_name(MAIN_BRANCH_FULL_NAME); > > > + > > > + if (!strcmp(refname, main_branch)) > > > return "refs/heads/ref0"; > > > > This leaks main_branch if it came from git_main_branch_name() > > It's a static that's used over and over, so I think it's intentional to > essentially memoize it for the life of the program (at which point we > could free it, but don't bother to do so, letting the process exit take > care of it, and trusting in leak detectors to be aware that it's still > reachable, as we do for lots of other process-lifetime allocations). That is indeed the intention, and I will edit the commit message accordingly. Thanks, Dscho
Hi Junio, On Mon, 15 Jun 2020, Junio C Hamano wrote: > "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com> > writes: > > > + * While the main branch name might often be `main` for new > > + * repositories (and `master` for aged ones), and such well-known names > > As I said, if you used a different word for the first 'main' in the > sentence, it reads much better. > > > + * may not necessarily need anonymizing, it could be configured to use > > + * a secret word that the user may not want to reveal. > > */ > > - if (!strcmp(refname, "refs/heads/master")) > > + if (!main_branch) > > + main_branch = git_main_branch_name(MAIN_BRANCH_FULL_NAME); > > + > > + if (!strcmp(refname, main_branch)) > > return "refs/heads/ref0"; > > The same comment as 02/12 applies here. If the helper function > returns "" when the user says that no branch is more special than > others in the repository, the code would automatically do the right > thing. Seeing as the `fast-export` patches in this here patch series will be dropped from v3, in favor of Peff's patches, this does no longer need to be addressed. > In any case, thanks for working on it. I am on "vacation" so will > be commenting on the rest of the series later in the week. Welcome back! I hope you were able to rest. Ciao, Dscho
diff --git a/builtin/fast-export.c b/builtin/fast-export.c index 1072bbf041f..deeb01b6937 100644 --- a/builtin/fast-export.c +++ b/builtin/fast-export.c @@ -515,14 +515,23 @@ static const char *anonymize_refname(const char *refname) }; static struct hashmap refs; static struct strbuf anon = STRBUF_INIT; + static char *main_branch; int i; /* * In certain circumstances, it might be interesting to be able to * identify the main branch. For that reason, let's force its name to * be anonymized to `ref0`. + * + * While the main branch name might often be `main` for new + * repositories (and `master` for aged ones), and such well-known names + * may not necessarily need anonymizing, it could be configured to use + * a secret word that the user may not want to reveal. */ - if (!strcmp(refname, "refs/heads/master")) + if (!main_branch) + main_branch = git_main_branch_name(MAIN_BRANCH_FULL_NAME); + + if (!strcmp(refname, main_branch)) return "refs/heads/ref0"; strbuf_reset(&anon); diff --git a/t/t9351-fast-export-anonymize.sh b/t/t9351-fast-export-anonymize.sh index 2415f0ec213..f42be635c2f 100755 --- a/t/t9351-fast-export-anonymize.sh +++ b/t/t9351-fast-export-anonymize.sh @@ -31,6 +31,12 @@ test_expect_success 'stream translates master to ref0' ' ! grep master stream ' +test_expect_success 'respects configured main branch' ' + git -c core.mainBranch=does-not-exist \ + fast-export --anonymize --all >stream-without-ref0 && + ! grep ref0 stream-without-ref0 +' + test_expect_success 'stream omits other refnames' ' ! grep other stream && ! grep mytag stream