mbox series

[0/2] gpg-interface: cleanup + convert low hanging fruit to configset API

Message ID cover-0.2-00000000000-20230209T142225Z-avarab@gmail.com (mailing list archive)
Headers show
Series gpg-interface: cleanup + convert low hanging fruit to configset API | expand

Message

Ævar Arnfjörð Bjarmason Feb. 9, 2023, 2:35 p.m. UTC
On Thu, Feb 09 2023, Jeff King wrote:

> If the gpg code used git_config_get_string(), etc, then they could just
> access each key on demand (efficiently, from an internal hash table),
> which reduces the risk of "oops, we forgot to initialize the config
> here". It does probably mean restructuring the code a little, though
> (since you'd often have an accessor function to get "foo.bar" rather
> than assuming "foo.bar" was parsed into an enum already, etc). That may
> not be worth the effort (and risk of regression) to convert.

I'd already played around with that a bit as part of reviewing Junio's
change, this goes on top of that.

I found that continuing this conversion was getting harder, but these
3 cases really were trivial cases where we're just reading a variable
globally, and then proceeding to use it in one specific place.

Out of the remaining ones gpg.program et all looked easiest, but I
didn't continue with it.

For anyone interested think it would be best to continue by converting
the remaining bits by having commit, tag etc. set up some "struct
gpg", so that when they could directly instruct it ot do its config
reading before parse_options(). The remaining complexity is mainly
with the file-global & having to juggle in what order we read & set
what.

FWIW when poking at this I found that we have fairly robust testing
support for this area, but it could be better, but it's good enough to
spot that if we stop reading these we'll fail tests.

But e.g. for the "gpg.program" we've got tests that'll fail if the
"gpg" program variable isn't read, but not for the "ssh" variable, but
as they'll both share the same/similar reader code any future
migration should spot any glaring bugs, just possibly not subtle ones.

Branch & passing[1] CI at:
https://github.com/avar/git/tree/avar/gpg-lazy-init-configset

1. Well, passing except for the general current Windows CI dumpster
   fire on topics based off current "master".

Ævar Arnfjörð Bjarmason (2):
  {am,commit-tree,verify-{commit,tag}}: refactor away config wrapper
  gpg-interface.c: lazily get GPG config variables on demand

 builtin/am.c            |  7 +----
 builtin/commit-tree.c   |  7 +----
 builtin/verify-commit.c |  7 +----
 builtin/verify-tag.c    |  7 +----
 gpg-interface.c         | 66 ++++++++++++++++-------------------------
 5 files changed, 29 insertions(+), 65 deletions(-)

Comments

Junio C Hamano Feb. 9, 2023, 9:27 p.m. UTC | #1
Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> On Thu, Feb 09 2023, Jeff King wrote:
>
>> If the gpg code used git_config_get_string(), etc, then they could just
>> access each key on demand (efficiently, from an internal hash table),
>> which reduces the risk of "oops, we forgot to initialize the config
>> here". It does probably mean restructuring the code a little, though
>> (since you'd often have an accessor function to get "foo.bar" rather
>> than assuming "foo.bar" was parsed into an enum already, etc). That may
>> not be worth the effort (and risk of regression) to convert.
>
> I'd already played around with that a bit as part of reviewing Junio's
> change, this goes on top of that.

What's your intention of sending these?  I think we are already in
agreement that the churn may not be worth the risk, so if these are
"and here is the churn would look like, not for application", I
would understand it and appreciate it.  But did you mean that these
patches are for application?  I am not sure...

Thanks.
Ævar Arnfjörð Bjarmason Feb. 10, 2023, 10:29 a.m. UTC | #2
On Thu, Feb 09 2023, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> On Thu, Feb 09 2023, Jeff King wrote:
>>
>>> If the gpg code used git_config_get_string(), etc, then they could just
>>> access each key on demand (efficiently, from an internal hash table),
>>> which reduces the risk of "oops, we forgot to initialize the config
>>> here". It does probably mean restructuring the code a little, though
>>> (since you'd often have an accessor function to get "foo.bar" rather
>>> than assuming "foo.bar" was parsed into an enum already, etc). That may
>>> not be worth the effort (and risk of regression) to convert.
>>
>> I'd already played around with that a bit as part of reviewing Junio's
>> change, this goes on top of that.
>
> What's your intention of sending these?

For them to be picked up on top of your jc/gpg-lazy-init.

> I think we are already in
> agreement that the churn may not be worth the risk, so if these are
> "and here is the churn would look like, not for application", I
> would understand it and appreciate it.  But did you mean that these
> patches are for application?  I am not sure...

I understood your "I specifically did not want anybody to start doing
this line of analysis" in [1] to mean that you didn't want to have the
sort of change that the last paragraph of 2/2 notes that we're
deliberately not doing.

I.e. that we'd like to keep the gpg_interface_lazy_init() boilerplate,
even though we might carefully reason that a specific API entry point
won't need to initialize the file-scoped config variables right now.

I then took your "it is vastly preferred not to do such a change in this
step" in [2] as a note that it was deliberate that the change in 1/2
here wasn't part of your jc/gpg-lazy-init, but not that we shouldn't
follow-up with such a clean-up.

The "on top once the dust settled" in [2] can then be addressed by
graduating your jc/gpg-lazy-init soon, and keeping this in "seen" for a
bit, although I think the changes here (and in particular 1/2) are
trivial enough to graduate soon thereafter.

Given that I had mixed feelings about submitting this now, but Jeff's
[3] convinced me. I.e. the change in 2/2 'reduces the risk of "oops, we
forgot to initialize the config here"' in the future.

But obviously it's up to you whether you pick this up, and you don't
seem especially keen on doing so, so if not I guess we'll just drop
this, but I'd be happy if you did.

I do think that the 2/2 here has the added benefit of making your change
easier to review, and that's why I wrote it initially. I was poking at
your patch to see what behavior changes, logic errors or bugs I could
find in it.

I.e. your end state is that we're reading 7 config variables (I'm
counting the *.program ones as one variable). The 2/2 here brings that
down to just 3. Thus the surface area of potential issues where we don't
call gpg_interface_lazy_init() before accessing the values is reduced.

Which is also I why I opted to send this sooner than later, having that
as a review aid helps others now, and not in a few months.

1. https://lore.kernel.org/git/xmqq5ycbpp8a.fsf@gitster.g/
2. https://lore.kernel.org/git/xmqqpmaimvtd.fsf_-_@gitster.g/
3. https://lore.kernel.org/git/Y+TqEM21o+3TGx6D@coredump.intra.peff.net/
Junio C Hamano Feb. 10, 2023, 7:02 p.m. UTC | #3
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> What's your intention of sending these?
>
> For them to be picked up on top of your jc/gpg-lazy-init.
>
>> I think we are already in
>> agreement that the churn may not be worth the risk, so if these are
>> "and here is the churn would look like, not for application", I
>> would understand it and appreciate it.  But did you mean that these
>> patches are for application?  I am not sure...
>
> I understood your "I specifically did not want anybody to start doing
> this line of analysis" in [1] to mean that you didn't want to have the
> sort of change that the last paragraph of 2/2 notes that we're
> deliberately not doing.

I didn't want to see "oh you are calling lazy_init here but you can
delay it even further" kind of comments that is wrong and wastes our
time.  

> I.e. that we'd like to keep the gpg_interface_lazy_init() boilerplate,
> even though we might carefully reason that a specific API entry point
> won't need to initialize the file-scoped config variables right now.

It is the complete opposite of what I meant.

Changing

	git_am_config(...) {
		return git_default_config(...);
	}
	... 
		git_config(git_am_config);

to

	/* no git_am_config() */
	...
		git_config(git_default_config);

is perfectly fine as a clean-up post series.

If we are moving away from git_config() callback style, and move to
git_config_get_*() style, the upthread already said it does not have
a good risk/benefit ratio, but if we were to do so, then we should
not leave some still using the callback style while others using
git_config_get_*(), which will lead to configuration read in a wrong
order and easily breaking precedence rules.

And if we were to move away completely from the callback style, then
I do not see a point to build such a series on top of the lazy init
patch, which is about staying with the callback style.

So, that is exactly why I asked the question after seeing it was
marked to apply on top of the lazy init thing, which did not make
sense to me.