mbox series

[0/5] avoid peeking into `struct lock_file`

Message ID cover.1609874026.git.martin.agren@gmail.com (mailing list archive)
Headers show
Series avoid peeking into `struct lock_file` | expand

Message

Martin Ågren Jan. 5, 2021, 7:23 p.m. UTC
I made a comment in [1] about how we could avoid peeking into a `struct
lock_file` and instead use a helper function that we happen to have at
our disposal. I then grepped around a bit and found that we're pretty
good at avoiding such peeking at the moment, but that we could do
a bit better.

Here's a series to avoid such `lk.tempfile.foo` in favor of
`get_lock_file_foo(&lk)`.

[1] https://lore.kernel.org/git/CAN0heSrOKr--GenbowHP+iwkijbg5pCeJLq+wz6NXCXTsfcvGg@mail.gmail.com/


Martin Ågren (5):
  builtin/gc: don't peek into `struct lock_file`
  commit-graph: don't peek into `struct lock_file`
  midx: don't peek into `struct lock_file`
  refs/files-backend: don't peek into `struct lock_file`
  read-cache: try not to peek into `struct {lock_,temp}file`

 builtin/gc.c         |  6 +++---
 commit-graph.c       |  6 +++---
 midx.c               |  2 +-
 read-cache.c         | 12 ++++++------
 refs/files-backend.c |  4 ++--
 5 files changed, 15 insertions(+), 15 deletions(-)

Comments

Derrick Stolee Jan. 6, 2021, 2:16 a.m. UTC | #1
On 1/5/2021 2:23 PM, Martin Ågren wrote:
> I made a comment in [1] about how we could avoid peeking into a `struct
> lock_file` and instead use a helper function that we happen to have at
> our disposal. I then grepped around a bit and found that we're pretty
> good at avoiding such peeking at the moment, but that we could do
> a bit better.
> 
> Here's a series to avoid such `lk.tempfile.foo` in favor of
> `get_lock_file_foo(&lk)`.
> 
> [1] https://lore.kernel.org/git/CAN0heSrOKr--GenbowHP+iwkijbg5pCeJLq+wz6NXCXTsfcvGg@mail.gmail.com/

Thanks for being diligent and keeping the code clean.

This series is good-to-go.

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Junio C Hamano Jan. 6, 2021, 11:55 a.m. UTC | #2
Derrick Stolee <stolee@gmail.com> writes:

> On 1/5/2021 2:23 PM, Martin Ågren wrote:
>> I made a comment in [1] about how we could avoid peeking into a `struct
>> lock_file` and instead use a helper function that we happen to have at
>> our disposal. I then grepped around a bit and found that we're pretty
>> good at avoiding such peeking at the moment, but that we could do
>> a bit better.
>> 
>> Here's a series to avoid such `lk.tempfile.foo` in favor of
>> `get_lock_file_foo(&lk)`.
>> 
>> [1] https://lore.kernel.org/git/CAN0heSrOKr--GenbowHP+iwkijbg5pCeJLq+wz6NXCXTsfcvGg@mail.gmail.com/
>
> Thanks for being diligent and keeping the code clean.
>
> This series is good-to-go.
>
> Reviewed-by: Derrick Stolee <dstolee@microsoft.com>

Thanks, both.
Junio C Hamano Jan. 6, 2021, 10:36 p.m. UTC | #3
Junio C Hamano <gitster@pobox.com> writes:

> Derrick Stolee <stolee@gmail.com> writes:
>
>> On 1/5/2021 2:23 PM, Martin Ågren wrote:
>>> I made a comment in [1] about how we could avoid peeking into a `struct
>>> lock_file` and instead use a helper function that we happen to have at
>>> our disposal. I then grepped around a bit and found that we're pretty
>>> good at avoiding such peeking at the moment, but that we could do
>>> a bit better.
>>> 
>>> Here's a series to avoid such `lk.tempfile.foo` in favor of
>>> `get_lock_file_foo(&lk)`.
>>> 
>>> [1] https://lore.kernel.org/git/CAN0heSrOKr--GenbowHP+iwkijbg5pCeJLq+wz6NXCXTsfcvGg@mail.gmail.com/
>>
>> Thanks for being diligent and keeping the code clean.
>>
>> This series is good-to-go.
>>
>> Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
>
> Thanks, both.

I liked what I saw.  The code after these patches got certainly
clearer.

But it was not quite clear what I was *NOT* seeing in these patches.

IOW, how extensive is the coverage of these patches?  If we renamed
the .tempfile field to, say, .tmpfile in "struct lock_file" in
"lockfile.h", for example, would "lockfile.[ch]" be the *only* files
that need to be adjusted to make the code compile again?  The same
question for various fields in "struct tempfile".
Derrick Stolee Jan. 7, 2021, 2:08 a.m. UTC | #4
On 1/6/2021 5:36 PM, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
>> Derrick Stolee <stolee@gmail.com> writes:
>>
>>> On 1/5/2021 2:23 PM, Martin Ågren wrote:
>>>> I made a comment in [1] about how we could avoid peeking into a `struct
>>>> lock_file` and instead use a helper function that we happen to have at
>>>> our disposal. I then grepped around a bit and found that we're pretty
>>>> good at avoiding such peeking at the moment, but that we could do
>>>> a bit better.
>>>>
>>>> Here's a series to avoid such `lk.tempfile.foo` in favor of
>>>> `get_lock_file_foo(&lk)`.
>>>>
>>>> [1] https://lore.kernel.org/git/CAN0heSrOKr--GenbowHP+iwkijbg5pCeJLq+wz6NXCXTsfcvGg@mail.gmail.com/
>>>
>>> Thanks for being diligent and keeping the code clean.
>>>
>>> This series is good-to-go.
>>>
>>> Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
>>
>> Thanks, both.
> 
> I liked what I saw.  The code after these patches got certainly
> clearer.
> 
> But it was not quite clear what I was *NOT* seeing in these patches.
> 
> IOW, how extensive is the coverage of these patches?  If we renamed
> the .tempfile field to, say, .tmpfile in "struct lock_file" in
> "lockfile.h", for example, would "lockfile.[ch]" be the *only* files
> that need to be adjusted to make the code compile again?  The same
> question for various fields in "struct tempfile".
 
There was a note in patch 5 about how do_write_index() takes a
tempfile instead of a lockfile, because sometimes the index is
written without a lock.

I can't say that this is otherwise comprehensive. Definitely a
step in the right direction.

I think the only way to enforce this is to make 'struct lock_file'
anonymous in lockfile.h and actually defined in lockfile.c, assuming
that's possible. It seems like external callers would only be able
to declare a pointer to one, but without access to sizeof(struct
lock_file) these callers would be severely limited.

Thanks,
-Stolee
Martin Ågren Jan. 7, 2021, 7:55 a.m. UTC | #5
On Thu, 7 Jan 2021 at 03:08, Derrick Stolee <stolee@gmail.com> wrote:
>
> On 1/6/2021 5:36 PM, Junio C Hamano wrote:
> > Junio C Hamano <gitster@pobox.com> writes:
> > I liked what I saw.  The code after these patches got certainly
> > clearer.
> >
> > But it was not quite clear what I was *NOT* seeing in these patches.
> >
> > IOW, how extensive is the coverage of these patches?  If we renamed
> > the .tempfile field to, say, .tmpfile in "struct lock_file" in
> > "lockfile.h", for example, would "lockfile.[ch]" be the *only* files
> > that need to be adjusted to make the code compile again?  The same
> > question for various fields in "struct tempfile".

To be perfectly honest, I just grepped around. I just tried your
suggestion and it seems like I really did catch everyone who looks at
`->tempfile` or `.tempfile`.

I could add a remark in the commit message of the last patch along the
lines of "After this commit, renaming the `tempfile` field only triggers
compilation errors in lockfile.[ch] and this one instance that we're
intentionally leaving here.".

> There was a note in patch 5 about how do_write_index() takes a
> tempfile instead of a lockfile, because sometimes the index is
> written without a lock.

Right, so outside of lockfile.[ch], that's the one spot where I needed
to do s/tempfile/tmpfile/ to be able to build again after trying out
Junio's idea. (We could have `get_lock_file_tempfile()` but it feels
funny to me to have an API for leaking the implementation. Just having
one user who needs to access the internal tempfile also sort of argues
for punting on introducing such a function, to me at least.)

I did on purpose avoid doing the same "don't peek" for tempfiles,
because I sort of assumed it would be a much deeper rabbit hole. Except
for in the last patch: In read-cache.c, the use of tempfiles/lock files
is intertwined/entangled enough that I felt silly modifying the spots
where we use lock files without doing the exact same cleanups to
tempfiles.

I just tried quickly renaming `fd` of `struct tempfile`. There aren't
*too* many hits, but from a quick glance it seems like some of them
might want a more informed rewrite of the logic to express intent even
better. For example, rather than grabbing the `->fd` using the "proper"
function and handling it sort of as a boolean, it might express intent
even better to not grab it at all and instead explicitly ask "do we
hold the lock?".

I'm tempted to promise I'll try a similar round of cleanups for tempfile
users, rather than rerolling and turning ~5 patches in v1 into ~10 in
v2, especially when those might be slightly more invasive. Hmm?

> I think the only way to enforce this is to make 'struct lock_file'
> anonymous in lockfile.h and actually defined in lockfile.c, assuming
> that's possible. It seems like external callers would only be able
> to declare a pointer to one, but without access to sizeof(struct
> lock_file) these callers would be severely limited.

I've never used coccinelle, but I'm guessing it might be a decent fit
for the problem. Either by encoding all 10(?) "don't do this, do that"
or if it's somehow possible to more generally say "don't look at
lockfile{.,->}tempfile". We'll obviously need some allowlisting as well.

Thanks both for your comments.

Martin
Junio C Hamano Jan. 7, 2021, 8:19 a.m. UTC | #6
Martin Ågren <martin.agren@gmail.com> writes:

> To be perfectly honest, I just grepped around. I just tried your
> suggestion and it seems like I really did catch everyone who looks at
> `->tempfile` or `.tempfile`.

That's wonderful ;-)

> I could add a remark in the commit message of the last patch along the
> lines of "After this commit, renaming the `tempfile` field only triggers
> compilation errors in lockfile.[ch] and this one instance that we're
> intentionally leaving here.".

It would be a nice addition to the log message to help readers to
feel confident about the conversion.  It is OK if you want to add
one, and certainly a good trick to have in your toolbox for your
next conversion, but it may not be worth rerolling only to update
the log message with such a remark.

Thanks.
Martin Ågren Jan. 7, 2021, 6:17 p.m. UTC | #7
On Thu, 7 Jan 2021 at 09:19, Junio C Hamano <gitster@pobox.com> wrote:
>
> Martin Ågren <martin.agren@gmail.com> writes:
>
> > I could add a remark in the commit message of the last patch along the
> > lines of "After this commit, renaming the `tempfile` field only triggers
> > compilation errors in lockfile.[ch] and this one instance that we're
> > intentionally leaving here.".
>
> It would be a nice addition to the log message to help readers to
> feel confident about the conversion.  It is OK if you want to add
> one, and certainly a good trick to have in your toolbox for your
> next conversion, but it may not be worth rerolling only to update
> the log message with such a remark.

Ok, got it. So I'm not planning to reroll this series. Unless something
else shows up, obviously.

Thank you.

Martin