mbox series

[v10,00/10] packfile: avoid using the 'the_repository' global variable

Message ID cover.1733236936.git.karthik.188@gmail.com (mailing list archive)
Headers show
Series packfile: avoid using the 'the_repository' global variable | expand

Message

Karthik Nayak Dec. 3, 2024, 2:43 p.m. UTC
The `packfile.c` file uses the global variable 'the_repository' extensively
throughout the code. Let's remove all usecases of this, by modifying the
required functions to accept a 'struct repository' instead. This is to clean up
usage of global state.

The first 3 patches are mostly internal to `packfile.c`, we add the repository
field to the `packed_git` struct and this is used to clear up some useages of
the global variables.

The next 3 patches are more disruptive, they modify the function definition of
`odb_pack_name`, `has_object[_kept]_pack` and `for_each_packed_object` to receive
a repository, helping remove other usages of 'the_repository' variable.

Finally, the next two patches deal with global config values. These values are
localized. The last patch is removal of an unecessary call to `prepare_packed_git()`.
For v5 onwards, I've rebased the series off the master: 8f8d6eee53 (The
seventh batch, 2024-11-01), as a dependency for this series 'jk/dumb-http-finalize'
was merged to master. I've found no conflicts while merging with seen & next. But
since this series does touch multiple files, there could be future conflicts.

Changes in v10:
- Grammar corrections in the commit messages.

Changes in v9:
- Added a comment in gc_config to indicate that eventually the
  `delta_base_cache_limit` variable should be used through repo_settings. 

Changes in v8:
- Fix typos in comments
- For packfile.c use delta_base_cache_limit from the repository
settings, this avoids loading the config in hot paths.
- Rename `longval` to `ulongval` to better signify the type.

Changes in v7:
- Cleanup stale commit message.
- Add missing space in `if` statement.
- Fix typo s/incase/in case/.

Changes in v6:
- Lazy load repository settings in packfile.c. This ensures that the settings are
available for sure and we do not rely on callees setting it.
- Use `size_t` for `delta_base_cache_limit`.

Changes in v5:
- Move packed_git* settings to repo_settings to ensure we don't keep reparsing the
settings in `use_pack`.

Changes in v4:
- Renamed the repository field within `packed_git` and `multi_pack_index` from
`r` to `repo`, while keeping function parameters to be `r`.
- Fixed bad braces.

Changes in v3:
- Improved commit messages. In the first commit to talk about how packed_git
struct could also be part of the alternates of a repository. In the 7th commit
to talk about the motive behind removing the global variable.
- Changed 'packed_git->repo' to 'packed_git->r' to keep it consistent with the
rest of the code base.
- Replaced 'the_repository' with locally available access to the repository
struct in multiple regions.
- Removed unecessary inclusion of the 'repository.h' header file by forward
declaring the 'repository' struct.
- Replace memcpy with hashcpy.
- Change the logic in the 7th patch to use if else statements.
- Added an extra commit to cleanup `pack-bitmap.c`.

Karthik Nayak (9):
  packfile: add repository to struct `packed_git`
  packfile: use `repository` from `packed_git` directly
  packfile: pass `repository` to static function in the file
  packfile: pass down repository to `odb_pack_name`
  packfile: pass down repository to `has_object[_kept]_pack`
  packfile: pass down repository to `for_each_packed_object`
  config: make `delta_base_cache_limit` a non-global variable
  config: make `packed_git_(limit|window_size)` non-global variables
  midx: add repository to `multi_pack_index` struct

Taylor Blau (1):
  packfile.c: remove unnecessary prepare_packed_git() call

 builtin/cat-file.c       |   7 +-
 builtin/count-objects.c  |   2 +-
 builtin/fast-import.c    |  15 ++--
 builtin/fsck.c           |  20 +++---
 builtin/gc.c             |  12 +++-
 builtin/index-pack.c     |  20 ++++--
 builtin/pack-objects.c   |  11 +--
 builtin/pack-redundant.c |   2 +-
 builtin/repack.c         |   2 +-
 builtin/rev-list.c       |   2 +-
 commit-graph.c           |   4 +-
 config.c                 |  22 ------
 connected.c              |   3 +-
 diff.c                   |   3 +-
 environment.c            |   3 -
 environment.h            |   1 -
 fsck.c                   |   2 +-
 http.c                   |   4 +-
 list-objects.c           |   7 +-
 midx-write.c             |   2 +-
 midx.c                   |   3 +-
 midx.h                   |   3 +
 object-store-ll.h        |   9 ++-
 pack-bitmap.c            |  90 ++++++++++++++----------
 pack-objects.h           |   3 +-
 pack-write.c             |   1 +
 pack.h                   |   2 +
 packfile.c               | 144 ++++++++++++++++++++++-----------------
 packfile.h               |  18 +++--
 promisor-remote.c        |   2 +-
 prune-packed.c           |   2 +-
 reachable.c              |   4 +-
 repo-settings.c          |  18 +++++
 repo-settings.h          |   7 ++
 revision.c               |  13 ++--
 tag.c                    |   2 +-
 36 files changed, 275 insertions(+), 190 deletions(-)

Range-diff against v9:
 1:  d1fdd6996a !  1:  d6d571c58e packfile: add repository to struct `packed_git`
    @@ Commit message
         on the global `the_repository` object in `packfile.c` by simply using
         repository information now readily available in the struct.
     
    -    We do need to consider that a pack file could be part of the alternates
    +    We do need to consider that a packfile could be part of the alternates
         of a repository, but considering that we only have one repository struct
    -    and also that we currently anyways use 'the_repository'. We should be
    +    and also that we currently anyways use 'the_repository', we should be
         OK with this change.
     
         We also modify `alloc_packed_git` to ensure that the repository is added
 2:  65c09858ce =  2:  fa69763468 packfile: use `repository` from `packed_git` directly
 3:  80632934d1 !  3:  c6acbece46 packfile: pass `repository` to static function in the file
    @@ Commit message
         packfile: pass `repository` to static function in the file
     
         Some of the static functions in the `packfile.c` access global
    -    variables, which can simply be avoiding by passing the `repository`
    +    variables, which can simply be avoided by passing the `repository`
         struct down to them. Let's do that.
     
         Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
 4:  67d71eab83 =  4:  a8588d6086 packfile: pass down repository to `odb_pack_name`
 5:  ee210fa153 =  5:  b3fe20c8f1 packfile: pass down repository to `has_object[_kept]_pack`
 6:  8db7094f4e =  6:  ad46b339ea packfile: pass down repository to `for_each_packed_object`
 7:  a66494384d =  7:  342a26572d config: make `delta_base_cache_limit` a non-global variable
 8:  bce9196f6b =  8:  6e55daf5b3 config: make `packed_git_(limit|window_size)` non-global variables
 9:  c7fba8cf6a =  9:  6e0ec955e6 midx: add repository to `multi_pack_index` struct
10:  d7f475fbd0 = 10:  e33fa2ea0d packfile.c: remove unnecessary prepare_packed_git() call

Comments

Kristoffer Haugsbakk Dec. 3, 2024, 4:46 p.m. UTC | #1
On Tue, Dec 3, 2024, at 15:43, Karthik Nayak wrote:
> Range-diff against v9:
>  1:  d1fdd6996a !  1:  d6d571c58e packfile: add repository to struct 
> `packed_git`
>     @@ Commit message
>          on the global `the_repository` object in `packfile.c` by 
> simply using
>          repository information now readily available in the struct.
>     
>     -    We do need to consider that a pack file could be part of the 
> alternates
>     +    We do need to consider that a packfile could be part of the 
> alternates
>          of a repository, but considering that we only have one 
> repository struct
>     -    and also that we currently anyways use 'the_repository'. We 
> should be
>     +    and also that we currently anyways use 'the_repository', we 
> should be
>          OK with this change.
>     
>          We also modify `alloc_packed_git` to ensure that the 
> repository is added
>  2:  65c09858ce =  2:  fa69763468 packfile: use `repository` from 
> `packed_git` directly
>  3:  80632934d1 !  3:  c6acbece46 packfile: pass `repository` to static 
> function in the file
>     @@ Commit message
>          packfile: pass `repository` to static function in the file
>     
>          Some of the static functions in the `packfile.c` access global
>     -    variables, which can simply be avoiding by passing the `repository`
>     +    variables, which can simply be avoided by passing the `repository`
>          struct down to them. Let's do that.

Nice, thank you.
Junio C Hamano Dec. 3, 2024, 11:24 p.m. UTC | #2
"Kristoffer Haugsbakk" <kristofferhaugsbakk@fastmail.com> writes:

> Nice, thank you.

Yeah, together with a few responses from Toon and Patrick, this
topic seems to be very well done by now.  Let me mark it for 'next'.

Thanks, all.