Message ID | 20250226-pks-maintenance-reflog-expire-v1-0-a1204a814952@pks.im (mailing list archive) |
---|---|
Headers | show |
Series | builtin/maintenance: introduce "reflog-expire" task | expand |
On 26/02/2025 15:24, Patrick Steinhardt wrote: > Hi, > > this patch series introduces a new "reflog-expire" task to > git-maintenance(1). This task is designed to plug a gap when the "gc" > task is disabled, as there is no way to expire reflog entries in that > case. > > This patch series has been inspired by the discussion at [1]. I consider > it to be another step into the direction of replacing git-gc(1) and > allowing for more flexible maintenance strategies overall. Next steps Hmm, I don't know what you have in mind, but just as a data-point, I have never used, and have no inclination to use, git-maintenance. However, I do use git-gc extensively: at least once (times the number of repos fetched which have changes) per day, pretty much every day! :) ATB, Ramsay Jones
Ramsay Jones <ramsay@ramsayjones.plus.com> writes: > Hmm, I don't know what you have in mind, but just as a data-point, I have > never used, and have no inclination to use, git-maintenance. However, I do > use git-gc extensively: at least once (times the number of repos fetched > which have changes) per day, pretty much every day! :) That makes two of us, but everybody knows that we are old fashioned ;-)
On 26/02/2025 18:40, Junio C Hamano wrote: > Ramsay Jones <ramsay@ramsayjones.plus.com> writes: > >> Hmm, I don't know what you have in mind, but just as a data-point, I have >> never used, and have no inclination to use, git-maintenance. However, I do >> use git-gc extensively: at least once (times the number of repos fetched >> which have changes) per day, pretty much every day! :) > > That makes two of us, but everybody knows that we are old fashioned ;-) true, very true. :) ATB, Ramsay Jones
Patrick Steinhardt <ps@pks.im> writes: > this patch series introduces a new "reflog-expire" task to > git-maintenance(1). This task is designed to plug a gap when the "gc" > task is disabled, as there is no way to expire reflog entries in that > case. I think in the longer run, "maintenance" users should be able to treat the single ball of wax "gc" task as a mere short-hand to invoke a set of often used maintenance tasks, and we would want to break down the component tasks grouped in it and make them independently available. This is a good step along that journey. Are there other things that the "gc" task covers that are not available elsewhere? "git gc --help" suggests there are things related to pruning (unused?) worktrees and stale rerere database entries. Another thing, how much control do we want to cede to the end users the choice of tasks and order of running them? When you are expiring stale reflog entries and repacking the object database to discard unreachable objects, it would only make sense to do them in the order I just said. We could leave it up to the end users, but that may be doing disservice to them.
On Wed, Feb 26, 2025 at 06:54:48PM +0000, Ramsay Jones wrote: > On 26/02/2025 18:40, Junio C Hamano wrote: > > Ramsay Jones <ramsay@ramsayjones.plus.com> writes: > > > >> Hmm, I don't know what you have in mind, but just as a data-point, I have > >> never used, and have no inclination to use, git-maintenance. However, I do > >> use git-gc extensively: at least once (times the number of repos fetched > >> which have changes) per day, pretty much every day! :) > > > > That makes two of us, but everybody knows that we are old fashioned ;-) > > true, very true. :) Well, it depends on what you mean by "use". In fact, both of you use it implicitly assuming that you use a recent version of Git because that is what Git nowadays spawns automatically: we don't use `git gc --auto` anymore, but instead use `git maintenance run --auto`. It _does_ still use git-gc(1) under the hood by default, but that is something we can change going forward. The opportunity here is to have a more fine-grained strategy to perform maintenance, both when run explicitly but also when run automatically by Git. git-maintenance(1) is written in a way that makes it significantly more flexible overall, so we can iterate on how exactly it performs the maintenance for the user. Different strategies may make sense in some contexts, but not in others, and that is something we can account for here. It also allows us to bring newer features to the masses that have a chance to improve performance or reduce the time spent maintaining repositories for everyone: multi-pack indices, split commit graphs, geometric repacking, incremental bitmaps. While we could move them into git-gc(1), I think that this tool is just not well-suited for such changes as it simply doesn't provide a good foundation for tweakable behaviour. Patrick
On Wed, Feb 26, 2025 at 05:23:10PM -0800, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > this patch series introduces a new "reflog-expire" task to > > git-maintenance(1). This task is designed to plug a gap when the "gc" > > task is disabled, as there is no way to expire reflog entries in that > > case. > > I think in the longer run, "maintenance" users should be able to > treat the single ball of wax "gc" task as a mere short-hand to > invoke a set of often used maintenance tasks, and we would want to > break down the component tasks grouped in it and make them > independently available. This is a good step along that journey. > > Are there other things that the "gc" task covers that are not > available elsewhere? "git gc --help" suggests there are things > related to pruning (unused?) worktrees and stale rerere database > entries. These are more gaps indeed. I'm happy to work on them once this patch series has landed. I don't know about any other gaps. > Another thing, how much control do we want to cede to the end users > the choice of tasks and order of running them? When you are > expiring stale reflog entries and repacking the object database to > discard unreachable objects, it would only make sense to do them in > the order I just said. We could leave it up to the end users, but > that may be doing disservice to them. This is a good question. From my perspective, there are three classes of users here: - Those that don't care and don't have special needs. This class of users is unlikely to tweak things anyway. - Those that aren't deeply familiar with how Git works, but who do have special needs e.g. because they have huge repositories. This class of users may need to tweak configuration, but we should give them an _easy_ way to do so. Configuring individual tasks ain't that from my perspective. - Power users that are deeply familiar with how Git works. This class of users may even want to tweak the order in which specific tasks run. "maintenance.strategy" exists to cater to the second class of users and allows them to configure the high-level strategy used to maintain repos. I don't know whether it's honored by `git maintenance run`, but I think it is (and if it's not it should be). That to me means that the configuration for individual tasks for power users can be as flexible as possible, including configuring the order in which tasks are run. Patrick
Patrick Steinhardt <ps@pks.im> writes: > On Wed, Feb 26, 2025 at 05:23:10PM -0800, Junio C Hamano wrote: >> Patrick Steinhardt <ps@pks.im> writes: >> >> > this patch series introduces a new "reflog-expire" task to >> > git-maintenance(1). This task is designed to plug a gap when the "gc" >> > task is disabled, as there is no way to expire reflog entries in that >> > case. >> >> I think in the longer run, "maintenance" users should be able to >> treat the single ball of wax "gc" task as a mere short-hand to >> invoke a set of often used maintenance tasks, and we would want to >> break down the component tasks grouped in it and make them >> independently available. This is a good step along that journey. >> >> Are there other things that the "gc" task covers that are not >> available elsewhere? "git gc --help" suggests there are things >> related to pruning (unused?) worktrees and stale rerere database >> entries. > > These are more gaps indeed. I'm happy to work on them once this patch > series has landed. I don't know about any other gaps. Or maybe leave breadcrumbs and invite others to help advance the cause? If we know we have achieved consensus that it is a good direction to go in, that is (we already saw a mention that indicates that there are populations of us who do not care too much about extending maintenance but are familiar with gc).
On Thu, Feb 27, 2025 at 09:01:49AM -0800, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > On Wed, Feb 26, 2025 at 05:23:10PM -0800, Junio C Hamano wrote: > >> Patrick Steinhardt <ps@pks.im> writes: > >> > >> > this patch series introduces a new "reflog-expire" task to > >> > git-maintenance(1). This task is designed to plug a gap when the "gc" > >> > task is disabled, as there is no way to expire reflog entries in that > >> > case. > >> > >> I think in the longer run, "maintenance" users should be able to > >> treat the single ball of wax "gc" task as a mere short-hand to > >> invoke a set of often used maintenance tasks, and we would want to > >> break down the component tasks grouped in it and make them > >> independently available. This is a good step along that journey. > >> > >> Are there other things that the "gc" task covers that are not > >> available elsewhere? "git gc --help" suggests there are things > >> related to pruning (unused?) worktrees and stale rerere database > >> entries. > > > > These are more gaps indeed. I'm happy to work on them once this patch > > series has landed. I don't know about any other gaps. > > Or maybe leave breadcrumbs and invite others to help advance the > cause? If we know we have achieved consensus that it is a good > direction to go in, that is (we already saw a mention that indicates > that there are populations of us who do not care too much about > extending maintenance but are familiar with gc). Oh, sure, I wouldn't mind at all if somebody else picked this up. The question to me is where to leave the breadcrumb, other than having it in this thread. Patrick