Message ID | pull.1490.git.1678136369387.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | object-file: reprepare alternates when necessary | expand |
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > @@ -1008,6 +1008,7 @@ void reprepare_packed_git(struct repository *r) > struct object_directory *odb; > > obj_read_lock(); > + reprepare_alt_odb(r); > for (odb = r->objects->odb; odb; odb = odb->next) > odb_clear_loose_cache(odb); Hmph, if there was an old alternate ODB from which we took some loose object from and cached, and if that ODB no longer is on the updated alternate list, would we now fail to clear the loose objects cache for the ODB? Or are we only prepared for seeing "more" alternates and assume no existing alternates go away? Other than that, looking quite well reasoned.
On Mon, Mar 06, 2023 at 02:54:00PM -0800, Junio C Hamano wrote: > "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > > > @@ -1008,6 +1008,7 @@ void reprepare_packed_git(struct repository *r) > > struct object_directory *odb; > > > > obj_read_lock(); > > + reprepare_alt_odb(r); > > for (odb = r->objects->odb; odb; odb = odb->next) > > odb_clear_loose_cache(odb); > > Hmph, if there was an old alternate ODB from which we took some > loose object from and cached, and if that ODB no longer is on the > updated alternate list, would we now fail to clear the loose objects > cache for the ODB? Or are we only prepared for seeing "more" > alternates and assume no existing alternates go away? Based on my understanding of the patch, we are only prepared to see "more" alternates, rather than some existing alternate going away. That being said, I am not certain that is how it works. Perhaps an alternate "goes away", but does not actually get removed from the list of alternate ODBs. If that's the case, any object lookup in that now-missing ODB would fail, but any subsequent ODBs which were added after calling reprepare_alt_odb() would succeed on that object lookup. So, I don't know. I don't have the implementation details of the alternates ODB mechanism paged in enough to say for sure. Hopefully Stolee can point us in the right direction. > Other than that, looking quite well reasoned. Agreed. Thanks, Taylor
On Mon, Mar 06 2023, Derrick Stolee via GitGitGadget wrote: > From: Derrick Stolee <derrickstolee@github.com> > > When an object is not found in a repository's object store, we sometimes > call reprepare_packed_git() to see if the object was temporarily moved > into a new pack-file (and its old pack-file or loose object was > deleted). This process does a scan of each pack directory within each > odb, but does not reevaluate if the odb list needs updating. > > Create a new reprepare_alt_odb() method that is a similar wrapper around > prepare_alt_odb(). Call it from reprepare_packed_git() under the object > read lock to avoid readers from interacting with a potentially > incomplete odb being added to the odb list. > > prepare_alt_odb() already avoids adding duplicate odbs to the list > during its progress, so it is safe to call it again from > reprepare_alt_odb() without worrying about duplicate odbs. > > This change is specifically for concurrent changes to the repository, so > it is difficult to create a test that guarantees this behavior is > correct. I manually verified by introducing a reprepare_packed_git() call > into get_revision() and stepped into that call in a debugger with a > parent 'git log' process. Multiple runs of reprepare_alt_odb() kept > the_repository->objects->odb as a single-item chain until I added a > .git/objects/info/alternates file in a different process. The next run > added the new odb to the chain and subsequent runs did not add to the > chain. I found this a bit hard to read, as one migh think from just this explanation that you're adding ODB locking as a new concept here, adding "existing" in front of "read lock" above might help. But in fact we've been doing the locking since 6c307626f1e (grep: protect packed_git [re-]initialization, 2020-01-15). So the only thing that really needs justification here is that putting the alternates re-reading under the same lock There is a really interesting potential caveat here which you're not discussing, which is... > Signed-off-by: Derrick Stolee <derrickstolee@github.com> > --- > object-file: reprepare alternates when necessary > > This subtlety was notice by Michael Haggerty due to how alternates are > used server-side at $DAYJOB. Moving pack-files from a repository to the > alternate occasionally causes failures because processes that start > before the alternate exists don't know how to find that alternate at > run-time. > > Thanks, > > * Stolee > > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1490%2Fderrickstolee%2Fstolee%2Freprepare-alternates-v1 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1490/derrickstolee/stolee/reprepare-alternates-v1 > Pull-Request: https://github.com/gitgitgadget/git/pull/1490 > > object-file.c | 6 ++++++ > object-store.h | 1 + > packfile.c | 1 + > 3 files changed, 8 insertions(+) > > diff --git a/object-file.c b/object-file.c > index 939865c1ae0..22acc7fd8e9 100644 > --- a/object-file.c > +++ b/object-file.c > @@ -944,6 +944,12 @@ void prepare_alt_odb(struct repository *r) > r->objects->loaded_alternates = 1; > } > > +void reprepare_alt_odb(struct repository *r) > +{ > + r->objects->loaded_alternates = 0; > + prepare_alt_odb(r); > +} > + > /* Returns 1 if we have successfully freshened the file, 0 otherwise. */ > static int freshen_file(const char *fn) > { > diff --git a/object-store.h b/object-store.h > index 1a713d89d7c..750c29daa54 100644 > --- a/object-store.h > +++ b/object-store.h > @@ -56,6 +56,7 @@ KHASH_INIT(odb_path_map, const char * /* key: odb_path */, > struct object_directory *, 1, fspathhash, fspatheq) > > void prepare_alt_odb(struct repository *r); > +void reprepare_alt_odb(struct repository *r); > char *compute_alternate_path(const char *path, struct strbuf *err); > struct object_directory *find_odb(struct repository *r, const char *obj_dir); > typedef int alt_odb_fn(struct object_directory *, void *); > diff --git a/packfile.c b/packfile.c > index 79e21ab18e7..2b28918a05e 100644 > --- a/packfile.c > +++ b/packfile.c > @@ -1008,6 +1008,7 @@ void reprepare_packed_git(struct repository *r) > struct object_directory *odb; > > obj_read_lock(); > + reprepare_alt_odb(r); > for (odb = r->objects->odb; odb; odb = odb->next) > odb_clear_loose_cache(odb); > > > base-commit: d15644fe0226af7ffc874572d968598564a230dd This seems reasonable, but wouldn't this do the same without introducing an API function just for this one use-case? That's of course a nit, and you seem to have been adding this for consistency with reprepare_packed_git(), but it already "owns" the "approximate_object_count_valid" and "packed_git_initialized" flags in "struct raw_object_store". So as we'll only need this from reprepare_packed_git() isn't it better to declare that "loaded_alternates" is another such flag? Perhaps not, but as the resulting patch is much shorter it seems worth considering. ...but to continue the above, the *really* important thing here (and correct me if I'm wrong) is that we really need to *first* prepare the alternates, and *then* do the rest, as our new alternates might point to new loose objects and packs. So with both of the above (the same could be done with your new helper) something like this would IMO make that much clearer: diff --git a/packfile.c b/packfile.c index 79e21ab18e7..50cb46ca4b7 100644 --- a/packfile.c +++ b/packfile.c @@ -1008,6 +1008,13 @@ void reprepare_packed_git(struct repository *r) struct object_directory *odb; obj_read_lock(); + /* + * New alternates might point to new loose & pack dirs, so we + * must first read those. + */ + r->objects->loaded_alternates = 0; + prepare_alt_odb(r); + for (odb = r->objects->odb; odb; odb = odb->next) odb_clear_loose_cache(odb); And, I think this is an exsiting edge case, but we're only locking the ODB of the "parent" repository in this case, so if we have alternates in play aren't we going to racily compute the rest here, the loose objects and packs of the alternates we're about to consult aren't under the same lock?
On 3/6/2023 7:28 PM, Taylor Blau wrote: > On Mon, Mar 06, 2023 at 02:54:00PM -0800, Junio C Hamano wrote: >> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: >> >>> @@ -1008,6 +1008,7 @@ void reprepare_packed_git(struct repository *r) >>> struct object_directory *odb; >>> >>> obj_read_lock(); >>> + reprepare_alt_odb(r); >>> for (odb = r->objects->odb; odb; odb = odb->next) >>> odb_clear_loose_cache(odb); >> >> Hmph, if there was an old alternate ODB from which we took some >> loose object from and cached, and if that ODB no longer is on the >> updated alternate list, would we now fail to clear the loose objects >> cache for the ODB? Or are we only prepared for seeing "more" >> alternates and assume no existing alternates go away? > > Based on my understanding of the patch, we are only prepared to see > "more" alternates, rather than some existing alternate going away. > > That being said, I am not certain that is how it works. Perhaps an > alternate "goes away", but does not actually get removed from the list > of alternate ODBs. If that's the case, any object lookup in that > now-missing ODB would fail, but any subsequent ODBs which were added > after calling reprepare_alt_odb() would succeed on that object lookup. > > So, I don't know. I don't have the implementation details of the > alternates ODB mechanism paged in enough to say for sure. Hopefully > Stolee can point us in the right direction. The prepare_alt_odb() call only _adds_ to the linked odb list. It will not remove any existing ODBs. Adding this reprepare_*() method makes it such that we can use the union of the alternates available across the lifetime of the process. Thanks, -Stolee
Derrick Stolee <derrickstolee@github.com> writes:
> The prepare_alt_odb() call only _adds_ to the linked odb list.
Ah, thanks. That was what I missed and led to the question.
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > But in fact we've been doing the locking since 6c307626f1e (grep: > protect packed_git [re-]initialization, 2020-01-15). So the only thing > that really needs justification here is that putting the alternates > re-reading under the same lock > > There is a really interesting potential caveat here which you're not > discussing, which is... >> ... >> +void reprepare_alt_odb(struct repository *r) >> +{ >> + r->objects->loaded_alternates = 0; >> + prepare_alt_odb(r); >> +} >> + >> ... > This seems reasonable, but wouldn't this do the same without introducing > an API function just for this one use-case? > > That's of course a nit, and you seem to have been adding this for > consistency with reprepare_packed_git(), but it already "owns" the > "approximate_object_count_valid" and "packed_git_initialized" flags in > "struct raw_object_store". > > So as we'll only need this from reprepare_packed_git() isn't it better > to declare that "loaded_alternates" is another such flag? I am not sure I got what you want to say 100%, but if you are saying that this "drop the 'loaded' flag and force prepare_*() function to redo its thing" must not be done only in reprepare_packed_git(), and that inlining the code there without introducing a helper function that anybody can casually call without thinking its consequenced through, then I tend to agree in principle. But it is just as easy to lift two lines of code from the rewritten/inlined code to a new place---to ensure people follow the obj_read_lock() rule, the comment before it may have to be a bit stronger, I wonder? > Perhaps not, but as the resulting patch is much shorter it seems worth > considering. > > ...but to continue the above, the *really* important thing here (and > correct me if I'm wrong) is that we really need to *first* prepare the > alternates, and *then* do the rest, as our new alternates might point to > new loose objects and packs. Yes, and as Derrick explained in another message, we only have to worry about new ones getting added, not existing ones going away. > So with both of the above (the same could be done with your new helper) > something like this would IMO make that much clearer: > > diff --git a/packfile.c b/packfile.c > index 79e21ab18e7..50cb46ca4b7 100644 > --- a/packfile.c > +++ b/packfile.c > @@ -1008,6 +1008,13 @@ void reprepare_packed_git(struct repository *r) > struct object_directory *odb; > > obj_read_lock(); > + /* > + * New alternates might point to new loose & pack dirs, so we > + * must first read those. > + */ > + r->objects->loaded_alternates = 0; > + prepare_alt_odb(r); > + > for (odb = r->objects->odb; odb; odb = odb->next) > odb_clear_loose_cache(odb); > > And, I think this is an exsiting edge case, but we're only locking the > ODB of the "parent" repository in this case, so if we have alternates in > play aren't we going to racily compute the rest here, the loose objects > and packs of the alternates we're about to consult aren't under the same > lock?
Junio C Hamano <gitster@pobox.com> writes:
> redo its thing" must not be done only in reprepare_packed_git(), and
Oops, "must be done only", of course. Sorry.
On 3/7/2023 12:29 PM, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > >> But in fact we've been doing the locking since 6c307626f1e (grep: >> protect packed_git [re-]initialization, 2020-01-15). So the only thing >> that really needs justification here is that putting the alternates >> re-reading under the same lock >> >> There is a really interesting potential caveat here which you're not >> discussing, which is... >>> ... >>> +void reprepare_alt_odb(struct repository *r) >>> +{ >>> + r->objects->loaded_alternates = 0; >>> + prepare_alt_odb(r); >>> +} >>> + >>> ... >> This seems reasonable, but wouldn't this do the same without introducing >> an API function just for this one use-case? >> >> That's of course a nit, and you seem to have been adding this for >> consistency with reprepare_packed_git(), but it already "owns" the >> "approximate_object_count_valid" and "packed_git_initialized" flags in >> "struct raw_object_store". >> >> So as we'll only need this from reprepare_packed_git() isn't it better >> to declare that "loaded_alternates" is another such flag? > > I am not sure I got what you want to say 100%, but if you are saying > that this "drop the 'loaded' flag and force prepare_*() function to > redo its thing" must not be done only in reprepare_packed_git(), and > that inlining the code there without introducing a helper function > that anybody can casually call without thinking its consequenced > through, then I tend to agree in principle. But it is just as easy > to lift two lines of code from the rewritten/inlined code to a new > place---to ensure people follow the obj_read_lock() rule, the > comment before it may have to be a bit stronger, I wonder? The fact that we do it in a lock in reprepare_packed_git() in the only current caller does raise suspicion that someone could call it later and not realize that the lock could be helpful. Inlining the change within reprepare_packed_git() makes the most sense here instead of mimicking the pattern. >> Perhaps not, but as the resulting patch is much shorter it seems worth >> considering. The shortness of the patch is metric of quality, to me. The other reason (we might introduce a footgun) is more interesting. >> ...but to continue the above, the *really* important thing here (and >> correct me if I'm wrong) is that we really need to *first* prepare the >> alternates, and *then* do the rest, as our new alternates might point to >> new loose objects and packs. > > Yes, and as Derrick explained in another message, we only have to > worry about new ones getting added, not existing ones going away. I'll be sure to clarify that in my next version. >> So with both of the above (the same could be done with your new helper) >> something like this would IMO make that much clearer: >> >> diff --git a/packfile.c b/packfile.c >> index 79e21ab18e7..50cb46ca4b7 100644 >> --- a/packfile.c >> +++ b/packfile.c >> @@ -1008,6 +1008,13 @@ void reprepare_packed_git(struct repository *r) >> struct object_directory *odb; >> >> obj_read_lock(); >> + /* >> + * New alternates might point to new loose & pack dirs, so we >> + * must first read those. >> + */ >> + r->objects->loaded_alternates = 0; >> + prepare_alt_odb(r); >> + >> for (odb = r->objects->odb; odb; odb = odb->next) >> odb_clear_loose_cache(odb); >> >> And, I think this is an exsiting edge case, but we're only locking the >> ODB of the "parent" repository in this case, so if we have alternates in >> play aren't we going to racily compute the rest here, the loose objects >> and packs of the alternates we're about to consult aren't under the same >> lock? I don't understand what you are saying here. odb_read_lock() does not specify a repository and is instead a global lock on reading from any object database. Here is its implementation: extern int obj_read_use_lock; extern pthread_mutex_t obj_read_mutex; static inline void obj_read_lock(void) { if(obj_read_use_lock) pthread_mutex_lock(&obj_read_mutex); } static inline void obj_read_unlock(void) { if(obj_read_use_lock) pthread_mutex_unlock(&obj_read_mutex); } So I don't believe that your proposed edge case exists. Thanks, -Stolee
On Tue, Mar 07, 2023 at 09:52:19AM -0500, Derrick Stolee wrote: > On 3/6/2023 7:28 PM, Taylor Blau wrote: > > On Mon, Mar 06, 2023 at 02:54:00PM -0800, Junio C Hamano wrote: > >> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> > >>> @@ -1008,6 +1008,7 @@ void reprepare_packed_git(struct repository *r) > >>> struct object_directory *odb; > >>> > >>> obj_read_lock(); > >>> + reprepare_alt_odb(r); > >>> for (odb = r->objects->odb; odb; odb = odb->next) > >>> odb_clear_loose_cache(odb); > >> > >> Hmph, if there was an old alternate ODB from which we took some > >> loose object from and cached, and if that ODB no longer is on the > >> updated alternate list, would we now fail to clear the loose objects > >> cache for the ODB? Or are we only prepared for seeing "more" > >> alternates and assume no existing alternates go away? > > > > Based on my understanding of the patch, we are only prepared to see > > "more" alternates, rather than some existing alternate going away. > > > > That being said, I am not certain that is how it works. Perhaps an > > alternate "goes away", but does not actually get removed from the list > > of alternate ODBs. If that's the case, any object lookup in that > > now-missing ODB would fail, but any subsequent ODBs which were added > > after calling reprepare_alt_odb() would succeed on that object lookup. > > > > So, I don't know. I don't have the implementation details of the > > alternates ODB mechanism paged in enough to say for sure. Hopefully > > Stolee can point us in the right direction. > > The prepare_alt_odb() call only _adds_ to the linked odb list. It > will not remove any existing ODBs. Adding this reprepare_*() method > makes it such that we can use the union of the alternates available > across the lifetime of the process. Right, that matches my understanding. What I am asking is: since we only add ODBs to the list, what happens if we can no longer access an *existing* alternate at the time we call reprepare_alt_odb()? It's clear that that now-inaccessible alternate remains in our list of alternate ODBs, but do all object lookups hitting that ODB fail-over to the new ODB? I believe so, but it isn't totally clear to me. Thanks, Taylor
On 3/8/2023 10:55 AM, Taylor Blau wrote: > On Tue, Mar 07, 2023 at 09:52:19AM -0500, Derrick Stolee wrote: >> The prepare_alt_odb() call only _adds_ to the linked odb list. It >> will not remove any existing ODBs. Adding this reprepare_*() method >> makes it such that we can use the union of the alternates available >> across the lifetime of the process. > > Right, that matches my understanding. What I am asking is: since we only > add ODBs to the list, what happens if we can no longer access an > *existing* alternate at the time we call reprepare_alt_odb()? > > It's clear that that now-inaccessible alternate remains in our list of > alternate ODBs, but do all object lookups hitting that ODB fail-over to > the new ODB? I believe so, but it isn't totally clear to me. It's the same as the pack-file list: if we fail to load something from one, then we continue to the next one. If an alternate dir is completely removed during the process, then looking for pack- files again will fail to see any and continue without error. This is already possible by deleting an alternate directory while a Git process is running and might try to open files in it. Git already recovers from this scenario. If you're instead talking about the .git/objects/info/alternates file being modified to remove an alternate from the list, then Git's current behavior is to keep that alternate around for the life of the process, and I recommend continuing that behavior. There's nothing special that we are adding here that doesn't already exist as protections when files are removed beneath the Git process. Thanks, -Stolee
diff --git a/object-file.c b/object-file.c index 939865c1ae0..22acc7fd8e9 100644 --- a/object-file.c +++ b/object-file.c @@ -944,6 +944,12 @@ void prepare_alt_odb(struct repository *r) r->objects->loaded_alternates = 1; } +void reprepare_alt_odb(struct repository *r) +{ + r->objects->loaded_alternates = 0; + prepare_alt_odb(r); +} + /* Returns 1 if we have successfully freshened the file, 0 otherwise. */ static int freshen_file(const char *fn) { diff --git a/object-store.h b/object-store.h index 1a713d89d7c..750c29daa54 100644 --- a/object-store.h +++ b/object-store.h @@ -56,6 +56,7 @@ KHASH_INIT(odb_path_map, const char * /* key: odb_path */, struct object_directory *, 1, fspathhash, fspatheq) void prepare_alt_odb(struct repository *r); +void reprepare_alt_odb(struct repository *r); char *compute_alternate_path(const char *path, struct strbuf *err); struct object_directory *find_odb(struct repository *r, const char *obj_dir); typedef int alt_odb_fn(struct object_directory *, void *); diff --git a/packfile.c b/packfile.c index 79e21ab18e7..2b28918a05e 100644 --- a/packfile.c +++ b/packfile.c @@ -1008,6 +1008,7 @@ void reprepare_packed_git(struct repository *r) struct object_directory *odb; obj_read_lock(); + reprepare_alt_odb(r); for (odb = r->objects->odb; odb; odb = odb->next) odb_clear_loose_cache(odb);