Message ID | f1c07324f62cf4d087c41165cefed98f554cfd78.1612411124.git.me@ttaylorr.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | repack: support repacking into a geometric sequence | expand |
On Wed, Feb 03, 2021 at 10:59:21PM -0500, Taylor Blau wrote: > diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c > index fbd7b54d70..b2ba5aa14f 100644 > --- a/builtin/pack-objects.c > +++ b/builtin/pack-objects.c > @@ -1225,9 +1225,9 @@ static int want_found_object(const struct object_id *oid, int exclude, > */ > unsigned flags = 0; > if (ignore_packed_keep_on_disk) > - flags |= ON_DISK_KEEP_PACKS; > + flags |= CACHE_ON_DISK_KEEP_PACKS; > if (ignore_packed_keep_in_core) > - flags |= IN_CORE_KEEP_PACKS; > + flags |= CACHE_IN_CORE_KEEP_PACKS; Why are we renaming the constants in this patch? I know I'm listed as the author, but I think this came out of some off-list back and forth between us. It seems like the existing constants would have been fine. > +static void maybe_invalidate_kept_pack_cache(struct repository *r, > + unsigned flags) > { > - return find_one_pack_entry(r, oid, e, 0); > + if (!r->objects->kept_pack_cache) > + return; > + if (r->objects->kept_pack_cache->flags == flags) > + return; > + free(r->objects->kept_pack_cache->packs); > + FREE_AND_NULL(r->objects->kept_pack_cache); > +} OK, so we keep a single cache based on the flags, and then if somebody ever asks for different flags, we throw it away. That's probably OK for our purposes, since we wouldn't expect multiple callers within a single process. I wondered if it would be simpler to just keep two lists, one for in-core keeps and one for on-disk keeps. And then just walk over each list separately based on the query flags. That makes things more robust _and_ I think would be less code. It does mean that a pack could appear in both lists, though, which means we might do a lookup in it twice. That doesn't seem all that likely, but it is working against our goal here. Another option is to keep 3 caches (two separate and one combined), rather than flipping between them. I'm not sure if that would be less code or not (it gets rid of the "invalidate" function, but you do have to pick the right cache depending on the query flags). Yet another option is to keep a cache of any that are marked as _either_ in core or on-disk keeps, and then decide to look up the object based on the query flags. Then you just pay the cost to iterate over the list and check the flags (which really is all this cache is helping with in the first place). I dunno. TBH, I kind of wonder if this whole patch is worth doing at all, giving the underwhelming performance benefit (3% on the pathological 1000-pack case). When I had timed this strategy initially, it was more like 15%. I'm not sure where the savings went in the interim, or if it was a timing fluke. > +static struct packed_git **kept_pack_cache(struct repository *r, unsigned flags) > +{ > + maybe_invalidate_kept_pack_cache(r, flags); > + > + if (!r->objects->kept_pack_cache) { > + struct packed_git **packs = NULL; > + size_t nr = 0, alloc = 0; > + struct packed_git *p; > + > + /* > + * We want "all" packs here, because we need to cover ones that > + * are used by a midx, as well. We need to look in every one of > + * them (instead of the midx itself) to cover duplicates. It's > + * possible that an object is found in two packs that the midx > + * covers, one kept and one not kept, but the midx returns only > + * the non-kept version. > + */ > + for (p = get_all_packs(r); p; p = p->next) { > + if ((p->pack_keep && (flags & CACHE_ON_DISK_KEEP_PACKS)) || > + (p->pack_keep_in_core && (flags & CACHE_IN_CORE_KEEP_PACKS))) { > + ALLOC_GROW(packs, nr + 1, alloc); > + packs[nr++] = p; > + } > + } > + ALLOC_GROW(packs, nr + 1, alloc); > + packs[nr] = NULL; > + > + r->objects->kept_pack_cache = xmalloc(sizeof(*r->objects->kept_pack_cache)); > + r->objects->kept_pack_cache->packs = packs; > + r->objects->kept_pack_cache->flags = flags; > + } Is there any reason not to just embed the kept_pack_cache struct inside the object_store? It's one less pointer to deal with. I wonder if this is a holdover from an attempt to have multiple caches. (I also think it would be reasonable if we wanted to hide the definition of the cache struct from callers, but we don't seem do to that). > @@ -2109,7 +2123,8 @@ int has_object_pack(const struct object_id *oid) > return find_pack_entry(the_repository, oid, &e); > } > > -int has_object_kept_pack(const struct object_id *oid, unsigned flags) > +int has_object_kept_pack(const struct object_id *oid, > + unsigned flags) > { > struct pack_entry e; > return find_kept_pack_entry(the_repository, oid, flags, &e); This seems like a stray change. > diff --git a/packfile.h b/packfile.h > index 624327f64d..eb56db2a7b 100644 > --- a/packfile.h > +++ b/packfile.h > @@ -161,10 +161,6 @@ int packed_object_info(struct repository *r, > void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1); > const struct packed_git *has_packed_and_bad(struct repository *r, const unsigned char *sha1); > > -#define ON_DISK_KEEP_PACKS 1 > -#define IN_CORE_KEEP_PACKS 2 > -#define ALL_KEEP_PACKS (ON_DISK_KEEP_PACKS | IN_CORE_KEEP_PACKS) I notice that when the constants moved, we didn't keep an equivalent of ALL_KEEP_PACKS. Maybe we didn't need it in the first place in patch 1? BTW, I absolutely hate the complication that all of this on-disk versus in-core keep distinction brings to this code. And I wondered what it was really doing for us and whether we could get rid of it. But I think we do need it: a common case may be to avoid using --honor-pack-keep (because you don't want to deal with racy .keep writes from incoming receive-pack processes), but use in-core ones for something like --stdin-packs. So we do need to respect one and not the other. I do wonder if things would be simpler if pack-objects simply kept its own list of "in core" packs in a separate array. But that is really just another form of the same problem, I guess. -Peff
On Wed, Feb 17, 2021 at 12:11:00PM -0500, Jeff King wrote: > On Wed, Feb 03, 2021 at 10:59:21PM -0500, Taylor Blau wrote: > > > diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c > > index fbd7b54d70..b2ba5aa14f 100644 > > --- a/builtin/pack-objects.c > > +++ b/builtin/pack-objects.c > > @@ -1225,9 +1225,9 @@ static int want_found_object(const struct object_id *oid, int exclude, > > */ > > unsigned flags = 0; > > if (ignore_packed_keep_on_disk) > > - flags |= ON_DISK_KEEP_PACKS; > > + flags |= CACHE_ON_DISK_KEEP_PACKS; > > if (ignore_packed_keep_in_core) > > - flags |= IN_CORE_KEEP_PACKS; > > + flags |= CACHE_IN_CORE_KEEP_PACKS; > > Why are we renaming the constants in this patch? > > I know I'm listed as the author, but I think this came out of some > off-list back and forth between us. It seems like the existing constants > would have been fine. Yeah, they would have been fine. They were renamed because this patch makes them only used for the kept pack cache, but I agree the existing names are fine, too. In any case, they make an easier-to-read diff, so I'm perfectly happy to un-rename them ;). > > +static void maybe_invalidate_kept_pack_cache(struct repository *r, > > + unsigned flags) > > { > > - return find_one_pack_entry(r, oid, e, 0); > > + if (!r->objects->kept_pack_cache) > > + return; > > + if (r->objects->kept_pack_cache->flags == flags) > > + return; > > + free(r->objects->kept_pack_cache->packs); > > + FREE_AND_NULL(r->objects->kept_pack_cache); > > +} > > OK, so we keep a single cache based on the flags, and then if somebody > ever asks for different flags, we throw it away. That's probably OK for > our purposes, since we wouldn't expect multiple callers within a single > process. > > I wondered if it would be simpler to just keep two lists, one for > in-core keeps and one for on-disk keeps. And then just walk over each > list separately based on the query flags. That makes things more robust > _and_ I think would be less code. It does mean that a pack could appear > in both lists, though, which means we might do a lookup in it twice. > That doesn't seem all that likely, but it is working against our goal > here. > > Another option is to keep 3 caches (two separate and one combined), > rather than flipping between them. I'm not sure if that would be less > code or not (it gets rid of the "invalidate" function, but you do have > to pick the right cache depending on the query flags). > > Yet another option is to keep a cache of any that are marked as _either_ > in core or on-disk keeps, and then decide to look up the object based on > the query flags. Then you just pay the cost to iterate over the list and > check the flags (which really is all this cache is helping with in the > first place). All interesting ideas. In this patch (and by the end of the series) callers that use the kept pack cache never ask for the cache with a different set of flags. IOW, there isn't a situation where a caller would populate the in-core kept pack cache, and then suddenly ask for both in-core and on-disk packs to be kept. So all of this code is defensive in case that were to change, and suddenly we'd be returning subtly wrong results. I could imagine that being kind of a nasty bug to track down, so detecting and invalidating the cache would make it a non-issue. I'll note it in the commit message, though, since it's good for future readers to be aware, too. > I dunno. TBH, I kind of wonder if this whole patch is worth doing at > all, giving the underwhelming performance benefit (3% on the > pathological 1000-pack case). When I had timed this strategy initially, > it was more like 15%. I'm not sure where the savings went in the > interim, or if it was a timing fluke. Yeah, I dunno. It's certainly not hurting (I don't think the extra code is all that complex, and the savings is at least non-zero), so I'm inclined to keep it. > > +static struct packed_git **kept_pack_cache(struct repository *r, unsigned flags) > > +{ > > + maybe_invalidate_kept_pack_cache(r, flags); > > + > > + if (!r->objects->kept_pack_cache) { > > + struct packed_git **packs = NULL; > > + size_t nr = 0, alloc = 0; > > + struct packed_git *p; > > + > > + /* > > + * We want "all" packs here, because we need to cover ones that > > + * are used by a midx, as well. We need to look in every one of > > + * them (instead of the midx itself) to cover duplicates. It's > > + * possible that an object is found in two packs that the midx > > + * covers, one kept and one not kept, but the midx returns only > > + * the non-kept version. > > + */ > > + for (p = get_all_packs(r); p; p = p->next) { > > + if ((p->pack_keep && (flags & CACHE_ON_DISK_KEEP_PACKS)) || > > + (p->pack_keep_in_core && (flags & CACHE_IN_CORE_KEEP_PACKS))) { > > + ALLOC_GROW(packs, nr + 1, alloc); > > + packs[nr++] = p; > > + } > > + } > > + ALLOC_GROW(packs, nr + 1, alloc); > > + packs[nr] = NULL; > > + > > + r->objects->kept_pack_cache = xmalloc(sizeof(*r->objects->kept_pack_cache)); > > + r->objects->kept_pack_cache->packs = packs; > > + r->objects->kept_pack_cache->flags = flags; > > + } > > Is there any reason not to just embed the kept_pack_cache struct inside > the object_store? It's one less pointer to deal with. I wonder if this > is a holdover from an attempt to have multiple caches. > > (I also think it would be reasonable if we wanted to hide the definition > of the cache struct from callers, but we don't seem do to that). Not a holdover, just designed to avoid adding too many extra fields to the object-store. I don't feel strongly, but I do think hiding the definition is a good idea, so I'll inline it. > > @@ -2109,7 +2123,8 @@ int has_object_pack(const struct object_id *oid) > > return find_pack_entry(the_repository, oid, &e); > > } > > > > -int has_object_kept_pack(const struct object_id *oid, unsigned flags) > > +int has_object_kept_pack(const struct object_id *oid, > > + unsigned flags) > > { > > struct pack_entry e; > > return find_kept_pack_entry(the_repository, oid, flags, &e); > > This seems like a stray change. Good eyes, thanks. > > > diff --git a/packfile.h b/packfile.h > > index 624327f64d..eb56db2a7b 100644 > > --- a/packfile.h > > +++ b/packfile.h > > @@ -161,10 +161,6 @@ int packed_object_info(struct repository *r, > > void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1); > > const struct packed_git *has_packed_and_bad(struct repository *r, const unsigned char *sha1); > > > > -#define ON_DISK_KEEP_PACKS 1 > > -#define IN_CORE_KEEP_PACKS 2 > > -#define ALL_KEEP_PACKS (ON_DISK_KEEP_PACKS | IN_CORE_KEEP_PACKS) > > I notice that when the constants moved, we didn't keep an equivalent of > ALL_KEEP_PACKS. Maybe we didn't need it in the first place in patch 1? Yeah, we didn't need it to begin with. I'll drop it accordingly. > BTW, I absolutely hate the complication that all of this on-disk > versus in-core keep distinction brings to this code. And I wondered > what it was really doing for us and whether we could get rid of it. > But I think we do need it: a common case may be to avoid using > --honor-pack-keep (because you don't want to deal with racy .keep > writes from incoming receive-pack processes), but use in-core ones for > something like --stdin-packs. So we do need to respect one and not the > other. > > I do wonder if things would be simpler if pack-objects simply kept its > own list of "in core" packs in a separate array. But that is really > just another form of the same problem, I guess. Yeah, the complexity is awfully hard to reason about, but you're right that here it is necessary. > -Peff Thanks, Taylor
On Wed, Feb 17, 2021 at 02:54:33PM -0500, Taylor Blau wrote: > > OK, so we keep a single cache based on the flags, and then if somebody > > ever asks for different flags, we throw it away. That's probably OK for > > our purposes, since we wouldn't expect multiple callers within a single > > process. > > [...some alternatives] > > All interesting ideas. In this patch (and by the end of the series) > callers that use the kept pack cache never ask for the cache with a > different set of flags. IOW, there isn't a situation where a caller > would populate the in-core kept pack cache, and then suddenly ask for > both in-core and on-disk packs to be kept. > > So all of this code is defensive in case that were to change, and > suddenly we'd be returning subtly wrong results. I could imagine that > being kind of a nasty bug to track down, so detecting and invalidating > the cache would make it a non-issue. Yeah, I agree that the current crop of callers does not care. And I am glad we are not leaving a booby-trap for later programmers with respect to correctness (by virtue of the invalidation function). But it does feel like we are leaving one for performance, which they very well might not realize the cache is doing worse-than-nothing. Would just doing: if (cache.packs && cache.flags != flags) BUG("kept-pack-cache cannot handle multiple queries in a single process"); be a better solution? That is not helping anyone towards a world where we gracefully handle back-and-forth queries. But it makes it abundantly clear when such a thing would become necessary. > > Is there any reason not to just embed the kept_pack_cache struct inside > > the object_store? It's one less pointer to deal with. I wonder if this > > is a holdover from an attempt to have multiple caches. > > > > (I also think it would be reasonable if we wanted to hide the definition > > of the cache struct from callers, but we don't seem do to that). > > Not a holdover, just designed to avoid adding too many extra fields to > the object-store. I don't feel strongly, but I do think hiding the > definition is a good idea, so I'll inline it. This response confuses me a bit. Hiding the definition from callers would mean _keeping_ it as a pointer, but putting the definition into packfile.c, where nobody outside that file could see it (at least that is what I meant by hiding). But inlining it to me implies embedding the struct (not a pointer to it) in "struct object_store", defining the struct at the point we define the struct field which uses it. I am fine with either, to be clear. I'm just confused which you are proposing to do. :) -Peff
On Wed, Feb 17, 2021 at 03:25:17PM -0500, Jeff King wrote: > Would just doing: > > if (cache.packs && cache.flags != flags) > BUG("kept-pack-cache cannot handle multiple queries in a single process"); > > be a better solution? That is not helping anyone towards a world where > we gracefully handle back-and-forth queries. But it makes it abundantly > clear when such a thing would become necessary. I dunno. I can certainly see its merits, but I have to imagine that anybody who cares enough about the performance will be able to find our conversation here. Assuming that's the case, I would rather have the kept-pack cache handle multiple queries before BUG()-ing. > > > Is there any reason not to just embed the kept_pack_cache struct inside > > > the object_store? It's one less pointer to deal with. I wonder if this > > > is a holdover from an attempt to have multiple caches. > > > > > > (I also think it would be reasonable if we wanted to hide the definition > > > of the cache struct from callers, but we don't seem do to that). > > > > Not a holdover, just designed to avoid adding too many extra fields to > > the object-store. I don't feel strongly, but I do think hiding the > > definition is a good idea, so I'll inline it. > > This response confuses me a bit. Hiding the definition from callers > would mean _keeping_ it as a pointer, but putting the definition into > packfile.c, where nobody outside that file could see it (at least that > is what I meant by hiding). > > But inlining it to me implies embedding the struct (not a pointer to it) > in "struct object_store", defining the struct at the point we define the > struct field which uses it. > > I am fine with either, to be clear. I'm just confused which you are > proposing to do. :) Probably because I changed my mind in the middle of writing it ;). I'm proposing embedding the definition of the struct into the definition of object_store, and then operating on its fields (from within packfile.c). Thanks, Taylor
On Wed, Feb 17, 2021 at 03:29:50PM -0500, Taylor Blau wrote: > On Wed, Feb 17, 2021 at 03:25:17PM -0500, Jeff King wrote: > > Would just doing: > > > > if (cache.packs && cache.flags != flags) > > BUG("kept-pack-cache cannot handle multiple queries in a single process"); > > > > be a better solution? That is not helping anyone towards a world where > > we gracefully handle back-and-forth queries. But it makes it abundantly > > clear when such a thing would become necessary. > > I dunno. I can certainly see its merits, but I have to imagine that > anybody who cares enough about the performance will be able to find our > conversation here. Assuming that's the case, I would rather have the > kept-pack cache handle multiple queries before BUG()-ing. OK. I am on the fence, and you are the author, so I'm happy to go with your preference. I'm not quite as optimistic that somebody would find this conversation, if only because they have to know to look for it. I could easily see somebody adding a find_kept_in_pack() without thinking too hard about it. OTOH, I find it quite unlikely that anybody would use a different set of flags within the same process, so it would probably Just Work for them regardless. :) > > This response confuses me a bit. Hiding the definition from callers > > would mean _keeping_ it as a pointer, but putting the definition into > > packfile.c, where nobody outside that file could see it (at least that > > is what I meant by hiding). > > > > But inlining it to me implies embedding the struct (not a pointer to it) > > in "struct object_store", defining the struct at the point we define the > > struct field which uses it. > > > > I am fine with either, to be clear. I'm just confused which you are > > proposing to do. :) > > Probably because I changed my mind in the middle of writing it ;). I'm > proposing embedding the definition of the struct into the definition of > object_store, and then operating on its fields (from within packfile.c). OK, that sounds great to me (and arguably produces more efficient code, since we avoid a pointer dereference, though I doubt it matters in practice). Thanks for clarifying. -Peff
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index fbd7b54d70..b2ba5aa14f 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1225,9 +1225,9 @@ static int want_found_object(const struct object_id *oid, int exclude, */ unsigned flags = 0; if (ignore_packed_keep_on_disk) - flags |= ON_DISK_KEEP_PACKS; + flags |= CACHE_ON_DISK_KEEP_PACKS; if (ignore_packed_keep_in_core) - flags |= IN_CORE_KEEP_PACKS; + flags |= CACHE_IN_CORE_KEEP_PACKS; if (ignore_packed_keep_on_disk && p->pack_keep) return 0; @@ -3089,7 +3089,7 @@ static void read_packs_list_from_stdin(void) * an optimization during delta selection. */ revs.no_kept_objects = 1; - revs.keep_pack_cache_flags |= IN_CORE_KEEP_PACKS; + revs.keep_pack_cache_flags |= CACHE_IN_CORE_KEEP_PACKS; revs.blob_objects = 1; revs.tree_objects = 1; revs.tag_objects = 1; diff --git a/object-store.h b/object-store.h index c4fc9dd74e..4cbe8eae3c 100644 --- a/object-store.h +++ b/object-store.h @@ -105,6 +105,14 @@ static inline int pack_map_entry_cmp(const void *unused_cmp_data, return strcmp(pg1->pack_name, key ? key : pg2->pack_name); } +#define CACHE_ON_DISK_KEEP_PACKS 1 +#define CACHE_IN_CORE_KEEP_PACKS 2 + +struct kept_pack_cache { + struct packed_git **packs; + unsigned flags; +}; + struct raw_object_store { /* * Set of all object directories; the main directory is first (and @@ -150,6 +158,8 @@ struct raw_object_store { /* A most-recently-used ordered version of the packed_git list. */ struct list_head packed_git_mru; + struct kept_pack_cache *kept_pack_cache; + /* * A map of packfiles to packed_git structs for tracking which * packs have been loaded already. diff --git a/packfile.c b/packfile.c index 5f35cfe788..2a139c907b 100644 --- a/packfile.c +++ b/packfile.c @@ -2031,10 +2031,7 @@ static int fill_pack_entry(const struct object_id *oid, return 1; } -static int find_one_pack_entry(struct repository *r, - const struct object_id *oid, - struct pack_entry *e, - int kept_only) +int find_pack_entry(struct repository *r, const struct object_id *oid, struct pack_entry *e) { struct list_head *pos; struct multi_pack_index *m; @@ -2044,49 +2041,64 @@ static int find_one_pack_entry(struct repository *r, return 0; for (m = r->objects->multi_pack_index; m; m = m->next) { - if (!fill_midx_entry(r, oid, e, m)) - continue; - - if (!kept_only) - return 1; - - if (((kept_only & ON_DISK_KEEP_PACKS) && e->p->pack_keep) || - ((kept_only & IN_CORE_KEEP_PACKS) && e->p->pack_keep_in_core)) + if (fill_midx_entry(r, oid, e, m)) return 1; } list_for_each(pos, &r->objects->packed_git_mru) { struct packed_git *p = list_entry(pos, struct packed_git, mru); - if (p->multi_pack_index && !kept_only) { - /* - * If this pack is covered by the MIDX, we'd have found - * the object already in the loop above if it was here, - * so don't bother looking. - * - * The exception is if we are looking only at kept - * packs. An object can be present in two packs covered - * by the MIDX, one kept and one not-kept. And as the - * MIDX points to only one copy of each object, it might - * have returned only the non-kept version above. We - * have to check again to be thorough. - */ - continue; - } - if (!kept_only || - (((kept_only & ON_DISK_KEEP_PACKS) && p->pack_keep) || - ((kept_only & IN_CORE_KEEP_PACKS) && p->pack_keep_in_core))) { - if (fill_pack_entry(oid, e, p)) { - list_move(&p->mru, &r->objects->packed_git_mru); - return 1; - } + if (!p->multi_pack_index && fill_pack_entry(oid, e, p)) { + list_move(&p->mru, &r->objects->packed_git_mru); + return 1; } } return 0; } -int find_pack_entry(struct repository *r, const struct object_id *oid, struct pack_entry *e) +static void maybe_invalidate_kept_pack_cache(struct repository *r, + unsigned flags) { - return find_one_pack_entry(r, oid, e, 0); + if (!r->objects->kept_pack_cache) + return; + if (r->objects->kept_pack_cache->flags == flags) + return; + free(r->objects->kept_pack_cache->packs); + FREE_AND_NULL(r->objects->kept_pack_cache); +} + +static struct packed_git **kept_pack_cache(struct repository *r, unsigned flags) +{ + maybe_invalidate_kept_pack_cache(r, flags); + + if (!r->objects->kept_pack_cache) { + struct packed_git **packs = NULL; + size_t nr = 0, alloc = 0; + struct packed_git *p; + + /* + * We want "all" packs here, because we need to cover ones that + * are used by a midx, as well. We need to look in every one of + * them (instead of the midx itself) to cover duplicates. It's + * possible that an object is found in two packs that the midx + * covers, one kept and one not kept, but the midx returns only + * the non-kept version. + */ + for (p = get_all_packs(r); p; p = p->next) { + if ((p->pack_keep && (flags & CACHE_ON_DISK_KEEP_PACKS)) || + (p->pack_keep_in_core && (flags & CACHE_IN_CORE_KEEP_PACKS))) { + ALLOC_GROW(packs, nr + 1, alloc); + packs[nr++] = p; + } + } + ALLOC_GROW(packs, nr + 1, alloc); + packs[nr] = NULL; + + r->objects->kept_pack_cache = xmalloc(sizeof(*r->objects->kept_pack_cache)); + r->objects->kept_pack_cache->packs = packs; + r->objects->kept_pack_cache->flags = flags; + } + + return r->objects->kept_pack_cache->packs; } int find_kept_pack_entry(struct repository *r, @@ -2094,13 +2106,15 @@ int find_kept_pack_entry(struct repository *r, unsigned flags, struct pack_entry *e) { - /* - * Load all packs, including midx packs, since our "kept" strategy - * relies on that. We're relying on the side effect of it setting up - * r->objects->packed_git, which is a little ugly. - */ - get_all_packs(r); - return find_one_pack_entry(r, oid, e, flags); + struct packed_git **cache; + + for (cache = kept_pack_cache(r, flags); *cache; cache++) { + struct packed_git *p = *cache; + if (fill_pack_entry(oid, e, p)) + return 1; + } + + return 0; } int has_object_pack(const struct object_id *oid) @@ -2109,7 +2123,8 @@ int has_object_pack(const struct object_id *oid) return find_pack_entry(the_repository, oid, &e); } -int has_object_kept_pack(const struct object_id *oid, unsigned flags) +int has_object_kept_pack(const struct object_id *oid, + unsigned flags) { struct pack_entry e; return find_kept_pack_entry(the_repository, oid, flags, &e); diff --git a/packfile.h b/packfile.h index 624327f64d..eb56db2a7b 100644 --- a/packfile.h +++ b/packfile.h @@ -161,10 +161,6 @@ int packed_object_info(struct repository *r, void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1); const struct packed_git *has_packed_and_bad(struct repository *r, const unsigned char *sha1); -#define ON_DISK_KEEP_PACKS 1 -#define IN_CORE_KEEP_PACKS 2 -#define ALL_KEEP_PACKS (ON_DISK_KEEP_PACKS | IN_CORE_KEEP_PACKS) - /* * Iff a pack file in the given repository contains the object named by sha1, * return true and store its location to e. diff --git a/revision.c b/revision.c index 4c5adb90b1..41c0478705 100644 --- a/revision.c +++ b/revision.c @@ -2338,14 +2338,14 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg die(_("--unpacked=<packfile> no longer supported")); } else if (!strcmp(arg, "--no-kept-objects")) { revs->no_kept_objects = 1; - revs->keep_pack_cache_flags |= IN_CORE_KEEP_PACKS; - revs->keep_pack_cache_flags |= ON_DISK_KEEP_PACKS; + revs->keep_pack_cache_flags |= CACHE_IN_CORE_KEEP_PACKS; + revs->keep_pack_cache_flags |= CACHE_ON_DISK_KEEP_PACKS; } else if (skip_prefix(arg, "--no-kept-objects=", &optarg)) { revs->no_kept_objects = 1; if (!strcmp(optarg, "in-core")) - revs->keep_pack_cache_flags |= IN_CORE_KEEP_PACKS; + revs->keep_pack_cache_flags |= CACHE_IN_CORE_KEEP_PACKS; if (!strcmp(optarg, "on-disk")) - revs->keep_pack_cache_flags |= ON_DISK_KEEP_PACKS; + revs->keep_pack_cache_flags |= CACHE_ON_DISK_KEEP_PACKS; } else if (!strcmp(arg, "-r")) { revs->diff = 1; revs->diffopt.flags.recursive = 1;