Message ID | 70500b63434f6e454631b3a8fde07c62b5adfef0.1648119652.git.dyroneteng@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | trace2 output for bitmap decision path | expand |
Teng Long <dyroneteng@gmail.com> writes: > There will be only one operant ".bitmap" file in repo, so let's > add "break" statement in "open_pack_bitmap()" when looping all > the packs in repo. Googling "operant" only gives a psychology term X-<. : behavior (such as bar pressing by a rat to obtain food) that operates on the environment to produce rewarding and reinforcing effects. I do not quite get this. We expect that there may be more than one .pack in the object store, and that is why we iterate over the packs get_all_packs() gives us. Then for each of these packs, we try to open pack bitmap. Ah, are you referring to the current limitation that we can have only one .bitmap file, even when there are many packs in the object store? I wonder if that limitation is something we need to hardcode here. IOW, when we ever extend the feature to allow more than one, we may have to extend the "struct bitmap_index" so that we can have bitmap info from more than one on-disk files, and the implementation of open_pack_bitmap_1() (which is "read bitmap for this one pack" as opposed to "read bitmpa for this repository" which is this function) would certainly have to change, but would we need to revert this change to the function? If we do not omit this patch, i.e. even if we know there can be only one pack that may have a corresponding bitmap but if the code does not know it and has to try git_open() only to see it fail, how much wasted effort are we talking about? > Signed-off-by: Teng Long <dyroneteng@gmail.com> > --- > pack-bitmap.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/pack-bitmap.c b/pack-bitmap.c > index 931219adf0..b1357137bf 100644 > --- a/pack-bitmap.c > +++ b/pack-bitmap.c > @@ -484,8 +484,10 @@ static int open_pack_bitmap(struct repository *r, > assert(!bitmap_git->map); > > for (p = get_all_packs(r); p; p = p->next) { > - if (open_pack_bitmap_1(bitmap_git, p) == 0) > + if (open_pack_bitmap_1(bitmap_git, p) == 0) { > ret = 0; > + break; > + } > } > > return ret;
On Thu, Mar 24, 2022 at 07:44:00PM +0800, Teng Long wrote: > There will be only one operant ".bitmap" file in repo, so let's > add "break" statement in "open_pack_bitmap()" when looping all > the packs in repo. > > Signed-off-by: Teng Long <dyroneteng@gmail.com> > --- > pack-bitmap.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/pack-bitmap.c b/pack-bitmap.c > index 931219adf0..b1357137bf 100644 > --- a/pack-bitmap.c > +++ b/pack-bitmap.c > @@ -484,8 +484,10 @@ static int open_pack_bitmap(struct repository *r, > assert(!bitmap_git->map); > > for (p = get_all_packs(r); p; p = p->next) { > - if (open_pack_bitmap_1(bitmap_git, p) == 0) > + if (open_pack_bitmap_1(bitmap_git, p) == 0) { > ret = 0; > + break; The lack of a break here is intentional, I think, since having more than one bitmap of the same kind in a repository is an error. (This is behavior we inherited from the pre-MIDX bitmap days, when having more than one pack bitmap caused Git to signal an error, since it could only use the results from a single bitmap). You can see in pack-bitmap.c::open_pack_bitmap_1() that we have a condition saying: if (bitmap_git->pack || bitmap_git->midx) { /* ignore extra bitmap file; we can only handle one */ warning("...") close(fd;) return -1; } We do want to call that open_pack_bitmap_1() function on every pack we know about to make sure that one and only one of them corresponds to a .bitmap. Thanks, Taylor
On Thu, Mar 24, 2022 at 11:40:02AM -0700, Junio C Hamano wrote: > Teng Long <dyroneteng@gmail.com> writes: > > > There will be only one operant ".bitmap" file in repo, so let's > > add "break" statement in "open_pack_bitmap()" when looping all > > the packs in repo. > > Googling "operant" only gives a psychology term X-<. > > : behavior (such as bar pressing by a rat to obtain food) that > operates on the environment to produce rewarding and reinforcing > effects. I suspect that the author meant "operating", which I would probably have written as "Git will only use information from a single .bitmap file in the repository" or similar. But... > I do not quite get this. We expect that there may be more than one > .pack in the object store, and that is why we iterate over the packs > get_all_packs() gives us. Then for each of these packs, we try to > open pack bitmap. > > Ah, are you referring to the current limitation that we can have > only one .bitmap file, even when there are many packs in the object > store? > > I wonder if that limitation is something we need to hardcode here. ...you're exactly right: if we see more than one pack bitmap in the repository, we can and should warn there, since we'll only ever *use* information from a single bitmap at a time. That's the behavior before this patch which is guaranteed by t5310.74. This patch causes that test to break, and I would suggest that is telling us to drop this part of the series. Thanks, Taylor
On Thu, 24 Mar 2022 15:03:09 -0400, Taylor Blau wrote: > The lack of a break here is intentional, I think, since having more than > one bitmap of the same kind in a repository is an error. Yeah. I quite sure that it's intentional now. Thanks for your explanations. > (This is behavior we inherited from the pre-MIDX bitmap days, when > having more than one pack bitmap caused Git to signal an error, since it > could only use the results from a single bitmap). > > You can see in pack-bitmap.c::open_pack_bitmap_1() that we have a > condition saying: > > if (bitmap_git->pack || bitmap_git->midx) { > /* ignore extra bitmap file; we can only handle one */ > warning("...") > close(fd;) > return -1; > } > > We do want to call that open_pack_bitmap_1() function on every pack we > know about to make sure that one and only one of them corresponds to a > .bitmap. Yeah, the condition and "warning" make it clear to me which is if already exists a bitmap of the pack or MIDX is ready, we will give warnings and just let it fail (return -1 means a return of NULL in "prepare_bitmap_git()", and will then die() in usage cases I found). In addition of above, I had a question that why we need "bitmap_git->midx" in the condition? Because here in "open_pack_bitmap_1()" we intent to open the non-midx-bitmap and it's after we deal with "open_midx_bitmap()" in "open_bitmap()": static int open_bitmap(struct repository *r, struct bitmap_index *bitmap_git) { assert(!bitmap_git->map); if (!open_midx_bitmap(r, bitmap_git)) return 0; return open_pack_bitmap(r, bitmap_git); } So if I understood correct, maybe we can made condition of "bitmap_git->midx" a little earlier so that we can avoid to open every packfile, maybe it's like: diff --git a/pack-bitmap.c b/pack-bitmap.c index 9c666cdb8b..38f53b8f1c 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -483,11 +483,12 @@ static int open_pack_bitmap(struct repository *r, assert(!bitmap_git->map); - for (p = get_all_packs(r); p; p = p->next) { - if (open_pack_bitmap_1(bitmap_git, p) == 0) - ret = 0; + if (!bitmap_git->midx) { + for (p = get_all_packs(r); p; p = p->next) { + if (open_pack_bitmap_1(bitmap_git, p) == 0) + ret = 0; + } } - return ret; } Thanks.
On Tue, Mar 29, 2022 at 10:49:49AM +0800, Teng Long wrote: > Yeah, the condition and "warning" make it clear to me which is if > already exists a bitmap of the pack or MIDX is ready, we will give > warnings and just let it fail (return -1 means a return of NULL in > "prepare_bitmap_git()", and will then die() in usage cases I found). > > In addition of above, I had a question that why we need > "bitmap_git->midx" in the condition? Because here in > "open_pack_bitmap_1()" we intent to open the non-midx-bitmap and it's > after we deal with "open_midx_bitmap()" in "open_bitmap()": You're right; open_pack_bitmap_1() doesn't need to care about whether or not bitmap_git->midx is or isn't non-NULL, since: - if we did open a MIDX bitmap (which we will always attempt first before trying to open single-pack bitmaps), then we won't even bother to call open_pack_bitmap() at all. - if we _do_ end up within open_pack_bitmap_1(), then we _know_ that no MIDX bitmap could be found/opened, so there is no need to check in that case, either. So I think we realistically could do something like: --- 8< --- diff --git a/pack-bitmap.c b/pack-bitmap.c index 97909d48da..6e7c89826d 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -387,3 +387,3 @@ static int open_pack_bitmap_1(struct bitmap_index *bitmap_git, struct packed_git - if (bitmap_git->pack || bitmap_git->midx) { + if (bitmap_git->pack) { /* ignore extra bitmap file; we can only handle one */ --- >8 --- ...but having the conditional there from the pre-image doesn't hurt, either, and it makes the error clearer in case of an accidental regression where we start looking for single-pack bitmaps after successfully opening a multi-pack one. > static int open_bitmap(struct repository *r, > struct bitmap_index *bitmap_git) > { > assert(!bitmap_git->map); > > if (!open_midx_bitmap(r, bitmap_git)) > return 0; > return open_pack_bitmap(r, bitmap_git); > } > > So if I understood correct, maybe we can made condition of "bitmap_git->midx" a little > earlier so that we can avoid to open every packfile, maybe it's like: > > diff --git a/pack-bitmap.c b/pack-bitmap.c > index 9c666cdb8b..38f53b8f1c 100644 > --- a/pack-bitmap.c > +++ b/pack-bitmap.c > @@ -483,11 +483,12 @@ static int open_pack_bitmap(struct repository *r, > > assert(!bitmap_git->map); > > - for (p = get_all_packs(r); p; p = p->next) { > - if (open_pack_bitmap_1(bitmap_git, p) == 0) > - ret = 0; > + if (!bitmap_git->midx) { > + for (p = get_all_packs(r); p; p = p->next) { > + if (open_pack_bitmap_1(bitmap_git, p) == 0) > + ret = 0; > + } > } > - > return ret; > } This shouldn't be necessary, since we don't bother calling open_pack_bitmap() at all if open_midx_bitmap() returns success. In other words, based on the way that open_bitmap() (which is the caller for both of these functions) is written, we know that once we're in open_pack_bitmap(), that `bitmap_git->midx` is definitely NULL, which makes this change a noop. Thanks, Taylor
diff --git a/pack-bitmap.c b/pack-bitmap.c index 931219adf0..b1357137bf 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -484,8 +484,10 @@ static int open_pack_bitmap(struct repository *r, assert(!bitmap_git->map); for (p = get_all_packs(r); p; p = p->next) { - if (open_pack_bitmap_1(bitmap_git, p) == 0) + if (open_pack_bitmap_1(bitmap_git, p) == 0) { ret = 0; + break; + } } return ret;
There will be only one operant ".bitmap" file in repo, so let's add "break" statement in "open_pack_bitmap()" when looping all the packs in repo. Signed-off-by: Teng Long <dyroneteng@gmail.com> --- pack-bitmap.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)