Message ID | cover.1721250704.git.me@ttaylorr.com (mailing list archive) |
---|---|
Headers | show |
Series | midx: incremental multi-pack indexes, part one | expand |
On Wed, Jul 17, 2024 at 05:11:54PM -0400, Taylor Blau wrote: > This series implements incremental MIDXs, which allow for storing > a MIDX across multiple layers, each with their own distinct set of > packs. > > This round is mostly unchanged from the previous since there has not yet > been substantial review. But it does rebase to current 'master' (which > is 04f5a52757 (Post 2.46-rc0 batch #2, 2024-07-16), at the time of > writing). > > Importantly, this rebase moves this topic to be based on an ancestor of > 0c5a62f14b (midx-write.c: do not read existing MIDX with > `packs_to_include`, 2024-06-11), which resulted in a non-trivial > conflict prior to this rebase. > > The rest of the topic is unchanged. I don't expect that we'll see much > review here for the next couple of weeks while we are in the -rc phase, > but I figured it would be useful to have it on the list for folks that > are interested in taking a look. > > Thanks in advance for any review! :-) I gave it a pretty thorough look. Everything looks good for the most part. I left a few comments, but mostly just thinking my way through things. The trickiest parts were: - the confusion between when we want local per-layer positions versus global positions within the whole chainfile, or whether functions are operating on a single layer versus the whole chain. I mused a bit on how we could do it differently, but ultimately I'm not sure there any good solutions. - the changes you did make look good, but it's hard to know if there's code lurking that still needs to be adjusted for chained midx's. For that I think I'd turn more towards testing than code review. I'm not sure how much interesting coverage we're getting from the GIT_TEST variable, just because the repositories made in most of the tests are so trivial. I'd love to see the results on a real workload (both a big repo, but also how things behave over days or weeks of repository maintenance done with incremental midxs). I know that can be hard to do, though. -Peff
On Thu, Aug 01, 2024 at 07:14:10AM -0400, Jeff King wrote: > On Wed, Jul 17, 2024 at 05:11:54PM -0400, Taylor Blau wrote: > > > This series implements incremental MIDXs, which allow for storing > > a MIDX across multiple layers, each with their own distinct set of > > packs. > > > > This round is mostly unchanged from the previous since there has not yet > > been substantial review. But it does rebase to current 'master' (which > > is 04f5a52757 (Post 2.46-rc0 batch #2, 2024-07-16), at the time of > > writing). > > > > Importantly, this rebase moves this topic to be based on an ancestor of > > 0c5a62f14b (midx-write.c: do not read existing MIDX with > > `packs_to_include`, 2024-06-11), which resulted in a non-trivial > > conflict prior to this rebase. > > > > The rest of the topic is unchanged. I don't expect that we'll see much > > review here for the next couple of weeks while we are in the -rc phase, > > but I figured it would be useful to have it on the list for folks that > > are interested in taking a look. > > > > Thanks in advance for any review! :-) > > I gave it a pretty thorough look. Everything looks good for the most > part. I left a few comments, but mostly just thinking my way through > things. Thanks very much. I squashed all of the feedback that I got from your review into a local copy, which I'll submit as "v3" (probably next week, as I am gone for a long weekend starting ~tomorrow and would like to leave others some time to review as well). > - the changes you did make look good, but it's hard to know if there's > code lurking that still needs to be adjusted for chained midx's. For > that I think I'd turn more towards testing than code review. I'm not > sure how much interesting coverage we're getting from the GIT_TEST > variable, just because the repositories made in most of the tests > are so trivial. > > I'd love to see the results on a real workload (both a big repo, but > also how things behave over days or weeks of repository maintenance > done with incremental midxs). I know that can be hard to do, though. Yeah, I agree that this is the biggest gap in this series and the overall plan right now. I have some more detailed comments in [1] that I think are useful to the overall approach. It basically boils down to declaring the feature "experimental" and letting users that are comfortable testing on the bleeding edge help us iron out any bugs (along with rolling it out at GitHub once all the dust has settled on this and subsequent parts). Thanks again for a detailed and helpful review :-). Thanks, Taylor [1]: https://lore.kernel.org/git/ZqvcAQABDIthFUPH@nand.local/