mbox series

[0/9] caching loose objects

Message ID 20181112144627.GA2478@sigill.intra.peff.net (mailing list archive)
Headers show
Series caching loose objects | expand

Message

Jeff King Nov. 12, 2018, 2:46 p.m. UTC
Here's the series I mentioned earlier in the thread to cache loose
objects when answering has_object_file(..., OBJECT_INFO_QUICK). For
those just joining us, this makes operations that look up a lot of
missing objects (like "index-pack" looking for collisions) faster. This
is mostly targeted at systems where stat() is slow, like over NFS, but
it seems to give a 2% speedup indexing a full git.git packfile into an
empty repository (i.e., what you'd see on a clone).

I'm adding René Scharfe and Takuto Ikuta to the cc for their previous
work in loose-object caching.

The interesting bit is patch 8. The rest of it is cleanup to let us
treat alternates and the main object directory similarly.

  [1/9]: fsck: do not reuse child_process structs
  [2/9]: submodule--helper: prefer strip_suffix() to ends_with()
  [3/9]: rename "alternate_object_database" to "object_directory"
  [4/9]: sha1_file_name(): overwrite buffer instead of appending
  [5/9]: handle alternates paths the same as the main object dir
  [6/9]: sha1-file: use an object_directory for the main object dir
  [7/9]: object-store: provide helpers for loose_objects_cache
  [8/9]: sha1-file: use loose object cache for quick existence check
  [9/9]: fetch-pack: drop custom loose object cache

 builtin/count-objects.c     |   4 +-
 builtin/fsck.c              |  35 +++---
 builtin/grep.c              |   2 +-
 builtin/submodule--helper.c |   9 +-
 commit-graph.c              |  13 +--
 environment.c               |   4 +-
 fetch-pack.c                |  39 +------
 http-walker.c               |   2 +-
 http.c                      |   4 +-
 object-store.h              |  60 +++++------
 object.c                    |  26 ++---
 packfile.c                  |  20 ++--
 path.c                      |   2 +-
 repository.c                |   8 +-
 sha1-file.c                 | 210 ++++++++++++++++++------------------
 sha1-name.c                 |  42 ++------
 transport.c                 |   2 +-
 17 files changed, 209 insertions(+), 273 deletions(-)

-Peff

Comments

Derrick Stolee Nov. 12, 2018, 4:02 p.m. UTC | #1
On 11/12/2018 9:46 AM, Jeff King wrote:
> Here's the series I mentioned earlier in the thread to cache loose
> objects when answering has_object_file(..., OBJECT_INFO_QUICK). For
> those just joining us, this makes operations that look up a lot of
> missing objects (like "index-pack" looking for collisions) faster. This
> is mostly targeted at systems where stat() is slow, like over NFS, but
> it seems to give a 2% speedup indexing a full git.git packfile into an
> empty repository (i.e., what you'd see on a clone).
>
> I'm adding René Scharfe and Takuto Ikuta to the cc for their previous
> work in loose-object caching.
>
> The interesting bit is patch 8. The rest of it is cleanup to let us
> treat alternates and the main object directory similarly.

This cleanup is actually really valuable, and affects much more than 
this application.

I really think it is a good idea, and hope it doesn't cause too much 
trouble as the topic is cooking.

Thanks,
-Stolee
Stefan Beller Nov. 12, 2018, 7:10 p.m. UTC | #2
On Mon, Nov 12, 2018 at 8:02 AM Derrick Stolee <stolee@gmail.com> wrote:

> This cleanup is actually really valuable, and affects much more than
> this application.

I second this. I'd value this series more for the cleanup than its
application. ;-)