diff mbox series

[v3,5/5] unpack-objects: unpack_non_delta_entry() read data in a stream

Message ID 20211122033220.32883-6-chiyutianyi@gmail.com (mailing list archive)
State New, archived
Headers show
Series None | expand

Commit Message

Han Xin Nov. 22, 2021, 3:32 a.m. UTC
From: Han Xin <hanxin.hx@alibaba-inc.com>

We used to call "get_data()" in "unpack_non_delta_entry()" to read the
entire contents of a blob object, no matter how big it is. This
implementation may consume all the memory and cause OOM.

By implementing a zstream version of input_stream interface, we can use
a small fixed buffer for "unpack_non_delta_entry()".

However, unpack non-delta objects from a stream instead of from an entrie
buffer will have 10% performance penalty. Therefore, only unpack object
larger than the "big_file_threshold" in zstream. See the following
benchmarks:

    $ hyperfine \
    --prepare 'rm -rf dest.git && git init --bare dest.git' \
    'git -C dest.git unpack-objects <binary_320M.pack'
    Benchmark 1: git -C dest.git unpack-objects <binary_320M.pack
      Time (mean ± σ):     10.029 s ±  0.270 s    [User: 8.265 s, System: 1.522 s]
      Range (min … max):    9.786 s … 10.603 s    10 runs

    $ hyperfine \
    --prepare 'rm -rf dest.git && git init --bare dest.git' \
    'git -c core.bigFileThreshold=2m -C dest.git unpack-objects <binary_320M.pack'
    Benchmark 1: git -c core.bigFileThreshold=2m -C dest.git unpack-objects <binary_320M.pack
      Time (mean ± σ):     10.859 s ±  0.774 s    [User: 8.813 s, System: 1.898 s]
      Range (min … max):    9.884 s … 12.192 s    10 runs

    $ hyperfine \
    --prepare 'rm -rf dest.git && git init --bare dest.git' \
    'git -C dest.git unpack-objects <binary_96M.pack'
    Benchmark 1: git -C dest.git unpack-objects <binary_96M.pack
      Time (mean ± σ):      2.678 s ±  0.037 s    [User: 2.205 s, System: 0.450 s]
      Range (min … max):    2.639 s …  2.743 s    10 runs

    $ hyperfine \
    --prepare 'rm -rf dest.git && git init --bare dest.git' \
    'git -c core.bigFileThreshold=2m -C dest.git unpack-objects <binary_96M.pack'
    Benchmark 1: git -c core.bigFileThreshold=2m -C dest.git unpack-objects <binary_96M.pack
      Time (mean ± σ):      2.819 s ±  0.124 s    [User: 2.216 s, System: 0.564 s]
      Range (min … max):    2.679 s …  3.125 s    10 runs

Helped-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Han Xin <hanxin.hx@alibaba-inc.com>
---
 builtin/unpack-objects.c            | 76 ++++++++++++++++++++++++++++-
 object-file.c                       |  6 +--
 object-store.h                      |  4 ++
 t/t5590-unpack-non-delta-objects.sh | 76 +++++++++++++++++++++++++++++
 4 files changed, 158 insertions(+), 4 deletions(-)
 create mode 100755 t/t5590-unpack-non-delta-objects.sh

Comments

Derrick Stolee Nov. 29, 2021, 5:37 p.m. UTC | #1
On 11/21/2021 10:32 PM, Han Xin wrote:
> From: Han Xin <hanxin.hx@alibaba-inc.com>
> 
> We used to call "get_data()" in "unpack_non_delta_entry()" to read the
> entire contents of a blob object, no matter how big it is. This
> implementation may consume all the memory and cause OOM.
> 
> By implementing a zstream version of input_stream interface, we can use
> a small fixed buffer for "unpack_non_delta_entry()".
> 
> However, unpack non-delta objects from a stream instead of from an entrie
> buffer will have 10% performance penalty. Therefore, only unpack object
> larger than the "big_file_threshold" in zstream. See the following
> benchmarks:
> 
>     $ hyperfine \
>     --prepare 'rm -rf dest.git && git init --bare dest.git' \
>     'git -C dest.git unpack-objects <binary_320M.pack'
>     Benchmark 1: git -C dest.git unpack-objects <binary_320M.pack
>       Time (mean ± σ):     10.029 s ±  0.270 s    [User: 8.265 s, System: 1.522 s]
>       Range (min … max):    9.786 s … 10.603 s    10 runs
> 
>     $ hyperfine \
>     --prepare 'rm -rf dest.git && git init --bare dest.git' \
>     'git -c core.bigFileThreshold=2m -C dest.git unpack-objects <binary_320M.pack'
>     Benchmark 1: git -c core.bigFileThreshold=2m -C dest.git unpack-objects <binary_320M.pack
>       Time (mean ± σ):     10.859 s ±  0.774 s    [User: 8.813 s, System: 1.898 s]
>       Range (min … max):    9.884 s … 12.192 s    10 runs

It seems that you want us to compare this pair of results, and
hyperfine can assist with that by including multiple benchmarks
(with labels, using '-n') as follows:

$ hyperfine \
        --prepare 'rm -rf dest.git && git init --bare dest.git' \
        -n 'old' '~/_git/git-upstream/git -C dest.git unpack-objects <big.pack' \
        -n 'new' '~/_git/git/git -C dest.git unpack-objects <big.pack' \
        -n 'new (small threshold)' '~/_git/git/git -c core.bigfilethreshold=64k -C dest.git unpack-objects <big.pack'

Benchmark 1: old
  Time (mean ± σ):     20.835 s ±  0.058 s    [User: 14.510 s, System: 6.284 s]
  Range (min … max):   20.741 s … 20.909 s    10 runs
 
Benchmark 2: new
  Time (mean ± σ):     26.515 s ±  0.072 s    [User: 19.783 s, System: 6.696 s]
  Range (min … max):   26.419 s … 26.611 s    10 runs
 
Benchmark 3: new (small threshold)
  Time (mean ± σ):     26.523 s ±  0.101 s    [User: 19.805 s, System: 6.680 s]
  Range (min … max):   26.416 s … 26.739 s    10 runs
 
Summary
  'old' ran
    1.27 ± 0.00 times faster than 'new'
    1.27 ± 0.01 times faster than 'new (small threshold)'

(Here, 'old' is testing a compiled version of the latest 'master'
branch, while 'new' has your patches applied on top.)

Notice from this example I had a pack with many small objects (mostly
commits and trees) and I see that this change introduces significant
overhead to this case.

It would be nice to understand this overhead and fix it before taking
this change any further.

Thanks,
-Stolee
Han Xin Nov. 30, 2021, 1:49 p.m. UTC | #2
On Tue, Nov 30, 2021 at 1:37 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 11/21/2021 10:32 PM, Han Xin wrote:
> > From: Han Xin <hanxin.hx@alibaba-inc.com>
> >
> > We used to call "get_data()" in "unpack_non_delta_entry()" to read the
> > entire contents of a blob object, no matter how big it is. This
> > implementation may consume all the memory and cause OOM.
> >
> > By implementing a zstream version of input_stream interface, we can use
> > a small fixed buffer for "unpack_non_delta_entry()".
> >
> > However, unpack non-delta objects from a stream instead of from an entrie
> > buffer will have 10% performance penalty. Therefore, only unpack object
> > larger than the "big_file_threshold" in zstream. See the following
> > benchmarks:
> >
> >     $ hyperfine \
> >     --prepare 'rm -rf dest.git && git init --bare dest.git' \
> >     'git -C dest.git unpack-objects <binary_320M.pack'
> >     Benchmark 1: git -C dest.git unpack-objects <binary_320M.pack
> >       Time (mean ± σ):     10.029 s ±  0.270 s    [User: 8.265 s, System: 1.522 s]
> >       Range (min … max):    9.786 s … 10.603 s    10 runs
> >
> >     $ hyperfine \
> >     --prepare 'rm -rf dest.git && git init --bare dest.git' \
> >     'git -c core.bigFileThreshold=2m -C dest.git unpack-objects <binary_320M.pack'
> >     Benchmark 1: git -c core.bigFileThreshold=2m -C dest.git unpack-objects <binary_320M.pack
> >       Time (mean ± σ):     10.859 s ±  0.774 s    [User: 8.813 s, System: 1.898 s]
> >       Range (min … max):    9.884 s … 12.192 s    10 runs
>
> It seems that you want us to compare this pair of results, and
> hyperfine can assist with that by including multiple benchmarks
> (with labels, using '-n') as follows:
>
> $ hyperfine \
>         --prepare 'rm -rf dest.git && git init --bare dest.git' \
>         -n 'old' '~/_git/git-upstream/git -C dest.git unpack-objects <big.pack' \
>         -n 'new' '~/_git/git/git -C dest.git unpack-objects <big.pack' \
>         -n 'new (small threshold)' '~/_git/git/git -c core.bigfilethreshold=64k -C dest.git unpack-objects <big.pack'
>
> Benchmark 1: old
>   Time (mean ± σ):     20.835 s ±  0.058 s    [User: 14.510 s, System: 6.284 s]
>   Range (min … max):   20.741 s … 20.909 s    10 runs
>
> Benchmark 2: new
>   Time (mean ± σ):     26.515 s ±  0.072 s    [User: 19.783 s, System: 6.696 s]
>   Range (min … max):   26.419 s … 26.611 s    10 runs
>
> Benchmark 3: new (small threshold)
>   Time (mean ± σ):     26.523 s ±  0.101 s    [User: 19.805 s, System: 6.680 s]
>   Range (min … max):   26.416 s … 26.739 s    10 runs
>
> Summary
>   'old' ran
>     1.27 ± 0.00 times faster than 'new'
>     1.27 ± 0.01 times faster than 'new (small threshold)'
>
> (Here, 'old' is testing a compiled version of the latest 'master'
> branch, while 'new' has your patches applied on top.)
>
> Notice from this example I had a pack with many small objects (mostly
> commits and trees) and I see that this change introduces significant
> overhead to this case.
>
> It would be nice to understand this overhead and fix it before taking
> this change any further.
>
> Thanks,
> -Stolee

Can you show me the specific information of the repository you
tested, so that I can analyze it further.

I test this repository, but did not meet the problem:

 Unpacking objects: 100% (18345/18345), 43.15 MiB

hyperfine \
        --prepare 'rm -rf dest.git && git init --bare dest.git' \
        -n 'old' 'git -C dest.git unpack-objects <big.pack' \
        -n 'new' 'new/git -C dest.git unpack-objects <big.pack' \
        -n 'new (small threshold)' 'new/git -c
core.bigfilethreshold=64k -C dest.git unpack-objects <big.pack'
Benchmark 1: old
  Time (mean ± σ):     17.403 s ±  0.880 s    [User: 4.996 s, System: 11.803 s]
  Range (min … max):   15.911 s … 19.368 s    10 runs

Benchmark 2: new
  Time (mean ± σ):     17.788 s ±  0.199 s    [User: 5.054 s, System: 12.257 s]
  Range (min … max):   17.420 s … 18.195 s    10 runs

Benchmark 3: new (small threshold)
  Time (mean ± σ):     18.433 s ±  0.711 s    [User: 4.982 s, System: 12.338 s]
  Range (min … max):   17.518 s … 19.775 s    10 runs

Summary
  'old' ran
    1.02 ± 0.05 times faster than 'new'
    1.06 ± 0.07 times faster than 'new (small threshold)'

Thanks,
- Han Xin
Derrick Stolee Nov. 30, 2021, 6:38 p.m. UTC | #3
On 11/30/2021 8:49 AM, Han Xin wrote:
> On Tue, Nov 30, 2021 at 1:37 AM Derrick Stolee <stolee@gmail.com> wrote:
>> $ hyperfine \
>>         --prepare 'rm -rf dest.git && git init --bare dest.git' \
>>         -n 'old' '~/_git/git-upstream/git -C dest.git unpack-objects <big.pack' \
>>         -n 'new' '~/_git/git/git -C dest.git unpack-objects <big.pack' \
>>         -n 'new (small threshold)' '~/_git/git/git -c core.bigfilethreshold=64k -C dest.git unpack-objects <big.pack'
>>
>> Benchmark 1: old
>>   Time (mean ± σ):     20.835 s ±  0.058 s    [User: 14.510 s, System: 6.284 s]
>>   Range (min … max):   20.741 s … 20.909 s    10 runs
>>
>> Benchmark 2: new
>>   Time (mean ± σ):     26.515 s ±  0.072 s    [User: 19.783 s, System: 6.696 s]
>>   Range (min … max):   26.419 s … 26.611 s    10 runs
>>
>> Benchmark 3: new (small threshold)
>>   Time (mean ± σ):     26.523 s ±  0.101 s    [User: 19.805 s, System: 6.680 s]
>>   Range (min … max):   26.416 s … 26.739 s    10 runs
>>
>> Summary
>>   'old' ran
>>     1.27 ± 0.00 times faster than 'new'
>>     1.27 ± 0.01 times faster than 'new (small threshold)'
>>
>> (Here, 'old' is testing a compiled version of the latest 'master'
>> branch, while 'new' has your patches applied on top.)
>>
>> Notice from this example I had a pack with many small objects (mostly
>> commits and trees) and I see that this change introduces significant
>> overhead to this case.
>>
>> It would be nice to understand this overhead and fix it before taking
>> this change any further.
>>
>> Thanks,
>> -Stolee
> 
> Can you show me the specific information of the repository you
> tested, so that I can analyze it further.

I used a pack-file from an internal repo. It happened to be using
partial clone, so here is a repro with the git/git repository
after cloning this way:

$ git clone --no-checkout --filter=blob:none https://github.com/git/git

(copy the large .pack from git/.git/objects/pack/ to big.pack)

$ hyperfine \
	--prepare 'rm -rf dest.git && git init --bare dest.git' \
	-n 'old' '~/_git/git-upstream/git -C dest.git unpack-objects <big.pack' \
	-n 'new' '~/_git/git/git -C dest.git unpack-objects <big.pack' \
	-n 'new (small threshold)' '~/_git/git/git -c core.bigfilethreshold=64k -C dest.git unpack-objects <big.pack'

Benchmark 1: old
  Time (mean ± σ):     82.748 s ±  0.445 s    [User: 50.512 s, System: 32.049 s]
  Range (min … max):   82.042 s … 83.587 s    10 runs
 
Benchmark 2: new
  Time (mean ± σ):     101.644 s ±  0.524 s    [User: 67.470 s, System: 34.047 s]
  Range (min … max):   100.866 s … 102.633 s    10 runs
 
Benchmark 3: new (small threshold)
  Time (mean ± σ):     101.093 s ±  0.269 s    [User: 67.404 s, System: 33.559 s]
  Range (min … max):   100.639 s … 101.375 s    10 runs
 
Summary
  'old' ran
    1.22 ± 0.01 times faster than 'new (small threshold)'
    1.23 ± 0.01 times faster than 'new'

I'm also able to repro this with a smaller repo (microsoft/scalar)
so the tests complete much faster:

$ hyperfine \
        --prepare 'rm -rf dest.git && git init --bare dest.git' \
        -n 'old' '~/_git/git-upstream/git -C dest.git unpack-objects <small.pack' \
        -n 'new' '~/_git/git/git -C dest.git unpack-objects <small.pack' \
        -n 'new (small threshold)' '~/_git/git/git -c core.bigfilethreshold=64k -C dest.git unpack-objects <small.pack'

Benchmark 1: old
  Time (mean ± σ):      3.295 s ±  0.023 s    [User: 1.063 s, System: 2.228 s]
  Range (min … max):    3.269 s …  3.351 s    10 runs
 
Benchmark 2: new
  Time (mean ± σ):      3.592 s ±  0.105 s    [User: 1.261 s, System: 2.328 s]
  Range (min … max):    3.378 s …  3.679 s    10 runs
 
Benchmark 3: new (small threshold)
  Time (mean ± σ):      3.584 s ±  0.144 s    [User: 1.241 s, System: 2.339 s]
  Range (min … max):    3.359 s …  3.747 s    10 runs
 
Summary
  'old' ran
    1.09 ± 0.04 times faster than 'new (small threshold)'
    1.09 ± 0.03 times faster than 'new'

It's not the same relative overhead, but still significant.

These pack-files contain (mostly) small objects, no large blobs.
I know that's not the target of your efforts, but it would be
good to avoid a regression here.

Thanks,
-Stolee
Ævar Arnfjörð Bjarmason Dec. 1, 2021, 8:37 p.m. UTC | #4
I hadn't sent a shameless plug for my "git hyperfine" script to the
list, perhaps this is a good time. It's just a thin shellscript wrapper
around "hyperfine" that I wrote the other day, which...

On Tue, Nov 30 2021, Derrick Stolee wrote:

> [...]
> I used a pack-file from an internal repo. It happened to be using
> partial clone, so here is a repro with the git/git repository
> after cloning this way:
>
> $ git clone --no-checkout --filter=blob:none https://github.com/git/git
>
> (copy the large .pack from git/.git/objects/pack/ to big.pack)
>
> $ hyperfine \
> 	--prepare 'rm -rf dest.git && git init --bare dest.git' \
> 	-n 'old' '~/_git/git-upstream/git -C dest.git unpack-objects <big.pack' \
> 	-n 'new' '~/_git/git/git -C dest.git unpack-objects <big.pack' \
> 	-n 'new (small threshold)' '~/_git/git/git -c core.bigfilethreshold=64k -C dest.git unpack-objects <big.pack'
>
> Benchmark 1: old
>   Time (mean ± σ):     82.748 s ±  0.445 s    [User: 50.512 s, System: 32.049 s]
>   Range (min … max):   82.042 s … 83.587 s    10 runs
>  
> Benchmark 2: new
>   Time (mean ± σ):     101.644 s ±  0.524 s    [User: 67.470 s, System: 34.047 s]
>   Range (min … max):   100.866 s … 102.633 s    10 runs
>  
> Benchmark 3: new (small threshold)
>   Time (mean ± σ):     101.093 s ±  0.269 s    [User: 67.404 s, System: 33.559 s]
>   Range (min … max):   100.639 s … 101.375 s    10 runs
>  
> Summary
>   'old' ran
>     1.22 ± 0.01 times faster than 'new (small threshold)'
>     1.23 ± 0.01 times faster than 'new'

...adds enough sugar around "hyperfine" itself to do this as e.g. (the
"-s" is a feature I submitted to hyperfine itself, it's not in a release
yet[1], but in this case you could also use "-p"):

    git hyperfine -L rev v2.20.0,origin/master \
        -s 'if ! test -d redis.git; then git clone --bare --filter=blob:none https://github.com/redis/redis; fi && make' \
        -p 'rm -rf dest.git; git init --bare dest.git' \
        './git -C dest.git unpack-objects <$(echo redis.git/objects/pack/*.pack)'

The sugar being that for each named "rev" parameter it'll set up "git
worktree" for you, so under the hood each of those is chdir-ing to the
respective revision of:
    
    $ git worktree list
    [...]
    /run/user/1001/git-hyperfine/origin/master  abe6bb39053 (detached HEAD)
    /run/user/1001/git-hyperfine/v2.33.0        225bc32a989 (detached HEAD)

That they're named revisions and not git-rev-parse'd is intentional,
since you'll benefit from faster incremental "make" (even if using
"ccache"). I'm typically benchmarking HEAD~1,HEAD~0.

The output will then use those "rev" parameters, and be e.g.:
    
    Benchmark 1: ./git -C dest.git unpack-objects <$(echo redis.git/objects/pack/*.pack)' in 'v2.20.0
      Time (mean ± σ):      6.678 s ±  0.046 s    [User: 4.525 s, System: 2.117 s]
      Range (min … max):    6.619 s …  6.765 s    10 runs
     
    Benchmark 2: ./git -C dest.git unpack-objects <$(echo redis.git/objects/pack/*.pack)' in 'origin/master
      Time (mean ± σ):      6.756 s ±  0.074 s    [User: 4.586 s, System: 2.134 s]
      Range (min … max):    6.691 s …  6.941 s    10 runs
     
    Summary
      './git -C dest.git unpack-objects <$(echo redis.git/objects/pack/*.pack)' in 'v2.20.0' ran
        1.01 ± 0.01 times faster than './git -C dest.git unpack-objects <$(echo redis.git/objects/pack/*.pack)' in 'origin/master'

I think if you're routinely benchmarking N different git versions you'll
find it handy, it also has configurable hook support (using git config),
so e.g. it's easy to copy your config.mak in-place in the
worktrees. E.g. my config is:

    $ git -P config --get-regexp '^hyperfine'
    hyperfine.run-dir $XDG_RUNTIME_DIR/git-hyperfine
    hyperfine.xargs-options -r
    hyperfine.hook.setup ~/g/git.meta/config.mak.sh

It's hosted at https://github.com/avar/git-hyperfine/ and
https://gitlab.com/avar/git-hyperfine/; It's implemented in (portable)
POSIX shell script.

There's surely some bugs in it, one known one is that unlike hyperfine
it doesn't accept there being spaces in the parameters to -L, because
I'm screwing up some quoting-within-quoting in the (shellscript)
implementation (suggestions for that particular one most welcome).

I hacked it up after this suggestion from Jeff King[2] of moving t/perf
over to it.

I haven't done any of that legwork, but I think a wrapper like
"git-hyperfine" that prepares worktrees for the N revisions we're
benchmarking is a good direction to go in.

We don't use git-worktrees in t/perf, but probably could for most/all
tests. In any case it would be easy to have the script setup the revs to
be benchmarked in some hookable custom manner to have it do exactly what
t/perf/run is doing now.

1. https://github.com/sharkdp/hyperfine/commit/017d55a
2. https://lore.kernel.org/git/YV+zFqi4VmBVJYex@coredump.intra.peff.net/
Han Xin Dec. 2, 2021, 7:33 a.m. UTC | #5
On Wed, Dec 1, 2021 at 2:38 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> I used a pack-file from an internal repo. It happened to be using
> partial clone, so here is a repro with the git/git repository
> after cloning this way:
>
> $ git clone --no-checkout --filter=blob:none https://github.com/git/git
>
> (copy the large .pack from git/.git/objects/pack/ to big.pack)
>
> $ hyperfine \
>         --prepare 'rm -rf dest.git && git init --bare dest.git' \
>         -n 'old' '~/_git/git-upstream/git -C dest.git unpack-objects <big.pack' \
>         -n 'new' '~/_git/git/git -C dest.git unpack-objects <big.pack' \
>         -n 'new (small threshold)' '~/_git/git/git -c core.bigfilethreshold=64k -C dest.git unpack-objects <big.pack'
>
> Benchmark 1: old
>   Time (mean ± σ):     82.748 s ±  0.445 s    [User: 50.512 s, System: 32.049 s]
>   Range (min … max):   82.042 s … 83.587 s    10 runs
>
> Benchmark 2: new
>   Time (mean ± σ):     101.644 s ±  0.524 s    [User: 67.470 s, System: 34.047 s]
>   Range (min … max):   100.866 s … 102.633 s    10 runs
>
> Benchmark 3: new (small threshold)
>   Time (mean ± σ):     101.093 s ±  0.269 s    [User: 67.404 s, System: 33.559 s]
>   Range (min … max):   100.639 s … 101.375 s    10 runs
>
> Summary
>   'old' ran
>     1.22 ± 0.01 times faster than 'new (small threshold)'
>     1.23 ± 0.01 times faster than 'new'
>
> I'm also able to repro this with a smaller repo (microsoft/scalar)
> so the tests complete much faster:
>
> $ hyperfine \
>         --prepare 'rm -rf dest.git && git init --bare dest.git' \
>         -n 'old' '~/_git/git-upstream/git -C dest.git unpack-objects <small.pack' \
>         -n 'new' '~/_git/git/git -C dest.git unpack-objects <small.pack' \
>         -n 'new (small threshold)' '~/_git/git/git -c core.bigfilethreshold=64k -C dest.git unpack-objects <small.pack'
>
> Benchmark 1: old
>   Time (mean ± σ):      3.295 s ±  0.023 s    [User: 1.063 s, System: 2.228 s]
>   Range (min … max):    3.269 s …  3.351 s    10 runs
>
> Benchmark 2: new
>   Time (mean ± σ):      3.592 s ±  0.105 s    [User: 1.261 s, System: 2.328 s]
>   Range (min … max):    3.378 s …  3.679 s    10 runs
>
> Benchmark 3: new (small threshold)
>   Time (mean ± σ):      3.584 s ±  0.144 s    [User: 1.241 s, System: 2.339 s]
>   Range (min … max):    3.359 s …  3.747 s    10 runs
>
> Summary
>   'old' ran
>     1.09 ± 0.04 times faster than 'new (small threshold)'
>     1.09 ± 0.03 times faster than 'new'
>
> It's not the same relative overhead, but still significant.
>
> These pack-files contain (mostly) small objects, no large blobs.
> I know that's not the target of your efforts, but it would be
> good to avoid a regression here.
>
> Thanks,
> -Stolee

With your help, I did catch this performance problem, which was
introduced in this patch:
https://lore.kernel.org/git/20211122033220.32883-4-chiyutianyi@gmail.com/

This patch changes the original data reading ino to stream reading, but
its problem is that even for the original reading of the whole object data,
it still generates an additional git_deflate() and subsequent transfer.

I will fix it in a follow-up patch.

Thanks,
-Han Xin
Derrick Stolee Dec. 2, 2021, 1:53 p.m. UTC | #6
On 12/2/2021 2:33 AM, Han Xin wrote:
> On Wed, Dec 1, 2021 at 2:38 AM Derrick Stolee <stolee@gmail.com> wrote:
>> These pack-files contain (mostly) small objects, no large blobs.
>> I know that's not the target of your efforts, but it would be
>> good to avoid a regression here.
>>
>> Thanks,
>> -Stolee
> 
> With your help, I did catch this performance problem, which was
> introduced in this patch:
> https://lore.kernel.org/git/20211122033220.32883-4-chiyutianyi@gmail.com/
> 
> This patch changes the original data reading ino to stream reading, but
> its problem is that even for the original reading of the whole object data,
> it still generates an additional git_deflate() and subsequent transfer.

I'm glad you found it!

> I will fix it in a follow-up patch.

Looking forward to it.

Thanks,
-Stolee
diff mbox series

Patch

diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 8d68acd662..bfc254a236 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -326,11 +326,85 @@  static void added_object(unsigned nr, enum object_type type,
 	}
 }
 
+struct input_zstream_data {
+	git_zstream *zstream;
+	unsigned char buf[4096];
+	int status;
+};
+
+static const void *feed_input_zstream(struct input_stream *in_stream, unsigned long *readlen)
+{
+	struct input_zstream_data *data = in_stream->data;
+	git_zstream *zstream = data->zstream;
+	void *in = fill(1);
+
+	if (!len || data->status == Z_STREAM_END) {
+		*readlen = 0;
+		return NULL;
+	}
+
+	zstream->next_out = data->buf;
+	zstream->avail_out = sizeof(data->buf);
+	zstream->next_in = in;
+	zstream->avail_in = len;
+
+	data->status = git_inflate(zstream, 0);
+	use(len - zstream->avail_in);
+	*readlen = sizeof(data->buf) - zstream->avail_out;
+
+	return data->buf;
+}
+
+static void write_stream_blob(unsigned nr, unsigned long size)
+{
+	char hdr[32];
+	int hdrlen;
+	git_zstream zstream;
+	struct input_zstream_data data;
+	struct input_stream in_stream = {
+		.read = feed_input_zstream,
+		.data = &data,
+	};
+	struct object_id *oid = &obj_list[nr].oid;
+	int ret;
+
+	memset(&zstream, 0, sizeof(zstream));
+	memset(&data, 0, sizeof(data));
+	data.zstream = &zstream;
+	git_inflate_init(&zstream);
+
+	/* Generate the header */
+	hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %"PRIuMAX, type_name(OBJ_BLOB), (uintmax_t)size) + 1;
+
+	if ((ret = write_loose_object(oid, hdr, hdrlen, &in_stream, 0, 0)))
+		die(_("failed to write object in stream %d"), ret);
+
+	if (zstream.total_out != size || data.status != Z_STREAM_END)
+		die(_("inflate returned %d"), data.status);
+	git_inflate_end(&zstream);
+
+	if (strict && !dry_run) {
+		struct blob *blob = lookup_blob(the_repository, oid);
+		if (blob)
+			blob->object.flags |= FLAG_WRITTEN;
+		else
+			die("invalid blob object from stream");
+	}
+	obj_list[nr].obj = NULL;
+}
+
 static void unpack_non_delta_entry(enum object_type type, unsigned long size,
 				   unsigned nr)
 {
-	void *buf = get_data(size, dry_run);
+	void *buf;
+
+	/* Write large blob in stream without allocating full buffer. */
+	if (!dry_run && type == OBJ_BLOB && size > big_file_threshold) {
+		write_stream_blob(nr, size);
+		return;
+	}
 
+	buf = get_data(size, dry_run);
 	if (!dry_run && buf)
 		write_object(nr, type, buf, size);
 	else
diff --git a/object-file.c b/object-file.c
index 93bcfaca50..bd7631f7ef 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1878,9 +1878,9 @@  static const void *feed_simple_input_stream(struct input_stream *in_stream, unsi
 	return data->buf;
 }
 
-static int write_loose_object(const struct object_id *oid, char *hdr,
-			      int hdrlen, struct input_stream *in_stream,
-			      time_t mtime, unsigned flags)
+int write_loose_object(const struct object_id *oid, char *hdr,
+		       int hdrlen, struct input_stream *in_stream,
+		       time_t mtime, unsigned flags)
 {
 	int fd, ret;
 	unsigned char compressed[4096];
diff --git a/object-store.h b/object-store.h
index ccc1fc9c1a..cbd95c47e2 100644
--- a/object-store.h
+++ b/object-store.h
@@ -228,6 +228,10 @@  int hash_object_file(const struct git_hash_algo *algo, const void *buf,
 		     unsigned long len, const char *type,
 		     struct object_id *oid);
 
+int write_loose_object(const struct object_id *oid, char *hdr,
+		       int hdrlen, struct input_stream *in_stream,
+		       time_t mtime, unsigned flags);
+
 int write_object_file_flags(const void *buf, unsigned long len,
 			    const char *type, struct object_id *oid,
 			    unsigned flags);
diff --git a/t/t5590-unpack-non-delta-objects.sh b/t/t5590-unpack-non-delta-objects.sh
new file mode 100755
index 0000000000..01d950d119
--- /dev/null
+++ b/t/t5590-unpack-non-delta-objects.sh
@@ -0,0 +1,76 @@ 
+#!/bin/sh
+#
+# Copyright (c) 2021 Han Xin
+#
+
+test_description='Test unpack-objects when receive pack'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+test_expect_success "create commit with big blobs (1.5 MB)" '
+	test-tool genrandom foo 1500000 >big-blob &&
+	test_commit --append foo big-blob &&
+	test-tool genrandom bar 1500000 >big-blob &&
+	test_commit --append bar big-blob &&
+	(
+		cd .git &&
+		find objects/?? -type f | sort
+	) >expect &&
+	PACK=$(echo main | git pack-objects --progress --revs test)
+'
+
+test_expect_success 'setup GIT_ALLOC_LIMIT to 1MB' '
+	GIT_ALLOC_LIMIT=1m &&
+	export GIT_ALLOC_LIMIT
+'
+
+test_expect_success 'prepare dest repository' '
+	git init --bare dest.git &&
+	git -C dest.git config core.bigFileThreshold 2m &&
+	git -C dest.git config receive.unpacklimit 100
+'
+
+test_expect_success 'fail to unpack-objects: cannot allocate' '
+	test_must_fail git -C dest.git unpack-objects <test-$PACK.pack 2>err &&
+	test_i18ngrep "fatal: attempting to allocate" err &&
+	(
+		cd dest.git &&
+		find objects/?? -type f | sort
+	) >actual &&
+	! test_cmp expect actual
+'
+
+test_expect_success 'set a lower bigfile threshold' '
+	git -C dest.git config core.bigFileThreshold 1m
+'
+
+test_expect_success 'unpack big object in stream' '
+	git -C dest.git unpack-objects <test-$PACK.pack &&
+	git -C dest.git fsck &&
+	(
+		cd dest.git &&
+		find objects/?? -type f | sort
+	) >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'setup for unpack-objects dry-run test' '
+	git init --bare unpack-test.git
+'
+
+test_expect_success 'unpack-objects dry-run' '
+	(
+		cd unpack-test.git &&
+		git unpack-objects -n <../test-$PACK.pack
+	) &&
+	(
+		cd unpack-test.git &&
+		find objects/ -type f
+	) >actual &&
+	test_must_be_empty actual
+'
+
+test_done