Message ID | pull.872.git.1612897624121.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | doc: mention bigFileThreshold for packing | expand |
"Christian Walther via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Christian Walther <cwalther@gmx.ch> > > Knowing about the core.bigFileThreshold configuration variable is > helpful when examining pack file size differences between repositories. > Add a reference to it to the manpages a user is likely to read in this > situation. Thanks. I doubt that the description of --window/--depth command line options, for both repack and pack-objects, is the best place to add this "Note". Even if we were to add it as an appendix to these places, please do not break the flow of explanation by inserting it before the description of the default values of these options. > I recently spent a lot of time trying to figure out why git repack would > create huge packs on some clones of my repository and small ones on > others, until I found out about the existence of the > core.bigFileThreshold configuration variable, which happened to be set > on some and not on others. It would have saved me a lot of time if that > variable had been mentioned in the relevant manpages that I was reading, > git-repack and git-pack-objects. So this patch adds that. Not related to the contents of the patch, but I am somewhat curious to know what configuration resulted in the "huge" ones and "small" ones. Documentation/config/core.txt::core.bigFileThreashold may be helped by addition of a success story, and the configuration for the "small" ones may be a good place to start. Thanks
Junio C Hamano wrote: > I doubt that the description of --window/--depth command line > options, for both repack and pack-objects, is the best place to add > this "Note". Even if we were to add it as an appendix to these > places, please do not break the flow of explanation by inserting it > before the description of the default values of these options. OK. That was where I would have looked for it, because it explains why --window wasn't effective in my attempts to get better compression, but I don't insist on it - any place would have worked, as I read both manpages back and forth several times. In git-repack.txt, there is a "Configuration" section at the bottom, I guess it would fit there? There is none in git-pack-objects.txt, but I could add it. What do you think? >> I recently spent a lot of time trying to figure out why git repack would >> create huge packs on some clones of my repository and small ones on >> others > > Not related to the contents of the patch, but I am somewhat curious > to know what configuration resulted in the "huge" ones and "small" > ones. Documentation/config/core.txt::core.bigFileThreashold may be > helped by addition of a success story, and the configuration for the > "small" ones may be a good place to start. The "huge" repository had bigFileThreshold = 1m. That was set by SubGit when converting from Subversion, for reasons unknown to me (see some discussion at https://support.tmatesoft.com/t/reduce-repository-size/2551 and https://issues.tmatesoft.com/issue/SGT-604). The result is a pack file of about 3 GB. The "small" repository has it unset, so the default 512m applies, resulting in a pack file of about 50 MB. What causes the huge difference is that the repository contains a "changelog" file that changes in almost every commit and has grown to 2.4 MB over 10000 commits. So it exists in about that many different versions, of which about 6000 are larger than 1 MB, but they only differ from each other by successive addition of small pieces. I'm not sure if that makes for a good success story. 1m seems a rather extreme value to me. If you think so, I can try to come up with something. Thanks Christian
Christian Walther <cwalther@gmx.ch> writes: > Junio C Hamano wrote: > >> I doubt that the description of --window/--depth command line >> options, for both repack and pack-objects, is the best place to add >> this "Note". Even if we were to add it as an appendix to these >> places, please do not break the flow of explanation by inserting it >> before the description of the default values of these options. > > OK. That was where I would have looked for it, because it explains > why --window wasn't effective in my attempts to get better > compression, but I don't insist on it - any place would have > worked, as I read both manpages back and forth several times. The "pack-objects" command (and to some degree "repack", too) is about packing throughout, and --depth/--window is not necessarily the central piece of the puzzle, and that, together with disruption of the flow of the original explanation, was the reason why I found the initial location a bit odd. > In git-repack.txt, there is a "Configuration" section at the > bottom, I guess it would fit there? There is none in > git-pack-objects.txt, but I could add it. What do you think? You're right---if there is an existing CONFIGURATION section, that may be a much better place. There are configuration variables that affect how the packing works other than the core.bigFileThreshold, and attributes like "delta" would also affect the outcome. Describing all in one CONFIGURATION section would be valuable. What I queued is with the following ready to be squashed in, primarily because I was lazy and didn't have time/inclination to look for a better place myself ;-) Thanks. ---- >8 ---- Subject: [PATCH] fixup! doc: mention bigFileThreshold for packing --- Documentation/git-pack-objects.txt | 7 +++---- Documentation/git-repack.txt | 7 +++---- 2 files changed, 6 insertions(+), 8 deletions(-) diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt index 59150ded4b..be0f953c35 100644 --- a/Documentation/git-pack-objects.txt +++ b/Documentation/git-pack-objects.txt @@ -97,12 +97,11 @@ base-name:: side, because delta data needs to be applied that many times to get to the necessary object. + -Note that delta compression is never used on objects larger than the -`core.bigFileThreshold` configuration variable (see -linkgit:git-config[1]). -+ The default value for --window is 10 and --depth is 50. The maximum depth is 4095. ++ +Note that delta compression is never used on objects larger than the +`core.bigFileThreshold` configuration variable (see linkgit:git-config[1]). --window-memory=<n>:: This option provides an additional limit on top of `--window`; diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt index 0a7038ec4a..145fff6e01 100644 --- a/Documentation/git-repack.txt +++ b/Documentation/git-repack.txt @@ -96,12 +96,11 @@ to the new separate pack will be written. affects the performance on the unpacker side, because delta data needs to be applied that many times to get to the necessary object. + -Note that delta compression is never used on objects larger than the -`core.bigFileThreshold` configuration variable (see -linkgit:git-config[1]). -+ The default value for --window is 10 and --depth is 50. The maximum depth is 4095. ++ +Note that delta compression is never used on objects larger than the +`core.bigFileThreshold` configuration variable (see linkgit:git-config[1]). --threads=<n>:: This option is passed through to `git pack-objects`.
diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt index 54d715ead137..59150ded4bef 100644 --- a/Documentation/git-pack-objects.txt +++ b/Documentation/git-pack-objects.txt @@ -97,6 +97,10 @@ base-name:: side, because delta data needs to be applied that many times to get to the necessary object. + +Note that delta compression is never used on objects larger than the +`core.bigFileThreshold` configuration variable (see +linkgit:git-config[1]). ++ The default value for --window is 10 and --depth is 50. The maximum depth is 4095. diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt index 92f146d27dc3..0a7038ec4ad8 100644 --- a/Documentation/git-repack.txt +++ b/Documentation/git-repack.txt @@ -96,6 +96,10 @@ to the new separate pack will be written. affects the performance on the unpacker side, because delta data needs to be applied that many times to get to the necessary object. + +Note that delta compression is never used on objects larger than the +`core.bigFileThreshold` configuration variable (see +linkgit:git-config[1]). ++ The default value for --window is 10 and --depth is 50. The maximum depth is 4095.