diff mbox series

doc: warn people against --max-pack-size

Message ID YL8bQMOAHxApjVVe@coredump.intra.peff.net (mailing list archive)
State Accepted
Commit b272fddb225c4864ed2967188f5b83eb1662d9ad
Headers show
Series doc: warn people against --max-pack-size | expand

Commit Message

Jeff King June 8, 2021, 7:24 a.m. UTC
On Tue, Jun 08, 2021 at 04:04:23PM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > On Mon, Jun 07, 2021 at 01:42:47PM +0700, Bagas Sanjaya wrote:
> >
> >> I would like to create packfiles with charm-numbered size (that is for
> >> example use 49.99M instead of 50M) with git-repack:
> >> 
> >> $ git repack --max-pack-size=49.99M -a -d
> >
> > The parser for numbers with units is shared by many options and config
> > variables. In general, I'm not really opposed to allowing floating point
> > values which get rounded to the nearest byte for any of them. So no
> > objection for the general feature if somebody wants to implement it.
> >
> > But I would note that --max-pack-size is almost never a good idea in
> > general. I don't know what you think it's accomplishing, but it is
> > probably making your repository larger than it needs to be, as well as
> > less performant.
> 
> Perhaps a doc update is in order?  It might have been cute and
> superficially useful to be able to cut packfiles in 650M chunks to
> fit on a CD-ROM, but I suspect that it would be more useful to feed
> a single large file to a generic multi-volume archive tool and let
> it split it to fit the physical volume the tool deals with.

Yeah. Let's do that while we're thinking about it. Here's what I came up
with.

-- >8 --
Subject: [PATCH] doc: warn people against --max-pack-size

This option is almost never a good idea, as the resulting repository is
larger and slower (see the new explanations in the docs).

I outlined the potential problems. We could go further and make the
option harder to find (or at least, make the command-line option
descriptions a much more terse "you probably don't want this; see
pack.packsizeLimit for details"). But this seems like a minimal change
that may prevent people from thinking it's more useful than it is.

Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/config/pack.txt      | 23 +++++++++++++++++------
 Documentation/git-pack-objects.txt |  6 +++---
 Documentation/git-repack.txt       |  4 +++-
 3 files changed, 23 insertions(+), 10 deletions(-)

Comments

Philip Oakley June 17, 2021, 5:02 p.m. UTC | #1
On 08/06/2021 08:24, Jeff King wrote:
> +The minimum size allowed is limited to 1 MiB. The default is unlimited.
> +Common unit suffixes of 'k', 'm', or 'g' are supported.

Do we want to include the workaround of scaling in kibibytes (as
originally mentioned by Bagas) for the default as 1024k? This also
avoids the easy mistake that the size is in multiples MiB.

Philip
Jeff King June 18, 2021, 1:26 p.m. UTC | #2
On Thu, Jun 17, 2021 at 06:02:27PM +0100, Philip Oakley wrote:

> On 08/06/2021 08:24, Jeff King wrote:
> > +The minimum size allowed is limited to 1 MiB. The default is unlimited.
> > +Common unit suffixes of 'k', 'm', or 'g' are supported.
> 
> Do we want to include the workaround of scaling in kibibytes (as
> originally mentioned by Bagas) for the default as 1024k? This also
> avoids the easy mistake that the size is in multiples MiB.

I'm not quite sure what you're asking. If you mean: should we tell
people that they can't use "4.9m" and should instead use "5017k"
instead, then I don't have a strong opinion.

It might help some people. But OTOH it's not clear to me that this is a
common question, so it might clutter up the documentation. Either way,
it's orthogonal to the patch in question, and should come on top if
somebody cares to work on it.

I'd also be fine with somebody actually implementing fractional unit
support (it would probably go into git_parse_signed() and
git_parse_unsigned()). It doesn't seem worth the effort to me, but if
somebody feels strongly enough to implement it cleanly, I wouldn't say
no. :)

-Peff
Philip Oakley June 18, 2021, 3:15 p.m. UTC | #3
On 18/06/2021 14:26, Jeff King wrote:
> On Thu, Jun 17, 2021 at 06:02:27PM +0100, Philip Oakley wrote:
>
>> On 08/06/2021 08:24, Jeff King wrote:
>>> +The minimum size allowed is limited to 1 MiB. The default is unlimited.
>>> +Common unit suffixes of 'k', 'm', or 'g' are supported.
>> Do we want to include the workaround of scaling in kibibytes (as
>> originally mentioned by Bagas) for the default as 1024k? This also
>> avoids the easy mistake that the size is in multiples MiB.
> I'm not quite sure what you're asking. If you mean: should we tell
> people that they can't use "4.9m" and should instead use "5017k"
> instead, then I don't have a strong opinion.

Sorry, I should have included an example based on the patch

> +The minimum size allowed is limited to 1 MiB. The default is unlimited.
> +Common unit suffixes of 'k', 'm', or 'g' are supported.

e.g.

+The default is unlimited. Common unit suffixes of 'k', 'm', or 'g' are
+supported. The minimum size allowed is limited to 1 MiB (`1024k`). 

I swapped the sentence order to allow the scaled example of the minimum
to be after the explanation of the suffixes
>
> It might help some people. But OTOH it's not clear to me that this is a
> common question, so it might clutter up the documentation. Either way,
> it's orthogonal to the patch in question, and should come on top if
> somebody cares to work on it.
>
> I'd also be fine with somebody actually implementing fractional unit
> support (it would probably go into git_parse_signed() and
> git_parse_unsigned()). It doesn't seem worth the effort to me, but if
> somebody feels strongly enough to implement it cleanly, I wouldn't say
> no. :)
I'd agree about not offering fractional values, but showing the use of
smaller units was an easy tweak.

Philip
Jeff King June 18, 2021, 3:18 p.m. UTC | #4
On Fri, Jun 18, 2021 at 04:15:23PM +0100, Philip Oakley wrote:

> > I'm not quite sure what you're asking. If you mean: should we tell
> > people that they can't use "4.9m" and should instead use "5017k"
> > instead, then I don't have a strong opinion.
> 
> Sorry, I should have included an example based on the patch
> 
> > +The minimum size allowed is limited to 1 MiB. The default is unlimited.
> > +Common unit suffixes of 'k', 'm', or 'g' are supported.
> 
> e.g.
> 
> +The default is unlimited. Common unit suffixes of 'k', 'm', or 'g' are
> +supported. The minimum size allowed is limited to 1 MiB (`1024k`). 
> 
> I swapped the sentence order to allow the scaled example of the minimum
> to be after the explanation of the suffixes

Ah, I see. I don't have a strong feeling either way if you want to
prepare a patch.

-Peff
diff mbox series

Patch

diff --git a/Documentation/config/pack.txt b/Documentation/config/pack.txt
index c0844d8d8e..763f7af7c4 100644
--- a/Documentation/config/pack.txt
+++ b/Documentation/config/pack.txt
@@ -99,12 +99,23 @@  pack.packSizeLimit::
 	packing to a file when repacking, i.e. the git:// protocol
 	is unaffected.  It can be overridden by the `--max-pack-size`
 	option of linkgit:git-repack[1].  Reaching this limit results
-	in the creation of multiple packfiles; which in turn prevents
-	bitmaps from being created.
-	The minimum size allowed is limited to 1 MiB.
-	The default is unlimited.
-	Common unit suffixes of 'k', 'm', or 'g' are
-	supported.
+	in the creation of multiple packfiles.
++
+Note that this option is rarely useful, and may result in a larger total
+on-disk size (because Git will not store deltas between packs), as well
+as worse runtime performance (object lookup within multiple packs is
+slower than a single pack, and optimizations like reachability bitmaps
+cannot cope with multiple packs).
++
+If you need to actively run Git using smaller packfiles (e.g., because your
+filesystem does not support large files), this option may help. But if
+your goal is to transmit a packfile over a medium that supports limited
+sizes (e.g., removable media that cannot store the whole repository),
+you are likely better off creating a single large packfile and splitting
+it using a generic multi-volume archive tool (e.g., Unix `split`).
++
+The minimum size allowed is limited to 1 MiB. The default is unlimited.
+Common unit suffixes of 'k', 'm', or 'g' are supported.
 
 pack.useBitmaps::
 	When true, git will use pack bitmaps (if available) when packing
diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt
index 25d9fbe37a..dbfd1f9017 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -128,10 +128,10 @@  depth is 4095.
 	into multiple independent packfiles, each not larger than the
 	given size. The size can be suffixed with
 	"k", "m", or "g". The minimum size allowed is limited to 1 MiB.
-	This option
-	prevents the creation of a bitmap index.
 	The default is unlimited, unless the config variable
-	`pack.packSizeLimit` is set.
+	`pack.packSizeLimit` is set. Note that this option may result in
+	a larger and slower repository; see the discussion in
+	`pack.packSizeLimit`.
 
 --honor-pack-keep::
 	This flag causes an object already in a local pack that
diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt
index ef310f362e..24c00c9384 100644
--- a/Documentation/git-repack.txt
+++ b/Documentation/git-repack.txt
@@ -121,7 +121,9 @@  depth is 4095.
 	If specified, multiple packfiles may be created, which also
 	prevents the creation of a bitmap index.
 	The default is unlimited, unless the config variable
-	`pack.packSizeLimit` is set.
+	`pack.packSizeLimit` is set. Note that this option may result in
+	a larger and slower repository; see the discussion in
+	`pack.packSizeLimit`.
 
 -b::
 --write-bitmap-index::