Message ID | pull.254.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Create 'core.size=large' setting to update config defaults | expand |
On 6/3/2019 4:18 PM, Derrick Stolee via GitGitGadget wrote: > 1. (Patches 1-3) Introduce a new 'core.size' config setting that takes > 'large' as a value. I do want to point out that this "core.size=large" option is probably a terrible name and could easily be replaced with something better. Please consider alternatives that better describe the goals at hand (helping users get performance boosts on upgrade without needing to pay close attention). Thanks, -Stolee
Hi Stolee, On Mon, 3 Jun 2019, Derrick Stolee via GitGitGadget wrote: > 1. (Patches 1-3) Introduce a new 'core.size' config setting that takes > 'large' as a value. This enables several config values that are > beneficial for large repos. I find `core.size` a bit non-descriptive. Maybe `repository.size` instead? Ciao, Dscho
On 6/4/2019 10:43 AM, Johannes Schindelin wrote: > Hi Stolee, > > On Mon, 3 Jun 2019, Derrick Stolee via GitGitGadget wrote: > >> 1. (Patches 1-3) Introduce a new 'core.size' config setting that takes >> 'large' as a value. This enables several config values that are >> beneficial for large repos. > > I find `core.size` a bit non-descriptive. Maybe `repository.size` instead? Thanks for the suggestion! If the "repository." doesn't make sense as a top- level category, then maybe "core.repositorySize" would work? A thought I had overnight that may broaden our options would be to think of this as a tolerance for experimental features. Maybe "core.adoptionRing" with options for "slow" and "fast", where "slow" takes things that have been cooking a long while (index.version=4, core.commitGraph=gc.writeCommitGraph=1) and the "fast" option gets all of those values plus the more experimental options (status.aheadBehind=false, fetch.showForcedUpdates=false). Alternate names with this slow/fast idea could be: * core.experimentTolerance={none,low,high} * core.autoConfig={none,some,all} Hopefully these options can trigger some creativity to decide on a good name that an experienced Git user could understand. Thanks, -Stolee
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > This patch series includes a few new config options we created to speed up > certain critical commands in VFS for Git. On their own, they would > contribute little value as it is hard to discover new config variables. > Instead, I've created this RFC as a goal for probably three sequential patch > series: > > 1. (Patches 1-3) Introduce a new 'core.size' config setting that takes > 'large' as a value. This enables several config values that are > beneficial for large repos. We use a certain set in VFS for Git (see > [1]), and most of those are applicable to any repo. This 'core.size' > setting is intended for users to automatically receive performance > updates as soon as they are stable, but they must opt-in to the setting > and can always explicitly set their own config values. The settings to > include here are core.commitGraph=true, gc.writeCommitGraph=true, > index.version=4, pack.useSparse=true. ... and not the configuration introduced by the other two points in this list? "If you set this, these other configuration variables are set to these default values" is a very valuable usability feature. It looks a lot more "meta" or "macro", and certainly is not a good idea to call it as if it sits next to variables in any existing hierarchy. I also wonder if this is something we would want to support in general; random things that come to mind are: - should such a "macro" configuration be limited to boolean (e.g. the above core.size that takes 'large' is a boolean between 'large' and 'not large'), or can it be an enum (e.g. choose among 'large', 'medium' and 'small', and core.bigFileThreshold will be set to 1G, 512M and 128M respectively---this silly example is for illustration purposes only), and if so, can we express what these default values are for each choice without writing a lot of code? - if we were to have more than just this 'core.size' macro, can two otherwise orthogonal macros both control the same underlying variable, and if so, how do we express their interactions? "using these two at the same time is forbidden" is a perfectly acceptable answer for the first round until we figure out the desired semantics, of course. - perhaps we may eventually want to allow end users (via their ~/.gitconfig) and system administrators (via /etc/gitconfig) define such a macro setting (e.g. setting macro.largeRepoSetting sets pack.usebitmaps=true, pack.useSpars=true, etc.) *after* we figure out what we want to do to the other points in this list. - even if we do not allow end users and system administrators futz with custom macros, can we specify the macros we ship without casting them in code?
On 6/5/2019 4:39 PM, Junio C Hamano wrote: > "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> This patch series includes a few new config options we created to speed up >> certain critical commands in VFS for Git. On their own, they would >> contribute little value as it is hard to discover new config variables. >> Instead, I've created this RFC as a goal for probably three sequential patch >> series: >> >> 1. (Patches 1-3) Introduce a new 'core.size' config setting that takes >> 'large' as a value. This enables several config values that are >> beneficial for large repos. We use a certain set in VFS for Git (see >> [1]), and most of those are applicable to any repo. This 'core.size' >> setting is intended for users to automatically receive performance >> updates as soon as they are stable, but they must opt-in to the setting >> and can always explicitly set their own config values. The settings to >> include here are core.commitGraph=true, gc.writeCommitGraph=true, >> index.version=4, pack.useSparse=true. > > ... and not the configuration introduced by the other two points in > this list? They are added to the config setting after they are introduced. See patches 7 (status.aheadBehind) and 11 (fetch.showForcedUpdates). > "If you set this, these other configuration variables are set to > these default values" is a very valuable usability feature. It > looks a lot more "meta" or "macro", and certainly is not a good idea > to call it as if it sits next to variables in any existing hierarchy. > > I also wonder if this is something we would want to support in > general; random things that come to mind are: > > - should such a "macro" configuration be limited to boolean > (e.g. the above core.size that takes 'large' is a boolean between > 'large' and 'not large'), or can it be an enum (e.g. choose among > 'large', 'medium' and 'small', and core.bigFileThreshold will be > set to 1G, 512M and 128M respectively---this silly example is for > illustration purposes only), and if so, can we express what these > default values are for each choice without writing a lot of code? That's a good point that we could include recommended values for other non-boolean variables if our "meta" config setting is also non-boolean. This fits in with the "ring" ideas discussed earlier [1]. Taking in a few ideas from your message, perhaps we create a new "meta" category for this setting and use an integer value for "how big do I think my repo is?" and we can apply different settings based on thresholds: 0: no config defaults changed 3: safe defaults (core.commitGraph, index.version=4) 6: behavior-modifying defaults (status.aheadBehind, fetch.noShowForcedUpdates) Using 3 and 6 here to allow for finer gradients at a later date. [1] https://public-inbox.org/git/xmqqftonsr6a.fsf@gitster-ct.c.googlers.com/T/#m8dbaedc016ce7301b9d80e5ceb6a82edfa7bafac > - if we were to have more than just this 'core.size' macro, can two > otherwise orthogonal macros both control the same underlying > variable, and if so, how do we express their interactions? > "using these two at the same time is forbidden" is a perfectly > acceptable answer for the first round until we figure out the > desired semantics, of course. To borrow from linear algebra, I would recommend that two orthogonal config settings have disjoint _bases_ (i.e. the set of config settings they use are disjoint). Of course, this can be discussed in more detail when someone suggests a second meta-config setting. Such a second setting would need justification for why it doesn't work with our first setting. > - perhaps we may eventually want to allow end users (via their > ~/.gitconfig) and system administrators (via /etc/gitconfig) > define such a macro setting (e.g. setting macro.largeRepoSetting > sets pack.usebitmaps=true, pack.useSpars=true, etc.) *after* we > figure out what we want to do to the other points in this list. > > - even if we do not allow end users and system administrators futz > with custom macros, can we specify the macros we ship without > casting them in code? Are you suggesting that we allow some config values to be pulled from the repo contents? If we could identify some config options as "safe" to include in the Git data, then a repo administrator could commit a "/.gitconfig" file _and_ some existing config option says "look at the config in the repo". I see value in making some "safe" settings available in the repo, but also see that it can be very tricky to get right. Further, I think it is independent of the current direction. In fact, I would imagine the meta-config setting be one of the "safe" settings that we could put in this committed config file. Thanks, -Stolee
Derrick Stolee <stolee@gmail.com> writes: >> - perhaps we may eventually want to allow end users (via their >> ~/.gitconfig) and system administrators (via /etc/gitconfig) >> define such a macro setting (e.g. setting macro.largeRepoSetting >> sets pack.usebitmaps=true, pack.useSpars=true, etc.) *after* we >> figure out what we want to do to the other points in this list. >> >> - even if we do not allow end users and system administrators futz >> with custom macros, can we specify the macros we ship without >> casting them in code? > > Are you suggesting that we allow some config values to be pulled from > the repo contents? Not at all. As far as the configuration is concerned, what project ships is tainted data that should not be used blindly. What I had in mind is parallel to the idea of pushing "static struct userdiff_driver builtin_drivers[]" out of the compiled-in code and instead have a text file shipped in /usr/share/git/ somewhere. So, instead of having "core.size==large means these other four variables are set to these values" in the code, we invent a general mechanism to read such "macro" specification out of a text file, and that would be the only code change---the specific "core.size==large affects X, Y and Z" would not be in the code, but would be in the text file we ship and read by the mechanism. If the list of allowed "meta" configuration variables and the configuration variables whose default each of them affects can be expressed in our usual ".gitconfig" file format, then the system administrators can add their own in /etc/gitconfig, too, to help their users. That is what I meant by the last item. Note that I was "wondering if it makes sense" and what I wrote above in this message is merely clarifying what I meant---I am not making further/more arguments to claim it is a good idea (at least not yet). Thanks.