Message ID | 20181113003938.GC170017@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Avoid confusing messages from new index extensions (Re: [PATCH v8 0/7] speed up index load through parallelization) | expand |
> +index.recordOffsetTable:: > + Specifies whether the index file should include an "Index Entry > + Offset Table" section. This reduces index load time on > + multiprocessor machines but produces a message "ignoring IEOT > + extension" when reading the index using Git versions before 2.20. > + Defaults to 'false'. Probably worth adding a test that exercises this new config option - somehow create an index with index.recordOffsetTable=1, check that the index contains the appropriate string (a few ways to do this, but I'm not sure which are portable), and then run a Git command that reads the index to make sure it is valid; then do the same except index.recordOffsetTable=0. The code itself looks good to me. Same comment for patch 1.
Jonathan Nieder <jrnieder@gmail.com> writes: > As with EOIE, popular versions of Git do not support the new IEOT > extension yet. When accessing a Git repository written by a more > modern version of Git, they correctly ignore the unrecognized section, > but in the process they loudly warn > > ignoring IEOT extension > > resulting in confusion for users. Then removing the message is throwing it with bathwater. First think about which part of the message is confusiong and then make it less confusing. How about hint: ignoring an optional IEOT extension to make it clear that it is totally harmless? With that, we can add advise.unknownIndexExtension=false to turn all of them off with a single switch.
Junio C Hamano wrote: > How about > > hint: ignoring an optional IEOT extension > > to make it clear that it is totally harmless? > > With that, we can add advise.unknownIndexExtension=false to turn all > of them off with a single switch. I like it. Expect a patch soon (tonight or tomorrow) that does that. We'll have to find some appropriate place in the documentation to explain what the message is about, still. Thanks, Jonathan
On 11/12/2018 7:39 PM, Jonathan Nieder wrote: > As with EOIE, popular versions of Git do not support the new IEOT > extension yet. When accessing a Git repository written by a more > modern version of Git, they correctly ignore the unrecognized section, > but in the process they loudly warn > > ignoring IEOT extension > > resulting in confusion for users. Introduce the index extension more > gently by not writing it yet in this first version with support for > it. Soon, once sufficiently many users are running a modern version > of Git, we can flip the default so users benefit from this index > extension by default. > > Introduce a '[index] recordOffsetTable' configuration variable to > control whether the new index extension is written. > Why introduce a new setting to disable writing the IEOT extension instead of just using the existing index.threads setting? If index.threads=1 then the IEOT extension isn't written which (I believe) will accomplish the same goal. > Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> > --- > Documentation/config.txt | 7 +++++++ > read-cache.c | 11 ++++++++++- > 2 files changed, 17 insertions(+), 1 deletion(-) > > diff --git a/Documentation/config.txt b/Documentation/config.txt > index d702379db4..cc66fb7de3 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -2195,6 +2195,13 @@ index.recordEndOfIndexEntries:: > reading the index using Git versions before 2.20. Defaults to > 'false'. > > +index.recordOffsetTable:: > + Specifies whether the index file should include an "Index Entry > + Offset Table" section. This reduces index load time on > + multiprocessor machines but produces a message "ignoring IEOT > + extension" when reading the index using Git versions before 2.20. > + Defaults to 'false'. > + > index.threads:: > Specifies the number of threads to spawn when loading the index. > This is meant to reduce index load time on multiprocessor machines. > diff --git a/read-cache.c b/read-cache.c > index 4bfe93c4c2..290bd54708 100644 > --- a/read-cache.c > +++ b/read-cache.c > @@ -2707,6 +2707,15 @@ static int record_eoie(void) > return 0; > } > > +static int record_ieot(void) > +{ > + int val; > + > + if (!git_config_get_bool("index.recordoffsettable", &val)) > + return val; > + return 0; > +} > + > /* > * On success, `tempfile` is closed. If it is the temporary file > * of a `struct lock_file`, we will therefore effectively perform > @@ -2767,7 +2776,7 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile, > > #ifndef NO_PTHREADS > nr_threads = git_config_get_index_threads(); > - if (nr_threads != 1) { > + if (nr_threads != 1 && record_ieot()) { > int ieot_blocks, cpus; > > /* >
On Tue, Nov 13, 2018 at 2:12 AM Jonathan Nieder <jrnieder@gmail.com> wrote: > > Junio C Hamano wrote: > > > How about > > > > hint: ignoring an optional IEOT extension > > > > to make it clear that it is totally harmless? > > > > With that, we can add advise.unknownIndexExtension=false to turn all > > of them off with a single switch. > > I like it. Expect a patch soon (tonight or tomorrow) that does that. > > We'll have to find some appropriate place in the documentation to > explain what the message is about, still. Also from the last discussion, if I remember correctly, this "ignoring" is considered harmless and could be suppressed most of the time. But commands like 'fsck' should always report it.
Hi again, Junio C Hamano wrote: > Then removing the message is throwing it with bathwater. First > think about which part of the message is confusiong and then make it > less confusing. > > How about > > hint: ignoring an optional IEOT extension > > to make it clear that it is totally harmless? > > With that, we can add advise.unknownIndexExtension=false to turn all > of them off with a single switch. After having slept on it, this doesn't seem like a good fit for the advice subsystem. The advice subsystem provides hints about suggested actions for new users to understand what to do about a condition. In this example, the message is not suggesting a particular user action --- instead, it's describing state, which would seem to be a better fit for tracing, as in the patch 3/3 I sent. Am I understanding correclty? Can you give an example of when a user would *want* to see this message and what they would do in response? Thanks, Jonathan
Hi, Ben Peart wrote: > On 11/12/2018 7:39 PM, Jonathan Nieder wrote: >> As with EOIE, popular versions of Git do not support the new IEOT >> extension yet. When accessing a Git repository written by a more >> modern version of Git, they correctly ignore the unrecognized section, >> but in the process they loudly warn >> >> ignoring IEOT extension >> >> resulting in confusion for users. Introduce the index extension more >> gently by not writing it yet in this first version with support for >> it. [...] >> Introduce a '[index] recordOffsetTable' configuration variable to >> control whether the new index extension is written. > > Why introduce a new setting to disable writing the IEOT extension instead of > just using the existing index.threads setting? If index.threads=1 then the > IEOT extension isn't written which (I believe) will accomplish the same > goal. Do you mean defaulting to index.threads=1? I don't think that would be a good default, but if you have a different change in mind then I'd be happy to hear it. Or do you mean that if the user has explicitly specified index.threads=true, then that should imply index.recordOffsetTable=true so users only have to set one setting to turn it on? I can imagine that working well. Thanks, Jonathan
On 11/13/2018 1:18 PM, Jonathan Nieder wrote: > Hi, > > Ben Peart wrote: >> On 11/12/2018 7:39 PM, Jonathan Nieder wrote: > >>> As with EOIE, popular versions of Git do not support the new IEOT >>> extension yet. When accessing a Git repository written by a more >>> modern version of Git, they correctly ignore the unrecognized section, >>> but in the process they loudly warn >>> >>> ignoring IEOT extension >>> >>> resulting in confusion for users. Introduce the index extension more >>> gently by not writing it yet in this first version with support for >>> it. > [...] >>> Introduce a '[index] recordOffsetTable' configuration variable to >>> control whether the new index extension is written. >> >> Why introduce a new setting to disable writing the IEOT extension instead of >> just using the existing index.threads setting? If index.threads=1 then the >> IEOT extension isn't written which (I believe) will accomplish the same >> goal. > > Do you mean defaulting to index.threads=1? I don't think that would > be a good default, but if you have a different change in mind then I'd > be happy to hear it. > > Or do you mean that if the user has explicitly specified index.threads=true, > then that should imply index.recordOffsetTable=true so users only have > to set one setting to turn it on? I can imagine that working well. > Reading the index with multiple threads requires the EOIE and IEOT extensions to exist in the index. If either extension doesn't exist, then the code falls back to the single threaded path. That means you can't have both 1) no warning for old versions of git and 2) multi-threaded reading for new versions of git. If you set index.threads=1, that will prevent the IEOT extension from being written and there will be no "ignoring IEOT extension" warning in older versions of git. With this patch 'as is' you would have to set both index.threads=true and index.recordOffsetTable=true to get multi-threaded index reads. If either is set to false, it will silently drop back to single threaded reads. > Thanks, > Jonathan >
Hi again, Ben Peart wrote: > On 11/13/2018 1:18 PM, Jonathan Nieder wrote: >> Ben Peart wrote: >>> Why introduce a new setting to disable writing the IEOT extension instead of >>> just using the existing index.threads setting? If index.threads=1 then the >>> IEOT extension isn't written which (I believe) will accomplish the same >>> goal. >> >> Do you mean defaulting to index.threads=1? I don't think that would >> be a good default, but if you have a different change in mind then I'd >> be happy to hear it. >> >> Or do you mean that if the user has explicitly specified index.threads=true, >> then that should imply index.recordOffsetTable=true so users only have >> to set one setting to turn it on? I can imagine that working well. > > Reading the index with multiple threads requires the EOIE and IEOT > extensions to exist in the index. If either extension doesn't exist, then > the code falls back to the single threaded path. That means you can't have > both 1) no warning for old versions of git and 2) multi-threaded reading for > new versions of git. > > If you set index.threads=1, that will prevent the IEOT extension from being > written and there will be no "ignoring IEOT extension" warning in older > versions of git. > > With this patch 'as is' you would have to set both index.threads=true and > index.recordOffsetTable=true to get multi-threaded index reads. If either > is set to false, it will silently drop back to single threaded reads. Sorry, I'm still not understanding what you're proposing. What would be - the default behavior - the mechanism for changing that behavior under your proposal? I consider index.threads=1 to be a bad default. I would understand if you are saying that that should be the default, and I tried to propose a different way to achieve what you're looking for in the quoted reply above (but I'm not understanding whether you like that proposal or not). Jonathan
Ben Peart <peartben@gmail.com> writes: > Why introduce a new setting to disable writing the IEOT extension > instead of just using the existing index.threads setting? But index.threads is about what the reader does, not about the writer who does not even know who will be reading the resulting index, no?
On 11/13/2018 4:08 PM, Jonathan Nieder wrote: > Hi again, > > Ben Peart wrote: >> On 11/13/2018 1:18 PM, Jonathan Nieder wrote: >>> Ben Peart wrote: > >>>> Why introduce a new setting to disable writing the IEOT extension instead of >>>> just using the existing index.threads setting? If index.threads=1 then the >>>> IEOT extension isn't written which (I believe) will accomplish the same >>>> goal. >>> >>> Do you mean defaulting to index.threads=1? I don't think that would >>> be a good default, but if you have a different change in mind then I'd >>> be happy to hear it. >>> >>> Or do you mean that if the user has explicitly specified index.threads=true, >>> then that should imply index.recordOffsetTable=true so users only have >>> to set one setting to turn it on? I can imagine that working well. >> >> Reading the index with multiple threads requires the EOIE and IEOT >> extensions to exist in the index. If either extension doesn't exist, then >> the code falls back to the single threaded path. That means you can't have >> both 1) no warning for old versions of git and 2) multi-threaded reading for >> new versions of git. >> >> If you set index.threads=1, that will prevent the IEOT extension from being >> written and there will be no "ignoring IEOT extension" warning in older >> versions of git. >> >> With this patch 'as is' you would have to set both index.threads=true and >> index.recordOffsetTable=true to get multi-threaded index reads. If either >> is set to false, it will silently drop back to single threaded reads. > > Sorry, I'm still not understanding what you're proposing. What would be > > - the default behavior > - the mechanism for changing that behavior > > under your proposal? > > I consider index.threads=1 to be a bad default. I would understand if > you are saying that that should be the default, and I tried to propose > a different way to achieve what you're looking for in the quoted reply > above (but I'm not understanding whether you like that proposal or > not). > Today, both the write logic (ie should we write out the IEOT extension) and the read logic (should I use the IEOT, if available, and do multi-threaded reading) are controlled by the single "index.threads" setting. I would like to keep the settings as simple as possible to prevent user confusion. If we have two different settings (index.threads and index.recordoffsettable) then the only combination that will result in the user actually getting multi-threaded reads is when they are both set to true. Any other combination will silently fail. I think it would be confusing if you set index.threads=true and got no error message but didn't get multi-threaded reads either (or vice versa). If you want to prevent any of the scary "ignoring IEOT extension" from ever happening then your only option is to turn off the IEOT writing by default. The downside is that people have to discover and turn it on if they want the improvements. This can be achieved by changing the default for index.threads from "true" to "false." diff --git a/config.c b/config.c index 2ee29f6f86..86f5c14294 100644 --- a/config.c +++ b/config.c @@ -2291,7 +2291,7 @@ int git_config_get_fsmonitor(void) int git_config_get_index_threads(void) { - int is_bool, val = 0; + int is_bool, val = 1; val = git_env_ulong("GIT_TEST_INDEX_THREADS", 0); if (val) If you want to provide a way for a concerned user to disable the message after the first time they have seen it, then they can be instructed to run 'git config --global index.threads false' There is no way to get multi-threaded reads and NOT get the scary message with older versions of git. Multi-threaded reads require the IEOT extension to be written into the index and the existence of the IEOT extension in the index will always generate the scary warning. > Jonathan >
Hi, Ben Peart wrote: > There is no way to get multi-threaded reads and NOT get the scary message > with older versions of git. Multi-threaded reads require the IEOT extension > to be written into the index and the existence of the IEOT extension in the > index will always generate the scary warning. This is where I think we differ. I want my local copy of Git to get multi-threaded reads as long as IEOT happens to be there, even if I am not ready to write IEOT myself yet. I understand that this differs from what you would prefer, so I'd like to find some compromise that makes us both happy. I've tried to suggest one: Make explicitly setting index.threads=true imply index.recordOffsetTable=true. That way, the default behavior is the behavior I prefer, and a client can simply set index.threads=true to get the behavior I think you are describing preferring. My preference is instead what I sent in patch 2/3 (for simplicity, especially since the default of index.recordOffsetTable=false would be only temporary), but this would work okay for me. I'll send this as a patch. If there is a reason it won't work for you, I would be very happy to learn more about why. Thanks, Jonathan
diff --git a/Documentation/config.txt b/Documentation/config.txt index d702379db4..cc66fb7de3 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -2195,6 +2195,13 @@ index.recordEndOfIndexEntries:: reading the index using Git versions before 2.20. Defaults to 'false'. +index.recordOffsetTable:: + Specifies whether the index file should include an "Index Entry + Offset Table" section. This reduces index load time on + multiprocessor machines but produces a message "ignoring IEOT + extension" when reading the index using Git versions before 2.20. + Defaults to 'false'. + index.threads:: Specifies the number of threads to spawn when loading the index. This is meant to reduce index load time on multiprocessor machines. diff --git a/read-cache.c b/read-cache.c index 4bfe93c4c2..290bd54708 100644 --- a/read-cache.c +++ b/read-cache.c @@ -2707,6 +2707,15 @@ static int record_eoie(void) return 0; } +static int record_ieot(void) +{ + int val; + + if (!git_config_get_bool("index.recordoffsettable", &val)) + return val; + return 0; +} + /* * On success, `tempfile` is closed. If it is the temporary file * of a `struct lock_file`, we will therefore effectively perform @@ -2767,7 +2776,7 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile, #ifndef NO_PTHREADS nr_threads = git_config_get_index_threads(); - if (nr_threads != 1) { + if (nr_threads != 1 && record_ieot()) { int ieot_blocks, cpus; /*
As with EOIE, popular versions of Git do not support the new IEOT extension yet. When accessing a Git repository written by a more modern version of Git, they correctly ignore the unrecognized section, but in the process they loudly warn ignoring IEOT extension resulting in confusion for users. Introduce the index extension more gently by not writing it yet in this first version with support for it. Soon, once sufficiently many users are running a modern version of Git, we can flip the default so users benefit from this index extension by default. Introduce a '[index] recordOffsetTable' configuration variable to control whether the new index extension is written. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- Documentation/config.txt | 7 +++++++ read-cache.c | 11 ++++++++++- 2 files changed, 17 insertions(+), 1 deletion(-)