Message ID | 20180926195442.1380-5-benpeart@microsoft.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | speed up index load through parallelization | expand |
On Wed, Sep 26, 2018 at 03:54:39PM -0400, Ben Peart wrote: > Add support for a new index.threads config setting which will be used to > control the threading code in do_read_index(). A value of 0 will tell the > index code to automatically determine the correct number of threads to use. > A value of 1 will make the code single threaded. A value greater than 1 > will set the maximum number of threads to use. > > For testing purposes, this setting can be overwritten by setting the > GIT_TEST_INDEX_THREADS=<n> environment variable to a value greater than 0. > > Signed-off-by: Ben Peart <Ben.Peart@microsoft.com> > --- > diff --git a/t/README b/t/README > index aa33ac4f26..0fcecf4500 100644 > --- a/t/README > +++ b/t/README > @@ -332,6 +332,11 @@ This is used to allow tests 1, 4-9 in t1700-split-index.sh to succeed > as they currently hard code SHA values for the index which are no longer > valid due to the addition of the EOIE extension. > > +GIT_TEST_INDEX_THREADS=<n> enables exercising the multi-threaded loading > +of the index for the whole test suite by bypassing the default number of > +cache entries and thread minimums. Settting this to 1 will make the s/ttt/tt/ > +index loading single threaded. > + > Naming Tests > ------------ > > diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh > index 1f168378c8..ab205954cf 100755 > --- a/t/t1700-split-index.sh > +++ b/t/t1700-split-index.sh > @@ -8,6 +8,7 @@ test_description='split index mode tests' > sane_unset GIT_TEST_SPLIT_INDEX > sane_unset GIT_FSMONITOR_TEST > GIT_TEST_DISABLE_EOIE=true; export GIT_TEST_DISABLE_EOIE > +GIT_TEST_INDEX_THREADS=1; export GIT_TEST_INDEX_THREADS Why does multithreading have to be disabled in this test? > test_expect_success 'enable split index' ' > git config splitIndex.maxPercentChange 100 && > -- > 2.18.0.windows.1 >
On 9/27/2018 8:26 PM, SZEDER Gábor wrote: > On Wed, Sep 26, 2018 at 03:54:39PM -0400, Ben Peart wrote: >> Add support for a new index.threads config setting which will be used to >> control the threading code in do_read_index(). A value of 0 will tell the >> index code to automatically determine the correct number of threads to use. >> A value of 1 will make the code single threaded. A value greater than 1 >> will set the maximum number of threads to use. >> >> For testing purposes, this setting can be overwritten by setting the >> GIT_TEST_INDEX_THREADS=<n> environment variable to a value greater than 0. >> >> Signed-off-by: Ben Peart <Ben.Peart@microsoft.com> >> --- > >> diff --git a/t/README b/t/README >> index aa33ac4f26..0fcecf4500 100644 >> --- a/t/README >> +++ b/t/README >> @@ -332,6 +332,11 @@ This is used to allow tests 1, 4-9 in t1700-split-index.sh to succeed >> as they currently hard code SHA values for the index which are no longer >> valid due to the addition of the EOIE extension. >> >> +GIT_TEST_INDEX_THREADS=<n> enables exercising the multi-threaded loading >> +of the index for the whole test suite by bypassing the default number of >> +cache entries and thread minimums. Settting this to 1 will make the > > s/ttt/tt/ > >> +index loading single threaded. >> + >> Naming Tests >> ------------ >> >> diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh >> index 1f168378c8..ab205954cf 100755 >> --- a/t/t1700-split-index.sh >> +++ b/t/t1700-split-index.sh >> @@ -8,6 +8,7 @@ test_description='split index mode tests' >> sane_unset GIT_TEST_SPLIT_INDEX >> sane_unset GIT_FSMONITOR_TEST >> GIT_TEST_DISABLE_EOIE=true; export GIT_TEST_DISABLE_EOIE >> +GIT_TEST_INDEX_THREADS=1; export GIT_TEST_INDEX_THREADS > > Why does multithreading have to be disabled in this test? > If multi-threading is enabled, it will write out the IEOT extension which changes the SHA and causes the test to fail. I will update the logic in this case to not write out the IEOT extension as it isn't needed. >> test_expect_success 'enable split index' ' >> git config splitIndex.maxPercentChange 100 && >> -- >> 2.18.0.windows.1 >>
Ben Peart <peartben@gmail.com> writes: >> Why does multithreading have to be disabled in this test? > > If multi-threading is enabled, it will write out the IEOT extension > which changes the SHA and causes the test to fail. I think it is a design mistake to let the writing processes's capability decide what is written in the file to be read later by a different process, which possibly may have different capability. If you are not writing with multiple threads, it should not matter if that writer process is capable of and configured to spawn 8 threads if the process were reading the file---as it is not reading the file it is writing right now. I can understand if the design is to write IEOT only if the resulting index is expected to become large enough (above an arbitrary threshold like 100k entries) to matter. I also can understand if IEOT is omitted when the repository configuration says that no process is allowed to read the index with multi-threaded codepath in that repository.
On 9/28/2018 1:07 PM, Junio C Hamano wrote: > Ben Peart <peartben@gmail.com> writes: > >>> Why does multithreading have to be disabled in this test? >> >> If multi-threading is enabled, it will write out the IEOT extension >> which changes the SHA and causes the test to fail. > > I think it is a design mistake to let the writing processes's > capability decide what is written in the file to be read later by a > different process, which possibly may have different capability. If > you are not writing with multiple threads, it should not matter if > that writer process is capable of and configured to spawn 8 threads > if the process were reading the file---as it is not reading the file > it is writing right now. > > I can understand if the design is to write IEOT only if the > resulting index is expected to become large enough (above an > arbitrary threshold like 100k entries) to matter. I also can > understand if IEOT is omitted when the repository configuration says > that no process is allowed to read the index with multi-threaded > codepath in that repository. > There are two different paths which determine how many blocks are written to the IEOT. The first is the default path. On this path, the number of blocks is determined by the number of cache entries divided by the THREAD_COST. If there are sufficient entries to make it faster to use threading, then it will automatically use enough blocks to optimize the performance of reading the entries across multiple threads. I currently cap the maximum number of blocks to be the number of cores that would be available to process them on that same machine purely as an optimization. The majority of the time, the index will be read from the same machine that it was written on so this works well. Before I added that logic, you would usually end up with more blocks than available threads which meant some threads had more to do than the other threads and resulted in worse performance. For example, 4 blocks across 3 threads results in the 1st thread having twice as much work to do as the other threads. If the index is copied to a machine with a different number of cores, it will still all work - it just may not be optimal for that machine. This is self correcting because as soon as the index is written out, it will be optimized for that machine. If the "automatically try to make it perform optimally" logic doesn't work for some reason, we have path #2. The second path is when the user specifies a specific number of blocks via the GIT_TEST_INDEX_THREADS=<n> environment variable or the index.threads=<n> config setting. If they ask for n blocks, they will get n blocks. This is the "I know what I'm doing and want to control the behavior" path. I just added one additional test (see patch below) to avoid a divide by zero bug and simplify things a bit. With this change, if there are fewer than two blocks, the IEOT extension is not written out as it isn't needed. The load would be single threaded anyway so there is no reason to write out a IEOT extensions that won't be used. diff --git a/read-cache.c b/read-cache.c index f5d766088d..a1006fa824 100644 --- a/read-cache.c +++ b/read-cache.c @@ -2751,18 +2751,23 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfil e, */ if (!nr) { ieot_blocks = istate->cache_nr / THREAD_COST; - if (ieot_blocks < 1) - ieot_blocks = 1; cpus = online_cpus(); if (ieot_blocks > cpus - 1) ieot_blocks = cpus - 1; } else { ieot_blocks = nr; } - ieot = xcalloc(1, sizeof(struct index_entry_offset_table) - + (ieot_blocks * sizeof(struct index_entry_offset))); - ieot->nr = 0; - ieot_work = DIV_ROUND_UP(entries, ieot_blocks); + + /* + * no reason to write out the IEOT extension if we don't + * have enough blocks to utilize multi-threading + */ + if (ieot_blocks > 1) { + ieot = xcalloc(1, sizeof(struct index_entry_offset_table) + + (ieot_blocks * sizeof(struct index_entry_offset))); + ieot->nr = 0; + ieot_work = DIV_ROUND_UP(entries, ieot_blocks); + } } #endif
On 28/09/18 20:41, Ben Peart wrote: > > > On 9/28/2018 1:07 PM, Junio C Hamano wrote: >> Ben Peart <peartben@gmail.com> writes: >> >>>> Why does multithreading have to be disabled in this test? >>> >>> If multi-threading is enabled, it will write out the IEOT extension >>> which changes the SHA and causes the test to fail. >> >> I think it is a design mistake to let the writing processes's >> capability decide what is written in the file to be read later by a >> different process, which possibly may have different capability. If >> you are not writing with multiple threads, it should not matter if >> that writer process is capable of and configured to spawn 8 threads >> if the process were reading the file---as it is not reading the file >> it is writing right now. >> >> I can understand if the design is to write IEOT only if the >> resulting index is expected to become large enough (above an >> arbitrary threshold like 100k entries) to matter. I also can >> understand if IEOT is omitted when the repository configuration says >> that no process is allowed to read the index with multi-threaded >> codepath in that repository. >> > > There are two different paths which determine how many blocks are written to the IEOT. The first is the default path. On this path, the number of blocks is determined by the number of cache entries divided by the THREAD_COST. If there are sufficient entries to make it faster to use threading, then it will automatically use enough blocks to optimize the performance of reading the entries across multiple threads. > > I currently cap the maximum number of blocks to be the number of cores that would be available to process them on that same machine purely as an optimization. The majority of the time, the index will be read from the same machine that it was written on so this works well. Before I added that logic, you would usually end up with more blocks than available threads which meant some threads had more to do than the other threads and resulted in worse performance. For example, 4 blocks across 3 threads results in the 1st thread having twice as much work to do as the other threads. > > If the index is copied to a machine with a different number of cores, it will still all work - it just may not be optimal for that machine. This is self correcting because as soon as the index is written out, it will be optimized for that machine. > > If the "automatically try to make it perform optimally" logic doesn't work for some reason, we have path #2. > > The second path is when the user specifies a specific number of blocks via the GIT_TEST_INDEX_THREADS=<n> environment variable or the index.threads=<n> config setting. If they ask for n blocks, they will get n blocks. This is the "I know what I'm doing and want to control the behavior" path. > > I just added one additional test (see patch below) to avoid a divide by zero bug and simplify things a bit. With this change, if there are fewer than two blocks, the IEOT extension is not written out as it isn't needed. The load would be single threaded anyway so there is no reason to write out a IEOT extensions that won't be used. > > > > diff --git a/read-cache.c b/read-cache.c > index f5d766088d..a1006fa824 100644 > --- a/read-cache.c > +++ b/read-cache.c > @@ -2751,18 +2751,23 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfil > e, > */ > if (!nr) { > ieot_blocks = istate->cache_nr / THREAD_COST; > - if (ieot_blocks < 1) > - ieot_blocks = 1; > cpus = online_cpus(); > if (ieot_blocks > cpus - 1) > ieot_blocks = cpus - 1; So, am I reading this correctly - you need cpus > 2 before an IEOT extension block is written out? OK. ATB, Ramsay Jones > } else { > ieot_blocks = nr; > } > - ieot = xcalloc(1, sizeof(struct index_entry_offset_table) > - + (ieot_blocks * sizeof(struct index_entry_offset))); > - ieot->nr = 0; > - ieot_work = DIV_ROUND_UP(entries, ieot_blocks); > + > + /* > + * no reason to write out the IEOT extension if we don't > + * have enough blocks to utilize multi-threading > + */ > + if (ieot_blocks > 1) { > + ieot = xcalloc(1, sizeof(struct index_entry_offset_table) > + + (ieot_blocks * sizeof(struct index_entry_offset))); > + ieot->nr = 0; > + ieot_work = DIV_ROUND_UP(entries, ieot_blocks); > + } > } > #endif > >
Ramsay Jones <ramsay@ramsayjones.plus.com> writes: >> if (!nr) { >> ieot_blocks = istate->cache_nr / THREAD_COST; >> - if (ieot_blocks < 1) >> - ieot_blocks = 1; >> cpus = online_cpus(); >> if (ieot_blocks > cpus - 1) >> ieot_blocks = cpus - 1; > > So, am I reading this correctly - you need cpus > 2 before an > IEOT extension block is written out? > > OK. Why should we be even calling online_cpus() in this codepath to write the index in a single thread to begin with? The number of cpus that readers would use to read this index file has nothing to do with the number of cpus available to this particular writer process.
On 9/28/2018 6:15 PM, Junio C Hamano wrote: > Ramsay Jones <ramsay@ramsayjones.plus.com> writes: > >>> if (!nr) { >>> ieot_blocks = istate->cache_nr / THREAD_COST; >>> - if (ieot_blocks < 1) >>> - ieot_blocks = 1; >>> cpus = online_cpus(); >>> if (ieot_blocks > cpus - 1) >>> ieot_blocks = cpus - 1; >> >> So, am I reading this correctly - you need cpus > 2 before an >> IEOT extension block is written out? >> >> OK. > > Why should we be even calling online_cpus() in this codepath to > write the index in a single thread to begin with? > > The number of cpus that readers would use to read this index file > has nothing to do with the number of cpus available to this > particular writer process. > As I mentioned in my other reply, this is optimizing for the most common case where the index is read from the same machine that wrote it and the user is taking the default settings (ie index.threads=true). Aligning the number of blocks to the number of threads that will be processing them avoids situations where one thread may have up to double the work to do as the other threads (for example, if there were 3 blocks to be processed by 2 threads).
On Mon, Oct 01, 2018 at 09:17:53AM -0400, Ben Peart wrote: > > > On 9/28/2018 6:15 PM, Junio C Hamano wrote: > >Ramsay Jones <ramsay@ramsayjones.plus.com> writes: > > > >>> if (!nr) { > >>> ieot_blocks = istate->cache_nr / THREAD_COST; > >>>- if (ieot_blocks < 1) > >>>- ieot_blocks = 1; > >>> cpus = online_cpus(); > >>> if (ieot_blocks > cpus - 1) > >>> ieot_blocks = cpus - 1; > >> > >>So, am I reading this correctly - you need cpus > 2 before an > >>IEOT extension block is written out? > >> > >>OK. > > > >Why should we be even calling online_cpus() in this codepath to > >write the index in a single thread to begin with? > > > >The number of cpus that readers would use to read this index file > >has nothing to do with the number of cpus available to this > >particular writer process. > > > > As I mentioned in my other reply, this is optimizing for the most common > case where the index is read from the same machine that wrote it and the > user is taking the default settings (ie index.threads=true). I think this is a reasonable assumption to make, but it should be mentioned in the relevant commit message. Alas, as far as I can tell, not a single commit message has been updated in v7. > Aligning the number of blocks to the number of threads that will be > processing them avoids situations where one thread may have up to double the > work to do as the other threads (for example, if there were 3 blocks to be > processed by 2 threads).
diff --git a/Documentation/config.txt b/Documentation/config.txt index ad0f4510c3..8fd973b76b 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -2413,6 +2413,13 @@ imap:: The configuration variables in the 'imap' section are described in linkgit:git-imap-send[1]. +index.threads:: + Specifies the number of threads to spawn when loading the index. + This is meant to reduce index load time on multiprocessor machines. + Specifying 0 or 'true' will cause Git to auto-detect the number of + CPU's and set the number of threads accordingly. Specifying 1 or + 'false' will disable multithreading. Defaults to 'true'. + index.version:: Specify the version with which new index files should be initialized. This does not affect existing repositories. diff --git a/config.c b/config.c index 3461993f0a..2ee29f6f86 100644 --- a/config.c +++ b/config.c @@ -2289,6 +2289,24 @@ int git_config_get_fsmonitor(void) return 0; } +int git_config_get_index_threads(void) +{ + int is_bool, val = 0; + + val = git_env_ulong("GIT_TEST_INDEX_THREADS", 0); + if (val) + return val; + + if (!git_config_get_bool_or_int("index.threads", &is_bool, &val)) { + if (is_bool) + return val ? 0 : 1; + else + return val; + } + + return 0; /* auto */ +} + NORETURN void git_die_config_linenr(const char *key, const char *filename, int linenr) { diff --git a/config.h b/config.h index ab46e0165d..a06027e69b 100644 --- a/config.h +++ b/config.h @@ -250,6 +250,7 @@ extern int git_config_get_untracked_cache(void); extern int git_config_get_split_index(void); extern int git_config_get_max_percent_split_change(void); extern int git_config_get_fsmonitor(void); +extern int git_config_get_index_threads(void); /* This dies if the configured or default date is in the future */ extern int git_config_get_expiry(const char *key, const char **output); diff --git a/t/README b/t/README index aa33ac4f26..0fcecf4500 100644 --- a/t/README +++ b/t/README @@ -332,6 +332,11 @@ This is used to allow tests 1, 4-9 in t1700-split-index.sh to succeed as they currently hard code SHA values for the index which are no longer valid due to the addition of the EOIE extension. +GIT_TEST_INDEX_THREADS=<n> enables exercising the multi-threaded loading +of the index for the whole test suite by bypassing the default number of +cache entries and thread minimums. Settting this to 1 will make the +index loading single threaded. + Naming Tests ------------ diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh index 1f168378c8..ab205954cf 100755 --- a/t/t1700-split-index.sh +++ b/t/t1700-split-index.sh @@ -8,6 +8,7 @@ test_description='split index mode tests' sane_unset GIT_TEST_SPLIT_INDEX sane_unset GIT_FSMONITOR_TEST GIT_TEST_DISABLE_EOIE=true; export GIT_TEST_DISABLE_EOIE +GIT_TEST_INDEX_THREADS=1; export GIT_TEST_INDEX_THREADS test_expect_success 'enable split index' ' git config splitIndex.maxPercentChange 100 &&
Add support for a new index.threads config setting which will be used to control the threading code in do_read_index(). A value of 0 will tell the index code to automatically determine the correct number of threads to use. A value of 1 will make the code single threaded. A value greater than 1 will set the maximum number of threads to use. For testing purposes, this setting can be overwritten by setting the GIT_TEST_INDEX_THREADS=<n> environment variable to a value greater than 0. Signed-off-by: Ben Peart <Ben.Peart@microsoft.com> --- Documentation/config.txt | 7 +++++++ config.c | 18 ++++++++++++++++++ config.h | 1 + t/README | 5 +++++ t/t1700-split-index.sh | 1 + 5 files changed, 32 insertions(+)