Message ID | 20191124211021.GB23183@szeder.dev (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | t3429: try to protect against a potential racy todo file problem | expand |
SZEDER Gábor <szeder.dev@gmail.com> writes: > On Sun, Nov 24, 2019 at 10:44:10AM +0000, Phillip Wood wrote: >> On 24/11/2019 04:49, Junio C Hamano wrote: >> ... >> >Hmph, that makes it sound as if the right fix is to re-read after >> >writing the first version of the todo file out, so that the stat >> >data matches reality and tells us that it has never been modified? >> >> I think we should update the stat data after we write the todo list. > > Well, yes and no. > > No, because we are dealing with regression in v2.24.0 here, so the > simpler the fix the better it is for maint. I don't think a fix can > get any simpler than my patch, with or without the suggestions from > Phillip. Of course, the simplest "fix" for regression is to revert the offending one, and anything else is a band-aid ;-). The question is which band-aid is the least risky and which one takes us the closest to the real solution. I tend to agree that forcing to skip checking no matter what the variable "check_todo" says unless is_rebase_i() qualifies as the band-aid that is the least risky. > Yes, we should definitely consider updating the stat data after the > sequencer writes the todo list, or any other options with which the > sequencer could notice a modified todo list file with less subtlety. > Alas, there is a big can of worms in that direction, see the patch > below, and we have to be very careful going that way, so I think it's > only viable in the long term, but less suitable as a regression fix > for maint. Yes, I agree that it is much less suitable than even reverting the offending one outright. > (Hrm, perhaps I spent too many words on the all zeroed out stat data, > and managed to sidetrack you a bit...) No, I do not think so. Thinking about what we need to do in the longer term, while coming up with a shorter term fix, is a necessary step of gaining confidence in the latter. Again, thanks both for thinking about this issue.
On Sun, Nov 24, 2019 at 10:10:21PM +0100, SZEDER Gábor wrote: > To notice a changed todo file the sequencer stores the file's stat > data in its 'struct todo_list' instance, and compares it with the > file's current stat data after 'reword', 'squash' and 'exec' > instructions. If the two stat data doesn't match, it re-reads the > todo file. > > Sounds simple, but there are some subtleties going on here: > > - The 'struct todo_list' holds the stat data from the time when the > todo file was last read. > > - This stat data in 'struct todo_list' is not updated when the > sequencer itself writes the todo file. > > - Before executing each instruction during an interactive rebase, > the sequencer always updates the todo file by removing the > just-about-to-be-executed instruction. This changes the file's > size and inode [1]. > > Consequently, when the sequencer looks at the stat data after a > 'reword', 'squash' or 'exec' instruction, it most likely finds that > they differ, even when the user didn't modify the todo list at all! > This is not an issue in practice, it just wastes a few cycles on > re-reading the todo list that matches what the sequencer already has > in memory anyway. > > However, an unsuspecting Git developer might try to "fix" it by simply > updating the stat data each time the sequencer writes the todo list > for an interactive rebase. On first sight it looks quite sensible and > straightforward, but we have to be very careful when doing that, > because potential racy problems lie that way. > > It is possible to overwrite the todo list file without modifying > either its inode or size, and if such an overwrite were to happen in > the same second when the file was last read (our stat data has one > second granularity by default), then the actual stat data on the file > system would match the stat data that the sequencer has in memory. > Consequently, such a modification to the todo list file would go > unnoticed. > [1] The todo file is written using the lockfile API, i.e. first to the > lockfile, which is then moved to place, so the new file can't > possibly have the same inode as the file it replaces. Note, > however, that the file system might reuse inodes, so it is > possible that the new todo file ends up with the same inode as is > recorded in the 'struct todo_list' from the last time the file was > read. Unfortunately, we already do have an issue here when the sequencer can overlook a modified todo list, but triggering it needs the lucky "alignment" of inodes as well. I can trigger it fairly reliably with the test below, but forcing the inode match makes it kind of gross and Linux-only. https://travis-ci.org/szeder/git/jobs/616460522#L1470 --- >8 --- diff --git a/t/t9999-rebase-racy-todo-reread.sh b/t/t9999-rebase-racy-todo-reread.sh new file mode 100755 index 0000000000..437ebd55e0 --- /dev/null +++ b/t/t9999-rebase-racy-todo-reread.sh @@ -0,0 +1,87 @@ +#!/bin/sh + +test_description='racy edit todo reread problem' + +. ./test-lib.sh + +test_expect_success 'setup' ' + test_commit first_ && + test_commit second && + test_commit third_ && + test_commit fourth && + test_commit fifth_ && + test_commit sixth_ && + + write_script sequence-editor <<-\EOS && + todo=.git/rebase-merge/git-rebase-todo && + cat >"$todo" <<-EOF + reword $(git rev-parse second^0) second + reword $(git rev-parse third_^0) third_ + reword $(git rev-parse fourth^0) fourth + EOF + EOS + + write_script commit-editor <<-\EOS && + read first_line <"$1" && + echo "$first_line - edited" >"$1" && + + todo=.git/rebase-merge/git-rebase-todo && + + if test "$first_line" = second + then + stat --format=%i "$todo" >expected-ino + elif test "$first_line" = third_ + then + ino=$(cat expected-ino) && + file=$(find . -inum $ino) && + if test -n "$file" + then + echo && + echo "Trying to free inode $ino by moving \"$file\" out of the way" && + cp -av "$file" "$file".tmp && + rm -fv "$file" + fi && + + cat >"$todo".tmp <<-EOF && + reword $(git rev-parse fifth_^0) fifth_ + reword $(git rev-parse sixth_^0) sixth_ + EOF + mv -v "$todo".tmp "$todo" && + + if test "$ino" -eq $(stat --format=%i "$todo") + then + echo "Yay! The todo list did get inode $ino, just what the sequencer is expecting!" + fi && + + if test -n "$file" + then + mv -v "$file".tmp "$file" + fi + fi + EOS + + cat >expect <<-\EOF + sixth_ - edited + fifth_ - edited + third_ - edited + second - edited + first_ + EOF +' + +for trial in 0 1 2 3 4 +do + test_expect_success "demonstrate racy todo re-read problem #$trial" ' + git reset --hard fourth && + >expected-ino && # placeholder + + GIT_SEQUENCE_EDITOR=./sequence-editor \ + GIT_EDITOR=./commit-editor \ + git rebase -i HEAD^^^ && + + git log --format=%s >actual && + test_cmp expect actual + ' +done + +test_done
On Sun, Nov 24, 2019 at 10:10:21PM +0100, SZEDER Gábor wrote: > To notice a changed todo file the sequencer stores the file's stat > data in its 'struct todo_list' instance, and compares it with the > file's current stat data after 'reword', 'squash' and 'exec' > instructions. If the two stat data doesn't match, it re-reads the > todo file. > > Sounds simple, but there are some subtleties going on here: > > - The 'struct todo_list' holds the stat data from the time when the > todo file was last read. > > - This stat data in 'struct todo_list' is not updated when the > sequencer itself writes the todo file. > > - Before executing each instruction during an interactive rebase, > the sequencer always updates the todo file by removing the > just-about-to-be-executed instruction. This changes the file's > size and inode [1]. > > Consequently, when the sequencer looks at the stat data after a > 'reword', 'squash' or 'exec' instruction, it most likely finds that > they differ, even when the user didn't modify the todo list at all! > This is not an issue in practice, it just wastes a few cycles on > re-reading the todo list that matches what the sequencer already has > in memory anyway. It can be much more than just a few cycles, because the total number of parsed instructions from all the todo file re-reads can go quadratic with the number of rebased commits. The simple test below runs 'git rebase -i -x' on 1000 commits, which takes over 14seconds to run. If it doesn't re-read the todo file at all (I simply deleted the whole condition block checking the stat data and re-reading) it runs for only ~2.5secs. Just another angle to consider... --- >8 --- test_expect_success 'test' ' num_commits=1000 && test_commit_bulk --filename=file $num_commits && /usr/bin/time -f "Elapsed time: %e" \ git rebase -i --root -x true 2>out && grep "Executing: true" out >actual && test_line_count = $num_commits actual && # show the elapsed time tail -n2 out ' --- >8 ---
On 25/11/2019 13:18, SZEDER Gábor wrote: > On Sun, Nov 24, 2019 at 10:10:21PM +0100, SZEDER Gábor wrote: >> To notice a changed todo file the sequencer stores the file's stat >> data in its 'struct todo_list' instance, and compares it with the >> file's current stat data after 'reword', 'squash' and 'exec' >> instructions. If the two stat data doesn't match, it re-reads the >> todo file. >> >> Sounds simple, but there are some subtleties going on here: >> >> - The 'struct todo_list' holds the stat data from the time when the >> todo file was last read. >> >> - This stat data in 'struct todo_list' is not updated when the >> sequencer itself writes the todo file. >> >> - Before executing each instruction during an interactive rebase, >> the sequencer always updates the todo file by removing the >> just-about-to-be-executed instruction. This changes the file's >> size and inode [1]. >> >> Consequently, when the sequencer looks at the stat data after a >> 'reword', 'squash' or 'exec' instruction, it most likely finds that >> they differ, even when the user didn't modify the todo list at all! >> This is not an issue in practice, it just wastes a few cycles on >> re-reading the todo list that matches what the sequencer already has >> in memory anyway. > > It can be much more than just a few cycles, because the total number > of parsed instructions from all the todo file re-reads can go > quadratic with the number of rebased commits. > > The simple test below runs 'git rebase -i -x' on 1000 commits, which > takes over 14seconds to run. If it doesn't re-read the todo file at > all (I simply deleted the whole condition block checking the stat data > and re-reading) it runs for only ~2.5secs. > > Just another angle to consider... I know dscho was keen to avoid re-parsing the list all the time [1] presumably because of the quadratic behavior. (He also assumed most people were using ns stat times [2] but that appears not to be the case) Could we just compare the text of the todo list on disk to whats in todo->buf.buf (with an appropriate offset)? That would avoid parsing the text and looking up all the commits with get_oid() Best Wishes Phillip [1] https://public-inbox.org/git/alpine.DEB.2.20.1703021617510.3767@virtualbox/ [2] https://public-inbox.org/git/alpine.DEB.2.20.1704131526500.2135@virtualbox/ > > --- >8 --- > > test_expect_success 'test' ' > num_commits=1000 && > test_commit_bulk --filename=file $num_commits && > > /usr/bin/time -f "Elapsed time: %e" \ > git rebase -i --root -x true 2>out && > > grep "Executing: true" out >actual && > test_line_count = $num_commits actual && > > # show the elapsed time > tail -n2 out > ' > > --- >8 --- >
On Mon, Nov 25, 2019 at 02:43:07PM +0000, Phillip Wood wrote: > On 25/11/2019 13:18, SZEDER Gábor wrote: > >On Sun, Nov 24, 2019 at 10:10:21PM +0100, SZEDER Gábor wrote: > >>To notice a changed todo file the sequencer stores the file's stat > >>data in its 'struct todo_list' instance, and compares it with the > >>file's current stat data after 'reword', 'squash' and 'exec' > >>instructions. If the two stat data doesn't match, it re-reads the > >>todo file. > >> > >>Sounds simple, but there are some subtleties going on here: > >> > >> - The 'struct todo_list' holds the stat data from the time when the > >> todo file was last read. > >> > >> - This stat data in 'struct todo_list' is not updated when the > >> sequencer itself writes the todo file. > >> > >> - Before executing each instruction during an interactive rebase, > >> the sequencer always updates the todo file by removing the > >> just-about-to-be-executed instruction. This changes the file's > >> size and inode [1]. > >> > >>Consequently, when the sequencer looks at the stat data after a > >>'reword', 'squash' or 'exec' instruction, it most likely finds that > >>they differ, even when the user didn't modify the todo list at all! > >>This is not an issue in practice, it just wastes a few cycles on > >>re-reading the todo list that matches what the sequencer already has > >>in memory anyway. > > > >It can be much more than just a few cycles, because the total number > >of parsed instructions from all the todo file re-reads can go > >quadratic with the number of rebased commits. > > > >The simple test below runs 'git rebase -i -x' on 1000 commits, which > >takes over 14seconds to run. If it doesn't re-read the todo file at > >all (I simply deleted the whole condition block checking the stat data > >and re-reading) it runs for only ~2.5secs. > > > >Just another angle to consider... > > I know dscho was keen to avoid re-parsing the list all the time [1] > presumably because of the quadratic behavior. (He also assumed most people > were using ns stat times [2] but that appears not to be the case) Nanosecond file timestamp comparisons are only enabled by the USE_NSEC macro, which is only defined if the USE_NSEC Makefile knob is enabled, but that is not enabled by default. Then there is the related NO_NSEC Makefile knob: # Define NO_NSEC if your "struct stat" does not have "st_ctim.tv_nsec" # available. This automatically turns USE_NSEC off. As Dscho mentioned in [2], we do disable nanosecond file timestamp comparisons in 'config.mak.uname' on a handful of platforms by setting NO_NSEC. This, however, does not mean that nanosec timestamps are enabled on other platforms by default. > Could we > just compare the text of the todo list on disk to whats in todo->buf.buf > (with an appropriate offset)? That would avoid parsing the text and looking > up all the commits with get_oid() Comparing the contents without parsing is still quadratic in the size of the todo list, though I suppose with a much lower constant factor than actually parsing it. > [1] > https://public-inbox.org/git/alpine.DEB.2.20.1703021617510.3767@virtualbox/ > [2] > https://public-inbox.org/git/alpine.DEB.2.20.1704131526500.2135@virtualbox/ > > > > > --- >8 --- > > > >test_expect_success 'test' ' > > num_commits=1000 && > > test_commit_bulk --filename=file $num_commits && > > > > /usr/bin/time -f "Elapsed time: %e" \ > > git rebase -i --root -x true 2>out && > > > > grep "Executing: true" out >actual && > > test_line_count = $num_commits actual && > > > > # show the elapsed time > > tail -n2 out > >' > > > > --- >8 --- > >
On 25/11/2019 15:15, SZEDER Gábor wrote: > On Mon, Nov 25, 2019 at 02:43:07PM +0000, Phillip Wood wrote: >> On 25/11/2019 13:18, SZEDER Gábor wrote: >>> On Sun, Nov 24, 2019 at 10:10:21PM +0100, SZEDER Gábor wrote: >>>> To notice a changed todo file the sequencer stores the file's stat >>>> data in its 'struct todo_list' instance, and compares it with the >>>> file's current stat data after 'reword', 'squash' and 'exec' >>>> instructions. If the two stat data doesn't match, it re-reads the >>>> todo file. >>>> >>>> Sounds simple, but there are some subtleties going on here: >>>> >>>> - The 'struct todo_list' holds the stat data from the time when the >>>> todo file was last read. >>>> >>>> - This stat data in 'struct todo_list' is not updated when the >>>> sequencer itself writes the todo file. >>>> >>>> - Before executing each instruction during an interactive rebase, >>>> the sequencer always updates the todo file by removing the >>>> just-about-to-be-executed instruction. This changes the file's >>>> size and inode [1]. >>>> >>>> Consequently, when the sequencer looks at the stat data after a >>>> 'reword', 'squash' or 'exec' instruction, it most likely finds that >>>> they differ, even when the user didn't modify the todo list at all! >>>> This is not an issue in practice, it just wastes a few cycles on >>>> re-reading the todo list that matches what the sequencer already has >>>> in memory anyway. >>> >>> It can be much more than just a few cycles, because the total number >>> of parsed instructions from all the todo file re-reads can go >>> quadratic with the number of rebased commits. >>> >>> The simple test below runs 'git rebase -i -x' on 1000 commits, which >>> takes over 14seconds to run. If it doesn't re-read the todo file at >>> all (I simply deleted the whole condition block checking the stat data >>> and re-reading) it runs for only ~2.5secs. >>> >>> Just another angle to consider... >> >> I know dscho was keen to avoid re-parsing the list all the time [1] >> presumably because of the quadratic behavior. (He also assumed most people >> were using ns stat times [2] but that appears not to be the case) > > Nanosecond file timestamp comparisons are only enabled by the USE_NSEC > macro, which is only defined if the USE_NSEC Makefile knob is enabled, > but that is not enabled by default. > > Then there is the related NO_NSEC Makefile knob: > > # Define NO_NSEC if your "struct stat" does not have "st_ctim.tv_nsec" > # available. This automatically turns USE_NSEC off. > > As Dscho mentioned in [2], we do disable nanosecond file timestamp > comparisons in 'config.mak.uname' on a handful of platforms by setting > NO_NSEC. This, however, does not mean that nanosec timestamps are > enabled on other platforms by default. > >> Could we >> just compare the text of the todo list on disk to whats in todo->buf.buf >> (with an appropriate offset)? That would avoid parsing the text and looking >> up all the commits with get_oid() > > Comparing the contents without parsing is still quadratic in the size > of the todo list, though I suppose with a much lower constant factor > than actually parsing it. The patch below (assuming thunderbird doesn't mangle it) reduces the time to run your bulk commit test from 30s to 7s, if I delete the condition block which checks the stat data it takes 4.7s on my (somewhat ancient) laptop. So there is a cost to the string comparison approach but it's much less that the full todo list parsing. Best Wishes Phillip --- >8 --- diff --git a/sequencer.c b/sequencer.c index 8952cfa89b..a3efdae0a5 100644 --- a/sequencer.c +++ b/sequencer.c @@ -3909,12 +3909,17 @@ static int pick_commits(struct repository *r, arg, item->arg_len, opts, res, 0); } else if (check_todo && !res) { - struct stat st; - - if (stat(get_todo_path(opts), &st)) { - res = error_errno(_("could not stat '%s'"), + int offset; + struct strbuf buf = STRBUF_INIT; + if (strbuf_read_file(&buf, get_todo_path(opts), 8096) < 0) + res = error_errno(_("could not read '%s'"), get_todo_path(opts)); - } else if (match_stat_data(&todo_list->stat, &st)) { + + offset = get_item_line_offset(todo_list, + todo_list->current + 1); + if (buf.len != todo_list->buf.len - offset || + memcmp(buf.buf, todo_list->buf.buf + offset, buf.len)) { + fputs("re-reading todo list\n", stderr); /* Reread the todo file if it has changed. */ todo_list_release(todo_list); if (read_populate_todo(r, todo_list, opts))
diff --git a/t/t3429-rebase-edit-todo.sh b/t/t3429-rebase-edit-todo.sh index 8739cb60a7..c80c139b47 100755 --- a/t/t3429-rebase-edit-todo.sh +++ b/t/t3429-rebase-edit-todo.sh @@ -52,4 +52,63 @@ test_expect_success 'todo is re-read after reword and squash' ' test_cmp expected actual ' +test_expect_success 'setup for racy tests' ' + write_script sequence-editor <<-EOS && + cat >.git/rebase-merge/git-rebase-todo <<-\EOF + r $(git rev-parse second^0) second + r $(git rev-parse third^0) third + EOF + EOS + + write_script commit-editor <<-\EOS && + read first_line <"$1" && + echo "$first_line - edited" >"$1" && + + todo=.git/rebase-merge/git-rebase-todo && + + if test "$first_line" = second + then + old_size=$(wc -c <"$todo") && + # Replace the "reword <full-oid> third" line with + # a different instruction of the same line length. + # Overwrite the file in-place, so the inode stays + # the same as well. + cat >"$todo" <<-EOF && + exec echo 0123456789012345678901234 >exec-cmd-was-run + EOF + new_size=$(wc -c <"$todo") && + + if test $old_size -ne $new_size + then + echo >&2 "error: bug in the test script: the size of the todo list must not change" + exit 1 + fi + fi + EOS + + cat >expect <<-\EOF + second - edited + first + EOF +' + +# This test can give false success if your machine is sufficiently +# slow or all trials happened to happen on second boundaries. + +for trial in 0 1 2 3 4 +do + test_expect_success "racy todo re-read #$trial" ' + git reset --hard third && + rm -rf exec-cmd-was-run && + + GIT_SEQUENCE_EDITOR=./sequence-editor \ + GIT_EDITOR=./commit-editor \ + git rebase -i HEAD^^ && + + test_path_is_file exec-cmd-was-run && + git log --format=%s >actual && + test_cmp expect actual + ' +done + test_done