Message ID | 20240218194603.1210895-1-marcel@roethke.info (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] rerere: fix crash during clear | expand |
Marcel Röthke <marcel@roethke.info> writes: > When rerere_clear is called, for instance when aborting a rebase, and > the current conflict does not have a pre or postimage recorded git > crashes with a SEGFAULT in has_rerere_resolution when accessing the > status member of struct rerere_dir. I had to read this twice before realizing the reason why I found it hard to grok was because of a missing comma between "recorded" and "git". > This happens because scan_rerere_dir > only allocates the status field in struct rerere_dir when a post or > preimage was found. But that is not really the root cause, no? Readers following the above text are probably wondering why the preimage was not recorded, when a conflict resulted in stopping a mergy-command and invoking rerere machinery, before rerere_clear() got called. Is that something that usually happen? How? Do we have a reproduction sequence of such a state that we can make it into a new test in t4200 where we already have tests for "git rerere clear" and its friends? > In some cases a segfault may happen even if a post > or preimage was recorded if it was not for the variant of interest and > the number of the variant that is present is lower than the variant of > interest. Ditto. What sequence of events would lead to such a state? The answer *can* be ".git/rr-cache being a normal directory, the user can poke around, removing files randomly, which can create such a problematic situation", and the reproduction test *can* also be to simulate such an end-user action, but I am asking primarily because I want to make sure that we are *not* losing or failing to create necessary preimage files, causing this problem to users who do not muck with files in .git/rr-cache themselves. Thanks.
On 2024-02-19 17:22:43, Junio C Hamano wrote: > Marcel Röthke <marcel@roethke.info> writes: > > > When rerere_clear is called, for instance when aborting a rebase, and > > the current conflict does not have a pre or postimage recorded git > > crashes with a SEGFAULT in has_rerere_resolution when accessing the > > status member of struct rerere_dir. > > I had to read this twice before realizing the reason why I found it > hard to grok was because of a missing comma between "recorded" and > "git". fixed > > This happens because scan_rerere_dir > > only allocates the status field in struct rerere_dir when a post or > > preimage was found. > > But that is not really the root cause, no? Readers following the > above text are probably wondering why the preimage was not recorded, > when a conflict resulted in stopping a mergy-command and invoking > rerere machinery, before rerere_clear() got called. Is that > something that usually happen? How? Do we have a reproduction > sequence of such a state that we can make it into a new test in > t4200 where we already have tests for "git rerere clear" and its > friends? I'm unfortunately not sure how it happened. I do have the initial state of the repository and I think I know the commands that were executed, but I could not reproduce it. I will look into adding a test case though.
Marcel Röthke <marcel@roethke.info> writes: > On 2024-02-19 17:22:43, Junio C Hamano wrote: >> Marcel Röthke <marcel@roethke.info> writes: >> >> > When rerere_clear is called, for instance when aborting a rebase, and >> > the current conflict does not have a pre or postimage recorded git >> > crashes with a SEGFAULT in has_rerere_resolution when accessing the >> > status member of struct rerere_dir. >> >> I had to read this twice before realizing the reason why I found it >> hard to grok was because of a missing comma between "recorded" and >> "git". > > fixed > ... > I'm unfortunately not sure how it happened. I do have the initial state > of the repository and I think I know the commands that were executed, > but I could not reproduce it. > > I will look into adding a test case though. It has been about a month. Any new development on this topic? Thanks.
On 2024-03-24 14:51:17, Junio C Hamano wrote: > Marcel Röthke <marcel@roethke.info> writes: > > > On 2024-02-19 17:22:43, Junio C Hamano wrote: > >> Marcel Röthke <marcel@roethke.info> writes: > >> > >> > When rerere_clear is called, for instance when aborting a rebase, and > >> > the current conflict does not have a pre or postimage recorded git > >> > crashes with a SEGFAULT in has_rerere_resolution when accessing the > >> > status member of struct rerere_dir. > >> > >> I had to read this twice before realizing the reason why I found it > >> hard to grok was because of a missing comma between "recorded" and > >> "git". > > > > fixed > > ... > > I'm unfortunately not sure how it happened. I do have the initial state > > of the repository and I think I know the commands that were executed, > > but I could not reproduce it. > > > > I will look into adding a test case though. > > It has been about a month. Any new development on this topic? > > Thanks. No, I have been otherwise occupied. But I plan to get back to this soon. Probably this week, next week at the latest.
After reproducing this without manually corrupting the database, I noticed that the summary line is no longer accurate. Because it fixes two SEGFAULTs and one of them does not happen during clear. Can I change this in v3 or should I start a new thread?
Marcel Röthke <marcel@roethke.info> writes: > After reproducing this without manually corrupting the database, I > noticed that the summary line is no longer accurate. Because it fixes > two SEGFAULTs and one of them does not happen during clear. Can I change this > in v3 or should I start a new thread? Updating the title in the course of evolution of a topic/patch is perfectly normal. Please do so in v3 as continuation of the previous round. Thanks.
diff --git a/rerere.c b/rerere.c index ca7e77ba68..4683d6cbb1 100644 --- a/rerere.c +++ b/rerere.c @@ -219,6 +219,11 @@ static void read_rr(struct repository *r, struct string_list *rr) buf.buf[hexsz] = '\0'; id = new_rerere_id_hex(buf.buf); id->variant = variant; + /* + * make sure id->collection->status has enough space + * for the variant we are interested in + */ + fit_variant(id->collection, variant); string_list_insert(rr, path)->util = id; } strbuf_release(&buf);
When rerere_clear is called, for instance when aborting a rebase, and the current conflict does not have a pre or postimage recorded git crashes with a SEGFAULT in has_rerere_resolution when accessing the status member of struct rerere_dir. This happens because scan_rerere_dir only allocates the status field in struct rerere_dir when a post or preimage was found. In some cases a segfault may happen even if a post or preimage was recorded if it was not for the variant of interest and the number of the variant that is present is lower than the variant of interest. Solve this by making sure the status field is large enough to accommodate for the variant of interest so it can be accessed without checking if it is large enough. An alternative solution would be to always check before accessing the status field, but I think the chosen solution aligns better with the assumptions made elsewhere in the code. Signed-off-by: Marcel Röthke <marcel@roethke.info> --- Range-diff against v1: 1: 93f982d170 ! 1: 68178298fe rerere: fix crash in during clear @@ Metadata Author: Marcel Röthke <marcel@roethke.info> ## Commit message ## - rerere: fix crash in during clear + rerere: fix crash during clear When rerere_clear is called, for instance when aborting a rebase, and the current conflict does not have a pre or postimage recorded git @@ Commit message the number of the variant that is present is lower than the variant of interest. - This patch solves this by making sure the status field is large enough - to accommodate for the variant of interest so it can be accesses without + Solve this by making sure the status field is large enough to + accommodate for the variant of interest so it can be accessed without checking if it is large enough. An alternative solution would be to always check before accessing the @@ rerere.c: static void read_rr(struct repository *r, struct string_list *rr) buf.buf[hexsz] = '\0'; id = new_rerere_id_hex(buf.buf); id->variant = variant; -+ /* make sure id->collection->status has enough space -+ * for the variant we are interested in */ ++ /* ++ * make sure id->collection->status has enough space ++ * for the variant we are interested in ++ */ + fit_variant(id->collection, variant); string_list_insert(rr, path)->util = id; } rerere.c | 5 +++++ 1 file changed, 5 insertions(+)