mbox series

[0/4] midx: avoid potential rename-while-mapped on Windows

Message ID cover.1633729502.git.me@ttaylorr.com (mailing list archive)
Headers show
Series midx: avoid potential rename-while-mapped on Windows | expand

Message

Taylor Blau Oct. 8, 2021, 9:46 p.m. UTC
(This topic is based on tb/repack-write-midx).

While looking through some failures in t5319 when Git is compiled with
SANITIZE=leak, I noticed a curious pattern that basically looks like:

    struct multi_pack_index *m = load_multi_pack_index(object_dir, 0);
    /* ... */
    write_midx_internal();

This can cause problems on platforms where a memory mapped file cannot be
overridden via a rename, because load_multi_pack_index() builds a new MIDX from
scratch each time it is called without adding that MIDX to the object store. But
write_midx_internal() loads a MIDX from the object store, causing the same
region to be mapped twice.

The details are fully spelled out in the second patch, but the gist is that
while write_midx_internal() unmaps *its* copy of the MIDX, the caller's copy is
still mapped in memory.

Strangely, I could not get this to produce a failure on Windows, even though I
would have expected one. I asked around off-list, and it sounds like it's
possible this behavior has changed, or Windows' mmap-equivalent may not lock
files that fit within a single page.

In any case, we let's side step the issue entirely by only interacting with
MIDXs that can be found from the object store.

Taylor Blau (4):
  midx.c: extract MIDX lookup by object_dir
  midx.c: lookup MIDX by object directory during expire
  midx.c: lookup MIDX by object directory during repack
  midx.c: guard against commit_lock_file() failures

 midx.c | 42 +++++++++++++++++++++++-------------------
 1 file changed, 23 insertions(+), 19 deletions(-)

Comments

Derrick Stolee Oct. 13, 2021, 1:29 p.m. UTC | #1
On 10/8/2021 5:46 PM, Taylor Blau wrote:
> (This topic is based on tb/repack-write-midx).
> 
> While looking through some failures in t5319 when Git is compiled with
> SANITIZE=leak, I noticed a curious pattern that basically looks like:
> 
>     struct multi_pack_index *m = load_multi_pack_index(object_dir, 0);
>     /* ... */
>     write_midx_internal();
> 
> This can cause problems on platforms where a memory mapped file cannot be
> overridden via a rename, because load_multi_pack_index() builds a new MIDX from
> scratch each time it is called without adding that MIDX to the object store. But
> write_midx_internal() loads a MIDX from the object store, causing the same
> region to be mapped twice.
> 
> The details are fully spelled out in the second patch, but the gist is that
> while write_midx_internal() unmaps *its* copy of the MIDX, the caller's copy is
> still mapped in memory.
> 
> Strangely, I could not get this to produce a failure on Windows, even though I
> would have expected one. I asked around off-list, and it sounds like it's
> possible this behavior has changed, or Windows' mmap-equivalent may not lock
> files that fit within a single page.

That's a good possibility. Once everything is loaded into memory, there is no
need for a file handle.
 
> In any case, we let's side step the issue entirely by only interacting with
> MIDXs that can be found from the object store.

I think this is a good strategy, and the patch series goes above and beyond
this one failure by making it easier to get the MIDX from a specific alternate.

Thanks,
-Stolee