diff mbox series

[v2] core.fsync: new option to harden references

Message ID 47dd79106b93bb81750320d50ccaa74c24aacd28.1646992380.git.ps@pks.im (mailing list archive)
State Accepted
Commit bc22d845c4328f5bd896d019b3729f776ad4be4c
Headers show
Series [v2] core.fsync: new option to harden references | expand

Commit Message

Patrick Steinhardt March 11, 2022, 9:58 a.m. UTC
When writing both loose and packed references to disk we first create a
lockfile, write the updated values into that lockfile, and on commit we
rename the file into place. According to filesystem developers, this
behaviour is broken because applications should always sync data to disk
before doing the final rename to ensure data consistency [1][2][3]. If
applications fail to do this correctly, a hard crash of the machine can
easily result in corrupted on-disk data.

This kind of corruption can in fact be easily observed with Git when the
machine hard-resets shortly after writing references to disk. On
machines with ext4, this will likely lead to the "empty files" problem:
the file has been renamed, but its data has not been synced to disk. The
result is that the reference is corrupt, and in the worst case this can
lead to data loss.

Implement a new option to harden references so that users and admins can
avoid this scenario by syncing locked loose and packed references to
disk before we rename them into place.

[1]: https://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/
[2]: https://btrfs.wiki.kernel.org/index.php/FAQ (What are the crash guarantees of overwrite-by-rename)
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/ext4.rst (see auto_da_alloc)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---

Hi,

here's my updated patch series which implements syncing of refs. It
applies on top of Neeraj's v6 of "A design for future-proofing fsync()
configuration".

I've simplified these patches a bit:

    - I don't distinguishing between "loose" and "packed" refs anymore.
      I agree with Junio that it's probably not worth it, but we can
      still reintroduce the split at a later point without breaking
      backwards compatibility if the need comes up.

    - I've simplified the way loose refs are written to disk so that we
      now sync them when before we close their files. The previous
      implementation I had was broken because we tried to sync after
      closing.

Because this really only changes a few lines of code I've also decided
to squash together the patches into a single one.

Patrick

 Documentation/config/core.txt | 1 +
 cache.h                       | 7 +++++--
 config.c                      | 1 +
 refs/files-backend.c          | 1 +
 refs/packed-backend.c         | 3 ++-
 5 files changed, 10 insertions(+), 3 deletions(-)

Comments

SZEDER Gábor March 25, 2022, 6:11 a.m. UTC | #1
On Fri, Mar 11, 2022 at 10:58:59AM +0100, Patrick Steinhardt wrote:
> When writing both loose and packed references to disk we first create a
> lockfile, write the updated values into that lockfile, and on commit we
> rename the file into place. According to filesystem developers, this
> behaviour is broken because applications should always sync data to disk
> before doing the final rename to ensure data consistency [1][2][3]. If
> applications fail to do this correctly, a hard crash of the machine can
> easily result in corrupted on-disk data.
> 
> This kind of corruption can in fact be easily observed with Git when the
> machine hard-resets shortly after writing references to disk. On
> machines with ext4, this will likely lead to the "empty files" problem:
> the file has been renamed, but its data has not been synced to disk. The
> result is that the reference is corrupt, and in the worst case this can
> lead to data loss.
> 
> Implement a new option to harden references so that users and admins can
> avoid this scenario by syncing locked loose and packed references to
> disk before we rename them into place.

In 't5541-http-push-smart.sh' there is a test case called 'push 2000
tags over http', which does pretty much what it's title says.  This
patch makes that test case significantly slower.


diff --git a/t/t5541-http-push-smart.sh b/t/t5541-http-push-smart.sh
index 8ca50f8b18..d7e94cb791 100755
--- a/t/t5541-http-push-smart.sh
+++ b/t/t5541-http-push-smart.sh
@@ -415,7 +415,7 @@ test_expect_success CMDLINE_LIMIT 'push 2000 tags over http' '
 	  sort |
 	  sed "s|.*|$sha1 refs/tags/really-long-tag-name-&|" \
 	  >.git/packed-refs &&
-	run_with_limited_cmdline git push --mirror
+	run_with_limited_cmdline /usr/bin/time git push --mirror
 '
 
 test_expect_success GPG 'push with post-receive to inspect certificate' '

Before this patch (bc22d845c4^) 'time' reports:

  3.62user 0.03system 0:03.83elapsed 95%CPU (0avgtext+0avgdata 11904maxresident)k
  0inputs+312outputs (0major+4597minor)pagefaults 0swaps

With this patch (bc22d845c4):

  3.56user 0.04system 0:33.60elapsed 10%CPU (0avgtext+0avgdata 11832maxresident)k
  0inputs+320outputs (0major+4578minor)pagefaults 0swaps

And the total runtime of the whole test script increases from 8-9s to
37-39s.


I wonder whether we should relax the fsync options for this test case.

> 
> [1]: https://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/
> [2]: https://btrfs.wiki.kernel.org/index.php/FAQ (What are the crash guarantees of overwrite-by-rename)
> [3]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/ext4.rst (see auto_da_alloc)
> 
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> 
> Hi,
> 
> here's my updated patch series which implements syncing of refs. It
> applies on top of Neeraj's v6 of "A design for future-proofing fsync()
> configuration".
> 
> I've simplified these patches a bit:
> 
>     - I don't distinguishing between "loose" and "packed" refs anymore.
>       I agree with Junio that it's probably not worth it, but we can
>       still reintroduce the split at a later point without breaking
>       backwards compatibility if the need comes up.
> 
>     - I've simplified the way loose refs are written to disk so that we
>       now sync them when before we close their files. The previous
>       implementation I had was broken because we tried to sync after
>       closing.
> 
> Because this really only changes a few lines of code I've also decided
> to squash together the patches into a single one.
> 
> Patrick
> 
>  Documentation/config/core.txt | 1 +
>  cache.h                       | 7 +++++--
>  config.c                      | 1 +
>  refs/files-backend.c          | 1 +
>  refs/packed-backend.c         | 3 ++-
>  5 files changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
> index 37105a7be4..812cca7de7 100644
> --- a/Documentation/config/core.txt
> +++ b/Documentation/config/core.txt
> @@ -575,6 +575,7 @@ but risks losing recent work in the event of an unclean system shutdown.
>  * `index` hardens the index when it is modified.
>  * `objects` is an aggregate option that is equivalent to
>    `loose-object,pack`.
> +* `reference` hardens references modified in the repo.
>  * `derived-metadata` is an aggregate option that is equivalent to
>    `pack-metadata,commit-graph`.
>  * `committed` is an aggregate option that is currently equivalent to
> diff --git a/cache.h b/cache.h
> index cde0900d05..033e5b0779 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1005,6 +1005,7 @@ enum fsync_component {
>  	FSYNC_COMPONENT_PACK_METADATA		= 1 << 2,
>  	FSYNC_COMPONENT_COMMIT_GRAPH		= 1 << 3,
>  	FSYNC_COMPONENT_INDEX			= 1 << 4,
> +	FSYNC_COMPONENT_REFERENCE		= 1 << 5,
>  };
>  
>  #define FSYNC_COMPONENTS_OBJECTS (FSYNC_COMPONENT_LOOSE_OBJECT | \
> @@ -1017,7 +1018,8 @@ enum fsync_component {
>  				  FSYNC_COMPONENTS_DERIVED_METADATA | \
>  				  ~FSYNC_COMPONENT_LOOSE_OBJECT)
>  
> -#define FSYNC_COMPONENTS_COMMITTED (FSYNC_COMPONENTS_OBJECTS)
> +#define FSYNC_COMPONENTS_COMMITTED (FSYNC_COMPONENTS_OBJECTS | \
> +				    FSYNC_COMPONENT_REFERENCE)
>  
>  #define FSYNC_COMPONENTS_ADDED (FSYNC_COMPONENTS_COMMITTED | \
>  				FSYNC_COMPONENT_INDEX)
> @@ -1026,7 +1028,8 @@ enum fsync_component {
>  			      FSYNC_COMPONENT_PACK | \
>  			      FSYNC_COMPONENT_PACK_METADATA | \
>  			      FSYNC_COMPONENT_COMMIT_GRAPH | \
> -			      FSYNC_COMPONENT_INDEX)
> +			      FSYNC_COMPONENT_INDEX | \
> +			      FSYNC_COMPONENT_REFERENCE)
>  
>  /*
>   * A bitmask indicating which components of the repo should be fsynced.
> diff --git a/config.c b/config.c
> index eb75f65338..3c9b6b589a 100644
> --- a/config.c
> +++ b/config.c
> @@ -1333,6 +1333,7 @@ static const struct fsync_component_name {
>  	{ "commit-graph", FSYNC_COMPONENT_COMMIT_GRAPH },
>  	{ "index", FSYNC_COMPONENT_INDEX },
>  	{ "objects", FSYNC_COMPONENTS_OBJECTS },
> +	{ "reference", FSYNC_COMPONENT_REFERENCE },
>  	{ "derived-metadata", FSYNC_COMPONENTS_DERIVED_METADATA },
>  	{ "committed", FSYNC_COMPONENTS_COMMITTED },
>  	{ "added", FSYNC_COMPONENTS_ADDED },
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index f59589d6cc..6521ee8af5 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -1787,6 +1787,7 @@ static int write_ref_to_lockfile(struct ref_lock *lock,
>  	fd = get_lock_file_fd(&lock->lk);
>  	if (write_in_full(fd, oid_to_hex(oid), the_hash_algo->hexsz) < 0 ||
>  	    write_in_full(fd, &term, 1) < 0 ||
> +	    fsync_component(FSYNC_COMPONENT_REFERENCE, get_lock_file_fd(&lock->lk)) < 0 ||
>  	    close_ref_gently(lock) < 0) {
>  		strbuf_addf(err,
>  			    "couldn't write '%s'", get_lock_file_path(&lock->lk));
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index 27dd8c3922..9d704ccd3e 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -1262,7 +1262,8 @@ static int write_with_updates(struct packed_ref_store *refs,
>  		goto error;
>  	}
>  
> -	if (close_tempfile_gently(refs->tempfile)) {
> +	if (fsync_component(FSYNC_COMPONENT_REFERENCE, get_tempfile_fd(refs->tempfile)) ||
> +	    close_tempfile_gently(refs->tempfile)) {
>  		strbuf_addf(err, "error closing file %s: %s",
>  			    get_tempfile_path(refs->tempfile),
>  			    strerror(errno));
> -- 
> 2.35.1
>
diff mbox series

Patch

diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
index 37105a7be4..812cca7de7 100644
--- a/Documentation/config/core.txt
+++ b/Documentation/config/core.txt
@@ -575,6 +575,7 @@  but risks losing recent work in the event of an unclean system shutdown.
 * `index` hardens the index when it is modified.
 * `objects` is an aggregate option that is equivalent to
   `loose-object,pack`.
+* `reference` hardens references modified in the repo.
 * `derived-metadata` is an aggregate option that is equivalent to
   `pack-metadata,commit-graph`.
 * `committed` is an aggregate option that is currently equivalent to
diff --git a/cache.h b/cache.h
index cde0900d05..033e5b0779 100644
--- a/cache.h
+++ b/cache.h
@@ -1005,6 +1005,7 @@  enum fsync_component {
 	FSYNC_COMPONENT_PACK_METADATA		= 1 << 2,
 	FSYNC_COMPONENT_COMMIT_GRAPH		= 1 << 3,
 	FSYNC_COMPONENT_INDEX			= 1 << 4,
+	FSYNC_COMPONENT_REFERENCE		= 1 << 5,
 };
 
 #define FSYNC_COMPONENTS_OBJECTS (FSYNC_COMPONENT_LOOSE_OBJECT | \
@@ -1017,7 +1018,8 @@  enum fsync_component {
 				  FSYNC_COMPONENTS_DERIVED_METADATA | \
 				  ~FSYNC_COMPONENT_LOOSE_OBJECT)
 
-#define FSYNC_COMPONENTS_COMMITTED (FSYNC_COMPONENTS_OBJECTS)
+#define FSYNC_COMPONENTS_COMMITTED (FSYNC_COMPONENTS_OBJECTS | \
+				    FSYNC_COMPONENT_REFERENCE)
 
 #define FSYNC_COMPONENTS_ADDED (FSYNC_COMPONENTS_COMMITTED | \
 				FSYNC_COMPONENT_INDEX)
@@ -1026,7 +1028,8 @@  enum fsync_component {
 			      FSYNC_COMPONENT_PACK | \
 			      FSYNC_COMPONENT_PACK_METADATA | \
 			      FSYNC_COMPONENT_COMMIT_GRAPH | \
-			      FSYNC_COMPONENT_INDEX)
+			      FSYNC_COMPONENT_INDEX | \
+			      FSYNC_COMPONENT_REFERENCE)
 
 /*
  * A bitmask indicating which components of the repo should be fsynced.
diff --git a/config.c b/config.c
index eb75f65338..3c9b6b589a 100644
--- a/config.c
+++ b/config.c
@@ -1333,6 +1333,7 @@  static const struct fsync_component_name {
 	{ "commit-graph", FSYNC_COMPONENT_COMMIT_GRAPH },
 	{ "index", FSYNC_COMPONENT_INDEX },
 	{ "objects", FSYNC_COMPONENTS_OBJECTS },
+	{ "reference", FSYNC_COMPONENT_REFERENCE },
 	{ "derived-metadata", FSYNC_COMPONENTS_DERIVED_METADATA },
 	{ "committed", FSYNC_COMPONENTS_COMMITTED },
 	{ "added", FSYNC_COMPONENTS_ADDED },
diff --git a/refs/files-backend.c b/refs/files-backend.c
index f59589d6cc..6521ee8af5 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1787,6 +1787,7 @@  static int write_ref_to_lockfile(struct ref_lock *lock,
 	fd = get_lock_file_fd(&lock->lk);
 	if (write_in_full(fd, oid_to_hex(oid), the_hash_algo->hexsz) < 0 ||
 	    write_in_full(fd, &term, 1) < 0 ||
+	    fsync_component(FSYNC_COMPONENT_REFERENCE, get_lock_file_fd(&lock->lk)) < 0 ||
 	    close_ref_gently(lock) < 0) {
 		strbuf_addf(err,
 			    "couldn't write '%s'", get_lock_file_path(&lock->lk));
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 27dd8c3922..9d704ccd3e 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1262,7 +1262,8 @@  static int write_with_updates(struct packed_ref_store *refs,
 		goto error;
 	}
 
-	if (close_tempfile_gently(refs->tempfile)) {
+	if (fsync_component(FSYNC_COMPONENT_REFERENCE, get_tempfile_fd(refs->tempfile)) ||
+	    close_tempfile_gently(refs->tempfile)) {
 		strbuf_addf(err, "error closing file %s: %s",
 			    get_tempfile_path(refs->tempfile),
 			    strerror(errno));