diff mbox series

[05/16] fsmonitor--daemon: refactor cookie handling for readability

Message ID 84df95be620c76afed73d1679722459e2ff32018.1647033303.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Builtin FSMonitor Part 2.5 | expand

Commit Message

Jeff Hostetler March 11, 2022, 9:14 p.m. UTC
From: Jeff Hostetler <jeffhost@microsoft.com>

fixup! fsmonitor--daemon: use a cookie file to sync with file system

Use implicit definitions for FCIR_ enum values.

Remove const from cookie->name.

Reverse if then and else branches around open() to ease readability.

Document that we don't care about errors from close() and unlink().

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 53 +++++++++++++++++++++----------------
 1 file changed, 30 insertions(+), 23 deletions(-)

Comments

Ævar Arnfjörð Bjarmason March 14, 2022, 8 a.m. UTC | #1
On Fri, Mar 11 2022, Jeff Hostetler via GitGitGadget wrote:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> fixup! fsmonitor--daemon: use a cookie file to sync with file system
>
> Use implicit definitions for FCIR_ enum values.
>
> Remove const from cookie->name.
>
> Reverse if then and else branches around open() to ease readability.
>
> Document that we don't care about errors from close() and unlink().
>
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>  builtin/fsmonitor--daemon.c | 53 +++++++++++++++++++++----------------
>  1 file changed, 30 insertions(+), 23 deletions(-)
>
> diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
> index 97ca2a356e5..02a99ce98a2 100644
> --- a/builtin/fsmonitor--daemon.c
> +++ b/builtin/fsmonitor--daemon.c
> @@ -109,14 +109,14 @@ static int do_as_client__status(void)
>  
>  enum fsmonitor_cookie_item_result {
>  	FCIR_ERROR = -1, /* could not create cookie file ? */
> -	FCIR_INIT = 0,
> +	FCIR_INIT,
>  	FCIR_SEEN,
>  	FCIR_ABORT,
>  };
>  
>  struct fsmonitor_cookie_item {
>  	struct hashmap_entry entry;
> -	const char *name;
> +	char *name;
>  	enum fsmonitor_cookie_item_result result;
>  };
>  
> @@ -166,37 +166,44 @@ static enum fsmonitor_cookie_item_result with_lock__wait_for_cookie(
>  	 * that the listener thread has seen it.
>  	 */
>  	fd = open(cookie_pathname.buf, O_WRONLY | O_CREAT | O_EXCL, 0600);
> -	if (fd >= 0) {
> -		close(fd);
> -		unlink(cookie_pathname.buf);
> -
> -		/*
> -		 * Technically, this is an infinite wait (well, unless another
> -		 * thread sends us an abort).  I'd like to change this to
> -		 * use `pthread_cond_timedwait()` and return an error/timeout
> -		 * and let the caller do the trivial response thing, but we
> -		 * don't have that routine in our thread-utils.
> -		 *
> -		 * After extensive beta testing I'm not really worried about
> -		 * this.  Also note that the above open() and unlink() calls
> -		 * will cause at least two FS events on that path, so the odds
> -		 * of getting stuck are pretty slim.
> -		 */
> -		while (cookie->result == FCIR_INIT)
> -			pthread_cond_wait(&state->cookies_cond,
> -					  &state->main_lock);
> -	} else {
> +	if (fd < 0) {
>  		error_errno(_("could not create fsmonitor cookie '%s'"),
>  			    cookie->name);
>  
>  		cookie->result = FCIR_ERROR;
> +		goto done;
>  	}
>  
> +	/*
> +	 * Technically, close() and unlink() can fail, but we don't
> +	 * care here.  We only created the file to trigger a watch
> +	 * event from the FS to know that when we're up to date.
> +	 */
> +	close(fd);

It still seems odd to explicitly want to ignore close() return values.

I realize that we do in (too many) existing places, but why wouldn't we
want to e.g. catch an I/O error here early?
Derrick Stolee March 14, 2022, 2:49 p.m. UTC | #2
On 3/14/2022 4:00 AM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Fri, Mar 11 2022, Jeff Hostetler via GitGitGadget wrote:

>> +	/*
>> +	 * Technically, close() and unlink() can fail, but we don't
>> +	 * care here.  We only created the file to trigger a watch
>> +	 * event from the FS to know that when we're up to date.
>> +	 */
>> +	close(fd);
> 
> It still seems odd to explicitly want to ignore close() return values.
> 
> I realize that we do in (too many) existing places, but why wouldn't we
> want to e.g. catch an I/O error here early?

What exactly do you propose we do here if there is an I/O error
during close()?

Thanks,
-Stolee
Junio C Hamano March 14, 2022, 5:47 p.m. UTC | #3
Derrick Stolee <derrickstolee@github.com> writes:

> On 3/14/2022 4:00 AM, Ævar Arnfjörð Bjarmason wrote:
>> 
>> On Fri, Mar 11 2022, Jeff Hostetler via GitGitGadget wrote:
>
>>> +	/*
>>> +	 * Technically, close() and unlink() can fail, but we don't
>>> +	 * care here.  We only created the file to trigger a watch
>>> +	 * event from the FS to know that when we're up to date.
>>> +	 */
>>> +	close(fd);
>> 
>> It still seems odd to explicitly want to ignore close() return values.
>> 
>> I realize that we do in (too many) existing places, but why wouldn't we
>> want to e.g. catch an I/O error here early?
>
> What exactly do you propose we do here if there is an I/O error
> during close()?

We created the file to trigger a watch event, but now we have a
reason to suspect that the wished-for watch event may not come.

We only did so to know that when we're up to date.  Now we may never
know?  We may go without realizing we are already up to date a bit
longer than the reality?

How much damage would it cause us to miss a watch event in this
case?  Very little?  Is it a thing that sysadmins may care if we see
too many of, but there is nothing the end user can immediately do
about?  If it is, perhaps a trace2 event to report it (and other "we
do not care here" syscalls that fail)?
Jeff Hostetler March 21, 2022, 7:26 p.m. UTC | #4
On 3/14/22 1:47 PM, Junio C Hamano wrote:
> Derrick Stolee <derrickstolee@github.com> writes:
> 
>> On 3/14/2022 4:00 AM, Ævar Arnfjörð Bjarmason wrote:
>>>
>>> On Fri, Mar 11 2022, Jeff Hostetler via GitGitGadget wrote:
>>
>>>> +	/*
>>>> +	 * Technically, close() and unlink() can fail, but we don't
>>>> +	 * care here.  We only created the file to trigger a watch
>>>> +	 * event from the FS to know that when we're up to date.
>>>> +	 */
>>>> +	close(fd);
>>>
>>> It still seems odd to explicitly want to ignore close() return values.
>>>
>>> I realize that we do in (too many) existing places, but why wouldn't we
>>> want to e.g. catch an I/O error here early?
>>
>> What exactly do you propose we do here if there is an I/O error
>> during close()?
> 
> We created the file to trigger a watch event, but now we have a
> reason to suspect that the wished-for watch event may not come.
> 
> We only did so to know that when we're up to date.  Now we may never
> know?  We may go without realizing we are already up to date a bit
> longer than the reality?
> 
> How much damage would it cause us to miss a watch event in this
> case?  Very little?  Is it a thing that sysadmins may care if we see
> too many of, but there is nothing the end user can immediately do
> about?  If it is, perhaps a trace2 event to report it (and other "we
> do not care here" syscalls that fail)?
> 
> 
> 

The open(... O_CREAT ...) succeeded, so we actually created a
new file and expect a FS event for it.  That FS event (when seen
by the FS listener thread) will cause our condition to be
signaled and allow this thread to wake up and respond to the client.

The odds of the close() failing on a plain file (after a successful
open()) are very slim.  And there's nothing that we can do about
the failure anyway.  (And we're not relying on an FS event from the
close() succeeding, so it really doesn't matter.)   Technically, it
is possible that the daemon could run out of fd's if this close()
fails often, so at some point the daemon might not be able to create
new cookie files.  But the daemon currently defaults to sending a
trivial response to the client -- if this turns out to be a real
issue, we could have the daemon restart or something, but I'm not
going to worry about that right now.

The odds of a failure in unlink() is a little more interesting.
This would mean that a stale cookie file would be left in the
cookie directory (and waste a little disk space).  But that is
not likely either (for a plain file that we just created).
Since we're not relying on the FS event for the unlink(), the
failure here won't block the current thread either.  Deleting
stale cookie files is something that we could try to address
in the future if it turns out to be a problem.

Jeff
diff mbox series

Patch

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index 97ca2a356e5..02a99ce98a2 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -109,14 +109,14 @@  static int do_as_client__status(void)
 
 enum fsmonitor_cookie_item_result {
 	FCIR_ERROR = -1, /* could not create cookie file ? */
-	FCIR_INIT = 0,
+	FCIR_INIT,
 	FCIR_SEEN,
 	FCIR_ABORT,
 };
 
 struct fsmonitor_cookie_item {
 	struct hashmap_entry entry;
-	const char *name;
+	char *name;
 	enum fsmonitor_cookie_item_result result;
 };
 
@@ -166,37 +166,44 @@  static enum fsmonitor_cookie_item_result with_lock__wait_for_cookie(
 	 * that the listener thread has seen it.
 	 */
 	fd = open(cookie_pathname.buf, O_WRONLY | O_CREAT | O_EXCL, 0600);
-	if (fd >= 0) {
-		close(fd);
-		unlink(cookie_pathname.buf);
-
-		/*
-		 * Technically, this is an infinite wait (well, unless another
-		 * thread sends us an abort).  I'd like to change this to
-		 * use `pthread_cond_timedwait()` and return an error/timeout
-		 * and let the caller do the trivial response thing, but we
-		 * don't have that routine in our thread-utils.
-		 *
-		 * After extensive beta testing I'm not really worried about
-		 * this.  Also note that the above open() and unlink() calls
-		 * will cause at least two FS events on that path, so the odds
-		 * of getting stuck are pretty slim.
-		 */
-		while (cookie->result == FCIR_INIT)
-			pthread_cond_wait(&state->cookies_cond,
-					  &state->main_lock);
-	} else {
+	if (fd < 0) {
 		error_errno(_("could not create fsmonitor cookie '%s'"),
 			    cookie->name);
 
 		cookie->result = FCIR_ERROR;
+		goto done;
 	}
 
+	/*
+	 * Technically, close() and unlink() can fail, but we don't
+	 * care here.  We only created the file to trigger a watch
+	 * event from the FS to know that when we're up to date.
+	 */
+	close(fd);
+	unlink(cookie_pathname.buf);
+
+	/*
+	 * Technically, this is an infinite wait (well, unless another
+	 * thread sends us an abort).  I'd like to change this to
+	 * use `pthread_cond_timedwait()` and return an error/timeout
+	 * and let the caller do the trivial response thing, but we
+	 * don't have that routine in our thread-utils.
+	 *
+	 * After extensive beta testing I'm not really worried about
+	 * this.  Also note that the above open() and unlink() calls
+	 * will cause at least two FS events on that path, so the odds
+	 * of getting stuck are pretty slim.
+	 */
+	while (cookie->result == FCIR_INIT)
+		pthread_cond_wait(&state->cookies_cond,
+				  &state->main_lock);
+
+done:
 	hashmap_remove(&state->cookies, &cookie->entry, NULL);
 
 	result = cookie->result;
 
-	free((char*)cookie->name);
+	free(cookie->name);
 	free(cookie);
 	strbuf_release(&cookie_pathname);