diff mbox

Scrub racing with btrfs workers on umount

Message ID 50AB86D5.6040501@jan-o-sch.net (mailing list archive)
State New, archived
Headers show

Commit Message

Jan Schmidt Nov. 20, 2012, 1:34 p.m. UTC
I found a race condition in the way scrub interacts with btrfs workers:

Please carefully distinguish "workers" as struct btrfs_workers from "worker" as
struct btrfs_worker_thread.

Process A: umount
	Process B: scrub-workers
		Process C: genwork-worker

A: close_ctree()
A: scrub_cancel()
	B: scrub_workers_put()
	B: btrfs_stop_workers()
	B: -> lock(workers->lock)
	B: -> splice idle worker list into worker list
	B: -> while (worker list)
	B: ->	worker = delete from worker list
	B: ->	unlock(workers->lock)
	B: ->	kthread_stop(worker)
		C: __btrfs_start_workers()
		C: -> kthread_run starts new scrub-worker
		C: -> lock(workers->lock) /* from scrub-workers */
		C: -> add to workers idle list
		C: -> unlock(workers->lock)
	B: ->	lock(workers->lock)
A: stop genwork-worker

In that situation, the scrub-workers have one idle worker but return from
btrfs_stop_workers(). There are several strategies to fix this:

1. Let close_ctree call scrub_cancel() after stopping the generic worker.

2. Let btrfs_stop_workers be more careful on insertions to the idle list, like:


3. Find out why __btrfs_start_workers() gets called in the above scenario at all.

Any thoughts? Otherwise, I'm going for option 3 and will call back.

-Jan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Chris Mason Nov. 21, 2012, 12:58 p.m. UTC | #1
On Tue, Nov 20, 2012 at 06:34:13AM -0700, Jan Schmidt wrote:
> I found a race condition in the way scrub interacts with btrfs workers:
> 
> Please carefully distinguish "workers" as struct btrfs_workers from "worker" as
> struct btrfs_worker_thread.
> 
> Process A: umount
> 	Process B: scrub-workers
> 		Process C: genwork-worker
> 
> A: close_ctree()
> A: scrub_cancel()
> 	B: scrub_workers_put()
> 	B: btrfs_stop_workers()
> 	B: -> lock(workers->lock)
> 	B: -> splice idle worker list into worker list
> 	B: -> while (worker list)
> 	B: ->	worker = delete from worker list
> 	B: ->	unlock(workers->lock)
> 	B: ->	kthread_stop(worker)
> 		C: __btrfs_start_workers()
> 		C: -> kthread_run starts new scrub-worker
> 		C: -> lock(workers->lock) /* from scrub-workers */
> 		C: -> add to workers idle list
> 		C: -> unlock(workers->lock)
> 	B: ->	lock(workers->lock)
> A: stop genwork-worker
> 
> In that situation, the scrub-workers have one idle worker but return from
> btrfs_stop_workers(). There are several strategies to fix this:
> 
> 1. Let close_ctree call scrub_cancel() after stopping the generic worker.
> 
> 2. Let btrfs_stop_workers be more careful on insertions to the idle list, like:
> 
> diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
> index 58b7d14..2523781 100644
> --- a/fs/btrfs/async-thread.c
> +++ b/fs/btrfs/async-thread.c
> @@ -431,6 +431,8 @@ void btrfs_stop_workers(struct btrfs_workers *workers)
>                         kthread_stop(worker->task);
>                 spin_lock_irq(&workers->lock);
>                 put_worker(worker);
> +               /* we dropped the lock above, check for new idle workers */
> +               list_splice_init(&workers->idle_list, &workers->worker_list);
>         }
>         spin_unlock_irq(&workers->lock);
>  }
> 
> 3. Find out why __btrfs_start_workers() gets called in the above scenario at all.
> 
> Any thoughts? Otherwise, I'm going for option 3 and will call back.

I'd go for a mixture of 1 and 3.  My first guess is that we're getting
endio events for scrub reads and that is kicking the scrub state
machine.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
index 58b7d14..2523781 100644
--- a/fs/btrfs/async-thread.c
+++ b/fs/btrfs/async-thread.c
@@ -431,6 +431,8 @@  void btrfs_stop_workers(struct btrfs_workers *workers)
                        kthread_stop(worker->task);
                spin_lock_irq(&workers->lock);
                put_worker(worker);
+               /* we dropped the lock above, check for new idle workers */
+               list_splice_init(&workers->idle_list, &workers->worker_list);
        }
        spin_unlock_irq(&workers->lock);
 }