From patchwork Fri Feb 6 03:26:33 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikulas Patocka X-Patchwork-Id: 5792 Received: from hormel.redhat.com (hormel1.redhat.com [209.132.177.33]) by demeter.kernel.org (8.14.2/8.14.2) with ESMTP id n163QaVH008060 for ; Fri, 6 Feb 2009 03:26:37 GMT Received: from listman.util.phx.redhat.com (listman.util.phx.redhat.com [10.8.4.110]) by hormel.redhat.com (Postfix) with ESMTP id 1AAF161A778; Thu, 5 Feb 2009 22:26:35 -0500 (EST) Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by listman.util.phx.redhat.com (8.13.1/8.13.1) with ESMTP id n163QXTB031349 for ; Thu, 5 Feb 2009 22:26:33 -0500 Received: from hs20-bc2-1.build.redhat.com (hs20-bc2-1.build.redhat.com [10.10.28.34]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id n163QZHd024227; Thu, 5 Feb 2009 22:26:35 -0500 Received: from hs20-bc2-1.build.redhat.com (localhost.localdomain [127.0.0.1]) by hs20-bc2-1.build.redhat.com (8.13.1/8.13.1) with ESMTP id n163QYmS023551; Thu, 5 Feb 2009 22:26:34 -0500 Received: from localhost (mpatocka@localhost) by hs20-bc2-1.build.redhat.com (8.13.1/8.13.1/Submit) with ESMTP id n163QXiF023545; Thu, 5 Feb 2009 22:26:33 -0500 X-Authentication-Warning: hs20-bc2-1.build.redhat.com: mpatocka owned process doing -bs Date: Thu, 5 Feb 2009 22:26:33 -0500 (EST) From: Mikulas Patocka X-X-Sender: mpatocka@hs20-bc2-1.build.redhat.com To: device-mapper development Subject: Re: [dm-devel] Re: 2.6.28.2 & dm-snapshot or kcopyd Oops In-Reply-To: Message-ID: References: <200901231836184950432@163.com>, , <200901301800149019891@163.com>, , <200901311416111648168@163.com>, <200902051113011850587@163.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.58 on 172.16.52.254 X-loop: dm-devel@redhat.com Cc: Alasdair G Kergon , Jacky Kim , Milan Broz X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.5 Precedence: junk Reply-To: device-mapper development List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com > + push(&job->kc->complete_jobs, job); > + wake(job->kc); I still found one bug with my patch. When I push the job to complete_jobs queue, it may be immediatelly taken and cease to exist. So job->kc dereference is wrong. A new patch: Under specific circumstances, kcopyd callback could be called from the thread that called dm_kcopyd_copy not from kcopyd workqueue. dm_kcopyd_copy -> split_job -> segment_complete -> job->fn() This code path is taken if thread calling dm_kcopyd_copy is delayed due to scheduling inside split_job/segment_complete and the subjobs complete before the loop in split_job completes. Snapshots depend on the fact that callbacks are called from the singlethreaded kcopyd workqueue and expect that there is no racing between individual callbacks. The racing between callbacks can lead to corruption of exception store and it can also cause that exception store callbacks are called twice for the same exception --- a likely reason for crashes inside pending_complete() / remove_exception(). When I reviewed kcopyd further, I found four total problems: 1. job->fn being called from the thread that submitted the job (see above). 2. job->fn(read_err, write_err, job->context); in segment_complete reports the error of the last subjob, not the union of all errors. 3. This comment is bogus (the jobs are not cancelable), plus the code is prone to deadlock because the callback may allocate another job from the same mempool. /* * To avoid a race we must keep the job around * until after the notify function has completed. * Otherwise the client may try and stop the job * after we've completed. */ job->fn(read_err, write_err, job->context); mempool_free(job, job->kc->job_pool); 4. Master jobs and subjobs are allocated from the same mempool. Completion and freeing of a master job depends on successful allocation of all subjobs -> deadlock. This patch moves completion of master jobs to run_complete_job (being called from the kcopyd workqueue). The patch fixes problems 1, 2, 3. Problem 4 will need more changes and another patch. Signed-off-by: Mikulas Patocka --- drivers/md/dm-kcopyd.c | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel Index: linux-2.6.29-rc3-devel/drivers/md/dm-kcopyd.c =================================================================== --- linux-2.6.29-rc3-devel.orig/drivers/md/dm-kcopyd.c 2009-02-05 23:01:50.000000000 +0100 +++ linux-2.6.29-rc3-devel/drivers/md/dm-kcopyd.c 2009-02-06 04:15:52.000000000 +0100 @@ -297,7 +297,8 @@ static int run_complete_job(struct kcopy dm_kcopyd_notify_fn fn = job->fn; struct dm_kcopyd_client *kc = job->kc; - kcopyd_put_pages(kc, job->pages); + if (job->pages) + kcopyd_put_pages(kc, job->pages); mempool_free(job, kc->job_pool); fn(read_err, write_err, context); @@ -461,6 +462,7 @@ static void segment_complete(int read_er sector_t progress = 0; sector_t count = 0; struct kcopyd_job *job = (struct kcopyd_job *) context; + struct dm_kcopyd_client *kc = job->kc; mutex_lock(&job->lock); @@ -490,7 +492,7 @@ static void segment_complete(int read_er if (count) { int i; - struct kcopyd_job *sub_job = mempool_alloc(job->kc->job_pool, + struct kcopyd_job *sub_job = mempool_alloc(kc->job_pool, GFP_NOIO); *sub_job = *job; @@ -508,14 +510,8 @@ static void segment_complete(int read_er } else if (atomic_dec_and_test(&job->sub_jobs)) { - /* - * To avoid a race we must keep the job around - * until after the notify function has completed. - * Otherwise the client may try and stop the job - * after we've completed. - */ - job->fn(read_err, write_err, job->context); - mempool_free(job, job->kc->job_pool); + push(&kc->complete_jobs, job); + wake(kc); } } @@ -528,6 +524,8 @@ static void split_job(struct kcopyd_job { int i; + atomic_inc(&job->kc->nr_jobs); + atomic_set(&job->sub_jobs, SPLIT_COUNT); for (i = 0; i < SPLIT_COUNT; i++) segment_complete(0, 0u, job);