From patchwork Tue Mar 5 13:57:21 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bodo Stroesser X-Patchwork-Id: 2219321 Return-Path: X-Original-To: patchwork-linux-nfs@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork2.kernel.org (Postfix) with ESMTP id 31247DF24C for ; Tue, 5 Mar 2013 13:57:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755138Ab3CEN5X (ORCPT ); Tue, 5 Mar 2013 08:57:23 -0500 Received: from dgate10.ts.fujitsu.com ([80.70.172.49]:35883 "EHLO dgate10.ts.fujitsu.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755978Ab3CEN5W (ORCPT ); Tue, 5 Mar 2013 08:57:22 -0500 DomainKey-Signature: s=s1536a; d=ts.fujitsu.com; c=nofws; q=dns; h=Message-Id:Date:X-SBRSScore:X-IronPort-AV:Received: X-IronPort-AV:Received:From:To:Cc:Subject:Content-Type: Content-Transfer-Encoding; b=pzoYTIOGyPLf+kUKWOTHUBKoPd9e45QuKZUPbIBLhTOZP0714PkCGyI1 qmc20IWkFyIbc7VQEow4+dd+ipi8dfK6B2JlJRMpf3dVXqveCVmcpUOMs Flb/is1waNsZmgrEMqypHJWFOLVS2pH9Q5YdM4S2bGEr8QLO4dsD/QsLM 1VqsZorIC9yt8aBktNVs8iteHx4YUtX5SPgRaHWSAX5yaM4jKmMQeiXUC w3VykHJMVDFuRprSwv2Zw03eE4/4j; DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=ts.fujitsu.com; i=@ts.fujitsu.com; q=dns/txt; s=s1536b; t=1362491842; x=1394027842; h=from:to:cc:subject:content-transfer-encoding:message-id: date; bh=tSOwNfhMMWLpvbn26WYzekvxAZ1q4vdFr7y7jIzYPO4=; b=d3mM+sHVQGxRDOOjUyyVk2VNsMotlRaeR1XOuWpASSr0ixX0a9E71dIQ Ow8pjnrZizX2nX9rcOE/o52XygJhzqRCicL4sHEYKdOl2p0r2URhyF6CG e8+YZ3vxFX05hR6wbYstfWBlVe6j8pb23BPlo2wj6jLJbd+rG0FceTUBn nOmJmOuuusF2HvjvIHhVd7Yq83o3SozXFQLI/6ZIM8CylL2ID3DdDMTYJ 7/b3ELxOdAnAwbw5gkVMkjg2plyQH; Message-Id: Date: 05 Mar 2013 14:57:21 +0100 X-SBRSScore: None X-IronPort-AV: E=Sophos;i="4.84,787,1355094000"; d="scan'208";a="138833493" Received: from unknown (HELO abgdate50u.abg.fsc.net) ([172.25.138.66]) by dgate10u.abg.fsc.net with ESMTP; 05 Mar 2013 14:57:21 +0100 X-IronPort-AV: E=Sophos;i="4.84,787,1355094000"; d="scan'208";a="5650333" Received: from unknown (HELO BridgeHost.test.fsc.net) ([172.17.68.112]) by abgdate50u.abg.fsc.net with SMTP; 05 Mar 2013 14:57:20 +0100 From: Bodo Stroesser To: neilb@suse.de Cc: bfields@fieldses.org, linux-nfs@vger.kernel.org, bstroesser@ts.fujitsu.com Subject: Re: [PATCH 1/2] sunrpc/cache: remove races with queuing an upcall. Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On 04 Mar 2013 07:10:00 +0100 NeilBrown wrote: > We currently queue an upcall after setting CACHE_PENDING, > and dequeue after clearing CACHE_PENDING. > So a request should only be present when CACHE_PENDING is set. > > However we don't combine the test and the enqueue/dequeue in > a protected region, so it is possible (if unlikely) for a race > to result in a request being queued without CACHE_PENDING set, > or a request to be absent despite CACHE_PENDING. > > So: include a test for CACHE_PENDING inside the regions of > enqueue and dequeue where queue_lock is held, and abort > the operation if the value is not as expected. > > Also remove the 'return' from cache_dequeue() to ensure it always > removes all entries: As there is no locking between setting > CACHE_PENDING and calling sunrpc_cache_pipe_upcall it is not > inconceivable for some other thread to clear CACHE_PENDING and then > someone else to set it can call sunrpc_cache_pipe_upcall, both before > the original thread completed the call. > > With this, it perfectly safe and correct to: > - call cache_dequeue() if and only if we have just > cleared CACHE_PENDING > - call sunrpc_cache_pipe_upcall() (via cache_make_upcall) > if and only if we have just set CACHE_PENDING. > > Reported-by: Bodo Stroesser > Signed-off-by: NeilBrown > --- > net/sunrpc/cache.c | 17 ++++++++++++++--- > 1 file changed, 14 insertions(+), 3 deletions(-) > > diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c > index 9afa439..0400a92 100644 > --- a/net/sunrpc/cache.c > +++ b/net/sunrpc/cache.c > @@ -1022,6 +1022,9 @@ static void cache_dequeue(struct cache_detail *detail, struct cache_head *ch) > struct cache_request *cr = container_of(cq, struct cache_request, q); > if (cr->item != ch) > continue; > + if (test_bit(CACHE_PENDING, &ch->flags)) > + /* Lost a race and it is pending again */ > + break; > if (cr->readers != 0) > continue; > list_del(&cr->q.list); > @@ -1029,7 +1032,6 @@ static void cache_dequeue(struct cache_detail *detail, struct cache_head *ch) > cache_put(cr->item, detail); > kfree(cr->buf); > kfree(cr); > - return; Just removing "return" is not enough. If the loop no longer stops after dequeueing the first entry found, some other changes are necessary also: 1) use list_for_each_entry_safe() instead of list_for_each_entry() 2) don't call spin_unlock() in the loop. Thus, at the end of this mail I added a revised patch. With this patch cache_dequeue() no longer frees the requests in the loop, but moves them to a temporary queue. After the loop it calls spin_unlock() and does the kfree() and cache_put() in a second loop. The patch is not tested on a mainline kernel. Instead, I ported it to SLES11 SP1 2.6.32.59-0.7.1. On SLES 11 the test is still running fine. Best Regards, Bodo > } > spin_unlock(&queue_lock); > } > @@ -1151,6 +1153,7 @@ int sunrpc_cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h, > struct cache_request *crq; > char *bp; > int len; > + int ret = 0; > > if (!cache_listeners_exist(detail)) { > warn_no_listener(detail); > @@ -1182,10 +1185,18 @@ int sunrpc_cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h, > crq->len = PAGE_SIZE - len; > crq->readers = 0; > spin_lock(&queue_lock); > - list_add_tail(&crq->q.list, &detail->queue); > + if (test_bit(CACHE_PENDING, &h->flags)) > + list_add_tail(&crq->q.list, &detail->queue); > + else > + /* Lost a race, no longer PENDING, so don't enqueue */ > + ret = -EAGAIN; > spin_unlock(&queue_lock); > wake_up(&queue_wait); > - return 0; > + if (ret == -EAGAIN) { > + kfree(buf); > + kfree(crq); > + } > + return ret; > } > EXPORT_SYMBOL_GPL(sunrpc_cache_pipe_upcall); > > ..... Reported-by: Bodo Stroesser Signed-off-by: NeilBrown Signed-off-by: Bodo Stroesser --- a/net/sunrpc/cache.c 2013-02-20 14:05:08.000000000 +0100 +++ b/net/sunrpc/cache.c 2013-03-05 13:30:13.000000000 +0100 @@ -1015,23 +1015,31 @@ static void cache_dequeue(struct cache_detail *detail, struct cache_head *ch) { - struct cache_queue *cq; + struct cache_request *cr, *tmp; + struct list_head dequeued; + + INIT_LIST_HEAD(&dequeued); spin_lock(&queue_lock); - list_for_each_entry(cq, &detail->queue, list) - if (!cq->reader) { - struct cache_request *cr = container_of(cq, struct cache_request, q); + list_for_each_entry_safe(cr, tmp, &detail->queue, q.list) + if (!cr->q.reader) { if (cr->item != ch) continue; + if (test_bit(CACHE_PENDING, &ch->flags)) + /* Lost a race and it is pending again */ + break; if (cr->readers != 0) continue; - list_del(&cr->q.list); - spin_unlock(&queue_lock); - cache_put(cr->item, detail); - kfree(cr->buf); - kfree(cr); - return; + list_move(&cr->q.list, &dequeued); } spin_unlock(&queue_lock); + + while (!list_empty(&dequeued)) { + cr = list_entry(dequeued.next, struct cache_request, q.list); + list_del(&cr->q.list); + cache_put(cr->item, detail); + kfree(cr->buf); + kfree(cr); + } } /* @@ -1151,6 +1159,7 @@ struct cache_request *crq; char *bp; int len; + int ret = 0; if (!cache_listeners_exist(detail)) { warn_no_listener(detail); @@ -1182,10 +1191,18 @@ crq->len = PAGE_SIZE - len; crq->readers = 0; spin_lock(&queue_lock); - list_add_tail(&crq->q.list, &detail->queue); + if (test_bit(CACHE_PENDING, &h->flags)) + list_add_tail(&crq->q.list, &detail->queue); + else + /* Lost a race, no longer PENDING, so don't enqueue */ + ret = -EAGAIN; spin_unlock(&queue_lock); wake_up(&queue_wait); - return 0; + if (ret == -EAGAIN) { + kfree(buf); + kfree(crq); + } + return ret; } EXPORT_SYMBOL_GPL(sunrpc_cache_pipe_upcall);