From patchwork Mon Feb 15 22:32:54 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin Brandenburg X-Patchwork-Id: 8319781 Return-Path: X-Original-To: patchwork-linux-fsdevel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id EBBC6C02AA for ; Mon, 15 Feb 2016 22:32:59 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id DB91420386 for ; Mon, 15 Feb 2016 22:32:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 825E52026C for ; Mon, 15 Feb 2016 22:32:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752284AbcBOWc4 (ORCPT ); Mon, 15 Feb 2016 17:32:56 -0500 Received: from mail-ob0-f179.google.com ([209.85.214.179]:36171 "EHLO mail-ob0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752129AbcBOWcz (ORCPT ); Mon, 15 Feb 2016 17:32:55 -0500 Received: by mail-ob0-f179.google.com with SMTP id gc3so130993373obb.3 for ; Mon, 15 Feb 2016 14:32:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=omnibond-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=D7dyZU5QULcJex8OZqmFxWZpoEvpMit49W/zInNg44g=; b=sdaKuOHqTG7jk0HMHR/p1I6n36B14FhMcTd4Y4tibY201QGtgb69eNQ+DN7TkxSWLR V4CwgJt7yC3DRRNw1VUX/pQ8uhqQeTV0l2RE5/qzAezZ04kDOXkzAF1g79ELM1+gwYio usE5kKIFlVYDmsBTbTiDMaCVoI1HdRxMaiM1OeTYw6aTkrjl2lJyEt439xTFXc3asShV 6YCVBzsxYTpfvsZ/3sCLCJKmLrYZIOjb7q1LdVD1u0CixnMev4dC5/rMXj8UArsalEBa ZqpuhbDBM8X41hp0mkq6brXZuToUVtPAw8rR1NDuPP+2brUpM/nvx4jozyu0KwEJLzRb +0IA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=D7dyZU5QULcJex8OZqmFxWZpoEvpMit49W/zInNg44g=; b=CoT8kMG27kYq/ZICnXZb2ZXKwTe2bndzpk3dXYMGFywHE4Ag/o1W+07lTCctUAYQK6 9ow31JdBfJ//mozSDidRX3lpEJrZPkYKfxmx7uf2g1pQg44C/2Bc84uwhKUDTsICDp0/ iPvqbytum1zpAo/sVl3mIGPXPdvN/mPyNSuqTZfIfPHAvrrEnRyLQDQOZNOwta0CSLyq /Gx/pKwM7e8xtBl1Yy8LDfvKna80Du6ReYrj+kaXIxQ+mfTtCmxvo8DF2MCdk7bWBtpk I968dJ8zIMEPiJHVsWv2aHa2LRtxwR6ITS7cZY/W6dBn7EANtuyt4CdmLu4tvtR4m578 i9Ww== X-Gm-Message-State: AG10YORj4nUa8B9hwptvCEHwhmyu4DAA3+bfmyvQ4DbnTyUOlJrK1GVGKAlmP2gui0EFw3RIsiSztTsb/bCTvg== MIME-Version: 1.0 X-Received: by 10.202.216.4 with SMTP id p4mr6239518oig.86.1455575574917; Mon, 15 Feb 2016 14:32:54 -0800 (PST) Received: by 10.182.87.199 with HTTP; Mon, 15 Feb 2016 14:32:54 -0800 (PST) In-Reply-To: <20160215184554.GY17997@ZenIV.linux.org.uk> References: <20160212042757.GP17997@ZenIV.linux.org.uk> <20160213174738.GR17997@ZenIV.linux.org.uk> <20160214025615.GU17997@ZenIV.linux.org.uk> <20160214234312.GX17997@ZenIV.linux.org.uk> <20160215184554.GY17997@ZenIV.linux.org.uk> Date: Mon, 15 Feb 2016 17:32:54 -0500 Message-ID: Subject: Re: Orangefs ABI documentation From: Martin Brandenburg To: Al Viro Cc: Mike Marshall , Linus Torvalds , linux-fsdevel , Stephen Rothwell Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,T_DKIM_INVALID,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 2/15/16, Al Viro wrote: > On Mon, Feb 15, 2016 at 12:46:51PM -0500, Mike Marshall wrote: >> I pushed the list_del up to the kernel.org for-next branch... >> >> And I've been running tests with the CRUDE bandaid... weird >> results... >> >> No oopses, no WARN_ONs... I was running dbench and ls -R >> or find and kill-minus-nining different ones of them with no >> perceived resulting problems, so I moved on to signalling >> the client-core to abort... it restarted numerous times, >> and then stuff wedged up differently than I've seen before. > > There are other problems with that thing (starting with the fact that > retrying readdir/wait_for_direct_io can try to grab a slot despite the > bufmap winding down). OK, at that point I think we should try to see > if bufmap rewrite works - I've rebased on top of your branch and pushed > (head at 8c3bc9a). Bufmap rewrite is really completely untested - > it's done pretty much blindly and I'd be surprised as hell if it has no > brainos at the first try. > There's at least one major issue aside from a small typo. Something that used a slot, such as reader, would call service_operation while holding a bufmap. Then the client-core would crash, and the kernel would get run_down waiting on the slots to be given up. But the slots are not given up until someone wakes all the processes waiting in service_operation up, which happens after all the slots are given up. Then client-core hangs until someone sends a deadly signal to all the processes waiting in service_operation or presumably the timeout expires. This splits finalize and run_down so that orangefs_devreq_release can mark the slot map as killed, then purge waiting ops, then wait for all the slots to be released. Meanwhile, processes which were waiting will get into orangefs_bufmap_get which will see that the slot map is shutting down and wait for the client-core to come back. This is all at https://www.github.com/martinbrandenburg/linux.git branch slots. -- Martin void orangefs_bufmap_put(int buffer_index); --- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/fs/orangefs/devorangefs-req.c b/fs/orangefs/devorangefs-req.c index d96bcf10..b27ed1c 100644 --- a/fs/orangefs/devorangefs-req.c +++ b/fs/orangefs/devorangefs-req.c @@ -513,6 +513,9 @@ static int orangefs_devreq_release(struct inode *inode, struct file *file) * them as purged and wake them up */ purge_inprogress_ops(); + + orangefs_bufmap_run_down(); + gossip_debug(GOSSIP_DEV_DEBUG, "pvfs2-client-core: device close complete\n"); open_access_count = 0; diff --git a/fs/orangefs/orangefs-bufmap.c b/fs/orangefs/orangefs-bufmap.c index 3c6e07c..c544710 100644 --- a/fs/orangefs/orangefs-bufmap.c +++ b/fs/orangefs/orangefs-bufmap.c @@ -20,7 +20,7 @@ static struct slot_map rw_map = { }; static struct slot_map readdir_map = { .c = -1, - .q = __WAIT_QUEUE_HEAD_INITIALIZER(rw_map.q) + .q = __WAIT_QUEUE_HEAD_INITIALIZER(readdir_map.q) }; @@ -430,6 +430,15 @@ void orangefs_bufmap_finalize(void) gossip_debug(GOSSIP_BUFMAP_DEBUG, "orangefs_bufmap_finalize: called\n"); mark_killed(&rw_map); mark_killed(&readdir_map); + gossip_debug(GOSSIP_BUFMAP_DEBUG, + "orangefs_bufmap_finalize: exiting normally\n"); +} + +void orangefs_bufmap_run_down(void) +{ + struct orangefs_bufmap *bufmap = __orangefs_bufmap; + if (!bufmap) + return; run_down(&rw_map); run_down(&readdir_map); spin_lock(&orangefs_bufmap_lock); @@ -437,8 +446,6 @@ void orangefs_bufmap_finalize(void) spin_unlock(&orangefs_bufmap_lock); orangefs_bufmap_unmap(bufmap); orangefs_bufmap_free(bufmap); - gossip_debug(GOSSIP_BUFMAP_DEBUG, - "orangefs_bufmap_finalize: exiting normally\n"); } /* diff --git a/fs/orangefs/orangefs-bufmap.h b/fs/orangefs/orangefs-bufmap.h index ad8d82a..0be62be 100644 --- a/fs/orangefs/orangefs-bufmap.h +++ b/fs/orangefs/orangefs-bufmap.h @@ -17,6 +17,8 @@ int orangefs_bufmap_initialize(struct ORANGEFS_dev_map_desc *user_desc); void orangefs_bufmap_finalize(void); +void orangefs_bufmap_run_down(void); + int orangefs_bufmap_get(struct orangefs_bufmap **mapp, int *buffer_index);