From patchwork Wed Sep 17 09:58:05 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xue jiufei X-Patchwork-Id: 4923611 Return-Path: X-Original-To: patchwork-ocfs2-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id F08C89F32F for ; Wed, 17 Sep 2014 09:58:31 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id DF5D720158 for ; Wed, 17 Sep 2014 09:59:59 +0000 (UTC) Received: from aserp1040.oracle.com (aserp1040.oracle.com [141.146.126.69]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9D85620154 for ; Wed, 17 Sep 2014 09:59:58 +0000 (UTC) Received: from acsinet21.oracle.com (acsinet21.oracle.com [141.146.126.237]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id s8H9xKP8018131 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 17 Sep 2014 09:59:21 GMT Received: from oss.oracle.com (oss-external.oracle.com [137.254.96.51]) by acsinet21.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id s8H9xFHE016508 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 17 Sep 2014 09:59:15 GMT Received: from localhost ([127.0.0.1] helo=oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1XUC14-00041j-To; Wed, 17 Sep 2014 02:59:14 -0700 Received: from acsinet21.oracle.com ([141.146.126.237]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1XUC0X-00041B-Du for ocfs2-devel@oss.oracle.com; Wed, 17 Sep 2014 02:58:41 -0700 Received: from aserp1020.oracle.com (aserp1020.oracle.com [141.146.126.67]) by acsinet21.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id s8H9wfuf014759 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Wed, 17 Sep 2014 09:58:41 GMT Received: from aserp2040.oracle.com (aserp2040.oracle.com [141.146.126.75]) by aserp1020.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id s8H9wehM019520 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Wed, 17 Sep 2014 09:58:40 GMT Received: from pps.filterd (aserp2040.oracle.com [127.0.0.1]) by aserp2040.oracle.com (8.14.7/8.14.7) with SMTP id s8H9wKCS029759 for ; Wed, 17 Sep 2014 09:58:40 GMT Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [119.145.14.65]) by aserp2040.oracle.com with ESMTP id 1pf5w4hcxp-1 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NOT) for ; Wed, 17 Sep 2014 09:58:40 +0000 Received: from 172.24.2.119 (EHLO szxeml420-hub.china.huawei.com) ([172.24.2.119]) by szxrg02-dlp.huawei.com (MOS 4.3.7-GA FastPath queued) with ESMTP id BZO98009; Wed, 17 Sep 2014 17:58:24 +0800 (CST) Received: from [127.0.0.1] (10.177.22.96) by szxeml420-hub.china.huawei.com (10.82.67.159) with Microsoft SMTP Server id 14.3.158.1; Wed, 17 Sep 2014 17:58:22 +0800 Message-ID: <54195B2D.9040405@huawei.com> Date: Wed, 17 Sep 2014 17:58:05 +0800 From: Xue jiufei User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Thunderbird/24.0.1 MIME-Version: 1.0 To: Andrew Morton , "ocfs2-devel@oss.oracle.com" , Junxiao Bi X-Originating-IP: [10.177.22.96] X-CFilter-Loop: Reflected X-ServerName: szxga02-in.huawei.com X-Proofpoint-Virus-Version: vendor=nai engine=5600 definitions=7563 signatures=670521 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=9 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=7.0.1-1402240000 definitions=main-1409170082 Subject: [Ocfs2-devel] [PATCH] ocfs2: fix a deadlock while o2net_wq doing direct memory reclaim X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list Reply-To: xuejiufei@huawei.com List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Source-IP: acsinet21.oracle.com [141.146.126.237] X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The patch fix a deadlock problem caused by direct memory reclaim in o2net_wq. The situation is as follows: 1) Receiving a connect message from another node, node queues a work_struct o2net_listen_work. 2) o2net_wq processes this work and call the following functions: o2net_wq -> o2net_accept_one -> sock_create_lite -> sock_alloc() -> kmem_cache_alloc with GFP_KERNEL -> ____cache_alloc_node ->__alloc_pages_nodemask -> do_try_to_free_pages -> shrink_slab -> evict -> ocfs2_evict_inode -> ocfs2_drop_lock -> dlmunlock -> o2net_send_message_vec then o2net_wq wait for the unlock reply from master. 3)tcp layer received the reply, call o2net_data_ready() and queue sc_rx_work, waiting o2net_wq to process this work. 4)o2net_wq is a single thread workqueue, it process the work one by one. Right now it is still doing o2net_listen_work and cannot handle sc_rx_work. so we deadlock. Junxiao Bi's patch (http://ozlabs.org/~akpm/mmots/broken-out/mm-clear-__gfp_fs-when-pf_memalloc_noio-is-set.patch) clearing __GFP_FS in memalloc_noio_flags() besides __GFP_IO. We use memalloc_noio_save() to set process flag PF_MEMALLOC_NOIO so that all allocations done by this process are done as if GFP_NOIO was specified. We are not reentering filesystem while doing memory reclaim. Signed-off-by: joyce.xue --- fs/ocfs2/cluster/tcp.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/fs/ocfs2/cluster/tcp.c b/fs/ocfs2/cluster/tcp.c index ea34952..e607937 100644 --- a/fs/ocfs2/cluster/tcp.c +++ b/fs/ocfs2/cluster/tcp.c @@ -1601,7 +1601,15 @@ static void o2net_start_connect(struct work_struct *work) struct sockaddr_in myaddr = {0, }, remoteaddr = {0, }; int ret = 0, stop; unsigned int timeout; + unsigned int noio_flag; + /* + * sock_create allocates the sock with GFP_KERNEL. We must set + * per-process flag PF_MEMALLOC_NOIO so that all allocations done + * by this process are done as if GFP_NOIO was specified. So we + * are not reentering filesystem while doing memory reclaim. + */ + noio_flag = memalloc_noio_save(); /* if we're greater we initiate tx, otherwise we accept */ if (o2nm_this_node() <= o2net_num_from_nn(nn)) goto out; @@ -1710,6 +1718,7 @@ out: if (mynode) o2nm_node_put(mynode); + memalloc_noio_restore(noio_flag); return; } @@ -1835,6 +1844,15 @@ static int o2net_accept_one(struct socket *sock, int *more) struct o2nm_node *local_node = NULL; struct o2net_sock_container *sc = NULL; struct o2net_node *nn; + unsigned int noio_flag; + + /* + * sock_create_lite allocates the sock with GFP_KERNEL. We must set + * per-process flag PF_MEMALLOC_NOIO so that all allocations done + * by this process are done as if GFP_NOIO was specified. So we + * are not reentering filesystem while doing memory reclaim. + */ + noio_flag = memalloc_noio_save(); BUG_ON(sock == NULL); *more = 0; @@ -1951,6 +1969,8 @@ out: o2nm_node_put(local_node); if (sc) sc_put(sc); + + memalloc_noio_restore(noio_flag); return ret; }