From patchwork Wed Apr 28 16:50:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vivek Goyal X-Patchwork-Id: 12229737 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D7EDBC433ED for ; Wed, 28 Apr 2021 16:51:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A983D61417 for ; Wed, 28 Apr 2021 16:51:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241292AbhD1Qvs (ORCPT ); Wed, 28 Apr 2021 12:51:48 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:36769 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240477AbhD1Qvr (ORCPT ); Wed, 28 Apr 2021 12:51:47 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619628661; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TVg4TK3ftOVnWUIDPLvRsoA0GP0bFUZQWBEdvpHV2VI=; b=A4TMFfGG1Wkfxo+JsG9qd+aQAUeguzv1n49qW2uvNVpcszv1Br5gJNvRFFOQN1jmXXm1jP WInV585U0kF/X4UqeHqqYlnWzsrlmBw3Wc2YgNuYwSg5QevarhIUAtTLcgQlv3Zg33F5nY 6MTf/HiiMf90fM9aTvoGf28C0yJch4U= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-501-Ioxs4MaVNfOj94iVShIaHQ-1; Wed, 28 Apr 2021 12:50:57 -0400 X-MC-Unique: Ioxs4MaVNfOj94iVShIaHQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 611DF1006CAA; Wed, 28 Apr 2021 16:50:50 +0000 (UTC) Received: from horse.redhat.com (ovpn-116-128.rdu2.redhat.com [10.10.116.128]) by smtp.corp.redhat.com (Postfix) with ESMTP id C09C710027A5; Wed, 28 Apr 2021 16:50:49 +0000 (UTC) Received: by horse.redhat.com (Postfix, from userid 10451) id 558DF223D99; Wed, 28 Apr 2021 12:50:49 -0400 (EDT) From: Vivek Goyal To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org, virtio-fs@redhat.com, dan.j.williams@intel.com Cc: vgoyal@redhat.com, miklos@szeredi.hu, jack@suse.cz, willy@infradead.org, slp@redhat.com, Greg Kurz Subject: [PATCH v5 1/3] dax: Add an enum for specifying dax wakup mode Date: Wed, 28 Apr 2021 12:50:38 -0400 Message-Id: <20210428165040.1856202-2-vgoyal@redhat.com> In-Reply-To: <20210428165040.1856202-1-vgoyal@redhat.com> References: <20210428165040.1856202-1-vgoyal@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Dan mentioned that he is not very fond of passing around a boolean true/false to specify if only next waiter should be woken up or all waiters should be woken up. He instead prefers that we introduce an enum and make it very explicity at the callsite itself. Easier to read code. This patch should not introduce any change of behavior. Reviewed-by: Greg Kurz Reviewed-by: Jan Kara Suggested-by: Dan Williams Signed-off-by: Vivek Goyal --- fs/dax.c | 23 +++++++++++++++++------ 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index b3d27fdc6775..c8cd2ae4440b 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -144,6 +144,16 @@ struct wait_exceptional_entry_queue { struct exceptional_entry_key key; }; +/** + * enum dax_wake_mode: waitqueue wakeup behaviour + * @WAKE_NEXT: wake only the first waiter in the waitqueue + * @WAKE_ALL: wake all waiters in the waitqueue + */ +enum dax_wake_mode { + WAKE_ALL, + WAKE_NEXT, +}; + static wait_queue_head_t *dax_entry_waitqueue(struct xa_state *xas, void *entry, struct exceptional_entry_key *key) { @@ -182,7 +192,8 @@ static int wake_exceptional_entry_func(wait_queue_entry_t *wait, * The important information it's conveying is whether the entry at * this index used to be a PMD entry. */ -static void dax_wake_entry(struct xa_state *xas, void *entry, bool wake_all) +static void dax_wake_entry(struct xa_state *xas, void *entry, + enum dax_wake_mode mode) { struct exceptional_entry_key key; wait_queue_head_t *wq; @@ -196,7 +207,7 @@ static void dax_wake_entry(struct xa_state *xas, void *entry, bool wake_all) * must be in the waitqueue and the following check will see them. */ if (waitqueue_active(wq)) - __wake_up(wq, TASK_NORMAL, wake_all ? 0 : 1, &key); + __wake_up(wq, TASK_NORMAL, mode == WAKE_ALL ? 0 : 1, &key); } /* @@ -268,7 +279,7 @@ static void put_unlocked_entry(struct xa_state *xas, void *entry) { /* If we were the only waiter woken, wake the next one */ if (entry && !dax_is_conflict(entry)) - dax_wake_entry(xas, entry, false); + dax_wake_entry(xas, entry, WAKE_NEXT); } /* @@ -286,7 +297,7 @@ static void dax_unlock_entry(struct xa_state *xas, void *entry) old = xas_store(xas, entry); xas_unlock_irq(xas); BUG_ON(!dax_is_locked(old)); - dax_wake_entry(xas, entry, false); + dax_wake_entry(xas, entry, WAKE_NEXT); } /* @@ -524,7 +535,7 @@ static void *grab_mapping_entry(struct xa_state *xas, dax_disassociate_entry(entry, mapping, false); xas_store(xas, NULL); /* undo the PMD join */ - dax_wake_entry(xas, entry, true); + dax_wake_entry(xas, entry, WAKE_ALL); mapping->nrexceptional--; entry = NULL; xas_set(xas, index); @@ -937,7 +948,7 @@ static int dax_writeback_one(struct xa_state *xas, struct dax_device *dax_dev, xas_lock_irq(xas); xas_store(xas, entry); xas_clear_mark(xas, PAGECACHE_TAG_DIRTY); - dax_wake_entry(xas, entry, false); + dax_wake_entry(xas, entry, WAKE_NEXT); trace_dax_writeback_one(mapping->host, index, count); return ret; From patchwork Wed Apr 28 16:50:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vivek Goyal X-Patchwork-Id: 12229741 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44792C433B4 for ; Wed, 28 Apr 2021 16:51:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 115DF61448 for ; Wed, 28 Apr 2021 16:51:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241324AbhD1Qvw (ORCPT ); Wed, 28 Apr 2021 12:51:52 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:23715 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241307AbhD1Qvt (ORCPT ); Wed, 28 Apr 2021 12:51:49 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619628663; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lhUk8o/dNVG2b59dZ/ynjwyLgOva03xJ5PDCqgW+204=; b=jAuucBVuUv+xQS+f/k+72f0+k+6gpKA9CyjhF4F4H2VIXDYhAjVPYXkMOMkgGsH7G8FRJU ln51fEzot6FmcurbbAfgAYrg76id5jNXtPT4RegvhNa44p6iYjR0ki9xyt/41rI1xW5exD 6Tuva6OQhLHge9Do8huhDlseWNW0xPo= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-420-whEXGlG1O2qSYfAGUt9zSQ-1; Wed, 28 Apr 2021 12:50:59 -0400 X-MC-Unique: whEXGlG1O2qSYfAGUt9zSQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D3CF31085937; Wed, 28 Apr 2021 16:50:50 +0000 (UTC) Received: from horse.redhat.com (ovpn-116-128.rdu2.redhat.com [10.10.116.128]) by smtp.corp.redhat.com (Postfix) with ESMTP id CD23419C45; Wed, 28 Apr 2021 16:50:49 +0000 (UTC) Received: by horse.redhat.com (Postfix, from userid 10451) id 59854225FCE; Wed, 28 Apr 2021 12:50:49 -0400 (EDT) From: Vivek Goyal To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org, virtio-fs@redhat.com, dan.j.williams@intel.com Cc: vgoyal@redhat.com, miklos@szeredi.hu, jack@suse.cz, willy@infradead.org, slp@redhat.com, Greg Kurz Subject: [PATCH v5 2/3] dax: Add a wakeup mode parameter to put_unlocked_entry() Date: Wed, 28 Apr 2021 12:50:39 -0400 Message-Id: <20210428165040.1856202-3-vgoyal@redhat.com> In-Reply-To: <20210428165040.1856202-1-vgoyal@redhat.com> References: <20210428165040.1856202-1-vgoyal@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org As of now put_unlocked_entry() always wakes up next waiter. In next patches we want to wake up all waiters at one callsite. Hence, add a parameter to the function. This patch does not introduce any change of behavior. Reviewed-by: Greg Kurz Reviewed-by: Jan Kara Suggested-by: Dan Williams Signed-off-by: Vivek Goyal --- fs/dax.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index c8cd2ae4440b..e84dd240c35c 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -275,11 +275,11 @@ static void wait_entry_unlocked(struct xa_state *xas, void *entry) finish_wait(wq, &ewait.wait); } -static void put_unlocked_entry(struct xa_state *xas, void *entry) +static void put_unlocked_entry(struct xa_state *xas, void *entry, + enum dax_wake_mode mode) { - /* If we were the only waiter woken, wake the next one */ if (entry && !dax_is_conflict(entry)) - dax_wake_entry(xas, entry, WAKE_NEXT); + dax_wake_entry(xas, entry, mode); } /* @@ -633,7 +633,7 @@ struct page *dax_layout_busy_page_range(struct address_space *mapping, entry = get_unlocked_entry(&xas, 0); if (entry) page = dax_busy_page(entry); - put_unlocked_entry(&xas, entry); + put_unlocked_entry(&xas, entry, WAKE_NEXT); if (page) break; if (++scanned % XA_CHECK_SCHED) @@ -675,7 +675,7 @@ static int __dax_invalidate_entry(struct address_space *mapping, mapping->nrexceptional--; ret = 1; out: - put_unlocked_entry(&xas, entry); + put_unlocked_entry(&xas, entry, WAKE_NEXT); xas_unlock_irq(&xas); return ret; } @@ -954,7 +954,7 @@ static int dax_writeback_one(struct xa_state *xas, struct dax_device *dax_dev, return ret; put_unlocked: - put_unlocked_entry(xas, entry); + put_unlocked_entry(xas, entry, WAKE_NEXT); return ret; } @@ -1695,7 +1695,7 @@ dax_insert_pfn_mkwrite(struct vm_fault *vmf, pfn_t pfn, unsigned int order) /* Did we race with someone splitting entry or so? */ if (!entry || dax_is_conflict(entry) || (order == 0 && !dax_is_pte_entry(entry))) { - put_unlocked_entry(&xas, entry); + put_unlocked_entry(&xas, entry, WAKE_NEXT); xas_unlock_irq(&xas); trace_dax_insert_pfn_mkwrite_no_entry(mapping->host, vmf, VM_FAULT_NOPAGE); From patchwork Wed Apr 28 16:50:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vivek Goyal X-Patchwork-Id: 12229739 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEDA7C43460 for ; Wed, 28 Apr 2021 16:51:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8071C6141F for ; Wed, 28 Apr 2021 16:51:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241305AbhD1Qvt (ORCPT ); Wed, 28 Apr 2021 12:51:49 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:46502 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241291AbhD1Qvs (ORCPT ); Wed, 28 Apr 2021 12:51:48 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619628662; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BHdHg0RFU1qOPaKtIS99H89bvmSSqduaW6j8IHmwfQ8=; b=OIE1G3e+qCQ0TSL3ObQCCxSZ23IN1uGtucB6RxSuxLY8QX+E8dipAG1rbEV8IqqsfAMEGt F/hagiEx0Kbg15BEMzfPJX1QftmyTeXpEhZQaOu5DWZTRr4RwK+8P5RM8gIz4MazNM2R+o w6auO/aeufrNHSfSblM2VkSwSom+GtY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-516-r3ZHLjFLMBmFPbWHRK9lQQ-1; Wed, 28 Apr 2021 12:51:00 -0400 X-MC-Unique: r3ZHLjFLMBmFPbWHRK9lQQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 5EDC1192CC5D; Wed, 28 Apr 2021 16:50:50 +0000 (UTC) Received: from horse.redhat.com (ovpn-116-128.rdu2.redhat.com [10.10.116.128]) by smtp.corp.redhat.com (Postfix) with ESMTP id D142C60E3A; Wed, 28 Apr 2021 16:50:49 +0000 (UTC) Received: by horse.redhat.com (Postfix, from userid 10451) id 5F75F225FCF; Wed, 28 Apr 2021 12:50:49 -0400 (EDT) From: Vivek Goyal To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org, virtio-fs@redhat.com, dan.j.williams@intel.com Cc: vgoyal@redhat.com, miklos@szeredi.hu, jack@suse.cz, willy@infradead.org, slp@redhat.com Subject: [PATCH v5 3/3] dax: Wake up all waiters after invalidating dax entry Date: Wed, 28 Apr 2021 12:50:40 -0400 Message-Id: <20210428165040.1856202-4-vgoyal@redhat.com> In-Reply-To: <20210428165040.1856202-1-vgoyal@redhat.com> References: <20210428165040.1856202-1-vgoyal@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org I am seeing missed wakeups which ultimately lead to a deadlock when I am using virtiofs with DAX enabled and running "make -j". I had to mount virtiofs as rootfs and also reduce to dax window size to 256M to reproduce the problem consistently. So here is the problem. put_unlocked_entry() wakes up waiters only if entry is not null as well as !dax_is_conflict(entry). But if I call multiple instances of invalidate_inode_pages2() in parallel, then I can run into a situation where there are waiters on this index but nobody will wake these waiters. invalidate_inode_pages2() invalidate_inode_pages2_range() invalidate_exceptional_entry2() dax_invalidate_mapping_entry_sync() __dax_invalidate_entry() { xas_lock_irq(&xas); entry = get_unlocked_entry(&xas, 0); ... ... dax_disassociate_entry(entry, mapping, trunc); xas_store(&xas, NULL); ... ... put_unlocked_entry(&xas, entry); xas_unlock_irq(&xas); } Say a fault in in progress and it has locked entry at offset say "0x1c". Now say three instances of invalidate_inode_pages2() are in progress (A, B, C) and they all try to invalidate entry at offset "0x1c". Given dax entry is locked, all tree instances A, B, C will wait in wait queue. When dax fault finishes, say A is woken up. It will store NULL entry at index "0x1c" and wake up B. When B comes along it will find "entry=0" at page offset 0x1c and it will call put_unlocked_entry(&xas, 0). And this means put_unlocked_entry() will not wake up next waiter, given the current code. And that means C continues to wait and is not woken up. This patch fixes the issue by waking up all waiters when a dax entry has been invalidated. This seems to fix the deadlock I am facing and I can make forward progress. Reported-by: Sergio Lopez Fixes: ac401cc78242 ("dax: New fault locking") Reviewed-by: Jan Kara Suggested-by: Dan Williams Signed-off-by: Vivek Goyal --- fs/dax.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/dax.c b/fs/dax.c index e84dd240c35c..42d7406b2cea 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -675,7 +675,7 @@ static int __dax_invalidate_entry(struct address_space *mapping, mapping->nrexceptional--; ret = 1; out: - put_unlocked_entry(&xas, entry, WAKE_NEXT); + put_unlocked_entry(&xas, entry, WAKE_ALL); xas_unlock_irq(&xas); return ret; }