From patchwork Fri Jan 10 19:03:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joao Martins X-Patchwork-Id: 11328287 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A9444930 for ; Fri, 10 Jan 2020 19:06:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7EF2F208E4 for ; Fri, 10 Jan 2020 19:06:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="AnMka1g1" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729021AbgAJTGJ (ORCPT ); Fri, 10 Jan 2020 14:06:09 -0500 Received: from aserp2120.oracle.com ([141.146.126.78]:53634 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728900AbgAJTGD (ORCPT ); Fri, 10 Jan 2020 14:06:03 -0500 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id 00AJ3Br1100944; Fri, 10 Jan 2020 19:04:44 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2019-08-05; bh=lmAqIjgm7HVbSyXw06vW1C3Rvk9MKSm3/YBzLlEguL0=; b=AnMka1g154EpkiM9ty3oJyhJn5eWfNl5iaIKez8elA1e04Peo9Aeateeo2qtcXUoJqKc ZxPNkT/ypti5w79K0d/0fReEG/D5KXOdi5es5Lmd6MKy2mEmpMTZnJLE+CLzoaJJXZj+ zz37gPKnTkH0pMZ9NJWSZS/cM1b2LJs/Hq1pviSEV9L/wdxQbHsjZRhlTE3A01NiWZEt wMjuLQhx+lkmMDA1Ue6GN/WswconK96BMLUfVc/vYO3TOUstM6vCaeqGUfsCoDmXOroU dK2wo1Vpn6XP6n261naL7aPuAAppyBiaP24Mf9Nv+06hJejovTX9lYKCAD3Iwv/ks8b+ WA== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by aserp2120.oracle.com with ESMTP id 2xajnqm1aq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Jan 2020 19:04:43 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id 00AJ49CZ069268; Fri, 10 Jan 2020 19:04:43 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserp3020.oracle.com with ESMTP id 2xevfebtjk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Jan 2020 19:04:43 +0000 Received: from abhmp0013.oracle.com (abhmp0013.oracle.com [141.146.116.19]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 00AJ4gMM020986; Fri, 10 Jan 2020 19:04:42 GMT Received: from paddy.uk.oracle.com (/10.175.192.165) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 10 Jan 2020 11:04:41 -0800 From: Joao Martins To: linux-nvdimm@lists.01.org Cc: Dan Williams , Vishal Verma , Dave Jiang , Ira Weiny , Alex Williamson , Cornelia Huck , kvm@vger.kernel.org, Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H . Peter Anvin" , x86@kernel.org, Liran Alon , Nikita Leshenko , Barret Rhoden , Boris Ostrovsky , Matthew Wilcox , Konrad Rzeszutek Wilk Subject: [PATCH RFC 01/10] mm: Add pmd support for _PAGE_SPECIAL Date: Fri, 10 Jan 2020 19:03:04 +0000 Message-Id: <20200110190313.17144-2-joao.m.martins@oracle.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20200110190313.17144-1-joao.m.martins@oracle.com> References: <20200110190313.17144-1-joao.m.martins@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9496 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-2001100154 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9496 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-2001100154 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Currently vmf_insert_pfn_pmd only works with devmap and BUG_ON otherwise. Add support for handling page special when pfn_t has it marked with PFN_SPECIAL. Usage of page special aren't expected to do GUP hence return no pages on gup_huge_pmd() much like how it is done for ptes on gup_pte_range(). This allows a DAX driver to handle 2M hugepages without struct pages. Signed-off-by: Joao Martins --- arch/x86/include/asm/pgtable.h | 16 +++++++++++++++- mm/gup.c | 3 +++ mm/huge_memory.c | 7 ++++--- mm/memory.c | 3 ++- 4 files changed, 24 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index ad97dc155195..60351c0c15fe 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -255,7 +255,7 @@ static inline int pmd_large(pmd_t pte) #ifdef CONFIG_TRANSPARENT_HUGEPAGE static inline int pmd_trans_huge(pmd_t pmd) { - return (pmd_val(pmd) & (_PAGE_PSE|_PAGE_DEVMAP)) == _PAGE_PSE; + return (pmd_val(pmd) & (_PAGE_PSE|_PAGE_DEVMAP|_PAGE_SPECIAL)) == _PAGE_PSE; } #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD @@ -293,6 +293,15 @@ static inline int pgd_devmap(pgd_t pgd) { return 0; } +#endif + +#ifdef CONFIG_ARCH_HAS_PTE_SPECIAL +static inline int pmd_special(pmd_t pmd) +{ + return !!(pmd_flags(pmd) & _PAGE_SPECIAL); +} +#endif + #endif #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ @@ -414,6 +423,11 @@ static inline pmd_t pmd_mkdevmap(pmd_t pmd) return pmd_set_flags(pmd, _PAGE_DEVMAP); } +static inline pmd_t pmd_mkspecial(pmd_t pmd) +{ + return pmd_set_flags(pmd, _PAGE_SPECIAL); +} + static inline pmd_t pmd_mkhuge(pmd_t pmd) { return pmd_set_flags(pmd, _PAGE_PSE); diff --git a/mm/gup.c b/mm/gup.c index 7646bf993b25..ba5f10535392 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2079,6 +2079,9 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, return __gup_device_huge_pmd(orig, pmdp, addr, end, pages, nr); } + if (pmd_special(orig)) + return 0; + refs = 0; page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); do { diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 41a0fbddc96b..06ad4d6f7477 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -791,6 +791,8 @@ static void insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr, entry = pmd_mkhuge(pfn_t_pmd(pfn, prot)); if (pfn_t_devmap(pfn)) entry = pmd_mkdevmap(entry); + else if (pfn_t_special(pfn)) + entry = pmd_mkspecial(entry); if (write) { entry = pmd_mkyoung(pmd_mkdirty(entry)); entry = maybe_pmd_mkwrite(entry, vma); @@ -823,8 +825,7 @@ vm_fault_t vmf_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write) * but we need to be consistent with PTEs and architectures that * can't support a 'special' bit. */ - BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) && - !pfn_t_devmap(pfn)); + BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))); BUG_ON((vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) == (VM_PFNMAP|VM_MIXEDMAP)); BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags)); @@ -2013,7 +2014,7 @@ spinlock_t *__pmd_trans_huge_lock(pmd_t *pmd, struct vm_area_struct *vma) spinlock_t *ptl; ptl = pmd_lock(vma->vm_mm, pmd); if (likely(is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || - pmd_devmap(*pmd))) + pmd_devmap(*pmd) || pmd_special(*pmd))) return ptl; spin_unlock(ptl); return NULL; diff --git a/mm/memory.c b/mm/memory.c index 45442d9a4f52..cfc3668bddeb 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1165,7 +1165,8 @@ static inline unsigned long zap_pmd_range(struct mmu_gather *tlb, pmd = pmd_offset(pud, addr); do { next = pmd_addr_end(addr, end); - if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || pmd_devmap(*pmd)) { + if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || + pmd_devmap(*pmd) || pmd_special(*pmd)) { if (next - addr != HPAGE_PMD_SIZE) __split_huge_pmd(vma, pmd, addr, false, NULL); else if (zap_huge_pmd(tlb, vma, pmd, addr)) From patchwork Fri Jan 10 19:03:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joao Martins X-Patchwork-Id: 11328301 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2DD5E930 for ; Fri, 10 Jan 2020 19:07:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0CBE020848 for ; Fri, 10 Jan 2020 19:07:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="X1C81cDB" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727709AbgAJTHK (ORCPT ); Fri, 10 Jan 2020 14:07:10 -0500 Received: from aserp2120.oracle.com ([141.146.126.78]:54708 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727448AbgAJTHK (ORCPT ); Fri, 10 Jan 2020 14:07:10 -0500 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id 00AJ489v101370; Fri, 10 Jan 2020 19:06:48 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2019-08-05; bh=MhOBiFDMMT29+VAIydrH0CPWpq+g29L7+unCxWM/Z+4=; b=X1C81cDBcc6zBx6wUvH97zlln5XYqhJFsBOzb+NhMu64yTJ8Y8Ddy524aVvfH91E2Qwi EljaT9eUhxVKyGw4JjWxSb2yVC/mWh5WpPBjBUcU5l0qypsatrtNjo2B4bFLqkzAS4ZU //fove2KpjDxt5esFfPNHqMA0hYyEegwsYbmO3xnMXnnv3japqx+0vkuq4+PHVpY+jEP ORAuknMDIv/gMEQhR2dx4O/iyOVJNAtA6pHnGJSYJ7EhlkgPpyvtAUE1Gov3iuBwlMqO 3A3dMgroQeS2mRnxqfGduppkRINZ/5FXuG1dYndFA0HqcQVNMwCYuN11vKw6EJkeJvSk fQ== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by aserp2120.oracle.com with ESMTP id 2xajnqm1kv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Jan 2020 19:06:48 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id 00AJ3rAK183505; Fri, 10 Jan 2020 19:04:48 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserp3030.oracle.com with ESMTP id 2xedhyptut-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Jan 2020 19:04:48 +0000 Received: from abhmp0013.oracle.com (abhmp0013.oracle.com [141.146.116.19]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 00AJ4lHr021020; Fri, 10 Jan 2020 19:04:47 GMT Received: from paddy.uk.oracle.com (/10.175.192.165) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 10 Jan 2020 11:04:47 -0800 From: Joao Martins To: linux-nvdimm@lists.01.org Cc: Dan Williams , Vishal Verma , Dave Jiang , Ira Weiny , Alex Williamson , Cornelia Huck , kvm@vger.kernel.org, Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H . Peter Anvin" , x86@kernel.org, Liran Alon , Nikita Leshenko , Barret Rhoden , Boris Ostrovsky , Matthew Wilcox , Konrad Rzeszutek Wilk Subject: [PATCH RFC 02/10] mm: Handle pmd entries in follow_pfn() Date: Fri, 10 Jan 2020 19:03:05 +0000 Message-Id: <20200110190313.17144-3-joao.m.martins@oracle.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20200110190313.17144-1-joao.m.martins@oracle.com> References: <20200110190313.17144-1-joao.m.martins@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9496 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=518 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-2001100154 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9496 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=574 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-2001100154 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When follow_pfn hits a pmd_huge() it won't return a valid PFN given it's usage of follow_pte(). Fix that up to pass a @pmdpp and thus allow callers to get the pmd pointer. If we encounter such a huge page, we calculate the pfn offset to the PMD accordingly. This allows KVM to handle 2M hugepage pfns on VM_PFNMAP vmas. Signed-off-by: Joao Martins --- mm/memory.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index cfc3668bddeb..db99684d2cb3 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4366,6 +4366,7 @@ EXPORT_SYMBOL(follow_pte_pmd); int follow_pfn(struct vm_area_struct *vma, unsigned long address, unsigned long *pfn) { + pmd_t *pmdpp = NULL; int ret = -EINVAL; spinlock_t *ptl; pte_t *ptep; @@ -4373,10 +4374,14 @@ int follow_pfn(struct vm_area_struct *vma, unsigned long address, if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) return ret; - ret = follow_pte(vma->vm_mm, address, &ptep, &ptl); + ret = follow_pte_pmd(vma->vm_mm, address, NULL, + &ptep, &pmdpp, &ptl); if (ret) return ret; - *pfn = pte_pfn(*ptep); + if (pmdpp) + *pfn = pmd_pfn(*pmdpp) + ((address & ~PMD_MASK) >> PAGE_SHIFT); + else + *pfn = pte_pfn(*ptep); pte_unmap_unlock(ptep, ptl); return 0; } From patchwork Fri Jan 10 19:03:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joao Martins X-Patchwork-Id: 11328291 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 508F592A for ; Fri, 10 Jan 2020 19:06:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 24DC62084D for ; Fri, 10 Jan 2020 19:06:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="CrULY4cf" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728946AbgAJTGA (ORCPT ); Fri, 10 Jan 2020 14:06:00 -0500 Received: from userp2120.oracle.com ([156.151.31.85]:55568 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728900AbgAJTF6 (ORCPT ); Fri, 10 Jan 2020 14:05:58 -0500 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id 00AJ3CH8110888; Fri, 10 Jan 2020 19:04:54 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2019-08-05; bh=91h706fXsdEJXZDnw67zTo1mSfBkU/j9Erd5I1z7aCg=; b=CrULY4cfFvZsA2qw7NIKPR6egO/hAPbaNQDQeIddAlyjfmNf1Eu7rlxJDD/C7SWOFlb4 9FssiPbqhbHhrXyUFQNfA+/whTHpQmArL4bk0lhHg8iFDxVB8nEtCrg1leogQp7KL8d/ cg3EF1XYzYE6iLJtVGueP1TEFX+eQ3IFLsVLqxSaIGn68F22OMey6y7SwAOCE8K03WVZ 4FES+ozykOyjgQRqbzlfhx5Jx+8f5EtIkLQc5icI/NZsuR9nGjlgOEkWdyMSfNnVnR+e BSHpYCrrxgINh2JOT56+QBxKIDT2FqGWz04t14mdO+ttS7R3qna5FMqoU3Wqoi2kPatj hw== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2120.oracle.com with ESMTP id 2xakbrbyq7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Jan 2020 19:04:54 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id 00AJ4A0l069372; Fri, 10 Jan 2020 19:04:53 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserp3020.oracle.com with ESMTP id 2xevfebv47-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Jan 2020 19:04:53 +0000 Received: from abhmp0013.oracle.com (abhmp0013.oracle.com [141.146.116.19]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id 00AJ4qpm014575; Fri, 10 Jan 2020 19:04:52 GMT Received: from paddy.uk.oracle.com (/10.175.192.165) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 10 Jan 2020 11:04:52 -0800 From: Joao Martins To: linux-nvdimm@lists.01.org Cc: Dan Williams , Vishal Verma , Dave Jiang , Ira Weiny , Alex Williamson , Cornelia Huck , kvm@vger.kernel.org, Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H . Peter Anvin" , x86@kernel.org, Liran Alon , Nikita Leshenko , Barret Rhoden , Boris Ostrovsky , Matthew Wilcox , Konrad Rzeszutek Wilk Subject: [PATCH RFC 03/10] mm: Add pud support for _PAGE_SPECIAL Date: Fri, 10 Jan 2020 19:03:06 +0000 Message-Id: <20200110190313.17144-4-joao.m.martins@oracle.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20200110190313.17144-1-joao.m.martins@oracle.com> References: <20200110190313.17144-1-joao.m.martins@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9496 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-2001100154 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9496 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-2001100154 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Currently vmf_insert_pfn_pud only works with devmap and BUG_ON otherwise. Add support for handling page special when pfn_t has it marked with PFN_SPECIAL. Usage of this type of pages aren't expected to do GUP hence return no pages on gup_huge_pud() much like how it is done for ptes on gup_pte_range() and for pmds on gup_huge_pmd(). This allows device-dax to handle 1G hugepages without struct pages. Signed-off-by: Joao Martins --- arch/x86/include/asm/pgtable.h | 18 +++++++++++++++++- mm/gup.c | 3 +++ mm/huge_memory.c | 8 +++++--- mm/memory.c | 3 ++- 4 files changed, 27 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 60351c0c15fe..2027c063fa16 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -261,7 +261,7 @@ static inline int pmd_trans_huge(pmd_t pmd) #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD static inline int pud_trans_huge(pud_t pud) { - return (pud_val(pud) & (_PAGE_PSE|_PAGE_DEVMAP)) == _PAGE_PSE; + return (pud_val(pud) & (_PAGE_PSE|_PAGE_DEVMAP|_PAGE_SPECIAL)) == _PAGE_PSE; } #endif @@ -300,6 +300,17 @@ static inline int pmd_special(pmd_t pmd) { return !!(pmd_flags(pmd) & _PAGE_SPECIAL); } + +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +static inline int pud_special(pud_t pud) +{ + return !!(pud_flags(pud) & _PAGE_SPECIAL); +} +#else +static inline int pud_special(pud_t pud) +{ + return 0; +} #endif #endif @@ -487,6 +498,11 @@ static inline pud_t pud_mkhuge(pud_t pud) return pud_set_flags(pud, _PAGE_PSE); } +static inline pud_t pud_mkspecial(pud_t pud) +{ + return pud_set_flags(pud, _PAGE_SPECIAL); +} + static inline pud_t pud_mkyoung(pud_t pud) { return pud_set_flags(pud, _PAGE_ACCESSED); diff --git a/mm/gup.c b/mm/gup.c index ba5f10535392..ae4abe5878ad 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2123,6 +2123,9 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr, return __gup_device_huge_pud(orig, pudp, addr, end, pages, nr); } + if (pud_special(orig)) + return 0; + refs = 0; page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); do { diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 06ad4d6f7477..cff707163bc1 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -879,6 +879,8 @@ static void insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr, entry = pud_mkhuge(pfn_t_pud(pfn, prot)); if (pfn_t_devmap(pfn)) entry = pud_mkdevmap(entry); + else if (pfn_t_special(pfn)) + entry = pud_mkspecial(entry); if (write) { entry = pud_mkyoung(pud_mkdirty(entry)); entry = maybe_pud_mkwrite(entry, vma); @@ -901,8 +903,7 @@ vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write) * but we need to be consistent with PTEs and architectures that * can't support a 'special' bit. */ - BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) && - !pfn_t_devmap(pfn)); + BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))); BUG_ON((vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) == (VM_PFNMAP|VM_MIXEDMAP)); BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags)); @@ -2031,7 +2032,8 @@ spinlock_t *__pud_trans_huge_lock(pud_t *pud, struct vm_area_struct *vma) spinlock_t *ptl; ptl = pud_lock(vma->vm_mm, pud); - if (likely(pud_trans_huge(*pud) || pud_devmap(*pud))) + if (likely(pud_trans_huge(*pud) || pud_devmap(*pud)) || + pud_special(*pud)) return ptl; spin_unlock(ptl); return NULL; diff --git a/mm/memory.c b/mm/memory.c index db99684d2cb3..109643219e1b 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1201,7 +1201,8 @@ static inline unsigned long zap_pud_range(struct mmu_gather *tlb, pud = pud_offset(p4d, addr); do { next = pud_addr_end(addr, end); - if (pud_trans_huge(*pud) || pud_devmap(*pud)) { + if (pud_trans_huge(*pud) || pud_devmap(*pud) || + pud_special(*pud)) { if (next - addr != HPAGE_PUD_SIZE) { VM_BUG_ON_VMA(!rwsem_is_locked(&tlb->mm->mmap_sem), vma); split_huge_pud(vma, pud, addr); From patchwork Fri Jan 10 19:03:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joao Martins X-Patchwork-Id: 11328305 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D1649138C for ; Fri, 10 Jan 2020 19:07:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A55B32051A for ; Fri, 10 Jan 2020 19:07:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="aqdMe1SW" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727956AbgAJTHV (ORCPT ); Fri, 10 Jan 2020 14:07:21 -0500 Received: from aserp2120.oracle.com ([141.146.126.78]:54876 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727242AbgAJTHV (ORCPT ); Fri, 10 Jan 2020 14:07:21 -0500 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id 00AJ3Bm3100946; Fri, 10 Jan 2020 19:06:59 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2019-08-05; bh=dGWzVklUrNGEX2eNLYyNdTm24qlx6OFYrDxE//3Gt2k=; b=aqdMe1SWRIRZOfYC9ETTz7SqH3W7zzBFov3yJ3G88orfapiXzOduqU9XCgT/5IfVusy+ b1siD5J2KDhVXBP20WyHbz6gHA+Ep7fpyWDLdYnFEV04bgj55idOwuWUtB/1l+h97KbE DKeq4mOS3skV+wNAqu5KkyM08710bSA1yEUTwfnRFqluW3dxNxlQBJpGi9c4UM2azVjW NiIEe2JhlpuDIGkos/0/o+RqEgg3Qv9ao7SWVGv5hQsjIgOIZp/wM39Z2RCpNFrSJnd+ nSFHLnpj0qSxO9mcvipKBIe1w5DwDIG6LRyjMnRb0g6PA51RK3OJA2M3KsqCH9JzJFWe ZQ== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by aserp2120.oracle.com with ESMTP id 2xajnqm1nh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Jan 2020 19:06:58 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id 00AJ3tmw183684; Fri, 10 Jan 2020 19:04:58 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserp3030.oracle.com with ESMTP id 2xedhypu7j-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Jan 2020 19:04:58 +0000 Received: from abhmp0013.oracle.com (abhmp0013.oracle.com [141.146.116.19]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 00AJ4v7K021342; Fri, 10 Jan 2020 19:04:57 GMT Received: from paddy.uk.oracle.com (/10.175.192.165) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 10 Jan 2020 11:04:57 -0800 From: Joao Martins To: linux-nvdimm@lists.01.org Cc: Dan Williams , Vishal Verma , Dave Jiang , Ira Weiny , Alex Williamson , Cornelia Huck , kvm@vger.kernel.org, Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H . Peter Anvin" , x86@kernel.org, Liran Alon , Nikita Leshenko , Barret Rhoden , Boris Ostrovsky , Matthew Wilcox , Konrad Rzeszutek Wilk Subject: [PATCH RFC 04/10] mm: Handle pud entries in follow_pfn() Date: Fri, 10 Jan 2020 19:03:07 +0000 Message-Id: <20200110190313.17144-5-joao.m.martins@oracle.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20200110190313.17144-1-joao.m.martins@oracle.com> References: <20200110190313.17144-1-joao.m.martins@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9496 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=711 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-2001100154 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9496 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=767 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-2001100154 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When follow_pfn hits a pud_huge() it won't return a valid PFN for a PUD. Fix it by adding @pudp and thus allow callers to get the pud pointer. If we encounter such a huge page, we calculate the offset to the PUD accordingly. This allows KVM to handle 1G hugepage pfns on VM_PFNMAP vmas. Co-developed-by: Nikita Leshenko Signed-off-by: Nikita Leshenko Signed-off-by: Joao Martins --- mm/memory.c | 58 ++++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 49 insertions(+), 9 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 109643219e1b..f46646630497 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4261,9 +4261,10 @@ int __pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address) } #endif /* __PAGETABLE_PMD_FOLDED */ -static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address, +static int __follow_pte_pud(struct mm_struct *mm, unsigned long address, struct mmu_notifier_range *range, - pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp) + pte_t **ptepp, pmd_t **pmdpp, pud_t **pudpp, + spinlock_t **ptlp) { pgd_t *pgd; p4d_t *p4d; @@ -4280,6 +4281,28 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address, goto out; pud = pud_offset(p4d, address); + VM_BUG_ON(pud_trans_huge(*pud)); + + if (pud_huge(*pud)) { + if (!pudpp) + goto out; + + if (range) { + mmu_notifier_range_init(range, MMU_NOTIFY_CLEAR, 0, + NULL, mm, address & PUD_MASK, + (address & PUD_MASK) + PUD_SIZE); + mmu_notifier_invalidate_range_start(range); + } + *ptlp = pud_lock(mm, pud); + if (pud_huge(*pud)) { + *pudpp = pud; + return 0; + } + spin_unlock(*ptlp); + if (range) + mmu_notifier_invalidate_range_end(range); + } + if (pud_none(*pud) || unlikely(pud_bad(*pud))) goto out; @@ -4335,8 +4358,8 @@ static inline int follow_pte(struct mm_struct *mm, unsigned long address, /* (void) is needed to make gcc happy */ (void) __cond_lock(*ptlp, - !(res = __follow_pte_pmd(mm, address, NULL, - ptepp, NULL, ptlp))); + !(res = __follow_pte_pud(mm, address, NULL, + ptepp, NULL, NULL, ptlp))); return res; } @@ -4348,12 +4371,26 @@ int follow_pte_pmd(struct mm_struct *mm, unsigned long address, /* (void) is needed to make gcc happy */ (void) __cond_lock(*ptlp, - !(res = __follow_pte_pmd(mm, address, range, - ptepp, pmdpp, ptlp))); + !(res = __follow_pte_pud(mm, address, range, + ptepp, pmdpp, NULL, ptlp))); return res; } EXPORT_SYMBOL(follow_pte_pmd); +static int follow_pte_pud(struct mm_struct *mm, unsigned long address, + struct mmu_notifier_range *range, + pte_t **ptepp, pmd_t **pmdpp, pud_t **pudpp, + spinlock_t **ptlp) +{ + int res; + + /* (void) is needed to make gcc happy */ + (void) __cond_lock(*ptlp, + !(res = __follow_pte_pud(mm, address, range, + ptepp, pmdpp, pudpp, ptlp))); + return res; +} + /** * follow_pfn - look up PFN at a user virtual address * @vma: memory mapping @@ -4368,6 +4405,7 @@ int follow_pfn(struct vm_area_struct *vma, unsigned long address, unsigned long *pfn) { pmd_t *pmdpp = NULL; + pud_t *pudpp = NULL; int ret = -EINVAL; spinlock_t *ptl; pte_t *ptep; @@ -4375,11 +4413,13 @@ int follow_pfn(struct vm_area_struct *vma, unsigned long address, if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) return ret; - ret = follow_pte_pmd(vma->vm_mm, address, NULL, - &ptep, &pmdpp, &ptl); + ret = follow_pte_pud(vma->vm_mm, address, NULL, + &ptep, &pmdpp, &pudpp, &ptl); if (ret) return ret; - if (pmdpp) + if (pudpp) + *pfn = pud_pfn(*pudpp) + ((address & ~PUD_MASK) >> PAGE_SHIFT); + else if (pmdpp) *pfn = pmd_pfn(*pmdpp) + ((address & ~PMD_MASK) >> PAGE_SHIFT); else *pfn = pte_pfn(*ptep); From patchwork Fri Jan 10 19:03:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joao Martins X-Patchwork-Id: 11328285 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0FE5892A for ; Fri, 10 Jan 2020 19:06:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D992F20842 for ; Fri, 10 Jan 2020 19:06:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="eihdRw9s" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728974AbgAJTGD (ORCPT ); Fri, 10 Jan 2020 14:06:03 -0500 Received: from aserp2120.oracle.com ([141.146.126.78]:53616 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728963AbgAJTGC (ORCPT ); Fri, 10 Jan 2020 14:06:02 -0500 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id 00AJ3IaH100963; Fri, 10 Jan 2020 19:05:06 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2019-08-05; bh=C8D0pIaGNjjh2JphNwNEtO74QjeO7OaTgDpMn2oUCOQ=; b=eihdRw9sB5AAEspn7seM1U5ZcQbXLdkmwvsQYq2bO9Gh6HcrtIMT2X1++G3XqobSqNQJ 2th6fRWrI2JHwTh3bDyM3j7IFK/n4eaTF6XXuhcNM24OZw0++etfJw7hRiHC/FrZYrWq tXB3SxHo3dktk0cXhtdwWv+I3oeI3mjbpCxSvhgjVHD/nZ+c1/Een82LrDaX+2MSlm2z KBhhzugXijUeAt31lKRdngewyAINmeeKeg2nu1kGN93Iy/78Tlp81CmoCobLHGWW2xaa pn+lpZY2zX/iUxyX7w+gWnLRnd3lzaehmeXPVGVTarifoch91V/lvKx273jo77IMnCd9 eA== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by aserp2120.oracle.com with ESMTP id 2xajnqm1cu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Jan 2020 19:05:05 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id 00AJ4Dwn106360; Fri, 10 Jan 2020 19:05:05 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userp3030.oracle.com with ESMTP id 2xevfdhx8d-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Jan 2020 19:05:05 +0000 Received: from abhmp0013.oracle.com (abhmp0013.oracle.com [141.146.116.19]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id 00AJ54wI014657; Fri, 10 Jan 2020 19:05:04 GMT Received: from paddy.uk.oracle.com (/10.175.192.165) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 10 Jan 2020 11:05:03 -0800 From: Joao Martins To: linux-nvdimm@lists.01.org Cc: Dan Williams , Vishal Verma , Dave Jiang , Ira Weiny , Alex Williamson , Cornelia Huck , kvm@vger.kernel.org, Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H . Peter Anvin" , x86@kernel.org, Liran Alon , Nikita Leshenko , Barret Rhoden , Boris Ostrovsky , Matthew Wilcox , Konrad Rzeszutek Wilk Subject: [PATCH RFC 05/10] device-dax: Do not enforce MADV_DONTFORK on mmap() Date: Fri, 10 Jan 2020 19:03:08 +0000 Message-Id: <20200110190313.17144-6-joao.m.martins@oracle.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20200110190313.17144-1-joao.m.martins@oracle.com> References: <20200110190313.17144-1-joao.m.martins@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9496 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-2001100154 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9496 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-2001100154 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Currently check_vma() checks for VM_DONTCOPY for a device-dax when the dax region is not backed by devmap (i.e. PFN_MAP is not set). VM_DONTCOPY is set through madvise(MADV_DONTFORK) and it only sets it at an address returned from mmap(). check_vma() is called at devdax mmap hence checking VM_DONTCOPY prevents a process from mmap-ing the device. Let's not enforce MADV_DONTFORK at mmap(), but rather when it actually gets used (on fault). The assumptions don't change, as it is expected to still retain/madvise MADV_DONTFORK after mmap. Signed-off-by: Joao Martins --- drivers/dax/device.c | 28 ++++++++++++++++++++-------- 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/drivers/dax/device.c b/drivers/dax/device.c index 1af823b2fe6b..c6a7f5e12c54 100644 --- a/drivers/dax/device.c +++ b/drivers/dax/device.c @@ -14,7 +14,7 @@ #include "dax-private.h" #include "bus.h" -static int check_vma(struct dev_dax *dev_dax, struct vm_area_struct *vma, +static int check_vma_mmap(struct dev_dax *dev_dax, struct vm_area_struct *vma, const char *func) { struct dax_region *dax_region = dev_dax->region; @@ -41,17 +41,29 @@ static int check_vma(struct dev_dax *dev_dax, struct vm_area_struct *vma, return -EINVAL; } - if ((dax_region->pfn_flags & (PFN_DEV|PFN_MAP)) == PFN_DEV - && (vma->vm_flags & VM_DONTCOPY) == 0) { + if (!vma_is_dax(vma)) { dev_info_ratelimited(dev, - "%s: %s: fail, dax range requires MADV_DONTFORK\n", + "%s: %s: fail, vma is not DAX capable\n", current->comm, func); return -EINVAL; } - if (!vma_is_dax(vma)) { - dev_info_ratelimited(dev, - "%s: %s: fail, vma is not DAX capable\n", + return 0; +} + +static int check_vma(struct dev_dax *dev_dax, struct vm_area_struct *vma, + const char *func) +{ + int rc; + + rc = check_vma_mmap(dev_dax, vma, func); + if (rc < 0) + return rc; + + if ((dev_dax->region->pfn_flags & (PFN_DEV|PFN_MAP)) == PFN_DEV + && (vma->vm_flags & VM_DONTCOPY) == 0) { + dev_info_ratelimited(&dev_dax->dev, + "%s: %s: fail, dax range requires MADV_DONTFORK\n", current->comm, func); return -EINVAL; } @@ -315,7 +327,7 @@ static int dax_mmap(struct file *filp, struct vm_area_struct *vma) * fault time. */ id = dax_read_lock(); - rc = check_vma(dev_dax, vma, __func__); + rc = check_vma_mmap(dev_dax, vma, __func__); dax_read_unlock(id); if (rc) return rc; From patchwork Fri Jan 10 19:03:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joao Martins X-Patchwork-Id: 11328267 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C2A4892A for ; Fri, 10 Jan 2020 19:05:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A1CFA208E4 for ; Fri, 10 Jan 2020 19:05:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="akuQOJ4o" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728646AbgAJTFl (ORCPT ); Fri, 10 Jan 2020 14:05:41 -0500 Received: from aserp2120.oracle.com ([141.146.126.78]:53252 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728556AbgAJTFl (ORCPT ); Fri, 10 Jan 2020 14:05:41 -0500 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id 00AJ4GGE101455; Fri, 10 Jan 2020 19:05:10 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2019-08-05; bh=MHgz05oZOaycJTUzvHe+/2r6D+V+CAPUNuF1+l0RUqk=; b=akuQOJ4oyrHAWCsm1Qeez7Nv1vtW+9n8sxvrA44yFKWjM07BtIuB9c9UCV2ceeiIfhD5 fcXALB3yW5Ym7HofN0nVLwDzwkAhDMLQa2lH+WiYpMQxdXMIiPgKDgKF2nt8IwqmyXfn jx6DbzVey39Sm9YdfSUM77U3azh8MRz7Ru89NkhyOw92HIKtlfX4s48NMcSAFrERCkJQ BzbvVZuCAeFo6IuoNGxV20zPc7LJ5AIxAvRZRuIfyx+8gJASy4gPQ2BL8oaYunc1dQ8J svR8Gbn6/CRGg3c5y0py2nYZycHMISPY5mlPhfDCDB/p3Sx0QvtaowAlxYgcZi9kUGDr mw== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by aserp2120.oracle.com with ESMTP id 2xajnqm1d0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Jan 2020 19:05:10 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id 00AJ3t1e183573; Fri, 10 Jan 2020 19:05:09 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserp3030.oracle.com with ESMTP id 2xedhypum9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Jan 2020 19:05:09 +0000 Received: from abhmp0013.oracle.com (abhmp0013.oracle.com [141.146.116.19]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id 00AJ59wD014700; Fri, 10 Jan 2020 19:05:09 GMT Received: from paddy.uk.oracle.com (/10.175.192.165) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 10 Jan 2020 11:05:08 -0800 From: Joao Martins To: linux-nvdimm@lists.01.org Cc: Dan Williams , Vishal Verma , Dave Jiang , Ira Weiny , Alex Williamson , Cornelia Huck , kvm@vger.kernel.org, Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H . Peter Anvin" , x86@kernel.org, Liran Alon , Nikita Leshenko , Barret Rhoden , Boris Ostrovsky , Matthew Wilcox , Konrad Rzeszutek Wilk Subject: [PATCH RFC 06/10] device-dax: Introduce pfn_flags helper Date: Fri, 10 Jan 2020 19:03:09 +0000 Message-Id: <20200110190313.17144-7-joao.m.martins@oracle.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20200110190313.17144-1-joao.m.martins@oracle.com> References: <20200110190313.17144-1-joao.m.martins@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9496 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=800 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-2001100154 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9496 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=865 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-2001100154 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Replace PFN_DEV|PFN_MAP check call sites with two helper functions dax_is_pfn_dev() and dax_is_pfn_map(). Signed-off-by: Joao Martins --- drivers/dax/device.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/drivers/dax/device.c b/drivers/dax/device.c index c6a7f5e12c54..113a554de3ee 100644 --- a/drivers/dax/device.c +++ b/drivers/dax/device.c @@ -14,6 +14,17 @@ #include "dax-private.h" #include "bus.h" +static int dax_is_pfn_dev(struct dev_dax *dev_dax) +{ + return (dev_dax->region->pfn_flags & (PFN_DEV|PFN_MAP)) == PFN_DEV; +} + +static int dax_is_pfn_map(struct dev_dax *dev_dax) +{ + return (dev_dax->region->pfn_flags & + (PFN_DEV|PFN_MAP)) == (PFN_DEV|PFN_MAP); +} + static int check_vma_mmap(struct dev_dax *dev_dax, struct vm_area_struct *vma, const char *func) { @@ -60,8 +71,7 @@ static int check_vma(struct dev_dax *dev_dax, struct vm_area_struct *vma, if (rc < 0) return rc; - if ((dev_dax->region->pfn_flags & (PFN_DEV|PFN_MAP)) == PFN_DEV - && (vma->vm_flags & VM_DONTCOPY) == 0) { + if (dax_is_pfn_dev(dev_dax) && (vma->vm_flags & VM_DONTCOPY) == 0) { dev_info_ratelimited(&dev_dax->dev, "%s: %s: fail, dax range requires MADV_DONTFORK\n", current->comm, func); @@ -140,7 +150,7 @@ static vm_fault_t __dev_dax_pmd_fault(struct dev_dax *dev_dax, } /* dax pmd mappings require pfn_t_devmap() */ - if ((dax_region->pfn_flags & (PFN_DEV|PFN_MAP)) != (PFN_DEV|PFN_MAP)) { + if (!dax_is_pfn_map(dev_dax)) { dev_dbg(dev, "region lacks devmap flags\n"); return VM_FAULT_SIGBUS; } @@ -190,7 +200,7 @@ static vm_fault_t __dev_dax_pud_fault(struct dev_dax *dev_dax, } /* dax pud mappings require pfn_t_devmap() */ - if ((dax_region->pfn_flags & (PFN_DEV|PFN_MAP)) != (PFN_DEV|PFN_MAP)) { + if (!dax_is_pfn_map(dev_dax)) { dev_dbg(dev, "region lacks devmap flags\n"); return VM_FAULT_SIGBUS; } From patchwork Fri Jan 10 19:03:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joao Martins X-Patchwork-Id: 11328257 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5C84C930 for ; Fri, 10 Jan 2020 19:05:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 31EDC208E4 for ; Fri, 10 Jan 2020 19:05:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="Bzn/J7UT" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728724AbgAJTFl (ORCPT ); Fri, 10 Jan 2020 14:05:41 -0500 Received: from userp2120.oracle.com ([156.151.31.85]:55314 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728562AbgAJTFl (ORCPT ); Fri, 10 Jan 2020 14:05:41 -0500 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id 00AJ44Ed111312; Fri, 10 Jan 2020 19:05:16 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2019-08-05; bh=AOtZE2Ew8hycb3SG+EnQmNQT+PxozxfzEjsUQMh/A/s=; b=Bzn/J7UT1T61gmZkCk5vbEUQl1TX7zikcJc7drBa47visSSlDYUrXXYCwC2IDFAR4i7t 58SVag0bRztRmzfWIYStozEQPsR/ZS2dbeSOMtWcyvB9NIjDDqiqquffapV9/J0QYRTO KaO7RCvWsQGtypBGMmqXw+OZA8QLWNfxxqxyzDWJmGFWQABgp/2Y20Zeu7HMImLxfn1B Zmuw+GnvpjZlKfSP9b0cI/t76wg/m/Thl2aktuoRCDM2Vjt2IptsxQ7XNh3yX0HtRf/o k9dcm9Cn1AEgHDg6QuoYJsLCakcfFqfzmdYObBpBzFRYlix+mW8pQrQwe5XkDfBCsnpQ jw== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by userp2120.oracle.com with ESMTP id 2xakbrbyrr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Jan 2020 19:05:16 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id 00AJ4RxF038537; Fri, 10 Jan 2020 19:05:16 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userp3020.oracle.com with ESMTP id 2xekkvjm94-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Jan 2020 19:05:15 +0000 Received: from abhmp0013.oracle.com (abhmp0013.oracle.com [141.146.116.19]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id 00AJ5Ec4006978; Fri, 10 Jan 2020 19:05:14 GMT Received: from paddy.uk.oracle.com (/10.175.192.165) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 10 Jan 2020 11:05:14 -0800 From: Joao Martins To: linux-nvdimm@lists.01.org Cc: Dan Williams , Vishal Verma , Dave Jiang , Ira Weiny , Alex Williamson , Cornelia Huck , kvm@vger.kernel.org, Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H . Peter Anvin" , x86@kernel.org, Liran Alon , Nikita Leshenko , Barret Rhoden , Boris Ostrovsky , Matthew Wilcox , Konrad Rzeszutek Wilk Subject: [PATCH RFC 07/10] device-dax: Add support for PFN_SPECIAL flags Date: Fri, 10 Jan 2020 19:03:10 +0000 Message-Id: <20200110190313.17144-8-joao.m.martins@oracle.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20200110190313.17144-1-joao.m.martins@oracle.com> References: <20200110190313.17144-1-joao.m.martins@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9496 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-2001100154 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9496 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-2001100154 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Right now we assume there's gonna be a PFN_DEV|PFN_MAP which means it will have a struct page backing the PFN but that is not placed in normal system RAM zones. Add support for PFN_DEV|PFN_SPECIAL only and therefore the underlying vma won't have a struct page. For device dax, this means not assuming callers will pass a dev_pagemap, and avoid returning SIGBUS for the lack of PFN_MAP region pfn flag and finally not setting struct page index/mapping on fault. Signed-off-by: Joao Martins --- drivers/dax/bus.c | 3 ++- drivers/dax/device.c | 40 ++++++++++++++++++++++------------------ 2 files changed, 24 insertions(+), 19 deletions(-) diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index 46e46047a1f7..96ca3ac85278 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -414,7 +414,8 @@ struct dev_dax *__devm_create_dev_dax(struct dax_region *dax_region, int id, if (!dev_dax) return ERR_PTR(-ENOMEM); - memcpy(&dev_dax->pgmap, pgmap, sizeof(*pgmap)); + if (pgmap) + memcpy(&dev_dax->pgmap, pgmap, sizeof(*pgmap)); /* * No 'host' or dax_operations since there is no access to this diff --git a/drivers/dax/device.c b/drivers/dax/device.c index 113a554de3ee..aa38f5ff180a 100644 --- a/drivers/dax/device.c +++ b/drivers/dax/device.c @@ -14,6 +14,12 @@ #include "dax-private.h" #include "bus.h" +static int dax_is_pfn_special(struct dev_dax *dev_dax) +{ + return (dev_dax->region->pfn_flags & + (PFN_DEV|PFN_SPECIAL)) == (PFN_DEV|PFN_SPECIAL); +} + static int dax_is_pfn_dev(struct dev_dax *dev_dax) { return (dev_dax->region->pfn_flags & (PFN_DEV|PFN_MAP)) == PFN_DEV; @@ -104,6 +110,7 @@ static vm_fault_t __dev_dax_pte_fault(struct dev_dax *dev_dax, struct dax_region *dax_region; phys_addr_t phys; unsigned int fault_size = PAGE_SIZE; + int rc; if (check_vma(dev_dax, vmf->vma, __func__)) return VM_FAULT_SIGBUS; @@ -126,7 +133,12 @@ static vm_fault_t __dev_dax_pte_fault(struct dev_dax *dev_dax, *pfn = phys_to_pfn_t(phys, dax_region->pfn_flags); - return vmf_insert_mixed(vmf->vma, vmf->address, *pfn); + if (dax_is_pfn_special(dev_dax)) + rc = vmf_insert_pfn(vmf->vma, vmf->address, pfn_t_to_pfn(*pfn)); + else + rc = vmf_insert_mixed(vmf->vma, vmf->address, *pfn); + + return rc; } static vm_fault_t __dev_dax_pmd_fault(struct dev_dax *dev_dax, @@ -149,12 +161,6 @@ static vm_fault_t __dev_dax_pmd_fault(struct dev_dax *dev_dax, return VM_FAULT_SIGBUS; } - /* dax pmd mappings require pfn_t_devmap() */ - if (!dax_is_pfn_map(dev_dax)) { - dev_dbg(dev, "region lacks devmap flags\n"); - return VM_FAULT_SIGBUS; - } - if (fault_size < dax_region->align) return VM_FAULT_SIGBUS; else if (fault_size > dax_region->align) @@ -199,12 +205,6 @@ static vm_fault_t __dev_dax_pud_fault(struct dev_dax *dev_dax, return VM_FAULT_SIGBUS; } - /* dax pud mappings require pfn_t_devmap() */ - if (!dax_is_pfn_map(dev_dax)) { - dev_dbg(dev, "region lacks devmap flags\n"); - return VM_FAULT_SIGBUS; - } - if (fault_size < dax_region->align) return VM_FAULT_SIGBUS; else if (fault_size > dax_region->align) @@ -266,7 +266,7 @@ static vm_fault_t dev_dax_huge_fault(struct vm_fault *vmf, rc = VM_FAULT_SIGBUS; } - if (rc == VM_FAULT_NOPAGE) { + if (dax_is_pfn_map(dev_dax) && (rc == VM_FAULT_NOPAGE)) { unsigned long i; pgoff_t pgoff; @@ -344,6 +344,8 @@ static int dax_mmap(struct file *filp, struct vm_area_struct *vma) vma->vm_ops = &dax_vm_ops; vma->vm_flags |= VM_HUGEPAGE; + if (dax_is_pfn_special(dev_dax)) + vma->vm_flags |= VM_PFNMAP; return 0; } @@ -450,10 +452,12 @@ int dev_dax_probe(struct device *dev) return -EBUSY; } - dev_dax->pgmap.type = MEMORY_DEVICE_DEVDAX; - addr = devm_memremap_pages(dev, &dev_dax->pgmap); - if (IS_ERR(addr)) - return PTR_ERR(addr); + if (dax_is_pfn_map(dev_dax)) { + dev_dax->pgmap.type = MEMORY_DEVICE_DEVDAX; + addr = devm_memremap_pages(dev, &dev_dax->pgmap); + if (IS_ERR(addr)) + return PTR_ERR(addr); + } inode = dax_inode(dax_dev); cdev = inode->i_cdev; From patchwork Fri Jan 10 19:03:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joao Martins X-Patchwork-Id: 11328259 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7AA4F92A for ; Fri, 10 Jan 2020 19:05:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4FA252087F for ; Fri, 10 Jan 2020 19:05:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="Xar53uV/" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728819AbgAJTFo (ORCPT ); Fri, 10 Jan 2020 14:05:44 -0500 Received: from userp2130.oracle.com ([156.151.31.86]:47622 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728562AbgAJTFo (ORCPT ); Fri, 10 Jan 2020 14:05:44 -0500 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id 00AJ3C56131553; Fri, 10 Jan 2020 19:05:21 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2019-08-05; bh=IAxWsXZJeFUgO3RbSR4igugMzki4kDIyDXTs6F36IxU=; b=Xar53uV/V6ArJjBjrO2U/jRSqy+lBdT0sIOibNZPQJ0sSIRTE8E/vkVf/k3ISNfMnM03 nzzlelBFvEF6mqCIDAy4qHKZvpKvvpUirI6tepOp9Z5VnjB570p2hLfGK6f0TGncDgAE vVps6a1rWKDMOLKafYwhtbYk4LBk/keEpMxkrfmUdnHsk1dnONEY1Pc5ZXBs+P+xkhGB RAzOnve4U8GM3iIInJXABFDYiDTVJfTd0zqbk7QkW0C5hTOSqP3zcJghsbOD1r/kV9P9 TXBV56IioMWKsWEgf0TQA7/gzYhXcSG2tLmJ6dGIx3RJfy5ENgg0Gn9R7dy9UDt+6Z+j 9Q== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2130.oracle.com with ESMTP id 2xaj4um8hw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Jan 2020 19:05:21 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id 00AJ482S069233; Fri, 10 Jan 2020 19:05:20 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserp3020.oracle.com with ESMTP id 2xevfec03t-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Jan 2020 19:05:20 +0000 Received: from abhmp0013.oracle.com (abhmp0013.oracle.com [141.146.116.19]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 00AJ5JPx021552; Fri, 10 Jan 2020 19:05:19 GMT Received: from paddy.uk.oracle.com (/10.175.192.165) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 10 Jan 2020 11:05:19 -0800 From: Joao Martins To: linux-nvdimm@lists.01.org Cc: Dan Williams , Vishal Verma , Dave Jiang , Ira Weiny , Alex Williamson , Cornelia Huck , kvm@vger.kernel.org, Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H . Peter Anvin" , x86@kernel.org, Liran Alon , Nikita Leshenko , Barret Rhoden , Boris Ostrovsky , Matthew Wilcox , Konrad Rzeszutek Wilk Subject: [PATCH RFC 08/10] dax/pmem: Add device-dax support for PFN_MODE_NONE Date: Fri, 10 Jan 2020 19:03:11 +0000 Message-Id: <20200110190313.17144-9-joao.m.martins@oracle.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20200110190313.17144-1-joao.m.martins@oracle.com> References: <20200110190313.17144-1-joao.m.martins@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9496 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-2001100154 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9496 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-2001100154 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Allowing dax pmem driver to work without struct pages means that user will request to not create any PFN metadata by writing seed's device mode to PFN_MODE_NONE. When the underlying nd_pfn->mode is PFN_MODE_NONE, most dax_pmem initialization steps can be skipped because we won't have/need a pfn superblock for the pagemap/struct-pages. We only allocate an opaque zeroed object with the chosen align requested, and finally add PFN_SPECIAL to the region pfn_flags. Signed-off-by: Joao Martins --- drivers/dax/pmem/core.c | 36 ++++++++++++++++++++++++++++++------ 1 file changed, 30 insertions(+), 6 deletions(-) diff --git a/drivers/dax/pmem/core.c b/drivers/dax/pmem/core.c index 2bedf8414fff..67f5604a8291 100644 --- a/drivers/dax/pmem/core.c +++ b/drivers/dax/pmem/core.c @@ -17,15 +17,38 @@ struct dev_dax *__dax_pmem_probe(struct device *dev, enum dev_dax_subsys subsys) struct nd_namespace_io *nsio; struct dax_region *dax_region; struct dev_pagemap pgmap = { }; + struct dev_pagemap *devmap = NULL; struct nd_namespace_common *ndns; struct nd_dax *nd_dax = to_nd_dax(dev); struct nd_pfn *nd_pfn = &nd_dax->nd_pfn; struct nd_region *nd_region = to_nd_region(dev->parent); + unsigned long long pfn_flags = PFN_DEV; ndns = nvdimm_namespace_common_probe(dev); if (IS_ERR(ndns)) return ERR_CAST(ndns); + rc = sscanf(dev_name(&ndns->dev), "namespace%d.%d", ®ion_id, &id); + if (rc != 2) + return ERR_PTR(-EINVAL); + + if (is_nd_dax(&nd_pfn->dev) && nd_pfn->mode == PFN_MODE_NONE) { + /* allocate a dummy super block */ + pfn_sb = devm_kzalloc(&nd_pfn->dev, sizeof(*pfn_sb), + GFP_KERNEL); + if (!pfn_sb) + return ERR_PTR(-ENOMEM); + + memset(pfn_sb, 0, sizeof(*pfn_sb)); + pfn_sb->align = nd_pfn->align; + nd_pfn->pfn_sb = pfn_sb; + pfn_flags |= PFN_SPECIAL; + + nsio = to_nd_namespace_io(&ndns->dev); + memcpy(&res, &nsio->res, sizeof(res)); + goto no_pfn_sb; + } + /* parse the 'pfn' info block via ->rw_bytes */ rc = devm_namespace_enable(dev, ndns, nd_info_block_reserve()); if (rc) @@ -45,20 +68,21 @@ struct dev_dax *__dax_pmem_probe(struct device *dev, enum dev_dax_subsys subsys) return ERR_PTR(-EBUSY); } - rc = sscanf(dev_name(&ndns->dev), "namespace%d.%d", ®ion_id, &id); - if (rc != 2) - return ERR_PTR(-EINVAL); - /* adjust the dax_region resource to the start of data */ memcpy(&res, &pgmap.res, sizeof(res)); res.start += offset; + devmap = &pgmap; + pfn_flags |= PFN_MAP; + +no_pfn_sb: dax_region = alloc_dax_region(dev, region_id, &res, nd_region->target_node, le32_to_cpu(pfn_sb->align), - PFN_DEV|PFN_MAP); + pfn_flags); if (!dax_region) return ERR_PTR(-ENOMEM); - dev_dax = __devm_create_dev_dax(dax_region, id, &pgmap, subsys); + + dev_dax = __devm_create_dev_dax(dax_region, id, devmap, subsys); /* child dev_dax instances now own the lifetime of the dax_region */ dax_region_put(dax_region); From patchwork Fri Jan 10 19:03:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joao Martins X-Patchwork-Id: 11328273 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D9C7892A for ; Fri, 10 Jan 2020 19:05:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B8F442051A for ; Fri, 10 Jan 2020 19:05:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="dxO0KB0S" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728871AbgAJTFx (ORCPT ); Fri, 10 Jan 2020 14:05:53 -0500 Received: from aserp2120.oracle.com ([141.146.126.78]:53436 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728853AbgAJTFw (ORCPT ); Fri, 10 Jan 2020 14:05:52 -0500 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id 00AJ3rr5101264; Fri, 10 Jan 2020 19:05:27 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2019-08-05; bh=wn9KcLolGATFIJ85MhQjHSQWXqsU1jEhASuaJIiaAfM=; b=dxO0KB0SDg6HkWerdmIGSxGHOlTPRZSiNToovSwdX2j1Wns32wgYGJopODhJFOD5rSBn puqZvYSK0bJOyTs3l2hiJo4Cva72swDScgam4nCkBGfEH0npV1q3Dz0CNk7uyqRfY7FP mzMT06ji8VS9wWPwZ9YuLo7iB61f1WJv+tqv3QCkYbLlHsUm4czNCS/d5x3gXk+Ehzjk Gbplaa1hJCHQYOg153zpccEUoxTfPctfoAqwCj46FWlQXVm8Wa/LIxvGEx8DRAo4VMtj SfzhCbDg63K6GPa+zzAbmswgAJ2e68gj1Cz2gpVJRcYBA9/KUzhr7D6WXqkJyPLgsgKt Fg== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by aserp2120.oracle.com with ESMTP id 2xajnqm1e5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Jan 2020 19:05:27 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id 00AJ4QTA038469; Fri, 10 Jan 2020 19:05:26 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userp3020.oracle.com with ESMTP id 2xekkvjmhk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Jan 2020 19:05:26 +0000 Received: from abhmp0013.oracle.com (abhmp0013.oracle.com [141.146.116.19]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id 00AJ5PCf014949; Fri, 10 Jan 2020 19:05:25 GMT Received: from paddy.uk.oracle.com (/10.175.192.165) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 10 Jan 2020 11:05:24 -0800 From: Joao Martins To: linux-nvdimm@lists.01.org Cc: Dan Williams , Vishal Verma , Dave Jiang , Ira Weiny , Alex Williamson , Cornelia Huck , kvm@vger.kernel.org, Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H . Peter Anvin" , x86@kernel.org, Liran Alon , Nikita Leshenko , Barret Rhoden , Boris Ostrovsky , Matthew Wilcox , Konrad Rzeszutek Wilk Subject: [PATCH RFC 09/10] vfio/type1: Use follow_pfn for VM_FPNMAP VMAs Date: Fri, 10 Jan 2020 19:03:12 +0000 Message-Id: <20200110190313.17144-10-joao.m.martins@oracle.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20200110190313.17144-1-joao.m.martins@oracle.com> References: <20200110190313.17144-1-joao.m.martins@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9496 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=848 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-2001100154 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9496 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=904 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-2001100154 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Nikita Leshenko Unconditionally interpreting vm_pgoff as a PFN is incorrect. VMAs created by /dev/mem do this, but in general VM_PFNMAP just means that the VMA doesn't have an associated struct page and is being managed directly by something other than the core mmu. Use follow_pfn like KVM does to find the PFN. Signed-off-by: Nikita Leshenko --- drivers/vfio/vfio_iommu_type1.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 2ada8e6cdb88..1e43581f95ea 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -362,9 +362,9 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr, vma = find_vma_intersection(mm, vaddr, vaddr + 1); if (vma && vma->vm_flags & VM_PFNMAP) { - *pfn = ((vaddr - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff; - if (is_invalid_reserved_pfn(*pfn)) - ret = 0; + ret = follow_pfn(vma, vaddr, pfn); + if (!ret && !is_invalid_reserved_pfn(*pfn)) + ret = -EOPNOTSUPP; } up_read(&mm->mmap_sem); From patchwork Fri Jan 10 19:03:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joao Martins X-Patchwork-Id: 11328293 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 03CFA930 for ; Fri, 10 Jan 2020 19:06:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CD42E205F4 for ; Fri, 10 Jan 2020 19:06:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="JrPCmplD" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728898AbgAJTF6 (ORCPT ); Fri, 10 Jan 2020 14:05:58 -0500 Received: from userp2130.oracle.com ([156.151.31.86]:47810 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728853AbgAJTF5 (ORCPT ); Fri, 10 Jan 2020 14:05:57 -0500 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id 00AJ3xrK131913; Fri, 10 Jan 2020 19:05:32 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2019-08-05; bh=KCRuAAvQAwl0rfqZGNTEjq3Xbk+5StxMV3R+zKC6+g8=; b=JrPCmplD9EMEh51IfNBg19dXdZpwX0YSrH0fuLXZbPtzBggR1OlVflWX3ojalVgEQ16k YfQESJzZDYlOWPMN2sTcqvz9ZXxYTDy32sFop2GrM41KDU3xYuc6wJ6yiPKXXPBApWZT xOsCwsJEfeWi1ChoY5o/CkkaxUK7gFzYpGvcUIkqT3p183iiqFmsrAFCTAYZ8Sd5r1IZ JgsiPMiCL0pQM/GeOjZQeEhh5n/rinDIpAy8EFAivHI1yoeAOjZ7M6YzFO0woSNnYz/A 8qPEjY4km8oE0MYe0GaQeAF7E4a/XDEj3UHH1SX8Ln3csinf83vsCblTS8thQ1MHXt3c /w== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by userp2130.oracle.com with ESMTP id 2xaj4um8jn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Jan 2020 19:05:32 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id 00AJ3tj7183577; Fri, 10 Jan 2020 19:05:31 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserp3030.oracle.com with ESMTP id 2xedhypv58-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Jan 2020 19:05:31 +0000 Received: from abhmp0013.oracle.com (abhmp0013.oracle.com [141.146.116.19]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id 00AJ5UBM014974; Fri, 10 Jan 2020 19:05:30 GMT Received: from paddy.uk.oracle.com (/10.175.192.165) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 10 Jan 2020 11:05:29 -0800 From: Joao Martins To: linux-nvdimm@lists.01.org Cc: Dan Williams , Vishal Verma , Dave Jiang , Ira Weiny , Alex Williamson , Cornelia Huck , kvm@vger.kernel.org, Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H . Peter Anvin" , x86@kernel.org, Liran Alon , Nikita Leshenko , Barret Rhoden , Boris Ostrovsky , Matthew Wilcox , Konrad Rzeszutek Wilk Subject: [PATCH RFC 10/10] nvdimm/e820: add multiple namespaces support Date: Fri, 10 Jan 2020 19:03:13 +0000 Message-Id: <20200110190313.17144-11-joao.m.martins@oracle.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20200110190313.17144-1-joao.m.martins@oracle.com> References: <20200110190313.17144-1-joao.m.martins@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9496 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-2001100154 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9496 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-2001100154 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org User can define regions with 'memmap=size!offset' which in turn creates PMEM legacy devices. But because it is a label-less NVDIMM device we only have one namespace for the whole device. Add support for multiple namespaces by adding ndctl control support, and exposing a minimal set of features: (ND_CMD_GET_CONFIG_SIZE, ND_CMD_GET_CONFIG_DATA, ND_CMD_SET_CONFIG_DATA) alongside NDD_ALIASING because we can store labels. Initialization is a little different: We allocate and register an nvdimm bus with an @nvdimm_descriptor which we use to locate where we are keeping our label storage area. The config data get/set/size operations are then simply memcpying to this area. Equivalent approach can also be found in the NFIT tests which emulate the same thing. Signed-off-by: Joao Martins --- drivers/nvdimm/e820.c | 212 +++++++++++++++++++++++++++++++++++++----- 1 file changed, 191 insertions(+), 21 deletions(-) diff --git a/drivers/nvdimm/e820.c b/drivers/nvdimm/e820.c index e02f60ad6c99..36fbff3d7110 100644 --- a/drivers/nvdimm/e820.c +++ b/drivers/nvdimm/e820.c @@ -7,14 +7,21 @@ #include #include #include +#include +#include +#include -static int e820_pmem_remove(struct platform_device *pdev) -{ - struct nvdimm_bus *nvdimm_bus = platform_get_drvdata(pdev); +#define LABEL_SIZE SZ_128K - nvdimm_bus_unregister(nvdimm_bus); - return 0; -} +struct e820_descriptor { + struct nd_interleave_set nd_set; + struct nvdimm_bus_descriptor nd_desc; + void *label; + unsigned char cookie1[16]; + unsigned char cookie2[16]; + struct nvdimm_bus *nvdimm_bus; + struct nvdimm *nvdimm; +}; #ifdef CONFIG_MEMORY_HOTPLUG static int e820_range_to_nid(resource_size_t addr) @@ -28,43 +35,206 @@ static int e820_range_to_nid(resource_size_t addr) } #endif +static int e820_get_config_size(struct nd_cmd_get_config_size *nd_cmd, + unsigned int buf_len) +{ + if (buf_len < sizeof(*nd_cmd)) + return -EINVAL; + + nd_cmd->status = 0; + nd_cmd->config_size = LABEL_SIZE; + nd_cmd->max_xfer = SZ_4K; + + return 0; +} + +static int e820_get_config_data(struct nd_cmd_get_config_data_hdr + *nd_cmd, unsigned int buf_len, void *label) +{ + unsigned int len, offset = nd_cmd->in_offset; + int rc; + + if (buf_len < sizeof(*nd_cmd)) + return -EINVAL; + if (offset >= LABEL_SIZE) + return -EINVAL; + if (nd_cmd->in_length + sizeof(*nd_cmd) > buf_len) + return -EINVAL; + + nd_cmd->status = 0; + len = min(nd_cmd->in_length, LABEL_SIZE - offset); + memcpy(nd_cmd->out_buf, label + offset, len); + rc = buf_len - sizeof(*nd_cmd) - len; + + return rc; +} + +static int e820_set_config_data(struct nd_cmd_set_config_hdr *nd_cmd, + unsigned int buf_len, void *label) +{ + unsigned int len, offset = nd_cmd->in_offset; + u32 *status; + int rc; + + if (buf_len < sizeof(*nd_cmd)) + return -EINVAL; + if (offset >= LABEL_SIZE) + return -EINVAL; + if (nd_cmd->in_length + sizeof(*nd_cmd) + 4 > buf_len) + return -EINVAL; + + status = (void *)nd_cmd + nd_cmd->in_length + sizeof(*nd_cmd); + *status = 0; + len = min(nd_cmd->in_length, LABEL_SIZE - offset); + memcpy(label + offset, nd_cmd->in_buf, len); + rc = buf_len - sizeof(*nd_cmd) - (len + 4); + + return rc; +} + +static struct e820_descriptor *to_e820_desc(struct nvdimm_bus_descriptor *desc) +{ + return container_of(desc, struct e820_descriptor, nd_desc); +} + +static int e820_ndctl(struct nvdimm_bus_descriptor *nd_desc, + struct nvdimm *nvdimm, unsigned int cmd, void *buf, + unsigned int buf_len, int *cmd_rc) +{ + struct e820_descriptor *t = to_e820_desc(nd_desc); + int rc = -EINVAL; + + switch (cmd) { + case ND_CMD_GET_CONFIG_SIZE: + rc = e820_get_config_size(buf, buf_len); + break; + case ND_CMD_GET_CONFIG_DATA: + rc = e820_get_config_data(buf, buf_len, t->label); + break; + case ND_CMD_SET_CONFIG_DATA: + rc = e820_set_config_data(buf, buf_len, t->label); + break; + default: + return rc; + } + + return rc; +} + +static void e820_desc_free(struct e820_descriptor *desc) +{ + if (!desc) + return; + + nvdimm_bus_unregister(desc->nvdimm_bus); + kfree(desc->label); + kfree(desc); +} + +static struct e820_descriptor *e820_desc_alloc(struct platform_device *pdev) +{ + struct nvdimm_bus_descriptor *nd_desc; + unsigned int cmd_mask, dimm_flags; + struct device *dev = &pdev->dev; + struct nvdimm_bus *nvdimm_bus; + struct e820_descriptor *desc; + struct nvdimm *nvdimm; + + desc = kzalloc(sizeof(*desc), GFP_KERNEL); + if (!desc) + goto err; + + desc->label = kzalloc(LABEL_SIZE, GFP_KERNEL); + if (!desc->label) + goto err; + + nd_desc = &desc->nd_desc; + nd_desc->provider_name = "e820"; + nd_desc->module = THIS_MODULE; + nd_desc->ndctl = e820_ndctl; + nvdimm_bus = nvdimm_bus_register(&pdev->dev, nd_desc); + if (!nvdimm_bus) { + dev_err(dev, "nvdimm bus registration failure\n"); + goto err; + } + desc->nvdimm_bus = nvdimm_bus; + + cmd_mask = (1UL << ND_CMD_GET_CONFIG_SIZE | + 1UL << ND_CMD_GET_CONFIG_DATA | + 1UL << ND_CMD_SET_CONFIG_DATA); + dimm_flags = (1UL << NDD_ALIASING); + nvdimm = nvdimm_create(nvdimm_bus, pdev, NULL, + dimm_flags, cmd_mask, 0, NULL); + if (!nvdimm) { + dev_err(dev, "nvdimm creation failure\n"); + goto err; + } + desc->nvdimm = nvdimm; + return desc; + +err: + e820_desc_free(desc); + return NULL; +} + static int e820_register_one(struct resource *res, void *data) { + struct platform_device *pdev = data; struct nd_region_desc ndr_desc; - struct nvdimm_bus *nvdimm_bus = data; + struct nd_mapping_desc mapping; + struct e820_descriptor *desc; + + desc = e820_desc_alloc(pdev); + if (!desc) + return -ENOMEM; + + mapping.nvdimm = desc->nvdimm; + mapping.start = res->start; + mapping.size = resource_size(res); + mapping.position = 0; + + generate_random_uuid(desc->cookie1); + desc->nd_set.cookie1 = (u64) desc->cookie1; + generate_random_uuid(desc->cookie2); + desc->nd_set.cookie2 = (u64) desc->cookie2; memset(&ndr_desc, 0, sizeof(ndr_desc)); ndr_desc.res = res; ndr_desc.numa_node = e820_range_to_nid(res->start); ndr_desc.target_node = ndr_desc.numa_node; + ndr_desc.mapping = &mapping; + ndr_desc.num_mappings = 1; + ndr_desc.nd_set = &desc->nd_set; set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags); - if (!nvdimm_pmem_region_create(nvdimm_bus, &ndr_desc)) + if (!nvdimm_pmem_region_create(desc->nvdimm_bus, &ndr_desc)) { + e820_desc_free(desc); + dev_err(&pdev->dev, "nvdimm region creation failure\n"); return -ENXIO; + } + + platform_set_drvdata(pdev, desc); + return 0; +} + +static int e820_pmem_remove(struct platform_device *pdev) +{ + struct e820_descriptor *desc = platform_get_drvdata(pdev); + + e820_desc_free(desc); return 0; } static int e820_pmem_probe(struct platform_device *pdev) { - static struct nvdimm_bus_descriptor nd_desc; - struct device *dev = &pdev->dev; - struct nvdimm_bus *nvdimm_bus; int rc = -ENXIO; - nd_desc.provider_name = "e820"; - nd_desc.module = THIS_MODULE; - nvdimm_bus = nvdimm_bus_register(dev, &nd_desc); - if (!nvdimm_bus) - goto err; - platform_set_drvdata(pdev, nvdimm_bus); - rc = walk_iomem_res_desc(IORES_DESC_PERSISTENT_MEMORY_LEGACY, - IORESOURCE_MEM, 0, -1, nvdimm_bus, e820_register_one); + IORESOURCE_MEM, 0, -1, pdev, e820_register_one); if (rc) goto err; return 0; err: - nvdimm_bus_unregister(nvdimm_bus); - dev_err(dev, "failed to register legacy persistent memory ranges\n"); + dev_err(&pdev->dev, "failed to register legacy persistent memory ranges\n"); return rc; }