From patchwork Tue Jul 13 08:46:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 12373143 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC33EC11F69 for ; Tue, 13 Jul 2021 08:47:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A7833613A9 for ; Tue, 13 Jul 2021 08:47:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234869AbhGMIuS (ORCPT ); Tue, 13 Jul 2021 04:50:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50100 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234849AbhGMIuQ (ORCPT ); Tue, 13 Jul 2021 04:50:16 -0400 Received: from mail-pf1-x431.google.com (mail-pf1-x431.google.com [IPv6:2607:f8b0:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 039BAC0613F0 for ; Tue, 13 Jul 2021 01:47:27 -0700 (PDT) Received: by mail-pf1-x431.google.com with SMTP id b12so18896500pfv.6 for ; Tue, 13 Jul 2021 01:47:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=omwDqSVrMraB8zPCmbagpNybYhuwUgLxPSaQSPVKs0I=; b=NFS9sf7oZs6OfC3ePyWxWOFYPRWYuHJEmSuowU4+A/AN9SkO5oklzsdY2KXu8XLaK/ RtjCDzBNyMsdIgaq+gdMB7X3iSEzGHSn43PT3yYRI1O50eXo3lcMJohxW+FjS73Fgzhz cByWfSfvdlPZnHSM2paax8FVHByTKAKS59DKXws2obexLNMYBSz4JwJpnlbAFJbXemnl muonmUyFpQcCa6tNLtEtmhQrzdudI5qv9lDhSdlcAC7ix0U02RmFBKcz7hR6i8/sr/Il miY9BiEVBFNOzq9HdHWN3Ltqrr/qolEO7tzr92l33+PTeqr/Mh3jyfO+z8sDGyszPXwg lRGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=omwDqSVrMraB8zPCmbagpNybYhuwUgLxPSaQSPVKs0I=; b=S15e7CoiY4bpz55EJvVdazdY8LBbZABBJ6l/AK56yeZi3k5xJxJBqFbVQSpO4efvtK mLy2IRt3Ell0QKNzJFQ99DD0qQU/n/qOxG3nku3RgIAPgaqHFCRZaQ31OR0HUir3eU7l IXzYzGXDQoxeUxTpXdrbpqzuxljaKBzCskHlhO4xCpaJjGmXfcy+OGw7P9Tgokg1PL1+ G5xLK0IONgTl/2Gtq4A9jhhJqt3xlDm5h/grieNDKJQNr8c+350T3RESxqOwZ+BpPzzP r4EXHpJmeDceejsiR8gyDOlo5p66oOJnHjKaebuuSsL7XcOkOPC7O7ch1gABoBdlGdve CmlQ== X-Gm-Message-State: AOAM533m/hcTo08tEVm3fii5SFvOUrL9yU4zIgmemOeXBDz2pphn05lN PYEOPzHYSMO61q4bP2dbPHpW X-Google-Smtp-Source: ABdhPJy1XC0KnBeCN6bWa8CB2iE99dOa7TYAMnl/f4lHXuMSNKXTJU6n9g1wg0BApWWsX8PXSpPEXw== X-Received: by 2002:a63:1266:: with SMTP id 38mr3331831pgs.154.1626166046577; Tue, 13 Jul 2021 01:47:26 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id j21sm18231301pfn.35.2021.07.13.01.47.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:47:26 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 01/17] iova: Export alloc_iova_fast() and free_iova_fast() Date: Tue, 13 Jul 2021 16:46:40 +0800 Message-Id: <20210713084656.232-2-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Export alloc_iova_fast() and free_iova_fast() so that some modules can use it to improve iova allocation efficiency. Signed-off-by: Xie Yongji --- drivers/iommu/iova.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index b6cf5f16123b..3941ed6bb99b 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -521,6 +521,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size, return new_iova->pfn_lo; } +EXPORT_SYMBOL_GPL(alloc_iova_fast); /** * free_iova_fast - free iova pfn range into rcache @@ -538,6 +539,7 @@ free_iova_fast(struct iova_domain *iovad, unsigned long pfn, unsigned long size) free_iova(iovad, pfn); } +EXPORT_SYMBOL_GPL(free_iova_fast); #define fq_ring_for_each(i, fq) \ for ((i) = (fq)->head; (i) != (fq)->tail; (i) = ((i) + 1) % IOVA_FQ_SIZE) From patchwork Tue Jul 13 08:46:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 12373145 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E37DC11F68 for ; Tue, 13 Jul 2021 08:47:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3360F60725 for ; Tue, 13 Jul 2021 08:47:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234942AbhGMIu1 (ORCPT ); Tue, 13 Jul 2021 04:50:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50138 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234910AbhGMIuW (ORCPT ); Tue, 13 Jul 2021 04:50:22 -0400 Received: from mail-pg1-x52e.google.com (mail-pg1-x52e.google.com [IPv6:2607:f8b0:4864:20::52e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BBAB3C061788 for ; Tue, 13 Jul 2021 01:47:30 -0700 (PDT) Received: by mail-pg1-x52e.google.com with SMTP id y4so18511511pgl.10 for ; Tue, 13 Jul 2021 01:47:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=x7lCOA+2uHcoDlHpnAPiyr08qHLnKknrRwhD42bW14c=; b=drxf3oJhygPSsqt71Ess67rxf+ndIQUhAGvJBQ7FfJc0EFN26/FzOtRTtC7PVm1UpN kQbh6cvZeXqEDwoNx0CPkFun1o95KcC72J/BrRHyWrL8j1ckCdBhsQrNaMArojnrxTDL EmSaEel9rQLb9Sl4sGu/b4+LJvaPa8chJvARj9AOwsZWRs4OZF+vghBIiQsIsYm4M9Ea tM/aLM4LHw9LpWwiWBNz+lg0l70CKpTeZGSiDnx8VQaiPHKV/RYUAN0tySvpbSXIWg/n CsQiMon3tJMRQHkqUdGOISGCwJp1P5f62r4sA52w48rVbKC6WaqVP0H4fqQAmdZ4e0TS niRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=x7lCOA+2uHcoDlHpnAPiyr08qHLnKknrRwhD42bW14c=; b=iGWDrVH46a5LzXKTOSg3UCFql5oWQjt/YHi4xSUJd05aIZsBZ+G5pFgdEbC0JskF4K laNEzDp6tsQnwmD9RqMJyHiKbQeZ7mG+N4NcO1MyZFgqRT0GoB5B3d9A9Gg1RNT2BOPT 4UHcKr9H3t8xib16umbM1uCcpf3VD3F+wCIzFwy29PyoqrJqevXKhvD56NFzXEHHg2gy +XKx7sYU04tIE9HYDFF1gKbFv2JsMzHysHJnbg1VksgJ53Ft8j6J3MY84zkwBtkcfFVM 84+na3n8xScnlEeilT51aoYeCwz0xW+358eVSezRYChMm4ErlB3Fy5C+T/Smf/QyORzQ +Lzw== X-Gm-Message-State: AOAM5332zpjcDM1dVRK1yT5q073Ew9az3HI+H1KFrbH00Vbl2mv+8bku wAyLMzGOgpvVEUgQfeFarXA5 X-Google-Smtp-Source: ABdhPJwdRaHRXTGGcq8TKcIstM4ONkFGUJKEp5Fy9XmDOpn7i0jIsKMl/HA/2cyBqJh1EcfH1oJiyQ== X-Received: by 2002:a63:5802:: with SMTP id m2mr3270429pgb.171.1626166050321; Tue, 13 Jul 2021 01:47:30 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id z12sm15702430pjd.39.2021.07.13.01.47.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:47:29 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 02/17] file: Export receive_fd() to modules Date: Tue, 13 Jul 2021 16:46:41 +0800 Message-Id: <20210713084656.232-3-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Export receive_fd() so that some modules can use it to pass file descriptor between processes without missing any security stuffs. Signed-off-by: Xie Yongji --- fs/file.c | 6 ++++++ include/linux/file.h | 7 +++---- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/fs/file.c b/fs/file.c index 86dc9956af32..210e540672aa 100644 --- a/fs/file.c +++ b/fs/file.c @@ -1134,6 +1134,12 @@ int receive_fd_replace(int new_fd, struct file *file, unsigned int o_flags) return new_fd; } +int receive_fd(struct file *file, unsigned int o_flags) +{ + return __receive_fd(file, NULL, o_flags); +} +EXPORT_SYMBOL_GPL(receive_fd); + static int ksys_dup3(unsigned int oldfd, unsigned int newfd, int flags) { int err = -EBADF; diff --git a/include/linux/file.h b/include/linux/file.h index 2de2e4613d7b..51e830b4fe3a 100644 --- a/include/linux/file.h +++ b/include/linux/file.h @@ -94,6 +94,9 @@ extern void fd_install(unsigned int fd, struct file *file); extern int __receive_fd(struct file *file, int __user *ufd, unsigned int o_flags); + +extern int receive_fd(struct file *file, unsigned int o_flags); + static inline int receive_fd_user(struct file *file, int __user *ufd, unsigned int o_flags) { @@ -101,10 +104,6 @@ static inline int receive_fd_user(struct file *file, int __user *ufd, return -EFAULT; return __receive_fd(file, ufd, o_flags); } -static inline int receive_fd(struct file *file, unsigned int o_flags) -{ - return __receive_fd(file, NULL, o_flags); -} int receive_fd_replace(int new_fd, struct file *file, unsigned int o_flags); extern void flush_delayed_fput(void); From patchwork Tue Jul 13 08:46:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 12373149 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B8A8C07E95 for ; Tue, 13 Jul 2021 08:47:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 36144613AB for ; Tue, 13 Jul 2021 08:47:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234854AbhGMIue (ORCPT ); Tue, 13 Jul 2021 04:50:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50142 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234898AbhGMIu0 (ORCPT ); Tue, 13 Jul 2021 04:50:26 -0400 Received: from mail-pg1-x52b.google.com (mail-pg1-x52b.google.com [IPv6:2607:f8b0:4864:20::52b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7581CC0617A7 for ; Tue, 13 Jul 2021 01:47:34 -0700 (PDT) Received: by mail-pg1-x52b.google.com with SMTP id s18so5707237pgq.3 for ; Tue, 13 Jul 2021 01:47:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=fyDF3ItsKMeO1MjqIrqgJilMt7mYKDPvv5TK6RowgtE=; b=p1KNnydZcTiuyWle/nJtM7wFkNnRwelSBfv+CpsCxOYkJn5kWph/w1ljqkJz9zrEg5 L2pkcNggfl3Q9RjS5qhSOgAu/suoOWBFYsfmw4VoJ3cVZCGQWXC/t3LZ5InMM8XR5LRl xHGxHb49eUsoBIewF2f127m9Zr6vQ1nNrE7ANg6C6KNKlJI6o8pJl/1YyAaWFPkOV3/r EsJv5vTsWLrxteRhSjRM/MuUUYUEpmJLfFEjWeF53QSvgsRiNQPzhkLCqrfes/TWccji zCW/mpIwd16XGKv0iCcj7quuKncnXS6vKRI09sMPi97p5ey6v9LWBy67FlYZ5GtM1LSk RZlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=fyDF3ItsKMeO1MjqIrqgJilMt7mYKDPvv5TK6RowgtE=; b=Mm8PFBVAPYt+PREGM1OQNwl73hrFsJpEUtbVRshFGG+CjxL0YC3XX6eBwfCzie1f3z ZU16x3ELDbxdX9SwrdnU0KMGA5A3gq2Dk9oUBp4ymhhkySGR3TLxhKQTcI0ZxqaJ3kMG W52mJt0s1BYJoLyxy1AzL+ebpAHYcMPRReGF2Yuh/WD7ZSipccqfLDU71M/GU+kd5kmA WBMhMCls0rXK2kroRXIOHbY3mFWGLqltvSUbEcofyy6xo9MQBEGMN4PrRksuezswA7MC ZHYzSWWJ8Fv4rBBIEhwKTzB7nHe3h1vbgjOV2jmt519MInriV8R5oIzvJavkNvZtMMd+ vC0A== X-Gm-Message-State: AOAM531WouSMdR8MrsTFR+kh0RimjvwpNenN8iR/+ddvxukclXjKC5az z/8ex7+o9Iqff4tWepgAvP15 X-Google-Smtp-Source: ABdhPJx6Qc5AYiRax5JttuV19B9UsF3u1wzbj3hvEI32kFSkxmH/vgaDI62s1sThuuzVSR58tJkLXA== X-Received: by 2002:a62:f947:0:b029:2e9:c502:7939 with SMTP id g7-20020a62f9470000b02902e9c5027939mr3616720pfm.34.1626166053934; Tue, 13 Jul 2021 01:47:33 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id n6sm12746734pgb.60.2021.07.13.01.47.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:47:33 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 03/17] vdpa: Fix code indentation Date: Tue, 13 Jul 2021 16:46:42 +0800 Message-Id: <20210713084656.232-4-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Use tabs to indent the code instead of spaces. Signed-off-by: Xie Yongji --- include/linux/vdpa.h | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index 7c49bc5a2b71..f822490db584 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -342,25 +342,25 @@ static inline struct device *vdpa_get_dma_dev(struct vdpa_device *vdev) static inline void vdpa_reset(struct vdpa_device *vdev) { - const struct vdpa_config_ops *ops = vdev->config; + const struct vdpa_config_ops *ops = vdev->config; vdev->features_valid = false; - ops->set_status(vdev, 0); + ops->set_status(vdev, 0); } static inline int vdpa_set_features(struct vdpa_device *vdev, u64 features) { - const struct vdpa_config_ops *ops = vdev->config; + const struct vdpa_config_ops *ops = vdev->config; vdev->features_valid = true; - return ops->set_features(vdev, features); + return ops->set_features(vdev, features); } static inline void vdpa_get_config(struct vdpa_device *vdev, unsigned offset, void *buf, unsigned int len) { - const struct vdpa_config_ops *ops = vdev->config; + const struct vdpa_config_ops *ops = vdev->config; /* * Config accesses aren't supposed to trigger before features are set. From patchwork Tue Jul 13 08:46:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 12373147 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.0 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD860C07E96 for ; Tue, 13 Jul 2021 08:47:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BADC460725 for ; Tue, 13 Jul 2021 08:47:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234949AbhGMIu3 (ORCPT ); Tue, 13 Jul 2021 04:50:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50168 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234910AbhGMIu1 (ORCPT ); Tue, 13 Jul 2021 04:50:27 -0400 Received: from mail-pf1-x42a.google.com (mail-pf1-x42a.google.com [IPv6:2607:f8b0:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B8A9C0613F0 for ; Tue, 13 Jul 2021 01:47:38 -0700 (PDT) Received: by mail-pf1-x42a.google.com with SMTP id j199so18932011pfd.7 for ; Tue, 13 Jul 2021 01:47:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=tdSO+YXsywfBhdOnOREu62bNqEWlUHvrtNpQOS7MuZo=; b=hiimUl4DgO1Of4XGPXpdICPYnPm7+QvGfh85ne+O9YqLV4HwNaz1/3GVCexKIIR9bK 79qC2iFIA2pLQ1nFnXPut8k1vrxiwBlzw8SE6x4BYAMbkEs5Gxh0ltQDnZC1DuhOWaXy Y2aq3utptKdeOlIEcc73zmkwyC0gwKsapg8k19yhxOLFoyCxe7DHgO2gP5VdDTqtHx6A mJhAVdE2NVFsWzib8FCLBIYmov6tjYXcaVAbydRQY1/65AASv7O0PG98TXYj+Gu12TPn VjG1m9HmS0VsQdytCx6vQRcAJNoY4lM6llfjwo4MI5ukWs2uwYJovIc1U/KU7Zz4NCOC WkTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tdSO+YXsywfBhdOnOREu62bNqEWlUHvrtNpQOS7MuZo=; b=VegjM2YR17JEIIWOVi4LoJv5794vgk3jMydIL5Ow3FXZITb4z5pBHyaBgZq0K/jpvB tlU/HGPLuByt7L8KJwWeQoyzwe02B5+o8oRMTAsw2FVpJQRqg1GjrQA/aIrVbF1cc6W9 pAZ/HhRORKdU67/zDMJ4Gb5Ciky8G0NvXmMmInxTCmqV979cCTDlmEC0hwbZrFolONBg C09OHqHhBIP2BmXTnkT4yRpQsG6wl68NQGvlKgRjTtYvkNPgTiS1htM3EkG0HYmXg4og R6xcRE5/F06Em6ujRGPUsyGTxypbZxPlEKrWOWDdrWwedeeMWOPFxPFMXQvRQyeqYf2L UDXA== X-Gm-Message-State: AOAM533BjhTqDrIJPp3uHJfMuFT/al6JrmDfIgdRE164XUzzVN9s+JQu Kp/PjYCNeqsa3vKTYfHT0+dT X-Google-Smtp-Source: ABdhPJzmFohN6vUcKF8KLIan4JZVjJ6TYDpkl96NawoSdWhjfLM6wpCarQvAfJbCfQIvaaP5xnDdBw== X-Received: by 2002:a05:6a00:2:b029:32e:3ef0:770a with SMTP id h2-20020a056a000002b029032e3ef0770amr810552pfk.8.1626166057836; Tue, 13 Jul 2021 01:47:37 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id u7sm21390626pgl.30.2021.07.13.01.47.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:47:37 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 04/17] vdpa: Fail the vdpa_reset() if fail to set device status to zero Date: Tue, 13 Jul 2021 16:46:43 +0800 Message-Id: <20210713084656.232-5-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Re-read the device status to ensure it's set to zero during resetting. Otherwise, fail the vdpa_reset() after timeout. Signed-off-by: Xie Yongji --- include/linux/vdpa.h | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index f822490db584..198c30e84b5d 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -6,6 +6,7 @@ #include #include #include +#include /** * struct vdpa_calllback - vDPA callback definition. @@ -340,12 +341,24 @@ static inline struct device *vdpa_get_dma_dev(struct vdpa_device *vdev) return vdev->dma_dev; } -static inline void vdpa_reset(struct vdpa_device *vdev) +#define VDPA_RESET_TIMEOUT_MS 1000 + +static inline int vdpa_reset(struct vdpa_device *vdev) { const struct vdpa_config_ops *ops = vdev->config; + int timeout = 0; vdev->features_valid = false; ops->set_status(vdev, 0); + while (ops->get_status(vdev)) { + timeout += 20; + if (timeout > VDPA_RESET_TIMEOUT_MS) + return -EIO; + + msleep(20); + } + + return 0; } static inline int vdpa_set_features(struct vdpa_device *vdev, u64 features) From patchwork Tue Jul 13 08:46:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 12373151 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44369C07E95 for ; Tue, 13 Jul 2021 08:47:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 286B5613AB for ; Tue, 13 Jul 2021 08:47:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234967AbhGMIuh (ORCPT ); Tue, 13 Jul 2021 04:50:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50216 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234924AbhGMIud (ORCPT ); Tue, 13 Jul 2021 04:50:33 -0400 Received: from mail-pl1-x62b.google.com (mail-pl1-x62b.google.com [IPv6:2607:f8b0:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 11AD3C061787 for ; Tue, 13 Jul 2021 01:47:42 -0700 (PDT) Received: by mail-pl1-x62b.google.com with SMTP id c15so10724941pls.13 for ; Tue, 13 Jul 2021 01:47:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=FdoZ+4EM7JCQgZ86bDT9RbGlh7q+AWgllGjoIYPWL9Q=; b=gA1/QgaUqMMGrBGGpEVYK8fMzHiF70tOx013tB9CxkOMsIAz6K9CyWbLukiYsceeeK WT3omC6T/UV02GbqJntqFcUEJy/CsH+2v38lltH8hTUVi5bDMirNFM2m8jsgCd53U/1v Lc/Oygn7Gsw282JeKIqoiMuvIbDZt/am73WeLkzOY2/CkX4nQAuUu3ul4s5GPTxJkyx3 dWRyzileES+3pFwFh509M/IKBfUS06G83m9tPJd2yiq3RvUawNvAiQ6SGLm5AoV3JD3Q Yf50mdZiBV6HqGDPHp59XIyHUg9RniZUw56YWfFOmyWOXjXu+QsXkJ6TpitVDuk7cFO9 CT5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FdoZ+4EM7JCQgZ86bDT9RbGlh7q+AWgllGjoIYPWL9Q=; b=Citrxpi6Bfsw5M51J2WqlRUF/YtKLnNRuZjf6EF2aoJB/rKGeBZu4yQ5kBjjUvIUyu Ttv0AZxpjn4qLCj+B7YDHDH7ell8/Nw7zR0ahP4qOg1WUbn5u/0G0M3tzzS84kJynuQD dbdehq1JxEaT/7nOLjzWyptuVYzzV9rSiRDjruUJbTySxFij/hhjqA4p8MYYX4Tl7TSq WOHIHMFSBbhenRjFVd0SEczrPynT6vIoy2TEuG/JP6Jsv1IWFbkSFf5ppbYaHvojDsSw QQJThw0hsMS0uNJfgesn/IVUQejtvZMrl7m5DYxczJoyzlD5UTxD+a73sTUcfYE0NrgN g87w== X-Gm-Message-State: AOAM531PL8d+e8eUf0W0FZEXSsuMa8zNuf5Y8B6roGvP0NWmD61sfxA9 IWd7AZpjMZMsVRqlxGdEUNhM X-Google-Smtp-Source: ABdhPJy/YXaKCppxOExioeRxTLQv1sj8lcUHDptTXIY/sm3klehqPKR0QkOZNRfSckm4SM+jax8f+w== X-Received: by 2002:a17:90b:1b4d:: with SMTP id nv13mr1935868pjb.216.1626166061630; Tue, 13 Jul 2021 01:47:41 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id w2sm15858457pjq.5.2021.07.13.01.47.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:47:41 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 05/17] vhost-vdpa: Fail the vhost_vdpa_set_status() on reset failure Date: Tue, 13 Jul 2021 16:46:44 +0800 Message-Id: <20210713084656.232-6-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Re-read the device status to ensure it's set to zero during resetting. Otherwise, fail the vhost_vdpa_set_status() after timeout. Signed-off-by: Xie Yongji --- drivers/vhost/vdpa.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index bb374a801bda..62b6d911c57d 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -157,7 +157,7 @@ static long vhost_vdpa_set_status(struct vhost_vdpa *v, u8 __user *statusp) struct vdpa_device *vdpa = v->vdpa; const struct vdpa_config_ops *ops = vdpa->config; u8 status, status_old; - int nvqs = v->nvqs; + int timeout = 0, nvqs = v->nvqs; u16 i; if (copy_from_user(&status, statusp, sizeof(status))) @@ -173,6 +173,15 @@ static long vhost_vdpa_set_status(struct vhost_vdpa *v, u8 __user *statusp) return -EINVAL; ops->set_status(vdpa, status); + if (status == 0) { + while (ops->get_status(vdpa)) { + timeout += 20; + if (timeout > VDPA_RESET_TIMEOUT_MS) + return -EIO; + + msleep(20); + } + } if ((status & VIRTIO_CONFIG_S_DRIVER_OK) && !(status_old & VIRTIO_CONFIG_S_DRIVER_OK)) for (i = 0; i < nvqs; i++) From patchwork Tue Jul 13 08:46:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 12373153 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEB83C11F68 for ; Tue, 13 Jul 2021 08:47:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DA573613A9 for ; Tue, 13 Jul 2021 08:47:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235032AbhGMIuj (ORCPT ); Tue, 13 Jul 2021 04:50:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50234 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234998AbhGMIuf (ORCPT ); Tue, 13 Jul 2021 04:50:35 -0400 Received: from mail-pf1-x435.google.com (mail-pf1-x435.google.com [IPv6:2607:f8b0:4864:20::435]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 900D0C0613E9 for ; Tue, 13 Jul 2021 01:47:45 -0700 (PDT) Received: by mail-pf1-x435.google.com with SMTP id o201so13862188pfd.1 for ; Tue, 13 Jul 2021 01:47:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=duCG4IP24crSQcX/Hc+dmtkejh6EvJA1HRc0A4ISBjM=; b=jhEklg55ncTgryEBNVp2R8QmbT0mSSZ9Jbx/vRWE7VCj3bWE2z2dIOjal3v7V87F+w KXfQDIBXqy11zkoXvndvUbGlVPDEiXSl+fAQR2u1kh3sE66g/ycHI3ytjUjkQxSBJH/g AZxHQhTacxVYFFdIyyjgze0ilCXCS/jDuumV2MlbMyoRI0UfccSBvPIQYWOGLs4/8MSa /LKjYT2XeUKg2HCwHW37u9JbUA9UywMRQY5Oh21L5F5yKxW5N31AISgPj+ldmzvcwtCe E1Q/Vl13UyWPfsgrfUc5jX89w8ETwn8bjbTAF+M6GLaU7IxUmlm0UT8g4Vzjxqr6Yp/e wF9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=duCG4IP24crSQcX/Hc+dmtkejh6EvJA1HRc0A4ISBjM=; b=lJEo7coJfZvm/utzPaS/YT8HKdi+2nOPVdnIAaGVUKMJp1EcGIDBwLLear8npERKdf irH3pFpHGqyMoa0F+Tw5Z5P+porgrobbRvKF2lcOfkqeawIsY5CVuE5xKWZ6V1xpE+eC BRkw1Ede3bzmZFGOp5LbqPhq8YgYXIzPjzFidIsBdN2EYUnngJGpFTj8D2TVf6s6EaOw MAiuXddJpTpd4kiREYJZCk2he1DBii2Hz0c9n7QCuDXEgVC1TO4A49waU/YEmeFmJZ+p XchF5SG20fY2zUhc3m9Q4OHmK5Ap86jfyzuXIKG9NN9HJY9QaySHsPplk88rJ/Xzlqtk Pg0g== X-Gm-Message-State: AOAM532b55Xr8nA70pwZDa3/tX3ECs7rIzwr+12G6lksUimQm6q1QIm9 sydTFtnRgsCCJBlKYpSrpx62 X-Google-Smtp-Source: ABdhPJzUKtRs1AySsyiLpLF85eJbibtFpTyQ7lV4V3lGutdi9feZAXM7KOxrD2p11DbuRgxCB1AMmQ== X-Received: by 2002:a62:ee16:0:b029:2fe:ffcf:775a with SMTP id e22-20020a62ee160000b02902feffcf775amr3339041pfi.59.1626166065181; Tue, 13 Jul 2021 01:47:45 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id b22sm17658102pfi.181.2021.07.13.01.47.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:47:44 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 06/17] vhost-vdpa: Handle the failure of vdpa_reset() Date: Tue, 13 Jul 2021 16:46:45 +0800 Message-Id: <20210713084656.232-7-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The vdpa_reset() may fail now. This adds check to its return value and fail the vhost_vdpa_open(). Signed-off-by: Xie Yongji --- drivers/vhost/vdpa.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 62b6d911c57d..8615756306ec 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -116,12 +116,13 @@ static void vhost_vdpa_unsetup_vq_irq(struct vhost_vdpa *v, u16 qid) irq_bypass_unregister_producer(&vq->call_ctx.producer); } -static void vhost_vdpa_reset(struct vhost_vdpa *v) +static int vhost_vdpa_reset(struct vhost_vdpa *v) { struct vdpa_device *vdpa = v->vdpa; - vdpa_reset(vdpa); v->in_batch = 0; + + return vdpa_reset(vdpa); } static long vhost_vdpa_get_device_id(struct vhost_vdpa *v, u8 __user *argp) @@ -871,7 +872,9 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep) return -EBUSY; nvqs = v->nvqs; - vhost_vdpa_reset(v); + r = vhost_vdpa_reset(v); + if (r) + goto err; vqs = kmalloc_array(nvqs, sizeof(*vqs), GFP_KERNEL); if (!vqs) { From patchwork Tue Jul 13 08:46:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 12373155 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C8B7C11F68 for ; Tue, 13 Jul 2021 08:47:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 00D9B613A9 for ; Tue, 13 Jul 2021 08:47:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235040AbhGMIuo (ORCPT ); Tue, 13 Jul 2021 04:50:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50254 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235022AbhGMIui (ORCPT ); Tue, 13 Jul 2021 04:50:38 -0400 Received: from mail-pj1-x1036.google.com (mail-pj1-x1036.google.com [IPv6:2607:f8b0:4864:20::1036]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3C8F8C0613DD for ; Tue, 13 Jul 2021 01:47:49 -0700 (PDT) Received: by mail-pj1-x1036.google.com with SMTP id i16-20020a17090acf90b02901736d9d2218so1655385pju.1 for ; Tue, 13 Jul 2021 01:47:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=EjR09xl/s+Zlx+uuKDPZmG38VH07hTVA8QsJdDWGoMM=; b=0VNyBgTY7yxOchYvhwFlmk9k2udQOWCUUMpM8nhTR7Bk4VZtxT4ZI8f98TzkKevJk5 dYLlzzFa5wsVMHn13aVEoDurgoQ/B97Q+JhhlkS6OR7p2ukAzqbqaIA6heoBr2mvaW2u LWKNdvE1D06+3O4nambhTYRGp9OSBjhVsvc+sS1fRJwI0tKIi1LcEHcgiIroKNvO+Tk9 Y9lhU4YFRflDnAO1ytSChBfqW/0Xf2PueeNxk3B7oGv7K2ZJPXdQOsXRxJSUBz/FAPuC yq81H3J4EOgBB4At7Bf6X83H6Bjq4AOEWbBPDSkXpZtUwXWHYKXE5SKnGfz6IjR2oj55 XrbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=EjR09xl/s+Zlx+uuKDPZmG38VH07hTVA8QsJdDWGoMM=; b=AjuvbRUaLuiuMZ+iKw1oR3wV0fJBzI57oa6eRAb15+8DcYyc2U09Bk1J2Szr7/HodL 0E9wgmTD0klnmH2IUPCfsrmvyvfzEMJuQKqGzVT8jNui1tVaiCGCZmPtMeoBGxuQCn72 t3gw7Z1+yf6/k2fpUp5orOXTDfim1x+PY2FdnWm2f/x5bpq8oJ/fPwz6H/m7/JpRKbib Qicfwv1gdNTUMpxNX0i4SY6S0/wvaBbyVySr6lXFH9uELBzvNDQnlu/MPCsoY8mjkGRH J+YhtreAdKg55C3UgU6fchr+EYznZ3Myyk/MbkNl0bTU1dOO+PHPWOzV4FM6aiCYfbyt ij1w== X-Gm-Message-State: AOAM533TjB9kB/nkqTwzHDhZYv3t5sK9lLbZwb52jqyvGmlunlrl+Aor dLhkWh9brNRjtXZfI8JrpCJJ X-Google-Smtp-Source: ABdhPJw5LSlu0ju1WyptJVUqPCDp3E+vXWICCBRXlUH6oKTubBRS+RfF2LpFzJEBJMPewe5KBLVzVw== X-Received: by 2002:a17:902:7610:b029:12b:f9f:727 with SMTP id k16-20020a1709027610b029012b0f9f0727mr2734371pll.65.1626166068835; Tue, 13 Jul 2021 01:47:48 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id i12sm9336715pjj.9.2021.07.13.01.47.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:47:48 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 07/17] virtio: Don't set FAILED status bit on device index allocation failure Date: Tue, 13 Jul 2021 16:46:46 +0800 Message-Id: <20210713084656.232-8-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org We don't need to set FAILED status bit on device index allocation failure since the device initialization hasn't been started yet. Signed-off-by: Xie Yongji --- drivers/virtio/virtio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c index 4b15c00c0a0a..a15beb6b593b 100644 --- a/drivers/virtio/virtio.c +++ b/drivers/virtio/virtio.c @@ -338,7 +338,7 @@ int register_virtio_device(struct virtio_device *dev) /* Assign a unique device index and hence name. */ err = ida_simple_get(&virtio_index_ida, 0, 0, GFP_KERNEL); if (err < 0) - goto out; + return err; dev->index = err; dev_set_name(&dev->dev, "virtio%u", dev->index); From patchwork Tue Jul 13 08:46:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 12373157 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 671EDC11F66 for ; Tue, 13 Jul 2021 08:48:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4F611613A9 for ; Tue, 13 Jul 2021 08:48:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235108AbhGMIuu (ORCPT ); Tue, 13 Jul 2021 04:50:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50282 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235109AbhGMIum (ORCPT ); Tue, 13 Jul 2021 04:50:42 -0400 Received: from mail-pf1-x433.google.com (mail-pf1-x433.google.com [IPv6:2607:f8b0:4864:20::433]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1B39CC0613EF for ; Tue, 13 Jul 2021 01:47:53 -0700 (PDT) Received: by mail-pf1-x433.google.com with SMTP id 17so18944437pfz.4 for ; Tue, 13 Jul 2021 01:47:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=2fP/MAmI7gUka5e/DKUQ2mjQlxJxe0/QRKPw5yBwyTk=; b=rpVEQUV1X70OmbrrQTJJKvk5vCPqC4uyyQYqM8mgOg9WDN/eKRObLhREIgC9AhlQtG eUZsoe1ZtDcx9fRxn5K5cTmakEDk81uPA+xScHZGg3Tqu81c80BChtmTqJDJYMUUT2FX I8vD0sFKqshhBkFve+LFV2d7iGakKJ6UO++oA6/QFcNuQGOsSjYmiPWVYrZl3U10Hqmm N6egKhnknsyBTg02r1Hz+54X/yG9BUJz2HuqTsK2D2UErp3uOrJUQlLV7QFYJqAnHhic z3Jy2j8PSswSLNBDJg5a+jUolEiDfEm0gqiykb9N4UFR//SYSeYkWWUQ83/UgrjRRwug POjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2fP/MAmI7gUka5e/DKUQ2mjQlxJxe0/QRKPw5yBwyTk=; b=DqKSzARAwUA3yBE4KOsHWqb9WgDlcyrM0vDJJ/LJxEe/Qc9KURVY4XeRsyXALVtsVs hdkQt4BV/xylf7PzwjPighb2o7JAkHnufKo4IQSXKohfgWxL526j1WxOYddASEveGzN5 E1cG866zogX8RuXufPo1e9+9hQxARtfAYcmdSKyqnm7WLBBbMJ/MGy0MpqW4GEnP2ap+ S9jon5jIfAW6IDusLB57TwiOYl1874khD0AVgMj05QPnhQnNXiBNh3Zj53d80a8PGdWk OM7ZUEtINGwiLorB/Jc5P62QpIwhP6AeszJ0cJGmqYv1GnQbFgjUs4hQlqtPyYKxrK4j HjZA== X-Gm-Message-State: AOAM532XidZrsQGM4ijEH97Ns/c/+0crxAvO/1D/GvEDf2qeKvvriQw3 YToySXtElt38Jf9SYUGiTADQ X-Google-Smtp-Source: ABdhPJyBkUB9QNAx3I8j5SvXv3BBPI0eQfjjVmRO0YZO2mJNlMqzj7TJp+Nb22Xnek5hLerTt0F38A== X-Received: by 2002:a65:5c89:: with SMTP id a9mr3285855pgt.207.1626166072648; Tue, 13 Jul 2021 01:47:52 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id e2sm21544494pgh.5.2021.07.13.01.47.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:47:52 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 08/17] virtio_config: Add a return value to reset function Date: Tue, 13 Jul 2021 16:46:47 +0800 Message-Id: <20210713084656.232-9-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This adds a return value to reset function so that we can handle the reset failure later. No functional changes. Signed-off-by: Xie Yongji --- arch/um/drivers/virtio_uml.c | 4 +++- drivers/platform/mellanox/mlxbf-tmfifo.c | 4 +++- drivers/remoteproc/remoteproc_virtio.c | 4 +++- drivers/s390/virtio/virtio_ccw.c | 6 ++++-- drivers/virtio/virtio_pci_legacy.c | 4 +++- drivers/virtio/virtio_pci_modern.c | 4 +++- drivers/virtio/virtio_vdpa.c | 4 +++- include/linux/virtio_config.h | 3 ++- 8 files changed, 24 insertions(+), 9 deletions(-) diff --git a/arch/um/drivers/virtio_uml.c b/arch/um/drivers/virtio_uml.c index 4412d6febade..ca02deaf9b32 100644 --- a/arch/um/drivers/virtio_uml.c +++ b/arch/um/drivers/virtio_uml.c @@ -828,11 +828,13 @@ static void vu_set_status(struct virtio_device *vdev, u8 status) vu_dev->status = status; } -static void vu_reset(struct virtio_device *vdev) +static int vu_reset(struct virtio_device *vdev) { struct virtio_uml_device *vu_dev = to_virtio_uml_device(vdev); vu_dev->status = 0; + + return 0; } static void vu_del_vq(struct virtqueue *vq) diff --git a/drivers/platform/mellanox/mlxbf-tmfifo.c b/drivers/platform/mellanox/mlxbf-tmfifo.c index 38800e86ed8a..e3c513c2d4fa 100644 --- a/drivers/platform/mellanox/mlxbf-tmfifo.c +++ b/drivers/platform/mellanox/mlxbf-tmfifo.c @@ -989,11 +989,13 @@ static void mlxbf_tmfifo_virtio_set_status(struct virtio_device *vdev, } /* Reset the device. Not much here for now. */ -static void mlxbf_tmfifo_virtio_reset(struct virtio_device *vdev) +static int mlxbf_tmfifo_virtio_reset(struct virtio_device *vdev) { struct mlxbf_tmfifo_vdev *tm_vdev = mlxbf_vdev_to_tmfifo(vdev); tm_vdev->status = 0; + + return 0; } /* Read the value of a configuration field. */ diff --git a/drivers/remoteproc/remoteproc_virtio.c b/drivers/remoteproc/remoteproc_virtio.c index cf4d54e98e6a..975c845b3187 100644 --- a/drivers/remoteproc/remoteproc_virtio.c +++ b/drivers/remoteproc/remoteproc_virtio.c @@ -191,7 +191,7 @@ static void rproc_virtio_set_status(struct virtio_device *vdev, u8 status) dev_dbg(&vdev->dev, "status: %d\n", status); } -static void rproc_virtio_reset(struct virtio_device *vdev) +static int rproc_virtio_reset(struct virtio_device *vdev) { struct rproc_vdev *rvdev = vdev_to_rvdev(vdev); struct fw_rsc_vdev *rsc; @@ -200,6 +200,8 @@ static void rproc_virtio_reset(struct virtio_device *vdev) rsc->status = 0; dev_dbg(&vdev->dev, "reset !\n"); + + return 0; } /* provide the vdev features as retrieved from the firmware */ diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c index d35e7a3f7067..5221cdad531d 100644 --- a/drivers/s390/virtio/virtio_ccw.c +++ b/drivers/s390/virtio/virtio_ccw.c @@ -710,14 +710,14 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs, return ret; } -static void virtio_ccw_reset(struct virtio_device *vdev) +static int virtio_ccw_reset(struct virtio_device *vdev) { struct virtio_ccw_device *vcdev = to_vc_device(vdev); struct ccw1 *ccw; ccw = ccw_device_dma_zalloc(vcdev->cdev, sizeof(*ccw)); if (!ccw) - return; + return -ENOMEM; /* Zero status bits. */ vcdev->dma_area->status = 0; @@ -729,6 +729,8 @@ static void virtio_ccw_reset(struct virtio_device *vdev) ccw->cda = 0; ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_RESET); ccw_device_dma_free(vcdev->cdev, ccw, sizeof(*ccw)); + + return 0; } static u64 virtio_ccw_get_features(struct virtio_device *vdev) diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c index d62e9835aeec..0b5d95e3efa1 100644 --- a/drivers/virtio/virtio_pci_legacy.c +++ b/drivers/virtio/virtio_pci_legacy.c @@ -89,7 +89,7 @@ static void vp_set_status(struct virtio_device *vdev, u8 status) iowrite8(status, vp_dev->ioaddr + VIRTIO_PCI_STATUS); } -static void vp_reset(struct virtio_device *vdev) +static int vp_reset(struct virtio_device *vdev) { struct virtio_pci_device *vp_dev = to_vp_device(vdev); /* 0 status means a reset. */ @@ -99,6 +99,8 @@ static void vp_reset(struct virtio_device *vdev) ioread8(vp_dev->ioaddr + VIRTIO_PCI_STATUS); /* Flush pending VQ/configuration callbacks. */ vp_synchronize_vectors(vdev); + + return 0; } static u16 vp_config_vector(struct virtio_pci_device *vp_dev, u16 vector) diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c index 30654d3a0b41..b0cde3b2f0ff 100644 --- a/drivers/virtio/virtio_pci_modern.c +++ b/drivers/virtio/virtio_pci_modern.c @@ -158,7 +158,7 @@ static void vp_set_status(struct virtio_device *vdev, u8 status) vp_modern_set_status(&vp_dev->mdev, status); } -static void vp_reset(struct virtio_device *vdev) +static int vp_reset(struct virtio_device *vdev) { struct virtio_pci_device *vp_dev = to_vp_device(vdev); struct virtio_pci_modern_device *mdev = &vp_dev->mdev; @@ -174,6 +174,8 @@ static void vp_reset(struct virtio_device *vdev) msleep(1); /* Flush pending VQ/configuration callbacks. */ vp_synchronize_vectors(vdev); + + return 0; } static u16 vp_config_vector(struct virtio_pci_device *vp_dev, u16 vector) diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c index ff43f9b62b2f..3e666f70e829 100644 --- a/drivers/virtio/virtio_vdpa.c +++ b/drivers/virtio/virtio_vdpa.c @@ -97,11 +97,13 @@ static void virtio_vdpa_set_status(struct virtio_device *vdev, u8 status) return ops->set_status(vdpa, status); } -static void virtio_vdpa_reset(struct virtio_device *vdev) +static int virtio_vdpa_reset(struct virtio_device *vdev) { struct vdpa_device *vdpa = vd_get_vdpa(vdev); vdpa_reset(vdpa); + + return 0; } static bool virtio_vdpa_notify(struct virtqueue *vq) diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h index 8519b3ae5d52..203407992c30 100644 --- a/include/linux/virtio_config.h +++ b/include/linux/virtio_config.h @@ -47,6 +47,7 @@ struct virtio_shm_region { * After this, status and feature negotiation must be done again * Device must not be reset from its vq/config callbacks, or in * parallel with being added/removed. + * Returns 0 on success or error status * @find_vqs: find virtqueues and instantiate them. * vdev: the virtio_device * nvqs: the number of virtqueues to find @@ -82,7 +83,7 @@ struct virtio_config_ops { u32 (*generation)(struct virtio_device *vdev); u8 (*get_status)(struct virtio_device *vdev); void (*set_status)(struct virtio_device *vdev, u8 status); - void (*reset)(struct virtio_device *vdev); + int (*reset)(struct virtio_device *vdev); int (*find_vqs)(struct virtio_device *, unsigned nvqs, struct virtqueue *vqs[], vq_callback_t *callbacks[], const char * const names[], const bool *ctx, From patchwork Tue Jul 13 08:46:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 12373185 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0CA9FC07E95 for ; Tue, 13 Jul 2021 08:48:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E8805613AB for ; Tue, 13 Jul 2021 08:48:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235167AbhGMIvB (ORCPT ); Tue, 13 Jul 2021 04:51:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50340 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235159AbhGMIut (ORCPT ); Tue, 13 Jul 2021 04:50:49 -0400 Received: from mail-pf1-x431.google.com (mail-pf1-x431.google.com [IPv6:2607:f8b0:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8E43EC0613AB for ; Tue, 13 Jul 2021 01:47:56 -0700 (PDT) Received: by mail-pf1-x431.google.com with SMTP id d12so18953495pfj.2 for ; Tue, 13 Jul 2021 01:47:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=79+lwfgqPqTMCDJYOtelsxZhwWsCxwv7dGGvPGe+0Sw=; b=XPYV6AqgUbZp6+vdppARBpAYCPqTsziLqNwvNsrjLuNKQsh6GS3q19GuTFueLqfI3W 2mpioxtWWYo5lHGzcp+Jutobs4GmHGUTQiIvyRJOrhRaSYEd2g13fYjJdx8IqQbpVtg+ U5M3F2NmfDR26UxFi92Yzg5HX/fYV6AYnZdE6d7ugOH+Y9vAV4e5UbOMWrmsG3AqiEzX NUPQt8B7lzQDca2Wt8iUkhnTLPaw8kJJmv71Rv4l+iNrRzjIvV51afp0ikEMovNJIa3w l3GoRQ485RyPX6RthpBauL0J2/vlNzokMcWZNI71JmewCwtjndkIJDGd4VS80mquizHp zreQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=79+lwfgqPqTMCDJYOtelsxZhwWsCxwv7dGGvPGe+0Sw=; b=VDZO9eWsXi1nhpAlQAFLMOJ7vzlDMEAe8ACmEveAEHp5HfUoLNJrmTSYiWF8refx1W ElPt/x5/Yzy0BLzI+ayXhn9MSECnjS4CYtnHp4ZsRcB1j1stbikszDVqRvb1dWDAC71q zlkYo3kfPY67Y1chO8kR4Db9bYpee1oXKlnChbbQoBsTjGxLcMkWhpMW2+SUOM00XX2o gvTCOeYFM1x+G9M8I4NdklWP9LuhCA5BVhotCiKeUBA+JEs7op2C3qlW5D5Xc0+ABKO6 bZHsCiI2d5IsxQhEvl1ir3DlYK+DcLRh6uW+q74sS+Y7m9I5EiHafprLRlvNqxKTALUq Fu8g== X-Gm-Message-State: AOAM532qYrkw+WQRpMvG6SojoUCdO+so+Vl8ASCfDZrfc/BRDkFiMbk9 Y+KbJzMNuqaSkjD7rkhKdsHu X-Google-Smtp-Source: ABdhPJya/3d83k8xmEEz7mxN/KfDFTYEWfRDpJqmURKX4t0jIlbAdO7JalzXWyaKhiHPR7hvAQh/zQ== X-Received: by 2002:aa7:808b:0:b029:2ef:cdd4:8297 with SMTP id v11-20020aa7808b0000b02902efcdd48297mr3574498pff.27.1626166076154; Tue, 13 Jul 2021 01:47:56 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id q21sm11607775pff.55.2021.07.13.01.47.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:47:55 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 09/17] virtio-vdpa: Handle the failure of vdpa_reset() Date: Tue, 13 Jul 2021 16:46:48 +0800 Message-Id: <20210713084656.232-10-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The vpda_reset() may fail now. This adds check to its return value and fail the virtio_vdpa_reset(). Signed-off-by: Xie Yongji --- drivers/virtio/virtio_vdpa.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c index 3e666f70e829..ebbd8471bbee 100644 --- a/drivers/virtio/virtio_vdpa.c +++ b/drivers/virtio/virtio_vdpa.c @@ -101,9 +101,7 @@ static int virtio_vdpa_reset(struct virtio_device *vdev) { struct vdpa_device *vdpa = vd_get_vdpa(vdev); - vdpa_reset(vdpa); - - return 0; + return vdpa_reset(vdpa); } static bool virtio_vdpa_notify(struct virtqueue *vq) From patchwork Tue Jul 13 08:46:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 12373189 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C973C11F67 for ; Tue, 13 Jul 2021 08:48:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 571DA613B0 for ; Tue, 13 Jul 2021 08:48:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235276AbhGMIvM (ORCPT ); Tue, 13 Jul 2021 04:51:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50340 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235072AbhGMIvA (ORCPT ); Tue, 13 Jul 2021 04:51:00 -0400 Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7B0BAC0613EF for ; Tue, 13 Jul 2021 01:48:00 -0700 (PDT) Received: by mail-pl1-x634.google.com with SMTP id o8so1307530plg.11 for ; Tue, 13 Jul 2021 01:48:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=LsGekY4BvYvIO5Z8AgcJapYB7A/KQEKxuOgJFJnV+F0=; b=Et7FbWZVS2cx+fpTA8oSkjyi22zfg/JVBDFbigj20lktndc6w8VizCfnQbd8DLvnBV VNEhNSeBTNpJFAegw09Rg0mEfMyli4GMRdum0q4qn3U74w57SPrdejSMPmNtu9yMiPCF TiTyNrOHaIEsr8Bx2ZBmhMhJIuXrBF5pSlk5FJueUhyp4hfHn5krzzUtF6Q9Pl85RJaq SKkCpcUcgTDMKntqvaiRWItj7upm0Y9Xhp3HloIdnzr5/LU16LGuAsmTpAo3w7IY0Gn5 K4vArTZXMxMG7swExsxAVe8b+arAc/H+tSgN2OIWI2JOdcmaiAlQmmKNmF3cbmgNWuUL EKrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=LsGekY4BvYvIO5Z8AgcJapYB7A/KQEKxuOgJFJnV+F0=; b=nEOtmVbqlcuheReTm7EGcyX3wZHHTEzqGNoIU3kMmqvIdKmdn4gagfYyKIoTD7xogp 8nhnpk2pdr2BzOV500TGDOXm7o8k1qzaPTbIMge+l3T1PkNbCk+NpzDWq7AZ4yXASel0 QnzgX5FSEFohMwQB/dtu/XEbqKUV6en4w4FpuOL7snPKF8DsOeokOYLmFgGbZYlI4KvT QHvC/nnZ5svOfXbmMhgjoVXJri6ZB/brOrMDcA4BJs4UCIaYQf6uI2E1WF34haQIKI0B 9DKQ0WJC0E6fRHzy2h0cd2A/s3J3rDKvxlWP8ZMLZH+ygS36zLEgkwxHuubC6FogcRlA +irg== X-Gm-Message-State: AOAM532Rlz38/x4IcrJN5lLQJ7u8Zy/iPwMG9qNu5DyV1TVtrvKFMUWs eJJt84IoZUI3mpffzThW+/UE X-Google-Smtp-Source: ABdhPJzFJWinFNSX5sjhEi7EUxUYGzqDtd7WH488DhV7Pgm2CsQABcNDQNUVm8LAg6MjG7Qz0MbUVw== X-Received: by 2002:a17:90a:5b07:: with SMTP id o7mr3372998pji.35.1626166080091; Tue, 13 Jul 2021 01:48:00 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id x10sm2437739pgj.73.2021.07.13.01.47.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:47:59 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 10/17] virtio: Handle device reset failure in register_virtio_device() Date: Tue, 13 Jul 2021 16:46:49 +0800 Message-Id: <20210713084656.232-11-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The device reset may fail in virtio-vdpa case now, so add checks to its return value and fail the register_virtio_device(). Signed-off-by: Xie Yongji --- drivers/virtio/virtio.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c index a15beb6b593b..8df75425fb43 100644 --- a/drivers/virtio/virtio.c +++ b/drivers/virtio/virtio.c @@ -349,7 +349,9 @@ int register_virtio_device(struct virtio_device *dev) /* We always start by resetting the device, in case a previous * driver messed it up. This also tests that code path a little. */ - dev->config->reset(dev); + err = dev->config->reset(dev); + if (err) + goto err_reset; /* Acknowledge that we've seen the device. */ virtio_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE); @@ -362,10 +364,13 @@ int register_virtio_device(struct virtio_device *dev) */ err = device_add(&dev->dev); if (err) - ida_simple_remove(&virtio_index_ida, dev->index); -out: - if (err) - virtio_add_status(dev, VIRTIO_CONFIG_S_FAILED); + goto err_add; + + return 0; +err_add: + virtio_add_status(dev, VIRTIO_CONFIG_S_FAILED); +err_reset: + ida_simple_remove(&virtio_index_ida, dev->index); return err; } EXPORT_SYMBOL_GPL(register_virtio_device); From patchwork Tue Jul 13 08:46:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 12373187 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C616EC07E95 for ; Tue, 13 Jul 2021 08:48:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AD02B613AB for ; Tue, 13 Jul 2021 08:48:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235143AbhGMIvL (ORCPT ); Tue, 13 Jul 2021 04:51:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50296 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235231AbhGMIvA (ORCPT ); Tue, 13 Jul 2021 04:51:00 -0400 Received: from mail-pj1-x1029.google.com (mail-pj1-x1029.google.com [IPv6:2607:f8b0:4864:20::1029]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 284B8C0613B5 for ; Tue, 13 Jul 2021 01:48:04 -0700 (PDT) Received: by mail-pj1-x1029.google.com with SMTP id i16-20020a17090acf90b02901736d9d2218so1655767pju.1 for ; Tue, 13 Jul 2021 01:48:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=CGl/RjOOzzXlpDeA69Ku950YCkBGnktZEIYihkRI6vQ=; b=f2iY73K9/iJbrFo2MgGFRZQ2tnpbs3ImXGvgYWrH8jbRwZtYrG0H49iFKE/gsgwKLy wVtyRbuwPA0zRTedHVys3XUYskroarJowEQpt+ZqI5TnDw7Zp23rmxpefDYE2fUT31OK Yx8EzxFOTu/Y54tuXZUybHIRyzZazhpojXPhI62Q8vKEpNQgs633F50M2v6JbT1SWtju /ZtqBhhqTVZGT9Le/4u/OKCsHRigEKlFME1X9DTpvhSS8l1I7+BAFOhoRdXOT1yNydWd YxmuE2ov3/MNlpLC8cpCwId08XpgeT4hgIX6doOocMtwfbp46o2SIwhFi6wDRfTBj+Ar hnBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=CGl/RjOOzzXlpDeA69Ku950YCkBGnktZEIYihkRI6vQ=; b=stX6qccSLjcZxVi/4jvbFniQYUzuUe7lUr2w0EOh1xMAjfzLMjsqoc1ueGbODSKVLz fcXx5ea+7Ira56aMb1wkvH34/HvMj0myNOuOwnJTwEk2HgMYmO16F+3esKMPxb69A1HR d8gQ+hG0bKwG4XVDslwqNLhLvqAXvmiebZ3GsxkDM/OXgtnzzj1Tl75e8/x9KRSuxVUa 4YSj1rp3FbL73TrKjb5dCa8L+cfIE2AsZsZ6uFw2LF2Uj0x92Tqy4BOTORwpotSYdQTm T7EbI+4Oup3zzsK25F0Z/tabsr6O7qmMA9c9fooVr/hBrZR+309EVTP/2fzgmUj647mG Dl+Q== X-Gm-Message-State: AOAM5318olqKpxbQi/Fcb05QOEEFqJphn0aFYEof6hqIB9viY96BvqOm 7cxjZL9dSztX8O6ZWqzdDVNP X-Google-Smtp-Source: ABdhPJxyTVjwlLv6HFUW/HmnbuC6yzJVkXeQ5coeCepXZBwWX13HH6cyjAijtoaEpSnWsq9zQv6N4Q== X-Received: by 2002:a17:903:4051:b029:12a:181c:9305 with SMTP id n17-20020a1709034051b029012a181c9305mr2713300pla.25.1626166083725; Tue, 13 Jul 2021 01:48:03 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id s15sm18640281pfw.207.2021.07.13.01.48.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:48:03 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 11/17] vhost-iotlb: Add an opaque pointer for vhost IOTLB Date: Tue, 13 Jul 2021 16:46:50 +0800 Message-Id: <20210713084656.232-12-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add an opaque pointer for vhost IOTLB. And introduce vhost_iotlb_add_range_ctx() to accept it. Suggested-by: Jason Wang Signed-off-by: Xie Yongji Acked-by: Jason Wang --- drivers/vhost/iotlb.c | 20 ++++++++++++++++---- include/linux/vhost_iotlb.h | 3 +++ 2 files changed, 19 insertions(+), 4 deletions(-) diff --git a/drivers/vhost/iotlb.c b/drivers/vhost/iotlb.c index 0582079e4bcc..670d56c879e5 100644 --- a/drivers/vhost/iotlb.c +++ b/drivers/vhost/iotlb.c @@ -36,19 +36,21 @@ void vhost_iotlb_map_free(struct vhost_iotlb *iotlb, EXPORT_SYMBOL_GPL(vhost_iotlb_map_free); /** - * vhost_iotlb_add_range - add a new range to vhost IOTLB + * vhost_iotlb_add_range_ctx - add a new range to vhost IOTLB * @iotlb: the IOTLB * @start: start of the IOVA range * @last: last of IOVA range * @addr: the address that is mapped to @start * @perm: access permission of this range + * @opaque: the opaque pointer for the new mapping * * Returns an error last is smaller than start or memory allocation * fails */ -int vhost_iotlb_add_range(struct vhost_iotlb *iotlb, - u64 start, u64 last, - u64 addr, unsigned int perm) +int vhost_iotlb_add_range_ctx(struct vhost_iotlb *iotlb, + u64 start, u64 last, + u64 addr, unsigned int perm, + void *opaque) { struct vhost_iotlb_map *map; @@ -71,6 +73,7 @@ int vhost_iotlb_add_range(struct vhost_iotlb *iotlb, map->last = last; map->addr = addr; map->perm = perm; + map->opaque = opaque; iotlb->nmaps++; vhost_iotlb_itree_insert(map, &iotlb->root); @@ -80,6 +83,15 @@ int vhost_iotlb_add_range(struct vhost_iotlb *iotlb, return 0; } +EXPORT_SYMBOL_GPL(vhost_iotlb_add_range_ctx); + +int vhost_iotlb_add_range(struct vhost_iotlb *iotlb, + u64 start, u64 last, + u64 addr, unsigned int perm) +{ + return vhost_iotlb_add_range_ctx(iotlb, start, last, + addr, perm, NULL); +} EXPORT_SYMBOL_GPL(vhost_iotlb_add_range); /** diff --git a/include/linux/vhost_iotlb.h b/include/linux/vhost_iotlb.h index 6b09b786a762..2d0e2f52f938 100644 --- a/include/linux/vhost_iotlb.h +++ b/include/linux/vhost_iotlb.h @@ -17,6 +17,7 @@ struct vhost_iotlb_map { u32 perm; u32 flags_padding; u64 __subtree_last; + void *opaque; }; #define VHOST_IOTLB_FLAG_RETIRE 0x1 @@ -29,6 +30,8 @@ struct vhost_iotlb { unsigned int flags; }; +int vhost_iotlb_add_range_ctx(struct vhost_iotlb *iotlb, u64 start, u64 last, + u64 addr, unsigned int perm, void *opaque); int vhost_iotlb_add_range(struct vhost_iotlb *iotlb, u64 start, u64 last, u64 addr, unsigned int perm); void vhost_iotlb_del_range(struct vhost_iotlb *iotlb, u64 start, u64 last); From patchwork Tue Jul 13 08:46:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 12373191 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6179C07E96 for ; Tue, 13 Jul 2021 08:48:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D441F613A9 for ; Tue, 13 Jul 2021 08:48:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235159AbhGMIvS (ORCPT ); Tue, 13 Jul 2021 04:51:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50302 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235158AbhGMIvB (ORCPT ); Tue, 13 Jul 2021 04:51:01 -0400 Received: from mail-pg1-x52b.google.com (mail-pg1-x52b.google.com [IPv6:2607:f8b0:4864:20::52b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B293C0613B8 for ; Tue, 13 Jul 2021 01:48:08 -0700 (PDT) Received: by mail-pg1-x52b.google.com with SMTP id s18so5708463pgq.3 for ; Tue, 13 Jul 2021 01:48:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=tkBBc1IawD+Ffhgxr5n+Z15KbqmeeJxiazdLqHh1xrc=; b=yszZshd37Q1r8rl2w4Lwrj6YFf82n3z6RFM2/AVb64c2vPdKXY1wme59FfefIDnkgr ZeHW2AajsEfK2rxQLc6OuYZkgjbUBXJA+gcejIcjQVScRmmzU//ovNfFgpCztM6evjZb /n+WP+ydd3YlkZ8SQbNriexfCms41b+6n5JyrSnt1kqA40n300sAvayGKIsgDO660Dag xPMXs0Ybb8+DJM4LihYMuZRid7CateX2g6ffuDfJjjN9xcWY2pbOZU1RjT72UvyZtG7E Iu7G8pek5Vi+CTDYAPuzeMfjRbD3yyHp8cITHLj2FoMlmhTbDWiVkQDUq3fyefNyyPld e0XA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tkBBc1IawD+Ffhgxr5n+Z15KbqmeeJxiazdLqHh1xrc=; b=h/mz2nEt1I4OLmk9oBZrIX0A3lm6N9csT0LMVnHxULhkS/nMYCJxNGBxTgr33mnvTj zuPGCxYYmIkJGrfgeb5qe5dTaXbUW50h/ftrE/vpXRpvdbuuCwIRRSiJArY93ye4R1mH xYSIEt3+YV6X9WxgtbTy3955xt2SQ1A15Uo6KovJE3ttcQNw5OsmCA+Ykudha2/BI16X WVp7EJxeq7sc88/NGf4Ef0T+Vy9GRM0Jn5rO4AME/+O+GxMvq7OPWlv0EPrdTzmXqz3O Ae4o2vTwACyZdcW9JnHcNnNoHscnHvuLqy1E3/f82HkUCnwVB0BvIKfsfGRKdaLzqYr/ KyBg== X-Gm-Message-State: AOAM532U7RUP3CasAtyNW1mqeAcfuNgzQWaO4Tuzzp/muu5RfzbMcAXX attPa0ppS7DlN34GjmTfVH9P X-Google-Smtp-Source: ABdhPJzZdDWqJmTNf93gwkTO/YfKFCx5ihMhJT+DOlHvR2RbC+0OzE+HpikVZnSzaRW2shba6Z0k9A== X-Received: by 2002:aa7:8704:0:b029:328:c7ca:fe33 with SMTP id b4-20020aa787040000b0290328c7cafe33mr3425253pfo.12.1626166087820; Tue, 13 Jul 2021 01:48:07 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id bf18sm4475167pjb.46.2021.07.13.01.48.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:48:07 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 12/17] vdpa: Add an opaque pointer for vdpa_config_ops.dma_map() Date: Tue, 13 Jul 2021 16:46:51 +0800 Message-Id: <20210713084656.232-13-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add an opaque pointer for DMA mapping. Suggested-by: Jason Wang Signed-off-by: Xie Yongji Acked-by: Jason Wang --- drivers/vdpa/vdpa_sim/vdpa_sim.c | 6 +++--- drivers/vhost/vdpa.c | 2 +- include/linux/vdpa.h | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c index 72167963421b..f456f4baf86d 100644 --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c @@ -542,14 +542,14 @@ static int vdpasim_set_map(struct vdpa_device *vdpa, } static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size, - u64 pa, u32 perm) + u64 pa, u32 perm, void *opaque) { struct vdpasim *vdpasim = vdpa_to_sim(vdpa); int ret; spin_lock(&vdpasim->iommu_lock); - ret = vhost_iotlb_add_range(vdpasim->iommu, iova, iova + size - 1, pa, - perm); + ret = vhost_iotlb_add_range_ctx(vdpasim->iommu, iova, iova + size - 1, + pa, perm, opaque); spin_unlock(&vdpasim->iommu_lock); return ret; diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 8615756306ec..f60a513dac7c 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -578,7 +578,7 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, return r; if (ops->dma_map) { - r = ops->dma_map(vdpa, iova, size, pa, perm); + r = ops->dma_map(vdpa, iova, size, pa, perm, NULL); } else if (ops->set_map) { if (!v->in_batch) r = ops->set_map(vdpa, dev->iotlb); diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index 198c30e84b5d..4903e67c690b 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -268,7 +268,7 @@ struct vdpa_config_ops { /* DMA ops */ int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb); int (*dma_map)(struct vdpa_device *vdev, u64 iova, u64 size, - u64 pa, u32 perm); + u64 pa, u32 perm, void *opaque); int (*dma_unmap)(struct vdpa_device *vdev, u64 iova, u64 size); /* Free device resources */ From patchwork Tue Jul 13 08:46:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 12373193 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 895CEC11F67 for ; Tue, 13 Jul 2021 08:48:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7064C613A9 for ; Tue, 13 Jul 2021 08:48:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235149AbhGMIv1 (ORCPT ); Tue, 13 Jul 2021 04:51:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50340 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235139AbhGMIvL (ORCPT ); Tue, 13 Jul 2021 04:51:11 -0400 Received: from mail-pf1-x42f.google.com (mail-pf1-x42f.google.com [IPv6:2607:f8b0:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D822FC0613A7 for ; Tue, 13 Jul 2021 01:48:11 -0700 (PDT) Received: by mail-pf1-x42f.google.com with SMTP id j199so18933365pfd.7 for ; Tue, 13 Jul 2021 01:48:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=QQuejJgNoEGyjo9tRdqGEQx2lQUe8kg/2ppmDnsJJAA=; b=unjSdw/881HgzjyZtUmmNHKre9MjzPnNfw0dZmJe0wi8SfaN2BXtbXsbdgycORb6qy CDTL+Qr9i5T3/weikPBzLhR+DSS6XfPrFZ0iQ/BSm6Rk7qiH5M7kSgYp2MJleNwlyfol vZCl9PV+5+N9XKukRMjOBXgYIcnk22iQiPRO2/DTv6okYz6rw+zj7W7lSU6+476LJVcn 2BPhqvP2u7lIK+Qdy/jXOonjh+hOdl+q/9CQqHqZHN7RjQFYmwWcjl0kk12fG2fS4C/Z Yzb0lI7wdxWXuWtJKO5acsPEWr0+bSE1X1DfnvQd4xh+Mv/SMXS70DwhUJnZZqsTODMW LhnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=QQuejJgNoEGyjo9tRdqGEQx2lQUe8kg/2ppmDnsJJAA=; b=F1sMdo09cfXCknGWbtQS+uqq+1bJw4O71wkhXxSBzMl8FqaZW6qp3Hnef4Oy6DzIZU AOfAXlJN/4FWCBO+5Y15VLslMMZTyjGyBmawlkkKHoQggFBgnN9/a4hQv54R5p399zJK m5AivRmiFFNEX75xXOj4dI3QrNA/iLfv1mpQYVOhf+n0xLCktuvviHKNYm4F9FDIHT8W 7Wp3SwJwyLh476dGqQBt7gj3//gmeGX4jtjNxpEQThkMJEtKszB9Bc1DLJ2NCJ5oHZ2F 4Y+flKLYida44iKqvdZ1+8TB75MN26IV9v6Y2Z+fDucJ5mOyorDHEs7VUCT3XS+qJIpT 0HdA== X-Gm-Message-State: AOAM532FFguH9/56xzcFE/ZXDR+HvDketECGeDCI0936RyD8FUrIWNkq NK7V8zrdTikxjbOnnqtBL8Wo X-Google-Smtp-Source: ABdhPJz3XYFBOb+mfR3r9wlTER8M1X/HEO0osRKjG8afubA8hE7WiCvA7RRSBu5qwBv/Xtfq2SZxsg== X-Received: by 2002:a63:8b41:: with SMTP id j62mr3285495pge.435.1626166091443; Tue, 13 Jul 2021 01:48:11 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id 2sm21343999pgz.26.2021.07.13.01.48.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:48:11 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 13/17] vdpa: factor out vhost_vdpa_pa_map() and vhost_vdpa_pa_unmap() Date: Tue, 13 Jul 2021 16:46:52 +0800 Message-Id: <20210713084656.232-14-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The upcoming patch is going to support VA mapping/unmapping. So let's factor out the logic of PA mapping/unmapping firstly to make the code more readable. Suggested-by: Jason Wang Signed-off-by: Xie Yongji Acked-by: Jason Wang --- drivers/vhost/vdpa.c | 53 +++++++++++++++++++++++++++++++++------------------- 1 file changed, 34 insertions(+), 19 deletions(-) diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index f60a513dac7c..3e260038beba 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -511,7 +511,7 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, return r; } -static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last) +static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, u64 start, u64 last) { struct vhost_dev *dev = &v->vdev; struct vhost_iotlb *iotlb = dev->iotlb; @@ -533,6 +533,11 @@ static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last) } } +static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last) +{ + return vhost_vdpa_pa_unmap(v, start, last); +} + static void vhost_vdpa_iotlb_free(struct vhost_vdpa *v) { struct vhost_dev *dev = &v->vdev; @@ -613,37 +618,28 @@ static void vhost_vdpa_unmap(struct vhost_vdpa *v, u64 iova, u64 size) } } -static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, - struct vhost_iotlb_msg *msg) +static int vhost_vdpa_pa_map(struct vhost_vdpa *v, + u64 iova, u64 size, u64 uaddr, u32 perm) { struct vhost_dev *dev = &v->vdev; - struct vhost_iotlb *iotlb = dev->iotlb; struct page **page_list; unsigned long list_size = PAGE_SIZE / sizeof(struct page *); unsigned int gup_flags = FOLL_LONGTERM; unsigned long npages, cur_base, map_pfn, last_pfn = 0; unsigned long lock_limit, sz2pin, nchunks, i; - u64 iova = msg->iova; + u64 start = iova; long pinned; int ret = 0; - if (msg->iova < v->range.first || - msg->iova + msg->size - 1 > v->range.last) - return -EINVAL; - - if (vhost_iotlb_itree_first(iotlb, msg->iova, - msg->iova + msg->size - 1)) - return -EEXIST; - /* Limit the use of memory for bookkeeping */ page_list = (struct page **) __get_free_page(GFP_KERNEL); if (!page_list) return -ENOMEM; - if (msg->perm & VHOST_ACCESS_WO) + if (perm & VHOST_ACCESS_WO) gup_flags |= FOLL_WRITE; - npages = PAGE_ALIGN(msg->size + (iova & ~PAGE_MASK)) >> PAGE_SHIFT; + npages = PAGE_ALIGN(size + (iova & ~PAGE_MASK)) >> PAGE_SHIFT; if (!npages) { ret = -EINVAL; goto free; @@ -657,7 +653,7 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, goto unlock; } - cur_base = msg->uaddr & PAGE_MASK; + cur_base = uaddr & PAGE_MASK; iova &= PAGE_MASK; nchunks = 0; @@ -688,7 +684,7 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, csize = (last_pfn - map_pfn + 1) << PAGE_SHIFT; ret = vhost_vdpa_map(v, iova, csize, map_pfn << PAGE_SHIFT, - msg->perm); + perm); if (ret) { /* * Unpin the pages that are left unmapped @@ -717,7 +713,7 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, /* Pin the rest chunk */ ret = vhost_vdpa_map(v, iova, (last_pfn - map_pfn + 1) << PAGE_SHIFT, - map_pfn << PAGE_SHIFT, msg->perm); + map_pfn << PAGE_SHIFT, perm); out: if (ret) { if (nchunks) { @@ -736,13 +732,32 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, for (pfn = map_pfn; pfn <= last_pfn; pfn++) unpin_user_page(pfn_to_page(pfn)); } - vhost_vdpa_unmap(v, msg->iova, msg->size); + vhost_vdpa_unmap(v, start, size); } unlock: mmap_read_unlock(dev->mm); free: free_page((unsigned long)page_list); return ret; + +} + +static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, + struct vhost_iotlb_msg *msg) +{ + struct vhost_dev *dev = &v->vdev; + struct vhost_iotlb *iotlb = dev->iotlb; + + if (msg->iova < v->range.first || + msg->iova + msg->size - 1 > v->range.last) + return -EINVAL; + + if (vhost_iotlb_itree_first(iotlb, msg->iova, + msg->iova + msg->size - 1)) + return -EEXIST; + + return vhost_vdpa_pa_map(v, msg->iova, msg->size, msg->uaddr, + msg->perm); } static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev, From patchwork Tue Jul 13 08:46:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 12373195 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15C4DC07E95 for ; Tue, 13 Jul 2021 08:48:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F3D7C6101D for ; Tue, 13 Jul 2021 08:48:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234690AbhGMIve (ORCPT ); Tue, 13 Jul 2021 04:51:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50300 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235269AbhGMIvM (ORCPT ); Tue, 13 Jul 2021 04:51:12 -0400 Received: from mail-pj1-x1036.google.com (mail-pj1-x1036.google.com [IPv6:2607:f8b0:4864:20::1036]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8998DC0613AD for ; Tue, 13 Jul 2021 01:48:15 -0700 (PDT) Received: by mail-pj1-x1036.google.com with SMTP id bt15so6633550pjb.2 for ; Tue, 13 Jul 2021 01:48:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=y4ufJe1pqoQtqMP3ixyl76pbDy1hGYJP4XFdOY1YVyY=; b=WQuU/gG7ZfTNI2qZpsTs3nFbw3nGQz58SeTex5OzuWpcg0kawm2pwWU3SS9wmThtD7 8SuQKUzHbZdL+eLavT5+YRBAi0zmAoF3t2E3SoNDTO3c4D0gyq5q5vOJ64II6vfAd4aT 71QQPF9/laUlyzOfQoVzNja3HoTyWs8WAGyHW2tFN9KbReARF+Jsge6Z3XnKKTf4ks8U utVVPAsVmo7WGuvvgX6H5ZATzkWQ/yddy7LlXwLGRWxnXXGAB0PPn7iJMOENuO0eZ444 qQrj04dP1ATna4I3q9+iU/DxS0eJuKeELHTy5OsXbFmUU0q1Ngqw9zUHXdbVI6/h072e BWMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=y4ufJe1pqoQtqMP3ixyl76pbDy1hGYJP4XFdOY1YVyY=; b=YIYdal8ZHZl6adFs54RcNad1YkGp0caDF3psbpkoQgdG8vx9y3xO57WqWiRB7kT3U4 s6tBdXobkso5GMpgZIY1be6p5spiCXZt4ujmjaB4Bxiu8L43nhDa99RgLQqlA/ykd8MQ Qgyvr329UqhQsBzC8/f/pFzz7bYoxZ92diyZaMZSBPQ3JW6i3vD/t0fRQjtQuKLONzr9 NLvykT2WYGs22qg65NLA692Xj9tbcd3+UXCqCZbZBLNioOXefCwtVt1TDIymLGE0ZFHH wLOEJyqhg9tj0aQfIMrW0yO8Eedm6ACjJ1IXr1T5cJ4PZi0KQZ3xrWSelKVIDVae8UTh 1ajQ== X-Gm-Message-State: AOAM530w3wTqoZaF9DhY08YQR4dLGDQ/BB0ZCuJrDxtJZ/LVO8la5n+D ubHbvxtlSLXpnSSGSui0Misr X-Google-Smtp-Source: ABdhPJwqa943zq1Ntq/jM0YGq1xzB0bh9LWs6ajK6UMzkrpe/h3ODorWGfeeWAcPP8rS7//MZIB7gQ== X-Received: by 2002:a17:902:b188:b029:11b:1549:da31 with SMTP id s8-20020a170902b188b029011b1549da31mr2646108plr.7.1626166095039; Tue, 13 Jul 2021 01:48:15 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id y6sm20418758pgk.79.2021.07.13.01.48.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:48:14 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 14/17] vdpa: Support transferring virtual addressing during DMA mapping Date: Tue, 13 Jul 2021 16:46:53 +0800 Message-Id: <20210713084656.232-15-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This patch introduces an attribute for vDPA device to indicate whether virtual address can be used. If vDPA device driver set it, vhost-vdpa bus driver will not pin user page and transfer userspace virtual address instead of physical address during DMA mapping. And corresponding vma->vm_file and offset will be also passed as an opaque pointer. Suggested-by: Jason Wang Signed-off-by: Xie Yongji Acked-by: Jason Wang --- drivers/vdpa/ifcvf/ifcvf_main.c | 2 +- drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +- drivers/vdpa/vdpa.c | 9 +++- drivers/vdpa/vdpa_sim/vdpa_sim.c | 2 +- drivers/vdpa/virtio_pci/vp_vdpa.c | 2 +- drivers/vhost/vdpa.c | 99 ++++++++++++++++++++++++++++++++++----- include/linux/vdpa.h | 19 ++++++-- 7 files changed, 116 insertions(+), 19 deletions(-) diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c index b4b89a66607f..fc848aafe5a9 100644 --- a/drivers/vdpa/ifcvf/ifcvf_main.c +++ b/drivers/vdpa/ifcvf/ifcvf_main.c @@ -492,7 +492,7 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id) } adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa, - dev, &ifc_vdpa_ops, NULL); + dev, &ifc_vdpa_ops, NULL, false); if (adapter == NULL) { IFCVF_ERR(pdev, "Failed to allocate vDPA structure"); return -ENOMEM; diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index 683452c41a36..79cb2f130bf0 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -2035,7 +2035,7 @@ static int mlx5_vdpa_dev_add(struct vdpa_mgmt_dev *v_mdev, const char *name) max_vqs = min_t(u32, max_vqs, MLX5_MAX_SUPPORTED_VQS); ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, &mlx5_vdpa_ops, - name); + name, false); if (IS_ERR(ndev)) return PTR_ERR(ndev); diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index d77d59811389..41377df674d5 100644 --- a/drivers/vdpa/vdpa.c +++ b/drivers/vdpa/vdpa.c @@ -71,6 +71,7 @@ static void vdpa_release_dev(struct device *d) * @config: the bus operations that is supported by this device * @size: size of the parent structure that contains private data * @name: name of the vdpa device; optional. + * @use_va: indicate whether virtual address must be used by this device * * Driver should use vdpa_alloc_device() wrapper macro instead of * using this directly. @@ -80,7 +81,8 @@ static void vdpa_release_dev(struct device *d) */ struct vdpa_device *__vdpa_alloc_device(struct device *parent, const struct vdpa_config_ops *config, - size_t size, const char *name) + size_t size, const char *name, + bool use_va) { struct vdpa_device *vdev; int err = -EINVAL; @@ -91,6 +93,10 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent, if (!!config->dma_map != !!config->dma_unmap) goto err; + /* It should only work for the device that use on-chip IOMMU */ + if (use_va && !(config->dma_map || config->set_map)) + goto err; + err = -ENOMEM; vdev = kzalloc(size, GFP_KERNEL); if (!vdev) @@ -106,6 +112,7 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent, vdev->index = err; vdev->config = config; vdev->features_valid = false; + vdev->use_va = use_va; if (name) err = dev_set_name(&vdev->dev, "%s", name); diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c index f456f4baf86d..472d573d755c 100644 --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c @@ -250,7 +250,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr) ops = &vdpasim_config_ops; vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops, - dev_attr->name); + dev_attr->name, false); if (!vdpasim) goto err_alloc; diff --git a/drivers/vdpa/virtio_pci/vp_vdpa.c b/drivers/vdpa/virtio_pci/vp_vdpa.c index ee723c645aec..c828b9f9d92a 100644 --- a/drivers/vdpa/virtio_pci/vp_vdpa.c +++ b/drivers/vdpa/virtio_pci/vp_vdpa.c @@ -435,7 +435,7 @@ static int vp_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id) return ret; vp_vdpa = vdpa_alloc_device(struct vp_vdpa, vdpa, - dev, &vp_vdpa_ops, NULL); + dev, &vp_vdpa_ops, NULL, false); if (vp_vdpa == NULL) { dev_err(dev, "vp_vdpa: Failed to allocate vDPA structure\n"); return -ENOMEM; diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 3e260038beba..0db701fa935f 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -533,8 +533,28 @@ static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, u64 start, u64 last) } } +static void vhost_vdpa_va_unmap(struct vhost_vdpa *v, u64 start, u64 last) +{ + struct vhost_dev *dev = &v->vdev; + struct vhost_iotlb *iotlb = dev->iotlb; + struct vhost_iotlb_map *map; + struct vdpa_map_file *map_file; + + while ((map = vhost_iotlb_itree_first(iotlb, start, last)) != NULL) { + map_file = (struct vdpa_map_file *)map->opaque; + fput(map_file->file); + kfree(map_file); + vhost_iotlb_map_free(iotlb, map); + } +} + static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last) { + struct vdpa_device *vdpa = v->vdpa; + + if (vdpa->use_va) + return vhost_vdpa_va_unmap(v, start, last); + return vhost_vdpa_pa_unmap(v, start, last); } @@ -569,21 +589,21 @@ static int perm_to_iommu_flags(u32 perm) return flags | IOMMU_CACHE; } -static int vhost_vdpa_map(struct vhost_vdpa *v, - u64 iova, u64 size, u64 pa, u32 perm) +static int vhost_vdpa_map(struct vhost_vdpa *v, u64 iova, + u64 size, u64 pa, u32 perm, void *opaque) { struct vhost_dev *dev = &v->vdev; struct vdpa_device *vdpa = v->vdpa; const struct vdpa_config_ops *ops = vdpa->config; int r = 0; - r = vhost_iotlb_add_range(dev->iotlb, iova, iova + size - 1, - pa, perm); + r = vhost_iotlb_add_range_ctx(dev->iotlb, iova, iova + size - 1, + pa, perm, opaque); if (r) return r; if (ops->dma_map) { - r = ops->dma_map(vdpa, iova, size, pa, perm, NULL); + r = ops->dma_map(vdpa, iova, size, pa, perm, opaque); } else if (ops->set_map) { if (!v->in_batch) r = ops->set_map(vdpa, dev->iotlb); @@ -591,13 +611,15 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, r = iommu_map(v->domain, iova, pa, size, perm_to_iommu_flags(perm)); } - - if (r) + if (r) { vhost_iotlb_del_range(dev->iotlb, iova, iova + size - 1); - else + return r; + } + + if (!vdpa->use_va) atomic64_add(size >> PAGE_SHIFT, &dev->mm->pinned_vm); - return r; + return 0; } static void vhost_vdpa_unmap(struct vhost_vdpa *v, u64 iova, u64 size) @@ -618,6 +640,56 @@ static void vhost_vdpa_unmap(struct vhost_vdpa *v, u64 iova, u64 size) } } +static int vhost_vdpa_va_map(struct vhost_vdpa *v, + u64 iova, u64 size, u64 uaddr, u32 perm) +{ + struct vhost_dev *dev = &v->vdev; + u64 offset, map_size, map_iova = iova; + struct vdpa_map_file *map_file; + struct vm_area_struct *vma; + int ret; + + mmap_read_lock(dev->mm); + + while (size) { + vma = find_vma(dev->mm, uaddr); + if (!vma) { + ret = -EINVAL; + break; + } + map_size = min(size, vma->vm_end - uaddr); + if (!(vma->vm_file && (vma->vm_flags & VM_SHARED) && + !(vma->vm_flags & (VM_IO | VM_PFNMAP)))) + goto next; + + map_file = kzalloc(sizeof(*map_file), GFP_KERNEL); + if (!map_file) { + ret = -ENOMEM; + break; + } + offset = (vma->vm_pgoff << PAGE_SHIFT) + uaddr - vma->vm_start; + map_file->offset = offset; + map_file->file = get_file(vma->vm_file); + ret = vhost_vdpa_map(v, map_iova, map_size, uaddr, + perm, map_file); + if (ret) { + fput(map_file->file); + kfree(map_file); + break; + } +next: + size -= map_size; + uaddr += map_size; + map_iova += map_size; + } + if (ret) + vhost_vdpa_unmap(v, iova, map_iova - iova); + + mmap_read_unlock(dev->mm); + + return ret; +} + static int vhost_vdpa_pa_map(struct vhost_vdpa *v, u64 iova, u64 size, u64 uaddr, u32 perm) { @@ -684,7 +756,7 @@ static int vhost_vdpa_pa_map(struct vhost_vdpa *v, csize = (last_pfn - map_pfn + 1) << PAGE_SHIFT; ret = vhost_vdpa_map(v, iova, csize, map_pfn << PAGE_SHIFT, - perm); + perm, NULL); if (ret) { /* * Unpin the pages that are left unmapped @@ -713,7 +785,7 @@ static int vhost_vdpa_pa_map(struct vhost_vdpa *v, /* Pin the rest chunk */ ret = vhost_vdpa_map(v, iova, (last_pfn - map_pfn + 1) << PAGE_SHIFT, - map_pfn << PAGE_SHIFT, perm); + map_pfn << PAGE_SHIFT, perm, NULL); out: if (ret) { if (nchunks) { @@ -746,6 +818,7 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, struct vhost_iotlb_msg *msg) { struct vhost_dev *dev = &v->vdev; + struct vdpa_device *vdpa = v->vdpa; struct vhost_iotlb *iotlb = dev->iotlb; if (msg->iova < v->range.first || @@ -756,6 +829,10 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, msg->iova + msg->size - 1)) return -EEXIST; + if (vdpa->use_va) + return vhost_vdpa_va_map(v, msg->iova, msg->size, + msg->uaddr, msg->perm); + return vhost_vdpa_pa_map(v, msg->iova, msg->size, msg->uaddr, msg->perm); } diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index 4903e67c690b..b605ce09a5bd 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -66,6 +66,7 @@ struct vdpa_mgmt_dev; * @config: the configuration ops for this device. * @index: device index * @features_valid: were features initialized? for legacy guests + * @use_va: indicate whether virtual address must be used by this device * @nvqs: maximum number of supported virtqueues * @mdev: management device pointer; caller must setup when registering device as part * of dev_add() mgmtdev ops callback before invoking _vdpa_register_device(). @@ -76,6 +77,7 @@ struct vdpa_device { const struct vdpa_config_ops *config; unsigned int index; bool features_valid; + bool use_va; int nvqs; struct vdpa_mgmt_dev *mdev; }; @@ -91,6 +93,16 @@ struct vdpa_iova_range { }; /** + * Corresponding file area for device memory mapping + * @file: vma->vm_file for the mapping + * @offset: mapping offset in the vm_file + */ +struct vdpa_map_file { + struct file *file; + u64 offset; +}; + +/** * struct vdpa_config_ops - operations for configuring a vDPA device. * Note: vDPA device drivers are required to implement all of the * operations unless it is mentioned to be optional in the following @@ -277,14 +289,15 @@ struct vdpa_config_ops { struct vdpa_device *__vdpa_alloc_device(struct device *parent, const struct vdpa_config_ops *config, - size_t size, const char *name); + size_t size, const char *name, + bool use_va); -#define vdpa_alloc_device(dev_struct, member, parent, config, name) \ +#define vdpa_alloc_device(dev_struct, member, parent, config, name, use_va) \ container_of(__vdpa_alloc_device( \ parent, config, \ sizeof(dev_struct) + \ BUILD_BUG_ON_ZERO(offsetof( \ - dev_struct, member)), name), \ + dev_struct, member)), name, use_va), \ dev_struct, member) int vdpa_register_device(struct vdpa_device *vdev, int nvqs); From patchwork Tue Jul 13 08:46:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 12373197 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4608CC11F68 for ; Tue, 13 Jul 2021 08:48:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 30E04613AF for ; Tue, 13 Jul 2021 08:48:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235427AbhGMIvp (ORCPT ); Tue, 13 Jul 2021 04:51:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50282 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235125AbhGMIv2 (ORCPT ); Tue, 13 Jul 2021 04:51:28 -0400 Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 587BFC0611C4 for ; Tue, 13 Jul 2021 01:48:19 -0700 (PDT) Received: by mail-pl1-x635.google.com with SMTP id z2so7835555plg.8 for ; Tue, 13 Jul 2021 01:48:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=DmuibZSASz5HW9QiqnoPNYlf/iv8ZNP8HE9857Kg6Fw=; b=XFYPiitRN4+RByOxDim7QxWXrOIW6wy7raapT+osXdqpI/uA8WXCu8aCkb7Nzl7iTT crxA725BDpJ0Rtee7BnKrbt0YYnKlWAojWmIfXnMHK5Uh1rnBYXxzHVrJNqTsQN1rVfb zOxP2taHjLAMH4PdTQ1bVuovD2u4yLDvQ129fLwEMR/o+GbK3rkTgP2JZxR71aY3E8H4 W6nXzSjPl3BsHG/R9uVgU1+0uT9Ash4iexZ184aX7R4puclZz1DUrpVrwaA4qF8QIOU5 mkMLHDM1zsxSpydA+xKM68QqnZuTWd1E4o3us+1929DbtCpx04r8nZNZ80bgM8oe1hk9 o6Rg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=DmuibZSASz5HW9QiqnoPNYlf/iv8ZNP8HE9857Kg6Fw=; b=AQhQ3aliej1NrvC5vtPO9o+A2SAByVxPmH2sAy22J0DrlwO4evWaFome8JnMdR1ray 1eL0V0c+Gf1hqhv4S2nXh5mx+vDUdylAikh9o6QHzBc61cziW5WnYlCmOubGBKE/PRyY 8wib5fn1Af8AF6sc9CgF17PMJvCpL0rKZjZtRgjG+25O0+asTFoB7tR+eJvBUn3sr/eu x1obhbcd9u5nfZNE/SuMi3DqEPPfur14cf7nuChKEz/QfmrqlgjBl65p8k4D4XBX+Pe8 /vb97Ist/j6vGNKeBQdjC1Kr+YaF/j7L3rQ2V+IuTPIFNpXtmLMR0D3QFAE22XLGf9sK numA== X-Gm-Message-State: AOAM532FVlpDaevKmbcJGrkj69LmPL3nHh8q1lRsPS3nSuwn3JbTetyO R4ra0p1f69sigQtpLg4jHPga X-Google-Smtp-Source: ABdhPJzMFFL+LqmYw0EqE9kKoTKOfG4JqdLi0VjPIskKPn746N91NPZc3/W+s7UTqY4A8Q2BsvNCMQ== X-Received: by 2002:a17:90a:4bce:: with SMTP id u14mr3307609pjl.188.1626166098789; Tue, 13 Jul 2021 01:48:18 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id s15sm19293283pfu.97.2021.07.13.01.48.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:48:18 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 15/17] vduse: Implement an MMU-based IOMMU driver Date: Tue, 13 Jul 2021 16:46:54 +0800 Message-Id: <20210713084656.232-16-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This implements an MMU-based IOMMU driver to support mapping kernel dma buffer into userspace. The basic idea behind it is treating MMU (VA->PA) as IOMMU (IOVA->PA). The driver will set up MMU mapping instead of IOMMU mapping for the DMA transfer so that the userspace process is able to use its virtual address to access the dma buffer in kernel. And to avoid security issue, a bounce-buffering mechanism is introduced to prevent userspace accessing the original buffer directly. Signed-off-by: Xie Yongji Acked-by: Jason Wang --- drivers/vdpa/vdpa_user/iova_domain.c | 545 +++++++++++++++++++++++++++++++++++ drivers/vdpa/vdpa_user/iova_domain.h | 73 +++++ 2 files changed, 618 insertions(+) create mode 100644 drivers/vdpa/vdpa_user/iova_domain.c create mode 100644 drivers/vdpa/vdpa_user/iova_domain.h diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c new file mode 100644 index 000000000000..ad45026f5423 --- /dev/null +++ b/drivers/vdpa/vdpa_user/iova_domain.c @@ -0,0 +1,545 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * MMU-based IOMMU implementation + * + * Copyright (C) 2020-2021 Bytedance Inc. and/or its affiliates. All rights reserved. + * + * Author: Xie Yongji + * + */ + +#include +#include +#include +#include +#include +#include + +#include "iova_domain.h" + +static int vduse_iotlb_add_range(struct vduse_iova_domain *domain, + u64 start, u64 last, + u64 addr, unsigned int perm, + struct file *file, u64 offset) +{ + struct vdpa_map_file *map_file; + int ret; + + map_file = kmalloc(sizeof(*map_file), GFP_ATOMIC); + if (!map_file) + return -ENOMEM; + + map_file->file = get_file(file); + map_file->offset = offset; + + ret = vhost_iotlb_add_range_ctx(domain->iotlb, start, last, + addr, perm, map_file); + if (ret) { + fput(map_file->file); + kfree(map_file); + return ret; + } + return 0; +} + +static void vduse_iotlb_del_range(struct vduse_iova_domain *domain, + u64 start, u64 last) +{ + struct vdpa_map_file *map_file; + struct vhost_iotlb_map *map; + + while ((map = vhost_iotlb_itree_first(domain->iotlb, start, last))) { + map_file = (struct vdpa_map_file *)map->opaque; + fput(map_file->file); + kfree(map_file); + vhost_iotlb_map_free(domain->iotlb, map); + } +} + +int vduse_domain_set_map(struct vduse_iova_domain *domain, + struct vhost_iotlb *iotlb) +{ + struct vdpa_map_file *map_file; + struct vhost_iotlb_map *map; + u64 start = 0ULL, last = ULLONG_MAX; + int ret; + + spin_lock(&domain->iotlb_lock); + vduse_iotlb_del_range(domain, start, last); + + for (map = vhost_iotlb_itree_first(iotlb, start, last); map; + map = vhost_iotlb_itree_next(map, start, last)) { + map_file = (struct vdpa_map_file *)map->opaque; + ret = vduse_iotlb_add_range(domain, map->start, map->last, + map->addr, map->perm, + map_file->file, + map_file->offset); + if (ret) + goto err; + } + spin_unlock(&domain->iotlb_lock); + + return 0; +err: + vduse_iotlb_del_range(domain, start, last); + spin_unlock(&domain->iotlb_lock); + return ret; +} + +void vduse_domain_clear_map(struct vduse_iova_domain *domain, + struct vhost_iotlb *iotlb) +{ + struct vhost_iotlb_map *map; + u64 start = 0ULL, last = ULLONG_MAX; + + spin_lock(&domain->iotlb_lock); + for (map = vhost_iotlb_itree_first(iotlb, start, last); map; + map = vhost_iotlb_itree_next(map, start, last)) { + vduse_iotlb_del_range(domain, map->start, map->last); + } + spin_unlock(&domain->iotlb_lock); +} + +static int vduse_domain_map_bounce_page(struct vduse_iova_domain *domain, + u64 iova, u64 size, u64 paddr) +{ + struct vduse_bounce_map *map; + u64 last = iova + size - 1; + + while (iova <= last) { + map = &domain->bounce_maps[iova >> PAGE_SHIFT]; + if (!map->bounce_page) { + map->bounce_page = alloc_page(GFP_ATOMIC); + if (!map->bounce_page) + return -ENOMEM; + } + map->orig_phys = paddr; + paddr += PAGE_SIZE; + iova += PAGE_SIZE; + } + return 0; +} + +static void vduse_domain_unmap_bounce_page(struct vduse_iova_domain *domain, + u64 iova, u64 size) +{ + struct vduse_bounce_map *map; + u64 last = iova + size - 1; + + while (iova <= last) { + map = &domain->bounce_maps[iova >> PAGE_SHIFT]; + map->orig_phys = INVALID_PHYS_ADDR; + iova += PAGE_SIZE; + } +} + +static void do_bounce(phys_addr_t orig, void *addr, size_t size, + enum dma_data_direction dir) +{ + unsigned long pfn = PFN_DOWN(orig); + unsigned int offset = offset_in_page(orig); + char *buffer; + unsigned int sz = 0; + + while (size) { + sz = min_t(size_t, PAGE_SIZE - offset, size); + + buffer = kmap_atomic(pfn_to_page(pfn)); + if (dir == DMA_TO_DEVICE) + memcpy(addr, buffer + offset, sz); + else + memcpy(buffer + offset, addr, sz); + kunmap_atomic(buffer); + + size -= sz; + pfn++; + addr += sz; + offset = 0; + } +} + +static void vduse_domain_bounce(struct vduse_iova_domain *domain, + dma_addr_t iova, size_t size, + enum dma_data_direction dir) +{ + struct vduse_bounce_map *map; + unsigned int offset; + void *addr; + size_t sz; + + if (iova >= domain->bounce_size) + return; + + while (size) { + map = &domain->bounce_maps[iova >> PAGE_SHIFT]; + offset = offset_in_page(iova); + sz = min_t(size_t, PAGE_SIZE - offset, size); + + if (WARN_ON(!map->bounce_page || + map->orig_phys == INVALID_PHYS_ADDR)) + return; + + addr = page_address(map->bounce_page) + offset; + do_bounce(map->orig_phys + offset, addr, sz, dir); + size -= sz; + iova += sz; + } +} + +static struct page * +vduse_domain_get_coherent_page(struct vduse_iova_domain *domain, u64 iova) +{ + u64 start = iova & PAGE_MASK; + u64 last = start + PAGE_SIZE - 1; + struct vhost_iotlb_map *map; + struct page *page = NULL; + + spin_lock(&domain->iotlb_lock); + map = vhost_iotlb_itree_first(domain->iotlb, start, last); + if (!map) + goto out; + + page = pfn_to_page((map->addr + iova - map->start) >> PAGE_SHIFT); + get_page(page); +out: + spin_unlock(&domain->iotlb_lock); + + return page; +} + +static struct page * +vduse_domain_get_bounce_page(struct vduse_iova_domain *domain, u64 iova) +{ + struct vduse_bounce_map *map; + struct page *page = NULL; + + spin_lock(&domain->iotlb_lock); + map = &domain->bounce_maps[iova >> PAGE_SHIFT]; + if (!map->bounce_page) + goto out; + + page = map->bounce_page; + get_page(page); +out: + spin_unlock(&domain->iotlb_lock); + + return page; +} + +static void +vduse_domain_free_bounce_pages(struct vduse_iova_domain *domain) +{ + struct vduse_bounce_map *map; + unsigned long pfn, bounce_pfns; + + bounce_pfns = domain->bounce_size >> PAGE_SHIFT; + + for (pfn = 0; pfn < bounce_pfns; pfn++) { + map = &domain->bounce_maps[pfn]; + if (WARN_ON(map->orig_phys != INVALID_PHYS_ADDR)) + continue; + + if (!map->bounce_page) + continue; + + __free_page(map->bounce_page); + map->bounce_page = NULL; + } +} + +void vduse_domain_reset_bounce_map(struct vduse_iova_domain *domain) +{ + if (!domain->bounce_map) + return; + + spin_lock(&domain->iotlb_lock); + if (!domain->bounce_map) + goto unlock; + + vduse_iotlb_del_range(domain, 0, domain->bounce_size - 1); + domain->bounce_map = 0; +unlock: + spin_unlock(&domain->iotlb_lock); +} + +static int vduse_domain_init_bounce_map(struct vduse_iova_domain *domain) +{ + int ret = 0; + + if (domain->bounce_map) + return 0; + + spin_lock(&domain->iotlb_lock); + if (domain->bounce_map) + goto unlock; + + ret = vduse_iotlb_add_range(domain, 0, domain->bounce_size - 1, + 0, VHOST_MAP_RW, domain->file, 0); + if (ret) + goto unlock; + + domain->bounce_map = 1; +unlock: + spin_unlock(&domain->iotlb_lock); + return ret; +} + +static dma_addr_t +vduse_domain_alloc_iova(struct iova_domain *iovad, + unsigned long size, unsigned long limit) +{ + unsigned long shift = iova_shift(iovad); + unsigned long iova_len = iova_align(iovad, size) >> shift; + unsigned long iova_pfn; + + /* + * Freeing non-power-of-two-sized allocations back into the IOVA caches + * will come back to bite us badly, so we have to waste a bit of space + * rounding up anything cacheable to make sure that can't happen. The + * order of the unadjusted size will still match upon freeing. + */ + if (iova_len < (1 << (IOVA_RANGE_CACHE_MAX_SIZE - 1))) + iova_len = roundup_pow_of_two(iova_len); + iova_pfn = alloc_iova_fast(iovad, iova_len, limit >> shift, true); + + return iova_pfn << shift; +} + +static void vduse_domain_free_iova(struct iova_domain *iovad, + dma_addr_t iova, size_t size) +{ + unsigned long shift = iova_shift(iovad); + unsigned long iova_len = iova_align(iovad, size) >> shift; + + free_iova_fast(iovad, iova >> shift, iova_len); +} + +dma_addr_t vduse_domain_map_page(struct vduse_iova_domain *domain, + struct page *page, unsigned long offset, + size_t size, enum dma_data_direction dir, + unsigned long attrs) +{ + struct iova_domain *iovad = &domain->stream_iovad; + unsigned long limit = domain->bounce_size - 1; + phys_addr_t pa = page_to_phys(page) + offset; + dma_addr_t iova = vduse_domain_alloc_iova(iovad, size, limit); + + if (!iova) + return DMA_MAPPING_ERROR; + + if (vduse_domain_init_bounce_map(domain)) + goto err; + + if (vduse_domain_map_bounce_page(domain, (u64)iova, (u64)size, pa)) + goto err; + + if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL) + vduse_domain_bounce(domain, iova, size, DMA_TO_DEVICE); + + return iova; +err: + vduse_domain_free_iova(iovad, iova, size); + return DMA_MAPPING_ERROR; +} + +void vduse_domain_unmap_page(struct vduse_iova_domain *domain, + dma_addr_t dma_addr, size_t size, + enum dma_data_direction dir, unsigned long attrs) +{ + struct iova_domain *iovad = &domain->stream_iovad; + + if (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL) + vduse_domain_bounce(domain, dma_addr, size, DMA_FROM_DEVICE); + + vduse_domain_unmap_bounce_page(domain, (u64)dma_addr, (u64)size); + vduse_domain_free_iova(iovad, dma_addr, size); +} + +void *vduse_domain_alloc_coherent(struct vduse_iova_domain *domain, + size_t size, dma_addr_t *dma_addr, + gfp_t flag, unsigned long attrs) +{ + struct iova_domain *iovad = &domain->consistent_iovad; + unsigned long limit = domain->iova_limit; + dma_addr_t iova = vduse_domain_alloc_iova(iovad, size, limit); + void *orig = alloc_pages_exact(size, flag); + + if (!iova || !orig) + goto err; + + spin_lock(&domain->iotlb_lock); + if (vduse_iotlb_add_range(domain, (u64)iova, (u64)iova + size - 1, + virt_to_phys(orig), VHOST_MAP_RW, + domain->file, (u64)iova)) { + spin_unlock(&domain->iotlb_lock); + goto err; + } + spin_unlock(&domain->iotlb_lock); + + *dma_addr = iova; + + return orig; +err: + *dma_addr = DMA_MAPPING_ERROR; + if (orig) + free_pages_exact(orig, size); + if (iova) + vduse_domain_free_iova(iovad, iova, size); + + return NULL; +} + +void vduse_domain_free_coherent(struct vduse_iova_domain *domain, size_t size, + void *vaddr, dma_addr_t dma_addr, + unsigned long attrs) +{ + struct iova_domain *iovad = &domain->consistent_iovad; + struct vhost_iotlb_map *map; + struct vdpa_map_file *map_file; + phys_addr_t pa; + + spin_lock(&domain->iotlb_lock); + map = vhost_iotlb_itree_first(domain->iotlb, (u64)dma_addr, + (u64)dma_addr + size - 1); + if (WARN_ON(!map)) { + spin_unlock(&domain->iotlb_lock); + return; + } + map_file = (struct vdpa_map_file *)map->opaque; + fput(map_file->file); + kfree(map_file); + pa = map->addr; + vhost_iotlb_map_free(domain->iotlb, map); + spin_unlock(&domain->iotlb_lock); + + vduse_domain_free_iova(iovad, dma_addr, size); + free_pages_exact(phys_to_virt(pa), size); +} + +static vm_fault_t vduse_domain_mmap_fault(struct vm_fault *vmf) +{ + struct vduse_iova_domain *domain = vmf->vma->vm_private_data; + unsigned long iova = vmf->pgoff << PAGE_SHIFT; + struct page *page; + + if (!domain) + return VM_FAULT_SIGBUS; + + if (iova < domain->bounce_size) + page = vduse_domain_get_bounce_page(domain, iova); + else + page = vduse_domain_get_coherent_page(domain, iova); + + if (!page) + return VM_FAULT_SIGBUS; + + vmf->page = page; + + return 0; +} + +static const struct vm_operations_struct vduse_domain_mmap_ops = { + .fault = vduse_domain_mmap_fault, +}; + +static int vduse_domain_mmap(struct file *file, struct vm_area_struct *vma) +{ + struct vduse_iova_domain *domain = file->private_data; + + vma->vm_flags |= VM_DONTDUMP | VM_DONTEXPAND; + vma->vm_private_data = domain; + vma->vm_ops = &vduse_domain_mmap_ops; + + return 0; +} + +static int vduse_domain_release(struct inode *inode, struct file *file) +{ + struct vduse_iova_domain *domain = file->private_data; + + spin_lock(&domain->iotlb_lock); + vduse_iotlb_del_range(domain, 0, ULLONG_MAX); + vduse_domain_free_bounce_pages(domain); + spin_unlock(&domain->iotlb_lock); + put_iova_domain(&domain->stream_iovad); + put_iova_domain(&domain->consistent_iovad); + vhost_iotlb_free(domain->iotlb); + vfree(domain->bounce_maps); + kfree(domain); + + return 0; +} + +static const struct file_operations vduse_domain_fops = { + .owner = THIS_MODULE, + .mmap = vduse_domain_mmap, + .release = vduse_domain_release, +}; + +void vduse_domain_destroy(struct vduse_iova_domain *domain) +{ + fput(domain->file); +} + +struct vduse_iova_domain * +vduse_domain_create(unsigned long iova_limit, size_t bounce_size) +{ + struct vduse_iova_domain *domain; + struct file *file; + struct vduse_bounce_map *map; + unsigned long pfn, bounce_pfns; + + bounce_pfns = PAGE_ALIGN(bounce_size) >> PAGE_SHIFT; + if (iova_limit <= bounce_size) + return NULL; + + domain = kzalloc(sizeof(*domain), GFP_KERNEL); + if (!domain) + return NULL; + + domain->iotlb = vhost_iotlb_alloc(0, 0); + if (!domain->iotlb) + goto err_iotlb; + + domain->iova_limit = iova_limit; + domain->bounce_size = PAGE_ALIGN(bounce_size); + domain->bounce_maps = vzalloc(bounce_pfns * + sizeof(struct vduse_bounce_map)); + if (!domain->bounce_maps) + goto err_map; + + for (pfn = 0; pfn < bounce_pfns; pfn++) { + map = &domain->bounce_maps[pfn]; + map->orig_phys = INVALID_PHYS_ADDR; + } + file = anon_inode_getfile("[vduse-domain]", &vduse_domain_fops, + domain, O_RDWR); + if (IS_ERR(file)) + goto err_file; + + domain->file = file; + spin_lock_init(&domain->iotlb_lock); + init_iova_domain(&domain->stream_iovad, + PAGE_SIZE, IOVA_START_PFN); + init_iova_domain(&domain->consistent_iovad, + PAGE_SIZE, bounce_pfns); + + return domain; +err_file: + vfree(domain->bounce_maps); +err_map: + vhost_iotlb_free(domain->iotlb); +err_iotlb: + kfree(domain); + return NULL; +} + +int vduse_domain_init(void) +{ + return iova_cache_get(); +} + +void vduse_domain_exit(void) +{ + iova_cache_put(); +} diff --git a/drivers/vdpa/vdpa_user/iova_domain.h b/drivers/vdpa/vdpa_user/iova_domain.h new file mode 100644 index 000000000000..5fc41d01c412 --- /dev/null +++ b/drivers/vdpa/vdpa_user/iova_domain.h @@ -0,0 +1,73 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * MMU-based IOMMU implementation + * + * Copyright (C) 2020-2021 Bytedance Inc. and/or its affiliates. All rights reserved. + * + * Author: Xie Yongji + * + */ + +#ifndef _VDUSE_IOVA_DOMAIN_H +#define _VDUSE_IOVA_DOMAIN_H + +#include +#include +#include + +#define IOVA_START_PFN 1 + +#define INVALID_PHYS_ADDR (~(phys_addr_t)0) + +struct vduse_bounce_map { + struct page *bounce_page; + u64 orig_phys; +}; + +struct vduse_iova_domain { + struct iova_domain stream_iovad; + struct iova_domain consistent_iovad; + struct vduse_bounce_map *bounce_maps; + size_t bounce_size; + unsigned long iova_limit; + int bounce_map; + struct vhost_iotlb *iotlb; + spinlock_t iotlb_lock; + struct file *file; +}; + +int vduse_domain_set_map(struct vduse_iova_domain *domain, + struct vhost_iotlb *iotlb); + +void vduse_domain_clear_map(struct vduse_iova_domain *domain, + struct vhost_iotlb *iotlb); + +dma_addr_t vduse_domain_map_page(struct vduse_iova_domain *domain, + struct page *page, unsigned long offset, + size_t size, enum dma_data_direction dir, + unsigned long attrs); + +void vduse_domain_unmap_page(struct vduse_iova_domain *domain, + dma_addr_t dma_addr, size_t size, + enum dma_data_direction dir, unsigned long attrs); + +void *vduse_domain_alloc_coherent(struct vduse_iova_domain *domain, + size_t size, dma_addr_t *dma_addr, + gfp_t flag, unsigned long attrs); + +void vduse_domain_free_coherent(struct vduse_iova_domain *domain, size_t size, + void *vaddr, dma_addr_t dma_addr, + unsigned long attrs); + +void vduse_domain_reset_bounce_map(struct vduse_iova_domain *domain); + +void vduse_domain_destroy(struct vduse_iova_domain *domain); + +struct vduse_iova_domain *vduse_domain_create(unsigned long iova_limit, + size_t bounce_size); + +int vduse_domain_init(void); + +void vduse_domain_exit(void); + +#endif /* _VDUSE_IOVA_DOMAIN_H */ From patchwork Tue Jul 13 08:46:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 12373199 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.0 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B76F9C11F67 for ; Tue, 13 Jul 2021 08:49:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9C0F1613AB for ; Tue, 13 Jul 2021 08:49:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235382AbhGMIvu (ORCPT ); Tue, 13 Jul 2021 04:51:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50298 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235392AbhGMIvd (ORCPT ); Tue, 13 Jul 2021 04:51:33 -0400 Received: from mail-pj1-x102d.google.com (mail-pj1-x102d.google.com [IPv6:2607:f8b0:4864:20::102d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 10F2CC0613EE for ; Tue, 13 Jul 2021 01:48:24 -0700 (PDT) Received: by mail-pj1-x102d.google.com with SMTP id i16-20020a17090acf90b02901736d9d2218so1656243pju.1 for ; Tue, 13 Jul 2021 01:48:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=yIPl04ivFClsafahu2qJ8PDmZ+pEZYepUapxtPnL8wQ=; b=rHbYu4SqxnQv72AGurJYrHpFqxamSELUZxnZbA+PwwFUCxh2BL800GgzhiDBDn7snw vPRjqvElYzHPZv86xBJvVf+NtiJJ5PWHvv8sr5CHd8aGhiuiX6JszOPctOpH13Rt9boK srkuLaEtXxszoUDTEuJjcOCbl9AIy7SEyIJ+So/Ky/MRDfT2HQKFdxakzVnJPPmCvFtd /e07SPXMHCFGuFJGaVo1Ip9QGJLz+cwdqz6Kbdp4VNErrr4uJxe+tInK4b/5BxJYf5Tq e/Zor9BQZswkVJVEK0V5Jbxayo7DmNe8DMY8omE9N6mZYNSVK4I4n4b4PtAKBvVjobTZ LASw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=yIPl04ivFClsafahu2qJ8PDmZ+pEZYepUapxtPnL8wQ=; b=bgVm+zlucB1Cb5dYAty/Im19c+PioZ2rbrpftTo+rHWJx/ds4nCByDwrVCvAt4uGSd mOjrgkF7iaugWtHfyNFQkLIrVCKnMR2wZSfimBzHU12vyzwtIi0t9BxTZsPrtPMoCcMj mDt3u+7RUooXI3OWZiUJI2czNs2y/hghsUDeO/cVkyGHcy1DwEbS/H7SwT4Dt4MIkQj7 8fcjeLJMVmjUjACdEVCntJke39oyLXw7EsN72mYB1G5wo2TrMNH8HgsuECpvxQnXyPIP fpfKSwutXs4c7+4ftyb2x74rKGzScDdvbE1QooflESRjLhK5zxKSAgj1xXfuKsDD3WUy sFOw== X-Gm-Message-State: AOAM53321NgGfo++BrRZJuyRbueL1EPDNJZGSoqPqekjiviGn2iJKqbk 9QpxaWZ+GQrg4+dacl4j5OAKr/FMWA4M X-Google-Smtp-Source: ABdhPJyNI4SCnAF/FaXz1vQQiNPlViVk5DHDMIWtKGo3FJ3O6HfqyjYMjhQwxUPFK2+xvmwucJvTnQ== X-Received: by 2002:a17:902:8488:b029:129:97e8:16e7 with SMTP id c8-20020a1709028488b029012997e816e7mr2683936plo.39.1626166103242; Tue, 13 Jul 2021 01:48:23 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id c141sm19151424pfc.13.2021.07.13.01.48.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:48:22 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 16/17] vduse: Introduce VDUSE - vDPA Device in Userspace Date: Tue, 13 Jul 2021 16:46:55 +0800 Message-Id: <20210713084656.232-17-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This VDUSE driver enables implementing software-emulated vDPA devices in userspace. The vDPA device is created by ioctl(VDUSE_CREATE_DEV) on /dev/vduse/control. Then a char device interface (/dev/vduse/$NAME) is exported to userspace for device emulation. In order to make the device emulation more secure, the device's control path is handled in kernel. A message mechnism is introduced to forward some dataplane related control messages to userspace. And in the data path, the DMA buffer will be mapped into userspace address space through different ways depending on the vDPA bus to which the vDPA device is attached. In virtio-vdpa case, the MMU-based IOMMU driver is used to achieve that. And in vhost-vdpa case, the DMA buffer is reside in a userspace memory region which can be shared to the VDUSE userspace processs via transferring the shmfd. For more details on VDUSE design and usage, please see the follow-on Documentation commit. Signed-off-by: Xie Yongji --- Documentation/userspace-api/ioctl/ioctl-number.rst | 1 + drivers/vdpa/Kconfig | 10 + drivers/vdpa/Makefile | 1 + drivers/vdpa/vdpa_user/Makefile | 5 + drivers/vdpa/vdpa_user/vduse_dev.c | 1502 ++++++++++++++++++++ include/uapi/linux/vduse.h | 221 +++ 6 files changed, 1740 insertions(+) create mode 100644 drivers/vdpa/vdpa_user/Makefile create mode 100644 drivers/vdpa/vdpa_user/vduse_dev.c create mode 100644 include/uapi/linux/vduse.h diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst index 1409e40e6345..293ca3aef358 100644 --- a/Documentation/userspace-api/ioctl/ioctl-number.rst +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst @@ -300,6 +300,7 @@ Code Seq# Include File Comments 'z' 10-4F drivers/s390/crypto/zcrypt_api.h conflict! '|' 00-7F linux/media.h 0x80 00-1F linux/fb.h +0x81 00-1F linux/vduse.h 0x89 00-06 arch/x86/include/asm/sockios.h 0x89 0B-DF linux/sockios.h 0x89 E0-EF linux/sockios.h SIOCPROTOPRIVATE range diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig index a503c1b2bfd9..6e23bce6433a 100644 --- a/drivers/vdpa/Kconfig +++ b/drivers/vdpa/Kconfig @@ -33,6 +33,16 @@ config VDPA_SIM_BLOCK vDPA block device simulator which terminates IO request in a memory buffer. +config VDPA_USER + tristate "VDUSE (vDPA Device in Userspace) support" + depends on EVENTFD && MMU && HAS_DMA + select DMA_OPS + select VHOST_IOTLB + select IOMMU_IOVA + help + With VDUSE it is possible to emulate a vDPA Device + in a userspace program. + config IFCVF tristate "Intel IFC VF vDPA driver" depends on PCI_MSI diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile index 67fe7f3d6943..f02ebed33f19 100644 --- a/drivers/vdpa/Makefile +++ b/drivers/vdpa/Makefile @@ -1,6 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_VDPA) += vdpa.o obj-$(CONFIG_VDPA_SIM) += vdpa_sim/ +obj-$(CONFIG_VDPA_USER) += vdpa_user/ obj-$(CONFIG_IFCVF) += ifcvf/ obj-$(CONFIG_MLX5_VDPA) += mlx5/ obj-$(CONFIG_VP_VDPA) += virtio_pci/ diff --git a/drivers/vdpa/vdpa_user/Makefile b/drivers/vdpa/vdpa_user/Makefile new file mode 100644 index 000000000000..260e0b26af99 --- /dev/null +++ b/drivers/vdpa/vdpa_user/Makefile @@ -0,0 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0 + +vduse-y := vduse_dev.o iova_domain.o + +obj-$(CONFIG_VDPA_USER) += vduse.o diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c new file mode 100644 index 000000000000..c994a4a4660c --- /dev/null +++ b/drivers/vdpa/vdpa_user/vduse_dev.c @@ -0,0 +1,1502 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * VDUSE: vDPA Device in Userspace + * + * Copyright (C) 2020-2021 Bytedance Inc. and/or its affiliates. All rights reserved. + * + * Author: Xie Yongji + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "iova_domain.h" + +#define DRV_AUTHOR "Yongji Xie " +#define DRV_DESC "vDPA Device in Userspace" +#define DRV_LICENSE "GPL v2" + +#define VDUSE_DEV_MAX (1U << MINORBITS) +#define VDUSE_MAX_BOUNCE_SIZE (64 * 1024 * 1024) +#define VDUSE_IOVA_SIZE (128 * 1024 * 1024) +#define VDUSE_REQUEST_TIMEOUT 30 + +struct vduse_virtqueue { + u16 index; + u16 num_max; + u32 num; + u64 desc_addr; + u64 driver_addr; + u64 device_addr; + struct vdpa_vq_state state; + bool ready; + bool kicked; + spinlock_t kick_lock; + spinlock_t irq_lock; + struct eventfd_ctx *kickfd; + struct vdpa_callback cb; + struct work_struct inject; +}; + +struct vduse_dev; + +struct vduse_vdpa { + struct vdpa_device vdpa; + struct vduse_dev *dev; +}; + +struct vduse_dev { + struct vduse_vdpa *vdev; + struct device *dev; + struct vduse_virtqueue *vqs; + struct vduse_iova_domain *domain; + char *name; + struct mutex lock; + spinlock_t msg_lock; + u64 msg_unique; + wait_queue_head_t waitq; + struct list_head send_list; + struct list_head recv_list; + struct vdpa_callback config_cb; + struct work_struct inject; + spinlock_t irq_lock; + int minor; + bool connected; + u64 api_version; + u64 device_features; + u64 driver_features; + u32 device_id; + u32 vendor_id; + u32 generation; + u32 config_size; + void *config; + u8 status; + u32 vq_num; + u32 vq_align; +}; + +struct vduse_dev_msg { + struct vduse_dev_request req; + struct vduse_dev_response resp; + struct list_head list; + wait_queue_head_t waitq; + bool completed; +}; + +struct vduse_control { + u64 api_version; +}; + +static DEFINE_MUTEX(vduse_lock); +static DEFINE_IDR(vduse_idr); + +static dev_t vduse_major; +static struct class *vduse_class; +static struct cdev vduse_ctrl_cdev; +static struct cdev vduse_cdev; +static struct workqueue_struct *vduse_irq_wq; + +static u32 allowed_device_id[] = { + VIRTIO_ID_BLOCK, +}; + +static inline struct vduse_dev *vdpa_to_vduse(struct vdpa_device *vdpa) +{ + struct vduse_vdpa *vdev = container_of(vdpa, struct vduse_vdpa, vdpa); + + return vdev->dev; +} + +static inline struct vduse_dev *dev_to_vduse(struct device *dev) +{ + struct vdpa_device *vdpa = dev_to_vdpa(dev); + + return vdpa_to_vduse(vdpa); +} + +static struct vduse_dev_msg *vduse_find_msg(struct list_head *head, + uint32_t request_id) +{ + struct vduse_dev_msg *msg; + + list_for_each_entry(msg, head, list) { + if (msg->req.request_id == request_id) { + list_del(&msg->list); + return msg; + } + } + + return NULL; +} + +static struct vduse_dev_msg *vduse_dequeue_msg(struct list_head *head) +{ + struct vduse_dev_msg *msg = NULL; + + if (!list_empty(head)) { + msg = list_first_entry(head, struct vduse_dev_msg, list); + list_del(&msg->list); + } + + return msg; +} + +static void vduse_enqueue_msg(struct list_head *head, + struct vduse_dev_msg *msg) +{ + list_add_tail(&msg->list, head); +} + +static int vduse_dev_msg_sync(struct vduse_dev *dev, + struct vduse_dev_msg *msg) +{ + int ret; + + init_waitqueue_head(&msg->waitq); + spin_lock(&dev->msg_lock); + msg->req.request_id = dev->msg_unique++; + vduse_enqueue_msg(&dev->send_list, msg); + wake_up(&dev->waitq); + spin_unlock(&dev->msg_lock); + + wait_event_killable_timeout(msg->waitq, msg->completed, + VDUSE_REQUEST_TIMEOUT * HZ); + spin_lock(&dev->msg_lock); + if (!msg->completed) { + list_del(&msg->list); + msg->resp.result = VDUSE_REQ_RESULT_FAILED; + } + ret = (msg->resp.result == VDUSE_REQ_RESULT_OK) ? 0 : -EIO; + spin_unlock(&dev->msg_lock); + + return ret; +} + +static int vduse_dev_get_vq_state_packed(struct vduse_dev *dev, + struct vduse_virtqueue *vq, + struct vdpa_vq_state_packed *packed) +{ + struct vduse_dev_msg msg = { 0 }; + int ret; + + msg.req.type = VDUSE_GET_VQ_STATE; + msg.req.vq_state.index = vq->index; + + ret = vduse_dev_msg_sync(dev, &msg); + if (ret) + return ret; + + packed->last_avail_counter = + msg.resp.vq_state.packed.last_avail_counter; + packed->last_avail_idx = msg.resp.vq_state.packed.last_avail_idx; + packed->last_used_counter = msg.resp.vq_state.packed.last_used_counter; + packed->last_used_idx = msg.resp.vq_state.packed.last_used_idx; + + return 0; +} + +static int vduse_dev_get_vq_state_split(struct vduse_dev *dev, + struct vduse_virtqueue *vq, + struct vdpa_vq_state_split *split) +{ + struct vduse_dev_msg msg = { 0 }; + int ret; + + msg.req.type = VDUSE_GET_VQ_STATE; + msg.req.vq_state.index = vq->index; + + ret = vduse_dev_msg_sync(dev, &msg); + if (ret) + return ret; + + split->avail_index = msg.resp.vq_state.split.avail_index; + + return 0; +} + +static int vduse_dev_set_status(struct vduse_dev *dev, u8 status) +{ + struct vduse_dev_msg msg = { 0 }; + + msg.req.type = VDUSE_SET_STATUS; + msg.req.s.status = status; + + return vduse_dev_msg_sync(dev, &msg); +} + +static int vduse_dev_update_iotlb(struct vduse_dev *dev, + u64 start, u64 last) +{ + struct vduse_dev_msg msg = { 0 }; + + if (last < start) + return -EINVAL; + + msg.req.type = VDUSE_UPDATE_IOTLB; + msg.req.iova.start = start; + msg.req.iova.last = last; + + return vduse_dev_msg_sync(dev, &msg); +} + +static ssize_t vduse_dev_read_iter(struct kiocb *iocb, struct iov_iter *to) +{ + struct file *file = iocb->ki_filp; + struct vduse_dev *dev = file->private_data; + struct vduse_dev_msg *msg; + int size = sizeof(struct vduse_dev_request); + ssize_t ret; + + if (iov_iter_count(to) < size) + return -EINVAL; + + spin_lock(&dev->msg_lock); + while (1) { + msg = vduse_dequeue_msg(&dev->send_list); + if (msg) + break; + + ret = -EAGAIN; + if (file->f_flags & O_NONBLOCK) + goto unlock; + + spin_unlock(&dev->msg_lock); + ret = wait_event_interruptible_exclusive(dev->waitq, + !list_empty(&dev->send_list)); + if (ret) + return ret; + + spin_lock(&dev->msg_lock); + } + spin_unlock(&dev->msg_lock); + ret = copy_to_iter(&msg->req, size, to); + spin_lock(&dev->msg_lock); + if (ret != size) { + ret = -EFAULT; + vduse_enqueue_msg(&dev->send_list, msg); + goto unlock; + } + vduse_enqueue_msg(&dev->recv_list, msg); +unlock: + spin_unlock(&dev->msg_lock); + + return ret; +} + +static ssize_t vduse_dev_write_iter(struct kiocb *iocb, struct iov_iter *from) +{ + struct file *file = iocb->ki_filp; + struct vduse_dev *dev = file->private_data; + struct vduse_dev_response resp; + struct vduse_dev_msg *msg; + size_t ret; + + ret = copy_from_iter(&resp, sizeof(resp), from); + if (ret != sizeof(resp)) + return -EINVAL; + + spin_lock(&dev->msg_lock); + msg = vduse_find_msg(&dev->recv_list, resp.request_id); + if (!msg) { + ret = -ENOENT; + goto unlock; + } + + memcpy(&msg->resp, &resp, sizeof(resp)); + msg->completed = 1; + wake_up(&msg->waitq); +unlock: + spin_unlock(&dev->msg_lock); + + return ret; +} + +static __poll_t vduse_dev_poll(struct file *file, poll_table *wait) +{ + struct vduse_dev *dev = file->private_data; + __poll_t mask = 0; + + poll_wait(file, &dev->waitq, wait); + + if (!list_empty(&dev->send_list)) + mask |= EPOLLIN | EPOLLRDNORM; + if (!list_empty(&dev->recv_list)) + mask |= EPOLLOUT | EPOLLWRNORM; + + return mask; +} + +static void vduse_dev_reset(struct vduse_dev *dev) +{ + int i; + struct vduse_iova_domain *domain = dev->domain; + + /* The coherent mappings are handled in vduse_dev_free_coherent() */ + if (domain->bounce_map) + vduse_domain_reset_bounce_map(domain); + + dev->driver_features = 0; + dev->generation++; + spin_lock(&dev->irq_lock); + dev->config_cb.callback = NULL; + dev->config_cb.private = NULL; + spin_unlock(&dev->irq_lock); + flush_work(&dev->inject); + + for (i = 0; i < dev->vq_num; i++) { + struct vduse_virtqueue *vq = &dev->vqs[i]; + + vq->ready = false; + vq->desc_addr = 0; + vq->driver_addr = 0; + vq->device_addr = 0; + vq->num = 0; + memset(&vq->state, 0, sizeof(vq->state)); + + spin_lock(&vq->kick_lock); + vq->kicked = false; + if (vq->kickfd) + eventfd_ctx_put(vq->kickfd); + vq->kickfd = NULL; + spin_unlock(&vq->kick_lock); + + spin_lock(&vq->irq_lock); + vq->cb.callback = NULL; + vq->cb.private = NULL; + spin_unlock(&vq->irq_lock); + flush_work(&vq->inject); + } +} + +static int vduse_vdpa_set_vq_address(struct vdpa_device *vdpa, u16 idx, + u64 desc_area, u64 driver_area, + u64 device_area) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + struct vduse_virtqueue *vq = &dev->vqs[idx]; + + vq->desc_addr = desc_area; + vq->driver_addr = driver_area; + vq->device_addr = device_area; + + return 0; +} + +static void vduse_vdpa_kick_vq(struct vdpa_device *vdpa, u16 idx) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + struct vduse_virtqueue *vq = &dev->vqs[idx]; + + spin_lock(&vq->kick_lock); + if (!vq->ready) + goto unlock; + + if (vq->kickfd) + eventfd_signal(vq->kickfd, 1); + else + vq->kicked = true; +unlock: + spin_unlock(&vq->kick_lock); +} + +static void vduse_vdpa_set_vq_cb(struct vdpa_device *vdpa, u16 idx, + struct vdpa_callback *cb) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + struct vduse_virtqueue *vq = &dev->vqs[idx]; + + spin_lock(&vq->irq_lock); + vq->cb.callback = cb->callback; + vq->cb.private = cb->private; + spin_unlock(&vq->irq_lock); +} + +static void vduse_vdpa_set_vq_num(struct vdpa_device *vdpa, u16 idx, u32 num) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + struct vduse_virtqueue *vq = &dev->vqs[idx]; + + vq->num = num; +} + +static void vduse_vdpa_set_vq_ready(struct vdpa_device *vdpa, + u16 idx, bool ready) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + struct vduse_virtqueue *vq = &dev->vqs[idx]; + + vq->ready = ready; +} + +static bool vduse_vdpa_get_vq_ready(struct vdpa_device *vdpa, u16 idx) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + struct vduse_virtqueue *vq = &dev->vqs[idx]; + + return vq->ready; +} + +static int vduse_vdpa_set_vq_state(struct vdpa_device *vdpa, u16 idx, + const struct vdpa_vq_state *state) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + struct vduse_virtqueue *vq = &dev->vqs[idx]; + + if (dev->driver_features & BIT_ULL(VIRTIO_F_RING_PACKED)) { + vq->state.packed.last_avail_counter = + state->packed.last_avail_counter; + vq->state.packed.last_avail_idx = state->packed.last_avail_idx; + vq->state.packed.last_used_counter = + state->packed.last_used_counter; + vq->state.packed.last_used_idx = state->packed.last_used_idx; + } else + vq->state.split.avail_index = state->split.avail_index; + + return 0; +} + +static int vduse_vdpa_get_vq_state(struct vdpa_device *vdpa, u16 idx, + struct vdpa_vq_state *state) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + struct vduse_virtqueue *vq = &dev->vqs[idx]; + + if (dev->driver_features & BIT_ULL(VIRTIO_F_RING_PACKED)) + return vduse_dev_get_vq_state_packed(dev, vq, &state->packed); + + return vduse_dev_get_vq_state_split(dev, vq, &state->split); +} + +static u32 vduse_vdpa_get_vq_align(struct vdpa_device *vdpa) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + return dev->vq_align; +} + +static u64 vduse_vdpa_get_features(struct vdpa_device *vdpa) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + return dev->device_features; +} + +static int vduse_vdpa_set_features(struct vdpa_device *vdpa, u64 features) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + dev->driver_features = features; + return 0; +} + +static void vduse_vdpa_set_config_cb(struct vdpa_device *vdpa, + struct vdpa_callback *cb) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + spin_lock(&dev->irq_lock); + dev->config_cb.callback = cb->callback; + dev->config_cb.private = cb->private; + spin_unlock(&dev->irq_lock); +} + +static u16 vduse_vdpa_get_vq_num_max(struct vdpa_device *vdpa, u16 idx) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + return dev->vqs[idx].num_max; +} + +static u32 vduse_vdpa_get_device_id(struct vdpa_device *vdpa) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + return dev->device_id; +} + +static u32 vduse_vdpa_get_vendor_id(struct vdpa_device *vdpa) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + return dev->vendor_id; +} + +static u8 vduse_vdpa_get_status(struct vdpa_device *vdpa) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + return dev->status; +} + +static void vduse_vdpa_set_status(struct vdpa_device *vdpa, u8 status) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + if (vduse_dev_set_status(dev, status)) + return; + + dev->status = status; + if (status == 0) + vduse_dev_reset(dev); +} + +static size_t vduse_vdpa_get_config_size(struct vdpa_device *vdpa) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + return dev->config_size; +} + +static void vduse_vdpa_get_config(struct vdpa_device *vdpa, unsigned int offset, + void *buf, unsigned int len) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + if (len > dev->config_size - offset) + return; + + memcpy(buf, dev->config + offset, len); +} + +static void vduse_vdpa_set_config(struct vdpa_device *vdpa, unsigned int offset, + const void *buf, unsigned int len) +{ + /* Now we only support read-only configuration space */ +} + +static u32 vduse_vdpa_get_generation(struct vdpa_device *vdpa) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + return dev->generation; +} + +static int vduse_vdpa_set_map(struct vdpa_device *vdpa, + struct vhost_iotlb *iotlb) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + int ret; + + ret = vduse_domain_set_map(dev->domain, iotlb); + if (ret) + return ret; + + ret = vduse_dev_update_iotlb(dev, 0ULL, ULLONG_MAX); + if (ret) { + vduse_domain_clear_map(dev->domain, iotlb); + return ret; + } + + return 0; +} + +static void vduse_vdpa_free(struct vdpa_device *vdpa) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + dev->vdev = NULL; +} + +static const struct vdpa_config_ops vduse_vdpa_config_ops = { + .set_vq_address = vduse_vdpa_set_vq_address, + .kick_vq = vduse_vdpa_kick_vq, + .set_vq_cb = vduse_vdpa_set_vq_cb, + .set_vq_num = vduse_vdpa_set_vq_num, + .set_vq_ready = vduse_vdpa_set_vq_ready, + .get_vq_ready = vduse_vdpa_get_vq_ready, + .set_vq_state = vduse_vdpa_set_vq_state, + .get_vq_state = vduse_vdpa_get_vq_state, + .get_vq_align = vduse_vdpa_get_vq_align, + .get_features = vduse_vdpa_get_features, + .set_features = vduse_vdpa_set_features, + .set_config_cb = vduse_vdpa_set_config_cb, + .get_vq_num_max = vduse_vdpa_get_vq_num_max, + .get_device_id = vduse_vdpa_get_device_id, + .get_vendor_id = vduse_vdpa_get_vendor_id, + .get_status = vduse_vdpa_get_status, + .set_status = vduse_vdpa_set_status, + .get_config_size = vduse_vdpa_get_config_size, + .get_config = vduse_vdpa_get_config, + .set_config = vduse_vdpa_set_config, + .get_generation = vduse_vdpa_get_generation, + .set_map = vduse_vdpa_set_map, + .free = vduse_vdpa_free, +}; + +static dma_addr_t vduse_dev_map_page(struct device *dev, struct page *page, + unsigned long offset, size_t size, + enum dma_data_direction dir, + unsigned long attrs) +{ + struct vduse_dev *vdev = dev_to_vduse(dev); + struct vduse_iova_domain *domain = vdev->domain; + + return vduse_domain_map_page(domain, page, offset, size, dir, attrs); +} + +static void vduse_dev_unmap_page(struct device *dev, dma_addr_t dma_addr, + size_t size, enum dma_data_direction dir, + unsigned long attrs) +{ + struct vduse_dev *vdev = dev_to_vduse(dev); + struct vduse_iova_domain *domain = vdev->domain; + + return vduse_domain_unmap_page(domain, dma_addr, size, dir, attrs); +} + +static void *vduse_dev_alloc_coherent(struct device *dev, size_t size, + dma_addr_t *dma_addr, gfp_t flag, + unsigned long attrs) +{ + struct vduse_dev *vdev = dev_to_vduse(dev); + struct vduse_iova_domain *domain = vdev->domain; + unsigned long iova; + void *addr; + + *dma_addr = DMA_MAPPING_ERROR; + addr = vduse_domain_alloc_coherent(domain, size, + (dma_addr_t *)&iova, flag, attrs); + if (!addr) + return NULL; + + *dma_addr = (dma_addr_t)iova; + + return addr; +} + +static void vduse_dev_free_coherent(struct device *dev, size_t size, + void *vaddr, dma_addr_t dma_addr, + unsigned long attrs) +{ + struct vduse_dev *vdev = dev_to_vduse(dev); + struct vduse_iova_domain *domain = vdev->domain; + + vduse_domain_free_coherent(domain, size, vaddr, dma_addr, attrs); +} + +static size_t vduse_dev_max_mapping_size(struct device *dev) +{ + struct vduse_dev *vdev = dev_to_vduse(dev); + struct vduse_iova_domain *domain = vdev->domain; + + return domain->bounce_size; +} + +static const struct dma_map_ops vduse_dev_dma_ops = { + .map_page = vduse_dev_map_page, + .unmap_page = vduse_dev_unmap_page, + .alloc = vduse_dev_alloc_coherent, + .free = vduse_dev_free_coherent, + .max_mapping_size = vduse_dev_max_mapping_size, +}; + +static unsigned int perm_to_file_flags(u8 perm) +{ + unsigned int flags = 0; + + switch (perm) { + case VDUSE_ACCESS_WO: + flags |= O_WRONLY; + break; + case VDUSE_ACCESS_RO: + flags |= O_RDONLY; + break; + case VDUSE_ACCESS_RW: + flags |= O_RDWR; + break; + default: + WARN(1, "invalidate vhost IOTLB permission\n"); + break; + } + + return flags; +} + +static int vduse_kickfd_setup(struct vduse_dev *dev, + struct vduse_vq_eventfd *eventfd) +{ + struct eventfd_ctx *ctx = NULL; + struct vduse_virtqueue *vq; + u32 index; + + if (eventfd->index >= dev->vq_num) + return -EINVAL; + + index = array_index_nospec(eventfd->index, dev->vq_num); + vq = &dev->vqs[index]; + if (eventfd->fd >= 0) { + ctx = eventfd_ctx_fdget(eventfd->fd); + if (IS_ERR(ctx)) + return PTR_ERR(ctx); + } else if (eventfd->fd != VDUSE_EVENTFD_DEASSIGN) + return 0; + + spin_lock(&vq->kick_lock); + if (vq->kickfd) + eventfd_ctx_put(vq->kickfd); + vq->kickfd = ctx; + if (vq->ready && vq->kicked && vq->kickfd) { + eventfd_signal(vq->kickfd, 1); + vq->kicked = false; + } + spin_unlock(&vq->kick_lock); + + return 0; +} + +static bool vduse_dev_is_ready(struct vduse_dev *dev) +{ + int i; + + for (i = 0; i < dev->vq_num; i++) + if (!dev->vqs[i].num_max) + return false; + + return true; +} + +static void vduse_dev_irq_inject(struct work_struct *work) +{ + struct vduse_dev *dev = container_of(work, struct vduse_dev, inject); + + spin_lock_irq(&dev->irq_lock); + if (dev->config_cb.callback) + dev->config_cb.callback(dev->config_cb.private); + spin_unlock_irq(&dev->irq_lock); +} + +static void vduse_vq_irq_inject(struct work_struct *work) +{ + struct vduse_virtqueue *vq = container_of(work, + struct vduse_virtqueue, inject); + + spin_lock_irq(&vq->irq_lock); + if (vq->ready && vq->cb.callback) + vq->cb.callback(vq->cb.private); + spin_unlock_irq(&vq->irq_lock); +} + +static long vduse_dev_ioctl(struct file *file, unsigned int cmd, + unsigned long arg) +{ + struct vduse_dev *dev = file->private_data; + void __user *argp = (void __user *)arg; + int ret; + + switch (cmd) { + case VDUSE_IOTLB_GET_FD: { + struct vduse_iotlb_entry entry; + struct vhost_iotlb_map *map; + struct vdpa_map_file *map_file; + struct vduse_iova_domain *domain = dev->domain; + struct file *f = NULL; + + ret = -EFAULT; + if (copy_from_user(&entry, argp, sizeof(entry))) + break; + + ret = -EINVAL; + if (entry.start > entry.last) + break; + + spin_lock(&domain->iotlb_lock); + map = vhost_iotlb_itree_first(domain->iotlb, + entry.start, entry.last); + if (map) { + map_file = (struct vdpa_map_file *)map->opaque; + f = get_file(map_file->file); + entry.offset = map_file->offset; + entry.start = map->start; + entry.last = map->last; + entry.perm = map->perm; + } + spin_unlock(&domain->iotlb_lock); + ret = -EINVAL; + if (!f) + break; + + ret = -EFAULT; + if (copy_to_user(argp, &entry, sizeof(entry))) { + fput(f); + break; + } + ret = receive_fd(f, perm_to_file_flags(entry.perm)); + fput(f); + break; + } + case VDUSE_DEV_GET_FEATURES: + ret = put_user(dev->driver_features, (u64 __user *)argp); + break; + case VDUSE_DEV_SET_CONFIG: { + struct vduse_config_data config; + unsigned long size = offsetof(struct vduse_config_data, + buffer); + + ret = -EFAULT; + if (copy_from_user(&config, argp, size)) + break; + + ret = -EINVAL; + if (config.length == 0 || + config.length > dev->config_size - config.offset) + break; + + ret = -EFAULT; + if (copy_from_user(dev->config + config.offset, argp + size, + config.length)) + break; + + ret = 0; + break; + } + case VDUSE_DEV_INJECT_IRQ: + ret = 0; + queue_work(vduse_irq_wq, &dev->inject); + break; + case VDUSE_VQ_SETUP: { + struct vduse_vq_config config; + u32 index; + + ret = -EFAULT; + if (copy_from_user(&config, argp, sizeof(config))) + break; + + ret = -EINVAL; + if (config.index >= dev->vq_num) + break; + + index = array_index_nospec(config.index, dev->vq_num); + dev->vqs[index].num_max = config.max_size; + ret = 0; + break; + } + case VDUSE_VQ_GET_INFO: { + struct vduse_vq_info vq_info; + struct vduse_virtqueue *vq; + u32 index; + + ret = -EFAULT; + if (copy_from_user(&vq_info, argp, sizeof(vq_info))) + break; + + ret = -EINVAL; + if (vq_info.index >= dev->vq_num) + break; + + index = array_index_nospec(vq_info.index, dev->vq_num); + vq = &dev->vqs[index]; + vq_info.desc_addr = vq->desc_addr; + vq_info.driver_addr = vq->driver_addr; + vq_info.device_addr = vq->device_addr; + vq_info.num = vq->num; + + if (dev->driver_features & BIT_ULL(VIRTIO_F_RING_PACKED)) { + vq_info.packed.last_avail_counter = + vq->state.packed.last_avail_counter; + vq_info.packed.last_avail_idx = + vq->state.packed.last_avail_idx; + vq_info.packed.last_used_counter = + vq->state.packed.last_used_counter; + vq_info.packed.last_used_idx = + vq->state.packed.last_used_idx; + } else + vq_info.split.avail_index = + vq->state.split.avail_index; + + vq_info.ready = vq->ready; + + ret = -EFAULT; + if (copy_to_user(argp, &vq_info, sizeof(vq_info))) + break; + + ret = 0; + break; + } + case VDUSE_VQ_SETUP_KICKFD: { + struct vduse_vq_eventfd eventfd; + + ret = -EFAULT; + if (copy_from_user(&eventfd, argp, sizeof(eventfd))) + break; + + ret = vduse_kickfd_setup(dev, &eventfd); + break; + } + case VDUSE_VQ_INJECT_IRQ: { + u32 index; + + ret = -EFAULT; + if (get_user(index, (u32 __user *)argp)) + break; + + ret = -EINVAL; + if (index >= dev->vq_num) + break; + + ret = 0; + index = array_index_nospec(index, dev->vq_num); + queue_work(vduse_irq_wq, &dev->vqs[index].inject); + break; + } + default: + ret = -ENOIOCTLCMD; + break; + } + + return ret; +} + +static int vduse_dev_release(struct inode *inode, struct file *file) +{ + struct vduse_dev *dev = file->private_data; + + spin_lock(&dev->msg_lock); + /* Make sure the inflight messages can processed after reconncection */ + list_splice_init(&dev->recv_list, &dev->send_list); + spin_unlock(&dev->msg_lock); + dev->connected = false; + + return 0; +} + +static struct vduse_dev *vduse_dev_get_from_minor(int minor) +{ + struct vduse_dev *dev; + + mutex_lock(&vduse_lock); + dev = idr_find(&vduse_idr, minor); + mutex_unlock(&vduse_lock); + + return dev; +} + +static int vduse_dev_open(struct inode *inode, struct file *file) +{ + int ret; + struct vduse_dev *dev = vduse_dev_get_from_minor(iminor(inode)); + + if (!dev) + return -ENODEV; + + ret = -EBUSY; + mutex_lock(&dev->lock); + if (dev->connected) + goto unlock; + + ret = 0; + dev->connected = true; + file->private_data = dev; +unlock: + mutex_unlock(&dev->lock); + + return ret; +} + +static const struct file_operations vduse_dev_fops = { + .owner = THIS_MODULE, + .open = vduse_dev_open, + .release = vduse_dev_release, + .read_iter = vduse_dev_read_iter, + .write_iter = vduse_dev_write_iter, + .poll = vduse_dev_poll, + .unlocked_ioctl = vduse_dev_ioctl, + .compat_ioctl = compat_ptr_ioctl, + .llseek = noop_llseek, +}; + +static struct vduse_dev *vduse_dev_create(void) +{ + struct vduse_dev *dev = kzalloc(sizeof(*dev), GFP_KERNEL); + + if (!dev) + return NULL; + + mutex_init(&dev->lock); + spin_lock_init(&dev->msg_lock); + INIT_LIST_HEAD(&dev->send_list); + INIT_LIST_HEAD(&dev->recv_list); + spin_lock_init(&dev->irq_lock); + + INIT_WORK(&dev->inject, vduse_dev_irq_inject); + init_waitqueue_head(&dev->waitq); + + return dev; +} + +static void vduse_dev_destroy(struct vduse_dev *dev) +{ + kfree(dev); +} + +static struct vduse_dev *vduse_find_dev(const char *name) +{ + struct vduse_dev *dev; + int id; + + idr_for_each_entry(&vduse_idr, dev, id) + if (!strcmp(dev->name, name)) + return dev; + + return NULL; +} + +static int vduse_destroy_dev(char *name) +{ + struct vduse_dev *dev = vduse_find_dev(name); + + if (!dev) + return -EINVAL; + + mutex_lock(&dev->lock); + if (dev->vdev || dev->connected) { + mutex_unlock(&dev->lock); + return -EBUSY; + } + dev->connected = true; + mutex_unlock(&dev->lock); + + vduse_dev_reset(dev); + device_destroy(vduse_class, MKDEV(MAJOR(vduse_major), dev->minor)); + idr_remove(&vduse_idr, dev->minor); + kvfree(dev->config); + kfree(dev->vqs); + vduse_domain_destroy(dev->domain); + kfree(dev->name); + vduse_dev_destroy(dev); + module_put(THIS_MODULE); + + return 0; +} + +static bool device_is_allowed(u32 device_id) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(allowed_device_id); i++) + if (allowed_device_id[i] == device_id) + return true; + + return false; +} + +static bool features_is_valid(u64 features) +{ + if (!(features & (1ULL << VIRTIO_F_ACCESS_PLATFORM))) + return false; + + /* Now we only support read-only configuration space */ + if (features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE)) + return false; + + return true; +} + +static bool vduse_validate_config(struct vduse_dev_config *config) +{ + if (config->bounce_size > VDUSE_MAX_BOUNCE_SIZE) + return false; + + if (config->vq_align > PAGE_SIZE) + return false; + + if (config->config_size > PAGE_SIZE) + return false; + + if (!device_is_allowed(config->device_id)) + return false; + + if (!features_is_valid(config->features)) + return false; + + return true; +} + +static int vduse_create_dev(struct vduse_dev_config *config, + void *config_buf, u64 api_version) +{ + int i, ret; + struct vduse_dev *dev; + + ret = -EEXIST; + if (vduse_find_dev(config->name)) + goto err; + + ret = -ENOMEM; + dev = vduse_dev_create(); + if (!dev) + goto err; + + dev->api_version = api_version; + dev->device_features = config->features; + dev->device_id = config->device_id; + dev->vendor_id = config->vendor_id; + dev->name = kstrdup(config->name, GFP_KERNEL); + if (!dev->name) + goto err_str; + + dev->domain = vduse_domain_create(VDUSE_IOVA_SIZE - 1, + config->bounce_size); + if (!dev->domain) + goto err_domain; + + dev->config = config_buf; + dev->config_size = config->config_size; + dev->vq_align = config->vq_align; + dev->vq_num = config->vq_num; + dev->vqs = kcalloc(dev->vq_num, sizeof(*dev->vqs), GFP_KERNEL); + if (!dev->vqs) + goto err_vqs; + + for (i = 0; i < dev->vq_num; i++) { + dev->vqs[i].index = i; + INIT_WORK(&dev->vqs[i].inject, vduse_vq_irq_inject); + spin_lock_init(&dev->vqs[i].kick_lock); + spin_lock_init(&dev->vqs[i].irq_lock); + } + + ret = idr_alloc(&vduse_idr, dev, 1, VDUSE_DEV_MAX, GFP_KERNEL); + if (ret < 0) + goto err_idr; + + dev->minor = ret; + dev->dev = device_create(vduse_class, NULL, + MKDEV(MAJOR(vduse_major), dev->minor), + NULL, "%s", config->name); + if (IS_ERR(dev->dev)) { + ret = PTR_ERR(dev->dev); + goto err_dev; + } + __module_get(THIS_MODULE); + + return 0; +err_dev: + idr_remove(&vduse_idr, dev->minor); +err_idr: + kfree(dev->vqs); +err_vqs: + vduse_domain_destroy(dev->domain); +err_domain: + kfree(dev->name); +err_str: + vduse_dev_destroy(dev); +err: + kvfree(config_buf); + return ret; +} + +static long vduse_ioctl(struct file *file, unsigned int cmd, + unsigned long arg) +{ + int ret; + void __user *argp = (void __user *)arg; + struct vduse_control *control = file->private_data; + + mutex_lock(&vduse_lock); + switch (cmd) { + case VDUSE_GET_API_VERSION: + ret = put_user(control->api_version, (u64 __user *)argp); + break; + case VDUSE_SET_API_VERSION: { + u64 api_version; + + ret = -EFAULT; + if (get_user(api_version, (u64 __user *)argp)) + break; + + ret = -EINVAL; + if (api_version > VDUSE_API_VERSION) + break; + + ret = 0; + control->api_version = api_version; + break; + } + case VDUSE_CREATE_DEV: { + struct vduse_dev_config config; + unsigned long size = offsetof(struct vduse_dev_config, config); + void *buf; + + ret = -EFAULT; + if (copy_from_user(&config, argp, size)) + break; + + ret = -EINVAL; + if (vduse_validate_config(&config) == false) + break; + + buf = vmemdup_user(argp + size, config.config_size); + if (IS_ERR(buf)) { + ret = PTR_ERR(buf); + break; + } + config.name[VDUSE_NAME_MAX - 1] = '\0'; + ret = vduse_create_dev(&config, buf, control->api_version); + break; + } + case VDUSE_DESTROY_DEV: { + char name[VDUSE_NAME_MAX]; + + ret = -EFAULT; + if (copy_from_user(name, argp, VDUSE_NAME_MAX)) + break; + + name[VDUSE_NAME_MAX - 1] = '\0'; + ret = vduse_destroy_dev(name); + break; + } + default: + ret = -EINVAL; + break; + } + mutex_unlock(&vduse_lock); + + return ret; +} + +static int vduse_release(struct inode *inode, struct file *file) +{ + struct vduse_control *control = file->private_data; + + kfree(control); + return 0; +} + +static int vduse_open(struct inode *inode, struct file *file) +{ + struct vduse_control *control; + + control = kmalloc(sizeof(struct vduse_control), GFP_KERNEL); + if (!control) + return -ENOMEM; + + control->api_version = VDUSE_API_VERSION; + file->private_data = control; + + return 0; +} + +static const struct file_operations vduse_ctrl_fops = { + .owner = THIS_MODULE, + .open = vduse_open, + .release = vduse_release, + .unlocked_ioctl = vduse_ioctl, + .compat_ioctl = compat_ptr_ioctl, + .llseek = noop_llseek, +}; + +static char *vduse_devnode(struct device *dev, umode_t *mode) +{ + return kasprintf(GFP_KERNEL, "vduse/%s", dev_name(dev)); +} + +static void vduse_mgmtdev_release(struct device *dev) +{ +} + +static struct device vduse_mgmtdev = { + .init_name = "vduse", + .release = vduse_mgmtdev_release, +}; + +static struct vdpa_mgmt_dev mgmt_dev; + +static int vduse_dev_init_vdpa(struct vduse_dev *dev, const char *name) +{ + struct vduse_vdpa *vdev; + int ret; + + if (dev->vdev) + return -EEXIST; + + vdev = vdpa_alloc_device(struct vduse_vdpa, vdpa, dev->dev, + &vduse_vdpa_config_ops, name, true); + if (!vdev) + return -ENOMEM; + + dev->vdev = vdev; + vdev->dev = dev; + vdev->vdpa.dev.dma_mask = &vdev->vdpa.dev.coherent_dma_mask; + ret = dma_set_mask_and_coherent(&vdev->vdpa.dev, DMA_BIT_MASK(64)); + if (ret) { + put_device(&vdev->vdpa.dev); + return ret; + } + set_dma_ops(&vdev->vdpa.dev, &vduse_dev_dma_ops); + vdev->vdpa.dma_dev = &vdev->vdpa.dev; + vdev->vdpa.mdev = &mgmt_dev; + + return 0; +} + +static int vdpa_dev_add(struct vdpa_mgmt_dev *mdev, const char *name) +{ + struct vduse_dev *dev; + int ret; + + mutex_lock(&vduse_lock); + dev = vduse_find_dev(name); + if (!dev || !vduse_dev_is_ready(dev)) { + mutex_unlock(&vduse_lock); + return -EINVAL; + } + ret = vduse_dev_init_vdpa(dev, name); + mutex_unlock(&vduse_lock); + if (ret) + return ret; + + ret = _vdpa_register_device(&dev->vdev->vdpa, dev->vq_num); + if (ret) { + put_device(&dev->vdev->vdpa.dev); + return ret; + } + + return 0; +} + +static void vdpa_dev_del(struct vdpa_mgmt_dev *mdev, struct vdpa_device *dev) +{ + _vdpa_unregister_device(dev); +} + +static const struct vdpa_mgmtdev_ops vdpa_dev_mgmtdev_ops = { + .dev_add = vdpa_dev_add, + .dev_del = vdpa_dev_del, +}; + +static struct virtio_device_id id_table[] = { + { VIRTIO_ID_BLOCK, VIRTIO_DEV_ANY_ID }, + { 0 }, +}; + +static struct vdpa_mgmt_dev mgmt_dev = { + .device = &vduse_mgmtdev, + .id_table = id_table, + .ops = &vdpa_dev_mgmtdev_ops, +}; + +static int vduse_mgmtdev_init(void) +{ + int ret; + + ret = device_register(&vduse_mgmtdev); + if (ret) + return ret; + + ret = vdpa_mgmtdev_register(&mgmt_dev); + if (ret) + goto err; + + return 0; +err: + device_unregister(&vduse_mgmtdev); + return ret; +} + +static void vduse_mgmtdev_exit(void) +{ + vdpa_mgmtdev_unregister(&mgmt_dev); + device_unregister(&vduse_mgmtdev); +} + +static int vduse_init(void) +{ + int ret; + struct device *dev; + + vduse_class = class_create(THIS_MODULE, "vduse"); + if (IS_ERR(vduse_class)) + return PTR_ERR(vduse_class); + + vduse_class->devnode = vduse_devnode; + + ret = alloc_chrdev_region(&vduse_major, 0, VDUSE_DEV_MAX, "vduse"); + if (ret) + goto err_chardev_region; + + /* /dev/vduse/control */ + cdev_init(&vduse_ctrl_cdev, &vduse_ctrl_fops); + vduse_ctrl_cdev.owner = THIS_MODULE; + ret = cdev_add(&vduse_ctrl_cdev, vduse_major, 1); + if (ret) + goto err_ctrl_cdev; + + dev = device_create(vduse_class, NULL, vduse_major, NULL, "control"); + if (IS_ERR(dev)) { + ret = PTR_ERR(dev); + goto err_device; + } + + /* /dev/vduse/$DEVICE */ + cdev_init(&vduse_cdev, &vduse_dev_fops); + vduse_cdev.owner = THIS_MODULE; + ret = cdev_add(&vduse_cdev, MKDEV(MAJOR(vduse_major), 1), + VDUSE_DEV_MAX - 1); + if (ret) + goto err_cdev; + + vduse_irq_wq = alloc_workqueue("vduse-irq", + WQ_HIGHPRI | WQ_SYSFS | WQ_UNBOUND, 0); + if (!vduse_irq_wq) + goto err_wq; + + ret = vduse_domain_init(); + if (ret) + goto err_domain; + + ret = vduse_mgmtdev_init(); + if (ret) + goto err_mgmtdev; + + return 0; +err_mgmtdev: + vduse_domain_exit(); +err_domain: + destroy_workqueue(vduse_irq_wq); +err_wq: + cdev_del(&vduse_cdev); +err_cdev: + device_destroy(vduse_class, vduse_major); +err_device: + cdev_del(&vduse_ctrl_cdev); +err_ctrl_cdev: + unregister_chrdev_region(vduse_major, VDUSE_DEV_MAX); +err_chardev_region: + class_destroy(vduse_class); + return ret; +} +module_init(vduse_init); + +static void vduse_exit(void) +{ + vduse_mgmtdev_exit(); + vduse_domain_exit(); + destroy_workqueue(vduse_irq_wq); + cdev_del(&vduse_cdev); + device_destroy(vduse_class, vduse_major); + cdev_del(&vduse_ctrl_cdev); + unregister_chrdev_region(vduse_major, VDUSE_DEV_MAX); + class_destroy(vduse_class); +} +module_exit(vduse_exit); + +MODULE_LICENSE(DRV_LICENSE); +MODULE_AUTHOR(DRV_AUTHOR); +MODULE_DESCRIPTION(DRV_DESC); diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h new file mode 100644 index 000000000000..585fcce398af --- /dev/null +++ b/include/uapi/linux/vduse.h @@ -0,0 +1,221 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +#ifndef _UAPI_VDUSE_H_ +#define _UAPI_VDUSE_H_ + +#include + +#define VDUSE_BASE 0x81 + +/* The ioctls for control device (/dev/vduse/control) */ + +#define VDUSE_API_VERSION 0 + +/* + * Get the version of VDUSE API that kernel supported (VDUSE_API_VERSION). + * This is used for future extension. + */ +#define VDUSE_GET_API_VERSION _IOR(VDUSE_BASE, 0x00, __u64) + +/* Set the version of VDUSE API that userspace supported. */ +#define VDUSE_SET_API_VERSION _IOW(VDUSE_BASE, 0x01, __u64) + +/* + * The basic configuration of a VDUSE device, which is used by + * VDUSE_CREATE_DEV ioctl to create a VDUSE device. + */ +struct vduse_dev_config { +#define VDUSE_NAME_MAX 256 + char name[VDUSE_NAME_MAX]; /* vduse device name, needs to be NUL terminated */ + __u32 vendor_id; /* virtio vendor id */ + __u32 device_id; /* virtio device id */ + __u64 features; /* virtio features */ + __u64 bounce_size; /* the size of bounce buffer for data transfer */ + __u32 vq_num; /* the number of virtqueues */ + __u32 vq_align; /* the allocation alignment of virtqueue's metadata */ + __u32 reserved[15]; /* for future use */ + __u32 config_size; /* the size of the configuration space */ + __u8 config[0]; /* the buffer of the configuration space */ +}; + +/* Create a VDUSE device which is represented by a char device (/dev/vduse/$NAME) */ +#define VDUSE_CREATE_DEV _IOW(VDUSE_BASE, 0x02, struct vduse_dev_config) + +/* + * Destroy a VDUSE device. Make sure there are no more references + * to the char device (/dev/vduse/$NAME). + */ +#define VDUSE_DESTROY_DEV _IOW(VDUSE_BASE, 0x03, char[VDUSE_NAME_MAX]) + +/* The ioctls for VDUSE device (/dev/vduse/$NAME) */ + +/* + * The information of one IOVA region, which is retrieved from + * VDUSE_IOTLB_GET_FD ioctl. + */ +struct vduse_iotlb_entry { + __u64 offset; /* the mmap offset on returned file descriptor */ + __u64 start; /* start of the IOVA range: [start, last] */ + __u64 last; /* last of the IOVA range: [start, last] */ +#define VDUSE_ACCESS_RO 0x1 +#define VDUSE_ACCESS_WO 0x2 +#define VDUSE_ACCESS_RW 0x3 + __u8 perm; /* access permission of this region */ +}; + +/* + * Find the first IOVA region that overlaps with the range [start, last] + * and return the corresponding file descriptor. Return -EINVAL means the + * IOVA region doesn't exist. Caller should set start and last fields. + */ +#define VDUSE_IOTLB_GET_FD _IOWR(VDUSE_BASE, 0x10, struct vduse_iotlb_entry) + +/* + * Get the negotiated virtio features. It's a subset of the features in + * struct vduse_dev_config which can be accepted by virtio driver. It's + * only valid after FEATURES_OK status bit is set. + */ +#define VDUSE_DEV_GET_FEATURES _IOR(VDUSE_BASE, 0x11, __u64) + +/* + * The information that is used by VDUSE_DEV_SET_CONFIG ioctl to update + * device configuration space. + */ +struct vduse_config_data { + __u32 offset; /* offset from the beginning of configuration space */ + __u32 length; /* the length to write to configuration space */ + __u8 buffer[0]; /* buffer used to write from */ +}; + +/* Set device configuration space */ +#define VDUSE_DEV_SET_CONFIG _IOW(VDUSE_BASE, 0x12, struct vduse_config_data) + +/* + * Inject a config interrupt. It's usually used to notify virtio driver + * that device configuration space has changed. + */ +#define VDUSE_DEV_INJECT_IRQ _IO(VDUSE_BASE, 0x13) + +/* + * The basic configuration of a virtqueue, which is used by + * VDUSE_VQ_SETUP ioctl to setup a virtqueue. + */ +struct vduse_vq_config { + __u32 index; /* virtqueue index */ + __u16 max_size; /* the max size of virtqueue */ +}; + +/* + * Setup the specified virtqueue. Make sure all virtqueues have been + * configured before the device is attached to vDPA bus. + */ +#define VDUSE_VQ_SETUP _IOW(VDUSE_BASE, 0x14, struct vduse_vq_config) + +struct vduse_vq_state_split { + __u16 avail_index; /* available index */ +}; + +struct vduse_vq_state_packed { + __u16 last_avail_counter:1; /* last driver ring wrap counter observed by device */ + __u16 last_avail_idx:15; /* device available index */ + __u16 last_used_counter:1; /* device ring wrap counter */ + __u16 last_used_idx:15; /* used index */ +}; + +/* + * The information of a virtqueue, which is retrieved from + * VDUSE_VQ_GET_INFO ioctl. + */ +struct vduse_vq_info { + __u32 index; /* virtqueue index */ + __u32 num; /* the size of virtqueue */ + __u64 desc_addr; /* address of desc area */ + __u64 driver_addr; /* address of driver area */ + __u64 device_addr; /* address of device area */ + union { + struct vduse_vq_state_split split; /* split virtqueue state */ + struct vduse_vq_state_packed packed; /* packed virtqueue state */ + }; + __u8 ready; /* ready status of virtqueue */ +}; + +/* Get the specified virtqueue's information. Caller should set index field. */ +#define VDUSE_VQ_GET_INFO _IOWR(VDUSE_BASE, 0x15, struct vduse_vq_info) + +/* + * The eventfd configuration for the specified virtqueue. It's used by + * VDUSE_VQ_SETUP_KICKFD ioctl to setup kick eventfd. + */ +struct vduse_vq_eventfd { + __u32 index; /* virtqueue index */ +#define VDUSE_EVENTFD_DEASSIGN -1 + int fd; /* eventfd, -1 means de-assigning the eventfd */ +}; + +/* + * Setup kick eventfd for specified virtqueue. The kick eventfd is used + * by VDUSE kernel module to notify userspace to consume the avail vring. + */ +#define VDUSE_VQ_SETUP_KICKFD _IOW(VDUSE_BASE, 0x16, struct vduse_vq_eventfd) + +/* + * Inject an interrupt for specific virtqueue. It's used to notify virtio driver + * to consume the used vring. + */ +#define VDUSE_VQ_INJECT_IRQ _IOW(VDUSE_BASE, 0x17, __u32) + +/* The control messages definition for read/write on /dev/vduse/$NAME */ + +enum vduse_req_type { + /* Get the state for specified virtqueue from userspace */ + VDUSE_GET_VQ_STATE, + /* Set the device status */ + VDUSE_SET_STATUS, + /* + * Notify userspace to update the memory mapping for specified + * IOVA range via VDUSE_IOTLB_GET_FD ioctl + */ + VDUSE_UPDATE_IOTLB, +}; + +struct vduse_vq_state { + __u32 index; /* virtqueue index */ + union { + struct vduse_vq_state_split split; /* split virtqueue state */ + struct vduse_vq_state_packed packed; /* packed virtqueue state */ + }; +}; + +struct vduse_dev_status { + __u8 status; /* device status */ +}; + +struct vduse_iova_range { + __u64 start; /* start of the IOVA range: [start, end] */ + __u64 last; /* last of the IOVA range: [start, end] */ +}; + +struct vduse_dev_request { + __u32 type; /* request type */ + __u32 request_id; /* request id */ + __u32 reserved[2]; /* for future use */ + union { + struct vduse_vq_state vq_state; /* virtqueue state, only use index */ + struct vduse_dev_status s; /* device status */ + struct vduse_iova_range iova; /* IOVA range for updating */ + __u32 padding[16]; /* padding */ + }; +}; + +struct vduse_dev_response { + __u32 request_id; /* corresponding request id */ +#define VDUSE_REQ_RESULT_OK 0x00 +#define VDUSE_REQ_RESULT_FAILED 0x01 + __u32 result; /* the result of request */ + __u32 reserved[2]; /* for future use */ + union { + struct vduse_vq_state vq_state; /* virtqueue state */ + __u32 padding[16]; /* padding */ + }; +}; + +#endif /* _UAPI_VDUSE_H_ */ From patchwork Tue Jul 13 08:46:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 12373201 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B357C07E96 for ; Tue, 13 Jul 2021 08:49:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 362E661374 for ; Tue, 13 Jul 2021 08:49:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235532AbhGMIwN (ORCPT ); Tue, 13 Jul 2021 04:52:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50272 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234839AbhGMIvo (ORCPT ); Tue, 13 Jul 2021 04:51:44 -0400 Received: from mail-pj1-x102e.google.com (mail-pj1-x102e.google.com [IPv6:2607:f8b0:4864:20::102e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7A99CC061794 for ; Tue, 13 Jul 2021 01:48:27 -0700 (PDT) Received: by mail-pj1-x102e.google.com with SMTP id p14-20020a17090ad30eb02901731c776526so973986pju.4 for ; Tue, 13 Jul 2021 01:48:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=uMm6a9p0EzX9DUYqGCBIk5+JBvK8mtKaW/5v/siepz8=; b=FwrQiD+0uRPLUN+6Ju88xP5C02Bvk9OqVShw9WcsJdiLxu+YCLfDld1XCGfjLU1nSl 918yKE4A5ColszxQDG/4pM1sXcDtN4ghfirvX+RsqtkNJyYamqqNi2VjbvDtJuJMdMEH DBnVAKFBCd1A3Mu8H5eyl/2LzIPpp/+RJW7nXs1sv+9wD9NX2G2ZW51nr6oAKS/uh/Re no24BT5m/sJN3evL+WSKCld+Bzy52NIw84s+0QKBgTE+n5jtfL7elaULaVgn89viBuwz oYoSGCLRyfky8JDQWJQ44g6XrECuYo7lecJLBNwO2UB7NYtQMZoIpWP32eSw3XP6Zz0y tt/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=uMm6a9p0EzX9DUYqGCBIk5+JBvK8mtKaW/5v/siepz8=; b=QYln1KI6b/HTd/M4VdJcS2KQ1ntIIlQdCezYbdW8d1jnxseKK1Y3dc3nMkvnJt6X8N +3qqrFeSCgpLN+l9I83UC0/YXVz3hJwgLpznh3FHwbI6aaCK+B6A6wakG/kKJaEhh0Er 0zP0PH2SMwPvkIMguSj9EzDdSFSrXtz8B1PR0s/kIEYM8MLEEddCVbmOTNHmBSvqfxq8 le55+s7XRs8TgW5MyxSVYLMSzNSFqSIaS2SnQcLMHCrCrBzH9V7FQy30DkM4sHuoOCkd 4Md2S5Edmp23JAb6yxYncuSizAf3xkMpbPnxEUaYEGabju08ng5DHZ+g5pNlPMtUJ1C9 d9ow== X-Gm-Message-State: AOAM530+SUgdAegdpCm5xBr9FjooPfi/TW8jz90B5FW5Oj12hiNcRQOU p+hHNxSb6+X60gXdm3CsJpx5 X-Google-Smtp-Source: ABdhPJy41kKAoDHW9SvVVXjbdWKoO4Hb5f5MeAoUzDaUJYwvEJY6gTCT1v+c76dyqcdtmYhkN5UdhA== X-Received: by 2002:a17:90a:8585:: with SMTP id m5mr3319949pjn.224.1626166106915; Tue, 13 Jul 2021 01:48:26 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id a23sm17961927pff.43.2021.07.13.01.48.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:48:26 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 17/17] Documentation: Add documentation for VDUSE Date: Tue, 13 Jul 2021 16:46:56 +0800 Message-Id: <20210713084656.232-18-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org VDUSE (vDPA Device in Userspace) is a framework to support implementing software-emulated vDPA devices in userspace. This document is intended to clarify the VDUSE design and usage. Signed-off-by: Xie Yongji --- Documentation/userspace-api/index.rst | 1 + Documentation/userspace-api/vduse.rst | 248 ++++++++++++++++++++++++++++++++++ 2 files changed, 249 insertions(+) create mode 100644 Documentation/userspace-api/vduse.rst diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst index 0b5eefed027e..c432be070f67 100644 --- a/Documentation/userspace-api/index.rst +++ b/Documentation/userspace-api/index.rst @@ -27,6 +27,7 @@ place where this information is gathered. iommu media/index sysfs-platform_profile + vduse .. only:: subproject and html diff --git a/Documentation/userspace-api/vduse.rst b/Documentation/userspace-api/vduse.rst new file mode 100644 index 000000000000..2c0d56d4b2da --- /dev/null +++ b/Documentation/userspace-api/vduse.rst @@ -0,0 +1,248 @@ +================================== +VDUSE - "vDPA Device in Userspace" +================================== + +vDPA (virtio data path acceleration) device is a device that uses a +datapath which complies with the virtio specifications with vendor +specific control path. vDPA devices can be both physically located on +the hardware or emulated by software. VDUSE is a framework that makes it +possible to implement software-emulated vDPA devices in userspace. And +to make the device emulation more secure, the emulated vDPA device's +control path is handled in the kernel and only the data path is +implemented in the userspace. + +Note that only virtio block device is supported by VDUSE framework now, +which can reduce security risks when the userspace process that implements +the data path is run by an unprivileged user. The support for other device +types can be added after the security issue of corresponding device driver +is clarified or fixed in the future. + +Start/Stop VDUSE devices +------------------------ + +VDUSE devices are started as follows: + +1. Create a new VDUSE instance with ioctl(VDUSE_CREATE_DEV) on + /dev/vduse/control. + +2. Setup each virtqueue with ioctl(VDUSE_VQ_SETUP) on /dev/vduse/$NAME. + +3. Begin processing VDUSE messages from /dev/vduse/$NAME. The first + messages will arrive while attaching the VDUSE instance to vDPA bus. + +4. Send the VDPA_CMD_DEV_NEW netlink message to attach the VDUSE + instance to vDPA bus. + +VDUSE devices are stopped as follows: + +1. Send the VDPA_CMD_DEV_DEL netlink message to detach the VDUSE + instance from vDPA bus. + +2. Close the file descriptor referring to /dev/vduse/$NAME. + +3. Destroy the VDUSE instance with ioctl(VDUSE_DESTROY_DEV) on + /dev/vduse/control. + +The netlink messages can be sent via vdpa tool in iproute2 or use the +below sample codes: + +.. code-block:: c + + static int netlink_add_vduse(const char *name, enum vdpa_command cmd) + { + struct nl_sock *nlsock; + struct nl_msg *msg; + int famid; + + nlsock = nl_socket_alloc(); + if (!nlsock) + return -ENOMEM; + + if (genl_connect(nlsock)) + goto free_sock; + + famid = genl_ctrl_resolve(nlsock, VDPA_GENL_NAME); + if (famid < 0) + goto close_sock; + + msg = nlmsg_alloc(); + if (!msg) + goto close_sock; + + if (!genlmsg_put(msg, NL_AUTO_PORT, NL_AUTO_SEQ, famid, 0, 0, cmd, 0)) + goto nla_put_failure; + + NLA_PUT_STRING(msg, VDPA_ATTR_DEV_NAME, name); + if (cmd == VDPA_CMD_DEV_NEW) + NLA_PUT_STRING(msg, VDPA_ATTR_MGMTDEV_DEV_NAME, "vduse"); + + if (nl_send_sync(nlsock, msg)) + goto close_sock; + + nl_close(nlsock); + nl_socket_free(nlsock); + + return 0; + nla_put_failure: + nlmsg_free(msg); + close_sock: + nl_close(nlsock); + free_sock: + nl_socket_free(nlsock); + return -1; + } + +How VDUSE works +--------------- + +As mentioned above, a VDUSE device is created by ioctl(VDUSE_CREATE_DEV) on +/dev/vduse/control. With this ioctl, userspace can specify some basic configuration +such as device name (uniquely identify a VDUSE device), virtio features, virtio +configuration space, bounce buffer size and so on for this emulated device. Then +a char device interface (/dev/vduse/$NAME) is exported to userspace for device +emulation. Userspace can use the VDUSE_VQ_SETUP ioctl on /dev/vduse/$NAME to +add per-virtqueue configuration such as the max size of virtqueue to the device. + +After the initialization, the VDUSE device can be attached to vDPA bus via +the VDPA_CMD_DEV_NEW netlink message. Userspace needs to read()/write() on +/dev/vduse/$NAME to receive/reply some control messages from/to VDUSE kernel +module as follows: + +.. code-block:: c + + static int vduse_message_handler(int dev_fd) + { + int len; + struct vduse_dev_request req; + struct vduse_dev_response resp; + + len = read(dev_fd, &req, sizeof(req)); + if (len != sizeof(req)) + return -1; + + resp.request_id = req.request_id; + + switch (req.type) { + + /* handle different types of message */ + + } + + len = write(dev_fd, &resp, sizeof(resp)); + if (len != sizeof(resp)) + return -1; + + return 0; + } + +There are now three types of messages introduced by VDUSE framework: + +- VDUSE_GET_VQ_STATE: Get the state for virtqueue, userspace should return + avail index for split virtqueue or the device/driver ring wrap counters and + the avail and used index for packed virtqueue. + +- VDUSE_SET_STATUS: Set the device status, userspace should follow + the virtio spec: https://docs.oasis-open.org/virtio/virtio/v1.1/virtio-v1.1.html + to process this message. For example, fail to set the FEATURES_OK device + status bit if the device can not accept the negotiated virtio features + get from the VDUSE_GET_FEATURES ioctl. + +- VDUSE_UPDATE_IOTLB: Notify userspace to update the memory mapping for specified + IOVA range, userspace should firstly remove the old mapping, then setup the new + mapping via the VDUSE_IOTLB_GET_FD ioctl. + +After DRIVER_OK status bit is set via the VDUSE_SET_STATUS message, userspace is +able to start the dataplane processing with the help of below ioctls: + +- VDUSE_IOTLB_GET_FD: Find the first IOVA region that overlaps with the specified + range [start, last] and return the corresponding file descriptor. In vhost-vdpa + cases, it might be a full chunk of guest RAM. And in virtio-vdpa cases, it should + be the whole bounce buffer or the memory region that stores one virtqueue's + metadata (descriptor table, available ring and used ring). Userspace can access + this IOVA region by passing fd and corresponding size, offset, perm to mmap(). + For example: + +.. code-block:: c + + static int perm_to_prot(uint8_t perm) + { + int prot = 0; + + switch (perm) { + case VDUSE_ACCESS_WO: + prot |= PROT_WRITE; + break; + case VDUSE_ACCESS_RO: + prot |= PROT_READ; + break; + case VDUSE_ACCESS_RW: + prot |= PROT_READ | PROT_WRITE; + break; + } + + return prot; + } + + static void *iova_to_va(int dev_fd, uint64_t iova, uint64_t *len) + { + int fd; + void *addr; + size_t size; + struct vduse_iotlb_entry entry; + + entry.start = iova; + entry.last = iova; + fd = ioctl(dev_fd, VDUSE_IOTLB_GET_FD, &entry); + if (fd < 0) + return NULL; + + size = entry.last - entry.start + 1; + *len = entry.last - iova + 1; + addr = mmap(0, size, perm_to_prot(entry.perm), MAP_SHARED, + fd, entry.offset); + close(fd); + if (addr == MAP_FAILED) + return NULL; + + /* + * Using some data structures such as linked list to store + * the iotlb mapping. The munmap(2) should be called for the + * cached mapping when the corresponding VDUSE_UPDATE_IOTLB + * message is received or the device is reset. + */ + + return addr + iova - entry.start; + } + +- VDUSE_VQ_GET_INFO: Get the specified virtqueue's information including the size, + the IOVAs of descriptor table, available ring and used ring, the state + and the ready status. The IOVAs should be passed to the VDUSE_IOTLB_GET_FD ioctl + so that userspace can access the descriptor table, available ring and used ring. + +- VDUSE_VQ_SETUP_KICKFD: Setup the kick eventfd for the specified virtqueues. + The kick eventfd is used by VDUSE kernel module to notify userspace to consume + the available ring. + +- VDUSE_INJECT_VQ_IRQ: Inject an interrupt for specific virtqueue. It's used to + notify virtio driver to consume the used ring. + +More details on the uAPI can be found in include/uapi/linux/vduse.h. + +MMU-based IOMMU Driver +---------------------- + +VDUSE framework implements an MMU-based on-chip IOMMU driver to support +mapping the kernel DMA buffer into the userspace IOVA region dynamically. +This is mainly designed for virtio-vdpa case (kernel virtio drivers). + +The basic idea behind this driver is treating MMU (VA->PA) as IOMMU (IOVA->PA). +The driver will set up MMU mapping instead of IOMMU mapping for the DMA transfer +so that the userspace process is able to use its virtual address to access +the DMA buffer in kernel. + +And to avoid security issue, a bounce-buffering mechanism is introduced to +prevent userspace accessing the original buffer directly which may contain other +kernel data. During the mapping, unmapping, the driver will copy the data from +the original buffer to the bounce buffer and back, depending on the direction of +the transfer. And the bounce-buffer addresses will be mapped into the user address +space instead of the original one.