From patchwork Fri Feb 15 21:46:28 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Williamson X-Patchwork-Id: 2150241 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork1.kernel.org (Postfix) with ESMTP id 346CA3FDF1 for ; Fri, 15 Feb 2013 21:47:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751552Ab3BOVq3 (ORCPT ); Fri, 15 Feb 2013 16:46:29 -0500 Received: from mx1.redhat.com ([209.132.183.28]:17400 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751440Ab3BOVq3 (ORCPT ); Fri, 15 Feb 2013 16:46:29 -0500 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r1FLkSlr030084 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 15 Feb 2013 16:46:29 -0500 Received: from bling.home (ovpn-113-86.phx2.redhat.com [10.3.113.86]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id r1FLkSc2017924; Fri, 15 Feb 2013 16:46:28 -0500 Subject: [PATCH v2] vfio-pci: Manage user power state transitions To: alex.williamson@redhat.com From: Alex Williamson Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Date: Fri, 15 Feb 2013 14:46:28 -0700 Message-ID: <20130215214510.7117.97846.stgit@bling.home> In-Reply-To: <20130215191228.27383.1844.stgit@bling.home> References: <20130215191228.27383.1844.stgit@bling.home> User-Agent: StGit/0.16 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.67 on 10.5.11.12 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org We give the user access to change the power state of the device but certain transitions result in an uninitialized state which the user cannot resolve. To fix this we need to mark the PowerState field of the PMCSR register read-only and effect the requested change on behalf of the user. This has the added benefit that pdev->current_state remains accurate while controlled by the user. The primary example of this bug is a QEMU guest doing a reboot where the device it put into D3 on shutdown and becomes unusable on the next boot because the device did a soft reset on D3->D0 (NoSoftRst-). Signed-off-by: Alex Williamson --- v2: sparse found a type error when calling pci_set_power_state. Fixed here drivers/vfio/pci/vfio_pci_config.c | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c index f1dde2c..1fd1895 100644 --- a/drivers/vfio/pci/vfio_pci_config.c +++ b/drivers/vfio/pci/vfio_pci_config.c @@ -587,12 +587,30 @@ static int __init init_pci_cap_basic_perm(struct perm_bits *perm) return 0; } +static int vfio_pm_config_write(struct vfio_pci_device *vdev, int pos, + int count, struct perm_bits *perm, + int offset, __le32 val) +{ + count = vfio_default_config_write(vdev, pos, count, perm, offset, val); + if (count < 0) + return count; + + if (offset == PCI_PM_CTRL) { + pci_power_t state = le32_to_cpu(val) & PCI_PM_CTRL_STATE_MASK; + pci_set_power_state(vdev->pdev, state); + } + + return count; +} + /* Permissions for the Power Management capability */ static int __init init_pci_cap_pm_perm(struct perm_bits *perm) { if (alloc_perm_bits(perm, pci_cap_length[PCI_CAP_ID_PM])) return -ENOMEM; + perm->writefn = vfio_pm_config_write; + /* * We always virtualize the next field so we can remove * capabilities from the chain if we want to. @@ -600,10 +618,11 @@ static int __init init_pci_cap_pm_perm(struct perm_bits *perm) p_setb(perm, PCI_CAP_LIST_NEXT, (u8)ALL_VIRT, NO_WRITE); /* - * Power management is defined *per function*, - * so we let the user write this + * Power management is defined *per function*, so we can let + * the user change power state, but we trap and initiate the + * change ourselves, so the state bits are read-only. */ - p_setd(perm, PCI_PM_CTRL, NO_VIRT, ALL_WRITE); + p_setd(perm, PCI_PM_CTRL, NO_VIRT, ~PCI_PM_CTRL_STATE_MASK); return 0; }