From patchwork Fri Jun 8 19:23:00 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eduardo Habkost X-Patchwork-Id: 10455069 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 8E57460318 for ; Fri, 8 Jun 2018 19:24:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 68F0629574 for ; Fri, 8 Jun 2018 19:24:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5B2A22959E; Fri, 8 Jun 2018 19:24:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8793129574 for ; Fri, 8 Jun 2018 19:24:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752624AbeFHTXH (ORCPT ); Fri, 8 Jun 2018 15:23:07 -0400 Received: from mx1.redhat.com ([209.132.183.28]:55452 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751946AbeFHTXF (ORCPT ); Fri, 8 Jun 2018 15:23:05 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B5FD83086246; Fri, 8 Jun 2018 19:23:05 +0000 (UTC) Received: from localhost (ovpn-116-19.gru2.redhat.com [10.97.116.19]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6A0955DA63; Fri, 8 Jun 2018 19:23:02 +0000 (UTC) Date: Fri, 8 Jun 2018 16:23:00 -0300 From: Eduardo Habkost To: "Moger, Babu" Cc: "mst@redhat.com" , "marcel.apfelbaum@gmail.com" , "pbonzini@redhat.com" , "rth@twiddle.net" , "mtosatti@redhat.com" , "qemu-devel@nongnu.org" , "kvm@vger.kernel.org" , "kash@tripleback.net" , "geoff@hostfission.com" , Juan Quintela , xiaoguangrong@tencent.com Subject: Re: [PATCH v12 3/4] i386: Enable TOPOEXT feature on AMD EPYC CPU Message-ID: <20180608192300.GN7451@localhost.localdomain> References: <1528295806-90593-1-git-send-email-babu.moger@amd.com> <1528295806-90593-4-git-send-email-babu.moger@amd.com> <20180606223956.GD7451@localhost.localdomain> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-Fnord: you can see the fnord User-Agent: Mutt/1.9.2 (2017-12-15) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Fri, 08 Jun 2018 19:23:05 +0000 (UTC) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Fri, Jun 08, 2018 at 06:40:16PM +0000, Moger, Babu wrote: > Hi Eduardo, > Sorry for the late response. Got pulled into something else. > > > -----Original Message----- > > From: Eduardo Habkost [mailto:ehabkost@redhat.com] > > Sent: Wednesday, June 6, 2018 5:40 PM > > To: Moger, Babu > > Cc: mst@redhat.com; marcel.apfelbaum@gmail.com; pbonzini@redhat.com; > > rth@twiddle.net; mtosatti@redhat.com; qemu-devel@nongnu.org; > > kvm@vger.kernel.org; kash@tripleback.net; geoff@hostfission.com > > Subject: Re: [PATCH v12 3/4] i386: Enable TOPOEXT feature on AMD EPYC > > CPU > > > > On Wed, Jun 06, 2018 at 10:36:45AM -0400, Babu Moger wrote: > > > Enable TOPOEXT feature on EPYC CPU. This is required to support > > > hyperthreading on VM guests. Also extend xlevel to 0x8000001E. > > > > > > Disable TOPOEXT feature for legacy machines. > > > > > > Signed-off-by: Babu Moger > > > > Now, I just noticed we have a problem here: > > > > "-machine pc -cpu EPYC -smp 64" works today > > > > This patch makes it stop working, but it shouldn't. > > No. It works fine. I have tested it. This doesn't sound right. The code in this series will error out of TOPOEXT is enabled and you have more than 64 VCPUs. But I just noticed we have a bug introduced by: commit f548222c24342ca74689de7794f9006b43f86a54 Author: Xiao Guangrong Date: Thu May 3 16:06:11 2018 +0800 migration: introduce decompress-error-check QEMU 3.0 enables strict check for compression & decompression to make the migration more robust, that depends on the source to fix the internal design which triggers the unexpected error conditions To make it work for migrating old version QEMU to 2.13 QEMU, we introduce this parameter to disable the error check on the destination which is the default behavior of the machine type which is older than 2.13, alternately, the strict check can be enabled explicitly as followings: -M pc-q35-2.11 -global migration.decompress-error-check=true Signed-off-by: Xiao Guangrong Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela This commits added PC_COMPAT_2_12 to the 3.0 machine-types. Because of this bug, TOPOEXT is being unconditionally disabled on all machine-types, unless I apply the fix below: > > > > > On the other hand, I believe you expect: > > * "-machine pc -cpu EPYC -smp 8" to automatically enable topoext. > Yes. Only on new machines-types > > * "-machine pc -cpu Opteron_G1 -smp 8" to not enable topoext. > Yes. > > * What about "-machine -cpu Opteron_G1 -smp 8,threads=2"? > No. This should not enable topoext. Topoext is not supported by Opteron_G1. > This should warn about hyperthreading and continue. OK, makes sense to me. > > > > > > We also have other requirements, I will try to enumerate all of > > them below: > > > > 0) "-topoext" explicitly configured (any machine-type): > > * Must never enable topoext. > Yes. > > > > 1) "+topoext" explicitly configured (any machine-type): > > * Must validate topology and refuse to start if unsupported. > > Yes. > > > > > 2) Older machine-types: > > * Must never enable topoext automatically, even if using "EPYC" > > or "threads=2" > > > Yes. > > > 3) "EPYC" CPU model (on new machine-types): > > * Should enable topoext automatically, but only if topology is > > supported. > > * Must not error out if topology is not supported. > In new machine types we will enable topoext for "EPYC" CPU model. > Right now(old machine type) we can disable for all the CPU models. > So, we don't need two bits(topoext and auto-topoext) Right, so you agree that in this case we must _not_ error out if topology is unsupported, correct? Otherwise we will break this existing use case: "-machine pc -cpu EPYC -smp 64". > > I thought we should error out if topology cannot be supported. But we can warn(disable topoext) and continue that is another option. > > > * Should this enable topoext automatically even if threads=1? > > Yes. We should enable even with threads=1. > > > > > 4) Other AMD CPU models with "threads=2" (on new machine-types): > > * We might want to make this enable topoext automatically, too. > > What do you think? > > No. We should not enable topoext here. We should depend on CPU model table here. > > > > > Is the above description accurate? Do you agree with these > > requirements? > > With these requirements in mind, I will send that patches. We can start our discussion. > We don't need one more bits. That is my opinion. Thanks for confirming the requirements above. But it doesn't seem to be possible represent these requirements with just one bit. Otherwise you can't differentiate explicit "+topoext" (1 above) from topoext being implicitly enabled by "-cpu EPYC" (3 above). Another problem is query-cpu-model-expansion QMP command: this patch makes "topoext" appear on the output of "query-cpu-model-expansion model=EPYC", meaning that management software will assume everybody using the "EPYC" CPU model will require +topoext. A separate "auto-topoext" property would avoid this issue. (Yeah, this is tricky. I want to eventually encode these subtle rules in automated test cases, so these issues could be detected by software instead of code inspection.) diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index 3d81136065..b4c5b03274 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -430,7 +430,6 @@ static void pc_i440fx_3_0_machine_options(MachineClass *m) pc_i440fx_machine_options(m); m->alias = "pc"; m->is_default = 1; - SET_MACHINE_COMPAT(m, PC_COMPAT_2_12); } DEFINE_I440FX_MACHINE(v3_0, "pc-i440fx-3.0", NULL, diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index b60cbb9266..83d6d75efa 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -312,7 +312,6 @@ static void pc_q35_3_0_machine_options(MachineClass *m) { pc_q35_machine_options(m); m->alias = "q35"; - SET_MACHINE_COMPAT(m, PC_COMPAT_2_12); } DEFINE_Q35_MACHINE(v3_0, "pc-q35-3.0", NULL,