Message ID | 20200513221557.14366-3-walling@linux.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Use DIAG318 to set Control Program Name & Version Codes | expand |
Hi Collin, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on kvms390/next] [also build test WARNING on s390/features v5.7-rc5] [cannot apply to kvm/linux-next next-20200512] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system. BTW, we also suggest to use '--base' option to specify the base tree in git format-patch, please see https://stackoverflow.com/a/37406982] url: https://github.com/0day-ci/linux/commits/Collin-Walling/Use-DIAG318-to-set-Control-Program-Name-Version-Codes/20200514-061856 base: https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git next config: s390-allyesconfig (attached as .config) compiler: s390-linux-gcc (GCC) 9.3.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day GCC_VERSION=9.3.0 make.cross ARCH=s390 If you fix the issue, kindly add following tag as appropriate Reported-by: kbuild test robot <lkp@intel.com> All warnings (new ones prefixed by >>): In file included from arch/s390/include/asm/kvm_host.h:22, from include/linux/kvm_host.h:36, from arch/s390/kvm/kvm-s390.c:23: arch/s390/kvm/kvm-s390.c: In function 'kvm_s390_set_diag318_info': >> arch/s390/kvm/kvm-s390.c:1253:19: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 5 has type 'long unsigned int:56' [-Wformat=] 1253 | VM_EVENT(kvm, 3, "SET: CPNC: 0x%x CPVC: 0x%llx", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1254 | kvm->arch.diag318_info.cpnc, kvm->arch.diag318_info.cpvc); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | long unsigned int:56 arch/s390/include/asm/debug.h:247:12: note: in definition of macro 'debug_sprintf_event' 247 | _fmt, ## __VA_ARGS__); | ^~~~ >> arch/s390/kvm/kvm-s390.c:1253:2: note: in expansion of macro 'VM_EVENT' 1253 | VM_EVENT(kvm, 3, "SET: CPNC: 0x%x CPVC: 0x%llx", | ^~~~~~~~ arch/s390/kvm/kvm-s390.c:1253:47: note: format string is defined here 1253 | VM_EVENT(kvm, 3, "SET: CPNC: 0x%x CPVC: 0x%llx", | ~~~^ | | | long long unsigned int vim +1253 arch/s390/kvm/kvm-s390.c 1245 1246 void kvm_s390_set_diag318_info(struct kvm *kvm, u64 info) 1247 { 1248 struct kvm_vcpu *vcpu; 1249 int i; 1250 1251 kvm->arch.diag318_info.val = info; 1252 > 1253 VM_EVENT(kvm, 3, "SET: CPNC: 0x%x CPVC: 0x%llx", 1254 kvm->arch.diag318_info.cpnc, kvm->arch.diag318_info.cpvc); 1255 1256 if (sclp.has_diag318) { 1257 kvm_for_each_vcpu(i, vcpu, kvm) { 1258 vcpu->arch.sie_block->cpnc = kvm->arch.diag318_info.cpnc; 1259 } 1260 } 1261 } 1262 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On 14/05/2020 00.15, Collin Walling wrote: > DIAGNOSE 0x318 (diag318) is a privileged s390x instruction that must > be intercepted by SIE and handled via KVM. Let's introduce some > functions to communicate between userspace and KVM via ioctls. These > will be used to get/set the diag318 related information, as well as > check the system if KVM supports handling this instruction. > > This information can help with diagnosing the environment the VM is > running in (Linux, z/VM, etc) if the OS calls this instruction. > > By default, this feature is disabled and can only be enabled if a > user space program (such as QEMU) explicitly requests it. > > The Control Program Name Code (CPNC) is stored in the SIE block > and a copy is retained in each VCPU. The Control Program Version > Code (CPVC) is not designed to be stored in the SIE block, so we > retain a copy in each VCPU next to the CPNC. > > Signed-off-by: Collin Walling <walling@linux.ibm.com> > --- > Documentation/virt/kvm/devices/vm.rst | 29 +++++++++ > arch/s390/include/asm/kvm_host.h | 6 +- > arch/s390/include/uapi/asm/kvm.h | 5 ++ > arch/s390/kvm/diag.c | 20 ++++++ > arch/s390/kvm/kvm-s390.c | 89 +++++++++++++++++++++++++++ > arch/s390/kvm/kvm-s390.h | 1 + > arch/s390/kvm/vsie.c | 2 + > 7 files changed, 151 insertions(+), 1 deletion(-) [...] > diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c > index 563429dece03..3caed4b880c8 100644 > --- a/arch/s390/kvm/diag.c > +++ b/arch/s390/kvm/diag.c > @@ -253,6 +253,24 @@ static int __diag_virtio_hypercall(struct kvm_vcpu *vcpu) > return ret < 0 ? ret : 0; > } > > +static int __diag_set_diag318_info(struct kvm_vcpu *vcpu) > +{ > + unsigned int reg = (vcpu->arch.sie_block->ipa & 0xf0) >> 4; > + u64 info = vcpu->run->s.regs.gprs[reg]; > + > + if (!vcpu->kvm->arch.use_diag318) > + return -EOPNOTSUPP; > + > + vcpu->stat.diagnose_318++; > + kvm_s390_set_diag318_info(vcpu->kvm, info); > + > + VCPU_EVENT(vcpu, 3, "diag 0x318 cpnc: 0x%x cpvc: 0x%llx", > + vcpu->kvm->arch.diag318_info.cpnc, > + (u64)vcpu->kvm->arch.diag318_info.cpvc); > + > + return 0; > +} > + > int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) > { > int code = kvm_s390_get_base_disp_rs(vcpu, NULL) & 0xffff; > @@ -272,6 +290,8 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) > return __diag_page_ref_service(vcpu); > case 0x308: > return __diag_ipl_functions(vcpu); > + case 0x318: > + return __diag_set_diag318_info(vcpu); > case 0x500: > return __diag_virtio_hypercall(vcpu); I wonder whether it would make more sense to simply drop to userspace and handle the diag 318 call there? That way the userspace would always be up-to-date, and as we've seen in the past (e.g. with the various SIGP handling), it's better if the userspace is in control... e.g. userspace could also decide to only use KVM_S390_VM_MISC_ENABLE_DIAG318 if the guest just executed the diag 318 instruction. And you need the kvm_s390_vm_get/set_misc functions anyway, so these could also be simply used by the diag 318 handler in userspace? > default: > diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c > index d05bb040fd42..c3eee468815f 100644 > --- a/arch/s390/kvm/kvm-s390.c > +++ b/arch/s390/kvm/kvm-s390.c > @@ -159,6 +159,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { > { "diag_9c_ignored", VCPU_STAT(diagnose_9c_ignored) }, > { "instruction_diag_258", VCPU_STAT(diagnose_258) }, > { "instruction_diag_308", VCPU_STAT(diagnose_308) }, > + { "instruction_diag_318", VCPU_STAT(diagnose_318) }, > { "instruction_diag_500", VCPU_STAT(diagnose_500) }, > { "instruction_diag_other", VCPU_STAT(diagnose_other) }, > { NULL } > @@ -1243,6 +1244,76 @@ static int kvm_s390_get_tod(struct kvm *kvm, struct kvm_device_attr *attr) > return ret; > } > > +void kvm_s390_set_diag318_info(struct kvm *kvm, u64 info) > +{ > + struct kvm_vcpu *vcpu; > + int i; > + > + kvm->arch.diag318_info.val = info; > + > + VM_EVENT(kvm, 3, "SET: CPNC: 0x%x CPVC: 0x%llx", > + kvm->arch.diag318_info.cpnc, kvm->arch.diag318_info.cpvc); > + > + if (sclp.has_diag318) { > + kvm_for_each_vcpu(i, vcpu, kvm) { > + vcpu->arch.sie_block->cpnc = kvm->arch.diag318_info.cpnc; > + } > + } > +} > + > +static int kvm_s390_vm_set_misc(struct kvm *kvm, struct kvm_device_attr *attr) > +{ > + int ret; > + u64 diag318_info; > + > + switch (attr->attr) { > + case KVM_S390_VM_MISC_ENABLE_DIAG318: > + kvm->arch.use_diag318 = 1; > + ret = 0; > + break; Would it make sense to set kvm->arch.use_diag318 = 1 during the first execution of KVM_S390_VM_MISC_DIAG318 instead, so that we could get along without the KVM_S390_VM_MISC_ENABLE_DIAG318 ? > + case KVM_S390_VM_MISC_DIAG318: > + ret = -EFAULT; > + if (!kvm->arch.use_diag318) > + return -EOPNOTSUPP; > + if (get_user(diag318_info, (u64 __user *)attr->addr)) > + break; > + kvm_s390_set_diag318_info(kvm, diag318_info); > + ret = 0; > + break; > + default: > + ret = -ENXIO; > + break; > + } > + return ret; > +} What about a reset of the guest VM? If a user first boots into a Linux kernel that supports diag 318, then reboots and selects a Linux kernel that does not support diag 318? I'd expect that the cpnc / cpnv values need to be cleared here somewhere? Otherwise the information might not be accurate anymore? Thomas
On 5/14/20 9:53 AM, Thomas Huth wrote: > On 14/05/2020 00.15, Collin Walling wrote: >> DIAGNOSE 0x318 (diag318) is a privileged s390x instruction that must >> be intercepted by SIE and handled via KVM. Let's introduce some >> functions to communicate between userspace and KVM via ioctls. These >> will be used to get/set the diag318 related information, as well as >> check the system if KVM supports handling this instruction. >> >> This information can help with diagnosing the environment the VM is >> running in (Linux, z/VM, etc) if the OS calls this instruction. >> >> By default, this feature is disabled and can only be enabled if a >> user space program (such as QEMU) explicitly requests it. >> >> The Control Program Name Code (CPNC) is stored in the SIE block >> and a copy is retained in each VCPU. The Control Program Version >> Code (CPVC) is not designed to be stored in the SIE block, so we >> retain a copy in each VCPU next to the CPNC. >> >> Signed-off-by: Collin Walling <walling@linux.ibm.com> >> --- >> Documentation/virt/kvm/devices/vm.rst | 29 +++++++++ >> arch/s390/include/asm/kvm_host.h | 6 +- >> arch/s390/include/uapi/asm/kvm.h | 5 ++ >> arch/s390/kvm/diag.c | 20 ++++++ >> arch/s390/kvm/kvm-s390.c | 89 +++++++++++++++++++++++++++ >> arch/s390/kvm/kvm-s390.h | 1 + >> arch/s390/kvm/vsie.c | 2 + >> 7 files changed, 151 insertions(+), 1 deletion(-) > [...] >> diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c >> index 563429dece03..3caed4b880c8 100644 >> --- a/arch/s390/kvm/diag.c >> +++ b/arch/s390/kvm/diag.c >> @@ -253,6 +253,24 @@ static int __diag_virtio_hypercall(struct kvm_vcpu *vcpu) >> return ret < 0 ? ret : 0; >> } >> >> +static int __diag_set_diag318_info(struct kvm_vcpu *vcpu) >> +{ >> + unsigned int reg = (vcpu->arch.sie_block->ipa & 0xf0) >> 4; >> + u64 info = vcpu->run->s.regs.gprs[reg]; >> + >> + if (!vcpu->kvm->arch.use_diag318) >> + return -EOPNOTSUPP; >> + >> + vcpu->stat.diagnose_318++; >> + kvm_s390_set_diag318_info(vcpu->kvm, info); >> + >> + VCPU_EVENT(vcpu, 3, "diag 0x318 cpnc: 0x%x cpvc: 0x%llx", >> + vcpu->kvm->arch.diag318_info.cpnc, >> + (u64)vcpu->kvm->arch.diag318_info.cpvc); >> + >> + return 0; >> +} >> + >> int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) >> { >> int code = kvm_s390_get_base_disp_rs(vcpu, NULL) & 0xffff; >> @@ -272,6 +290,8 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) >> return __diag_page_ref_service(vcpu); >> case 0x308: >> return __diag_ipl_functions(vcpu); >> + case 0x318: >> + return __diag_set_diag318_info(vcpu); >> case 0x500: >> return __diag_virtio_hypercall(vcpu); > > I wonder whether it would make more sense to simply drop to userspace > and handle the diag 318 call there? That way the userspace would always > be up-to-date, and as we've seen in the past (e.g. with the various SIGP > handling), it's better if the userspace is in control... e.g. userspace > could also decide to only use KVM_S390_VM_MISC_ENABLE_DIAG318 if the > guest just executed the diag 318 instruction. > > And you need the kvm_s390_vm_get/set_misc functions anyway, so these > could also be simply used by the diag 318 handler in userspace? > >> default: >> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c >> index d05bb040fd42..c3eee468815f 100644 >> --- a/arch/s390/kvm/kvm-s390.c >> +++ b/arch/s390/kvm/kvm-s390.c >> @@ -159,6 +159,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { >> { "diag_9c_ignored", VCPU_STAT(diagnose_9c_ignored) }, >> { "instruction_diag_258", VCPU_STAT(diagnose_258) }, >> { "instruction_diag_308", VCPU_STAT(diagnose_308) }, >> + { "instruction_diag_318", VCPU_STAT(diagnose_318) }, >> { "instruction_diag_500", VCPU_STAT(diagnose_500) }, >> { "instruction_diag_other", VCPU_STAT(diagnose_other) }, >> { NULL } >> @@ -1243,6 +1244,76 @@ static int kvm_s390_get_tod(struct kvm *kvm, struct kvm_device_attr *attr) >> return ret; >> } >> >> +void kvm_s390_set_diag318_info(struct kvm *kvm, u64 info) >> +{ >> + struct kvm_vcpu *vcpu; >> + int i; >> + >> + kvm->arch.diag318_info.val = info; >> + >> + VM_EVENT(kvm, 3, "SET: CPNC: 0x%x CPVC: 0x%llx", >> + kvm->arch.diag318_info.cpnc, kvm->arch.diag318_info.cpvc); >> + >> + if (sclp.has_diag318) { >> + kvm_for_each_vcpu(i, vcpu, kvm) { >> + vcpu->arch.sie_block->cpnc = kvm->arch.diag318_info.cpnc; >> + } >> + } >> +} >> + >> +static int kvm_s390_vm_set_misc(struct kvm *kvm, struct kvm_device_attr *attr) >> +{ >> + int ret; >> + u64 diag318_info; >> + >> + switch (attr->attr) { >> + case KVM_S390_VM_MISC_ENABLE_DIAG318: >> + kvm->arch.use_diag318 = 1; >> + ret = 0; >> + break; > > Would it make sense to set kvm->arch.use_diag318 = 1 during the first > execution of KVM_S390_VM_MISC_DIAG318 instead, so that we could get > along without the KVM_S390_VM_MISC_ENABLE_DIAG318 ? I'm not an expert in feature negotiation, but why isn't this a cpu feature like sief2 instead of a attribute? @David? > >> + case KVM_S390_VM_MISC_DIAG318: >> + ret = -EFAULT; >> + if (!kvm->arch.use_diag318) >> + return -EOPNOTSUPP; >> + if (get_user(diag318_info, (u64 __user *)attr->addr)) >> + break; >> + kvm_s390_set_diag318_info(kvm, diag318_info); >> + ret = 0; >> + break; >> + default: >> + ret = -ENXIO; >> + break; >> + } >> + return ret; >> +} > > What about a reset of the guest VM? If a user first boots into a Linux > kernel that supports diag 318, then reboots and selects a Linux kernel > that does not support diag 318? I'd expect that the cpnc / cpnv values > need to be cleared here somewhere? Otherwise the information might not > be accurate anymore? He resets via QEMU on a machine reset. > > Thomas >
On Wed, 13 May 2020 18:15:57 -0400 Collin Walling <walling@linux.ibm.com> wrote: > DIAGNOSE 0x318 (diag318) is a privileged s390x instruction that must > be intercepted by SIE and handled via KVM. Let's introduce some > functions to communicate between userspace and KVM via ioctls. These > will be used to get/set the diag318 related information, as well as > check the system if KVM supports handling this instruction. > > This information can help with diagnosing the environment the VM is > running in (Linux, z/VM, etc) if the OS calls this instruction. > > By default, this feature is disabled and can only be enabled if a > user space program (such as QEMU) explicitly requests it. > > The Control Program Name Code (CPNC) is stored in the SIE block > and a copy is retained in each VCPU. The Control Program Version > Code (CPVC) is not designed to be stored in the SIE block, so we > retain a copy in each VCPU next to the CPNC. > > Signed-off-by: Collin Walling <walling@linux.ibm.com> > --- > Documentation/virt/kvm/devices/vm.rst | 29 +++++++++ > arch/s390/include/asm/kvm_host.h | 6 +- > arch/s390/include/uapi/asm/kvm.h | 5 ++ > arch/s390/kvm/diag.c | 20 ++++++ > arch/s390/kvm/kvm-s390.c | 89 +++++++++++++++++++++++++++ > arch/s390/kvm/kvm-s390.h | 1 + > arch/s390/kvm/vsie.c | 2 + > 7 files changed, 151 insertions(+), 1 deletion(-) > > diff --git a/Documentation/virt/kvm/devices/vm.rst b/Documentation/virt/kvm/devices/vm.rst > index 0aa5b1cfd700..9344d45ace6d 100644 > --- a/Documentation/virt/kvm/devices/vm.rst > +++ b/Documentation/virt/kvm/devices/vm.rst > @@ -314,3 +314,32 @@ Allows userspace to query the status of migration mode. > if it is enabled > :Returns: -EFAULT if the given address is not accessible from kernel space; > 0 in case of success. > + > +6. GROUP: KVM_S390_VM_MISC This needs to be rstyfied, matching the remainder of the file. > +Architectures: s390 > + > + 6.1. KVM_S390_VM_MISC_ENABLE_DIAG318 > + > + Allows userspace to enable the DIAGNOSE 0x318 instruction call for a > + guest OS. By default, KVM will not allow this instruction to be executed > + by a guest, even if support is in place. Userspace must explicitly enable > + the instruction handling for DIAGNOSE 0x318 via this call. > + > + Parameters: none > + Returns: 0 after setting a flag telling KVM to enable this feature > + > + 6.2. KVM_S390_VM_MISC_DIAG318 (r/w) > + > + Allows userspace to retrieve and set the DIAGNOSE 0x318 information, > + which consists of a 1-byte "Control Program Name Code" and a 7-byte > + "Control Program Version Code" (a 64 bit value all in all). This > + information is set by the guest (usually during IPL). This interface is > + intended to allow retrieving and setting it during migration; while no > + real harm is done if the information is changed outside of migration, > + it is strongly discouraged. (Sorry if we discussed that already, but that was some time ago and the info has dropped out of my cache...) Had we considered doing this in userspace only? If QEMU wanted to emulate diag 318 in tcg, it would basically need to mirror what KVM does; diag 318 does not seem like something where we want to optimize for performance, so dropping to userspace seems feasible? We'd just need an interface for userspace to forward anything set by the guest. > + > + Parameters: address of a buffer in user space (u64), where the > + information is read from or stored into > + Returns: -EFAULT if the given address is not accessible from kernel space; > + -EOPNOTSUPP if feature has not been requested to be enabled first; > + 0 in case of success
On 14.05.20 10:52, Janosch Frank wrote: > On 5/14/20 9:53 AM, Thomas Huth wrote: >> On 14/05/2020 00.15, Collin Walling wrote: >>> DIAGNOSE 0x318 (diag318) is a privileged s390x instruction that must >>> be intercepted by SIE and handled via KVM. Let's introduce some >>> functions to communicate between userspace and KVM via ioctls. These >>> will be used to get/set the diag318 related information, as well as >>> check the system if KVM supports handling this instruction. >>> >>> This information can help with diagnosing the environment the VM is >>> running in (Linux, z/VM, etc) if the OS calls this instruction. >>> >>> By default, this feature is disabled and can only be enabled if a >>> user space program (such as QEMU) explicitly requests it. >>> >>> The Control Program Name Code (CPNC) is stored in the SIE block >>> and a copy is retained in each VCPU. The Control Program Version >>> Code (CPVC) is not designed to be stored in the SIE block, so we >>> retain a copy in each VCPU next to the CPNC. >>> >>> Signed-off-by: Collin Walling <walling@linux.ibm.com> >>> --- >>> Documentation/virt/kvm/devices/vm.rst | 29 +++++++++ >>> arch/s390/include/asm/kvm_host.h | 6 +- >>> arch/s390/include/uapi/asm/kvm.h | 5 ++ >>> arch/s390/kvm/diag.c | 20 ++++++ >>> arch/s390/kvm/kvm-s390.c | 89 +++++++++++++++++++++++++++ >>> arch/s390/kvm/kvm-s390.h | 1 + >>> arch/s390/kvm/vsie.c | 2 + >>> 7 files changed, 151 insertions(+), 1 deletion(-) >> [...] >>> diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c >>> index 563429dece03..3caed4b880c8 100644 >>> --- a/arch/s390/kvm/diag.c >>> +++ b/arch/s390/kvm/diag.c >>> @@ -253,6 +253,24 @@ static int __diag_virtio_hypercall(struct kvm_vcpu *vcpu) >>> return ret < 0 ? ret : 0; >>> } >>> >>> +static int __diag_set_diag318_info(struct kvm_vcpu *vcpu) >>> +{ >>> + unsigned int reg = (vcpu->arch.sie_block->ipa & 0xf0) >> 4; >>> + u64 info = vcpu->run->s.regs.gprs[reg]; >>> + >>> + if (!vcpu->kvm->arch.use_diag318) >>> + return -EOPNOTSUPP; >>> + >>> + vcpu->stat.diagnose_318++; >>> + kvm_s390_set_diag318_info(vcpu->kvm, info); >>> + >>> + VCPU_EVENT(vcpu, 3, "diag 0x318 cpnc: 0x%x cpvc: 0x%llx", >>> + vcpu->kvm->arch.diag318_info.cpnc, >>> + (u64)vcpu->kvm->arch.diag318_info.cpvc); >>> + >>> + return 0; >>> +} >>> + >>> int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) >>> { >>> int code = kvm_s390_get_base_disp_rs(vcpu, NULL) & 0xffff; >>> @@ -272,6 +290,8 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) >>> return __diag_page_ref_service(vcpu); >>> case 0x308: >>> return __diag_ipl_functions(vcpu); >>> + case 0x318: >>> + return __diag_set_diag318_info(vcpu); >>> case 0x500: >>> return __diag_virtio_hypercall(vcpu); >> >> I wonder whether it would make more sense to simply drop to userspace >> and handle the diag 318 call there? That way the userspace would always >> be up-to-date, and as we've seen in the past (e.g. with the various SIGP >> handling), it's better if the userspace is in control... e.g. userspace >> could also decide to only use KVM_S390_VM_MISC_ENABLE_DIAG318 if the >> guest just executed the diag 318 instruction. >> >> And you need the kvm_s390_vm_get/set_misc functions anyway, so these >> could also be simply used by the diag 318 handler in userspace? >> >>> default: >>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c >>> index d05bb040fd42..c3eee468815f 100644 >>> --- a/arch/s390/kvm/kvm-s390.c >>> +++ b/arch/s390/kvm/kvm-s390.c >>> @@ -159,6 +159,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { >>> { "diag_9c_ignored", VCPU_STAT(diagnose_9c_ignored) }, >>> { "instruction_diag_258", VCPU_STAT(diagnose_258) }, >>> { "instruction_diag_308", VCPU_STAT(diagnose_308) }, >>> + { "instruction_diag_318", VCPU_STAT(diagnose_318) }, >>> { "instruction_diag_500", VCPU_STAT(diagnose_500) }, >>> { "instruction_diag_other", VCPU_STAT(diagnose_other) }, >>> { NULL } >>> @@ -1243,6 +1244,76 @@ static int kvm_s390_get_tod(struct kvm *kvm, struct kvm_device_attr *attr) >>> return ret; >>> } >>> >>> +void kvm_s390_set_diag318_info(struct kvm *kvm, u64 info) >>> +{ >>> + struct kvm_vcpu *vcpu; >>> + int i; >>> + >>> + kvm->arch.diag318_info.val = info; >>> + >>> + VM_EVENT(kvm, 3, "SET: CPNC: 0x%x CPVC: 0x%llx", >>> + kvm->arch.diag318_info.cpnc, kvm->arch.diag318_info.cpvc); >>> + >>> + if (sclp.has_diag318) { >>> + kvm_for_each_vcpu(i, vcpu, kvm) { >>> + vcpu->arch.sie_block->cpnc = kvm->arch.diag318_info.cpnc; >>> + } >>> + } >>> +} >>> + >>> +static int kvm_s390_vm_set_misc(struct kvm *kvm, struct kvm_device_attr *attr) >>> +{ >>> + int ret; >>> + u64 diag318_info; >>> + >>> + switch (attr->attr) { >>> + case KVM_S390_VM_MISC_ENABLE_DIAG318: >>> + kvm->arch.use_diag318 = 1; >>> + ret = 0; >>> + break; >> >> Would it make sense to set kvm->arch.use_diag318 = 1 during the first >> execution of KVM_S390_VM_MISC_DIAG318 instead, so that we could get >> along without the KVM_S390_VM_MISC_ENABLE_DIAG318 ? > > I'm not an expert in feature negotiation, but why isn't this a cpu > feature like sief2 instead of a attribute? > > @David? In the end you want to have it somehow in the CPU model I guess. You cannot glue it to QEMU machines, because availability depends on HW+KVM support. How does the guest detect that it can use diag318? I assume/hope via a a STFLE feature.
On 5/14/20 11:37 AM, David Hildenbrand wrote: > On 14.05.20 10:52, Janosch Frank wrote: >> On 5/14/20 9:53 AM, Thomas Huth wrote: >>> On 14/05/2020 00.15, Collin Walling wrote: >>>> DIAGNOSE 0x318 (diag318) is a privileged s390x instruction that must >>>> be intercepted by SIE and handled via KVM. Let's introduce some >>>> functions to communicate between userspace and KVM via ioctls. These >>>> will be used to get/set the diag318 related information, as well as >>>> check the system if KVM supports handling this instruction. >>>> >>>> This information can help with diagnosing the environment the VM is >>>> running in (Linux, z/VM, etc) if the OS calls this instruction. >>>> >>>> By default, this feature is disabled and can only be enabled if a >>>> user space program (such as QEMU) explicitly requests it. >>>> >>>> The Control Program Name Code (CPNC) is stored in the SIE block >>>> and a copy is retained in each VCPU. The Control Program Version >>>> Code (CPVC) is not designed to be stored in the SIE block, so we >>>> retain a copy in each VCPU next to the CPNC. >>>> >>>> Signed-off-by: Collin Walling <walling@linux.ibm.com> >>>> --- >>>> Documentation/virt/kvm/devices/vm.rst | 29 +++++++++ >>>> arch/s390/include/asm/kvm_host.h | 6 +- >>>> arch/s390/include/uapi/asm/kvm.h | 5 ++ >>>> arch/s390/kvm/diag.c | 20 ++++++ >>>> arch/s390/kvm/kvm-s390.c | 89 +++++++++++++++++++++++++++ >>>> arch/s390/kvm/kvm-s390.h | 1 + >>>> arch/s390/kvm/vsie.c | 2 + >>>> 7 files changed, 151 insertions(+), 1 deletion(-) >>> [...] >>>> diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c >>>> index 563429dece03..3caed4b880c8 100644 >>>> --- a/arch/s390/kvm/diag.c >>>> +++ b/arch/s390/kvm/diag.c >>>> @@ -253,6 +253,24 @@ static int __diag_virtio_hypercall(struct kvm_vcpu *vcpu) >>>> return ret < 0 ? ret : 0; >>>> } >>>> >>>> +static int __diag_set_diag318_info(struct kvm_vcpu *vcpu) >>>> +{ >>>> + unsigned int reg = (vcpu->arch.sie_block->ipa & 0xf0) >> 4; >>>> + u64 info = vcpu->run->s.regs.gprs[reg]; >>>> + >>>> + if (!vcpu->kvm->arch.use_diag318) >>>> + return -EOPNOTSUPP; >>>> + >>>> + vcpu->stat.diagnose_318++; >>>> + kvm_s390_set_diag318_info(vcpu->kvm, info); >>>> + >>>> + VCPU_EVENT(vcpu, 3, "diag 0x318 cpnc: 0x%x cpvc: 0x%llx", >>>> + vcpu->kvm->arch.diag318_info.cpnc, >>>> + (u64)vcpu->kvm->arch.diag318_info.cpvc); >>>> + >>>> + return 0; >>>> +} >>>> + >>>> int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) >>>> { >>>> int code = kvm_s390_get_base_disp_rs(vcpu, NULL) & 0xffff; >>>> @@ -272,6 +290,8 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) >>>> return __diag_page_ref_service(vcpu); >>>> case 0x308: >>>> return __diag_ipl_functions(vcpu); >>>> + case 0x318: >>>> + return __diag_set_diag318_info(vcpu); >>>> case 0x500: >>>> return __diag_virtio_hypercall(vcpu); >>> >>> I wonder whether it would make more sense to simply drop to userspace >>> and handle the diag 318 call there? That way the userspace would always >>> be up-to-date, and as we've seen in the past (e.g. with the various SIGP >>> handling), it's better if the userspace is in control... e.g. userspace >>> could also decide to only use KVM_S390_VM_MISC_ENABLE_DIAG318 if the >>> guest just executed the diag 318 instruction. >>> >>> And you need the kvm_s390_vm_get/set_misc functions anyway, so these >>> could also be simply used by the diag 318 handler in userspace? >>> >>>> default: >>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c >>>> index d05bb040fd42..c3eee468815f 100644 >>>> --- a/arch/s390/kvm/kvm-s390.c >>>> +++ b/arch/s390/kvm/kvm-s390.c >>>> @@ -159,6 +159,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { >>>> { "diag_9c_ignored", VCPU_STAT(diagnose_9c_ignored) }, >>>> { "instruction_diag_258", VCPU_STAT(diagnose_258) }, >>>> { "instruction_diag_308", VCPU_STAT(diagnose_308) }, >>>> + { "instruction_diag_318", VCPU_STAT(diagnose_318) }, >>>> { "instruction_diag_500", VCPU_STAT(diagnose_500) }, >>>> { "instruction_diag_other", VCPU_STAT(diagnose_other) }, >>>> { NULL } >>>> @@ -1243,6 +1244,76 @@ static int kvm_s390_get_tod(struct kvm *kvm, struct kvm_device_attr *attr) >>>> return ret; >>>> } >>>> >>>> +void kvm_s390_set_diag318_info(struct kvm *kvm, u64 info) >>>> +{ >>>> + struct kvm_vcpu *vcpu; >>>> + int i; >>>> + >>>> + kvm->arch.diag318_info.val = info; >>>> + >>>> + VM_EVENT(kvm, 3, "SET: CPNC: 0x%x CPVC: 0x%llx", >>>> + kvm->arch.diag318_info.cpnc, kvm->arch.diag318_info.cpvc); >>>> + >>>> + if (sclp.has_diag318) { >>>> + kvm_for_each_vcpu(i, vcpu, kvm) { >>>> + vcpu->arch.sie_block->cpnc = kvm->arch.diag318_info.cpnc; >>>> + } >>>> + } >>>> +} >>>> + >>>> +static int kvm_s390_vm_set_misc(struct kvm *kvm, struct kvm_device_attr *attr) >>>> +{ >>>> + int ret; >>>> + u64 diag318_info; >>>> + >>>> + switch (attr->attr) { >>>> + case KVM_S390_VM_MISC_ENABLE_DIAG318: >>>> + kvm->arch.use_diag318 = 1; >>>> + ret = 0; >>>> + break; >>> >>> Would it make sense to set kvm->arch.use_diag318 = 1 during the first >>> execution of KVM_S390_VM_MISC_DIAG318 instead, so that we could get >>> along without the KVM_S390_VM_MISC_ENABLE_DIAG318 ? >> >> I'm not an expert in feature negotiation, but why isn't this a cpu >> feature like sief2 instead of a attribute? >> >> @David? > > In the end you want to have it somehow in the CPU model I guess. You > cannot glue it to QEMU machines, because availability depends on HW+KVM > support. > > How does the guest detect that it can use diag318? I assume/hope via a a > STFLE feature. > SCLP
On 14.05.20 11:49, Janosch Frank wrote: > On 5/14/20 11:37 AM, David Hildenbrand wrote: >> On 14.05.20 10:52, Janosch Frank wrote: >>> On 5/14/20 9:53 AM, Thomas Huth wrote: >>>> On 14/05/2020 00.15, Collin Walling wrote: >>>>> DIAGNOSE 0x318 (diag318) is a privileged s390x instruction that must >>>>> be intercepted by SIE and handled via KVM. Let's introduce some >>>>> functions to communicate between userspace and KVM via ioctls. These >>>>> will be used to get/set the diag318 related information, as well as >>>>> check the system if KVM supports handling this instruction. >>>>> >>>>> This information can help with diagnosing the environment the VM is >>>>> running in (Linux, z/VM, etc) if the OS calls this instruction. >>>>> >>>>> By default, this feature is disabled and can only be enabled if a >>>>> user space program (such as QEMU) explicitly requests it. >>>>> >>>>> The Control Program Name Code (CPNC) is stored in the SIE block >>>>> and a copy is retained in each VCPU. The Control Program Version >>>>> Code (CPVC) is not designed to be stored in the SIE block, so we >>>>> retain a copy in each VCPU next to the CPNC. >>>>> >>>>> Signed-off-by: Collin Walling <walling@linux.ibm.com> >>>>> --- >>>>> Documentation/virt/kvm/devices/vm.rst | 29 +++++++++ >>>>> arch/s390/include/asm/kvm_host.h | 6 +- >>>>> arch/s390/include/uapi/asm/kvm.h | 5 ++ >>>>> arch/s390/kvm/diag.c | 20 ++++++ >>>>> arch/s390/kvm/kvm-s390.c | 89 +++++++++++++++++++++++++++ >>>>> arch/s390/kvm/kvm-s390.h | 1 + >>>>> arch/s390/kvm/vsie.c | 2 + >>>>> 7 files changed, 151 insertions(+), 1 deletion(-) >>>> [...] >>>>> diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c >>>>> index 563429dece03..3caed4b880c8 100644 >>>>> --- a/arch/s390/kvm/diag.c >>>>> +++ b/arch/s390/kvm/diag.c >>>>> @@ -253,6 +253,24 @@ static int __diag_virtio_hypercall(struct kvm_vcpu *vcpu) >>>>> return ret < 0 ? ret : 0; >>>>> } >>>>> >>>>> +static int __diag_set_diag318_info(struct kvm_vcpu *vcpu) >>>>> +{ >>>>> + unsigned int reg = (vcpu->arch.sie_block->ipa & 0xf0) >> 4; >>>>> + u64 info = vcpu->run->s.regs.gprs[reg]; >>>>> + >>>>> + if (!vcpu->kvm->arch.use_diag318) >>>>> + return -EOPNOTSUPP; >>>>> + >>>>> + vcpu->stat.diagnose_318++; >>>>> + kvm_s390_set_diag318_info(vcpu->kvm, info); >>>>> + >>>>> + VCPU_EVENT(vcpu, 3, "diag 0x318 cpnc: 0x%x cpvc: 0x%llx", >>>>> + vcpu->kvm->arch.diag318_info.cpnc, >>>>> + (u64)vcpu->kvm->arch.diag318_info.cpvc); >>>>> + >>>>> + return 0; >>>>> +} >>>>> + >>>>> int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) >>>>> { >>>>> int code = kvm_s390_get_base_disp_rs(vcpu, NULL) & 0xffff; >>>>> @@ -272,6 +290,8 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) >>>>> return __diag_page_ref_service(vcpu); >>>>> case 0x308: >>>>> return __diag_ipl_functions(vcpu); >>>>> + case 0x318: >>>>> + return __diag_set_diag318_info(vcpu); >>>>> case 0x500: >>>>> return __diag_virtio_hypercall(vcpu); >>>> >>>> I wonder whether it would make more sense to simply drop to userspace >>>> and handle the diag 318 call there? That way the userspace would always >>>> be up-to-date, and as we've seen in the past (e.g. with the various SIGP >>>> handling), it's better if the userspace is in control... e.g. userspace >>>> could also decide to only use KVM_S390_VM_MISC_ENABLE_DIAG318 if the >>>> guest just executed the diag 318 instruction. >>>> >>>> And you need the kvm_s390_vm_get/set_misc functions anyway, so these >>>> could also be simply used by the diag 318 handler in userspace? >>>> >>>>> default: >>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c >>>>> index d05bb040fd42..c3eee468815f 100644 >>>>> --- a/arch/s390/kvm/kvm-s390.c >>>>> +++ b/arch/s390/kvm/kvm-s390.c >>>>> @@ -159,6 +159,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { >>>>> { "diag_9c_ignored", VCPU_STAT(diagnose_9c_ignored) }, >>>>> { "instruction_diag_258", VCPU_STAT(diagnose_258) }, >>>>> { "instruction_diag_308", VCPU_STAT(diagnose_308) }, >>>>> + { "instruction_diag_318", VCPU_STAT(diagnose_318) }, >>>>> { "instruction_diag_500", VCPU_STAT(diagnose_500) }, >>>>> { "instruction_diag_other", VCPU_STAT(diagnose_other) }, >>>>> { NULL } >>>>> @@ -1243,6 +1244,76 @@ static int kvm_s390_get_tod(struct kvm *kvm, struct kvm_device_attr *attr) >>>>> return ret; >>>>> } >>>>> >>>>> +void kvm_s390_set_diag318_info(struct kvm *kvm, u64 info) >>>>> +{ >>>>> + struct kvm_vcpu *vcpu; >>>>> + int i; >>>>> + >>>>> + kvm->arch.diag318_info.val = info; >>>>> + >>>>> + VM_EVENT(kvm, 3, "SET: CPNC: 0x%x CPVC: 0x%llx", >>>>> + kvm->arch.diag318_info.cpnc, kvm->arch.diag318_info.cpvc); >>>>> + >>>>> + if (sclp.has_diag318) { >>>>> + kvm_for_each_vcpu(i, vcpu, kvm) { >>>>> + vcpu->arch.sie_block->cpnc = kvm->arch.diag318_info.cpnc; >>>>> + } >>>>> + } >>>>> +} >>>>> + >>>>> +static int kvm_s390_vm_set_misc(struct kvm *kvm, struct kvm_device_attr *attr) >>>>> +{ >>>>> + int ret; >>>>> + u64 diag318_info; >>>>> + >>>>> + switch (attr->attr) { >>>>> + case KVM_S390_VM_MISC_ENABLE_DIAG318: >>>>> + kvm->arch.use_diag318 = 1; >>>>> + ret = 0; >>>>> + break; >>>> >>>> Would it make sense to set kvm->arch.use_diag318 = 1 during the first >>>> execution of KVM_S390_VM_MISC_DIAG318 instead, so that we could get >>>> along without the KVM_S390_VM_MISC_ENABLE_DIAG318 ? >>> >>> I'm not an expert in feature negotiation, but why isn't this a cpu >>> feature like sief2 instead of a attribute? >>> >>> @David? >> >> In the end you want to have it somehow in the CPU model I guess. You >> cannot glue it to QEMU machines, because availability depends on HW+KVM >> support. >> >> How does the guest detect that it can use diag318? I assume/hope via a a >> STFLE feature. >> > SCLP Okay, so just another feature flag, which implies it belongs into the CPU model. How we communicate support from kvm to QEMU can be done via features, bot also via attributes. Important part is that we can sense/enable/disable.
On 5/14/20 5:05 AM, Cornelia Huck wrote: > On Wed, 13 May 2020 18:15:57 -0400 > Collin Walling <walling@linux.ibm.com> wrote: > >> DIAGNOSE 0x318 (diag318) is a privileged s390x instruction that must >> be intercepted by SIE and handled via KVM. Let's introduce some >> functions to communicate between userspace and KVM via ioctls. These >> will be used to get/set the diag318 related information, as well as >> check the system if KVM supports handling this instruction. >> >> This information can help with diagnosing the environment the VM is >> running in (Linux, z/VM, etc) if the OS calls this instruction. >> >> By default, this feature is disabled and can only be enabled if a >> user space program (such as QEMU) explicitly requests it. >> >> The Control Program Name Code (CPNC) is stored in the SIE block >> and a copy is retained in each VCPU. The Control Program Version >> Code (CPVC) is not designed to be stored in the SIE block, so we >> retain a copy in each VCPU next to the CPNC. >> >> Signed-off-by: Collin Walling <walling@linux.ibm.com> >> --- >> Documentation/virt/kvm/devices/vm.rst | 29 +++++++++ >> arch/s390/include/asm/kvm_host.h | 6 +- >> arch/s390/include/uapi/asm/kvm.h | 5 ++ >> arch/s390/kvm/diag.c | 20 ++++++ >> arch/s390/kvm/kvm-s390.c | 89 +++++++++++++++++++++++++++ >> arch/s390/kvm/kvm-s390.h | 1 + >> arch/s390/kvm/vsie.c | 2 + >> 7 files changed, 151 insertions(+), 1 deletion(-) >> >> diff --git a/Documentation/virt/kvm/devices/vm.rst b/Documentation/virt/kvm/devices/vm.rst >> index 0aa5b1cfd700..9344d45ace6d 100644 >> --- a/Documentation/virt/kvm/devices/vm.rst >> +++ b/Documentation/virt/kvm/devices/vm.rst >> @@ -314,3 +314,32 @@ Allows userspace to query the status of migration mode. >> if it is enabled >> :Returns: -EFAULT if the given address is not accessible from kernel space; >> 0 in case of success. >> + >> +6. GROUP: KVM_S390_VM_MISC > > This needs to be rstyfied, matching the remainder of the file. > >> +Architectures: s390 >> + >> + 6.1. KVM_S390_VM_MISC_ENABLE_DIAG318 >> + >> + Allows userspace to enable the DIAGNOSE 0x318 instruction call for a >> + guest OS. By default, KVM will not allow this instruction to be executed >> + by a guest, even if support is in place. Userspace must explicitly enable >> + the instruction handling for DIAGNOSE 0x318 via this call. >> + >> + Parameters: none >> + Returns: 0 after setting a flag telling KVM to enable this feature >> + >> + 6.2. KVM_S390_VM_MISC_DIAG318 (r/w) >> + >> + Allows userspace to retrieve and set the DIAGNOSE 0x318 information, >> + which consists of a 1-byte "Control Program Name Code" and a 7-byte >> + "Control Program Version Code" (a 64 bit value all in all). This >> + information is set by the guest (usually during IPL). This interface is >> + intended to allow retrieving and setting it during migration; while no >> + real harm is done if the information is changed outside of migration, >> + it is strongly discouraged. > > (Sorry if we discussed that already, but that was some time ago and the > info has dropped out of my cache...) > > Had we considered doing this in userspace only? If QEMU wanted to > emulate diag 318 in tcg, it would basically need to mirror what KVM > does; diag 318 does not seem like something where we want to optimize > for performance, so dropping to userspace seems feasible? We'd just > need an interface for userspace to forward anything set by the guest. > My reservation with respect to handling this in userspace only is that the data set by the instruction call is relevant to both host-level and guest-level kernels. That data is set during kernel setup. Right now, the instruction call is used to set a hard-coded "name code" value, but later we want to use this instruction to also set some sort of unique version code. The format of the version code is not yet determined. If guest support is handled in userspace only, then we'll have to update the version codes in both the Linux kernel /and/ QEMU, which might be a bit messy if things go out of sync. >> + >> + Parameters: address of a buffer in user space (u64), where the >> + information is read from or stored into >> + Returns: -EFAULT if the given address is not accessible from kernel space; >> + -EOPNOTSUPP if feature has not been requested to be enabled first; >> + 0 in case of success >
On 5/14/20 11:24 AM, Collin Walling wrote: > On 5/14/20 5:05 AM, Cornelia Huck wrote: >> On Wed, 13 May 2020 18:15:57 -0400 >> Collin Walling <walling@linux.ibm.com> wrote: >> >>> DIAGNOSE 0x318 (diag318) is a privileged s390x instruction that must >>> be intercepted by SIE and handled via KVM. Let's introduce some >>> functions to communicate between userspace and KVM via ioctls. These >>> will be used to get/set the diag318 related information, as well as >>> check the system if KVM supports handling this instruction. >>> >>> This information can help with diagnosing the environment the VM is >>> running in (Linux, z/VM, etc) if the OS calls this instruction. >>> >>> By default, this feature is disabled and can only be enabled if a >>> user space program (such as QEMU) explicitly requests it. >>> >>> The Control Program Name Code (CPNC) is stored in the SIE block >>> and a copy is retained in each VCPU. The Control Program Version >>> Code (CPVC) is not designed to be stored in the SIE block, so we >>> retain a copy in each VCPU next to the CPNC. >>> >>> Signed-off-by: Collin Walling <walling@linux.ibm.com> >>> --- >>> Documentation/virt/kvm/devices/vm.rst | 29 +++++++++ >>> arch/s390/include/asm/kvm_host.h | 6 +- >>> arch/s390/include/uapi/asm/kvm.h | 5 ++ >>> arch/s390/kvm/diag.c | 20 ++++++ >>> arch/s390/kvm/kvm-s390.c | 89 +++++++++++++++++++++++++++ >>> arch/s390/kvm/kvm-s390.h | 1 + >>> arch/s390/kvm/vsie.c | 2 + >>> 7 files changed, 151 insertions(+), 1 deletion(-) >>> >>> diff --git a/Documentation/virt/kvm/devices/vm.rst b/Documentation/virt/kvm/devices/vm.rst >>> index 0aa5b1cfd700..9344d45ace6d 100644 >>> --- a/Documentation/virt/kvm/devices/vm.rst >>> +++ b/Documentation/virt/kvm/devices/vm.rst >>> @@ -314,3 +314,32 @@ Allows userspace to query the status of migration mode. >>> if it is enabled >>> :Returns: -EFAULT if the given address is not accessible from kernel space; >>> 0 in case of success. >>> + >>> +6. GROUP: KVM_S390_VM_MISC >> >> This needs to be rstyfied, matching the remainder of the file. >> >>> +Architectures: s390 >>> + >>> + 6.1. KVM_S390_VM_MISC_ENABLE_DIAG318 >>> + >>> + Allows userspace to enable the DIAGNOSE 0x318 instruction call for a >>> + guest OS. By default, KVM will not allow this instruction to be executed >>> + by a guest, even if support is in place. Userspace must explicitly enable >>> + the instruction handling for DIAGNOSE 0x318 via this call. >>> + >>> + Parameters: none >>> + Returns: 0 after setting a flag telling KVM to enable this feature >>> + >>> + 6.2. KVM_S390_VM_MISC_DIAG318 (r/w) >>> + >>> + Allows userspace to retrieve and set the DIAGNOSE 0x318 information, >>> + which consists of a 1-byte "Control Program Name Code" and a 7-byte >>> + "Control Program Version Code" (a 64 bit value all in all). This >>> + information is set by the guest (usually during IPL). This interface is >>> + intended to allow retrieving and setting it during migration; while no >>> + real harm is done if the information is changed outside of migration, >>> + it is strongly discouraged. >> >> (Sorry if we discussed that already, but that was some time ago and the >> info has dropped out of my cache...) >> >> Had we considered doing this in userspace only? If QEMU wanted to >> emulate diag 318 in tcg, it would basically need to mirror what KVM >> does; diag 318 does not seem like something where we want to optimize >> for performance, so dropping to userspace seems feasible? We'd just >> need an interface for userspace to forward anything set by the guest. >> > > My reservation with respect to handling this in userspace only is that > the data set by the instruction call is relevant to both host-level and > guest-level kernels. That data is set during kernel setup. > > Right now, the instruction call is used to set a hard-coded "name code" > value, but later we want to use this instruction to also set some sort > of unique version code. The format of the version code is not yet > determined. > > If guest support is handled in userspace only, then we'll have to update > the version codes in both the Linux kernel /and/ QEMU, which might be a > bit messy if things go out of sync. > In an attempt to clear up some fogginess with respect to "what" the version code may entail, we're thinking of some sort of 7-byte combination that denotes both the kernel version and a value that denotes the distro + release. We're not 100% solid on exactly what that format will look like just yet, but all of the discussions have revolved around that theme. >>> + >>> + Parameters: address of a buffer in user space (u64), where the >>> + information is read from or stored into >>> + Returns: -EFAULT if the given address is not accessible from kernel space; >>> + -EOPNOTSUPP if feature has not been requested to be enabled first; >>> + 0 in case of success >> > >
On 5/14/20 5:53 AM, David Hildenbrand wrote: > On 14.05.20 11:49, Janosch Frank wrote: >> On 5/14/20 11:37 AM, David Hildenbrand wrote: >>> On 14.05.20 10:52, Janosch Frank wrote: >>>> On 5/14/20 9:53 AM, Thomas Huth wrote: >>>>> On 14/05/2020 00.15, Collin Walling wrote: >>>>>> DIAGNOSE 0x318 (diag318) is a privileged s390x instruction that must >>>>>> be intercepted by SIE and handled via KVM. Let's introduce some >>>>>> functions to communicate between userspace and KVM via ioctls. These >>>>>> will be used to get/set the diag318 related information, as well as >>>>>> check the system if KVM supports handling this instruction. >>>>>> >>>>>> This information can help with diagnosing the environment the VM is >>>>>> running in (Linux, z/VM, etc) if the OS calls this instruction. >>>>>> >>>>>> By default, this feature is disabled and can only be enabled if a >>>>>> user space program (such as QEMU) explicitly requests it. >>>>>> >>>>>> The Control Program Name Code (CPNC) is stored in the SIE block >>>>>> and a copy is retained in each VCPU. The Control Program Version >>>>>> Code (CPVC) is not designed to be stored in the SIE block, so we >>>>>> retain a copy in each VCPU next to the CPNC. >>>>>> >>>>>> Signed-off-by: Collin Walling <walling@linux.ibm.com> >>>>>> --- >>>>>> Documentation/virt/kvm/devices/vm.rst | 29 +++++++++ >>>>>> arch/s390/include/asm/kvm_host.h | 6 +- >>>>>> arch/s390/include/uapi/asm/kvm.h | 5 ++ >>>>>> arch/s390/kvm/diag.c | 20 ++++++ >>>>>> arch/s390/kvm/kvm-s390.c | 89 +++++++++++++++++++++++++++ >>>>>> arch/s390/kvm/kvm-s390.h | 1 + >>>>>> arch/s390/kvm/vsie.c | 2 + >>>>>> 7 files changed, 151 insertions(+), 1 deletion(-) >>>>> [...] >>>>>> diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c >>>>>> index 563429dece03..3caed4b880c8 100644 >>>>>> --- a/arch/s390/kvm/diag.c >>>>>> +++ b/arch/s390/kvm/diag.c >>>>>> @@ -253,6 +253,24 @@ static int __diag_virtio_hypercall(struct kvm_vcpu *vcpu) >>>>>> return ret < 0 ? ret : 0; >>>>>> } >>>>>> >>>>>> +static int __diag_set_diag318_info(struct kvm_vcpu *vcpu) >>>>>> +{ >>>>>> + unsigned int reg = (vcpu->arch.sie_block->ipa & 0xf0) >> 4; >>>>>> + u64 info = vcpu->run->s.regs.gprs[reg]; >>>>>> + >>>>>> + if (!vcpu->kvm->arch.use_diag318) >>>>>> + return -EOPNOTSUPP; >>>>>> + >>>>>> + vcpu->stat.diagnose_318++; >>>>>> + kvm_s390_set_diag318_info(vcpu->kvm, info); >>>>>> + >>>>>> + VCPU_EVENT(vcpu, 3, "diag 0x318 cpnc: 0x%x cpvc: 0x%llx", >>>>>> + vcpu->kvm->arch.diag318_info.cpnc, >>>>>> + (u64)vcpu->kvm->arch.diag318_info.cpvc); Errr.. thought I dropped this message. We favored just using the VM_EVENT from last time. >>>>>> + >>>>>> + return 0; >>>>>> +} >>>>>> + >>>>>> int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) >>>>>> { >>>>>> int code = kvm_s390_get_base_disp_rs(vcpu, NULL) & 0xffff; >>>>>> @@ -272,6 +290,8 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) >>>>>> return __diag_page_ref_service(vcpu); >>>>>> case 0x308: >>>>>> return __diag_ipl_functions(vcpu); >>>>>> + case 0x318: >>>>>> + return __diag_set_diag318_info(vcpu); >>>>>> case 0x500: >>>>>> return __diag_virtio_hypercall(vcpu); >>>>> >>>>> I wonder whether it would make more sense to simply drop to userspace >>>>> and handle the diag 318 call there? That way the userspace would always >>>>> be up-to-date, and as we've seen in the past (e.g. with the various SIGP >>>>> handling), it's better if the userspace is in control... e.g. userspace >>>>> could also decide to only use KVM_S390_VM_MISC_ENABLE_DIAG318 if the >>>>> guest just executed the diag 318 instruction. >>>>> >>>>> And you need the kvm_s390_vm_get/set_misc functions anyway, so these >>>>> could also be simply used by the diag 318 handler in userspace? Pardon my ignorance, but I do not think I fully understand what exactly should be dropped in favor of doing things in userspace. My assumption: if a diag handler is not found in KVM, then we fallthrough to userspace handling? >>>>> >>>>>> default: >>>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c >>>>>> index d05bb040fd42..c3eee468815f 100644 >>>>>> --- a/arch/s390/kvm/kvm-s390.c >>>>>> +++ b/arch/s390/kvm/kvm-s390.c >>>>>> @@ -159,6 +159,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { >>>>>> { "diag_9c_ignored", VCPU_STAT(diagnose_9c_ignored) }, >>>>>> { "instruction_diag_258", VCPU_STAT(diagnose_258) }, >>>>>> { "instruction_diag_308", VCPU_STAT(diagnose_308) }, >>>>>> + { "instruction_diag_318", VCPU_STAT(diagnose_318) }, >>>>>> { "instruction_diag_500", VCPU_STAT(diagnose_500) }, >>>>>> { "instruction_diag_other", VCPU_STAT(diagnose_other) }, >>>>>> { NULL } >>>>>> @@ -1243,6 +1244,76 @@ static int kvm_s390_get_tod(struct kvm *kvm, struct kvm_device_attr *attr) >>>>>> return ret; >>>>>> } >>>>>> >>>>>> +void kvm_s390_set_diag318_info(struct kvm *kvm, u64 info) >>>>>> +{ >>>>>> + struct kvm_vcpu *vcpu; >>>>>> + int i; >>>>>> + >>>>>> + kvm->arch.diag318_info.val = info; >>>>>> + >>>>>> + VM_EVENT(kvm, 3, "SET: CPNC: 0x%x CPVC: 0x%llx", >>>>>> + kvm->arch.diag318_info.cpnc, kvm->arch.diag318_info.cpvc); >>>>>> + >>>>>> + if (sclp.has_diag318) { >>>>>> + kvm_for_each_vcpu(i, vcpu, kvm) { >>>>>> + vcpu->arch.sie_block->cpnc = kvm->arch.diag318_info.cpnc; >>>>>> + } >>>>>> + } >>>>>> +} >>>>>> + >>>>>> +static int kvm_s390_vm_set_misc(struct kvm *kvm, struct kvm_device_attr *attr) >>>>>> +{ >>>>>> + int ret; >>>>>> + u64 diag318_info; >>>>>> + >>>>>> + switch (attr->attr) { >>>>>> + case KVM_S390_VM_MISC_ENABLE_DIAG318: >>>>>> + kvm->arch.use_diag318 = 1; >>>>>> + ret = 0; >>>>>> + break; >>>>> >>>>> Would it make sense to set kvm->arch.use_diag318 = 1 during the first >>>>> execution of KVM_S390_VM_MISC_DIAG318 instead, so that we could get >>>>> along without the KVM_S390_VM_MISC_ENABLE_DIAG318 ? >>>> Hmmm... is your thought set the flag in the diag handler in KVM? That way the get/set functions are only enabled if the instruction was actually called? I like that, actually. I think that makes more sense. >>>> I'm not an expert in feature negotiation, but why isn't this a cpu >>>> feature like sief2 instead of a attribute? >>>> >>>> @David? >>> >>> In the end you want to have it somehow in the CPU model I guess. You >>> cannot glue it to QEMU machines, because availability depends on HW+KVM >>> support. >>> >>> How does the guest detect that it can use diag318? I assume/hope via a a >>> STFLE feature. >>> >> SCLP > > Okay, so just another feature flag, which implies it belongs into the > CPU model. How we communicate support from kvm to QEMU can be done via > features, bot also via attributes. Important part is that we can > sense/enable/disable. > > Right. KVM for instruction handling and for get/setting (setting also handles setting the name code in the SIE block), and QEMU for migration / resetting. There are a lot of moving parts for a simple 8-byte string of data, but its very useful for giving us more information regarding the OS during firmware / service events.
On 14/05/2020 19.20, Collin Walling wrote: > On 5/14/20 5:53 AM, David Hildenbrand wrote: >> On 14.05.20 11:49, Janosch Frank wrote: >>> On 5/14/20 11:37 AM, David Hildenbrand wrote: >>>> On 14.05.20 10:52, Janosch Frank wrote: >>>>> On 5/14/20 9:53 AM, Thomas Huth wrote: >>>>>> On 14/05/2020 00.15, Collin Walling wrote: >>>>>>> DIAGNOSE 0x318 (diag318) is a privileged s390x instruction that must >>>>>>> be intercepted by SIE and handled via KVM. Let's introduce some >>>>>>> functions to communicate between userspace and KVM via ioctls. These >>>>>>> will be used to get/set the diag318 related information, as well as >>>>>>> check the system if KVM supports handling this instruction. >>>>>>> >>>>>>> This information can help with diagnosing the environment the VM is >>>>>>> running in (Linux, z/VM, etc) if the OS calls this instruction. >>>>>>> >>>>>>> By default, this feature is disabled and can only be enabled if a >>>>>>> user space program (such as QEMU) explicitly requests it. >>>>>>> >>>>>>> The Control Program Name Code (CPNC) is stored in the SIE block >>>>>>> and a copy is retained in each VCPU. The Control Program Version >>>>>>> Code (CPVC) is not designed to be stored in the SIE block, so we >>>>>>> retain a copy in each VCPU next to the CPNC. >>>>>>> >>>>>>> Signed-off-by: Collin Walling <walling@linux.ibm.com> >>>>>>> --- >>>>>>> Documentation/virt/kvm/devices/vm.rst | 29 +++++++++ >>>>>>> arch/s390/include/asm/kvm_host.h | 6 +- >>>>>>> arch/s390/include/uapi/asm/kvm.h | 5 ++ >>>>>>> arch/s390/kvm/diag.c | 20 ++++++ >>>>>>> arch/s390/kvm/kvm-s390.c | 89 +++++++++++++++++++++++++++ >>>>>>> arch/s390/kvm/kvm-s390.h | 1 + >>>>>>> arch/s390/kvm/vsie.c | 2 + >>>>>>> 7 files changed, 151 insertions(+), 1 deletion(-) >>>>>> [...] >>>>>>> diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c >>>>>>> index 563429dece03..3caed4b880c8 100644 >>>>>>> --- a/arch/s390/kvm/diag.c >>>>>>> +++ b/arch/s390/kvm/diag.c >>>>>>> @@ -253,6 +253,24 @@ static int __diag_virtio_hypercall(struct kvm_vcpu *vcpu) >>>>>>> return ret < 0 ? ret : 0; >>>>>>> } >>>>>>> >>>>>>> +static int __diag_set_diag318_info(struct kvm_vcpu *vcpu) >>>>>>> +{ >>>>>>> + unsigned int reg = (vcpu->arch.sie_block->ipa & 0xf0) >> 4; >>>>>>> + u64 info = vcpu->run->s.regs.gprs[reg]; >>>>>>> + >>>>>>> + if (!vcpu->kvm->arch.use_diag318) >>>>>>> + return -EOPNOTSUPP; >>>>>>> + >>>>>>> + vcpu->stat.diagnose_318++; >>>>>>> + kvm_s390_set_diag318_info(vcpu->kvm, info); >>>>>>> + >>>>>>> + VCPU_EVENT(vcpu, 3, "diag 0x318 cpnc: 0x%x cpvc: 0x%llx", >>>>>>> + vcpu->kvm->arch.diag318_info.cpnc, >>>>>>> + (u64)vcpu->kvm->arch.diag318_info.cpvc); > > Errr.. thought I dropped this message. We favored just using the > VM_EVENT from last time. > >>>>>>> + >>>>>>> + return 0; >>>>>>> +} >>>>>>> + >>>>>>> int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) >>>>>>> { >>>>>>> int code = kvm_s390_get_base_disp_rs(vcpu, NULL) & 0xffff; >>>>>>> @@ -272,6 +290,8 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) >>>>>>> return __diag_page_ref_service(vcpu); >>>>>>> case 0x308: >>>>>>> return __diag_ipl_functions(vcpu); >>>>>>> + case 0x318: >>>>>>> + return __diag_set_diag318_info(vcpu); >>>>>>> case 0x500: >>>>>>> return __diag_virtio_hypercall(vcpu); >>>>>> >>>>>> I wonder whether it would make more sense to simply drop to userspace >>>>>> and handle the diag 318 call there? That way the userspace would always >>>>>> be up-to-date, and as we've seen in the past (e.g. with the various SIGP >>>>>> handling), it's better if the userspace is in control... e.g. userspace >>>>>> could also decide to only use KVM_S390_VM_MISC_ENABLE_DIAG318 if the >>>>>> guest just executed the diag 318 instruction. >>>>>> >>>>>> And you need the kvm_s390_vm_get/set_misc functions anyway, so these >>>>>> could also be simply used by the diag 318 handler in userspace? > > Pardon my ignorance, but I do not think I fully understand what exactly > should be dropped in favor of doing things in userspace. > > My assumption: if a diag handler is not found in KVM, then we > fallthrough to userspace handling? Right, if you simply omit this change to diag.c, the default case returns -EOPNOTSUPP which then should cause an exit to userspace. You can then add the code in QEMU to handle_diag() in target/s390x/kvm.c instead. Thomas
On 14/05/2020 10.52, Janosch Frank wrote: > On 5/14/20 9:53 AM, Thomas Huth wrote: >> On 14/05/2020 00.15, Collin Walling wrote: >>> DIAGNOSE 0x318 (diag318) is a privileged s390x instruction that must >>> be intercepted by SIE and handled via KVM. Let's introduce some >>> functions to communicate between userspace and KVM via ioctls. These >>> will be used to get/set the diag318 related information, as well as >>> check the system if KVM supports handling this instruction. >>> >>> This information can help with diagnosing the environment the VM is >>> running in (Linux, z/VM, etc) if the OS calls this instruction. >>> >>> By default, this feature is disabled and can only be enabled if a >>> user space program (such as QEMU) explicitly requests it. >>> >>> The Control Program Name Code (CPNC) is stored in the SIE block >>> and a copy is retained in each VCPU. The Control Program Version >>> Code (CPVC) is not designed to be stored in the SIE block, so we >>> retain a copy in each VCPU next to the CPNC. >>> >>> Signed-off-by: Collin Walling <walling@linux.ibm.com> >>> --- >>> Documentation/virt/kvm/devices/vm.rst | 29 +++++++++ >>> arch/s390/include/asm/kvm_host.h | 6 +- >>> arch/s390/include/uapi/asm/kvm.h | 5 ++ >>> arch/s390/kvm/diag.c | 20 ++++++ >>> arch/s390/kvm/kvm-s390.c | 89 +++++++++++++++++++++++++++ >>> arch/s390/kvm/kvm-s390.h | 1 + >>> arch/s390/kvm/vsie.c | 2 + >>> 7 files changed, 151 insertions(+), 1 deletion(-) >> [...] >>> diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c >>> index 563429dece03..3caed4b880c8 100644 >>> --- a/arch/s390/kvm/diag.c >>> +++ b/arch/s390/kvm/diag.c >>> @@ -253,6 +253,24 @@ static int __diag_virtio_hypercall(struct kvm_vcpu *vcpu) >>> return ret < 0 ? ret : 0; >>> } >>> >>> +static int __diag_set_diag318_info(struct kvm_vcpu *vcpu) >>> +{ >>> + unsigned int reg = (vcpu->arch.sie_block->ipa & 0xf0) >> 4; >>> + u64 info = vcpu->run->s.regs.gprs[reg]; >>> + >>> + if (!vcpu->kvm->arch.use_diag318) >>> + return -EOPNOTSUPP; >>> + >>> + vcpu->stat.diagnose_318++; >>> + kvm_s390_set_diag318_info(vcpu->kvm, info); >>> + >>> + VCPU_EVENT(vcpu, 3, "diag 0x318 cpnc: 0x%x cpvc: 0x%llx", >>> + vcpu->kvm->arch.diag318_info.cpnc, >>> + (u64)vcpu->kvm->arch.diag318_info.cpvc); >>> + >>> + return 0; >>> +} >>> + >>> int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) >>> { >>> int code = kvm_s390_get_base_disp_rs(vcpu, NULL) & 0xffff; >>> @@ -272,6 +290,8 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) >>> return __diag_page_ref_service(vcpu); >>> case 0x308: >>> return __diag_ipl_functions(vcpu); >>> + case 0x318: >>> + return __diag_set_diag318_info(vcpu); >>> case 0x500: >>> return __diag_virtio_hypercall(vcpu); >> >> I wonder whether it would make more sense to simply drop to userspace >> and handle the diag 318 call there? That way the userspace would always >> be up-to-date, and as we've seen in the past (e.g. with the various SIGP >> handling), it's better if the userspace is in control... e.g. userspace >> could also decide to only use KVM_S390_VM_MISC_ENABLE_DIAG318 if the >> guest just executed the diag 318 instruction. >> >> And you need the kvm_s390_vm_get/set_misc functions anyway, so these >> could also be simply used by the diag 318 handler in userspace? [...] >> What about a reset of the guest VM? If a user first boots into a Linux >> kernel that supports diag 318, then reboots and selects a Linux kernel >> that does not support diag 318? I'd expect that the cpnc / cpnv values >> need to be cleared here somewhere? Otherwise the information might not >> be accurate anymore? > > He resets via QEMU on a machine reset. Ah, thanks for the pointer, makes sense! ... and actually, I think that is another indication that QEMU should rather be in control, thus the kernel code should drop to userspace instead of handling the diag 318 call in diag.c in the kernel above. Thomas
On 5/14/20 2:37 PM, Thomas Huth wrote: > On 14/05/2020 19.20, Collin Walling wrote: >> On 5/14/20 5:53 AM, David Hildenbrand wrote: >>> On 14.05.20 11:49, Janosch Frank wrote: >>>> On 5/14/20 11:37 AM, David Hildenbrand wrote: >>>>> On 14.05.20 10:52, Janosch Frank wrote: >>>>>> On 5/14/20 9:53 AM, Thomas Huth wrote: >>>>>>> On 14/05/2020 00.15, Collin Walling wrote: >>>>>>>> DIAGNOSE 0x318 (diag318) is a privileged s390x instruction that must >>>>>>>> be intercepted by SIE and handled via KVM. Let's introduce some >>>>>>>> functions to communicate between userspace and KVM via ioctls. These >>>>>>>> will be used to get/set the diag318 related information, as well as >>>>>>>> check the system if KVM supports handling this instruction. >>>>>>>> >>>>>>>> This information can help with diagnosing the environment the VM is >>>>>>>> running in (Linux, z/VM, etc) if the OS calls this instruction. >>>>>>>> >>>>>>>> By default, this feature is disabled and can only be enabled if a >>>>>>>> user space program (such as QEMU) explicitly requests it. >>>>>>>> >>>>>>>> The Control Program Name Code (CPNC) is stored in the SIE block >>>>>>>> and a copy is retained in each VCPU. The Control Program Version >>>>>>>> Code (CPVC) is not designed to be stored in the SIE block, so we >>>>>>>> retain a copy in each VCPU next to the CPNC. >>>>>>>> >>>>>>>> Signed-off-by: Collin Walling <walling@linux.ibm.com> >>>>>>>> --- >>>>>>>> Documentation/virt/kvm/devices/vm.rst | 29 +++++++++ >>>>>>>> arch/s390/include/asm/kvm_host.h | 6 +- >>>>>>>> arch/s390/include/uapi/asm/kvm.h | 5 ++ >>>>>>>> arch/s390/kvm/diag.c | 20 ++++++ >>>>>>>> arch/s390/kvm/kvm-s390.c | 89 +++++++++++++++++++++++++++ >>>>>>>> arch/s390/kvm/kvm-s390.h | 1 + >>>>>>>> arch/s390/kvm/vsie.c | 2 + >>>>>>>> 7 files changed, 151 insertions(+), 1 deletion(-) >>>>>>> [...] >>>>>>>> diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c >>>>>>>> index 563429dece03..3caed4b880c8 100644 >>>>>>>> --- a/arch/s390/kvm/diag.c >>>>>>>> +++ b/arch/s390/kvm/diag.c >>>>>>>> @@ -253,6 +253,24 @@ static int __diag_virtio_hypercall(struct kvm_vcpu *vcpu) >>>>>>>> return ret < 0 ? ret : 0; >>>>>>>> } >>>>>>>> >>>>>>>> +static int __diag_set_diag318_info(struct kvm_vcpu *vcpu) >>>>>>>> +{ >>>>>>>> + unsigned int reg = (vcpu->arch.sie_block->ipa & 0xf0) >> 4; >>>>>>>> + u64 info = vcpu->run->s.regs.gprs[reg]; >>>>>>>> + >>>>>>>> + if (!vcpu->kvm->arch.use_diag318) >>>>>>>> + return -EOPNOTSUPP; >>>>>>>> + >>>>>>>> + vcpu->stat.diagnose_318++; >>>>>>>> + kvm_s390_set_diag318_info(vcpu->kvm, info); >>>>>>>> + >>>>>>>> + VCPU_EVENT(vcpu, 3, "diag 0x318 cpnc: 0x%x cpvc: 0x%llx", >>>>>>>> + vcpu->kvm->arch.diag318_info.cpnc, >>>>>>>> + (u64)vcpu->kvm->arch.diag318_info.cpvc); >> >> Errr.. thought I dropped this message. We favored just using the >> VM_EVENT from last time. >> >>>>>>>> + >>>>>>>> + return 0; >>>>>>>> +} >>>>>>>> + >>>>>>>> int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) >>>>>>>> { >>>>>>>> int code = kvm_s390_get_base_disp_rs(vcpu, NULL) & 0xffff; >>>>>>>> @@ -272,6 +290,8 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) >>>>>>>> return __diag_page_ref_service(vcpu); >>>>>>>> case 0x308: >>>>>>>> return __diag_ipl_functions(vcpu); >>>>>>>> + case 0x318: >>>>>>>> + return __diag_set_diag318_info(vcpu); >>>>>>>> case 0x500: >>>>>>>> return __diag_virtio_hypercall(vcpu); >>>>>>> >>>>>>> I wonder whether it would make more sense to simply drop to userspace >>>>>>> and handle the diag 318 call there? That way the userspace would always >>>>>>> be up-to-date, and as we've seen in the past (e.g. with the various SIGP >>>>>>> handling), it's better if the userspace is in control... e.g. userspace >>>>>>> could also decide to only use KVM_S390_VM_MISC_ENABLE_DIAG318 if the >>>>>>> guest just executed the diag 318 instruction. >>>>>>> >>>>>>> And you need the kvm_s390_vm_get/set_misc functions anyway, so these >>>>>>> could also be simply used by the diag 318 handler in userspace? >> >> Pardon my ignorance, but I do not think I fully understand what exactly >> should be dropped in favor of doing things in userspace. >> >> My assumption: if a diag handler is not found in KVM, then we >> fallthrough to userspace handling? > > Right, if you simply omit this change to diag.c, the default case > returns -EOPNOTSUPP which then should cause an exit to userspace. You > can then add the code in QEMU to handle_diag() in target/s390x/kvm.c > instead. > > Thomas > Very cool! Okay, I think this makes sense, then. I'll look into this. Thanks for the tip. @Conny, I assume this is what you meant as well? If so, ignore my response I sent earlier :)
On Thu, 14 May 2020 14:53:24 -0400 Collin Walling <walling@linux.ibm.com> wrote: > On 5/14/20 2:37 PM, Thomas Huth wrote: > > On 14/05/2020 19.20, Collin Walling wrote: > >> On 5/14/20 5:53 AM, David Hildenbrand wrote: > >>> On 14.05.20 11:49, Janosch Frank wrote: > >>>> On 5/14/20 11:37 AM, David Hildenbrand wrote: > >>>>> On 14.05.20 10:52, Janosch Frank wrote: > >>>>>> On 5/14/20 9:53 AM, Thomas Huth wrote: > >>>>>>> On 14/05/2020 00.15, Collin Walling wrote: > >>>>>>>> DIAGNOSE 0x318 (diag318) is a privileged s390x instruction that must > >>>>>>>> be intercepted by SIE and handled via KVM. Let's introduce some > >>>>>>>> functions to communicate between userspace and KVM via ioctls. These > >>>>>>>> will be used to get/set the diag318 related information, as well as > >>>>>>>> check the system if KVM supports handling this instruction. > >>>>>>>> > >>>>>>>> This information can help with diagnosing the environment the VM is > >>>>>>>> running in (Linux, z/VM, etc) if the OS calls this instruction. > >>>>>>>> > >>>>>>>> By default, this feature is disabled and can only be enabled if a > >>>>>>>> user space program (such as QEMU) explicitly requests it. > >>>>>>>> > >>>>>>>> The Control Program Name Code (CPNC) is stored in the SIE block > >>>>>>>> and a copy is retained in each VCPU. The Control Program Version > >>>>>>>> Code (CPVC) is not designed to be stored in the SIE block, so we > >>>>>>>> retain a copy in each VCPU next to the CPNC. > >>>>>>>> > >>>>>>>> Signed-off-by: Collin Walling <walling@linux.ibm.com> > >>>>>>>> --- > >>>>>>>> Documentation/virt/kvm/devices/vm.rst | 29 +++++++++ > >>>>>>>> arch/s390/include/asm/kvm_host.h | 6 +- > >>>>>>>> arch/s390/include/uapi/asm/kvm.h | 5 ++ > >>>>>>>> arch/s390/kvm/diag.c | 20 ++++++ > >>>>>>>> arch/s390/kvm/kvm-s390.c | 89 +++++++++++++++++++++++++++ > >>>>>>>> arch/s390/kvm/kvm-s390.h | 1 + > >>>>>>>> arch/s390/kvm/vsie.c | 2 + > >>>>>>>> 7 files changed, 151 insertions(+), 1 deletion(-) > >>>>>>> [...] > >>>>>>>> diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c > >>>>>>>> index 563429dece03..3caed4b880c8 100644 > >>>>>>>> --- a/arch/s390/kvm/diag.c > >>>>>>>> +++ b/arch/s390/kvm/diag.c > >>>>>>>> @@ -253,6 +253,24 @@ static int __diag_virtio_hypercall(struct kvm_vcpu *vcpu) > >>>>>>>> return ret < 0 ? ret : 0; > >>>>>>>> } > >>>>>>>> > >>>>>>>> +static int __diag_set_diag318_info(struct kvm_vcpu *vcpu) > >>>>>>>> +{ > >>>>>>>> + unsigned int reg = (vcpu->arch.sie_block->ipa & 0xf0) >> 4; > >>>>>>>> + u64 info = vcpu->run->s.regs.gprs[reg]; > >>>>>>>> + > >>>>>>>> + if (!vcpu->kvm->arch.use_diag318) > >>>>>>>> + return -EOPNOTSUPP; > >>>>>>>> + > >>>>>>>> + vcpu->stat.diagnose_318++; > >>>>>>>> + kvm_s390_set_diag318_info(vcpu->kvm, info); > >>>>>>>> + > >>>>>>>> + VCPU_EVENT(vcpu, 3, "diag 0x318 cpnc: 0x%x cpvc: 0x%llx", > >>>>>>>> + vcpu->kvm->arch.diag318_info.cpnc, > >>>>>>>> + (u64)vcpu->kvm->arch.diag318_info.cpvc); > >> > >> Errr.. thought I dropped this message. We favored just using the > >> VM_EVENT from last time. > >> > >>>>>>>> + > >>>>>>>> + return 0; > >>>>>>>> +} > >>>>>>>> + > >>>>>>>> int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) > >>>>>>>> { > >>>>>>>> int code = kvm_s390_get_base_disp_rs(vcpu, NULL) & 0xffff; > >>>>>>>> @@ -272,6 +290,8 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) > >>>>>>>> return __diag_page_ref_service(vcpu); > >>>>>>>> case 0x308: > >>>>>>>> return __diag_ipl_functions(vcpu); > >>>>>>>> + case 0x318: > >>>>>>>> + return __diag_set_diag318_info(vcpu); > >>>>>>>> case 0x500: > >>>>>>>> return __diag_virtio_hypercall(vcpu); > >>>>>>> > >>>>>>> I wonder whether it would make more sense to simply drop to userspace > >>>>>>> and handle the diag 318 call there? That way the userspace would always > >>>>>>> be up-to-date, and as we've seen in the past (e.g. with the various SIGP > >>>>>>> handling), it's better if the userspace is in control... e.g. userspace > >>>>>>> could also decide to only use KVM_S390_VM_MISC_ENABLE_DIAG318 if the > >>>>>>> guest just executed the diag 318 instruction. > >>>>>>> > >>>>>>> And you need the kvm_s390_vm_get/set_misc functions anyway, so these > >>>>>>> could also be simply used by the diag 318 handler in userspace? > >> > >> Pardon my ignorance, but I do not think I fully understand what exactly > >> should be dropped in favor of doing things in userspace. > >> > >> My assumption: if a diag handler is not found in KVM, then we > >> fallthrough to userspace handling? > > > > Right, if you simply omit this change to diag.c, the default case > > returns -EOPNOTSUPP which then should cause an exit to userspace. You > > can then add the code in QEMU to handle_diag() in target/s390x/kvm.c > > instead. > > > > Thomas > > > > Very cool! Okay, I think this makes sense, then. I'll look into this. > Thanks for the tip. > > @Conny, I assume this is what you meant as well? If so, ignore my > response I sent earlier :) > Yes; if done correctly, it should be easy to hack something up for tcg as well, if we want it.
diff --git a/Documentation/virt/kvm/devices/vm.rst b/Documentation/virt/kvm/devices/vm.rst index 0aa5b1cfd700..9344d45ace6d 100644 --- a/Documentation/virt/kvm/devices/vm.rst +++ b/Documentation/virt/kvm/devices/vm.rst @@ -314,3 +314,32 @@ Allows userspace to query the status of migration mode. if it is enabled :Returns: -EFAULT if the given address is not accessible from kernel space; 0 in case of success. + +6. GROUP: KVM_S390_VM_MISC +Architectures: s390 + + 6.1. KVM_S390_VM_MISC_ENABLE_DIAG318 + + Allows userspace to enable the DIAGNOSE 0x318 instruction call for a + guest OS. By default, KVM will not allow this instruction to be executed + by a guest, even if support is in place. Userspace must explicitly enable + the instruction handling for DIAGNOSE 0x318 via this call. + + Parameters: none + Returns: 0 after setting a flag telling KVM to enable this feature + + 6.2. KVM_S390_VM_MISC_DIAG318 (r/w) + + Allows userspace to retrieve and set the DIAGNOSE 0x318 information, + which consists of a 1-byte "Control Program Name Code" and a 7-byte + "Control Program Version Code" (a 64 bit value all in all). This + information is set by the guest (usually during IPL). This interface is + intended to allow retrieving and setting it during migration; while no + real harm is done if the information is changed outside of migration, + it is strongly discouraged. + + Parameters: address of a buffer in user space (u64), where the + information is read from or stored into + Returns: -EFAULT if the given address is not accessible from kernel space; + -EOPNOTSUPP if feature has not been requested to be enabled first; + 0 in case of success diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h index d6bcd34f3ec3..77a46416bd62 100644 --- a/arch/s390/include/asm/kvm_host.h +++ b/arch/s390/include/asm/kvm_host.h @@ -260,7 +260,8 @@ struct kvm_s390_sie_block { __u32 scaol; /* 0x0064 */ __u8 sdf; /* 0x0068 */ __u8 epdx; /* 0x0069 */ - __u8 reserved6a[2]; /* 0x006a */ + __u8 cpnc; /* 0x006a */ + __u8 reserved6b; /* 0x006b */ __u32 todpr; /* 0x006c */ #define GISA_FORMAT1 0x00000001 __u32 gd; /* 0x0070 */ @@ -454,6 +455,7 @@ struct kvm_vcpu_stat { u64 diagnose_9c_ignored; u64 diagnose_258; u64 diagnose_308; + u64 diagnose_318; u64 diagnose_500; u64 diagnose_other; }; @@ -938,6 +940,7 @@ struct kvm_arch{ int user_sigp; int user_stsi; int user_instr0; + int use_diag318; struct s390_io_adapter *adapters[MAX_S390_IO_ADAPTERS]; wait_queue_head_t ipte_wq; int ipte_lock_count; @@ -956,6 +959,7 @@ struct kvm_arch{ DECLARE_BITMAP(idle_mask, KVM_MAX_VCPUS); struct kvm_s390_gisa_interrupt gisa_int; struct kvm_s390_pv pv; + union diag318_info diag318_info; }; #define KVM_HVA_ERR_BAD (-1UL) diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h index 436ec7636927..92cfe14ba2e1 100644 --- a/arch/s390/include/uapi/asm/kvm.h +++ b/arch/s390/include/uapi/asm/kvm.h @@ -74,6 +74,7 @@ struct kvm_s390_io_adapter_req { #define KVM_S390_VM_CRYPTO 2 #define KVM_S390_VM_CPU_MODEL 3 #define KVM_S390_VM_MIGRATION 4 +#define KVM_S390_VM_MISC 5 /* kvm attributes for mem_ctrl */ #define KVM_S390_VM_MEM_ENABLE_CMMA 0 @@ -171,6 +172,10 @@ struct kvm_s390_vm_cpu_subfunc { #define KVM_S390_VM_MIGRATION_START 1 #define KVM_S390_VM_MIGRATION_STATUS 2 +/* kvm attributes for KVM_S390_VM_MISC */ +#define KVM_S390_VM_MISC_ENABLE_DIAG318 0 +#define KVM_S390_VM_MISC_DIAG318 1 + /* for KVM_GET_REGS and KVM_SET_REGS */ struct kvm_regs { /* general purpose regs for s390 */ diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c index 563429dece03..3caed4b880c8 100644 --- a/arch/s390/kvm/diag.c +++ b/arch/s390/kvm/diag.c @@ -253,6 +253,24 @@ static int __diag_virtio_hypercall(struct kvm_vcpu *vcpu) return ret < 0 ? ret : 0; } +static int __diag_set_diag318_info(struct kvm_vcpu *vcpu) +{ + unsigned int reg = (vcpu->arch.sie_block->ipa & 0xf0) >> 4; + u64 info = vcpu->run->s.regs.gprs[reg]; + + if (!vcpu->kvm->arch.use_diag318) + return -EOPNOTSUPP; + + vcpu->stat.diagnose_318++; + kvm_s390_set_diag318_info(vcpu->kvm, info); + + VCPU_EVENT(vcpu, 3, "diag 0x318 cpnc: 0x%x cpvc: 0x%llx", + vcpu->kvm->arch.diag318_info.cpnc, + (u64)vcpu->kvm->arch.diag318_info.cpvc); + + return 0; +} + int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) { int code = kvm_s390_get_base_disp_rs(vcpu, NULL) & 0xffff; @@ -272,6 +290,8 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) return __diag_page_ref_service(vcpu); case 0x308: return __diag_ipl_functions(vcpu); + case 0x318: + return __diag_set_diag318_info(vcpu); case 0x500: return __diag_virtio_hypercall(vcpu); default: diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index d05bb040fd42..c3eee468815f 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -159,6 +159,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { { "diag_9c_ignored", VCPU_STAT(diagnose_9c_ignored) }, { "instruction_diag_258", VCPU_STAT(diagnose_258) }, { "instruction_diag_308", VCPU_STAT(diagnose_308) }, + { "instruction_diag_318", VCPU_STAT(diagnose_318) }, { "instruction_diag_500", VCPU_STAT(diagnose_500) }, { "instruction_diag_other", VCPU_STAT(diagnose_other) }, { NULL } @@ -1243,6 +1244,76 @@ static int kvm_s390_get_tod(struct kvm *kvm, struct kvm_device_attr *attr) return ret; } +void kvm_s390_set_diag318_info(struct kvm *kvm, u64 info) +{ + struct kvm_vcpu *vcpu; + int i; + + kvm->arch.diag318_info.val = info; + + VM_EVENT(kvm, 3, "SET: CPNC: 0x%x CPVC: 0x%llx", + kvm->arch.diag318_info.cpnc, kvm->arch.diag318_info.cpvc); + + if (sclp.has_diag318) { + kvm_for_each_vcpu(i, vcpu, kvm) { + vcpu->arch.sie_block->cpnc = kvm->arch.diag318_info.cpnc; + } + } +} + +static int kvm_s390_vm_set_misc(struct kvm *kvm, struct kvm_device_attr *attr) +{ + int ret; + u64 diag318_info; + + switch (attr->attr) { + case KVM_S390_VM_MISC_ENABLE_DIAG318: + kvm->arch.use_diag318 = 1; + ret = 0; + break; + case KVM_S390_VM_MISC_DIAG318: + ret = -EFAULT; + if (!kvm->arch.use_diag318) + return -EOPNOTSUPP; + if (get_user(diag318_info, (u64 __user *)attr->addr)) + break; + kvm_s390_set_diag318_info(kvm, diag318_info); + ret = 0; + break; + default: + ret = -ENXIO; + break; + } + return ret; +} + +static int kvm_s390_get_diag318_info(struct kvm *kvm, struct kvm_device_attr *attr) +{ + if (put_user(kvm->arch.diag318_info.val, (u64 __user *)attr->addr)) + return -EFAULT; + + VM_EVENT(kvm, 3, "QUERY: CPNC: 0x%x, CPVC: 0x%llx", + kvm->arch.diag318_info.cpnc, (u64)kvm->arch.diag318_info.cpvc); + return 0; +} + +static int kvm_s390_get_misc(struct kvm *kvm, struct kvm_device_attr *attr) +{ + int ret; + + switch (attr->attr) { + case KVM_S390_VM_MISC_DIAG318: + if (!kvm->arch.use_diag318) + return -ENXIO; + ret = kvm_s390_get_diag318_info(kvm, attr); + break; + default: + ret = -ENXIO; + break; + } + return ret; +} + static int kvm_s390_set_processor(struct kvm *kvm, struct kvm_device_attr *attr) { struct kvm_s390_vm_cpu_processor *proc; @@ -1689,6 +1760,9 @@ static int kvm_s390_vm_set_attr(struct kvm *kvm, struct kvm_device_attr *attr) case KVM_S390_VM_MIGRATION: ret = kvm_s390_vm_set_migration(kvm, attr); break; + case KVM_S390_VM_MISC: + ret = kvm_s390_vm_set_misc(kvm, attr); + break; default: ret = -ENXIO; break; @@ -1714,6 +1788,9 @@ static int kvm_s390_vm_get_attr(struct kvm *kvm, struct kvm_device_attr *attr) case KVM_S390_VM_MIGRATION: ret = kvm_s390_vm_get_migration(kvm, attr); break; + case KVM_S390_VM_MISC: + ret = kvm_s390_get_misc(kvm, attr); + break; default: ret = -ENXIO; break; @@ -1787,6 +1864,16 @@ static int kvm_s390_vm_has_attr(struct kvm *kvm, struct kvm_device_attr *attr) case KVM_S390_VM_MIGRATION: ret = 0; break; + case KVM_S390_VM_MISC: + switch (attr->attr) { + case KVM_S390_VM_MISC_DIAG318: + ret = 0; + break; + default: + ret = -ENXIO; + break; + } + break; default: ret = -ENXIO; break; @@ -3075,6 +3162,8 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu) vcpu->arch.sie_block->ictl |= ICTL_OPEREXC; /* make vcpu_load load the right gmap on the first trigger */ vcpu->arch.enabled_gmap = vcpu->arch.gmap; + if (sclp.has_diag318) + vcpu->arch.sie_block->cpnc = vcpu->kvm->arch.diag318_info.cpnc; } static bool kvm_has_pckmo_subfunc(struct kvm *kvm, unsigned long nr) diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h index 79dcd647b378..59195447737e 100644 --- a/arch/s390/kvm/kvm-s390.h +++ b/arch/s390/kvm/kvm-s390.h @@ -326,6 +326,7 @@ int kvm_s390_handle_sigp(struct kvm_vcpu *vcpu); int kvm_s390_handle_sigp_pei(struct kvm_vcpu *vcpu); /* implemented in kvm-s390.c */ +void kvm_s390_set_diag318_info(struct kvm *kvm, u64 info); void kvm_s390_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clock *gtod); long kvm_arch_fault_in_page(struct kvm_vcpu *vcpu, gpa_t gpa, int writable); diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c index 4f6c22d72072..3a63ad5ee8d8 100644 --- a/arch/s390/kvm/vsie.c +++ b/arch/s390/kvm/vsie.c @@ -548,6 +548,8 @@ static int shadow_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page) scb_s->ecd |= scb_o->ecd & ECD_ETOKENF; scb_s->hpid = HPID_VSIE; + if (sclp.has_diag318) + scb_s->cpnc = scb_o->cpnc; prepare_ibc(vcpu, vsie_page); rc = shadow_crycb(vcpu, vsie_page);
DIAGNOSE 0x318 (diag318) is a privileged s390x instruction that must be intercepted by SIE and handled via KVM. Let's introduce some functions to communicate between userspace and KVM via ioctls. These will be used to get/set the diag318 related information, as well as check the system if KVM supports handling this instruction. This information can help with diagnosing the environment the VM is running in (Linux, z/VM, etc) if the OS calls this instruction. By default, this feature is disabled and can only be enabled if a user space program (such as QEMU) explicitly requests it. The Control Program Name Code (CPNC) is stored in the SIE block and a copy is retained in each VCPU. The Control Program Version Code (CPVC) is not designed to be stored in the SIE block, so we retain a copy in each VCPU next to the CPNC. Signed-off-by: Collin Walling <walling@linux.ibm.com> --- Documentation/virt/kvm/devices/vm.rst | 29 +++++++++ arch/s390/include/asm/kvm_host.h | 6 +- arch/s390/include/uapi/asm/kvm.h | 5 ++ arch/s390/kvm/diag.c | 20 ++++++ arch/s390/kvm/kvm-s390.c | 89 +++++++++++++++++++++++++++ arch/s390/kvm/kvm-s390.h | 1 + arch/s390/kvm/vsie.c | 2 + 7 files changed, 151 insertions(+), 1 deletion(-)