From patchwork Thu Nov 22 11:28:16 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Borntraeger X-Patchwork-Id: 1792921 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork1.kernel.org (Postfix) with ESMTP id 171653FC64 for ; Thu, 22 Nov 2012 23:24:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752800Ab2KVXYr (ORCPT ); Thu, 22 Nov 2012 18:24:47 -0500 Received: from e06smtp16.uk.ibm.com ([195.75.94.112]:33569 "EHLO e06smtp16.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752165Ab2KVXYq (ORCPT ); Thu, 22 Nov 2012 18:24:46 -0500 Received: from /spool/local by e06smtp16.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 22 Nov 2012 11:28:29 -0000 Received: from b06cxnps3074.portsmouth.uk.ibm.com (9.149.109.194) by e06smtp16.uk.ibm.com (192.168.101.146) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Thu, 22 Nov 2012 11:28:21 -0000 Received: from d06av02.portsmouth.uk.ibm.com (d06av02.portsmouth.uk.ibm.com [9.149.37.228]) by b06cxnps3074.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id qAMBSA3J52887668; Thu, 22 Nov 2012 11:28:10 GMT Received: from d06av02.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av02.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id qAMBSHGZ007314; Thu, 22 Nov 2012 04:28:18 -0700 Received: from tuxmaker.boeblingen.de.ibm.com (tuxmaker.boeblingen.de.ibm.com [9.152.85.9]) by d06av02.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id qAMBSHB0007302; Thu, 22 Nov 2012 04:28:17 -0700 Received: by tuxmaker.boeblingen.de.ibm.com (Postfix, from userid 25651) id 5A538122438E; Thu, 22 Nov 2012 12:28:17 +0100 (CET) From: Christian Borntraeger To: Marcelo Tossati Cc: Alexander Graf , Jens Freimann , Cornelia Huck , Heiko Carstens , Martin Schwidefsky , KVM , stable@vger.kernel.org, Christian Borntraeger Subject: [PATCH 1/1] Subject: [PATCH] s390/kvm: Fix address space mixup Date: Thu, 22 Nov 2012 12:28:16 +0100 Message-Id: <1353583696-10276-2-git-send-email-borntraeger@de.ibm.com> X-Mailer: git-send-email 1.7.12.4 In-Reply-To: <1353583696-10276-1-git-send-email-borntraeger@de.ibm.com> References: <1353583696-10276-1-git-send-email-borntraeger@de.ibm.com> x-cbid: 12112211-3548-0000-0000-000003C40054 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org I was chasing down a bug of random validity intercepts on s390. (guest prefix page not mapped in the host virtual aspace). Turns out that the problem was a wrong address space control element. The cause was quite complex: During paging activity a DAT protection during SIE caused a program interrupt. Normally, the sie retry loop tries to catch all interrupts during and shortly before sie to rerun the setup. The problem is now that protection causes a suppressing program interrupt, causing the PSW to point to the instruction AFTER SIE in case of DAT protection. This confused the logic of the retry loop to not trigger, instead we jumped directly back to SIE after return from the program interrupt. (the protection fault handler itself did a rewind of the psw). This usually works quite well, but: If now the protection fault handler has to wait, another program might be scheduled in. Later on the sie process will be schedules in again. In that case the content of CR1 (primary address space) will be wrong because switch_to will put the user space ASCE into CR1 and not the guest ASCE. In addition the program parameter is also wrong for every protection fault of a guest, since we dont issue the SPP instruction. So lets also check for PSW == instruction after SIE in the program check handler. Instead of expensively checking all program interruption codes that might be suppressing we assume that a program interrupt pointing after SIE was always a program interrupt in SIE. (Otherwise we have a kernel bug anyway). We also have to compensate the rewinding, since the C-level handlers will do that. Therefore we need to add a nop with the same length as SIE before the sie_loop. Signed-off-by: Christian Borntraeger CC: stable@vger.kernel.org CC: Martin Schwidefsky CC: Heiko Carstens --- arch/s390/kernel/entry64.S | 25 ++++++++++++++++++++----- 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/arch/s390/kernel/entry64.S b/arch/s390/kernel/entry64.S index 07d8de3..19b6080 100644 --- a/arch/s390/kernel/entry64.S +++ b/arch/s390/kernel/entry64.S @@ -80,14 +80,21 @@ _TIF_EXIT_SIE = (_TIF_SIGPENDING | _TIF_NEED_RESCHED | _TIF_MCCK_PENDING) #endif .endm - .macro HANDLE_SIE_INTERCEPT scratch + .macro HANDLE_SIE_INTERCEPT scratch,pgmcheck #if defined(CONFIG_KVM) || defined(CONFIG_KVM_MODULE) tmhh %r8,0x0001 # interrupting from user ? jnz .+42 lgr \scratch,%r9 slg \scratch,BASED(.Lsie_loop) clg \scratch,BASED(.Lsie_length) + .if \pgmcheck + # Some program interrupts are suppressing (e.g. protection). + # We must also check the instruction after SIE in that case. + # do_protection_exception will rewind to rewind_pad + jh .+22 + .else jhe .+22 + .endif lg %r9,BASED(.Lsie_loop) SPP BASED(.Lhost_id) # set host id #endif @@ -391,7 +398,7 @@ ENTRY(pgm_check_handler) lg %r12,__LC_THREAD_INFO larl %r13,system_call lmg %r8,%r9,__LC_PGM_OLD_PSW - HANDLE_SIE_INTERCEPT %r14 + HANDLE_SIE_INTERCEPT %r14,1 tmhh %r8,0x0001 # test problem state bit jnz 1f # -> fault in user space tmhh %r8,0x4000 # PER bit set in old PSW ? @@ -467,7 +474,7 @@ ENTRY(io_int_handler) lg %r12,__LC_THREAD_INFO larl %r13,system_call lmg %r8,%r9,__LC_IO_OLD_PSW - HANDLE_SIE_INTERCEPT %r14 + HANDLE_SIE_INTERCEPT %r14,0 SWITCH_ASYNC __LC_SAVE_AREA_ASYNC,__LC_ASYNC_STACK,STACK_SHIFT tmhh %r8,0x0001 # interrupting from user? jz io_skip @@ -613,7 +620,7 @@ ENTRY(ext_int_handler) lg %r12,__LC_THREAD_INFO larl %r13,system_call lmg %r8,%r9,__LC_EXT_OLD_PSW - HANDLE_SIE_INTERCEPT %r14 + HANDLE_SIE_INTERCEPT %r14,0 SWITCH_ASYNC __LC_SAVE_AREA_ASYNC,__LC_ASYNC_STACK,STACK_SHIFT tmhh %r8,0x0001 # interrupting from user ? jz ext_skip @@ -661,7 +668,7 @@ ENTRY(mcck_int_handler) lg %r12,__LC_THREAD_INFO larl %r13,system_call lmg %r8,%r9,__LC_MCK_OLD_PSW - HANDLE_SIE_INTERCEPT %r14 + HANDLE_SIE_INTERCEPT %r14,0 tm __LC_MCCK_CODE,0x80 # system damage? jo mcck_panic # yes -> rest of mcck code invalid lghi %r14,__LC_CPU_TIMER_SAVE_AREA @@ -960,6 +967,13 @@ ENTRY(sie64a) stg %r3,__SF_EMPTY+8(%r15) # save guest register save area xc __SF_EMPTY+16(8,%r15),__SF_EMPTY+16(%r15) # host id == 0 lmg %r0,%r13,0(%r3) # load guest gprs 0-13 +# some program checks are suppressing. C code (e.g. do_protection_exception) +# will rewind the PSW by the ILC, which is 4 bytes in case of SIE. Other +# instructions in the sie_loop should not cause program interrupts. So +# lets use a nop (47 00 00 00) as a landing pad. +# See also HANDLE_SIE_INTERCEPT +rewind_pad: + nop 0 sie_loop: lg %r14,__LC_THREAD_INFO # pointer thread_info struct tm __TI_flags+7(%r14),_TIF_EXIT_SIE @@ -999,6 +1013,7 @@ sie_fault: .Lhost_id: .quad 0 + EX_TABLE(rewind_pad,sie_fault) EX_TABLE(sie_loop,sie_fault) #endif