From patchwork Wed Apr 10 13:57:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 10893917 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4105317E6 for ; Wed, 10 Apr 2019 13:59:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2895C2865F for ; Wed, 10 Apr 2019 13:59:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1CADF2869B; Wed, 10 Apr 2019 13:59:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 441CB28675 for ; Wed, 10 Apr 2019 13:59:53 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hEDjK-0003an-CO; Wed, 10 Apr 2019 13:57:34 +0000 Received: from us1-rack-dfw2.inumbo.com ([104.130.134.6]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hEDjI-0003ai-CM for xen-devel@lists.xenproject.org; Wed, 10 Apr 2019 13:57:32 +0000 X-Inumbo-ID: 946d1b24-5b98-11e9-92d7-bc764e045a96 Received: from prv1-mh.provo.novell.com (unknown [137.65.248.33]) by us1-rack-dfw2.inumbo.com (Halon) with ESMTPS id 946d1b24-5b98-11e9-92d7-bc764e045a96; Wed, 10 Apr 2019 13:57:29 +0000 (UTC) Received: from INET-PRV1-MTA by prv1-mh.provo.novell.com with Novell_GroupWise; Wed, 10 Apr 2019 07:57:29 -0600 Message-Id: <5CADF646020000780022642B@prv1-mh.provo.novell.com> X-Mailer: Novell GroupWise Internet Agent 18.1.0 Date: Wed, 10 Apr 2019 07:57:26 -0600 From: "Jan Beulich" To: "xen-devel" Mime-Version: 1.0 Content-Disposition: inline Subject: [Xen-devel] x86 guest EOI timer issues / questions X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Andrew Cooper , Wei Liu , Roger Pau Monne Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP All, Andrew's observation mentioned in "[PATCH v2] x86/smt: Support for enabling/disabling SMT at runtime" and some discussions we've already had with him made me look into how exactly the EOI timer works. I think there are a number of quirks. I'll enumerate all questions I've run into below. At the end of the mail is a patch carrying out some of the adjustments implied by a subset of these questions, plus the addition of some debugging code. It is perhaps worthwhile to note that the printk()-s irq_guest_eoi_timer_fn() both were observed to trigger prior to making the non-debugging code changes. With the patch in its current shape, I've seen trigger (every once in a while) only the printk() added to __do_IRQ_guest(). - Why does the timer not get stopped in desc_guest_eoi() when ->in_flight is zero? - Why does irq_guest_eoi_timer_fn() not re-arm the timer when ->in_flight is still non-zero after all the decrements? Can this case happen at all, considering that the only increment in __do_IRQ_guest() happens under the same desc lock? (If it can happen, I think it would be against the purpose of 37eb6d05fe, as this would mean the IRQ can then remain un-acked for an indefinite period of time, unless something else re-armed the timer.) - Why does irq_guest_eoi_timer_fn() issue ->end() / set_eoi_ready() even when no decrement of any ->in_flight was done at all? Neither for ACKTYPE_UNMASK nor for ACKTYPE_EOI this looks to be fully race free. - Wouldn't __do_IRQ_guest() better stop the timer early on? - Wouldn't __do_IRQ_guest() better avoid re-programming the timer if no ->in_flight increment was done (which I would have thought shouldn't happen anyway, but which I've observed in practice)? Of course this would mean the timer may only be stopped when the first increment occurs. - What about the timer triggering while __do_IRQ_guest() is active (holding the desc lock)? This looks to result in immediate expiry of the EOI deferral for the current interrupt instance (and I've observed this in practice, luckily with no other visible bad effects). - Is it okay for fixup_eoi() to fiddle with action->cpu_eoi_map without holding the desc lock? (I guess being in stop-machine / SMP-boot context makes this acceptable.) Commits of interest: f3922f4084 x86: Support CPU hotplug offline 37eb6d05fe x86: Automatically EOI guest-bound interrupts if guest takes too long Other notes / observations: - irq_guest_eoi_timer_fn() should not need to re-acquire the lock after on_selected_cpus() - irq_guest_eoi_timer_fn() should not need to use irqsave/irqrestore spin lock/unlock (plain irq ones should suffice in a timer handler) - irq_guest_eoi_timer_fn() could ASSERT(action->ack_type != ACKTYPE_NONE) instead of having such a conditional - fixup_eoi() may better avoid setting all peoi[].ready and instead pass a "force" boolean to flush_ready_eoi() Thoughts / opinions appreciated, Jan --- unstable.orig/xen/arch/x86/irq.c +++ unstable/xen/arch/x86/irq.c @@ -1103,7 +1103,7 @@ static void set_eoi_ready(void *data); static void irq_guest_eoi_timer_fn(void *data) { struct irq_desc *desc = data; - unsigned int irq = desc - irq_desc; + unsigned int irq = desc - irq_desc, done = 0; irq_guest_action_t *action; cpumask_t cpu_eoi_map; unsigned long flags; @@ -1115,6 +1115,16 @@ static void irq_guest_eoi_timer_fn(void action = (irq_guest_action_t *)desc->action; + if ( action->eoi_timer.status >= TIMER_STATUS_in_heap ) +{//temp + static unsigned long cnt, thr; + if(++cnt > thr) { + thr |= cnt; + printk("IRQ%u: i=%u a=%d n=%u\n", irq, action->in_flight, action->ack_type, action->nr_guests); + } + goto out; +} + if ( action->ack_type != ACKTYPE_NONE ) { unsigned int i; @@ -1122,11 +1132,29 @@ static void irq_guest_eoi_timer_fn(void { struct domain *d = action->guest[i]; unsigned int pirq = domain_irq_to_pirq(d, irq); + if ( test_and_clear_bool(pirq_info(d, pirq)->masked) ) - action->in_flight--; + ++done; } + action->in_flight -= done; } +{//temp + static unsigned long done_cnt, done_thr; + static unsigned long infl_cnt, infl_thr; + bool log = false; + if(action->in_flight && ++infl_cnt > infl_thr) { + infl_thr |= infl_cnt; + log = true; + } else if(!done && ++done_cnt > done_thr) { + done_thr |= done_cnt; + log = true; + } + if(log) + printk("IRQ%u: d=%u i=%u a=%d n=%u (%06lx,%06lx)\n", irq, done, action->in_flight, + action->ack_type, action->nr_guests, done_cnt, infl_cnt); +} +//todo Instead of the below, assert that ->in_flight is zero and bail if done is zero? if ( action->in_flight != 0 ) goto out; @@ -1156,6 +1184,7 @@ static void __do_IRQ_guest(int irq) int i, sp; struct pending_eoi *peoi = this_cpu(pending_eoi); unsigned int vector = (u8)get_irq_regs()->entry_vector; +unsigned done = 0;//temp if ( unlikely(action->nr_guests == 0) ) { @@ -1167,6 +1196,9 @@ static void __do_IRQ_guest(int irq) return; } + if ( action->ack_type != ACKTYPE_NONE ) + stop_timer(&action->eoi_timer); + if ( action->ack_type == ACKTYPE_EOI ) { sp = pending_eoi_sp(peoi); @@ -1187,6 +1219,7 @@ static void __do_IRQ_guest(int irq) pirq = pirq_info(d, domain_irq_to_pirq(d, irq)); if ( (action->ack_type != ACKTYPE_NONE) && !test_and_set_bool(pirq->masked) ) +++done,//temp action->in_flight++; if ( !is_hvm_domain(d) || !hvm_do_IRQ_dpci(d, pirq) ) send_guest_pirq(d, pirq); @@ -1194,10 +1227,16 @@ static void __do_IRQ_guest(int irq) if ( action->ack_type != ACKTYPE_NONE ) { - stop_timer(&action->eoi_timer); migrate_timer(&action->eoi_timer, smp_processor_id()); set_timer(&action->eoi_timer, NOW() + MILLISECS(1)); } +if(!done) {//temp + static unsigned long cnt, thr; + if(++cnt > thr) { + thr |= cnt; + printk("IRQ%d: d=0 n=%u i=%u a=%d\n", irq, action->nr_guests, action->in_flight, action->ack_type); + } +} } /* @@ -1457,6 +1496,8 @@ void desc_guest_eoi(struct irq_desc *des return; } + stop_timer(&action->eoi_timer); + if ( action->ack_type == ACKTYPE_UNMASK ) { ASSERT(cpumask_empty(action->cpu_eoi_map));