From patchwork Wed Oct 31 08:10:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jerry Lee X-Patchwork-Id: 10662131 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 75B4613B5 for ; Wed, 31 Oct 2018 08:11:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 656F72A381 for ; Wed, 31 Oct 2018 08:11:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5938C2A3A5; Wed, 31 Oct 2018 08:11:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E9FBA2A381 for ; Wed, 31 Oct 2018 08:11:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727168AbeJaRIB (ORCPT ); Wed, 31 Oct 2018 13:08:01 -0400 Received: from mail-it1-f170.google.com ([209.85.166.170]:39747 "EHLO mail-it1-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725955AbeJaRIB (ORCPT ); Wed, 31 Oct 2018 13:08:01 -0400 Received: by mail-it1-f170.google.com with SMTP id m15so17111306itl.4 for ; Wed, 31 Oct 2018 01:10:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=CxL7ZAygNyq8UNdLud1Q/t9NqgLKCAAjkcE54FZ4K+k=; b=CyV6BkV108l3Qeppfoa76lUT886sYNYOoVAUffUdqNTc6BQuIuvTTHVFdugxI+5gx+ ak+WpEnR1XuctgX1DddLN6O8W8pCN8QRlvPwji/jl87aJZWF7vhG73nbbglcOAV5F6Sk 7qBkTB697iGin/QA6dLBrqeQUR5255V1urACBqsfUzSqTe6s2UwYArEdyyfrdal34u14 JB9CyPqyki9OS/Bc6rXKEBIqNoNfh287lBmq7lkzpKInpRc7NFf4GKqrh0Hnq04cBsHK 9YWQGO1g26+uMrewvR1yc77uMwwFRI31CyshVSvIuCJxBJNp+jk/KtFZyaI3HcyRJwcG sR0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=CxL7ZAygNyq8UNdLud1Q/t9NqgLKCAAjkcE54FZ4K+k=; b=m5ETaHGa6FlOoyAqE22kN50vTIKJnmlrqnvqSU7Dx3Hb4hRuaMZSeXqi3EFLK1Mx4s yB/2yGOv4z7zHwq0DonkUij4x8mLWvy1NAQomj04/gB9djVgGcSx1hWAH4q0jxm1wdWm Nvm9V10mndcoJWUKKxrZAIzlatvKAbAgUSc/XNEzQ6zNl3YBE7i+NftsubM9kp8r+HXI cgc/ZWaNhDbMIEP+Vb4jbIRVkuUzEbalCXo1u8wzknsKgYqHey5npMFP23sWUlJZbOG/ 0WdrnlZLuq9zZfy5DOoa27atz9cQtjl9QjFUSmW29CfaJfMBcMNRok55AUxFfV4Quzdn cn6g== X-Gm-Message-State: AGRZ1gJlyPoFubCWdYSEW3taJUQqk7SQudVMK0btDFYucbqyjmkNImyD 5Gz0Du7yInE7Jqt2DYpQvv3cyPbV6PKASWtfDcH0GCRu X-Google-Smtp-Source: AJdET5ffeZC3usmwlj4fR4RVu/Jk6uY/u1X/EIIYMtTKca5Tj8of7BOM8U0qkF0B5SSEtOcS05N7mdBu6Px1FzmaFgk= X-Received: by 2002:a24:c445:: with SMTP id v66-v6mr1336827itf.21.1540973457259; Wed, 31 Oct 2018 01:10:57 -0700 (PDT) MIME-Version: 1.0 From: Jerry Lee Date: Wed, 31 Oct 2018 16:10:32 +0800 Message-ID: Subject: ceph-mgr: requests to restful api get blocked sometimes To: ceph-devel , branto@redhat.com Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi, We setup a ceph cluster (v12.2.2) with restful api plugin running, but sometimes requests got blocked forever without responding. While stucking in such condition, we checked the netstat output and it shown some packets were queued in the Recv-Q: [~] netstat -tupln Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 129 0 192.168.2.1:8003 0.0.0.0:* LISTEN 1885/ceph-mgr And a log which may be related to the issue is captured: 2018-10-29 13:43:00.058319 7fcd1891b700 1 mgr[restful] Unknown request '140518797573648:0' After digging into the codes, should the requests list be protected by the requests_lock as the following patch? A possible condition we suspect is that a request is done and the restful plugin is notified. But unfortunately, the request is not appended to the requests list yet which makes a "Unknown request" log is generated and the submit_reqeust() function waits forever without acceping new request. Any idea and feedback are appreciated, thanks. - Jerry diff --git a/src/pybind/mgr/restful/module.py b/src/pybind/mgr/restful/module.py index 6ce610b..bbe88ab 100644 --- a/src/pybind/mgr/restful/module.py +++ b/src/pybind/mgr/restful/module.py @@ -363,9 +363,10 @@ class Module(MgrModule): if tag == 'seq': return - request = filter( - lambda x: x.is_running(tag), - self.requests) + with self.requests_lock: + request = filter( + lambda x: x.is_running(tag), + self.requests) if len(request) != 1: self.log.warn("Unknown request '%s'" % str(tag)) @@ -596,8 +597,8 @@ class Module(MgrModule): def submit_request(self, _request, **kwargs): - request = CommandsRequest(_request) with self.requests_lock: + request = CommandsRequest(_request) self.requests.append(request) if kwargs.get('wait', 0): while not request.is_finished():