From patchwork Wed Sep 3 17:47:21 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paulo Zanoni X-Patchwork-Id: 4836191 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 537F0C0338 for ; Wed, 3 Sep 2014 17:47:38 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 4CBDD20127 for ; Wed, 3 Sep 2014 17:47:37 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 4650220120 for ; Wed, 3 Sep 2014 17:47:36 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C354F6E59F; Wed, 3 Sep 2014 10:47:35 -0700 (PDT) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-yh0-f45.google.com (mail-yh0-f45.google.com [209.85.213.45]) by gabe.freedesktop.org (Postfix) with ESMTP id DBC126E59F for ; Wed, 3 Sep 2014 10:47:34 -0700 (PDT) Received: by mail-yh0-f45.google.com with SMTP id i57so5534698yha.18 for ; Wed, 03 Sep 2014 10:47:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id; bh=FeByifpZeqlSRGse9fXitgtkQpxoSaoiMjc56pNaJoU=; b=waZDg4QpFmT5v4GOPqMmZJ7UHGJkom8s9VguaJtK42FkybdE+hK0rHfsorC/wHKXHU IfyEih/P/+xPfRYwmJP/nvhv4OgF/L4DbcCqMEExWcx1iQvJXwEbzKkMlgTItsFWRC1q P/EpttzZDyYf10CTpHTbyfMLBuAhTgT8qcT8fdU7aX1MH+M6mSvrZwusMrx3rQNntOX1 SKqkXJ/knuA5YCAUCPo/NxdVG325yi9HjGF6FLO4w3v6xLYimvAEYqB034Yme+l3WGZK grIMpovl6wSlwXihHlCKSxWVU3anXC/Mc+o5iYfEbXiVDslHAPj+cMMP22Lo/pZWQOvg GWuA== X-Received: by 10.236.126.169 with SMTP id b29mr2841070yhi.191.1409766454220; Wed, 03 Sep 2014 10:47:34 -0700 (PDT) Received: from localhost.localdomain ([177.132.105.192]) by mx.google.com with ESMTPSA id p27sm7527554yhd.1.2014.09.03.10.47.32 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 03 Sep 2014 10:47:33 -0700 (PDT) From: Paulo Zanoni To: intel-gfx@lists.freedesktop.org Date: Wed, 3 Sep 2014 14:47:21 -0300 Message-Id: <1409766441-3886-1-git-send-email-przanoni@gmail.com> X-Mailer: git-send-email 2.1.0 Cc: Paulo Zanoni , Thomas Wood Subject: [Intel-gfx] [PATCH] igt_core: zero exit_handler_count before forking X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-5.8 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Paulo Zanoni If we don't reset exit_handler_count before forking, we may have a case where the forked process is killed before it even does "exit_handler_count = 0": in that case, it is still finishing forking. When that happens, we may end up calling our exit handlers. On the specific bug I'm investigating, we call igt_reset_connnectors(), which ends up in a deadlock inside malloc_atfork. If we attach gdb to the forked process and get a backtrace, we have: (gdb) bt 0 __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95 1 0x00007f15634d36bf in _L_lock_10524 () from /lib/x86_64-linux-gnu/libc.so.6 2 0x00007f15634d12ef in malloc_atfork (sz=139729840351352, caller=) at arena.c:181 3 0x00007f15640466a1 in drmMalloc () from /usr/lib/x86_64-linux-gnu/libdrm.so.2 4 0x00007f1564049ad7 in drmModeGetResources () from /usr/lib/x86_64-linux-gnu/libdrm.so.2 5 0x0000000000408f84 in igt_reset_connectors () at igt_kms.c:1656 6 0x00000000004092dc in call_exit_handlers (sig=15) at igt_core.c:1130 7 fatal_sig_handler (sig=15) at igt_core.c:1154 8 9 0x00007f15634cce60 in ptmalloc_unlock_all2 () at arena.c:298 10 0x00007f156350ca3f in __libc_fork () at ../nptl/sysdeps/unix/sysv/linux/x86_64/../fork.c:188 11 0x000000000040a029 in __igt_fork_helper (proc=proc@entry=0x610fc4 ) at igt_core.c:910 12 0x000000000040459d in igt_fork_signal_helper () at igt_aux.c:110 13 0x0000000000402ab7 in __real_main63 () at bug.c:76 14 0x000000000040296e in main (argc=, argv=) at bug.c:63 After doing some searches for "stuck at malloc_atfork", it seems to me we probably shouldn't be doing any malloc calls at this point of the code, so the best way to do that is to make sure we can't really run the exit handlers. So on this patch, instead of resetting the exit handlers after forking, we reset them before forking, and then restore the original value on the parent process. I can reproduce this problem by running "./kms_flip --run-subtest 2x-flip-vs-modeset" under an infinite loop. Usually after a few hundred calls, we end up stuck on the deadlock mentioned above. QA says this problem happens every time, but I'm not sure what is the difference between our environments that makes the race condition so much easier for them. The kms_flip.c problem can be considered a regression introduced by: commit eef768f283466b6d7cb3f08381f72ccf3951dc99 Author: Thomas Wood Date: Wed Jun 18 14:28:43 2014 +0100 tests: enable extra connectors in kms_flip and kms_pipe_crc_basic even though this commit is not the one that introduced the real problem. It is also possible to reproduce this problem with a few modifications to template.c: - Add a call to igt_enable_connectors() inside the first fixture. - Add igt_fork_signal_helper() and igt_stop_signal_helper() calls around subtest B. Cc: Thomas Wood Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81367 Signed-off-by: Paulo Zanoni --- lib/igt_core.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/lib/igt_core.c b/lib/igt_core.c index 7f14b17..e9a27e3 100644 --- a/lib/igt_core.c +++ b/lib/igt_core.c @@ -895,6 +895,7 @@ bool __igt_fork_helper(struct igt_helper_process *proc) { pid_t pid; int id; + int tmp_count; assert(!proc->running); assert(helper_process_count < ARRAY_SIZE(helper_process_pids)); @@ -904,16 +905,19 @@ bool __igt_fork_helper(struct igt_helper_process *proc) igt_install_exit_handler(fork_helper_exit_handler); + tmp_count = exit_handler_count; + exit_handler_count = 0; switch (pid = fork()) { case -1: + exit_handler_count = tmp_count; igt_assert(0); case 0: - exit_handler_count = 0; reset_helper_process_list(); oom_adjust_for_doom(); return true; default: + exit_handler_count = tmp_count; proc->running = true; proc->pid = pid; proc->id = id;