From patchwork Fri Oct 9 22:05:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Axel Rasmussen X-Patchwork-Id: 11829721 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 664AF175A for ; Fri, 9 Oct 2020 22:05:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1DC3922258 for ; Fri, 9 Oct 2020 22:05:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="t0+NyCh6" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1DC3922258 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 268B494000F; Fri, 9 Oct 2020 18:05:33 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 21BC394000C; Fri, 9 Oct 2020 18:05:33 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0954294000F; Fri, 9 Oct 2020 18:05:33 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0155.hostedemail.com [216.40.44.155]) by kanga.kvack.org (Postfix) with ESMTP id CABA294000C for ; Fri, 9 Oct 2020 18:05:32 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 603B4180AD806 for ; Fri, 9 Oct 2020 22:05:32 +0000 (UTC) X-FDA: 77353769304.20.club84_0004b9e271e4 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin20.hostedemail.com (Postfix) with ESMTP id 3F4E4180C07A3 for ; Fri, 9 Oct 2020 22:05:32 +0000 (UTC) X-Spam-Summary: 1,0,0,39e7874319a207de,d41d8cd98f00b204,3qt6axw0kcg0lipwcldxfddpyrzzrwp.nzxwtyfi-xxvglnv.zcr@flex--axelrasmussen.bounces.google.com,,RULES_HIT:41:152:355:379:541:800:960:973:988:989:1260:1277:1313:1314:1345:1359:1431:1437:1516:1518:1534:1541:1593:1594:1711:1730:1747:1777:1792:2393:2559:2562:3138:3139:3140:3141:3142:3152:3353:3866:3867:3868:3870:3871:3872:3873:4321:5007:6119:6261:6653:7903:8603:9969:10004:10400:11026:11658:11914:12043:12048:12297:12555:12895:13069:13138:13161:13229:13231:13311:13357:14096:14097:14181:14394:14659:14721:21080:21220:21444:21451:21627:21740:21990:30054:30056,0,RBL:209.85.215.202:@flex--axelrasmussen.bounces.google.com:.lbl8.mailshell.net-62.18.0.100 66.100.201.100;04y86f5zydcrtzk9r6mnz1qwqxri1op4o8mipkz5yryr1nhjutdqgfh474p3o7u.8um55b6meqeum3mecs9wn74ncextic5egp1e4bwuz4ts1y7bxcxofk88ww51ep4.6-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Cust om_rules X-HE-Tag: club84_0004b9e271e4 X-Filterd-Recvd-Size: 4818 Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) by imf35.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 22:05:31 +0000 (UTC) Received: by mail-pg1-f202.google.com with SMTP id j16so6810835pgi.3 for ; Fri, 09 Oct 2020 15:05:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=P3zh56605aw3etCHRqEnAMV5pM6KCjNDIRD7NDvRgXI=; b=t0+NyCh6jdjxgRPAZG6P+/ltegNieLG9x+vuVTY4RBmE9jAZDh3fHF9zAEpaLn+o0Z F9Ur8Y9cwDZ/TteieT8YZF48CawXHH2kXf9cc3CDCKuNEMYIlL8Ff2QFHZpp28daOsgK pm1IF0D9k4kW/k+dlf8ezx2dfyGqqoDjAhgXvRQwH+aLOBw/b6VT0QTv87fDOFpmbkBL GR0NuuRQy6UZzYEJMOx5E+ahPBW+UkfYtBM7zqan4Fty3jvjCnfZW0eMAoMqYkw0Ypr6 Ol2VLRIi8qPzvovHeAXHuVCp+0zF0X10+oJNmXycEkv1k1YY40IJkgXvFZlImwyx8LjI pbBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=P3zh56605aw3etCHRqEnAMV5pM6KCjNDIRD7NDvRgXI=; b=KQIYiqDC9y3wFbtMbDN9OyTqpPXw500UECHXawx/ujRNSuhkTsVkSlJwlkoE6+3E3r uJqKMlBPAuDtsLNkuJjlzniWkoSkkDGUsCxilmc7cV2wB1Uu3RFs4rMRTxEFyHIYxp5c MwAf+cQbC6iLKFniU0hb/WGRp4EMd8alYp4S5oZRhBLFLer6Pzpsn079ue6XfuOmgV5T kxdXiL8zEk+wzYu3XEeSkD6iTG08W5xXGrVge/mg9DzPGfpYGKdRe+2ngy5c21Vhi3bk Ivb3ef8poiUiWwhNz+LmYMfOLX+igxGB741qOtKV2CTBeOJq1QfG//g/AMdXOXrnkc38 Umqw== X-Gm-Message-State: AOAM533pvkxNA5JyYqDufrLHKLzCJPwrWxFeGt3gXO1C8MCwPp4qt2Eu eO67Nbh2SxDy0887EzY+/FPKJ5sCw/YhtycJ95yq X-Google-Smtp-Source: ABdhPJy9z+/rA1j7Repb1JxRjQy2PvR0s/646IIzv9KPrPXcBJo38uYMWZ7o4gIo0ScfSy/L1jQh+IwgU4G9nrr3AD1m X-Received: from ajr0.svl.corp.google.com ([2620:15c:2cd:203:f693:9fff:feef:c8f8]) (user=axelrasmussen job=sendgmr) by 2002:a65:5b48:: with SMTP id y8mr4907900pgr.67.1602281130575; Fri, 09 Oct 2020 15:05:30 -0700 (PDT) Date: Fri, 9 Oct 2020 15:05:23 -0700 In-Reply-To: <20201009220524.485102-1-axelrasmussen@google.com> Message-Id: <20201009220524.485102-2-axelrasmussen@google.com> Mime-Version: 1.0 References: <20201009220524.485102-1-axelrasmussen@google.com> X-Mailer: git-send-email 2.28.0.1011.ga647a8990f-goog Subject: [PATCH v3 1/2] tracing: support "bool" type in synthetic trace events From: Axel Rasmussen To: Steven Rostedt , Ingo Molnar , Andrew Morton , Michel Lespinasse , Vlastimil Babka , Daniel Jordan , Laurent Dufour , Axel Rasmussen , Jann Horn , Chinwen Chang Cc: Yafang Shao , linux-kernel@vger.kernel.org, linux-mm@kvack.org X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: It's common [1] to define tracepoint fields as "bool" when they contain a true / false value. Currently, defining a synthetic event with a "bool" field yields EINVAL. It's possible to work around this by using e.g. u8 (assuming sizeof(bool) is 1, and bool is unsigned; if either of these properties don't match, you get EINVAL [2]). Supporting "bool" explicitly makes hooking this up easier and more portable for userspace. [1]: grep -r "bool" include/trace/events/ [2]: check_synth_field() in kernel/trace/trace_events_hist.c Acked-by: Michel Lespinasse Signed-off-by: Axel Rasmussen Acked-by: Tom Zanussi Acked-by: David Rientjes --- kernel/trace/trace_events_synth.c | 4 ++++ 1 file changed, 4 insertions(+) -- 2.28.0.1011.ga647a8990f-goog diff --git a/kernel/trace/trace_events_synth.c b/kernel/trace/trace_events_synth.c index 8e1974fbad0e..8f94c84349a6 100644 --- a/kernel/trace/trace_events_synth.c +++ b/kernel/trace/trace_events_synth.c @@ -234,6 +234,8 @@ static int synth_field_size(char *type) size = sizeof(long); else if (strcmp(type, "unsigned long") == 0) size = sizeof(unsigned long); + else if (strcmp(type, "bool") == 0) + size = sizeof(bool); else if (strcmp(type, "pid_t") == 0) size = sizeof(pid_t); else if (strcmp(type, "gfp_t") == 0) @@ -276,6 +278,8 @@ static const char *synth_field_fmt(char *type) fmt = "%ld"; else if (strcmp(type, "unsigned long") == 0) fmt = "%lu"; + else if (strcmp(type, "bool") == 0) + fmt = "%d"; else if (strcmp(type, "pid_t") == 0) fmt = "%d"; else if (strcmp(type, "gfp_t") == 0) From patchwork Fri Oct 9 22:05:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Axel Rasmussen X-Patchwork-Id: 11829723 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8AA4B16C1 for ; Fri, 9 Oct 2020 22:05:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1ADAE22258 for ; Fri, 9 Oct 2020 22:05:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="nf9ptOaV" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1ADAE22258 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CD1DC940010; Fri, 9 Oct 2020 18:05:34 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id BDDC494000C; Fri, 9 Oct 2020 18:05:34 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A5750940010; Fri, 9 Oct 2020 18:05:34 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0204.hostedemail.com [216.40.44.204]) by kanga.kvack.org (Postfix) with ESMTP id 6BB9494000C for ; Fri, 9 Oct 2020 18:05:34 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 02DF51EE6 for ; Fri, 9 Oct 2020 22:05:34 +0000 (UTC) X-FDA: 77353769388.07.floor81_0605569271e4 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin07.hostedemail.com (Postfix) with ESMTP id D872F1803F9AC for ; Fri, 9 Oct 2020 22:05:33 +0000 (UTC) X-Spam-Summary: 1,0,0,63fdde54b7347be3,d41d8cd98f00b204,3rn6axw0kcg8nkryenfzhffratbbtyr.pbzyvahk-zzxinpx.bet@flex--axelrasmussen.bounces.google.com,,RULES_HIT:4:41:152:355:379:541:800:960:973:988:989:1260:1277:1313:1314:1345:1359:1431:1434:1437:1516:1518:1593:1594:1605:1730:1747:1777:1792:1801:2393:2559:2562:2638:2693:2892:2897:3138:3139:3140:3141:3142:3152:3865:3866:3867:3868:3870:3871:3872:4250:4321:4605:5007:6119:6261:6653:7903:9149:9969:10004:10946:11026:11473:11658:11914:12043:12048:12291:12296:12297:12438:12555:12683:12895:12986:13137:13150:13161:13229:13230:13231:14096:14097:14394:14659:21080:21444:21451:21627:21740:21966:21987:21990:30012:30029:30054:30055:30070:30075,0,RBL:209.85.160.201:@flex--axelrasmussen.bounces.google.com:.lbl8.mailshell.net-62.18.0.100 66.100.201.100;04yf458wmq4md9sp85jj3rnre9z5bypossxgam4nhbkz9fufdp7s8r4s7aoowh6.ijsr9gmkmbsk17fe1yxg1rnp6bmptxfncj5fheo8km3xqq9wobyitdqcohn44u7.g-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0. 5,Netche X-HE-Tag: floor81_0605569271e4 X-Filterd-Recvd-Size: 15624 Received: from mail-qt1-f201.google.com (mail-qt1-f201.google.com [209.85.160.201]) by imf35.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 22:05:33 +0000 (UTC) Received: by mail-qt1-f201.google.com with SMTP id g3so7808540qtc.5 for ; Fri, 09 Oct 2020 15:05:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=xU7vX6pQFejp9ijUnVlAOr7jSaZpMx0gsOkDnDT2lNo=; b=nf9ptOaVxbau3V8WGUWQfOKny35tdngx3/rnYBjV6ftcNvD5Cy+jhPo1bxvxC6oCaK ngTZwNSzgyvGpjBUQvoIxYbD6zXnT1QA8asxud+Up99K+L1PMaIuXKGA6dF/rodk3etC I1B/BJ6zZEzAUDogdRwYJWyrsQUD/Ofhh2dfceSwywEejLBgzp0JAmlWHOjCwoXfpHPB q/qdlyvbzmcwoaSXfthLfdY659ftKz77WCHBFzP6MyPDBzn8L50YFJsd+HpjIqlh4awR M7iNR0WpN5+H056rLkQSQHNb9IHQxMWGtzoWxEg9dQq2skxq20wlUt9li1bw2ReBCs2X EDog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=xU7vX6pQFejp9ijUnVlAOr7jSaZpMx0gsOkDnDT2lNo=; b=X40aFnLaXPLxX2X5TiYbaw5nflxTRkQCxmDbaOdzWJ33ExXUsQ4/8FnBuTnIbccu0z oYameQnV1RSsARnz/wzchB9AhSWQ8+H2lit5bE5qoPonnBHnz6jva0j7d7phpn3HR/tE pUIPQ3Y1aD7SUpL1OvV9qmxZ/US58huAL98b7rXh/2kf2MGhUSF0TIN9rrx6Sr5KtpOb oJo6KHpQ4vEWPna0upH3IDemj4GJdWSoSgyAUPj9X/qi2rlFH4BxQh2apex1qLUwqn6d vAEyrGmBc14fWkeYQrGz+VVzyZIdMhem2KC5USreP8LYQXjxisnY28exkb9R6yh12FMV zxBA== X-Gm-Message-State: AOAM5339gHAEuj1B3uTPCOYTDRTuLF6i7jt6wq/HVNtw7e5BUoMemBEj oAg/L+L7onlMXzOSAAMFXZEHMq/xxyCQnjk+/eKi X-Google-Smtp-Source: ABdhPJxWYO/ZLTBtvejEWtjmV6FlmfsBCw+SPV11/99aCgoTJkoobLdtOyVuLu+5ESHafvvcHYOGJA1Lvbhjv3iOAIQW X-Received: from ajr0.svl.corp.google.com ([2620:15c:2cd:203:f693:9fff:feef:c8f8]) (user=axelrasmussen job=sendgmr) by 2002:ad4:5184:: with SMTP id b4mr14643938qvp.26.1602281132423; Fri, 09 Oct 2020 15:05:32 -0700 (PDT) Date: Fri, 9 Oct 2020 15:05:24 -0700 In-Reply-To: <20201009220524.485102-1-axelrasmussen@google.com> Message-Id: <20201009220524.485102-3-axelrasmussen@google.com> Mime-Version: 1.0 References: <20201009220524.485102-1-axelrasmussen@google.com> X-Mailer: git-send-email 2.28.0.1011.ga647a8990f-goog Subject: [PATCH v3 2/2] mmap_lock: add tracepoints around lock acquisition From: Axel Rasmussen To: Steven Rostedt , Ingo Molnar , Andrew Morton , Michel Lespinasse , Vlastimil Babka , Daniel Jordan , Laurent Dufour , Axel Rasmussen , Jann Horn , Chinwen Chang Cc: Yafang Shao , linux-kernel@vger.kernel.org, linux-mm@kvack.org X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The goal of these tracepoints is to be able to debug lock contention issues. This lock is acquired on most (all?) mmap / munmap / page fault operations, so a multi-threaded process which does a lot of these can experience significant contention. We trace just before we start acquisition, when the acquisition returns (whether it succeeded or not), and when the lock is released (or downgraded). The events are broken out by lock type (read / write). The events are also broken out by memcg path. For container-based workloads, users often think of several processes in a memcg as a single logical "task", so collecting statistics at this level is useful. The end goal is to get latency information. This isn't directly included in the trace events. Instead, users are expected to compute the time between "start locking" and "acquire returned", using e.g. synthetic events or BPF. The benefit we get from this is simpler code. Because we use tracepoint_enabled() to decide whether or not to trace, this patch has effectively no overhead unless tracepoints are enabled at runtime. If tracepoints are enabled, there is a performance impact, but how much depends on exactly what e.g. the BPF program does. Signed-off-by: Axel Rasmussen Reviewed-by: Michel Lespinasse Acked-by: Yafang Shao Acked-by: David Rientjes --- include/linux/mmap_lock.h | 93 ++++++++++++++++++++++++++++++-- include/trace/events/mmap_lock.h | 70 ++++++++++++++++++++++++ mm/Makefile | 2 +- mm/mmap_lock.c | 87 ++++++++++++++++++++++++++++++ 4 files changed, 246 insertions(+), 6 deletions(-) create mode 100644 include/trace/events/mmap_lock.h create mode 100644 mm/mmap_lock.c diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index 0707671851a8..6586b42b4faa 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -1,11 +1,63 @@ #ifndef _LINUX_MMAP_LOCK_H #define _LINUX_MMAP_LOCK_H +#include +#include #include +#include +#include +#include #define MMAP_LOCK_INITIALIZER(name) \ .mmap_lock = __RWSEM_INITIALIZER((name).mmap_lock), +DECLARE_TRACEPOINT(mmap_lock_start_locking); +DECLARE_TRACEPOINT(mmap_lock_acquire_returned); +DECLARE_TRACEPOINT(mmap_lock_released); + +#ifdef CONFIG_TRACING + +void __mmap_lock_do_trace_start_locking(struct mm_struct *mm, bool write); +void __mmap_lock_do_trace_acquire_returned(struct mm_struct *mm, bool write, + bool success); +void __mmap_lock_do_trace_released(struct mm_struct *mm, bool write); + +static inline void __mmap_lock_trace_start_locking(struct mm_struct *mm, + bool write) +{ + if (tracepoint_enabled(mmap_lock_start_locking)) + __mmap_lock_do_trace_start_locking(mm, write); +} + +static inline void __mmap_lock_trace_acquire_returned(struct mm_struct *mm, + bool write, bool success) +{ + if (tracepoint_enabled(mmap_lock_acquire_returned)) + __mmap_lock_do_trace_acquire_returned(mm, write, success); +} + +static inline void __mmap_lock_trace_released(struct mm_struct *mm, bool write) +{ + if (tracepoint_enabled(mmap_lock_released)) + __mmap_lock_do_trace_released(mm, write); +} + +#else /* !CONFIG_TRACING */ + +static inline void __mmap_lock_trace_start_locking(struct mm_struct *mm, + bool write) +{ +} +static inline void __mmap_lock_trace_acquire_returned(struct mm_struct *mm, + bool write, bool success) +{ +} +static inline void __mmap_lock_trace_released(struct mm_struct *mm, bool write) +{ +} + +#endif /* CONFIG_TRACING */ + static inline void mmap_init_lock(struct mm_struct *mm) { init_rwsem(&mm->mmap_lock); @@ -13,58 +65,88 @@ static inline void mmap_init_lock(struct mm_struct *mm) static inline void mmap_write_lock(struct mm_struct *mm) { + __mmap_lock_trace_start_locking(mm, true); down_write(&mm->mmap_lock); + __mmap_lock_trace_acquire_returned(mm, true, true); } static inline void mmap_write_lock_nested(struct mm_struct *mm, int subclass) { + __mmap_lock_trace_start_locking(mm, true); down_write_nested(&mm->mmap_lock, subclass); + __mmap_lock_trace_acquire_returned(mm, true, true); } static inline int mmap_write_lock_killable(struct mm_struct *mm) { - return down_write_killable(&mm->mmap_lock); + int ret; + + __mmap_lock_trace_start_locking(mm, true); + ret = down_write_killable(&mm->mmap_lock); + __mmap_lock_trace_acquire_returned(mm, true, ret == 0); + return ret; } static inline bool mmap_write_trylock(struct mm_struct *mm) { - return down_write_trylock(&mm->mmap_lock) != 0; + bool ret; + + __mmap_lock_trace_start_locking(mm, true); + ret = down_write_trylock(&mm->mmap_lock) != 0; + __mmap_lock_trace_acquire_returned(mm, true, ret); + return ret; } static inline void mmap_write_unlock(struct mm_struct *mm) { up_write(&mm->mmap_lock); + __mmap_lock_trace_released(mm, true); } static inline void mmap_write_downgrade(struct mm_struct *mm) { downgrade_write(&mm->mmap_lock); + __mmap_lock_trace_acquire_returned(mm, false, true); } static inline void mmap_read_lock(struct mm_struct *mm) { + __mmap_lock_trace_start_locking(mm, false); down_read(&mm->mmap_lock); + __mmap_lock_trace_acquire_returned(mm, false, true); } static inline int mmap_read_lock_killable(struct mm_struct *mm) { - return down_read_killable(&mm->mmap_lock); + int ret; + + __mmap_lock_trace_start_locking(mm, false); + ret = down_read_killable(&mm->mmap_lock); + __mmap_lock_trace_acquire_returned(mm, false, ret == 0); + return ret; } static inline bool mmap_read_trylock(struct mm_struct *mm) { - return down_read_trylock(&mm->mmap_lock) != 0; + bool ret; + + __mmap_lock_trace_start_locking(mm, false); + ret = down_read_trylock(&mm->mmap_lock) != 0; + __mmap_lock_trace_acquire_returned(mm, false, ret); + return ret; } static inline void mmap_read_unlock(struct mm_struct *mm) { up_read(&mm->mmap_lock); + __mmap_lock_trace_released(mm, false); } static inline bool mmap_read_trylock_non_owner(struct mm_struct *mm) { - if (down_read_trylock(&mm->mmap_lock)) { + if (mmap_read_trylock(mm)) { rwsem_release(&mm->mmap_lock.dep_map, _RET_IP_); + __mmap_lock_trace_released(mm, false); return true; } return false; @@ -73,6 +155,7 @@ static inline bool mmap_read_trylock_non_owner(struct mm_struct *mm) static inline void mmap_read_unlock_non_owner(struct mm_struct *mm) { up_read_non_owner(&mm->mmap_lock); + __mmap_lock_trace_released(mm, false); } static inline void mmap_assert_locked(struct mm_struct *mm) diff --git a/include/trace/events/mmap_lock.h b/include/trace/events/mmap_lock.h new file mode 100644 index 000000000000..ca652b52510e --- /dev/null +++ b/include/trace/events/mmap_lock.h @@ -0,0 +1,70 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM mmap_lock + +#if !defined(_TRACE_MMAP_LOCK_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_MMAP_LOCK_H + +#include +#include + +struct mm_struct; + +DECLARE_EVENT_CLASS( + mmap_lock_template, + + TP_PROTO(struct mm_struct *mm, const char *memcg_path, bool write, + bool success), + + TP_ARGS(mm, memcg_path, write, success), + + TP_STRUCT__entry( + __field(struct mm_struct *, mm) + __string(memcg_path, memcg_path) + __field(bool, write) + __field(bool, success) + ), + + TP_fast_assign( + __entry->mm = mm; + __assign_str(memcg_path, memcg_path); + __entry->write = write; + __entry->success = success; + ), + + TP_printk( + "mm=%p memcg_path=%s write=%s success=%s\n", + __entry->mm, + __get_str(memcg_path), + __entry->write ? "true" : "false", + __entry->success ? "true" : "false") + ); + +DEFINE_EVENT(mmap_lock_template, mmap_lock_start_locking, + + TP_PROTO(struct mm_struct *mm, const char *memcg_path, bool write, + bool success), + + TP_ARGS(mm, memcg_path, write, success) +); + +DEFINE_EVENT(mmap_lock_template, mmap_lock_acquire_returned, + + TP_PROTO(struct mm_struct *mm, const char *memcg_path, bool write, + bool success), + + TP_ARGS(mm, memcg_path, write, success) +); + +DEFINE_EVENT(mmap_lock_template, mmap_lock_released, + + TP_PROTO(struct mm_struct *mm, const char *memcg_path, bool write, + bool success), + + TP_ARGS(mm, memcg_path, write, success) +); + +#endif /* _TRACE_MMAP_LOCK_H */ + +/* This part must be outside protection */ +#include diff --git a/mm/Makefile b/mm/Makefile index d5649f1c12c0..1a7ea212fd8b 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -52,7 +52,7 @@ obj-y := filemap.o mempool.o oom_kill.o fadvise.o \ mm_init.o percpu.o slab_common.o \ compaction.o vmacache.o \ interval_tree.o list_lru.o workingset.o \ - debug.o gup.o $(mmu-y) + debug.o gup.o mmap_lock.o $(mmu-y) # Give 'page_alloc' its own module-parameter namespace page-alloc-y := page_alloc.o diff --git a/mm/mmap_lock.c b/mm/mmap_lock.c new file mode 100644 index 000000000000..b849287bd12a --- /dev/null +++ b/mm/mmap_lock.c @@ -0,0 +1,87 @@ +// SPDX-License-Identifier: GPL-2.0 +#define CREATE_TRACE_POINTS +#include + +#include +#include +#include +#include +#include +#include +#include + +/* + * We have to export these, as drivers use mmap_lock, and our inline functions + * in the header check if the tracepoint is enabled. They can't be GPL, as e.g. + * the nvidia driver is an existing caller of this code. + */ +EXPORT_SYMBOL(__tracepoint_mmap_lock_start_locking); +EXPORT_SYMBOL(__tracepoint_mmap_lock_acquire_returned); +EXPORT_SYMBOL(__tracepoint_mmap_lock_released); + +#ifdef CONFIG_MEMCG + +DEFINE_PER_CPU(char[MAX_FILTER_STR_VAL], trace_memcg_path); + +/* + * Write the given mm_struct's memcg path to a percpu buffer, and return a + * pointer to it. If the path cannot be determined, the buffer will contain the + * empty string. + * + * Note: buffers are allocated per-cpu to avoid locking, so preemption must be + * disabled by the caller before calling us, and re-enabled only after the + * caller is done with the pointer. + */ +static const char *get_mm_memcg_path(struct mm_struct *mm) +{ + struct mem_cgroup *memcg = get_mem_cgroup_from_mm(mm); + + if (memcg != NULL && likely(memcg->css.cgroup != NULL)) { + char *buf = this_cpu_ptr(trace_memcg_path); + + cgroup_path(memcg->css.cgroup, buf, MAX_FILTER_STR_VAL); + return buf; + } + return ""; +} + +#define TRACE_MMAP_LOCK_EVENT(type, mm, ...) \ + do { \ + if (trace_mmap_lock_##type##_enabled()) { \ + get_cpu(); \ + trace_mmap_lock_##type(mm, get_mm_memcg_path(mm), \ + ##__VA_ARGS__); \ + put_cpu(); \ + } \ + } while (0) + +#else /* !CONFIG_MEMCG */ + +#define TRACE_MMAP_LOCK_EVENT(type, mm, ...) \ + trace_mmap_lock_##type(mm, "", ##__VA_ARGS__) + +#endif /* CONFIG_MEMCG */ + +/* + * Trace calls must be in a separate file, as otherwise there's a circular + * dependency between linux/mmap_lock.h and trace/events/mmap_lock.h. + */ + +void __mmap_lock_do_trace_start_locking(struct mm_struct *mm, bool write) +{ + TRACE_MMAP_LOCK_EVENT(start_locking, mm, write, true); +} +EXPORT_SYMBOL(__mmap_lock_do_trace_start_locking); + +void __mmap_lock_do_trace_acquire_returned(struct mm_struct *mm, bool write, + bool success) +{ + TRACE_MMAP_LOCK_EVENT(acquire_returned, mm, write, success); +} +EXPORT_SYMBOL(__mmap_lock_do_trace_acquire_returned); + +void __mmap_lock_do_trace_released(struct mm_struct *mm, bool write) +{ + TRACE_MMAP_LOCK_EVENT(released, mm, write, true); +} +EXPORT_SYMBOL(__mmap_lock_do_trace_released);