From patchwork Tue Dec 11 14:29:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 10723907 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D445B1869 for ; Tue, 11 Dec 2018 14:30:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C439C2B192 for ; Tue, 11 Dec 2018 14:30:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B8AE02B1A5; Tue, 11 Dec 2018 14:30:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C69AD2B19E for ; Tue, 11 Dec 2018 14:30:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C12388E0096; Tue, 11 Dec 2018 09:30:10 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A37618E0095; Tue, 11 Dec 2018 09:30:10 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5E76C8E0098; Tue, 11 Dec 2018 09:30:10 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-ed1-f72.google.com (mail-ed1-f72.google.com [209.85.208.72]) by kanga.kvack.org (Postfix) with ESMTP id 015A38E0096 for ; Tue, 11 Dec 2018 09:30:09 -0500 (EST) Received: by mail-ed1-f72.google.com with SMTP id w2so7127254edc.13 for ; Tue, 11 Dec 2018 06:30:09 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=9GziqcA7np7N0DaqyW6zeEbZLCaCpd0e+vQXIDmrbnQ=; b=Fcl3N903gWz1hZeaE7eCt5DdBz2Bs0LRCghg9yXs9Gy6krfpUn8A1w8zPYG4Sm+gVl 51AarGmI39XR9h6khl038oMLTlqGmV8/AKYyBB864DnHVaekVCJY1JCUDnV1dcTOg2/B oyjGB7Epo5ExdRrg+3Sr/Nu4fet+oqu5sWN3jP74sR97hWVe09AWRc4IGs/4BHVCpOA+ XTL2mevGy4D8MPkmABxzFDFB808CDnAbt4jiUmoc+9/tBUpH2ab6gax5u9fVT0AFJ4bV OYPQ65TdAMlTB/jO++QUiFT5qJiFpqUFGcHr5VR5UeRS+2UC5eWQfsQ7MUGc5KQaKZXb VF+w== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of vbabka@suse.cz designates 195.135.220.15 as permitted sender) smtp.mailfrom=vbabka@suse.cz X-Gm-Message-State: AA+aEWaMYUkpaIuG8BDuX0s8FWm54MSOAxUzfs+wxLlryitUW0iZF4/t TWbnb0JXejlJTQIDR85KZ7IiBCE28IOPBeCm+kka8ptao9GqXBien7ab1Jpa7c3VLkB2Md0PE6H ufhkrLoMdsUrDaZdVZrus2BFSIEENma7GhRfDXegOObf6Cq294cialCaxZtE/ucN1vg== X-Received: by 2002:a50:8e95:: with SMTP id w21mr14840216edw.198.1544538609492; Tue, 11 Dec 2018 06:30:09 -0800 (PST) X-Google-Smtp-Source: AFSGD/XIW90ehpQl+MKuWP3tdW7poZ7psyjwovO+7BAoP1hf6U9YIwASxb96ZTPySzWebEkPeIB6 X-Received: by 2002:a50:8e95:: with SMTP id w21mr14840156edw.198.1544538608489; Tue, 11 Dec 2018 06:30:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1544538608; cv=none; d=google.com; s=arc-20160816; b=TTL7F51WgPkQDWj/47w82Aby9p58y2SLzwpd7LSP2rJ075Rba+fHF+2LoRF4TLuPvp 3D0I31l+E2JRdSTKmOdRXR9/1R0g5OVI65bU1DN/LGRijM3RzzKUGfDcEK8l/F0OK8xg ev/QegHC1V1YvSoMEwdf9lOCNdPchE2IZ4MiTsXFCFonBsnHJY4UI6PGuJ41ITccMRRF AfUzP0yjv3bZ7LH/ZnA4YCuQDhyexRD8NAJu+gB6A6JuzLnJpwNpKrWhazYt/8cyVv+z yPcO993l5daZAzKml907RTERkBQaOHCGZ5bSeAHZpvHhntkcTmX8sJ/6O/kbmQZMMQgG D/Aw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=9GziqcA7np7N0DaqyW6zeEbZLCaCpd0e+vQXIDmrbnQ=; b=OSfuYgyAelVyGxv0/Y3VBgUloVtIY/gp+JjWCYZdTlAJgw3O2Za9pluRPyykMD+gbX 4GfQaJ3S2mAknpaEI7fELxrYXQEuIFEdEJ+Lw2rW4yZZuVMe9Um288PPIAjqhsWl+uyT YCP/IIySvdcnKuhnSurNG9ZKz9bo9xmdqss1BwHgHomp+TGuGBPUWE4qk6EfY5/oArfI jZDGU0h03Ao4fpIIplBACGOzYze+InKny6DdYVSQ/pHuHfdiw+WSjOGm3+LICpTzHDZv 5MYxQE/9pcozUQVyAhG99iJIR846npHNJc4RyKvluT7Rv6lbTr5Z25J2pkdIme0GeJEM Qspg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of vbabka@suse.cz designates 195.135.220.15 as permitted sender) smtp.mailfrom=vbabka@suse.cz Received: from mx1.suse.de (mx2.suse.de. [195.135.220.15]) by mx.google.com with ESMTPS id v17si605767edl.345.2018.12.11.06.30.08 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 11 Dec 2018 06:30:08 -0800 (PST) Received-SPF: pass (google.com: domain of vbabka@suse.cz designates 195.135.220.15 as permitted sender) client-ip=195.135.220.15; Authentication-Results: mx.google.com; spf=pass (google.com: domain of vbabka@suse.cz designates 195.135.220.15 as permitted sender) smtp.mailfrom=vbabka@suse.cz X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id C284EB03B; Tue, 11 Dec 2018 14:30:07 +0000 (UTC) From: Vlastimil Babka To: David Rientjes , Andrea Arcangeli , Mel Gorman Cc: Michal Hocko , Linus Torvalds , linux-mm@kvack.org, Andrew Morton , Vlastimil Babka Subject: [RFC 3/3] mm, compaction: introduce deferred async compaction Date: Tue, 11 Dec 2018 15:29:41 +0100 Message-Id: <20181211142941.20500-4-vbabka@suse.cz> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20181211142941.20500-1-vbabka@suse.cz> References: <20181211142941.20500-1-vbabka@suse.cz> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Deferring compaction happens when it fails to fulfill the allocation request at given order, and then a number of the following direct compaction attempts for same or higher orders is skipped; with further failures, the number grows exponentially up to 64. This is reset e.g. when compaction succeeds. Until now, defering compaction is only performed after a sync compaction fails, and then it also blocks async compaction attempts. The rationale is that only a failed sync compaction is expected to fully exhaust all compaction potential of a zone. However, for THP page faults that use __GFP_NORETRY, this means only async compaction is attempted and thus it is never deferred, potentially resulting in pointless reclaim/compaction attempts in a badly fragmented node. This patch therefore tracks and checks async compaction deferred status in addition, and mostly separately from sync compaction. This allows deferring THP fault compaction without affecting any sync pageblock-order compaction. Deferring for sync compaction however implies deferring for async compaction as well. When deferred status is reset, it is reset for both modes. The expected outcome is less compaction/reclaim activity for failing THP faults likely with some expense on THP fault success rate. Signed-off-by: Vlastimil Babka Cc: Andrea Arcangeli Cc: David Rientjes Cc: Mel Gorman Cc: Michal Hocko --- include/linux/compaction.h | 10 ++-- include/linux/mmzone.h | 6 +-- include/trace/events/compaction.h | 29 ++++++----- mm/compaction.c | 80 ++++++++++++++++++------------- 4 files changed, 71 insertions(+), 54 deletions(-) diff --git a/include/linux/compaction.h b/include/linux/compaction.h index 68250a57aace..f1d4dc1deec9 100644 --- a/include/linux/compaction.h +++ b/include/linux/compaction.h @@ -100,11 +100,11 @@ extern void reset_isolation_suitable(pg_data_t *pgdat); extern enum compact_result compaction_suitable(struct zone *zone, int order, unsigned int alloc_flags, int classzone_idx); -extern void defer_compaction(struct zone *zone, int order); -extern bool compaction_deferred(struct zone *zone, int order); +extern void defer_compaction(struct zone *zone, int order, bool sync); +extern bool compaction_deferred(struct zone *zone, int order, bool sync); extern void compaction_defer_reset(struct zone *zone, int order, bool alloc_success); -extern bool compaction_restarting(struct zone *zone, int order); +extern bool compaction_restarting(struct zone *zone, int order, bool sync); /* Compaction has made some progress and retrying makes sense */ static inline bool compaction_made_progress(enum compact_result result) @@ -189,11 +189,11 @@ static inline enum compact_result compaction_suitable(struct zone *zone, int ord return COMPACT_SKIPPED; } -static inline void defer_compaction(struct zone *zone, int order) +static inline void defer_compaction(struct zone *zone, int order, bool sync) { } -static inline bool compaction_deferred(struct zone *zone, int order) +static inline bool compaction_deferred(struct zone *zone, int order, bool sync) { return true; } diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 847705a6d0ec..4c59996dd4f9 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -492,9 +492,9 @@ struct zone { * are skipped before trying again. The number attempted since * last failure is tracked with compact_considered. */ - unsigned int compact_considered; - unsigned int compact_defer_shift; - int compact_order_failed; + unsigned int compact_considered[2]; + unsigned int compact_defer_shift[2]; + int compact_order_failed[2]; #endif #if defined CONFIG_COMPACTION || defined CONFIG_CMA diff --git a/include/trace/events/compaction.h b/include/trace/events/compaction.h index 6074eff3d766..7ef40c76bfed 100644 --- a/include/trace/events/compaction.h +++ b/include/trace/events/compaction.h @@ -245,9 +245,9 @@ DEFINE_EVENT(mm_compaction_suitable_template, mm_compaction_suitable, DECLARE_EVENT_CLASS(mm_compaction_defer_template, - TP_PROTO(struct zone *zone, int order), + TP_PROTO(struct zone *zone, int order, bool sync), - TP_ARGS(zone, order), + TP_ARGS(zone, order, sync), TP_STRUCT__entry( __field(int, nid) @@ -256,45 +256,48 @@ DECLARE_EVENT_CLASS(mm_compaction_defer_template, __field(unsigned int, considered) __field(unsigned int, defer_shift) __field(int, order_failed) + __field(bool, sync) ), TP_fast_assign( __entry->nid = zone_to_nid(zone); __entry->idx = zone_idx(zone); __entry->order = order; - __entry->considered = zone->compact_considered; - __entry->defer_shift = zone->compact_defer_shift; - __entry->order_failed = zone->compact_order_failed; + __entry->considered = zone->compact_considered[sync]; + __entry->defer_shift = zone->compact_defer_shift[sync]; + __entry->order_failed = zone->compact_order_failed[sync]; + __entry->sync = sync; ), - TP_printk("node=%d zone=%-8s order=%d order_failed=%d consider=%u limit=%lu", + TP_printk("node=%d zone=%-8s order=%d order_failed=%d consider=%u limit=%lu sync=%d", __entry->nid, __print_symbolic(__entry->idx, ZONE_TYPE), __entry->order, __entry->order_failed, __entry->considered, - 1UL << __entry->defer_shift) + 1UL << __entry->defer_shift, + __entry->sync) ); DEFINE_EVENT(mm_compaction_defer_template, mm_compaction_deferred, - TP_PROTO(struct zone *zone, int order), + TP_PROTO(struct zone *zone, int order, bool sync), - TP_ARGS(zone, order) + TP_ARGS(zone, order, sync) ); DEFINE_EVENT(mm_compaction_defer_template, mm_compaction_defer_compaction, - TP_PROTO(struct zone *zone, int order), + TP_PROTO(struct zone *zone, int order, bool sync), - TP_ARGS(zone, order) + TP_ARGS(zone, order, sync) ); DEFINE_EVENT(mm_compaction_defer_template, mm_compaction_defer_reset, - TP_PROTO(struct zone *zone, int order), + TP_PROTO(struct zone *zone, int order, bool sync), - TP_ARGS(zone, order) + TP_ARGS(zone, order, sync) ); #endif diff --git a/mm/compaction.c b/mm/compaction.c index 7c607479de4a..cb139b63a754 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -139,36 +139,40 @@ EXPORT_SYMBOL(__ClearPageMovable); * allocation success. 1 << compact_defer_limit compactions are skipped up * to a limit of 1 << COMPACT_MAX_DEFER_SHIFT */ -void defer_compaction(struct zone *zone, int order) +void defer_compaction(struct zone *zone, int order, bool sync) { - zone->compact_considered = 0; - zone->compact_defer_shift++; + zone->compact_considered[sync] = 0; + zone->compact_defer_shift[sync]++; - if (order < zone->compact_order_failed) - zone->compact_order_failed = order; + if (order < zone->compact_order_failed[sync]) + zone->compact_order_failed[sync] = order; - if (zone->compact_defer_shift > COMPACT_MAX_DEFER_SHIFT) - zone->compact_defer_shift = COMPACT_MAX_DEFER_SHIFT; + if (zone->compact_defer_shift[sync] > COMPACT_MAX_DEFER_SHIFT) + zone->compact_defer_shift[sync] = COMPACT_MAX_DEFER_SHIFT; - trace_mm_compaction_defer_compaction(zone, order); + trace_mm_compaction_defer_compaction(zone, order, sync); + + /* deferred sync compaciton implies deferred async compaction */ + if (sync) + defer_compaction(zone, order, false); } /* Returns true if compaction should be skipped this time */ -bool compaction_deferred(struct zone *zone, int order) +bool compaction_deferred(struct zone *zone, int order, bool sync) { - unsigned long defer_limit = 1UL << zone->compact_defer_shift; + unsigned long defer_limit = 1UL << zone->compact_defer_shift[sync]; - if (order < zone->compact_order_failed) + if (order < zone->compact_order_failed[sync]) return false; /* Avoid possible overflow */ - if (++zone->compact_considered > defer_limit) - zone->compact_considered = defer_limit; + if (++zone->compact_considered[sync] > defer_limit) + zone->compact_considered[sync] = defer_limit; - if (zone->compact_considered >= defer_limit) + if (zone->compact_considered[sync] >= defer_limit) return false; - trace_mm_compaction_deferred(zone, order); + trace_mm_compaction_deferred(zone, order, sync); return true; } @@ -181,24 +185,32 @@ bool compaction_deferred(struct zone *zone, int order) void compaction_defer_reset(struct zone *zone, int order, bool alloc_success) { - if (alloc_success) { - zone->compact_considered = 0; - zone->compact_defer_shift = 0; - } - if (order >= zone->compact_order_failed) - zone->compact_order_failed = order + 1; + int sync; + + for (sync = 0; sync <= 1; sync++) { + if (alloc_success) { + zone->compact_considered[sync] = 0; + zone->compact_defer_shift[sync] = 0; + } + if (order >= zone->compact_order_failed[sync]) + zone->compact_order_failed[sync] = order + 1; - trace_mm_compaction_defer_reset(zone, order); + trace_mm_compaction_defer_reset(zone, order, sync); + } } /* Returns true if restarting compaction after many failures */ -bool compaction_restarting(struct zone *zone, int order) +bool compaction_restarting(struct zone *zone, int order, bool sync) { - if (order < zone->compact_order_failed) + int defer_shift; + + if (order < zone->compact_order_failed[sync]) return false; - return zone->compact_defer_shift == COMPACT_MAX_DEFER_SHIFT && - zone->compact_considered >= 1UL << zone->compact_defer_shift; + defer_shift = zone->compact_defer_shift[sync]; + + return defer_shift == COMPACT_MAX_DEFER_SHIFT && + zone->compact_considered[sync] >= 1UL << defer_shift; } /* Returns true if the pageblock should be scanned for pages to isolate. */ @@ -1555,7 +1567,7 @@ static enum compact_result compact_zone(struct zone *zone, struct compact_contro * Clear pageblock skip if there were failures recently and compaction * is about to be retried after being deferred. */ - if (compaction_restarting(zone, cc->order)) + if (compaction_restarting(zone, cc->order, sync)) __reset_isolation_suitable(zone); /* @@ -1767,7 +1779,8 @@ enum compact_result try_to_compact_pages(gfp_t gfp_mask, unsigned int order, enum compact_result status; if (prio > MIN_COMPACT_PRIORITY - && compaction_deferred(zone, order)) { + && compaction_deferred(zone, order, + prio != COMPACT_PRIO_ASYNC)) { rc = max_t(enum compact_result, COMPACT_DEFERRED, rc); continue; } @@ -1789,14 +1802,15 @@ enum compact_result try_to_compact_pages(gfp_t gfp_mask, unsigned int order, break; } - if (prio != COMPACT_PRIO_ASYNC && (status == COMPACT_COMPLETE || - status == COMPACT_PARTIAL_SKIPPED)) + if (status == COMPACT_COMPLETE || + status == COMPACT_PARTIAL_SKIPPED) /* * We think that allocation won't succeed in this zone * so we defer compaction there. If it ends up * succeeding after all, it will be reset. */ - defer_compaction(zone, order); + defer_compaction(zone, order, + prio != COMPACT_PRIO_ASYNC); /* * We might have stopped compacting due to need_resched() in @@ -1966,7 +1980,7 @@ static void kcompactd_do_work(pg_data_t *pgdat) if (!populated_zone(zone)) continue; - if (compaction_deferred(zone, cc.order)) + if (compaction_deferred(zone, cc.order, true)) continue; if (compaction_suitable(zone, cc.order, 0, zoneid) != @@ -2000,7 +2014,7 @@ static void kcompactd_do_work(pg_data_t *pgdat) * We use sync migration mode here, so we defer like * sync direct compaction does. */ - defer_compaction(zone, cc.order); + defer_compaction(zone, cc.order, true); } count_compact_events(KCOMPACTD_MIGRATE_SCANNED,