From patchwork Thu May 5 08:14:52 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Kara X-Patchwork-Id: 9021671 Return-Path: X-Original-To: patchwork-linux-fsdevel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id C50979F30C for ; Thu, 5 May 2016 08:15:26 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id D9D6120411 for ; Thu, 5 May 2016 08:15:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id DD5F2203FB for ; Thu, 5 May 2016 08:15:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756400AbcEEIPW (ORCPT ); Thu, 5 May 2016 04:15:22 -0400 Received: from mx2.suse.de ([195.135.220.15]:34292 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755908AbcEEIO5 (ORCPT ); Thu, 5 May 2016 04:14:57 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id E5C46AD85; Thu, 5 May 2016 08:14:55 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id CD0F51E09CE; Thu, 5 May 2016 10:14:53 +0200 (CEST) From: Jan Kara To: Andrew Morton Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, mhocko@suse.cz, Tetsuo Handa , tj@kernel.org, Jan Kara Subject: [PATCH] writeback: Avoid exhausting allocation reserves under memory pressure Date: Thu, 5 May 2016 10:14:52 +0200 Message-Id: <1462436092-32665-1-git-send-email-jack@suse.cz> X-Mailer: git-send-email 2.6.6 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Spam-Status: No, score=-9.0 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When system is under memory pressure memory management frequently calls wakeup_flusher_threads() to writeback pages to that they can be freed. This was observed to exhaust reserves for atomic allocations since wakeup_flusher_threads() allocates one writeback work for each device with dirty data with GFP_ATOMIC. However it is pointless to allocate new work items when requested work is identical. Instead, we can merge the new work with the pending work items and thus save memory allocation. Reported-by: Tetsuo Handa Signed-off-by: Jan Kara --- fs/fs-writeback.c | 37 +++++++++++++++++++++++++++++++++++++ include/trace/events/writeback.h | 1 + 2 files changed, 38 insertions(+) This is a patch which should (and in my basic testing does) address the issues with many atomic allocations Tetsuo reported. What do people think? diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index fee81e8768c9..bb6725f5b1ba 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -189,6 +189,35 @@ out_unlock: spin_unlock_bh(&wb->work_lock); } +/* + * Check whether the request to writeback some pages can be merged with some + * other request which is already pending. If yes, merge it and return true. + * If no, return false. + */ +static bool wb_merge_request(struct bdi_writeback *wb, long nr_pages, + struct super_block *sb, bool range_cyclic, + enum wb_reason reason) +{ + struct wb_writeback_work *work; + bool merged = false; + + spin_lock_bh(&wb->work_lock); + list_for_each_entry(work, &wb->work_list, list) { + if (work->reason == reason && + work->range_cyclic == range_cyclic && + work->auto_free == 1 && work->sb == sb && + work->for_sync == 0) { + work->nr_pages += nr_pages; + merged = true; + trace_writeback_merged(wb, work); + break; + } + } + spin_unlock_bh(&wb->work_lock); + + return merged; +} + /** * wb_wait_for_completion - wait for completion of bdi_writeback_works * @bdi: bdi work items were issued to @@ -928,6 +957,14 @@ void wb_start_writeback(struct bdi_writeback *wb, long nr_pages, return; /* + * Can we merge current request with another pending one - saves us + * atomic allocation which can be significant e.g. when MM is under + * pressure and calls wake_up_flusher_threads() a lot. + */ + if (wb_merge_request(wb, nr_pages, NULL, range_cyclic, reason)) + return; + + /* * This is WB_SYNC_NONE writeback, so if allocation fails just * wakeup the thread for old dirty data writeback */ diff --git a/include/trace/events/writeback.h b/include/trace/events/writeback.h index 73614ce1d204..84ad9fac475b 100644 --- a/include/trace/events/writeback.h +++ b/include/trace/events/writeback.h @@ -252,6 +252,7 @@ DEFINE_WRITEBACK_WORK_EVENT(writeback_exec); DEFINE_WRITEBACK_WORK_EVENT(writeback_start); DEFINE_WRITEBACK_WORK_EVENT(writeback_written); DEFINE_WRITEBACK_WORK_EVENT(writeback_wait); +DEFINE_WRITEBACK_WORK_EVENT(writeback_merged); TRACE_EVENT(writeback_pages_written, TP_PROTO(long pages_written),