From patchwork Mon Dec 5 09:17:11 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Huang Shijie X-Patchwork-Id: 9460687 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 719C360236 for ; Mon, 5 Dec 2016 09:21:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 59A2B24DA2 for ; Mon, 5 Dec 2016 09:21:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4BD9026242; Mon, 5 Dec 2016 09:21:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAD_ENC_HEADER,BAYES_00, DKIM_SIGNED, RCVD_IN_DNSWL_MED, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 49A2924DA2 for ; Mon, 5 Dec 2016 09:21:21 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.85_2 #1 (Red Hat Linux)) id 1cDpQv-0001uF-If; Mon, 05 Dec 2016 09:19:37 +0000 Received: from mail-he1eur01on0053.outbound.protection.outlook.com ([104.47.0.53] helo=EUR01-HE1-obe.outbound.protection.outlook.com) by bombadil.infradead.org with esmtps (Exim 4.85_2 #1 (Red Hat Linux)) id 1cDpQV-0001Ys-6D for linux-arm-kernel@lists.infradead.org; Mon, 05 Dec 2016 09:19:17 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector1-arm-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=PlGpWoMffcKNGOoB8RXgFyP6x4utFdAyxXYDEMulWaU=; b=M0nfLsK7auictVs9gqDUHTZVwMNWKtf7afZ/00zkKW5NdWm1aQSVFvKk0IkbJMoSgfgWQzBRR9UwphOkunAvIhF9dWap1nL/AlyTrIfxoUcdiqtHxLPELOk9dHJWecUVxE2qLbUBBfjgo66eDil3dT48doQN85JNtHTmhSROOKk= Received: from DB6PR0801CA0064.eurprd08.prod.outlook.com (10.169.219.32) by AM2PR08MB0435.eurprd08.prod.outlook.com (10.163.148.152) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.761.9; Mon, 5 Dec 2016 09:18:47 +0000 Received: from AM1FFO11FD006.protection.gbl (2a01:111:f400:7e00::178) by DB6PR0801CA0064.outlook.office365.com (2603:10a6:4:2b::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.761.9 via Frontend Transport; Mon, 5 Dec 2016 09:18:47 +0000 Authentication-Results: spf=pass (sender IP is 217.140.96.140) smtp.mailfrom=arm.com; kvack.org; dkim=none (message not signed) header.d=none; kvack.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 217.140.96.140 as permitted sender) receiver=protection.outlook.com; client-ip=217.140.96.140; helo=nebula.arm.com; Received: from nebula.arm.com (217.140.96.140) by AM1FFO11FD006.mail.protection.outlook.com (10.174.64.68) with Microsoft SMTP Server (version=TLS1_0, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P384) id 15.1.734.4 via Frontend Transport; Mon, 5 Dec 2016 09:18:46 +0000 X-IncomingTopHeaderMarker: OriginalChecksum:; UpperCasedChecksum:; SizeAsReceived:1086; Count:13 Received: from sha-win-210.shanghai.arm.com (10.1.2.79) by mail.arm.com (10.1.105.66) with Microsoft SMTP Server id 14.3.294.0; Mon, 5 Dec 2016 09:18:06 +0000 From: Huang Shijie To: , Subject: [PATCH v3 4/4] mm: hugetlb: support gigantic surplus pages Date: Mon, 5 Dec 2016 17:17:11 +0800 Message-ID: <1480929431-22348-5-git-send-email-shijie.huang@arm.com> X-Mailer: git-send-email 2.5.5 In-Reply-To: <1480929431-22348-1-git-send-email-shijie.huang@arm.com> References: <1480929431-22348-1-git-send-email-shijie.huang@arm.com> MIME-Version: 1.0 X-IncomingHeaderCount: 13 X-EOPAttributedMessage: 0 X-MS-Office365-Filtering-HT: Tenant X-Forefront-Antispam-Report: CIP:217.140.96.140; IPV:CAL; SCL:-1; CTRY:GB; EFV:NLI; SFV:NSPM; SFS:(10009020)(6009001)(7916002)(2980300002)(438002)(189002)(199003)(26826002)(7846002)(7416002)(305945005)(6666003)(86362001)(39850400001)(2950100002)(33646002)(104016004)(6636002)(189998001)(5001770100001)(626004)(48376002)(76176999)(50986999)(47776003)(8936002)(77096006)(5660300001)(8676002)(39450400002)(246002)(50466002)(39410400001)(38730400001)(50226002)(36756003)(356003)(5003940100001)(106466001)(39840400001)(92566002)(4326007)(2906002); DIR:OUT; SFP:1101; SCL:1; SRVR:AM2PR08MB0435; H:nebula.arm.com; FPR:; SPF:Pass; PTR:fw-tnat.cambridge.arm.com; MX:1; A:1; LANG:en; X-Microsoft-Exchange-Diagnostics: 1; AM1FFO11FD006; 1:+zJMaFrjKnAadqytl6BCD/pp/Do4Y45Z/RaUQNDESedy6TVG4tDJXrANsflE1aEWrEdyBZMtylvWYLvNExD48PzcxKJjhJzSlnFTOX038SLuHcFHB+O/21eNKOLd+xp4JOk+tTdGCncpyuwJSyLFy6wnr5oSgIN3WmaN/xn1Di39jCArPMHicJD7z+dJ0RIttY9Czw6sBN4ST/xc9JHcnB+8CFQeZ8YlCC3q8kgeT20rQNNoPiH7sgNuYqo11MWb4DaDRVsR33m+w5wFwSTPHPLOPuAgq4SQDQLJl1dVq9QE/UnHsftTW7hgFLYMfe+0q3BJb9+O9wC3aLFTeuV+TadOmGS08FW8JqWLbBcx7FkazfpTKz0BtYsjWNYRXgVot2nMutso4f21QRaryXzmUXtpYh/dzwuvK46RK2Xdw5pSawaCGX3edM8hrE1IAXTx28eV3GAad/b52bYceU65VUniwqlVNj3Wx0Pc3OP4/jBZGTovwfeaO0iKsYG1KUv4ATS3cvZ6JiPM569CfMLZn0jlfE590yaVBkWrXIrlGxOJX7okTrjb8yoMwfv4KeQZ1jqvJaPPPDvoEYelvSZDotpgwCxj+ZDpmsvdYEsbaLEoWTUxJfDJqVh2bPazm/+7hKu1Bzzdx3wrqxJATeq9Jj+bdo25YCpO4tcYRjpTUwM= X-MS-Office365-Filtering-Correlation-Id: 47783750-c54a-480f-a5a9-08d41cefb7c9 X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001)(8251501002); SRVR:AM2PR08MB0435; X-Microsoft-Exchange-Diagnostics: 1; AM2PR08MB0435; 3:nWIqiWTc8RPx35O+MWWoxfOMR+u7wXFhuhvYz0F/Nffxul4ndrKEcdv9ifoy3q1tEvWmqe3FdOhDbeRCDyMBUNhsw27a/+yUsHb+55kEHBozNHHeS1ScgaN4TfTgsLbSqndQh275H5LbHH80njezfdSKbO9xbJOslOADlNjbWcDzHB5UqgompWg7dXGieSiJIVp/Otm4T1MUCuE5w2INwir2VjqTOPxHe/LG2daFoJZE9ncXLjNa/9jsKmHNpdbGvNuP4epr3wvlvMatsnfdVVl3AY/Ak9dy51eju04RX+I1gdVjoDO7LJDHyvY5/JxR+Lp5wxdaGzlUNQMWzBKZt3ZsVLllhyzrh2IfL2+EVnUbB42HaT03mchlniG/weRxKlpWbyM7uumC6lMas8ByoQ== X-Microsoft-Exchange-Diagnostics: 1; AM2PR08MB0435; 25:9ByAGxnMkdPalAdIlVddOGq3Cx/HEkMgUsJzZyJ99dr8GLiG4zmWgHvy5oHsI7r/S4DKvVxUinbn+SipHdkXca1uEjY8TvhB6rI9Zis+qwBx/MKxmsq4htILKw90EnSKSklvilDGGdhjJd86Ho1uwnrQJ5HYtv4W7IMVFaNtWvVtPIH37qZmZVDC3RNxdNjbo2LfFSTbl9f3/ZDCr6oMMz4K3E79JP41zyg8yxT7axMGq6cOaXm9TJkJ8cpMWX8LZ6FwEAic0Gn2vHocVceY0B0ikympjYBQk7Z1lync4wG+XnkuLtsvOuS8g9lkqRHhF+e5mFEhgJ9SnCi85wGx7qSs7PnZnZ2cEVjNo0pVd/L3i6M9jZzj5UcDtaiwanVE2CWsn1025Lvn2InEhdbIzjwVhe8hT+u+aueApiyDX1jVb1XZ6aB9PtBWJAhp01dmx5q9Mz2dxizpIAZFrFOAdz8/SCvs06PK3EXq6V1p5UR+yqT662c3kXGrYwHTih1B+Is3HqhcoVsSUocQbK77gHT4+XPUzKaAuPoenjrGwaQtTckkb9Mue0oBExKtDAZMtdGZGaeKdargO2Yu4YdtRyHPRxOE1jMDiT6/CIQQq+Gbn1zghxyefRCKz+2NIvRbQxlygESdk9T66s4P3JXnWc8NyyIeZ3UGuke2Pzjx3ghxYytKdnAWF8y3dQ25rphGtPk4lrDD09S+dBrBRfWqsK/5lIfLnl11ncSOv9cRLAsOIbjs9jE8Im/mdscRqk9kz81QX+1AEH7fEnmYbjGbKdA2Acnsh3uqRYg4rka4e1yPpIbf0aVYVKCmTwNQ0zS7qAyPQKjwMG/Rpk1fwPN3I/Dtjr0rRnCEwIm1WyvF5gKGJZQeV+05t5eAnjLSDLrEHl0PJONV9zwIQ5t/92fCjQ== X-Microsoft-Exchange-Diagnostics: 1; AM2PR08MB0435; 31:6y4a4FhdYE2krfBxAXzz5brrxH59hhJCOyOLtIztrtUHd7DfIEBl5jBBOY/0u52igRibhb1z0o4oJLDgV/aya7lyHUhzYitoFETynF3Nr+vhGXY0/PMdtR91WOjOF2ih13x/FwkLJqpz3tFhlYUd8YR2wesrfIBDDNM2y+cx8YZ3o/Wl9X9AXGHQ2FVAvokz9ROKT+i5pFltIwPSXLIKupIOoGLBi6r0Gynxg0yM1Zur03gY1zhR0V2/b4PXz7hXcaHleq6TVSafo1bB87WwDqqe3lD8CqhlI2WvP+TfYdI=; 20:uQTU11EIkesYIzGZ6Sd1J8nOrYqj2F2bOJL3ooSYkpccdn1UAs9bCKMDfcuoRTDIVwEpNV1m+LYb+i8xLzjz4NU7tFLrRfkwV0jp9tsZndELR18DAWWwzxGiEaBhEGhq2FYalLWn4as3g15z5uToYAWd508vcDeC7IDh4Puu1OlKXIdOwUWfaWE+yBVdTYTUPUJLIOdOSp8cKb/j+TTRDgQx7VrWgsXM0pxByD4rmTq/f8hGLpYYgrwZJLvh1C0J NoDisclaimer: True X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(180628864354917)(166708455590820); X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(6040375)(601004)(2401047)(13013025)(13024025)(13023025)(13020025)(5005006)(8121501046)(3002001)(10201501046)(6055026)(6041248)(20161123560025)(20161123562025)(20161123564025)(20161123555025)(6072148); SRVR:AM2PR08MB0435; BCL:0; PCL:0; RULEID:; SRVR:AM2PR08MB0435; X-Microsoft-Exchange-Diagnostics: 1; AM2PR08MB0435; 4:jhOyKfl+2BIonlkDO5E3h6eASlVpjOzTp4iMm2858fxl0y4PaTcppuw+brCEC2iF/kjBl6CgQpDywnXNw7AM/CHF4jXeQhlhlZGd7ZVL01SH992mkDtQqur8/8ycsL0V2U04hK67no9uHJ+HHpCEWt/Ld5pM+gIHcSVM51xFb1+zsQhO7Fx2alJLJu95LoL9WKFHODC27mcsbQKtwfHdbaayWIVmvup5QGmEHXJohTI1wthBmv0BcLN7jdv24VAy2azgerfjUwWsqgvVC31oUbn07U+B09WSmi+PzIxLhuFdFWkVagUe4CR3AgmnJIAG088SHdG8Ybkdl2yP0Rp3wxcCI0g7xI7h2r8ojhBOKQu26A8aCmyikvC2mzMCsKQsKgXKYavbesKguYAe1+Si188gsfPzsNf+4TxhNLn3FxQd+fZGz4bcAoudWjhja6hEKaiBFebqifJPws5J+mDTtBkOJrqxTEiPPKtgMn+o6VWh16jo3JqiVUBhxnPjWD3X1V8aVoux8Y/2P85ye52gh7AKXBLQIOyq/9NB4U2RXyIO/zNMUzwtg5N2eABEn/RMx7u3BXodRMFNxbb9CtxfGRZUiKmW5bi5dZVom2aHbvQbtAkSjrRUcNYJgAA1i+RkmFLVkNhb/UlMRnQsG7cXi5eAa52smu9+D8p2xpN532lR+nD1crhbBzWFeM2w5Lvc9lJv2GETF71d3+ldZF6n1aBoREdPxInlyhZM8Wi8UvfZseNk5No0h7VmOHYVgNaE X-Forefront-PRVS: 0147E151B5 X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; AM2PR08MB0435; 23:lOgpEZ+Btl6jyX5lPD467cvd35r20BFVRviO5Q3mR?= =?us-ascii?Q?aKsu5GaUvS8IXuwIqPSqWqQN7IbCyvJXXO8nrs0Td31DM2oo+DH8kNj1m3tl?= =?us-ascii?Q?ar82gLJuJCHzF3B3hkcy8VBfPpaQZR/9yk3CnXHzxnVIrongkak0jq7eTcws?= =?us-ascii?Q?CW1I9nUQ8JsIaLVHopVLMOVdYKNDNmcYEYl29dCegyD8LQPjY6LQbymvZcvl?= =?us-ascii?Q?B3w71HB4nEXYTOdnIVbFIQIgABWIvAcWHpI8WUOtu4JshC29SFFcnfX4BcQr?= =?us-ascii?Q?+ygxMOBD2ir7OdKDWqzpCiTk4tAP/5RwQcXTVwbdir8dWkkisDM7hR0ZBhgs?= =?us-ascii?Q?z5l0K5AsCIrhKsoy1ajFulY3bFQd0ajyUqf+DQaZvMf7xEnDm6mFMkOvQKQU?= =?us-ascii?Q?iQywTO7gYyMoJulCvJfb/wG+87K4tgdf0kaB7kobeKgMCZ8lpxhxa54v+yrF?= =?us-ascii?Q?RmPlvbgALV2LtxoNUvim2chfLPXm1nPewLbospSoJcm6xQUxH1Jxf0up15JP?= =?us-ascii?Q?tZgBNdBzKqOgmATFUXyXba0Mkk+tukwHeUQ4DS4bXFwbkI+RCWyB5cGyNZVv?= =?us-ascii?Q?L9G89uQsfTo6mVFHH1wkqkFwvfi9nk72kR/fmxSmryGOo59oDSXKoo5SD4Iu?= =?us-ascii?Q?BBcl8gu/eK8nVxZ7BTczLyWRQJFhRvgggB4pESdxPHGq9W9MB/ReG1zLJKcT?= =?us-ascii?Q?rcG5Rc4Uo6S/jpc+kp6hYw0i6ID0VEtfbL4ByIdRMNWnudoxIzCNYy8WkSJD?= =?us-ascii?Q?b9VhO1WIpar6RqXcop+kEratl9OvLMkQXFnM74s8Q+iAY7hCUOdBdCbgoRwC?= =?us-ascii?Q?cFt1hH36CZopptnjnAgUh5g4epT1MRsTL+U8JOIIA4+o69PnDSK6IerFZlWA?= =?us-ascii?Q?P5iwPQekGCqQBKwT/8YH7h67xoYrqy8abE+WpYCwelMGis8inJG+5t9AHIBs?= =?us-ascii?Q?C4OthJ8a4/2ibkOtvlSVLuAWyYvJUB0wTPhYe4SkM2rUnemO/wkHH7rWc/+d?= =?us-ascii?Q?4fxQnWSRelAjJG8Ngy15T1PFh9udIEztT3MqQ1JU7F+JQ=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1; AM2PR08MB0435; 6:bsCWSlASsOJL5nroW2UZGU4VVmtI1B6rv4BswQV67kyYa6LIyNz82mfd6jJHANeuWZd2e2xCdL4dhtIFWlWjAZqXLcfLK5wKym44TiulffD8kH92av9v+kLUvVS40TTcvrMzZi/H0tzamat3aqzkJ5dPmVdIhZIXybIlKOgBDAORH8Re4flisXxHAja5KB0NFKbdxUbIU3PKbV6A7qDkBkPv5Z5/O3E4x9kqZvF2HhB2iZPXREfur/GSqXVytLTwdX+me42W2CSWI5vJZXu9Y0psKJvNRyMYrO9gsZaxBJHnZ8hZmhs9kYb/Q/orlTFkALGJPtAKPIqXWXUXL7sj6rqd6YbZwMihvw4qMyuRpEdZe5M7uJFsJMeKrrvJzM8nfqOfXXOc8eoAxp4fmyhN1beAXDPHUowvPSpRkH6GwP3bi983DEvhRep47JWkatYCEedsUro8CsQBj/hZyIbU4g==; 5:5lYIP/+D4S4CrCrTAOsonJxEi0FjNZ6GLPsQwUmO89PO8xFqW5sqWju4LV/vCUvZhixeZrRqjRSBvIqOFZTOSUdZoAb2FfhOCW+A8ANUjk0aMZ6I8XTwFO7Qpy6lQS9L84ZItlPhl9KfL9B12DGkdTkDX/HrtCe3z0UuX8LHRj4=; 24:nhs4sZoR9g2JWiSAjMma0/PBheJPKzES6IqmnxcvrUNwdyv6V21fiXpWJuzGFI/9uVnFKXwYDA2S4tWBHQr8jR12ARIkxAyQZ5LszSIDvpM= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; AM2PR08MB0435; 7:eU1Q8en1fYBOzyBYtYdddQmV2hAy27Y2iXuGPVpNO5PEIfiXU9/L6OJbx31lmiXEdrns2jbwU0iCso4eO8f1eX/65XuA3eMr5kzFmL9jzWZWPUwG8A5hey5ssLVOIATXeC2NSaHg3saY8NiqjqP+JzSNEGAxSE9X0GWaRXftY/Wdn1ejAkL2bR5zQ+wQqrdxUNOKBeSiNh6iWQuIGfukVgINBYRXJDHP9iNEI1HYAYEc+rdUJv22uJHh2GJVpbXrGixy1tuDh2bBFtAeWxGd39PhLyR9WkejgJHEX85dA5By+eTYXzFvz+goFXxZDO2Ml4HmgLXU8q8zcNbyRv1ahlZJqiS0R8b6Q59QnuD51h9fXuno1fAnIz/qIW2B80efIAg3AcoOtgxzwcNL/ZibBnS+s/Hojne5UL9PJNl3P/h4uZM0IneOq2KJrDza6ODMLPTqwlKHIsUk8aLvORaK+Q== X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 Dec 2016 09:18:46.9949 (UTC) X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[217.140.96.140]; Helo=[nebula.arm.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM2PR08MB0435 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20161205_011912_060869_7F92DD46 X-CRM114-Status: GOOD ( 16.26 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arm-kernel@lists.infradead.org, kaly.xin@arm.com, mhocko@suse.com, kirill.shutemov@linux.intel.com, steve.capper@arm.com, will.deacon@arm.com, linux-mm@kvack.org, vbabka@suze.cz, aneesh.kumar@linux.vnet.ibm.com, Huang Shijie , n-horiguchi@ah.jp.nec.com, nd@arm.com, gerald.schaefer@de.ibm.com, mike.kravetz@oracle.com Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP When testing the gigantic page whose order is too large for the buddy allocator, the libhugetlbfs test case "counter.sh" will fail. The counter.sh is just a wrapper for counter.c, you can find them in: https://github.com/libhugetlbfs/libhugetlbfs/blob/master/tests/counters.c https://github.com/libhugetlbfs/libhugetlbfs/blob/master/tests/counters.sh Please see the error log below: ............................................ ........ quota.sh (32M: 64): PASS counters.sh (32M: 64): FAIL mmap failed: Invalid argument ********** TEST SUMMARY * 32M * 32-bit 64-bit * Total testcases: 0 87 * Skipped: 0 0 * PASS: 0 86 * FAIL: 0 1 * Killed by signal: 0 0 * Bad configuration: 0 0 * Expected FAIL: 0 0 * Unexpected PASS: 0 0 * Strange test result: 0 0 ********** ............................................ The failure is caused by: 1) kernel fails to allocate a gigantic page for the surplus case. And the gather_surplus_pages() will return NULL in the end. 2) The condition checks for "over-commit" is wrong. This patch does following things: 1) This patch changes the condition checks for: return_unused_surplus_pages() nr_overcommit_hugepages_store() hugetlb_overcommit_handler() 2) This patch introduces two helper functions: huge_nodemask() and __hugetlb_alloc_gigantic_page(). Please see the descritions in the two functions. 3) This patch uses __hugetlb_alloc_gigantic_page() to allocate the gigantic page in the __alloc_huge_page(). After this patch, gather_surplus_pages() can return a gigantic page for the surplus case. After this patch, the counter.sh can pass for the gigantic page. Signed-off-by: Huang Shijie --- include/linux/mempolicy.h | 8 +++++ mm/hugetlb.c | 77 +++++++++++++++++++++++++++++++++++++++++++---- mm/mempolicy.c | 44 +++++++++++++++++++++++++++ 3 files changed, 123 insertions(+), 6 deletions(-) diff --git a/include/linux/mempolicy.h b/include/linux/mempolicy.h index 5f4d828..6539fbb 100644 --- a/include/linux/mempolicy.h +++ b/include/linux/mempolicy.h @@ -146,6 +146,8 @@ extern void mpol_rebind_task(struct task_struct *tsk, const nodemask_t *new, enum mpol_rebind_step step); extern void mpol_rebind_mm(struct mm_struct *mm, nodemask_t *new); +extern bool huge_nodemask(struct vm_area_struct *vma, + unsigned long addr, nodemask_t *mask); extern struct zonelist *huge_zonelist(struct vm_area_struct *vma, unsigned long addr, gfp_t gfp_flags, struct mempolicy **mpol, nodemask_t **nodemask); @@ -269,6 +271,12 @@ static inline void mpol_rebind_mm(struct mm_struct *mm, nodemask_t *new) { } +static inline bool huge_nodemask(struct vm_area_struct *vma, + unsigned long addr, nodemask_t *mask) +{ + return false; +} + static inline struct zonelist *huge_zonelist(struct vm_area_struct *vma, unsigned long addr, gfp_t gfp_flags, struct mempolicy **mpol, nodemask_t **nodemask) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 1395bef..04440b8 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1506,6 +1506,69 @@ int dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn) /* * There are 3 ways this can get called: + * + * 1. When the NUMA is not enabled, use alloc_gigantic_page() to get + * the gigantic page. + * + * 2. The NUMA is enabled, but the vma is NULL. + * Initialize the @mask, and use alloc_fresh_gigantic_page() to get + * the gigantic page. + * + * 3. The NUMA is enabled, and the vma is valid. + * Use the @vma's memory policy. + * Get @mask by huge_nodemask(), and use alloc_fresh_gigantic_page() + * to get the gigantic page. + */ +static struct page *__hugetlb_alloc_gigantic_page(struct hstate *h, + struct vm_area_struct *vma, unsigned long addr, int nid) +{ + NODEMASK_ALLOC(nodemask_t, mask, GFP_KERNEL | __GFP_NORETRY); + struct page *page = NULL; + + /* Not NUMA */ + if (!IS_ENABLED(CONFIG_NUMA)) { + if (nid == NUMA_NO_NODE) + nid = numa_mem_id(); + + page = alloc_gigantic_page(nid, huge_page_order(h)); + if (page) + prep_compound_gigantic_page(page, huge_page_order(h)); + goto got_page; + } + + /* NUMA && !vma */ + if (!vma) { + /* First, check the mask */ + if (!mask) { + mask = &node_states[N_MEMORY]; + } else { + if (nid == NUMA_NO_NODE) { + if (!init_nodemask_of_mempolicy(mask)) { + NODEMASK_FREE(mask); + mask = &node_states[N_MEMORY]; + } + } else { + init_nodemask_of_node(mask, nid); + } + } + + page = alloc_fresh_gigantic_page(h, mask, false); + goto got_page; + } + + /* NUMA && vma */ + if (mask && huge_nodemask(vma, addr, mask)) + page = alloc_fresh_gigantic_page(h, mask, false); + +got_page: + if (mask != &node_states[N_MEMORY]) + NODEMASK_FREE(mask); + + return page; +} + +/* + * There are 3 ways this can get called: * 1. With vma+addr: we use the VMA's memory policy * 2. With !vma, but nid=NUMA_NO_NODE: We try to allocate a huge * page from any node, and let the buddy allocator itself figure @@ -1584,7 +1647,7 @@ static struct page *__alloc_huge_page(struct hstate *h, struct page *page; unsigned int r_nid; - if (hstate_is_gigantic(h)) + if (hstate_is_gigantic(h) && !gigantic_page_supported()) return NULL; /* @@ -1629,7 +1692,10 @@ static struct page *__alloc_huge_page(struct hstate *h, } spin_unlock(&hugetlb_lock); - page = __hugetlb_alloc_buddy_huge_page(h, vma, addr, nid); + if (hstate_is_gigantic(h)) + page = __hugetlb_alloc_gigantic_page(h, vma, addr, nid); + else + page = __hugetlb_alloc_buddy_huge_page(h, vma, addr, nid); spin_lock(&hugetlb_lock); if (page) { @@ -1796,8 +1862,7 @@ static void return_unused_surplus_pages(struct hstate *h, /* Uncommit the reservation */ h->resv_huge_pages -= unused_resv_pages; - /* Cannot return gigantic pages currently */ - if (hstate_is_gigantic(h)) + if (hstate_is_gigantic(h) && !gigantic_page_supported()) return; nr_pages = min(unused_resv_pages, h->surplus_huge_pages); @@ -2514,7 +2579,7 @@ static ssize_t nr_overcommit_hugepages_store(struct kobject *kobj, unsigned long input; struct hstate *h = kobj_to_hstate(kobj, NULL); - if (hstate_is_gigantic(h)) + if (hstate_is_gigantic(h) && !gigantic_page_supported()) return -EINVAL; err = kstrtoul(buf, 10, &input); @@ -2966,7 +3031,7 @@ int hugetlb_overcommit_handler(struct ctl_table *table, int write, tmp = h->nr_overcommit_huge_pages; - if (write && hstate_is_gigantic(h)) + if (write && hstate_is_gigantic(h) && !gigantic_page_supported()) return -EINVAL; table->data = &tmp; diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 6d3639e..3550a29 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1800,6 +1800,50 @@ static inline unsigned interleave_nid(struct mempolicy *pol, #ifdef CONFIG_HUGETLBFS /* + * huge_nodemask(@vma, @addr, @mask) + * @vma: virtual memory area whose policy is sought + * @addr: address in @vma + * @mask: should be a valid nodemask pointer, not NULL + * + * Return true if we can succeed in extracting the policy nodemask + * for 'bind' or 'interleave' policy into the argument @mask, or + * initializing the argument @mask to contain the single node for + * 'preferred' or 'local' policy. + */ +bool huge_nodemask(struct vm_area_struct *vma, unsigned long addr, + nodemask_t *mask) +{ + struct mempolicy *mpol; + bool ret = true; + int nid; + + mpol = get_vma_policy(vma, addr); + + switch (mpol->mode) { + case MPOL_PREFERRED: + if (mpol->flags & MPOL_F_LOCAL) + nid = numa_node_id(); + else + nid = mpol->v.preferred_node; + init_nodemask_of_node(mask, nid); + break; + + case MPOL_BIND: + /* Fall through */ + case MPOL_INTERLEAVE: + *mask = mpol->v.nodes; + break; + + default: + ret = false; + break; + } + mpol_cond_put(mpol); + + return ret; +} + +/* * huge_zonelist(@vma, @addr, @gfp_flags, @mpol) * @vma: virtual memory area whose policy is sought * @addr: address in @vma for shared policy lookup and interleave policy