From patchwork Fri Jan 31 09:06:14 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sergey Senozhatsky X-Patchwork-Id: 13955141 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DFDD9C0218D for ; Fri, 31 Jan 2025 09:08:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4BCE928029C; Fri, 31 Jan 2025 04:08:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 444CD280299; Fri, 31 Jan 2025 04:08:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2BFAC28029C; Fri, 31 Jan 2025 04:08:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 0C084280299 for ; Fri, 31 Jan 2025 04:08:11 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id B7E15A097A for ; Fri, 31 Jan 2025 09:08:10 +0000 (UTC) X-FDA: 83067170340.15.1217D4E Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) by imf05.hostedemail.com (Postfix) with ESMTP id C9364100005 for ; Fri, 31 Jan 2025 09:08:08 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=LT7dUjPK; spf=pass (imf05.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.214.172 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738314488; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rSDpkbxqSvfVMXf6MtRLSP9PYlXCwlCf1VxOLTEJdKQ=; b=1WyaafqtY6IPQpKJrCkWSOPtoUqT7ONRNmGQi7iw6sqkQHNkPPp+8yZbhkqD8KumOQa6Zr A7FeoYA8p/YSnG07utQlCHL5MY7CpmMuuu99d2T5OqXTqYf/IQCcdKsO1joXTwdiP9YFUn Gn1mzqccPjZEZSsHOEr5HWNQnyi4wXE= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=LT7dUjPK; spf=pass (imf05.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.214.172 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738314488; a=rsa-sha256; cv=none; b=mN+120rYOvEc70jjoDj+wuTVNhyRUic8yYHwZaPgcwjveREiHSrgWr/DQCCc4pKCH/NeXK 7bG30DgNt7d8nn/RgArDYA2jmK4UOtxt3t/K5f8lODSWp4z55UOG1FiZbCjIr7KxDrMncq 3WP2VRW+gB9FYLOHcLBHLI0tOkS5tKU= Received: by mail-pl1-f172.google.com with SMTP id d9443c01a7336-21669fd5c7cso29292415ad.3 for ; Fri, 31 Jan 2025 01:08:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1738314488; x=1738919288; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=rSDpkbxqSvfVMXf6MtRLSP9PYlXCwlCf1VxOLTEJdKQ=; b=LT7dUjPKAs0hnAAmelBIRxm5tGdYaVoQjHE4XAoXZDKqwsmw9WA8Y8ixhL6W945tC1 Ci/jPI78quTXvnkF1m5fWU9vbAp1aJW7vOd7TNbn9QIeUGpX+EGbJ6FxpqgHjhML4TPo PiM1D9qkoJWMlsx5ZIdqzlF3mlktnwZJDIFPA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738314488; x=1738919288; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rSDpkbxqSvfVMXf6MtRLSP9PYlXCwlCf1VxOLTEJdKQ=; b=cfCz0E5zcQ7MB+8MiE48TOfITz4ShoVYYMwmEipLJHdnTkloBLKXNB0SC9hEh7fsV9 wmsXgB4SqFUweNIV8ap9IEN8tlVGduLrzJZOIQgevX3Duh7hxEFVUocgmKw8UPs74kB9 4krbrZ0F3eQgx0ou0igxirlRApMDRnZeLVZwDN4aTvPcwWJ8b5boHyi4jzXlO0/5cmOm sO3NYN4RVpUDhgAJ9a0B5iPtYfncEUlkZU/XL463daMn20HmllQ91cI3E3SI+Ys+bffA fYKyA2Jp3TscaKjFvDi/cq2gRi5zjVHUdtJRwU19Rc5aOY1HA4znb8o86txXyUKc2h/v WY1A== X-Forwarded-Encrypted: i=1; AJvYcCWEX8MzsxcxLXOt+cmuNo75FQN+N4GJVOOcVD7R19jkCFji/OF0HLbOaaF9FUGkWUp40wO9GY/WWA==@kvack.org X-Gm-Message-State: AOJu0YwrMF1V/wZSEVXhy+3I6iCsNqEdBF5v48ldSNkbNzuhtIdZnn1L XPvxQwaefhwYyhVbWETEAwOsAWIc2VUqxoRUp6SrMnwZjUpq15L8y/fhGAkJdQ== X-Gm-Gg: ASbGncvOGzHtjmafKnlGdXCUyWj6iEoFaxSOIGBmXQ5lu0qGerkgp1eiyaR/o9YrbhC RoyDQQdFXfFJKHmEARvxOE5s0UmQaWtCu/ayX6vqJUQWxLdCzcsovBfotc/bS1+9InhNFrJFOav casAVgIhqJD461a/+sXAuosW7DT05ObcM1UuuS/UvAelXOi5/3m5BPeKnkF8JNEyM1NZaBK2Zv5 8e9qGIWZwMRbkNPzB33QvXkkr++eKQxJFAZou1CyqarK0AisQr3moN8rbyHlo6P23IaDDCJfyce 6NHglX680P2IyE6X1A== X-Google-Smtp-Source: AGHT+IFbB/ylqlS/xqv7OLfNnPKv/KBkJRu1pTXCHuKZ+Ljs5D2OLYSsCuLflKkc6dvQm0HS88PAuQ== X-Received: by 2002:a17:903:2f8e:b0:215:b75f:a1cb with SMTP id d9443c01a7336-21dd9fa74bcmr137340835ad.9.1738314487509; Fri, 31 Jan 2025 01:08:07 -0800 (PST) Received: from localhost ([2401:fa00:8f:203:c752:be9d:3368:16fa]) by smtp.gmail.com with UTF8SMTPSA id d9443c01a7336-21de32f44a9sm26237855ad.143.2025.01.31.01.08.05 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 31 Jan 2025 01:08:07 -0800 (PST) From: Sergey Senozhatsky To: Andrew Morton Cc: Minchan Kim , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky , Yosry Ahmed Subject: [PATCHv4 15/17] zsmalloc: introduce new object mapping API Date: Fri, 31 Jan 2025 18:06:14 +0900 Message-ID: <20250131090658.3386285-16-senozhatsky@chromium.org> X-Mailer: git-send-email 2.48.1.362.g079036d154-goog In-Reply-To: <20250131090658.3386285-1-senozhatsky@chromium.org> References: <20250131090658.3386285-1-senozhatsky@chromium.org> MIME-Version: 1.0 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: C9364100005 X-Stat-Signature: 8tthfwuj9oqipczf9wxus585ma7h9qmi X-Rspam-User: X-HE-Tag: 1738314488-424332 X-HE-Meta: U2FsdGVkX1/Sj8lcBNPCigApoRRpyBYXo6vzGFIKzPFDDct2qXo7aovNV6fsMvbs6oATWWBXK3HrOSCaJCLYEhltBs+PG4eUBhSvZ636ps/RecXammYw22EYWPjQe+Oi3SfpSeOCv5YfMWNsYwZnuHK0lsb/cHa/64YxZZjt58mL0IcWN13oReg38a2Mgft89K6QBJlABS9HvwZOzy9TBzw1MlDpPxCyhUccjemy7DHf9hI4JQLVtxxEWvQ3G/MFhKf04gnaXAHlAjnou/V0HnSnQ71nGbd/CYg81mG4TsQDB018UTucf9NNrqbb1Z1P8JL5ISYgni2MF8AFmhCxJjl/td1IP2y23btHDZTB+k/sNLO0mrUL8t7tjdstoi0KLHU6IJqDa0OnJ635XDIQ1YkGWtWKvshHH4cgMl0BO6Q1VKr7i9ccCzL82bkAEcoxVcdSXf6AvWV4fy/R+j+Vs6/4K0SjQnMMRKiq+tLf9N8fIxjjU1UTY7mq91qc4CXPWLW1M07ZbobCNlx+v/kgx97zlRLcHKD66LBjJxGVn2NSTikLETZLLd90hTMLPfMj1OPTHrXYYTRbhkqlB2UfsR0t/zDWpDEHBRYp4SZpusbpl3Vzj33gcfnudjxpDb5SfzG0N+ZgXK4gaOsDixW8Sk1I31vFY7BmhtyX2Z5jqQYSmsIp6CVqz9DzaHfiOWbF1VGVq420/Roxn2I5xNLVHI1wvsMFo9UjrUPpL2jnaAQBM377bCtktl+1th9kZxhYkAqRYzT0L1UrqC/zevk3Uh8XOJsZxXcBpWYWw++1wXeHIZ5Gp02NFEtwZ/Swiy+NJ76yO+jIZqS03VYzEUhjSeEI9BvjsIRTjMVs5HDO8cc3PM1Z1J2QYb/fknEA2tc1RsDvIOS166hzc/Y/elF9rpoSqrxpa6lrDnq8aWsHCsoNwqv4ivGBNpln7nS0dMBw/IOSR+Bhn2KEuX3XZm5 9GyzYJ7m 5hx7cruAaV8og0GAo0VWai2wj3ubsFHlv1ikJUVRZoMWegvUndEQqYEFc5BxnQTsPWXh50TgtrECFlQcG/dB1MePrERp1Y/iw9JlHBUCILNzilABBoquQOsLOlgA8GVgtvtbrj8CnWs8AiuyY41uDU0I+L4FHpwUo5TInxe6+XEzF1HCP2jYlrTk0UfTk9YXrUtpEikrPFDLGHUbg0G/yR+rEO58rvvNMRvSPVqf36TVbPE99+GUgSbFBwNIW7J42/928NbebzDpPKkOreUfoSoDdG2s4gg8+ZN6xppD2VptOGFFRPZSzmco1vg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Current object mapping API is a little cumbersome. First, it's inconsistent, sometimes it returns with page-faults disabled and sometimes with page-faults enabled. Second, and most importantly, it enforces atomicity restrictions on its users. zs_map_object() has to return a liner object address which is not always possible because some objects span multiple physical (non-contiguous) pages. For such objects zsmalloc uses a per-CPU buffer to which object's data is copied before a pointer to that per-CPU buffer is returned back to the caller. This leads to another, final, issue - extra memcpy(). Since the caller gets a pointer to per-CPU buffer it can memcpy() data only to that buffer, and during zs_unmap_object() zsmalloc will memcpy() from that per-CPU buffer to physical pages that object in question spans across. New API splits functions by access mode: - zs_obj_read_begin(handle, local_copy) Returns a pointer to handle memory. For objects that span two physical pages a local_copy buffer is used to store object's data before the address is returned to the caller. Otherwise the object's page is kmap_local mapped directly. - zs_obj_read_end(handle, buf) Unmaps the page if it was kmap_local mapped by zs_obj_read_begin(). - zs_obj_write(handle, buf, len) Copies len-bytes from compression buffer to handle memory (takes care of objects that span two pages). This does not need any additional (e.g. per-CPU) buffers and writes the data directly to zsmalloc pool pages. The old API will stay around until the remaining users switch to the new one. After that we'll also remove zsmalloc per-CPU buffer and CPU hotplug handling. Signed-off-by: Sergey Senozhatsky Reviewed-by: Yosry Ahmed --- include/linux/zsmalloc.h | 8 +++ mm/zsmalloc.c | 129 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 137 insertions(+) diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h index a48cd0ffe57d..7d70983cf398 100644 --- a/include/linux/zsmalloc.h +++ b/include/linux/zsmalloc.h @@ -58,4 +58,12 @@ unsigned long zs_compact(struct zs_pool *pool); unsigned int zs_lookup_class_index(struct zs_pool *pool, unsigned int size); void zs_pool_stats(struct zs_pool *pool, struct zs_pool_stats *stats); + +void *zs_obj_read_begin(struct zs_pool *pool, unsigned long handle, + void *local_copy); +void zs_obj_read_end(struct zs_pool *pool, unsigned long handle, + void *handle_mem); +void zs_obj_write(struct zs_pool *pool, unsigned long handle, + void *handle_mem, size_t mem_len); + #endif diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index f5b5fe732e50..f9d840f77b18 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -1365,6 +1365,135 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle) } EXPORT_SYMBOL_GPL(zs_unmap_object); +void *zs_obj_read_begin(struct zs_pool *pool, unsigned long handle, + void *local_copy) +{ + struct zspage *zspage; + struct zpdesc *zpdesc; + unsigned long obj, off; + unsigned int obj_idx; + struct size_class *class; + void *addr; + + WARN_ON(in_interrupt()); + + /* Guarantee we can get zspage from handle safely */ + pool_read_lock(pool); + obj = handle_to_obj(handle); + obj_to_location(obj, &zpdesc, &obj_idx); + zspage = get_zspage(zpdesc); + + /* Make sure migration doesn't move any pages in this zspage */ + zspage_read_lock(zspage); + pool_read_unlock(pool); + + class = zspage_class(pool, zspage); + off = offset_in_page(class->size * obj_idx); + + if (off + class->size <= PAGE_SIZE) { + /* this object is contained entirely within a page */ + addr = kmap_local_zpdesc(zpdesc); + addr += off; + } else { + size_t sizes[2]; + + /* this object spans two pages */ + sizes[0] = PAGE_SIZE - off; + sizes[1] = class->size - sizes[0]; + addr = local_copy; + + memcpy_from_page(addr, zpdesc_page(zpdesc), + off, sizes[0]); + zpdesc = get_next_zpdesc(zpdesc); + memcpy_from_page(addr + sizes[0], + zpdesc_page(zpdesc), + 0, sizes[1]); + } + + if (!ZsHugePage(zspage)) + addr += ZS_HANDLE_SIZE; + + return addr; +} +EXPORT_SYMBOL_GPL(zs_obj_read_begin); + +void zs_obj_read_end(struct zs_pool *pool, unsigned long handle, + void *handle_mem) +{ + struct zspage *zspage; + struct zpdesc *zpdesc; + unsigned long obj, off; + unsigned int obj_idx; + struct size_class *class; + + obj = handle_to_obj(handle); + obj_to_location(obj, &zpdesc, &obj_idx); + zspage = get_zspage(zpdesc); + class = zspage_class(pool, zspage); + off = offset_in_page(class->size * obj_idx); + + if (off + class->size <= PAGE_SIZE) { + if (!ZsHugePage(zspage)) + off += ZS_HANDLE_SIZE; + handle_mem -= off; + kunmap_local(handle_mem); + } + + zspage_read_unlock(zspage); +} +EXPORT_SYMBOL_GPL(zs_obj_read_end); + +void zs_obj_write(struct zs_pool *pool, unsigned long handle, + void *handle_mem, size_t mem_len) +{ + struct zspage *zspage; + struct zpdesc *zpdesc; + unsigned long obj, off; + unsigned int obj_idx; + struct size_class *class; + + WARN_ON(in_interrupt()); + + /* Guarantee we can get zspage from handle safely */ + pool_read_lock(pool); + obj = handle_to_obj(handle); + obj_to_location(obj, &zpdesc, &obj_idx); + zspage = get_zspage(zpdesc); + + /* Make sure migration doesn't move any pages in this zspage */ + zspage_read_lock(zspage); + pool_read_unlock(pool); + + class = zspage_class(pool, zspage); + off = offset_in_page(class->size * obj_idx); + + if (off + class->size <= PAGE_SIZE) { + /* this object is contained entirely within a page */ + void *dst = kmap_local_zpdesc(zpdesc); + + if (!ZsHugePage(zspage)) + off += ZS_HANDLE_SIZE; + memcpy(dst + off, handle_mem, mem_len); + kunmap_local(dst); + } else { + /* this object spans two pages */ + size_t sizes[2]; + + off += ZS_HANDLE_SIZE; + sizes[0] = PAGE_SIZE - off; + sizes[1] = mem_len - sizes[0]; + + memcpy_to_page(zpdesc_page(zpdesc), off, + handle_mem, sizes[0]); + zpdesc = get_next_zpdesc(zpdesc); + memcpy_to_page(zpdesc_page(zpdesc), 0, + handle_mem + sizes[0], sizes[1]); + } + + zspage_read_unlock(zspage); +} +EXPORT_SYMBOL_GPL(zs_obj_write); + /** * zs_huge_class_size() - Returns the size (in bytes) of the first huge * zsmalloc &size_class.