From patchwork Wed Jan 29 06:43:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sergey Senozhatsky X-Patchwork-Id: 13953468 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6783C02192 for ; Wed, 29 Jan 2025 06:49:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 60413280026; Wed, 29 Jan 2025 01:49:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5B44928001A; Wed, 29 Jan 2025 01:49:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 42D8F280026; Wed, 29 Jan 2025 01:49:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 22E0F28001A for ; Wed, 29 Jan 2025 01:49:29 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 99F0C80988 for ; Wed, 29 Jan 2025 06:49:28 +0000 (UTC) X-FDA: 83059563216.22.902A76C Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) by imf11.hostedemail.com (Postfix) with ESMTP id C4A5A40008 for ; Wed, 29 Jan 2025 06:49:26 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=E14jNnUR; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf11.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.214.181 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738133366; a=rsa-sha256; cv=none; b=S0TGOdyLC+0uuAx+56qtHvctuSQC+IMQRZdR/ocvQMK9TryZ+OdRXgCK3I//uUF9e7F9+a HCBa591p7uL95u5mPKNkJ+5WpG1xTnS0wiuX94IB4P6+Ao1lgRLvNOxex5fHlYOWGThskC jSP7gv4vVGe7MbDQXF3uKg8WbyNtJsU= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=E14jNnUR; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf11.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.214.181 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738133366; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KXGVSyFQ7wxPOOSznd8eqGTmOOS9W52Yl2SySbGkS2E=; b=o580RNVkRTxmu3bmsJKiqUlyJrSE1YK8hECS+FvJi4iO6xboFdr/T58UEUqJ5CjCl6zfZt drLx1TWWz8p1FMfhQRtFBPHjlA5mdJ31zCqJh2SVsDXsDm/lr1VSkIyxvCAl/Zpnld4fYU 9O2IvRjOc4RiH92Pi41B9/cVoFgkhfE= Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-21649a7bcdcso111509295ad.1 for ; Tue, 28 Jan 2025 22:49:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1738133365; x=1738738165; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KXGVSyFQ7wxPOOSznd8eqGTmOOS9W52Yl2SySbGkS2E=; b=E14jNnURbGw84PXvWiB5gi/4cBQ3Ae7oZRt7inh/tt2BsMQqL+QnxMB5XuoRn9b3ps FHAc71HhHFMUMHnKIf1SUC7bKJQvLUtjfhPmt4k1SkcmZTyRQ+t10Avqb+X+/FFEpvQo hOlBf4qYDoh2ixCKIuXxHPTHCDePkQ2kyF9CY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738133365; x=1738738165; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KXGVSyFQ7wxPOOSznd8eqGTmOOS9W52Yl2SySbGkS2E=; b=eC7j463M0apkccMZHXUFU54hduJd6Faqzb0nZ1Ks+k0PLARB4vw6apTHPnIKEiUzKG Jv/ZCPgUwb5EXHqmMCiEV10sn7ceeLIY3x5bhsD1M6RXOI0RCdPWdHtbfcolZzs1Rpm7 vHcb43N6oWizVDgAMgy18eiwKetdpa+zv8FsHuFCcsGlC+icC+MgEE4AaQqCtlOBqpjJ 9wd542LJ34ud7jtFX6nu3DiK2yC+Cxy0T6l3WSl0mBzH8KR+7R12RkQoUEZ92A3yGMA/ T4Q4KF8kD+Gzk8jU5YcS6PzI0VmGeA9oc3oFh39i4hb3hRfW7pdLs8ue1Be1TynG+9IX tkoA== X-Gm-Message-State: AOJu0YzU3vQL1IHyk8Cz/DpGHRSOREcFv50JGpSUQSBqlyV596BxRPKG Bi3p0vSmi8uO9x6QhNup5RYTiO64kafM8NiudYBvWBnESM4MIGSXwz4ss9RdQQ== X-Gm-Gg: ASbGncteaJTNjzt+4gw1RnYj4cobuMk3mLVEiwgowRi9nnJiOPBzp3l3BViR2PLBMHr aZFaU2YopWwiwIT6zbaNdIUmOFsf9DNjjcMm5sEBgJGm2E4DZ9oi9ABv+3vxgF3FffieYwG/WjV cqJbHPKvOCfx98poFER8yqnZA0Yx0vzKw3umf0LE8foiThjjH2NM5Zfe4mLEYbp8DhUIX/TEt0I /WW+PJCJFcfoC8ZsL8r98qnkBgFrjlZhZn/dg7jeSMQAUP4QBlkVI7QHzU7rJnD73AtaBWkCd3H JImCq21om6NdcOXfPg== X-Google-Smtp-Source: AGHT+IHRM9r7XrdtSHnutLuiUDtUgDRAtF8VGeBOts49IOHmepNWeYTTAUBt2jiNw5hFypkaNy8rEg== X-Received: by 2002:a17:903:41cc:b0:215:bb50:6a05 with SMTP id d9443c01a7336-21dd7c499d2mr21877945ad.9.1738133365611; Tue, 28 Jan 2025 22:49:25 -0800 (PST) Received: from localhost ([2401:fa00:8f:203:b323:d70b:a1b8:1683]) by smtp.gmail.com with UTF8SMTPSA id d9443c01a7336-21da3d9d9fbsm92959935ad.9.2025.01.28.22.49.23 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 28 Jan 2025 22:49:25 -0800 (PST) From: Sergey Senozhatsky To: Andrew Morton , Minchan Kim , Johannes Weiner , Yosry Ahmed , Nhat Pham Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky Subject: [PATCHv1 4/6] zsmalloc: introduce new object mapping API Date: Wed, 29 Jan 2025 15:43:50 +0900 Message-ID: <20250129064853.2210753-5-senozhatsky@chromium.org> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog In-Reply-To: <20250129064853.2210753-1-senozhatsky@chromium.org> References: <20250129064853.2210753-1-senozhatsky@chromium.org> MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: C4A5A40008 X-Stat-Signature: mri81tbx4q8s45h1b5f4956besikx7sz X-Rspam-User: X-HE-Tag: 1738133366-700909 X-HE-Meta: U2FsdGVkX1/Pc6ww8tWqBvLr1wj+GN1c7EoAb9yMT3GUPVJjDkpx+hEuYu5X/hWks5v4n1UWUpxCLsqiQsZwWnWSTf4HJJmqt9K9ra5xy6pOkhV2EZITBW4qYMp4dvbvCaeaJJKOSTn28OFvOfjHZXLwUFrfOjo2MQilTFF6l2SitOT0geyMvrljkqukkQWHibXosGprMiCHaIlbpEps5m0Oy3F3xwzbOob5gOAF1Q2y++8OS0Fn/KXN4iv9DsyPE83F2nIQTetSGM8GGDpdMu9psMhQlQs2M5ssN1ovRSiF4Exv9DzhCjuqmsyULIzM6xroWzzmAhi/dmDXf0U2AXankvH6ncAfOKdTaxUPtGlSgh8a0sDH4LUVklYScy3YHkfyBzRKRvRB3DdJ/5TEMA5dEztcwNWymiJbC+QtSLQHH8IYuBiRbpU5SYO+ujHjZYjWKsIqi6bwd0ZZ6c0aypffPI4JUf30wnU1kpnxcIIfXA9qrafYr7IaUhIz+47RUGQVjRFihEhCU7rIQv899gST/WSn1joJmBUNcgeiYaX2ccuo/1Tru85hNiBrQHu5H+h8XuzIwXUSgMGY8OPBo5hrImcfFpu01fV5BdZbwaikiWTkBJz2OilPgnCc9WJKQOlT7/pXVbjynVDQ6SqO/C5Gpq5aci2aVNbHv3TmVGDiySEL8Jtf0AH7DszWr6/1EsEe12vVrxNN2PWu/yhuH0+cAhtAsfVfP2TTN6jrV0xnMgtlaDxhXlCGY8vdMr0z0ivIgbQstez3+kIXHFDMBaJ9b8ma7ut2FtP09fZcWm+jzSIMvCTEgazO6pNFyhQhLcO1Ht4K+4nlZ4LuY8ATl26hS4z/ZuePfxNWeuTQn1AbGrBOqj1nxavZzdcXRmWVKlHkftFyrHvkpdzqAbWaavbh9b7YYxQM5PVihCf06BLC+0l3E7zAZP0cyeaDBXH4ZCM4gma194qvjg2OYMv ueg40tg2 +JLOycq1IE90+Hey5q5/zNvy2ZJ/CYPrJkwBBJ4EPTioN3cZ1wmGm97zgU5ApUXmVRDxMAMYzkLB2+LtmGqGW//HeBpxFf4Sf9DN5j8YliuVEGI7s3Z8+P2hav61chRc0CGBwNBKFoUcTz+kSXL0aO1rzbJ7BinHaImaU/jX5HLZatR0Oa6yZRdUvr6VIlY2dluj5pdq5pAXzO4UZ0OlWyMlF4Vc1WFiwTstJu/M8TzAfEZLLygfINlho9XvYnAYteYx+xxEetHk3eywutO1oRKBrhSB/HKnZSPyI6hlXr3/ljt5sJ/ZGSSqW5w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Current object mapping API is a little cumbersome. First, it's inconsistent, sometimes it returns with page-faults disabled and sometimes with page-faults enabled. Second, and most importantly, it enforces atomicity restrictions on its users. zs_map_object() has to return a liner object address which is not always possible because some objects span multiple physical (non-contiguous) pages. For such objects zsmalloc uses a per-CPU buffer to which object's data is copied before a pointer to that per-CPU buffer is returned back to the caller. This leads to another, final, issue - extra memcpy(). Since the caller gets a pointer to per-CPU buffer it can memcpy() data only to that buffer, and during zs_unmap_object() zsmalloc will memcpy() from that per-CPU buffer to physical pages that object in question spans across. New API splits functions by access mode: - zs_obj_read_begin(handle, local_copy) Returns a pointer to handle memory. For objects that span two physical pages a local_copy buffer is used to store object's data before the address is returned to the caller. Otherwise the object's page is kmap_local mapped directly. - zs_obj_read_end(handle, buf) Unmaps the page if it was kmap_local mapped by zs_obj_read_begin(). - zs_obj_write(handle, buf, len) Copies len-bytes from compression buffer to handle memory (takes care of objects that span two pages). This does not need any additional (e.g. per-CPU) buffers and writes the data directly to zsmalloc pool pages. The old API will stay around until the remaining users switch to the new one. After that we'll also remove zsmalloc per-CPU buffer and CPU hotplug handling. Signed-off-by: Sergey Senozhatsky Reviewed-by: Yosry Ahmed --- include/linux/zsmalloc.h | 8 +++ mm/zsmalloc.c | 129 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 137 insertions(+) diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h index a48cd0ffe57d..625adae8e547 100644 --- a/include/linux/zsmalloc.h +++ b/include/linux/zsmalloc.h @@ -58,4 +58,12 @@ unsigned long zs_compact(struct zs_pool *pool); unsigned int zs_lookup_class_index(struct zs_pool *pool, unsigned int size); void zs_pool_stats(struct zs_pool *pool, struct zs_pool_stats *stats); + +void zs_obj_read_end(struct zs_pool *pool, unsigned long handle, + void *handle_mem); +void *zs_obj_read_begin(struct zs_pool *pool, unsigned long handle, + void *local_copy); +void zs_obj_write(struct zs_pool *pool, unsigned long handle, + void *handle_mem, size_t mem_len); + #endif diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 8f4011713bc8..0e21bc57470b 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -1371,6 +1371,135 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle) } EXPORT_SYMBOL_GPL(zs_unmap_object); +void zs_obj_write(struct zs_pool *pool, unsigned long handle, + void *handle_mem, size_t mem_len) +{ + struct zspage *zspage; + struct zpdesc *zpdesc; + unsigned long obj, off; + unsigned int obj_idx; + struct size_class *class; + + WARN_ON(in_interrupt()); + + /* Guarantee we can get zspage from handle safely */ + pool_read_lock(pool); + obj = handle_to_obj(handle); + obj_to_location(obj, &zpdesc, &obj_idx); + zspage = get_zspage(zpdesc); + + /* Make sure migration doesn't move any pages in this zspage */ + zspage_read_lock(zspage); + pool_read_unlock(pool); + + class = zspage_class(pool, zspage); + off = offset_in_page(class->size * obj_idx); + + if (off + class->size <= PAGE_SIZE) { + /* this object is contained entirely within a page */ + void *dst = kmap_local_zpdesc(zpdesc); + + if (!ZsHugePage(zspage)) + off += ZS_HANDLE_SIZE; + memcpy(dst + off, handle_mem, mem_len); + kunmap_local(dst); + } else { + size_t sizes[2]; + + /* this object spans two pages */ + off += ZS_HANDLE_SIZE; + sizes[0] = PAGE_SIZE - off; + sizes[1] = mem_len - sizes[0]; + + memcpy_to_page(zpdesc_page(zpdesc), off, + handle_mem, sizes[0]); + zpdesc = get_next_zpdesc(zpdesc); + memcpy_to_page(zpdesc_page(zpdesc), 0, + handle_mem + sizes[0], sizes[1]); + } + + zspage_read_unlock(zspage); +} +EXPORT_SYMBOL_GPL(zs_obj_write); + +void zs_obj_read_end(struct zs_pool *pool, unsigned long handle, + void *handle_mem) +{ + struct zspage *zspage; + struct zpdesc *zpdesc; + unsigned long obj, off; + unsigned int obj_idx; + struct size_class *class; + + obj = handle_to_obj(handle); + obj_to_location(obj, &zpdesc, &obj_idx); + zspage = get_zspage(zpdesc); + class = zspage_class(pool, zspage); + off = offset_in_page(class->size * obj_idx); + + if (off + class->size <= PAGE_SIZE) { + if (!ZsHugePage(zspage)) + off += ZS_HANDLE_SIZE; + handle_mem -= off; + kunmap_local(handle_mem); + } + + zspage_read_unlock(zspage); +} +EXPORT_SYMBOL_GPL(zs_obj_read_end); + +void *zs_obj_read_begin(struct zs_pool *pool, unsigned long handle, + void *local_copy) +{ + struct zspage *zspage; + struct zpdesc *zpdesc; + unsigned long obj, off; + unsigned int obj_idx; + struct size_class *class; + void *addr; + + WARN_ON(in_interrupt()); + + /* Guarantee we can get zspage from handle safely */ + pool_read_lock(pool); + obj = handle_to_obj(handle); + obj_to_location(obj, &zpdesc, &obj_idx); + zspage = get_zspage(zpdesc); + + /* Make sure migration doesn't move any pages in this zspage */ + zspage_read_lock(zspage); + pool_read_unlock(pool); + + class = zspage_class(pool, zspage); + off = offset_in_page(class->size * obj_idx); + + if (off + class->size <= PAGE_SIZE) { + /* this object is contained entirely within a page */ + addr = kmap_local_zpdesc(zpdesc); + addr += off; + } else { + size_t sizes[2]; + + /* this object spans two pages */ + sizes[0] = PAGE_SIZE - off; + sizes[1] = class->size - sizes[0]; + addr = local_copy; + + memcpy_from_page(addr, zpdesc_page(zpdesc), + off, sizes[0]); + zpdesc = get_next_zpdesc(zpdesc); + memcpy_from_page(addr + sizes[0], + zpdesc_page(zpdesc), + 0, sizes[1]); + } + + if (!ZsHugePage(zspage)) + addr += ZS_HANDLE_SIZE; + + return addr; +} +EXPORT_SYMBOL_GPL(zs_obj_read_begin); + /** * zs_huge_class_size() - Returns the size (in bytes) of the first huge * zsmalloc &size_class.