From patchwork Tue Feb 20 19:26:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13564423 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07BAFC48BC4 for ; Tue, 20 Feb 2024 19:26:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 510866B0072; Tue, 20 Feb 2024 14:26:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4C05A6B0074; Tue, 20 Feb 2024 14:26:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3616E6B0075; Tue, 20 Feb 2024 14:26:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 1EE906B0072 for ; Tue, 20 Feb 2024 14:26:21 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id B00D8A0751 for ; Tue, 20 Feb 2024 19:26:20 +0000 (UTC) X-FDA: 81813163320.17.798C665 Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) by imf01.hostedemail.com (Postfix) with ESMTP id 13F1740018 for ; Tue, 20 Feb 2024 19:26:18 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=OAUey+FD; spf=pass (imf01.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.214.172 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1708457179; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=mbsUkow/oAiTwTXzAYe7YvLlZhgnFaTumNCFpEDptvY=; b=21q8DpQEbgnxn/7sYEoUqtwQZTGC7Dapp+WTMXGuA3hAo6StSiLmtXZ8NfLKmDXB/g70R4 wSAGlQKOFplYc3nFl4ZU3OkddrUHQx0Kt5U54S/KrRPSpYRay4LjFzWASSptgoWMFU3f16 mCo1cf5L130s8YKamIE4ThJdRqV2g9g= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1708457179; a=rsa-sha256; cv=none; b=Ye393Soz3sSV7IL5TxZvpi4KwW2Cc4Jca12O9/5tIQ6wm1SDa31DsO8OjJhMP7yP7L0scM oPuwZjjOa3MHLKAtPHVU8546MoQJsn0asznaGEiSba9G13io7BSOfPaVW89xohnGDu6eKV B6D7S+sU/ehmzxzIwhtR8oNL5MyLKEI= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=OAUey+FD; spf=pass (imf01.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.214.172 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pl1-f172.google.com with SMTP id d9443c01a7336-1dbd32cff0bso23564015ad.0 for ; Tue, 20 Feb 2024 11:26:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1708457178; x=1709061978; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=mbsUkow/oAiTwTXzAYe7YvLlZhgnFaTumNCFpEDptvY=; b=OAUey+FDGLWnYtxJ0TJlHws7whKt87Tli7m7zNMSK5GsyPsl781IQdsx4Ulql3KcWh h9GGjsUu8g3CoWf92ZwYvN1p1wl7mqOK3jPUb+lcU/h/YGwiEAtER4tJHSz2Hv0R2YQx 0RgAui67AaF0HdvwUa11bpZ+/h5/VfiL3o+zq4VcHUHNpPL2fkXmN4APNQq2fDvfpyv0 GL9TPgI4dHzDuTVuv/KbHwYrmz7bA9XLE8jl8OgIIYxggxqTXHRVEmeT5gG8x1LIlFus wPLYT0iousbFNtw5yPuXW1tw3xNKWCcYMEcAFIYp4S6OE010LMudEXPbNLdUO+9oFvQ4 NruA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708457178; x=1709061978; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=mbsUkow/oAiTwTXzAYe7YvLlZhgnFaTumNCFpEDptvY=; b=DfQ16NyAa4u+un5kDLp9zDEfOCOOs/QwfeAXbfb+mGQJodiPtPoQTK7eR/4wk/2VjH wGd5lmwEv0P1caoA1if2NlIjZ4k61HQddxCnpZQFl2iTUYG8s1KSB4Cufo5QMtnlZqCg LqX6bKBCmcyTGqkzns7YOylLdG5CIrM2Wd6BsgtfVhPtlTxP0lU1dS12SB/E0iwm0htr gDQGHiPese/GeAaX+YUUxZTzZe6To4lBj3Gvpa6VzTkw0vA6BS0i4Kvfq6OHLivBiac+ hEpsjiuuq1ZqYw1LedlUV83VzY6Qd4OXxdyj+uZjBsiB5inyirjjrTkm/ewsSixhzVzP Cv8g== X-Forwarded-Encrypted: i=1; AJvYcCUy915ihkgUzo1OlPCaZIncca2nmP2hdEIV+r3QeKRWcb+sV+1ApsmnpU9rO7s/9nXYugChbo149AHdd1tV0lrF+VI= X-Gm-Message-State: AOJu0Yz4V6nAHAlnC7WWjgH+KuUcXzhLGTaLmmFrvlHL9JUAD59+Eh3R tTNkBqIf50iwDXyVo5mrXjam0H7t18Pq6R2qY8dwWCRcXsDUq8QLIXCoGhPY X-Google-Smtp-Source: AGHT+IHxhJ+utJfB5SoIVUC1hkyrBK1kgV2gS7bGydSFtzZP5wRbTlnyeVyIkE8ZXltjEjSELgtIZw== X-Received: by 2002:a17:902:d2ca:b0:1db:e1f4:d454 with SMTP id n10-20020a170902d2ca00b001dbe1f4d454mr7989029plc.12.1708457177734; Tue, 20 Feb 2024 11:26:17 -0800 (PST) Received: from localhost.localdomain ([2620:10d:c090:500::4:504e]) by smtp.gmail.com with ESMTPSA id c20-20020a170902c1d400b001dbc3f2e7e8sm6382109plc.98.2024.02.20.11.26.15 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 20 Feb 2024 11:26:16 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, torvalds@linux-foundation.org, brho@google.com, hannes@cmpxchg.org, lstoakes@gmail.com, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next] mm: Introduce vm_area_[un]map_pages(). Date: Tue, 20 Feb 2024 11:26:13 -0800 Message-Id: <20240220192613.8840-1-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) MIME-Version: 1.0 X-Rspamd-Queue-Id: 13F1740018 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: cuu9ef16xngs9g66on9pmy6c6ifcogtx X-HE-Tag: 1708457178-664584 X-HE-Meta: U2FsdGVkX18g9aecRAlmSaD0TG9CY/qKwu5Krp6J6ZtChB1yvrzV78R3VWXwph7fB42pom4LjPOvvC3ecSxCOkiwMkPrDSJuk88UatdWpy1W+YRAvy7NRU0fQ7Pps2MaqFgj6EC+m23X/094cw8bCkE6Qvm5BH+YbjYpF0d5bf1dK8kgA8wsYPpH41yS1RZJ19xMK0kqjSK/ab8PYuSNUyx+4SbaxvqXGWzCrWpZ4wtc2kG79l2oi94acoEBzRkgvwIWqKuk+Edyc+taBNV88mroRpEu9MQT65+MiE7eTam+xTtGH9SqaYVg5iLCQDmXSXAkJd2eruVwM1zoESAlVgfMcMHmHhgi5OH3hj6Gzvg0NaIYhiwPHCLh9gioQ/Z1vAftCzpgwWCUk6dCFwyMDdbOavUsi7PbAbJ1dPH7TeNO02IjKl5fb3ZFEoauYFXu3f8O3TS4/47ewfg8viRoPtYqFbokknxMqdHXfljufKsXfVYgQnRBDhSdTxhxE3C52dGOTom2iS/vFaokk1qKsCjYYy96VdKwLSn1LvE3hpTJBELQ2KAO+sOz+z3wV9KA66KRteMwaCO1uJqw5AO8czfYntzakU5rPz3M9G77xSty9S3i9jp9ANUchYvzmI80vtOP/0IWq0MkmsYavmIXBiYzXh3cIgIhhtDD5Sx3Eoa+IyIbRBIdOECsNnPrdEHK7OiMbjahZw8iDUtiH87iraxsgMARos6kqk67FSC+XatjNF30Ez2KneiKDA6L8ncAE3ViJepgh3wD8lXgEknN4UqUz2vKA7IpYVBZMyo19C9xFQPYAApOkXlPo8ZjuP2GmToUtslVo+H0XfL9ouCnGFPUuhwAAnacQv8o8C9j4cVHT9sqvnzI2X+b223S6avcOl996Rujv7fiMzQ2oXW724DNih2hrFZ4HPemL+rqfzfDZMeShV79d8M/wqxovP0cVaW9PF7c2/JGiXm154w CzkpeALH Q992MyLjbGKTB1P62F5tvcjRxGFrKOCoZsaJb3XFI0SkefUIjMerOmHq2e3HLTslobsevO7GWi48IOaVm8UeTJqIwAGwEws7hu5pyECvuaI4vbTMACdgXpoi0k1/acLC0T+nQi9ntrT4sXYeNTY1j7bI2stlj55nbJuwzf6OcApzIMtr0j7Nu4uryNeMCJm5FpsAa X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Alexei Starovoitov vmap() API is used to map a set of pages into contiguous kernel virtual space. BPF would like to extend the vmap API to implement a lazily-populated contiguous kernel virtual space which size and start address is fixed early. The vmap API has functions to request and release areas of kernel address space: get_vm_area() and free_vm_area(). Introduce vm_area_map_pages(area, start_addr, count, pages) to map a set of pages within a given area. It has the same sanity checks as vmap() does. In addition it also checks that get_vm_area() was created with VM_MAP flag (as all users of vmap() should be doing). Also add vm_area_unmap_pages() that is a safer alternative to existing vunmap_range() api. The next commits will introduce bpf_arena which is a sparsely populated shared memory region between bpf program and user space process. It will map privately-managed pages into an existing vm area with the following steps: area = get_vm_area(area_size, VM_MAP | VM_USERMAP); // at bpf prog verification time vm_area_map_pages(area, kaddr, 1, page); // on demand vm_area_unmap_pages(area, kaddr, 1); free_vm_area(area); // after bpf prog is unloaded For BPF use case the area_size will be 4Gbyte plus 64Kbyte of guard pages and area->addr known and fixed at the program verification time. Signed-off-by: Alexei Starovoitov --- include/linux/vmalloc.h | 3 +++ mm/vmalloc.c | 46 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 49 insertions(+) diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index c720be70c8dd..7d112cc5f2a3 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -232,6 +232,9 @@ static inline bool is_vm_area_hugepages(const void *addr) } #ifdef CONFIG_MMU +int vm_area_map_pages(struct vm_struct *area, unsigned long addr, unsigned int count, + struct page **pages); +int vm_area_unmap_pages(struct vm_struct *area, unsigned long addr, unsigned int count); void vunmap_range(unsigned long addr, unsigned long end); static inline void set_vm_flush_reset_perms(void *addr) { diff --git a/mm/vmalloc.c b/mm/vmalloc.c index d12a17fc0c17..d6337d46f1d8 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -635,6 +635,52 @@ static int vmap_pages_range(unsigned long addr, unsigned long end, return err; } +/** + * vm_area_map_pages - map pages inside given vm_area + * @area: vm_area + * @addr: start address inside vm_area + * @count: number of pages + * @pages: pages to map (always PAGE_SIZE pages) + */ +int vm_area_map_pages(struct vm_struct *area, unsigned long addr, unsigned int count, + struct page **pages) +{ + unsigned long size = ((unsigned long)count) * PAGE_SIZE; + unsigned long end = addr + size; + + might_sleep(); + if (WARN_ON_ONCE(area->flags & VM_FLUSH_RESET_PERMS)) + return -EINVAL; + if (WARN_ON_ONCE(area->flags & VM_NO_GUARD)) + return -EINVAL; + if (WARN_ON_ONCE(!(area->flags & VM_MAP))) + return -EINVAL; + if (count > totalram_pages()) + return -E2BIG; + if (addr < (unsigned long)area->addr || (void *)end > area->addr + area->size) + return -ERANGE; + + return vmap_pages_range(addr, end, PAGE_KERNEL, pages, PAGE_SHIFT); +} + +/** + * vm_area_unmap_pages - unmap pages inside given vm_area + * @area: vm_area + * @addr: start address inside vm_area + * @count: number of pages to unmap + */ +int vm_area_unmap_pages(struct vm_struct *area, unsigned long addr, unsigned int count) +{ + unsigned long size = ((unsigned long)count) * PAGE_SIZE; + unsigned long end = addr + size; + + if (addr < (unsigned long)area->addr || (void *)end > area->addr + area->size) + return -ERANGE; + + vunmap_range(addr, end); + return 0; +} + int is_vmalloc_or_module_addr(const void *x) { /*