diff mbox series

Hugetlb pages should not be reserved by shmat() if SHM_NORESERVE

Message ID 1705713472-3537-1-git-send-email-prakash.sangappa@oracle.com (mailing list archive)
State New
Headers show
Series Hugetlb pages should not be reserved by shmat() if SHM_NORESERVE | expand

Commit Message

Prakash Sangappa Jan. 20, 2024, 1:17 a.m. UTC
For shared memory of type SHM_HUGETLB, hugetlb pages are reserved in
shmget() call. If SHM_NORESERVE flags is specified then the hugetlb
pages are not reserved. However when the shared memory is attached
with the shmat() call the hugetlb pages are getting reserved incorrectly
for SHM_HUGETLB shared memory created with SHM_NORESERVE.

Ensure that the hugetlb pages are no reserved for SHM_HUGETLB shared
memory in the shmat() call.

Signed-off-by: Prakash Sangappa <prakash.sangappa@oracle.com>
---
 fs/hugetlbfs/inode.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Comments

Andrew Morton Jan. 21, 2024, 10:32 p.m. UTC | #1
On Fri, 19 Jan 2024 17:17:52 -0800 Prakash Sangappa <prakash.sangappa@oracle.com> wrote:

> For shared memory of type SHM_HUGETLB, hugetlb pages are reserved in
> shmget() call. If SHM_NORESERVE flags is specified then the hugetlb
> pages are not reserved. However when the shared memory is attached
> with the shmat() call the hugetlb pages are getting reserved incorrectly
> for SHM_HUGETLB shared memory created with SHM_NORESERVE.
> 
> Ensure that the hugetlb pages are no reserved for SHM_HUGETLB shared
> memory in the shmat() call.

Thanks.

What are the userspace-visible effects of this change?

Based on that, is a -stable backport desirable?

And can we please identify a suitable Fixes: target for this?
Prakash Sangappa Jan. 23, 2024, 2 a.m. UTC | #2
> On Jan 21, 2024, at 2:32 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> On Fri, 19 Jan 2024 17:17:52 -0800 Prakash Sangappa <prakash.sangappa@oracle.com> wrote:
> 
>> For shared memory of type SHM_HUGETLB, hugetlb pages are reserved in
>> shmget() call. If SHM_NORESERVE flags is specified then the hugetlb
>> pages are not reserved. However when the shared memory is attached
>> with the shmat() call the hugetlb pages are getting reserved incorrectly
>> for SHM_HUGETLB shared memory created with SHM_NORESERVE.
>> 
>> Ensure that the hugetlb pages are no reserved for SHM_HUGETLB shared
>> memory in the shmat() call.
> 
> Thanks.

Sent a v2 patch with slightly modified fix.

> 
> What are the userspace-visible effects of this change?

This is a bug. Following test shows the issue

$ cat shmhtb.c
#include <stdlib.h>
#include <sys/mman.h>
#include <stdio.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <unistd.h>
#include <errno.h>

#define SHMSZ (10*1024*1024)
#define SKEY 41321234

int main()
{
int shmflags = 0660 | IPC_CREAT | SHM_HUGETLB | SHM_NORESERVE;
int shmid;

shmid = shmget(SKEY, SHMSZ, shmflags);

if (shmid < 0)
{  printf("shmat: shmget() failed, %d\n", errno);
return 1;
}

printf("After shmget\n");
system("cat /proc/meminfo | grep -i hugepages_”);

shmat(shmid, NULL, 0);

printf("After shmat\n");
system("cat /proc/meminfo | grep -i hugepages_");

shmctl(shmid, IPC_RMID, NULL);

return 0;
}


# sysctl -w vm.nr_hugepages=20
#./shmhtb
After shmget
HugePages_Total:      20
HugePages_Free:       20
HugePages_Rsvd:        0
HugePages_Surp:        0
After shmat
HugePages_Total:      20
HugePages_Free:       20
HugePages_Rsvd:        5 <--
HugePages_Surp:        0

> 
> Based on that, is a -stable backport desirable?

I think so. The issue is reproducible on older kernel versions. Reproduced on v4.18

> 
> And can we please identify a suitable Fixes: target for this?

Should it be mentioned in the patch?

-Prakash

>
diff mbox series

Patch

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index f757d4f..93cafd2 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -141,7 +141,13 @@  static int hugetlbfs_file_mmap(struct file *file, struct vm_area_struct *vma)
 	file_accessed(file);
 
 	ret = -ENOMEM;
-	if (!hugetlb_reserve_pages(inode,
+
+	/*
+	 * for SHM_HUGETLB, the pages are reserved in the shmget() call so skip
+	 * reserving here. Note only for SHM hugetlbfs file, the inode
+	 * flag S_PRIVATE is set.
+	 */
+	if (!(inode->i_flags & S_PRIVATE) && !hugetlb_reserve_pages(inode,
 				vma->vm_pgoff >> huge_page_order(h),
 				len >> huge_page_shift(h), vma,
 				vma->vm_flags))