Message ID | 20240522074658.2420468-1-Sukrit.Bhatnagar@sony.com (mailing list archive) |
---|---|
Headers | show |
Series | Improve dmesg output for swapfile+hibernation | expand |
Hi! > While trying to use a swapfile for hibernation, I noticed that the suspend > process was failing when it tried to search for the swap to use for snapshot. > I had created the swapfile on ext4 and got the starting physical block offset > using the filefrag command. How is swapfile for hibernation supposed to work? I'm afraid that can't work, and we should just not allow hibernation if there's anything else than just one swap partition. Best regards, Pavel
Hi Pavel! On 2024-05-24 04:45, Pavel Machek wrote: > Hi! > >> While trying to use a swapfile for hibernation, I noticed that the suspend >> process was failing when it tried to search for the swap to use for snapshot. >> I had created the swapfile on ext4 and got the starting physical block offset >> using the filefrag command. > > How is swapfile for hibernation supposed to work? I'm afraid that > can't work, and we should just not allow hibernation if there's > anything else than just one swap partition. I am not sure what you mean. We can pass the starting physical block offset of a swapfile into /sys/power/resume_offset, and hibernate can directly read/write into it using the swap extents information created by iomap during swapon. On resume, the kernel would read this offset value from the commandline parameters, and then access the swapfile. I find having a swapfile option for hibernate useful, in the scenarios where it is hard to modify the partitioning scheme, or to have a dedicated swap partition. Are there any plans to remove swapfile support from hibernation? -- Sukrit
On Mon, May 27, 2024 at 11:06:11AM +0000, Sukrit.Bhatnagar@sony.com wrote: > We can pass the starting physical block offset of a swapfile into > /sys/power/resume_offset, and hibernate can directly read/write > into it using the swap extents information created by iomap during > swapon. On resume, the kernel would read this offset value from > the commandline parameters, and then access the swapfile. Reading a physical address from userspace is not a proper interface. What is this code even trying to do with it?
Hi Christoph, On 2024-05-27 20:20, Christoph Hellwig wrote: > On Mon, May 27, 2024 at 11:06:11AM +0000, Sukrit.Bhatnagar@sony.com wrote: >> We can pass the starting physical block offset of a swapfile into >> /sys/power/resume_offset, and hibernate can directly read/write >> into it using the swap extents information created by iomap during >> swapon. On resume, the kernel would read this offset value from >> the commandline parameters, and then access the swapfile. > > Reading a physical address from userspace is not a proper interface. > What is this code even trying to do with it? I understand your point. Ideally, the low-level stuff such as finding the physical block offset should not be handled in the userspace. In my understanding, the resume offset in hibernate is used as follows. Suspend - Hibernate looks up the swap/swapfile using the details we pass in the sysfs entries, in the function swsusp_swap_check(): * /sys/power/resume - path/uuid/major:minor of the swap partition (or non-swap partition for swapfile) * /sys/power/resume_offset - physical offset of the swapfile in that partition * If no resume device is specified, it just uses the first available swap! - It then proceeds to write the image to the specified swap. (The allocation of swap pages is done by the swapfile code internally.) - When writing is finished, the swap header needs to be updated with some metadata, in the function mark_swapfiles(). * Hibernate creates bio requests to read/write the header (which is the first page of swap) using that physical block offset. Resume - Hibernate gets the partition and offset values from kernel command-line parameters "resume" and "resume_offset" (which must be set from userspace, not ideal). - It checks for valid hibernate swap signature by reading the swap header. * Hibernate creates bio requests again, using the physical block offset, but the one from kernel command-line this time. - Then it restores image and resumes into the previously saved kernel. -- Sukrit
On Mon, May 27, 2024 at 12:51:07PM +0000, Sukrit.Bhatnagar@sony.com wrote: > In my understanding, the resume offset in hibernate is used as follows. > > Suspend > - Hibernate looks up the swap/swapfile using the details we pass in the > sysfs entries, in the function swsusp_swap_check(): > * /sys/power/resume - path/uuid/major:minor of the swap partition (or > non-swap partition for swapfile) > * /sys/power/resume_offset - physical offset of the swapfile in that > partition > * If no resume device is specified, it just uses the first available swap! > - It then proceeds to write the image to the specified swap. > (The allocation of swap pages is done by the swapfile code internally.) Where "it" is userspace code? If so, that already seems unsafe for a swap device, but definitely is a no-go for a swapfile. > - Hibernate gets the partition and offset values from kernel command-line > parameters "resume" and "resume_offset" (which must be set from > userspace, not ideal). Or is it just for these parameters? In which case we "only" need to specify the swap file, which would then need code in the file system driver to resolve the logical to physical mapping as swap files don't need to be contiguous.
On 2024-05-27 21:58, Christoph Hellwig wrote: > On Mon, May 27, 2024 at 12:51:07PM +0000, Sukrit.Bhatnagar@sony.com wrote: >> In my understanding, the resume offset in hibernate is used as follows. >> >> Suspend >> - Hibernate looks up the swap/swapfile using the details we pass in the >> sysfs entries, in the function swsusp_swap_check(): >> * /sys/power/resume - path/uuid/major:minor of the swap partition (or >> non-swap partition for swapfile) >> * /sys/power/resume_offset - physical offset of the swapfile in that >> partition >> * If no resume device is specified, it just uses the first available >> swap! - It then proceeds to write the image to the specified swap. >> (The allocation of swap pages is done by the swapfile code >> internally.) > > Where "it" is userspace code? If so, that already seems unsafe for > a swap device, but definitely is a no-go for a swapfile. By "it", I meant the hibernate code running in kernel space. Once userspace triggers hibernation by `echo disk > /sys/power/state` or a systemd wrapper program etc., and userspace tasks are frozen, everything happens within kernel context. >> - Hibernate gets the partition and offset values from kernel command-line >> parameters "resume" and "resume_offset" (which must be set from >> userspace, not ideal). > > Or is it just for these parameters? In which case we "only" need to > specify the swap file, which would then need code in the file system > driver to resolve the logical to physical mapping as swap files don't > need to be contiguous. Yes, it is just for setting these parameters in sysfs entries and in kernel commandline. I think specifying the swapfile path *may* not work because when we resume from hibernation, the filesystems are not yet mounted (except for the case when someone is resuming from initramfs stage). Using the block device + physical offset, this procedure becomes Independent of the filesystem and the mounted status. And since the system swap information is lost on reboot/shutdown, the kernel which loads the hibernation image will not know what swaps were enabled when the image was created. -- Sukrit