Message ID | 1448559168-8363-10-git-send-email-james.morse@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Nov 26, 2015 at 05:32:47PM +0000, James Morse wrote: [...] > @@ -2427,25 +2434,31 @@ static void *get_buffer(struct memory_bitmap *bm, struct chain_allocator *ca) > if (PageHighMem(page)) > return get_highmem_page_buffer(page, ca); I know it is not a problem for arm64, but you should export the "restored" highmem pages list too because arch code may need to use it for the same reasons this patch was implemented. > - if (swsusp_page_is_forbidden(page) && swsusp_page_is_free(page)) > - /* We have allocated the "original" page frame and we can > - * use it directly to store the loaded page. > - */ > - return page_address(page); > - > - /* The "original" page frame has not been allocated and we have to > - * use a "safe" page frame to store the loaded page. > - */ > pbe = chain_alloc(ca, sizeof(struct pbe)); > if (!pbe) { > swsusp_free(); > return ERR_PTR(-ENOMEM); > } > - pbe->orig_address = page_address(page); > - pbe->address = safe_pages_list; > - safe_pages_list = safe_pages_list->next; > - pbe->next = restore_pblist; > - restore_pblist = pbe; > + > + if (swsusp_page_is_forbidden(page) && swsusp_page_is_free(page)) { > + /* We have allocated the "original" page frame and we can > + * use it directly to store the loaded page. > + */ > + pbe->orig_address = NULL; > + pbe->address = page_address(page); > + pbe->next = restored_inplace_pblist; > + restored_inplace_pblist = pbe; > + } else { > + /* The "original" page frame has not been allocated and we > + * have to use a "safe" page frame to store the loaded page. > + */ > + pbe->orig_address = page_address(page); > + pbe->address = safe_pages_list; > + safe_pages_list = safe_pages_list->next; > + pbe->next = restore_pblist; > + restore_pblist = pbe; > + } This makes sense to me, more so given that the pbe data is already pre-allocated regardless in prepare_image() (ie it should not change the current behaviour apart from calling chain_alloc() for every page we are restoring), you are just adding a pointer to stash that information, hence, if it is ok for Pavel and Rafael I think this patch can be considered for merging. Feedback appreciated. Thanks, Lorenzo > + > return pbe->address; > } > > @@ -2513,6 +2526,7 @@ int snapshot_write_next(struct snapshot_handle *handle) > chain_init(&ca, GFP_ATOMIC, PG_SAFE); > memory_bm_position_reset(&orig_bm); > restore_pblist = NULL; > + restored_inplace_pblist = NULL; > handle->buffer = get_buffer(&orig_bm, &ca); > handle->sync_read = 0; > if (IS_ERR(handle->buffer)) > -- > 2.6.2 >
Hi Lorenzo, On 03/12/15 12:09, Lorenzo Pieralisi wrote: > On Thu, Nov 26, 2015 at 05:32:47PM +0000, James Morse wrote: >> @@ -2427,25 +2434,31 @@ static void *get_buffer(struct memory_bitmap *bm, struct chain_allocator *ca) >> if (PageHighMem(page)) >> return get_highmem_page_buffer(page, ca); > > I know it is not a problem for arm64, but you should export the > "restored" highmem pages list too because arch code may need to use > it for the same reasons this patch was implemented. I'm not sure it can be for the same reasons: kernel/power/snapshot.c:swap_two_pages_data() does the copy for highmem pages, and then kunmap-s them... if the page is to be used for execution, it needs to be re-mapped first, possibly at a different address. The place to do cache maintenance is in the kunmap/kmap calls. Thanks, James
On Thu 2015-11-26 17:32:47, James Morse wrote: > Some architectures require code written to memory as if it were data to be > 'cleaned' from any data caches before the processor can fetch them as new > instructions. > > During resume from hibernate, the snapshot code copies some pages directly, > meaning these architectures do not get a chance to perform their cache > maintenance. Create a new list of pages that were restored in place, so > that the arch code can perform this maintenance when necessary. Umm. Could the copy function be modified to do the neccessary flushing, instead? Alternatively, can you just clean the whole cache before jumping to the new kernel? Pavel
Hi Pavel, On 05/12/15 09:35, Pavel Machek wrote: > On Thu 2015-11-26 17:32:47, James Morse wrote: >> Some architectures require code written to memory as if it were data to be >> 'cleaned' from any data caches before the processor can fetch them as new >> instructions. >> >> During resume from hibernate, the snapshot code copies some pages directly, >> meaning these architectures do not get a chance to perform their cache >> maintenance. Create a new list of pages that were restored in place, so >> that the arch code can perform this maintenance when necessary. > > Umm. Could the copy function be modified to do the neccessary > flushing, instead? The copying is done by load_image_lzo() using memcpy() if you have compression enabled, and by load_image() using swap_read_page() if you don't. I didn't do it here as it would clean every page copied, which was the worrying part of the previous approach. If there is an architecture where this cache-clean operation is expensive, it would slow down restore. I was trying to benchmark the impact of this on 32bit arm when I spotted it was broken. This allocated-same-page code path doesn't happen very often, so we don't want this to have an impact on the 'normal' code path. On 32bit arm I saw ~20 of these allocations out of ~60,000 pages. This new way allocates a few extra pages during restore, and doesn't assume that flush_cache_range() needs calling. It should have no impact on architectures that aren't using the new list. > Alternatively, can you just clean the whole cache before jumping to > the new kernel? On arm64, cleaning the whole cache means cleaning all of memory by virtual address, which would be a high price to pay when we only need to clean the pages we copied. The current implementation does clean all the page it copies, the problem is the ~0.03% that are copied behind its back. This patch publishes where those pages are. Thanks! James
Hi! > > Umm. Could the copy function be modified to do the neccessary > > flushing, instead? > > The copying is done by load_image_lzo() using memcpy() if you have > compression enabled, and by load_image() using swap_read_page() if you > don't. > > I didn't do it here as it would clean every page copied, which was the > worrying part of the previous approach. If there is an architecture > where this cache-clean operation is expensive, it would slow down > restore. I was trying to benchmark the impact of this on 32bit arm when > I spotted it was broken. You have just loaded the page from slow storage (hard drive, MMC). Cleaning a page should be pretty fast compared to that. > This allocated-same-page code path doesn't happen very often, so we > don't want this to have an impact on the 'normal' code path. On 32bit > arm I saw ~20 of these allocations out of ~60,000 pages. > > This new way allocates a few extra pages during restore, and doesn't > assume that flush_cache_range() needs calling. It should have no impact > on architectures that aren't using the new list. It is also complex. > > Alternatively, can you just clean the whole cache before jumping to > > the new kernel? > > On arm64, cleaning the whole cache means cleaning all of memory by > virtual address, which would be a high price to pay when we only need to > clean the pages we copied. The current implementation does clean all How high price to pay? I mean, hibernation/restore takes _seconds_. Paying miliseconds to have cleaner code is acceptable price. Pavel
Hi Pavel, On 08/12/15 08:19, Pavel Machek wrote: >> I didn't do it here as it would clean every page copied, which was the >> worrying part of the previous approach. If there is an architecture >> where this cache-clean operation is expensive, it would slow down >> restore. I was trying to benchmark the impact of this on 32bit arm when >> I spotted it was broken. > > You have just loaded the page from slow storage (hard drive, > MMC). Cleaning a page should be pretty fast compared to that. (One day I hope to own a laptop that hibernates to almost-memory speed nvram!) Speed is one issue - another is I don't think its correct to assume that any architecture with a flush_icache_range() function can/should have that called on any page of data. There is also the possibility that an architecture needs to do something other than flush_icache_range() on the pages that were copied. (I can see lots of s390 hooks for 'page keys', Intel's memory protection keys may want something similar...) This patch is the general-purpose fix, matching the existing list of 'these pages need copying' with a 'these pages were already copied'. >> This allocated-same-page code path doesn't happen very often, so we >> don't want this to have an impact on the 'normal' code path. On 32bit >> arm I saw ~20 of these allocations out of ~60,000 pages. >> >> This new way allocates a few extra pages during restore, and doesn't >> assume that flush_cache_range() needs calling. It should have no impact >> on architectures that aren't using the new list. > > It is also complex. Its symmetric with the existing restore_pblist code, I think that this is the simplest way of doing it. >>> Alternatively, can you just clean the whole cache before jumping to >>> the new kernel? >> >> On arm64, cleaning the whole cache means cleaning all of memory by >> virtual address, which would be a high price to pay when we only need to >> clean the pages we copied. The current implementation does clean all > > How high price to pay? I mean, hibernation/restore takes > _seconds_. Paying miliseconds to have cleaner code is acceptable > price. I agree, but the code to clean all 8GB of memory on Juno takes ~3 seconds, and this will probably scale linearly. We only need to clean the ~250MB that was copied by hibernate, (and of that, only the executable pages). The sticking point is the few pages it copies, but doesn't tell us about. I will put together the flush_icache_range()-during-decompression version of this patch... it looks like powerpc will suffer the most from this, from the comments, its flush_icache_range() code pushes data all the way out to memory... Thanks, James
diff --git a/include/linux/suspend.h b/include/linux/suspend.h index 8b6ec7ef0854..b17cf6081bca 100644 --- a/include/linux/suspend.h +++ b/include/linux/suspend.h @@ -384,6 +384,7 @@ extern bool system_entering_hibernation(void); extern bool hibernation_available(void); asmlinkage int swsusp_save(void); extern struct pbe *restore_pblist; +extern struct pbe *restored_inplace_pblist; #else /* CONFIG_HIBERNATION */ static inline void register_nosave_region(unsigned long b, unsigned long e) {} static inline void register_nosave_region_late(unsigned long b, unsigned long e) {} diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c index 3a970604308f..f251f5af49fb 100644 --- a/kernel/power/snapshot.c +++ b/kernel/power/snapshot.c @@ -74,6 +74,11 @@ void __init hibernate_image_size_init(void) */ struct pbe *restore_pblist; +/* List of PBEs that were restored in place. modified-harvard architectures + * need to 'clean' these pages before they can be executed. + */ +struct pbe *restored_inplace_pblist; + /* Pointer to an auxiliary buffer (1 page) */ static void *buffer; @@ -1359,6 +1364,7 @@ out: nr_copy_pages = 0; nr_meta_pages = 0; restore_pblist = NULL; + restored_inplace_pblist = NULL; buffer = NULL; alloc_normal = 0; alloc_highmem = 0; @@ -2072,6 +2078,7 @@ load_header(struct swsusp_info *info) int error; restore_pblist = NULL; + restored_inplace_pblist = NULL; error = check_header(info); if (!error) { nr_copy_pages = info->image_pages; @@ -2427,25 +2434,31 @@ static void *get_buffer(struct memory_bitmap *bm, struct chain_allocator *ca) if (PageHighMem(page)) return get_highmem_page_buffer(page, ca); - if (swsusp_page_is_forbidden(page) && swsusp_page_is_free(page)) - /* We have allocated the "original" page frame and we can - * use it directly to store the loaded page. - */ - return page_address(page); - - /* The "original" page frame has not been allocated and we have to - * use a "safe" page frame to store the loaded page. - */ pbe = chain_alloc(ca, sizeof(struct pbe)); if (!pbe) { swsusp_free(); return ERR_PTR(-ENOMEM); } - pbe->orig_address = page_address(page); - pbe->address = safe_pages_list; - safe_pages_list = safe_pages_list->next; - pbe->next = restore_pblist; - restore_pblist = pbe; + + if (swsusp_page_is_forbidden(page) && swsusp_page_is_free(page)) { + /* We have allocated the "original" page frame and we can + * use it directly to store the loaded page. + */ + pbe->orig_address = NULL; + pbe->address = page_address(page); + pbe->next = restored_inplace_pblist; + restored_inplace_pblist = pbe; + } else { + /* The "original" page frame has not been allocated and we + * have to use a "safe" page frame to store the loaded page. + */ + pbe->orig_address = page_address(page); + pbe->address = safe_pages_list; + safe_pages_list = safe_pages_list->next; + pbe->next = restore_pblist; + restore_pblist = pbe; + } + return pbe->address; } @@ -2513,6 +2526,7 @@ int snapshot_write_next(struct snapshot_handle *handle) chain_init(&ca, GFP_ATOMIC, PG_SAFE); memory_bm_position_reset(&orig_bm); restore_pblist = NULL; + restored_inplace_pblist = NULL; handle->buffer = get_buffer(&orig_bm, &ca); handle->sync_read = 0; if (IS_ERR(handle->buffer))
Some architectures require code written to memory as if it were data to be 'cleaned' from any data caches before the processor can fetch them as new instructions. During resume from hibernate, the snapshot code copies some pages directly, meaning these architectures do not get a chance to perform their cache maintenance. Create a new list of pages that were restored in place, so that the arch code can perform this maintenance when necessary. Signed-off-by: James Morse <james.morse@arm.com> --- include/linux/suspend.h | 1 + kernel/power/snapshot.c | 42 ++++++++++++++++++++++++++++-------------- 2 files changed, 29 insertions(+), 14 deletions(-)