diff mbox series

[v7,08/14] xen/page_alloc: introduce preserved page flags macro

Message ID 20240315105902.160047-9-carlo.nonato@minervasys.tech (mailing list archive)
State New
Headers show
Series Arm cache coloring | expand

Commit Message

Carlo Nonato March 15, 2024, 10:58 a.m. UTC
PGC_static and PGC_extra needs to be preserved when assigning a page.
Define a new macro that groups those flags and use it instead of or'ing
every time.

To make preserved flags even more meaningful, they are kept also when
switching state in mark_page_free().

Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
---
v7:
- PGC_preserved used also in mark_page_free()
v6:
- preserved_flags renamed to PGC_preserved
- PGC_preserved is used only in assign_pages()
v5:
- new patch
---
 xen/common/page_alloc.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

Comments

Jan Beulich March 19, 2024, 3:47 p.m. UTC | #1
On 15.03.2024 11:58, Carlo Nonato wrote:
> PGC_static and PGC_extra needs to be preserved when assigning a page.
> Define a new macro that groups those flags and use it instead of or'ing
> every time.
> 
> To make preserved flags even more meaningful, they are kept also when
> switching state in mark_page_free().
> 
> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
Julien Grall March 21, 2024, 4:07 p.m. UTC | #2
(+ Roger)

Hi Carlo,

On 15/03/2024 10:58, Carlo Nonato wrote:
> PGC_static and PGC_extra needs to be preserved when assigning a page.
> Define a new macro that groups those flags and use it instead of or'ing
> every time.
> 
> To make preserved flags even more meaningful, they are kept also when
> switching state in mark_page_free().
> 
> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>

This patch is introducing a regression in OSStest (and possibly gitlab?):

Mar 21 12:00:29.533676 (XEN) pg[0] MFN 2211c5 c=0x2c00000000000000 o=0 
v=0xe40000010007ffff t=0x24
Mar 21 12:00:42.829785 (XEN) Xen BUG at common/page_alloc.c:1033
Mar 21 12:00:42.829829 (XEN) ----[ Xen-4.19-unstable  x86_64  debug=y 
Not tainted ]----
Mar 21 12:00:42.829857 (XEN) CPU:    12
Mar 21 12:00:42.841571 (XEN) RIP:    e008:[<ffff82d04022fe1f>] 
common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
Mar 21 12:00:42.841609 (XEN) RFLAGS: 0000000000010282   CONTEXT: 
hypervisor (d0v8)
Mar 21 12:00:42.853654 (XEN) rax: ffff83023e3ed06c   rbx: 
000000000007ffff   rcx: 0000000000000028
Mar 21 12:00:42.853689 (XEN) rdx: ffff83047bec7fff   rsi: 
ffff83023e3ea3e8   rdi: ffff83023e3ea3e0
Mar 21 12:00:42.865657 (XEN) rbp: ffff83047bec7c10   rsp: 
ffff83047bec7b98   r8:  0000000000000000
Mar 21 12:00:42.877647 (XEN) r9:  0000000000000001   r10: 
000000000000000c   r11: 0000000000000010
Mar 21 12:00:42.877682 (XEN) r12: 0000000000000001   r13: 
0000000000000000   r14: ffff82e0044238a0
Mar 21 12:00:42.889652 (XEN) r15: 0000000000000000   cr0: 
0000000080050033   cr4: 0000000000372660
Mar 21 12:00:42.901651 (XEN) cr3: 000000046fe34000   cr2: 00007fb72757610b
Mar 21 12:00:42.901685 (XEN) fsb: 00007fb726def380   gsb: 
ffff88801f200000   gss: 0000000000000000
Mar 21 12:00:42.913646 (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000 
ss: e010   cs: e008
Mar 21 12:00:42.913680 (XEN) Xen code around <ffff82d04022fe1f> 
(common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2):
Mar 21 12:00:42.925645 (XEN)  d1 1c 00 e8 ad dd 02 00 <0f> 0b 48 85 c9 
79 36 0f 0b 41 89 cd 48 c7 47 f0
Mar 21 12:00:42.937649 (XEN) Xen stack trace from rsp=ffff83047bec7b98:
Mar 21 12:00:42.937683 (XEN)    0000000000000024 000000007bec7c20 
0000000000000001 ffff83046ccda000
Mar 21 12:00:42.949653 (XEN)    ffff82e000000021 0000000000000016 
0000000000000000 0000000000000000
Mar 21 12:00:42.949687 (XEN)    0000000000000000 0000000000000000 
0000000000000028 0000000000000021
Mar 21 12:00:42.961652 (XEN)    ffff83046ccda000 0000000000000000 
00007d2000000000 ffff83047bec7c48
Mar 21 12:00:42.961687 (XEN)    ffff82d0402302ff ffff83046ccda000 
0000000000000100 0000000000000000
Mar 21 12:00:42.973655 (XEN)    ffff82d0405f0080 00007d2000000000 
ffff83047bec7c80 ffff82d0402f626c
Mar 21 12:00:42.985656 (XEN)    ffff83046ccda000 ffff83046ccda640 
0000000000000000 0000000000000000
Mar 21 12:00:42.985690 (XEN)    ffff83046ccda220 ffff83047bec7cb0 
ffff82d0402f65a0 ffff83046ccda000
Mar 21 12:00:42.997662 (XEN)    0000000000000000 0000000000000000 
0000000000000000 ffff83047bec7cc0
Mar 21 12:00:43.009660 (XEN)    ffff82d040311f8a ffff83047bec7ce0 
ffff82d0402bd543 ffff83046ccda000
Mar 21 12:00:43.009695 (XEN)    ffff83047bec7dc8 ffff83047bec7d08 
ffff82d04032c524 ffff83046ccda000
Mar 21 12:00:43.021653 (XEN)    ffff83047bec7dc8 0000000000000002 
ffff83047bec7d58 ffff82d040206750
Mar 21 12:00:43.033642 (XEN)    0000000000000000 ffff82d040233fe5 
ffff83047bec7d48 0000000000000000
Mar 21 12:00:43.033678 (XEN)    0000000000000002 00007fb72767f010 
ffff82d0405e9120 0000000000000001
Mar 21 12:00:43.045654 (XEN)    ffff83047bec7e70 ffff82d040240728 
0000000000000007 ffff83023e3b3000
Mar 21 12:00:43.045690 (XEN)    0000000000000246 ffff83023e2efa90 
ffff83023e38e000 ffff83023e2efb40
Mar 21 12:00:43.057609 (XEN)    0000000000000007 ffff83023e3afb80 
0000000000000206 ffff83047bec7dc0
Mar 21 12:00:43.069662 (XEN)    0000001600000001 000000000000ffff 
e75aaa8d0000000c ac0d6d864e487f62
Mar 21 12:00:43.069697 (XEN)    000000037fa48d76 0000000200000000 
ffffffff000003ff 00000002ffffffff
Mar 21 12:00:43.081647 (XEN)    0000000000000000 00000000000001ff 
0000000000000000 0000000000000000
Mar 21 12:00:43.093646 (XEN) Xen call trace:
Mar 21 12:00:43.093677 (XEN)    [<ffff82d04022fe1f>] R 
common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
Mar 21 12:00:43.093705 (XEN)    [<ffff82d0402302ff>] F 
alloc_domheap_pages+0x17d/0x1e4
Mar 21 12:00:43.105652 (XEN)    [<ffff82d0402f626c>] F 
hap_set_allocation+0x73/0x23c
Mar 21 12:00:43.105685 (XEN)    [<ffff82d0402f65a0>] F 
hap_enable+0x138/0x33c
Mar 21 12:00:43.117646 (XEN)    [<ffff82d040311f8a>] F 
paging_enable+0x2d/0x45
Mar 21 12:00:43.117679 (XEN)    [<ffff82d0402bd543>] F 
hvm_domain_initialise+0x185/0x428
Mar 21 12:00:43.129652 (XEN)    [<ffff82d04032c524>] F 
arch_domain_create+0x3e7/0x4c1
Mar 21 12:00:43.129687 (XEN)    [<ffff82d040206750>] F 
domain_create+0x4cc/0x7e2
Mar 21 12:00:43.141665 (XEN)    [<ffff82d040240728>] F 
do_domctl+0x1850/0x192d
Mar 21 12:00:43.141699 (XEN)    [<ffff82d04031a96a>] F 
pv_hypercall+0x617/0x6b5
Mar 21 12:00:43.153656 (XEN)    [<ffff82d0402012ca>] F 
lstar_enter+0x13a/0x140
Mar 21 12:00:43.153689 (XEN)
Mar 21 12:00:43.153711 (XEN)
Mar 21 12:00:43.153731 (XEN) ****************************************
Mar 21 12:00:43.165647 (XEN) Panic on CPU 12:
Mar 21 12:00:43.165678 (XEN) Xen BUG at common/page_alloc.c:1033
Mar 21 12:00:43.165703 (XEN) ****************************************
Mar 21 12:00:43.177633 (XEN)
Mar 21 12:00:43.177662 (XEN) Manual reset required ('noreboot' specified)

The code around the BUG is:

         /* Reference count must continuously be zero for free pages. */
         if ( (pg[i].count_info & ~PGC_need_scrub) != PGC_state_free )
         {
             printk(XENLOG_ERR
                    "pg[%u] MFN %"PRI_mfn" c=%#lx o=%u v=%#lx t=%#x\n",
                    i, mfn_x(page_to_mfn(pg + i)),
                    pg[i].count_info, pg[i].v.free.order,
                    pg[i].u.free.val, pg[i].tlbflush_timestamp);
             BUG();
         }

Now that you are preserving some flags, you also want to modify the 
condition. I haven't checked the rest of the code, so there might be 
some adjustments necessary.

For now I have reverted the patch to unblock the CI.

Cheers,
Julien Grall March 21, 2024, 4:10 p.m. UTC | #3
On 21/03/2024 16:07, Julien Grall wrote:
> (+ Roger)
> 
> Hi Carlo,
> 
> On 15/03/2024 10:58, Carlo Nonato wrote:
>> PGC_static and PGC_extra needs to be preserved when assigning a page.
>> Define a new macro that groups those flags and use it instead of or'ing
>> every time.
>>
>> To make preserved flags even more meaningful, they are kept also when
>> switching state in mark_page_free().
>>
>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
> 
> This patch is introducing a regression in OSStest (and possibly gitlab?):
> 
> Mar 21 12:00:29.533676 (XEN) pg[0] MFN 2211c5 c=0x2c00000000000000 o=0 
> v=0xe40000010007ffff t=0x24
> Mar 21 12:00:42.829785 (XEN) Xen BUG at common/page_alloc.c:1033
> Mar 21 12:00:42.829829 (XEN) ----[ Xen-4.19-unstable  x86_64  debug=y 
> Not tainted ]----
> Mar 21 12:00:42.829857 (XEN) CPU:    12
> Mar 21 12:00:42.841571 (XEN) RIP:    e008:[<ffff82d04022fe1f>] 
> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
> Mar 21 12:00:42.841609 (XEN) RFLAGS: 0000000000010282   CONTEXT: 
> hypervisor (d0v8)
> Mar 21 12:00:42.853654 (XEN) rax: ffff83023e3ed06c   rbx: 
> 000000000007ffff   rcx: 0000000000000028
> Mar 21 12:00:42.853689 (XEN) rdx: ffff83047bec7fff   rsi: 
> ffff83023e3ea3e8   rdi: ffff83023e3ea3e0
> Mar 21 12:00:42.865657 (XEN) rbp: ffff83047bec7c10   rsp: 
> ffff83047bec7b98   r8:  0000000000000000
> Mar 21 12:00:42.877647 (XEN) r9:  0000000000000001   r10: 
> 000000000000000c   r11: 0000000000000010
> Mar 21 12:00:42.877682 (XEN) r12: 0000000000000001   r13: 
> 0000000000000000   r14: ffff82e0044238a0
> Mar 21 12:00:42.889652 (XEN) r15: 0000000000000000   cr0: 
> 0000000080050033   cr4: 0000000000372660
> Mar 21 12:00:42.901651 (XEN) cr3: 000000046fe34000   cr2: 00007fb72757610b
> Mar 21 12:00:42.901685 (XEN) fsb: 00007fb726def380   gsb: 
> ffff88801f200000   gss: 0000000000000000
> Mar 21 12:00:42.913646 (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000 
> ss: e010   cs: e008
> Mar 21 12:00:42.913680 (XEN) Xen code around <ffff82d04022fe1f> 
> (common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2):
> Mar 21 12:00:42.925645 (XEN)  d1 1c 00 e8 ad dd 02 00 <0f> 0b 48 85 c9 
> 79 36 0f 0b 41 89 cd 48 c7 47 f0
> Mar 21 12:00:42.937649 (XEN) Xen stack trace from rsp=ffff83047bec7b98:
> Mar 21 12:00:42.937683 (XEN)    0000000000000024 000000007bec7c20 
> 0000000000000001 ffff83046ccda000
> Mar 21 12:00:42.949653 (XEN)    ffff82e000000021 0000000000000016 
> 0000000000000000 0000000000000000
> Mar 21 12:00:42.949687 (XEN)    0000000000000000 0000000000000000 
> 0000000000000028 0000000000000021
> Mar 21 12:00:42.961652 (XEN)    ffff83046ccda000 0000000000000000 
> 00007d2000000000 ffff83047bec7c48
> Mar 21 12:00:42.961687 (XEN)    ffff82d0402302ff ffff83046ccda000 
> 0000000000000100 0000000000000000
> Mar 21 12:00:42.973655 (XEN)    ffff82d0405f0080 00007d2000000000 
> ffff83047bec7c80 ffff82d0402f626c
> Mar 21 12:00:42.985656 (XEN)    ffff83046ccda000 ffff83046ccda640 
> 0000000000000000 0000000000000000
> Mar 21 12:00:42.985690 (XEN)    ffff83046ccda220 ffff83047bec7cb0 
> ffff82d0402f65a0 ffff83046ccda000
> Mar 21 12:00:42.997662 (XEN)    0000000000000000 0000000000000000 
> 0000000000000000 ffff83047bec7cc0
> Mar 21 12:00:43.009660 (XEN)    ffff82d040311f8a ffff83047bec7ce0 
> ffff82d0402bd543 ffff83046ccda000
> Mar 21 12:00:43.009695 (XEN)    ffff83047bec7dc8 ffff83047bec7d08 
> ffff82d04032c524 ffff83046ccda000
> Mar 21 12:00:43.021653 (XEN)    ffff83047bec7dc8 0000000000000002 
> ffff83047bec7d58 ffff82d040206750
> Mar 21 12:00:43.033642 (XEN)    0000000000000000 ffff82d040233fe5 
> ffff83047bec7d48 0000000000000000
> Mar 21 12:00:43.033678 (XEN)    0000000000000002 00007fb72767f010 
> ffff82d0405e9120 0000000000000001
> Mar 21 12:00:43.045654 (XEN)    ffff83047bec7e70 ffff82d040240728 
> 0000000000000007 ffff83023e3b3000
> Mar 21 12:00:43.045690 (XEN)    0000000000000246 ffff83023e2efa90 
> ffff83023e38e000 ffff83023e2efb40
> Mar 21 12:00:43.057609 (XEN)    0000000000000007 ffff83023e3afb80 
> 0000000000000206 ffff83047bec7dc0
> Mar 21 12:00:43.069662 (XEN)    0000001600000001 000000000000ffff 
> e75aaa8d0000000c ac0d6d864e487f62
> Mar 21 12:00:43.069697 (XEN)    000000037fa48d76 0000000200000000 
> ffffffff000003ff 00000002ffffffff
> Mar 21 12:00:43.081647 (XEN)    0000000000000000 00000000000001ff 
> 0000000000000000 0000000000000000
> Mar 21 12:00:43.093646 (XEN) Xen call trace:
> Mar 21 12:00:43.093677 (XEN)    [<ffff82d04022fe1f>] R 
> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
> Mar 21 12:00:43.093705 (XEN)    [<ffff82d0402302ff>] F 
> alloc_domheap_pages+0x17d/0x1e4
> Mar 21 12:00:43.105652 (XEN)    [<ffff82d0402f626c>] F 
> hap_set_allocation+0x73/0x23c
> Mar 21 12:00:43.105685 (XEN)    [<ffff82d0402f65a0>] F 
> hap_enable+0x138/0x33c
> Mar 21 12:00:43.117646 (XEN)    [<ffff82d040311f8a>] F 
> paging_enable+0x2d/0x45
> Mar 21 12:00:43.117679 (XEN)    [<ffff82d0402bd543>] F 
> hvm_domain_initialise+0x185/0x428
> Mar 21 12:00:43.129652 (XEN)    [<ffff82d04032c524>] F 
> arch_domain_create+0x3e7/0x4c1
> Mar 21 12:00:43.129687 (XEN)    [<ffff82d040206750>] F 
> domain_create+0x4cc/0x7e2
> Mar 21 12:00:43.141665 (XEN)    [<ffff82d040240728>] F 
> do_domctl+0x1850/0x192d
> Mar 21 12:00:43.141699 (XEN)    [<ffff82d04031a96a>] F 
> pv_hypercall+0x617/0x6b5
> Mar 21 12:00:43.153656 (XEN)    [<ffff82d0402012ca>] F 
> lstar_enter+0x13a/0x140
> Mar 21 12:00:43.153689 (XEN)
> Mar 21 12:00:43.153711 (XEN)
> Mar 21 12:00:43.153731 (XEN) ****************************************
> Mar 21 12:00:43.165647 (XEN) Panic on CPU 12:
> Mar 21 12:00:43.165678 (XEN) Xen BUG at common/page_alloc.c:1033
> Mar 21 12:00:43.165703 (XEN) ****************************************
> Mar 21 12:00:43.177633 (XEN)
> Mar 21 12:00:43.177662 (XEN) Manual reset required ('noreboot' specified)
> 
> The code around the BUG is:
> 
>          /* Reference count must continuously be zero for free pages. */
>          if ( (pg[i].count_info & ~PGC_need_scrub) != PGC_state_free )
>          {
>              printk(XENLOG_ERR
>                     "pg[%u] MFN %"PRI_mfn" c=%#lx o=%u v=%#lx t=%#x\n",
>                     i, mfn_x(page_to_mfn(pg + i)),
>                     pg[i].count_info, pg[i].v.free.order,
>                     pg[i].u.free.val, pg[i].tlbflush_timestamp);
>              BUG();
>          }
> 
> Now that you are preserving some flags, you also want to modify the 
> condition. I haven't checked the rest of the code, so there might be 
> some adjustments necessary.

Actually maybe the condition should not be adjusted. I think it would be 
wrong if a free pages has the flag PGC_extra set. Any thoughts?

Cheers,
Jan Beulich March 21, 2024, 4:22 p.m. UTC | #4
On 21.03.2024 17:10, Julien Grall wrote:
> On 21/03/2024 16:07, Julien Grall wrote:
>> On 15/03/2024 10:58, Carlo Nonato wrote:
>>> PGC_static and PGC_extra needs to be preserved when assigning a page.
>>> Define a new macro that groups those flags and use it instead of or'ing
>>> every time.
>>>
>>> To make preserved flags even more meaningful, they are kept also when
>>> switching state in mark_page_free().
>>>
>>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
>>
>> This patch is introducing a regression in OSStest (and possibly gitlab?):
>>
>> Mar 21 12:00:29.533676 (XEN) pg[0] MFN 2211c5 c=0x2c00000000000000 o=0 
>> v=0xe40000010007ffff t=0x24
>> Mar 21 12:00:42.829785 (XEN) Xen BUG at common/page_alloc.c:1033
>> Mar 21 12:00:42.829829 (XEN) ----[ Xen-4.19-unstable  x86_64  debug=y 
>> Not tainted ]----
>> Mar 21 12:00:42.829857 (XEN) CPU:    12
>> Mar 21 12:00:42.841571 (XEN) RIP:    e008:[<ffff82d04022fe1f>] 
>> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
>> Mar 21 12:00:42.841609 (XEN) RFLAGS: 0000000000010282   CONTEXT: 
>> hypervisor (d0v8)
>> Mar 21 12:00:42.853654 (XEN) rax: ffff83023e3ed06c   rbx: 
>> 000000000007ffff   rcx: 0000000000000028
>> Mar 21 12:00:42.853689 (XEN) rdx: ffff83047bec7fff   rsi: 
>> ffff83023e3ea3e8   rdi: ffff83023e3ea3e0
>> Mar 21 12:00:42.865657 (XEN) rbp: ffff83047bec7c10   rsp: 
>> ffff83047bec7b98   r8:  0000000000000000
>> Mar 21 12:00:42.877647 (XEN) r9:  0000000000000001   r10: 
>> 000000000000000c   r11: 0000000000000010
>> Mar 21 12:00:42.877682 (XEN) r12: 0000000000000001   r13: 
>> 0000000000000000   r14: ffff82e0044238a0
>> Mar 21 12:00:42.889652 (XEN) r15: 0000000000000000   cr0: 
>> 0000000080050033   cr4: 0000000000372660
>> Mar 21 12:00:42.901651 (XEN) cr3: 000000046fe34000   cr2: 00007fb72757610b
>> Mar 21 12:00:42.901685 (XEN) fsb: 00007fb726def380   gsb: 
>> ffff88801f200000   gss: 0000000000000000
>> Mar 21 12:00:42.913646 (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000 
>> ss: e010   cs: e008
>> Mar 21 12:00:42.913680 (XEN) Xen code around <ffff82d04022fe1f> 
>> (common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2):
>> Mar 21 12:00:42.925645 (XEN)  d1 1c 00 e8 ad dd 02 00 <0f> 0b 48 85 c9 
>> 79 36 0f 0b 41 89 cd 48 c7 47 f0
>> Mar 21 12:00:42.937649 (XEN) Xen stack trace from rsp=ffff83047bec7b98:
>> Mar 21 12:00:42.937683 (XEN)    0000000000000024 000000007bec7c20 
>> 0000000000000001 ffff83046ccda000
>> Mar 21 12:00:42.949653 (XEN)    ffff82e000000021 0000000000000016 
>> 0000000000000000 0000000000000000
>> Mar 21 12:00:42.949687 (XEN)    0000000000000000 0000000000000000 
>> 0000000000000028 0000000000000021
>> Mar 21 12:00:42.961652 (XEN)    ffff83046ccda000 0000000000000000 
>> 00007d2000000000 ffff83047bec7c48
>> Mar 21 12:00:42.961687 (XEN)    ffff82d0402302ff ffff83046ccda000 
>> 0000000000000100 0000000000000000
>> Mar 21 12:00:42.973655 (XEN)    ffff82d0405f0080 00007d2000000000 
>> ffff83047bec7c80 ffff82d0402f626c
>> Mar 21 12:00:42.985656 (XEN)    ffff83046ccda000 ffff83046ccda640 
>> 0000000000000000 0000000000000000
>> Mar 21 12:00:42.985690 (XEN)    ffff83046ccda220 ffff83047bec7cb0 
>> ffff82d0402f65a0 ffff83046ccda000
>> Mar 21 12:00:42.997662 (XEN)    0000000000000000 0000000000000000 
>> 0000000000000000 ffff83047bec7cc0
>> Mar 21 12:00:43.009660 (XEN)    ffff82d040311f8a ffff83047bec7ce0 
>> ffff82d0402bd543 ffff83046ccda000
>> Mar 21 12:00:43.009695 (XEN)    ffff83047bec7dc8 ffff83047bec7d08 
>> ffff82d04032c524 ffff83046ccda000
>> Mar 21 12:00:43.021653 (XEN)    ffff83047bec7dc8 0000000000000002 
>> ffff83047bec7d58 ffff82d040206750
>> Mar 21 12:00:43.033642 (XEN)    0000000000000000 ffff82d040233fe5 
>> ffff83047bec7d48 0000000000000000
>> Mar 21 12:00:43.033678 (XEN)    0000000000000002 00007fb72767f010 
>> ffff82d0405e9120 0000000000000001
>> Mar 21 12:00:43.045654 (XEN)    ffff83047bec7e70 ffff82d040240728 
>> 0000000000000007 ffff83023e3b3000
>> Mar 21 12:00:43.045690 (XEN)    0000000000000246 ffff83023e2efa90 
>> ffff83023e38e000 ffff83023e2efb40
>> Mar 21 12:00:43.057609 (XEN)    0000000000000007 ffff83023e3afb80 
>> 0000000000000206 ffff83047bec7dc0
>> Mar 21 12:00:43.069662 (XEN)    0000001600000001 000000000000ffff 
>> e75aaa8d0000000c ac0d6d864e487f62
>> Mar 21 12:00:43.069697 (XEN)    000000037fa48d76 0000000200000000 
>> ffffffff000003ff 00000002ffffffff
>> Mar 21 12:00:43.081647 (XEN)    0000000000000000 00000000000001ff 
>> 0000000000000000 0000000000000000
>> Mar 21 12:00:43.093646 (XEN) Xen call trace:
>> Mar 21 12:00:43.093677 (XEN)    [<ffff82d04022fe1f>] R 
>> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
>> Mar 21 12:00:43.093705 (XEN)    [<ffff82d0402302ff>] F 
>> alloc_domheap_pages+0x17d/0x1e4
>> Mar 21 12:00:43.105652 (XEN)    [<ffff82d0402f626c>] F 
>> hap_set_allocation+0x73/0x23c
>> Mar 21 12:00:43.105685 (XEN)    [<ffff82d0402f65a0>] F 
>> hap_enable+0x138/0x33c
>> Mar 21 12:00:43.117646 (XEN)    [<ffff82d040311f8a>] F 
>> paging_enable+0x2d/0x45
>> Mar 21 12:00:43.117679 (XEN)    [<ffff82d0402bd543>] F 
>> hvm_domain_initialise+0x185/0x428
>> Mar 21 12:00:43.129652 (XEN)    [<ffff82d04032c524>] F 
>> arch_domain_create+0x3e7/0x4c1
>> Mar 21 12:00:43.129687 (XEN)    [<ffff82d040206750>] F 
>> domain_create+0x4cc/0x7e2
>> Mar 21 12:00:43.141665 (XEN)    [<ffff82d040240728>] F 
>> do_domctl+0x1850/0x192d
>> Mar 21 12:00:43.141699 (XEN)    [<ffff82d04031a96a>] F 
>> pv_hypercall+0x617/0x6b5
>> Mar 21 12:00:43.153656 (XEN)    [<ffff82d0402012ca>] F 
>> lstar_enter+0x13a/0x140
>> Mar 21 12:00:43.153689 (XEN)
>> Mar 21 12:00:43.153711 (XEN)
>> Mar 21 12:00:43.153731 (XEN) ****************************************
>> Mar 21 12:00:43.165647 (XEN) Panic on CPU 12:
>> Mar 21 12:00:43.165678 (XEN) Xen BUG at common/page_alloc.c:1033
>> Mar 21 12:00:43.165703 (XEN) ****************************************
>> Mar 21 12:00:43.177633 (XEN)
>> Mar 21 12:00:43.177662 (XEN) Manual reset required ('noreboot' specified)
>>
>> The code around the BUG is:
>>
>>          /* Reference count must continuously be zero for free pages. */
>>          if ( (pg[i].count_info & ~PGC_need_scrub) != PGC_state_free )
>>          {
>>              printk(XENLOG_ERR
>>                     "pg[%u] MFN %"PRI_mfn" c=%#lx o=%u v=%#lx t=%#x\n",
>>                     i, mfn_x(page_to_mfn(pg + i)),
>>                     pg[i].count_info, pg[i].v.free.order,
>>                     pg[i].u.free.val, pg[i].tlbflush_timestamp);
>>              BUG();
>>          }
>>
>> Now that you are preserving some flags, you also want to modify the 
>> condition. I haven't checked the rest of the code, so there might be 
>> some adjustments necessary.
> 
> Actually maybe the condition should not be adjusted. I think it would be 
> wrong if a free pages has the flag PGC_extra set. Any thoughts?

I agree, yet I'm inclined to say PGC_extra should have been cleared
before trying to free the page.

Jan
Carlo Nonato March 22, 2024, 3:07 p.m. UTC | #5
Hi guys,

On Thu, Mar 21, 2024 at 5:23 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 21.03.2024 17:10, Julien Grall wrote:
> > On 21/03/2024 16:07, Julien Grall wrote:
> >> On 15/03/2024 10:58, Carlo Nonato wrote:
> >>> PGC_static and PGC_extra needs to be preserved when assigning a page.
> >>> Define a new macro that groups those flags and use it instead of or'ing
> >>> every time.
> >>>
> >>> To make preserved flags even more meaningful, they are kept also when
> >>> switching state in mark_page_free().
> >>>
> >>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
> >>
> >> This patch is introducing a regression in OSStest (and possibly gitlab?):
> >>
> >> Mar 21 12:00:29.533676 (XEN) pg[0] MFN 2211c5 c=0x2c00000000000000 o=0
> >> v=0xe40000010007ffff t=0x24
> >> Mar 21 12:00:42.829785 (XEN) Xen BUG at common/page_alloc.c:1033
> >> Mar 21 12:00:42.829829 (XEN) ----[ Xen-4.19-unstable  x86_64  debug=y
> >> Not tainted ]----
> >> Mar 21 12:00:42.829857 (XEN) CPU:    12
> >> Mar 21 12:00:42.841571 (XEN) RIP:    e008:[<ffff82d04022fe1f>]
> >> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
> >> Mar 21 12:00:42.841609 (XEN) RFLAGS: 0000000000010282   CONTEXT:
> >> hypervisor (d0v8)
> >> Mar 21 12:00:42.853654 (XEN) rax: ffff83023e3ed06c   rbx:
> >> 000000000007ffff   rcx: 0000000000000028
> >> Mar 21 12:00:42.853689 (XEN) rdx: ffff83047bec7fff   rsi:
> >> ffff83023e3ea3e8   rdi: ffff83023e3ea3e0
> >> Mar 21 12:00:42.865657 (XEN) rbp: ffff83047bec7c10   rsp:
> >> ffff83047bec7b98   r8:  0000000000000000
> >> Mar 21 12:00:42.877647 (XEN) r9:  0000000000000001   r10:
> >> 000000000000000c   r11: 0000000000000010
> >> Mar 21 12:00:42.877682 (XEN) r12: 0000000000000001   r13:
> >> 0000000000000000   r14: ffff82e0044238a0
> >> Mar 21 12:00:42.889652 (XEN) r15: 0000000000000000   cr0:
> >> 0000000080050033   cr4: 0000000000372660
> >> Mar 21 12:00:42.901651 (XEN) cr3: 000000046fe34000   cr2: 00007fb72757610b
> >> Mar 21 12:00:42.901685 (XEN) fsb: 00007fb726def380   gsb:
> >> ffff88801f200000   gss: 0000000000000000
> >> Mar 21 12:00:42.913646 (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000
> >> ss: e010   cs: e008
> >> Mar 21 12:00:42.913680 (XEN) Xen code around <ffff82d04022fe1f>
> >> (common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2):
> >> Mar 21 12:00:42.925645 (XEN)  d1 1c 00 e8 ad dd 02 00 <0f> 0b 48 85 c9
> >> 79 36 0f 0b 41 89 cd 48 c7 47 f0
> >> Mar 21 12:00:42.937649 (XEN) Xen stack trace from rsp=ffff83047bec7b98:
> >> Mar 21 12:00:42.937683 (XEN)    0000000000000024 000000007bec7c20
> >> 0000000000000001 ffff83046ccda000
> >> Mar 21 12:00:42.949653 (XEN)    ffff82e000000021 0000000000000016
> >> 0000000000000000 0000000000000000
> >> Mar 21 12:00:42.949687 (XEN)    0000000000000000 0000000000000000
> >> 0000000000000028 0000000000000021
> >> Mar 21 12:00:42.961652 (XEN)    ffff83046ccda000 0000000000000000
> >> 00007d2000000000 ffff83047bec7c48
> >> Mar 21 12:00:42.961687 (XEN)    ffff82d0402302ff ffff83046ccda000
> >> 0000000000000100 0000000000000000
> >> Mar 21 12:00:42.973655 (XEN)    ffff82d0405f0080 00007d2000000000
> >> ffff83047bec7c80 ffff82d0402f626c
> >> Mar 21 12:00:42.985656 (XEN)    ffff83046ccda000 ffff83046ccda640
> >> 0000000000000000 0000000000000000
> >> Mar 21 12:00:42.985690 (XEN)    ffff83046ccda220 ffff83047bec7cb0
> >> ffff82d0402f65a0 ffff83046ccda000
> >> Mar 21 12:00:42.997662 (XEN)    0000000000000000 0000000000000000
> >> 0000000000000000 ffff83047bec7cc0
> >> Mar 21 12:00:43.009660 (XEN)    ffff82d040311f8a ffff83047bec7ce0
> >> ffff82d0402bd543 ffff83046ccda000
> >> Mar 21 12:00:43.009695 (XEN)    ffff83047bec7dc8 ffff83047bec7d08
> >> ffff82d04032c524 ffff83046ccda000
> >> Mar 21 12:00:43.021653 (XEN)    ffff83047bec7dc8 0000000000000002
> >> ffff83047bec7d58 ffff82d040206750
> >> Mar 21 12:00:43.033642 (XEN)    0000000000000000 ffff82d040233fe5
> >> ffff83047bec7d48 0000000000000000
> >> Mar 21 12:00:43.033678 (XEN)    0000000000000002 00007fb72767f010
> >> ffff82d0405e9120 0000000000000001
> >> Mar 21 12:00:43.045654 (XEN)    ffff83047bec7e70 ffff82d040240728
> >> 0000000000000007 ffff83023e3b3000
> >> Mar 21 12:00:43.045690 (XEN)    0000000000000246 ffff83023e2efa90
> >> ffff83023e38e000 ffff83023e2efb40
> >> Mar 21 12:00:43.057609 (XEN)    0000000000000007 ffff83023e3afb80
> >> 0000000000000206 ffff83047bec7dc0
> >> Mar 21 12:00:43.069662 (XEN)    0000001600000001 000000000000ffff
> >> e75aaa8d0000000c ac0d6d864e487f62
> >> Mar 21 12:00:43.069697 (XEN)    000000037fa48d76 0000000200000000
> >> ffffffff000003ff 00000002ffffffff
> >> Mar 21 12:00:43.081647 (XEN)    0000000000000000 00000000000001ff
> >> 0000000000000000 0000000000000000
> >> Mar 21 12:00:43.093646 (XEN) Xen call trace:
> >> Mar 21 12:00:43.093677 (XEN)    [<ffff82d04022fe1f>] R
> >> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
> >> Mar 21 12:00:43.093705 (XEN)    [<ffff82d0402302ff>] F
> >> alloc_domheap_pages+0x17d/0x1e4
> >> Mar 21 12:00:43.105652 (XEN)    [<ffff82d0402f626c>] F
> >> hap_set_allocation+0x73/0x23c
> >> Mar 21 12:00:43.105685 (XEN)    [<ffff82d0402f65a0>] F
> >> hap_enable+0x138/0x33c
> >> Mar 21 12:00:43.117646 (XEN)    [<ffff82d040311f8a>] F
> >> paging_enable+0x2d/0x45
> >> Mar 21 12:00:43.117679 (XEN)    [<ffff82d0402bd543>] F
> >> hvm_domain_initialise+0x185/0x428
> >> Mar 21 12:00:43.129652 (XEN)    [<ffff82d04032c524>] F
> >> arch_domain_create+0x3e7/0x4c1
> >> Mar 21 12:00:43.129687 (XEN)    [<ffff82d040206750>] F
> >> domain_create+0x4cc/0x7e2
> >> Mar 21 12:00:43.141665 (XEN)    [<ffff82d040240728>] F
> >> do_domctl+0x1850/0x192d
> >> Mar 21 12:00:43.141699 (XEN)    [<ffff82d04031a96a>] F
> >> pv_hypercall+0x617/0x6b5
> >> Mar 21 12:00:43.153656 (XEN)    [<ffff82d0402012ca>] F
> >> lstar_enter+0x13a/0x140
> >> Mar 21 12:00:43.153689 (XEN)
> >> Mar 21 12:00:43.153711 (XEN)
> >> Mar 21 12:00:43.153731 (XEN) ****************************************
> >> Mar 21 12:00:43.165647 (XEN) Panic on CPU 12:
> >> Mar 21 12:00:43.165678 (XEN) Xen BUG at common/page_alloc.c:1033
> >> Mar 21 12:00:43.165703 (XEN) ****************************************
> >> Mar 21 12:00:43.177633 (XEN)
> >> Mar 21 12:00:43.177662 (XEN) Manual reset required ('noreboot' specified)
> >>
> >> The code around the BUG is:
> >>
> >>          /* Reference count must continuously be zero for free pages. */
> >>          if ( (pg[i].count_info & ~PGC_need_scrub) != PGC_state_free )
> >>          {
> >>              printk(XENLOG_ERR
> >>                     "pg[%u] MFN %"PRI_mfn" c=%#lx o=%u v=%#lx t=%#x\n",
> >>                     i, mfn_x(page_to_mfn(pg + i)),
> >>                     pg[i].count_info, pg[i].v.free.order,
> >>                     pg[i].u.free.val, pg[i].tlbflush_timestamp);
> >>              BUG();
> >>          }
> >>
> >> Now that you are preserving some flags, you also want to modify the
> >> condition. I haven't checked the rest of the code, so there might be
> >> some adjustments necessary.
> >
> > Actually maybe the condition should not be adjusted. I think it would be
> > wrong if a free pages has the flag PGC_extra set. Any thoughts?
>
> I agree, yet I'm inclined to say PGC_extra should have been cleared
> before trying to free the page.

So what to do now? Should I drop this commit?

Thanks.
Jan Beulich March 25, 2024, 7:19 a.m. UTC | #6
On 22.03.2024 16:07, Carlo Nonato wrote:
> Hi guys,
> 
> On Thu, Mar 21, 2024 at 5:23 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 21.03.2024 17:10, Julien Grall wrote:
>>> On 21/03/2024 16:07, Julien Grall wrote:
>>>> On 15/03/2024 10:58, Carlo Nonato wrote:
>>>>> PGC_static and PGC_extra needs to be preserved when assigning a page.
>>>>> Define a new macro that groups those flags and use it instead of or'ing
>>>>> every time.
>>>>>
>>>>> To make preserved flags even more meaningful, they are kept also when
>>>>> switching state in mark_page_free().
>>>>>
>>>>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
>>>>
>>>> This patch is introducing a regression in OSStest (and possibly gitlab?):
>>>>
>>>> Mar 21 12:00:29.533676 (XEN) pg[0] MFN 2211c5 c=0x2c00000000000000 o=0
>>>> v=0xe40000010007ffff t=0x24
>>>> Mar 21 12:00:42.829785 (XEN) Xen BUG at common/page_alloc.c:1033
>>>> Mar 21 12:00:42.829829 (XEN) ----[ Xen-4.19-unstable  x86_64  debug=y
>>>> Not tainted ]----
>>>> Mar 21 12:00:42.829857 (XEN) CPU:    12
>>>> Mar 21 12:00:42.841571 (XEN) RIP:    e008:[<ffff82d04022fe1f>]
>>>> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
>>>> Mar 21 12:00:42.841609 (XEN) RFLAGS: 0000000000010282   CONTEXT:
>>>> hypervisor (d0v8)
>>>> Mar 21 12:00:42.853654 (XEN) rax: ffff83023e3ed06c   rbx:
>>>> 000000000007ffff   rcx: 0000000000000028
>>>> Mar 21 12:00:42.853689 (XEN) rdx: ffff83047bec7fff   rsi:
>>>> ffff83023e3ea3e8   rdi: ffff83023e3ea3e0
>>>> Mar 21 12:00:42.865657 (XEN) rbp: ffff83047bec7c10   rsp:
>>>> ffff83047bec7b98   r8:  0000000000000000
>>>> Mar 21 12:00:42.877647 (XEN) r9:  0000000000000001   r10:
>>>> 000000000000000c   r11: 0000000000000010
>>>> Mar 21 12:00:42.877682 (XEN) r12: 0000000000000001   r13:
>>>> 0000000000000000   r14: ffff82e0044238a0
>>>> Mar 21 12:00:42.889652 (XEN) r15: 0000000000000000   cr0:
>>>> 0000000080050033   cr4: 0000000000372660
>>>> Mar 21 12:00:42.901651 (XEN) cr3: 000000046fe34000   cr2: 00007fb72757610b
>>>> Mar 21 12:00:42.901685 (XEN) fsb: 00007fb726def380   gsb:
>>>> ffff88801f200000   gss: 0000000000000000
>>>> Mar 21 12:00:42.913646 (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000
>>>> ss: e010   cs: e008
>>>> Mar 21 12:00:42.913680 (XEN) Xen code around <ffff82d04022fe1f>
>>>> (common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2):
>>>> Mar 21 12:00:42.925645 (XEN)  d1 1c 00 e8 ad dd 02 00 <0f> 0b 48 85 c9
>>>> 79 36 0f 0b 41 89 cd 48 c7 47 f0
>>>> Mar 21 12:00:42.937649 (XEN) Xen stack trace from rsp=ffff83047bec7b98:
>>>> Mar 21 12:00:42.937683 (XEN)    0000000000000024 000000007bec7c20
>>>> 0000000000000001 ffff83046ccda000
>>>> Mar 21 12:00:42.949653 (XEN)    ffff82e000000021 0000000000000016
>>>> 0000000000000000 0000000000000000
>>>> Mar 21 12:00:42.949687 (XEN)    0000000000000000 0000000000000000
>>>> 0000000000000028 0000000000000021
>>>> Mar 21 12:00:42.961652 (XEN)    ffff83046ccda000 0000000000000000
>>>> 00007d2000000000 ffff83047bec7c48
>>>> Mar 21 12:00:42.961687 (XEN)    ffff82d0402302ff ffff83046ccda000
>>>> 0000000000000100 0000000000000000
>>>> Mar 21 12:00:42.973655 (XEN)    ffff82d0405f0080 00007d2000000000
>>>> ffff83047bec7c80 ffff82d0402f626c
>>>> Mar 21 12:00:42.985656 (XEN)    ffff83046ccda000 ffff83046ccda640
>>>> 0000000000000000 0000000000000000
>>>> Mar 21 12:00:42.985690 (XEN)    ffff83046ccda220 ffff83047bec7cb0
>>>> ffff82d0402f65a0 ffff83046ccda000
>>>> Mar 21 12:00:42.997662 (XEN)    0000000000000000 0000000000000000
>>>> 0000000000000000 ffff83047bec7cc0
>>>> Mar 21 12:00:43.009660 (XEN)    ffff82d040311f8a ffff83047bec7ce0
>>>> ffff82d0402bd543 ffff83046ccda000
>>>> Mar 21 12:00:43.009695 (XEN)    ffff83047bec7dc8 ffff83047bec7d08
>>>> ffff82d04032c524 ffff83046ccda000
>>>> Mar 21 12:00:43.021653 (XEN)    ffff83047bec7dc8 0000000000000002
>>>> ffff83047bec7d58 ffff82d040206750
>>>> Mar 21 12:00:43.033642 (XEN)    0000000000000000 ffff82d040233fe5
>>>> ffff83047bec7d48 0000000000000000
>>>> Mar 21 12:00:43.033678 (XEN)    0000000000000002 00007fb72767f010
>>>> ffff82d0405e9120 0000000000000001
>>>> Mar 21 12:00:43.045654 (XEN)    ffff83047bec7e70 ffff82d040240728
>>>> 0000000000000007 ffff83023e3b3000
>>>> Mar 21 12:00:43.045690 (XEN)    0000000000000246 ffff83023e2efa90
>>>> ffff83023e38e000 ffff83023e2efb40
>>>> Mar 21 12:00:43.057609 (XEN)    0000000000000007 ffff83023e3afb80
>>>> 0000000000000206 ffff83047bec7dc0
>>>> Mar 21 12:00:43.069662 (XEN)    0000001600000001 000000000000ffff
>>>> e75aaa8d0000000c ac0d6d864e487f62
>>>> Mar 21 12:00:43.069697 (XEN)    000000037fa48d76 0000000200000000
>>>> ffffffff000003ff 00000002ffffffff
>>>> Mar 21 12:00:43.081647 (XEN)    0000000000000000 00000000000001ff
>>>> 0000000000000000 0000000000000000
>>>> Mar 21 12:00:43.093646 (XEN) Xen call trace:
>>>> Mar 21 12:00:43.093677 (XEN)    [<ffff82d04022fe1f>] R
>>>> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
>>>> Mar 21 12:00:43.093705 (XEN)    [<ffff82d0402302ff>] F
>>>> alloc_domheap_pages+0x17d/0x1e4
>>>> Mar 21 12:00:43.105652 (XEN)    [<ffff82d0402f626c>] F
>>>> hap_set_allocation+0x73/0x23c
>>>> Mar 21 12:00:43.105685 (XEN)    [<ffff82d0402f65a0>] F
>>>> hap_enable+0x138/0x33c
>>>> Mar 21 12:00:43.117646 (XEN)    [<ffff82d040311f8a>] F
>>>> paging_enable+0x2d/0x45
>>>> Mar 21 12:00:43.117679 (XEN)    [<ffff82d0402bd543>] F
>>>> hvm_domain_initialise+0x185/0x428
>>>> Mar 21 12:00:43.129652 (XEN)    [<ffff82d04032c524>] F
>>>> arch_domain_create+0x3e7/0x4c1
>>>> Mar 21 12:00:43.129687 (XEN)    [<ffff82d040206750>] F
>>>> domain_create+0x4cc/0x7e2
>>>> Mar 21 12:00:43.141665 (XEN)    [<ffff82d040240728>] F
>>>> do_domctl+0x1850/0x192d
>>>> Mar 21 12:00:43.141699 (XEN)    [<ffff82d04031a96a>] F
>>>> pv_hypercall+0x617/0x6b5
>>>> Mar 21 12:00:43.153656 (XEN)    [<ffff82d0402012ca>] F
>>>> lstar_enter+0x13a/0x140
>>>> Mar 21 12:00:43.153689 (XEN)
>>>> Mar 21 12:00:43.153711 (XEN)
>>>> Mar 21 12:00:43.153731 (XEN) ****************************************
>>>> Mar 21 12:00:43.165647 (XEN) Panic on CPU 12:
>>>> Mar 21 12:00:43.165678 (XEN) Xen BUG at common/page_alloc.c:1033
>>>> Mar 21 12:00:43.165703 (XEN) ****************************************
>>>> Mar 21 12:00:43.177633 (XEN)
>>>> Mar 21 12:00:43.177662 (XEN) Manual reset required ('noreboot' specified)
>>>>
>>>> The code around the BUG is:
>>>>
>>>>          /* Reference count must continuously be zero for free pages. */
>>>>          if ( (pg[i].count_info & ~PGC_need_scrub) != PGC_state_free )
>>>>          {
>>>>              printk(XENLOG_ERR
>>>>                     "pg[%u] MFN %"PRI_mfn" c=%#lx o=%u v=%#lx t=%#x\n",
>>>>                     i, mfn_x(page_to_mfn(pg + i)),
>>>>                     pg[i].count_info, pg[i].v.free.order,
>>>>                     pg[i].u.free.val, pg[i].tlbflush_timestamp);
>>>>              BUG();
>>>>          }
>>>>
>>>> Now that you are preserving some flags, you also want to modify the
>>>> condition. I haven't checked the rest of the code, so there might be
>>>> some adjustments necessary.
>>>
>>> Actually maybe the condition should not be adjusted. I think it would be
>>> wrong if a free pages has the flag PGC_extra set. Any thoughts?
>>
>> I agree, yet I'm inclined to say PGC_extra should have been cleared
>> before trying to free the page.
> 
> So what to do now? Should I drop this commit?

No, we need to get to the root of the issue. Since osstest has hit it quite
easily as it seems, I'm somewhat surprised you didn't hit it in your testing.
In any event, as per my earlier reply, my present guess is that your change
has merely uncovered a previously latent issue elsewhere.

Jan
Carlo Nonato March 26, 2024, 4:39 p.m. UTC | #7
Hi Jan,

On Mon, Mar 25, 2024 at 8:19 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 22.03.2024 16:07, Carlo Nonato wrote:
> > Hi guys,
> >
> > On Thu, Mar 21, 2024 at 5:23 PM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 21.03.2024 17:10, Julien Grall wrote:
> >>> On 21/03/2024 16:07, Julien Grall wrote:
> >>>> On 15/03/2024 10:58, Carlo Nonato wrote:
> >>>>> PGC_static and PGC_extra needs to be preserved when assigning a page.
> >>>>> Define a new macro that groups those flags and use it instead of or'ing
> >>>>> every time.
> >>>>>
> >>>>> To make preserved flags even more meaningful, they are kept also when
> >>>>> switching state in mark_page_free().
> >>>>>
> >>>>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
> >>>>
> >>>> This patch is introducing a regression in OSStest (and possibly gitlab?):
> >>>>
> >>>> Mar 21 12:00:29.533676 (XEN) pg[0] MFN 2211c5 c=0x2c00000000000000 o=0
> >>>> v=0xe40000010007ffff t=0x24
> >>>> Mar 21 12:00:42.829785 (XEN) Xen BUG at common/page_alloc.c:1033
> >>>> Mar 21 12:00:42.829829 (XEN) ----[ Xen-4.19-unstable  x86_64  debug=y
> >>>> Not tainted ]----
> >>>> Mar 21 12:00:42.829857 (XEN) CPU:    12
> >>>> Mar 21 12:00:42.841571 (XEN) RIP:    e008:[<ffff82d04022fe1f>]
> >>>> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
> >>>> Mar 21 12:00:42.841609 (XEN) RFLAGS: 0000000000010282   CONTEXT:
> >>>> hypervisor (d0v8)
> >>>> Mar 21 12:00:42.853654 (XEN) rax: ffff83023e3ed06c   rbx:
> >>>> 000000000007ffff   rcx: 0000000000000028
> >>>> Mar 21 12:00:42.853689 (XEN) rdx: ffff83047bec7fff   rsi:
> >>>> ffff83023e3ea3e8   rdi: ffff83023e3ea3e0
> >>>> Mar 21 12:00:42.865657 (XEN) rbp: ffff83047bec7c10   rsp:
> >>>> ffff83047bec7b98   r8:  0000000000000000
> >>>> Mar 21 12:00:42.877647 (XEN) r9:  0000000000000001   r10:
> >>>> 000000000000000c   r11: 0000000000000010
> >>>> Mar 21 12:00:42.877682 (XEN) r12: 0000000000000001   r13:
> >>>> 0000000000000000   r14: ffff82e0044238a0
> >>>> Mar 21 12:00:42.889652 (XEN) r15: 0000000000000000   cr0:
> >>>> 0000000080050033   cr4: 0000000000372660
> >>>> Mar 21 12:00:42.901651 (XEN) cr3: 000000046fe34000   cr2: 00007fb72757610b
> >>>> Mar 21 12:00:42.901685 (XEN) fsb: 00007fb726def380   gsb:
> >>>> ffff88801f200000   gss: 0000000000000000
> >>>> Mar 21 12:00:42.913646 (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000
> >>>> ss: e010   cs: e008
> >>>> Mar 21 12:00:42.913680 (XEN) Xen code around <ffff82d04022fe1f>
> >>>> (common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2):
> >>>> Mar 21 12:00:42.925645 (XEN)  d1 1c 00 e8 ad dd 02 00 <0f> 0b 48 85 c9
> >>>> 79 36 0f 0b 41 89 cd 48 c7 47 f0
> >>>> Mar 21 12:00:42.937649 (XEN) Xen stack trace from rsp=ffff83047bec7b98:
> >>>> Mar 21 12:00:42.937683 (XEN)    0000000000000024 000000007bec7c20
> >>>> 0000000000000001 ffff83046ccda000
> >>>> Mar 21 12:00:42.949653 (XEN)    ffff82e000000021 0000000000000016
> >>>> 0000000000000000 0000000000000000
> >>>> Mar 21 12:00:42.949687 (XEN)    0000000000000000 0000000000000000
> >>>> 0000000000000028 0000000000000021
> >>>> Mar 21 12:00:42.961652 (XEN)    ffff83046ccda000 0000000000000000
> >>>> 00007d2000000000 ffff83047bec7c48
> >>>> Mar 21 12:00:42.961687 (XEN)    ffff82d0402302ff ffff83046ccda000
> >>>> 0000000000000100 0000000000000000
> >>>> Mar 21 12:00:42.973655 (XEN)    ffff82d0405f0080 00007d2000000000
> >>>> ffff83047bec7c80 ffff82d0402f626c
> >>>> Mar 21 12:00:42.985656 (XEN)    ffff83046ccda000 ffff83046ccda640
> >>>> 0000000000000000 0000000000000000
> >>>> Mar 21 12:00:42.985690 (XEN)    ffff83046ccda220 ffff83047bec7cb0
> >>>> ffff82d0402f65a0 ffff83046ccda000
> >>>> Mar 21 12:00:42.997662 (XEN)    0000000000000000 0000000000000000
> >>>> 0000000000000000 ffff83047bec7cc0
> >>>> Mar 21 12:00:43.009660 (XEN)    ffff82d040311f8a ffff83047bec7ce0
> >>>> ffff82d0402bd543 ffff83046ccda000
> >>>> Mar 21 12:00:43.009695 (XEN)    ffff83047bec7dc8 ffff83047bec7d08
> >>>> ffff82d04032c524 ffff83046ccda000
> >>>> Mar 21 12:00:43.021653 (XEN)    ffff83047bec7dc8 0000000000000002
> >>>> ffff83047bec7d58 ffff82d040206750
> >>>> Mar 21 12:00:43.033642 (XEN)    0000000000000000 ffff82d040233fe5
> >>>> ffff83047bec7d48 0000000000000000
> >>>> Mar 21 12:00:43.033678 (XEN)    0000000000000002 00007fb72767f010
> >>>> ffff82d0405e9120 0000000000000001
> >>>> Mar 21 12:00:43.045654 (XEN)    ffff83047bec7e70 ffff82d040240728
> >>>> 0000000000000007 ffff83023e3b3000
> >>>> Mar 21 12:00:43.045690 (XEN)    0000000000000246 ffff83023e2efa90
> >>>> ffff83023e38e000 ffff83023e2efb40
> >>>> Mar 21 12:00:43.057609 (XEN)    0000000000000007 ffff83023e3afb80
> >>>> 0000000000000206 ffff83047bec7dc0
> >>>> Mar 21 12:00:43.069662 (XEN)    0000001600000001 000000000000ffff
> >>>> e75aaa8d0000000c ac0d6d864e487f62
> >>>> Mar 21 12:00:43.069697 (XEN)    000000037fa48d76 0000000200000000
> >>>> ffffffff000003ff 00000002ffffffff
> >>>> Mar 21 12:00:43.081647 (XEN)    0000000000000000 00000000000001ff
> >>>> 0000000000000000 0000000000000000
> >>>> Mar 21 12:00:43.093646 (XEN) Xen call trace:
> >>>> Mar 21 12:00:43.093677 (XEN)    [<ffff82d04022fe1f>] R
> >>>> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
> >>>> Mar 21 12:00:43.093705 (XEN)    [<ffff82d0402302ff>] F
> >>>> alloc_domheap_pages+0x17d/0x1e4
> >>>> Mar 21 12:00:43.105652 (XEN)    [<ffff82d0402f626c>] F
> >>>> hap_set_allocation+0x73/0x23c
> >>>> Mar 21 12:00:43.105685 (XEN)    [<ffff82d0402f65a0>] F
> >>>> hap_enable+0x138/0x33c
> >>>> Mar 21 12:00:43.117646 (XEN)    [<ffff82d040311f8a>] F
> >>>> paging_enable+0x2d/0x45
> >>>> Mar 21 12:00:43.117679 (XEN)    [<ffff82d0402bd543>] F
> >>>> hvm_domain_initialise+0x185/0x428
> >>>> Mar 21 12:00:43.129652 (XEN)    [<ffff82d04032c524>] F
> >>>> arch_domain_create+0x3e7/0x4c1
> >>>> Mar 21 12:00:43.129687 (XEN)    [<ffff82d040206750>] F
> >>>> domain_create+0x4cc/0x7e2
> >>>> Mar 21 12:00:43.141665 (XEN)    [<ffff82d040240728>] F
> >>>> do_domctl+0x1850/0x192d
> >>>> Mar 21 12:00:43.141699 (XEN)    [<ffff82d04031a96a>] F
> >>>> pv_hypercall+0x617/0x6b5
> >>>> Mar 21 12:00:43.153656 (XEN)    [<ffff82d0402012ca>] F
> >>>> lstar_enter+0x13a/0x140
> >>>> Mar 21 12:00:43.153689 (XEN)
> >>>> Mar 21 12:00:43.153711 (XEN)
> >>>> Mar 21 12:00:43.153731 (XEN) ****************************************
> >>>> Mar 21 12:00:43.165647 (XEN) Panic on CPU 12:
> >>>> Mar 21 12:00:43.165678 (XEN) Xen BUG at common/page_alloc.c:1033
> >>>> Mar 21 12:00:43.165703 (XEN) ****************************************
> >>>> Mar 21 12:00:43.177633 (XEN)
> >>>> Mar 21 12:00:43.177662 (XEN) Manual reset required ('noreboot' specified)
> >>>>
> >>>> The code around the BUG is:
> >>>>
> >>>>          /* Reference count must continuously be zero for free pages. */
> >>>>          if ( (pg[i].count_info & ~PGC_need_scrub) != PGC_state_free )
> >>>>          {
> >>>>              printk(XENLOG_ERR
> >>>>                     "pg[%u] MFN %"PRI_mfn" c=%#lx o=%u v=%#lx t=%#x\n",
> >>>>                     i, mfn_x(page_to_mfn(pg + i)),
> >>>>                     pg[i].count_info, pg[i].v.free.order,
> >>>>                     pg[i].u.free.val, pg[i].tlbflush_timestamp);
> >>>>              BUG();
> >>>>          }
> >>>>
> >>>> Now that you are preserving some flags, you also want to modify the
> >>>> condition. I haven't checked the rest of the code, so there might be
> >>>> some adjustments necessary.
> >>>
> >>> Actually maybe the condition should not be adjusted. I think it would be
> >>> wrong if a free pages has the flag PGC_extra set. Any thoughts?
> >>
> >> I agree, yet I'm inclined to say PGC_extra should have been cleared
> >> before trying to free the page.
> >
> > So what to do now? Should I drop this commit?
>
> No, we need to get to the root of the issue. Since osstest has hit it quite
> easily as it seems, I'm somewhat surprised you didn't hit it in your testing.
> In any event, as per my earlier reply, my present guess is that your change
> has merely uncovered a previously latent issue elsewhere.

Ok, what about removing PGC_extra in free_heap_pages() before the
mark_page_free() call?

Thanks.

> Jan
Jan Beulich March 26, 2024, 5:04 p.m. UTC | #8
On 26.03.2024 17:39, Carlo Nonato wrote:
> On Mon, Mar 25, 2024 at 8:19 AM Jan Beulich <jbeulich@suse.com> wrote:
>> On 22.03.2024 16:07, Carlo Nonato wrote:
>>> On Thu, Mar 21, 2024 at 5:23 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>> On 21.03.2024 17:10, Julien Grall wrote:
>>>>> On 21/03/2024 16:07, Julien Grall wrote:
>>>>>> On 15/03/2024 10:58, Carlo Nonato wrote:
>>>>>>> PGC_static and PGC_extra needs to be preserved when assigning a page.
>>>>>>> Define a new macro that groups those flags and use it instead of or'ing
>>>>>>> every time.
>>>>>>>
>>>>>>> To make preserved flags even more meaningful, they are kept also when
>>>>>>> switching state in mark_page_free().
>>>>>>>
>>>>>>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
>>>>>>
>>>>>> This patch is introducing a regression in OSStest (and possibly gitlab?):
>>>>>>
>>>>>> Mar 21 12:00:29.533676 (XEN) pg[0] MFN 2211c5 c=0x2c00000000000000 o=0
>>>>>> v=0xe40000010007ffff t=0x24
>>>>>> Mar 21 12:00:42.829785 (XEN) Xen BUG at common/page_alloc.c:1033
>>>>>> Mar 21 12:00:42.829829 (XEN) ----[ Xen-4.19-unstable  x86_64  debug=y
>>>>>> Not tainted ]----
>>>>>> Mar 21 12:00:42.829857 (XEN) CPU:    12
>>>>>> Mar 21 12:00:42.841571 (XEN) RIP:    e008:[<ffff82d04022fe1f>]
>>>>>> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
>>>>>> Mar 21 12:00:42.841609 (XEN) RFLAGS: 0000000000010282   CONTEXT:
>>>>>> hypervisor (d0v8)
>>>>>> Mar 21 12:00:42.853654 (XEN) rax: ffff83023e3ed06c   rbx:
>>>>>> 000000000007ffff   rcx: 0000000000000028
>>>>>> Mar 21 12:00:42.853689 (XEN) rdx: ffff83047bec7fff   rsi:
>>>>>> ffff83023e3ea3e8   rdi: ffff83023e3ea3e0
>>>>>> Mar 21 12:00:42.865657 (XEN) rbp: ffff83047bec7c10   rsp:
>>>>>> ffff83047bec7b98   r8:  0000000000000000
>>>>>> Mar 21 12:00:42.877647 (XEN) r9:  0000000000000001   r10:
>>>>>> 000000000000000c   r11: 0000000000000010
>>>>>> Mar 21 12:00:42.877682 (XEN) r12: 0000000000000001   r13:
>>>>>> 0000000000000000   r14: ffff82e0044238a0
>>>>>> Mar 21 12:00:42.889652 (XEN) r15: 0000000000000000   cr0:
>>>>>> 0000000080050033   cr4: 0000000000372660
>>>>>> Mar 21 12:00:42.901651 (XEN) cr3: 000000046fe34000   cr2: 00007fb72757610b
>>>>>> Mar 21 12:00:42.901685 (XEN) fsb: 00007fb726def380   gsb:
>>>>>> ffff88801f200000   gss: 0000000000000000
>>>>>> Mar 21 12:00:42.913646 (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000
>>>>>> ss: e010   cs: e008
>>>>>> Mar 21 12:00:42.913680 (XEN) Xen code around <ffff82d04022fe1f>
>>>>>> (common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2):
>>>>>> Mar 21 12:00:42.925645 (XEN)  d1 1c 00 e8 ad dd 02 00 <0f> 0b 48 85 c9
>>>>>> 79 36 0f 0b 41 89 cd 48 c7 47 f0
>>>>>> Mar 21 12:00:42.937649 (XEN) Xen stack trace from rsp=ffff83047bec7b98:
>>>>>> Mar 21 12:00:42.937683 (XEN)    0000000000000024 000000007bec7c20
>>>>>> 0000000000000001 ffff83046ccda000
>>>>>> Mar 21 12:00:42.949653 (XEN)    ffff82e000000021 0000000000000016
>>>>>> 0000000000000000 0000000000000000
>>>>>> Mar 21 12:00:42.949687 (XEN)    0000000000000000 0000000000000000
>>>>>> 0000000000000028 0000000000000021
>>>>>> Mar 21 12:00:42.961652 (XEN)    ffff83046ccda000 0000000000000000
>>>>>> 00007d2000000000 ffff83047bec7c48
>>>>>> Mar 21 12:00:42.961687 (XEN)    ffff82d0402302ff ffff83046ccda000
>>>>>> 0000000000000100 0000000000000000
>>>>>> Mar 21 12:00:42.973655 (XEN)    ffff82d0405f0080 00007d2000000000
>>>>>> ffff83047bec7c80 ffff82d0402f626c
>>>>>> Mar 21 12:00:42.985656 (XEN)    ffff83046ccda000 ffff83046ccda640
>>>>>> 0000000000000000 0000000000000000
>>>>>> Mar 21 12:00:42.985690 (XEN)    ffff83046ccda220 ffff83047bec7cb0
>>>>>> ffff82d0402f65a0 ffff83046ccda000
>>>>>> Mar 21 12:00:42.997662 (XEN)    0000000000000000 0000000000000000
>>>>>> 0000000000000000 ffff83047bec7cc0
>>>>>> Mar 21 12:00:43.009660 (XEN)    ffff82d040311f8a ffff83047bec7ce0
>>>>>> ffff82d0402bd543 ffff83046ccda000
>>>>>> Mar 21 12:00:43.009695 (XEN)    ffff83047bec7dc8 ffff83047bec7d08
>>>>>> ffff82d04032c524 ffff83046ccda000
>>>>>> Mar 21 12:00:43.021653 (XEN)    ffff83047bec7dc8 0000000000000002
>>>>>> ffff83047bec7d58 ffff82d040206750
>>>>>> Mar 21 12:00:43.033642 (XEN)    0000000000000000 ffff82d040233fe5
>>>>>> ffff83047bec7d48 0000000000000000
>>>>>> Mar 21 12:00:43.033678 (XEN)    0000000000000002 00007fb72767f010
>>>>>> ffff82d0405e9120 0000000000000001
>>>>>> Mar 21 12:00:43.045654 (XEN)    ffff83047bec7e70 ffff82d040240728
>>>>>> 0000000000000007 ffff83023e3b3000
>>>>>> Mar 21 12:00:43.045690 (XEN)    0000000000000246 ffff83023e2efa90
>>>>>> ffff83023e38e000 ffff83023e2efb40
>>>>>> Mar 21 12:00:43.057609 (XEN)    0000000000000007 ffff83023e3afb80
>>>>>> 0000000000000206 ffff83047bec7dc0
>>>>>> Mar 21 12:00:43.069662 (XEN)    0000001600000001 000000000000ffff
>>>>>> e75aaa8d0000000c ac0d6d864e487f62
>>>>>> Mar 21 12:00:43.069697 (XEN)    000000037fa48d76 0000000200000000
>>>>>> ffffffff000003ff 00000002ffffffff
>>>>>> Mar 21 12:00:43.081647 (XEN)    0000000000000000 00000000000001ff
>>>>>> 0000000000000000 0000000000000000
>>>>>> Mar 21 12:00:43.093646 (XEN) Xen call trace:
>>>>>> Mar 21 12:00:43.093677 (XEN)    [<ffff82d04022fe1f>] R
>>>>>> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
>>>>>> Mar 21 12:00:43.093705 (XEN)    [<ffff82d0402302ff>] F
>>>>>> alloc_domheap_pages+0x17d/0x1e4
>>>>>> Mar 21 12:00:43.105652 (XEN)    [<ffff82d0402f626c>] F
>>>>>> hap_set_allocation+0x73/0x23c
>>>>>> Mar 21 12:00:43.105685 (XEN)    [<ffff82d0402f65a0>] F
>>>>>> hap_enable+0x138/0x33c
>>>>>> Mar 21 12:00:43.117646 (XEN)    [<ffff82d040311f8a>] F
>>>>>> paging_enable+0x2d/0x45
>>>>>> Mar 21 12:00:43.117679 (XEN)    [<ffff82d0402bd543>] F
>>>>>> hvm_domain_initialise+0x185/0x428
>>>>>> Mar 21 12:00:43.129652 (XEN)    [<ffff82d04032c524>] F
>>>>>> arch_domain_create+0x3e7/0x4c1
>>>>>> Mar 21 12:00:43.129687 (XEN)    [<ffff82d040206750>] F
>>>>>> domain_create+0x4cc/0x7e2
>>>>>> Mar 21 12:00:43.141665 (XEN)    [<ffff82d040240728>] F
>>>>>> do_domctl+0x1850/0x192d
>>>>>> Mar 21 12:00:43.141699 (XEN)    [<ffff82d04031a96a>] F
>>>>>> pv_hypercall+0x617/0x6b5
>>>>>> Mar 21 12:00:43.153656 (XEN)    [<ffff82d0402012ca>] F
>>>>>> lstar_enter+0x13a/0x140
>>>>>> Mar 21 12:00:43.153689 (XEN)
>>>>>> Mar 21 12:00:43.153711 (XEN)
>>>>>> Mar 21 12:00:43.153731 (XEN) ****************************************
>>>>>> Mar 21 12:00:43.165647 (XEN) Panic on CPU 12:
>>>>>> Mar 21 12:00:43.165678 (XEN) Xen BUG at common/page_alloc.c:1033
>>>>>> Mar 21 12:00:43.165703 (XEN) ****************************************
>>>>>> Mar 21 12:00:43.177633 (XEN)
>>>>>> Mar 21 12:00:43.177662 (XEN) Manual reset required ('noreboot' specified)
>>>>>>
>>>>>> The code around the BUG is:
>>>>>>
>>>>>>          /* Reference count must continuously be zero for free pages. */
>>>>>>          if ( (pg[i].count_info & ~PGC_need_scrub) != PGC_state_free )
>>>>>>          {
>>>>>>              printk(XENLOG_ERR
>>>>>>                     "pg[%u] MFN %"PRI_mfn" c=%#lx o=%u v=%#lx t=%#x\n",
>>>>>>                     i, mfn_x(page_to_mfn(pg + i)),
>>>>>>                     pg[i].count_info, pg[i].v.free.order,
>>>>>>                     pg[i].u.free.val, pg[i].tlbflush_timestamp);
>>>>>>              BUG();
>>>>>>          }
>>>>>>
>>>>>> Now that you are preserving some flags, you also want to modify the
>>>>>> condition. I haven't checked the rest of the code, so there might be
>>>>>> some adjustments necessary.
>>>>>
>>>>> Actually maybe the condition should not be adjusted. I think it would be
>>>>> wrong if a free pages has the flag PGC_extra set. Any thoughts?
>>>>
>>>> I agree, yet I'm inclined to say PGC_extra should have been cleared
>>>> before trying to free the page.
>>>
>>> So what to do now? Should I drop this commit?
>>
>> No, we need to get to the root of the issue. Since osstest has hit it quite
>> easily as it seems, I'm somewhat surprised you didn't hit it in your testing.
>> In any event, as per my earlier reply, my present guess is that your change
>> has merely uncovered a previously latent issue elsewhere.
> 
> Ok, what about removing PGC_extra in free_heap_pages() before the
> mark_page_free() call?

Question is: How would you justify such a change? IOW I'm not convinced
(yet) this wants doing there.

Jan
Julien Grall March 26, 2024, 9:55 p.m. UTC | #9
Hi Carlo & Jan,

On 26/03/2024 17:04, Jan Beulich wrote:
>>> No, we need to get to the root of the issue. Since osstest has hit it quite
>>> easily as it seems, I'm somewhat surprised you didn't hit it in your testing.
>>> In any event, as per my earlier reply, my present guess is that your change
>>> has merely uncovered a previously latent issue elsewhere.
>>
>> Ok, what about removing PGC_extra in free_heap_pages() before the
>> mark_page_free() call?
> 
> Question is: How would you justify such a change? IOW I'm not convinced
> (yet) this wants doing there.

Looking at the code, the flag is originally set in 
alloc_domheap_pages(). So I guess it would make sense to do it in 
free_domheap_pages().

For PGC_static, I don't think we can reach free_domheap_pages() with the 
flag set (the pages should live in a separate pool). So I believe there 
is nothing to do for them.

Cheers,
Carlo Nonato March 27, 2024, 11:10 a.m. UTC | #10
Hi guys,

> Question is: How would you justify such a change? IOW I'm not convinced
> (yet) this wants doing there.

You mean in this series?

> Looking at the code, the flag is originally set in
> alloc_domheap_pages(). So I guess it would make sense to do it in
> free_domheap_pages().

We don't hold the heap_lock there. Is it safe to change count_info without it?

Thanks.

> For PGC_static, I don't think we can reach free_domheap_pages() with the
> flag set (the pages should live in a separate pool). So I believe there
> is nothing to do for them.
>
> Cheers,
>
> --
> Julien Grall
Julien Grall March 27, 2024, 1:28 p.m. UTC | #11
Hi Carlo,

On 27/03/2024 11:10, Carlo Nonato wrote:
> Hi guys,
> 
>> Question is: How would you justify such a change? IOW I'm not convinced
>> (yet) this wants doing there.
> 
> You mean in this series?
> 
>> Looking at the code, the flag is originally set in
>> alloc_domheap_pages(). So I guess it would make sense to do it in
>> free_domheap_pages().
> 
> We don't hold the heap_lock there.
Regardless of the safety question (I will answer below), count_info is 
not meant to be protected by heap_lock. The lock is protecting the heap 
and ensure we are not corrupting them when there are concurrent call to 
free_heap_pages().

 > Is it safe to change count_info without it?

count_info is meant to be accessed locklessly. The flag PGC_extra cannot 
be set/clear concurrently because you can't allocate a page that has not 
yet been freed.

So it would be fine to use clear_bit(..., ...);

Cheers,
Jan Beulich March 27, 2024, 1:38 p.m. UTC | #12
On 27.03.2024 14:28, Julien Grall wrote:
> Hi Carlo,
> 
> On 27/03/2024 11:10, Carlo Nonato wrote:
>> Hi guys,
>>
>>> Question is: How would you justify such a change? IOW I'm not convinced
>>> (yet) this wants doing there.
>>
>> You mean in this series?
>>
>>> Looking at the code, the flag is originally set in
>>> alloc_domheap_pages(). So I guess it would make sense to do it in
>>> free_domheap_pages().
>>
>> We don't hold the heap_lock there.
> Regardless of the safety question (I will answer below), count_info is 
> not meant to be protected by heap_lock. The lock is protecting the heap 
> and ensure we are not corrupting them when there are concurrent call to 
> free_heap_pages().
> 
>  > Is it safe to change count_info without it?
> 
> count_info is meant to be accessed locklessly. The flag PGC_extra cannot 
> be set/clear concurrently because you can't allocate a page that has not 
> yet been freed.
> 
> So it would be fine to use clear_bit(..., ...);

Actually we hardly ever use clear_bit() on count_info. Normally we use
ordinary C operators. Atomic (and otherwise lockless) updates are useful
only if done like this everywhere.

Jan
Julien Grall March 27, 2024, 1:48 p.m. UTC | #13
On 27/03/2024 13:38, Jan Beulich wrote:
> On 27.03.2024 14:28, Julien Grall wrote:
>> Hi Carlo,
>>
>> On 27/03/2024 11:10, Carlo Nonato wrote:
>>> Hi guys,
>>>
>>>> Question is: How would you justify such a change? IOW I'm not convinced
>>>> (yet) this wants doing there.
>>>
>>> You mean in this series?
>>>
>>>> Looking at the code, the flag is originally set in
>>>> alloc_domheap_pages(). So I guess it would make sense to do it in
>>>> free_domheap_pages().
>>>
>>> We don't hold the heap_lock there.
>> Regardless of the safety question (I will answer below), count_info is
>> not meant to be protected by heap_lock. The lock is protecting the heap
>> and ensure we are not corrupting them when there are concurrent call to
>> free_heap_pages().
>>
>>   > Is it safe to change count_info without it?
>>
>> count_info is meant to be accessed locklessly. The flag PGC_extra cannot
>> be set/clear concurrently because you can't allocate a page that has not
>> yet been freed.
>>
>> So it would be fine to use clear_bit(..., ...);
> 
> Actually we hardly ever use clear_bit() on count_info. Normally we use
> ordinary C operators.

I knew you would say that. I am not convince it is safe to always using 
count_info without any atomic operations. But I never got around to 
check all them.

> Atomic (and otherwise lockless) updates are useful
> only if done like this everywhere.

You are right. But starting to use the bitops is not going to hurt 
anyone (other than maybe performance, but once we convert all of them, 
then this will become moot). In fact, it helps start to slowly move 
towards the aim to have count_info safe.

Cheers,
Julien Grall March 28, 2024, 6:21 p.m. UTC | #14
Hi,

Replying to myself.

On 27/03/2024 13:48, Julien Grall wrote:
> On 27/03/2024 13:38, Jan Beulich wrote:
>> On 27.03.2024 14:28, Julien Grall wrote:
>>> Hi Carlo,
>>>
>>> On 27/03/2024 11:10, Carlo Nonato wrote:
>>>> Hi guys,
>>>>
>>>>> Question is: How would you justify such a change? IOW I'm not 
>>>>> convinced
>>>>> (yet) this wants doing there.
>>>>
>>>> You mean in this series?
>>>>
>>>>> Looking at the code, the flag is originally set in
>>>>> alloc_domheap_pages(). So I guess it would make sense to do it in
>>>>> free_domheap_pages().
>>>>
>>>> We don't hold the heap_lock there.
>>> Regardless of the safety question (I will answer below), count_info is
>>> not meant to be protected by heap_lock. The lock is protecting the heap
>>> and ensure we are not corrupting them when there are concurrent call to
>>> free_heap_pages().
>>>
>>>   > Is it safe to change count_info without it?
>>>
>>> count_info is meant to be accessed locklessly. The flag PGC_extra cannot
>>> be set/clear concurrently because you can't allocate a page that has not
>>> yet been freed.
>>>
>>> So it would be fine to use clear_bit(..., ...);
>>
>> Actually we hardly ever use clear_bit() on count_info. Normally we use
>> ordinary C operators.
> 
> I knew you would say that. I am not convince it is safe to always using 
> count_info without any atomic operations. But I never got around to 
> check all them.
> 
>> Atomic (and otherwise lockless) updates are useful
>> only if done like this everywhere.
> 
> You are right. But starting to use the bitops is not going to hurt 
> anyone (other than maybe performance, but once we convert all of them, 
> then this will become moot). In fact, it helps start to slowly move 
> towards the aim to have count_info safe.

I think I managed to convince myself that, count_info |= ... is fine in 
this case and no locking is necessary.

Cheers,
diff mbox series

Patch

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index c38edb9a58..6a98d9013f 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -158,6 +158,8 @@ 
 #define PGC_static 0
 #endif
 
+#define PGC_preserved (PGC_extra | PGC_static)
+
 #ifndef PGT_TYPE_INFO_INITIALIZER
 #define PGT_TYPE_INFO_INITIALIZER 0
 #endif
@@ -1424,11 +1426,11 @@  static bool mark_page_free(struct page_info *pg, mfn_t mfn)
     {
     case PGC_state_inuse:
         BUG_ON(pg->count_info & PGC_broken);
-        pg->count_info = PGC_state_free;
+        pg->count_info = PGC_state_free | (pg->count_info & PGC_preserved);
         break;
 
     case PGC_state_offlining:
-        pg->count_info = (pg->count_info & PGC_broken) |
+        pg->count_info = (pg->count_info & (PGC_broken | PGC_preserved)) |
                          PGC_state_offlined;
         pg_offlined = true;
         break;
@@ -2363,7 +2365,7 @@  int assign_pages(
 
         for ( i = 0; i < nr; i++ )
         {
-            ASSERT(!(pg[i].count_info & ~(PGC_extra | PGC_static)));
+            ASSERT(!(pg[i].count_info & ~PGC_preserved));
             if ( pg[i].count_info & PGC_extra )
                 extra_pages++;
         }
@@ -2423,7 +2425,7 @@  int assign_pages(
         page_set_owner(&pg[i], d);
         smp_wmb(); /* Domain pointer must be visible before updating refcnt. */
         pg[i].count_info =
-            (pg[i].count_info & (PGC_extra | PGC_static)) | PGC_allocated | 1;
+            (pg[i].count_info & PGC_preserved) | PGC_allocated | 1;
 
         page_list_add_tail(&pg[i], page_to_list(d, &pg[i]));
     }