diff mbox series

[v4,14/21] x86: introduce helper for recording degree of contiguity in page tables

Message ID fedf7224-8023-275a-843c-1a5753c20ded@suse.com (mailing list archive)
State Superseded
Headers show
Series IOMMU: superpage support when not sharing pagetables | expand

Commit Message

Jan Beulich April 25, 2022, 8:41 a.m. UTC
This is a re-usable helper (kind of a template) which gets introduced
without users so that the individual subsequent patches introducing such
users can get committed independently of one another.

See the comment at the top of the new file. To demonstrate the effect,
if a page table had just 16 entries, this would be the set of markers
for a page table with fully contiguous mappings:

index  0 1 2 3 4 5 6 7 8 9 A B C D E F
marker 4 0 1 0 2 0 1 0 3 0 1 0 2 0 1 0

"Contiguous" here means not only present entries with successively
increasing MFNs, each one suitably aligned for its slot, but also a
respective number of all non-present entries.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: Rename function and header. Introduce IS_CONTIG().
v2: New.

Comments

Roger Pau Monne May 6, 2022, 1:25 p.m. UTC | #1
On Mon, Apr 25, 2022 at 10:41:23AM +0200, Jan Beulich wrote:
> This is a re-usable helper (kind of a template) which gets introduced
> without users so that the individual subsequent patches introducing such
> users can get committed independently of one another.
> 
> See the comment at the top of the new file. To demonstrate the effect,
> if a page table had just 16 entries, this would be the set of markers
> for a page table with fully contiguous mappings:
> 
> index  0 1 2 3 4 5 6 7 8 9 A B C D E F
> marker 4 0 1 0 2 0 1 0 3 0 1 0 2 0 1 0
> 
> "Contiguous" here means not only present entries with successively
> increasing MFNs, each one suitably aligned for its slot, but also a
> respective number of all non-present entries.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v3: Rename function and header. Introduce IS_CONTIG().
> v2: New.
> 
> --- /dev/null
> +++ b/xen/arch/x86/include/asm/pt-contig-markers.h
> @@ -0,0 +1,105 @@
> +#ifndef __ASM_X86_PT_CONTIG_MARKERS_H
> +#define __ASM_X86_PT_CONTIG_MARKERS_H
> +
> +/*
> + * Short of having function templates in C, the function defined below is
> + * intended to be used by multiple parties interested in recording the
> + * degree of contiguity in mappings by a single page table.
> + *
> + * Scheme: Every entry records the order of contiguous successive entries,
> + * up to the maximum order covered by that entry (which is the number of
> + * clear low bits in its index, with entry 0 being the exception using
> + * the base-2 logarithm of the number of entries in a single page table).
> + * While a few entries need touching upon update, knowing whether the
> + * table is fully contiguous (and can hence be replaced by a higher level
> + * leaf entry) is then possible by simply looking at entry 0's marker.
> + *
> + * Prereqs:
> + * - CONTIG_MASK needs to be #define-d, to a value having at least 4
> + *   contiguous bits (ignored by hardware), before including this file,
> + * - page tables to be passed here need to be initialized with correct
> + *   markers.

Not sure it's very relevant, but might we worth adding that:

- Null entries must have the PTE zeroed except for the CONTIG_MASK
  region in order to be considered as inactive.

> + */
> +
> +#include <xen/bitops.h>
> +#include <xen/lib.h>
> +#include <xen/page-size.h>
> +
> +/* This is the same for all anticipated users, so doesn't need passing in. */
> +#define CONTIG_LEVEL_SHIFT 9
> +#define CONTIG_NR          (1 << CONTIG_LEVEL_SHIFT)
> +
> +#define GET_MARKER(e) MASK_EXTR(e, CONTIG_MASK)
> +#define SET_MARKER(e, m) \
> +    ((void)((e) = ((e) & ~CONTIG_MASK) | MASK_INSR(m, CONTIG_MASK)))
> +
> +#define IS_CONTIG(kind, pt, i, idx, shift, b) \
> +    ((kind) == PTE_kind_leaf \
> +     ? (((pt)[i] ^ (pt)[idx]) & ~CONTIG_MASK) == (1ULL << ((b) + (shift))) \
> +     : !((pt)[i] & ~CONTIG_MASK))
> +
> +enum PTE_kind {
> +    PTE_kind_null,
> +    PTE_kind_leaf,
> +    PTE_kind_table,
> +};
> +
> +static bool pt_update_contig_markers(uint64_t *pt, unsigned int idx,
> +                                     unsigned int level, enum PTE_kind kind)
> +{
> +    unsigned int b, i = idx;
> +    unsigned int shift = (level - 1) * CONTIG_LEVEL_SHIFT + PAGE_SHIFT;
> +
> +    ASSERT(idx < CONTIG_NR);
> +    ASSERT(!(pt[idx] & CONTIG_MASK));
> +
> +    /* Step 1: Reduce markers in lower numbered entries. */
> +    while ( i )
> +    {
> +        b = find_first_set_bit(i);
> +        i &= ~(1U << b);
> +        if ( GET_MARKER(pt[i]) > b )
> +            SET_MARKER(pt[i], b);

Can't you exit early when you find an entry that already has the
to-be-set contiguous marker <= b, as lower numbered entries will then
also be <= b'?

Ie:

if ( GET_MARKER(pt[i]) <= b )
    break;
else
    SET_MARKER(pt[i], b);

Thanks, Roger.
Jan Beulich May 18, 2022, 10:06 a.m. UTC | #2
On 06.05.2022 15:25, Roger Pau Monné wrote:
> On Mon, Apr 25, 2022 at 10:41:23AM +0200, Jan Beulich wrote:
>> --- /dev/null
>> +++ b/xen/arch/x86/include/asm/pt-contig-markers.h
>> @@ -0,0 +1,105 @@
>> +#ifndef __ASM_X86_PT_CONTIG_MARKERS_H
>> +#define __ASM_X86_PT_CONTIG_MARKERS_H
>> +
>> +/*
>> + * Short of having function templates in C, the function defined below is
>> + * intended to be used by multiple parties interested in recording the
>> + * degree of contiguity in mappings by a single page table.
>> + *
>> + * Scheme: Every entry records the order of contiguous successive entries,
>> + * up to the maximum order covered by that entry (which is the number of
>> + * clear low bits in its index, with entry 0 being the exception using
>> + * the base-2 logarithm of the number of entries in a single page table).
>> + * While a few entries need touching upon update, knowing whether the
>> + * table is fully contiguous (and can hence be replaced by a higher level
>> + * leaf entry) is then possible by simply looking at entry 0's marker.
>> + *
>> + * Prereqs:
>> + * - CONTIG_MASK needs to be #define-d, to a value having at least 4
>> + *   contiguous bits (ignored by hardware), before including this file,
>> + * - page tables to be passed here need to be initialized with correct
>> + *   markers.
> 
> Not sure it's very relevant, but might we worth adding that:
> 
> - Null entries must have the PTE zeroed except for the CONTIG_MASK
>   region in order to be considered as inactive.

NP, I've added an item along these lines.

>> +static bool pt_update_contig_markers(uint64_t *pt, unsigned int idx,
>> +                                     unsigned int level, enum PTE_kind kind)
>> +{
>> +    unsigned int b, i = idx;
>> +    unsigned int shift = (level - 1) * CONTIG_LEVEL_SHIFT + PAGE_SHIFT;
>> +
>> +    ASSERT(idx < CONTIG_NR);
>> +    ASSERT(!(pt[idx] & CONTIG_MASK));
>> +
>> +    /* Step 1: Reduce markers in lower numbered entries. */
>> +    while ( i )
>> +    {
>> +        b = find_first_set_bit(i);
>> +        i &= ~(1U << b);
>> +        if ( GET_MARKER(pt[i]) > b )
>> +            SET_MARKER(pt[i], b);
> 
> Can't you exit early when you find an entry that already has the
> to-be-set contiguous marker <= b, as lower numbered entries will then
> also be <= b'?
> 
> Ie:
> 
> if ( GET_MARKER(pt[i]) <= b )
>     break;
> else
>     SET_MARKER(pt[i], b);

Almost - I think it would need to be 

        if ( GET_MARKER(pt[i]) < b )
            break;
        if ( GET_MARKER(pt[i]) > b )
            SET_MARKER(pt[i], b);

or, accepting redundant updates, 

        if ( GET_MARKER(pt[i]) < b )
            break;
        SET_MARKER(pt[i], b);

. Neither the redundant updates nor the extra (easily mis-predicted)
conditional looked very appealing to me, but I guess I could change
this if you are convinced that's better than continuing a loop with
at most 9 (typically less) iterations.

Jan
Roger Pau Monne May 20, 2022, 10:22 a.m. UTC | #3
On Wed, May 18, 2022 at 12:06:29PM +0200, Jan Beulich wrote:
> On 06.05.2022 15:25, Roger Pau Monné wrote:
> > On Mon, Apr 25, 2022 at 10:41:23AM +0200, Jan Beulich wrote:
> >> --- /dev/null
> >> +++ b/xen/arch/x86/include/asm/pt-contig-markers.h
> >> @@ -0,0 +1,105 @@
> >> +#ifndef __ASM_X86_PT_CONTIG_MARKERS_H
> >> +#define __ASM_X86_PT_CONTIG_MARKERS_H
> >> +
> >> +/*
> >> + * Short of having function templates in C, the function defined below is
> >> + * intended to be used by multiple parties interested in recording the
> >> + * degree of contiguity in mappings by a single page table.
> >> + *
> >> + * Scheme: Every entry records the order of contiguous successive entries,
> >> + * up to the maximum order covered by that entry (which is the number of
> >> + * clear low bits in its index, with entry 0 being the exception using
> >> + * the base-2 logarithm of the number of entries in a single page table).
> >> + * While a few entries need touching upon update, knowing whether the
> >> + * table is fully contiguous (and can hence be replaced by a higher level
> >> + * leaf entry) is then possible by simply looking at entry 0's marker.
> >> + *
> >> + * Prereqs:
> >> + * - CONTIG_MASK needs to be #define-d, to a value having at least 4
> >> + *   contiguous bits (ignored by hardware), before including this file,
> >> + * - page tables to be passed here need to be initialized with correct
> >> + *   markers.
> > 
> > Not sure it's very relevant, but might we worth adding that:
> > 
> > - Null entries must have the PTE zeroed except for the CONTIG_MASK
> >   region in order to be considered as inactive.
> 
> NP, I've added an item along these lines.
> 
> >> +static bool pt_update_contig_markers(uint64_t *pt, unsigned int idx,
> >> +                                     unsigned int level, enum PTE_kind kind)
> >> +{
> >> +    unsigned int b, i = idx;
> >> +    unsigned int shift = (level - 1) * CONTIG_LEVEL_SHIFT + PAGE_SHIFT;
> >> +
> >> +    ASSERT(idx < CONTIG_NR);
> >> +    ASSERT(!(pt[idx] & CONTIG_MASK));
> >> +
> >> +    /* Step 1: Reduce markers in lower numbered entries. */
> >> +    while ( i )
> >> +    {
> >> +        b = find_first_set_bit(i);
> >> +        i &= ~(1U << b);
> >> +        if ( GET_MARKER(pt[i]) > b )
> >> +            SET_MARKER(pt[i], b);
> > 
> > Can't you exit early when you find an entry that already has the
> > to-be-set contiguous marker <= b, as lower numbered entries will then
> > also be <= b'?
> > 
> > Ie:
> > 
> > if ( GET_MARKER(pt[i]) <= b )
> >     break;
> > else
> >     SET_MARKER(pt[i], b);
> 
> Almost - I think it would need to be 
> 
>         if ( GET_MARKER(pt[i]) < b )
>             break;
>         if ( GET_MARKER(pt[i]) > b )
>             SET_MARKER(pt[i], b);

I guess I'm slightly confused, but if marker at i is <= b, then all
following markers will also be <=, and hence could be skipped?

Not sure why we need to keep iterating if GET_MARKER(pt[i]) == b.

FWIW, you could even do:

if ( GET_MARKER(pt[i]) <= b )
    break;
SET_MARKER(pt[i], b);

Which would keep the conditionals to 1 like it currently is.

> 
> or, accepting redundant updates, 
> 
>         if ( GET_MARKER(pt[i]) < b )
>             break;
>         SET_MARKER(pt[i], b);
> 
> . Neither the redundant updates nor the extra (easily mis-predicted)
> conditional looked very appealing to me, but I guess I could change
> this if you are convinced that's better than continuing a loop with
> at most 9 (typically less) iterations.

Well, I think I at least partly understood the logic.  Not sure
whether it's worth adding the conditional or just assuming that
continuing the loop is going to be cheaper.  Might be worth adding a
comment that we choose to explicitly not add an extra conditional to
check for early exit, because we assume that to be more expensive than
just continuing.

Thanks, Roger.
Jan Beulich May 20, 2022, 10:59 a.m. UTC | #4
On 20.05.2022 12:22, Roger Pau Monné wrote:
> On Wed, May 18, 2022 at 12:06:29PM +0200, Jan Beulich wrote:
>> On 06.05.2022 15:25, Roger Pau Monné wrote:
>>> On Mon, Apr 25, 2022 at 10:41:23AM +0200, Jan Beulich wrote:
>>>> --- /dev/null
>>>> +++ b/xen/arch/x86/include/asm/pt-contig-markers.h
>>>> @@ -0,0 +1,105 @@
>>>> +#ifndef __ASM_X86_PT_CONTIG_MARKERS_H
>>>> +#define __ASM_X86_PT_CONTIG_MARKERS_H
>>>> +
>>>> +/*
>>>> + * Short of having function templates in C, the function defined below is
>>>> + * intended to be used by multiple parties interested in recording the
>>>> + * degree of contiguity in mappings by a single page table.
>>>> + *
>>>> + * Scheme: Every entry records the order of contiguous successive entries,
>>>> + * up to the maximum order covered by that entry (which is the number of
>>>> + * clear low bits in its index, with entry 0 being the exception using
>>>> + * the base-2 logarithm of the number of entries in a single page table).
>>>> + * While a few entries need touching upon update, knowing whether the
>>>> + * table is fully contiguous (and can hence be replaced by a higher level
>>>> + * leaf entry) is then possible by simply looking at entry 0's marker.
>>>> + *
>>>> + * Prereqs:
>>>> + * - CONTIG_MASK needs to be #define-d, to a value having at least 4
>>>> + *   contiguous bits (ignored by hardware), before including this file,
>>>> + * - page tables to be passed here need to be initialized with correct
>>>> + *   markers.
>>>
>>> Not sure it's very relevant, but might we worth adding that:
>>>
>>> - Null entries must have the PTE zeroed except for the CONTIG_MASK
>>>   region in order to be considered as inactive.
>>
>> NP, I've added an item along these lines.
>>
>>>> +static bool pt_update_contig_markers(uint64_t *pt, unsigned int idx,
>>>> +                                     unsigned int level, enum PTE_kind kind)
>>>> +{
>>>> +    unsigned int b, i = idx;
>>>> +    unsigned int shift = (level - 1) * CONTIG_LEVEL_SHIFT + PAGE_SHIFT;
>>>> +
>>>> +    ASSERT(idx < CONTIG_NR);
>>>> +    ASSERT(!(pt[idx] & CONTIG_MASK));
>>>> +
>>>> +    /* Step 1: Reduce markers in lower numbered entries. */
>>>> +    while ( i )
>>>> +    {
>>>> +        b = find_first_set_bit(i);
>>>> +        i &= ~(1U << b);
>>>> +        if ( GET_MARKER(pt[i]) > b )
>>>> +            SET_MARKER(pt[i], b);
>>>
>>> Can't you exit early when you find an entry that already has the
>>> to-be-set contiguous marker <= b, as lower numbered entries will then
>>> also be <= b'?
>>>
>>> Ie:
>>>
>>> if ( GET_MARKER(pt[i]) <= b )
>>>     break;
>>> else
>>>     SET_MARKER(pt[i], b);
>>
>> Almost - I think it would need to be 
>>
>>         if ( GET_MARKER(pt[i]) < b )
>>             break;
>>         if ( GET_MARKER(pt[i]) > b )
>>             SET_MARKER(pt[i], b);
> 
> I guess I'm slightly confused, but if marker at i is <= b, then all
> following markers will also be <=, and hence could be skipped?

Your use of "following" is ambiguous here, because the iteration
moves downwards as far as PTEs inspected are concerned (and it's
b which grows from one iteration to the next). But yes, I think I
agree now that ...

> Not sure why we need to keep iterating if GET_MARKER(pt[i]) == b.

... this isn't needed. At which point ...

> FWIW, you could even do:
> 
> if ( GET_MARKER(pt[i]) <= b )
>     break;
> SET_MARKER(pt[i], b);
> 
> Which would keep the conditionals to 1 like it currently is.
> 
>>
>> or, accepting redundant updates, 
>>
>>         if ( GET_MARKER(pt[i]) < b )
>>             break;
>>         SET_MARKER(pt[i], b);
>>
>> . Neither the redundant updates nor the extra (easily mis-predicted)
>> conditional looked very appealing to me, but I guess I could change
>> this if you are convinced that's better than continuing a loop with
>> at most 9 (typically less) iterations.
> 
> Well, I think I at least partly understood the logic.  Not sure
> whether it's worth adding the conditional or just assuming that
> continuing the loop is going to be cheaper.  Might be worth adding a
> comment that we choose to explicitly not add an extra conditional to
> check for early exit, because we assume that to be more expensive than
> just continuing.

... this resolves without further action.

Jan
Roger Pau Monne May 20, 2022, 11:27 a.m. UTC | #5
On Fri, May 20, 2022 at 12:59:55PM +0200, Jan Beulich wrote:
> On 20.05.2022 12:22, Roger Pau Monné wrote:
> > On Wed, May 18, 2022 at 12:06:29PM +0200, Jan Beulich wrote:
> >> On 06.05.2022 15:25, Roger Pau Monné wrote:
> >>> On Mon, Apr 25, 2022 at 10:41:23AM +0200, Jan Beulich wrote:
> >>>> --- /dev/null
> >>>> +++ b/xen/arch/x86/include/asm/pt-contig-markers.h
> >>>> @@ -0,0 +1,105 @@
> >>>> +#ifndef __ASM_X86_PT_CONTIG_MARKERS_H
> >>>> +#define __ASM_X86_PT_CONTIG_MARKERS_H
> >>>> +
> >>>> +/*
> >>>> + * Short of having function templates in C, the function defined below is
> >>>> + * intended to be used by multiple parties interested in recording the
> >>>> + * degree of contiguity in mappings by a single page table.
> >>>> + *
> >>>> + * Scheme: Every entry records the order of contiguous successive entries,
> >>>> + * up to the maximum order covered by that entry (which is the number of
> >>>> + * clear low bits in its index, with entry 0 being the exception using
> >>>> + * the base-2 logarithm of the number of entries in a single page table).
> >>>> + * While a few entries need touching upon update, knowing whether the
> >>>> + * table is fully contiguous (and can hence be replaced by a higher level
> >>>> + * leaf entry) is then possible by simply looking at entry 0's marker.
> >>>> + *
> >>>> + * Prereqs:
> >>>> + * - CONTIG_MASK needs to be #define-d, to a value having at least 4
> >>>> + *   contiguous bits (ignored by hardware), before including this file,
> >>>> + * - page tables to be passed here need to be initialized with correct
> >>>> + *   markers.
> >>>
> >>> Not sure it's very relevant, but might we worth adding that:
> >>>
> >>> - Null entries must have the PTE zeroed except for the CONTIG_MASK
> >>>   region in order to be considered as inactive.
> >>
> >> NP, I've added an item along these lines.
> >>
> >>>> +static bool pt_update_contig_markers(uint64_t *pt, unsigned int idx,
> >>>> +                                     unsigned int level, enum PTE_kind kind)
> >>>> +{
> >>>> +    unsigned int b, i = idx;
> >>>> +    unsigned int shift = (level - 1) * CONTIG_LEVEL_SHIFT + PAGE_SHIFT;
> >>>> +
> >>>> +    ASSERT(idx < CONTIG_NR);
> >>>> +    ASSERT(!(pt[idx] & CONTIG_MASK));
> >>>> +
> >>>> +    /* Step 1: Reduce markers in lower numbered entries. */
> >>>> +    while ( i )
> >>>> +    {
> >>>> +        b = find_first_set_bit(i);
> >>>> +        i &= ~(1U << b);
> >>>> +        if ( GET_MARKER(pt[i]) > b )
> >>>> +            SET_MARKER(pt[i], b);
> >>>
> >>> Can't you exit early when you find an entry that already has the
> >>> to-be-set contiguous marker <= b, as lower numbered entries will then
> >>> also be <= b'?
> >>>
> >>> Ie:
> >>>
> >>> if ( GET_MARKER(pt[i]) <= b )
> >>>     break;
> >>> else
> >>>     SET_MARKER(pt[i], b);
> >>
> >> Almost - I think it would need to be 
> >>
> >>         if ( GET_MARKER(pt[i]) < b )
> >>             break;
> >>         if ( GET_MARKER(pt[i]) > b )
> >>             SET_MARKER(pt[i], b);
> > 
> > I guess I'm slightly confused, but if marker at i is <= b, then all
> > following markers will also be <=, and hence could be skipped?
> 
> Your use of "following" is ambiguous here, because the iteration
> moves downwards as far as PTEs inspected are concerned (and it's
> b which grows from one iteration to the next). But yes, I think I
> agree now that ...

Right, 'following' here would be the next item processed by the loop.

> > Not sure why we need to keep iterating if GET_MARKER(pt[i]) == b.
> 
> ... this isn't needed. At which point ...
> 
> > FWIW, you could even do:
> > 
> > if ( GET_MARKER(pt[i]) <= b )
> >     break;
> > SET_MARKER(pt[i], b);
> > 
> > Which would keep the conditionals to 1 like it currently is.
> > 
> >>
> >> or, accepting redundant updates, 
> >>
> >>         if ( GET_MARKER(pt[i]) < b )
> >>             break;
> >>         SET_MARKER(pt[i], b);
> >>
> >> . Neither the redundant updates nor the extra (easily mis-predicted)
> >> conditional looked very appealing to me, but I guess I could change
> >> this if you are convinced that's better than continuing a loop with
> >> at most 9 (typically less) iterations.
> > 
> > Well, I think I at least partly understood the logic.  Not sure
> > whether it's worth adding the conditional or just assuming that
> > continuing the loop is going to be cheaper.  Might be worth adding a
> > comment that we choose to explicitly not add an extra conditional to
> > check for early exit, because we assume that to be more expensive than
> > just continuing.
> 
> ... this resolves without further action.

OK, since we agree, and that was the only comment I had, you can add:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.
diff mbox series

Patch

--- /dev/null
+++ b/xen/arch/x86/include/asm/pt-contig-markers.h
@@ -0,0 +1,105 @@ 
+#ifndef __ASM_X86_PT_CONTIG_MARKERS_H
+#define __ASM_X86_PT_CONTIG_MARKERS_H
+
+/*
+ * Short of having function templates in C, the function defined below is
+ * intended to be used by multiple parties interested in recording the
+ * degree of contiguity in mappings by a single page table.
+ *
+ * Scheme: Every entry records the order of contiguous successive entries,
+ * up to the maximum order covered by that entry (which is the number of
+ * clear low bits in its index, with entry 0 being the exception using
+ * the base-2 logarithm of the number of entries in a single page table).
+ * While a few entries need touching upon update, knowing whether the
+ * table is fully contiguous (and can hence be replaced by a higher level
+ * leaf entry) is then possible by simply looking at entry 0's marker.
+ *
+ * Prereqs:
+ * - CONTIG_MASK needs to be #define-d, to a value having at least 4
+ *   contiguous bits (ignored by hardware), before including this file,
+ * - page tables to be passed here need to be initialized with correct
+ *   markers.
+ */
+
+#include <xen/bitops.h>
+#include <xen/lib.h>
+#include <xen/page-size.h>
+
+/* This is the same for all anticipated users, so doesn't need passing in. */
+#define CONTIG_LEVEL_SHIFT 9
+#define CONTIG_NR          (1 << CONTIG_LEVEL_SHIFT)
+
+#define GET_MARKER(e) MASK_EXTR(e, CONTIG_MASK)
+#define SET_MARKER(e, m) \
+    ((void)((e) = ((e) & ~CONTIG_MASK) | MASK_INSR(m, CONTIG_MASK)))
+
+#define IS_CONTIG(kind, pt, i, idx, shift, b) \
+    ((kind) == PTE_kind_leaf \
+     ? (((pt)[i] ^ (pt)[idx]) & ~CONTIG_MASK) == (1ULL << ((b) + (shift))) \
+     : !((pt)[i] & ~CONTIG_MASK))
+
+enum PTE_kind {
+    PTE_kind_null,
+    PTE_kind_leaf,
+    PTE_kind_table,
+};
+
+static bool pt_update_contig_markers(uint64_t *pt, unsigned int idx,
+                                     unsigned int level, enum PTE_kind kind)
+{
+    unsigned int b, i = idx;
+    unsigned int shift = (level - 1) * CONTIG_LEVEL_SHIFT + PAGE_SHIFT;
+
+    ASSERT(idx < CONTIG_NR);
+    ASSERT(!(pt[idx] & CONTIG_MASK));
+
+    /* Step 1: Reduce markers in lower numbered entries. */
+    while ( i )
+    {
+        b = find_first_set_bit(i);
+        i &= ~(1U << b);
+        if ( GET_MARKER(pt[i]) > b )
+            SET_MARKER(pt[i], b);
+    }
+
+    /* An intermediate table is never contiguous with anything. */
+    if ( kind == PTE_kind_table )
+        return false;
+
+    /*
+     * Present entries need in-sync index and address to be a candidate
+     * for being contiguous: What we're after is whether ultimately the
+     * intermediate table can be replaced by a superpage.
+     */
+    if ( kind != PTE_kind_null &&
+         idx != ((pt[idx] >> shift) & (CONTIG_NR - 1)) )
+        return false;
+
+    /* Step 2: Check higher numbered entries for contiguity. */
+    for ( b = 0; b < CONTIG_LEVEL_SHIFT && !(idx & (1U << b)); ++b )
+    {
+        i = idx | (1U << b);
+        if ( !IS_CONTIG(kind, pt, i, idx, shift, b) || GET_MARKER(pt[i]) != b )
+            break;
+    }
+
+    /* Step 3: Update markers in this and lower numbered entries. */
+    for ( ; SET_MARKER(pt[idx], b), b < CONTIG_LEVEL_SHIFT; ++b )
+    {
+        i = idx ^ (1U << b);
+        if ( !IS_CONTIG(kind, pt, i, idx, shift, b) || GET_MARKER(pt[i]) != b )
+            break;
+        idx &= ~(1U << b);
+    }
+
+    return b == CONTIG_LEVEL_SHIFT;
+}
+
+#undef IS_CONTIG
+#undef SET_MARKER
+#undef GET_MARKER
+#undef CONTIG_NR
+#undef CONTIG_LEVEL_SHIFT
+#undef CONTIG_MASK
+
+#endif /* __ASM_X86_PT_CONTIG_MARKERS_H */