diff mbox series

[6/9] XArray: internal node is a xa_node when it is bigger than XA_ZERO_ENTRY

Message ID 20200330123643.17120-7-richard.weiyang@gmail.com (mailing list archive)
State New, archived
Headers show
Series XArray: several cleanups | expand

Commit Message

Wei Yang March 30, 2020, 12:36 p.m. UTC
As the comment mentioned, we reserved several ranges of internal node
for tree maintenance, 0-62, 256, 257. This means a node bigger than
XA_ZERO_ENTRY is a normal node.

The checked on XA_ZERO_ENTRY seems to be more meaningful.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
---
 include/linux/xarray.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Matthew Wilcox (Oracle) March 30, 2020, 12:50 p.m. UTC | #1
On Mon, Mar 30, 2020 at 12:36:40PM +0000, Wei Yang wrote:
> As the comment mentioned, we reserved several ranges of internal node
> for tree maintenance, 0-62, 256, 257. This means a node bigger than
> XA_ZERO_ENTRY is a normal node.
> 
> The checked on XA_ZERO_ENTRY seems to be more meaningful.

257-1023 are also reserved, they just aren't used yet.  XA_ZERO_ENTRY
is not guaranteed to be the largest reserved entry.
Wei Yang March 30, 2020, 1:45 p.m. UTC | #2
On Mon, Mar 30, 2020 at 05:50:06AM -0700, Matthew Wilcox wrote:
>On Mon, Mar 30, 2020 at 12:36:40PM +0000, Wei Yang wrote:
>> As the comment mentioned, we reserved several ranges of internal node
>> for tree maintenance, 0-62, 256, 257. This means a node bigger than
>> XA_ZERO_ENTRY is a normal node.
>> 
>> The checked on XA_ZERO_ENTRY seems to be more meaningful.
>
>257-1023 are also reserved, they just aren't used yet.  XA_ZERO_ENTRY
>is not guaranteed to be the largest reserved entry.

Then why we choose 4096?
Matthew Wilcox (Oracle) March 30, 2020, 1:49 p.m. UTC | #3
On Mon, Mar 30, 2020 at 01:45:19PM +0000, Wei Yang wrote:
> On Mon, Mar 30, 2020 at 05:50:06AM -0700, Matthew Wilcox wrote:
> >On Mon, Mar 30, 2020 at 12:36:40PM +0000, Wei Yang wrote:
> >> As the comment mentioned, we reserved several ranges of internal node
> >> for tree maintenance, 0-62, 256, 257. This means a node bigger than
> >> XA_ZERO_ENTRY is a normal node.
> >> 
> >> The checked on XA_ZERO_ENTRY seems to be more meaningful.
> >
> >257-1023 are also reserved, they just aren't used yet.  XA_ZERO_ENTRY
> >is not guaranteed to be the largest reserved entry.
> 
> Then why we choose 4096?

Because 4096 is the smallest page size supported by Linux, so we're
guaranteed that anything less than 4096 is not a valid pointer.
Wei Yang March 30, 2020, 2:13 p.m. UTC | #4
On Mon, Mar 30, 2020 at 06:49:03AM -0700, Matthew Wilcox wrote:
>On Mon, Mar 30, 2020 at 01:45:19PM +0000, Wei Yang wrote:
>> On Mon, Mar 30, 2020 at 05:50:06AM -0700, Matthew Wilcox wrote:
>> >On Mon, Mar 30, 2020 at 12:36:40PM +0000, Wei Yang wrote:
>> >> As the comment mentioned, we reserved several ranges of internal node
>> >> for tree maintenance, 0-62, 256, 257. This means a node bigger than
>> >> XA_ZERO_ENTRY is a normal node.
>> >> 
>> >> The checked on XA_ZERO_ENTRY seems to be more meaningful.
>> >
>> >257-1023 are also reserved, they just aren't used yet.  XA_ZERO_ENTRY
>> >is not guaranteed to be the largest reserved entry.
>> 
>> Then why we choose 4096?
>
>Because 4096 is the smallest page size supported by Linux, so we're
>guaranteed that anything less than 4096 is not a valid pointer.

I found this in xarray.rst:

  Normal pointers may be stored in the XArray directly.  They must be 4-byte
  aligned, which is true for any pointer returned from kmalloc() and
  alloc_page().  It isn't true for arbitrary user-space pointers,
  nor for function pointers.  You can store pointers to statically allocated
  objects, as long as those objects have an alignment of at least 4.

So the document here is not correct?
Matthew Wilcox (Oracle) March 30, 2020, 2:27 p.m. UTC | #5
On Mon, Mar 30, 2020 at 02:13:50PM +0000, Wei Yang wrote:
> On Mon, Mar 30, 2020 at 06:49:03AM -0700, Matthew Wilcox wrote:
> >On Mon, Mar 30, 2020 at 01:45:19PM +0000, Wei Yang wrote:
> >> On Mon, Mar 30, 2020 at 05:50:06AM -0700, Matthew Wilcox wrote:
> >> >On Mon, Mar 30, 2020 at 12:36:40PM +0000, Wei Yang wrote:
> >> >> As the comment mentioned, we reserved several ranges of internal node
> >> >> for tree maintenance, 0-62, 256, 257. This means a node bigger than
> >> >> XA_ZERO_ENTRY is a normal node.
> >> >> 
> >> >> The checked on XA_ZERO_ENTRY seems to be more meaningful.
> >> >
> >> >257-1023 are also reserved, they just aren't used yet.  XA_ZERO_ENTRY
> >> >is not guaranteed to be the largest reserved entry.
> >> 
> >> Then why we choose 4096?
> >
> >Because 4096 is the smallest page size supported by Linux, so we're
> >guaranteed that anything less than 4096 is not a valid pointer.
> 
> I found this in xarray.rst:
> 
>   Normal pointers may be stored in the XArray directly.  They must be 4-byte
>   aligned, which is true for any pointer returned from kmalloc() and
>   alloc_page().  It isn't true for arbitrary user-space pointers,
>   nor for function pointers.  You can store pointers to statically allocated
>   objects, as long as those objects have an alignment of at least 4.
> 
> So the document here is not correct?

Why do you say that?

(it is slightly out of date; the XArray actually supports storing unaligned
pointers now, but that's not relevant to this discussion)
Wei Yang March 30, 2020, 10:20 p.m. UTC | #6
On Mon, Mar 30, 2020 at 07:27:08AM -0700, Matthew Wilcox wrote:
>On Mon, Mar 30, 2020 at 02:13:50PM +0000, Wei Yang wrote:
>> On Mon, Mar 30, 2020 at 06:49:03AM -0700, Matthew Wilcox wrote:
>> >On Mon, Mar 30, 2020 at 01:45:19PM +0000, Wei Yang wrote:
>> >> On Mon, Mar 30, 2020 at 05:50:06AM -0700, Matthew Wilcox wrote:
>> >> >On Mon, Mar 30, 2020 at 12:36:40PM +0000, Wei Yang wrote:
>> >> >> As the comment mentioned, we reserved several ranges of internal node
>> >> >> for tree maintenance, 0-62, 256, 257. This means a node bigger than
>> >> >> XA_ZERO_ENTRY is a normal node.
>> >> >> 
>> >> >> The checked on XA_ZERO_ENTRY seems to be more meaningful.
>> >> >
>> >> >257-1023 are also reserved, they just aren't used yet.  XA_ZERO_ENTRY
>> >> >is not guaranteed to be the largest reserved entry.
>> >> 
>> >> Then why we choose 4096?
>> >
>> >Because 4096 is the smallest page size supported by Linux, so we're
>> >guaranteed that anything less than 4096 is not a valid pointer.
>> 

So you want to say, the 4096 makes sure XArray will not store an address in
first page? If this is the case, I have two suggestions:

  * use PAGE_SIZE would be more verbose?
  * a node is an internal entry, do we need to compare with xa_mk_internal()
    instead?

And also suggest to add a comment on this, otherwise it seems a little magic.

>> I found this in xarray.rst:
>> 
>>   Normal pointers may be stored in the XArray directly.  They must be 4-byte
>>   aligned, which is true for any pointer returned from kmalloc() and
>>   alloc_page().  It isn't true for arbitrary user-space pointers,
>>   nor for function pointers.  You can store pointers to statically allocated
>>   objects, as long as those objects have an alignment of at least 4.
>> 
>> So the document here is not correct?
>
>Why do you say that?
>
>(it is slightly out of date; the XArray actually supports storing unaligned
>pointers now, but that's not relevant to this discussion)

Ok, maybe this document need to update.
Matthew Wilcox (Oracle) March 31, 2020, 12:06 a.m. UTC | #7
On Mon, Mar 30, 2020 at 10:20:13PM +0000, Wei Yang wrote:
> On Mon, Mar 30, 2020 at 07:27:08AM -0700, Matthew Wilcox wrote:
> >On Mon, Mar 30, 2020 at 02:13:50PM +0000, Wei Yang wrote:
> >> On Mon, Mar 30, 2020 at 06:49:03AM -0700, Matthew Wilcox wrote:
> >> >On Mon, Mar 30, 2020 at 01:45:19PM +0000, Wei Yang wrote:
> >> >> On Mon, Mar 30, 2020 at 05:50:06AM -0700, Matthew Wilcox wrote:
> >> >> >On Mon, Mar 30, 2020 at 12:36:40PM +0000, Wei Yang wrote:
> >> >> >> As the comment mentioned, we reserved several ranges of internal node
> >> >> >> for tree maintenance, 0-62, 256, 257. This means a node bigger than
> >> >> >> XA_ZERO_ENTRY is a normal node.
> >> >> >> 
> >> >> >> The checked on XA_ZERO_ENTRY seems to be more meaningful.
> >> >> >
> >> >> >257-1023 are also reserved, they just aren't used yet.  XA_ZERO_ENTRY
> >> >> >is not guaranteed to be the largest reserved entry.
> >> >> 
> >> >> Then why we choose 4096?
> >> >
> >> >Because 4096 is the smallest page size supported by Linux, so we're
> >> >guaranteed that anything less than 4096 is not a valid pointer.
> >> 
> 
> So you want to say, the 4096 makes sure XArray will not store an address in
> first page? If this is the case, I have two suggestions:
> 
>   * use PAGE_SIZE would be more verbose?

But also incorrect, because it'll be different on different architectures.
It's 4096.  That's all.

>   * a node is an internal entry, do we need to compare with xa_mk_internal()
>     instead?

No.  4096 is better because it's a number which is easily expressible in
many CPU instruction sets.  4094 is much less likely to be an easy number
to encode.

> >(it is slightly out of date; the XArray actually supports storing unaligned
> >pointers now, but that's not relevant to this discussion)
> 
> Ok, maybe this document need to update.

Did you want to send a patch?
Wei Yang March 31, 2020, 1:40 p.m. UTC | #8
On Mon, Mar 30, 2020 at 05:06:49PM -0700, Matthew Wilcox wrote:
>On Mon, Mar 30, 2020 at 10:20:13PM +0000, Wei Yang wrote:
>> On Mon, Mar 30, 2020 at 07:27:08AM -0700, Matthew Wilcox wrote:
>> >On Mon, Mar 30, 2020 at 02:13:50PM +0000, Wei Yang wrote:
>> >> On Mon, Mar 30, 2020 at 06:49:03AM -0700, Matthew Wilcox wrote:
>> >> >On Mon, Mar 30, 2020 at 01:45:19PM +0000, Wei Yang wrote:
>> >> >> On Mon, Mar 30, 2020 at 05:50:06AM -0700, Matthew Wilcox wrote:
>> >> >> >On Mon, Mar 30, 2020 at 12:36:40PM +0000, Wei Yang wrote:
>> >> >> >> As the comment mentioned, we reserved several ranges of internal node
>> >> >> >> for tree maintenance, 0-62, 256, 257. This means a node bigger than
>> >> >> >> XA_ZERO_ENTRY is a normal node.
>> >> >> >> 
>> >> >> >> The checked on XA_ZERO_ENTRY seems to be more meaningful.
>> >> >> >
>> >> >> >257-1023 are also reserved, they just aren't used yet.  XA_ZERO_ENTRY
>> >> >> >is not guaranteed to be the largest reserved entry.
>> >> >> 
>> >> >> Then why we choose 4096?
>> >> >
>> >> >Because 4096 is the smallest page size supported by Linux, so we're
>> >> >guaranteed that anything less than 4096 is not a valid pointer.
>> >> 
>> 
>> So you want to say, the 4096 makes sure XArray will not store an address in
>> first page? If this is the case, I have two suggestions:
>> 
>>   * use PAGE_SIZE would be more verbose?
>
>But also incorrect, because it'll be different on different architectures.
>It's 4096.  That's all.
>
>>   * a node is an internal entry, do we need to compare with xa_mk_internal()
>>     instead?
>
>No.  4096 is better because it's a number which is easily expressible in
>many CPU instruction sets.  4094 is much less likely to be an easy number
>to encode.
>
>> >(it is slightly out of date; the XArray actually supports storing unaligned
>> >pointers now, but that's not relevant to this discussion)
>> 
>> Ok, maybe this document need to update.
>
>Did you want to send a patch?

I am not clear how it supports unaligned pointers. So maybe not now.

Actually, I still not get the point between page size and valid pointer. Why a
valid pointer couldn't be less than 4096? The first page in address space is
handled differently? Maybe I miss some point. I'd appreciate it if you'd share
some light.

Thanks
diff mbox series

Patch

diff --git a/include/linux/xarray.h b/include/linux/xarray.h
index a491653d8882..e9d07483af64 100644
--- a/include/linux/xarray.h
+++ b/include/linux/xarray.h
@@ -1221,7 +1221,7 @@  static inline struct xa_node *xa_to_node(const void *entry)
 /* Private */
 static inline bool xa_is_node(const void *entry)
 {
-	return xa_is_internal(entry) && (unsigned long)entry > 4096;
+	return xa_is_internal(entry) && entry > XA_ZERO_ENTRY;
 }
 
 /* Private */