diff mbox series

[v3,12/24] erofs: introduce tagged pointer

Message ID 20190722025043.166344-13-gaoxiang25@huawei.com (mailing list archive)
State New, archived
Headers show
Series erofs: promote erofs from staging | expand

Commit Message

Gao Xiang July 22, 2019, 2:50 a.m. UTC
Currently kernel has scattered tagged pointer usages
hacked by hand in plain code, without a unique and
portable functionset to highlight the tagged pointer
itself and wrap these hacked code in order to clean up
all over meaningless magic masks.

This patch introduces simple generic methods to fold
tags into a pointer integer. Currently it supports
the last n bits of the pointer for tags, which can be
selected by users.

In addition, it will also be used for the upcoming EROFS
filesystem, which heavily uses tagged pointer pproach
 to reduce extra memory allocation.

Link: https://en.wikipedia.org/wiki/Tagged_pointer

Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---
 fs/erofs/tagptr.h | 110 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 110 insertions(+)
 create mode 100644 fs/erofs/tagptr.h

Comments

Amir Goldstein July 22, 2019, 4:39 a.m. UTC | #1
On Mon, Jul 22, 2019 at 5:54 AM Gao Xiang <gaoxiang25@huawei.com> wrote:
>
> Currently kernel has scattered tagged pointer usages
> hacked by hand in plain code, without a unique and
> portable functionset to highlight the tagged pointer
> itself and wrap these hacked code in order to clean up
> all over meaningless magic masks.
>
> This patch introduces simple generic methods to fold
> tags into a pointer integer. Currently it supports
> the last n bits of the pointer for tags, which can be
> selected by users.
>
> In addition, it will also be used for the upcoming EROFS
> filesystem, which heavily uses tagged pointer pproach
>  to reduce extra memory allocation.
>
> Link: https://en.wikipedia.org/wiki/Tagged_pointer

Well, it won't do much good for other kernel users in fs/erofs/ ;-)

I think now would be a right time to promote this facility to
include/linux as you initially proposed.
I don't recall you got any objections. No ACKs either, but I think
that was the good kind of silence (?)

You might want to post the __fdget conversion patch [1] as a
bonus patch on top of your series.

Thanks,
Amir.

[1] https://lore.kernel.org/linux-fsdevel/1530543233-65279-2-git-send-email-gaoxiang25@huawei.com/
Gao Xiang July 22, 2019, 5:01 a.m. UTC | #2
Hi Amir,

On 2019/7/22 12:39, Amir Goldstein wrote:
> On Mon, Jul 22, 2019 at 5:54 AM Gao Xiang <gaoxiang25@huawei.com> wrote:
>>
>> Currently kernel has scattered tagged pointer usages
>> hacked by hand in plain code, without a unique and
>> portable functionset to highlight the tagged pointer
>> itself and wrap these hacked code in order to clean up
>> all over meaningless magic masks.
>>
>> This patch introduces simple generic methods to fold
>> tags into a pointer integer. Currently it supports
>> the last n bits of the pointer for tags, which can be
>> selected by users.
>>
>> In addition, it will also be used for the upcoming EROFS
>> filesystem, which heavily uses tagged pointer pproach
>>  to reduce extra memory allocation.
>>
>> Link: https://en.wikipedia.org/wiki/Tagged_pointer
> 
> Well, it won't do much good for other kernel users in fs/erofs/ ;-)

Thanks for your reply and interest in this patch.... :)

Sigh... since I'm not sure kernel folks could have some interests in that stuffs.

Actually at the time once I coded EROFS I found tagged pointer had 2 main advantages:
1) it saves an extra field;
2) it can keep the whole stuff atomicly...
And I observed the current kernel uses tagged pointer all around but w/o a proper wrapper...
and EROFS heavily uses tagged pointer... So I made a simple tagged pointer wrapper
to avoid meaningless magic masks and type casts in the code...

> 
> I think now would be a right time to promote this facility to
> include/linux as you initially proposed.
> I don't recall you got any objections. No ACKs either, but I think
> that was the good kind of silence (?)

Yes, no NAK no ACK...(it seems the ordinary state for all EROFS stuffs... :'( sigh...)
Therefore I decided to leave it in fs/erofs/ in this series...

> 
> You might want to post the __fdget conversion patch [1] as a
> bonus patch on top of your series.

I am not sure if another potential users could be quite happy with my ("sane?" or not)
implementation... (Is there some use scenerios in overlayfs and fanotify?...)

and I'm not sure Al could accept __fdget conversion (I just wanted to give a example then...)

Therefore, I tend to keep silence and just promote EROFS... some better ideas?...

Thanks,
Gao Xiang
Amir Goldstein July 22, 2019, 6:16 a.m. UTC | #3
On Mon, Jul 22, 2019 at 8:02 AM Gao Xiang <gaoxiang25@huawei.com> wrote:
>
> Hi Amir,
>
> On 2019/7/22 12:39, Amir Goldstein wrote:
> > On Mon, Jul 22, 2019 at 5:54 AM Gao Xiang <gaoxiang25@huawei.com> wrote:
> >>
> >> Currently kernel has scattered tagged pointer usages
> >> hacked by hand in plain code, without a unique and
> >> portable functionset to highlight the tagged pointer
> >> itself and wrap these hacked code in order to clean up
> >> all over meaningless magic masks.
> >>
> >> This patch introduces simple generic methods to fold
> >> tags into a pointer integer. Currently it supports
> >> the last n bits of the pointer for tags, which can be
> >> selected by users.
> >>
> >> In addition, it will also be used for the upcoming EROFS
> >> filesystem, which heavily uses tagged pointer pproach
> >>  to reduce extra memory allocation.
> >>
> >> Link: https://en.wikipedia.org/wiki/Tagged_pointer
> >
> > Well, it won't do much good for other kernel users in fs/erofs/ ;-)
>
> Thanks for your reply and interest in this patch.... :)
>
> Sigh... since I'm not sure kernel folks could have some interests in that stuffs.
>
> Actually at the time once I coded EROFS I found tagged pointer had 2 main advantages:
> 1) it saves an extra field;
> 2) it can keep the whole stuff atomicly...
> And I observed the current kernel uses tagged pointer all around but w/o a proper wrapper...
> and EROFS heavily uses tagged pointer... So I made a simple tagged pointer wrapper
> to avoid meaningless magic masks and type casts in the code...
>
> >
> > I think now would be a right time to promote this facility to
> > include/linux as you initially proposed.
> > I don't recall you got any objections. No ACKs either, but I think
> > that was the good kind of silence (?)
>
> Yes, no NAK no ACK...(it seems the ordinary state for all EROFS stuffs... :'( sigh...)
> Therefore I decided to leave it in fs/erofs/ in this series...
>
> >
> > You might want to post the __fdget conversion patch [1] as a
> > bonus patch on top of your series.
>
> I am not sure if another potential users could be quite happy with my ("sane?" or not)
> implementation...

Well, let's ask potential users then.

CC kernel/trace maintainers for RB_PAGE_HEAD/RB_PAGE_UPDATE
and kernel/locking maintainers for RT_MUTEX_HAS_WAITERS

> (Is there some use scenerios in overlayfs and fanotify?...)

We had one in overlayfs once. It is gone now.

>
> and I'm not sure Al could accept __fdget conversion (I just wanted to give a example then...)
>
> Therefore, I tend to keep silence and just promote EROFS... some better ideas?...
>

Writing example conversion patches to demonstrate cleaner code
and perhaps reduce LOC seems the best way.

Also pointing out that fixing potential bugs in one implementation is preferred
to having to patch all copied implementations.

I wonder if tagptr_unfold_tags() doesn't need READ_ONCE() as per:
1be5d4fa0af3 locking/rtmutex: Use READ_ONCE() in rt_mutex_owner()

rb_list_head() doesn't have READ_ONCE()
Nor does hlist_bl_first() and BPF_MAP_PTR().

Are those all safe due to safe call sites? or potentially broken?

Thanks,
Amir.
Gao Xiang July 22, 2019, 6:31 a.m. UTC | #4
On 2019/7/22 14:16, Amir Goldstein wrote:
> On Mon, Jul 22, 2019 at 8:02 AM Gao Xiang <gaoxiang25@huawei.com> wrote:
>>
>> Hi Amir,
>>
>> On 2019/7/22 12:39, Amir Goldstein wrote:
>>> On Mon, Jul 22, 2019 at 5:54 AM Gao Xiang <gaoxiang25@huawei.com> wrote:
>>>>
>>>> Currently kernel has scattered tagged pointer usages
>>>> hacked by hand in plain code, without a unique and
>>>> portable functionset to highlight the tagged pointer
>>>> itself and wrap these hacked code in order to clean up
>>>> all over meaningless magic masks.
>>>>
>>>> This patch introduces simple generic methods to fold
>>>> tags into a pointer integer. Currently it supports
>>>> the last n bits of the pointer for tags, which can be
>>>> selected by users.
>>>>
>>>> In addition, it will also be used for the upcoming EROFS
>>>> filesystem, which heavily uses tagged pointer pproach
>>>>  to reduce extra memory allocation.
>>>>
>>>> Link: https://en.wikipedia.org/wiki/Tagged_pointer
>>>
>>> Well, it won't do much good for other kernel users in fs/erofs/ ;-)
>>
>> Thanks for your reply and interest in this patch.... :)
>>
>> Sigh... since I'm not sure kernel folks could have some interests in that stuffs.
>>
>> Actually at the time once I coded EROFS I found tagged pointer had 2 main advantages:
>> 1) it saves an extra field;
>> 2) it can keep the whole stuff atomicly...
>> And I observed the current kernel uses tagged pointer all around but w/o a proper wrapper...
>> and EROFS heavily uses tagged pointer... So I made a simple tagged pointer wrapper
>> to avoid meaningless magic masks and type casts in the code...
>>
>>>
>>> I think now would be a right time to promote this facility to
>>> include/linux as you initially proposed.
>>> I don't recall you got any objections. No ACKs either, but I think
>>> that was the good kind of silence (?)
>>
>> Yes, no NAK no ACK...(it seems the ordinary state for all EROFS stuffs... :'( sigh...)
>> Therefore I decided to leave it in fs/erofs/ in this series...
>>
>>>
>>> You might want to post the __fdget conversion patch [1] as a
>>> bonus patch on top of your series.
>>
>> I am not sure if another potential users could be quite happy with my ("sane?" or not)
>> implementation...
> 
> Well, let's ask potential users then.
> 
> CC kernel/trace maintainers for RB_PAGE_HEAD/RB_PAGE_UPDATE
> and kernel/locking maintainers for RT_MUTEX_HAS_WAITERS
> 
>> (Is there some use scenerios in overlayfs and fanotify?...)
> 
> We had one in overlayfs once. It is gone now.
> 
>>
>> and I'm not sure Al could accept __fdget conversion (I just wanted to give a example then...)
>>
>> Therefore, I tend to keep silence and just promote EROFS... some better ideas?...
>>
> 
> Writing example conversion patches to demonstrate cleaner code
> and perhaps reduce LOC seems the best way.
> 
> Also pointing out that fixing potential bugs in one implementation is preferred
> to having to patch all copied implementations.
> 
> I wonder if tagptr_unfold_tags() doesn't need READ_ONCE() as per:
> 1be5d4fa0af3 locking/rtmutex: Use READ_ONCE() in rt_mutex_owner()
> 
> rb_list_head() doesn't have READ_ONCE()
> Nor does hlist_bl_first() and BPF_MAP_PTR().
> 
> Are those all safe due to safe call sites? or potentially broken?

...Add a word (maybe not too ralated with this topic), I heard something
before from compiler guys like that the pointer type will be kept in atomic
by compilers during accessing, I personally think that makes sense
for pointer type.

However, in EROFS implementation (not in this patch) I tend to use
WRITE_ONCE / READ_ONCE in order to access once and as a hint to tell
compiler it should be access once in case of getting rare broken
generated code...

I cannot trust compiler all the time due to code optimization since
1) I have no idea it will generate in atomic for all cases...
2) I have no idea it will be accessed more than one time somewhere...

Thanks,
Gao Xiang

> 
> Thanks,
> Amir.
>
Steven Rostedt July 22, 2019, 2:40 p.m. UTC | #5
On Mon, 22 Jul 2019 09:16:22 +0300
Amir Goldstein <amir73il@gmail.com> wrote:

> CC kernel/trace maintainers for RB_PAGE_HEAD/RB_PAGE_UPDATE
> and kernel/locking maintainers for RT_MUTEX_HAS_WAITERS

Interesting.

> 
> > (Is there some use scenerios in overlayfs and fanotify?...)  
> 
> We had one in overlayfs once. It is gone now.
> 
> >
> > and I'm not sure Al could accept __fdget conversion (I just wanted to give a example then...)
> >
> > Therefore, I tend to keep silence and just promote EROFS... some better ideas?...
> >  
> 
> Writing example conversion patches to demonstrate cleaner code
> and perhaps reduce LOC seems the best way.

Yes, I would be more interested in seeing patches that clean up the
code than just talking about it.

> 
> Also pointing out that fixing potential bugs in one implementation is preferred
> to having to patch all copied implementations.
> 
> I wonder if tagptr_unfold_tags() doesn't need READ_ONCE() as per:
> 1be5d4fa0af3 locking/rtmutex: Use READ_ONCE() in rt_mutex_owner()
> 
> rb_list_head() doesn't have READ_ONCE()

Hmm, even if the compiler decided to reread the data, it would still
need to clear the extra bits wouldn't it? Or am I missing something?

-- Steve

> Nor does hlist_bl_first() and BPF_MAP_PTR().
> 
> Are those all safe due to safe call sites? or potentially broken?
Gao Xiang July 22, 2019, 3:33 p.m. UTC | #6
Hi Steven,

On 2019/7/22 ????10:40, Steven Rostedt wrote:
>>> and I'm not sure Al could accept __fdget conversion (I just wanted to give a example then...)
>>>
>>> Therefore, I tend to keep silence and just promote EROFS... some better ideas?...
>>>  
>> Writing example conversion patches to demonstrate cleaner code
>> and perhaps reduce LOC seems the best way.
> Yes, I would be more interested in seeing patches that clean up the
> code than just talking about it.
> 

I guess that is related to me, though I didn't plan to promote
a generic tagged pointer implementation in this series...

I try to describe what erofs met and my own implementation,
assume that we have 3 tagged pointers, a, b, c, and one
potential user only (no need to ACCESS_ONCE).

One way is

#define A_MASK		1
#define B_MASK		1
#define C_MASK		3

/* now we have 3 mask there, A, B, C is simple,
   the real name could be long... */

void *a;
void *b;
void *c;		/* and some pointers */

In order to decode the tag, we have to
	((unsigned long)a & A_MASK)

to decode the ptr, we have to
	((unsigned long)a & ~A_MASK)

In order to fold the tagged pointer...
	(void *)((unsigned long)a | tag)

You can see the only meaning of these masks is the bitlength of tags,
but there are many masks (or we have to do open-coded a & 3,
if bitlength is changed, we have to fix them all)...

therefore my approach is

typedef tagptr1_t ta;	/* tagptr type a with 1-bit tag */
typedef tagptr1_t tb;	/* tagptr type b with 1-bit tag */
typedef tagptr2_t tc;	/* tagptr type c with 2-bit tag */

and ta a; tb b; tc c;

the type will represent its bitlength of tags and we can use ta, tb, tc
to avoid masks or open-coded bitlength.

In order to decode the tag, we can
	tagptr_unfold_tags(a)

In order to decode the ptr, we can
	tagptr_unfold_ptr(a)

In order to fold the tagged pointer...
	a = tagptr_fold(ta, ptr, tag)


ACCESS_ONCE stuff is another thing... If my approach seems cleaner,
we could move to include/linux later after EROFS stuffs is done...
Or I could use a better tagptr approach later if any...

Thanks,
Gao XIang
Steven Rostedt July 22, 2019, 4:35 p.m. UTC | #7
On Mon, 22 Jul 2019 23:33:53 +0800
Gao Xiang <hsiangkao@aol.com> wrote:

> Hi Steven,
> 
> On 2019/7/22 ????10:40, Steven Rostedt wrote:
> >>> and I'm not sure Al could accept __fdget conversion (I just wanted to give a example then...)
> >>>
> >>> Therefore, I tend to keep silence and just promote EROFS... some better ideas?...
> >>>    
> >> Writing example conversion patches to demonstrate cleaner code
> >> and perhaps reduce LOC seems the best way.  
> > Yes, I would be more interested in seeing patches that clean up the
> > code than just talking about it.
> >   
> 
> I guess that is related to me, though I didn't plan to promote
> a generic tagged pointer implementation in this series...

I don't expect you to either.

> 
> I try to describe what erofs met and my own implementation,
> assume that we have 3 tagged pointers, a, b, c, and one
> potential user only (no need to ACCESS_ONCE).
> 
> One way is
> 
> #define A_MASK		1
> #define B_MASK		1
> #define C_MASK		3
> 
> /* now we have 3 mask there, A, B, C is simple,
>    the real name could be long... */
> 
> void *a;
> void *b;
> void *c;		/* and some pointers */
> 
> In order to decode the tag, we have to
> 	((unsigned long)a & A_MASK)
> 
> to decode the ptr, we have to
> 	((unsigned long)a & ~A_MASK)
> 
> In order to fold the tagged pointer...
> 	(void *)((unsigned long)a | tag)

And you need a way to clear the flag.

> 
> You can see the only meaning of these masks is the bitlength of tags,
> but there are many masks (or we have to do open-coded a & 3,
> if bitlength is changed, we have to fix them all)...
> 
> therefore my approach is
> 
> typedef tagptr1_t ta;	/* tagptr type a with 1-bit tag */
> typedef tagptr1_t tb;	/* tagptr type b with 1-bit tag */
> typedef tagptr2_t tc;	/* tagptr type c with 2-bit tag */
> 
> and ta a; tb b; tc c;
> 
> the type will represent its bitlength of tags and we can use ta, tb, tc
> to avoid masks or open-coded bitlength.
> 
> In order to decode the tag, we can
> 	tagptr_unfold_tags(a)
> 
> In order to decode the ptr, we can
> 	tagptr_unfold_ptr(a)
> 
> In order to fold the tagged pointer...
> 	a = tagptr_fold(ta, ptr, tag)
> 
> 
> ACCESS_ONCE stuff is another thing... If my approach seems cleaner,
> we could move to include/linux later after EROFS stuffs is done...
> Or I could use a better tagptr approach later if any...

Looking at the ring buffer code, it may be a bit too complex to try to
use a generic infrastructure. Look at rb_head_page_set(), where it does
a cmpxchg to set or clear the flags and then tests the previous flags
to know what actions need to be done.

The ring buffer tag code was added in 2009, the rtmutex tag code was
added in 2006. It's been 10 years before we needed another tag
operation. I'm not sure we benefit from making this generic.

-- Steve
Gao Xiang July 22, 2019, 4:52 p.m. UTC | #8
On 2019/7/23 ????12:35, Steven Rostedt wrote:
> On Mon, 22 Jul 2019 23:33:53 +0800
> Gao Xiang <hsiangkao@aol.com> wrote:
> 
>> Hi Steven,
>>
>> On 2019/7/22 ????10:40, Steven Rostedt wrote:
>>>>> and I'm not sure Al could accept __fdget conversion (I just wanted to give a example then...)
>>>>>
>>>>> Therefore, I tend to keep silence and just promote EROFS... some better ideas?...
>>>>>    
>>>> Writing example conversion patches to demonstrate cleaner code
>>>> and perhaps reduce LOC seems the best way.  
>>> Yes, I would be more interested in seeing patches that clean up the
>>> code than just talking about it.
>>>   
>>
>> I guess that is related to me, though I didn't plan to promote
>> a generic tagged pointer implementation in this series...
> 
> I don't expect you to either.

Beyond my expectation, I think I will (could) learn some new knowledge
from this topic, thanks you and Amir :)

> 
>>
>> I try to describe what erofs met and my own implementation,
>> assume that we have 3 tagged pointers, a, b, c, and one
>> potential user only (no need to ACCESS_ONCE).
>>
>> One way is
>>
>> #define A_MASK		1
>> #define B_MASK		1
>> #define C_MASK		3
>>
>> /* now we have 3 mask there, A, B, C is simple,
>>    the real name could be long... */
>>
>> void *a;
>> void *b;
>> void *c;		/* and some pointers */
>>
>> In order to decode the tag, we have to
>> 	((unsigned long)a & A_MASK)
>>
>> to decode the ptr, we have to
>> 	((unsigned long)a & ~A_MASK)
>>
>> In order to fold the tagged pointer...
>> 	(void *)((unsigned long)a | tag)
> 
> And you need a way to clear the flag.

Considering one potential user, we could refold the tagged pointer.
or we could refold the tagged pointer and update the value in atomic
(like atomic_t does).

a = tagptr_fold(ta, tagptr_unfold_tags(a), tag);

> 
>>
>> You can see the only meaning of these masks is the bitlength of tags,
>> but there are many masks (or we have to do open-coded a & 3,
>> if bitlength is changed, we have to fix them all)...
>>
>> therefore my approach is
>>
>> typedef tagptr1_t ta;	/* tagptr type a with 1-bit tag */
>> typedef tagptr1_t tb;	/* tagptr type b with 1-bit tag */
>> typedef tagptr2_t tc;	/* tagptr type c with 2-bit tag */
>>
>> and ta a; tb b; tc c;
>>
>> the type will represent its bitlength of tags and we can use ta, tb, tc
>> to avoid masks or open-coded bitlength.
>>
>> In order to decode the tag, we can
>> 	tagptr_unfold_tags(a)
>>
>> In order to decode the ptr, we can
>> 	tagptr_unfold_ptr(a)
>>
>> In order to fold the tagged pointer...
>> 	a = tagptr_fold(ta, ptr, tag)
>>
>>
>> ACCESS_ONCE stuff is another thing... If my approach seems cleaner,
>> we could move to include/linux later after EROFS stuffs is done...
>> Or I could use a better tagptr approach later if any...
> 
> Looking at the ring buffer code, it may be a bit too complex to try to
> use a generic infrastructure. Look at rb_head_page_set(), where it does
> a cmpxchg to set or clear the flags and then tests the previous flags
> to know what actions need to be done.

The current code supports cmpxchg as well, but I don't look into
rb_head_page_set... (although I think it is not the critical thing if we
decide to do some generic tagged pointer approach...)

> 
> The ring buffer tag code was added in 2009, the rtmutex tag code was
> added in 2006. It's been 10 years before we needed another tag
> operation. I'm not sure we benefit from making this generic.

Okay, that depends on your folks, actually...


Thanks,
Gao Xiang

> 
> -- Steve
>
diff mbox series

Patch

diff --git a/fs/erofs/tagptr.h b/fs/erofs/tagptr.h
new file mode 100644
index 000000000000..121403cff2a3
--- /dev/null
+++ b/fs/erofs/tagptr.h
@@ -0,0 +1,110 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Tagged pointer implementation
+ *
+ * Copyright (C) 2018 Gao Xiang <gaoxiang25@huawei.com>
+ */
+#ifndef __EROFS_TAGPTR_H
+#define __EROFS_TAGPTR_H
+
+#include <linux/types.h>
+#include <linux/build_bug.h>
+
+/*
+ * the name of tagged pointer types are tagptr{1, 2, 3...}_t
+ * avoid directly using the internal structs __tagptr{1, 2, 3...}
+ */
+#define __MAKE_TAGPTR(n) \
+typedef struct __tagptr##n {	\
+	uintptr_t v;	\
+} tagptr##n##_t;
+
+__MAKE_TAGPTR(1)
+__MAKE_TAGPTR(2)
+__MAKE_TAGPTR(3)
+__MAKE_TAGPTR(4)
+
+#undef __MAKE_TAGPTR
+
+extern void __compiletime_error("bad tagptr tags")
+	__bad_tagptr_tags(void);
+
+extern void __compiletime_error("bad tagptr type")
+	__bad_tagptr_type(void);
+
+/* fix the broken usage of "#define tagptr2_t tagptr3_t" by users */
+#define __tagptr_mask_1(ptr, n)	\
+	__builtin_types_compatible_p(typeof(ptr), struct __tagptr##n) ? \
+		(1UL << (n)) - 1 :
+
+#define __tagptr_mask(ptr)	(\
+	__tagptr_mask_1(ptr, 1) ( \
+	__tagptr_mask_1(ptr, 2) ( \
+	__tagptr_mask_1(ptr, 3) ( \
+	__tagptr_mask_1(ptr, 4) ( \
+	__bad_tagptr_type(), 0)))))
+
+/* generate a tagged pointer from a raw value */
+#define tagptr_init(type, val) \
+	((typeof(type)){ .v = (uintptr_t)(val) })
+
+/*
+ * directly cast a tagged pointer to the native pointer type, which
+ * could be used for backward compatibility of existing code.
+ */
+#define tagptr_cast_ptr(tptr) ((void *)(tptr).v)
+
+/* encode tagged pointers */
+#define tagptr_fold(type, ptr, _tags) ({ \
+	const typeof(_tags) tags = (_tags); \
+	if (__builtin_constant_p(tags) && (tags & ~__tagptr_mask(type))) \
+		__bad_tagptr_tags(); \
+tagptr_init(type, (uintptr_t)(ptr) | tags); })
+
+/* decode tagged pointers */
+#define tagptr_unfold_ptr(tptr) \
+	((void *)((tptr).v & ~__tagptr_mask(tptr)))
+
+#define tagptr_unfold_tags(tptr) \
+	((tptr).v & __tagptr_mask(tptr))
+
+/* operations for the tagger pointer */
+#define tagptr_eq(_tptr1, _tptr2) ({ \
+	typeof(_tptr1) tptr1 = (_tptr1); \
+	typeof(_tptr2) tptr2 = (_tptr2); \
+	(void)(&tptr1 == &tptr2); \
+(tptr1).v == (tptr2).v; })
+
+/* lock-free CAS operation */
+#define tagptr_cmpxchg(_ptptr, _o, _n) ({ \
+	typeof(_ptptr) ptptr = (_ptptr); \
+	typeof(_o) o = (_o); \
+	typeof(_n) n = (_n); \
+	(void)(&o == &n); \
+	(void)(&o == ptptr); \
+tagptr_init(o, cmpxchg(&ptptr->v, o.v, n.v)); })
+
+/* wrap WRITE_ONCE if atomic update is needed */
+#define tagptr_replace_tags(_ptptr, tags) ({ \
+	typeof(_ptptr) ptptr = (_ptptr); \
+	*ptptr = tagptr_fold(*ptptr, tagptr_unfold_ptr(*ptptr), tags); \
+*ptptr; })
+
+#define tagptr_set_tags(_ptptr, _tags) ({ \
+	typeof(_ptptr) ptptr = (_ptptr); \
+	const typeof(_tags) tags = (_tags); \
+	if (__builtin_constant_p(tags) && (tags & ~__tagptr_mask(*ptptr))) \
+		__bad_tagptr_tags(); \
+	ptptr->v |= tags; \
+*ptptr; })
+
+#define tagptr_clear_tags(_ptptr, _tags) ({ \
+	typeof(_ptptr) ptptr = (_ptptr); \
+	const typeof(_tags) tags = (_tags); \
+	if (__builtin_constant_p(tags) && (tags & ~__tagptr_mask(*ptptr))) \
+		__bad_tagptr_tags(); \
+	ptptr->v &= ~tags; \
+*ptptr; })
+
+#endif
+