diff mbox series

[1/2] pipe: change pipe_write() to never add a zero-sized buffer

Message ID 20250209150749.GA16999@redhat.com (mailing list archive)
State New
Headers show
Series pipe: change pipe_write() to never add a zero-sized buffer | expand

Commit Message

Oleg Nesterov Feb. 9, 2025, 3:07 p.m. UTC
a194dfe6e6f6 ("pipe: Rearrange sequence in pipe_write() to preallocate slot")
changed pipe_write() to increment pipe->head in advance.  IIUC to avoid the
race with the post_one_notification()-like code which can add another buffer
under pipe->rd_wait.lock without pipe->mutex.

This is no longer necessary after c73be61cede5 ("pipe: Add general notification
queue support"), pipe_write() checks pipe_has_watch_queue() and returns -EXDEV
at the start. And can't help in any case, pipe_write() no longer takes this
rd_wait.lock spinlock.

Change pipe_write() to call copy_page_from_iter() first and do nothing if it
fails. This way pipe_write() can't add a zero-sized buffer and we can simplify
pipe_read() which currently has to take care of this very unlikely case.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 fs/pipe.c                 | 47 +++++++++------------------------------
 include/linux/pipe_fs_i.h |  6 +++++
 2 files changed, 16 insertions(+), 37 deletions(-)

Comments

Linus Torvalds Feb. 9, 2025, 5:26 p.m. UTC | #1
On Sun, 9 Feb 2025 at 07:08, Oleg Nesterov <oleg@redhat.com> wrote:
>
> This is no longer necessary after c73be61cede5 ("pipe: Add general notification
> queue support"), pipe_write() checks pipe_has_watch_queue() and returns -EXDEV
> at the start. And can't help in any case, pipe_write() no longer takes this
> rd_wait.lock spinlock.

Ack. This code all goes back to the two-level locking thing with
notifications using just the spinlock side.

The locking was removed from this code in commit dfaabf916b1c
("fs/pipe: remove unnecessary spinlock from pipe_write()"), but the
"pre-allocate" logic remained.

This patch seems to be the right thing to do and removes the vestiges
of the old model.

But I don't think you need that pipe_buf_assert_len() thing.  And if
you do, please don't make it a pointless inline helper that only hides
what it does.

             Linus
Oleg Nesterov Feb. 9, 2025, 6:02 p.m. UTC | #2
On 02/09, Linus Torvalds wrote:
>
> This patch seems to be the right thing to do and removes the vestiges
> of the old model.

OK, thanks.

> But I don't think you need that pipe_buf_assert_len() thing.

Well, I'd prefer to keep this WARN_ON_ONCE() for some time... If
nobody hits this warning we can kill eat_empty_buffer() and more
hopefully dead checks, for example

	/* zero-length bvecs are not supported, skip them */
	if (!this_len)
		continue;

in iter_file_splice_write().

> And if
> you do, please don't make it a pointless inline helper that only hides
> what it does.

Could you explain what do you think should I do if I keep this check?
make pipe_buf_assert_len() return void? or just replace it with
WARN_ON_ONCE(!buf->len) in its callers?

Oleg.
Linus Torvalds Feb. 9, 2025, 6:17 p.m. UTC | #3
On Sun, 9 Feb 2025 at 10:02, Oleg Nesterov <oleg@redhat.com> wrote:
>
> Could you explain what do you think should I do if I keep this check?
> make pipe_buf_assert_len() return void? or just replace it with
> WARN_ON_ONCE(!buf->len) in its callers?

Just replace it with WARN_ON_ONCE() in any place where you really
think it's needed.

And honestly, if you worry so much about it, just allow the zero-sized
case. I don't see why you want to special-case it in the first place.

Yes, the zero sized buffer *used* to be a special case, but it was a
special case for writes.

And yes, splice wants to actually wait for "do we have real data" and
has that eat_empty_buffer() case, but just *keep* it.

Keeping that check is *better* and clearer than adding some pointless warning.

IOW, why warn for a case that isn't a problem, and you're only making
it a problem by thinking it is?

              Linus
Oleg Nesterov Feb. 9, 2025, 6:44 p.m. UTC | #4
On 02/09, Linus Torvalds wrote:
>
> On Sun, 9 Feb 2025 at 10:02, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > Could you explain what do you think should I do if I keep this check?
> > make pipe_buf_assert_len() return void? or just replace it with
> > WARN_ON_ONCE(!buf->len) in its callers?
>
> Just replace it with WARN_ON_ONCE() in any place where you really
> think it's needed.

OK, will do.

> IOW, why warn for a case that isn't a problem, and you're only making
> it a problem by thinking it is?

Again, lets look eat_empty_buffer().

The comment says "maybe it's empty" but how/why can this happen ?

The changelog for d1a819a2ec2d3 ("splice: teach splice pipe reading
about empty pipe buffers") says "you can trigger it by doing a write
to a pipe that fails" but if someone looks at anon_pipe_write() after
1/2 this case is not possible.

And if eat_empty_buffer() flushes the buffer and updates pipe->tail,
why doesn't it wake the writers?

WARN_ON_ONCE() makes it clear that we do not expect !buf->len == 0,
and the kernel will complain if it does happen.

So unless you have a strong opinion, I'd prefer to keep it for now.

Oleg.
Linus Torvalds Feb. 9, 2025, 6:52 p.m. UTC | #5
On Sun, 9 Feb 2025 at 10:45, Oleg Nesterov <oleg@redhat.com> wrote:
>
> Again, lets look eat_empty_buffer().
>
> The comment says "maybe it's empty" but how/why can this happen ?

WHY DO YOU CARE?

Even if it's just defensive programming, it's just plain good code,
when the rule is "the caller is waiting for data".

And if it means that you don't need to add random WARN_ON_ONCE() statements.

So here's the deal: either you

 (a) convince yourself that the length really can never be true, and
you write a comment about why that is the case.

or

 (b) you DON'T convince yourself that that is true, and you leave
eat_empty_buffer() alone.

and both of those situations are fine.

But adding a random sprinkling of WARN_ON_ONCE() statements is worse
than *either* of those two cases.

See what my argument is? Adding a WARN_ON is just making the code
worse. Don't do it. It's the worst of both worlds: it adds code to
generate, and it adds confusion to readers ("why are we warning?" or
"when can this happen").

In contrast, the "eat_empty_buffer()" case just saying "if it's an
empty buffer, it doesn't satisfy my needs, so I'll just release the
empty buffer and go on".

That's simple and straightforward. Easy to maintain, and more
efficient than your WARN_ONs.

And if you can convince yourself it's not needed, you remove it.
EXACTLY LIKE THE WARN_ON.

So there's simply never any upside to the WARN_ON.

               Linus
Oleg Nesterov Feb. 9, 2025, 7:15 p.m. UTC | #6
On 02/09, Linus Torvalds wrote:
>
> On Sun, 9 Feb 2025 at 10:45, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > Again, lets look eat_empty_buffer().
> >
> > The comment says "maybe it's empty" but how/why can this happen ?
>
> WHY DO YOU CARE?

Because it looks unclear/confusing, and I think it can confuse other
readers of this code. Especially after 1/2.

> So here's the deal: either you
...
>  (b) you DON'T convince yourself that that is true, and you leave
> eat_empty_buffer() alone.

Yes, I failed to convince myself that fs/splice.c can never add an
empty bufer. Although it seems to me it should not.

> In contrast, the "eat_empty_buffer()" case just saying "if it's an
> empty buffer, it doesn't satisfy my needs, so I'll just release the
> empty buffer and go on".

... without wakeup_pipe_writers().

OK, nevermind, I see your point even if I do not 100% agree.

I'll send v2 without WARN_ON() and without 2/2.

Oleg.
K Prateek Nayak Feb. 9, 2025, 11:39 p.m. UTC | #7
Hello Oleg,

On 2/10/2025 12:45 AM, Oleg Nesterov wrote:
> On 02/09, Linus Torvalds wrote:
>>
>> On Sun, 9 Feb 2025 at 10:45, Oleg Nesterov <oleg@redhat.com> wrote:
>>>
>>> Again, lets look eat_empty_buffer().
>>>
>>> The comment says "maybe it's empty" but how/why can this happen ?
>>
>> WHY DO YOU CARE?
> 
> Because it looks unclear/confusing, and I think it can confuse other
> readers of this code. Especially after 1/2.
> 
>> So here's the deal: either you
> ...
>>   (b) you DON'T convince yourself that that is true, and you leave
>> eat_empty_buffer() alone.
> 
> Yes, I failed to convince myself that fs/splice.c can never add an
> empty bufer. Although it seems to me it should not.
> 
>> In contrast, the "eat_empty_buffer()" case just saying "if it's an
>> empty buffer, it doesn't satisfy my needs, so I'll just release the
>> empty buffer and go on".
> 
> ... without wakeup_pipe_writers().
> 
> OK, nevermind, I see your point even if I do not 100% agree.
> 
> I'll send v2 without WARN_ON() and without 2/2.

Went ahead and tested that version on top of mainline with your
previous change to skip updating {a,c,m}time, here are the results:

==================================================================
Test          : sched-messaging
Units         : Normalized time in seconds
Interpretation: Lower is better
Statistic     : AMean
==================================================================
Case:       mainline + no-acm_time[pct imp](CV)   patched[pct imp](CV)
  1-groups     1.00 [ -0.00]( 7.19)                0.95 [  4.90](12.39)
  2-groups     1.00 [ -0.00]( 3.54)                1.02 [ -1.92]( 6.55)
  4-groups     1.00 [ -0.00]( 2.78)                1.01 [ -0.85]( 2.18)
  8-groups     1.00 [ -0.00]( 1.04)                0.99 [  0.63]( 0.77)
16-groups     1.00 [ -0.00]( 1.02)                1.00 [ -0.26]( 0.98)

I don't see any regression / improvements from a performance standpoint
on my 3rd Generation EPYC system (2 x 64C/128T. boost on, C2 disabled)
Feel free to include:

Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>

> 
> Oleg.
>
Oleg Nesterov Feb. 10, 2025, 5:22 p.m. UTC | #8
Hi Prateek,

On 02/10, K Prateek Nayak wrote:
>
>  1-groups     1.00 [ -0.00]( 7.19)                0.95 [  4.90](12.39)
>  2-groups     1.00 [ -0.00]( 3.54)                1.02 [ -1.92]( 6.55)
>  4-groups     1.00 [ -0.00]( 2.78)                1.01 [ -0.85]( 2.18)
>  8-groups     1.00 [ -0.00]( 1.04)                0.99 [  0.63]( 0.77)
> 16-groups     1.00 [ -0.00]( 1.02)                1.00 [ -0.26]( 0.98)
>
> I don't see any regression / improvements from a performance standpoint

Yes, this patch shouldn't make any difference performance-wise, at least
in this case. Although I was thinking the same thing when I sent "pipe_read:
don't wake up the writer if the pipe is still full" ;)

> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>

Thanks! Please see v2, I've included you tag.

Any chance you can also test the patch below?

To me it looks like a cleanup which makes the "merge small writes" logic
more understandable. And note that "page-align the rest of the writes"
doesn't work anyway if "total_len & (PAGE_SIZE-1)" can't fit in the last
buffer.

However, in this particular case with DATASIZE=100 this patch can increase
the number of copy_page_from_iter()'s in pipe_write(). And with this change
receiver() can certainly get the short reads, so this can increase the
number of sys_read() calls.

So I am just curious if this change can cause any noticeable regression on
your machine.

Thank you!

Oleg.

--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -459,16 +459,16 @@ anon_pipe_write(struct kiocb *iocb, struct iov_iter *from)
 	was_empty = pipe_empty(head, pipe->tail);
 	chars = total_len & (PAGE_SIZE-1);
 	if (chars && !was_empty) {
-		unsigned int mask = pipe->ring_size - 1;
-		struct pipe_buffer *buf = &pipe->bufs[(head - 1) & mask];
+		struct pipe_buffer *buf = pipe_buf(pipe, head - 1);
 		int offset = buf->offset + buf->len;
+		int avail = PAGE_SIZE - offset;
 
-		if ((buf->flags & PIPE_BUF_FLAG_CAN_MERGE) &&
-		    offset + chars <= PAGE_SIZE) {
+		if (avail && (buf->flags & PIPE_BUF_FLAG_CAN_MERGE)) {
 			ret = pipe_buf_confirm(pipe, buf);
 			if (ret)
 				goto out;
 
+			chars = min_t(ssize_t, chars, avail);
 			ret = copy_page_from_iter(buf->page, offset, chars, from);
 			if (unlikely(ret < chars)) {
 				ret = -EFAULT;
Linus Torvalds Feb. 10, 2025, 5:37 p.m. UTC | #9
On Mon, 10 Feb 2025 at 09:22, Oleg Nesterov <oleg@redhat.com> wrote:
>
> +               int avail = PAGE_SIZE - offset;
>
> -               if ((buf->flags & PIPE_BUF_FLAG_CAN_MERGE) &&
> -                   offset + chars <= PAGE_SIZE) {
> +               if (avail && (buf->flags & PIPE_BUF_FLAG_CAN_MERGE)) {
>                         ret = pipe_buf_confirm(pipe, buf);
>                         if (ret)
>                                 goto out;
>
> +                       chars = min_t(ssize_t, chars, avail);

If I read this correctly, this patch is horribly broken.

You can't do partial writes. Pipes have one very core atomicity
guarantee: from the man-pages:

   PIPE_BUF
       POSIX.1 says that writes of less than PIPE_BUF bytes must be
       atomic: the output data is written to the pipe as a contiguous
       sequence.  Writes of more than PIPE_BUF bytes may be nonatomic:
       the kernel may interleave the data with data written by other
       processes.  POSIX.1 requires PIPE_BUF to be at least 512 bytes.
       (On Linux, PIPE_BUF is 4096 bytes.)

IOW, that whole "try to write as many chars as there is room" is very
very broken. You have to write all or nothing.

So you can't (say) first write just 50 bytes of a 100-byte pipe write
because it fits in the last buffer, and then wait for another buffer
to become free to write the rest. Not just because another writer
might come in and start mixing in data, but because *readers* may well
expect to get 100 bytes or nothing.

And to make matters worse, you'll never notice the bug until something
breaks very subtly (unless we happen to have a good test-case for this
somewhere - there might be a test for this in LTP).

And yes, this is actually something I know for a fact that people
depend on. Lots of traditional UNIX "send packet commands over pipes"
programs around, which expect the packets to be atomic.

So things *will* break, but it might take a while before you hit just
the right race condition for things to go south, and the errors migth
end up being very non-obvious indeed.

Note that the initial

        chars = total_len & (PAGE_SIZE-1);

before the whole test for "can we merge" is fine, because if total_len
is larger than a page, it's no longer a write we need to worry about
atomicity with.

Maybe we should add a comment somewhere about this.

          Linus
Oleg Nesterov Feb. 10, 2025, 6:36 p.m. UTC | #10
On 02/10, Linus Torvalds wrote:
>
> On Mon, 10 Feb 2025 at 09:22, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > +               int avail = PAGE_SIZE - offset;
> >
> > -               if ((buf->flags & PIPE_BUF_FLAG_CAN_MERGE) &&
> > -                   offset + chars <= PAGE_SIZE) {
> > +               if (avail && (buf->flags & PIPE_BUF_FLAG_CAN_MERGE)) {
> >                         ret = pipe_buf_confirm(pipe, buf);
> >                         if (ret)
> >                                 goto out;
> >
> > +                       chars = min_t(ssize_t, chars, avail);
>
> If I read this correctly, this patch is horribly broken.
>
> You can't do partial writes. Pipes have one very core atomicity
> guarantee: from the man-pages:
>
>    PIPE_BUF
>        POSIX.1 says that writes of less than PIPE_BUF bytes must be
>        atomic:

Ah, I didn't know!

Thanks for your explanation and the quick NACK!

> Maybe we should add a comment somewhere about this.

Agreed. It would certainly help the ignorant readers like me ;)

So I guess that the "goto again" logic in sender/receiver in
tools/perf/bench/sched-messaging.c is currently pointless,
DATASIZE == 100 < PIPE_BUF.

Thanks.

Oleg.
K Prateek Nayak Feb. 11, 2025, 3:59 a.m. UTC | #11
Hello Oleg,

On 2/10/2025 10:52 PM, Oleg Nesterov wrote:
> Hi Prateek,
> 
> On 02/10, K Prateek Nayak wrote:
>>
>>   1-groups     1.00 [ -0.00]( 7.19)                0.95 [  4.90](12.39)
>>   2-groups     1.00 [ -0.00]( 3.54)                1.02 [ -1.92]( 6.55)
>>   4-groups     1.00 [ -0.00]( 2.78)                1.01 [ -0.85]( 2.18)
>>   8-groups     1.00 [ -0.00]( 1.04)                0.99 [  0.63]( 0.77)
>> 16-groups     1.00 [ -0.00]( 1.02)                1.00 [ -0.26]( 0.98)
>>
>> I don't see any regression / improvements from a performance standpoint
> 
> Yes, this patch shouldn't make any difference performance-wise, at least
> in this case. Although I was thinking the same thing when I sent "pipe_read:
> don't wake up the writer if the pipe is still full" ;)
> 
>> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
> 
> Thanks! Please see v2, I've included you tag.

Thank you. I can confirm it is same as the variant I tested.

> 
> Any chance you can also test the patch below?
> 
> To me it looks like a cleanup which makes the "merge small writes" logic
> more understandable. And note that "page-align the rest of the writes"
> doesn't work anyway if "total_len & (PAGE_SIZE-1)" can't fit in the last
> buffer.
> 
> However, in this particular case with DATASIZE=100 this patch can increase
> the number of copy_page_from_iter()'s in pipe_write(). And with this change
> receiver() can certainly get the short reads, so this can increase the
> number of sys_read() calls.
> 
> So I am just curious if this change can cause any noticeable regression on
> your machine.

For the sake of science:

==================================================================
Test          : sched-messaging
Units         : Normalized time in seconds
Interpretation: Lower is better
Statistic     : AMean
==================================================================
Case:         baseline[pct imp](CV)  merge_writes[pct imp](CV)
  1-groups     1.00 [ -0.00](12.39)     1.08 [ -7.62](11.73)
  2-groups     1.00 [ -0.00]( 6.55)     0.97 [  2.52]( 3.01)
  4-groups     1.00 [ -0.00]( 2.18)     1.00 [  0.42]( 1.97)
  8-groups     1.00 [ -0.00]( 0.77)     1.03 [ -3.35]( 5.07)
16-groups     1.00 [ -0.00]( 0.98)     1.01 [ -1.37]( 2.20)

I see some improvements up until 4 groups (160 tasks) but beyond that it
goes into a slight regression territory but the variance is large to
draw any conclusions.

Science experiment concluded.

> 
> Thank you!
> 
> Oleg.
> 
> --- a/fs/pipe.c
> +++ b/fs/pipe.c
> @@ -459,16 +459,16 @@ anon_pipe_write(struct kiocb *iocb, struct iov_iter *from)
>   	was_empty = pipe_empty(head, pipe->tail);
>   	chars = total_len & (PAGE_SIZE-1);
>   	if (chars && !was_empty) {
> -		unsigned int mask = pipe->ring_size - 1;
> -		struct pipe_buffer *buf = &pipe->bufs[(head - 1) & mask];
> +		struct pipe_buffer *buf = pipe_buf(pipe, head - 1);
>   		int offset = buf->offset + buf->len;
> +		int avail = PAGE_SIZE - offset;
>   
> -		if ((buf->flags & PIPE_BUF_FLAG_CAN_MERGE) &&
> -		    offset + chars <= PAGE_SIZE) {
> +		if (avail && (buf->flags & PIPE_BUF_FLAG_CAN_MERGE)) {
>   			ret = pipe_buf_confirm(pipe, buf);
>   			if (ret)
>   				goto out;
>   
> +			chars = min_t(ssize_t, chars, avail);
>   			ret = copy_page_from_iter(buf->page, offset, chars, from);
>   			if (unlikely(ret < chars)) {
>   				ret = -EFAULT;
>
diff mbox series

Patch

diff --git a/fs/pipe.c b/fs/pipe.c
index 2ae75adfba64..7f24d707f6a1 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -303,7 +303,7 @@  anon_pipe_read(struct kiocb *iocb, struct iov_iter *to)
 
 		if (!pipe_empty(head, tail)) {
 			struct pipe_buffer *buf = &pipe->bufs[tail & mask];
-			size_t chars = buf->len;
+			size_t chars = pipe_buf_assert_len(buf);
 			size_t written;
 			int error;
 
@@ -360,29 +360,9 @@  anon_pipe_read(struct kiocb *iocb, struct iov_iter *to)
 			break;
 		}
 		mutex_unlock(&pipe->mutex);
-
 		/*
 		 * We only get here if we didn't actually read anything.
 		 *
-		 * However, we could have seen (and removed) a zero-sized
-		 * pipe buffer, and might have made space in the buffers
-		 * that way.
-		 *
-		 * You can't make zero-sized pipe buffers by doing an empty
-		 * write (not even in packet mode), but they can happen if
-		 * the writer gets an EFAULT when trying to fill a buffer
-		 * that already got allocated and inserted in the buffer
-		 * array.
-		 *
-		 * So we still need to wake up any pending writers in the
-		 * _very_ unlikely case that the pipe was full, but we got
-		 * no data.
-		 */
-		if (unlikely(wake_writer))
-			wake_up_interruptible_sync_poll(&pipe->wr_wait, EPOLLOUT | EPOLLWRNORM);
-		kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT);
-
-		/*
 		 * But because we didn't read anything, at this point we can
 		 * just return directly with -ERESTARTSYS if we're interrupted,
 		 * since we've done any required wakeups and there's no need
@@ -391,7 +371,6 @@  anon_pipe_read(struct kiocb *iocb, struct iov_iter *to)
 		if (wait_event_interruptible_exclusive(pipe->rd_wait, pipe_readable(pipe)) < 0)
 			return -ERESTARTSYS;
 
-		wake_writer = false;
 		wake_next_reader = true;
 		mutex_lock(&pipe->mutex);
 	}
@@ -526,33 +505,27 @@  anon_pipe_write(struct kiocb *iocb, struct iov_iter *from)
 				pipe->tmp_page = page;
 			}
 
-			/* Allocate a slot in the ring in advance and attach an
-			 * empty buffer.  If we fault or otherwise fail to use
-			 * it, either the reader will consume it or it'll still
-			 * be there for the next write.
-			 */
-			pipe->head = head + 1;
+			copied = copy_page_from_iter(page, 0, PAGE_SIZE, from);
+			if (unlikely(copied < PAGE_SIZE && iov_iter_count(from))) {
+				if (!ret)
+					ret = -EFAULT;
+				break;
+			}
 
+			pipe->head = head + 1;
+			pipe->tmp_page = NULL;
 			/* Insert it into the buffer array */
 			buf = &pipe->bufs[head & mask];
 			buf->page = page;
 			buf->ops = &anon_pipe_buf_ops;
 			buf->offset = 0;
-			buf->len = 0;
 			if (is_packetized(filp))
 				buf->flags = PIPE_BUF_FLAG_PACKET;
 			else
 				buf->flags = PIPE_BUF_FLAG_CAN_MERGE;
-			pipe->tmp_page = NULL;
 
-			copied = copy_page_from_iter(page, 0, PAGE_SIZE, from);
-			if (unlikely(copied < PAGE_SIZE && iov_iter_count(from))) {
-				if (!ret)
-					ret = -EFAULT;
-				break;
-			}
-			ret += copied;
 			buf->len = copied;
+			ret += copied;
 
 			if (!iov_iter_count(from))
 				break;
diff --git a/include/linux/pipe_fs_i.h b/include/linux/pipe_fs_i.h
index 8ff23bf5a819..4174429a3e0e 100644
--- a/include/linux/pipe_fs_i.h
+++ b/include/linux/pipe_fs_i.h
@@ -31,6 +31,12 @@  struct pipe_buffer {
 	unsigned long private;
 };
 
+static inline unsigned int pipe_buf_assert_len(struct pipe_buffer *buf)
+{
+	WARN_ON_ONCE(!buf->len);
+	return buf->len;
+}
+
 /**
  *	struct pipe_inode_info - a linux kernel pipe
  *	@mutex: mutex protecting the whole thing