diff mbox series

eventfd: avoid overflow to ULLONG_MAX when ctx->count is 0

Message ID tencent_7588DFD1F365950A757310D764517A14B306@qq.com (mailing list archive)
State New, archived
Headers show
Series eventfd: avoid overflow to ULLONG_MAX when ctx->count is 0 | expand

Commit Message

Wen Yang July 9, 2023, 6:54 a.m. UTC
From: Wen Yang <wenyang.linux@foxmail.com>

For eventfd with flag EFD_SEMAPHORE, when its ctx->count is 0, calling
eventfd_ctx_do_read will cause ctx->count to overflow to ULLONG_MAX.

Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")
Signed-off-by: Wen Yang <wenyang.linux@foxmail.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dylan Yudaken <dylany@fb.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 fs/eventfd.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Matthew Wilcox (Oracle) July 9, 2023, 8:05 p.m. UTC | #1
On Sun, Jul 09, 2023 at 09:13:28PM +0200, Markus Elfring wrote:
> > For eventfd with flag EFD_SEMAPHORE, when its ctx->count is 0, calling
> > eventfd_ctx_do_read will cause ctx->count to overflow to ULLONG_MAX.
> 
> Please choose an imperative change suggestion.

Markus, stop this nitpicking.  It puts off contributors.  The changelog
is perfectly clear.
Christian Brauner July 10, 2023, 2:12 p.m. UTC | #2
On Sun, Jul 09, 2023 at 02:54:51PM +0800, wenyang.linux@foxmail.com wrote:
> From: Wen Yang <wenyang.linux@foxmail.com>
> 
> For eventfd with flag EFD_SEMAPHORE, when its ctx->count is 0, calling
> eventfd_ctx_do_read will cause ctx->count to overflow to ULLONG_MAX.
> 
> Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")
> Signed-off-by: Wen Yang <wenyang.linux@foxmail.com>
> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Christian Brauner <brauner@kernel.org>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Dylan Yudaken <dylany@fb.com>
> Cc: David Woodhouse <dwmw@amazon.co.uk>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: linux-fsdevel@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---

So this looks ok but I would like to see an analysis how the overflow
can happen. I'm looking at the callers and it seems that once ctx->count
hits 0 eventfd_read() won't call eventfd_ctx_do_read() anymore. So is
there a caller that can call directly or indirectly
eventfd_ctx_do_read() on a ctx->count == 0?

I'm just slightly skeptical about patches that fix issues without an
analysis how this can happen.

>  fs/eventfd.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/eventfd.c b/fs/eventfd.c
> index 8aa36cd37351..10a101df19cd 100644
> --- a/fs/eventfd.c
> +++ b/fs/eventfd.c
> @@ -189,7 +189,7 @@ void eventfd_ctx_do_read(struct eventfd_ctx *ctx, __u64 *cnt)
>  {
>  	lockdep_assert_held(&ctx->wqh.lock);
>  
> -	*cnt = (ctx->flags & EFD_SEMAPHORE) ? 1 : ctx->count;
> +	*cnt = ((ctx->flags & EFD_SEMAPHORE) && ctx->count) ? 1 : ctx->count;
>  	ctx->count -= *cnt;
>  }
>  EXPORT_SYMBOL_GPL(eventfd_ctx_do_read);
> @@ -269,6 +269,8 @@ static ssize_t eventfd_write(struct file *file, const char __user *buf, size_t c
>  		return -EFAULT;
>  	if (ucnt == ULLONG_MAX)
>  		return -EINVAL;
> +	if ((ctx->flags & EFD_SEMAPHORE) && !ucnt)
> +		return -EINVAL;
>  	spin_lock_irq(&ctx->wqh.lock);
>  	res = -EAGAIN;
>  	if (ULLONG_MAX - ctx->count > ucnt)
> -- 
> 2.25.1
>
Wen Yang July 10, 2023, 3:02 p.m. UTC | #3
On 2023/7/10 22:12, Christian Brauner wrote:
> On Sun, Jul 09, 2023 at 02:54:51PM +0800, wenyang.linux@foxmail.com wrote:
>> From: Wen Yang <wenyang.linux@foxmail.com>
>>
>> For eventfd with flag EFD_SEMAPHORE, when its ctx->count is 0, calling
>> eventfd_ctx_do_read will cause ctx->count to overflow to ULLONG_MAX.
>>
>> Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")
>> Signed-off-by: Wen Yang <wenyang.linux@foxmail.com>
>> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
>> Cc: Jens Axboe <axboe@kernel.dk>
>> Cc: Christian Brauner <brauner@kernel.org>
>> Cc: Christoph Hellwig <hch@lst.de>
>> Cc: Dylan Yudaken <dylany@fb.com>
>> Cc: David Woodhouse <dwmw@amazon.co.uk>
>> Cc: Matthew Wilcox <willy@infradead.org>
>> Cc: linux-fsdevel@vger.kernel.org
>> Cc: linux-kernel@vger.kernel.org
>> ---
> So this looks ok but I would like to see an analysis how the overflow
> can happen. I'm looking at the callers and it seems that once ctx->count
> hits 0 eventfd_read() won't call eventfd_ctx_do_read() anymore. So is
> there a caller that can call directly or indirectly
> eventfd_ctx_do_read() on a ctx->count == 0?
eventfd_read() ensures that ctx->count is not 0 before calling 
eventfd_ctx_do_read() and it is correct.

But it is not appropriate for eventfd_ctx_remove_wait_queue() to call 
eventfd_ctx_do_read() unconditionally,

as it may not only causes ctx->count to overflow, but also unnecessarily 
calls wake_up_locked_poll().


I am sorry for just adding the following string in the patch:
Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")


Looking forward to your suggestions.

--

Best wishes,

Wen


> I'm just slightly skeptical about patches that fix issues without an
> analysis how this can happen.
>
>>   fs/eventfd.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/eventfd.c b/fs/eventfd.c
>> index 8aa36cd37351..10a101df19cd 100644
>> --- a/fs/eventfd.c
>> +++ b/fs/eventfd.c
>> @@ -189,7 +189,7 @@ void eventfd_ctx_do_read(struct eventfd_ctx *ctx, __u64 *cnt)
>>   {
>>   	lockdep_assert_held(&ctx->wqh.lock);
>>   
>> -	*cnt = (ctx->flags & EFD_SEMAPHORE) ? 1 : ctx->count;
>> +	*cnt = ((ctx->flags & EFD_SEMAPHORE) && ctx->count) ? 1 : ctx->count;
>>   	ctx->count -= *cnt;
>>   }
>>   EXPORT_SYMBOL_GPL(eventfd_ctx_do_read);
>> @@ -269,6 +269,8 @@ static ssize_t eventfd_write(struct file *file, const char __user *buf, size_t c
>>   		return -EFAULT;
>>   	if (ucnt == ULLONG_MAX)
>>   		return -EINVAL;
>> +	if ((ctx->flags & EFD_SEMAPHORE) && !ucnt)
>> +		return -EINVAL;
>>   	spin_lock_irq(&ctx->wqh.lock);
>>   	res = -EAGAIN;
>>   	if (ULLONG_MAX - ctx->count > ucnt)
>> -- 
>> 2.25.1
>>
Christian Brauner July 11, 2023, 9:07 a.m. UTC | #4
On Mon, Jul 10, 2023 at 11:02:33PM +0800, Wen Yang wrote:
> 
> On 2023/7/10 22:12, Christian Brauner wrote:
> > On Sun, Jul 09, 2023 at 02:54:51PM +0800, wenyang.linux@foxmail.com wrote:
> > > From: Wen Yang <wenyang.linux@foxmail.com>
> > > 
> > > For eventfd with flag EFD_SEMAPHORE, when its ctx->count is 0, calling
> > > eventfd_ctx_do_read will cause ctx->count to overflow to ULLONG_MAX.
> > > 
> > > Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")
> > > Signed-off-by: Wen Yang <wenyang.linux@foxmail.com>
> > > Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> > > Cc: Jens Axboe <axboe@kernel.dk>
> > > Cc: Christian Brauner <brauner@kernel.org>
> > > Cc: Christoph Hellwig <hch@lst.de>
> > > Cc: Dylan Yudaken <dylany@fb.com>
> > > Cc: David Woodhouse <dwmw@amazon.co.uk>
> > > Cc: Matthew Wilcox <willy@infradead.org>
> > > Cc: linux-fsdevel@vger.kernel.org
> > > Cc: linux-kernel@vger.kernel.org
> > > ---
> > So this looks ok but I would like to see an analysis how the overflow
> > can happen. I'm looking at the callers and it seems that once ctx->count
> > hits 0 eventfd_read() won't call eventfd_ctx_do_read() anymore. So is
> > there a caller that can call directly or indirectly
> > eventfd_ctx_do_read() on a ctx->count == 0?
> eventfd_read() ensures that ctx->count is not 0 before calling
> eventfd_ctx_do_read() and it is correct.
> 
> But it is not appropriate for eventfd_ctx_remove_wait_queue() to call
> eventfd_ctx_do_read() unconditionally,
> 
> as it may not only causes ctx->count to overflow, but also unnecessarily
> calls wake_up_locked_poll().

Hm, so I think you're right and an underflow can be triggered for at
least three subsystems:

(1) virt/kvm/eventfd.c
(2) drivers/vfio/virqfd.c
(3) drivers/virt/acrn/irqfd.c

where (2) and (3) are just modeled after (1). The eventfd must've been
set to EFD_SEMAPHORE and ctx->count must been or decremented zero. The
only way I can see the _underflow_ happening is if the irqfd is shutdown
through an ioctl() like KVM_IRQFD with KVM_IRQFD_FLAG_DEASSIGN raised
while ctx->count is zero:

kvm_vm_ioctl()
-> kvm_irqfd()
   -> kvm_irqfd_deassign()
      -> irqfd_deactivate()
         -> irqfd_shutdown()
            -> eventfd_ctx_remove_wait_queue(&cnt)

which would underflow @cnt and cause a spurious wakeup. Userspace would
still read one because of EFD_SEMAPHORE semantics and wouldn't notice
the underflow.

I think it's probably not that bad because afaict, this really can only
happen when (1)-(3) are shutdown. But we should still fix it ofc.

> 
> 
> I am sorry for just adding the following string in the patch:
> Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")
> 
> 
> Looking forward to your suggestions.

What I usually look for is some callchain/analysis that explain under
what circumstance what this is fixing can happen. That makes life for
reviewers a lot easier because they don't have to dig out that work
themselves which takes time.
Christian Brauner July 11, 2023, 9:39 a.m. UTC | #5
On Mon, Jul 10, 2023 at 11:02:33PM +0800, Wen Yang wrote:
> 
> On 2023/7/10 22:12, Christian Brauner wrote:
> > On Sun, Jul 09, 2023 at 02:54:51PM +0800, wenyang.linux@foxmail.com wrote:
> > > From: Wen Yang <wenyang.linux@foxmail.com>
> > > 
> > > For eventfd with flag EFD_SEMAPHORE, when its ctx->count is 0, calling
> > > eventfd_ctx_do_read will cause ctx->count to overflow to ULLONG_MAX.
> > > 
> > > Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")
> > > Signed-off-by: Wen Yang <wenyang.linux@foxmail.com>
> > > Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> > > Cc: Jens Axboe <axboe@kernel.dk>
> > > Cc: Christian Brauner <brauner@kernel.org>
> > > Cc: Christoph Hellwig <hch@lst.de>
> > > Cc: Dylan Yudaken <dylany@fb.com>
> > > Cc: David Woodhouse <dwmw@amazon.co.uk>
> > > Cc: Matthew Wilcox <willy@infradead.org>
> > > Cc: linux-fsdevel@vger.kernel.org
> > > Cc: linux-kernel@vger.kernel.org
> > > ---
> > So this looks ok but I would like to see an analysis how the overflow
> > can happen. I'm looking at the callers and it seems that once ctx->count
> > hits 0 eventfd_read() won't call eventfd_ctx_do_read() anymore. So is
> > there a caller that can call directly or indirectly
> > eventfd_ctx_do_read() on a ctx->count == 0?
> eventfd_read() ensures that ctx->count is not 0 before calling
> eventfd_ctx_do_read() and it is correct.
> 
> But it is not appropriate for eventfd_ctx_remove_wait_queue() to call
> eventfd_ctx_do_read() unconditionally,
> 
> as it may not only causes ctx->count to overflow, but also unnecessarily
> calls wake_up_locked_poll().
> 
> 
> I am sorry for just adding the following string in the patch:
> Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")
> 
> 
> Looking forward to your suggestions.
> 
> --
> 
> Best wishes,
> 
> Wen
> 
> 
> > I'm just slightly skeptical about patches that fix issues without an
> > analysis how this can happen.
> > 
> > >   fs/eventfd.c | 4 +++-
> > >   1 file changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/eventfd.c b/fs/eventfd.c
> > > index 8aa36cd37351..10a101df19cd 100644
> > > --- a/fs/eventfd.c
> > > +++ b/fs/eventfd.c
> > > @@ -189,7 +189,7 @@ void eventfd_ctx_do_read(struct eventfd_ctx *ctx, __u64 *cnt)
> > >   {
> > >   	lockdep_assert_held(&ctx->wqh.lock);
> > > -	*cnt = (ctx->flags & EFD_SEMAPHORE) ? 1 : ctx->count;
> > > +	*cnt = ((ctx->flags & EFD_SEMAPHORE) && ctx->count) ? 1 : ctx->count;
> > >   	ctx->count -= *cnt;
> > >   }
> > >   EXPORT_SYMBOL_GPL(eventfd_ctx_do_read);
> > > @@ -269,6 +269,8 @@ static ssize_t eventfd_write(struct file *file, const char __user *buf, size_t c
> > >   		return -EFAULT;
> > >   	if (ucnt == ULLONG_MAX)
> > >   		return -EINVAL;
> > > +	if ((ctx->flags & EFD_SEMAPHORE) && !ucnt)
> > > +		return -EINVAL;

Hm, why is bit necessary though? What's wrong with specifying ucnt == 0
with EFD_SEMAPHORE? This also looks like a (very low potential) uapi
break.
Christian Brauner July 11, 2023, 9:46 a.m. UTC | #6
On Sun, 09 Jul 2023 14:54:51 +0800, wenyang.linux@foxmail.com wrote:
> For eventfd with flag EFD_SEMAPHORE, when its ctx->count is 0, calling
> eventfd_ctx_do_read will cause ctx->count to overflow to ULLONG_MAX.
> 
> 

I've tweaked the commit message to explain how an underflow can happen.
And for now I dropped that bit:

diff --git a/fs/eventfd.c b/fs/eventfd.c
index 10a101df19cd..33a918f9566c 100644
--- a/fs/eventfd.c
+++ b/fs/eventfd.c
@@ -269,8 +269,6 @@ static ssize_t eventfd_write(struct file *file, const char __user *buf, size_t c
                return -EFAULT;
        if (ucnt == ULLONG_MAX)
                return -EINVAL;
-       if ((ctx->flags & EFD_SEMAPHORE) && !ucnt)
-               return -EINVAL;
        spin_lock_irq(&ctx->wqh.lock);
        res = -EAGAIN;
        if (ULLONG_MAX - ctx->count > ucnt)

because I don't yet understand why that should be forbidden. Please
explain the reason for wanting that check in there and I can add it back.
I might just be missing the obvious.

---

Applied to the vfs.misc branch of the vfs/vfs.git tree.
Patches in the vfs.misc branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs.misc

[1/1] eventfd: prevent underflow for eventfd semaphores 
      https://git.kernel.org/vfs/vfs/c/7b2edd278691
Wen Yang July 11, 2023, 6:14 p.m. UTC | #7
On 2023/7/11 17:39, Christian Brauner wrote:
> On Mon, Jul 10, 2023 at 11:02:33PM +0800, Wen Yang wrote:
>> On 2023/7/10 22:12, Christian Brauner wrote:
>>> On Sun, Jul 09, 2023 at 02:54:51PM +0800, wenyang.linux@foxmail.com wrote:
>>>> From: Wen Yang <wenyang.linux@foxmail.com>
>>>>
>>>> For eventfd with flag EFD_SEMAPHORE, when its ctx->count is 0, calling
>>>> eventfd_ctx_do_read will cause ctx->count to overflow to ULLONG_MAX.
>>>>
>>>> Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")
>>>> Signed-off-by: Wen Yang <wenyang.linux@foxmail.com>
>>>> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
>>>> Cc: Jens Axboe <axboe@kernel.dk>
>>>> Cc: Christian Brauner <brauner@kernel.org>
>>>> Cc: Christoph Hellwig <hch@lst.de>
>>>> Cc: Dylan Yudaken <dylany@fb.com>
>>>> Cc: David Woodhouse <dwmw@amazon.co.uk>
>>>> Cc: Matthew Wilcox <willy@infradead.org>
>>>> Cc: linux-fsdevel@vger.kernel.org
>>>> Cc: linux-kernel@vger.kernel.org
>>>> ---
>>> So this looks ok but I would like to see an analysis how the overflow
>>> can happen. I'm looking at the callers and it seems that once ctx->count
>>> hits 0 eventfd_read() won't call eventfd_ctx_do_read() anymore. So is
>>> there a caller that can call directly or indirectly
>>> eventfd_ctx_do_read() on a ctx->count == 0?
>> eventfd_read() ensures that ctx->count is not 0 before calling
>> eventfd_ctx_do_read() and it is correct.
>>
>> But it is not appropriate for eventfd_ctx_remove_wait_queue() to call
>> eventfd_ctx_do_read() unconditionally,
>>
>> as it may not only causes ctx->count to overflow, but also unnecessarily
>> calls wake_up_locked_poll().
>>
>>
>> I am sorry for just adding the following string in the patch:
>> Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")
>>
>>
>> Looking forward to your suggestions.
>>
>> --
>>
>> Best wishes,
>>
>> Wen
>>
>>
>>> I'm just slightly skeptical about patches that fix issues without an
>>> analysis how this can happen.
>>>
>>>>    fs/eventfd.c | 4 +++-
>>>>    1 file changed, 3 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/fs/eventfd.c b/fs/eventfd.c
>>>> index 8aa36cd37351..10a101df19cd 100644
>>>> --- a/fs/eventfd.c
>>>> +++ b/fs/eventfd.c
>>>> @@ -189,7 +189,7 @@ void eventfd_ctx_do_read(struct eventfd_ctx *ctx, __u64 *cnt)
>>>>    {
>>>>    	lockdep_assert_held(&ctx->wqh.lock);
>>>> -	*cnt = (ctx->flags & EFD_SEMAPHORE) ? 1 : ctx->count;
>>>> +	*cnt = ((ctx->flags & EFD_SEMAPHORE) && ctx->count) ? 1 : ctx->count;
>>>>    	ctx->count -= *cnt;
>>>>    }
>>>>    EXPORT_SYMBOL_GPL(eventfd_ctx_do_read);
>>>> @@ -269,6 +269,8 @@ static ssize_t eventfd_write(struct file *file, const char __user *buf, size_t c
>>>>    		return -EFAULT;
>>>>    	if (ucnt == ULLONG_MAX)
>>>>    		return -EINVAL;
>>>> +	if ((ctx->flags & EFD_SEMAPHORE) && !ucnt)
>>>> +		return -EINVAL;
> Hm, why is bit necessary though? What's wrong with specifying ucnt == 0
> with EFD_SEMAPHORE? This also looks like a (very low potential) uapi
> break.


Thank you for your careful review.

As you pointed out, this may break the uapi.
Although manaul has the following description (man 2 eventfd):
*  The file descriptor is readable (the select(2) readfds argument; the 
poll(2) POLLIN flag) if the counter has a value greater than 0.
*  The file descriptor is writable (the select(2) writefds argument; the 
poll(2) POLLOUT flag) if it is possible to write a value of at least "1" 
without blocking.

But it does not specify that the ucnt cannot be zero, so we can only 
delete the two lines of code above

Could we propose another patch specifically to address the issue you 
have identified?

Since there are indeed some corner scenes when ucnt is 0 and ctx->count 
is also 0:


static ssize_t eventfd_write(struct file *file, const char __user *buf, 
size_t count,
                              loff_t *ppos)
{
...
         if (ULLONG_MAX - ctx->count > ucnt)
                 res = sizeof(ucnt);                 ---> always > 0
...
         if (likely(res > 0)) {
                 ctx->count += ucnt;                 ---> unnecessary 
addition of 0
                 current->in_eventfd = 1;            ---> May affect 
eventfd_signal()
                 if (waitqueue_active(&ctx->wqh))
                         wake_up_locked_poll(&ctx->wqh, EPOLLIN);  ---> 
heavyweight wakeup
                 current->in_eventfd = 0;
         }
...
}


static __poll_t eventfd_poll(struct file *file, poll_table *wait)
{
...
         count = READ_ONCE(ctx->count);

         if (count > 0)
                 events |= EPOLLIN;    ---> If count is 0, all previous 
operations are wasted

         if (count == ULLONG_MAX)
                 events |= EPOLLERR;
         if (ULLONG_MAX - 1 > count)
                 events |= EPOLLOUT;

         return events;
}

Could we optimize it like this?

static ssize_t eventfd_write(struct file *file, const char __user *buf, 
size_t count,
                              loff_t *ppos)
{
...
         if (likely(res > 0)) {
                 ctx->count += ucnt;
                 if (ctx->count) {                       ---> avoiding 
unnecessary heavyweight operations
                         current->in_eventfd = 1;
                         if (waitqueue_active(&ctx->wqh))
wake_up_locked_poll(&ctx->wqh, EPOLLIN);
                         current->in_eventfd = 0;
                 }
         }
...
}


--

Best wishes,

Wen
diff mbox series

Patch

diff --git a/fs/eventfd.c b/fs/eventfd.c
index 8aa36cd37351..10a101df19cd 100644
--- a/fs/eventfd.c
+++ b/fs/eventfd.c
@@ -189,7 +189,7 @@  void eventfd_ctx_do_read(struct eventfd_ctx *ctx, __u64 *cnt)
 {
 	lockdep_assert_held(&ctx->wqh.lock);
 
-	*cnt = (ctx->flags & EFD_SEMAPHORE) ? 1 : ctx->count;
+	*cnt = ((ctx->flags & EFD_SEMAPHORE) && ctx->count) ? 1 : ctx->count;
 	ctx->count -= *cnt;
 }
 EXPORT_SYMBOL_GPL(eventfd_ctx_do_read);
@@ -269,6 +269,8 @@  static ssize_t eventfd_write(struct file *file, const char __user *buf, size_t c
 		return -EFAULT;
 	if (ucnt == ULLONG_MAX)
 		return -EINVAL;
+	if ((ctx->flags & EFD_SEMAPHORE) && !ucnt)
+		return -EINVAL;
 	spin_lock_irq(&ctx->wqh.lock);
 	res = -EAGAIN;
 	if (ULLONG_MAX - ctx->count > ucnt)