diff mbox

[v6,1/5] random: Blocking API for accessing nonblocking_pool

Message ID 3672239.xqA2RrDaZY@tachyon.chronox.de (mailing list archive)
State RFC
Delegated to: Herbert Xu
Headers show

Commit Message

Stephan Mueller May 18, 2015, 5:32 a.m. UTC
Am Freitag, 15. Mai 2015, 14:46:26 schrieb Herbert Xu:

Hi Herbert,

> On Wed, May 13, 2015 at 09:54:41PM +0200, Stephan Mueller wrote:
> >  /*
> > 
> > + * Equivalent function to get_random_bytes with the difference that this
> > + * function blocks the request until the nonblocking_pool is initialized.
> > + */
> > +void get_blocking_random_bytes(void *buf, int nbytes)
> > +{
> > +	if (unlikely(nonblocking_pool.initialized == 0))
> > +		wait_event_interruptible(urandom_init_wait,
> > +					 nonblocking_pool.initialized);
> > +	extract_entropy(&nonblocking_pool, buf, nbytes, 0, 0);
> 
> So what if the wait was interrupted? You are going to extract
> entropy from an empty pool.
> 
> Anyway, you still haven't addressed my primary concern with this
> model which is the potential for dead-lock.  Sleeping for an open
> period of time like this in a work queue is bad form.  It may also
> lead to dead-locks if whatever you're waiting for happened to use
> the same work thread.
> 
> That's why I think you should simply provide a function and data
> pointer which random.c can then stash onto a list to call when
> the pool is ready.

Thanks for the hint to the list. Before handing in another formal patch, may i ask for checking the following approach? I would think that this one should cover your concerns.

Comments

Herbert Xu May 18, 2015, 9:21 a.m. UTC | #1
On Mon, May 18, 2015 at 07:32:01AM +0200, Stephan Mueller wrote:
> 
> Thanks for the hint to the list. Before handing in another formal patch, may i ask for checking the following approach? I would think that this one should cover your concerns.

Yes this is definitely going in the right direction.

> +/*
> + * Equivalent function to get_random_bytes with the difference that this
> + * function blocks the request until the nonblocking_pool is initialized.
> + */
> +int get_blocking_random_bytes_cb(void *buf, int nbytes, void *private,
> +				 void (*cb)(void *buf, int buflen,
> +					    void *private))

You can simplify this further and get rid of buf/nbytes.  All
we need to know is whether the pool is ready.  Everything else
can come from private.

> +	struct random_work *rw = NULL;
> +	int ret = 0;

I think this function should return 0 if the pool is ready now,
-EINPROGRESS if it's not (indicating that the callback will be
called when it is ready) and otherwise an error.

Cheers,
Stephan Mueller May 18, 2015, 1:07 p.m. UTC | #2
Am Montag, 18. Mai 2015, 17:21:31 schrieb Herbert Xu:

Hi Herbert,

>> +/*
>> + * Equivalent function to get_random_bytes with the difference that this
>> + * function blocks the request until the nonblocking_pool is initialized.
>> + */
>> +int get_blocking_random_bytes_cb(void *buf, int nbytes, void *private,
>> +				 void (*cb)(void *buf, int buflen,
>> +					    void *private))
>
>You can simplify this further and get rid of buf/nbytes.  All
>we need to know is whether the pool is ready.  Everything else
>can come from private.
>
So, the async function is now just a notification of the caller. Sounds good 
with me.

>> +	struct random_work *rw = NULL;
>> +	int ret = 0;
>
>I think this function should return 0 if the pool is ready now,
>-EINPROGRESS if it's not (indicating that the callback will be
>called when it is ready) and otherwise an error.

Ok, will come in the next patch.

Ciao
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stephan Mueller May 18, 2015, 1:26 p.m. UTC | #3
Am Montag, 18. Mai 2015, 15:07:10 schrieb Stephan Mueller:

Hi Herbert,

>>
>>You can simplify this further and get rid of buf/nbytes.  All
>>we need to know is whether the pool is ready.  Everything else
>>can come from private.
>
>So, the async function is now just a notification of the caller. Sounds good
>with me.
>

I am just running into an interesting problem with a missing cancel operation: 
a caller instantiates a DRBG handle and invokes the seeding operation. The 
nonblocking_pool is not initialized. Therefore, the callback is put onto the 
list for being processed later.

Now, the caller releases the DRBG handle *before* the callback is triggered.

The callback is triggered with a pointer that is invalid, but the pointer is 
non-NULL. Therefore, I am not sure how to validate the pointer in the callback 
function.

I would think that there is no other way than to add a cancel API call that 
allows removing a list entry from the wait list.


Ciao
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o May 18, 2015, 3:02 p.m. UTC | #4
On Mon, May 18, 2015 at 03:26:13PM +0200, Stephan Mueller wrote:
> 
> I am just running into an interesting problem with a missing cancel operation: 
> a caller instantiates a DRBG handle and invokes the seeding operation. The 
> nonblocking_pool is not initialized. Therefore, the callback is put onto the 
> list for being processed later.
> 
> Now, the caller releases the DRBG handle *before* the callback is triggered.
> 
> The callback is triggered with a pointer that is invalid, but the pointer is 
> non-NULL. Therefore, I am not sure how to validate the pointer in the callback 
> function.

The simplest thing to do is to put a refcount on inside the DRBG
handle structure.  The caller instantiates the DRBG handle, and
invokes the the DRBG.  The DRBG, since it is kicking off an
asynchronous operation, increments the refcount.

Both the caller and the callback function, before they exit, drop the
refcount, and if they see the refcount is zero, they free the DRBG
handle and the memory where the random seed is to be (or has been)
deposited.

This is the same pattern that the block I/O layer uses with a bio
struct.  In the BIO case, it's important since the callback function
could have been called and returned before the caller gets control
back from the bio_submit() call.  Or the struct bio may contain an
EOPNOTSUPP error, in which case there will be no callback function
dispatched.  So long as everyone handles the refcount rules, it all
works out.

Regards,

					- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stephan Mueller May 19, 2015, 5:58 a.m. UTC | #5
Am Montag, 18. Mai 2015, 11:02:34 schrieb Theodore Ts'o:

Hi Theodore, Herbert,
> 
> The simplest thing to do is to put a refcount on inside the DRBG
> handle structure.  The caller instantiates the DRBG handle, and
> invokes the the DRBG.  The DRBG, since it is kicking off an
> asynchronous operation, increments the refcount.

That is a good idea. After experimenting with the refcount, I see that kernel 
crypto API release function of crypto_destroy_tfm unconditionally destroys the 
crypto handle by freeing it.

So, if a caller releases the DRBG handle, the DRBG code cannot prevent the 
destruction of its context with a refcount.

Herbert, do you have any ideas?
Herbert Xu May 19, 2015, 7:22 a.m. UTC | #6
On Tue, May 19, 2015 at 07:58:25AM +0200, Stephan Mueller wrote:
> 
> Herbert, do you have any ideas?

On the /dev/random side,

1) Add a struct module argument in addition to func/data.
2) Grab module ref count when func/data is registered.
3) Drop module ref count after func returns.

On the drbg side,

1) Allocate data pointer before func/data registration, it should
contain a flag indicating whether drbg is still alive.
2) In cra_exit, zap the flag in allocated data.
3) In func immediately return if flag indicates drbg is dead.
4) Free allocated data pointer when func is done.

Obviously you need to add some locking so that func doesn't race
against cra_exit or any other drbg paths that it intersects.

Cheers,
Stephan Mueller May 19, 2015, 7:35 a.m. UTC | #7
Am Dienstag, 19. Mai 2015, 15:22:27 schrieb Herbert Xu:

Hi Herbert,

> On Tue, May 19, 2015 at 07:58:25AM +0200, Stephan Mueller wrote:
> > Herbert, do you have any ideas?
> 
> On the /dev/random side,
> 
> 1) Add a struct module argument in addition to func/data.
> 2) Grab module ref count when func/data is registered.
> 3) Drop module ref count after func returns.
> 
> On the drbg side,
> 
> 1) Allocate data pointer before func/data registration, it should
> contain a flag indicating whether drbg is still alive.
> 2) In cra_exit, zap the flag in allocated data.
> 3) In func immediately return if flag indicates drbg is dead.
> 4) Free allocated data pointer when func is done.
> 
> Obviously you need to add some locking so that func doesn't race
> against cra_exit or any other drbg paths that it intersects.

Thank you for the hints. I will follow your guidance.

Just for my edification: why is this (rather complex sounding) approach 
preferred over a simple cancel API? Other async APIs (e.g. the AIO syscalls 
with io_cancel) have such cancel operations.

Such cancel function would be as simple as:

void get_blocking_random_bytes_cancel(void *private)
{
        struct random_work *rw, *tmp;

        mutex_lock(&random_wait_list_mutex);
        list_for_each_entry_safe(rw, tmp, &random_wait_list, list) {
                if (private == rw->rw_private) {
                        list_del(&rw->list);
                        kfree(rw);
                        break;
                }
        }
        mutex_unlock(&random_wait_list_mutex);
}
Herbert Xu May 19, 2015, 7:51 a.m. UTC | #8
On Tue, May 19, 2015 at 09:35:15AM +0200, Stephan Mueller wrote:
> 
> Thank you for the hints. I will follow your guidance.
> 
> Just for my edification: why is this (rather complex sounding) approach 
> preferred over a simple cancel API? Other async APIs (e.g. the AIO syscalls 
> with io_cancel) have such cancel operations.
> 
> Such cancel function would be as simple as:

You're right.  The cancel function is indeed simpler.  I can
certainly live with that.

Thanks,
Stephan Mueller May 19, 2015, 7:56 a.m. UTC | #9
Am Dienstag, 19. Mai 2015, 15:51:55 schrieb Herbert Xu:

Hi Herbert,

>You're right.  The cancel function is indeed simpler.  I can
>certainly live with that.

Thank you. I will test my patch a bit more and then release it with the 
discussed changes.

Ciao
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o May 19, 2015, 1:50 p.m. UTC | #10
On Tue, May 19, 2015 at 03:51:55PM +0800, Herbert Xu wrote:
> On Tue, May 19, 2015 at 09:35:15AM +0200, Stephan Mueller wrote:
> > 
> > Thank you for the hints. I will follow your guidance.
> > 
> > Just for my edification: why is this (rather complex sounding) approach 
> > preferred over a simple cancel API? Other async APIs (e.g. the AIO syscalls 
> > with io_cancel) have such cancel operations.
> > 
> > Such cancel function would be as simple as:
> 
> You're right.  The cancel function is indeed simpler.  I can
> certainly live with that.

I'm not at all convinced it's simpler.  I see it add a huge amount of
hair to drivers/char/random.c that is only used by a single caller.

I'm also not sure I understand herbert's suggstion of adding a struct
module to the /dev/random code.  Why is this helpful, and what is the
race that you are trying to protect against?

Finally, this is only going to block *once*, when the system is
initially botting up.  Why is it so important that we get the
asynchronous nature of this right, and why can't we solve it simply by
just simply doing the work in a workqueue, with a completion barrier
getting triggered once /dev/random initializes itself, and just simply
blocking the module unload until /dev/random is initialized?

	     	    	   	 	     - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Herbert Xu May 19, 2015, 2:18 p.m. UTC | #11
On Tue, May 19, 2015 at 09:50:28AM -0400, Theodore Ts'o wrote:
>
> Finally, this is only going to block *once*, when the system is
> initially botting up.  Why is it so important that we get the
> asynchronous nature of this right, and why can't we solve it simply by
> just simply doing the work in a workqueue, with a completion barrier
> getting triggered once /dev/random initializes itself, and just simply
> blocking the module unload until /dev/random is initialized?

I guess I'm still thinking of the old work queue code before
Tejun's cmwq work.  Yes blocking in a work queue should be fine
as there is usually just one DRBG instance.

Cheers,
Stephan Mueller May 19, 2015, 2:27 p.m. UTC | #12
Am Dienstag, 19. Mai 2015, 22:18:05 schrieb Herbert Xu:

Hi Herbert,

>On Tue, May 19, 2015 at 09:50:28AM -0400, Theodore Ts'o wrote:
>> Finally, this is only going to block *once*, when the system is
>> initially botting up.  Why is it so important that we get the
>> asynchronous nature of this right, and why can't we solve it simply by
>> just simply doing the work in a workqueue, with a completion barrier
>> getting triggered once /dev/random initializes itself, and just simply
>> blocking the module unload until /dev/random is initialized?
>
>I guess I'm still thinking of the old work queue code before
>Tejun's cmwq work.  Yes blocking in a work queue should be fine
>as there is usually just one DRBG instance.

The current modification with patch 1 to random.c is the smallest change to 
date. Is that then appropriate?

Herbert, based on your comment now, would the currently discussed patch with 
waiting in the work queue in patch 3 appropriate? Or what would you like to 
see changed?

Ciao
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Herbert Xu May 19, 2015, 2:30 p.m. UTC | #13
On Tue, May 19, 2015 at 04:27:54PM +0200, Stephan Mueller wrote:
>
> The current modification with patch 1 to random.c is the smallest change to 
> date. Is that then appropriate?
> 
> Herbert, based on your comment now, would the currently discussed patch with 
> waiting in the work queue in patch 3 appropriate? Or what would you like to 
> see changed?

Sorry but I have lost track as to which patch is which.  Please post
the changes to random.c and we can all take a look.

Thanks,
Herbert Xu June 5, 2015, 5:28 a.m. UTC | #14
On Tue, May 19, 2015 at 10:18:05PM +0800, Herbert Xu wrote:
> On Tue, May 19, 2015 at 09:50:28AM -0400, Theodore Ts'o wrote:
> >
> > Finally, this is only going to block *once*, when the system is
> > initially botting up.  Why is it so important that we get the
> > asynchronous nature of this right, and why can't we solve it simply by
> > just simply doing the work in a workqueue, with a completion barrier
> > getting triggered once /dev/random initializes itself, and just simply
> > blocking the module unload until /dev/random is initialized?
> 
> I guess I'm still thinking of the old work queue code before
> Tejun's cmwq work.  Yes blocking in a work queue should be fine
> as there is usually just one DRBG instance.

It looks like waiting for it in a workqueue isn't good enough
after all.  I just got this in a KVM machine:

INFO: task kworker/0:1:121 blocked for more than 120 seconds.
      Tainted: G           O    4.1.0-rc1+ #34
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/0:1     D ffff88001eb47d18     0   121      2 0x00000000
Workqueue: events drbg_async_seed [drbg]
 ffff88001eb47d18 ffff88001e1bcec0 ffff88001eb84010 0000000000000246
 ffff88001eb48000 0000000000000020 ffff88001d54ea50 0000000000000000
 ffff88001f613080 ffff88001eb47d38 ffffffff813fe692 0000000000000020
Call Trace:
 [<ffffffff813fe692>] schedule+0x32/0x80
 [<ffffffff812c1415>] get_blocking_random_bytes+0x65/0xa0
 [<ffffffff810825e0>] ? add_wait_queue+0x60/0x60
 [<ffffffffa03dd14c>] drbg_async_seed+0x2c/0xc0 [drbg]
 [<ffffffff81063369>] process_one_work+0x129/0x310
 [<ffffffff810636a9>] worker_thread+0x119/0x430
 [<ffffffff813fe51b>] ? __schedule+0x7fb/0x85e
 [<ffffffff81063590>] ? process_scheduled_works+0x40/0x40
 [<ffffffff81068ed4>] kthread+0xc4/0xe0
 [<ffffffff81060000>] ? proc_cap_handler+0x180/0x1b0
 [<ffffffff81068e10>] ? kthread_freezable_should_stop+0x60/0x60
 [<ffffffff81401ee2>] ret_from_fork+0x42/0x70
 [<ffffffff81068e10>] ? kthread_freezable_should_stop+0x60/0x60

Steffen, I think we need to revisit the idea of having a list
of callbacks.

Cheers,
Stephan Mueller June 5, 2015, 9:50 a.m. UTC | #15
Am Freitag, 5. Juni 2015, 13:28:06 schrieb Herbert Xu:

Hi Herbert,
> 
> Steffen, I think we need to revisit the idea of having a list
> of callbacks.

Ok, I will reactivate my patch with the list.
> 
> Cheers,
diff mbox

Patch

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 9cd6968..9bc2a57 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -409,6 +409,19 @@  static DECLARE_WAIT_QUEUE_HEAD(random_write_wait);
 static DECLARE_WAIT_QUEUE_HEAD(urandom_init_wait);
 static struct fasync_struct *fasync;
 
+static LIST_HEAD(random_wait_list);
+static DEFINE_MUTEX(random_wait_list_mutex);
+struct random_work {
+	struct list_head	list;
+	struct work_struct	rw_work;
+	void			*rw_buf;
+	int			rw_len;
+	void			*rw_private;
+	void			(*rw_cb)(void *buf, int buflen,
+					 void *private);
+};
+static void process_random_waiters(void);
+
 /**********************************************************************
  *
  * OS independent entropy store.   Here are the functions which handle
@@ -660,6 +673,7 @@  retry:
 		r->entropy_total = 0;
 		if (r == &nonblocking_pool) {
 			prandom_reseed_late();
+			process_random_waiters();
 			wake_up_interruptible(&urandom_init_wait);
 			pr_notice("random: %s pool is initialized\n", r->name);
 		}
@@ -1778,3 +1792,64 @@  void add_hwgenerator_randomness(const char *buffer, size_t count,
 	credit_entropy_bits(poolp, entropy);
 }
 EXPORT_SYMBOL_GPL(add_hwgenerator_randomness);
+
+static void process_random_waiters(void)
+{
+	struct random_work *rw = NULL;
+
+	mutex_lock(&random_wait_list_mutex);
+	while (!list_empty(&random_wait_list)) {
+		rw = list_first_entry(&random_wait_list, struct random_work,
+				      list);
+		list_del(&rw->list);
+		schedule_work(&rw->rw_work);
+	}
+	mutex_unlock(&random_wait_list_mutex);
+}
+
+static void get_blocking_random_bytes_work(struct work_struct *work)
+{
+	struct random_work *rw = container_of(work, struct random_work,
+					      rw_work);
+
+	get_random_bytes(rw->rw_buf, rw->rw_len);
+	rw->rw_cb(rw->rw_buf, rw->rw_len, rw->rw_private);
+	kfree(rw);
+}
+
+/*
+ * Equivalent function to get_random_bytes with the difference that this
+ * function blocks the request until the nonblocking_pool is initialized.
+ */
+int get_blocking_random_bytes_cb(void *buf, int nbytes, void *private,
+				 void (*cb)(void *buf, int buflen,
+					    void *private))
+{
+	struct random_work *rw = NULL;
+	int ret = 0;
+
+	mutex_lock(&random_wait_list_mutex);
+	list_for_each_entry(rw, &random_wait_list, list)
+		if (buf == rw->rw_buf)
+			goto out;
+
+	rw = kmalloc(sizeof(struct random_work), GFP_KERNEL);
+	if (!rw) {
+		ret = -ENOMEM;
+		goto out;
+	}
+	INIT_WORK(&rw->rw_work, get_blocking_random_bytes_work);
+	rw->rw_buf = buf;
+	rw->rw_len = nbytes;
+	rw->rw_private = private;
+	rw->rw_cb = cb;
+	list_add_tail(&rw->list, &random_wait_list);
+
+out:
+	mutex_unlock(&random_wait_list_mutex);
+	if (nonblocking_pool.initialized)
+		process_random_waiters();
+
+	return ret;
+}
+EXPORT_SYMBOL(get_blocking_random_bytes_cb);
diff --git a/include/linux/random.h b/include/linux/random.h
index b05856e..b57525f 100644
--- a/include/linux/random.h
+++ b/include/linux/random.h
@@ -15,6 +15,9 @@  extern void add_interrupt_randomness(int irq, int irq_flags);
 
 extern void get_random_bytes(void *buf, int nbytes);
 extern void get_random_bytes_arch(void *buf, int nbytes);
+extern int get_blocking_random_bytes_cb(void *buf, int nbytes, void *private,
+					void (*cb)(void *buf, int buflen,
+						   void *private));
 void generate_random_uuid(unsigned char uuid_out[16]);
 extern int random_int_secret_init(void);