Message ID | 20221030094209.65916-1-eli.billauer@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 282a4b71816b6076029017a7bab3a9dcee12a920 |
Headers | show |
Series | [v2] char: xillybus: Prevent use-after-free due to race condition | expand |
On Sun, Oct 30, 2022 at 11:42:09AM +0200, Eli Billauer wrote: > The driver for XillyUSB devices maintains a kref reference count on each > xillyusb_dev structure, which represents a physical device. This reference > count reaches zero when the device has been disconnected and there are no > open file descriptors that are related to the device. When this occurs, > kref_put() calls cleanup_dev(), which clears up the device's data, > including the structure itself. > > However, when xillyusb_open() is called, this reference count becomes > tricky: This function needs to obtain the xillyusb_dev structure that > relates to the inode's major and minor (as there can be several such). > xillybus_find_inode() (which is defined in xillybus_class.c) is called > for this purpose. xillybus_find_inode() holds a mutex that is global in > xillybus_class.c to protect the list of devices, and releases this > mutex before returning. As a result, nothing protects the xillyusb_dev's > reference counter from being decremented to zero before xillyusb_open() > increments it on its own behalf. Hence the structure can be freed > due to a rare race condition. > > To solve this, a mutex is added. It is locked by xillyusb_open() before > the call to xillybus_find_inode() and is released only after the kref > counter has been incremented on behalf of the newly opened inode. This > protects the kref reference counters of all xillyusb_dev structs from > being decremented by xillyusb_disconnect() during this time segment, as > the call to kref_put() in this function is done with the same lock held. > > There is no need to hold the lock on other calls to kref_put(), because > if xillybus_find_inode() finds a struct, xillyusb_disconnect() has not > made the call to remove it, and hence not made its call to kref_put(), > which takes place afterwards. Hence preventing xillyusb_disconnect's > call to kref_put() is enough to ensure that the reference doesn't reach > zero before it's incremented by xillyusb_open(). > > It would have been more natural to increment the reference count in > xillybus_find_inode() of course, however this function is also called by > Xillybus' driver for PCIe / OF, which registers a completely different > structure. Therefore, xillybus_find_inode() treats these structures as > void pointers, and accordingly can't make any changes. > > Reported-by: Hyunwoo Kim <imv4bel@gmail.com> > Suggested-by: Alan Stern <stern@rowland.harvard.edu> > Signed-off-by: Eli Billauer <eli.billauer@gmail.com> It looks like the xillybus driver already has a private mutex that would have been very well suited for this task: unit_mutex defined in xillybus_class.c. Of course, there's nothing wrong with using a new mutex instead -- just make sure there aren't any ABBA locking order problems. Alan Stern
On 30/10/2022 18:23, Alan Stern wrote: > It looks like the xillybus driver already has a private mutex that would > have been very well suited for this task: unit_mutex defined in > xillybus_class.c. Indeed so. The problem is that unit_mutex is global to xillybus_class.c, and I need the mutex to be released in xillyusb.c: The kref counter, which needs to be incremented with a mutex held, is inside a structure that only the XillyUSB driver knows about, so xillybus_class can't do that. Which is why xillybus_find_inode() passed a pointer to unit_mutex to its caller in the v1 version of this patch. But that turned out to be too odd. > Of course, there's nothing wrong with using a new > mutex instead -- just make sure there aren't any ABBA locking order > problems. The kref_mutex is always taken with no other (Xillybus-related) mutex taken. So the locking order is assured. But thanks for the reminder. Never hurts checking again. Regards, Eli
Dear, Sorry for the late review. This patch cannot prevent the UAF scenario I presented: ``` cpu0 cpu1 1. xillyusb_open() mutex_lock(&kref_mutex); // unaffected lock xillybus_find_inode() mutex_lock(&unit_mutex); unit = iter; mutex_unlock(&unit_mutex); 2. xillyusb_disconnect() xillybus_cleanup_chrdev() mutex_lock(&unit_mutex); kfree(unit); mutex_unlock(&unit_mutex); 3. *private_data = unit->private_data; // UAF ``` This is a UAF for 'unit', not a UAF for 'xdev'. So, the added 'kref_mutex' has no effect. Regards, Hyunwoo Kim
On 13/11/2022 10:05, Hyunwoo Kim wrote: > Dear, > > Sorry for the late review. > > This patch cannot prevent the UAF scenario I presented: > ``` > cpu0 cpu1 > 1. xillyusb_open() > mutex_lock(&kref_mutex); // unaffected lock > xillybus_find_inode() > mutex_lock(&unit_mutex); > unit = iter; > mutex_unlock(&unit_mutex); > 2. xillyusb_disconnect() > xillybus_cleanup_chrdev() > mutex_lock(&unit_mutex); > kfree(unit); > mutex_unlock(&unit_mutex); > 3. *private_data = unit->private_data; // UAF > > ``` > > This is a UAF for 'unit', not a UAF for 'xdev'. > So, the added 'kref_mutex' has no effect. > You're correct. This submitted patch solves only one problem out of two. It prevents the content of @private_data to be freed, but you're right that @unit itself isn't protected well enough. The problem you're pointing at is that @unit can be freed before its attempted use, because the mutex is released too early. This is easily solved by moving down the mutex_unlock() call to after @unit has been used. Do you want the pleasure to submit this patch, or should I? Thanks, Eli
Dear, On Sun, Nov 13, 2022 at 10:30:16AM +0200, Eli Billauer wrote: > On 13/11/2022 10:05, Hyunwoo Kim wrote: > > Dear, > > > > Sorry for the late review. > > > > This patch cannot prevent the UAF scenario I presented: > > ``` > > cpu0 cpu1 > > 1. xillyusb_open() > > mutex_lock(&kref_mutex); // unaffected lock > > xillybus_find_inode() > > mutex_lock(&unit_mutex); > > unit = iter; > > mutex_unlock(&unit_mutex); > > 2. xillyusb_disconnect() > > xillybus_cleanup_chrdev() > > mutex_lock(&unit_mutex); > > kfree(unit); > > mutex_unlock(&unit_mutex); > > 3. *private_data = unit->private_data; // UAF > > > > ``` > > > > This is a UAF for 'unit', not a UAF for 'xdev'. > > So, the added 'kref_mutex' has no effect. > > > > You're correct. This submitted patch solves only one problem out of two. It > prevents the content of @private_data to be freed, but you're right that > @unit itself isn't protected well enough. > > The problem you're pointing at is that @unit can be freed before its > attempted use, because the mutex is released too early. > > This is easily solved by moving down the mutex_unlock() call to after @unit > has been used. Do you want the pleasure to submit this patch, or should I? I hope you work as before. And, even if the mutex_unlock(&unit_mutex); of xillybus_find_inode() is finally moved, xdev may be released before kref_get() is executed if xillyusb_disconnect() ends just before the function returns. (Of course, this is an extremely rare case.) So, in xillyusb_open() we need to move kref_get() above xillybus_find_inode(). Regards, Hyunwoo Kim
On 13/11/2022 10:47, Hyunwoo Kim wrote: > And, even if the mutex_unlock(&unit_mutex); of xillybus_find_inode() > is finally moved, xdev may be released before kref_get() is executed > if xillyusb_disconnect() ends just before the function returns. > (Of course, this is an extremely rare case.) > > So, in xillyusb_open() we need to move kref_get() above xillybus_find_inode(). First of all, that's impossible. kref_get() is called on a member of a specific @xdev's struct, and it's xillybus_find_inode()'s job to find it. So before the call to xillybus_find_inode(), we don't know which @xdev it is. That's the tricky part of all this. The solution of this submitted patch was a lock that briefly prevents the kref_put() of all @xdevs. The way it works is that if an @xdev is found by xillybus_find_inode(), it necessarily means that xillyusb_disconnect()'s call to xillybus_cleanup_chrdev() hasn't returned (yet). Therefore, holding @kref_mutex guarantees that the kref_put() call, which is later on, isn't reached for the @xdev that has been found. If you've found a flaw in this mechanism, please be more specific about it. Regards, Eli
On Sun, Nov 13, 2022 at 11:03:20AM +0200, Eli Billauer wrote: > On 13/11/2022 10:47, Hyunwoo Kim wrote: > > And, even if the mutex_unlock(&unit_mutex); of xillybus_find_inode() > > is finally moved, xdev may be released before kref_get() is executed > > if xillyusb_disconnect() ends just before the function returns. > > (Of course, this is an extremely rare case.) > > > > So, in xillyusb_open() we need to move kref_get() above xillybus_find_inode(). > > First of all, that's impossible. kref_get() is called on a member of a > specific @xdev's struct, and it's xillybus_find_inode()'s job to find it. So > before the call to xillybus_find_inode(), we don't know which @xdev it is. > That's the tricky part of all this. > > The solution of this submitted patch was a lock that briefly prevents the > kref_put() of all @xdevs. The way it works is that if an @xdev is found by > xillybus_find_inode(), it necessarily means that xillyusb_disconnect()'s > call to xillybus_cleanup_chrdev() hasn't returned (yet). Therefore, holding > @kref_mutex guarantees that the kref_put() call, which is later on, isn't > reached for the @xdev that has been found. > > If you've found a flaw in this mechanism, please be more specific about it. you're right. It seems that you only need to move the location of unit_mutex in xillybus_find_inode(). Regards, Hyunwoo Kim
On 13/11/2022 11:14, Hyunwoo Kim wrote: > > It seems that you only need to move the location of unit_mutex > in xillybus_find_inode(). > Agreed. I'll prepare a follow-up patch to fix this issue as well. Thanks a lot for drawing my attention to this. Eli
diff --git a/drivers/char/xillybus/xillyusb.c b/drivers/char/xillybus/xillyusb.c index 39bcbfd908b4..5a5afa14ca8c 100644 --- a/drivers/char/xillybus/xillyusb.c +++ b/drivers/char/xillybus/xillyusb.c @@ -184,6 +184,14 @@ struct xillyusb_dev { struct mutex process_in_mutex; /* synchronize wakeup_all() */ }; +/* + * kref_mutex is used in xillyusb_open() to prevent the xillyusb_dev + * struct from being freed during the gap between being found by + * xillybus_find_inode() and having its reference count incremented. + */ + +static DEFINE_MUTEX(kref_mutex); + /* FPGA to host opcodes */ enum { OPCODE_DATA = 0, @@ -1237,9 +1245,16 @@ static int xillyusb_open(struct inode *inode, struct file *filp) int rc; int index; + mutex_lock(&kref_mutex); + rc = xillybus_find_inode(inode, (void **)&xdev, &index); - if (rc) + if (rc) { + mutex_unlock(&kref_mutex); return rc; + } + + kref_get(&xdev->kref); + mutex_unlock(&kref_mutex); chan = &xdev->channels[index]; filp->private_data = chan; @@ -1275,8 +1290,6 @@ static int xillyusb_open(struct inode *inode, struct file *filp) ((filp->f_mode & FMODE_WRITE) && chan->open_for_write)) goto unmutex_fail; - kref_get(&xdev->kref); - if (filp->f_mode & FMODE_READ) chan->open_for_read = 1; @@ -1413,6 +1426,7 @@ static int xillyusb_open(struct inode *inode, struct file *filp) return rc; unmutex_fail: + kref_put(&xdev->kref, cleanup_dev); mutex_unlock(&chan->lock); return rc; } @@ -2227,7 +2241,9 @@ static void xillyusb_disconnect(struct usb_interface *interface) xdev->dev = NULL; + mutex_lock(&kref_mutex); kref_put(&xdev->kref, cleanup_dev); + mutex_unlock(&kref_mutex); } static struct usb_driver xillyusb_driver = {
The driver for XillyUSB devices maintains a kref reference count on each xillyusb_dev structure, which represents a physical device. This reference count reaches zero when the device has been disconnected and there are no open file descriptors that are related to the device. When this occurs, kref_put() calls cleanup_dev(), which clears up the device's data, including the structure itself. However, when xillyusb_open() is called, this reference count becomes tricky: This function needs to obtain the xillyusb_dev structure that relates to the inode's major and minor (as there can be several such). xillybus_find_inode() (which is defined in xillybus_class.c) is called for this purpose. xillybus_find_inode() holds a mutex that is global in xillybus_class.c to protect the list of devices, and releases this mutex before returning. As a result, nothing protects the xillyusb_dev's reference counter from being decremented to zero before xillyusb_open() increments it on its own behalf. Hence the structure can be freed due to a rare race condition. To solve this, a mutex is added. It is locked by xillyusb_open() before the call to xillybus_find_inode() and is released only after the kref counter has been incremented on behalf of the newly opened inode. This protects the kref reference counters of all xillyusb_dev structs from being decremented by xillyusb_disconnect() during this time segment, as the call to kref_put() in this function is done with the same lock held. There is no need to hold the lock on other calls to kref_put(), because if xillybus_find_inode() finds a struct, xillyusb_disconnect() has not made the call to remove it, and hence not made its call to kref_put(), which takes place afterwards. Hence preventing xillyusb_disconnect's call to kref_put() is enough to ensure that the reference doesn't reach zero before it's incremented by xillyusb_open(). It would have been more natural to increment the reference count in xillybus_find_inode() of course, however this function is also called by Xillybus' driver for PCIe / OF, which registers a completely different structure. Therefore, xillybus_find_inode() treats these structures as void pointers, and accordingly can't make any changes. Reported-by: Hyunwoo Kim <imv4bel@gmail.com> Suggested-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Eli Billauer <eli.billauer@gmail.com> --- Notes: Changelog: v2 -- Rewrite completely: Instead of juggling with a mutex, add a new one. drivers/char/xillybus/xillyusb.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-)