mbox series

[00/26] Media device lifetime management

Message ID 20230201214535.347075-1-sakari.ailus@linux.intel.com (mailing list archive)
Headers show
Series Media device lifetime management | expand

Message

Sakari Ailus Feb. 1, 2023, 9:45 p.m. UTC
Hi folks,

This is a refresh of my 2016 RFC patchset to start addressing object
lifetime issues in Media controller. It further allows continuing work to
address lifetime management of media entities.

The underlying problem is described in detail in v4 of the previous RFC:
<URL:https://lore.kernel.org/linux-media/20161108135438.GO3217@valkosipuli.retiisi.org.uk/>.
In brief, there is currently no connection between releasing media device
(and related) memory and IOCTL calls, meaning that there is a time window
during which released kernel memory can be accessed, and that access can be
triggered from the user space. The only reason why this is not a grave
security issue is that it is not triggerable by the user alone but requires
unbinding a device. That is still not an excuse for not fixing it.

This set differs from the earlier RFC to address the issue in the
following respects:

- Make changes for ipu3-cio2 driver, too.

- Continue to provide best effort attempt to keep the window between device
  removal and user space being able to access released memory as small as
  possible. This means the problem won't become worse for drivers for which
  Media device lifetime management has not been implemented.

The latter is achieved by adding a new object, Media devnode compat
reference, which is allocated, refcounted and eventually released by the
Media controller framework itself, and where the information on registration
and open filehandles is maintained. This is only done if the driver does not
manage the lifetime of the media device itself, i.e. its release operation
is NULL.

Due to this, Media device file handles will also be introduced by this
patchset. I thought the first user of this would be Media device events but
it seems we already need them here.

Both ipu3-cio2 and omap3isp drivers are relieved of devm_request_irq() use,
as device_release() releases the resources before calling the driver's
remove function. While further work will be required also on these drivers
to safely stop he hardware at unbind time, I don't see a reason not to merge
these patches now.

Some patches are temporarily reverted in order to make reworks easier, then
applied later on.

I've tested this on ipu3-cio2 with and without the refcounting patch (media:
ipu3-cio2: Release the cio2 device context by media device callback),
including failures in a few parts of the driver initialisation process in
the MC framework.

Questions and comments are welcome.


Daniel Axtens (1):
  media: uvcvideo: Refactor teardown of uvc on USB disconnect

Laurent Pinchart (1):
  media: Add per-file-handle data support

Logan Gunthorpe (1):
  media: utilize new cdev_device_add helper function

Sakari Ailus (23):
  Revert "[media] media: fix media devnode ioctl/syscall and unregister
    race"
  Revert "media: utilize new cdev_device_add helper function"
  Revert "[media] media: fix use-after-free in cdev_put() when app exits
    after driver unbind"
  Revert "media: uvcvideo: Refactor teardown of uvc on USB disconnect"
  Revert "[media] media-device: dynamically allocate struct
    media_devnode"
  media device: Drop nop release callback
  media: Do not call cdev_device_del() if cdev_device_add() fails
  media-device: Delete character device early
  media: Split initialising and adding media devnode
  media: Shuffle functions around
  media device: Initialise media devnode in media_device_init()
  media device: Refcount the media device
  v4l: Acquire a reference to the media device for every video device
  media-device: Postpone graph object removal until free
  omap3isp: Release the isp device struct by media device callback
  omap3isp: Don't use devm_request_irq()
  media: Add nop implementations of media_device_{init,cleanup}
  media: ipu3-cio2: Call v4l2_device_unregister() earlier
  media: ipu3-cio2: Don't use devm_request_irq()
  media: ipu3-cio2: Release the cio2 device context by media device
    callback
  media: Maintain a list of open file handles in a media device
  media: Implement best effort media device removal safety sans
    refcounting
  media: Document how Media device resources are released

 Documentation/driver-api/media/mc-core.rst    |  12 +-
 drivers/media/cec/core/cec-core.c             |   2 +-
 drivers/media/mc/mc-device.c                  | 279 +++++++++++-------
 drivers/media/mc/mc-devnode.c                 |  94 +++---
 drivers/media/pci/intel/ipu3/ipu3-cio2-main.c |  75 +++--
 drivers/media/platform/ti/omap3isp/isp.c      |  33 ++-
 drivers/media/usb/au0828/au0828-core.c        |   4 +-
 drivers/media/usb/uvc/uvc_driver.c            |   2 +-
 drivers/media/v4l2-core/v4l2-dev.c            |  13 +-
 drivers/staging/media/sunxi/cedrus/cedrus.c   |   2 +-
 include/media/media-device.h                  |  56 +++-
 include/media/media-devnode.h                 |  99 ++++---
 include/media/media-fh.h                      |  32 ++
 13 files changed, 476 insertions(+), 227 deletions(-)
 create mode 100644 include/media/media-fh.h

Comments

Hans Verkuil March 3, 2023, 9:07 a.m. UTC | #1
Hi Sakari,

On 01/02/2023 22:45, Sakari Ailus wrote:
> Hi folks,
> 
> This is a refresh of my 2016 RFC patchset to start addressing object
> lifetime issues in Media controller. It further allows continuing work to
> address lifetime management of media entities.
> 
> The underlying problem is described in detail in v4 of the previous RFC:
> <URL:https://lore.kernel.org/linux-media/20161108135438.GO3217@valkosipuli.retiisi.org.uk/>.
> In brief, there is currently no connection between releasing media device
> (and related) memory and IOCTL calls, meaning that there is a time window
> during which released kernel memory can be accessed, and that access can be
> triggered from the user space. The only reason why this is not a grave
> security issue is that it is not triggerable by the user alone but requires
> unbinding a device. That is still not an excuse for not fixing it.
> 
> This set differs from the earlier RFC to address the issue in the
> following respects:
> 
> - Make changes for ipu3-cio2 driver, too.
> 
> - Continue to provide best effort attempt to keep the window between device
>   removal and user space being able to access released memory as small as
>   possible. This means the problem won't become worse for drivers for which
>   Media device lifetime management has not been implemented.
> 
> The latter is achieved by adding a new object, Media devnode compat
> reference, which is allocated, refcounted and eventually released by the
> Media controller framework itself, and where the information on registration
> and open filehandles is maintained. This is only done if the driver does not
> manage the lifetime of the media device itself, i.e. its release operation
> is NULL.
> 
> Due to this, Media device file handles will also be introduced by this
> patchset. I thought the first user of this would be Media device events but
> it seems we already need them here.
> 
> Both ipu3-cio2 and omap3isp drivers are relieved of devm_request_irq() use,
> as device_release() releases the resources before calling the driver's
> remove function. While further work will be required also on these drivers
> to safely stop he hardware at unbind time, I don't see a reason not to merge
> these patches now.
> 
> Some patches are temporarily reverted in order to make reworks easier, then
> applied later on.
> 
> I've tested this on ipu3-cio2 with and without the refcounting patch (media:
> ipu3-cio2: Release the cio2 device context by media device callback),
> including failures in a few parts of the driver initialisation process in
> the MC framework.
> 
> Questions and comments are welcome.

Most of this series looks good.

You can add my:

Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>

for patches 1-17, 19, 20 and 22.

As I mentioned in my review of patch 21, I don't think 18 and 21 belong to
this series.

I am also not keen on patch 25 (and thus 23 and 24 that it needs). Perhaps
take that out for now for more discussion later?

I would like to see some more drivers to be converted: specifically uvc
and the test drivers (vidtv, vim2m, vimc, visl, vivid). uvc because it is
widely used, the test drivers because they function as reference code.

Finally, I don't think this series addresses another life-cycle problem:
unbinding subdevices when they are still being used, either by userspace
or a bridge driver.

I have patches for that here:

https://git.linuxtv.org/hverkuil/media_tree.git/log/?h=subdev-kref

I think these are pretty much independent from this patch series, but
I'll wait with posting them until this is merged.

Background: we have an fpga that implements subdevices and also
(depending on the configuration) complete v4l2 platform drivers.

When the fpga is unloaded when going to standby, subdevices and/or
platform drivers just disappear, and without correct life-time management
you get crashes. Basically exactly the same problem as you have with the
media device.

Regards,

	Hans

> 
> 
> Daniel Axtens (1):
>   media: uvcvideo: Refactor teardown of uvc on USB disconnect
> 
> Laurent Pinchart (1):
>   media: Add per-file-handle data support
> 
> Logan Gunthorpe (1):
>   media: utilize new cdev_device_add helper function
> 
> Sakari Ailus (23):
>   Revert "[media] media: fix media devnode ioctl/syscall and unregister
>     race"
>   Revert "media: utilize new cdev_device_add helper function"
>   Revert "[media] media: fix use-after-free in cdev_put() when app exits
>     after driver unbind"
>   Revert "media: uvcvideo: Refactor teardown of uvc on USB disconnect"
>   Revert "[media] media-device: dynamically allocate struct
>     media_devnode"
>   media device: Drop nop release callback
>   media: Do not call cdev_device_del() if cdev_device_add() fails
>   media-device: Delete character device early
>   media: Split initialising and adding media devnode
>   media: Shuffle functions around
>   media device: Initialise media devnode in media_device_init()
>   media device: Refcount the media device
>   v4l: Acquire a reference to the media device for every video device
>   media-device: Postpone graph object removal until free
>   omap3isp: Release the isp device struct by media device callback
>   omap3isp: Don't use devm_request_irq()
>   media: Add nop implementations of media_device_{init,cleanup}
>   media: ipu3-cio2: Call v4l2_device_unregister() earlier
>   media: ipu3-cio2: Don't use devm_request_irq()
>   media: ipu3-cio2: Release the cio2 device context by media device
>     callback
>   media: Maintain a list of open file handles in a media device
>   media: Implement best effort media device removal safety sans
>     refcounting
>   media: Document how Media device resources are released
> 
>  Documentation/driver-api/media/mc-core.rst    |  12 +-
>  drivers/media/cec/core/cec-core.c             |   2 +-
>  drivers/media/mc/mc-device.c                  | 279 +++++++++++-------
>  drivers/media/mc/mc-devnode.c                 |  94 +++---
>  drivers/media/pci/intel/ipu3/ipu3-cio2-main.c |  75 +++--
>  drivers/media/platform/ti/omap3isp/isp.c      |  33 ++-
>  drivers/media/usb/au0828/au0828-core.c        |   4 +-
>  drivers/media/usb/uvc/uvc_driver.c            |   2 +-
>  drivers/media/v4l2-core/v4l2-dev.c            |  13 +-
>  drivers/staging/media/sunxi/cedrus/cedrus.c   |   2 +-
>  include/media/media-device.h                  |  56 +++-
>  include/media/media-devnode.h                 |  99 ++++---
>  include/media/media-fh.h                      |  32 ++
>  13 files changed, 476 insertions(+), 227 deletions(-)
>  create mode 100644 include/media/media-fh.h
>
Sakari Ailus March 3, 2023, 11:23 a.m. UTC | #2
Hi Hans,

Many thanks for the review.

On Fri, Mar 03, 2023 at 10:07:38AM +0100, Hans Verkuil wrote:
> Hi Sakari,
> 
> On 01/02/2023 22:45, Sakari Ailus wrote:
> > Hi folks,
> > 
> > This is a refresh of my 2016 RFC patchset to start addressing object
> > lifetime issues in Media controller. It further allows continuing work to
> > address lifetime management of media entities.
> > 
> > The underlying problem is described in detail in v4 of the previous RFC:
> > <URL:https://lore.kernel.org/linux-media/20161108135438.GO3217@valkosipuli.retiisi.org.uk/>.
> > In brief, there is currently no connection between releasing media device
> > (and related) memory and IOCTL calls, meaning that there is a time window
> > during which released kernel memory can be accessed, and that access can be
> > triggered from the user space. The only reason why this is not a grave
> > security issue is that it is not triggerable by the user alone but requires
> > unbinding a device. That is still not an excuse for not fixing it.
> > 
> > This set differs from the earlier RFC to address the issue in the
> > following respects:
> > 
> > - Make changes for ipu3-cio2 driver, too.
> > 
> > - Continue to provide best effort attempt to keep the window between device
> >   removal and user space being able to access released memory as small as
> >   possible. This means the problem won't become worse for drivers for which
> >   Media device lifetime management has not been implemented.
> > 
> > The latter is achieved by adding a new object, Media devnode compat
> > reference, which is allocated, refcounted and eventually released by the
> > Media controller framework itself, and where the information on registration
> > and open filehandles is maintained. This is only done if the driver does not
> > manage the lifetime of the media device itself, i.e. its release operation
> > is NULL.
> > 
> > Due to this, Media device file handles will also be introduced by this
> > patchset. I thought the first user of this would be Media device events but
> > it seems we already need them here.
> > 
> > Both ipu3-cio2 and omap3isp drivers are relieved of devm_request_irq() use,
> > as device_release() releases the resources before calling the driver's
> > remove function. While further work will be required also on these drivers
> > to safely stop he hardware at unbind time, I don't see a reason not to merge
> > these patches now.
> > 
> > Some patches are temporarily reverted in order to make reworks easier, then
> > applied later on.
> > 
> > I've tested this on ipu3-cio2 with and without the refcounting patch (media:
> > ipu3-cio2: Release the cio2 device context by media device callback),
> > including failures in a few parts of the driver initialisation process in
> > the MC framework.
> > 
> > Questions and comments are welcome.
> 
> Most of this series looks good.
> 
> You can add my:
> 
> Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
> 
> for patches 1-17, 19, 20 and 22.

Thank you!

> 
> As I mentioned in my review of patch 21, I don't think 18 and 21 belong to
> this series.

Yes, some patches I wrote as part of this should be merged earlier. I think
I'll just pick them to my master branch once we have rc1.

> 
> I am also not keen on patch 25 (and thus 23 and 24 that it needs). Perhaps
> take that out for now for more discussion later?
> 
> I would like to see some more drivers to be converted: specifically uvc
> and the test drivers (vidtv, vim2m, vimc, visl, vivid). uvc because it is
> widely used, the test drivers because they function as reference code.

Sounds reasonable. Uvc especially should benefit from this. The conversion
isn't even difficult at all, but still requires testing.

> 
> Finally, I don't think this series addresses another life-cycle problem:
> unbinding subdevices when they are still being used, either by userspace
> or a bridge driver.

That is correct. I wanted to address this for the media device first and
tackle other problems once we have these patches in.

> 
> I have patches for that here:
> 
> https://git.linuxtv.org/hverkuil/media_tree.git/log/?h=subdev-kref
> 
> I think these are pretty much independent from this patch series, but
> I'll wait with posting them until this is merged.

Interesting. I thought in practice addressing the problem for sub-devices
would require addressing media device lifetime management first. In
practice most drivers have one big allocation for everything and that can
be released only once all users are gone.

> 
> Background: we have an fpga that implements subdevices and also
> (depending on the configuration) complete v4l2 platform drivers.
> 
> When the fpga is unloaded when going to standby, subdevices and/or
> platform drivers just disappear, and without correct life-time management
> you get crashes. Basically exactly the same problem as you have with the
> media device.

Yes.

Have you posted the subdev-kref patches to the list yet?
Hans Verkuil March 3, 2023, 11:27 a.m. UTC | #3
On 03/03/2023 12:23, Sakari Ailus wrote:
> Hi Hans,
> 
> Many thanks for the review.
> 
> On Fri, Mar 03, 2023 at 10:07:38AM +0100, Hans Verkuil wrote:
>> Hi Sakari,
>>
>> On 01/02/2023 22:45, Sakari Ailus wrote:
>>> Hi folks,
>>>
>>> This is a refresh of my 2016 RFC patchset to start addressing object
>>> lifetime issues in Media controller. It further allows continuing work to
>>> address lifetime management of media entities.
>>>
>>> The underlying problem is described in detail in v4 of the previous RFC:
>>> <URL:https://lore.kernel.org/linux-media/20161108135438.GO3217@valkosipuli.retiisi.org.uk/>.
>>> In brief, there is currently no connection between releasing media device
>>> (and related) memory and IOCTL calls, meaning that there is a time window
>>> during which released kernel memory can be accessed, and that access can be
>>> triggered from the user space. The only reason why this is not a grave
>>> security issue is that it is not triggerable by the user alone but requires
>>> unbinding a device. That is still not an excuse for not fixing it.
>>>
>>> This set differs from the earlier RFC to address the issue in the
>>> following respects:
>>>
>>> - Make changes for ipu3-cio2 driver, too.
>>>
>>> - Continue to provide best effort attempt to keep the window between device
>>>   removal and user space being able to access released memory as small as
>>>   possible. This means the problem won't become worse for drivers for which
>>>   Media device lifetime management has not been implemented.
>>>
>>> The latter is achieved by adding a new object, Media devnode compat
>>> reference, which is allocated, refcounted and eventually released by the
>>> Media controller framework itself, and where the information on registration
>>> and open filehandles is maintained. This is only done if the driver does not
>>> manage the lifetime of the media device itself, i.e. its release operation
>>> is NULL.
>>>
>>> Due to this, Media device file handles will also be introduced by this
>>> patchset. I thought the first user of this would be Media device events but
>>> it seems we already need them here.
>>>
>>> Both ipu3-cio2 and omap3isp drivers are relieved of devm_request_irq() use,
>>> as device_release() releases the resources before calling the driver's
>>> remove function. While further work will be required also on these drivers
>>> to safely stop he hardware at unbind time, I don't see a reason not to merge
>>> these patches now.
>>>
>>> Some patches are temporarily reverted in order to make reworks easier, then
>>> applied later on.
>>>
>>> I've tested this on ipu3-cio2 with and without the refcounting patch (media:
>>> ipu3-cio2: Release the cio2 device context by media device callback),
>>> including failures in a few parts of the driver initialisation process in
>>> the MC framework.
>>>
>>> Questions and comments are welcome.
>>
>> Most of this series looks good.
>>
>> You can add my:
>>
>> Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
>>
>> for patches 1-17, 19, 20 and 22.
> 
> Thank you!
> 
>>
>> As I mentioned in my review of patch 21, I don't think 18 and 21 belong to
>> this series.
> 
> Yes, some patches I wrote as part of this should be merged earlier. I think
> I'll just pick them to my master branch once we have rc1.
> 
>>
>> I am also not keen on patch 25 (and thus 23 and 24 that it needs). Perhaps
>> take that out for now for more discussion later?
>>
>> I would like to see some more drivers to be converted: specifically uvc
>> and the test drivers (vidtv, vim2m, vimc, visl, vivid). uvc because it is
>> widely used, the test drivers because they function as reference code.
> 
> Sounds reasonable. Uvc especially should benefit from this. The conversion
> isn't even difficult at all, but still requires testing.
> 
>>
>> Finally, I don't think this series addresses another life-cycle problem:
>> unbinding subdevices when they are still being used, either by userspace
>> or a bridge driver.
> 
> That is correct. I wanted to address this for the media device first and
> tackle other problems once we have these patches in.
> 
>>
>> I have patches for that here:
>>
>> https://git.linuxtv.org/hverkuil/media_tree.git/log/?h=subdev-kref
>>
>> I think these are pretty much independent from this patch series, but
>> I'll wait with posting them until this is merged.
> 
> Interesting. I thought in practice addressing the problem for sub-devices
> would require addressing media device lifetime management first. In
> practice most drivers have one big allocation for everything and that can
> be released only once all users are gone.
> 
>>
>> Background: we have an fpga that implements subdevices and also
>> (depending on the configuration) complete v4l2 platform drivers.
>>
>> When the fpga is unloaded when going to standby, subdevices and/or
>> platform drivers just disappear, and without correct life-time management
>> you get crashes. Basically exactly the same problem as you have with the
>> media device.
> 
> Yes.
> 
> Have you posted the subdev-kref patches to the list yet?
> 

No, I don't believe I did. I've been sitting on them waiting for this series,
basically.

But we (Cisco) have been using these patches for some time now. But that's
on a really old kernel :-(

I also need to double-check vimc as well: I have a memory that I had to make
changes there, but I will have to retest it.

Regards,

	Hans
Sakari Ailus March 3, 2023, 4:54 p.m. UTC | #4
Hi Hans,

On Fri, Mar 03, 2023 at 01:23:52PM +0200, Sakari Ailus wrote:
> > As I mentioned in my review of patch 21, I don't think 18 and 21 belong to
> > this series.
> 
> Yes, some patches I wrote as part of this should be merged earlier. I think
> I'll just pick them to my master branch once we have rc1.

I originally thought that you meant "media: Add nop implementations of
media_device_{init,cleanup}". The omap3isp devm_request_irq() patch can be
dropped and the ipu3-cio2 requivalent changed as I discussed in the other
e-mail.