mbox series

[0/4] mm/gup, drm/i915: refactor gup_fast, convert to pin_user_pages()

Message ID 20200519002124.2025955-1-jhubbard@nvidia.com (mailing list archive)
Headers show
Series mm/gup, drm/i915: refactor gup_fast, convert to pin_user_pages() | expand

Message

John Hubbard May 19, 2020, 12:21 a.m. UTC
This needs to go through Andrew's -mm tree, due to adding a new gup.c
routine. However, I would really love to have some testing from the
drm/i915 folks, because I haven't been able to run-time test that part
of it.

Otherwise, though, the series has passed my basic run time testing:
some LTP tests, some xfs and etx4 non-destructive xfstests, and an
assortment of other smaller ones: vm selftests, io_uring_register, a
few more. But that's only on one particular machine. Also, cross-compile
tests for half a dozen arches all pass.

Details:

In order to convert the drm/i915 driver from get_user_pages() to
pin_user_pages(), a FOLL_PIN equivalent of __get_user_pages_fast() was
required. That led to refactoring __get_user_pages_fast(), with the
following goals:

1) As above: provide a pin_user_pages*() routine for drm/i915 to call,
   in place of __get_user_pages_fast(),

2) Get rid of the gup.c duplicate code for walking page tables with
   interrupts disabled. This duplicate code is a minor maintenance
   problem anyway.

3) Make it easy for an upcoming patch from Souptick, which aims to
   convert __get_user_pages_fast() to use a gup_flags argument, instead
   of a bool writeable arg.  Also, if this series looks good, we can
   ask Souptick to change the name as well, to whatever the consensus
   is. My initial recommendation is: get_user_pages_fast_only(), to
   match the new pin_user_pages_only().

John Hubbard (4):
  mm/gup: move __get_user_pages_fast() down a few lines in gup.c
  mm/gup: refactor and de-duplicate gup_fast() code
  mm/gup: introduce pin_user_pages_fast_only()
  drm/i915: convert get_user_pages() --> pin_user_pages()

 drivers/gpu/drm/i915/gem/i915_gem_userptr.c |  22 +--
 include/linux/mm.h                          |   3 +
 mm/gup.c                                    | 150 ++++++++++++--------
 3 files changed, 107 insertions(+), 68 deletions(-)


base-commit: 642b151f45dd54809ea00ecd3976a56c1ec9b53d

Comments

Chris Wilson May 21, 2020, 6:57 p.m. UTC | #1
Quoting John Hubbard (2020-05-19 01:21:20)
> This needs to go through Andrew's -mm tree, due to adding a new gup.c
> routine. However, I would really love to have some testing from the
> drm/i915 folks, because I haven't been able to run-time test that part
> of it.

CI hit

<4> [185.667750] WARNING: CPU: 0 PID: 1387 at mm/gup.c:2699 internal_get_user_pages_fast+0x63a/0xac0
<4> [185.667752] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 mei_hdcp x86_pkg_temp_thermal coretemp snd_hda_intel snd_intel_dspcfg crct10dif_pclmul snd_hda_codec crc32_pclmul snd_hwdep snd_hda_core ghash_clmulni_intel cdc_ether usbnet mii snd_pcm e1000e mei_me ptp pps_core mei intel_lpss_pci prime_numbers
<4> [185.667774] CPU: 0 PID: 1387 Comm: gem_userptr_bli Tainted: G     U            5.7.0-rc5-CI-Patchwork_17704+ #1
<4> [185.667777] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.3234.A01.1906141750 06/14/2019
<4> [185.667782] RIP: 0010:internal_get_user_pages_fast+0x63a/0xac0
<4> [185.667785] Code: 24 40 08 48 39 5c 24 38 49 89 df 0f 85 74 fc ff ff 48 83 44 24 50 08 48 39 5c 24 58 49 89 dc 0f 85 e0 fb ff ff e9 14 fe ff ff <0f> 0b b8 ea ff ff ff e9 36 fb ff ff 4c 89 e8 48 21 e8 48 39 e8 0f
<4> [185.667789] RSP: 0018:ffffc90001133c38 EFLAGS: 00010206
<4> [185.667792] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8884999ee800
<4> [185.667795] RDX: 00000000000c0001 RSI: 0000000000000100 RDI: 00007f419e774000
<4> [185.667798] RBP: ffff888453dbf040 R08: 0000000000000000 R09: 0000000000000001
<4> [185.667800] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888453dbf380
<4> [185.667803] R13: ffff8884999ee800 R14: ffff888453dbf3e8 R15: 0000000000000040
<4> [185.667806] FS:  00007f419e875e40(0000) GS:ffff88849fe00000(0000) knlGS:0000000000000000
<4> [185.667808] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [185.667811] CR2: 00007f419e873000 CR3: 0000000458bd2004 CR4: 0000000000760ef0
<4> [185.667814] PKRU: 55555554
<4> [185.667816] Call Trace:
<4> [185.667912]  ? i915_gem_userptr_get_pages+0x1c6/0x290 [i915]
<4> [185.667918]  ? mark_held_locks+0x49/0x70
<4> [185.667998]  ? i915_gem_userptr_get_pages+0x1c6/0x290 [i915]
<4> [185.668073]  ? i915_gem_userptr_get_pages+0x1c6/0x290 [i915]

and then panicked, across a range of systems.
-Chris
John Hubbard May 21, 2020, 7:11 p.m. UTC | #2
On 2020-05-21 11:57, Chris Wilson wrote:
> Quoting John Hubbard (2020-05-19 01:21:20)
>> This needs to go through Andrew's -mm tree, due to adding a new gup.c
>> routine. However, I would really love to have some testing from the
>> drm/i915 folks, because I haven't been able to run-time test that part
>> of it.
> 
> CI hit
> 
> <4> [185.667750] WARNING: CPU: 0 PID: 1387 at mm/gup.c:2699 internal_get_user_pages_fast+0x63a/0xac0
> <4> [185.667752] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 mei_hdcp x86_pkg_temp_thermal coretemp snd_hda_intel snd_intel_dspcfg crct10dif_pclmul snd_hda_codec crc32_pclmul snd_hwdep snd_hda_core ghash_clmulni_intel cdc_ether usbnet mii snd_pcm e1000e mei_me ptp pps_core mei intel_lpss_pci prime_numbers
> <4> [185.667774] CPU: 0 PID: 1387 Comm: gem_userptr_bli Tainted: G     U            5.7.0-rc5-CI-Patchwork_17704+ #1
> <4> [185.667777] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.3234.A01.1906141750 06/14/2019
> <4> [185.667782] RIP: 0010:internal_get_user_pages_fast+0x63a/0xac0
> <4> [185.667785] Code: 24 40 08 48 39 5c 24 38 49 89 df 0f 85 74 fc ff ff 48 83 44 24 50 08 48 39 5c 24 58 49 89 dc 0f 85 e0 fb ff ff e9 14 fe ff ff <0f> 0b b8 ea ff ff ff e9 36 fb ff ff 4c 89 e8 48 21 e8 48 39 e8 0f
> <4> [185.667789] RSP: 0018:ffffc90001133c38 EFLAGS: 00010206
> <4> [185.667792] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8884999ee800
> <4> [185.667795] RDX: 00000000000c0001 RSI: 0000000000000100 RDI: 00007f419e774000
> <4> [185.667798] RBP: ffff888453dbf040 R08: 0000000000000000 R09: 0000000000000001
> <4> [185.667800] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888453dbf380
> <4> [185.667803] R13: ffff8884999ee800 R14: ffff888453dbf3e8 R15: 0000000000000040
> <4> [185.667806] FS:  00007f419e875e40(0000) GS:ffff88849fe00000(0000) knlGS:0000000000000000
> <4> [185.667808] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> <4> [185.667811] CR2: 00007f419e873000 CR3: 0000000458bd2004 CR4: 0000000000760ef0
> <4> [185.667814] PKRU: 55555554
> <4> [185.667816] Call Trace:
> <4> [185.667912]  ? i915_gem_userptr_get_pages+0x1c6/0x290 [i915]
> <4> [185.667918]  ? mark_held_locks+0x49/0x70
> <4> [185.667998]  ? i915_gem_userptr_get_pages+0x1c6/0x290 [i915]
> <4> [185.668073]  ? i915_gem_userptr_get_pages+0x1c6/0x290 [i915]
> 
> and then panicked, across a range of systems.
> -Chris
> 

Thanks for this report! I'm looking into it now.

thanks,
Souptick Joarder May 22, 2020, 11:40 a.m. UTC | #3
Hi John,


On Tue, May 19, 2020 at 5:51 AM John Hubbard <jhubbard@nvidia.com> wrote:
>
> This needs to go through Andrew's -mm tree, due to adding a new gup.c
> routine. However, I would really love to have some testing from the
> drm/i915 folks, because I haven't been able to run-time test that part
> of it.
>
> Otherwise, though, the series has passed my basic run time testing:
> some LTP tests, some xfs and etx4 non-destructive xfstests, and an
> assortment of other smaller ones: vm selftests, io_uring_register, a
> few more. But that's only on one particular machine. Also, cross-compile
> tests for half a dozen arches all pass.
>
> Details:
>
> In order to convert the drm/i915 driver from get_user_pages() to
> pin_user_pages(), a FOLL_PIN equivalent of __get_user_pages_fast() was
> required. That led to refactoring __get_user_pages_fast(), with the
> following goals:
>
> 1) As above: provide a pin_user_pages*() routine for drm/i915 to call,
>    in place of __get_user_pages_fast(),
>
> 2) Get rid of the gup.c duplicate code for walking page tables with
>    interrupts disabled. This duplicate code is a minor maintenance
>    problem anyway.
>
> 3) Make it easy for an upcoming patch from Souptick, which aims to
>    convert __get_user_pages_fast() to use a gup_flags argument, instead
>    of a bool writeable arg.  Also, if this series looks good, we can
>    ask Souptick to change the name as well, to whatever the consensus
>    is. My initial recommendation is: get_user_pages_fast_only(), to
>    match the new pin_user_pages_only().

Shall I hold my changes till 5.8-rc1 , when this series will appear upstream ?
>
> John Hubbard (4):
>   mm/gup: move __get_user_pages_fast() down a few lines in gup.c
>   mm/gup: refactor and de-duplicate gup_fast() code
>   mm/gup: introduce pin_user_pages_fast_only()
>   drm/i915: convert get_user_pages() --> pin_user_pages()
>
>  drivers/gpu/drm/i915/gem/i915_gem_userptr.c |  22 +--
>  include/linux/mm.h                          |   3 +
>  mm/gup.c                                    | 150 ++++++++++++--------
>  3 files changed, 107 insertions(+), 68 deletions(-)
>
>
> base-commit: 642b151f45dd54809ea00ecd3976a56c1ec9b53d
> --
> 2.26.2
>
John Hubbard May 22, 2020, 10:22 p.m. UTC | #4
On 2020-05-22 04:40, Souptick Joarder wrote:
...
>> 3) Make it easy for an upcoming patch from Souptick, which aims to
>>     convert __get_user_pages_fast() to use a gup_flags argument, instead
>>     of a bool writeable arg.  Also, if this series looks good, we can
>>     ask Souptick to change the name as well, to whatever the consensus
>>     is. My initial recommendation is: get_user_pages_fast_only(), to
>>     match the new pin_user_pages_only().
> 
> Shall I hold my changes till 5.8-rc1 , when this series will appear upstream ?

I don't really see any problem with your posting something that is based on
the latest linux-next (which has my changes now). Should be fine. And in
fact it would be nice to get that done in this round, so that the pin* and
get* APIs look the same.


thanks,