mbox series

[0/1] pmem: allow user to set QUEUE_FLAG_NOWAIT

Message ID 20230512104302.8527-1-kch@nvidia.com (mailing list archive)
Headers show
Series pmem: allow user to set QUEUE_FLAG_NOWAIT | expand

Message

Chaitanya Kulkarni May 12, 2023, 10:43 a.m. UTC
Allow user to set the QUEUE_FLAG_NOWAIT optionally using module
parameter to retain the default behaviour. Also, update respective
allocation flags in the write path. Following are the performance
numbers with io_uring fio engine for random read, note that device has
been populated fully with randwrite workload before taking these
numbers :-

* linux-block (for-next) # grep IOPS  pmem*fio | column -t

default-nowait-off-1.fio:  read:  IOPS=3968k,  BW=15.1GiB/s
default-nowait-off-2.fio:  read:  IOPS=4084k,  BW=15.6GiB/s
default-nowait-off-3.fio:  read:  IOPS=3995k,  BW=15.2GiB/s

nowait-on-1.fio:           read:  IOPS=5909k,  BW=22.5GiB/s
nowait-on-2.fio:           read:  IOPS=5997k,  BW=22.9GiB/s
nowait-on-3.fio:           read:  IOPS=6006k,  BW=22.9GiB/s

* linux-block (for-next) # grep cpu  pmem*fio | column -t

default-nowait-off-1.fio:  cpu  :  usr=6.38%,   sys=31.37%,  ctx=220427659
default-nowait-off-2.fio:  cpu  :  usr=6.19%,   sys=31.45%,  ctx=229825635
default-nowait-off-3.fio:  cpu  :  usr=6.17%,   sys=31.22%,  ctx=221896158

nowait-on-1.fio:           cpu  :  usr=10.56%,  sys=87.82%,  ctx=24730   
nowait-on-2.fio:           cpu  :  usr=9.92%,   sys=88.36%,  ctx=23427   
nowait-on-3.fio:           cpu  :  usr=9.85%,   sys=89.04%,  ctx=23237   

* linux-block (for-next) # grep slat  pmem*fio | column -t
default-nowait-off-1.fio:  slat  (nsec):  min=431,   max=50423k,  avg=9424.06
default-nowait-off-2.fio:  slat  (nsec):  min=420,   max=35992k,  avg=9193.94
default-nowait-off-3.fio:  slat  (nsec):  min=430,   max=40737k,  avg=9244.24

nowait-on-1.fio:           slat  (nsec):  min=1232,  max=40098k,  avg=7518.60
nowait-on-2.fio:           slat  (nsec):  min=1303,  max=52107k,  avg=7423.37
nowait-on-3.fio:           slat  (nsec):  min=1123,  max=40193k,  avg=7409.08

Please let me know if further testing is needed I've ran fio
verification job in order to make verify these changes.

Chaitanya Kulkarni (1):
  pmem: allow user to set QUEUE_FLAG_NOWAIT

 drivers/nvdimm/pmem.c | 6 ++++++
 1 file changed, 6 insertions(+)

linux-block (for-next) # sh test-pmem.sh
+ git log -1
commit 6df7042a11e06465b1b8f275170cb5593d8d7dcc (HEAD -> for-next)
Author: Chaitanya Kulkarni <kch@nvidia.com>
Date:   Fri May 12 03:24:54 2023 -0700

    pmem: allow user to set QUEUE_FLAG_NOWAIT
    
    Allow user to set the QUEUE_FLAG_NOWAIT optionally using module
    parameter to retain the default behaviour. Also, update respective
    allocation flags in the write path. Following are the performance
    numbers with io_uring fio engine for random read, note that device has
    been populated fully with randwrite workload before taking these
    numbers :-
+ rmmod nd_pmem
rmmod: ERROR: Module nd_pmem is not currently loaded
+ makej M=drivers/nvdimm
+ insmod drivers/nvdimm/nd_pmem.ko
+ sleep 1
+ test_pmem default-nowait-off
+ sleep 1
+ fio fio/verify.fio --ioengine=io_uring --size=896M --filename=/dev/pmem0
write-and-verify: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=16
fio-3.34
Starting 1 process
Jobs: 1 (f=1)
write-and-verify: (groupid=0, jobs=1): err= 0: pid=5358: Fri May 12 03:25:49 2023
  read: IOPS=266k, BW=1039MiB/s (1089MB/s)(566MiB/545msec)
    slat (nsec): min=511, max=49865, avg=2732.59, stdev=1231.82
    clat (nsec): min=1703, max=134486, avg=56488.86, stdev=7471.91
     lat (usec): min=6, max=138, avg=59.22, stdev= 7.76
    clat percentiles (usec):
     |  1.00th=[   43],  5.00th=[   47], 10.00th=[   49], 20.00th=[   51],
     | 30.00th=[   53], 40.00th=[   55], 50.00th=[   57], 60.00th=[   58],
     | 70.00th=[   60], 80.00th=[   62], 90.00th=[   65], 95.00th=[   70],
     | 99.00th=[   83], 99.50th=[   90], 99.90th=[  103], 99.95th=[  111],
     | 99.99th=[  124]
  write: IOPS=214k, BW=835MiB/s (876MB/s)(896MiB/1073msec); 0 zone resets
    slat (nsec): min=1473, max=92145, avg=4049.04, stdev=1645.45
    clat (usec): min=29, max=232, avg=70.52, stdev=12.86
     lat (usec): min=33, max=234, avg=74.57, stdev=13.54
    clat percentiles (usec):
     |  1.00th=[   44],  5.00th=[   53], 10.00th=[   56], 20.00th=[   61],
     | 30.00th=[   65], 40.00th=[   68], 50.00th=[   71], 60.00th=[   73],
     | 70.00th=[   76], 80.00th=[   79], 90.00th=[   85], 95.00th=[   92],
     | 99.00th=[  112], 99.50th=[  121], 99.90th=[  151], 99.95th=[  165],
     | 99.99th=[  188]
   bw (  KiB/s): min=115224, max=909344, per=71.53%, avg=611669.33, stdev=432768.96, samples=3
   iops        : min=28806, max=227336, avg=152917.33, stdev=108192.24, samples=3
  lat (usec)   : 2=0.01%, 10=0.01%, 20=0.01%, 50=7.92%, 100=90.44%
  lat (usec)   : 250=1.64%
  cpu          : usr=41.68%, sys=55.78%, ctx=6691, majf=0, minf=3975
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=144933,229376,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
   READ: bw=1039MiB/s (1089MB/s), 1039MiB/s-1039MiB/s (1089MB/s-1089MB/s), io=566MiB (594MB), run=545-545msec
  WRITE: bw=835MiB/s (876MB/s), 835MiB/s-835MiB/s (876MB/s-876MB/s), io=896MiB (940MB), run=1073-1073msec

Disk stats (read/write):
  pmem0: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
+ fio fio/randwrite.fio --ioengine=io_uring --size=896M --filename=/dev/pmem0
RANDWRITE: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=2
...
fio-3.34
Starting 48 processes
Jobs: 43 (f=43): [w(16),_(1),w(1),_(1),w(1),_(1),w(11),_(1),w(14),_(1)][80.0%][w=9744MiB/s][w=2494k IOPS][eta 00m:01s]
RANDWRITE: (groupid=0, jobs=48): err= 0: pid=5377: Fri May 12 03:25:54 2023
  write: IOPS=2400k, BW=9374MiB/s (9829MB/s)(42.0GiB/4588msec); 0 zone resets
    slat (nsec): min=411, max=9672.1k, avg=6856.45, stdev=13756.39
    clat (nsec): min=70, max=12541k, avg=28576.27, stdev=32832.06
     lat (nsec): min=1583, max=12543k, avg=35432.72, stdev=34424.05
    clat percentiles (nsec):
     |  1.00th=[   916],  5.00th=[  2288], 10.00th=[  4896], 20.00th=[ 10560],
     | 30.00th=[ 17792], 40.00th=[ 22144], 50.00th=[ 25984], 60.00th=[ 29824],
     | 70.00th=[ 34048], 80.00th=[ 39680], 90.00th=[ 51456], 95.00th=[ 65280],
     | 99.00th=[102912], 99.50th=[122368], 99.90th=[199680], 99.95th=[276480],
     | 99.99th=[888832]
   bw (  MiB/s): min= 8098, max=13617, per=100.00%, avg=10176.24, stdev=45.01, samples=366
   iops        : min=2073313, max=3486177, avg=2605112.23, stdev=11521.54, samples=366
  lat (nsec)   : 100=0.01%, 250=0.47%, 500=0.36%, 750=0.08%, 1000=0.14%
  lat (usec)   : 2=2.52%, 4=5.05%, 10=10.81%, 20=15.78%, 50=53.95%
  lat (usec)   : 100=9.72%, 250=1.06%, 500=0.04%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%
  cpu          : usr=6.89%, sys=29.96%, ctx=6989842, majf=0, minf=584
  IO depths    : 1=0.1%, 2=100.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,11010048,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=2

Run status group 0 (all jobs):
  WRITE: bw=9374MiB/s (9829MB/s), 9374MiB/s-9374MiB/s (9829MB/s-9829MB/s), io=42.0GiB (45.1GB), run=4588-4588msec

Disk stats (read/write):
  pmem0: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
+ for i in 1 2 3
+ fio fio/randread.fio --ioengine=io_uring --size=896M --filename=/dev/pmem0 --output=pmem-default-nowait-off-1.fio
+ for i in 1 2 3 [r(48)][100.0%][r=16.6GiB/s][r=4348k IOPS][eta 00m:00s]
+ fio fio/randread.fio --ioengine=io_uring --size=896M --filename=/dev/pmem0 --output=pmem-default-nowait-off-2.fio
+ for i in 1 2 3 [r(48)][100.0%][r=15.8GiB/s][r=4138k IOPS][eta 00m:00s]
+ fio fio/randread.fio --ioengine=io_uring --size=896M --filename=/dev/pmem0 --output=pmem-default-nowait-off-3.fio
+ rmmod nd_pmem: [r(48)][100.0%][r=16.6GiB/s][r=4346k IOPS][eta 00m:00s]
+ insmod drivers/nvdimm/nd_pmem.ko nowait=1
+ sleep 1
+ test_pmem nowait-on
+ sleep 1
+ fio fio/verify.fio --ioengine=io_uring --size=896M --filename=/dev/pmem0
write-and-verify: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=16
fio-3.34
Starting 1 process
Jobs: 1 (f=1)
write-and-verify: (groupid=0, jobs=1): err= 0: pid=6062: Fri May 12 03:28:59 2023
  read: IOPS=492k, BW=1923MiB/s (2016MB/s)(567MiB/295msec)
    slat (nsec): min=1021, max=45136, avg=1220.20, stdev=473.55
    clat (nsec): min=812, max=79261, avg=30452.86, stdev=3469.19
     lat (nsec): min=1944, max=81274, avg=31673.05, stdev=3575.81
    clat percentiles (nsec):
     |  1.00th=[28288],  5.00th=[28800], 10.00th=[29056], 20.00th=[29056],
     | 30.00th=[29312], 40.00th=[29568], 50.00th=[29568], 60.00th=[29824],
     | 70.00th=[30080], 80.00th=[30336], 90.00th=[30848], 95.00th=[37120],
     | 99.00th=[48384], 99.50th=[49408], 99.90th=[58112], 99.95th=[59648],
     | 99.99th=[78336]
  write: IOPS=215k, BW=839MiB/s (880MB/s)(896MiB/1068msec); 0 zone resets
    slat (usec): min=2, max=122, avg= 4.30, stdev= 1.52
    clat (nsec): min=401, max=190492, avg=69931.70, stdev=9390.70
     lat (usec): min=3, max=289, avg=74.23, stdev= 9.87
    clat percentiles (usec):
     |  1.00th=[   54],  5.00th=[   58], 10.00th=[   61], 20.00th=[   64],
     | 30.00th=[   67], 40.00th=[   69], 50.00th=[   70], 60.00th=[   72],
     | 70.00th=[   74], 80.00th=[   76], 90.00th=[   79], 95.00th=[   83],
     | 99.00th=[   96], 99.50th=[  122], 99.90th=[  161], 99.95th=[  165],
     | 99.99th=[  176]
   bw (  KiB/s): min=811952, max=899120, per=99.59%, avg=855536.00, stdev=61637.08, samples=2
   iops        : min=202988, max=224780, avg=213884.00, stdev=15409.27, samples=2
  lat (nsec)   : 500=0.01%, 1000=0.01%
  lat (usec)   : 4=0.01%, 10=0.01%, 20=0.01%, 50=38.69%, 100=60.80%
  lat (usec)   : 250=0.51%
  cpu          : usr=38.33%, sys=61.60%, ctx=1, majf=0, minf=3984
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=145223,229376,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
   READ: bw=1923MiB/s (2016MB/s), 1923MiB/s-1923MiB/s (2016MB/s-2016MB/s), io=567MiB (595MB), run=295-295msec
  WRITE: bw=839MiB/s (880MB/s), 839MiB/s-839MiB/s (880MB/s-880MB/s), io=896MiB (940MB), run=1068-1068msec

Disk stats (read/write):
  pmem0: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
+ fio fio/randwrite.fio --ioengine=io_uring --size=896M --filename=/dev/pmem0
RANDWRITE: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=2
...
fio-3.34
Starting 48 processes
Jobs: 48 (f=48)
RANDWRITE: (groupid=0, jobs=48): err= 0: pid=6065: Fri May 12 03:29:00 2023
  write: IOPS=10.6M, BW=40.3GiB/s (43.2GB/s)(42.0GiB/1043msec); 0 zone resets
    slat (nsec): min=1162, max=10395k, avg=3946.17, stdev=6436.85
    clat (nsec): min=70, max=10396k, avg=4608.73, stdev=6810.12
     lat (nsec): min=1282, max=10403k, avg=8554.90, stdev=9532.53
    clat percentiles (nsec):
     |  1.00th=[ 2224],  5.00th=[ 2544], 10.00th=[ 2800], 20.00th=[ 3184],
     | 30.00th=[ 3472], 40.00th=[ 3760], 50.00th=[ 4080], 60.00th=[ 4448],
     | 70.00th=[ 4896], 80.00th=[ 5408], 90.00th=[ 6304], 95.00th=[ 7200],
     | 99.00th=[14016], 99.50th=[27776], 99.90th=[42752], 99.95th=[46848],
     | 99.99th=[80384]
   bw (  MiB/s): min=40342, max=42969, per=100.00%, avg=41656.06, stdev=29.12, samples=93
   iops        : min=10327717, max=11000181, avg=10663949.00, stdev=7454.42, samples=93
  lat (nsec)   : 100=0.01%, 250=0.03%, 500=0.01%, 750=0.01%, 1000=0.01%
  lat (usec)   : 2=0.28%, 4=47.32%, 10=50.83%, 20=0.91%, 50=0.58%
  lat (usec)   : 100=0.02%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%
  cpu          : usr=15.39%, sys=83.72%, ctx=1002, majf=0, minf=580
  IO depths    : 1=0.1%, 2=100.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,11010048,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=2

Run status group 0 (all jobs):
  WRITE: bw=40.3GiB/s (43.2GB/s), 40.3GiB/s-40.3GiB/s (43.2GB/s-43.2GB/s), io=42.0GiB (45.1GB), run=1043-1043msec

Disk stats (read/write):
  pmem0: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
+ for i in 1 2 3
+ fio fio/randread.fio --ioengine=io_uring --size=896M --filename=/dev/pmem0 --output=pmem-nowait-on-1.fio
+ for i in 1 2 3 [r(48)][100.0%][r=22.8GiB/s][r=5987k IOPS][eta 00m:00s]
+ fio fio/randread.fio --ioengine=io_uring --size=896M --filename=/dev/pmem0 --output=pmem-nowait-on-2.fio
+ for i in 1 2 3 [r(48)][100.0%][r=22.8GiB/s][r=5990k IOPS][eta 00m:00s]
+ fio fio/randread.fio --ioengine=io_uring --size=896M --filename=/dev/pmem0 --output=pmem-nowait-on-3.fio
+ rmmod nd_pmem: [r(48)][100.0%][r=23.0GiB/s][r=6016k IOPS][eta 00m:00s]
linux-block (for-next) # for i in IOPS slat cpu; do grep $i bc-*fio | column -t ; done
linux-block (for-next) # for i in IOPS slat cpu; do grep $i pmem-*fio | column -t ; done
pmem-default-nowait-off-1.fio:  read:  IOPS=3968k,  BW=15.1GiB/s  (16.3GB/s)(908GiB/60002msec)
pmem-default-nowait-off-2.fio:  read:  IOPS=4084k,  BW=15.6GiB/s  (16.7GB/s)(935GiB/60001msec)
pmem-default-nowait-off-3.fio:  read:  IOPS=3995k,  BW=15.2GiB/s  (16.4GB/s)(914GiB/60002msec)
pmem-nowait-on-1.fio:           read:  IOPS=5909k,  BW=22.5GiB/s  (24.2GB/s)(1352GiB/60003msec)
pmem-nowait-on-2.fio:           read:  IOPS=5997k,  BW=22.9GiB/s  (24.6GB/s)(1373GiB/60002msec)
pmem-nowait-on-3.fio:           read:  IOPS=6006k,  BW=22.9GiB/s  (24.6GB/s)(1375GiB/60002msec)
pmem-default-nowait-off-1.fio:  slat  (nsec):  min=431,   max=50423k,  avg=9424.06,  stdev=19769.73
pmem-default-nowait-off-2.fio:  slat  (nsec):  min=420,   max=35992k,  avg=9193.94,  stdev=19814.91
pmem-default-nowait-off-3.fio:  slat  (nsec):  min=430,   max=40737k,  avg=9244.24,  stdev=22646.40
pmem-nowait-on-1.fio:           slat  (nsec):  min=1232,  max=40098k,  avg=7518.60,  stdev=26037.75
pmem-nowait-on-2.fio:           slat  (nsec):  min=1303,  max=52107k,  avg=7423.37,  stdev=24122.06
pmem-nowait-on-3.fio:           slat  (nsec):  min=1123,  max=40193k,  avg=7409.08,  stdev=17630.05
pmem-default-nowait-off-1.fio:  cpu  :  usr=6.38%,   sys=31.37%,  ctx=220427659,  majf=0,  minf=641
pmem-default-nowait-off-2.fio:  cpu  :  usr=6.19%,   sys=31.45%,  ctx=229825635,  majf=0,  minf=639
pmem-default-nowait-off-3.fio:  cpu  :  usr=6.17%,   sys=31.22%,  ctx=221896158,  majf=0,  minf=650
pmem-nowait-on-1.fio:           cpu  :  usr=10.56%,  sys=87.82%,  ctx=24730,      majf=0,  minf=784
pmem-nowait-on-2.fio:           cpu  :  usr=9.92%,   sys=88.36%,  ctx=23427,      majf=0,  minf=720
pmem-nowait-on-3.fio:           cpu  :  usr=9.85%,   sys=89.04%,  ctx=23237,      majf=0,  minf=724
linux-block (for-next) #

Comments

Christoph Hellwig May 12, 2023, 1:29 p.m. UTC | #1
On Fri, May 12, 2023 at 03:43:01AM -0700, Chaitanya Kulkarni wrote:
> Allow user to set the QUEUE_FLAG_NOWAIT optionally using module
> parameter to retain the default behaviour.

Why?
Chaitanya Kulkarni May 13, 2023, 12:54 a.m. UTC | #2
On 5/12/23 06:29, Christoph Hellwig wrote:
> On Fri, May 12, 2023 at 03:43:01AM -0700, Chaitanya Kulkarni wrote:
>> Allow user to set the QUEUE_FLAG_NOWAIT optionally using module
>> parameter to retain the default behaviour.
> Why?

not needed, sending v2 without module parameter.

-ck