Message ID | 20180530135421.GA81788@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, 30 May 2018, Mike Snitzer wrote: > On Wed, May 30 2018 at 9:33am -0400, > Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > > > > On Wed, 30 May 2018, Mike Snitzer wrote: > > > > > On Wed, May 30 2018 at 9:21am -0400, > > > Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > > > > > > > > > > > > On Wed, 30 May 2018, Mike Snitzer wrote: > > > > > > > > > That is really great news, can you submit an incremental patch that > > > > > layers ontop of the linux-dm.git 'dm-4.18' branch? > > > > > > > > > > Thanks, > > > > > Mike > > > > > > > > I've sent the current version that I have. I fixed the bugs that were > > > > reported here (missing DAX, dm_bufio_client_create, __branch_check__ > > > > long->int truncation). > > > > > > OK, but a monolithic dm-writecache.c is no longer useful to me. I can > > > drop Arnd's gcc warning fix (with the idea that Ingo or Steve will take > > > your __branch_check__ patch). Not sure what the dm_bufio_client_create > > > fix is... must've missed a report about that. > > > > > > ANyway, point is we're on too a different phase of dm-writecache.c's > > > development. I've picked it up and am trying to get it ready for the > > > 4.18 merge window (likely opening Sunday). Therefore it needs to be in > > > a git tree, and incremental changes overlayed. I cannot be rebasing at > > > this late stage in the 4.18 development window. > > > > > > Thanks, > > > Mike > > > > I downloaded dm-writecache from your git repository some times ago - but > > you changed a lot of useless things (i.e. reordering the fields in the > > structure) since that time - so, you'll have to merge the changes. > > Fine I'll deal with it. reordering the fields eliminated holes in the > structure and reduced struct members spanning cache lines. And what about this? #define WC_MODE_PMEM(wc) ((wc)->pmem_mode) The code that I had just allowed the compiler to optimize out persistent-memory code if we have DM_WRITECACHE_ONLY_SSD defined - and you deleted it. Most architectures don't have persistent memory and the dm-writecache driver could work in ssd-only mode on them. On these architectures, I define #define WC_MODE_PMEM(wc) false - and the compiler will just automatically remove the tests for that condition and the unused branch. It does also eliminate unused static functions. Mikulas
On Wed, May 30 2018 at 10:09am -0400, Mikulas Patocka <mpatocka@redhat.com> wrote: > > > On Wed, 30 May 2018, Mike Snitzer wrote: > > > On Wed, May 30 2018 at 9:33am -0400, > > Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > > > > > > > > On Wed, 30 May 2018, Mike Snitzer wrote: > > > > > > > On Wed, May 30 2018 at 9:21am -0400, > > > > Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > > > > > > > > > > > > > > > > On Wed, 30 May 2018, Mike Snitzer wrote: > > > > > > > > > > > That is really great news, can you submit an incremental patch that > > > > > > layers ontop of the linux-dm.git 'dm-4.18' branch? > > > > > > > > > > > > Thanks, > > > > > > Mike > > > > > > > > > > I've sent the current version that I have. I fixed the bugs that were > > > > > reported here (missing DAX, dm_bufio_client_create, __branch_check__ > > > > > long->int truncation). > > > > > > > > OK, but a monolithic dm-writecache.c is no longer useful to me. I can > > > > drop Arnd's gcc warning fix (with the idea that Ingo or Steve will take > > > > your __branch_check__ patch). Not sure what the dm_bufio_client_create > > > > fix is... must've missed a report about that. > > > > > > > > ANyway, point is we're on too a different phase of dm-writecache.c's > > > > development. I've picked it up and am trying to get it ready for the > > > > 4.18 merge window (likely opening Sunday). Therefore it needs to be in > > > > a git tree, and incremental changes overlayed. I cannot be rebasing at > > > > this late stage in the 4.18 development window. > > > > > > > > Thanks, > > > > Mike > > > > > > I downloaded dm-writecache from your git repository some times ago - but > > > you changed a lot of useless things (i.e. reordering the fields in the > > > structure) since that time - so, you'll have to merge the changes. > > > > Fine I'll deal with it. reordering the fields eliminated holes in the > > structure and reduced struct members spanning cache lines. > > And what about this? > #define WC_MODE_PMEM(wc) ((wc)->pmem_mode) > > The code that I had just allowed the compiler to optimize out > persistent-memory code if we have DM_WRITECACHE_ONLY_SSD defined - and you > deleted it. > > Most architectures don't have persistent memory and the dm-writecache > driver could work in ssd-only mode on them. On these architectures, I > define > #define WC_MODE_PMEM(wc) false > - and the compiler will just automatically remove the tests for that > condition and the unused branch. It does also eliminate unused static > functions. This level of microoptimization can be backfilled. But as it was, there were too many #defines. And I'm really not concerned with eliminating unused static functions for this case.
On Wed, 30 May 2018, Mike Snitzer wrote: > On Wed, May 30 2018 at 10:09am -0400, > Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > > > > On Wed, 30 May 2018, Mike Snitzer wrote: > > > > > On Wed, May 30 2018 at 9:33am -0400, > > > Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > > > > > > > > > > > > On Wed, 30 May 2018, Mike Snitzer wrote: > > > > > > > > > On Wed, May 30 2018 at 9:21am -0400, > > > > > Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > On Wed, 30 May 2018, Mike Snitzer wrote: > > > > > > > > > > > > > That is really great news, can you submit an incremental patch that > > > > > > > layers ontop of the linux-dm.git 'dm-4.18' branch? > > > > > > > > > > > > > > Thanks, > > > > > > > Mike > > > > > > > > > > > > I've sent the current version that I have. I fixed the bugs that were > > > > > > reported here (missing DAX, dm_bufio_client_create, __branch_check__ > > > > > > long->int truncation). > > > > > > > > > > OK, but a monolithic dm-writecache.c is no longer useful to me. I can > > > > > drop Arnd's gcc warning fix (with the idea that Ingo or Steve will take > > > > > your __branch_check__ patch). Not sure what the dm_bufio_client_create > > > > > fix is... must've missed a report about that. > > > > > > > > > > ANyway, point is we're on too a different phase of dm-writecache.c's > > > > > development. I've picked it up and am trying to get it ready for the > > > > > 4.18 merge window (likely opening Sunday). Therefore it needs to be in > > > > > a git tree, and incremental changes overlayed. I cannot be rebasing at > > > > > this late stage in the 4.18 development window. > > > > > > > > > > Thanks, > > > > > Mike > > > > > > > > I downloaded dm-writecache from your git repository some times ago - but > > > > you changed a lot of useless things (i.e. reordering the fields in the > > > > structure) since that time - so, you'll have to merge the changes. > > > > > > Fine I'll deal with it. reordering the fields eliminated holes in the > > > structure and reduced struct members spanning cache lines. > > > > And what about this? > > #define WC_MODE_PMEM(wc) ((wc)->pmem_mode) > > > > The code that I had just allowed the compiler to optimize out > > persistent-memory code if we have DM_WRITECACHE_ONLY_SSD defined - and you > > deleted it. > > > > Most architectures don't have persistent memory and the dm-writecache > > driver could work in ssd-only mode on them. On these architectures, I > > define > > #define WC_MODE_PMEM(wc) false > > - and the compiler will just automatically remove the tests for that > > condition and the unused branch. It does also eliminate unused static > > functions. > > This level of microoptimization can be backfilled. But as it was, there > were too many #defines. And I'm really not concerned with eliminating > unused static functions for this case. I don't see why "too many defines" would be a problem. If I compile it with and without pmem support, the difference is 15kB-vs-12kB. If we look at just one function (writecache_map), the difference is 1595 bytes - vs - 1280 bytes. So, it produces real savings in code size. The problem with performance is not caused a condition that always jumps the same way (that is predicted by the CPU and it causes no delays in the pipeline) - the problem is that a bigger function consumes more i-cache. There is no reason to include code that can't be executed. Note that we should also redefine pmem_assign on architectures that don't support persistent memory: #ifndef DM_WRITECACHE_ONLY_SSD #define pmem_assign(dest, src) \ do { \ typeof(dest) uniq = (src); \ memcpy_flushcache(&(dest), &uniq, sizeof(dest)); \ } while (0) #else #define pmem_assign(dest, src) ((dest) = (src)) #endif I.e. we should not call memcpy_flushcache if we can't have persistent memory. Cache flushing is slow and we should not do it if we don't have to. Mikulas
On Wed, May 30 2018 at 10:46P -0400, Mikulas Patocka <mpatocka@redhat.com> wrote: > > > On Wed, 30 May 2018, Mike Snitzer wrote: > > > > > Fine I'll deal with it. reordering the fields eliminated holes in the > > > > structure and reduced struct members spanning cache lines. > > > > > > And what about this? > > > #define WC_MODE_PMEM(wc) ((wc)->pmem_mode) > > > > > > The code that I had just allowed the compiler to optimize out > > > persistent-memory code if we have DM_WRITECACHE_ONLY_SSD defined - and you > > > deleted it. > > > > > > Most architectures don't have persistent memory and the dm-writecache > > > driver could work in ssd-only mode on them. On these architectures, I > > > define > > > #define WC_MODE_PMEM(wc) false > > > - and the compiler will just automatically remove the tests for that > > > condition and the unused branch. It does also eliminate unused static > > > functions. > > > > This level of microoptimization can be backfilled. But as it was, there > > were too many #defines. And I'm really not concerned with eliminating > > unused static functions for this case. > > I don't see why "too many defines" would be a problem. > > If I compile it with and without pmem support, the difference is > 15kB-vs-12kB. If we look at just one function (writecache_map), the > difference is 1595 bytes - vs - 1280 bytes. So, it produces real savings > in code size. > > The problem with performance is not caused a condition that always jumps > the same way (that is predicted by the CPU and it causes no delays in the > pipeline) - the problem is that a bigger function consumes more i-cache. > There is no reason to include code that can't be executed. Please double check you see the reduced code size you're expecting using the latest dm-writecache.c in linux-dm.git's dm-4.18 branch. Thanks, Mike
On Wed, 30 May 2018, Mike Snitzer wrote: > On Wed, May 30 2018 at 10:46P -0400, > Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > > > > On Wed, 30 May 2018, Mike Snitzer wrote: > > > > > > > Fine I'll deal with it. reordering the fields eliminated holes in the > > > > > structure and reduced struct members spanning cache lines. > > > > > > > > And what about this? > > > > #define WC_MODE_PMEM(wc) ((wc)->pmem_mode) > > > > > > > > The code that I had just allowed the compiler to optimize out > > > > persistent-memory code if we have DM_WRITECACHE_ONLY_SSD defined - and you > > > > deleted it. > > > > > > > > Most architectures don't have persistent memory and the dm-writecache > > > > driver could work in ssd-only mode on them. On these architectures, I > > > > define > > > > #define WC_MODE_PMEM(wc) false > > > > - and the compiler will just automatically remove the tests for that > > > > condition and the unused branch. It does also eliminate unused static > > > > functions. > > > > > > This level of microoptimization can be backfilled. But as it was, there > > > were too many #defines. And I'm really not concerned with eliminating > > > unused static functions for this case. > > > > I don't see why "too many defines" would be a problem. > > > > If I compile it with and without pmem support, the difference is > > 15kB-vs-12kB. If we look at just one function (writecache_map), the > > difference is 1595 bytes - vs - 1280 bytes. So, it produces real savings > > in code size. > > > > The problem with performance is not caused a condition that always jumps > > the same way (that is predicted by the CPU and it causes no delays in the > > pipeline) - the problem is that a bigger function consumes more i-cache. > > There is no reason to include code that can't be executed. > > Please double check you see the reduced code size you're expecting using > the latest dm-writecache.c in linux-dm.git's dm-4.18 branch. > > Thanks, > Mike I checked that - it's OK. Mikulas
diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c index 844c4fb2fcfc..e733a14faf8f 100644 --- a/drivers/md/dm-writecache.c +++ b/drivers/md/dm-writecache.c @@ -28,6 +28,8 @@ #define AUTOCOMMIT_BLOCKS_PMEM 64 #define AUTOCOMMIT_MSEC 1000 +//#define WC_MEASURE_LATENCY + #define BITMAP_GRANULARITY 65536 #if BITMAP_GRANULARITY < PAGE_SIZE #undef BITMAP_GRANULARITY @@ -217,6 +219,15 @@ struct dm_writecache { struct dm_kcopyd_client *dm_kcopyd; unsigned long *dirty_bitmap; unsigned dirty_bitmap_size; + +#ifdef WC_MEASURE_LATENCY + ktime_t lock_acquired_time; + ktime_t max_lock_held; + ktime_t max_lock_wait; + ktime_t max_freelist_wait; + ktime_t measure_latency_time; + ktime_t max_measure_latency; +#endif }; #define WB_LIST_INLINE 16 @@ -243,15 +254,60 @@ struct copy_struct { DECLARE_DM_KCOPYD_THROTTLE_WITH_MODULE_PARM(dm_writecache_throttle, "A percentage of time allocated for data copying"); -static void wc_lock(struct dm_writecache *wc) +static inline void measure_latency_start(struct dm_writecache *wc) +{ +#ifdef WC_MEASURE_LATENCY + wc->measure_latency_time = ktime_get(); +#endif +} + +static inline void measure_latency_end(struct dm_writecache *wc, unsigned long n) { +#ifdef WC_MEASURE_LATENCY + ktime_t now = ktime_get(); + if (now - wc->measure_latency_time > wc->max_measure_latency) { + wc->max_measure_latency = now - wc->measure_latency_time; + printk(KERN_DEBUG "dm-writecache: measured latency %lld.%03lldus, %lu steps\n", + wc->max_measure_latency / 1000, wc->max_measure_latency % 1000, n); + } +#endif +} + +static void __wc_lock(struct dm_writecache *wc, int line) +{ +#ifdef WC_MEASURE_LATENCY + ktime_t before, after; + before = ktime_get(); +#endif mutex_lock(&wc->lock); +#ifdef WC_MEASURE_LATENCY + after = ktime_get(); + if (unlikely(after - before > wc->max_lock_wait)) { + wc->max_lock_wait = after - before; + printk(KERN_DEBUG "dm-writecache: waiting for lock for %lld.%03lldus at %d\n", + wc->max_lock_wait / 1000, wc->max_lock_wait % 1000, line); + after = ktime_get(); + } + wc->lock_acquired_time = after; +#endif } +#define wc_lock(wc) __wc_lock(wc, __LINE__) -static void wc_unlock(struct dm_writecache *wc) +static void __wc_unlock(struct dm_writecache *wc, int line) { +#ifdef WC_MEASURE_LATENCY + ktime_t now = ktime_get(); + if (now - wc->lock_acquired_time > wc->max_lock_held) { + wc->max_lock_held = now - wc->lock_acquired_time; + printk(KERN_DEBUG "dm-writecache: lock held for %lld.%03lldus at %d\n", + wc->max_lock_held / 1000, wc->max_lock_held % 1000, line); + } +#endif mutex_unlock(&wc->lock); } +#define wc_unlock(wc) __wc_unlock(wc, __LINE__) + +#define wc_unlock_long(wc) mutex_unlock(&wc->lock) #if IS_ENABLED(CONFIG_DAX_DRIVER) static int persistent_memory_claim(struct dm_writecache *wc) @@ -293,6 +349,10 @@ static int persistent_memory_claim(struct dm_writecache *wc) r = -EOPNOTSUPP; goto err2; } +#ifdef WC_MEASURE_LATENCY + printk(KERN_DEBUG "dm-writecache: device %s, pfn %016llx\n", + wc->ssd_dev->name, pfn.val); +#endif if (da != p) { long i; wc->memory_map = NULL; @@ -701,16 +761,35 @@ static void writecache_free_entry(struct dm_writecache *wc, struct wc_entry *e) swake_up(&wc->freelist_wait); } -static void writecache_wait_on_freelist(struct dm_writecache *wc) +static void __writecache_wait_on_freelist(struct dm_writecache *wc, bool measure, int line) { DECLARE_SWAITQUEUE(wait); +#ifdef WC_MEASURE_LATENCY + ktime_t before, after; +#endif prepare_to_swait(&wc->freelist_wait, &wait, TASK_UNINTERRUPTIBLE); wc_unlock(wc); +#ifdef WC_MEASURE_LATENCY + if (measure) + before = ktime_get(); +#endif io_schedule(); finish_swait(&wc->freelist_wait, &wait); +#ifdef WC_MEASURE_LATENCY + if (measure) { + after = ktime_get(); + if (unlikely(after - before > wc->max_freelist_wait)) { + wc->max_freelist_wait = after - before; + printk(KERN_DEBUG "dm-writecache: waiting on freelist for %lld.%03lldus at %d\n", + wc->max_freelist_wait / 1000, wc->max_freelist_wait % 1000, line); + } + } +#endif wc_lock(wc); } +#define writecache_wait_on_freelist(wc) __writecache_wait_on_freelist(wc, true, __LINE__) +#define writecache_wait_on_freelist_long(wc) __writecache_wait_on_freelist(wc, false, __LINE__) static void writecache_poison_lists(struct dm_writecache *wc) { @@ -890,7 +969,7 @@ static void writecache_suspend(struct dm_target *ti) writecache_poison_lists(wc); - wc_unlock(wc); + wc_unlock_long(wc); } static int writecache_alloc_entries(struct dm_writecache *wc) @@ -1001,7 +1080,7 @@ static void writecache_resume(struct dm_target *ti) writecache_commit_flushed(wc); } - wc_unlock(wc); + wc_unlock_long(wc); } static int process_flush_mesg(unsigned argc, char **argv, struct dm_writecache *wc) @@ -1472,8 +1551,9 @@ static void __writeback_throttle(struct dm_writecache *wc, struct writeback_list if (unlikely(wc->max_writeback_jobs)) { if (READ_ONCE(wc->writeback_size) - wbl->size >= wc->max_writeback_jobs) { wc_lock(wc); - while (wc->writeback_size - wbl->size >= wc->max_writeback_jobs) - writecache_wait_on_freelist(wc); + while (wc->writeback_size - wbl->size >= wc->max_writeback_jobs) { + writecache_wait_on_freelist_long(wc); + } wc_unlock(wc); } }