Message ID | 20200106061343.18772-1-wqu@suse.com (mailing list archive) |
---|---|
Headers | show |
Series | Introduce per-profile available space array to avoid over-confident can_overcommit() | expand |
On Mon, Jan 06, 2020 at 02:13:40PM +0800, Qu Wenruo wrote: > The execution time of this per-profile calculation is a little below > 20 us per 5 iterations in my test VM. > Although all such calculation will need to acquire chunk mutex, the > impact should be small enough. The problem is not only the execution time of statfs, but what happens when them mutex is contended. This was the problem with the block group mutex in the past that had to be converted to RCU. If the chunk mutex gets locked because a new chunk is allocated, until it finishes then statfs will block. The time can vary a lot depending on the workload and delay in seconds can trigger system monitors alert.
On 2020/1/6 下午10:06, David Sterba wrote: > On Mon, Jan 06, 2020 at 02:13:40PM +0800, Qu Wenruo wrote: >> The execution time of this per-profile calculation is a little below >> 20 us per 5 iterations in my test VM. >> Although all such calculation will need to acquire chunk mutex, the >> impact should be small enough. > > The problem is not only the execution time of statfs, but what happens > when them mutex is contended. This was the problem with the block group > mutex in the past that had to be converted to RCU. > > If the chunk mutex gets locked because a new chunk is allocated, until > it finishes then statfs will block. The time can vary a lot depending on > the workload and delay in seconds can trigger system monitors alert. > Yes, that's exactly the same concern I have. But I'm not sure how safe the old RCU implementation is when device->virtual_allocated is modified during the RCU critical section. That's to say, if a virtual chunk is being allocated during the statfs(), then we got incorrect result. So I tend to keep it safe by protecting it using chunk_mutex even it means chunk_mutex can block statfs(). Another solution is to completely forget the whole metadata part, just grab the spinlock and the pre-calculated result, but that may result more available space than what we really have. If the delay is really a blockage, i can go the pre-allocated way, making the result a little less accurate. Thanks, Qu