mbox series

[0/1] cachefiles: Fix race between read_waiter and read_copier

Message ID 1588855822-5532-1-git-send-email-dwysocha@redhat.com (mailing list archive)
Headers show
Series cachefiles: Fix race between read_waiter and read_copier | expand

Message

David Wysochanski May 7, 2020, 12:50 p.m. UTC
This patch was originally posted by Lei Xue in 2018:
https://lore.kernel.org/patchwork/patch/889373/

The responses on the above thread ended up fixing a separate, but related 
problem in this code path, but the last portion of the commit message
indicated the original problem was thought to have been fixed as well.
    commit 934140ab028713a61de8bca58c05332416d037d1
    Author: Kiran Kumar Modukuri <kiran.modukuri@gmail.com>
    Date: 2018-07-25 15:04:25 +0100
    
        cachefiles: Fix refcounting bug in backing-file read monitoring

However, the original problem reported by Lei still remains and is fairly
easy to reproduce.  My reproducer details are below and I could reproduce
within a few minutes.  I ended up at the same patch after tracing of the
problem and proving this race still exists, then testing with this patch
applied.  The work is detailed in the following bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1829662

I am re-submitting this with Lei as author as I've only rebased the patch
on dhowells fscache-fixes branch, changed the subject line, and cleaned
up the patch header.

Reproducer
==========

# NFS server setup / config
# uname -r
3.10.0-1127.el7.x86_64
# cat /etc/exports
/data *(rw,sec=sys:krb5)
# df -h /data
Filesystem      Size  Used Avail Use% Mounted on
/dev/vdb         16G   16G  340M  98% /data
# time for i in $(seq 1 4000); do dd if=/dev/zero of=/data/file$i.bin bs=1M count=4; done &

# NFS client config and test
# uname -r
3.10.0-1062.23.1.el7.x86_64
# _or_
# From https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/commit/?h=fscache-fixes&id=1836833f67ab49363d221aeb120956448ca5be4f
# uname -r
5.7.0-rc3-1836833f67ab
# df /var/cache/fscache
Filesystem      Size  Used Avail Use% Mounted on
/dev/md0        7.8G   36M  7.3G   1% /var/cache/fscache
# make sure cachefilesd is running
systemctl status cachefilesd
# To run the test on the client:
# mount -overs=3,fsc nfs-server:/data /mnt/nfs
# cat dd-ioload.sh 
NFS_MNT=/mnt/nfs
echo 3 > /proc/sys/vm/drop_caches
for i in $(seq 1 2000); do
        dd if=$NFS_MNT/file$i.bin of=/dev/null bs=28k >/dev/null 2>&1 &
done
wait
# while true; do date; time ./dd-ioload.sh; done &