Re: IO scheduler based IO Controller V2

On Thu, May 07, 2009 at 01:36:08PM +0800, Li Zefan wrote:
> Vivek Goyal wrote:
> > On Wed, May 06, 2009 at 04:11:05PM +0800, Gui Jianfeng wrote:
> >> Vivek Goyal wrote:
> >>> Hi All,
> >>>
> >>> Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4.
> >>> First version of the patches was posted here.
> >> Hi Vivek,
> >>
> >> I did some simple test for V2, and triggered an kernel panic.
> >> The following script can reproduce this bug. It seems that the cgroup
> >> is already removed, but IO Controller still try to access into it.
> >>
> > 
> > Hi Gui,
> > 
> > Thanks for the report. I use cgroup_path() for debugging. I guess that
> > cgroup_path() was passed null cgrp pointer that's why it crashed.
> > 
> > If yes, then it is strange though. I call cgroup_path() only after
> > grabbing a refenrece to css object. (I am assuming that if I have a valid
> > reference to css object then css->cgrp can't be null).
> > 
> 
> Yes, css->cgrp shouldn't be NULL.. I doubt we hit a bug in cgroup here.
> The code dealing with css refcnt and cgroup rmdir has changed quite a lot,
> and is much more complex than it was.
> 
> > Anyway, can you please try out following patch and see if it fixes your
> > crash.
> ...
> > BTW, I tried following equivalent script and I can't see the crash on 
> > my system. Are you able to hit it regularly?
> > 
> 
> I modified the script like this:
> 
> ======================
> #!/bin/sh
> echo 1 > /proc/sys/vm/drop_caches
> mkdir /cgroup 2> /dev/null
> mount -t cgroup -o io,blkio io /cgroup
> mkdir /cgroup/test1
> mkdir /cgroup/test2
> echo 100 > /cgroup/test1/io.weight
> echo 500 > /cgroup/test2/io.weight
> 
> dd if=/dev/zero bs=4096 count=128000 of=500M.1 &
> pid1=$!
> echo $pid1 > /cgroup/test1/tasks
> 
> dd if=/dev/zero bs=4096 count=128000 of=500M.2 &
> pid2=$!
> echo $pid2 > /cgroup/test2/tasks
> 
> sleep 5
> kill -9 $pid1
> kill -9 $pid2
> 
> for ((;count != 2;))
> {
>         rmdir /cgroup/test1 > /dev/null 2>&1
>         if [ $? -eq 0 ]; then
>                 count=$(( $count + 1 ))
>         fi
> 
>         rmdir /cgroup/test2 > /dev/null 2>&1
>         if [ $? -eq 0 ]; then
>                 count=$(( $count + 1 ))
>         fi
> }
> 
> umount /cgroup
> rmdir /cgroup
> ======================
> 
> I ran this script and got lockdep BUG. Full log and my config are attached.
> 
> Actually this can be triggered with the following steps on my box:
> # mount -t cgroup -o blkio,io xxx /mnt
> # mkdir /mnt/0
> # echo $$ > /mnt/0/tasks
> # echo 3 > /proc/sys/vm/drop_cache
> # echo $$ > /mnt/tasks
> # rmdir /mnt/0
> 
> And when I ran the script for the second time, my box was freezed
> and I had to reset it.
> 

Thanks Li and Gui for pointing out the problem. With you script, I could
also produce lock validator warning as well as system freeze. I could
identify at least two trouble spots. With following patch things seems
to be fine on my system. Can you please give it a try.

---
 block/elevator-fq.c |   20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Re: IO scheduler based IO Controller V2

Commit Message

Comments

Patch