Message ID | 20170712121224.18522-2-daniel.vetter@ffwll.ch (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Jul 12, 2017 at 02:12:24PM +0200, Daniel Vetter wrote: > The problem is that we have a distributed cache - every committer has > a copy. Which means even just a slight clock skew will make sure that > a naive gc algorithm results in lots of thrashing around. > > To fix this add a huge hysteresis: Only add files newer than 1 day, > and only remove them when older than 60 days. As long as people have > reasonable accurate clocks on their machines this should work. > > A different problem is that we can't use filesystem timestamps (and > hence can't use git rerere gc): When someone comes back from vacations > and updates git rerere, all the files will have current timestamps, > even when they've been pushed out weeks ago. To fix that, use the git > log to judge old files to remove. Also, remove old files before adding > new ones, to avoid confusion. > > Also, we need to teach the cp -r to preserve timestamps, otherwise > this won't work. > > v2: Use git log to remove old files. > > v3: Remove the debug uncommenting (Sean). > > v4: Split out code movement and explain better what's going on (Jani). Yeah, much easier to digest with the split. Reviewed-by: Sean Paul <seanpaul@chromium.org> > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > --- > dim | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/dim b/dim > index b788edd29653..79d616cbf354 100755 > --- a/dim > +++ b/dim > @@ -513,9 +513,15 @@ function commit_rerere_cache > > git pull >& /dev/null > rm $(rr_cache_dir)/rr-cache -Rf &> /dev/null || true > - cp $(rr_cache_dir)/* rr-cache -r > + cp $(rr_cache_dir)/* rr-cache -r --preserve=timestamps > git add ./*.patch >& /dev/null || true > - git add rr-cache/* > /dev/null > + for file in $(git ls-files); do > + if ! git log --since="60 days ago" --name-only -- $file | grep $file &> /dev/null; then > + git rm $file &> /dev/null > + echo deleting $file > + fi > + done > + find rr-cache/ -ctime -1 -type f -print0 | xargs -0 git add > /dev/null > git rm rr-cache/rr-cache &> /dev/null || true > if git commit -m "$time: $integration_branch rerere cache update" >& /dev/null; then > echo -n "New commit. " > -- > 2.13.2 > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Thu, 13 Jul 2017, Sean Paul <seanpaul@chromium.org> wrote: > On Wed, Jul 12, 2017 at 02:12:24PM +0200, Daniel Vetter wrote: >> The problem is that we have a distributed cache - every committer has >> a copy. Which means even just a slight clock skew will make sure that >> a naive gc algorithm results in lots of thrashing around. >> >> To fix this add a huge hysteresis: Only add files newer than 1 day, >> and only remove them when older than 60 days. As long as people have >> reasonable accurate clocks on their machines this should work. >> >> A different problem is that we can't use filesystem timestamps (and >> hence can't use git rerere gc): When someone comes back from vacations >> and updates git rerere, all the files will have current timestamps, >> even when they've been pushed out weeks ago. To fix that, use the git >> log to judge old files to remove. Also, remove old files before adding >> new ones, to avoid confusion. >> >> Also, we need to teach the cp -r to preserve timestamps, otherwise >> this won't work. >> >> v2: Use git log to remove old files. >> >> v3: Remove the debug uncommenting (Sean). >> >> v4: Split out code movement and explain better what's going on (Jani). > > Yeah, much easier to digest with the split. > > Reviewed-by: Sean Paul <seanpaul@chromium.org> I'll trust that and hope for the best. ;) BR, Jani. > >> >> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> >> --- >> dim | 10 ++++++++-- >> 1 file changed, 8 insertions(+), 2 deletions(-) >> >> diff --git a/dim b/dim >> index b788edd29653..79d616cbf354 100755 >> --- a/dim >> +++ b/dim >> @@ -513,9 +513,15 @@ function commit_rerere_cache >> >> git pull >& /dev/null >> rm $(rr_cache_dir)/rr-cache -Rf &> /dev/null || true >> - cp $(rr_cache_dir)/* rr-cache -r >> + cp $(rr_cache_dir)/* rr-cache -r --preserve=timestamps >> git add ./*.patch >& /dev/null || true >> - git add rr-cache/* > /dev/null >> + for file in $(git ls-files); do >> + if ! git log --since="60 days ago" --name-only -- $file | grep $file &> /dev/null; then >> + git rm $file &> /dev/null >> + echo deleting $file >> + fi >> + done >> + find rr-cache/ -ctime -1 -type f -print0 | xargs -0 git add > /dev/null >> git rm rr-cache/rr-cache &> /dev/null || true >> if git commit -m "$time: $integration_branch rerere cache update" >& /dev/null; then >> echo -n "New commit. " >> -- >> 2.13.2 >> >> _______________________________________________ >> dri-devel mailing list >> dri-devel@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Fri, Jul 14, 2017 at 12:57:23PM +0300, Jani Nikula wrote: > On Thu, 13 Jul 2017, Sean Paul <seanpaul@chromium.org> wrote: > > On Wed, Jul 12, 2017 at 02:12:24PM +0200, Daniel Vetter wrote: > >> The problem is that we have a distributed cache - every committer has > >> a copy. Which means even just a slight clock skew will make sure that > >> a naive gc algorithm results in lots of thrashing around. > >> > >> To fix this add a huge hysteresis: Only add files newer than 1 day, > >> and only remove them when older than 60 days. As long as people have > >> reasonable accurate clocks on their machines this should work. > >> > >> A different problem is that we can't use filesystem timestamps (and > >> hence can't use git rerere gc): When someone comes back from vacations > >> and updates git rerere, all the files will have current timestamps, > >> even when they've been pushed out weeks ago. To fix that, use the git > >> log to judge old files to remove. Also, remove old files before adding > >> new ones, to avoid confusion. > >> > >> Also, we need to teach the cp -r to preserve timestamps, otherwise > >> this won't work. > >> > >> v2: Use git log to remove old files. > >> > >> v3: Remove the debug uncommenting (Sean). > >> > >> v4: Split out code movement and explain better what's going on (Jani). > > > > Yeah, much easier to digest with the split. > > > > Reviewed-by: Sean Paul <seanpaul@chromium.org> > > I'll trust that and hope for the best. ;) Pushed and asked everyone I managed to ping over irc to upgrade. Let's see how often we have to push out a revert until the old stuff is dead for good (I already screwed up twice locally ...). -Daniel > > BR, > Jani. > > > > > >> > >> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > >> --- > >> dim | 10 ++++++++-- > >> 1 file changed, 8 insertions(+), 2 deletions(-) > >> > >> diff --git a/dim b/dim > >> index b788edd29653..79d616cbf354 100755 > >> --- a/dim > >> +++ b/dim > >> @@ -513,9 +513,15 @@ function commit_rerere_cache > >> > >> git pull >& /dev/null > >> rm $(rr_cache_dir)/rr-cache -Rf &> /dev/null || true > >> - cp $(rr_cache_dir)/* rr-cache -r > >> + cp $(rr_cache_dir)/* rr-cache -r --preserve=timestamps > >> git add ./*.patch >& /dev/null || true > >> - git add rr-cache/* > /dev/null > >> + for file in $(git ls-files); do > >> + if ! git log --since="60 days ago" --name-only -- $file | grep $file &> /dev/null; then > >> + git rm $file &> /dev/null > >> + echo deleting $file > >> + fi > >> + done > >> + find rr-cache/ -ctime -1 -type f -print0 | xargs -0 git add > /dev/null > >> git rm rr-cache/rr-cache &> /dev/null || true > >> if git commit -m "$time: $integration_branch rerere cache update" >& /dev/null; then > >> echo -n "New commit. " > >> -- > >> 2.13.2 > >> > >> _______________________________________________ > >> dri-devel mailing list > >> dri-devel@lists.freedesktop.org > >> https://lists.freedesktop.org/mailman/listinfo/dri-devel > > -- > Jani Nikula, Intel Open Source Technology Center
diff --git a/dim b/dim index b788edd29653..79d616cbf354 100755 --- a/dim +++ b/dim @@ -513,9 +513,15 @@ function commit_rerere_cache git pull >& /dev/null rm $(rr_cache_dir)/rr-cache -Rf &> /dev/null || true - cp $(rr_cache_dir)/* rr-cache -r + cp $(rr_cache_dir)/* rr-cache -r --preserve=timestamps git add ./*.patch >& /dev/null || true - git add rr-cache/* > /dev/null + for file in $(git ls-files); do + if ! git log --since="60 days ago" --name-only -- $file | grep $file &> /dev/null; then + git rm $file &> /dev/null + echo deleting $file + fi + done + find rr-cache/ -ctime -1 -type f -print0 | xargs -0 git add > /dev/null git rm rr-cache/rr-cache &> /dev/null || true if git commit -m "$time: $integration_branch rerere cache update" >& /dev/null; then echo -n "New commit. "
The problem is that we have a distributed cache - every committer has a copy. Which means even just a slight clock skew will make sure that a naive gc algorithm results in lots of thrashing around. To fix this add a huge hysteresis: Only add files newer than 1 day, and only remove them when older than 60 days. As long as people have reasonable accurate clocks on their machines this should work. A different problem is that we can't use filesystem timestamps (and hence can't use git rerere gc): When someone comes back from vacations and updates git rerere, all the files will have current timestamps, even when they've been pushed out weeks ago. To fix that, use the git log to judge old files to remove. Also, remove old files before adding new ones, to avoid confusion. Also, we need to teach the cp -r to preserve timestamps, otherwise this won't work. v2: Use git log to remove old files. v3: Remove the debug uncommenting (Sean). v4: Split out code movement and explain better what's going on (Jani). Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> --- dim | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)