Message ID | 1386241253-5781-2-git-send-email-jlayton@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Dec 05, 2013 at 06:00:51AM -0500, Jeff Layton wrote: > Currently when we are processing a request, we try to scrape an expired > or over-limit entry off the list in preference to allocating a new one > from the slab. > > This is unnecessarily complicated. Just use the slab layer. I really don't think pruning caches from lookup functions is a good idea. To many chances for locking inversions, unbounded runtime and other issues. So I'd prefer my earlier version of the patch to remove it entirely if we want to change anything beyond your minimal fix. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 5 Dec 2013 05:29:19 -0800 Christoph Hellwig <hch@infradead.org> wrote: > On Thu, Dec 05, 2013 at 06:00:51AM -0500, Jeff Layton wrote: > > Currently when we are processing a request, we try to scrape an expired > > or over-limit entry off the list in preference to allocating a new one > > from the slab. > > > > This is unnecessarily complicated. Just use the slab layer. > > I really don't think pruning caches from lookup functions is a good > idea. To many chances for locking inversions, unbounded runtime and > other issues. So I'd prefer my earlier version of the patch to > remove it entirely if we want to change anything beyond your minimal > fix. > It's not likely to hit a locking inversion here since we don't have a lot of different locks, but point taken... If we take that out, then that does mean that the cache may grow larger than the max...possibly much larger since the pruner workqueue job only runs every 120s and you can put a lot more entries into the cache in that period. That said, I don't feel too strongly about it, so if you think leaving it to the workqueue job and the shrinker is the right thing to do, then so be it.
On Thu, Dec 05, 2013 at 08:41:56AM -0500, Jeff Layton wrote: > On Thu, 5 Dec 2013 05:29:19 -0800 > Christoph Hellwig <hch@infradead.org> wrote: > > > On Thu, Dec 05, 2013 at 06:00:51AM -0500, Jeff Layton wrote: > > > Currently when we are processing a request, we try to scrape an expired > > > or over-limit entry off the list in preference to allocating a new one > > > from the slab. > > > > > > This is unnecessarily complicated. Just use the slab layer. > > > > I really don't think pruning caches from lookup functions is a good > > idea. To many chances for locking inversions, unbounded runtime and > > other issues. So I'd prefer my earlier version of the patch to > > remove it entirely if we want to change anything beyond your minimal > > fix. > > > > It's not likely to hit a locking inversion here since we don't have a > lot of different locks, but point taken... > > If we take that out, then that does mean that the cache may grow larger > than the max...possibly much larger since the pruner workqueue job > only runs every 120s and you can put a lot more entries into the cache > in that period. Yeah, this scares me. I looked through my old mail but couldn't find your previous results: how long did you find the hash buckets needed to be before the difference was noticeable on your setup? We typically do a failed-lookup-and-insert on every cached operation so I wouldn't be surprised if a workload of small random writes, for example, could produce an unreasonably large cache in that time. --b. > > That said, I don't feel too strongly about it, so if you think leaving > it to the workqueue job and the shrinker is the right thing to do, then > so be it. > > -- > Jeff Layton <jlayton@redhat.com> -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 5 Dec 2013 10:50:24 -0500 "J. Bruce Fields" <bfields@fieldses.org> wrote: > On Thu, Dec 05, 2013 at 08:41:56AM -0500, Jeff Layton wrote: > > On Thu, 5 Dec 2013 05:29:19 -0800 > > Christoph Hellwig <hch@infradead.org> wrote: > > > > > On Thu, Dec 05, 2013 at 06:00:51AM -0500, Jeff Layton wrote: > > > > Currently when we are processing a request, we try to scrape an expired > > > > or over-limit entry off the list in preference to allocating a new one > > > > from the slab. > > > > > > > > This is unnecessarily complicated. Just use the slab layer. > > > > > > I really don't think pruning caches from lookup functions is a good > > > idea. To many chances for locking inversions, unbounded runtime and > > > other issues. So I'd prefer my earlier version of the patch to > > > remove it entirely if we want to change anything beyond your minimal > > > fix. > > > > > > > It's not likely to hit a locking inversion here since we don't have a > > lot of different locks, but point taken... > > > > If we take that out, then that does mean that the cache may grow larger > > than the max...possibly much larger since the pruner workqueue job > > only runs every 120s and you can put a lot more entries into the cache > > in that period. > > Yeah, this scares me. > > I looked through my old mail but couldn't find your previous results: > how long did you find the hash buckets needed to be before the > difference was noticeable on your setup? > > We typically do a failed-lookup-and-insert on every cached operation so > I wouldn't be surprised if a workload of small random writes, for > example, could produce an unreasonably large cache in that time. > I don't think I ever measured that. I did some some measurement of how long it took to generate checksums, but that's a different phase of this stuff. I just sort of went with "too long a chain == bad, so keep them as short as possible". > > > > > That said, I don't feel too strongly about it, so if you think leaving > > it to the workqueue job and the shrinker is the right thing to do, then > > so be it. > > > > -- > > Jeff Layton <jlayton@redhat.com>
diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c index b6af150..f8f060f 100644 --- a/fs/nfsd/nfscache.c +++ b/fs/nfsd/nfscache.c @@ -132,13 +132,6 @@ nfsd_reply_cache_alloc(void) } static void -nfsd_reply_cache_unhash(struct svc_cacherep *rp) -{ - hlist_del_init(&rp->c_hash); - list_del_init(&rp->c_lru); -} - -static void nfsd_reply_cache_free_locked(struct svc_cacherep *rp) { if (rp->c_type == RC_REPLBUFF && rp->c_replvec.iov_base) { @@ -416,22 +409,8 @@ nfsd_cache_lookup(struct svc_rqst *rqstp) /* * Since the common case is a cache miss followed by an insert, - * preallocate an entry. First, try to reuse the first entry on the LRU - * if it works, then go ahead and prune the LRU list. + * preallocate an entry. */ - spin_lock(&cache_lock); - if (!list_empty(&lru_head)) { - rp = list_first_entry(&lru_head, struct svc_cacherep, c_lru); - if (nfsd_cache_entry_expired(rp) || - num_drc_entries >= max_drc_entries) { - nfsd_reply_cache_unhash(rp); - prune_cache_entries(); - goto search_cache; - } - } - - /* No expired ones available, allocate a new one. */ - spin_unlock(&cache_lock); rp = nfsd_reply_cache_alloc(); spin_lock(&cache_lock); if (likely(rp)) { @@ -439,7 +418,9 @@ nfsd_cache_lookup(struct svc_rqst *rqstp) drc_mem_usage += sizeof(*rp); } -search_cache: + /* go ahead and prune the cache */ + prune_cache_entries(); + found = nfsd_cache_search(rqstp, csum); if (found) { if (likely(rp)) @@ -453,15 +434,6 @@ search_cache: goto out; } - /* - * We're keeping the one we just allocated. Are we now over the - * limit? Prune one off the tip of the LRU in trade for the one we - * just allocated if so. - */ - if (num_drc_entries >= max_drc_entries) - nfsd_reply_cache_free_locked(list_first_entry(&lru_head, - struct svc_cacherep, c_lru)); - nfsdstats.rcmisses++; rqstp->rq_cacherep = rp; rp->c_state = RC_INPROG;
Currently when we are processing a request, we try to scrape an expired or over-limit entry off the list in preference to allocating a new one from the slab. This is unnecessarily complicated. Just use the slab layer. Signed-off-by: Jeff Layton <jlayton@redhat.com> --- fs/nfsd/nfscache.c | 36 ++++-------------------------------- 1 file changed, 4 insertions(+), 32 deletions(-)