From patchwork Wed Nov 13 00:23:46 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 3176631 Return-Path: X-Original-To: patchwork-linux-nfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 56A919F39E for ; Wed, 13 Nov 2013 00:24:04 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 6BA1A205B1 for ; Wed, 13 Nov 2013 00:24:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2E0CE205B2 for ; Wed, 13 Nov 2013 00:24:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755551Ab3KMAYB (ORCPT ); Tue, 12 Nov 2013 19:24:01 -0500 Received: from cantor2.suse.de ([195.135.220.15]:55218 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753178Ab3KMAYA (ORCPT ); Tue, 12 Nov 2013 19:24:00 -0500 Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id C96A8A7994; Wed, 13 Nov 2013 01:23:58 +0100 (CET) Date: Wed, 13 Nov 2013 11:23:46 +1100 From: NeilBrown To: "J. Bruce Fields" Cc: "Myklebust, Trond" , Charles Edward Lever , Steve Dickson , Linux NFS Mailing List Subject: Re: [PATCH] Adding the nfs4_secure_mounts bool Message-ID: <20131113112346.3f5f3bd0@notabene.brown> In-Reply-To: <20131112161634.GC15060@fieldses.org> References: <1384037221-7224-1-git-send-email-steved@redhat.com> <52811CBB.3070204@RedHat.com> <5281290B.6000201@RedHat.com> <20131112161135.25a487da@notabene.brown> <20131112161634.GC15060@fieldses.org> X-Mailer: Claws Mail 3.9.0 (GTK+ 2.24.18; x86_64-suse-linux-gnu) Mime-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, T_TVD_MIME_EPI, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Tue, 12 Nov 2013 11:16:34 -0500 "J. Bruce Fields" wrote: > On Tue, Nov 12, 2013 at 05:29:46AM +0000, Myklebust, Trond wrote: > > > > On Nov 12, 2013, at 0:11, NeilBrown wrote: > > > > > On Mon, 11 Nov 2013 15:33:14 -0500 Chuck Lever wrote: > > > > > >> > > >> On Nov 11, 2013, at 1:59 PM, Steve Dickson wrote: > > >> > > >>> On 11/11/13 13:30, Chuck Lever wrote: > > >>>> > > >>>> On Nov 11, 2013, at 1:06 PM, Steve Dickson wrote: > > >>>> > > >>>>> > > >>>>> > > >>>>> On 09/11/13 18:12, Myklebust, Trond wrote: > > >>>>>> One alternative to the above scheme, which I believe that I’ve > > >>>>>> suggested before, is to have a permanent entry in rpc_pipefs > > >>>>>> that rpc.gssd can open and that the kernel can use to detect > > >>>>>> that it is running. If we make it /var/lib/nfs/rpc_pipefs/gssd/clnt00/gssd, > > >>>>>> then AFAICS we don’t need to change nfs-utils at all, since all newer > > >>>>>> versions of rpc.gssd will try to open for read anything of the form > > >>>>>> /var/lib/nfs/rpc_pipefs/*/clntXX/gssd... > > >>>>> > > >>>>> After further review I am going going have to disagree with you on this. > > >>>>> Since all the context is cached on the initial mount the kernel > > >>>>> should be using the call_usermodehelper() to call up to rpc.gssd > > >>>>> to get the context, which means we could put this upcall noise > > >>>>> to bed... forever! :-) > > >>>> > > >>>> Ask Al Viro for his comments on whether the kernel should start > > >>>> gssd (either a daemon or a script). Hint: wear your kevlar underpants. > > >>> I was thinking gssd would become a the gssd-cmd command... Al does not > > >>> like the call_usermodehelper() interface? > > >> > > >> He doesn't have a problem with call_usermodehelper() in general. However, the kernel cannot guarantee security if it has to run a fixed command line. Go ask him to explain. > > >> > > >> > > >>> > > >>>> > > >>>> Have you tried Trond's approach yet? > > >>> Looking into it... But nothing is trivial in that code... > > >>> > > >>>> > > >>>>> I realize this is not going happen overnight, so I would still > > >>>>> like to propose my nfs4_secure_mounts bool patch as bridge > > >>>>> to the new call_usermodehelper() since its the cleanest > > >>>>> solution so far... > > >>>>> > > >>>>> Thoughts? > > >>>> > > >>>> We have workarounds already that work on every kernel since 3.8. > > >>>> > > >>> The one that logs 5 to 20 lines (depending on thins are setup or not) > > >>> per mount? That does work in some environments but no all. ;-) > > >> > > >> When does running rpc.gssd not work? > > > > > > Oohh ooh.. Pick me. Pick me!! I can answer that one. > > > > > > Running rpc.gssd does not work if you are mounting a filesystem using the IP > > > address of the server and that IP address doesn't have a matching hostname > > > anywhere that can be found: > > > > > > In a newly creating minimal kvm install without rpc.gssd running, > > > mount 10.0.2.2:/home /mnt > > > > > > sleeps for 15 seconds then succeeds. > > > If I start rpc.gssd, then the same command takes forever. > > > > > > strace of rpc.gssd shows that it complains about not being able to resolve > > > the host name and "ERROR: failed to read service info". Then it keeps the > > > pipes open but never sends any message on them, so the kernel just keeps on > > > waiting. > > > > > > If I change "fail_keep_client" to "fail_destroy_client", then it closes the > > > pipe and we get the 15 second timeout back. > > > If I change NI_NAMEREQD to 0, then the mount completes instantly. (of course > > > that make serious compromise security so it was just for testing). > > > (Adding an entry to /etc/hosts also gives instant success). > > > > > > I'm hoping that someone who understands this code will suggest something > > > clever so I don't have to dig through all of it ;-) > > > > rpc.gssd is supposed to do a downcall with a zero-length window and an error message in any situation where it cannot establish a GSS context. Normally, I’d expect an EACCES for the above scenario. > > > > IOW: that’s a blatant rpc.gssd bug. One that will also affect you when you're doing NFSv3 and add ‘sec=krb5’ to the mount options. > > Also why is gssd trying to do a DNS lookup in this case? This sounds > similar to what f9f5450f8f94 "Avoid DNS reverse resolution for server > names (take 3)" was trying to fix? It is quite possible that I misunderstand something. But this is my understanding. 1/ "mount" allows you to use either an IP address or a host name to mount a filesystem. 2/ gss requires a hostname to identify the server and find it's key (IP not sufficient). 3/ If you use a host name to mount a filesystem, then that exact same host name should be used by gssd to identify the server and its key. The above mentioned patch was trying to enforce this. The idea was to collect the name given to the 'mount', see if it looked like an IP address or a Server name. If the later, just use it. If the former, do a reverse lookup because an IP address is no use by itself for gss. Previously it would always do a reverse DNS lookup from the IP address that was determined from the server-name-or-IP-address. Unfortunately this patch was broken - got the test backwards. A follow-up patch fixed the test: c93e8d8eeafec3e32 4/ So the above patch was not intended to address the case of mount-by-IP address at all - and this is the case that is causing me problems. But back to my problem: Following Trond's suggestion I've come up with the following patch. Does it look right? The "fd = -1" is just to stop us trying to close a non-open fd in an error path. The change from testing ->servicename to ->prog stops us from repeating the failed DNS lookup on every request, not that the failure isn't fatal. The last stanza makes sure we always reply to an upcall, with EINVAL if nothing else seems appropriate. The patch seems to work for my particular case but a more general review would be appreciated. Thanks, NeilBrown diff --git a/utils/gssd/gssd_proc.c b/utils/gssd/gssd_proc.c index b48d1637cd36..00b4bc779b7c 100644 --- a/utils/gssd/gssd_proc.c +++ b/utils/gssd/gssd_proc.c @@ -256,6 +256,7 @@ read_service_info(char *info_file_name, char **servicename, char **servername, if ((nbytes = read(fd, buf, INFOBUFLEN)) == -1) goto fail; close(fd); + fd = -1; buf[nbytes] = '\0'; numfields = sscanf(buf,"RPC server: %127s\n" @@ -403,11 +404,10 @@ process_clnt_dir_files(struct clnt_info * clp) return -1; snprintf(info_file_name, sizeof(info_file_name), "%s/info", clp->dirname); - if ((clp->servicename == NULL) && - read_service_info(info_file_name, &clp->servicename, - &clp->servername, &clp->prog, &clp->vers, - &clp->protocol, (struct sockaddr *) &clp->addr)) - return -1; + if (clp->prog == 0) + read_service_info(info_file_name, &clp->servicename, + &clp->servername, &clp->prog, &clp->vers, + &clp->protocol, (struct sockaddr *) &clp->addr); return 0; } @@ -1320,11 +1320,14 @@ handle_gssd_upcall(struct clnt_info *clp) } } - if (strcmp(mech, "krb5") == 0) + if (strcmp(mech, "krb5") == 0 && clp->servername) process_krb5_upcall(clp, uid, clp->gssd_fd, target, service); - else - printerr(0, "WARNING: handle_gssd_upcall: " - "received unknown gss mech '%s'\n", mech); + else { + if (clp->servername) + printerr(0, "WARNING: handle_gssd_upcall: " + "received unknown gss mech '%s'\n", mech); + do_error_downcall(clp->gssd_fd, uid, -EINVAL); + } out: free(lbuf);