Re: [PATCH v2 0/3] fuse: Add support for mounts from pid/user namespaces

From: Seth Forshee
Date: Tue Sep 23 2014 - 13:33:14 EST


On Tue, Sep 23, 2014 at 07:03:47PM +0200, Miklos Szeredi wrote:
> On Tue, Sep 23, 2014 at 6:26 PM, Seth Forshee
> <seth.forshee@xxxxxxxxxxxxx> wrote:
> > On Tue, Sep 23, 2014 at 06:07:35PM +0200, Miklos Szeredi wrote:
> >> On Tue, Sep 2, 2014 at 5:44 PM, Seth Forshee <seth.forshee@xxxxxxxxxxxxx> wrote:
> >> > Here's an updated set of patches for allowing fuse mounts from pid and
> >> > user namespaces. I discussed some of the issues we debated with the last
> >> > patch set (and a few others) with Eric at LinuxCon, and the updates here
> >> > mainly reflect the outcome of those discussions.
> >> >
> >> > The stickiest issue in the v1 patches was the question of where to get
> >> > the user and pid namespaces from that are used for translating ids for
> >> > communication with userspace. Eric told me that for user namespaces at
> >> > least we need to grab a namespace at open or mount time and use only
> >> > that namespace to prevent certain types of attacks.
> >>
> >> I'm not convinced. Let us have the gory details, please.
> >
> > I'll do my best, and hopefully Eric will fill in any details I miss.
> >
> > I think there may have been more than one possible scenario that Eric
> > was describing to me, but this is the one I remember. A user could
> > create a namespace and mount a fuse filesystem without nosuid. It could
> > then pass the /dev/fuse fd to a process in init_user_ns, which could
> > expose a suid file owned by root (or any other user) and use it to gain
> > elevated privileges.
> >
> > On the other hand, if file ownership is always interpreted in the
> > context of the namespace from which the filesystem is mounted then suid
> > can only be used to become another uid already under that user's
> > control.
>
> Much simpler solution: don't allow SUID in unprivileged namespaces.
> SUID is a really ugly hack with many problems, just get rid of it.

Yeah, that's an option but will require some vfs support similar to with
nodev, but I wouldn't call it simpler. The implementation of using a
single namespace for all uid/gid translation is actually quite a bit
simpler than trying to use the context of the /dev/fuse reads and
writes. The reason is that much of the translation happens in the
context of the process doing fs operations on the fuse mount, not in the
context of the /dev/fuse reads and writes, so it involves a lot of
grabbing references to namespaces and passing them around. Either that
or moving that code to happen in the read/write path, which would also
require some substantial changes.

Thanks,
Seth

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/