Re: [PATCH v1] memfd: `MFD_NOEXEC_SEAL` should not imply `MFD_ALLOW_SEALING`

From: Jeff Xu
Date: Thu May 23 2024 - 12:21:47 EST


On Thu, May 23, 2024 at 1:24 AM David Rheinsberg <david@xxxxxxxxxxxx> wrote:
>
> Hi
>
> On Thu, May 23, 2024, at 4:25 AM, Barnabás Pőcze wrote:
> > 2024. május 23., csütörtök 1:23 keltezéssel, Andrew Morton
> > <akpm@xxxxxxxxxxxxxxxxxxxx> írta:
> >> It's a change to a userspace API, yes? Please let's have a detailed
> >> description of why this is OK. Why it won't affect any existing users.
> >
> > Yes, it is a uAPI change. To trigger user visible change, a program has to
> >
> > - create a memfd
> > - with MFD_NOEXEC_SEAL,
> > - without MFD_ALLOW_SEALING;
> > - try to add seals / check the seals.
> >
> > This change in essence reverts the kernel's behaviour to that of Linux
> > <6.3, where
> > only `MFD_ALLOW_SEALING` enabled sealing. If a program works correctly
> > on those
> > kernels, it will likely work correctly after this change.
> >
> > I have looked through Debian Code Search and GitHub, searching for
> > `MFD_NOEXEC_SEAL`.
> > And I could find only a single breakage that this change would case:
> > dbus-broker
> > has its own memfd_create() wrapper that is aware of this implicit
> > `MFD_ALLOW_SEALING`
> > behaviour[0], and tries to work around it. This workaround will break.
> > Luckily,
> > however, as far as I could tell this only affects the test suite of
> > dbus-broker,
> > not its normal operations, so I believe it should be fine. I have
> > prepared a PR
> > with a fix[1].
>
> We asked for exactly this fix before, so I very much support this. Our test-suite in `dbus-broker` merely verifies what the current kernel behavior is (just like the kernel selftests). I am certainly ok if the kernel breaks it. I will gladly adapt the test-suite.
>
> Previous discussion was in:
>
> [PATCH] memfd: support MFD_NOEXEC alongside MFD_EXEC
> https://lore.kernel.org/lkml/20230714114753.170814-1-david@xxxxxxxxxxxx/
>
> Note that this fix is particularly important in combination with `vm.memfd_noexec=2`, since this breaks existing user-space by enabling sealing on all memfds unconditionally. I also encourage backporting to stable kernels.
>
Also with vm.memfd_noexec=1.
I think that problem must be addressed either with this patch, or with
a new flag.

Regarding vm.memfd_noexec, on another topic.
I think in addition to vm.memfd_noexec = 1 and 2, there still could
be another state: 3

=0. Do nothing.
=1. This will add MFD_NOEXEC_SEAL if application didn't set EXEC or
MFD_NOEXE_SEAL (to help with the migration)
=2: This will reject all calls without MFD_NOEXEC_SEAL (the whole
system doesn't allow executable memfd)
=3: Application must set MFD_EXEC or MFD_NOEXEC_SEAL explicitly, or
else it will be rejected.

3 is useful because it lets applications choose what to use, and
forces applications to migrate to new semantics (this is what 2 did
before 9876cfe8).
The caveat is 3 is less restrictive than 2, so must document it clearly.

-Jeff

> Reviewed-by: David Rheinsberg <david@xxxxxxxxxxxx>
>
> Thanks
> David