Re: [PATCH 2/3] seccomp: release task filters when the task exits
From: Andrei Vagin
Date: Wed May 22 2024 - 02:49:59 EST
On Thu, May 16, 2024 at 6:10 AM Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
>
> On 05/16, Oleg Nesterov wrote:
> >
> > On 05/15, Andrei Vagin wrote:
> > >
> > > seccomp_sync_threads and seccomp_can_sync_threads should be considered too.
> >
> > Yes. But we only need to consider them in the multi-thread case, right?
> > In this case exit_signals() sets PF_EXITING under ->siglock, so they can't
> > miss this flag, seccomp_filter_release() doesn't need to take siglock.
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> Ah, no. seccomp_filter_release() does need to take ->siglock even if we
> forget about proc_pid_seccomp_cache().
>
> Without siglock
>
> orig = tsk->seccomp.filter;
>
> can leak into the critical section in exit_signals() (spin_unlock is the
> one-way barrier) and this LOAD can be reordered with "flags |= PF_EXITING".
>
> Hmm. I thought we have something smp_mb__after_unlock(), but it seems we
> don't. So we can't add a fast-path
We have smp_mb__after_unlock_lock in include/linux/rcupdate.h.
>
> if (!tsk->seccomp.filter)
> return;
>
> check at the start of seccomp_filter_release().
>
>
> Cough... Now that I look at seccomp_can_sync_threads() I think it too
> doesn't need the PF_EXITING check.
>
> If it is called before seccomp_filter_release(), this doesn't really
> differ from the case when it is called before do_exit/exit_signals.
>
> If it is called after seccomp_filter_release(), then is_ancestor()
> must be true.
>
> But perhaps I missed something, I won't insist, up to you.
>
> > > If we check PF_EXITING in all of them, we don't need to take ->siglock in
> > > seccomp_filter_release. Does it sound right?
> >
> > The problem is a single-threaded exiting task. In this case exit_signals()
> > sets PF_EXITING lockless. This means that in this case
> >
> > - proc_pid_seccomp_cache() can't rely on the PF_EXITING check
> > but it can be safely removed.
> >
> > - seccomp_filter_release() needs to take ->siglock to avoid the
> > race with proc_pid_seccomp_cache().
> >
> > And this chunk from your patch
> >
> > static void __seccomp_filter_orphan(struct seccomp_filter *orig)
> > {
> > + lockdep_assert_held(¤t->sighand->siglock);
> > +
> >
> > looks unnecessary too, seccomp_filter_release() can just do
> >
> > spin_lock_irq(siglock);
> > orig = tsk->seccomp.filter;
> > tsk->seccomp.filter = NULL;
> > spin_unlock_irq(siglock);
> >
> > __seccomp_filter_release(orig);
> >
> > Right?
> >
> > Oleg.
>