Re:Re: [PATCH v2] smp: avoid generic_exec_single cause system lockup
From: Thomas Gleixner
Date: Thu Jul 18 2019 - 05:58:57 EST
On Thu, 18 Jul 2019, luferry wrote:
> At 2019-07-18 16:07:58, "Thomas Gleixner" <tglx@xxxxxxxxxxxxx> wrote:
> >On Thu, 18 Jul 2019, luferry@xxxxxxx wrote:
> >
> >> From: luferry <luferry@xxxxxxx>
> >>
> >> The race can reproduced by sending wait enabled IPI in softirq/irq env
> >
> >Which code path is doing that?
>
> I checked kernel and found no code path can run into this.
For a good reason.
> Actually , i encounter with this problem by my own code.
> I need to do some specific urgent work periodicity and these
> work may run for quite a while. So i can't disable irq during these work
> which stops me from using hrtimer to do this. So i did add an extra
> sofitrq action which may invoke smp_call.
Well, from softirq handling context the only allowed interface is
smp_call_function_single_async().
The code is actually missing a warning to that effect. See below.
Vs. your proposed change. It's broken in various ways and no, we are not
going to support that and definitely we are not going to disable interrupts
around a loop over all cpus in a mask.
Thanks,
tglx
8<--------------
Subject: smp: Warn on function calls from softirq context
From: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Date: Thu, 18 Jul 2019 11:20:09 +0200
It's clearly documented that smp function calls cannot be invoked from
softirq handling context. Unfortunately nothing enforces that or emits a
warning.
A single function call can be invoked from softirq context only via
smp_call_function_single_async().
Reported-by: luferry <luferry@xxxxxxx>
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
---
kernel/smp.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -291,6 +291,15 @@ int smp_call_function_single(int cpu, sm
WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled()
&& !oops_in_progress);
+ /*
+ * Can deadlock when the softirq is executed on return from
+ * interrupt and the interrupt hit between llist_add() and
+ * arch_send_call_function_single_ipi() because then this
+ * invocation sees the list non-empty, skips the IPI send
+ * and waits forever.
+ */
+ WARN_ON_ONCE(is_serving_softirq() && wait);
+
csd = &csd_stack;
if (!wait) {
csd = this_cpu_ptr(&csd_data);
@@ -416,6 +425,13 @@ void smp_call_function_many(const struct
WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled()
&& !oops_in_progress && !early_boot_irqs_disabled);
+ /*
+ * Bottom half handlers are not allowed to call this as they might
+ * corrupt cfd_data when the interrupt which triggered softirq
+ * processing hit this function.
+ */
+ WARN_ON_ONCE(is_serving_softirq());
+
/* Try to fastpath. So, what's a CPU they want? Ignoring this one. */
cpu = cpumask_first_and(mask, cpu_online_mask);
if (cpu == this_cpu)