Re: [PATCH v2 3/4] arm64: Make debug exception handlers visible from RCU
From: James Morse
Date: Tue Jul 23 2019 - 13:08:03 EST
Hi,
On 22/07/2019 08:48, Masami Hiramatsu wrote:
> Make debug exceptions visible from RCU so that synchronize_rcu()
> correctly track the debug exception handler.
>
> This also introduces sanity checks for user-mode exceptions as same
> as x86's ist_enter()/ist_exit().
>
> The debug exception can interrupt in idle task. For example, it warns
> if we put a kprobe on a function called from idle task as below.
> The warning message showed that the rcu_read_lock() caused this
> problem. But actually, this means the RCU is lost the context which
> is already in NMI/IRQ.
> So make debug exception visible to RCU can fix this warning.
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 9568c116ac7f..a6b244240db6 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -777,6 +777,42 @@ void __init hook_debug_fault_code(int nr,
> debug_fault_info[nr].name = name;
> }
>
> +/*
> + * In debug exception context, we explicitly disable preemption.
> + * This serves two purposes: it makes it much less likely that we would
> + * accidentally schedule in exception context and it will force a warning
> + * if we somehow manage to schedule by accident.
> + */
> +static void debug_exception_enter(struct pt_regs *regs)
> +{
> + if (user_mode(regs)) {
> + RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
Would moving entry.S's context_tracking_user_exit() call to be before do_debug_exception()
also fix this?
I don't know the reason its done 'after' debug exception handling. Its always been like
this: commit 6c81fe7925cc4c42 ("arm64: enable context tracking").
> + } else {
> + /*
> + * We might have interrupted pretty much anything. In
> + * fact, if we're a debug exception, we can even interrupt
> + * NMI processing.
> + * We don't want in_nmi() to return true,
> + * but we need to notify RCU.
How come? If you interrupted an SError or pseudo-nmi, it already is. Those paths should
all be painted no-kprobe, but I'm sure there are gaps. The hw-breakpoints can almost
certainly hook them.
> + */
> + rcu_nmi_enter();
Can we interrupt printk()? Do we need printk_nmi_enter()? ... What about ftrace?
Because SError and pseudo-nmi can interrupt interrupt-masked code, we describe them as
NMI. The only difference here is these exceptions are synchronous.
I suspect we should make these debug exceptions nmi for EL1. We can then use this for the
kprobe-re-entrance stuff so the pre/post hooks don't get run if they interrupted something
also described as NMI.
> + }
> +
> + preempt_disable();
> +
> + /* This code is a bit fragile. Test it. */
> + RCU_LOCKDEP_WARN(!rcu_is_watching(), "exception_enter didn't work");
> +}
> +NOKPROBE_SYMBOL(debug_exception_enter);
Thanks,
James