Re: 8b275b3754 ("x86/irq/64: Remap the IRQ stack with guard pages"): BUG: unable to handle kernel paging request at ffffb659000a1000
From: Andy Lutomirski
Date: Sat Apr 06 2019 - 09:55:22 EST
On Fri, Apr 5, 2019 at 11:38 PM kernel test robot <lkp@xxxxxxxxx> wrote:
>
> Greetings,
>
> 0day kernel testing robot got the below dmesg and the first bad commit is
>
> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.x86/stackguards
>
> commit 8b275b3754465d502d393f8ae8dd355b7067e73f
> Author: Andy Lutomirski <luto@xxxxxxxxxx>
> AuthorDate: Fri Jul 13 19:01:23 2018 -0700
> Commit: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> CommitDate: Fri Apr 5 17:04:10 2019 +0200
>
> x86/irq/64: Remap the IRQ stack with guard pages
>
> The IRQ stack lives in percpu space, so an IRQ handler that overflows it
> will overwrite other data structures.
>
> Use vmap() to remap the IRQ stack so that it will have the usual guard
> pages that vmap/vmalloc allocations have. With this the kernel will panic
> immediately on an IRQ stack overflow.
>
> [ tglx: Move the map code to a proper place and invoke it only when a CPU
> is about to be brought online. No point in installing the map at
> early boot for all possible CPUs. Fail the CPU bringup if the vmap
> fails as done for all other preparatory stages in cpu hotplug. ]
>
> Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
I haven't spotted the actual bug yet, but the faulting instruction is:
2a: 65 8b 35 09 ca 75 63 mov %gs:*0x6375ca09(%rip),%esi
# 0x6375ca3a <-- trapping instruction
This seems to be faulting just above the top of the stack (the thing
in RSP), so I suspect that there is some path that is shoving the
remapped value into GSBASE, which is wrong.
Also, FWIW, there was some reason that I initialized all the virtual
mappings for all possible CPUs early. I don't remember what it was,
and it may not have been a good reason, but I put at least some
nonzero amount of thought into it :)
--Andy