KASAN KUnit test '65 vmalloc_oob' fails on qemu ppc and PowerMac G4 DP when the machine has more than 867 MB of RAM (kernel 6.9.1)

From: Erhard Furtner
Date: Mon May 20 2024 - 19:04:46 EST


Greetings!

With https://lore.kernel.org/linux-mm/20240517130118.759301-1-andrey.konovalov@xxxxxxxxx/T/#u KASAN KUnit testsuite now finishes on my PowerMac G4 DP and on qemu.

On ppc I get one failing test however (65 vmalloc_oob) which does not fail on my x86_64 machine:
[...]
BUG: KASAN: vmalloc-out-of-bounds in vmalloc_oob+0x1d0/0x3cc
Read of size 1 at addr f10457f3 by task kunit_try_catch/190

CPU: 0 PID: 190 Comm: kunit_try_catch Tainted: G B N 6.9.1-PMacG4-dirty #1
Hardware name: PowerMac3,1 7450 0x80000201 PowerMac
Call Trace:
[f197bd60] [c15f48ac] dump_stack_lvl+0x80/0xac (unreliable)
[f197bd80] [c04c3f14] print_report+0xd4/0x4fc
[f197bdd0] [c04c456c] kasan_report+0xf8/0x10c
[f197be50] [c04c723c] vmalloc_oob+0x1d0/0x3cc
[f197bed0] [c0c29e98] kunit_try_run_case+0x3bc/0x5d8
[f197bfa0] [c0c2f1c8] kunit_generic_run_threadfn_adapter+0xa4/0xf8
[f197bfc0] [c00facf8] kthread+0x384/0x394
[f197bff0] [c002e304] start_kernel_thread+0x10/0x14

The buggy address belongs to the virtual mapping at
[f1045000, f1047000) created by:
vmalloc_oob+0x70/0x3cc

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:00000000 index:0x0 pfn:0x79f8b
flags: 0x80000000(zone=2)
page_type: 0xffffffff()
raw: 80000000 00000000 00000122 00000000 00000000 00000000 ffffffff 00000001
raw: 00000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
f1045680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f1045700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>f1045780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 03 f8
^
f1045800: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
f1045880: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
==================================================================
==================================================================
BUG: KASAN: vmalloc-out-of-bounds in vmalloc_oob+0x294/0x3cc
Read of size 1 at addr f10457f8 by task kunit_try_catch/190

CPU: 0 PID: 190 Comm: kunit_try_catch Tainted: G B N 6.9.1-PMacG4-dirty #1
Hardware name: PowerMac3,1 7450 0x80000201 PowerMac
Call Trace:
[f197bd60] [c15f48ac] dump_stack_lvl+0x80/0xac (unreliable)
[f197bd80] [c04c3f14] print_report+0xd4/0x4fc
[f197bdd0] [c04c456c] kasan_report+0xf8/0x10c
[f197be50] [c04c7300] vmalloc_oob+0x294/0x3cc
[f197bed0] [c0c29e98] kunit_try_run_case+0x3bc/0x5d8
[f197bfa0] [c0c2f1c8] kunit_generic_run_threadfn_adapter+0xa4/0xf8
[f197bfc0] [c00facf8] kthread+0x384/0x394
[f197bff0] [c002e304] start_kernel_thread+0x10/0x14

The buggy address belongs to the virtual mapping at
[f1045000, f1047000) created by:
vmalloc_oob+0x70/0x3cc

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:00000000 index:0x0 pfn:0x79f8b
flags: 0x80000000(zone=2)
page_type: 0xffffffff()
raw: 80000000 00000000 00000122 00000000 00000000 00000000 ffffffff 00000001
raw: 00000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
f1045680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f1045700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>f1045780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 03 f8
^
f1045800: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
f1045880: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
==================================================================
# vmalloc_oob: ASSERTION FAILED at mm/kasan/kasan_test.c:1680
Expected p_ptr is not null, but is
not ok 65 vmalloc_oob
[...]

I get this regardless building with GCC 13.2 or CLANG 18 (https://github.com/ClangBuiltLinux/linux/issues/2020). Nathan also found out the test fails when the machine has more than 867 MB of RAM.

In the original upstream discussion (https://lore.kernel.org/linux-mm/CA+fCnZeeJub5iCwwwGM2pDt9wzX=T4+wpZbbGhKQ7Qbtb+tFeA@xxxxxxxxxxxxxx/#t) Andrey suggested to open a separate thread pointing out the issue could originate in vmalloc issues.

> [...]
> Yeah, I suspect this is something ppc-specific and might not even be
> KASAN-related: somehow vmalloc_to_page + page_address return NULL. A
> separate thread with ppc maintainers makes sense.

The issue can also be seen on qemu. I run my attached .config kernel build on via: qemu-system-ppc -machine mac99,via=pmu -cpu 7450 -m 2G -nographic -append console=ttyS0 -kernel /var/cache/distfiles/vmlinux-6.9.1-PMacG4-dirty -hda Debian-VM_g4.img

Regards,
Erhard

Attachment: config_691_g4+
Description: Binary data