[RESEND PATCH 0/3] XICS emulation optimizations in KVM for PPC
From: Gautam Menghani
Date: Mon May 20 2024 - 04:20:59 EST
Optimize the XICS emulation code in KVM as per the 'performance todos'
in the comments of book3s_xics.c.
Performance numbers:
1. Test case: Pgbench run in a KVM on PowerVM guest for 120 secs
2. Time taken by arch_send_call_function_single_ipi() currently measured
with funclatency [1].
$ ./funclatency.py -u arch_send_call_function_single_ipi
usecs : count distribution
0 -> 1 : 7 | |
2 -> 3 : 16 | |
4 -> 7 : 141 | |
8 -> 15 : 4455631 |****************************************|
16 -> 31 : 437981 |*** |
32 -> 63 : 5036 | |
64 -> 127 : 92 | |
avg = 12 usecs, total: 60,532,481 usecs, count: 4,898,904
3. Time taken by arch_send_call_function_single_ipi() with changes in
this series.
$ ./funclatency.py -u arch_send_call_function_single_ipi
usecs : count distribution
0 -> 1 : 15 | |
2 -> 3 : 7 | |
4 -> 7 : 3798 | |
8 -> 15 : 4569610 |****************************************|
16 -> 31 : 339284 |** |
32 -> 63 : 4542 | |
64 -> 127 : 68 | |
128 -> 255 : 0 | |
256 -> 511 : 1 | |
avg = 11 usecs, total: 57,720,612 usecs, count: 4,917,325
4. This patch series has been also tested on KVM on Power8 CPU.
[1]: https://github.com/iovisor/bcc/blob/master/tools/funclatency.py
Changes v1 -> v1 resend
1. Add Cedric to CC
Gautam Menghani (3):
arch/powerpc/kvm: Use bitmap to speed up resend of irqs in ICS
arch/powerpc/kvm: Optimize the server number -> ICP lookup
arch/powerpc/kvm: Reduce lock contention by moving spinlock from ics
to irq_state
arch/powerpc/kvm/book3s_hv_rm_xics.c | 8 ++--
arch/powerpc/kvm/book3s_xics.c | 70 ++++++++++++----------------
arch/powerpc/kvm/book3s_xics.h | 13 ++----
3 files changed, 39 insertions(+), 52 deletions(-)
--
2.44.0