Re: [PATCH v2 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock
From: Peter Zijlstra
Date: Wed Apr 03 2019 - 12:01:40 EST
On Wed, Apr 03, 2019 at 11:39:09AM -0400, Alex Kogan wrote:
> >> The patch that I am looking for is to have a separate
> >> numa_queued_spinlock_slowpath() that coexists with
> >> native_queued_spinlock_slowpath() and
> >> paravirt_queued_spinlock_slowpath(). At boot time, we select the most
> >> appropriate one for the system at hand.
> Is this how this selection works today for paravirt?
> I see a PARAVIRT_SPINLOCKS config option, but IIUC you are talking about a different mechanism here.
> Can you, please, elaborate or give me a link to a page that explains that?
Oh man, you ask us to explain how paravirt patching works... that's
magic :-)
Basically, the compiler will emit a bunch of indirect calls to the
various pv_ops.*.* functions.
Then, at alternative_instructions() <- apply_paravirt() it will rewrite
all these indirect calls to direct calls to the function pointers that
are in the pv_ops structure at that time (+- more magic).
So we initialize the pv_ops.lock.* methods to the normal
native_queued_spin*() stuff, if KVM/Xen/whatever setup detectors pv
spnlock support changes the methods to the paravirt_queued_*() stuff.
If you wnt more details, you'll just have to read
arch/x86/include/asm/paravirt*.h and arch/x86/kernel/paravirt*.c, I
don't think there's a coherent writeup of all that.
> > Agreed; and until we have static_call, I think we can abuse the paravirt
> > stuff for this.
> >
> > By the time we patch the paravirt stuff:
> >
> > check_bugs()
> > alternative_instructions()
> > apply_paravirt()
> >
> > we should already have enumerated the NODE topology and so nr_node_ids()
> > should be set.
> >
> > So if we frob pv_ops.lock.queued_spin_lock_slowpath to
> > numa_queued_spin_lock_slowpath before that, it should all get patched
> > just right.
> >
> > That of course means the whole NUMA_AWARE_SPINLOCKS thing depends on
> > PARAVIRT_SPINLOCK, which is a bit awkwardâ
> Just to mention here, the patch so far does not address paravirt, but
> our goal is to add this support once we address all the concerns for
> the native version. So we will end up with four variants for the
> queued_spinlock_slowpath() â one for each combination of
> native/paravirt and NUMA/non-NUMA. Or perhaps we do not need a
> NUMA/paravirt variant?
I wouldn't bother with a pv version of the numa aware code at all. If
you have overcommitted guests, topology is likely irrelevant anyway. If
you have 1:1 pinned guests, they'll not use pv spinlocks anyway.
So keep it to tertiary choice:
- native
- native/numa
- paravirt