On Thu, May 16, 2024 at 04:36:48PM +0000,
"Edgecombe, Rick P" <rick.p.edgecombe@xxxxxxxxx> wrote:
On Thu, 2024-05-16 at 13:04 +0000, Huang, Kai wrote:
On Thu, 2024-05-16 at 02:57 +0000, Edgecombe, Rick P wrote:
On Thu, 2024-05-16 at 14:07 +1200, Huang, Kai wrote:
I meant it seems we should just strip shared bit away from the GPA in
handle_ept_violation() and pass it as 'cr2_or_gpa' here, so fault->addr
won't have the shared bit.
Do you see any problem of doing so?
We would need to add it back in "raw_gfn" in kvm_tdp_mmu_map().
I don't see any big difference?
Now in this patch the raw_gfn is directly from fault->addr:
raw_gfn = gpa_to_gfn(fault->addr);
tdp_mmu_for_each_pte(iter, mmu, is_private, raw_gfn, raw_gfn+1) {
...
}
But there's nothing wrong to get the raw_gfn from the fault->gfn. In
fact, the zapping code just does this:
/*
* start and end doesn't have GFN shared bit. This function zaps
* a region including alias. Adjust shared bit of [start, end) if
* the root is shared.
*/
start = kvm_gfn_for_root(kvm, root, start);
end = kvm_gfn_for_root(kvm, root, end);
So there's nothing wrong to just do the same thing in both functions.
The point is fault->gfn has shared bit stripped away at the beginning, and
AFAICT there's no useful reason to keep shared bit in fault->addr. The
entire @fault is a temporary structure on the stack during fault handling
anyway.
I would like to avoid code churn at this point if there is not a real clear
benefit. >>
One small benefit of keeping the shared bit in the fault->addr is that it is
sort of consistent with how that field is used in other scenarios in KVM. In
shadow paging it's not even the GPA. So it is simply the "fault address" and has
to be interpreted in different ways in the fault handler. For TDX the fault
address *does* include the shared bit. And the EPT needs to be faulted in at
that address.
If we strip the shared bit when setting fault->addr we have to reconstruct it
when we do the actual shared mapping. There is no way around that. Which helper
does it, isn't important I think. Doing the reconstruction inside
tdp_mmu_for_each_pte() could be neat, except that it doesn't know about the
shared bit position.
The zapping code's use of kvm_gfn_for_root() is different because the gfn comes
without the shared bit. It's not stripped and then added back. Those are
operations that target GFNs really.
I think the real problem is that we are gleaning whether the fault is to private
or shared memory from different things. Sometimes from fault->is_private,
sometimes the presence of the shared bits, and sometimes the role bit. I think
this is confusing, doubly so because we are using some of these things to infer
unrelated things (mirrored vs private).
It's confusing we don't check it in uniform way.
My guess is that you have noticed this and somehow zeroed in on the shared_mask.
I think we should straighten out the mirrored/private semantics and see what the
results look like. How does that sound to you?
I had closer look of the related code. I think we can (mostly) uniformly use
gpa/gfn without shared mask. Here is the proposal. We need a real patch to see
how the outcome looks like anyway. I think this is like what Kai is thinking
about.
- rename role.is_private => role.is_mirrored_pt
- sp->gfn: gfn without shared bit.
- fault->address: without gfn_shared_mask
Actually it doesn't matter much. We can use gpa with gfn_shared_mask.
- Update struct tdp_iter
struct tdp_iter
gfn: gfn without shared bit
/* Add new members */
/* Indicates which PT to walk. */
bool mirrored_pt;
// This is used tdp_iter_refresh_sptep()
// shared gfn_mask if mirrored_pt
// 0 if !mirrored_pt
gfn_shared_mask >
- Pass mirrored_pt and gfn_shared_mask to
tdp_iter_start(..., mirrored_pt, gfn_shared_mask)
and update tdp_iter_refresh_sptep()
static void tdp_iter_refresh_sptep(struct tdp_iter *iter)
...
iter->sptep = iter->pt_path[iter->level - 1] +
SPTE_INDEX((iter->gfn << PAGE_SHIFT) | iter->gfn_shared_mask, iter->level);
Change for_each_tdp_mte_min_level() accordingly.
Also the iteretor to call this.
#define for_each_tdp_pte_min_level(kvm, iter, root, min_level, start, end) \
for (tdp_iter_start(&iter, root, min_level, start, \
mirrored_root, mirrored_root ? kvm_gfn_shared_mask(kvm) : 0); \
iter.valid && iter.gfn < kvm_gfn_for_root(kvm, root, end); \
tdp_iter_next(&iter))
- trace point: update to include mirroredd_pt. Or Leave it as is for now.
- pr_err() that log gfn in handle_changed_spte()
Update to include mirrored_pt. Or Leave it as is for now.
- Update spte handler (handle_changed_spte(), handle_removed_pt()...),
use iter->mirror_pt or pass down mirror_pt.