Re: [PATCH] KVM: SEV: Fix guest memory leak when handling guest requests

From: Michael Roth
Date: Mon May 20 2024 - 22:01:46 EST


On Mon, May 20, 2024 at 04:32:04PM -0700, Sean Christopherson wrote:
> On Mon, May 20, 2024, Michael Roth wrote:
> > On Mon, May 20, 2024 at 07:17:13AM -0700, Sean Christopherson wrote:
> > > On Sat, May 18, 2024, Michael Roth wrote:
> > > > Before forwarding guest requests to firmware, KVM takes a reference on
> > > > the 2 pages the guest uses for its request/response buffers. Make sure
> > > > to release these when cleaning up after the request is completed.
> > > >
> > > > Signed-off-by: Michael Roth <michael.roth@xxxxxxx>
> > > > ---
> > >
> > > ...
> > >
> > > > @@ -3970,14 +3980,11 @@ static int __snp_handle_guest_req(struct kvm *kvm, gpa_t req_gpa, gpa_t resp_gpa
> > > > return ret;
> > > >
> > > > ret = sev_issue_cmd(kvm, SEV_CMD_SNP_GUEST_REQUEST, &data, fw_err);
> > > > - if (ret)
> > > > - return ret;
> > > >
> > > > - ret = snp_cleanup_guest_buf(&data);
> > > > - if (ret)
> > > > - return ret;
> > > > + if (snp_cleanup_guest_buf(&data))
> > > > + return -EINVAL;
> > >
> > > EINVAL feels wrong. The input was completely valid. Also, forwarding the error
> >
> > Yah, EIO seems more suitable here.
> >
> > > to the guest doesn't seem like the right thing to do if KVM can't reclaim the
> > > response PFN. Shouldn't that be fatal to the VM?
> >
> > The thinking here is that pretty much all guest request failures will be
> > fatal to the guest being able to continue. At least, that's definitely
> > true for attestation. So reporting the error to the guest would allow that
> > failure to be propagated along by handling in the guest where it would
> > presumably be reported a little more clearly to the guest owner, at
> > which point the guest would most likely terminate itself anyway.
>
> But failure to convert a pfn back to shared is a _host_ issue, not a guest issue.
> E.g. it most likely indicates a bug in the host software stack, or perhaps a bad
> CPU or firmware bug.

No disagreement there, I think it's more correct to not propagate
any errors resulting from reclaim failure. Was just explaining why the
original code had propensity for propagating errors to guest, and why it
still needs to be done for firmware errors.

>
> > But there is a possibility that the guest will attempt access the response
> > PFN before/during that reporting and spin on an #NPF instead though. So
> > maybe the safer more repeatable approach is to handle the error directly
> > from KVM and propagate it to userspace.
>
> I was thinking more along the lines of KVM marking the VM as dead/bugged.

In practice userspace will get an unhandled exit and kill the vcpu/guest,
but we could additionally flag the guest as dead. Is there a existing
mechanism for this?

>
> > But the GHCB spec does require that the firmware response code for
> > SNP_GUEST_REQUEST be passed directly to the guest via lower 32-bits of
> > SW_EXITINFO2, so we'd still want handling to pass that error on to the
> > guest, so I made some changes to retain that behavior.
>
> If and only the hypervisor completes the event.
>
> The hypervisor must save the SNP_GUEST_REQUEST return code in the lower 32-bits
> of the SW_EXITINFO2 field before completing the Guest Request NAE event.
>
> If KVM terminates the VM, there's obviously no need to fill SW_EXITINFO2.

Yah, the v2 patch will only propagate the firmware error if reclaim was
successful.

>
> Side topic, is there a plan to ratelimit Guest Requests?
>
> To avoid the possibility of a guest creating a denial of service attack against
> the SNP firmware, it is recommended that some form of rate limiting be implemented
> should it be detected that a high number of Guest Request NAE events are being
> issued.

The guest side is upstream, and Dionna submitted HV patches last year. I think
these are the latest ones:

https://www.spinics.net/lists/kvm/msg301438.html

I think it probably makes sense to try to get the throttling support in
for 6.11

-Mike