Re: [PATCH 4/5] vfio/type1: Flush CPU caches on DMA pages in non-coherent domains

From: Alex Williamson
Date: Tue May 21 2024 - 14:20:06 EST


On Tue, 21 May 2024 13:34:00 -0300
Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:

> On Tue, May 21, 2024 at 10:21:23AM -0600, Alex Williamson wrote:
>
> > > Intel GPU weirdness should not leak into making other devices
> > > insecure/slow. If necessary Intel GPU only should get some variant
> > > override to keep no snoop working.
> > >
> > > It would make alot of good sense if VFIO made the default to disable
> > > no-snoop via the config space.
> >
> > We can certainly virtualize the config space no-snoop enable bit, but
> > I'm not sure what it actually accomplishes. We'd then be relying on
> > the device to honor the bit and not have any backdoors to twiddle the
> > bit otherwise (where we know that GPUs often have multiple paths to get
> > to config space).
>
> I'm OK with this. If devices are insecure then they need quirks in
> vfio to disclose their problems, we shouldn't punish everyone who
> followed the spec because of some bad actors.
>
> But more broadly in a security engineered environment we can trust the
> no-snoop bit to work properly.

The spec has an interesting requirement on devices sending no-snoop
transactions anyway (regarding PCI_EXP_DEVCTL_NOSNOOP_EN):

"Even when this bit is Set, a Function is only permitted to Set the No
Snoop attribute on a transaction when it can guarantee that the
address of the transaction is not stored in any cache in the system."

I wouldn't think the function itself has such visibility and it would
leave the problem of reestablishing coherency to the driver, but am I
overlooking something that implicitly makes this safe? ie. if the
function isn't permitted to perform no-snoop to an address stored in
cache, there's nothing we need to do here.

> > We also then have the question of does the device function
> > correctly if we disable no-snoop.
>
> Other than the GPU BW issue the no-snoop is not a functional behavior.

As with some other config space bits though, I think we're kind of
hoping for sloppy driver behavior to virtualize this. The spec does
allow the bit to be hardwired to zero:

"This bit is permitted to be hardwired to 0b if a Function would never
Set the No Snoop attribute in transactions it initiates."

But there's no capability bit that allows us to report whether the
device supports no-snoop, we're just hoping that a driver writing to
the bit doesn't generate a fault if the bit doesn't stick. For example
the no-snoop bit in the TLP itself may only be a bandwidth issue, but
if the driver thinks no-snoop support is enabled it may request the
device use the attribute for a specific transaction and the device
could fault if it cannot comply.

> > The more secure approach might be that we need to do these cache
> > flushes for any IOMMU that doesn't maintain coherency, even for
> > no-snoop transactions. Thanks,
>
> Did you mean 'even for snoop transactions'?

I was referring to IOMMUs that maintain coherency regardless of
no-snoop transactions, ie domain->enforce_cache_coherency (ex. snoop
control/SNP on Intel), so I meant as typed, the IOMMU maintaining
coherency even for no-snoop transactions.

That's essentially the case we expect and we don't need to virtualize
no-snoop enable on the device.

> That is where this series is, it assumes a no-snoop transaction took
> place even if that is impossible, because of config space, and then
> does pessimistic flushes.

So are you proposing that we can trust devices to honor the
PCI_EXP_DEVCTL_NOSNOOP_EN bit and virtualize it to be hardwired to zero
on IOMMUs that do not enforce coherency as the entire solution?

Or maybe we trap on setting the bit to make the flushing less
pessimistic?

Intel folks might be able to comment on the performance hit relative to
iGPU assignment of denying the device the ability to use no-snoop
transactions (assuming the device control bit is actually honored).
The latency of flushing caches on touching no-snoop enable might be
prohibitive in the latter case. Thanks,

Alex