Re: [PATCH v5 0/4] drm: add explicit fencing
From: Daniel Vetter
Date: Fri Oct 21 2016 - 09:04:19 EST
On Thu, Oct 20, 2016 at 12:50:01PM -0200, Gustavo Padovan wrote:
> From: Gustavo Padovan <gustavo.padovan@xxxxxxxxxxxxxxx>
>
> Hi,
>
> Currently the Linux Kernel only have an implicit fencing mechanism, through
> reservation objects, in which the fences are attached directly to buffers
> operations and userspace is unaware of what is happening. On the other hand
> explicit fencing exposes fences to the userspace to handle fencing between
> producer/consumer explicitely.
>
> To support explicit fencing in the mainline kernel we de-staged the the needed
> parts of the Android Sync Framework[1] to be able to send and received fences
> to/from userspace. It has the concept of sync_file that exposes the driver's
> fences to userspace as file descriptors.
>
> With explicit fencing we have a global mechanism that optimizes the flow of
> buffers between consumers and producers, avoiding a lot of waiting and some
> potential desktop freeze. So instead of waiting for a buffer to be processed by
> the GPU before sending it to DRM in an Atomic IOCTL we can get a sync_file fd
> from the GPU driver at the moment we submit the buffer processing. The
> compositor then passes these fds to DRM in an Atomic IOCTL, that will
> not be displayed until the fences signal, i.e, the GPU finished processing the
> buffer and it is ready to display. In DRM the fences we wait on before
> displaying a buffer are called in-fences and they are a per-plane deal.
>
> Vice-versa, we have out-fences to sychronize the return of buffers to GPU
> (producer) to be processed again. When DRM receives an atomic request with a
> the OUT_FENCE_PTR property it create a fence attach it to a per-crtc
> sync_file. It then returns the sync_file fds to userspace. With the fences
> available userspace can forward these fences to the GPU, where it will wait the
> fence to signal before starting to process on buffer again. Out-fences are
> per-crtc.
>
> While these are out-fences in DRM (the consumer) they become in-fences once
> they get to the GPU (the producer).
>
> DRM explicit fences are opt-in, as the default will still be implicit fencing.
>
> In-fences
> ---------
>
> In the first discussions on #dri-devel on IRC we decided to hide the Sync
> Framework from DRM drivers to reduce complexity, so as soon we get the fd
> via IN_FENCE_FD plane property we convert the sync_file fd to a struct fence.
> However a sync_file might contain more than one fence, so we created the
> fence_array concept. struct fence_array is a subclass of struct
> fence and stores a group of fences that needs to be waited together.
>
> Then we just use the already in place fence waiting support to wait on those
> fences. Once the producer calls fence_signal() for all fences on wait we can
> proceed with the atomic commit and display the framebuffers.
>
> Out-fences
> ----------
>
> Passing a pointer to OUT_FENCE_PTR property in an atomic request enables
> out-fences. The kernel then creates a fence, attach it to a sync_file and
> install this file on a unused fd for each crtc. The kernel writes the fd in
> the memory pointed by the out_fence_ptr provided. In case of error it writes -1.
>
> DRM core use the already in place drm_event infrastructure to help signal
> fences, we've added a fence pointer to struct drm_pending_event. We signal
> signal fences when all the buffer related to that CRTC are *on* the screen.
>
> Kernel tree
> -----------
>
> For those who want all patches on this RFC are in my tree:
>
> https://git.kernel.org/cgit/linux/kernel/git/padovan/linux.git/log/?h=fences
>
>
> v5 changes
> ----------
>
> The changes from v5 to v4 are in the patches description.
>
>
> Userspace
> ---------
>
> Fences support on drm_hwcomposer is currently a working in progress.
Where are the igts? There's some fairly tricky error recovery and handling
code in there, I think we should have testcases for all of them. E.g.
out_fence property set, but then some invalid property later on to
exercise the error handling paths. Another one would be an atomic update
(maybe of a primary plane) which should work, except the fb is way too
small. That's checked by core code, but only at ->atomic_check phase, and
so could be used to exercise the even later error code.
Other things are mixing up in_fences, out_fences and events in different
ways, and making sure it all works out. And then maybe also mix in
nonblocking commit vs. blocking commit.
If you need something to generate fences: vgem has them. Chris Wilson can
point you at the code he's done in igt for testing the implicit fence
support in i915.
Cheers, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch