Re: [PATCH v4 3/4] drm/vc4: Check for the binner bo before handling OOM interrupt
From: Eric Anholt
Date: Thu Apr 04 2019 - 16:09:45 EST
Paul Kocialkowski <paul.kocialkowski@xxxxxxxxxxx> writes:
> Hey,
>
> Le mercredi 03 avril 2019 Ã 11:58 -0700, Eric Anholt a Ãcrit :
>> Paul Kocialkowski <paul.kocialkowski@xxxxxxxxxxx> writes:
>>
>> > Since the OOM interrupt directly deals with the binner bo, it doesn't
>> > make sense to try and handle it without a binner buffer registered.
>> > The interrupt will kick again in due time, so we can safely ignore it
>> > without a binner bo allocated.
>> >
>> > Signed-off-by: Paul Kocialkowski <paul.kocialkowski@xxxxxxxxxxx>
>> > ---
>> > drivers/gpu/drm/vc4/vc4_irq.c | 3 +++
>> > 1 file changed, 3 insertions(+)
>> >
>> > diff --git a/drivers/gpu/drm/vc4/vc4_irq.c b/drivers/gpu/drm/vc4/vc4_irq.c
>> > index ffd0a4388752..723dc86b4511 100644
>> > --- a/drivers/gpu/drm/vc4/vc4_irq.c
>> > +++ b/drivers/gpu/drm/vc4/vc4_irq.c
>> > @@ -64,6 +64,9 @@ vc4_overflow_mem_work(struct work_struct *work)
>> > struct vc4_exec_info *exec;
>> > unsigned long irqflags;
>>
>> Since OOM handling is tricky, could we add a comment to help the next
>> person try to understand it:
>>
>> /* The OOM IRQ is level-triggered, so we'll see one at power-on before
>> * any jobs are submitted. The OOM IRQ is masked when this work is
>> * scheduled, so we can safely return if there's no binner memory
>> * (because no client is currently using 3D). When a bin job is
>> * later submitted, its tile memory allocation will end up bringing us
>> * back to a non-OOM state so the OOM can be triggered again.
>> */
>>
>> But, actually, I don't see how the OOM IRQ will ever get re-enabled.
>
> Okay so I investigated that to try and understand what's going on.
> We are definitely writing the OUTOMEM bit to V3D_INTDIS just before
> scheduling the workqueue, and never re-enable the IRQ when leaving
> early in the workqueue because !vc4->bin_bo.
>
> It turns out that what saves us here is vc4_irq_postinstall being
> called from runtime resume at "the right time". Obviously this is more
> than fragile, so we should really be re-enabling the IRQ as soon as we
> have the binner bo allocated.
>
> Since we're now allocating at the first non-dumb bo alloc, I think we
> need to make sure that we did in fact get the irq and registered the
> allocated BO with the workqueue before submitting the rcl. Or does the
> hardware provide any mechanism to take that off our hands somehow?
Maybe just enable the OOM interrupt using INTENA in the bin BO
allocation's success case? That feels race-free, since it's a level
interrupt and even if we were racing the !bin_bo check in the work, we'd
end up re-scheduling the work?
Attachment:
signature.asc
Description: PGP signature