Re: [PATCH] drm/i915: Use drm_dev_unplug()

From: Chris Wilson
Date: Fri Apr 05 2019 - 04:25:08 EST


Quoting Janusz Krzysztofik (2019-04-05 09:11:54)
> On Fri, 2019-04-05 at 08:41 +0100, Chris Wilson wrote:
> > Quoting Janusz Krzysztofik (2019-04-05 08:26:57)
> > > From: Janusz Krzysztofik <janusz.krzysztofik@xxxxxxxxx>
> > >
> > > The driver does not currently support unbinding from a device which
> > > is
> > > in use. Since open file descriptors may still be pointing into
> > > kernel
> > > memory where the device structures used to be, entirely correct
> > > kernel
> > > panics protect the driver from being unbound as we should not be
> > > unbinding it before those dangling pointers have been made safe.
> > >
> > > According to the documentation found inside
> > > drivers/gpu/drm/drm_drv.c,
> > > drm_dev_unplug() should be used instead of drm_dev_unregister() in
> > > order to make a device inaccessible to users as soon as it is
> > > unpluged.
> > > Follow that advice to make those possibly dangling pointers safe,
> > > protected by DRM layer from a user who is otherwise left pointing
> > > into
> > > possibly reused kernel memory after the driver has been unbound
> > > from
> > > the device.
> > >
> > > Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@xxxxxxxxx>
> > > ---
> > > drivers/gpu/drm/i915/i915_drv.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > b/drivers/gpu/drm/i915/i915_drv.c
> > > index 9df65d386d11..66163378c481 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > @@ -1596,7 +1596,7 @@ static void i915_driver_unregister(struct
> > > drm_i915_private *dev_priv)
> > > i915_pmu_unregister(dev_priv);
> > >
> > > i915_teardown_sysfs(dev_priv);
> > > - drm_dev_unregister(&dev_priv->drm);
> > > + drm_dev_unplug(&dev_priv->drm);
> >
> > I think we may have our onion inverted here. We want to stop the
> > users
> > as the first step, then start removing the entries. (That will also
> > nicely invert the order from register, which is what we typically
> > expect).
> >
> > After calling i915_driver_unregister(); call i915_gem_set_wedged() to
> > immediately (give or take external fences) cancel inflight
> > operations.
>
> OK, thanks. Do you prefer them squashed or as serparate patches?

Quite happy to do the s/unregister/unplug/ and move in one go. Have a
pre-emptive
Reviewed-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
on that as that seems to be the right thing to do.

And there should be no issues in placing a i915_gem_set_wedged()
immediately after the call to i915_driver_unregister, so if you include
a line of commentary about why, for example

/*
* After unregistering the device to prevent any new users, cancel
* all in-flight requests so that we can quickly unbind the active
* resources.
*/
i915_gem_set_wedged(dev_priv);

Reviewed-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>

I think overall though, we need to go through i915_driver_unload() and
push the module cleanup operations to i915_driver_release -- that will
take a bit of surgery to separate the different phases that are
currently smashed together.
-Chris