Re: [Intel-gfx] [PATCH] drm/i915: Fix context IDs not released on driver hot unbind

From: Chris Wilson
Date: Thu Apr 04 2019 - 06:54:02 EST


Quoting Janusz Krzysztofik (2019-04-04 11:50:14)
> On Thu, 2019-04-04 at 11:43 +0100, Chris Wilson wrote:
> > Quoting Janusz Krzysztofik (2019-04-04 11:40:24)
> > > On Thu, 2019-04-04 at 11:28 +0100, Chris Wilson wrote:
> > > > Quoting Janusz Krzysztofik (2019-04-04 11:24:45)
> > > > > From: Janusz Krzysztofik <janusz.krzysztofik@xxxxxxxxx>
> > > > >
> > > > > In case the driver gets unbound while a device is open, kernel
> > > > > panic
> > > > > may be forced if a list of allocated context IDs is not empty.
> > > > >
> > > > > When a device is open, the list may happen to be not empty
> > > > > because
> > > > > a
> > > > > context ID, once allocated by a context ID allocator to a
> > > > > context
> > > > > assosiated with that open file descriptor, is released as late
> > > > > as
> > > > > on device close.
> > > > >
> > > > > On the other hand, there is a need to release all allocated
> > > > > context
> > > > > IDs
> > > > > and destroy the context ID allocator on driver unbind, even if
> > > > > a
> > > > > device
> > > > > is open, in order to free memory resources consumed and prevent
> > > > > from
> > > > > memory leaks. The purpose of the forced kernel panic was to
> > > > > protect
> > > > > the context ID allocator from being silently destroyed if not
> > > > > all
> > > > > allocated IDs had been released.
> > > >
> > > > Those open fd are still pointing into kernel memory where the
> > > > driver
> > > > used to be. The panic is entirely correct, we should not be
> > > > unloading
> > > > the module before those dangling pointers have been made safe.
> > > >
> > > > This is papering over the symptom. How is the module being
> > > > unloaded
> > > > with
> > > > open fd?
> > >
> > > A user can play with the driver unbind or device remove sysfs
> > > interface.
> >
> > Sure, but we must still follow all the steps before _unloading_ the
> > module or else the user is left pointing into reused kernel memory.
>
> I'm not talking about unloading the module, that is prevented by open
> fds. The driver still exists after being unbound from a device and may
> just respond with -ENODEV.

i915_gem_contexts_fini() *is* module unload.
-Chris