Re: [PATCH v9 0/7] Solve postboot supplier cleanup and optimize probe ordering

From: Saravana Kannan
Date: Thu Aug 15 2019 - 21:51:42 EST


On Fri, Aug 9, 2019 at 10:20 PM Frank Rowand <frowand.list@xxxxxxxxx> wrote:
>
> On 8/9/19 10:00 PM, Saravana Kannan wrote:
> > On Fri, Aug 9, 2019 at 7:57 PM Frank Rowand <frowand.list@xxxxxxxxx> wrote:
> >>
> >> Hi Saravana,
> >>
> >> On 7/31/19 3:17 PM, Saravana Kannan wrote:
> >>> Add device-links to track functional dependencies between devices
> >>> after they are created (but before they are probed) by looking at
> >>> their common DT bindings like clocks, interconnects, etc.
> >>>
> >>> Having functional dependencies automatically added before the devices
> >>> are probed, provides the following benefits:
> >>>
> >>> - Optimizes device probe order and avoids the useless work of
> >>> attempting probes of devices that will not probe successfully
> >>> (because their suppliers aren't present or haven't probed yet).
> >>>
> >>> For example, in a commonly available mobile SoC, registering just
> >>> one consumer device's driver at an initcall level earlier than the
> >>> supplier device's driver causes 11 failed probe attempts before the
> >>> consumer device probes successfully. This was with a kernel with all
> >>> the drivers statically compiled in. This problem gets a lot worse if
> >>> all the drivers are loaded as modules without direct symbol
> >>> dependencies.
> >>>
> >>> - Supplier devices like clock providers, interconnect providers, etc
> >>> need to keep the resources they provide active and at a particular
> >>> state(s) during boot up even if their current set of consumers don't
> >>> request the resource to be active. This is because the rest of the
> >>> consumers might not have probed yet and turning off the resource
> >>> before all the consumers have probed could lead to a hang or
> >>> undesired user experience.
> >>>
> >>> Some frameworks (Eg: regulator) handle this today by turning off
> >>> "unused" resources at late_initcall_sync and hoping all the devices
> >>> have probed by then. This is not a valid assumption for systems with
> >>> loadable modules. Other frameworks (Eg: clock) just don't handle
> >>> this due to the lack of a clear signal for when they can turn off
> >>> resources. This leads to downstream hacks to handle cases like this
> >>> that can easily be solved in the upstream kernel.
> >>>
> >>> By linking devices before they are probed, we give suppliers a clear
> >>> count of the number of dependent consumers. Once all of the
> >>> consumers are active, the suppliers can turn off the unused
> >>> resources without making assumptions about the number of consumers.
> >>>
> >>> By default we just add device-links to track "driver presence" (probe
> >>> succeeded) of the supplier device. If any other functionality provided
> >>> by device-links are needed, it is left to the consumer/supplier
> >>> devices to change the link when they probe.
> >>>
> >>> v1 -> v2:
> >>> - Drop patch to speed up of_find_device_by_node()
> >>> - Drop depends-on property and use existing bindings
> >>>
> >>> v2 -> v3:
> >>> - Refactor the code to have driver core initiate the linking of devs
> >>> - Have driver core link consumers to supplier before it's probed
> >>> - Add support for drivers to edit the device links before probing
> >>>
> >>> v3 -> v4:
> >>> - Tested edit_links() on system with cyclic dependency. Works.
> >>> - Added some checks to make sure device link isn't attempted from
> >>> parent device node to child device node.
> >>> - Added way to pause/resume sync_state callbacks across
> >>> of_platform_populate().
> >>> - Recursively parse DT node to create device links from parent to
> >>> suppliers of parent and all child nodes.
> >>>
> >>> v4 -> v5:
> >>> - Fixed copy-pasta bugs with linked list handling
> >>> - Walk up the phandle reference till I find an actual device (needed
> >>> for regulators to work)
> >>> - Added support for linking devices from regulator DT bindings
> >>> - Tested the whole series again to make sure cyclic dependencies are
> >>> broken with edit_links() and regulator links are created properly.
> >>>
> >>> v5 -> v6:
> >>> - Split, squashed and reordered some of the patches.
> >>> - Refactored the device linking code to follow the same code pattern for
> >>> any property.
> >>>
> >>> v6 -> v7:
> >>> - No functional changes.
> >>> - Renamed i to index
> >>> - Added comment to clarify not having to check property name for every
> >>> index
> >>> - Added "matched" variable to clarify code. No functional change.
> >>> - Added comments to include/linux/device.h for add_links()
> >>>
> >>> v7 -> v8:
> >>> - Rebased on top of linux-next to handle device link changes in [1]
> >>>
> >>
> >>
> >>> v8 -> v9:
> >>> - Fixed kbuild test bot reported errors (docs and const)
> >>
> >> Some maintainers have strong opinions about whether change logs should be:
> >>
> >> (1) only in patch 0
> >> (2) only in the specific patches that are changed
> >> (3) both in patch 0 and in the specific patches that are changed.
> >>
> >> I can adapt to any of the three styles. But for style "(1)" please
> >> list which specific patch has changed for each item in the change list.
> >>
> >
> > Thanks for the context Frank. I'm okay with (1) or (2) but I'll stick
> > with (1) for this series. Didn't realize there were options (2) and
> > (3). Since you started reviewing from v7, I'll do that in the future
> > updates? Also, I haven't forgotten your emails. Just tied up with
> > something else for a few days. I'll get to your emails next week.
>
> Yes, starting with future updates is fine, no need to redo the v9
> change logs.
>
> No problem on the timing. I figured you were busy or away from the
> internet.

I'm replying to your comments on the other 3 patches. Okay with a
majority of them. I'll wait for your reply to see where we settle for
some of the points before I send out any patches though.

For now I'm thinking of sending them as separate clean up patches so
that Greg doesn't have to deal with reverts in his "next" branch. We
can squash them later if we really need to rip out what's in there and
push it again.

-Saravana