Regulator Potential Deadlock

From: Charles Keepax
Date: Wed Apr 03 2019 - 09:55:44 EST


Hi Guys,

Was testing some of my hardware and hit this potential lockup:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock((work_completion)(&(&rdev->disable_work)->work));
lock(regulator_list_mutex);
lock((work_completion)(&(&rdev->disable_work)->work));
lock(regulator_list_mutex);

***
DEADLOCK
***

Looks like it comes from this patch:

commit f8702f9e4aa7 ("regulator: core: Use ww_mutex for regulators locking")

The basic problem appears to be that regulator_unregister takes
the regulator_list_mutex and then calls flush_work on
disable_work. But regulator_disable_work calls
regulator_lock_dependent which will also take the
regulator_list_mutex. Resulting in a deadlock if the flush_work
call actually needs to flush the work.

The locking appears to have got quite complex since last time I
looked at it and I am having a little difficulty working out
exactly is protecting what.

I am wondering if the flush_work can just be moved outside the
regulator_list_mutex in regulator_unregister since that mutex
doesn't seem to protect the against the work being queued anyway?
I will keep looking into this over the next couple of days,
any pointers or ideas anyone has would be greatly appreciated.
Finally, as it looks like it might take a couple of days to
figure out the locking, I am leaving on holiday on Saturday
so if you don't see a fix from my by then it might be a couple
of weeks before I can look at it again.

Thanks,
Charles