Re: [RFC PATCH v8 01/10] ras: scrub: Add scrub subsystem

From: Jonathan Cameron
Date: Fri May 17 2024 - 07:44:37 EST


On Fri, 17 May 2024 12:15:54 +0100
Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> wrote:

> Focusing on just one bit.
>
> > > Now, the question of how many legacy scrub interfaces should be
> > > considered in this design out of the gate is a worthwhile discussion. I
> > > am encouraged that this ABI is at least trying to handle more than 1
> > > backend, which makes me feel better that adding a 3rd and 4th might not
> > > be prohibitive.
> >
> > See above.
> >
> > I'm perfectly fine with: "hey, we have a new scrub API interfacing to
> > RAS scrub capability and it is *the* thing to use and all other hw scrub
> > functionality should be shoehorned into it.
> >
> > So this thing's design should at least try to anticipate supporting
> > other scrub hw.
> >
> > Because there's EDAC too. Why isn't this scrub thing part of EDAC? Why
> > isn't this scrub API part of edac_core? I mean, this is all RAS so why
> > design a whole new thing when the required glue is already there?
> >
> > We can just as well have a
> >
> > /sys/devices/system/edac/scrub/
> >
> > node hierarchy and have everything there.

Sorry - finger fumble, wasn't meant to send yet :(

>
> A few questions about this. It seems an unusual use of fake devices and a bus
> so I'm trying to understand how we might do something that looks more standard
> but perhaps also fit within the existing scheme. I appreciate this stuff
> has evolved over a long time, so lots of backwards compatibility concerns.
>
> If I follow this right the current situation is:
>
> /sys/devices/system/edac is the 'virtual' device registered on the edac bus.

Actually that's wrong it's not on the edac bus as that is the bus registered
via subsys_system_register() (which does create a fake device as per the docs
telling us not to use it any more - fair enough, legacy).

The mc below it is a bare device - I think just to provide a directory?
The comment on the release function seems to say that. This gives.

/sys/devices/system/edac/mc
/sys/bus/edac/devices/mc

Under that we have individual mc0/mc1 etc for the instances of that
accessible via
/sys/devices/system/edac/mc/mc0
/sys/bus/edac/device/mc/mc0
Those are registered a children of mc. I'd have expected them to be
children of the device that registered them - so for our case, a CXL mc0
node would be child of the CXL device rather than here but again
I'm guessing legacy that had to be maintained.

In general this nesting seems unusual, as I'd have expected the
registration directly on the edac bus with
/sys/bus/edac/device/mc0
/sys/bus/edac/device/pci0

Given we are talking about something new, maybe this is an opportunity
to not perpetuate this?

If we add scrub in here I'd prefer to just use the normal bus registration
handling rather than creating a nest of additional nodes. So perhaps we
could consider
/sys/bus/edac/device/scrub0 (or whatever name makes sense, as per the
earlier discussion of cxl_scrub0 or similar).
Could consider moving the bus location of mc0 etc in future to there with
symlinks to /sys/bus/edac/device/mc/* for backwards compatibility either
via setting their parents or more explicit link creation.

These scrub0 would have their dev->parent set to who ever actually registered
them providing that reference cleanly and letting all the normal device
model stuff work more simply.

If we did that with the scrub nodes, the only substantial change from
a separate subsystem as seen in this patch set would be to register
them on the edac bus rather than a separate class.

As you pointed out, there is a simple scrub interface in the existing
edac memory controller code. How would you suggest handling that?
Have them all register an additional device on the bus (as a child
of the mcX devices) perhaps? Seems an easy step forwards and should
be no backwards compatibility concerns.

>
> >
> > Why does it have to be yet another thing?

It absolutely doesn't as long as we can do it fairly cleanly within
existing code. I wasn't sure that was possible, but you know edac
a lot better than me and so I'll defer to you on that!

> >
> > And if it needs to be separate, who's going to maintain it?

Several options for that, but fair question - bringing (at least some of)
the RAS mess together will focus reviewer bandwidth etc better.

> >
> > > Which matches what I reacted to on the last posting:
> > >
> > > "Maybe it is self evident to others, but for me there is little in these
> > > changelogs besides 'mechanism exists, enable it'"
> > >
> > > ...and to me that feedback was taken to heart with much improved
> > > changelogs in this new posting.
> >
> > Ok.
> >
> > > This init time feature probing discussion feels like it was born from a
> > > micommunication / misunderstanding.
> >
> > Yes, it seems so, thanks for clarifying things.
> >
> > I still am unclear on the usecases and how this is supposed to be used
> > and also, as mentioned above, we have a *lot* of RAS functionality
> > spread around the kernel. Perhaps we should start unifying it instead of
> > adding more...

I'm definitely keen on unifying things as I agree, this mixture of different
RAS functionality is a ever worsening mess.

Jonathan


> >
> > So the big picture and where we're headed to, needs to be clarified first.
> >
> > Thx.
> >
>