Re: [PATCH v2] x86/cpu: Fix x86_match_cpu() to match just X86_VENDOR_INTEL
From: H. Peter Anvin
Date: Fri May 17 2024 - 13:36:49 EST
On May 17, 2024 7:43:12 AM PDT, Borislav Petkov <bp@xxxxxxxxx> wrote:
>On Thu, May 16, 2024 at 09:29:25AM -0700, Tony Luck wrote:
>> -#define X86_VENDOR_INTEL 0
>> #define X86_VENDOR_CYRIX 1
>> #define X86_VENDOR_AMD 2
>> #define X86_VENDOR_UMC 3
>> +#define X86_VENDOR_INTEL 4
>> #define X86_VENDOR_CENTAUR 5
>> #define X86_VENDOR_TRANSMETA 7
>> #define X86_VENDOR_NSC 8
>> --
>
>From my last review:
>
>> But the Intel vendor has been 0 for what, 30 years?
>
>> Are you sure no other code in the tree relies on that? Audited?
>
>But nope, apparently that's too much to ask. :-(
>
>modinfo ./arch/x86/events/intel/intel-uncore.ko
>filename: ./arch/x86/events/intel/intel-uncore.ko
>license: GPL
>srcversion: ECE38449B18DD83223B93FD
>alias: cpu:type:x86,ven0000fam0006mod00B6:feature:*
>alias: cpu:type:x86,ven0000fam0006mod00AF:feature:*
>alias: cpu:type:x86,ven0000fam0006mod00BE:feature:*
> ^^^^^^^^^
>
>Would everything still work if it said "ven0004" now?
>
>So tglx and I just did some poking and we think the best solution would
>be to add a __u16 flags field to struct x86_cpu_id right...
>
>struct x86_cpu_id {
> __u16 vendor; /* 0 2 */
> __u16 family; /* 2 2 */
> __u16 model; /* 4 2 */
> __u16 steppings; /* 6 2 */
> __u16 feature; /* 8 2 */
>
> /* XXX 6 bytes hole, try to pack */
>
><--- HERE
>
> kernel_ulong_t driver_data; /* 16 8 */
>
> /* size: 24, cachelines: 1, members: 6 */
> /* sum members: 18, holes: 1, sum holes: 6 */
> /* last cacheline: 24 bytes */
>};
>
>and the 32-bit version has the same hole:
>
>struct x86_cpu_id {
> __u16 vendor; /* 0 2 */
> __u16 family; /* 2 2 */
> __u16 model; /* 4 2 */
> __u16 steppings; /* 6 2 */
> __u16 feature; /* 8 2 */
>
> /* XXX 2 bytes hole, try to pack */
>
><--- HERE
>
> kernel_ulong_t driver_data; /* 12 4 */
>
> /* size: 16, cachelines: 1, members: 6 */
> /* sum members: 14, holes: 1, sum holes: 2 */
> /* last cacheline: 16 bytes */
>};
>
>And then do:
>
>struct x86_cpu_id {
> __u16 vendor;
> __u16 family;
> __u16 model;
> __u16 steppings;
> __u16 feature; /* bit index */
> __u16 flags;
> kernel_ulong_t driver_data;
>};
>
>#define X86_CPU_ID_FLAG_VENDOR_VALID BIT(0)
>
>and then have the macros in arch/x86/include/asm/cpu_device_id.h set
>that valid flag and then have x86_match_cpu() check it.
>
>Then you don't risk a userspace breakage and that x86_match_cpu() crap
>thing is fixed.
>
>Thx.
>
I'm confused. Why not simply use say -1 for wildcard vendor match, -2 for no vendor ID (no CPUID or other known probing mechanism) and -3 for unrecognized vendor (vendor detectable but not known.)
If we have a match/valid mask, I would suggest that we have *separate* match and valid masks, so that we can explicitly encode the condition of "no valid vendor", but in the specific case of vendor, there are two conditions (see above.)
I *hate* these strings with the passion of a thousand suns: they are a classic case of how just blindly converting binary information to hex adds absolutely no value, and often makes the result worse than what one started with. And yes, I complained about that when they first went in as a classic case of exposing what was always simply intended as a kernel internal interface to user space.
This is particularly pathetic as there already is a canonical string representation of the vendor ID!
Similarly, kernel internal CPUID feature numbers shouldn't be an API, we once again have canonical strings for the API level.
But of course, now the module utilities depend on them, and although they are relatively tightly coupled to the kernel, it still would require a staged transition where the legacy strings are included for some time.