Re: [PATCH V2 4/9] tools/perf: Add support to capture and parse raw instruction in objdump
From: Athira Rajeev
Date: Wed May 22 2024 - 09:59:59 EST
> On 10 May 2024, at 7:56 PM, Arnaldo Carvalho de Melo <acme@kernelorg> wrote:
>
> On Thu, May 09, 2024 at 10:56:23PM +0530, Athira Rajeev wrote:
>>
>>
>>> On 7 May 2024, at 3:05 PM, Christophe Leroy <christophe.leroy@xxxxxxxxxx> wrote:
>>>
>>>
>>>
>>> Le 06/05/2024 à 14:19, Athira Rajeev a écrit :
>>>> Add support to capture and parse raw instruction in objdump.
>>>
>>> What's the purpose of using 'objdump' for reading raw instructions ?
>>> Can't they be read directly without invoking 'objdump' ? It looks odd to
>>> me to use objdump to provide readable text and then parse it back.
>>
>> Hi Christophe,
>>
>> Thanks for your review comments.
>>
>> Current implementation for data type profiling on X86 uses "objdump" tool to get the disassembled code.
>
> commit 6d17edc113de1e21fc66afa76be475a4f7c91826
> Author: Namhyung Kim <namhyung@xxxxxxxxxx>
> Date: Fri Mar 29 14:58:11 2024 -0700
>
> perf annotate: Use libcapstone to disassemble
>
> Now it can use the capstone library to disassemble the instructions.
> Let's use that (if available) for perf annotate to speed up. Currently
> it only supports x86 architecture. With this change I can see ~3x speed
> up in data type profiling.
>
> But note that capstone cannot give the source file and line number info.
> For now, users should use the external objdump for that by specifying
> the --objdump option explicitly.
>
> Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx>
> Tested-by: Ian Rogers <irogers@xxxxxxxxxx>
> Cc: Adrian Hunter <adrian.hunter@xxxxxxxxx>
> Cc: Changbin Du <changbin.du@xxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
> Cc: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Link: https://lore.kernel.org/r/20240329215812.537846-5-namhyung@xxxxxxxxxx
> Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
>
> From a quick look at http://www.capstone-engine.org/compile.html it
> seems PowerPC is supported.
>
> But since we did it first with objdump output parsing, its good to have
> it as an alternative and sometimes a fallback:
Hi Arnaldo, Namhyung
Thanks for the suggestions. libcapstone is a good option and it is faster too.
I will address these changes in V3.
Thanks
Athira
>
> commit f35847de2a65137e011e559f38a3de5902a5463f
> Author: Namhyung Kim <namhyung@xxxxxxxxxx>
> Date: Wed Apr 24 17:51:56 2024 -0700
>
> perf annotate: Fallback disassemble to objdump when capstone fails
>
> I found some cases that capstone failed to disassemble. Probably my
> capstone is an old version but anyway there's a chance it can fail. And
> then it silently stopped in the middle. In my case, it didn't
> understand "RDPKRU" instruction.
>
> Let's check if the capstone disassemble reached the end of the function
> and fallback to objdump if not
>
> ---------------
>
> - Arnaldo
>
>> And then the objdump result lines are parsed to get the instruction
>> name and register fields. The initial patchset I posted to enable the
>> data type profiling feature in powerpc was using the same way by
>> getting disassembled code from objdump and parsing the disassembled
>> lines. But in V2, we are introducing change for powerpc to use "raw
>> instruction" and fetch opcode, reg fields from the raw instruction.
>
>> I tried to explain below that current objdump uses option
>> "--no-show-raw-insn" which doesn't capture raw instruction. So to
>> capture raw instruction, V2 patchset has changes to use default option
>> "--show-raw-insn" and get the raw instruction [ for powerpc ] along
>> with human readable annotation [ which is used by other archs ]. Since
>> perf tool already has objdump implementation in place, I went in the
>> direction to enhance it to use "--show-raw-insn" for powerpc purpose.
>
>> But as you mentioned, we can directly read raw instruction without
>> using "objdump" tool. perf has support to read object code. The dso
>> open/read utilities and helper functions are already present in
>> "util/dso.c" And "dso__data_read_offset" function reads data from dso
>> file offset. We can use these functions and I can make changes to
>> directly read binary instruction without using objdump.
>
>> Namhyung, Arnaldo, Christophe
>> Looking for your valuable feedback on this approach. Please suggest if this approach looks fine
>>
>>
>> Thanks
>> Athira
>>>
>>>> Currently, the perf tool infrastructure uses "--no-show-raw-insn" option
>>>> with "objdump" while disassemble. Example from powerpc with this option
>>>> for an instruction address is:
>>>
>>> Yes and that makes sense because the purpose of objdump is to provide
>>> human readable annotations, not to perform automated analysis. Am I
>>> missing something ?
>>>
>>>>
>>>> Snippet from:
>>>> objdump --start-address=<address> --stop-address=<address> -d --no-show-raw-insn -C <vmlinux>
>>>>
>>>> c0000000010224b4: lwz r10,0(r9)
>>>>
>>>> This line "lwz r10,0(r9)" is parsed to extract instruction name,
>>>> registers names and offset. Also to find whether there is a memory
>>>> reference in the operands, "memory_ref_char" field of objdump is used.
>>>> For x86, "(" is used as memory_ref_char to tackle instructions of the
>>>> form "mov (%rax), %rcx".
>>>>
>>>> In case of powerpc, not all instructions using "(" are the only memory
>>>> instructions. Example, above instruction can also be of extended form (X
>>>> form) "lwzx r10,0,r19". Inorder to easy identify the instruction category
>>>> and extract the source/target registers, patch adds support to use raw
>>>> instruction. With raw instruction, macros are added to extract opcode
>>>> and register fields.
>>>>
>>>> "struct ins_operands" and "struct ins" is updated to carry opcode and
>>>> raw instruction binary code (raw_insn). Function "disasm_line__parse"
>>>> is updated to fill the raw instruction hex value and opcode in newly
>>>> added fields. There is no changes in existing code paths, which parses
>>>> the disassembled code. The architecture using the instruction name and
>>>> present approach is not altered. Since this approach targets powerpc,
>>>> the macro implementation is added for powerpc as of now.
>>>>
>>>> Example:
>>>> representation using --show-raw-insn in objdump gives result:
>>>>
>>>> 38 01 81 e8 ld r4,312(r1)
>>>>
>>>> Here "38 01 81 e8" is the raw instruction representation. In powerpc,
>>>> this translates to instruction form: "ld RT,DS(RA)" and binary code
>>>> as:
>>>> _____________________________________
>>>> | 58 | RT | RA | DS | |
>>>> -------------------------------------
>>>> 0 6 11 16 30 31
>>>>
>>>> Function "disasm_line__parse" is updated to capture:
>>>>
>>>> line: 38 01 81 e8 ld r4,312(r1)
>>>> opcode and raw instruction "38 01 81 e8"
>>>> Raw instruction is used later to extract the reg/offset fields.
>>>>
>>>> Signed-off-by: Athira Rajeev <atrajeev@xxxxxxxxxxxxxxxxxx>
>>>> ---