I'm studying llvm recently.
I have a C code name cal.c
. Now, I apply two different optimization passes on this code and generate two different IR codes - cal1.ll
and cal2.ll
.
How could I compare the performance between them?
I tried compare instruction counts and instruction cost but those of them are not good features. I think there are no relations between instruction counts and performance, instruction costs and performance.
So, how could I compare the performance between two IR?
I don't need to know the run time. I just want to know which is the faster.
CodePudding user response:
This is a hard problem, often much harder than just running you code on the actual hardware and measure which is faster. It more or less boils down to simulate the hardware using "pen and paper". You will need a detailed model of the target system including things like pipeline and cache behavior and then use this model to calculate the cost of each executed instruction.
CodePudding user response:
llvm-mca tries to statically estimate performance of assembly code by reusing LLVM compiler's CPU pipeline model:
$ llvm-mca -mcpu=skylake foo.s
Iterations: 300
Instructions: 900
Total Cycles: 610
Total uOps: 900
As mentioned by others, the estimates would be imprecise (often very imprecise) due to lack of cache and branch predictor models, imprecision of CPU pipeline model, etc.