I am using arcanist and with a large number of linters both built-in and custom. As we add more, it's becoming increasingly slow.
For a beefy change with maybe an eslint expection, time arc lint
shows it can take up to 30 minutes like so:
$ time arc lint
<...>
real 8m31.771s
user 17m53.159s
sys 4m52.329s
But on a clean repo with no changes, its fast
$ time arc lint
OKAY No lint warnings.
real 0m7.961s
user 0m6.763s
sys 0m1.363s
To figure out which linters are running slowly and should be optimized, I'd like to get more granular information about the runtime of each individual linter.
Currently
$ arc lint
Linting....
No Errors
Ideal state
$ arc lint
Linting eslint...
Elapsed time 18m38.311s
Linting with pylint...
Elapsed time 1m35.334s
Linting with local/
<...>
No Errors
So, how can I get more granular information from each individual arcanist linter? (And otherwise, any tips and tricks for improving the run speed of arc lint?)
CodePudding user response:
To give an accurate answer as to why it's taking up to 30 minutes, the question lacks the following information.
- How many lines is a "beefy change"?
- How large / how many source files?
- List of linters
- PHP version
- Is
arcanist
running locally? Or in CI?
In general, linting can just get very expensive to execute if these parameters are all high (or low in case of versions). It's definitely possible to reach 30 minutes in extreme cases even without other performance issues.
PHP is also not the best optimized language for this kind of workload, although that difference got a lot thinner in recent versions. Still it's a language tailor made for processing HTTP requests. Linting large amounts of source code is quite a different workload.
For most languages, you should get much better performance by using the most common linter for that specific language.
Looking at some of the source code of the linting engine, most lines date from 8 to 12 years back. You can imagine in that time many better performing linters have been written.
In fact, if I follow the link on the repo's main README, the Phacility site displays a red warning saying
Effective June 1, 2021: Phabricator is no longer actively maintained.
They explain in more detail here (linked from that warning). If your current setup heavily depends on it, you may want to consider reducing that dependency.
That probably explains at least part of the performance problems.
Finding the slow part
I couldn't find any command option that has the output you want. On this Debian manpage for arc
I did find you can specify the path. So you can loop each directory in your sources root and time it separately.
for dir in /source/*; do
time arc lint "$dir"
done
You can then further pin down the directory that is the slowest.
Another thing you can try is disabling all rulesets, and re-enabling them one by one, keeping track of the time. At first it will take almost no time, so it should be relatively fast to locate the first rule that starts adding a substantial amount of time.