Home > OS >  How to aggregate total code line changes across codebases in git globally
How to aggregate total code line changes across codebases in git globally

Time:01-01

I saw from the magnanimous machine learning researcher and podcast interviewer Lex Fridman that one of his resolutions for 2023 was to "write, edit, or delete 20,000 lines of code".

This got me thinking, is there a place in github/gitlab where line changes in code are tracked?

How would one actually verify they meet this requirement at the end of the year?

CodePudding user response:

Short answer:

I doubt it will be possible in an online service. You're better off using one of the local solutions and looping through all repositories on your machine (or doing the same via GitHub / GitLab).

Longer, rambly answer:

GitHub / GitLab are unlikely to provide this information (or be very helpful in gathering it), primarily because your account and the lines changes aren't actually directly linked (excluding verified commits & obfuscated emails for now). For example, I could make a repo and the email linked to one of your public commits to edit 100k lines of code, and this would show up in any report. There wouldn't be any protection against this since obviously you don't control my arbitrary repository. Although they do show commit counts, so perhaps this isn't a real risk. However...

The closest solution in my opinion is to just look at handpicked repo insights, then use those totals. There's plenty of other issues with that metric, and ultimately I don't see how any solution could be fully automated.

For example, according to GitHub I've made ~24k additions & 5k removals to my blog this year. However, I know > 99% of those are actually Markdown / adding images, so presumably .md would have to be excluded. But then what about .md files that include HTML, do they count? What about if I revert a 5k code PR because I merged it accidentally, would that be 0k, 5k, 10k lines changed?

You could of course also use GitHub's API to get all the commits assigned to you, loop through each of them adding up the lines changed, and applying any exclusion rules. Or if you were feeling particularly masochistic, do it manually via your own GitHub profile!

Ultimately I think it's a vague metric, so any solution will require some cherrypicking of data. It's a lofty goal though, good luck to him.

  • Related