I'm trying to get all timestamped repo contributions (with contributor names) for a specific repo.
Let's say it's this one: https://github.com/morucci/repoxplorer
I've been playing around with the GH archive at BigQuery, but this is new to me, and it's not working.
This is the closest I've gotten:
/* count of issues opened, closed, and reopened on 2019/01/01 */
SELECT event as issue_status, COUNT(*) as cnt FROM (
SELECT type, repo.name, actor.login,
JSON_EXTRACT(payload, '$.action') as event,
FROM `githubarchive.day.20190101`
WHERE type = 'IssuesEvent'
AND repository_name = 'repoxplorer')
CodePudding user response:
The alternative is to:
- clone the repository
- use a static tool like
if James committed three times, I'd like to know the timestamp for all of his three commits.
git log --author="James" --format="%h %as %s" --all --branches
CodePudding user response:
Following VonC's answer, I figured out how to use mergestat
to get a list of a specified repo's commits, with committer name and timestamp of every commit.
I used this query code:
-- MergeStat is a tool for running SQL queries on data in git repos:
-- https://github.com/mergestat/mergestat
SELECT author_name, author_when, committer_when, count(*)
FROM commits('https://github.com/mergestat/mergestat')
WHERE parents < 2 -- ignore merge commits
GROUP BY author_when ORDER BY author_when DESC
I suspect that there are better ways of getting the desired output, but adding this answer until a better solution is found.