Home > Blockchain >  Getting BigQuery results for a specific GH repository
Getting BigQuery results for a specific GH repository

Time:06-27

I'm trying to get all timestamped repo contributions (with contributor names) for a specific repo.

Let's say it's this one: https://github.com/morucci/repoxplorer

I've been playing around with the GH archive at BigQuery, but this is new to me, and it's not working.

This is the closest I've gotten:

/* count of issues opened, closed, and reopened on 2019/01/01 */
SELECT event as issue_status, COUNT(*) as cnt FROM (
  SELECT type, repo.name, actor.login,
    JSON_EXTRACT(payload, '$.action') as event, 
  FROM `githubarchive.day.20190101`
  WHERE type = 'IssuesEvent'
    AND repository_name = 'repoxplorer')

CodePudding user response:

The alternative is to:

if James committed three times, I'd like to know the timestamp for all of his three commits.

git log --author="James" --format="%h %as %s" --all --branches

CodePudding user response:

Following VonC's answer, I figured out how to use mergestat to get a list of a specified repo's commits, with committer name and timestamp of every commit.

I used this query code:

-- MergeStat is a tool for running SQL queries on data in git repos:
-- https://github.com/mergestat/mergestat
    
SELECT author_name, author_when, committer_when, count(*)
FROM commits('https://github.com/mergestat/mergestat')
WHERE parents < 2 -- ignore merge commits
GROUP BY author_when ORDER BY author_when DESC

I suspect that there are better ways of getting the desired output, but adding this answer until a better solution is found.

  • Related