Home > Enterprise >  Git: remove commits older than 1 year
Git: remove commits older than 1 year

Time:10-09

I have a web application and use git to not only manage source control but also deploy changes. I push the changes to the remote repo on github and my webserver has a webhook, which then updates according to these changes.

Now I noticed that my local git repository is around 9GB. I cloned the repo from github and notived that a even then my repo is roughly 1.5GB.

I am pretty sure most of this is unnecessary bloat from the initial development phase. I would like to get rid of it to free up disk space. I have googled a bit, but only find relatively complicated solutions. My scenario is one branch, one developer, lots of tiny commits.

Is there a simple way to get rid of changes that are older than i.e. 12 months, that will result in freeing-up space locally and remotely?

Thanks

CodePudding user response:

If you can pick a commit to start from (and forget everything behind that single commit), I can offer you a script that can... let's call it "regrow" all commits past that commit and it would do it actually quite fast.... of course, you would be rewriting history, just want to make it clear.

https://github.com/eantoranz/git/blob/replay/contrib/replay.

The way to use it would be:

  • pick a commit that you would like to start rewriting your history from, will forget about everything behind. Create a branch on it.
git branch oldbase <some-commit-id>

Then create an orphan branch from that commit, so that you clean up all previous history

git checkout --orphan newbase oldbase
git commit -C oldbase # create a commit using the same comment as old-base

Now is when the script comes into play

the-replay-script --new-base newbase --old-base oldbase --tip master

That will replay all commits in the oldbase..master range on top of the newbase commit. It will print a single commit ID in the end. Take a look at the commit (check it out, log it, etc). WHen you are certain that's what you would like to have as your new master:

git branch -f master <the-commit-written-by-the-script>
git checkout master

And feel free to force-push-it where you would like to.

CodePudding user response:

To remove history older than X, you need to rewrite the history of your repo, and perhaps the most efficient way to rewrite a large repo is using git-filter-repo. Note git-filter-repo is a python script so you'll need python too in case you don't have it installed already.

Once you have git-filter-repo ready to go, the steps to answer your question are rather simple, and a similar scenario is even described in the Git manual for git-replace.

Basically, the steps are:

  1. Make a new parentless commit (a.k.a. root commit) that is equivalent to the state of the first commit you wish to keep.
  2. Replace that commit in your current history with the new root commit.
  3. Make it permanent using git-filter-repo.

For example, suppose the first commit you wish to keep has a commit hash of X:

  1. echo 'Truncate history to single commit' | git commit-tree X^{tree}

    The output of the above command will be a new commit hash, let's call it Y.

  2. git replace X Y

  3. git filter-repo --force

Note: git-filter-repo only touches your local repo. If you're happy with your new re-written repo you can re-add your remote and push it out.

  •  Tags:  
  • git
  • Related