Home > database >  How to git submodule update ONLY on submodules that need it?
How to git submodule update ONLY on submodules that need it?

Time:10-23

Here is my situation:

  • I'm working in a repo with 40 submodules
  • git status and git submodule update take a LONG time (submodule update is several minutes)
  • If I checkout a different commit and only a couple submodules have been changed, I can see the submodules that need updating using git status, then skip the long wait of a full git submodule update by doing
git submodule update <submodule path> <submodule path>

This will only update the submodules listed, taking only a few seconds

Is there a way to have git submodule update only update the modules that actually need it, instead of every one? I don't mind listing out a couple submodules manually, but when there's 6 , it'd be nice for git to somehow use the git status result to only run git submodule update on the ones that need it.

Does anyone know of any git command tricks I can do to achieve this and speed up my submodule updates? If not, is there a trick I can use to make a bash script to extract the necessary information from git status and build & run a git submodule update <> <> <> command for me?

Bonus: is there a way to achieve a similar result on submodules that have had their content modified? That is, the submodule needs to be git reset --hard HEAD, not checked out to a new commit. But doing this without entering EVERY submodule, such as git submodule foreach git reset --hard HEAD, only ones that need it?

Example git status --porcelain

$ git status --porcelain
 M ABC
 M XYZ
 M XXX
 M YYY
 M FOO/BAR

CodePudding user response:

TL;DR

You can write your own Git script to do what you want.

My high level answer

The logic your are looking for does not already exist as a Git command, but it is easy for you to implement via your own git script.

Create a script that uses git status to get the submodules to update and then calls git submodule update with the list of submodules.

Let's call this script git-subm-update. Make sure it's executable, and on your PATH, then when you type `git subm-update it will get called and do the operation in the way you'll have optimized it.

An actual script prototype

First, kudos to @torek for the --porcelain=v2 option, since that gives us what we need.

For example, I have a repo with submodule foo and modified file bar:

$ git status --porcelain=v2
1 .M N... 100644 100644 100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 bar
1 .M S..U 160000 160000 160000 bb765d402fb5c361a998d80363eaac41b3e27886 bb765d402fb5c361a998d80363eaac41b3e27886 foo

This is ugly but easy to parse by machine, e.g., with grep and sed (or awk or choose your favourite text manipulation tool).

Search for "Porcelain format version 2" on https://git-scm.com/docs/git-status for the full explanation. Here, I note that a dirty submodule has an M in column 4 and an S in column 6, with additional codes indicating more details, but I don't think you need them for your use case. And the submodule name is the last token on the line.

So: git status --porcelain=v2 | grep 'M S' | sed 's/.* //' will give me foo, or the full list of dirty submodules if you have more, and I'll use that in my script.

I'm being a bit lazy here, you could certainly make this command a lot more robust, but here's a prototype script which should work for you, based on it:

Create file git-subm-update, make it executable and place it somewhere on your PATH (see https://stackoverflow.com/a/54115621/3216427), with these contents, for example:

#!/bin/bash

# Run "git subm-update -n" to see what command would get run but not run it.
if [[ "$1" =~ "-n" ]]; then
    NOT_REALLY=1
else
    NOT_REALLY=
fi

# List dirty submodules
dirty_submodules=$(
    git status --porcelain=v2 |
    grep "M S" |
    sed 's/.* //'
)

if [[ -n "$dirty_submodules" ]]; then
    echo Updating these dirty submodules: $dirty_submodules
    if [[ $NOT_REALLY ]]; then
        echo Would execute: git submodule update $dirty_submodules
    else
        git submodule update $dirty_submodules
    fi
else
    echo No submodules to update
fi

Then you can run git subm-update to update the dirty submodules or git subm-update -n to see what command this would run without running it.

Possible tweak: you might want grep "M SC" instead of grep "M S" to list only dirty modules where the commit has been updated, thus excluding modules where you've made local changes only.

CodePudding user response:

I don't know what's taking the time, I've never encountered symptoms like what you're reporting but you could try

git submodule foreach -q '
        now=`git -C $toplevel rev-parse :$sm_path`
        test $now = `git rev-parse @` || git checkout $now
    '

if you know you're not going to need to fetch (if say you're updating for an older checkout or git submodule update instead of the checkout for the general case.

  • Related