Home > other >  Git Diff and Copy
Git Diff and Copy

Time:05-25

Good day,

** Updated Description

I am wanting to run a Static Code Analysis (PMD) report against the files that have been added or modified as part of a pull request on bitbucket. The files that have been modified etc are available locally within the pipeline image, however I need to do a git diff to identify the changes ONLY between the source branch (pulling from) and the target branch (to be merged into). I will then be executing the PMD CLI (with rulesets etc) against a directory that will contain only the "changed files" to highlight any issues with those files specifically as part of the change.

I basically want to copy out the files indicated in the git diff result. I hope this provides some more context.

I have tried finding some examples and done testing however I am just not getting it right due to my lack of understanding on these crazy linux commands :)

So far I have the below command, but results in an empty folder.

git diff --name-only --pretty $BITBUCKET_PR_DESTINATION_BRANCH $BITBUCKET_BRANCH | xargs -i {} cp {} -t ~/branch-diff/

Any guidance or assistance would be appreciated!

Peter

CodePudding user response:

xargs might have problems will a number of files - argument would be too big. I Propose something like

for name in $(git diff --name-only --pretty $BITBUCKET_PR_DESTINATION_BRANCH $BITBUCKET_BRANCH); do cp $name ~/branch-diff/; done

As a result you will have all these files in one directory (without directory tree). Other question is that is it really what you need.

CodePudding user response:

Firstly, the issues with your current solution:

  1. xargs doesn't play nicely with filenames which have spaces in. You may not have that problem now, and you can work around it, but it's better to just avoid this if possible.
  2. cp does not build a directory tree - which you can trivially verify - so it wouldn't do what you asked anyway.
  3. git does not produce pathnames relative to the current path, but to the working tree base.
  4. The filenames produced by git diff $BITBUCKET_PR_DESTINATION_BRANCH $BITBUCKET_BRANCH don't even have to exist in your working tree, but only in (at least one of) the branches.
  5. If there's a diff between the two versions of a file, you haven't said which one you want copied!

A functional script using standard file tools would look something like:

#!/bin/bash
# diffcopy.sh
#
DESTDIR="$1"
BRANCH1="$2"
BRANCH2="$3"

# so relative paths match git output
SRCDIR="$(git rev-parse --show-toplevel)"

# choose the branch whose files we want to copy
git checkout "$BRANCH2"

# make the output directory
mkdir -p "$DESTDIR"

# sync the changed files
rsync -a --files-from=<(git diff --name-only "${BRANCH1}".."${BRANCH2}") "$SRCDIR" "$DESTDIR"

# restore working copy
git checkout -

There may be a better way to do this purely in git, but I don't know it.

CodePudding user response:

If you have the GNU variant of cp and xargs, you can do this:

git diff --name-only -z $BITBUCKET_PR_DESTINATION_BRANCH $BITBUCKET_BRANCH |
   xargs -0 cp --target-directory="$HOME/branch-diff/" --parents

This does not spawn a cp per file, but copies as many files as possible with one cp process. By specifying --target-directory, the destination can come first on the cp command, and xargs can paste as many source file names at the and of the cp command as it likes. --parents keeps the directory names of the source files.

The -z in git diff separates file names by a NUL character instead of line breaks, and the -0 of xargs knows how to take the NUL separated path list apart without stumbling over whitespace characters in file names.

  • Related