Home > Software engineering >  Can you create a "diff" patch that does not incude the data of the lines to be removed?
Can you create a "diff" patch that does not incude the data of the lines to be removed?

Time:08-15

I want to create a patch file that does not include any of the source information of the data I want to patch.

I am trying to avoid this, so I can redistribute patches to files without sharing the original data.

I do understand that this becomes a oneway patch, but that is the intent.

Included is my current diff command

diff -u <(xxd "originalFile") <(xxd "updatedFile") > "originalFile.patch"

Is there an option that lets me just keep the insertions in the .patch

Thanks.

CodePudding user response:

Try this instead:

comm -13 <(xxd "originalFile") <(xxd "updatedFile") > "originalFile.patch"

comm -13 only includes the lines from the updated file that have changed. The patch looks like this:

0047b080: 9506 d708 0195 07ed 1209 2400 00a0 410a  ..........$...A.

To apply the patch:

xxd -r "originalFile.patch" "originalFile"

If you only want to include the changed bytes and don't mind the patch generation being much slower (and patches potentially being larger), you can do this instead:

comm -13 <(xxd -c1 "originalFile") <(xxd -c1 "updatedFile") > "originalFile.patch"

To apply the patch:

xxd -c1 -r "originalFile.patch" "originalFile"

xxd -c1 puts every byte on one line, so the patch will only include the changed bytes without surrounding bytes. The patch looks like this:

0047b087: cf  .
0047b088: 15  .

Lastly, if you don't want to use comm, this is the diff equivalent of comm -13:

diff --changed-group-format='%<' --unchanged-group-format=''

e.g.

diff --changed-group-format='%<' --unchanged-group-format='' <(xxd "originalFile") <(xxd "updatedFile") > "originalFile.patch"

CodePudding user response:

Some versions of patch and diff understand -e to read/write ed scripts.

You need to add Index: lines to the header so that patch can tell what files to change.

Note that there seems to be a bug in GNU patch when used this way, that sometimes makes it apply hunks to the wrong file. This seems to happen when only d commands appear and no c.

For example:

$ mkdir old new
$ seq 10 >old/a
$ (seq 3;seq 7 10) >new/a
$ seq 10 >old/b
$ (seq 2 4; seq 6 11) >new/b
$ diff -er old new |
  sed s',^diff -er .* new/,----\nIndex: new/,' >difs
$ cat difs
----
Index: new/a
4,6d
----
Index: new/b
10a
11
.
5d
1d
$ mkdir test
$ cp old/* test
$ ( cd test; patch -p1 -e <../difs )
?
patch: *** ed FAILED
$
$ (seq 3; echo ok; seq 7 10) >new/a
$ diff -er old new |
  sed s',^diff -er .* new/,----\nIndex: new/,' >difs2
$ cat difs2
----
Index: new/a
4,6c
ok
.
----
Index: new/b
10a
11
.
5d
1d
$ cp old/* test
$ ( cd test; patch -p1 -e <../difs2 )
$ diff -ur new test
$
  • Related