Home > Net >  Possible to extract only new words between 2 text files using git diff?
Possible to extract only new words between 2 text files using git diff?

Time:02-16

I have two files, fileA and fileB. These are text files, each containing essentially one long string (a series of paragraphs).

I want to be able to show only the 'words' in fileB that are newly added/not contained in fileA. Browsing this forum I saw saw git diff as a suggested approach - using its word-diff argument, highlights in green the new words in fileB nicely.

I was wondering if there's a way to take this output and extract only the new additions in fileB, so that I could place this in a separate text file. To maybe help make this clearly, what I'm envisioning is something like the diff command given here as an answer (diff -U $(wc -l < fileA) fileA fileB | sed -n 's/^-//p' > fileC, enter image description here

CodePudding user response:

You're looking for --word-diff=porcelain to make extracting the individual changes easy to code. Each changed line is printed as a series of unchanged, removed and added words runs, with the line boundaries shown as a separate ~ line.

  • Related