Premise:
I have five files in two directories.
folder/
├─ old/
│ ├─ a.json
│ ├─ b.xml
├─ new/
│ ├─ a.json
│ ├─ b.xml
│ ├─ c.html
old/a.json
{
"hello": {
},
"world": ""
}
new/a.json
{
"hello": {},
"world": ""
}
old/b.xml and new/b.xml are the same.
When I run diff
I get:
2,3c2
< "hello": {
< },
---
> "hello": {},
As well as the new file, c.html.
Solution:
I want to only see that c.html
is the new file added. I want to ignore the newline/spaces in the two a.json
.
Ideally I'd like to do diff -I '${REGEX_HERE}' folder/old folder/new
to accomplish this. Is this possible? I also have other bash utilities at my disposal. This is meant to run in a Dockerfile.
CodePudding user response:
Run jq
on each json file to produce a common format of output then diff
THAT:
diff <(jq . folder/old/a.json) <(jq . folder/new/a.json)
For example:
$ head *.json
==> x.json <==
{
"hello": {
},
"world": ""
}
==> y.json <==
{
"hello": {},
"world": ""
}
$ jq . x.json
{
"hello": {},
"world": ""
}
$ jq . y.json
{
"hello": {},
"world": ""
}
$ diff <(jq . x.json) <(jq . y.json)
$
To do what you asked for I want to ignore the newline/spaces in the two a.json
would be:
$ diff <(tr -d '[[:space:]]' < x.json) <(tr -d '[[:space:]]' < y.json)
$
but that assumes your version of diff
works on input files that don't have a terminating newline and so aren't valid text files per POSIX, and that you're OK with ALL white space being removed, even inside quotes, and that you don't care about other layout differences between the 2 files.
I expect you'll run into the same problem of wanting to ignore some of the white space and/or other formatting possibilities in xml and other files so you'd have to write a tool something like this to be able to diff the 2 directories as you appear to want (untested):
#!/usr/bin/env bash
readarray -d '' files < <(find folder -type f -printf '%P\0' | sort -zu)
diffByType() {
case $1 in
*.json ) diff <(jq . "$1") <(jq . "$2") >&2 ;;
*.xml ) diff <(xmlstarlet fo "$1") <(xmlstarlet fo "$2") >&2 ;;
* ) diff "$1" "$2" >&2 ;;
esac
return
}
for file in "${files[@]}"; do
if [[ -f "folder/old/$file" ]]; then
if [[ -f "folder/new/$file" ]]; then
if ! diffByType "folder/old/$file" "folder/new/$file"; then
printf '%s is different\n' "$file" >&2
fi
else
printf '%s is only in old\n' "$file" >&2
fi
else
printf '%s is only in new\n' "$file" >&2
fi
done