EDIT - Reproducible Code, Output, and Updated Question
#!/bin/bash
# Input Directory
inpd="/home/space user/space test/space dir1"
# Line To Parse
line="/home/space user/space test/space dir1: space file 1.txt"
# Split Line awk -F[:]
echo ""
var1=$(echo "$line" | awk -F[:] '{print $1}')
echo " echo var1"
echo " $var1"
echo ""
echo " printf var1"
printf "%-2s%s" "" $var1
# var1 == inpd
echo ""
echo ""
echo " var1 == inpd"
if [ var1 == inpd ]; then
printf " Match."
else
printf " No match."
fi
echo ""
$ scriptname
echo var1
/home/space user/space test/space dir1
printf var1
/home/spaceuser/spacetest/spacedir1
var1 == inpd
No match.
Updated Question - How to define, cast, or properly compare var1
to inpd
so it produces a match when the input has spaces? If there is a better way to find the match without calling awk
it would also solve my problem.
I found the clue to solve my question here:
How can I remove all text after a character in bash?
$ script
- this gives a Match!
#!/bin/bash
# Input Directory
inpd="/home/space user/space test/space dir1"
# Line To Parse
line="/home/space user/space test/space dir1: space file 1.txt"
# var1 keeps everything in 'line' before :
var1=${line%:*}
echo ""
echo "$line"
echo "$var1"
printf "$var1"
# "$var1" == "$inpd"
echo ""
if [ "$var1" == "$inpd" ]; then
printf " Match."
else
printf " No match."
fi
echo ""
EDIT - Why the Long Post?
I made a long post to show my script development effort but the question now reduces to an effort to match any /path with/ or without spaces/dir1
to the same path string or variable extracted from the output lines of the diff
command. I am using awk
with -F[:]
as the separator but there may be an alternative way to do it. I tried to embed some reproducible code above and below with the description Reproducible Code. The updated question should be based on the above edit, and the long post is to preserve the new context and the original post.
For my use cases the custom script is non-recursive; it would handle spaces in the path or filenames; but as of now it would generate errors for any path or filename containing a colon :
character and also for any filename containing a slash /
. I am not sure what other characters or sequences would produce an error and I don't need a more robust script for my present purposes.
Spaces in any input path it must be contained in quotes dirt "/path with spaces/dir1"
.
So far I think if subdirectories appear in only one directory, as shown in my test directory structure, then in the absence of file extensions there is no way to determine whether the name refers to a file or subdirectory. I intend to use tree
to list directories with color to show files and subdirectories and also use the new script dirt
to compare files that are the same or different. This will probably work best for directories with few files and not many subdirectories which is my intended use case.
EDIT - Desired Output Format (Script Name dirt
Using Test Directories Below)
$ dirt "/home/joe/test dirdiff/dir1" "/home/joe/test dirdiff/dir2"
BOTH /home/joe/test dirdiff/dir1 /home/joe/test dirdiff/dir2
diff diff.txt diff.txt
diffout.txt
only1.txt
only2.txt
same same space.txt same space.txt
same same.txt same.txt
space 1.txt
space 2.txt
subdir1
subdir2
comd subdirC subdirC
EDIT - Directory Structure With Spaces (Without :) To Test Script
/home/joe/test dirdiff
├── dir1
│ ├── diff.txt
│ ├── diffout.txt
│ ├── only1.txt
│ ├── same space.txt
│ ├── same.txt
│ ├── space 1.txt
│ ├── subdir1
│ └── subdirC
└── dir2
├── diff.txt
├── only2.txt
├── same space.txt
├── same.txt
├── space 2.txt
├── subdir2
└── subdirC
EDIT - Output from running diff
$ diff -qs "/home/joe/test dirdiff/dir1" "/home/joe/test dirdiff/dir2"
Files /home/joe/test dirdiff/dir1/diff.txt and /home/joe/test dirdiff/dir2/diff.txt differ
Only in /home/joe/test dirdiff/dir1: diffout.txt
Only in /home/joe/test dirdiff/dir1: only1.txt
Only in /home/joe/test dirdiff/dir2: only2.txt
Files /home/joe/test dirdiff/dir1/same space.txt and /home/joe/test dirdiff/dir2/same space.txt are identical
Files /home/joe/test dirdiff/dir1/same.txt and /home/joe/test dirdiff/dir2/same.txt are identical
Only in /home/joe/test dirdiff/dir1: space 1.txt
Only in /home/joe/test dirdiff/dir2: space 2.txt
Only in /home/joe/test dirdiff/dir1: subdir1
Only in /home/joe/test dirdiff/dir2: subdir2
Common subdirectories: /home/joe/test dirdiff/dir1/subdirC and /home/joe/test dirdiff/dir2/subdirC
EDIT - Script Fragment dirt00 Stores diff
Output in $diffout
#!/bin/bash
if [[ -z "$1" || -z "$2" ]]; then
printf "\n Type $ dirt00 Dir1 Dir2\n"
else
input1="$1"
input2="$2"
diffout=$(diff -qs "$1" "$2")
# Printf '%s\n' "$var" is necessary because printf '%s' "$var" on a
# variable that doesn't end with a newline then the while loop will
# completely miss the last line of the variable.
while IFS= read -r line
do
echo $line
done < <(printf '%s\n' "$diffout")
fi
EDIT - Output from running dirt00
$ dirt00 "/home/joe/test dirdiff/dir1" "/home/joe/test dirdiff/dir2"
Files /home/joe/test dirdiff/dir1/diff.txt and /home/joe/test dirdiff/dir2/diff.txt differ
Only in /home/joe/test dirdiff/dir1: diffout.txt
Only in /home/joe/test dirdiff/dir1: only1.txt
Only in /home/joe/test dirdiff/dir2: only2.txt
Files /home/joe/test dirdiff/dir1/same space.txt and /home/joe/test dirdiff/dir2/same space.txt are identical
Files /home/joe/test dirdiff/dir1/same.txt and /home/joe/test dirdiff/dir2/same.txt are identical
Only in /home/joe/test dirdiff/dir1: space 1.txt
Only in /home/joe/test dirdiff/dir2: space 2.txt
Only in /home/joe/test dirdiff/dir1: subdir1
Only in /home/joe/test dirdiff/dir2: subdir2
Common subdirectories: /home/joe/test dirdiff/dir1/subdirC and /home/joe/test dirdiff/dir2/subdirC
EDIT - Reproducible Code Script dirt01
#!/bin/bash
input1="/home/joe/test dirdiff/dir1"
input2="/home/joe/test dirdiff/dir2"
diffout="Files /home/joe/test dirdiff/dir1/diff.txt and /home/joe/test dirdiff/dir2/diff.txt differ
Only in /home/joe/test dirdiff/dir1: diffout.txt
Only in /home/joe/test dirdiff/dir1: only1.txt
Only in /home/joe/test dirdiff/dir2: only2.txt
Files /home/joe/test dirdiff/dir1/same space.txt and /home/joe/test dirdiff/dir2/same space.txt are identical
Files /home/joe/test dirdiff/dir1/same.txt and /home/joe/test dirdiff/dir2/same.txt are identical
Only in /home/joe/test dirdiff/dir1: space 1.txt
Only in /home/joe/test dirdiff/dir2: space 2.txt
Only in /home/joe/test dirdiff/dir1: subdir1
Only in /home/joe/test dirdiff/dir2: subdir2
Common subdirectories: /home/joe/test dirdiff/dir1/subdirC and /home/joe/test dirdiff/dir2/subdirC"
# Printf '%s\n' "$var" is necessary because printf '%s' "$var" on a
# variable that doesn't end with a newline then the while loop will
# completely miss the last line of the variable.
printf "\n %-8s%-40s%-40s\n" "BOTH" "$input1" "$input2"
while IFS= read -r line
do
#echo $line
firstword=$(echo "$line" | awk '{print $1}')
finalword=$(echo "$line" | awk '{print $NF}')
if [ $finalword == "differ" ]; then
snip=${line%" differ"}
echo "$snip" | awk -F[/] '{printf " %-8s%-40s%-40s\n","diff",$NF,$NF}'
elif [ $finalword == "identical" ]; then
snip=${line%" are identical"}
echo "$snip" | awk -F[/] '{printf " %-8s%-40s%-40s\n","same",$NF,$NF}'
elif [ $firstword == "Common" ]; then
echo "$line" | awk -F[/] '{printf " %-8s%-40s%-40s\n","comd",$NF,$NF}'
else
echo ""
fi
done < <(printf '%s\n' "$diffout")
EDIT - Output from running dirt01
$ dirt01
BOTH /home/joe/test dirdiff/dir1 /home/joe/test dirdiff/dir2
diff diff.txt diff.txt
same same space.txt same space.txt
same same.txt same.txt
comd subdirC subdirC
I cannot write dirt02
, to complete the script, without an answer to the updated question at the top of this post.
I left the original question and post below to preserve the context for the existing answer and comments which are greatly appreciated!
NOTE - Original Question and Post Below
In the two lines starting $NF=="differ"
and $NF=="identicial"
:
(1) How do I split the file name and extension from the directory using either identical awk variable shown below as $2
or $4
and then output the filename.ext in the printf
command?
dirdiff - bash script
#!/bin/bash
if [[ -z $1 || -z $2 ]]; then
printf "\n Type $ dirdiff Dir1 Dir2\n"
else
LEFT=$1
LEFT =:
RGHT=$2
RGHT =:
printf "\n %-8s%-40s%-40s\n" "" "$1" "$2"
printf " %-8s%-40s%-40s\n\n" "" "$LEFT" "$RGHT"
diff -qs $1 $2
echo ""
printf "\n%-8s%-40s%-40s\n" "INFO" "$1" "$2"
diff -qs $1 $2 | awk -v L=$LEFT -v R=$RGHT \
'$NF=="differ" {printf "%-8s%-40s%-40s\n","diff", $2, $4} \
$NF=="identical" {printf "%-8s%-40s%-40s\n","same", $2, $4} \
$3==L {printf "%-8s%-40s\n","", $4} \
$3==R {printf "%-8s%-40s%-40s\n","", "", $4}'
fi
This is the debug and develop script which runs command $ diff -qs $1 $2
twice. The first time shows the raw output. The second time pipes output to awk where I am trying to parse lines and format output on the command line. My questions relate to the final five lines in the script. EDIT: I solved the printf
syntax problem in awk as shown in the code.
Run dirdiff on command line gives the following command line output
$ dirdiff /usr/local/adm/sys /mnt/ssdroot/home/joe/admin/sys
/usr/local/adm/sys /mnt/ssdroot/home/joe/admin/sys
/usr/local/adm/sys: /mnt/ssdroot/home/joe/admin/sys:
Only in /mnt/ssdroot/home/joe/admin/sys: bashrc.txt
Only in /usr/local/adm/sys: debpkgs.txt
Files /usr/local/adm/sys/direnv.txt and /mnt/ssdroot/home/joe/admin/sys/direnv.txt differ
Only in /usr/local/adm/sys: dpiDec2022.txt
Only in /mnt/ssdroot/home/joe/admin/sys: mypkgs.txt
Only in /mnt/ssdroot/home/joe/admin/sys: pyenv.txt
Files /usr/local/adm/sys/ssh.txt and /mnt/ssdroot/home/joe/admin/sys/ssh.txt are identical
Files /usr/local/adm/sys/usbquirks.txt and /mnt/ssdroot/home/joe/admin/sys/usbquirks.txt differ
INFO /usr/local/adm/sys /mnt/ssdroot/home/joe/admin/sys
bashrc.txt
debpkgs.txt
diff /usr/local/adm/sys/direnv.txt /mnt/ssdroot/home/joe/admin/sys/direnv.txt
dpiDec2022.txt
mypkgs.txt
pyenv.txt
same /usr/local/adm/sys/ssh.txt /mnt/ssdroot/home/joe/admin/sys/ssh.txt
diff /usr/local/adm/sys/usbquirks.txt /mnt/ssdroot/home/joe/admin/sys/usbquirks.txt
Desired Command Line Output Format (Duplicated at Top)
$ dirdiff /usr/local/adm/sys /mnt/ssdroot/home/joe/admin/sys
INFO /usr/local/adm/sys /mnt/ssdroot/home/joe/admin/sys
bashrc.txt
debpkgs.txt
diff direnv.txt direnv.txt
dpiDec2022.txt
mypkgs.txt
pyenv.txt
same ssh.txt ssh.txt
diff usbquirks.txt usbquirks.txt
CodePudding user response:
Hope this helps. I think the sub
function is what you are asking about for the basename
function.
Good luck!
diff -qs $1 $2 | gawk -v L=$1 -v R=$2 \
'BEGIN { printf "\n%-8s%-40s%-40s\n", "INFO", L, R } \
$NF=="differ" { sub( /.*\//,"",$4) ; printf "%-8s%-40s%-40s\n", "diff", $4, $4 } \
$NF=="identical" { sub( /.*\//,"",$4) ; printf "%-8s%-40s%-40s\n", "same", $4, $4 } \
$3==L":" { sub( /.*\//,"",$4) ; printf "%-8s%-40s%-40s\n", "only", $4, "" } \
$3==R":" { sub( /.*\//,"",$4) ; printf "%-8s%-40s%-40s\n", "only", "", $4 } '
INFO dir1 dir2
only bashrc.txt
only debpkgs.txt
diff direnv.txt direnv.txt
only dpiDec2022.txt
only mypkgs.txt
only pyenv.txt
same ssh.txt ssh.txt
diff usbquirks.txt usbquirks.txt
CodePudding user response:
Directory Structure With Spaces (Without :) To Test Script
/home/joe/test dirdiff
├── dir1
│ ├── diff.txt
│ ├── diffout.txt
│ ├── only1.txt
│ ├── same space.txt
│ ├── same.txt
│ ├── space 1.txt
│ ├── subdir1
│ └── subdirC
└── dir2
├── diff.txt
├── only2.txt
├── same space.txt
├── same.txt
├── space 2.txt
├── subdir2
└── subdirC
Reproducible Script Works for Paths & Names Containing Spaces but Not Colons
#!/bin/bash
input1="/home/joe/test dirdiff/dir1"
input2="/home/joe/test dirdiff/dir2"
diffout="Files /home/joe/test dirdiff/dir1/diff.txt and /home/joe/test dirdiff/dir2/diff.txt differ
Only in /home/joe/test dirdiff/dir1: diffout.txt
Only in /home/joe/test dirdiff/dir1: only1.txt
Only in /home/joe/test dirdiff/dir2: only2.txt
Files /home/joe/test dirdiff/dir1/same space.txt and /home/joe/test dirdiff/dir2/same space.txt are identical
Files /home/joe/test dirdiff/dir1/same.txt and /home/joe/test dirdiff/dir2/same.txt are identical
Only in /home/joe/test dirdiff/dir1: space 1.txt
Only in /home/joe/test dirdiff/dir2: space 2.txt
Only in /home/joe/test dirdiff/dir1: subdir1
Only in /home/joe/test dirdiff/dir2: subdir2
Common subdirectories: /home/joe/test dirdiff/dir1/subdirC and /home/joe/test dirdiff/dir2/subdirC"
printf "\n %-8s%-40s%-40s\n" "BOTH" "$input1" "$input2"
# Printf '%s\n' "$var" is necessary because printf '%s' "$var" on a
# variable that doesn't end with a newline then the while loop will
# completely miss the last line of the variable.
while IFS= read -r line
do
#echo $line
firstword=$(echo "$line" | awk '{print $1}')
finalword=$(echo "$line" | awk '{print $NF}')
if [[ "$finalword" == "differ" ]]; then
snip=${line%" differ"}
echo "$snip" | awk -F[/] '{printf " %-8s%-40s%-40s\n","diff",$NF,$NF}'
elif [[ "$finalword" == "identical" ]]; then
snip=${line%" are identical"}
echo "$snip" | awk -F[/] '{printf " %-8s%-40s%-40s\n","same",$NF,$NF}'
elif [[ "$firstword" == "Common" ]]; then
echo "$line" | awk -F[/] '{printf " %-8s%-40s%-40s\n","comd",$NF,$NF}'
elif [[ "$firstword" == "Only" ]]; then
snip=${line#"Only in "}
mdir=${snip%:*}
name=${snip#*:}
name=${name# *}
if [[ "$mdir" == "$input1" ]]; then
printf " %-8s%-40s\n" "" "$name"
else
printf " %-8s%-40s%-40s\n" "" "" "$name"
fi
else
echo ""
fi
done < <(printf '%s\n' "$diffout")
$ scriptname
BOTH /home/joe/test dirdiff/dir1 /home/joe/test dirdiff/dir2
diff diff.txt diff.txt
diffout.txt
only1.txt
only2.txt
same same space.txt same space.txt
same same.txt same.txt
space 1.txt
space 2.txt
subdir1
subdir2
comd subdirC subdirC