I have two files with almost identical filenames:
/home/104800-001-001/H27VNDSX3_104800-001-001_GCCTATCA-CGACCATT_L002_R1.extracted.fastq.gz
/home/104800-001-001/H27VNDSX3_104800-001-001_GCCTATCA-CGACCATT_L002_R3.extracted.fastq.gz
How can I extract in bash ONLY the different characters?
Desired output:
1 3
Edit:
- Always the same length
- Take into account only differences in _R[0-9]
CodePudding user response:
Comparing Only An Interesting Subset
(Answering the question as-edited)
#!/usr/bin/env bash
s1='/home/104800-001-001/H27VNDSX3_104800-001-001_GCCTATCA-CGACCATT_L002_R1.extracted.fastq.gz'
s2='/home/104800-001-001/H27VNDSX3_104800-001-001_GCCTATCA-CGACCATT_L002_R3.extracted.fastq.gz'
revision_re='_R([[:digit:]] )[._]'
rev1=; rev2=;
[[ $s1 =~ $revision_re ]] && rev1=${BASH_REMATCH[1]}
[[ $s2 =~ $revision_re ]] && rev2=${BASH_REMATCH[1]}
if [[ $rev1 && $rev2 ]] && [[ $rev1 != "$rev2" ]]; then
printf '%s %s\n' "$rev1" "$rev2"
fi
Comparing The Whole String
(Answering the question as originally asked)
#!/usr/bin/env bash
s1='/home/104800-001-001/H27VNDSX3_104800-001-001_GCCTATCA-CGACCATT_L002_R1.extracted.fastq.gz'
s2='/home/104800-001-001/H27VNDSX3_104800-001-001_GCCTATCA-CGACCATT_L002_R3.extracted.fastq.gz'
max_len=$(( ${#s1} > ${#s2} ? ${#s1} : ${#s2} ))
for (( idx=0; idx<max_len; idx )); do
if [[ ${s1:idx:1} != "${s2:idx:1}" ]]; then
printf '%s ' "${s1:idx:1}" "${s2:idx:1}"
fi
done
printf '\n'