Home > database >  Renaming a .tsv file with bash if a .json file with the same name contains a certain string
Renaming a .tsv file with bash if a .json file with the same name contains a certain string


For each subject I have a folder with two files (.json and .tsv) per task (gram, plaus, and sem), for a total of 6 files per subject. Each pair of .tsv/.json files have the same name besides the file extension. For example, one subject's folder might contain: xxx.tsv, xxx.json, yyy.tsv, yyy.json, zzz.tsv, zzz.json.

I want to look through each .json file, see whether it contains the string "Gram", "Plaus", or "Sem", and rename the corresponding .tsv file to contain _Gram, _Plaus, or _Sem before the file extension based on which is found. Right now, my code (after changing to my subject folder) looks like this:

find -type f -name "*_regressors.json" -print0 | while IFS= read -r -d '' filename
    if [[grep -q 'Sem' "$filename"]]; then
    mv ${sem_name}.tsv ${sem_name}_sem.tsv
    if [[grep -q 'Plaus' "$filename"]]; then
    mv ${plaus_name}.tsv ${plaus_name}_plaus.tsv
    if [[grep -q 'Gram' "$filename"]]; then
    mv ${gram_name}.tsv ${gram_name}_gram.tsv

I'm wondering if an awk command might work better? I'm new to scripting with bash and unix in general, so any ideas are much appreciated!

CodePudding user response:

It does make sense to use awk instead of grep in this case:


find . -type f -name "*_regressors.json" -print0 |
while IFS= read -r -d '' filename
        awk '
            match($0,/Sem|Plaus|Gram/) {
                print tolower(substr($0,RSTART,RLENGTH))
        ' "$filename"
    mv "$filename" "${filename%.*}_$suffix.tsv" 

but trying to match a literal string inside a JSON file without parsing it might yield unexpected results

CodePudding user response:

Would you please try the following:


find . -type f -name "*_regressors.json" -print0 | while IFS= read -r -d '' f; do
    str=$(grep -oE "\b(Sem|Plaus|Gram)\b" "$f")                 # search the json file for the strings
    if (( $? == 0 )); then                                      # $? returns 0 if grep matches
        str=$(head -n 1 <<< "$str" | tr [:upper:] [:lower:])    # pick the 1st match and lower the case
        base=${f%.json}                                         # remove the extention
        echo mv -- "${base}.tsv" "${base}_${str}.tsv"           # rename the file
  • The head command picks the 1st matched line just in case there are multiple matches. (It may be overthinking.)
  • If the printed commands look good, drop echo before mv and run.
  • Related