Home > Blockchain >  How do I rename files if they match with names within a csv column?
How do I rename files if they match with names within a csv column?

Time:05-10

I'm working in a directory in which I have a lot (about 100) of BAM file with a specific name. In the same directory I have also a csv file containing 4 column: FileID, File.name, Donor, Type

Let's say that my bam files in the directory are: Donor.1234-xyz.bam, Donor.5678-abc.bam, Donor.1011-def.bam, Donor.1213-ghi.bam (which match with the name in the column number 2 called File.name of the csv file).

I am not really familiar with coding so I will try to explain what I would like to do. I would like the script to rename the bam files I have in the folder with the name Donor Type (column 3 and 4). So if Donor.1234-xyz.bam is also found in the File.name column of the csv file then I would like it to be called with the string in column 3 and 4 (basically I would like to replace the bam name with everything in the corresponding column of the Donor and the Type).

    BAM_FILE="*.bam"
    BAM=$BAM_FILE
    NAME="cat kich.csv | cut -f2 -s"
    NM=$NAME
    DONOR="cat kich.csv | cut -f3 -s"
    DO=$DONOR
    TYPE="cat kich.csv | cut -f4 -s"
    TY=$TYPE    
    for file_name in "$BAM"; 
    do
    if [[ "$file_name" == "$NM" ]] then 
    mv ${file_name} ${DO}_${TY} ;    
    done

But it doesn't work really, as I said I'm still naïve. So, could you help me please to fix this problem?

CodePudding user response:

#!/bin/bash

while IFS=, read -r id name donor type; do
    in="${name}.bam"
    ! [ -f "$in" ] && continue

    out="${donor}_${type}.bam"
    if [ -f "$out" ]; then
        echo "output exists: $out"
        continue
    fi

    echo mv -iv "$in" "$out"
done <kich.csv

CodePudding user response:

Let's suppose the directory contains

Donor.1011-def.bam
Donor.1234-xyz.bam
Donor.5678-abc.bam
kich.csv

and kich.csv contains

1,Donor.1234-xyz.bam,DO1,TY1
2,Donor.5678-abc.bam,DO2,TY2
3,Donor.1011-def.bam,DO3,TY3
4,Donor.1213-ghi.bam,DO4,TY4

(note that the csv file contains more than what's in the directory)

I would not loop over the files, I'd loop over the contents of the csv file:

while IFS=, read -r id filename donor type; do
    if [[ -f "$filename" ]]; then
        echo mv "$filename" "${donor}_${type}"
    fi
done < kich.csv

which outputs

mv Donor.1234-xyz.bam DO1_TY1
mv Donor.5678-abc.bam DO2_TY2
mv Donor.1011-def.bam DO3_TY3

That loop is:

  • reading lines from the csv file (< kich.csv);
  • splitting them into comma-separated fields (IFS=, read -r id filename donor type)
  • checking to see if the filename listed in the csv file exists ([[ -f "$filename" ]])
  • then emitting the mv command.

If you're happy with that output, remove the echo to actually rename the files.

CodePudding user response:

Assuming your file is something like that:

$ cat file.csv     
01;file01;part1;part2
02;file02;part3;part4

You can try it

while read -r line;
do
   file=$(echo $line | cut -d";" -f2)
   newName=$(echo $line | cut -d";" -f3)_$(echo $line | cut -d";" -f4)    

   mv $file $newName

done < file.csv

CodePudding user response:

awk -F, '{print "[ -f \""$2"\" ] && mv \""$2"\" \""$3"_"$4"\""}' kich.csv | sh
  • Related