I'm working in a directory in which I have a lot (about 100) of BAM file with a specific name. In the same directory I have also a csv file containing 4 column: FileID, File.name, Donor, Type
Let's say that my bam files in the directory are: Donor.1234-xyz.bam, Donor.5678-abc.bam, Donor.1011-def.bam, Donor.1213-ghi.bam
(which match with the name in the column number 2 called File.name of the csv file).
I am not really familiar with coding so I will try to explain what I would like to do. I would like the script to rename the bam files I have in the folder with the name Donor Type (column 3 and 4). So if Donor.1234-xyz.bam is also found in the File.name column of the csv file then I would like it to be called with the string in column 3 and 4 (basically I would like to replace the bam name with everything in the corresponding column of the Donor and the Type).
BAM_FILE="*.bam"
BAM=$BAM_FILE
NAME="cat kich.csv | cut -f2 -s"
NM=$NAME
DONOR="cat kich.csv | cut -f3 -s"
DO=$DONOR
TYPE="cat kich.csv | cut -f4 -s"
TY=$TYPE
for file_name in "$BAM";
do
if [[ "$file_name" == "$NM" ]] then
mv ${file_name} ${DO}_${TY} ;
done
But it doesn't work really, as I said I'm still naïve. So, could you help me please to fix this problem?
CodePudding user response:
#!/bin/bash
while IFS=, read -r id name donor type; do
in="${name}.bam"
! [ -f "$in" ] && continue
out="${donor}_${type}.bam"
if [ -f "$out" ]; then
echo "output exists: $out"
continue
fi
echo mv -iv "$in" "$out"
done <kich.csv
CodePudding user response:
Let's suppose the directory contains
Donor.1011-def.bam
Donor.1234-xyz.bam
Donor.5678-abc.bam
kich.csv
and kich.csv contains
1,Donor.1234-xyz.bam,DO1,TY1
2,Donor.5678-abc.bam,DO2,TY2
3,Donor.1011-def.bam,DO3,TY3
4,Donor.1213-ghi.bam,DO4,TY4
(note that the csv file contains more than what's in the directory)
I would not loop over the files, I'd loop over the contents of the csv file:
while IFS=, read -r id filename donor type; do
if [[ -f "$filename" ]]; then
echo mv "$filename" "${donor}_${type}"
fi
done < kich.csv
which outputs
mv Donor.1234-xyz.bam DO1_TY1
mv Donor.5678-abc.bam DO2_TY2
mv Donor.1011-def.bam DO3_TY3
That loop is:
- reading lines from the csv file (
< kich.csv
); - splitting them into comma-separated fields (
IFS=, read -r id filename donor type
) - checking to see if the filename listed in the csv file exists (
[[ -f "$filename" ]]
) - then emitting the mv command.
If you're happy with that output, remove the echo
to actually rename the files.
CodePudding user response:
Assuming your file is something like that:
$ cat file.csv
01;file01;part1;part2
02;file02;part3;part4
You can try it
while read -r line;
do
file=$(echo $line | cut -d";" -f2)
newName=$(echo $line | cut -d";" -f3)_$(echo $line | cut -d";" -f4)
mv $file $newName
done < file.csv
CodePudding user response:
awk -F, '{print "[ -f \""$2"\" ] && mv \""$2"\" \""$3"_"$4"\""}' kich.csv | sh