From a table, I am trying to pass the filenames (column 2) corresponding to each unique group (column 1) as input in a slurm array created with unique column 1 variables. I am doing something like this:
Sample table:
$cat table.txt
group1 a.txt
group1 b.txt
group2 c.txt
group2 d.txt
group3 e.txt
group3 f.txt
================================
#!/bin/bash
#SBATCH --array=0-2
#SBATCH -N 1
#SBATCH -c 8
#SBATCH -t 1-
GRP=`awk '{print $1}' table.txt | uniq`
echo $GRP
XYZ=${GRP[$SLURM_ARRAY_TASK_ID]}
echo $XYZ
INPUT=`awk -v x="$XYZ" '$1 == x {print $2}' table.txt`
echo $INPUT
=================================
Desired output here is a list of two files in each job of the array. Example:
a.txt
b.txt
Problem: Using the variable $XYZ as a pattern does not return any value for $INPUT, while $GRP and $XYZ works. If I use $GRP instead of $XYZ as a variable in $INPUT, and have only one unique value in column 1 as following, it works.
$cat table.txt
group1 a.txt
group1 b.txt
the output here is a.txt b.txt
I will appriciate if anyone can help to get similar output for each group in individidual jobs in the array.
CodePudding user response:
To make GRP an array, you should wrap the righthand of the assignment in parentheses, so:
GRP=(`awk '{print $1}' table.txt | uniq`)
What you did just created GRP as a normal variable, so ${GRP[0]}
was the whole string and all other indices would return empty.