How to create a slurm array with each unique values of column 1 and use the corresponsing values in-CodePudding

From a table, I am trying to pass the filenames (column 2) corresponding to each unique group (column 1) as input in a slurm array created with unique column 1 variables. I am doing something like this:

Sample table:

$cat table.txt
group1  a.txt
group1  b.txt
group2  c.txt
group2  d.txt
group3  e.txt
group3  f.txt

================================

#!/bin/bash
#SBATCH --array=0-2
#SBATCH -N 1
#SBATCH -c 8
#SBATCH -t 1-

GRP=`awk '{print $1}' table.txt | uniq`
echo $GRP

XYZ=${GRP[$SLURM_ARRAY_TASK_ID]}
echo $XYZ

INPUT=`awk -v x="$XYZ" '$1 == x {print $2}' table.txt`
echo $INPUT

=================================

Desired output here is a list of two files in each job of the array. Example:

a.txt
b.txt

Problem: Using the variable $XYZ as a pattern does not return any value for $INPUT, while $GRP and $XYZ works. If I use $GRP instead of $XYZ as a variable in $INPUT, and have only one unique value in column 1 as following, it works.

$cat table.txt
group1  a.txt
group1  b.txt

the output here is a.txt b.txt

I will appriciate if anyone can help to get similar output for each group in individidual jobs in the array.

CodePudding user response：

To make GRP an array, you should wrap the righthand of the assignment in parentheses, so:

GRP=(`awk '{print $1}' table.txt | uniq`)

What you did just created GRP as a normal variable, so ${GRP[0]} was the whole string and all other indices would return empty.