Home > Mobile >  How to create a slurm array with each unique values of column 1 and use the corresponsing values in
How to create a slurm array with each unique values of column 1 and use the corresponsing values in

Time:01-22

From a table, I am trying to pass the filenames (column 2) corresponding to each unique group (column 1) as input in a slurm array created with unique column 1 variables. I am doing something like this:

Sample table:

$cat table.txt
group1  a.txt
group1  b.txt
group2  c.txt
group2  d.txt
group3  e.txt
group3  f.txt

================================

#!/bin/bash
#SBATCH --array=0-2
#SBATCH -N 1
#SBATCH -c 8
#SBATCH -t 1-

GRP=`awk '{print $1}' table.txt | uniq`
echo $GRP

XYZ=${GRP[$SLURM_ARRAY_TASK_ID]}
echo $XYZ

INPUT=`awk -v x="$XYZ" '$1 == x {print $2}' table.txt`
echo $INPUT

=================================

Desired output here is a list of two files in each job of the array. Example:

a.txt
b.txt

Problem: Using the variable $XYZ as a pattern does not return any value for $INPUT, while $GRP and $XYZ works. If I use $GRP instead of $XYZ as a variable in $INPUT, and have only one unique value in column 1 as following, it works.

$cat table.txt
group1  a.txt
group1  b.txt

the output here is a.txt b.txt

I will appriciate if anyone can help to get similar output for each group in individidual jobs in the array.

CodePudding user response:

To make GRP an array, you should wrap the righthand of the assignment in parentheses, so:

GRP=(`awk '{print $1}' table.txt | uniq`)

What you did just created GRP as a normal variable, so ${GRP[0]} was the whole string and all other indices would return empty.

  • Related