Given an array of strings, I want IFS to parse each string by an underscore, join the first three separated values together using the same delimeter and output them to an empty list. For instance, if I have the array input=("L2Control_S3_L001_R1_001.fastq", "L2Control_S3_L001_R2_001.fastq")
IFS would take the first string and separate it out by the underscore, so the output would be:
L2Control
S3
L001
R1
001.fastq
Afterward, it would take the first three separated values and join them together with an underscore: "L2Control_S3_L001"
. Lastly, this value would be appended onto a new array output=("L2Control_S3_L001")
this process would continue until all values in the array are completed. I have tried the below implementation, but it seems to run infinitely.
#!/bin/bash
str=("L2Control_S3_L001_R1_001.fastq", "L2Control_S3_L001_R1_001.fastq")
IFS='_'
final=()
for (( c = 0; c = 2; c )); do
read -ra SEPA <<< "${str[$c]}"
final =("${SEPA[0]}_${SEPA[1]}_${SEPA[2]}")
done
Can someone help me with this, please?
CodePudding user response:
The ${array[*]}
expansion joins elements using the first character of IFS. You can combine this with the ${var:offset:length}
expansion.
output=()
for str in "${input[@]}"; do
read -ra fields <<< "$str"
output =("${fields[*]:0:3}")
done
I find declare -p varname ...
handy to inspect the contents of variables.
This can also be done with bash parameter expansion:
str="L2Control_S3_L001_R1_001.fastq"
IFS=_
suffix=${str#*"$IFS"*"$IFS"*"$IFS"}
first3=${str%"$IFS$suffix"}
declare -p str suffix first3
declare -- str="L2Control_S3_L001_R1_001.fastq"
declare -- suffix="R1_001.fastq"
declare -- first3="L2Control_S3_L001"
Can also do that in one line, but it's hairy:
first3="${str%"$IFS${str#*"$IFS"*"$IFS"*"$IFS"}"}"
CodePudding user response:
Setup (notice no comma needed to separate array entries):
str=("L2Control_S3_L001_R1_001.fastq" "L2Control_S3_L001_R1_001.fastq")
One idea using a while/read
loop to parse the input strings into 4 parts based on a delimiter (IFS=_
):
final=()
while IFS=_ read -r f1 f2 f3 ignore
do
final =("${f1}_${f2}_${f3}")
done < <(printf "%s\n" "${str[@]}")
typeset -p final
Where the variable ignore
will be assigned fields #4-#n.
This generates:
declare -a final=([0]="L2Control_S3_L001" [1]="L2Control_S3_L001")
CodePudding user response:
That's a lengthy description of taking the first 3 elements separated by _
.
printf "%s\n" "${str[@]}" | cut -d_ -f-3