I have many strings like this
i=./M1/CustomersList/HTP/Boston/FCT/output_GetCaseList_abs.txt
I need to extract the M1 code and the FCT but I am unable to do so, likely due to the regular expressions. FCT I can do with echo ${i:30:3}
, but for M1 nothing seems to work - my last try was grep -oP '.*\K(?<=.\/)\w (?=\/Cus)' $i ;
The length of the string can vary (but it always starts with /F) and /M1/ is always in the same position
Hope somebody can help. Thanks!
CodePudding user response:
You could try following awk
programs.
To get FCT
likewise strings try: Since position of string is NOT fixed as well as only /F
is fixed, so I am trying to match /F
till next occurrence of /
so it will catch any value after /F
but before next occurrence of /
here.
echo "$i" | awk 'match($0,/\/F[^/]*/){print substr($0,RSTART 1,RLENGTH-1)}'
To get M1
try following awk
program, since position of M1
is always fixed(as per OP in question), so I am using 2 substitute calls here, where first one is removing starting ./
with NULL and 2nd substitute call is removing everything from /
to till last of line with NULL and then printing the line which will give M1
part.
echo "$i" | awk '{sub(/^\.\//,"");sub(/\/.*/,"")} 1'
CodePudding user response:
Bash allows you to split a string into an array.
# starting value
str=./M1/CustomersList/HTP/Boston/FCT/output_GetCaseList_abs.txt
# split string on / delimiter into the split array
IFS=/ read -ra split <<<"$str"
# get M1 and FCT elements at their respective indexes
M1=${split[1]}
FCT=${split[5]}
# dump M1 and FCT variables for demo purpose
declare -p M1 FCT
CodePudding user response:
Another option with awk
is split()
to split the path components into an array. The array a[]
is filled by the command below and the 2nd and 6th elements ("M1"
, and "FCT"
)
awk '{split($1,a,"/"); print a[2]", "a[6]}'
Example Use/Output
$ i=./M1/CustomersList/HTP/Boston/FCT/output_GetCaseList_abs.txt; echo "$i" |
awk '{split($1,a,"/"); print a[2]", "a[6]}'
M1, FCT
CodePudding user response:
If the positions of the strings are always after the same number of forward slashes, you can print the 2nd and the 6th field, setting the field separator to /
echo "$i" | awk -F"/" '{print $2, $6}'
Output
M1 FCT
You might also use gnu awk
and a pattern with 2 capture groups matching the following Cus
for the first match, and starting with F
for the second match.
The negated character class [^\/]*
matches 0 or more characters except a /
echo "$i" | awk 'match($0, /[^\/]*\/([^\/]*)\/Cus.*\/(F[^\/]*)/, a) {print a[1], a[2]}'
CodePudding user response:
You have your awk answers, but I felt like contributing a bash idea just for fun.
[[ "$i" =~ ^\./([[:alnum:]] )(/[[:alnum:]] ){3}/([[:alnum:]] )/.* ]] \
&& echo "${BASH_REMATCH[1]} ${BASH_REMATCH[3]}"
BASH_REMATCH
array matches the capture groups in the test case. Index 0
is the complete string.
A slightly shorter version yielding the same output:
[[ "$i" =~ ^\./([[:alnum:]] )(/[[:alnum:]] ){4}/.* ]] \
&& echo "${BASH_REMATCH[1]} ${BASH_REMATCH[2]:1}"