Using the (?<=_)(.*)(?=\.)
regex with the 23353_test.txt
test string returns nothing with grep with the -p
option. It doesn’t show errors either. I expect the return to be test. But when the regex is tried in regex101.com it runs correctly.
CodePudding user response:
The following GNU grep
command extracts the right substring:
grep -oP '(?<=_).*(?=\.)' file
Note that .*
matches greedily, and if you want to make sure you match a substring between the closest _
and .
you need to use a
grep -oP '(?<=_)[^._]*(?=\.)' file
where [^._]*
matches zero or more chars other than .
and _
.
If you cannot rely on your grep
, you can use sed
here:
sed -n 's/.*_\(.*\)\..*/\1/p' file
See the online demo:
#!/bin/bash
s='23353_test.txt'
grep -oP '(?<=_)(.*)(?=\.)' <<< "$s"
# => test
sed -n 's/.*_\(.*\)\..*/\1/p' <<< "$s"
# => test
CodePudding user response:
1st solution: You should use awk
for this requirement, please try following as per your shown samples. Setting field separators as _
OR .
and checking condition if number of fields are 3 then printing 2nd field here.
s='23353_test.txt'
echo "$s" | awk -F'[_.]' 'NF==3{print $2}'
2nd solution: Using sed
program here with using capturing group capability of sed
. Using -E
option to enable ERE in sed
then in main program using regex ^[^_]*_([^.]*)\..*
, which matches from starting till 1st occurrence of _
and creating 1st and only capturing group which has everything which comes between _
and .
in it and after it matching literal .
till end of line. Then while substituting whole line with 1st capturing group value.
s='23353_test.txt'
echo "$s" | sed -E 's/^[^_]*_([^.]*)\..*/\1/'
3rd solution: Using GNU awk
using awk
's match
function here. Using regex inside match
function to match betwen 1st occurrence of _
till .
comes and having it inside a capturing group, we are using array named arr
which will store captured values in it, so printing 1st capturing group value by arr[1]
in it.
echo "$s" | awk 'match($0,/^[^_]*_([^.]*)\..*$/,arr){print arr[1]}'
4th solution: Using GNU grep
here, where using its -o
and -P
options. Where -o
option is for printing matched part only and -P
flag is for enabling PCRE regex. Here is Online demo for following regex.
echo "$s" | grep -oP '^.*?_\K([^.]*)(?=\.\S $)'