original line in file sed.txt:
outer_string_PATTERN_string(PATTERN_And_PATTERN_PATTERN_i)PATTERN_outer_string(i_PATTERN_inner)_outer_string
only need to replace PATTERN
to pattern
which in brackets, not lowercase, it could replace to other word.
expect result:
outer_string_PATTERN_string(pattern_And_pattern_pattern_i)PATTERN_outer_string(i_pattern_inner)_outer_string
I could use ([^)]*)
pattern to find the substring which would be replace some worlds in. But I can't use this pattern to index the substring's position, and it will replace the whole line's PATTERN
to pattern
.
:/tmp$ sed 's/([^)]*)/---/g' sed.txt
outer_string_PATTERN_string---PATTERN_outer_string---_outer_string
:/tmp$ sed '/([^)]*)/s/PATTERN/pattern/g' sed.txt
outer_string_pattern_string(pattern_And_pattern_pattern_i)pattern_outer_string(i_pattern_inner)_outer_string
I also tried to use the regex group
in sed to capture and replace the words, but I can't figure out the command.
Can sed
implement that? And how to achieve that? THX.
CodePudding user response:
Can sed implement that?
Yes. But you do not want to do it in sed
. Use other programming language, like Python, Perl, or awk
.
how to achieve that?
Implementing non-greedy regex is not simple in sed
. Basically, generally, it consists of:
- taking chunk of the input
- process the chunk
- put it in hold space
- shuffle hold with pattern space - extract what been already processed, what's not
- repeat
- shuffle with hold space
- output
Anyway, the following script:
#!/bin/bash
sed <<<'outer_string_PATTERN_string(PATTERN_i_PATTERN_PATTERN_i)PATTERN_outer_string(i_PATTERN_inner)_outer_string' '
:loop;
/\([^(]*\)\(([^)]*)\)\(.*\)/{
# Lowercase the second part.
s//\1\L\2\E\n\3/;
# Mix with hold space.
G;
s/\(.*\)\n\(.*\)\n\(.*\)/\3\1\n\2/;
# Put processed stuff into hold spcae
h; s/\n.*//; x;
# Process the other stuff again.
s/.*\n//;
bloop;
};
# Is hold space empty?
x; /^$/!{
# Pattern space has trailing stuff - add it.
G; s/\n//;
# We will print it.
h;
# Clear hold space
s/.*//
};x;
'
outputs:
PATTERN_outer_string(i_pattern_inner)outer_string_PATTERN_string(pattern_i_pattern_pattern_i)_outer_string
CodePudding user response:
As an alternative, it is easier to do this in gnu awk
with RS
that matches (...)
substring:
awk -v RS='\\([^)] )' '{gsub(/PATTERN/, "pattern", RT); ORS=RT} 1' file
outer_string_PATTERN_string(pattern_i_pattern_pattern_i)PATTERN_outer_string(i_pattern_inner)_outer_string
Steps:
RS='\\([^)] )'
captures a(...)
string as record separatorgsub
function then replacesPATTERN
withpattern
in matched text i.e.RT
ORS=RT
setsORS
as the new modifiedRT
1
prints each record to stdout
Another alternative solution using lookahead assertion in a perl
regex:
perl -pe 's/PATTERN(?=[^()]*\))/pattern/g' file
CodePudding user response:
Solved by this:
:/tmp$ sed 's/(/\n(/g' sed.txt | sed 's/)/)\n/g' | sed '/([^)]*)/s/PATTERN/pattern/g' | sed ':a;N;$!ba;s/\n//g'
outer_string_PATTERN_string(pattern_And_pattern_pattern_i)PATTERN_outer_string(i_pattern_inner)_outer_string
- make pattern
()
in a new line - find the
()
lines and replace thePATTERN
topattern
- merge multiple lines in one line
thanks for How can I replace a newline (\n) using sed?
CodePudding user response:
Can
sed
implement that?
It can be done using GNU sed
and basic regular expressions
(BRE):
sed '
s/)/)\n/g
:1
s/\(([^)]*\)PATTERN\([^)]*)\n\)/\1pattern\2/
t1
s/\n//g
' < file
where
- 1st
s
inserts a newline after each)
- 2nd
s
replaces the last (*
is greedy)PATTERN
inside()
s withpattern
t
loops back if a substitution was made- 3rd
s
strips all inserted newlines
EDIT
2nd s
ubstitute command edited according to OP's suggestion
since there is no need to match \n
inside ()
.
CodePudding user response:
You can try this sed
sed -E 's/\(.?PATTERN.?[^)]*\)/\L&/g'
Here, we are looking to match the word PATTERN
only if it resides within brackets.
Output
outer_string_PATTERN_string(pattern_i_pattern_pattern_i)PATTERN_outer_string(i_pattern_inner)_outer_string
New Example Output
echo "outer_string_PATTERN_string(pattern_And_pattern_pattern_i)PATTERN_outer_string(i_pattern_inner)_outer_string" | sed -E 's/\(.?PATTERN.?[^)]*\)/\L&/g'
outer_string_PATTERN_string(pattern_And_pattern_pattern_i)PATTERN_outer_string(i_pattern_inner)_outer_string