How to replace third repetition of kate with diane in each line only if william appears in sentence.
kate must be implicit - for example "kate's" is not a valid rep
For example:
- kate prince william's wife is the second kate after his mother kate
will be replaced with:
- kate prince william's wife is the second kate after his mother diana
but the following one will not:
- kate’s prince william's wife is the second kate after his mother kate
CodePudding user response:
So given this input:
kate prince william's wife is the second kate after his mother kate
kate's prince william's wife is the second kate after his mother kate
This remove and replace solution works:
parse.sed
/william/ {
s/\bkate's\b/XXXXXX/g
s/\bkate\b/diana/3
s/XXXXXX/kate's/g
}
Run it like this:
sed -Ef parse.sed infile
Output:
kate prince william's wife is the second kate after his mother diana
kate's prince william's wife is the second kate after his mother kate
CodePudding user response:
Using sed
$ sed -E '/william/s/(^| )kate( |$)/\1diane\2/3' input_file
kate prince william's wife is the second kate after his mother diane
kate’s prince william's wife is the second kate after his mother kate
CodePudding user response:
The most difficult part of this problem is trying to match sentences instead of lines. If you allow a "sentence" to mean "a string delimited by '.'" (which is a terrible definition, and fails to capture many common instances), you might get away with:
perl -0056 -pe 's{\bkate\b}{ $c == 3 ? "Diana" : $&}ige; $c=0' input-file
The -0056
causes perl
to treat .
(056 is the octal representation of .
) as the record separator, so each "sentence" is a single record. Again, this abuses the definition of "sentence" and misses common cases. The i
flag causes the matches to be case insensitive. Normally, one would use \b
to match word boundaries, but since you don't want kate's
to count as a match, you will need to modify this slightly. You haven't precisely defined what you want to consider a valid match, so perhaps:
$ b="(?:(?<=\w)(?=\W|\z|')|(?:(?<=\W)|(?<=\A)|')(?=\w))"
$ perl -0056 -pe 's{'"$b"'kate'"$b"'}{ $c == 3 ? "$1Diana$2" : $&}ige; $c=0' input