Home > other >  How to replace third repetition of kate with diane in each line
How to replace third repetition of kate with diane in each line

Time:08-18

How to replace third repetition of kate with diane in each line only if william appears in sentence.

kate must be implicit - for example "kate's" is not a valid rep

For example:

  • kate prince william's wife is the second kate after his mother kate

will be replaced with:

  • kate prince william's wife is the second kate after his mother diana

but the following one will not:

  • kate’s prince william's wife is the second kate after his mother kate

CodePudding user response:

So given this input:

kate prince william's wife is the second kate after his mother kate
kate's prince william's wife is the second kate after his mother kate

This remove and replace solution works:

parse.sed

/william/ {
  s/\bkate's\b/XXXXXX/g
  s/\bkate\b/diana/3
  s/XXXXXX/kate's/g
}

Run it like this:

sed -Ef parse.sed infile

Output:

kate prince william's wife is the second kate after his mother diana
kate's prince william's wife is the second kate after his mother kate

CodePudding user response:

Using sed

$ sed -E '/william/s/(^| )kate( |$)/\1diane\2/3' input_file
kate prince william's wife is the second kate after his mother diane
kate’s prince william's wife is the second kate after his mother kate

CodePudding user response:

The most difficult part of this problem is trying to match sentences instead of lines. If you allow a "sentence" to mean "a string delimited by '.'" (which is a terrible definition, and fails to capture many common instances), you might get away with:

perl -0056 -pe 's{\bkate\b}{  $c == 3 ? "Diana" : $&}ige; $c=0' input-file

The -0056 causes perl to treat . (056 is the octal representation of .) as the record separator, so each "sentence" is a single record. Again, this abuses the definition of "sentence" and misses common cases. The i flag causes the matches to be case insensitive. Normally, one would use \b to match word boundaries, but since you don't want kate's to count as a match, you will need to modify this slightly. You haven't precisely defined what you want to consider a valid match, so perhaps:

$ b="(?:(?<=\w)(?=\W|\z|')|(?:(?<=\W)|(?<=\A)|')(?=\w))"
$ perl -0056 -pe 's{'"$b"'kate'"$b"'}{  $c == 3 ? "$1Diana$2" : $&}ige; $c=0' input
  • Related