Home > Enterprise >  Error finding matches and creating new lines in a text file using sed
Error finding matches and creating new lines in a text file using sed

Time:12-29

I have this text in a file

random text 1.- random text 2. random text 3.- random 22 text 4. random text

I want to create a new line before each number followed by a dot. This is my code:

for number in {1..4}
do
var=$(sed -n "s/.*\([^0-9]$number\.[^0-9]\).*/\1/p" file)
echo $var
sed -i "s/$var/\n$var/g" file 
done

This is the result I get:

random text
 1.- random text
 2. random text
 3.- random
 2. text
 4. random text

I do not understand why it creates a new line before the number 22 if there is no dot. The expected result would be this:

random text
1.- random text
2. random text
3.- random 22 text
4. random text

Could someone help me and explain where my mistake is? Thank you very much

CodePudding user response:

When number=2 you get var=' 2.'

This gets fed into the last sed command as / 2./\n 2./g where the first 2. says to match a literal 2 with any other single character (.) which is why it ends up matching on 22. Then to confuse ya a bit more, the \n 2. says to insert the literal 2. hence the 22 is replaced with 2..

Consider:

$ echo '22' | sed 's/2./2./'
2.

One quick-fix for the current code is to use parameter substitution to add a backslash to escape the . in $var:

for number in {1..4}
do
    var=$(sed -n "s/.*\([^0-9]$number\.[^0-9]\).*/\1/p" file)
    var="${var/./\\.}"      # replace "." with "\."
    echo $var
    sed -i "s/$var/\n$var/g" file
done

$ cat file
random text
 1.- random text
 2. random text
 3.- random 22 text
 4. random text

The leading space is your match of the first [^0-9] in the first sed command and since it's inside the parens it's considered part of the capture group and thus gets included in the \1 reference. Try moving the left paren to the right by one character, eg:

for number in {1..4}
do
    # replace this:
    #var=$(sed -n "s/.*\([^0-9]$number\.[^0-9]\).*/\1/p" file)

    # with this:
    var=$(sed -n "s/.*[^0-9]\($number\.[^0-9]\).*/\1/p" file)

    var="${var/./\\.}"
    echo $var
    sed -i "s/[[:space:]]*$var/\n$var/g" file
done

NOTE: I've added the [[:space:]]* to match on any white space before $var; the replacement (\n$var) will effectively remove said white space from the end of what will now be the line-before $var.

The results:

$ cat file
random text
1.- random text
2. random text
3.- random 22 text
4. random text

CodePudding user response:

I'd write it using a single sed command:

sed 's/ \([0-9][0-9]*\.\)/\
\1/g' file

CodePudding user response:

Using sed in a single pass

$ sed -i.bak 's/[0-9]*\.[^0-9]*/\n&/g' file
random text
1.- random text
2. random text
3.- random 22 text
4. random text
  • Related