Home > database >  Remove specific character that repeats more than once, and leave only one
Remove specific character that repeats more than once, and leave only one

Time:03-16

I've looked at answers of how to remove duplicate words for the last 20 minutes, trying to make them work for my use case, but that didn't work.

I have a small command that deletes special characters and transforms spaces into dashes,

echo "[[[h[el]lo - w{o}rld%^& -text" | tr -d '?$#@;:/\="<>%{}|^~[]&`' | tr ' ' '-'

and that produces the following output

hello---world--text

That runs perfectly, but I also want to add something else to that command, maybe another pipe, that removes the dashes that repeat

ex: I want it to transform, from the produced output, to:

hello-world-text

How can I do this in the most POSIX compliant way possible?

PS: Please tell me if there's a more efficient way to accomplish what I already made there

CodePudding user response:

You can use the -s flag for that:

echo "[[[h[el]lo - w{o}rld%^& -text" | tr -d '?$#@;:/\="<>%{}|^~[]&`' | tr -s ' ' '-'

-s, --squeeze-repeats
              replace each sequence of a repeated character that is listed in the  last
              specified SET, with a single occurrence of that character

CodePudding user response:

For this particular case we can have tr delete everything except - and letters/digits ([:alnum:])

$ echo "[[[h[el]lo - w{o}rld%^& -text" | tr -dc -- '-[:alnum:]'
hello-world-text

### or

$ tr -dc -- '-[:alnum:]' <<< "[[[h[el]lo - w{o}rld%^& -text"
hello-world-text

The key is the -c flag which says to take the complement (ie, everything except) of the given pattern.

CodePudding user response:

You can also use sed:

$ echo "[[[h[el]lo - w{o}rld%^& -text" | sed -E 's/([^-[:alnum:]]*)//g'
hello-world-text
  • Related