I have the following SED command
echo "abcd_2222222233333333_jdkj" | sed -e 's/^\(.*\)_\(.*\)_\(.*\)$/\2_\1_\3/'
that returns
2222222233333333_abcd_jdkj
That's great, but I really want
22222222-33333333_abcd_jdkj
Is this possible with an easy tweak or do I need some non-sed solution? Basically, I know the number is 16 bytes, but I need to break it into two 8 byte numbers.
CodePudding user response:
Instead of .*
to match any number of characters, you can use .{8}
to match exactly eight characters.
The below also uses sed -r
to allow ERE syntax, which requires fewer backslashes and is generally easier to read than the default BRE. (On systems with BSD-style tools, this might be sed -E
instead).
sed -re 's/^(.*)_(.{8})(.*)_(.*)$/\2-\3_\1_\4/' <<<"abcd_2222222233333333_jdkj"
By the way -- I would strongly suggest using [^_]*
instead of .*
so your regex can't match underscores where you don't want it to. (.
means "any character"; [^_]
means "any character except _
"). That's not just a correctness enhancement -- it can also make your regex faster to evaluate by avoiding backtracking (where the regex engine realizes it's matched too much content and needs to undo some of its prior matches).
Also consider bash's built-in regex support:
string='abcd_2222222233333333_jdkj'
re='([^_] )_([[:digit:]]{8})([[:digit:]] )_(.*)'
if [[ $string =~ $re ]]; then
result=${BASH_REMATCH[2]}-${BASH_REMATCH[3]}_${BASH_REMATCH[1]}_${BASH_REMATCH[4]}
echo "Result is: $result"
else
echo "No match found"
fi
CodePudding user response:
Solution per the above commenter's tip works
echo "abcd_2222222233333333_jdkj" | sed -e 's/^\(.*\)_\(.\{8\}\)\(.\{8\}\)_\(.*\)$/\2-\3_\1_\4/'