With a bash script, I extracted a .conllu file into a three columned .txt with the Lemma, POS and meaning. So some kind of dictionary. Now I am trying to make it prettier by putting the second column (POS) in brackets.
It looks like:
ami NOUN mother
amo VERB sleep
asima NOUN younger_sister
ati NOUN older_sister
Every column is seperated by a tab.
I want it to look like this:
ami (NOUN) mother
amo (VERB) sleep
asima (NOUN) younger_sister
ati (NOUN) older_sister
and ideally:
ami (NOUN) - mother
amo (VERB) - sleep
asima (NOUN) - younger_sister
ati (NOUN) - older_sister
I tried regex and sed
sed -e 's/[a-zA-Z] /(/g' -e 's [a-zA-Z] =[a-zA-Z] /)/g' dictjaa.txt > test.txt
but failed unfortunately.
CodePudding user response:
Using sed
sed -E 's/([^[:alpha:]] )([^ ]*) /\1(\2) -/' input_file
ami (NOUN) - mother
amo (VERB) - sleep
asima (NOUN) - younger_sister
ati (NOUN) - older_sister