Home > Software engineering >  Splitting text to column
Splitting text to column

Time:09-21

i have case to splitting text like sample

A|B|C|D| 3#1#1#;1#4#4;5#1#4;|E|F|0
AE|B1|CC|DE| |EX|FF|0
AR|BE|CA|D1| 5#6#3#;4#3#4;|ED|G|

I need the text to be like this

A|B|C|D| 3#1#1#;1#4#4;5#1#4;|E|F|0|3|1|1
A|B|C|D| 3#1#1#;1#4#4;5#1#4;|E|F|0|1|4|4
A|B|C|D| 3#1#1#;1#4#4;5#1#4;|E|F|0|5|1|4
AE|B1|CC|DE| |EX|FF|0|||
AR|BE|CA|D1| 5#6#3#;4#3#4;|ED|G||5|6|3
AR|BE|CA|D1| 5#6#3#;4#3#4;|ED|G||4|3|4

i already try using

awk 'BEGIN{FS=OFS="|"} {split($5,a,/;/); for (i in a) {if (a[i]) $9=a[i]; else next; gsub(/#/,"|",$9); print}}

however the if the $5 is having space only, it wont adding the column.

CodePudding user response:

1st solution: With your shown samples please try following awk code.

awk '
match($0,/(([0-9] #) );[^|]*/){
  num=split(substr($0,RSTART,RLENGTH),arr,";")
  for(i=1;i<num;i  ){
    sub(/#$/,"",arr[i])
    gsub(/#/,"|",arr[i])
    print $0"|"arr[i]
  }
  next
}
{
  print $0 "|||"
}
'   Input_file


2nd solution: Using function approach in awk, with your shown samples please try following awk code. We can pass number of fields into function what we want to look and get values for, eg: in this case I am passing 2nd, 3rd and 4th field numbers into the function to work on them to get required output. But in case you have too many fields then I suggest use the 1st solution of mine shown above.

awk -F' |;' '
function getValues(fields){
  num=split(fields,arr,",")
  for(i=1;i<=num;i  ){
    if($arr[i]~/^([0-9] #) [0-9]*$/){
      val=$arr[i]
      sub(/#$/,"",val)
      gsub(/#/,"|",val)
      print $0"|"val
    }
  }
}
/([0-9] #) ;/{
  getValues("2,3,4")
  next
}
{
  print $0 "|||"
}
'   Input_file

In both the solutions, output will be as follows:

A|B|C|D| 3#1#1#;1#4#4;5#1#4;|E|F|0|3|1|1
A|B|C|D| 3#1#1#;1#4#4;5#1#4;|E|F|0|1|4|4
A|B|C|D| 3#1#1#;1#4#4;5#1#4;|E|F|0|5|1|4
AE|B1|CC|DE| |EX|FF|0|||
AR|BE|CA|D1| 5#6#3#;4#3#4;|ED|G||5|6|3
AR|BE|CA|D1| 5#6#3#;4#3#4;|ED|G||4|3|4

CodePudding user response:

Using any awk:

$ cat tst.awk
{
    tgt = ( /;/ ? $2 : "###;")
    gsub(/#/,"|",tgt)
    n = split(tgt,a,/\|?;/)
    for ( i=1; i<n; i   ) {
        print $0 "|" a[i]
    }
}

$ awk -f tst.awk file
A|B|C|D| 3#1#1#;1#4#4;5#1#4;|E|F|0|3|1|1
A|B|C|D| 3#1#1#;1#4#4;5#1#4;|E|F|0|1|4|4
A|B|C|D| 3#1#1#;1#4#4;5#1#4;|E|F|0|5|1|4
AE|B1|CC|DE| |EX|FF|0|||
AR|BE|CA|D1| 5#6#3#;4#3#4;|ED|G||5|6|3
AR|BE|CA|D1| 5#6#3#;4#3#4;|ED|G||4|3|4
  • Related