Home > front end >  Splitting file based on pattern '\r\n00' in korn shell
Splitting file based on pattern '\r\n00' in korn shell

Time:03-23

My file temp.txt looks like below

00ABC
PQR123400
00XYZ001234
012345
0012233

I want to split the file based on pattern '\r\n00'. In this case temp.txt should split into 3 files

first.txt: 
00ABC
PQR123400

second.txt
00XYZ001234
012345

third.txt
0012233

I am trying to use csplit to match pattern '\r\n00' but the debug shows me invalid pattern. Can someone please help me to match the exact pattern using csplit

CodePudding user response:

With your shown samples, please try following awk code. Written and tested in GNU awk.

This code will create files with names like: 1.txt, 2.txt and so on in your system. This will also take care of closing output files in backend so that we don't get in-famous error too many files opened one.

awk -v RS='\r?\n00' -v count="1" '
{
  outputFile=(count  ".txt")
  rt=RT
  sub(/\r?\n/,"",rt)
  if(!rt){
    sub(/\n /,"")
    rt=prevRT
  }
  printf("%s%s\n",(count>2?rt:""),$0) > outputFile
  close(outputFile)
  prevRT=rt
}
'  Input_file

Explanation: Adding detailed explanation for above code.

awk -v RS='\r?\n00' -v count="1" '      ##Starting awk program from here and setting RS as \r?\n00 aong with that setting count as 1 here.
{
  outputFile=(count  ".txt")            ##Creating outputFile which has value of count(increases each time cursor comes here) followed by .txt here.
  rt=RT                                 ##Setting RT value to rt here.
  sub(/\r?\n/,"",rt)                    ##Substituting \r?\n with NULL in rt.
  if(!rt){                              ##If rt is NULL then do following.
    sub(/\n /,"")                       ##Substituting new lines 1 or more with NULL.
    rt=prevRT                           ##Setting preRT to rt here.
  }
  printf("%s%s\n",(count>2?rt:""),$0) > outputFile   ##Printing rt and current line into outputFile.
  close(outputFile)                     ##Closing outputFile in backend.
  prevRT=rt                             ##Setting rt to prevRT here.
}
'  Input_file                           ##Mentioning Input_file name here. 
  • Related