Home > Software engineering >  How to use AWK command output and turn into CSV File
How to use AWK command output and turn into CSV File

Time:12-16

I am trying to create a script that searches certain fields from the output of autorep -j $i -q and print out what is value of that field mentioned in the output. So basically autorep -j $i -q, when the script is executed will ask the user to input a JOBNAME or %SEARCHSTRING% and then it will give the job details in the below format:

    /tmp $ autorep -j Test_jobA -q
    
    
insert_job: Test_jobA   job_type: CMD
command: echo
machine: machinename
owner: owner
permission:
date_conditions: 1
run_calendar: Autosys_Calendar 
start_times: "09:09"
description: "test discription"
std_out_file: "/tmp/Test_jobA.out"
std_err_file: "/tmp/Test_jobA.err"
alarm_if_fail: 1
alarm_if_terminated: 1


insert_job: Test_JobB   job_type: CMD
command: echo
machine: machinename
owner: owner
permission:
date_conditions: 1
days_of_week: mo,tu,we,th,fr 
start_times: "21:05"
description: "test discription"
std_out_file: "/tmp/Test_JobB.out"
std_err_file: "/tmp/Test_JobB.err"
alarm_if_fail: 1
alarm_if_terminated: 1

insert_job: Test_JobC  job_type: BOX
command: echo
machine: machinename
owner: owner
permission:
date_conditions: 0
description: "test discription"
std_out_file: "/tmp/Test_JobC.out"
std_err_file: "/tmp/Test_JobC.err"
alarm_if_fail: 1
alarm_if_terminated: 1

insert_job: Test_JobD   job_type: CMD
command: echo
machine: machinename
owner: owner
permission:
date_conditions: 1
days_of_week: su 
start_times: "08:50"
description: "test discription"
std_out_file: "/tmp/Test_JobD.out"
std_err_file: "/tmp/Test_JobD.err"
alarm_if_fail: 1
alarm_if_terminated: 1

insert_job: Test_JobE   job_type: CMD
command: echo
machine: machinename
owner: owner
permission:
date_conditions: 1
days_of_week: su 
start_times: "08:20"
description: "test discription"
std_out_file: "/tmp/Test_JobE.out"
std_err_file: "/tmp/Test_JobE.err"
alarm_if_fail: 1
alarm_if_terminated: 1


insert_job: Test_JobF   job_type: CMD
command: echo
machine: machinename
owner: owner
permission:
date_conditions: 1
days_of_week: all 
start_mins: 0,10,20,30,40,50
description: "test discription"
std_out_file: "/tmp/Test_JobF.out"
std_err_file: "/tmp/Test_JobF.err"
alarm_if_fail: 1
alarm_if_terminated: 1

So as you can see if date_condition: 0 then the job may or may not have condition: in it and wont have days_of_week: start_mins: run_window: run_calendar: and if a job has date_condition: 1 then it may or may not have days_of_week: start_mins: run_window: run_calendar:

I have the below script that does the filtering well enough:

#!/bin/bash

  TXT=/tmp/test1.txt
  CSV=/tmp/test1.csv

   echo "Enter the JOB_NAME or %SEARCHSTRING%"

   while read -r i;
      do
  awk '
    /^insert_job/ {if (flag) {printf "\n"}; 
                   printf "%s %s ", $2, $4; 
                   flag = 1}; 
    /^date_conditions/ {printf "%s", $2}; 
    /^condition:|^days_of_week:|^run_calendar:|^start_times:|^start_mins:/ {printf "%s", $2}
' < <(autorep -j $i -q) > $TXT
  break
  done

  if [ -s $TXT ]
    then
       (echo "job_name,job_type,Date_Conditions,condition,days_of_week,start_times,Start_mins" ; cat test1.txt) | sed 's/ \ /,/g' > $CSV
    else
       echo "Please check the %SEARCHSTRING% or JOB_NAME"
  fi

The While loop in the above script gives me the below output:

Test_jobA CMD 1 Autosys_Calendar "09:09"
Test_JobB CMD 1 mo,tu,we,th,fr "21:05"
Test_Jobc BOX 0
Test_JobD CMD 1 su "08:50"
Test_JobE CMD 1 su "08:20"
Test_JobF CMD 1 all "02:02,04:04,06:06,08:08,10:10,12:12,14:14,16:16,18:18,20:20,22:22"

The IF condition in the above script is used for converting the output of While loop into a .csv file but as the output is not being lenear, i am getting wrong data in the wrong columns.

is there anyway i can smooth it out.

EDIT: The required output for the CSV file:

I am looking for the below output, if a field is missing from the job then instead of an empty field it should print "NA" instead so that the .csv formating could be in line

job_name     job_type  date_conditions   condition run_calendar       days_of_week    start_times    start_mins
Test_jobA      CMD             1            NA     Autosys_Calendar     NA             "09:09"        NA
Test_JobB      CMD             1            NA          NA              mo,tu,we,th,fr "21:05"        NA
Test_Jobc      BOX             0            NA          NA              NA             NA             NA
Test_JobD      CMD             1            NA          NA              su             "08:50"        NA
Test_JobE      CMD             1            NA          NA              su             "08:20"        NA
Test_JobF      CMD             1            NA          NA              all             NA          0,10,20,30,40,50

CodePudding user response:

The following awk might be an option for you. You redirect the output to a .xlsx file or pipe the output to column -t for column formatted output.

#!/bin/bash

awk ' 
    BEGIN {
    print "job_name\tjob_type\tdate_conditions\tcondition\t\
        run_calendar\tdays_of_week\tstart_times\tstart_mins"
    }
    /job_type/ { jn=$2; jt=$4; dc="NA"; c="NA"; rc="NA"; dow="NA"; st="NA"; sm="NA" }
    /^date_conditions/ {dc=$2}
    /^condition/ {c=$2}
    /^run_calendar/ {rc=$2}
    /^days_of_week/ {dow=$2}
    /^start_times/ {st=$2;}
    /^start_mins/ {sm=$2;}
    /_if_terminated/{printf "%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\n", jn, jt, dc, c, rc, dow, st, sm}
' job.dat

Assuming all of your job data is contained in a file named job.dat, here is sample output:

./script job.dat | column -t
job_name   job_type  date_conditions  condition  run_calendar      days_of_week    start_times  start_mins
Test_jobA  CMD       1                NA         Autosys_Calendar  NA              "09:09"      NA
Test_JobB  CMD       1                NA         NA                mo,tu,we,th,fr  "21:05"      NA
Test_JobC  BOX       0                NA         NA                NA              NA           NA
Test_JobD  CMD       1                NA         NA                su              "08:50"      NA
Test_JobE  CMD       1                NA         NA                su              "08:20"      NA
Test_JobF  CMD       1                NA         NA                all             NA           0,10,20,30,40,50

Alternative version of script for generating a .csv output:

#!/bin/bash

data_file="${1}"

awk ' 
    BEGIN {
    print "\"job_name\",\"job_type\",\"date_conditions\",\"condition\",\"run_calendar\",\"days_of_week\",\"start_times\",\"start_mins\""
    }
    /job_type/ { jn="\""$2"\""; jt="\""$4"\""; dc="\"NA\""; c="\"NA\""; rc="\"NA\""; dow="\"NA\""; st="\"NA\""; sm="\"NA\"" }
    /^date_conditions/ {dc="\""$2"\""}
    /^condition/ {c="\""$2"\""}
    /^run_calendar/ {rc="\""$2"\""}
    /^days_of_week/ {dow="\""$2"\""}
    /^start_times/ {gsub("\"",""); st="\""$2"\""}
    /^start_mins/ {sm="\""$2"\""}
    /_if_terminated/{printf "%s,%s,%s,%s,%s,%s,%s,%s\n", jn, jt, dc, c, rc, dow, st, sm}
' "$data_file"

CSV output:

"job_name","job_type","date_conditions","condition","run_calendar","days_of_week","start_times","start_mins"
"Test_jobA","CMD","1","NA","Autosys_Calendar","NA","09:09","NA"
"Test_JobB","CMD","1","NA","NA","mo,tu,we,th,fr","21:05","NA"
"Test_JobC","BOX","0","NA","NA","NA","NA","NA"
"Test_JobD","CMD","1","NA","NA","su","08:50","NA"
"Test_JobE","CMD","1","NA","NA","su","08:20","NA"
"Test_JobF","CMD","1","NA","NA","all","NA","0,10,20,30,40,50"

CodePudding user response:

I think you'll be much happier with this script that outputs CSV as a starting point rather than a script that'd produce the output you say you want:

$ cat tst.awk
BEGIN { OFS="," }
!NF { next }

match($0,/^[[:space:]]*insert_job: [^[:space:]] [[:space:]] /) {
    prt()
    delete tag2val
    numTags = 0
    set_tag2val(substr($0,1,RLENGTH))
    $0 = substr($0,RSTART RLENGTH)
}
{ set_tag2val($0) }
END { prt() }

function set_tag2val(str,       tag,val) {
    gsub(/^[[:space:]] |[[:space:]] $/,"",str)
    tag = val = str
    sub(/[[:space:]]*:.*/,"",tag)
    sub(/[^:]*:[[:space:]]*/,"",val)
    if ( !(tag in tag2val) ) {
        tags[  numTags] = tag
    }
    tag2val[tag] = val
}

function prt() {
    if ( numTags && !doneHdr   ) {
        for ( tagNr=1; tagNr<=numTags; tagNr   ) {
            tag = tags[tagNr]
            printf "\"%s\"%s", tag, (tagNr<numTags ? OFS : ORS)
        }
    }
    for ( tagNr=1; tagNr<=numTags; tagNr   ) {
        tag = tags[tagNr]
        val = tag2val[tag]
        gsub(/^"|"$/,"",val)
        printf "\"%s\"%s", val, (tagNr<numTags ? OFS : ORS)
    }
}

$ awk -f tst.awk file
"insert_job","job_type","command","machine","owner","permission","date_conditions","run_calendar","start_times","description","std_out_file","std_err_file","alarm_if_fail","alarm_if_terminated"
"Test_jobA","CMD","echo","machinename","owner","","1","Autosys_Calendar","09:09","test discription","/tmp/Test_jobA.out","/tmp/Test_jobA.err","1","1"
"Test_JobB","CMD","echo","machinename","owner","","1","mo,tu,we,th,fr","21:05","test discription","/tmp/Test_JobB.out","/tmp/Test_JobB.err","1","1"
"Test_JobC","BOX","echo","machinename","owner","","0","test discription","/tmp/Test_JobC.out","/tmp/Test_JobC.err","1","1"
"Test_JobD","CMD","echo","machinename","owner","","1","su","08:50","test discription","/tmp/Test_JobD.out","/tmp/Test_JobD.err","1","1"
"Test_JobE","CMD","echo","machinename","owner","","1","su","08:20","test discription","/tmp/Test_JobE.out","/tmp/Test_JobE.err","1","1"
"Test_JobF","CMD","echo","machinename","owner","","1","all","0,10,20,30,40,50","test discription","/tmp/Test_JobF.out","/tmp/Test_JobF.err","1","1"
  • Related