Home > database >  Unwanted blank row in CSV file
Unwanted blank row in CSV file

Time:01-26

I have a command here that i was able to put together with the help of Stackoverflow community. Now I have a small concern with the script, this is a very small issue but its bugging me.

Below is the script I use:

#!/bin/bash

  read -p "Enter /DIR/PATH/FILENAME where you wish to copy the data: " FILENAME
  echo "Enter the JOB_NAME or %SEARCHSTRING%"


 while read -r i;
   do

  awk '
    BEGIN {
    print "\"insert_job\",\"job_type\",\"box_name\",\"command\",\"machine\",\"owner\",\"date_conditions\",\"condition\",\"run_calendar\",\"exclude_calendar\",\"days_of_week\",\"run_window\",\"start_times\",\"start_mins\",\"resources\",\"profile\",\"term_run_time\",\"watch_file\",\"watch_interval\"" }

/job_type/ {
    if (NR>1){printf "%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s\n", jn, jt, box, cmd, mcn, own, dc, c, rc, ec, dow, ruw, st, sm, res, prof, trt, wf, wi} jn="\""$2"\""; jt="\""$4"\""; box="\" \""; cmd="\" \""; mcn="\" \""; own="\" \""; dc="\" \""; c="\" \""; rc="\" \""; ec="\" \""; dow="\" \""; ruw="\" \""; st="\" \""; sm="\" \""; res="\" \""; prof="\" \""; trt="\" \""; wf="\" \""; wi="\" \""}
    /box_name/ {box="\""$2"\""}
    /command/ {$0=substr($0,index($0,$2)); cmd="\""$0"\""}
    /machine/ {mcn="\""$2"\""}
    /owner/   {own="\""$2"\""}
    /date_conditions/ {dc="\""$2"\""}
    /condition/ {$0=substr($0,index($0,$2)); c="\""$0"\""}
    /run_calendar/ {rc="\""$2"\""}
    /exclude_calendar/ {ec="\""$2"\""}
    /days_of_week/ {dow="\""$2"\""}
    /run_window/ {ruw="\""$2"\""}
    /start_times/ {gsub("\"",""); st="\""$2"\""}
    /^start_mins/ {sm="\""$2"\""}
    /profile/ {prof="\""$2"\""}
    /term_run_time/ {trt="\""$2"\""}
    /watch_file/ {wf="\""$2"\""}
    /watch_interval/ {wi="\""$2"\""}
    /resources/ {res="\""$2"\""}
    END{printf "%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s\n", jn, jt, box, cmd, mcn, own, dc, c, rc, ec, dow, ruw, st, sm, res, prof, trt, wf, wi}
' < <(autorep -j $i -q) > $FILENAME.csv

break
done

The $i takes a wildcard entry and gives output asper that.

for eg: the below 4 jobs have Test in their name, so i will give Test% as the wildcard value and the script will give the output for all the 4 jobs.

These are the test jobs i am using:

/* ----------------- Test_A ----------------- */

insert_job: Test_A  job_type: CMD
command: sleep 3000
machine: machine1
owner: user1
permission:
date_conditions: 0
term_run_time: 1
alarm_if_fail: 1
alarm_if_terminated: 1


/* ----------------- Test_B ----------------- */

insert_job: Test_B    job_type: CMD
command: echo
machine: machine2
owner: user2
permission:
date_conditions: 0
description: "Test"
std_out_file: "/tmp/$AUTO_JOB_NAME.$AUTORUN.out"
std_err_file: "/tmp/$AUTO_JOB_NAME.$AUTORUN.err"
max_run_alarm: 1
alarm_if_fail: 0
alarm_if_terminated: 0
send_notification: 1



/* ----------------- Test_c ----------------- */

insert_job: Test_c   job_type: CMD
command: sleep 10
machine: machine3
owner: user3
permission:
date_conditions: 0
alarm_if_fail: 0
alarm_if_terminated: 0


/* ----------------- Test_d ----------------- */

insert_job: Test_d   job_type: CMD
command: ls
machine: machine4
owner: user4
permission:
date_conditions: 0
alarm_if_fail: 1
alarm_if_terminated: 1

But here is the issue, the csv file output has a blank row between the column names and the data like below:

"insert_job","job_type","box_name","command","machine","owner","date_conditions","condition","run_calendar","exclude_calendar","days_of_week","run_window","start_times","start_mins","resources","profile","term_run_time","watch_file","watch_interval"
,,,,,,,,,,,,,,,,,,
"Test_A","CMD"," ","sleep 3000","machine1","user1","0","0"," "," "," "," "," "," "," "," ","1"," "," "
"Test_B","CMD"," ","echo","machine2","user2","0","0"," "," "," "," "," "," "," "," "," "," "," "
"Test_c","CMD"," ","sleep 10","machine3","user3","0","0"," "," "," "," "," "," "," "," "," "," "," "
"Test_d","CMD"," ","ls","machine4","user4","0","0"," "," "," "," "," "," "," "," "," "," "," "

Required output is:

"insert_job","job_type","box_name","command","machine","owner","date_conditions","condition","run_calendar","exclude_calendar","days_of_week","run_window","start_times","start_mins","resources","profile","term_run_time","watch_file","watch_interval"
"Test_A","CMD"," ","sleep 3000","machine1","user1","0","0"," "," "," "," "," "," "," "," ","1"," "," "
"Test_B","CMD"," ","echo","machine2","user2","0","0"," "," "," "," "," "," "," "," "," "," "," "
"Test_c","CMD"," ","sleep 10","machine3","user3","0","0"," "," "," "," "," "," "," "," "," "," "," "
"Test_d","CMD"," ","ls","machine4","user4","0","0"," "," "," "," "," "," "," "," "," "," "," "

I have tried using (NR>=1) but it doesn't work. I know this is very trivial but I cant get my head around it, can someone help me?

FOLLOW UP QUESTION

so the solution provided by Ed Morton has fixed the issue, nut here is my follow up Question

i am trying to use the same script for other similar kind of data as the ones below:

Machine Name:       machine1
Type:               a
Node Name:          machine1.test
Agent Name:         AGENT
Operating System:   Windows Server 2012 
Agent Release:      12.0
Agent Build:        6181, Service Pack 00, Maintenance Level 00

Machine Name:       machine2
Type:               a
Node Name:          machine2.test
Agent Name:         AGENT
Operating System:   Windows Server 2012 for amd64
Agent Release:      12.0
Agent Build:        6181, Service Pack 00, Maintenance Level 00

Is there a way to use the same script for the above data? Output needed:

"Machine Name","Type","Node Name","Agent Name","Operating System","Agent Release","Agent Build"
"machine1"," a","  machine1.test","  AGENT","  Windows Server 2012","  12.0","  6181, Service Pack 00, Maintenance Level 00"
"machine2"," a","  machine2.test","  AGENT","  Windows Server 2012","  12.0","  6181, Service Pack 00, Maintenance Level 00"

CodePudding user response:

Here's how to robustly (your existing script will fail for various possible input values) and maintainably (if you want to add/remove fields to print or reorder them just update the first split() arg) do what you're trying to do, using cat file inplace of autorep -j $i -q which I don't have on my system:

$ cat tst.sh
#!/usr/bin/env bash

cat file |
awk '
    BEGIN {
        OFS = ","
        numTags = split("insert_job job_type box_name command machine owner date_conditions condition run_calendar exclude_calendar days_of_week run_window start_times start_mins resources profile term_run_time watch_file watch_interval",tags)
        for ( tagNr=1; tagNr<=numTags; tagNr   ) {
            tag = tags[tagNr]
            printf "\"%s\"%s", tag, (tagNr<numTags ? OFS : ORS)
        }
    }

    !NF || /^\/\*/ { next }
    { gsub(/^[[:space:]] |[[:space:]] $/,"") }

    match($0,/[[:space:]]job_type:/) {
        if ( jobNr   ) {
            prt()
            delete tag2val
        }

        # save "insert_job" value
        tag = substr($1,1,length($1)-1)
        val = substr($0,length($1) 1,RSTART-(length($1) 2))
        gsub(/^[[:space:]] |[[:space:]] $/,"",val)
        tag2val[tag] = val

        # update $0 to start with "job_type" to look like all other input
        $0 = substr($0,RSTART 1)
    }

    {
        tag = val = $0
        sub(/:.*/,"",tag)
        sub(/[^:] :[[:space:]]*/,"",val)
        tag2val[tag] = val
    }

    END { prt() }

    function prt(    tagNr,tag,val) {
        for ( tagNr=1; tagNr<=numTags; tagNr   ) {
            tag = tags[tagNr]
            val = tag2val[tag]
            printf "\"%s\"%s", val, (tagNr<numTags ? OFS : ORS)
        }
    }
'

$ ./tst.sh
"insert_job","job_type","box_name","command","machine","owner","date_conditions","condition","run_calendar","exclude_calendar","days_of_week","run_window","start_times","start_mins","resources","profile","term_run_time","watch_file","watch_interval"
"Test_A","CMD","","sleep 3000","machine1","user1","0","","","","","","","","","","1","",""
"Test_B","CMD","","echo","machine2","user2","0","","","","","","","","","","","",""
"Test_c","CMD","","sleep 10","machine3","user3","0","","","","","","","","","","","",""
"Test_d","CMD","","ls","machine4","user4","0","","","","","","","","","","","",""

The above is a simplified version of my answer to your previous question where I provided a script that'd output a CSV of all fields instead of the above which only outputs a specific list of fields.

It's not clear why you have a loop calling autorep in your existing script but if you do need one for some reason then chances are you should be calling autorep in a loop and piping the loop output to awk

while IFS= read -r i; do
    autorep -j "$i" -q
done |
awk '...'

instead of calling autorep and piping it's output to awk inside the loop:

while IFS= read -r i; do
    autorep -j "$i" -q |
    awk '...'
done

The above was run on this input copy/pasted from your question:

$ cat file
/* ----------------- Test_A ----------------- */

insert_job: Test_A  job_type: CMD
command: sleep 3000
machine: machine1
owner: user1
permission:
date_conditions: 0
term_run_time: 1
alarm_if_fail: 1
alarm_if_terminated: 1


/* ----------------- Test_B ----------------- */

insert_job: Test_B    job_type: CMD
command: echo
machine: machine2
owner: user2
permission:
date_conditions: 0
description: "Test"
std_out_file: "/tmp/$AUTO_JOB_NAME.$AUTORUN.out"
std_err_file: "/tmp/$AUTO_JOB_NAME.$AUTORUN.err"
max_run_alarm: 1
alarm_if_fail: 0
alarm_if_terminated: 0
send_notification: 1



/* ----------------- Test_c ----------------- */

insert_job: Test_c   job_type: CMD
command: sleep 10
machine: machine3
owner: user3
permission:
date_conditions: 0
alarm_if_fail: 0
alarm_if_terminated: 0


/* ----------------- Test_d ----------------- */

insert_job: Test_d   job_type: CMD
command: ls
machine: machine4
owner: user4
permission:
date_conditions: 0
alarm_if_fail: 1
alarm_if_terminated: 1
  • Related