Command output with empty values to csv-CodePudding

> lsblk -o NAME,LABEL,FSTYPE,MOUNTPOINT,SIZE,TYPE -x NAME

NAME      LABEL FSTYPE MOUNTPOINT      SIZE TYPE
nvme0n1                              894.3G disk
nvme0n1p1              [SWAP]            4G part
nvme0n1p2                                1G part
nvme0n1p3 root         /home/cg/root 889.3G part

I need the output of this command in csv format, but all the methods I've tried so far don't handle the empty values correctly, thus generating bad rows like these I got with sed:

> lsblk -o NAME,LABEL,FSTYPE,MOUNTPOINT,SIZE,TYPE -x NAME | sed -E 's/  /,/g'

NAME,LABEL,FSTYPE,MOUNTPOINT,SIZE,TYPE
nvme0n1,894.3G,disk
nvme0n1p1,[SWAP],4G,part
nvme0n1p2,1G,part
nvme0n1p3,root,/home/cg/root,889.3G,part

Any idea how to add the extra commas for the empty fields?

NAME,LABEL,FSTYPE,MOUNTPOINT,SIZE,TYPE
nvme0n1,,,,894.3G,disk

CodePudding user response：

Make sure that the fields that are possibly empty are at the end of the line. And then re-arrange them in the required sequence.

lsblk -o NAME,SIZE,TYPE,FSTYPE,MOUNTPOINT,LABEL -x NAME   | awk '{ print $1,";",$6,";",$4,";",$5,";",$2,";",$3 }'

CodePudding user response：

Not really bash, but a quick and dirty Perl would be something like:

my $state=0;
my @input=<>;
my $maxlength=0;
for my $line ( 0 .. $#input){
        my $curlength= length($input[$line]);
        if ($curlength>$maxlength){$maxlength=$curlength;}
}
my $fill=' ' x $maxlength;
for my $line ( 0 .. $#input){
        chomp $input[$line];
        $input[$line]="$input[$line] $fill";
}


for (my $pos=0; $pos<$maxlength; $pos  ){
        my $spacecol=1;
        for my $line ( 0 .. $#input){
                if (substr($input[$line],$pos,1) ne ' '){
                        $spacecol=0;
                }
        }
        if ($spacecol==1){
                for my $line ( 0 .. $#input){
                        substr($input[$line],$pos,1)=';';
                }
        }
}

for my $line ( 0 .. $#input){
        print "$input[$line]\n";
}

CodePudding user response：

Assumptions:

output format is fixed-width
header record does not contain any blank fields
no fields contain white space (ie, only white space occurs between fields)
have access to GNU awk (for multi-dimensional arrays) (otherwise we'll need a 2-pass solution ... doable but a bit more coding)

Design overview:

parse header to get initial index for each field; if all columns are left-justified then this would be all we need to do however, with the existence of right-justified columns (eg, SIZE) we need to look for right-justified values that are longer than the associated header field (ie, the value has a lower index than the associated header)
for non-header rows we loop through our set of potential fields, using substr()/match() to find the non-space fields in the line and ...
if said field starts and ends before the next field's index then store in array as current field value but ...
if said field starts before next field's index but ends after next field's index then we're looking at a right-justified value of the next field which happens to have an earlier index than the associated header's index; in this case update the index for the next field
if said field starts after the index of the next field then the current field is empty (we could store a blank in the array but a missing array entry automatically evaulates as blank, so no need to save anything in the array for a missing field)
in the END{} block print our array to stdout

One awk idea:

awk '
BEGIN   { OFS="," }

# use header record to determine initial set of indexes

NR==1   { maxNF=NF   
          header=$0
          for (i=1;i<=maxNF;i  ) {
              match(header,/[^[:space:]] /)                             # find first non-space string
              ndx[i]=ndx[i-1]   prevlen   RSTART - (i==1 ? 0 : 1)       # make note of index
              fields[FNR][i]=substr(header,RSTART,RLENGTH)              # save value in fields[][] array
              prevlen=RLENGTH                                           # need for next pass through loop
              header=substr(header,RSTART RLENGTH)                      # strip off matched string and repeat loop
          }
        }

# for rest of records need to determine which fields are empty and/or which fields need the associated index updated

        { for (i=1;i<maxNF;i  ) {                                       # loop through all but last field
              restofline=substr($0,ndx[i])                              # work with current field thru to end of line
              if ( match(restofline,/[^[:space:]] /) )                  # if we find a non-space match ...
                 if ( ndx[i]-1 RSTART < ndx[i 1] )                      # if match starts before index of next field and ...
                    if ( ndx[i]-1 RSTART RLENGTH < ndx[i 1] )           # ends before index of next field then ...
                       fields[FNR][i]=substr(restofline,RSTART,RLENGTH) # store the value in our array
                    else {                                              # else if match finished beyond index of next field then ...
                       diff=ndx[i 1]-(ndx[i] RSTART-1)                  # figure the difference and ...
                       ndx[i 1]-=diff                                   # update the index for the next field
                    }
          }
          field=substr($0,ndx[maxNF])                                   # process last field
          gsub(/[[:space:]]/,"",field)                                  # remove all remaining spaces
          if (field)                                                    # if non-empty then ...
             fields[FNR][maxNF]=field                                   # save in array

        }

# print our array

END     { for (i=1;i<=FNR;i  )
              for (j=1;j<=maxNF;j  )
                  printf "%s%s",fields[i][j], (j==maxNF ? "\n" : OFS)
        }
' lsblk.out

This generates:

NAME,LABEL,FSTYPE,MOUNTPOINT,SIZE,TYPE
nvme0n1,,,,894.3G,disk
nvme0n1p1,,,[SWAP],4G,part
nvme0n1p2,,,,1G,part
nvme0n1p3,root,,/home/cg/root,889.3G,part