> lsblk -o NAME,LABEL,FSTYPE,MOUNTPOINT,SIZE,TYPE -x NAME
NAME LABEL FSTYPE MOUNTPOINT SIZE TYPE
nvme0n1 894.3G disk
nvme0n1p1 [SWAP] 4G part
nvme0n1p2 1G part
nvme0n1p3 root /home/cg/root 889.3G part
I need the output of this command in csv format, but all the methods I've tried so far don't handle the empty values correctly, thus generating bad rows like these I got with sed:
> lsblk -o NAME,LABEL,FSTYPE,MOUNTPOINT,SIZE,TYPE -x NAME | sed -E 's/ /,/g'
NAME,LABEL,FSTYPE,MOUNTPOINT,SIZE,TYPE
nvme0n1,894.3G,disk
nvme0n1p1,[SWAP],4G,part
nvme0n1p2,1G,part
nvme0n1p3,root,/home/cg/root,889.3G,part
Any idea how to add the extra commas for the empty fields?
NAME,LABEL,FSTYPE,MOUNTPOINT,SIZE,TYPE
nvme0n1,,,,894.3G,disk
CodePudding user response:
Make sure that the fields that are possibly empty are at the end of the line. And then re-arrange them in the required sequence.
lsblk -o NAME,SIZE,TYPE,FSTYPE,MOUNTPOINT,LABEL -x NAME | awk '{ print $1,";",$6,";",$4,";",$5,";",$2,";",$3 }'
CodePudding user response:
Not really bash, but a quick and dirty Perl would be something like:
my $state=0;
my @input=<>;
my $maxlength=0;
for my $line ( 0 .. $#input){
my $curlength= length($input[$line]);
if ($curlength>$maxlength){$maxlength=$curlength;}
}
my $fill=' ' x $maxlength;
for my $line ( 0 .. $#input){
chomp $input[$line];
$input[$line]="$input[$line] $fill";
}
for (my $pos=0; $pos<$maxlength; $pos ){
my $spacecol=1;
for my $line ( 0 .. $#input){
if (substr($input[$line],$pos,1) ne ' '){
$spacecol=0;
}
}
if ($spacecol==1){
for my $line ( 0 .. $#input){
substr($input[$line],$pos,1)=';';
}
}
}
for my $line ( 0 .. $#input){
print "$input[$line]\n";
}
CodePudding user response:
Assumptions:
- output format is fixed-width
- header record does not contain any blank fields
- no fields contain white space (ie, only white space occurs between fields)
- have access to
GNU awk
(for multi-dimensional arrays) (otherwise we'll need a 2-pass solution ... doable but a bit more coding)
Design overview:
- parse header to get initial index for each field; if all columns are left-justified then this would be all we need to do however, with the existence of right-justified columns (eg,
SIZE
) we need to look for right-justified values that are longer than the associated header field (ie, the value has a lower index than the associated header) - for non-header rows we loop through our set of potential fields, using
substr()/match()
to find the non-space fields in the line and ... - if said field starts and ends before the next field's index then store in array as current field value but ...
- if said field starts before next field's index but ends after next field's index then we're looking at a right-justified value of the next field which happens to have an earlier index than the associated header's index; in this case update the index for the next field
- if said field starts after the index of the next field then the current field is empty (we could store a blank in the array but a missing array entry automatically evaulates as blank, so no need to save anything in the array for a missing field)
- in the
END{}
block print our array to stdout
One awk
idea:
awk '
BEGIN { OFS="," }
# use header record to determine initial set of indexes
NR==1 { maxNF=NF
header=$0
for (i=1;i<=maxNF;i ) {
match(header,/[^[:space:]] /) # find first non-space string
ndx[i]=ndx[i-1] prevlen RSTART - (i==1 ? 0 : 1) # make note of index
fields[FNR][i]=substr(header,RSTART,RLENGTH) # save value in fields[][] array
prevlen=RLENGTH # need for next pass through loop
header=substr(header,RSTART RLENGTH) # strip off matched string and repeat loop
}
}
# for rest of records need to determine which fields are empty and/or which fields need the associated index updated
{ for (i=1;i<maxNF;i ) { # loop through all but last field
restofline=substr($0,ndx[i]) # work with current field thru to end of line
if ( match(restofline,/[^[:space:]] /) ) # if we find a non-space match ...
if ( ndx[i]-1 RSTART < ndx[i 1] ) # if match starts before index of next field and ...
if ( ndx[i]-1 RSTART RLENGTH < ndx[i 1] ) # ends before index of next field then ...
fields[FNR][i]=substr(restofline,RSTART,RLENGTH) # store the value in our array
else { # else if match finished beyond index of next field then ...
diff=ndx[i 1]-(ndx[i] RSTART-1) # figure the difference and ...
ndx[i 1]-=diff # update the index for the next field
}
}
field=substr($0,ndx[maxNF]) # process last field
gsub(/[[:space:]]/,"",field) # remove all remaining spaces
if (field) # if non-empty then ...
fields[FNR][maxNF]=field # save in array
}
# print our array
END { for (i=1;i<=FNR;i )
for (j=1;j<=maxNF;j )
printf "%s%s",fields[i][j], (j==maxNF ? "\n" : OFS)
}
' lsblk.out
This generates:
NAME,LABEL,FSTYPE,MOUNTPOINT,SIZE,TYPE
nvme0n1,,,,894.3G,disk
nvme0n1p1,,,[SWAP],4G,part
nvme0n1p2,,,,1G,part
nvme0n1p3,root,,/home/cg/root,889.3G,part