Here is my sample data:
DATA='target1.domain,LAST_VULN_AGENT_SCAN,2022/12/07 03:14:49
target2.domain,LAST_VULN_AGENT_SCAN,2022/12/07 03:14:30
target3.domain,LAST_VULN_AGENT_SCAN,2022/12/07 03:14:49
target1.domain,LAST_VULN_NONCRED_SCAN,2022/12/07 00:08:43
target2.domain,LAST_VULN_NONCRED_SCAN,2022/12/07 00:08:43
target3.domain,LAST_VULN_NONCRED_SCAN,2022/12/07 00:08:43
target1.domain,LAST_VULN_CRED_SCAN,2022/12/07 04:59:06
target2.domain,LAST_VULN_CRED_SCAN,2022/12/07 04:59:06
target3.domain,LAST_VULN_CRED_SCAN,2022/12/07 03:03:52'
Here is my desired output:
Name,LAST_VULN_AGENT_SCAN,LAST_VULN_NONCRED_SCAN,LAST_VULN_CRED_SCAN
target1.domain,2022/12/07 03:14:49,2022/12/07 00:08:43,2022/12/07 04:59:06
target2.domain,2022/12/07 03:14:30,2022/12/07 00:08:43,2022/12/07 04:59:06
target3.domain,2022/12/07 03:14:49,2022/12/07 00:08:43,2022/12/07 03:03:52
Here is my current for loop:
UNIQUETARGETS=$(echo "${DATA}" | cut -d , -f 1 | sort | uniq)
echo 'Name,LAST_VULN_AGENT_SCAN,LAST_VULN_NONCRED_SCAN,LAST_VULN_CRED_SCAN'
for TARGET in $UNIQUETARGETS; do
LAST_VULN_AGENT_SCAN=$(echo "${DATA}" | grep "${TARGET}," | grep 'LAST_VULN_AGENT_SCAN' | cut -d , -f 3)
LAST_VULN_NONCRED_SCAN=$(echo "${DATA}" | grep "${TARGET}," | grep 'LAST_VULN_NONCRED_SCAN' | cut -d , -f 3)
LAST_VULN_CRED_SCAN=$(echo "${DATA}" | grep "${TARGET}," | grep 'LAST_VULN_CRED_SCAN' | cut -d , -f 3)
echo "${TARGET},${LAST_VULN_AGENT_SCAN},${LAST_VULN_NONCRED_SCAN},${LAST_VULN_CRED_SCAN}"
done
While this method works, I am confident that AWK can do this much faster. I've searched for several "transpose" awk snippets, but nothing I've found can do quite what I want. Any help is greatly appreciated!
CodePudding user response:
Using any awk:
$ cat tst.awk
BEGIN { FS=OFS="," }
{
names[$1]
times[$1,$2] = $3
}
END {
hdr = "Name,LAST_VULN_AGENT_SCAN,LAST_VULN_NONCRED_SCAN,LAST_VULN_CRED_SCAN"
print hdr
numCols = split(hdr,scans)
for ( name in names ) {
printf "%s%s", name, OFS
for ( colNr=2; colNr<=numCols; colNr ) {
scan = scans[colNr]
time = times[name,scan]
printf "%s%s", time, (colNr<numCols ? OFS : ORS)
}
}
}
$ awk -f tst.awk file
Name,LAST_VULN_AGENT_SCAN,LAST_VULN_NONCRED_SCAN,LAST_VULN_CRED_SCAN
target3.domain,2022/12/07 03:14:49,2022/12/07 00:08:43,2022/12/07 03:03:52
target2.domain,2022/12/07 03:14:30,2022/12/07 00:08:43,2022/12/07 04:59:06
target1.domain,2022/12/07 03:14:49,2022/12/07 00:08:43,2022/12/07 04:59:06
If you care about the order of output lines they can be sorted in whatever order you like.
CodePudding user response:
I would use ruby
with its csv parser for this personally.
ruby -r csv -e '
options={ :headers=>false }
inp=CSV.parse($<.read, **options)
names=inp.map{|sa| sa[0]}.uniq.sort
header=["Name"] inp.map{|sa| sa[1]}.uniq.sort
data=inp.each_with_object(Hash.new {|h,k| h[k]={} }) {
|(name, func, date), h|
h[name][func]=date
}
table=CSV::Table.new([], headers: header)
names.each{ |n| line=[n]; header[1..].each { |h| line << data[n][h] }
table << line }
puts table
' file
Prints:
Name,LAST_VULN_AGENT_SCAN,LAST_VULN_CRED_SCAN,LAST_VULN_NONCRED_SCAN
target1.domain,2022/12/07 03:14:49,2022/12/07 04:59:06,2022/12/07 00:08:43
target2.domain,2022/12/07 03:14:30,2022/12/07 04:59:06,2022/12/07 00:08:43
target3.domain,2022/12/07 03:14:49,2022/12/07 03:03:52,2022/12/07 00:08:43
CodePudding user response:
If you just wanted to do it in bash
the logic still works with very few tweaks required.
declare -A names=() v=();
while IFS=, read n c t; do names[$n]=1; v["$n,$c"]="$t"; done < test.in
echo "Name,LAST_VULN_AGENT_SCAN,LAST_VULN_NONCRED_SCAN,LAST_VULN_CRED_SCAN"
for n in "${!names[@]}"
do echo "$n,${v[$n,LAST_VULN_AGENT_SCAN]},${v[$n,LAST_VULN_NONCRED_SCAN]},${v[$n,LAST_VULN_CRED_SCAN]}"
done
or
hdr="Name,LAST_VULN_AGENT_SCAN,LAST_VULN_NONCRED_SCAN,LAST_VULN_CRED_SCAN"
echo "$hdr"; IFS=, read -a col <<< "$hdr"
declare -A names=() v=();
while IFS=, read n c t; do names[$n]=1; v["$n,$c"]="$t"; done < test.in
for n in "${!names[@]}"; do echo "$n,${v[$n,${col[1]}]},${v[$n,${col[2]}]},${v[$n,${col[3]}]}"; done
The output -
Name,LAST_VULN_AGENT_SCAN,LAST_VULN_NONCRED_SCAN,LAST_VULN_CRED_SCAN
target3.domain,2022/12/07 03:14:49,2022/12/07 00:08:43,2022/12/07 03:03:52
target2.domain,2022/12/07 03:14:30,2022/12/07 00:08:43,2022/12/07 04:59:06
target1.domain,2022/12/07 03:14:49,2022/12/07 00:08:43,2022/12/07 04:59:06
And as mentioned, you could always add sort
.
If that last line were
for n in "${!names[@]}"; do echo "$n,${v[$n,${col[1]}]},${v[$n,${col[2]}]},${v[$n,${col[3]}]}"; done | sort
then the output becomes
Name,LAST_VULN_AGENT_SCAN,LAST_VULN_NONCRED_SCAN,LAST_VULN_CRED_SCAN
target1.domain,2022/12/07 03:14:49,2022/12/07 00:08:43,2022/12/07 04:59:06
target2.domain,2022/12/07 03:14:30,2022/12/07 00:08:43,2022/12/07 04:59:06
target3.domain,2022/12/07 03:14:49,2022/12/07 00:08:43,2022/12/07 03:03:52
But awk
is still smarter.