unable to summarize time difference and size of client with uniformity-CodePudding

I am having tough time in getting time difference and size with uniform code like MB or GB or TB for entire client backup duration.

Below is my command:

mminfo -v -q "group=testgroup1,savetime>=02/17/2022,savetime<=02/18/2022" -r \
    "savetime,level,totalsize,volume,vmname,client,sscreate(20),sscomp(20)" -xc,

-q is for query
-r is to retrieve params from query
group contains the clients
savetime is day of backup
level is level of backup
totalsize is size of backup in bytes
volume is the name of volume where data is stored
vmname gives the name of vm
client gives name of client , both vmname and client are one and same depends on type of backup.
sscreate(20) gives start time of backup
sscomp(20) gives end time of backup

o/p of my command looks like this

17/02/22,incr,4853101080,volume.001,,testclient1,17/02/22 20:27:18,17/02/22 20:40:45
17/02/22,incr,404305556,volume.001,,testclient1,17/02/22 20:27:15,17/02/22 20:27:34
17/02/22,incr,645786660,volume.001,,testclient1,17/02/22 20:27:17,17/02/22 20:27:30
17/02/22,incr,4,volume.001,,testclient1,17/02/22 20:27:45,17/02/22 20:27:47
17/02/22,incr,4,volume.001,,testclient1,17/02/22 20:27:16,17/02/22 20:27:19
17/02/22,incr,4,volume.001,,testclient1,17/02/22 20:27:46,17/02/22 20:27:48
17/02/22,incr,4,volume.001,,testclient1,17/02/22 20:28:05,17/02/22 20:28:08
17/02/22,incr,4,volume.002,,testclient1,17/02/22 20:27:48,17/02/22 20:27:51
17/02/22,incr,6085356,volume.002,,testclient1,17/02/22 20:42:26,17/02/22 20:42:51
17/02/22,incr,53328,volume.004,,testclient1,17/02/22 20:43:13,17/02/22 20:43:22
17/02/22,incr,4,volume.004,,testclient1,17/02/22 20:27:34,17/02/22 20:27:37

for each drive/mount point of a backup we have an entry in output in each line. Am able to sumup of size of data backed up per each client on a day but unable to get the logic for calculating time difference for backup for a client on one particular day. can someone help me with this?

logic which i used:(awk is where i started filtering the data)

printf "show name\n p type:nsr group\n" |
    nsradmin -i - |
    grep -v ^$ |
    cut -d: -f2- |
    cut -d\; -f1 |
    sort -u |
    perl -pe 's/\ //' |
    while read grp;do
        mminfo -q "group=$grp,savetime>02/12/2022 16:00,savetime<02/13/2022 16:00,level=full" \
             -r "savetime,level,totalsize,volume,vmname,client,sscreate(20),sscomp(20)" \
             -xc, 2>/dev/null |
           awk -F, 'BEGIN{OFS=FS}{if($5=="")$5=$6;else $5=$5;print}' |
           awk -F, '$5 != "vm_name" {a[$5","$2] =$3;b[$5","$2]  ;OFS=FS}
                     END{for (v in a) print v,b[v],a[v]}'|
           while read j;do echo $grp,$j;
       done
    done

o/p from code looks like this:

testgroup1,testclient1,full,11,65959975044

CodePudding user response：

Suggesting to fold all the post processing logic into a single gawk (standard Linux awk) script .

script.awk

function timestamp(dateStr) {
  formatedStr = gensub(/([[:digit:]]{2})\/([[:digit:]]{2})\/([[:digit:]]{2}) ([[:digit:]]{2}):([[:digit:]]{2}):([[:digit:]]{2})/, 
                       "20\\3 \\2 \\1 \\4 \\5 \\6", 1, dateStr);
  # from: DD/MM/DD HH:MM:SS
  # to:   YYYY MM DD HH MM SS
  return mktime(formatedStr);
}
BEGIN {
  minStartTime = 9999999999999999999999;
}
$5 == "" {
  $5 = $6;
}
{
  accumulatedDailyStorage[$5","$2]  = $3;
  accumulatedDailyTime[$5","$2]  = (timestamp($8) - timestamp($7));
  startTime = timestamp($7);
  minStartTime  = (startTime < minStartTime) ? startTime : minStartTime ; 
  endTime = timestamp($8);
  maxEndTime = (endTime > maxEndTime) ? endTime : maxEndTime ;
  accumulatedDailyCount[$5","$2]  ;
}
END {
  for (clientName in accumulatedDailyStorage) {
    print clientName, accumulatedDailyStorage[clientName], accumulatedDailyCount[clientName], accumulatedDailyTime[clientName], (maxEndTime - minStartTime);
  }
}

Output:

awk -F, -f script.awk input.1.txt
testclient1,incr 5909332004 11 889 6666