Home > Blockchain >  Bash script to find specific files older than 90 days and put them in .csv file
Bash script to find specific files older than 90 days and put them in .csv file

Time:03-09

I have the bash script below, and I am trying to find a way to write additional functionality that also calculates the number of days between the last modification of the file and the current date:

#!/bin/bash

read -p "do you want to find the files? Y/N " -n 1 -r
echo

echo "path , $(date  %d-%m-%Y)" >> checked_files.csv
find . -name "data.csv" | xargs -d '\n' stat -c "%-25n;%y" | echo "$(date  %d-%m-%Y) - $(stat 
-c "%-25n;%y")" | bc >> checked_files.csv
find . -name "output_for_CPA_tool.csv" | xargs -d '\n' stat -c "%-25n;%y" >> checked_files.csv
find . -name "info_table.csv" | xargs -d '\n' stat -c "%-25n;%y" >> checked_files.csv
find . -name "int_2.csv" | xargs -d '\n' stat -c "%-25n;%y" >> checked_files.csv
find . -name "intermediate.csv" | xargs -d '\n' stat -c "%-25n;%y" >> checked_files.csv
find . -name "output_for_MME_tool.csv" | xargs -d '\n' stat -c "%-25n;%y" >> checked_files.csv
find . -name "media_contacts.csv" | xargs -d '\n' stat -c "%-25n;%y" >> checked_files.csv
find . -name "modeldata.csv" | xargs -d '\n' stat -c "%-25n;%y" >> checked_files.csv
find . -name "modeldata.RData.csv" | xargs -d '\n' stat -c "%-25n;%y" >> checked_files.csv

Any help would be appreciated.

Thanks in advance :)

CodePudding user response:

you can use :

datediff() {
    d1=$(date -d "$1"  %s)
    d2=$(date -d "$2"  %s)
    echo $(( (d1 - d2) / 86400 )) days
}

in your case you can calculate days like below (by adding above lines in your script):

datediff "$(date  %F)" "`find . -name "modeldata.csv" | xargs -d '\n' stat -c "%-25n;%y" | cut -d";" -f2 | awk '{print $1}' | head -n 1`"

so for complete way you can write your commands :

#!/bin/bash

datediff() {
    d1=$(date -d "$1"  %s)
    d2=$(date -d "$2"  %s)
    echo $(( (d1 - d2) / 86400 )) days
}

result1=`find . -name "modeldata.csv" | xargs -d '\n' stat -c "%-25n;%y" | awk '{print $1}'`
datediff1=$(datediff "$(date  %F)" "`find . -name "modeldata.csv" | xargs -d '\n' stat -c "%-25n;%y" | cut -d";" -f2 | awk '{print $1}' | head -n 1`")
echo "$result1 $datediff1" >> checked_files.csv 

CodePudding user response:

GNU find has a lot of helpful options.

find all .csv files
find . -type f -name '*.csv'
find all files whose name is data.csv or modeldata.csv
find . -type f '(' -name data.csv -o -name modeldata.csv ')'
find all files that weren't modified in the last 90 days
find . -type f -mtime  90
find all files and append their modification time (in the same format than stat -c '%y' and in seconds since epoch) to the output
find . -type f -printf '%h/%f;%TY-%Tm-%Td %TT %Tz;%T@\n'

Now using that, you could do:

{
    echo 'path;datetime;elapsed(days)'

    find . -mtime  90 -type f \
        '(' \
            -name data.csv \
         -o -name output_for_CPA_tool.csv \
         -o -name info_table.csv \
         -o -name int_2.csv \
         -o -name intermediate.csv \
         -o -name output_for_MME_tool.csv \
         -o -name media_contacts.csv \
         -o -name modeldata.csv \
         -o -name modeldata.RData.csv \
        ')' \
        -printf '%h/%f;%TY-%Tm-%Td %TT %Tz;%T@\n' |

    awk -v FS=';' -v OFS=';' '
        BEGIN { now = systime() }
        {
            $3 = int( (now - $3) / 86400 )
            print
        }
    '
} > checked_files.csv
  • Related