i need your help on this please
I have an enormous directory with millions and millions of files and im trying to group those by year and month using the find command and then tar it to save some space.
I have created a bash script like the following
#!/bin/bash
DIR=/data/historical
/usr/bin/cd /data/backupfile
sleep 2
[ -e "$DIR" ] || mkdir "$DIR"
sleep 2
for year in 2019 2020 2021 2022
do
for month in jan feb mar apr may jun jul aug sept oct nov dec
do
mkdir -p /data/historical/"$year"/"$month"
done
for prev feb mar apr may jun jul aug sept oct nov dec jan
do
/usr/bin/find ! -newermt "$prev 31 $year" -newermt "$month 1 $year" -exec mv {} /data/historical/"$month" \;
done
done
Also tried this way
years=2019,2020,2021,2022
months=01,02,03,04,05,06,07,08,09,10,11,12
#months=`date ' %b'`
#after=`date -d '1 month' ' %b'`
for year in $(echo ${years})
do
for month in "${months[@]}"
do
/usr/bin/find ! -newermt "$year-$month-31" -newermt "$year-$month-01" -exec mv {} /data/historical/"$month" \;
done
done
So, this what i really need. I need to iterate through every year (2019 2020 2021 2022) starting with 2019 and every month ( 01,02,03,04,05,06,07,08,09,10,11,12) starting with 01 ... 12, get the files grouped by month-year and then tar it and them keep iterating through the other year ie 2020.
For example:
/usr/bin/find ! -newermt "feb 29 2019" -newermt "jan 1 2019" -exec mv {} /data/historical/2019 ; && /usr/bin/tar -czf /data/historical/file.tar.gz /data/historical/2019
I have tried change the variables, playing with the iteration and for loops, nested for loops. The directories 2019/{jan...dec} are created but the files i want to search for and grouped by month and year are not there.
#EDIT
To help you understand better:
My enormous file is /data/backupfile
It contains files from 2019-2022
I want to group those files by year/month that's why Im trying to create directories 2019/Jan and get those jan-2019, feb-2019, etc files from /data/backupfile.
I've been trying to do that using nested loops. Maybe there's a better solution?
CodePudding user response:
If you have GNU findutils installed this should work fine:
cd /data/backupfile
find -type f -printf '%p/%TY/%Tb\0' |
xargs -0 sh -c '
for args; do
src=${args%/*/*}
dst=/data/historical${args#"$src"}
echo mkdir -p "$dst" &&
echo mv "$src" "$dst"
done' sh
Remove both echo
s if you're happy with the output.
CodePudding user response:
Okay so you need to relearn how globbing and word splitting works, it is somewhat counter intuitive in bash so I can understand the difficulty. To answer your question:
years="$(echo {2019..2022})"
Is how you want to get the years.
For months reference this: https://unix.stackexchange.com/questions/480806/how-to-generate-list-of-months-using-bash
months="$(locale mon | tr ';' ' ')"
Your for loop should also be changed:
#ENSURE IFS IS PROPERLY SET
IFS=$' \t\n'
for year in ${years}
do
for month in ${month}
do
# Find statement here.
done
done
Your actual logic within the nested loops is confusing, to find files do this:
files="$(grep -i "${year}-${month}" "/data/backupfile")"
After that, I don't understand what you want.
If you want to tar the files you found do this:
(IFS=$'\n'; tar -czvf "${year}-${month}.tar.gz" ${files[*]})