Home > Net >  UNIX command - clear files based on the date logic
UNIX command - clear files based on the date logic

Time:09-17

--Newbie to UNIX--

I am looking for a UNIX command to remove some of the files in the directory based on date logic. Sample file structure:

Aug 30 01:30 Test20210830.ctl
Aug 30 01:30 Test20210830.txt
Aug  3 01:30 Test20210803.ctl
Aug  3 01:30 Test20210803.txt
Aug  2 01:30 Test20210802.ctl
Aug  2 01:30 Test20210802.txt
Aug  1 01:30 Test20210801.ctl
Aug  1 01:30 Test20210801.txt
Jul  1 01:30 Test20210701.ctl
Jul  1 01:30 Test20210701.txt
Jun 16 01:30 Test20210616.ctl
Jun 16 01:30 Test20210616.txt
Jun 15 01:30 Test20210615.ctl
Jun 15 01:30 Test20210615.txt

ls -ltr Test* | grep date %b | head -2 -- this command gives the top 2 files that I want to keep (first 2 files of each month). I want to remove the rest of the files from the same month (so the July and June 2 files still have to be there). There is a job that runs at end of each month so..

What is the best approach to remove the files and keep the files I want?

CodePudding user response:

If you wanted to keep only the first two lexicographically-sorted filenames from each month, you could use a simple loop and shell filename expansion:

for year in 2021
do
  for month in {01..12}
  do
    set -- Test${year}${month}*
    if [ "$#" -gt 2 ]
    then
      shift 2
      echo would rm -- "$@"
    fi
  done
done

Note that this would remove Test20210616.ctl and Test20210616.txt as it's specifically shifting off the first two filenames of each month.

The core part of this script is in the set and shift 2 portions. The set -- Test${year}${month}* line expands the year and month variables and appends the * wildcard to match any/all filenames that start with "Test" followed by the loop's current year and month values. Once those filenames are expanded -- or not! -- they are available in the special $@ array variable. Note that if there are no matches, there will be one value in the array -- the unexpanded "Test"... wildcard. This would be a special-case to watch out for if we needed to know if there was one actual filename match, but here we're only interested in the situation where there are more than two matches. The resulting filenames, if any, are added to the array in lexicographic order.

Once the filenames are in $@, the shift 2 simply pops the first two elements off the front of the list -- here, the first two filenames (sorted lexicographically). What remains is one or more filenames that are candidates to be removed.

Remove the echo would in order to enable this, if the results look correct.

CodePudding user response:

I have put your provided input in a file called input.

This code is safe to run as it doesn't actually remove the file but you can see the output for yourself. It tells you which file would be removed and which would stay with some self-explanatory lines at the end of each section.

It simply counts how many times it have seen the month so far and if it's more than two it outputs removing file. Otherwise it prints file stays!.

Output attached below.

lastmon=""
counter=0

# replace "cat input" with your "ls -l" or so
cat input | egrep -o "\S $" | sort -r | # extract filenames
  for filename in $(cat)
  do
    # extract pieces from filename
    echo "filename: $filename"
    echo -n "date: "
    echo $filename | egrep -o '[0-9] '
    echo -n "month: "
    echo $filename | egrep -o '[0-9] ' | cut -b5-6 
    
    # compare month to last month
    mon="$(echo $filename | egrep -o '[0-9] ' | cut -b5-6)"
    if [ "x$mon" != "x$lastmon" ]
    then counter=0
    fi
    lastmon="$mon"
    
    # keep counting
    counter=$((counter 1))
    echo counter: $counter 

    # if more than two in a row decide to remove file
    if [ $counter -gt 2 ]
    then echo result: removing file
    else echo result: file stays!
    fi
    
    echo # print empty line
  done

Output

filename: Test20210830.txt                                                                                                                                                                                                                                                         [39/1872]
date: 20210830                                                         
month: 08                                                              
counter: 1                                                             
file stays!                                                            
                                                                       
filename: Test20210830.ctl                                             
date: 20210830                                                         
month: 08                                                              
counter: 2                                                             
file stays!                                                            
                                                                       
filename: Test20210803.txt                                             
date: 20210803                                                         
month: 08                                                              
counter: 3                                                             
removing file                                                          
                                                                       
filename: Test20210803.ctl                                             
date: 20210803                                                         
month: 08                                                              
counter: 4                                                             
removing file                                                          
                                                                       
filename: Test20210802.txt                                             
date: 20210802                                                         
month: 08                                                              
counter: 5                                                             
removing file                                                          
                                                                       
filename: Test20210802.ctl                                             
date: 20210802                                                         
month: 08                                                              
counter: 6                                                             
removing file                                                          
                                                                       
filename: Test20210801.txt                                             
date: 20210801                                                         
month: 08                                                              
counter: 7                                                             
removing file                                                          

filename: Test20210801.ctl                                             
date: 20210801                                                         
month: 08                                                              
counter: 8                                                             
removing file       

filename: Test20210701.txt                                             
date: 20210701                                                         
month: 07                                                              
counter: 1                                                             
file stays!                                                            

filename: Test20210701.ctl                                             
date: 20210701                                                         
month: 07                                                              
counter: 2                                                             
file stays!                                                            

filename: Test20210616.txt                                             
date: 20210616                                                         
month: 06                                                              
counter: 1                                                             
file stays!                                                            

filename: Test20210616.ctl                                             
date: 20210616                                                         
month: 06                                                              
counter: 2                                                             
file stays!                                                            

filename: Test20210615.txt                                             
date: 20210615                                                         
month: 06                                                              
counter: 3                                                             
removing file                                                          

filename: Test20210615.ctl                                             
date: 20210615                                                         
month: 06                                                              
counter: 4                                                             
removing file                                                          
  • Related