Home > other >  Remove Files Based On Partial Match In Unix
Remove Files Based On Partial Match In Unix

Time:09-02

Lets say I have a file named duplicates.txt which appears as the following:

ID-32532
ID-78313
ID-89315

I also have a directory Fastq of files with the following names:

ID-18389_Feb92003_R1.fastq
ID-18389_Feb92003_R2.fastq
ID-32532_Feb142003_R1.fastq
ID-32532_Feb142003_R2.fastq
ID-48247_Mar202004_R1.fastq
ID-48247_Mar202004_R2.fastq

I want to enter a command that will search duplicates.txt and find any file whose name is a partial match in the Fastq directory and remove the file. Based on the provided example this would remove the files named ID-32532_Feb142003_{R1/R2}.fastq.

What Unix command should I use or if need be I could write a script in Python.

CodePudding user response:

In unix, just replace the variable character with a '?' or '.*'.

duplicates.txt

remove ID-?????

Fastq

remove ID-?????_????????_??.fastq
remove ID-.*fastq

CodePudding user response:

Here's a little bash function to do it:

lrmduplicates(){

  while read -r dupe;
  do
    echo removing "$dupe" ;

    #fine tune with ls first...
    #ls Fastq/$dupe*

    rm Fastq/$dupe*

  # dupes file: dont forget a line feed after 3rd pattern
  # i.e. end on empty line.
  done < duplicates.txt
  
}


For extra bonus, suppress error when no match. Not sure how to do that myself. rm -f or rm 2>/dev/null didnt do it (zsh on macos).

  • Related