Home > Back-end >  List (find) files with repeated pattern in their name
List (find) files with repeated pattern in their name

Time:11-11

My question is so simple, but I could not find any solution yet. I need to find files with the date (string) repeated in their names. For example:

20190101_fl_20190101.nc

20190101_fl_20190104.nc
20190102_fl_20190102.nc
20190102_fl_20190104.nc

I need to find 20190101_fl_20190101.nc and 20190102_fl_20190102.nc.

I have tried

ls 20190[0-9][0-9][0-9]_fl_20190[0-9][0-9][0-9].nc

But, as expected, it finds all possible combinations.

Any help would be highly appreciated.

CodePudding user response:

You can use

find . -type f -regextype posix-extended -regex '.*/(20190[0-9]{3})_fl_\1\.nc$'

The regex matches

  • .*/ - any chars up to the rightmost / (necessary because the pattern used with find requires a full string match)
  • (20190[0-9]{3}) - Group 1: 2019 and any three digits
  • _fl_ - a fixed substring
  • \1 - backreference to Group 1 value
  • \.nc - .nc string
  • $ - end of input.

The -regextype posix-extended option is necessary since the pattern above is POSIX ERE compliant.

  • Related