Home > Software design >  Using regular expression in lftp to ignore some strings from file name
Using regular expression in lftp to ignore some strings from file name

Time:12-09

Get specific file with name like abc_yyyymmdd_hhmmss.csv from directory using mget. Example files in a folder:

abc_20221202_145911.csv
abc_20221202_145921.csv
abc_20221202_145941.csv
abc_20181202_145941.csv

But, I want to ignore hhmmss part. I want to get all files with abc_20221202_*.csv

How to include * in mget. My code below:

File=abc_
Date=20221202
Filename=$File$Date"_*".csv
// Assume I have sftp connection established and I am in directory //where files with above naming convention are present. As I can //download the file when hardcoding exact file name during testing
conn=`lftp $protocol://$user:$password@$sftp_server -p $port <<EOF>/error.log
cd $path
mget $Filename
EOF`

The script is able to find the file but not able to retrieve it from the server. But, if I remove * and provide the entire file name abc_20221202_145941.csv it will download the file. Why is * causing issue in retrieving the file

CodePudding user response:

Assuming mget actually accepts regex:

Currently your regexp is looking for files that match abc_20221202_(underscore any number of times).csv

Just add a . before the * so it matches any character after the underscore any number of times before the .csv

Like so:

Filename=$File$Date"_.*".csv

If mget doesn't actually support regex, just use wget instead:

wget -r -np -nH -A "abc_20221202_.*\.csv" --ftp-user=user --ftp-password=psd ftp://ip/*

CodePudding user response:

You probably missed an underscore between File and Date. A good way to debug such problems is to enable debug (“debug” command) and command logging (set cmd:trace true)

  • Related