Home > Net >  Python: apply a range to wildcard when filtering or searching a str list (need to add any str list i
Python: apply a range to wildcard when filtering or searching a str list (need to add any str list i

Time:11-02

If I have a list of Windows file paths (strings), how would I search for all list objects that have a consecutive 10-digit number in the file path --to add to a list?

Is there a way to define a range of wildcard characters and search or apply a filter?

example:

from this list:

('C:\Users\ Documents\1H_1P_42497372610000\Kirkbride A1P_42497586550009\Well History.tif',
'C:\Users\ Documents\TEMPORARY\WISE\30497372610000\Kirkbride _42478972610009\ Drilling\Proposals.pdf',
'C:\Users\ Documents\Well History\Drilling\Proposals\Cement\Pilot hole KO plug\ Test Results.txt')

this would be my new list (or dataframe):

('C:\Users\ Documents\1H_1P_42497372610000\Kirkbride A1P_42497586550009\Well History.tif',
'C:\Users\ Documents\TEMPORARY\WISE\30497372610000\Kirkbride _42478972610009\ Drilling\Proposals.pdf')

I attempted a few tries with the glob() function and tried to piece together a filter with conditions where I defined a variable 'x' = ('1', '2', '3' . . .) and filtered items where 'x' 'x' 'x' 'x' 'x' 'x' 'x' 'x' 'x' 'x' didn't occur. I just couldn't come close to piecing together anything that made sense, or that wasn't searching for integers (which won't work).

Help me! Please and thank you!

CodePudding user response:

You can use regex to find strings with 10 consecutive numbers:

In [63]: [i for i in strings if len(re.findall('\d{10}',re.escape(i)))>0]
Out[63]: 
['C:\\Users\\ Documents\\1H_1P_42497372610000\\Kirkbride A1P_42497586550009\\Well History.tif',
 'C:\\Users\\ Documents\\TEMPORARY\\WISE\\30497372610000\\Kirkbride _42478972610009\\ Drilling\\Proposals.pdf']

You might not need the re.escape call, I had to on linux because of the escape characters, which explains the double backslashes '\\'.

  • Related