I have the following code that tries to retrieve the name of a file from a directory based on a double \ character:
import re
string = 'I:/Etrmtest/PZMALIo4/ETRM841_FX_Deals_Restructuring/FO_PRE\\abo_st_gas_dtd.csv'
pattern = r'(?<=*\\\\)*'
re.findall(pattern,string)
The reasoning behind is that the name of the file is always after a double \ , so I try to look any string which is preceeded by any text that finishes with \ .
Neverthless, when I apply this code I get the following error:
error: nothing to repeat at position 4
What am I doing wrong?
Edit: The concrete output I am looking for is getting the string 'abo_st_gas_dtd_csv'
as a match.
CodePudding user response:
There's a couple of things going on:
- You need to declare your string definition using the same
r'string'
notation as for the pattern; right now yourstring
only has a single backslash, since the first one of the two is escaped. - I'm not sure you're using
*
correctly. It means "repeat immediately preceding group", and not just "any string" (as, e.g., in the usual shell patterns). The first*
in parentheses does not have anything preceding it, meaning that the regex is invalid. Hence the error you see. I think, what you want is.*
, i.e., repeating any character 0 or more times. Furthermore, it is not needed in the parentheses. A more correct regexp would ber'(?<=\\\\).*'
:
import re
string = r'I:/Etrmtest/PZMALIo4/ETRM841_FX_Deals_Restructuring/FO_PRE\\abo_st_gas_dtd.csv'
pattern = r'(?<=\\\\).*'
re.findall(pattern,string)
CodePudding user response:
Your pattern is just a lookabehind, which, by itself, can't match anything. I would use this re.findall
approach:
string = 'I:/Etrmtest/PZMALIo4/ETRM841_FX_Deals_Restructuring/FO_PRE\\abo_st_gas_dtd.csv'
filename = re.findall(r'\\([^.] \.\w )$', string)[0]
print(filename) # abo_st_gas_dtd.csv
CodePudding user response:
files = 'I:E\\trm.csvest/PZMALIo4\ETRM841_FX_.csvDeals_Restructuring/FO_PRE\\abo_st_gas_dtd.csv'
counter = -1
my_files = []
for f in files:
counter = 1
if ord(f) == 92:#'\'
temp = files[counter 1:len(files)]
temp_file = ""
for f1 in temp:
temp_file = f1
# [0-len(temp_file)] => if [char after . to num index of type file]== csv
if f1 == '.' and temp[len(temp_file):len(temp_file) 3] == "csv":
my_files.append(temp_file "csv")
break
print(my_files)#['trm.csv', 'ETRM841_FX_.csv', 'abo_st_gas_dtd.csv']