I have a list of files with names like this.
["TYBN-220422-257172171.txt", "TYBN-120522-257172174.txt", "TYBN-320422-657172171.txt", "TYBN-220622-237172174.txt", "TYBN-FRTRE-FFF.txt",....]
I want to get only the files which has format like this TYBN-220422-257172171.txt
valid = "TYBN-{}-{}".format(numericvalue, numericvalue)
I want this type of files only in the list.
CodePudding user response:
Regex explanation:
- ^ start of the string
- $ end of the string
- \d matches all numbers. Equivalent to [0-9]
- one or many of the expressions
import re
files = ["TYBN-220422-257172171.txt", "TYBN-120522-257172174.txt"]
pattern = re.compile("^TYBN-\d -\d \.txt$")
for f in files:
if pattern.match(f):
print(f " matched naming convention.")
CodePudding user response:
This is probably most easily done using a regex to match the desired format i.e.
TYBN-\d -\d \.txt$
which looks for a name starting with the characters TYBN-
followed by one or more digits (\d
), a -
, some more digits and then finishing with .txt
.
Note that when using re.match
(as in the code below), matches are automatically anchored to the start of the string and thus a leading ^
(start-of-string anchor) is not required on the regex.
In python:
import re
filelist = ["TYBN-220422-257172171.txt",
"TYBN-120522-257172174.txt",
"TYBN-320422-657172171.txt",
"TYBN-220622-237172174.txt",
"TYBN-FRTRE-FFF.txt"
]
regex = re.compile(r'TYBN-\d -\d \.txt$')
valid = [file for file in filelist if regex.match(file)]
Output:
[
'TYBN-220422-257172171.txt',
'TYBN-120522-257172174.txt',
'TYBN-320422-657172171.txt',
'TYBN-220622-237172174.txt'
]
CodePudding user response:
Try this one.
lst = ["TYBN-220422-257172171.txt", "TYBN-120522-257172174.txt", "TYBN-320422-657172171.txt", "TYBN-220622-237172174.txt", "TYBN-FRTRE-FFF.txt"]
valid_format = ['TYBN',True,True] # here true for digits
valid = []
for a in lst:
l = a.replace('.txt','').split('-')
if l[0] == valid_format[0]:
if [i.isdigit() for i in l[1:]] == valid_format[1:]:
valid.append(a)
print(valid)
OUTPUT:
['TYBN-220422-257172171.txt',
'TYBN-120522-257172174.txt',
'TYBN-320422-657172171.txt',
'TYBN-220622-237172174.txt']