I add the file path to the goups_of_file list. But because of the cycles, there are many identical elements in the list. How to add only 1 time despite loop?
for file in files_names:
for name_group, formats in groups_of_format.items():
if file.split('.')[-1].upper() in groups_of_format.values():
groups_of_files[groups_of_format.keys()].append(file)
CodePudding user response:
Use sets instead of lists. Elements in sets are kept unique using an hash.
Something like:
groups_of_files = defaultdict(set)
for file in files_names:
for name_group, formats in groups_of_format.items():
if file.split('.')[-1].upper() in groups_of_format.values():
groups_of_files[groups_of_format.keys()].add(file)
I assumed that groups_of_files
is a dictionary. In the code example, when the element of the dictionary is missing, instead of raising exceptions, the element is created and the value is an empty set to which you can add your file
. If file
is of a custom type, make sure to define the __hash__
and the __eq__
methods.
If in the end you need anyway a list, you can convert a set to a list just using list()
and the set as the argument.
CodePudding user response:
You can use a set to keep track of the files that have already been added to the groups_of_files list.
added_files = set()
for file in files_names:
for name_group, formats in groups_of_format.items():
if file.split('.')[-1].upper() in formats and file not in added_files:
groups_of_files[name_group].append(file)
added_files.add(file)
CodePudding user response:
Build a dictionary keyed on the filename extensions. Associated values should be a set.
Subsequently, build the required dictionary by converting the sets to lists as follows:
import os
temp = dict()
files_names = ['a.txt', 'b.txt', 'b.txt', 'c.py', 'e.txt', 'f.py']
for file in files_names:
_, ext = os.path.splitext(file)
temp.setdefault(ext.upper()[1:], set()).add(file)
groups_of_files = {k: list(v) for k, v in temp.items()}
print(groups_of_files)
Output:
{'TXT': ['e.txt', 'b.txt', 'a.txt'], 'PY': ['c.py', 'f.py']}