Home > Software design >  How to avoid repeating adding elements to a list?
How to avoid repeating adding elements to a list?

Time:01-16

I add the file path to the goups_of_file list. But because of the cycles, there are many identical elements in the list. How to add only 1 time despite loop?

for file in files_names:
for name_group, formats in groups_of_format.items():
    if file.split('.')[-1].upper() in groups_of_format.values():
        groups_of_files[groups_of_format.keys()].append(file)

CodePudding user response:

Use sets instead of lists. Elements in sets are kept unique using an hash.

Something like:

groups_of_files = defaultdict(set)
for file in files_names:
  for name_group, formats in groups_of_format.items():
    if file.split('.')[-1].upper() in groups_of_format.values():
      groups_of_files[groups_of_format.keys()].add(file)

I assumed that groups_of_files is a dictionary. In the code example, when the element of the dictionary is missing, instead of raising exceptions, the element is created and the value is an empty set to which you can add your file. If file is of a custom type, make sure to define the __hash__ and the __eq__ methods.

If in the end you need anyway a list, you can convert a set to a list just using list() and the set as the argument.

CodePudding user response:

You can use a set to keep track of the files that have already been added to the groups_of_files list.

added_files = set()
for file in files_names:
    for name_group, formats in groups_of_format.items():
        if file.split('.')[-1].upper() in formats and file not in added_files:
            groups_of_files[name_group].append(file)
            added_files.add(file)

CodePudding user response:

Build a dictionary keyed on the filename extensions. Associated values should be a set.

Subsequently, build the required dictionary by converting the sets to lists as follows:

import os

temp = dict()

files_names = ['a.txt', 'b.txt', 'b.txt', 'c.py', 'e.txt', 'f.py']

for file in files_names:
    _, ext = os.path.splitext(file)
    temp.setdefault(ext.upper()[1:], set()).add(file)

groups_of_files = {k: list(v) for k, v in temp.items()}

print(groups_of_files)

Output:

{'TXT': ['e.txt', 'b.txt', 'a.txt'], 'PY': ['c.py', 'f.py']}
  • Related