Two lists of lists as below, and it needs to find out how many times (count) the pair of each elements in the son lists.
For example, William_Delta appear 4 times.
The result is to be written into a txt file.
processes = [['Iota', 'Gamma', 'Kappa'], ['Delta', 'Zeta', 'Beta'], ['Alpha', 'Zeta'], ['Alpha', 'Epsilon', 'Delta', 'Beta']]
staffs = [['William', 'James', 'Noah', 'Oliver'], ['Benjamin', 'Oliver', 'William'],['Oliver', 'Benjamin']]
list_output = []
for each_p in processes:
for p in each_p:
for each_s in staffs:
for s in each_s:
output = s '_' p
list_output.append(output)
uniques = set(list_output)
with open('c:\\temp\\outfile.txt', 'a') as outfile:
for ox in uniques:
outfile.write(ox '@' str(list_output.count(ox)) "\n")
The lengths of both 'processes' and 'staffs' are very long so it takes much time to complete.
What's the better way to make the run shorter?
Thank you.
CodePudding user response:
Use collections.Counter
to count each time an element appears in each sublist, then use itertools.product
to find all the pairs. In the end the total count is the multiplication of each count.
For example, "William"
appears 2 times and "Delta"
appears 2 times, therefore total count of the pair "William_Delta"
is 4 (2 * 2).
from collections import Counter
from itertools import product
processes = [['Iota', 'Gamma', 'Kappa'], ['Delta', 'Zeta', 'Beta'], ['Alpha', 'Zeta'], ['Alpha', 'Epsilon', 'Delta', 'Beta']]
staffs = [['William', 'James', 'Noah', 'Oliver'], ['Benjamin', 'Oliver', 'William'],['Oliver', 'Benjamin']]
count_staffs = Counter(st for staff in staffs for st in staff)
count_processes = Counter(pr for process in processes for pr in process)
with open('outfile.txt', 'a') as outfile:
for (staff, cs), (process, cp) in product(count_staffs.items(), count_processes.items()):
outfile.write(f"{staff}_{process}@{cs * cp}\n")
This solution should be faster than finding all the pairs and counting them.
CodePudding user response:
You can use collections.Counter
and chain.from_iterable
and product
from itertools
:
from collections import Counter
from itertools import product, chain
output = Counter(
f"{s}_{p}" for p, s in
product(*map(chain.from_iterable, [processes, staffs]))
)
with open(file) as outfile:
for name, count in output.items():
outfile.write(f"{name}@{count}\n")
A little more verbose version would be:
all_processes = chain.from_iterable(processes)
all_staffs = chain.from_iterable(staffs)
name_counts = Counter(f"{s}_{p}" for s, p in product(all_processes, all_staffs))
with open(file) as outfile:
for name, count in name_counts.items():
outfile.write(f"{name}@{count}\n")