Count the number of occurrences for values dictionary list python-CodePudding

i would like to have the number of occurences of the values in my dict :

test = {
  "Staph": ["grp1","grp2","grp3"],
  "Lacto": ["grp2","grp3","grp4","gr5"],
  "Bacilus": ["grp2","grp4","grp6"]
}

And i want to obtain the commun group for my keys for exemple :

grp1 is only in Staph so grp1 = 1 and grp2 is in "Staph" and "Lacto" and "Bacillus" so grp2 = 3

grp1 = 1 , grp2 = 3 , grp3 = 2, grp4 = 2 , grp5 = 1, grp6 = 1

After that i would like to count the number of occurrences of my precedent number for exemple :

I have grp1 = 1 and grp5 = 1 and grp6 = 1 so the number of time there is 1 is one group in only one keys is 3 or if i take grp3 = 2 , grp4 = 2 the number of time there is 2 same groups for different keys is 2

So i would like a result like that :

number : the number of times n groups appear in different keys

Staph      grp1       grp2       grp3
Lacto                 grp2       grp3         grp4     grp5
Bacillus              grp2                    grp4               grp6
            1          3          2            2        1         1 


number_of_1 = 3
number_of_2 = 2
number_of_3 = 1

I hope you have understood, thank you for your answer

CodePudding user response：

There you go :)

test = {
  "Staph": ["grp1","grp2","grp3"],
  "Lacto": ["grp2","grp3","grp4","gr5"],
  "Bacilus": ["grp2","grp4","grp6"]
}

groups = set()

for i,j in test.items():
    for k in j:
        groups.add(k)

counts = []

new_test = {}

for k in groups:
    for i in test.keys():
        if k in test[i]:
            if k not in new_test:
                new_test[k] = 1
            else:
                new_test[k]  = 1
print(new_test)

values = [i for i in new_test.values()]

values_set = set(values)

count_values = []


for i in values_set:
    count = 0
    for j in values:
        if i == j:
            count  = 1
    count_values.append([i,count])

print(count_values)

CodePudding user response：

You could use the Counter() implementation to simplify the code a lot more

test = {
  "Staph": ["grp1","grp2","grp3"],
  "Lacto": ["grp2","grp3","grp4","gr5"],
  "Bacilus": ["grp2","grp4","grp6"]
}

from collections import Counter

grp_counter = Counter()
for k, v in test.items():
    grp_counter.update(Counter(v))

print(grp_counter)

CodePudding user response：

Do the following:

import pandas as pd

test = {
    "Staph": ["grp1", "grp2", "grp3"],
    "Lacto": ["grp2", "grp3", "grp4", "grp5"],
    "Bacilus": ["grp2", "grp4", "grp6"]
}

# prepare data
all_values = sorted(set().union(*test.values()))
data = {key: [val if val in values else None for val in all_values] for key, values in test.items()}

# construct dataframe from data
df = pd.DataFrame.from_dict(data,orient="index")

# compute counts
row = df.notna().sum().T
row.name = "Counts"

# append counts as a new row
res = df.append(row).fillna("")
print(res)

Output

            0     1     2     3     4     5
Staph    grp1  grp2  grp3                  
Lacto          grp2  grp3  grp4  grp5      
Bacilus        grp2        grp4        grp6
Counts      1     3     2     2     1     1