Updating dictionary within loops-CodePudding

I have a list of dictionaries in which keys are "group_names" and values are gene_lists.

I want to update each dictionary with a new list of genes by looping through a species_list.

Here is my pseudocode:

groups=["group1", "group2"]
species_list=["spA", "spB"]
    
def get_genes(group,sp)
    return gene_list

for sp in species_list:
    for group in groups:
        gene_list[group]=get_genes(group,sp)
        gene_list.update(get_genes(group,sp))

The problem with this code is that new genes are replaced/overwritten by the previous ones instead of being added to the dictionary. My question is where should I put the following line. Although, I'm not sure if this is the only problem.

gene_list.update(get_genes(group,sp))

The data I have looks like this dataframe:

data={"group1":["geneA1", "geneA2"],
      "group2":[ "geneB1","geneB2"]}
pd.DataFrame.from_dict(data).T

The data I want to create should look like this:

data={"group1":["geneA1", "geneA2", "geneX"],
      "group2":[ "geneB1","geneB2", "geneX"]}
pd.DataFrame.from_dict(data).T

So in this case, "gene_x" refers to the new genes obtained by the get_genes function for each species and finally updated to the existing dictionary.

Any help would be much appreciated!!

CodePudding user response：

You need to append to the list in the dictionary entry, not assign it.

Use setdefault() to provide a default empty list if the dictionary key doesn't exist yet.

for sp in species_list:
    for group in groups:
        gene_list.setdefault(group, []).extend(get_genes(group, sp))

CodePudding user response：

From what I understand, you want to append new gene to each key, in order to do that:

new_gene = "gene_x"
data={"group1":["geneA1", "geneA2"], "group2":[ "geneB1","geneB2"]}

for value in data.values():
    value.append(new_gene)    

print(data)

You can also use defaultdict where you can append directly (read the docs for that).