Home > Back-end >  Updating dictionary within loops
Updating dictionary within loops

Time:11-13

I have a list of dictionaries in which keys are "group_names" and values are gene_lists.

I want to update each dictionary with a new list of genes by looping through a species_list.

Here is my pseudocode:

groups=["group1", "group2"]
species_list=["spA", "spB"]
    
def get_genes(group,sp)
    return gene_list

for sp in species_list:
    for group in groups:
        gene_list[group]=get_genes(group,sp)
        gene_list.update(get_genes(group,sp))

The problem with this code is that new genes are replaced/overwritten by the previous ones instead of being added to the dictionary. My question is where should I put the following line. Although, I'm not sure if this is the only problem.

gene_list.update(get_genes(group,sp))

The data I have looks like this dataframe:

data={"group1":["geneA1", "geneA2"],
      "group2":[ "geneB1","geneB2"]}
pd.DataFrame.from_dict(data).T

The data I want to create should look like this:

data={"group1":["geneA1", "geneA2", "geneX"],
      "group2":[ "geneB1","geneB2", "geneX"]}
pd.DataFrame.from_dict(data).T

So in this case, "gene_x" refers to the new genes obtained by the get_genes function for each species and finally updated to the existing dictionary.

Any help would be much appreciated!!

CodePudding user response:

You need to append to the list in the dictionary entry, not assign it.

Use setdefault() to provide a default empty list if the dictionary key doesn't exist yet.

for sp in species_list:
    for group in groups:
        gene_list.setdefault(group, []).extend(get_genes(group, sp))

CodePudding user response:

From what I understand, you want to append new gene to each key, in order to do that:

new_gene = "gene_x"
data={"group1":["geneA1", "geneA2"], "group2":[ "geneB1","geneB2"]}

for value in data.values():
    value.append(new_gene)    

print(data)

You can also use defaultdict where you can append directly (read the docs for that).

  • Related