I've got a dataframe column.A containing lists and I'm trying to populate a new column with a list of values in columnA that aren't present in a secondary list.
d = {'colA': [['UVB', 'NER', 'GGR'], ['KO'], ['ERK1', 'ERK2'], []]}
df = pd.DataFrame(data=d)
The code I've tried is:
finaldf['colB'] = [i for i in list(finaldf.AllGenes) if i not in List]
But this just populates colB with the same list of values thats in colA
CodePudding user response:
Not totally clear what you want
d = {'colA': [['UVB', 'NER', 'GGR'], ['KO'], ['ERK1', 'ERK2'], []]}
df = pd.DataFrame(data=d)
"""
colA
0 [UVB, NER, GGR]
1 [KO]
2 [ERK1, ERK2]
3 []
"""
# filter
dont_include = ["NER", "ERK2"]
df["colB"] = df["colA"].apply(
lambda col_a: [e for e in col_a if e not in dont_include]
)
"""
colA colB
0 [UVB, NER, GGR] [UVB, GGR]
1 [KO] [KO]
2 [ERK1, ERK2] [ERK1]
3 [] []
"""
try using this.