Home > Blockchain >  How do I iterate through a column with lists in each cell to find errors?
How do I iterate through a column with lists in each cell to find errors?

Time:02-16

I have a column with lists in each row, I would like to check if any of the rows has a duplicate.

updated_df.groupby('value 1')['value 2'].apply(list).reset_index(name='value 2')
| Vaue 1| Vaue 2|
|:------|:------|
|25     |[22,22]|
|265    |  [4]  | 
|257    |[1,1,7]| 

My intention is to create an adjacent column which contains 'True' or some other indicator to see if there are duplicates and no if not.

Thanks!

CodePudding user response:

Assuming that you created the dataframe with the statement you posted. You can do a similar thing to check if a duplicate is in the group and save it as a column:

# Create your dataframe
df = pd.DataFrame({
    'value 1':[25,25,265, 257, 257, 257],
    'value 2':[22,22, 4,1,1,7],
})

# Your groupby
df_new = df.groupby('value 1')['value 2'].apply(list).reset_index(name='value 2')

# Add the duplicate column
df_new['true_false'] = df.groupby('value 1')['value 2'].agg(lambda x : x.duplicated().any()).tolist()

print(df_new)

Output:

   value 1    value 2  true_false
0       25   [22, 22]        True
1      257  [1, 1, 7]        True
2      265        [4]       False
  • Related