Home > Mobile >  data frame, match two columns and remove repeated value from second columns if that exist in first c
data frame, match two columns and remove repeated value from second columns if that exist in first c

Time:09-06

In a data, frame match two-column and if any value from the second column is available in the first column, remove value from the second columns

col1 col2
1   
2     1
3     9
4
5     1
6     2

Output

col1 col2
1
2
3    9
4
5
6

Here, 1 and 2 from col2 are available in col1. So, this repeated data should be removed

CodePudding user response:

Using s.mask to value match and replace, we can do something along the likes of:

df['col2'] = df['col2'].mask(pd.to_numeric(df['col2']).isin(df['col1']), "")
col1    col2
0   1   
1   2   
2   3   9.0
3   4   
4   5   
5   6   

CodePudding user response:

import pandas as pd
col1= [1,2,3,4,5,6]
col2= [0,0,9.0,0,0,0]

df = pd.DataFrame({'col1':col1, 'col2':col2})
# add column with no of occurrence of Non None values in the column name starts with 'a'

# iterate over columns
for col in df.columns:
    # remove values that are in previous columns
    for prev_col in df.columns[:df.columns.get_loc(col)]:
        df[col] = df[col].where(~df[col].isin(df[prev_col]), None)

# OUTPUT
#    col1  col2
# 0     1   0.0
# 1     2   0.0
# 2     3   9.0
# 3     4   0.0
# 4     5   0.0
# 5     6   0.0
  • Related