Home > database >  Pandas dataframe - Removing repeated/duplicate column in dataframe but keep the values
Pandas dataframe - Removing repeated/duplicate column in dataframe but keep the values

Time:12-14

I have this dataframe that have duplicate column name, I want to remove the remove the repeated column but I need to keep the values.

enter image description here

I want to remove the C and D column at the end but move the values on the same row in the first C and D column.

df = df.loc[:,~df.columns.duplicated(keep='first')]

Tried this code but it remove the duplicate column and keeping the first but it also remove the values

CodePudding user response:

Example

make minimal and reproducible example for answer

data = [[0, 1, 2, 3, None, None], 
        [1, None, 3, None, 2, 4], 
        [2, 3, 4, 5, None, None]]
df = pd.DataFrame(data, columns=list('ABCDBD'))

df

    A   B   C   D   B   D
0   0   1.0 2   3.0 NaN NaN
1   1   NaN 3   NaN 2.0 4.0
2   2   3.0 4   5.0 NaN NaN

Code

df.groupby(level=0, axis=1).first()

result:

    A   B   C   D
0   0.0 1.0 2.0 3.0
1   1.0 2.0 3.0 4.0
2   2.0 3.0 4.0 5.0
  • Related