Home > Net >  Pandas GroupBy and Remove Dupplicates Without Shifting Cells
Pandas GroupBy and Remove Dupplicates Without Shifting Cells

Time:12-01

I have a somewhat large array (~3000 rows) where the first column has a duplicate string values that are of varying numbers. I want to be able to remove these duplicates without shifting the cells in this column.

Input

row/rack     shelf    tilt
row1.rack1     B       5
row1.rack1     A       nan
row1.rack2     C       nan
row1.rack2     B       nan
row1.rack2     A       17

Desired Output

row/rack     shelf    tilt
row1.rack1     B       5
               A       nan
row1.rack2     C       nan
               B       nan
               A       17

Is there a good way to do this? I've been searching through stackoverflow and other sites but haven't been able to find something like this

CodePudding user response:

using .duplicated and .loc

df.loc[df['row/rack'].duplicated(keep='first'),'row/rack'] = ''

print(df)

     row/rack shelf  tilt
0  row1.rack1     B   5.0
1                 A   NaN
2  row1.rack2     C   NaN
3                 B   NaN
4                 A  17.0

CodePudding user response:

mask the duplicates with empty strings:

df["row/rack"] = df["row/rack"].mask(df["row/rack"].duplicated(), "")

>>> df
     row/rack shelf  tilt
0  row1.rack1     B   5.0
1                 A   NaN
2  row1.rack2     C   NaN
3                 B   NaN
4                 A  17.0
  • Related