Home > Software engineering >  How to add a counter to pandas duplicated index?
How to add a counter to pandas duplicated index?

Time:10-22

I want to transform duplicated values of index adding a counter like -01, -02, -0n, etc. for every duplicated item, skipping unduplicated indices.

That is to transform index values from these:

one
one
two
three
three
four

to these:

one-01
one-02
two
three-01
three-02
four

What approach can I resort to?

CodePudding user response:

Use GroupBy.cumcount for counter and assign only duplicated values by Index.duplicated with Series.where:

df = pd.DataFrame({'a': range(6)},
                   index=['one', 'one', 'two', 'three', 'three', 'four'])


df.index  = (df.groupby(level=0).cumcount().add(1).astype(str).str.zfill(2).radd('-')
                .where(df.index.duplicated(keep=False), ''))
print (df)
          a
one-01    0
one-02    1
two       2
three-01  3
three-02  4
four      5

Or use numpy.where:

df.index = np.where(df.index.duplicated(keep=False),
                    (df.index   df.groupby(level=0).cumcount().add(1)
                                  .astype(str).str.zfill(2).radd('-')),
                    df.index )
print (df)
          a
one-01    0
one-02    1
two       2
three-01  3
three-02  4
four      5

CodePudding user response:

You can use pandas!

import pandas as pd

df = pd.DataFrame({'A': ['one', 'one', 'two', 'three', 'three', 'four']})

df['A'] = df['A'].astype('category')

df['A'] = df['A'].cat.rename_categories(df['A'].cat.categories   '-'   df['A'].groupby(df['A']).cumcount().astype(str).str.zfill(2))
  • Related